The Mathematical Intelligencer encourages comments about the material in this issue. Letters to the editor should be sent to the editor-in-chief, Chandler Davis.
Report on the Zurich Congress
Corrigenda to "Quaternions in Physics"
I find I agree with V.I. Arnold's "Will mathematics survive? Report on the Zurich Congress" (Mathematical Intelligencer 17 (1995), no. 3, 6-10) in a few things: e.g., most talks in ICM94 were not enlightening. I disagree in more: e.g., I didn't find the talks given by representatives of the Russian school more comprehensible than the median. There is, however, an inaccuracy about the General Assembly of the International Mathematical Union which I think needs correction. The Assembly did not reject the proposition of the American delegation to increase the representation of women and ethnic groups, but rather refused to vote on it. It used one of the Assembly's rules of procedure: that a proposition made on the second day of the meeting can't be voted on unless the Assembly decides (by vote) to take it up. It seems the members of the Assembly, while not endorsing the resolution, were not terribly eager to endorse the comment made there that such a resolution would lower the quality of the Congress. As Ingrid Daubechies asked during the Congress (I am paraphrasing), "How come increasing the number of women would automatically lower the quality?" If memory serves me correctly, the Assembly delegate who made the sarcastic comment about "sexual minorities" quoted in the Intelligencer article was V.I. Arnold himself, now vice-president of the International Mathematical Union. I found the comment offensive then, and I find it offensive now.
In my article in The Intelligencer 17 (1995), no. 4, 7-15, the following two items were inadvertently omitted from the list of references:
Alfredo Octavio IWC Caracas, Venezuela
W.S. Anglin and J. Lambek, The Heritage of Thales, Springer-Verlag, New York, 1995. L. Silberstein, Quaternionic form of relativity, Phil. Mag. 23 (1912), 790. The name "Sudbery" was misspelled. The journal reference of the article by A.W. Conway [1948] should be corrected to Pontificia Academia Scientiarum. Page 9, first column, line 17: I did not mean to imply that Hamilton was in fact influenced by Parmenides. I am informed by E.A. Costa that there is no direct evidence for this. Page 10, second column, last paragraph: Lewis Carroll did not assert that time is reversed in a mirror. My remark must have been based on a flawed recollection of Through the Looking Glass. Thanks to Lewis Stiller for pointing this out. On page 12, column 2, lines 12 and 15: replace
dx)
d *
J. Lambek Department of Mathematics and Statistics McGill University Montreal, Quebec H3A 2K6 Canada
THE MATHEMATICAL INTELLIGENCER VOL. 18, NO. 3 9 1996 Sprlnger-Verlag New York
3
The Missing Link Felipe Acker
Introduction The goal of this article is to change the views of mathematicians throughout the world on three fundamental theorems of elementary analysis: Cauchy-Goursat's Theorem, Stokes's Theorem, and the Mean Value Theorem. My claims are the following:
I'll prove below. The real question is, H o w could such a fiction persist for one entire century? For Stokes's Theorem, let's restrict ourselves to Green's Theorem: the exposition will be less technical, and the central ideas won't suffer. The simplest version states that if R is a rectangle and
P,Q: R---~ R 1. Cauchy-Goursat's Theorem is really a mere corollary of Green's. 2. The usual treatment of Stokes's Theorem is misguided and I will do it properly. 3. The Mean Value Theorem does generalize to higher dimensions as an equality. Sophisticated objects like differentiable manifolds and exterior differential forms will not figure in the exposition, lest they discourage potential readers. For a more technical version, see [1] and [2]. To make my points of view clear, I'll begin by summarizing what seems to be received wisdom about these theorems. Cauchy-Goursat's Theorem states (in a simplified version) that if A is an open subset of the complex plane, f is holomorphic on A, and R is a (closed) rectangle contained in A, then
are C 1
Although it is obvious that the C 1 hypothesis can be relaxed, the universally accepted proof is based on iterated integration and needs some kind of regularity of each one of the partial derivatives oQ/ox and OP/Oy. However, if we look at this theorem as a generalization
far f(z) dz = 0 (where OR represents the boundary of R with positive orientation). Almost every introductory complex analysis book contains the remark that, "with the additional hypothesis that f is C1," the proof can be carried out using Green's Theorem. So there seems to be a general belief in the reciprocal: without this additional hypothesis, Green's Theorem wouldn't apply. This is a fallacy, as 4 THEMATHEMATICAL INTELLIGENCER VOL. 18, NO. 3 9 1996 Springer-Verlag
functions, then
New York
of the F u n d a m e n t a l T h e o r e m of Calculus, w e feel that the natural h y p o t h e s i s should be the integrability of ( 3 Q / Ox - oP / oy), e v e n if individually o Q / ox a n d OP / Oy are bad. This is, in fact, true: if P a n d Q are differentiable ( a p p r o x i m a b l e b y linear functions), it is e n o u g h to a s s u m e ( o Q / O x - oP/Oy) to be R i e m a n n integrable. The precise h y p o t h e s e s are m o r e subtle, as I will show, b u t this version is sufficient in o r d e r to get C a u c h y - G o u r s a t as a corollary. N o w let's turn to the M e a n V a l u e T h e o r e m or, should I say, the M e a n Value Equality: if
f : [a, bl ---~ R is continuous on [a, b] a n d differentiable at each point of ]a, b[, then there exists a point c in ]a, b[ such that
m i m i c this proof to get Stokes's Theorem, it b e c o m e s clear that a general version of the M e a n Value Equality w o u l d be welcome. In the chain leading f r o m the F u n d a m e n t a l T h e o r e m of Calculus to Stokes's Theorem, this is the missing link.
The Fundamental Theorem Stokes's Theorem
Let m e briefly recall the p r o o f of the F u n d a m e n t a l T h e o r e m of Calculus just to e m p h a s i z e the role p l a y e d in it b y the M e a n Value Theorem. T H E F U N D A M E N T A L T H E O R E M OF C A L C U L U S . Let f : [a, b] --~ R be continuous on [a, b] and differentiable on ]a, b[. If f ' is (Riemann) integrable, then
f'(c) = fib) - fla) b-a The trouble a p p e a r s w h e n w e try to generalize this result to higher dimensions: the pretty a n d geometrical equality becomes an inequality. I think the best expression of w h a t e v e r y b o d y seems to believe w a s given b y Jean Dieudonn6 in his celebrated Foundations of Modern Analysis [41: After the formal rules of Calculus have been derived (sections 8.1 to 8.4), the other sections of the chapter are various applications of what is probably the most useful theorem in Analysis, the mean value theorem, proved in section 8.5. The reader will observe that the formulation of that theorem, which is of course given for vector-valued functions, differs in appearance from the classical mean value theorem (for real-valued functions), which one usually writes as an equality f(b) - f(a) = f'(c)(b - a). The trouble with that classical formulation is that: 1~ there is nothing similar to it as soon as f has vector values; 2 ~ it completely conceals the fact that nothing is known on the number c, except that it lies between a and b, and for most purposes, all one needs to know is that f'(c) is a number which lies between the g.l.b. and 1.u.b. of f ' in the interval [a, b] (and not the fact that it actually is a value of f'). In other words, the real nature of the mean value theorem is exhibited by writing it as an inequality, and not as an equality.
Well, D i e u d o n n 6 w a s wrong! ~ The M e a n Value T h e o r e m does generalize to higher d i m e n s i o n s as an equality. 2 This is a k e y idea: w h e n w e p r o v e the F u n d a m e n t a l T h e o r e m of Calculus, w e really need the M e a n Value T h e o r e m in the equality form. If w e try to
1I really appreciate people like Dieudonn6 (or, on the opposite side, Arnold) who express polemic opinions; polemics is fundamental to intellectual activity. I prefer Arnold, but, as the French say, "il faut de tout pour faire un monde." 2And this equality reveals a new aspect of its nature. The theorem referred to by Dieudonn6 is usually called by French authors finite increases theorem. I claim the true mean value theorem is the one I will present below.
of Calculus and
~f
' = f(b) - f(a).
Proof: Let P = {a0. . . . . an}, a = a0 < al < "'" < G = b b e a partition of [a, b]. Then writing n
f(b) - f(a) = ~ . f(a i) - flai-1) i=1
a n d a p p l y i n g the M e a n Value T h e o r e m to each subinterval [ai-~, ai], w e get
f(b) - f(a) = ~ f'(ci)(ai - ai-1), i=1
where ci E ]a/-1, ai[,
i = 1. . . . . n.
The right-hand side c o n v e r g e s to []
bf,.
N o w let us turn to Stokes's T h e o r e m a n d try to a d a p t the p r o o f of the o n e - d i m e n s i o n a l case. For simplicity, let us restrict o u r s t u d y to the e l e m e n t a r y case of G r e e n ' s Theorem. G R E E N ' S T H E O R E M (TENTATIVE). Let R = [a, b] • [c, d] a n d P,Q : R --~ ~ be continuous on R and differentiable in its interior. Let aQ
ax
ay
be integrable on R. Then
S P K+Q Y SSR(
~x
dx dy.
THE MATHEMATICAL INTELLIGENCER VOL. 18, NO. 3, 1996 5
Proof. f(x)
d
cj
i
cj-1 C
mlm'~
a
ai_ 1 a i
b
The proof should begin by taking two partitions, a = ao < aI <
""
< an =
b,
c~0.= a
and considering the rectangles Rij = [ai-1, ai] x [cj_i, cj]. Then just observe that, with positive orientations,
y
f RiJ P d x
that either m i = m for all the i's or there are two of them (ml a n d m3, for instance, in Fig. 1) such that m~ < ~ m 3. N o w look at the (continuous) function
+ Qdy.
i=l,...,n j=l,...,m
m:[a,b-h]--~ R
We are n o w at the point where the Mean Value Theorem is needed. If only we could write, for each i and for each j, f aaq P d x + Q d y =
~) (uij)(ai - ai_ O(bj- bj _ 1),
x ~
f i x + h) - f i x ) h
W h a t we have said implies that, for some point ~ in ]a, b - h[, we have re(X) = ~ (be sure y o u u n d e r s t a n d that ~ can be taken in the open interval). Next, make al = ~ and bl = ~ + h and iterate the procedure for the interval [al, bl], and so on. We will get a sequence of nested intervals [G, bn] with
for some uq in the interior of Rij (just as in the one-dimensional case), the proof would be finished. Now, w h a t is the trouble if we try to generalize the Mean Value Theorem? The problem seems to be that we must then generalize Rolle's Theorem, whose usual proof is peculiar to real-valued functions.
bn -
an -
b - a _ _
3n
f(bn) - f(an) _ f(b) - f(a) bn - an b - a
The Mean Value Theorem
/
f(x)
I will give n o w an alternative proof of the one-dimensional Mean Value Theorem. Refer to Figure 1. If we divide [a, b] into three (equal) parts, setting b-a
h-
3
and letting So, al, 0~2, r162be the endpoints of the subintervals we get, then we must have, for the slopes = f(b) - f(a) b-a
z
a mi
6
=
x
Figure 1.
c = c o < cl < "'" < Cm = d,
P ax + Q ay =
c%=b
R1
tic'i) - f l a i - 1), i = 1,2,3, h
THE MATHEMATICAL INTELLIGENCER VOL. 18, NO. 3, 1996
Figure 2 .
~
~=h
Let c be the intersection point of this sequence of intervals. As c is in ]a, b[, we are sure that f'(c) exists and f'(c) -- l i m , ~ f(bn) - fiG) = f(b) - fla) b, -- a, b - a
[]
Everyone can see that this proof is not simpler than the usual one, and I d o not pretend otherwise. The difference is that I am able to generalize it to higher dimensions.
sponding to E. A p p l y the C 1 form of Green's T h e o r e m to ~o0 to get
oR.
COo =
%
(Uo)
(this could be p r o v e d directly with some more effort). Finally, taking into account that the rectangles R, are similar, w e get 1"l
The Generalized Mean Value Theorem and Its Corollaries
(Uo) = I~(R,) 3Q 3x
3x
m
-1
f
to t = 0,
and the proof is finished.
[]
Let's begin with w h a t is sometimes called the physical interpretation of the divergence. To fix some notation, w h e n I say "rectangle" read "closed rectangle with sides parallel to the coordinate axes"; "/z(R)" should be read "area of R."
D E F I N I T I O N . Let A be an o p e n subset of the plane, and let
P R O P O S I T I O N . Let A be an open subset of the plane, let
P,Q:A---) R
The above result motivates the following definition.
be such that, for ~o = P dx + Q dy,
F :A---~ ~ 2 be continuous on A, and let (R,) be a sequence of similar rectangles containing the point Uo such that
lim/z(R,) = 0. ,--)oo
Let w be the differential form P dx + Q dy, where F = ( p , Q ) . 3 I f F is differentiable at u 0, then
f ,~R r
is well defined w h e n e v e r R is a rectangle. The differential form to will be said to have an exterior derivative &o(u) (which will be just a real n u m b e r ) at the point u of A if for any sequence (R,) of similar rectangles containing u such that/z(R,) converges to zero, we have lim ~
lim - , - ~ Ix(R.)
oR,
co =
~x
(u0).
Proof. Let T be the differential of F at u 0, and write, for u inA, F(u) = F(uo) + T(u - u o) + e(u),
with lim
{uE(u) --Uof-o.
In other words, F = Fo + a,
w h e r e F 0 is C 1. Let co0 be the differential form corresponding to F 0, and rot the differential form corre-
3To g e t w h a t is c a l l e d t h e divergence, w e s h o u l d of c o u r s e h a v e w r i t ten to = - Q dx + P dy
i f ~R. oJ = dro(u).
The n a m e exterior derivative is taken here, in a somew h a t i m p r o p e r sense, as a r e m i n d e r of what general concept I a m trying to p u t into a " p o p u l a r form." N o w observe that if, for instance, P and Q are continuous on A and are differentiable at u, then w has an exterior derivative at u. Moreover, if we change the values of P and Q at u so as to lose continuity, this will not change the values of the integrals and oJ will still have the same exterior derivative at u. This is not quite surprising: a differential form assigns n u m b e r s to objects (curves, in the present case; manifolds, in higher dimensions), not to points. So, the continuity of ~ should be taken in the sense that the integrals of ro over two neighboring curves should differ v e r y little.
D E F I N I T I O N . If oJ = P dx + Q dy is a differential form defined on some subset A of the plane, o~will be said to be continuous on A if for e v e r y rectangle R the function ~u +R) THE MATHEMATICALINrrELLIGENCERVOL. 18, NO. 3, 1996 7
is continuous (wherever defined), w h e r e u + R = {u + r ; r ~ R } . We n o w have good definitions for p r o v i n g the generalized Mean Value Theorem. The p r o o f follows the same steps as the one-dimensional case. The basic trick is the following. LEMMA. Let o9 = P dx + Q dy be a continuous differential form on the rectangle R. Then there exists a rectangle R o contained in the interior of R such that
if tz(Ro) ORo
if it(R) oR
O9--
where
m(u) =
l fu+RTog--1--!--foRog, it(R)
It(R)
is continuous and takes positive and negative values. It m u s t then vanish for some u 0, and we take R0 = uo + R. [] We are finally ready for T H E M E A N VALUE T H E O R E M . Let the differential form o9 = P dx + Q dy be continuous on the rectangle R and have an exterior derivative at each point of the interior of R. Then there is a point u in the interior of R such that
O9,
the sides of R o being one-third of those of R. Proof. Let R = [a, b] x [c, d]. Taking
dog(u) b-a h - m 3 '
d-c k -- - - I 3
we get, if w e divide the sides of R into three i_ntervals, nine rectangles Ri, i = 1. . . . . 9, congruent to R. N o w note that
(i) (ii)
9
f ~R og= f ~Ri o9
(iii)
it(R) = 9it(Ri).
i=1 ~
, f c~Ri Ogl
Now, just as in the one-dimensional case, we are sure that either
1 f OR~g - -
It(R)
1
It(Ri)
f
ORi
1 f
It(R,)
O9--
0R,
1 f
It(R)
ix/.
0R
T H E O R E M . Let the differential form o2 = P dx + Q dy be continuous on the rectangle R and have an exterior derivative at each point of the interior of R. If do) is (Riemann) integrable, then
So
9
all the Rn's are contained in the interior of R, the lengths of the sides of R~+I are one-third of those of R n,
Let u be the intersection point of the sequence (Rn). [] We get now, as an i m m e d i a t e consequence, the following stronger form of Green's Theorem:
and that, for all i,
1 f oR o9
|oR tO. ~
It(R)
Proof. Just as in the one-dimensional case, we iterate the l e m m a to get a sequence of nested rectangles (R n) such that
= [0, h] • [0, k],
It(R)
g
1
o9, i=1
....
9
(and then w e can just choose the central rectangle to be R 0) or we have, for two of the rectangles, say R 1 and R2,
f R do9 =
for
o9"
Corollary. Let R be a rectangle and P,Q : R --~ R be contin-
uous on R and differentiable at each point of the interior of R. If
OQ a~
ay
is (Riemann) integrable, then
1 f OR1 co< ~ if
It(R1)
OR
o)<
~
1 f OR2 to.
The differential form o9 being continuous, we can, pushing R 1 and R 2 inside R, suppose a I and R 2 to be contained in the interior of R and still satisfy the above inequalities. Then the function m : ] a , a + 2h[ • ]c,c + 2h[ ---~ N, 8 THE MATHEMATICALINTELLIGENCERVOL. 18, NO. 3, 1996
COROLLARY. Let A be an open subset of the complex plane and let f: A ~ C be holomorphic. If R is a rectangle contained in A, then
for f(z) dz = O.
Remark. It is usual to present Goursat's Theorem under the hypothesis that f is holomorphic on A, except at a subset D of isolated points such that lim f(z)(z - Zo) = 0 Z---~Z0
for each z 0 in D. This hypothesis just assures the continuity of the complex differential form f(z) dz. On the other hand, we have dro(z) = 0 if z is not in D. To conclude the proof, we just argue that dco(z) = 0 also if z D. This is a consequence of the following proposition, which generalizes a well-known result of real analysis (proved using the Mean Value Equality). PROPOSITION. Let A be an open subset of the plane and let oJ = P dx + Q dy be a continuous differential form on A. Suppose ~o has an exterior derivative at each point of A, except perhaps at z o. Then, if lim &o(z) = ~1, Z---)Z0
we have dro(zo) = ~1. The proof is left as an exercise. We can also prove the Cauchy-Goursat Theorem without this proposition. To make things simpler, let D = {z0}. If z 0 isn't in the interior of R, just apply Green's Theorem. If z 0 is in the interior of R, d r a w a horizontal line and a vertical line through z0 and apply Green's Theorem to the four rectangles you get (this is also a hint for the exercise!). Of course the above version of Green's Theorem can be extended in the obvious w a y to the Divergence Theorem on a block B = [al, bl] x ... x [an, bN] in R N. The next step is Stokes's Theorem on chains and on manifolds; see [1] and [2].
Final Remarks I apologize for having included some proofs, but I had to justify m y bombastic claims. I hope everybody is n o w persuaded. You m a y say, "Okay, the g u y has found a trick," referring to the proof of the Mean Value Theorem. But there is something else to be observed: the definition of the exterior derivative. In the books on differential forms, the exterior derivative is m a d e to look like some kind of algebraic operation on partial derivatives (something like a mixture). One notable exception is Arnold's book [3]. Let me emphasize this: exterior derivation is a genuine analytic operation, and it is an independent concept--it does not d e p e n d on differentiation. The one-dimensional derivative contains both the idea of linear approximation and that of flux (didn't N e w t o n call it fluxion?). The first one generalizes as the differential, the latter as the exterior derivative. People working on what is called Partial Differential Equations should perhaps think about this: the physically relevant "partial differ-
ential equations" (at least those from classical physics) are, in fact, "exterior derivative equations," and current Sobolev spaces are almost surely not their habitat.
References 1. F. Acker, The Mean Value Theorem for Differential Forms, to appear. 2. F. Acker, Advanced Calculus (in Portuguese, book, draft version). 3. V. Arnold, Mathematical Methods of ClassicalMechanics, New York: Springer-Verlag, 1978. 4. J. Dieudonn6, Foundations of Modern Analysis, New York: Academic Press, 1960. Departamento de Matemdtica Aplicada Instituto de Matemdtica Universidade Federal do Rio de Janeiro 21945-970 Rio de Janeiro, Brazil e-maih
[email protected] THE MATHEMATICAL INTELLIGENCER VOL. 18, NO. 3, 1996
9
Gabor Szeg6": 1895-1985 Richard Askey and Paul Nevai
The international mathematics community has recently celebrated the 100th anniversary of Gabor SzegS"s birth. 1 Gabor Szeg6 was 90 years old when he died. He was born in Kunhegyes on January 20,1895, and died in Palo Alto on August 7, 1985. His mother's and father's names were Hermina Neuman and Adolf Szeg6, respectively. His birth was formally recorded at the registry of the Karcag Rabbinical district on January 27, 1895. He came from a small town of approximately 9 thousand inhabitants in Hungary (approximately 150 km southeast of Budapest), and died in a town in northern California, U.S.A., with a population of approximately 55 thousand, near Stanford University and just miles away from Silicon Valley. So many things happened during the 90 years of his life that shaped the politics, history, economy, and technology of our times that one should not be surprised that the course of Szeg6's life did not follow the shortest geodesic curve between Kunhegyes and Palo Alto. I (R. A.) first met Szeg6 in the 1950s when he returned to St. Louis to visit old friends, and I was an instructor at Washington University. Earlier, when I was an undergraduate there, I had used a result found by Hsien Yu Hsu in his Ph.D. thesis at Washington University under Szeg6. This was in the first paper I wrote. While I was at the University of Chicago in the early 1960s, Szeg6 visited. I still remember seeing him at one end of the hall and a graduate student, Stephen V~gi, at the other end of the same hall. They walked toward each other and both started to speak in Hungarian. I am certain they had not met before, and I have always won-
dered how Szeg6 recognized another former Hungarian. In 1972, I spent a month in Budapest and Szeg6 was there. We talked most days, and although his health was poor and his memory was not as good as it had been a few years earlier, we had some very useful discussions. Three years earlier, also in Budapest, Szeg6
1We refer the reader to the section " A n s w e r s to Some Frequently Asked Questions A b o u t the H u n g a r i a n Language" at the end of this article. 10
THE MATHEMATICALINTELLIGENCERVOL. 18, NO. 3 9 1996 Springer-VerlagNew York
Adolf Szeg6 (father of Gabor Szeg6).
had mentioned two papers of his which he said should be studied. I did not do it immediately, but three months later I did. One contained the solution of a problem I had been trying to solve for three years. His paper had been written 40 years earlier. I learned from this that
Gabor Szeg6 in 1896 (age around 18 months).
Hermina Neuman Szeg6 (mother of Gabor Szeg6).
when a great mathematician tells you to look at a paper which he or she thinks has been unjustly neglected, one should do it rapidly. I (P. N.) only met Szeg6 once. It was in 1972 when I had just graduated. By that time, he had been inactive in mathematics research for almost a decade. Yet he was the mathematician who, for two reasons, had the greatest influence on my career as a research mathematician. One of the reasons was the book which we called P61ya-SzegG that is, the problem book titled Problemsand Theorems in Analysis by George P61ya and Szeg6 which needs no introduction for the readers of this article. The other reason was the book SzegG that is, Szeg6's monograph titled Orthogonal Polynomials and the accompanying contemporary theory of orthogonal polynomials, whose founding father was Szeg6. To be really pedantic, he should be called the "founding grandfather," since it is already the third generation of mathematicians who is developing his theory today. These books will be discussed later in more detail. Very recently, I had an extraordinarily rewarding experience while working on a project which led to erecting Szeg6's bust in Kunhegyes, St. Louis, and Stanford. THE MATHEMATICAL INTELLIGENCER VOL. 18, NO. 3, 1996
11
In telegraphic style: Gabor Szeg6 was Professor Emeritus at Stanford University. He was a member of the American Academy of Arts and Sciences, the Science Academy of Vienna, and the Hungarian Academy of Sciences. He was one of the prominent classical analysts of the twentieth century. He wrote more than 130 research articles and authored or co-authored 4 influential books, 2 of which were exceptionally successful. For analysts, Szeg6 is best known for Szeg6"s extremal problem, for his results on Toeplitz matrices which led to the concept of the Szeg6"reproducing kernel and which were the starting point for the Szeg6" limit theorem and the strong Szeg6"limit theorem, and for SzegS"s theory of Szeg~'s orthogonal polynomials on the unit circle. These have been summarized in his books Orthogonal Polynomials (Colloquium Publications, Vol. 23, American Mathematical Society, Providence, RI, 1939) and Toeplitz Forms and Their Applications (jointly with Ulf Grenander, University of California Press, Berkeley and Los Angeles, 1958). The former is one of the most successful books ever published by the American Mathematical Society (four editions and numerous reprints). The book
Aufgaben und Lehrsdtze aus der Analysis, vols. I and II ("Problems and Theorems in Analysis"), which he coauthored with George P61ya in 1925, contributed to the education of many generations of mathematicians. It was first published by Springer-Verlag in the series Grundlehren der mathematischen Wissenschaften in Einzeldarstellungen mit besonderer Berucksichtigung der Anwendungsgebiete (widely known as The Grundlehren) as volumes 19 and 20. The book Isoperimetric Inequalities in Mathematical Physics by P61ya and Szeg6 was published as No. 27 in the series Annals of Mathematical Studies by Princeton University Press, Princeton, NJ in 1951 (translated into Russian in 1962). Lawrence E. Payne writes on p. 39 of Vol. 1 of Szeg6's Collected Papers (Birkhauser, 1982): "Not only did this [book] make available to the mathematical public a number of powerful new tools of mathematical investigation but it also opened up an interesting new and fertile area of mathematical research." His work and results not only deeply influenced the development of pure and applied mathematics but also found many applications in statistics, physics, chemistry, and various fields of engineering science. In what follows, we discuss Szeg6's life and work, as seen by us and by several of his contemporaries. After completing elementary school in Kunhegyes and graduating from high school in Szolnok (a town approximately 100 km southeast of Budapest) on June 28, 1912, he enrolled in the P~zm~ny P6ter University in Budapest (today known as E6tv6s L6r~nd University), where he primarily studied mathematics and physics. The same year, he won first prize in the academic contest organized by the (Hungarian) Mathematical and Physical Society (which later became known as the E6tv6s Competition and today is known as the 12
THE MATHEMATICAL INTELLIGENCER VOL. 18, NO. 3, 1996
Ki~rsch~k Competition). Winning the competition was much more than a passing event but rather a very important milestone in Szeg6's career. The competition, as all mathematically inclined Hungarians know, carried a great deal of prestige. It was especially important for Szeg6 because it is doubtful that, as a Jew whose family had no connections, he would have been able without it to study or receive the attention that he did. His father had even tried to discourage him from entering the university since he thought that his son would have no future there as a Jew. Szeg6 later made sure his children knew of these circumstances. The following year, his paper on polynomial approximations of continuous functions received a University prize. He never abandoned approximation theory, and his very last research paper also focused on this subject. Szeg6 spent the summers of 1913 and 1914 in Germany, first at the University of Berlin, later at the University of G6ttingen. In Berlin, he attended the lectures of Georg Ferdinand Frobenius, Hermann Amandus Schwarz, and Konrad Knopp, and he also participated in Friedrich Schottky's seminar. In G6ttingen, he took courses from David Hilbert, Edmund Landau, and a fellow Hungarian, Alfr6d Haar, who was teaching there at the time. When the First World War broke out, he immediately returned to Hungary and continued his university studies there until May 15, 1915. Conscription was underw a y and he knew that he would be drafted into the army of the Austro-Hungarian Monarchy. So, to avoid the infantry, he bought a horse and enlisted in the cavalry. According to his comments to his children, he was a poor horseman and fared poorly. He spent the last years of the war in Vienna. His military service lasted beyond the Austro-Hungarian capitulation on November 11, 1918; he remained in the army until early 1919. During this time, he served in the infantry, the artillery, and the air force. Naturally, the military applications of aero-
To avoid the infantry, he bought a horse and enlisted in the cavalry. nautics were not very sophisticated at that time. However, the Austro-Hungarian Air Force had two extraordinary theoretical experts, Theodore von K~irm~n and Richard von Mises, two of the founders of modern aerodynamics. They both became Szeg6's lifelong friends. Between 1912 and 1915, Leopold Fej6r, Man6 Beke, J6zsef K6rsch~k, and Mih~ly Bauer were among his professors. He met George P61ya (who, at the age of 97, died in Palo Alto on September 7, 1985) and MihMy Fekete (of transfinite diameter fame) at this time; Szeg6 developed a long-lasting collaboration with both of them. His first publication in an international journal, in
which he gave a solution of a problem p r o p o s e d by P61ya, was in Archiv der Mathematik und Physik 21 (1913), 291-292. As w e k n o w very well, there are problems at various levels. Some, like those done in school, everyone should learn h o w to do. Then there are contest problems, like those in the Mathematical Olympiads. These frequently require d e e p e r insight than seems indicated at first reading. The problem of P61ya, which Szeg6 solved and published in 1913, is an example of a still harder type, which attracts prospective mathematicians. H u n g a r y has long specialized in the use of problems to attract y o u n g students to mathematics; other countries have learned from them, and have contests of problems to encourage d e e p e r mathematical thought. His first research paper, "Ein G r e n z w e r t s a t z / i b e r die Toeplitzschen Determinanten einer reellen positiven Funktion," was published in the Mathematische Annalen 76 (1915), 490-503. This is h o w P61ya r e m e m b e r s it in 1982 (p. 11 of Vol. 1 of Szeg6's Collected Papers): Our cooperation started from a conjecture which I found. It was about a determinant considered by Toeplitz and others, formed with the Fourier-coefficients of a function fix).
I had no proof, but I published the conjecture and the young Szeg6 found the proof [...] We have seen here a good example of the fruitful cooperation between two mathematicians. Mathematical theorems often, perhaps in most cases, are found in two steps: first the guess is found; then minutes, or hours, or days, or weeks, or months, perhaps even several years later, the proof is found. Now the two steps can be done by different mathematicians, as we have seen. Szeg6 spent another 45 years working on sharpening, extending, and finding applications of the results published in this article, and the t h e o r y of Toeplitz determinants became one of his p r i m a r y research areas. While he was serving in the military and his unit was stationed in Vienna, he received his Ph.D. from the University of Vienna on July 8, 1918. His dissertation was based on the above-mentioned article. Fifty years later, he returned to Vienna for a celebration of this, and I (R. A.) still r e m e m b e r h o w pleased he was recalling this celebration in conversation a few years later. Szeg6, having been a mathematical p r o d i g y himself, was an ideal person to be asked to tutor one of the great mathematical minds of this century, John v o n N e u m a n n (born as J~nos N e u m a n n in Budapest in 1903). Here is what N o r m a n Macrae wrote in his book John von Neumann (Pantheon Books, N e w York, 1992, p. 702) Professor Joseph Kiirsch~k soon wrote to a university tutor, Gabriel Szeg6, saying that the Lutheran School had a young boy of quite extraordinary talent. Would Szeg6, as was the Hungarian tradition with infant prodigies, give some university teaching to the lad? Szeg6's own account of what happened was modest. He wrote that he went to the von Neumann house once or twice a week, had tea, discussed set theory, the theory of measurement, and some other subjects with Jancsi [Johnny in Hungarian], and set him some problems. Other accounts in Budapest were more dramatic. Mrs. Szeg6 recalled that her husband came home with tears in his eyes from his first encounter with the young prodigy. The brilliant solutions to the problems posed by Szeg6, written by Johnny on the stationery of his father's bank, can still be seen in the von Neumann archives in Budapest. Szeg6 was married on M a y 22, 1919, in Budapest just after he was released from the A u s t r o - H u n g a r i a n Army. His wife, Erzs6bet Anna Nem6nyi, had a Ph.D. in chemistry from the P~zm~ny P6ter University in Budapest. H e was still in uniform w h e n they were married. It is
Gabor Szeg6, his wife Anna, and their children, Peter and Veronica, shortly after their arrival in the U.S. in 1934.
2According to Macrae, the "coaching" took place in 1915-16, but most likely it was earlier. Macrae writes that yon Neumann entered the Lutheran School in Budapest and that "L~szl6 R~itz [was an] instructor in mathematics in Johnny's 1914-21." Then he writes that "R~tz's recognition of von Neumann's mathematical talents was instant [...] R~tz turned his student over to the mathematicians at Budapest University." This would suggest that the tutoring started in late 1914. On the other hand, Veronica Szego Tincher gathers from her father's comments that the tutoring occurred after the First World War", although it is also possible that their mathematical discussions began before the War." THE MATHEMATICALINTELLIGENCERVOL.18, NO. 3, 1996 13
said that during the ceremony, there was bombing from a boat on the Danube. They were to have two children: Peter (born in Berlin in 1925) and Veronica (born in K6nigsberg in 1929). Peter is an engineer by profession; he wrote a number of papers on special functions. He lives in San Jose where he is retired from work with the State Legislature of California. Veronica has lived in Southern California since 1954 and worked for the University of Southern California. She retired in 1995 as Executive Director for Budget and Planning and now lives in Palo Alto. Veronica has three children, Steven, Emily, and Russell. Emily has a son, Nathan, and Russell has a daughter, Micaela. The Szeg6s lived in a happy marriage until Anna, after many years of suffering, died in 1968. Subsequently, Gabor married Ir6n Vajda in 1972 in Budapest. She died in 1982 in Budapest. Turbulent revolutionary, counterrevolutionary, and anti-semitically discriminatory years followed the First World War (in political terms: Mih~ly K~rolyi = middle, B61a Kun = left, and Mikl6s Horthy = right). There were only a very limited number of academic positions in Hungary. As a result, a great many Hungarian scientists who were not appreciated or were even labeled as unreliable characters in their own country, left Hungary, primarily for Germany, Switzerland, the United Kingdom, and a decade or so later, for the United States, where they received much more scientific and financial respect and reward than they could ever hope for in Hungary. For a short while, Szeg6 worked as an assistant of J6zsef K/irsch~k at the Technical University of Budapest in 1919 and 1920. After he could no longer work at the university, John von Neumann's father Maximilian helped Szeg6. Giving up all hope that he would ever get a job guaranteeing a reasonable living in Hungary, he moved to Berlin in 1921, where he became a friend and colleague of Issai Schur and worked with Leon Lichtenstein, von Mises, and Erhard Schmidt. For a result on the equiconvergence of orthogonal polynomial series and trigonometric Fourier series, he received his Habilitation in 1921. With this, he became a Privatdozent at the University of Berlin in May 1921. This meant that he had the right to give lectures but received very little compensation for it. Other mathematicians holding this title at the University of Berlin in the 1920s were Stefan Bergman, Salomon Bochner, Eberhard Hopf, Heinz Hopf, Charles Loewner, 3 and von Neumann. From 1925 he had the Lehrstuhl fiir angewandte Mathematik (chair for applied mathematics). This was a
called Karl L6wner in those days. In 1984 Louis de Branges proved the Bieberbach conjecture, the most famous previously unsolved problem in classical complexanalysis, utilizing, in addition to the results of Milin-Lebedev and Askey-Gasper, the works of Loewner from those Berlin years. 3He w a s
14
THE MATHEMATICAL INTELLIGENCER VOL, 18, NO. 3, 1996
nichtbeamtete ausserordentliche Professur, that is, an associate professorship without tenure. When Szeg6 left Berlin, Adolf Hammerstein (1888-1941) became his successor and held this position from 1927 to 1935. Szeg6's above-mentioned paper was published in the Mathematische Zeitschrifi 12 (1922), 61-94. At the same time, independently from Szeg6, his Berlin colleagues Bergman and Bochner laid d o w n the foundations of a theory of orthogonal functions that approached the problem from a different perspective. During this period, Szeg6 was also helping Lichtenstein with editing the Jahrbuch fiber die Fortschritte der Mathematik. While in Berlin, he was awarded the Julius K6nig prize by the E6tv6s L6r~nd Mathematical and Physical Society on April 10,1924. The members of the prize committee were J6zsef Kfirschak (president), Gyula Farkas, D6nes K6nig, and Frederick Riesz. F. Riesz was asked to make a presentation report on the work of the recip-. ient. His report was published in Hungarian in Mathematikai ds Physikai Lapok 23 (1924), 1-6, and later it
The two volume P61ya-Szeg6" is the best written and most useful problem book in the history of mathematics. was reprinted (pp. 1461-1466) with a French translation (pp. 1573-1576) in the second volume of F. Riesz's Oeuvres Completes (Akad6miai Kiad6, Budapest, 1960). We recommend it highly: an English translation is provided at the end of this article. There is general consensus among mathematicians that the two-volume P61ya-Szeg6" is the best written and most useful problem book in the history of mathematics. In the 70 years since its first publication, it has continuously influenced mathematics research and made a great impact on the education and training of young mathematicians. So far, it has had four German, one English, one Hungarian, and three Russian editions. Both authors believed that mathematics could only be learned by doing mathematics. The book introduces the reader to mathematical research through a series of carefully selected and related problems, in such a w a y that after analyzing and solving a group of these problems, the reader is almost ready to do independent research in that particular area. Even though the title suggests that the book is about analysis and most of the problems were indeed selected from that area, a variety of problems from number theory, combinatorics, and geometry were also included, along with a few physical applications. The selection of the problems demonstrates the refined taste and mathematical elegance of the authors as well as their technical repertoire. Virtually every page offers something unexpected--an elegant argument, an unexpectedly clever proof, or a single problem that grows into a complex theory right
in front of one's eyes. P61ya describes the origins of
P61ya-Szeg6" (p. 11 of Vol. 1 of Szeg6's Collected Papers); I cannot remember how and when the plan emerged; what is certain is that we worked on this plan for many years until the work appeared in two volumes in 1925. It was a wonderful time; we worked with enthusiasm and concentration. We had similar backgrounds. We were both influenced, like all other Hungarian mathematicians of that time, by Leopold Fej6r. We were both readers of the same well directed Hungarian Mathematical Journal for high school students that stressed problem solving. We were interested in the same kind of questions, in the same topics; but one of us knew more about one topic and the other more about some other topic. It was a fine collaboration. The book [P61ya-Szeg6"],the result of our cooperation, is my best work and also the best work of Gabor Szeg6. It is hard
to argue
with
P61ya's assessment
of
Pdlya-SzegS. T h e y set a standard for later books of problems, and no one has yet come close to their level. Szeg6 was always m o d e s t w h e n he talked of this collaboration. H e e m p h a s i z e d that the idea and the planning came from P61ya, and cooperative efforts followed. Szeg6 was invited to the University of K6nigsberg to succeed K n o p p in 1926 and w o r k e d there as Ordinarius (Professor) until 1934. His first two Ph.D. students were in K6nigsberg. One of his favorite stories is related to Hilbert's visit to the city in 1930. Hilbert was to be given an " h o n o n a r y citizenship" b y the city of K6nigsberg. He had not come dressed for the exceptionally cold fall weather. Szeg6 helped Hilbert out b y lending him an overcoat so that his native city could w e l c o m e him. Life became increasingly difficult for Jews in G e r m a n y in the 1930s. Szeg6 was one of the last to suffer, because he was so highly respected b y his students and colleagues and because of his service in the First World War. This is w h a t P61ya wrote to Jacob David Tamarkin from Z6rich dated February 14, 1934: It was very difficult to write about the chief point which is the fate of Szeg6. Well, I shall be brief and plain. I am terribly worried about him. I saw Mrs. Szeg6 in December. I got a letter from Szeg6 in the beginning of January; although no official measure was taken against him [until the beginning of January] and no direct collision happened with the students, I cannot see how it would go on indefinitely under those circumstances. He would accept, I understand, any offer even for a short period of I or 2 years, he should try to get a leave of absence for that time, and see whether he can live with his family on that amount. There is no hope to get something for him in Hungary, say Fej6r and also Szeg6 himself [...] I could not do anything for him here in Switzerland [...] Excuse this letter, but you see, I am worried. The whole European situation is very dark. Tamarkin set to w o r k immediately, trying to find a job for Szeg6 in the United States (cf. p. 2 of Vol. I of Szeg6's Collected Papers). Obviously, such a task was not simple in the middle thirties during the Great Depression; it was next to impossible even for American mathematicians to find jobs. Most positions involved a consider-
able a m o u n t of teaching (12 hours a week was not at all exceptional, and there were cases w h e n it was even more) and were poorly paid ($3000 a year was considered a good salary). It is less widely k n o w n that Jews were not particularly welcome in the United States during the 1930s either. United States officials were seemingly not interested in p r o v i d i n g asylum. Aided b y the unrelenting support of some American mathematicians, quite a few Jewish mathematicians m a n a g e d to come, and this played a major role in the d e v e l o p m e n t of mathematics in the United States. For a better u n d e r s t a n d i n g of the period, w e suggest several articles published in A Century of Mathematics in America, Part I (Peter Duren, ed., American Mathematical Society, 1988); in particular, "The European mathematicians' migration to America" b y Lipman Bers (pp. 231-243) and "Refugee mathematicians in the United States of America, 1933-1941: Reception and reaction" by N a t h a n Reingold (pp. 175-200). In M a y 1934, during his Pentecostal holidays, Szeg6 w e n t to C o p e n h a g e n to confer with Harald Bohr about his future. Without having to w o r r y about G e r m a n censors, he used this o p p o r t u n i t y to write a letter, dated M a y 23, 1934, to Tamarkin. The b a c k g r o u n d was this. In 1925, Szeg6 had been invited for a visiting appointment to D a r t m o u t h College. The Szeg6s considered it carefully but decided to turn it d o w n because his future seemed more secure in Germany. Szeg6 then recomm e n d e d Tamarkin for the position. It is thus not surprising that Szeg6 t u r n e d to Tamarkin for assistance; they continued to be close friends until Tamarkin's death in 1945. Szeg6's letter to Tamarkin was written in German; it is reprinted on pp. 3-6 of Vol. 1 of Szeg6's
Collected Papers. In this letter, he describes his feelings about the future of his family and his contemporaries. He seems u n a w a r e of h o w serious the situation was. In reality, it was their lives that were at risk. (According to Carl de Boor, w h o translated the letter for us, Szeg6's German style was excellent and it would be hard to recapture it in English.) Copenhagen, 23. May 1934 Dear Mr. Tamarkin! For some time now, I have been planning to write to you and thank you, respectively Professor Richardson, most cordially for all the efforts you have made on my behalf. Please forgive the fact that, once again, I write in German, I can in this way express myself partly more easily, partly more precisely. I do hope that the reading of this letter will not be difficult for you because of the language. I have come to Copenhagen for a few days over the Pentecost break. I will tell you in a moment what made me make this journey. However, in the interest of clarity, I want to begin with a short description of my situation, starting roughly at the point last summer when we last corresponded concerning these questions. Since that time, there has been, on the face of it, no essential change in my personal situation. I have been treated, by colleagues as well THE MATHEMATICAL INTELLIGENCER VOL. 18, NO. 3, 1996
15
as students, correctly and, despite the present political passions, bearably. Nevertheless, m y situation continues naturally to be very difficult, in many instances very depressing and offensive. At the center of these difficulties stands the worry about m y family, especially the education and future of m y children. In order keep him a w a y from the politicsfilled air of the school and the life in Germany, we have, already last fall, sent our nine-year old b o y to Switzerland, where he is well taken care of, body and soul. Since a few days ago, he is with us during summer vacation, giving us the opportunity to see clearly the great advantage of his stay there. However, this situation cannot be maintained for long . . . . The future of our children is hard to visualize in Germany. This, I am convinced, w o u l d be the case even if, against expectations, there were to be a change in the general direction taken b y the government . . . . A d d e d to these considerations concerning the present and future of m y children is the point that I have no faith at all in the stability of m y own situation. Last summer, we had many discussions with Fej6r about these matters, considered parallels to the analogous (as w e saw it) development in Hungary, etc. Some of the prophesies have already been contradicted b y what has happened. In Germany, the course of the new 'Weltanschauung' is being maintained with such single-mindedness that a change, a compromise, or even an attenuation in the near future cannot be expected. Of course, there are shades and differences of temperament in the leading circle, and it is impossible to predict which forces will finally be victorious. For example, in the recent formation of the "Reichskultusministerium' [Ministry for Culture] (a change which is of prime importance for personnel questions at the universities), the better spirit seems to have come out on top. Yet, one still hears reassurances that in 5 years no 'non-Aryan' person will occupy a university chair. I point to the m a n y retirements that have taken place recently, often without any p r o p e r procedure (e.g., Rademacher, at the end of February of this year), also to retirements for the sake of administrative simplification, but with the hidden goal to remove u n w a n t e d persons who otherwise w o u l d be protected by the 'Beamtengesetz' ['Law for the Restoration of the German Civil Service']. Just a few weeks ago, an outstanding classical philologist has in this w a y been removed from m y university. In Mathematics, the situation in K6nigsberg is as follows. Reidemeister has been moved, and no successor has so far been appointed. This means that I a m needed for the time being since I am alone in a position of responsibility. However, as soon as the successor arrives, something to be expected rather sooner than later (probably by the fall), I don't expect to stay around much longer, even though, as a participant in the war and officer at the front, I s u p p o s e d l y am not affected b y that bill. Last summer, it was generally thought that this 'Beamtengesetz' would be temporary in any case, so sooner or later the [former] law-based security would be restored. Since then, the bill has been extended twice already, and there is nothing to prevent further extensions ad inf. In addition, should the bill be revoked, there remain a thousand other means for making it impossible to w o r k here. Such bills are changed with much greater ease than, say, a mathematician w o u l d switch from one system of axioms to another. One other pertinent fact deserves to be stressed. K6nigsberg has been called, semiofficially, a 'Reichsuniversit/it', meaning that in future only politically correct people will work there. It is therefore nearly impossible that I will remain there after the final arrangements have been made, probably this coming fall, as a consequence of the formation of the Reichskultusministerium. Rather, if not pensioned off at once, I will be moved to a different university. 16
THE MATHEMATICAL INTELLIGENCER VOL. 18, NO. 3, 1996
This used to be impossible for university professors in Germany; today, the 'law' provides the means for it! Now, what can I expect of colleagues and students in a new environment, likely to be of ill will toward me from the start? Probably, they will only see a person who has been m o v e d as punishment and will h a r d l y tend to allow for mitigating circumstances, of the kind I a m used to here in K6nigsberg, where one knows me from former times when judgements were still objective and, with the slogan 'There are decent Jews, a rare exception' as a kind of excuse, behaves correctly t o w a r d me. All in all, m y situation is, from the standpoint of m y children, equally b a d for present and future, but m y own future is extremely uncertain. The longer one waits, the more difficult is the change to a n e w milieu likely to be. For this reason, I continued to think, after our correspondence last year, of moving to the U.S.A. In spite of m y rather hopeless situation, just described, I do not wish to proceed with this hastily; in particular, I a m not forgetting at any m o m e n t the difficulties which exist over there and which you kindly described last year to me and m y friends Fej6r and P61ya. Therefore, we arrived with Fej6r and P61ya at the conclusion that I will continue to look for a position in the U.S.A., but that in case of success . . . . I shall try first to get temporary leave from here in order not to burn any bridges. It was in this sense that I asked P61ya last fall to write to you. On 7. April of this year, he told me that you had res p o n d e d and had related the results of ca. 20 written inquiries. I am really moved by this undeserved measure of willingness to help! In addition, he asked me to let you k n o w exactly, through him, m y o w n intentions, since y o u w o u l d have to act quickly and decisively. For the sake of clarity, I a m stating these today again, even though you are certain to have had his answer for some time now. Here is m y thinking: In case A), if 1) the offer in question for 2-3 years is certain, and there is some hope of an extension, 2) if it is such that I and m y family can live on it, then I w o u l d accept it for sure. However, in case B), if the two conditions just formulated are not fully satisfied, I w o u l d app l y for temporary leave, and make further decisions only after I have that leave. Of course, I w o u l d also a p p l y for such leave in the case A), but that w o u l d n ' t be so utterly important as in the case B). In a letter from 9. May, Fej6r tells me, based on a letter from you, that Washington University in St. Louis plans to offer me a visiting professorship for 3-4 years, assuming that it can obtain the necessary financing, which p r o b a b l y is not certain. He also writes that you, in order to proceed, w o u l d like to know whether, in case of an official invitation, I w o u l d be able to obtain leave from m y university . . . . I decided on a moment's notice to travel to Copenhagen in order to speak with Bohr in person. I am now there, staying with him, and we are discussing the situation d a y and night. He kindly showed m e the letter from Richardson; b y the way, in a telegram two days ago he has announced m y presence here as well as a letter to be sent. These happenings explain m y somewhat late but, so I hope, very much clearer response than w o u l d have been possible from Germany. In any case, I ask for your and Prof. Richardson's indulgence, should this lateness cause the postponement of some steps . . . . I n o w w o u l d like to ask you, dear Mr. Tamarkin, to understand this response, as well as inform the relevant people, as if the possibility of an official leave were already settled. As a matter of fact, I do consider this very likely. However, I am going one step further. Since in the matter of St. Louis, the above-mentioned condition A), 1) seems to
be satisfied, I would accept the offer irrespective of the leave, provided also condition A), 2) is satisfied. In this respect, I am however badly informed, in particular, I have no idea what is the sum needed in the U.S.A. to keep a professor with family from starving to death. I have no intention of making special demands. In order to have a basis for comparison, I mention that my present yearly income is about 9000 M., but it is likely to experience increasing diminution because of the present uncertain situation. However, I hardly believe that it makes sense to convert this into U.S. currency, especially since its devaluation. In any case, I would be very grateful if you could give me some information on this point. About St. Louis itself, I couldn't find
I have no idea w h a t is the sum needed in the U.SJi. to keep a professor with family from starving to death. out much: It is about 1,400 km from New York, an industrial city with 800,000 inhabitants, several universities, with the one in question apparently well equipped and financed (cf. American Universities and Colleges, 2nd ed., Amer. Council on Educ., Baltimore, 1932), however the mathematicians there are unknown to me. This wouldn't bother me at all. It would, however, be very valuable to me if I could learn from you also about city and university, also about climate, housing possibilities, and standard of living. I am assuming that you are able to say something based on hearsay or even on personal experience. I dare to stress one very important point. A leave of absence longer than I year for me cannot be hoped for at present. (I can apply for an extension later.) On the other hand, I can't really apply for a 1-year leave of absence based on an invitation for 3--4 years, as was the formulation transmitted to me by Fej6r, extremely valuable though it is to me. Should the invitation take this form, then I would have the additional big and important request to you, to inform the relevant office about my situation and to request that, in addition to an invitation for 3-4 years for my personal use, a second invitation be sent in which only the duration of 1 year is mentioned. I hope that I have managed to be completely clear on this particular point. Don't be surprised when you receive very soon a paper of mine in English. I corresponded with Professor Walsh about a certain question and afterwards wrote up my results in English. He then asked me to send him the paper and he is going to honor my request to send it on to you or Professor Hille. I hope that the editing won't be as bad as when Japanese write in German. By the way, I am studiously learning English and hope to master quickly the art of giving talks (if not the art of day-to-day conversation), once I have had for a while the opportunity to listen to, resp. give, classes in English. In order not to disappoint the gentlemen in St. Louis too much, it might be advisable to mention on occasion my low skills in this area. I now close, having, I hope, developed the essentials clearly. With repeated warmest thanks to you and Professor Richardson, I remain, with cordial greetings, sincerely yours, [signed] G. Szeg6 At the same time, H. Bohr also wrote a letter to Tamarkin, explaining that Szeg6's situation in G e r m a n y
"is still more untenable than he pictures it himself." He also writes that the fact that "Szeg6 until n o w has been able to maintain his position, is only due to his quite exceptional position a m o n g his pupils, because he not only is a first class mathematician, but also an extremely estimated, inspiring and successful teacher." H. Bohr's letter is reprinted on pp. 2-3 of Vol. 1 of Szeg6's Collected
Papers. Tamarkin's efforts were successful and Szeg6 was offered a professorship at Washington University in St. Louis, Missouri, in 1934. These were extraordinary times both economically and politically. The university did not have sufficient funds for Szeg6's salary. The m o n e y was raised b y a grant of $4000 from the Rockefeller Foundation, b y a matching grant from the Emergency C o m m i t t e e in Aid of Displaced G e r m a n Scholars, and from donations from the local Jewish business c o m m u nity, which covered Szeg6's salary for four years. The Rockefeller Foundation was also instrumentally inv o l v e d in arranging visas, exit permits, travel documents, and so forth. 4 Following the advice of P61ya and H. Bohr, Szeg6 accepted the job, went to St. Louis in the fall of 1934, and r e m a i n e d there until June, 1938. In the course of that time, he was the advisor to five Ph.D. students (one of t h e m only graduated in 1948 at Stanford) and finished the first version of his book Orthogonal Polynomials. H e had a s u m m e r visiting a p p o i n t m e n t at Stanford University in 1935. In 1936, he gave an invited address at a meeting of the American Mathematical Society. The friendships and contacts he m a d e in St. Louis lasted to the end of his life. Orthogonal Polynomials was first published in 1939, and it has since become one of the main reference books for m a n y p u r e and applied mathematicians and for scientists working in various fields. There are m a n y reasons w h y researchers started investigating orthogonal polynomials. Historically, these polynomials first app e a r e d in connection with special functions, numerical analysis, and approximation theory (quadrature and interpolation). They are also denominators and numerators of convergents of continued fractions. Later, they w o u l d arise in Pad6 approximations and m o m e n t problems. H o w e v e r , the foundations of a general asymptotic theory of orthogonal polynomials were first laid d o w n in a series of papers that Szeg6 wrote in the 1920s and 1930s. Szeg6 succeeded in reducing m a n y significant problems related to orthogonal polynomials to the asymptotic behavior of certain Toeplitz and Hankel determinants. This ingenious result m a d e the solution of m a n y problems easy, at least in the case of the so-called Szeg6" class, that is, for measures w h o s e absolutely continuous c o m p o n e n t is Lebesgue integrable on the unit
4p. N. appreciates very well the utmost significance of the latter. THE MATHEMATICAL INTELLIGENCER VOL. 18, NO. 3, 1996
17
circle. Many mathematicians contributed to developing SzegS's theory, including Nauru I. Akhiezer, Sergey N. Bernstein, Paul Erd6s, G6za Freud, Yakov L. Geronimus, Alexander N. Kolmogorov, M a r k G. Krein, James Alexander Shohat, Vladimir I. Smirnov, and Paul T u r i n , just to n a m e a few. Szeg6's t h e o r y has found significant applications in other scientific fields such as numerical methods, direct and inverse discrete scattering theory, differential and difference equations, mathematical statistics, prediction theory, statistical physics, systems theory, coding theory, and fractals. A theory of orthogonal polynomials b e y o n d Szeg6's class was only recently discovered. The primary reason for the ren e w e d interest lies in the various recurrence formulas satisfied by orthogonal polynomials. Szeg6's Orthogonal Polynomials discusses almost all facets of the theory, including those areas that (often inspired b y Szeg6's book) were to be d e v e l o p e d only later. Quite a few well-written books have a p p e a r e d on orthogonal polynomials since; yet Szeg6's remains the standard, the first place to look for ideas and information. In 1938 Szeg6 accepted an offer from Stanford University to b e c o m e H e a d of the D e p a r t m e n t of Mathematics there. H e was the H e a d until 1953. His greatest achievement during this time was to raise Stanford's mathematics to a world-class standard. At this point we turn the pen over to Peter Lax b y quoting from "The old d a y s " to be published in A Century of Mathematical Meetings (Bettye Anne Case, ed., American Mathematical Society, 1996).
old world courtly manners. Underneath his somewhat aristocratic appearance he was a warmhearted person, ever willing to help others. He was aware of the absurdities of life and savored them. He had high standards, but did not expect everyone to live up to them. Throughout his scientific life, he was very close to his mentor, George P61ya. Contrary to a mistaken assertion in a recent biography of P61ya by the Taylors, Szeg6 always treated him with utmost consideration and tact. The two had different personalities; P61ya was conservative and pessimistic, Szeg6 liberal and optimistic. To illustrate their differences with a small story, when Marcel Riesz came to Stanford, P61ya felt he couldn't invite to his house a man who had sired two illegitimate daughters; Szeg6 was amused by this Victorian prudery. Szeg6 believed that one should do mathematics as long as one can; he made P61ya agree to delay his study of the psychology of problem solving until the age 65. Since this left P61ya almost 30 years for his educational enterprise, it was not a bad bargain. [...] in 1946 there was one departmental secretary at Stanford. All faculty members, including Szeg6, typed their own papers. All this changed soon in the postwar boom, but there was a corresponding loss of intimacy. For instance, it would be impossible today, as Szeg6 did in the summer of 1946, to invite all graduate students to his home for a supper of stuffed cabbage and plum dumplings, cooked expertly by his wife. Cooking wasn't Mrs. Szeg6's only expertise; she was a chemist, and supported the family while Szeg6 served in the prestigious but barely remunerated position of Privatdozent in Berlin. She was a voracious reader, in four languages, and provided intellectual companionship and stimulation to her husband, and their children. It was a happy household, and I was privileged to be part of it for three summers.
In the forties and fifties I spent many summers at Stanford University at the invitation of the Head of the Mathematics Department, G~bor Szeg6, who was my uncle by marriage. The Head of a department was in those days a much more powerful figure than a mere chairperson today; he made all decisions, including hiring and firing. Szeg6 used his powers to turn the provincial mathematics department that Stanford had been under [Hans Frederick] Blichfeldt and [James Victor] Uspensky--both remarkable mathematicians-into one of the leading departments of the country that Stanford is today. He appointed four senior mathematicians from Europe: P61ya, Loewner, [Max M.] Schiffer [who succeeded Szeg6 as head of the Department] and Bergman, and half a dozen brilliant young Americans: [Richard] Bellman, [Albert Hosmer] Bowker, [Paul] Garabedian, [Halsey] Royden, [Albert Charles] Schaeffer and [Donald C.] Spencer. He took advantage of the availability of postwar Government support for science by joining Bowker in the creation of the Applied Mathematics and Statistics Laboratory. The Szeg6 period at Stanford is well documented in Royden's article "A history of mathematics at Stanford" in Part II of A Century of Mathematics in America, published by the AMS, [1989, pp. 237-277]: He wrote his first paper [...] at age 20 while in the trenches during the First World War. His comrade-in-arms and later lifelong friend, Strasser, recalled that his fellow ofricers realized that Szeg6 was a very precious talent, and did their best not to expose him to danger. As a young man, Szeg6 was very shy; by the time he came to the United States, he was a self-assured man with
Szeg6 stayed at Stanford until his retirement in 1960 as Professor Emeritus.
18
THE MATHEMATICAL INTELLIGENCER VOL. 18, NO. 3, 1996
Peter D u r e n recalls: When I was an Instructor at Stanford, I attended Szeg6's course on orthogonal polynomials. It was really a course on various techniques in analysis: asymptotic estimation of integrals, for instance. I remember that he needed to use a Blaschke product at one point, but he didn't call it that. When one of the students asked whether that was a Blaschke product, he replied that some people called it that, but he would not want to honor that man in any way. The student immediately caught on and asked whether Blaschke had been a Nationalist Socialist. Szeg6 didn't respond directly, but the gleam in his eye confirmed that the student had got it right. Times have changed. Today, it's no big deal to call Blaschke a Nazi and still talk about Blaschke products. We d o it all the time. We w o n d e r h o w Szeg6 w o u l d have reacted to the solution of the Bierberbach conjecture. Bob Osserman recalls: I can recall one Szeg6ism from a conversation I had with Szeg6 at a party shortly after I arrived at Stanford. He said to me something to the effect: Don't you think it's somewhat
fraudulent that we claim to teach people how to become research
mathematicians? That's like claiming you can teach someone how to become a poet. All you can really do is show by example how research in mathematics is done, and then they either can do it themselves or they can't. At the time, I thought that was a
fairly shocking idea, perhaps meant deliberately to be a bit outrageous, but I gradually came to realize that there was more truth to it than I would have at first conceded.
Don't you think it's s o m e w h a t fraudulent that we claim to teach people how to become research mathematicians? That's like claiming you can teach someone how to become a p o e t . In his article already cited, Halsey Royden tells about Marcel Riesz's series of four lectures at Stanford in 1948: The day of the first lecture was warm, the good-sized lecture room was full of faculty and students. Gabor Szeg6 introduced Riesz, who promptly took off his jacket and proceeded to lecture in his shirtsleeves and suspenders. A bowl of water and sponge had been provided. After filling up the blackboard, Riesz motioned imperiously to Szeg6, who jumped up and washed off the blackboard while Riesz stood by and watched! Now Szeg6 was very distinguished and autocratic; wore elegant tailor-made suits, and was always regarded with awe by the students and most of the faculty. To see him in the role of young European assistant to Riesz was startling! After several repetitions of this performance, needless to say, blackboard and floor soon became quite a mess. Sitting directly behind me was George P61ya,who had brought Felix Bloch to hear a distinguished fellow Hungarian. P61ya was somewhat embarrassed by the performance and muttered apologies sotto voce. Szeg6 became a naturalized American citizen in 1940. In 1945-1946, he spent a year teaching mathematics to American soldiers (waiting to be shipped back to the States) at the American University in Biarritz, France. He served as a civilian employee of the War Department; he was in uniform and was given a rank equivalent to Colonel with PX and similar privileges. Szeg6's son, Peter, was serving in the U.S. Army at the same time and was also a student in Biarritz while Szeg6 was teaching there. During this time, Szeg6 traveled to the Netherlands and England where he met and assisted various mathematicians. Joseph Ullman (Professor Emeritus at the University of Michigan, who died on September 11, 1995, at the age of 72, while we were putting the finishing touches on this manuscript) met Szeg6 in Biarritz and later followed him to Stanford, where, under Szeg6's direction, he defended his Ph.D. in 1950. Michael Aissen (Professor Emeritus at Rutgers University, Newark) and Robert L. Wilson (Professor Emeritus at Ohio Wesleyan University) were also students of Szeg6 in Biarritz. Szeg6 also tried to get permission to enter Hungary, but the Russians declined his request. His mother lived in Budapest until she died there in 1946. At the time Szeg6 was in France, the Szeg6s knew little of what happened to his brother and his wife's family.
Michael Aissen recalls: The students [in Biarritz] became aware of Szeg6's prestige because of a curious fact. The faculty had a veneer of democracy evidenced by the fact that the standard form of address was Mister . . . . However, there was a single exception. Everyone addressed Gabor as Dr. Szeg6. Szeg6 received a lump sum reparation from the German government during the 1950s. When he reached retirement age, he received the equivalent of the pension he would have had if he had not been forced to leave Germany. After he retired, his wife's health deteriorated and Szeg6's health declined as well. He gave his last mathematical talk on Fej6r's work, at an international conference on Constructive Function Theory in Budapest in 1969. Anna died in 1968 and, in 1970, Szeg6 discovered that he was suffering from Parkinson's disease. During the years between 1973 and 1980 he split his time between Palo Alto and Budapest (often at the Grand Hotel on Margaret Island on the Danube). When he was in Budapest, his friends and followers often visited him. He particularly enjoyed the company of Gy6rgy Alexits, Erd6s, L~szl6 Fejes T6th, and Turin. During his last years he was confined to a wheelchair and suffered a lot of pain. Even after Szeg6 stopped doing mathematics research, papers, problems, and questions continued to be sent to him from all over the world. He was pleased to
Gabor Szeg6, around 1950.
THEMATHEMATICALINTELLIGENCERVOL.18,NO.3,1996 19
receive them, but it a n n o y e d him that he could not respond to all of t h e m as he once had. In 1952, Szeg6 published an extension of his first paper titled "On certain Hermitian forms associated with the Fourier series of a positive function" (Comm. Sdm. Univ. Lund, T o m e Suppl., Festskrift Marcel Riesz, 1952, 228-238). A b o u t this paper Barry M c C o y wrote on pp. 47-52 of Vol. 1 of Szeg6's Collected Papers: It is easily arguable that, of all Szeg6's papers [this] has had the most applications outside of mathematics. In the first place, the problem which inspired the theorem was propounded by a chemist working on magnetism. Extensions of this work made by physicists have led to surprising connections with integrable systems of nonlinear partial difference and differential equations [...] In addition Szeg6's theorem has recently been used by physicists investigating quantum field theory and Toeplitz determinants arise in the study of static monopole solutions of Yang-Mills equations. One w a y mathematicians are h o n o r e d is to have something they discovered n a m e d after them. As we mentioned before, he m a d e a n u m b e r of such discoveries. Another w a y w e show that the w o r k of mathematicians is d e e p e n o u g h to last is to publish their selected or collected works. H e was still alive w h e n his collected works w e r e published in three thick volumes (2626 pages) in 1982 with the title Gabor Szeg6: Collected Papers by Birkh~iuser in its series C o n t e m p o r a r y Mathematicians. I (R. A.) sent a copy of the three volumes to Szeg6 via his son Peter w h o looked at it before taking it u p to his father. Szeg6 had a v e r y good day w h e n Peter b r o u g h t it; he was not only v e r y pleased, but he kept also asking Peter if he h a d seen this in the book, and then going on to ask about something else he saw there. Mark Kac w r o t e in his review of Szeg6"s Collected Papers published in The American Mathematical Monthly 91 (1984), 591-592: For who could be indifferent to the theorem that a power series with only finitely many different coefficients either represents a rational function or is not continuable beyond its circle of convergence! Or if a Toeplitz matrix is generated by the Fourier coefficients of a nonnegative Lebesgue integrable 2~r-periodic function f, then the nth roots of the determinants of the n • n truncated matrices converge to the geometric mean of f. [The last sentence is a paraphrase of what Kac wrote with formulas.] I am picking these two examples, both from Szeg6's early years, because the first one is one of the many isolated jewels scattered throughout the books, and the second the beginning of an important development which is likely to continue for many years to come. It is characteristic of most, if not all, of Szeg6's work that it begins with a concrete problem. That much of it flowered into elegant general theories (e.g., orthogonal polynomials on the unit circle) is a tribute to Szeg6's impeccable taste in choosing problems and to the depth of his insight. Even then no one, not even Szeg6 himself, could have dreamed of the extent to which some of his work would ultimately influence mathematics and science [...]
20
THE MATHEMATICAL INTELLIGENCER VOL. 18, NO. 3, 1996
They are a monument to the vitality of classical analysis and to the virtuosity of their author. Szeg6 left a memorial for us, his mathematical work. It continues to live and lead to new work. We often regret that he is not here to appreciate all of the w o r k being d o n e on problems he started.
Epilogue The city of Kunhegyes celebrated the Szeg6 c e n t e n a r y on January 21, 1995. An entertaining description of the d a y ' s event, including two speeches b y Lee Lorch and R. A., can be found at URL h t t p : / / w w w . m a t h . o h i o state.edu/JAT/DATA/SPECIALS/szego. With the help of generous contributions b y over 100 individuals, the city of Kunhegyes, Washington University, and Stanford University, Szeg6's b r o n z e bust was commissioned f r o m the H u n g a r i a n artist Lajos Gy6rfi. The dedication of the statue, which was erected in front of the City Library in Kunhegyes, took place on A u g u s t 23, 1995. A short account of the dedication cere m o n y b y Kathy A. Driver can be found at URL h t t p : / / www.math.ohio-state.edu/JAT/DATA/SPECIALS/ szego.bust. In addition, copies of the bust will be placed at Washington University and Stanford University.
Answers to Some Frequently Asked Questions About the Hungarian Language First, a short course in pronouncing Hungarian words. It is the easiest thing on the face of the earth. 5 Hungarian is almost exclusively phonetic; that is, words are pronounced exactly as they are written (with a few exceptions). The (single) letter "sz" stands for "s" as in "set." The letter "s" is pronounced as "sh" as in "sheet." The letter "6" is just like its German twin: "H61der." On the other hand, "6" is a looong "6" as in "Szeg666." Incidentally, "z" is as in "zero," and the (single) letter "zs" is pronounced as in the French "Legendre." Surprise: the letter "~i" is not a long version of "a"; it's a vowel in its o w n right, as in "art." Homework: say Zsazsa Gdbor. N.B. Gdbor is legitimate both as a family and a given name. N o t e that Szeg6 is also frequently spelled as Szeg6 or Szego. In addition, Frigyes = Frederick (as in Riesz), G~bor = Gabor (as in Szeg6), G y 6 r g y = George (as in P61ya), Gyula = Julius (as in K6nig), Lip6t = Leopold (as in Fej6r), Marczel = Marcel (as in Riesz), Miksa = Maximilian (as in John v o n N e u m a n n ' s father w h o was also called Max), P~il = Paul (as in Erd6s and Turin), and T 6 d o r = Theodore (as in v o n K~irm~in). The custom in H u n g a r i a n is that family n a m e comes first and given names follow.
5Come now. Pronouncing Czech must be just as
easy.--Editor's Note.
Acknowledgments We t h a n k Veronica Szego Tincher a n d Peter Szego for information of a personal nature a n d for their continu e d s u p p o r t of this project. W e t h a n k Carl de Boor for consulting on G e r m a n - r e l a t e d matters in this article a n d for his translation of Szeg6's letter to Tamarkin. We t h a n k Dietrich Braess a n d H e r b e r t Stahl for helping us to figure out h o w the G e r m a n academic s y s t e m w o r k e d in the 1920s. W e t h a n k Priscilla R. Feigen for helping us locate s o m e f o r m e r m e m b e r s of the D e p a r t m e n t of Mathematics at Stanford University. W e t h a n k Edit Kurali for translating the article "G~bor Szeg6" b y P. N. which w a s p u b l i s h e d in H u n g a r i a n in Magyar Tudomdny 8-9 (1986), 728-736 [and w a s reprinted as " H u n g a r i a n scientists, XVI: G~bor Szeg6" in Nyelv~ink ds Kulturdnk 65 (1986), 57-63]. W e t h a n k Peter Lax for allowing us to use an excerpt f r o m his not yet p u b l i s h e d article. We t h a n k Michael Aissen, Peter Duren, a n d Bob O s s e r m a n for anecdotes a b o u t Szeg6. W e t h a n k L~szl6 Szab6 for translating F. Riesz's presentation r e p o r t f r o m Mathematikai ds Physikai Lapok 23 (1924), 1-6. We t h a n k Liz Askey, C h a n d l e r Davis, Peter Duren, Samuel Karlin, Peter Lax, Lee Lorch, Peter Szego, Veronica Szego Tincher, and Istv~n Vincze for r e a d i n g a draft version of this w o r k a n d suggesting m a n y i m p r o v e m e n t s . Parts of the present p a p e r are b a s e d on the a b o v e article b y P. N. in Magyar Tudomdny which itself has greatly benefited f r o m the i n t r o d u c t o r y material to Szeg6's Collected Papers edited b y R. A. and published b y Birkh/iuser, Boston, in 1982. This material is b a s e d on w o r k s u p p o r t e d b y the National Science F o u n d a t i o n u n d e r G r a n t Nos. DMS-9300524 (R. A.) a n d No. DMS-940577 (P. N.).
Appendix: Frederick Riesz's Report on the 1924 Gyula K6nig Prize 6 Gentlemen! The committee entrusted with the task of making a proposal concerning the award of the second Gyula K6nig Prize held a meeting on January 26 of this year at the Technical University with J6zsef K6rsch~ik as chairman; also present were Gyula Farkas, D6nes K6nig, and yours truly. The committee observed with pleasure that among the Hungarian mathematicians eligible, according to the rules of the foundation, there are more than one who are worthy of receiving the prize. This year the committee wanted to
6Translated from Hungarian by L~szl6 Szab6 (
[email protected]) from Mathematikaids PhysikaiLapok 23 (1924), 1~ [cf. in Hungarian (pp. 1461-1466) and in French (pp. 1573-i576) in the second volume of Riesz's OeuvresCompletes,Akad6miai Kiad6, Budapest, 1960].
reward a member of the youngest generation and decided to recommend for the prize Gabor Szeg6, Privatdozent at the University of Berlin. The committee has charged me with the task of preparing a report and of analyzing and appraising the works of the candidate. I have the honor to present my report. During his eight years of scientific activity G~bor Szeg6 has produced numerous works. Please permit me to restrict myself to those of his papers that attracted most of m y attention by the novelty, beauty, and significance of their results and methods. From among those results, if you excuse me for my perhaps excessive subjectivity, I start with a discovery that is in direct contact with my own research. It is known and easy to prove that the value of a function that is holomorphic inside a curve, say, a circle, and continuous on an arc of this circle, cannot be constant on this arc except in the trivial case when the function is constant. In 1906 the French mathematician Fatou, after showing in his famous doctoral dissertation that every function that is bounded and holomorphic inside a circle has a limiting value almost everywhere, that is, with the exception of a set of measure 0, raised the following question: since this limiting function cannot be constant on the whole arc, as was stated above, how large can the set be on which it is constant; or, and this amounts to the same, how large can the set be on which it vanishes? After showing that this set cannot fill out "almost" entirely an arc, he formulated the conjecture, which he believed was difficult to prove, that this set has measure 0. My younger brother Marcel and I proved this conjecture in a joint article, which we presented at the 1916 Stockholm Congress, not only in the bounded case but for a more general class of holomorphic functions as well. Szeg6 succeeded in showing the deeper, I could say real, reason of this phenomenon in a March 1920 letter addressed to me which was published, together with my comments, in the 38th volume of Math. ds Term. Ertesft6 in 1920 under the title "Analytikus f6ggv6ny kerfileti 6rt6keir61." Szeg6 later also published his related research in the 84th volume of Math. Annalen under the title "Uber die Randwerte einer analytischen Funktion." Namely, in these papers he proved that the logarithm of the absolute value of the [nontangential] boundary limit function is Lebesgue integrable. Therefore, the logarithm may be equal to negative infinity, that is, the boundary limit function itself may vanish, only on a set of measure 0. The interesting nature of this result is perhaps better shown by the following theorem which is easily seen to be equivalent to it: given a nonnegative function on the circumference of a disk, a necessary and sufficient condition for the existence of a function that is holomorphic inside the disk and is not identically vanishing inside the disk and has bounded mean value, such that the absolute value of its [nontangential] boundary value is almost everywhere equal to the given function, is that both the given function itself and its logarithm be integrable. I note that Szeg6's theorem, which he obtained in a roundabout way via the study of Toeplitz forms and the Fourier series of positive functions, can very easily be derived from a famous formula of Jensen. This was pointed out not only by me in the above cited correspondence, but also by Fatou himself, who applied a similar chain of ideas to prove in just a few lines, not the theorem of Szeg6, but his own conjecture. Another, extensive group of Szeg6's work belongs to the following sphere of ideas: from properties of the coefficients of power series, or from arithmetic properties of most of these coefficients, he deduces properties of the corresponding analytic functions. More specifically: (i) theorems of
THE MATHEMATICAL INTELLIGENCER VOL. 18, NO. 3, 1996
21
Hadamard and Fabry about lacunary power series; (ii) the theorem conjectured by P61ya and proved by Carlson about power series with integer coefficients that are convergent inside the unit disk, which states that the function defined by such a power series either is rational or else cannot be extended beyond the unit disk; and (iii) the analogous theorem of Szeg6 about power series having only finitely many different coefficients ("Uber Potenzreihen mit endlich vielen verschiedenen Koeffizienten," Sitzungsber. d. preuss. Akademie, 1922). In "Tschebyscheff'sche Polynome und nicht fortsetzbare Potenzreihen," which appeared in the 87th volume of Math. Annalen, Szeg6 shows that all these theorems follow naturally from the relationship discovered by Faber that exists between the Tschebyscheff polynomials of a curve and conformal mapping, and which was used by Carlson in his proof of P61ya's conjecture. In the same article, starting from the same principle, Szeg6 also deduces a theorem of Ostrowski which leads us to a seemingly distant theorem of Jentzsch, a young German mathematician who died in the war, about the distribution of zeros of the partial sums of power series. His article titled "Ober die Tschebyscheff'schen Polynome," which was published in the first volume of mathematical Acta [Acta Sci. Szeged] of the Ferenc.J6zsef University, and another one, a short article titled "Uber die Nulstellen von Polynomen, die in einem Kreise gleichm/issig konvergieren," which was recently published in the Sitzungsberichte of the Mathematical Association of Berlin belong to this area as well. In them Szeg6, using completely elementary methods, throws light on the deeper causes behind the theorem of Jentzsch and related phenomena. Finally, I turn to the area belonging both to complex and real analysis to which Szeg6 devoted the largest part of his work: the theory of orthogonal systems and the corresponding series expansions. To start with a smaller, very interesting work, which also shows his great ability in formal calculations, let me mention the article "Uber die Lebesgue'schen Konstanten bei den Fourier'schen Reihen" which appeared in the 9th volume of the Math. Zeitschrifl. Here he gives very simple numerical expressions for Lebesgue constants whose properties were previously studied by Fej6r and Gronwall. Then he proves in a straightforward fashion properties of these constants some of which were proved by the above-mentioned authors in a much more complicated way and some of which were conjectured by them. In more extensive work which appeared in the same volume under the title "Ober orthogonale Polynome, die zu einer gegebenen Kurve der komplexen Ebene geh6ren," he examines Fourier expansions in polynomials which are orthogonal on a closed curve. This contains, as a special case, both Legendre and power series. These expansions, even in the general case, behave very much like power series, and provide a new and most natural solution to the following problem of Faber: given a domain, find a system of polynomials in which every function that is holomorphic in this domain has an expansion. The expansions studied by Szeg6 have an interesting and very simple relationship with the conformal mapping of the finite and infinite domains bounded by the given curve onto the unit disk. For example, the conformal mapping between the exterior of the curve and the exterior of the unit disk which maps infinity onto itself is the limit of the ratios Pn+I(Z)/Pn(z) formed from consecutive polynomials. Among the papers of Szeg6 related to the problems just discussed there are two that deserve the greatest acclaim. In these two papers he examines the so-called "inner" asymptotics for orthogonal systems and the corresponding series expansions. In other words, he discusses questions con22
THE MATHEMATICAL INTELLIGENCER VOL. 18, NO. 3, 1996
cerning asymptotic behavior on those curves and intervals on which the polynomials are orthogonalized with respect to some weight function p(x). In this area, where the first classical results are linked with the names of Laplace and Darboux, Szeg6 not only obtains very general results, far overshadowing anything known previously, but he obtains these results exactly because he examines these questions, considered very difficult, using a simple, one can say elementary, method. The main point of his method is that he squeezes the weight function p(x) between two functions of a very simple structure that have the form X/1 - x2/p(x) where P(x) is a polynomial. He shows that these functions may be viewed as majorants and minorants, respectively, from the point of view of the problems which are studied. After reducing the problems to the case of weight functions of this special type, he evaluates explicitly the corresponding expressions, using a theorem of Fej6r on positive trigonometric polynomials. Using this method that we have just sketched, in his article, "Uber den asymptotischen Ausdruck von Polynomen, die durch eine Orthogonalit/itseigenschaft definiert sind" which appeared in the 86th volume of Math. Annalen, he gives [inner] asymptotic expressions of orthogonal polynomials for every point x where p"(x) exists. It is known, especially after Haar's dissertation, that with the help of these asymptotic expressions one can reduce questions of convergence and summability of series expansions to certain special cases, e.g., Fourier series. In addition, in another article titled "Uber die Entwicklung einer willk6rlichen Funktion nach den Polynomen eines Orthogonalsystems" published in the 12th volume of the Math. Zeitschrift, Szeg6 also shows that the same elementary method, without the use of asymptotic expressions of the polynomials, directly gives asymptotics for the partial sums [of orthogonal series] and in this way reduces convergence problems to analogous questions for Fourier series. I wish to mention another merit of Szeg6 of a different nature. Namely, the many careful, precise, to-the-point, and, if needed, critical reviews which he wrote for the last two volumes of Jahrbuch ~iber die Fortschritte der Mathematik, which dealt with the literature from 1914 to 1918. In my judgement, with these reviews, together with those of other collaborators, he has considerably contributed to raising the quality of the yearbook. This is of permanent value while contacts between scientists of different nations is made difficult by financial and other considerations. Based on these observations I recommend that the board approve the committee's recommendation. Szeged, March 7, 1924 Frederick Riesz
Department of Mathematics University of Wisconsin-Madison Madison, WI 53706, USA e-maih
[email protected] Department of Mathematics The Ohio State University Columbus, OH 43210-1174, USA e-maih
[email protected] Website: http://www.math.ohio-state.edu/-nevai
David Gale* This column is interested in publishing mathematical material which satisfies the following criteria, among others: 1. It should not require technical expertise in any specialized area of mathematics. 2. The topics treated should when possible be comprehen-
sible not only to professional mathematicians but also to reasonably knowledgeable and interested nonmathematicians. We welcome, encourage and frequently publish contributions from readers. Contributors who wish an acknowledgement of submission should enclose a self-addressed postcard.
Avoiding Numbers
white but a mixture to the red so there must be more red in the white.") Now of course one can work it out quantitatively. Let R and W be the given volumes of red and white wine (measured in some units) and let S be the volume of the spoon. Then it is not hard to get the formula for the amounts in question in terms of R, W, and S. The point is that this is all unnecessary, and, further, the condition of complete mixing is a red herring. Here is an alternative scenario. You have a full glass of wine (of either color). Throw some of it in the ocean, wait 5 minutes, and then refill the glass from the now slightly polluted ocean. Question: Is there more wine in the ocean or (pure) ocean in the wine? Answer: The two amounts are equal. Reason: At the end of the experiment the ocean in the wine glass has replaced the wine that was originally there. Exercise: Where is the missing wine? The significance of this example is that it not only avoids numbers but, one can argue, it avoids mathematics of any sort, using nothing but common sense. I will return to this matter in the next section.
One of the more frustrating things about being a mathematician is that almost no one other than professional scientists has the faintest idea what the subject is about or what it is we do. Ask a person chosen at random what the central concept of mathematics is and the answer will probably be numbers. Now I don't know myself what mathematics is about (more on this later), but I'm quite sure the central concept is not numbers. Still there is a central concept. It is proof. The examples to follow are intended to illustrate how sometimes the reflex action of trying to use numbers can make proofs more difficult.
Counting Beans At the most naive level, consider the problem of deciding which of two jars contains more beans. Easy, you say, just count the number of beans in each jar. True, but a person who didn't know how to count might find the quicker solution. Remove pairs of beans simultaneously from each jar, and stop when one of them is empty. This fable illustrates (1) the advantage of "parallel computation" and (2) at a very primitive level, the notion of cardinality of sets via one-to-one correspondences.
Mixing Wine A bit more sophisticated, but the same idea is illustrated in this well-known problem. You are given a glass of red and a glass of white wine and are told to take a spoonful of the red, add it to the white, and mix completely. Then take a spoonful of the mixture and return it to the red. Question: Is there more red in the white or white in the red? (Warning: this problem may have the effect of alienating people, who may argue vehemently for wrong answers like, "You added pure red to the
*Column editor's address: Department of Mathematics, University of California, Berkeley, CA 94720 USA.
Qualitative Geometry The preceding questions and those to follow are qualitative as opposed to quantitative problems. A qualitative question deals with concepts like more, fewer, greater, equal, less, bigger, smaller, whereas a quantitative problem involves attaching numbers to objects. The example par excellence of a qualitative theory, and I was surprised to realize this, is the geometry of Euclid. This is not to be confused with what we now refer to as Euclidean geometry with the "Euclidean metric." There is no metric, no concept of distance, no notion of the size of an angle in Euclid's Elements. It is true that Euclid adds, subtracts, and compares objects, but these objects are the segments and angles themselves, not their lengths or sizes. Here, for example, is the triangle inequality, the celebrated Proposition XX.
In any triangle two sides taken together are greater than the remaining side.
THE MATHEMATICALINTELLIGENCERVOL. 18, NO. 3 9 1996Springer-VerlagNew York 23
Here is Euclid's proof: Given triangle ABC, extend side BC to D so that DC equals AC (here and elsewhere for "equals" read "is congruent to").
D
To show ]3 is greater than % connect A to the midpoint M of BC and extend the line AM to D so that MD equals AM. Then triangles AMC and DMB are congruent (side-angle-side), so T equals 3, which is contained in, hence less than,/3.
Remark. We know, and so does Euclid, the much
C
B Then, since triangle ADC is isoceles, angles a and ]3 are equal (Proposition V). Further, a is less than (read, "contained in") T. But in triangle ABD, the side opposite ~, is AC+CB and the side opposite/8 is AB, so by (previously proved) Proposition XIX, DC+CB = AC+CB is greater than AB. Qualitative all the way! But let us continue to backtrack. Proposition XIX is the immediate converse of
stronger fact that the exterior angle is the sum of the opposite interior angles (Proposition XXXII). Why then does Euclid settle for this much weaker result? The reason is surely that he wants to go as far as possible without having to make use of the parallel postulate. In today's language we would say that the three propositions proved above are theorems of absolute geometry, holding in elliptic and hyperbolic as well as Euclidean geometry. Once again we see an example of the astonishing sophistication of the mathematics of 2000 years ago.
Stacking Rectangles Two congruent rectangles are stacked one on top of the other as shown in the figure below on the left. Question: Does the upper rectangle cover more or less than half of the lower one?
PROPOSITION XVIII. In any triangle the greater side
subtends the greater angle. Referring to the figure below, we suppose AB is longer than BC.
Choose D on AB so that BD equals BC. Then CBD is isoceles, so angles E and 3 are equal (Prop V); but e is contained in, hence less than % and 3, an exterior angle of triangle ADC, is greater than a. This in turn used PROPOSITION XVI. The exterior angle of a triangle is
The first impulse is probably to bring on the numbers, i.e., calculate the areas of the two uncovered triangles. One can do this, of course, but even after arriving at the somewhat messy expression, it is not at all obvious which w a y the inequality should go, whereas a little thought suggests drawing the two lines shown in the figure on the right and observing the two pairs of congruent triangles. Indeed, one does not even need to have the concept of area to present the problem. We need only agree that one region is bigger than another if the second is "piecewise congruent" to a subset of the first.
greater than either of the interior opposite angles.
A Look Back
Proof.
~ A 24
C
D M
B THE MATHEMATICAL INTELLIGENCER VOL. 18, NO. 3, 1996
In winding up my 5-year tenure as editor of the "Mathematical Entertainments" column it seems appropriate to look back, try to take stock, and report any cosmic illuminations I may have gained from the experience. The 22 columns starting in the winter of 1991 have treated a sort of hodgepodge of topics. Here is a rough inventory.
Topic 1. Sequences 2. Games and paradoxes 3. Automata (the Ant) 4. Tilings by rectangles 5. Triangles 6. Privacy-preserving protocols 7. Optimization
No. of columns 3 3 3 2 3 2 2
It's pretty hard to detect any unifying theme in all this. Computers played a major role in the first five topics, sometimes in solving problems but more often in posing them. In contemplating these "computer-generated mysteries" it seems to me that the problems they posed were of two very different sorts, solvable and unsolvable.
Examples Our very first column featured the sequences discovered by Michael Somos and their generalizations. These sequences are given by rational recursions, but the terms ttLrn out for no apparent reason to be integers. Ingenious arguments have proved integrality for some cases, but most cases remain unexplained. Here is an example which as far as I know is still mysterious. We consider the sequence (ak) given by the recursion,
ak+s = (ak+lak+7 + (ak+4)2)/ak, where the first eight terms are 1. This is just the simplest case of a 4-parameter family of algorithms which grind out integers as far as the eye can see. H o w come? N o w I believe this is a solvable problem, by which I mean there exists a proof that these generalized Somos sequences always yield integers. This is not to say that anyone will necessarily find the proof; but one feels strongly that such a proof exists. The above is in contrast to other sequence problems which I speculate are unsolvable. A good example is the shuffle problem which appeared back in the winter issue of 1992. In case you have forgotten, we considered a countably infinite deck of cards numbered 1, 2. . . . . An n-shuffle consists of taking the first n cards of the deck and interlacing them with the next n. Thus, after a 5-shuffle the deck would look like: 6, 1, 7, 2, 8, 3, 9, 4, 10, 5, 11, 12, 13. . . . . We are told to start with a 1-shuffie, giving 2, 1, 3, 4 , . . . , then a 2-shuffle, giving 3, 2, 4, 1, 5, 6. . . . . and so on; and the conjecture, essentially due to Richard Guy, is that if you keep going long enough, every card will eventually rise to the top of the deck. What about "experimental evidence"? As was reported in the column, all numbers up to 5000 have been tested, and they all do manage to rise to the top, although some of them take a long time to get there. It takes more than 250 million shuffles for card 54 to make it, and more than 21 billion for card 3464. N o w I have no idea whether Guy's conjecture is true or false, but I do have a strong feeling that there simply is no proof either way, meaning no proof from any set
of sensible axioms. This is based on the hunch that when phenomena exhibit this sort of seemingly random, chaotic behavior, it is an indication of unprovability. A similar and simpler question was raised in the 1992 summer issue. Consider the sequence consisting of the powers of 2. Do all but a finite number of these have, say, a seven in their decimal expansion? The largest known sevenless power of 2 (which, amazingly, is also fiveless and nineless) is 271 = 2 3 6 1 1 8 3 2 4 1 4 3 4 8 2 2 6 0 6 8 4 8 .
It seems to me unlikely that there is a proof that all powers of 2 from here on have at least one 7 in their decimal expansion (assuming it's true), but Hillel Furstenberg believes that one may be able at least to show that there are only finitely many sevenless powers of 2. All of which leads to the following meta-meta-question. Consider again statements like, "every card eventually reaches the top of the deck," or "there are only finitely many sevenless powers of 2." Suppose it is true that there are no proofs of these assertions and, therefore, we can never know the answer. Are they nevertheless either true or false? We are back to the question of the law of the excluded middle. Must every mathematically well-formed statement be either true or false? Is it a fact (there's another word to conjure with) that Guy's conjecture is true? If so, was it a fact before he conjectured it? During the dinosaur era? At the time of the Big Bang? My feeling is that words like "true," "false," and "fact" are not appropriate for dealing with these infinite questions, questions whose verification would depend on performing impossible experiments, like calculating all the powers of 2. THE MATHEMATICAL INTELLIGENCER VOL. 18, NO. 3, 1996
25
So much for cosmic speculation. Coming back down to earth, I'm still offering a $200 prize to anyone who can tell me who wins Chomp on the non-negative lattice points of R3. And I'm still tormented by the assertion by Scott Huddleston that he can show that Chomp on a 3 x ~ board is a second-player win. Huddleston seems to have vanished, if not from the face of the earth, at least from the face of the Internet. If you read this, Scott, please phone home.
And Finally, What Is Mathematics Anyway? After much rumination I've reached the conclusion that there's no such thing. Let me elucidate. Consider again the wine-water problem. If one "deconstructs" it properly, all the mathematics disappears. Here is another well-known question which doesn't seem to be mathematics at a l l . . , or is it?
D o i n g it w i t h Mirrors Why is it, people have asked, that mirrors reverse right and left but not top and bottom? This hardly seems like a question in mathematics, but if not, what kind of a question is it? Physics? Optics? There is a fairly substantial literature on this question, but I have yet to see a completely satisfactory discussion. Here's how I look at it. 1. The mirror is irrelevant. The phenomenon occurs in other quite different contexts. Write some words on a thin sheet of paper and then turn it away from you and hold it up to the light. The words are still there but the writing runs backward but not upside down. Is this a strange property of paper? Of course not. The phenomenon was produced by you, the experimenter, when in turning the paper toward the light you chose to turn it about a vertical rather than a horizontal axis. 2. My local barbershop has a sign painted on its window which viewed from the street looks like this,
Hair Cut Connection but as I look at it from inside while the barber is at work it looks like this,
JU3 7toll not.toenno3 26
THE MATHEMATICAL INTELLIGENCER VOL. 18, NO. 3, 1996
but not like this.
COUUSC,r!ou Hg4L cnf How come? Again it was I, the "experimenter," who caused this to happen, but in this case I moved myself rather than the window. In looking at the window from the inside I was facing in the opposite direction from when I was looking at it from the street, and I achieved this by turning myself about a vertical axis. If I had wanted to, and was sufficiently well coordinated, I could have faced the window from inside the barber shop by standing on my head, in which case the writing would have appeared to be like the last figure above. . The mirror phenomenon is felt to be paradoxical because the mirror seems to reverse sides but not top and bottom. But this is not true. My mirror is on the north wall of my bedroom and my window is on the west wall, looking out at San Francisco Bay. When I look in the mirror, the reflected window is also on the west wall of the reflected room. Clearly, the mirror does not reverse east and west. Why then are we confused? . Nevertheless, you may say, why is it that the guy I am looking at in the mirror seems to be wearing his watch on his right wrist (rather than his left ankle). The point is that these paradoxes are of our own making. Right and left, unlike east and west or top and bottom, are in the eye of the beholder. So are backward and upside down. All of the italicized words are defined relative to the observer, whereas east, west, north, south, up, down are objective. The sun always rises in the east, no matter where I happen to be, but it may rise on m y right or left or in front or in back depending on my position. The fallacy is in mixing relative with absolute concepts, the "subjective" left-right with "objective" east-west. When I say my reflection wears a watch on his right hand rather than his left ankle, it is because if I were to replace "him" in the mirror, I would do so by making an about-face, keeping my feet on the ground, rather than going into a handstand. The fact that we humans, along with most other animals, are bilaterally symmetric makes us feel that the only natural way to "turn around" is about a vertical rather than a horizontal axis, but it doesn't have to be that way. But let us get back to the issue at hand. In what way can the arguments above be thought of as mathematical? I claim there is some similarity both in form and content. As to form, we have insisted, as in mathematics, on having precise definitions of the words we use. As to content, the key words, left and right, are the
everyday expression of the notion of orientation, which is pervasive in much of mathematics. Here is a last example which is even more remote from what one would normally consider mathematics. I recall reading in a child's science book the fairly obvious optical fact that the images of objects on the retina of the eye are upside down. The book went on to say that the brain by some remarkable unexplained trick manages to turn them right side up. This is another example of the subjective-objective fallacy. To whom does the image on the retina appear upside down? To someone peering into m y eye from the outside. To say that it appears upside down to me implies the absurd idea that I am looking at my own retina. The point of these examples is to illustrate the gray area between what is and is not mathematics, but, more importantly, to show how we get tricked and trapped by words if we fail to analyze them rigorously enough. It is my belief that we are victims of a similar wordtrap when we try to define mathematics. Looking for a definition of mathematics is chasing a will-o'-the-wisp. We don't even have a definition of, for example, the word "green." My wife insists that this green tie I'm wearing is actually blue, and there is no way either of can prove we are right. On a more highbrow level, I recall that many years ago there was an ongoing exchange of letters in the New York Times Book Review section on the question "What is History."
Scholar A said history is simply everything that happens, but scholar B insisted that, no, history is only those things that happen that make other things happen, or words to that effect--and the debate went on. I think these things illustrate what I call the linguistic fallacy. We invent words like "history," "mathematics," "green," in order to be able to communicate with each other. But then we seem to think that because the words exist they must correspond to some actual outside entity, that the concept of History or Mathematics has some sort of existence of its own, perhaps in the mind of God, and so we seek to discover what these things really are, forgetting the fact that it was we, not He/She, who invented them in the first place. What is mathematics? My answer is that there is no such thing. For some things, e.g., up, down, left, right, it is crucial to have definitions. For others, like mathematics, searching for a definition becomes mere wordplay. Let's not waste time on it.
And a Final Perspective This from a granddaughter now in second grade who is somewhat ambivalent on the subject of mathematics. As she puts it, she likes carrying, but she can't stand borrowing--which probably says it all. Over and out. DG
THE DEFINITIVE MATHEMATICAL RESOURCE... THE HANDBOOKOF MATHEMATICAL FORMULAS HORST
STOCKER, J o h a n n Wolfgang G o e t h e U n i v e r s i t ~ t , G e r m a n y
The Handbook of Mathematical Formulas puts equations, formulas, tables, illustrations, and explanations into one invaluable reference volume. This handbook of modern mathematics is fully up-to-date and includes almost a thousand pages of accurate mathematicalmaterial. In addition to its broad coverage of topics in mathematics, it also includes new and expanded chapters on graphs and algebras, probability theory and mathematical statistics, fuzzy logic, neural networks, and the use of computers. This indispensable handbook will quickly become the standard reference for every mathematician and student. 1996/APP. 952 PP., 545 ILLUS./HARDCOVER/$29.95/ISBN 0-387-94746-9
CONTENTS: 9 Numerical Computation (Arithmetical and Numerical Analysis) 9 Equation and Inequalities (Algebra) 9 Geometry and Trigonometry in
9 Differential Geometry 9 Infinite Series 9 Integral Calculus 9 Vector Analysis 9 Complex Variables and Functions
the Plane 9 Differential and Partial Differential 9 Solid Geometry 9 Functions 9 Vector Calculus 9 Coordinate Systems 9 Analytic Geometry 9 Matrices, Determinants, and Systems of Linear Equations 9 Boolean Algebra - Application in Switching Algebra 9 Graphs and Algorithms 9 Differential Calculus
Equations 9 Fourier Transformation 9 Laplace and z-transformation 9 Probability Theory and Mathematical Statistics 9 Fuzzy Logic 9 Neural Networks 9 Computers (Introduction to Pascal, C, C++, Fortran, and Computer Algebra) 9 Tables of Integrals
ORDER TODAYI 9Call: 1-800-SPRINGERor Fax: (201) 348-4505 (8:30 AM - 5:30 PM ET) 9Write: Send payment
/Springer
(check or credit card) plus $3.00 postage and handling for the first book and $1.00 for each acldit~onal book to:
Springer-Vertag New York, Inc., Dept. #H218, PO Box 2485. Secaucus, NJ 07096-2485 (CA, IL, MA, NJ, NY, PA, TX, VA, and VT residents add sales tax, Canadian residents add 7% GST) 9VIMt: Yourlocal technical bookstore 9E-mall:
[email protected] 9 WWW: http://www.springer-ny.com
6/96 Reference:H218
Forgotten Fractals Keith Hannabuss
Computer graphics have kindled a general interest in fractals, and a cabinet of mathematical curiosities, such as Cantor's ternary set, has been brought to the attention of a wider public. The simple recipe of dividing the unit interval [0,1] into three parts, removing the middle piece, and then continuing the process so that at each stage, each remaining subinterval is similarly subdivided into three and the middle piece removed, demands little background knowledge. What is less often remarked, even in books presenting an historical approach, is the fact that this set appeared originally in an 1875 paper by the Oxford mathematician Henry Smith (1826-83) [18], some eight years before Cantor mentioned it (without giving its recursive geometrical construction) in 1883 [3]. (One welcome exception is [19], which provides the correct attribution.) Smith, whose major contributions were in number theory, is less well known than he deserves to be; his results tended to be noticed by Continental mathematicians only after their rediscovery by others [8]. Many mathematicians will have encountered Smith's name only in Constance Reid's biography of Hilbert [12], as the mathematician who posthumously shared the 1883 Grand Prix of the French Acad6mie des Sciences with the young Hermann Minkowski. The furore generated by the joint award, which is mentioned there, was, however, due to the fact that the prize had been offered for solving a problem whose solution Smith had already published in 1867 [17], rather than to any perceived slight to Smith's memory. His best known result is probably the Smith normal form of a matrix over a Euclidean domain, the theorem that any such matrix can by preand post-multiplication by non-singular matrices be put into a form in which the only non-zero entries are on the leading diagonal, with each such entry dividing all its successors, and uniquely determined up to units. Smith published this result for integer matrices in an 1861 paper on linear Diophantine equations [16], but it was only after its rediscovery and application by Frobenius in 1879 that it became more widely known [6], [7]. 28
In his 1875 paper Smith was investigating and clarifying certain aspects of Riemann's recently published theory of integration [13], and was, in particular, interested in the question of h o w discontinuous a function can be and yet be integrable. The paper is illuminated by five particular examples, of which the last two are of most interest here. The fourth example proceeds as follows: "Let m be any given integral number greater than 2. Divide the interval from 0 to I into m equal parts; and exempt the last segment from any subsequent division. Divide each of the remaining m - 1 segments into m equal parts; and exempt the last segment of each from any subsequent division. If this operation be continued ad infinitum, we shall obtain an infinite number of points of division P upon the line from 0 to 1. These points are in loose order.' Not only is this construction essentially the same as that for the Cantor set, but when m = 3 it results in a half-scale model of the standard ternary set
THE MATHEMATICAL INTELLIGENCER VOL. 18, NO. 3 9 1996 Springer-Verlag New York
Henry John Stephen Smith (1826-83), w h o was Savilian Professor of Geometry at Oxford from 1861 until his death.
(see Figure 1). By noting that after k stages the total length of the unexcluded segments is [ ( m - 1)/m] k which tends to 0 as k tends to 0%Smith was able to show that a function discontinuous only at the points of the set so constructed was integrable. In his fifth example Smith used a similar technique, except that at the k-th stage each remaining subinterval was to be divided into m k equal subintervals and the last excluded. This time the total length of the remaining intervals after k stages was [Iks=l (1 - m-S), which approaches a finite nonzero limit as k ~ 0% so that functions discontinuous at the points of the resulting set would not be integrable. (This provided a counterexample to an assertion of Hankel.) Yet again Smith's work was ignored, although, as Hawkins has noted in his classic history of Lebesgue's theory of integration [9], "Probably the development of a measure-theoretic viewpoint within integration theory would have been accelerated had the contents of Smith's paper been known to mathematicians whose interest in the theory was less tangential than Smith's." Caator
Smith's construction certainly provides one of the earliest published examples of a fractal set, and may be the first to have been constructed recursively. The earliest such construction of a geometrical fractal seems to have been that of Bolzano in the 1830s, but this lay undiscovered amongst his papers for some ninety years until its eventual publication in 1922 [10]. Riemann's 1854 example of a function which is continuous but not differentiable at most points, appeared in the paper which stimulated Smith's own work; but, like the other examples of this genre, it was very different in spirit from Smith's construction, being defined by a series rather than by geometrical recursion. Although discussed in a lecture of 1872, Weierstraffs famous example of a continuous nowhere differentiable function was first published (with acknowledgment) in a paper of du BoisReymond which appeared in the same year as Smith's work [5], whilst Cell6rier's earlier example of the same phenomenon (from around 1860) was published only in 1890, when it was discovered amongst his papers after his death [2]. For a fuller discussion of these other fractals (but omitting Smith's contribution), see [4]. The main purpose of this article is, however, to take up some ideas motivated by the discussion, in Smith's final section, of the multidimensional analogue of his ideas. Before doing this, let us first look at a two-dimensional version of the ternary set. Start with a unit square. By splitting it into three both horizontally and vertically, divide it into nine subsquares, the central of which is then removed. (This removes the points whose x and y coordinates would both qualify them for exclusion in the one-dimensional case.) The same procedure is then meted out to each subsquare and so on. This construction, which is shown in Figure 2, leads to the wellknown fractal Sierpinfiski carpet [15]. One can similarly produce a planar version of Smith's construction, in which, instead of removing the central subsquare at each stage, it is the bottom right-hand corner square which is deleted. The asymmetry in Smith's construction however, leads to a very different picture, shown in Figure 3 for the case of m = 3. It contains a particularly interesting feature totally absent from the Sierpihski carpet: the boundary at the lower right-hand Smith (m = 3)
Figure 1. The first six stages in the construction of the Cantor and Smith sets. At each stage the Smith set is topologically the same as the preceding stage of the Cantor set. In the limit the Cantor set is just a double size version of the Smith set. THE MATHEMATICAL INTELLIGENCER VOL. 18, NO. 3, 1996
29
Figure 2. The first, second, and fourth stages of the construction of the Sierpifiski carpet, a two-dimensional analogue of the Cantor set.
Figure 3. The first, second, and fourth stages of a planar version of Smith's construction w h e n m = 3. The bottom righthand comer is beginning to take on its eventual snowflake curve profile.
corner resembles a rectangular version of the famous snowflake curve, which appeared in a 1904 paper of Helge von Koch [11], in which edges facing the c o m e r have been sheltered from further excrescences. There is an additional amusing possibility in the plane, which would be n u g a t o r y for the interval: one can take m = 2. This gives rise to a Sierpifiski gasket [14], [20], as shown in Figure 4. (See also [1].) It is impossible to tell from Smith's paper, which, like all his published work, is very terse, whether he was
aware of these particular examples. He was aware, however, that the boundaries need not be rectifiable, for he notes that "the space of i n t e g r a t i o n . . , m a y be intersected b y curves of discontinuity . . . the function m a y be integrable even though the total length of the curves of discontinuity is infinite; because an infinite number of contiguous curves may be enclosed in the same channel." It is tempting to speculate that Smith had sketched out the first few steps in the planar construction outlined here for m ~- 3, and so had seen the snowflake-like curve develop.
Figure 4. The second, fourth, and sixth stages of a planar version of Smith's construction w h e n m = 2, showing the convergence to a Sierpifiski gasket. 30
THE MATHEMATICAL INTELLIGENCER VOL. 18, NO. 3, 1996
References 1. J.E. Barnsley, Fractals everywhere, Academic Press, 2nd edition, 1993; p88. 2. C. Cell4rier, "Note sur les principes fondamentaux de l'analyse," Bull. des Sci. Math. (2)14 (1890), 142-60. 3. G. Cantor, "Grundlagen einer allgemeinen Mannigfaltigkeitslehre," Math. Annalen 21 (1883), 545-591. 4. J-L. Chambert, "Un demi-si6cle des fractales 1870-1890," Hist. Math. 17 (1990), 339-65. 5. P. du Bois-Reymond, "Versuch einer Classification der willkfirlichen Functionen reeller Argumenten in den kleinsten Intervallen," J. f. reine u. angew. Math. 79 (1875), 21-37. 6. F.G. Frobenius, "Theorie der linearen Formen mit ganzen Coefficienten," J. f. reine u. angew. Math. 86 (1879), 146-208. 7. F.G. Frobenius & L. Stickelberger, "Uber Gruppen von vertauschbaren Elementen," J. f. reine u. angew. Math. 86 (1879), 217-262. 8. K.C. Hannabuss, "Henry Smith" in Oxford Figures, ed. J. Fauvel, R. Flood, and R.J. Wilson, Oxford, to appear. 9. T. Hawkins, Lebesgue's theory of integration; its origins and development,.Chelsea, 1970. 10. M. Jasek, "Uber den wissenschaftlichen Nachlass Bernhard Bolzanos," Jahresbericht der deutschen MathematikerVereinigung 31 (1922), 109-110. 11. H. v. Koch, "Sur une courbe continue sans tangente, obtenue par une construction g6ometrique 414mentaire," Ark. f. Mat. Astron. o. Fys. 1 (1904), 681-704, and "Une
m4thode g4om6trique 416mentaire pour l'6tude de certaines questions de la th6orie des courbes planes," Acta Math. 30 (1906), 145-174. 12. C. Reid, Hilbert, .S.pringer, 1970; p. 12. 13. G.B. Riemann, "Uber die DarsteUbarkeit einer Function durch einer trigonometrische Reihe," Abh. der Ges. der Wiss. zu G~tt. 13 (1868), 87-132 (based on Habilitationsschrifl of 1854). 14. W. Sierp~ski, "Sur une courbe dont tout point est un point de ramification," Comptes Rendus (Paris) 160 (1915), 302. 15. W. Sierp~ski, "Sur une courbe cantorienne qui contient une image biunivoque et continue de toute courbe donn6e," Comptes Rendus (Paris) 162 (1916), 629. 16. H.J.S. Smith, "On systems of linear indeterminate equations and congruences," Phil. Trans. Roy Soc. 101 (1861), 1293--326; Collected Mathematical Papers, 12. 17. H.J.S. Smith, "On the orders and genera of quadratic forms containing more than three indeterminates," Proc. Roy. Soc. 16 (1867), 197-208; Collected Mathematical Papers, 18. 18. H.J.S. Smith, "On the integration of discontinuous functions," Proc. London Math. Soc. 6 (1875), 140-153; Collected Mathematical Papers, 25. 19. I. Stewart, Does God play dice?, Blackwell, 1989. 20. I. Stewart, "Four encounters with Sierpihski's gasket," Math. Intelligencer 17 (1995), no. 1, 52--64.
Balliol College Oxford, OX1 3BJ England
THE MATHEMATICAL INTELLIGENCER VOL. 18, NO. 3, 1996
31
The Beginning of Polish Topology Krzysztof Ciesielski and Zdzislaw Pogoda
Up to the end of the 19th century Poland was not noted for mathematics. The mathematical results obtained by Nicolaus Copernicus in the 16th century and J6zef Hoene-Wroriski in the 19th century w e r e not e n o u g h to make Polish mathematics famous t h r o u g h o u t the world. Suddenly, after the First World War, the Polish Mathematical School became widely k n o w n . Two cities became great mathematical centers: Lw6w, where Stefan Banach and others w o r k e d o n functional analysis, and Warsaw, w h e r e the main area of research was set theory and topology. Here, we will concentrate on topological achievements. Topology was then a y o u n g branch of mathematics. H o w did it h a p p e n that topology flourished in a country without a significant mathematical tradition? Poland had for two centuries been d i v i d e d a m o n g Germany, Russia, and Austria. H o w e v e r , old universities in Polish cities still continued to function at two of them (in K r a k 6 w and in Lw6w, u n d e r Austria) the lectures were c o n d u c t e d in Polish. The level of mathematical research was not v e r y high. If y o u n g people w a n t e d the best education in m o d e r n mathematics, they had to go to study to Paris, G6ttingen, or other centers. By the beginning of the 20th century some Polish mathematicians w e r e well k n o w n in the mathematical world but were scattered a m o n g different branches of mathematics. In 1911, the IVth Congress of Polish Naturalists and Physicians took place in Krak6w. WadYaw SierpiIiski wrote about it [14],
and came to the conclusion that this situation must not continue. There was no partnership, no control. We had Polish mathematicians known abroad from their work but we had no Polish mathematics. My conclusion was that it would be much better if a greater number of Polish mathematicians worked in one area of research. Sierpiriski quickly m o v e d to the realisation of his ideas. H e was fortunate that just at this time two y o u n g mathematicians, Stefan Mazurkiewicz and Z y g m u n t Janiszewski, began their scientific careers, and both of t h e m w e r e interested in set t h e o r y and its applications. Janiszewski was studying abroad. In 1911 he passed his doctor's exam in Paris, e x a m i n e d b y Poincar6, Lebesgue, and Borel. Two years later he presented his habilitation [8] to the University of Lw6w. Some results from this
in this Congress all the five Polish professors of mathematics took part. Each one of us presented his talk, however, when we met we talked about everything but mathematics. The reason was that we worked in different scientific areas: Zorawski was interested in geometry, Zaremba in differential equations, Puzyna in analytic functions, Dickstein in the history of mathematics, and I in set theory and number theory. There were no scientific problems common to all of us. After this Congress I thought over this problem 9
32
THE MATHEMATICAL INTELLIGENCER VOL. 18, NO. 3 9 1996 Springer-Verlag New York
continuation of the earlier considerations of Sierpiriski. Janiszewski advanced the idea of concentrating research on one new and developing branch of mathematics. He also proposed putting out a new journal which would publish papers just in this area of mathematics. This was the beginning of Fundamenta Mathematicae; in this journal were published almost all the most important results obtained by Polish mathematicians in the twenties and the thirties. Unfortunately, Janiszewski died suddenly in 1920, at age 32, before the first issue of Fundamenta Mathematicae was published. Also in 1917, Janiszewski and Mazurkiewicz started to conduct topological seminars. After the return of Sierpiriski, the seminars became much more active. These were probably the first regular topological seminars in the world. Very soon the three masters were joined by their young, talented, and active pupils. These included Kazimierz Kuratowski (the first Ph.D. student of Sierpiriski after the war), Bronis~aw Knaster (a student of Mazurkiewicz), Stanist'aw Saks, and Antoni Zygmund. After some time, the foreign mathematicians began to come and take part in the seminars. There is a story about a mathematician from the United States, who presented a talk in a seminar and cited many theorems proved by, as he said, Mezoorkick. In the middle of the lecture, Mazurkiewicz exclaimed, "He is talking about my results!"--he had not recognized the pronunciation of his surname. One can wonder w h y just set theory and topology
~ LI~ATO~'SI<'I
KN ASTER
A caricature of Kuratowski and Knaster, drawn in 1946 by Polish mathematician Leon Jegmanowicz.
paper (about cutting the plane by a continuum) are now called the Theorems of Janiszewski. In 1913, under the supervision of Sierpiriski, Mazurkiewicz wrote his Ph.D. paper about curves filling a square. Sierpiiiski was a wonderful supervisor for his students. However, the ambitious plans and the far-reaching partnership were interrupted by the First World War. In 1914 SierpiIiski was sent to an internment camp by Russians. After 4 years he came back to Warsaw and found that the University of Warsaw was already functioning and his friends Janiszewski and Mazurkiewicz were working there; Poland was again an independent country. In 1917 an article by Janiszewski entitled "O potrzebach matematyki w Polsce" (On the needs of mathematics in Poland) [7] was published in the journal Nauka Polska (Polish Science). This paper was an independent
WaeIaw Sierpiriski in 1910.
THEMATHEMATICALINTELLIGENCERVOL.18,NO.3, 1996 33
Nous avot,a le devoir douloureux d'an,,o,,eer la mort prttm&tur~ de t ZYGMUNT .}ANISZEWSKI foJidateur do .otro journal o1 mombro de la Ik~deetion. Z y g m u n t , l a n i ~ z e w s k l .6 A Vsrgovie [e 19 Juillet 1888, devint docteur de POniversll~ de Parie on 1911: maitre de eo.f6reneos dos math6matiquea A I'Universit~ de L~opoI ea 19181 profeeaeur A ['Univerait4 de Vareovio on 1919. II est d6e6dL~ ]e ;4 Janvier 19'20. See publications tout: 1. 0ontxlbatiou ~ [a g6~ln6trie dee eourboi planea g'&l~rale& (.'omple~ll~ldu* de I'Aead. dee ~eienetm de Paris, t. 150 (Note du 7 mars 1910). 9. Sur la g~t,m6trie de llg.es eantorlenne.. Coraptes Beudus~ t. 151 (Note de 18 juillet 1910). 3. Une uouvelle tend0nee en g6om6trie (on polonail) Wiadora~ci ~mtonatyeraa t. XIV (Varaovie 1910). 4. Sat lea r irr6dur entre deux points. (.'ora~tea ReVue k 152 (Bore du 90 mare 1911). 5. Sur lee eontinus irrMuetlbles entre deux points. T h ~ , paris 10it. Parer auui dens le .]oun:~ de Cl~o~ Polyte~ui~ue. ~ S~fie~ 16.6me Cahler (1919), p, 7 9 - 1 7 0 . 6. Ober die Itegriffo ahi.ie u und ,Flaehe a. lntera. gresJ of Ma~hema~cia~ls. Cambridge: Auguet 19d2. 7. D4monstsatlon d'une propri6t~ dee eontiuus irr&luetiblee entre deux points. Bull. (le l'AcacL des Sciences de Craeoeie 1912 (p. 906 --914). 8. Sat les oouF,uree du plan faitee lmr lee eontinus (en polonais). P,'oec .latematycz.c-t~zv~zne t. 26, p. I I - 69 (Vat.erie 1913). 9. Sur ]e r~alisme et rid&diame ell Math6matlquee(en polonais), PrzeJlqd ~zo$eznv t. I~ (Varsovie 191 ~: p. 161--170).
The first page of the first issue of Fundamenta Mathematicae, containing an obituary of Janiszewski. were chosen as the main directions of research activities. At that time, set theory was quite a controversial branch of mathematics. Recall: historians of science regard Georg Cantor's research on Fourier series from 1879 to
Fig. 1.
1884 as the beginning of general topology. Cantor's investigations on the representation of real functions by trigonometric series showed that the sets of convergence of these series had a rather complicated structure. To characterize this structure, Cantor invented new definitions, applicable for any subset of the real line and Euclidean space. The definitions of a neighborhood, a limit point, a connected set, a closed set, a dense set, and many others quickly became very useful in the investigation of functions and in purely geometric problems. Surprisingly, even very well known sets like curves, lines, regions, and their boundaries led to important and difficult problems; moreover, the solutions were often amazing and led to other problems. Cantor's ideas developed into set-theoretic topology (general topology). There was another stream of math-
Fig. IT.
Figures from a famous paper by Knaster [11]: some steps in the construction of a hereditarily indecomposable continuum. 34
THE MATHEMATICAL
INTELLIGENCER
VOL
18, N O . 3, 1996
The curve in hexagonal shape, obtained by putting together six Sierpifiski gaskets. Every point of this curve is of the same kind of ramification (i.e., there is a parametrization such that the inverse images of each point contain equal numbers of elements). The Sierpiriski gasket does not have this property.
ematical thought connected with topology which was closer to geometry and algebra; this became algebraic topology. According to Pavel S. Alexandrov, Cantor should be regarded as "the creator of general topology" and Poincar6 as "the creator of algebraic topology." These two directions were quite different, yet interrelated. Why did Warsaw mathematicians select just topology for the realization of the program of Sierpi~iski and Janiszewski? There were several reasons for this. First, it was a logical consequence of their earlier interests: SierpiIiski was interested in set theory, Janiszewski in the properties of continua, and Mazurkiewicz in curves. Second, Polish mathematicians probably anticipated that in the future not "pure" set theory but its applications would predominate. Also, young people are not afraid to attack difficult and original problems. When one undertakes such a task, the chance of final success is better in a new, developing branch of science. Topology was a very good candidate for this. A remarkable profusion of results was obtained shortly after the war by Polish mathematicians. Many of them became classical, being mentioned in almost every book on general topology. Also, Polish mathematicians proved many specialized theorems, of interest mainly for topologists, and gave simpler proofs of known results, like the Jordan Curve Theorem or the Brouwer Fixed Point Theorem. Up to 1925, five Polish mathematicians--Sierpiriski, Janiszewski, Mazurkiewicz, Knaster, and Kuratowski--published about 100 (!) papers on topology.
Let us give more details about the results of this period. When Camille Jordan defined a plane curve as an image of a continuous mapping of a closed interval into the plane, this seemed a good, natural, and fully satisfactory definition. Unfortunately, in 1890 Giuseppe Peano constructed his famous example of a curve filling a square. The notion of curve was seen to be very difficult to define. The Peano example showed also that the intuitive concept of dimension would not be easy to formalize. The Poles took up problems connected with curves and dimension. SierpiIiski [29] gave another example of a square-filling curve (made into a portrait of SierpiIiski by Fritz Lott, see Mathematical Intelligencer, vol. 17, no. 1, cover). Especially, they considered sets which are simultaneously compact and connected; such a set is called a continuum. A continuous image of a closed interval must be a continuum. Mazurkiewicz (in 1913) and Sierpiriski (in 1920) gave a complete characterization of spaces obtained as a continuous images of a segment. Mazurkiewicz showed [20] that any locally connected continuum can be obtained as a continuous image of a segment, which proved that these sets form a very large class (the same result was obtained by Hahn, also in 1913). Later, Sierpiriski [26] found that a locally connected continuum is a set which for any E > 0 can be presented as a finite union of continua, each of diameter less than E. It was also Sierpiiiski who proved [36] that no continuum can be represented as a union of countably
The Sierpifiski gasket was used in the emblem of the Polish Mathematical Olympic Games for secondary school students. THE MATHEMATICAL INTELLIGENCER VOL. 18, NO. 3, 1996
35
Stefan Mazurkiewicz.
m a n y pairwise disjoint n o n e m p t y closed sets. Moreover, he gave a very interesting characterization of an arc: a set M is a continuous image of a closed unit interval if and only if it is a continuum and there are a,b E X such that for a n y x different from a and b there are A and B with a E A, b E B, A n B = {x} and A U B = M. There was another definition of curve, called the Cantor definition. A Cantor curve was defined as a plane continuum which does not contain interior points (in other words, a nowhere dense planar continuum). This definition rules out the Peano curve; however, it admits another strange phenomenon: it turns out that there is an example of such a curve which does not con-
tain a n y arc (i.e., no subset of this curve is homeomorphic to a segment!). This example was presented by Janiszewski during his lecture at the International Congress of Mathematicians in 1912 in Cambridge. In 1919, during a Warsaw seminar, H u g o Steinhaus called Janiszewski's example the most complicated geometrical set ever considered in geometry. Soon it appeared that there existed much more complicated planar sets. This followed from results obtained by Knaster. We will come to this point later. Sierpiriski supposed that even combining the Jordan definition and the Cantor definition w o u l d allow some strange examples. This was the origin of further famous sets, called the Sierpiriski carpet and the SierpiIiski triangle curve. The SierpiIiski triangle curve (also called the Sierpiiiski gasket) has the following interesting property: a n y of its points (other than the three vertices of the triangle) is a common endpoint of three arcs in the set which have only this point in common. Moreover, a n y Cantor curve is homeomorphic to a subset of the Sierpiriski curve! Sierpiriski presented different constructions of curves of this type: for instance, take a square, divide it into four congruent squares, and throw a w a y the square at the bottom left-hand corner. Apply the same procedure to each of the three remaining squares. Iterate this procedure. The infinite intersection of all sets obtained in this m a n n e r gives the curve with the same properties. This construction was described by Sierpirlski in his paper [28]. Recent Intelligencer articles have shown a little-known construction of the triangle curve [6, Fig. 4] as well as a well-known one [38, Fig. 2]. Also in [28], we read with surprise the following: Note that as early as one year ago Mr. Mazurkiewicz found an example of a curve which was simultaneously a Jordan curve and a Cantor curve... Mazurkiewicz forms this curve by dividing the square into nine smaller squares using lines parallel to the sides and removing the interior of the center square, performing the same procedure on each of the re-
The Sierpiriski carpet is also called the Sierpiriski universal curve, because any Cantor curve is h o m e o m o r p h i c to a subset of the Sierpifiski carpet (see [5]). This means that for any such curve T contained in ~2, there is a set S h o m e o m o r p h i c to the Sierpiriski carpet with T C S. The picture s h o w s the idea of construction of such a set containing a given curve (in the shape of a letter 3). 36
THE MATHEMATICAL INTELLIGENCER VOL. 18, NO. 3, 1996
The pictures published in [12] showing the constructions of untypical (now well-known) connected sets.
maining eight squares, and iterating this procedure ad infinitum. We recognize the description of the Sierpiriski carpet! Apparently the Sierpiriski carpet was found by Mazurkiewicz. Nowadays the Sierpirlski carpet is mentioned in almost every book about fractals. We have already mentioned the example of a curve containing no arc, given by Janiszewski, and promised to describe planar sets with still stranger properties. The most
famous one is undoubtedly a hereditarily indecomposable continuum, constructed by Bronislaw Knaster in his Ph.D. thesis in 1922 [11]. With the methods known at the time, the construction was extremely difficult. A continuum X is called indecomposable if it contains more than one point and is not the union of two closed continua different from X. Indecomposable continua were discovered in 1910 by Luitzen Brouwer. A hereditarily indecomposable continuum, or Knaster continuum, is a set which has the property that every con-tinuum contained in it is either a one-point set or an indecomposable continuum! This seems incredible. However, it turned out that such a strange set is not so unusual. Mazurkiewicz proved [21] that Knaster continua form a dense G8 set in the family of all subcontinua of the square. A set constructed by Knaster quickly became a favorite counterexample; if somebody had a conjecture about continua, he would start the verification by checking whether a Knaster continuum satisfied it. Later, it was shown that a Knaster continuum, although very strange, is quite regular. In 1948 E. E. Moise proved [23] that a Knaster continuum X is homeomorphic to each subset of X which is a continuum and contains more than one point. This was the origin of another name for this set: pseudoarc. In 1951 R. H. Bing [2] obtained another beautiful result: a pseudoarc is homogeneous; that is, if p and q are points of a pseudoarc, then there is a homeomorphism carrying the pseudoarc to itself and p into q. This condition is fulfilled by a circle. For a long time, it was supposed that a circle is the unique subset of the plane having this property. The construction of strange, unusual sets was a specialty of Knaster. Let us mention some more examples. First, a definition: a set A is a separator of the plane if R2\A is not connected; a separator A is irreducible if any proper subset of A is not a separator of the plane. Knaster constructed [11] a separator A of the plane such that A does not contain any irreducible separator. He also proved that there exists an uncountable family of pairwise disjoint and not locally connected separators of the plane. Another example constructed by Knaster is a subset of a plane which is the common boundary of infinitely many pairwise disjoint regions. Separators were also investigated by Kuratowski. Kuratowski showed that any subset A of the plane such that R2\A is the union of finitely many (at least two) regions contains an irreducible separator. Also, he proved that any irreducible separator A of the plane, such that R2XA is the union of more than two regions, is either an indecomposable continuum or the union of two indecomposable continua. We turn to the important results in plane topology now called the theorems of Janiszewski. The First Theorem of Janiszewski states that if A and B are continua and they are not separators of the plane, then A U B is a separator of the plane if and only if A A B is not connected. The Second Theorem of Janiszewski states that the two-dimensional sphere S2 is a Janiszewski THE MATHEMATICAL INTELLIGENCER VOL. 18, NO. 3, 1996
37
space (a locally connected continuum X is said to be a Janiszewski space if for any continua A, B C X with A n B disconnected, there are p,q E X such that A U B is a separator of the plane and p and q belong to different components of X \ ( A U B)). Later, these theorems were generalized by Kuratowski and Stefan Straszewicz; Straszewicz noticed that in the assumptions continua may be replaced by closed connected sets. The First Theorem of Janiszewski was applied to shorten the proof of the Jordan Curve Theorem. We have to say something about the famous results on connectivity. Connected sets were investigated as early as the very beginning of the 20th century. However, it was Kuratowski and Knaster who developed the ideas of connectivity. One of the most famous examples in topology is the Knaster-Kuratowski fan [12]. Let C be the Cantor set on the interval [0, 1] x {0} c R2; denote by P the set of all end points of intervals removed from [0,1] • {0}in the process of constructing the Cantor set. Join every point c E C to the point q = (1/2,1/2) ~ ~2 by a segment Ic, and denote by Fc the set of all points (x, y) E I~, where y E Q for c E C \ P and y E ( R \ Q ) for c E P . The Knaster-Kuratowski fan is the set F = U {Fc : c E C} C R 2. There were different kinds of disconnectivity. Sierpirlski investigated punctiform sets (discontinuous sets), a notion introduced by Janiszewski. A set is punctiform if it does not contain any continuum of cardinality greater than 1. Sierpifiski proved some theorems about decomposition of the plane into punctiform sets [27, 34]. Sierpirlski and Kuratowski [18] presented a decomposition of the plane into two punctiform sets A and B such that A is an intersection of a Fr set and a G8 set, and B is a union of a Fr set and a G8 set. The technique of construction of peculiar spaces by the use of graphs of functions was used by Sierpiriski and Kuratowski here for the first time. Sierpiriski distinguished the following "better" and "worse" kinds of disconnectivity [definitions (c) and (d) were introduced by him in 1921 [30]; (b) by Felix Hausdorff in 1914]: (a) countable space; (b) hereditarily disconnected space (i.e., a space which does not contain any nontrivial connected subset); (c) totally disconnected space (any two different points can be separated by open-and-closed sets); (d) zero-dimensional space (the space has a base consisting of open-and-closed sets); (e) punctiform spaces. For instance, the Knaster-Kuratowski fan F is connected and punctiform; the space F\{q} is hereditarily disconnected but not totally disconnected. Sierpifiski gave examples showing the difference among all these classes. He also noticed that any countable space dense in itself is homeomorphic to the set Q of rationals. The investigation of zero-dimensional spaces as well as some results by Mazurkiewicz anticipated the development of dimension theory. 38
THE MATHEMATICAL INTELLIGENCER VOL. 18, NO. 3, 1996
Let us also mention some other important results. Kuratowski invented the method of generating topology by the closure operator. Sierpiriski, independently of F. Riesz, characterized compact sets by families having the finite intersection property [35]. These are only a few of the many important and famous results obtained by Polish topologists in the very beginning of the century. Outside of Warsaw, general topology was not the main field of research. However, some topological results were obtained by Lw6w mathematicians: Stanis~aw Mazur, Stanis~:aw Ulam, Juliusz Pawe~ Schauder, and, of course, Stefan Banach. Also in Krak6w, Tadeusz Wa~ewski constructed the space which is nowadays called the Wa~ewski dendrite. We do not mention the many results obtained in the thirties and later. Among the young Polish topologists there were two who turned to another kind of topology and who emigrated early from Poland: they were Samuel Eilenberg and Witold Hurewicz. Eilenberg first worked in general topology. Later, when he moved to the United States, he became interested in algebraic topology. The results of Eilenberg are nowadays considered classical and appear in many textbooks on algebraic topology. Hurewicz studied in Vienna, and then worked for a long time in Amsterdam under Brouwer. He obtained fundamental results on homotopy theory and dimension theory. Finally, let us return to the idea of the development of mathematics in Poland with special emphasis on just one branch of mathematics. Although it gives one the opportunity to achieve significant results quickly, such a conception may be dangerous. For example, if the subject is ill-chosen, the finest mathematical brains in the country may squander their effort on a very narrow area. Also, editing a journal devoted only to one branch of mathematics was a very controversial notion at that time. For instance, when the first issue of Fundamenta Mathematicae was published, Henri Lebesgue wrote a letter to Sierpiriski, in which he expressed his enjoyment of the papers in this volume, but doubted if so specialized a journal would receive enough papers to ensure its continuation at such a high level. The creators of the Warsaw School of Mathematics realized all these dangers. Nevertheless, they believed that their choice was good. Sierpiriski thought that it was better to concentrate on one branch of mathematics than to work chaotically, with no sense of partnership. The development of topology in the 20th century was enormous. Even those who selected topology as the branch of investigation for Polish mathematicians did not anticipate this. N e w trends like algebraic topology and differential topology grew and were widely applied. In these areas, Polish mathematicians were not as dominant as in general topology: the best known are Karol Borsuk, Hurewicz, and Eilenberg, the last two not working in Poland. Some mathematicians criticize not so much the choice
of topology but rather the long concentration on just general topology. They think that the research area should have been extended to algebraic topology and differential topology. Famous mathematicians differ on this point. It is impossible to imagine the d e v e l o p m e n t of general topology w i t h o u t the results obtained b y Polish mathematicians. On the other hand, at present Polish mathematics does not play such a role in t o p o l o g y as it did 70 years ago. But is that the fault of the creators of the W a r s a w School? Is it a n y b o d y ' s fault? It is impossible to require a n y talented mathematician to w o r k on a particular kind of problems. Also, the period 1935-1950, w h e n algebraic t o p o l o g y d e v e l o p e d so richly, was v e r y difficult for Poland. It is p e r h a p s too early to judge these things now. We have to wait about 100 years. We can say that the Polish contribution to the develo p m e n t of t o p o l o g y was extremely impressive. It was in Poland w h e r e the basics of general t o p o l o g y began to be p u t in order, m a n y ideas were formalized, m a n y definitions stated, and m a n y really i m p o r t a n t problems solved. It even seems that the mathematicians of the twenties solved too m a n y problems and did not leave e n o u g h for their successors. The books written by Polish topologists, especially the impressive m o n o g r a p h b y Kuratowski, b e c a m e classics. Although old, they are still m u c h cited. Let us end b y quoting the f a m o u s Japanese mathematician, J. Nagata [4]; asked about his teachers, he answered, "I had two teachers: A l e x a n d r o v and Kuratowski, because I learned topology from the books written by them."
References 1. A.V. Arkhangelskii and L.S. Pontryagin (eds.), General Topology, vol. L Springer-Verlag, New York: 1990. 2. R.H. Bing, Concerning hereditarily indecomposable continua, Pacific Journal of Mathematics 1 (1951), 43-52. 3. R. Engelking, General Topology, PWN, 1977. 4. R. Engelking, "P.S. Aleksandrow," WiadomodciMatematyczne 20 (1978), 174-177. 5. R. Engelking and K. Sieklucki, Introduction to Topology, Amsterdam: North-Holland, 1994. 6. K. Hannabuss, Forgotten fractals, Mathematical Intelligencer 18, no. 3, 28-31. 7. Z. Janiszewski, O potrzebach matematyki w Polsce, in: Nauka Polska, Warszawa, Kasa im. Mianowskiego 1917: reprinted in: Wiadomosci Materaatyczne, 7(1963), 3-8. 8. Z. Janiszewski, O rozcinaniu plaszczyzny przez continua, Prace Matematyczno-Fizyczne 26 (1913), 11-63. 9. Z. Janiszewski, Sur les continus irr6ductibles entre deux points, Comptes Rendus Paris (1911), 752-755. 10. Z. Janiszewski, Uber die Begriffe "Linie" und "F1/iche," International Congress of Mathematicians, Cambridge, 1912. 11. B. Knaster, Un continu dont tout sous-continu est ind6composable, Fundamenta Mathematicae 3 (1922), 247-286. 12. B. Knaster and K. Kuratowski, Sur les ensembles connexes, Fundamenta Mathematicae 2 (1921), 206-255. 13. K. Kuratowski, Notatki do autobiografii, Czytelnik, Warszawa, 1981.
14. K. Kuratowski, P6~ wieku matematyki polskiej, Wiedza Powszechna, Warszawa, 1977. 15. K. Kuratowski, S. Mazurkiewicz et son oeuvre scientifique, Fundamenta Mathematicae 34 (1947), 316-331. 16. K. Kuratowski, Topologie, vol. I, Warszawa, 1933. 17. K. Kuratowski, Topologie, vol. II, Warszawa, 1950. 18. K. Kuratowski and W. Sierpirlski, Les fonctions de classe 1 et les ensembles connexes punctiformes, Fundamenta Mathematicae 3 (1922), 303-313. 19. A. Lelek, Zbiory, Warszawa, PZWS, Warszawa, 1966. 20. S. Mazurkiewicz, O arytmetyzacji continu6w, Comptes Rendus Varsovie 6 (1913), 305-311. 21. S. Mazurkiewicz, Sur les continus absolument ind6composables, Fundamenta Mathematicae 16 (1930), 151-159. 22. S. Mazurkiewicz and W. Sierpiriski, Contribution a la topologie des ensembles d6nombrables, Fundamenta Mathematicae 1 (1920), 17-27. 23. E.E. Moise, An indecomposable plane continuum which is homeomorphic to each of its nondegenerate subcontinua, Trans. American Mathematical Society 63 (1948), 581-594. 24. A. Schinzel, Rola Waclawa Sierpiriskiego w historii matematyki polskiej, Wiadomo~ciMatematyczne 26 (1984), 1-9. 25. W. SierpiIiski, Oeuvres Choisies,vols. L II, PWN, Warszawa, 1974 26. W. Sierpiriski, Sur une condition pour qu'un continu soit une courbe jordanienne, Fundamenta Mathematicae I (1920), 44-60. 27. W. Sierpirlski, Sur la d6composition du plan en deux ensembles punctiformes, Bulletin International de L'Acaddmie des Sciences de Cracovie, Ser. A (1913), 76-82. 28. W. Sierpiiiski, O krzywej, kt6rej ka~cly punktjest punktem rozga~e,zienia (Sur une courbe dont tout point est un point de ramification), Prace Matematyczno-Fizyczne 27 (1916), 77-85. 29. W. Sierpiriski, O krzywych, wyperniajacych kwadrat (Sur les courbes qui remplissent un carr6), Prace MatematycznoFizyczne 23 (1912) 193-219. 30. W. Sierpiriski, Sur les ensembles connexes et non connexes, Fundamenta Mathematicae 2 (1921), 81-95. 31. W. Sierpillski, Sur une courbe dont tout point est un point de ramification, Comptes Rendus Paris 160 (1915), 302-305. 32. W. Sierpiliski, Sur une courbe cantorienne qui contient une image biunivoque et continue de toute courbe donn6e, Comptes Rendus Paris 172 (1916), 629-632. 33. W. Sierpi6ski, Sur une nouvelle courbe continue quelconque, Bulletin International de L'Acaddmie des Sciences de Cracovie, Ser. A (1912), 462-478. 34. W. SierpiIiski, Sur un ensemble punctiforme connexe, Fundamenta Mathematicae 1 (1920), 7-10. 35. W. Sierpirlski, Un th6or6me sur les ensembles ferm6s, Bulletin International de L'Acaddmie des Sciences de Cracovie, Ser. A (1918), 49-51. 36. W. Sierpiiiski, Un th6oreme sur les continus, T6hoku Mathematics Journal 13 (1918), 300-303. 37. L.A. Steen and J.A. Seebach Jr., Counterexamples in Topology, New York: Springer-Verlag, 1978. 38. I. Stewart, Four encounters with Sierpinski's gasket, Mathematical Intelligencer 17, no. 1, 52-64. 39. G. Temple, 100 Years of Mathematics, Duckworth, London, 1981.
Mathematics Institute Jagiellonian University Reymonta 4, 30-059 Krak6w, Poland e-maih
[email protected] e-maih
[email protected] THE MATHEMATICAL INTELLIGENCER VOL. 18, NO. 3, 1996
39
Jeremy J. Gray*
Augustus De Morgan (1806-1871) Adrian Rice
This year marks the 125th anniversary of the death of Augustus De Morgan. Immortalised by the famous laws which bear his name, De Morgan is otherwise largely unknown to the majority of today's mathematicians. However, he was one of the most respected and influential British mathematicians of his day, an intriguing character whose enormous intellect was matched by a sharp wit and sense of humour. Born in Madurai, southern India, on 27 June 1806, De Morgan suffered an early infection which left him blind in the right eye, a disability which throughout his life resulted in concentration on mental rather than physical activities. Raised and educated in the southwest of England, he entered Trinity College, Cambridge, in February 1823, where his mathematical talents soon blossomed under the influence of tutors such as the algebraist George Peacock, the philosopher of science William Whewell, and the astronomer George Biddell Airy. As a result, in 1827, he graduated in fourth place as a 'Wrangler' (i.e., one with a first-class degree). His graduation coincided with the search for professors at the newly established University College London. Founded in 1826 as "The London University," UCL was the first such body to be established in
England since Oxford and Cambridge. Inspired by its progressive aims and explicit secular character, De Morgan applied for the mathematics chair. At 21, he was the youngest of thirty-one candidates and had no teaching experience whatsoever. Nevertheless, due in no small part to excellent references from his Cambridge mentors, he was unanimously elected founder Professor
*Column editor's address: Faculty of M a t h e m a t i c s , The O p e n University, Milton Keynes, MK7 6AA, England.
40 THE MATHEMATICALINTELLIGENCERVOL. 18, NO. 3 9 1996Springer-VerlagNew York
thor Isaac Todhunter, the economist and logician William Stanley Jevons, and the constitutional writer Walter Bagehot. Recollections of pupils such as these tell us that De Morgan was an "eccentric but brilliant teacher" whose lectures were stimulating, often inspiring, but far from easy. Even his best students had to struggle to keep up, as Bagehot wrote in 1843: De Morgan has been taking us through a perfect labyrinth lately; he was quite lost by the whole class for one lecture, but we are, I hope, getting better... We have been discussing the properties of infinite series, which are very perplexing.
Augustus DeMorgan of Mathematics on 23 February 1828, and gave his first lecture on 5 November that year. All did not go smoothly, however. Relationships between the twenty-eight professors and the college's dictatorial governing council were often strained; and when, in 1831, the professor of anatomy was unfairly dismissed, De Morgan, being a man of principle, promptly resigned in protest. Five years later, however, his successor, Professor George James Pelly White, was accidentally drowned while on holiday in the Channel Islands. De Morgan immediately offered himself as a temporary replacement ... and stayed on for another thirty years! (The professor of anatomy who had been at issue never did return.) The maths course then offered at University College was divided into four groups: the junior and senior classes, each with a higher and lower division. The course began with elementary arithmetic and the first book of Euclid's Elements, progressing as far as the calculus of variations in a period of two years. Incidentally, De Morgan never taught what we today would call "applied maths." Subjects such as dynamics and statics were taught by the Professor of Natural Philosophy (i.e., physics). De Morgan lectured from 9 to 10 am and 3 to 4 pm every day except Sundays, and at the end of each lecture would give homework problems for the class to solve by next time. Although University College was something of a feeder for the more advanced instruction offered at Cambridge, which took many of De Morgan's graduates, a number of his students went on to achieve fame in their own right, such as the algebraist James Joseph Sylvester, the mathematical textbook au-
(De Morgan was one of the first to lecture on this topic in Britain.) Perhaps due to his own experiences at university, De Morgan was severely critical of how students were examined, preferring them to be able to think for themselves rather than reproduce proofs in an exam. As one ex-student later wrote: "All cram he held in the most sovereign contempt. I remember, during the last week of his course which preceded an annual College examination, his abruptly addressing his class as follows: 'I notice that many of you have left off working m y examples this week. I know perfectly well what you are doing; YOU ARE CRAMMING FOR THE EXAMINATION. But I will set you such a paper as shall make ALL YOUR CRAM of no use.' " De Morgan was, throughout his career, a prolific writer, publishing 18 books and over 160 papers on many subjects. His research is primarily remembered today for its contribution to the development of modern symbolic algebra and logic, encouraging William Rowan Hamilton with his work on quaternions and George Boole in his algebraic logic. Indeed, De Morgan's major achievement lies in his recognition of the connection between the two disciplines. As he later characteristically put it: We know that mathematicians care no more for logic than logicians for mathematics. The two eyes of exact science are mathematics and logic: the mathematical sect puts out the logical eye, the logical sect puts out the mathematical eye; each believing that it sees better with one eye than with two. De Morgan's interest in logic arose from his teaching of Euclid. He noticed that the Elements provided a perfect example of the poor relationship between logic and mathematics: while Euclidean geometry was considered to be the model of deductive reasoning in mathematics and the syllogism in Aristotelian logic, hardly any connections existed between the two systems. De Morgan was virtually the only person to consider this peculiarity in 2000 years, although his attempts to "syllogise" Euclid were largely unsuccessful. He also believed that the traditional Aristotelian syllogistic method was inadequate in any reasoning inTHE MATHEMATICAL INTELLIGENCER VOL. 18, NO. 3, 1996
41
volving quantity. Giving the following example, 7~ost of the Ys are Xs Most of the Ys are Zs .'. Some Xs are Zs he asserted that this argument could not be proved by means of any of the normally accepted Aristotelian syllogisms. In order to rectify this defect, De Morgan introduced the notion of "quantifying the predicate" into his logic. Here he said that if the total number of Ys is m, the number of Ys that are Xs x, and the number of Ys that are Zs y, then there are at least (x + y - m) Xs that are Zs. For example, given that a boat with 100 people on board sinks, if 55 were below deck and the total number drowned is 70, then, by De Morgan's syllogism, at least 25 (i.e., 55 + 70 - 100) people below deck were drowned. This extension of the concept of syllogism was successful in developing a numerically definite system of logic: a significant step forward. He published two books and four papers based on his research, of which the fourth is now regarded as his most original contribution. In it, he introduced the logic of relations which, although his work in this area was left unfinished, substantially increased the scope of the subject. Less enduring perhaps were his attempts to invent a suitable notation for his symbolic logic, which were superseded by Boole's more algebraic approach. An illustration of this is provided by the fact that the famous De Morgan's Laws are far more familiar to us in their modern Boolean formulation: (A n B)' = A' U B',
(A U B)' = A' n B'.
In addition to his work in mathematics and logic, De Morgan had a lifelong fascination for the history and philosophy of science in general, and mathematics in particular. He contributed over 700 articles to a publication entitled the Penny Cyclopaedia on all areas of mathematical science, including one in which he invented the term, though not the method, of "mathematical induction." Though deeply interested in philosophy, this mode of thought was not usually one of his strengths. As he wrote, he "had no objection to Metaphysics, far from it, but if a man takes a candle to look down his own throat, he must take care not to set his head on fire." It was the history of mathematics that was his particular forte. Articles such as "The early history of infinitesimals in England" and "Notices of English mathematical and astronomical writers between the Norman Conquest and the year 1600" give a mere indication of the breadth of his knowledge and interest in the subject. Yet his approach was never dry. In a letter to Hamilton in 1852, he wrote: Dates are of as much importance to an historian as to an Arab. The Arab, however, has to dry his; the historian's are as dry as possible from the outset. 42
THE MATHEMATICAL INTELLIGENCER VOL. 18, NO. 3, 1996
Perhaps De Morgan's best known work is a book entitled A Budget of Paradoxes. This is a collection of humorous writings and reviews featured in a weekly Victorian magazine called The Athenaeum, compiled posthumously by his widow Sophia. De Morgan was a keen bibliophile, accumulating over 3000 mathematical volumes by his death, and the Budget consists of accounts of many of these works together with anecdotes and witty verses. A couple of reviews will suffice to give the flavour. "The Decimal System as a whole. By Dover Statter. London and Liverpool, 1856. The proposition is to make everything decimal. The day, now 24 hours, is to be made 10 hours. The year is to have ten months, Unusber, Duober, &c. Fortunately there are ten commandments, so there will be neither addition to, nor deduction from, the moral law. But the twelve apostles! Even rejecting Judas, there is a whole apostle of difficulty. These points the author does not touch." "A method to trisect a series of angles having relation to each other; also another to trisect any given angle. By James Sabben. 1848 (two quarto pages). 'The consequence of years of intense thought': very likely, and very sad." Another area into which De Morgan directed his intellectual energy was mathematical astronomy. He served on the council of the Royal Astronomical Society for over three decades between 1830 and 1861, holding the office of secretary between 1831 and 1838 and again from 1848 to 1854, as well as being the society's vicepresident on many occasions. Although an enthusiastic member, due to his optical disability De Morgan was not an observational or experimental astronomer. For this reason, he resisted considerable pressure to become the society's president. He wrote at the time: I will vote for and tolerate no President but a practical astronomer. . . . The President must be a man of brass--a micrometer-monger, a telescope-twiddler, a star-stringer, a planet-poker, and a nebula-nabber. De Morgan was a man of many eccentricities. When asked his age, he was wont to declare: "I was x years old in the year xa'--a phenomenon peculiar to those born in years such as 1640, 1722, 1806, 1892, 1980, and so on. In 1859, when offered an honorary law doctorate by Edinburgh University, he declined it, saying that he "did not feel like an LL.D." He also refused to allow himself to be proposed as a Fellow of the Royal Society. "Whether I could have been a Fellow," he later said, "I cannot know; as the gentleman said who was asked if he could play the violin, I never tried." Married with seven children, De Morgan lived in close proximity to University College, first at No. 69, Gower Street (now numbered 35), later moving to No. 7, Camden Street, then on the edge of London, but now
relatively central. He retained a lifelong love of London, rarely leaving it, not even for family holidays in the "desolation" of the countryside. Viewing these rural excursions with a humorous dread, he once wrote of himself: Ne'er out of town; 'tis such a horrid life: But duly sends his family and wife. Yet despite his love of the city, he never visited Westminster Abbey, or the Tower, or the House of Commons, and he refused to vote in any election. The last major event of his career was his term as first president of the London Mathematical Society, founded in 1865. Conceived as the "University College Mathematical Society" by two former students, one of whom was his son George, the society received great encouragement and support from De Morgan. His inaugural address was principally noteworthy for the emphasis it placed on the necessity for research into his two favourite topics: logic and the history of mathematics. To this day, the society commemorates its founding president with the De Morgan Medal, awarded every three years for outstanding mathematical achievement. Based at University College for the whole of De Morgan's presidency, the LMS moved to new premises at the end of 1866. De Morgan's own link with University College ended simultaneously, although the two events were unconnected. His departure was on another matter of principle, this time over adherence to the college's policy of religious equality. For De Morgan, the council's refusal to appoint a candidate to the vacant chair of philosophy on the grounds of his being a controversial Unitarian minister was a betrayal of its founding principles. He resigned his chair on 10 November 1866, giving his last lecture in the summer of 1867. He never returned, refusing even to sit for a bust to be placed in the college library, explaining that, as far as he was concerned, "our old college no longer exists." The years following this final resignation were plagued by misfortune. Although no personal bitterness had resulted from the controversy, the incident affected De Morgan so strongly that it injured his health. The death of George De Morgan in October 1867, at the age of just 26, served as a further blow. In 1868, he suffered a stroke from which he never fully recovered. The final decline in his health followed the premature death of another child, Helen Cristiana in August 1870. He died of nervous prostration and kidney disease on 18 March 1871, and was buried in Kensal Green Cemetery in north-west London.
".4 must-read book for anyone interested in science, mathematics, computers, quantum mechanics, human capabi/ities, consciousness, free wi//, reincarnation, and the scientific possibility of e t e m a / life. " --R. Rao Chivukula, Ph.D., Department of Mathematics and Statistics, University of Nebraska--Lincoln
Edges of Rea/ity is an astounding exploration of consciousness beyond the edge of a thought we can never think, a problem we can never solve, and a place we can never go. Dr. May entertainingly explains and illustrates the reasons for many of our intellectual and physical limitations, and offers a glimpse of what wonders the future of human and computer "thought" may hold. 0-306-45272-3/322 pp./ill./1996/$28.95 ($34.74 outside US & Canada)
Adrian Rice School of Mathematics and Statistics Middlesex University Bounds Green London N l l 2NG, UK THE MATHEMATICAL INTELLIGENCER VOL. 18, NO. 3, 1996
43
Light Shadows: Remembrances of Yale in the Early Fifties Gian-Carlo Rota
Jack Schwartz The first half of Jack Schwartz's life coincides with one of the greatest ages of science. The achievements in the exact sciences of the period that runs from roughly 1930 to 1990 may well remain unmatched in any foreseeable future. Jack Schwartz's name will be remembered as a beacon of this age. No one among the living has left as broad and deep a mark on as many areas of pure and applied mathematics, on computer science, in economics, in physics, as well as in fields which ignorance prevents me from naming. I hope you will forgive me as I declare m y incompetence to do justice to Jack Schwartz's life, to his personality, to his achievements. I beg your indulgence if I resort instead to an easier task. I'd like to recall a few anecdotes from a brief period of the past, the years 1953 to 1955, when I met Jack and learned mathematics as a graduate student at Yale. The first lecture by Jack I listened to was given in the spring of 1954, in a seminar in functional analysis. A brilliant array of lecturers had been expounding throughout the spring term on their pet topics. Jack's lecture dealt with stochastic processes. Probability was still a mysterious subject cultivated by a few scattered mathematicians, and the expression "Markov chain" conveyed more than a hint of mystery. Jack started his lecture with the words, "A Markov chain is a generalization of a function." His perfect motivation of the Markov property put the audience at ease. Graduate students and instructors relaxed and followed his every word to the end.
Jack's sentences are lessons in clarity and poise. I remember a discussion in the mid-eighties about the future of artificial intelligence, in which for some reason I was asked to participate. The advocates of what was then called "hard A. I." were painting a triumphalist picture of the future of computer intelligence, to the dismay of their opponents. As the discussion went on, all semblance of logical argument was given up. Eventually, everyone realized that Jack had not said a word, and all faces turned toward him. "Well," he said, "some of these developments may lie one hundred Nobel prizes away." His felicitous remark calmed everyone down. The A. I. people felt they were being granted the scientific standing they craved, and their opponents felt vindicated.
1Inaugural address delivered at Courant Institute (New York University) at a meeting in h o n o r of Jacob T. Schwartz on May 19, 1995.
44 THE MATHEMATICALINTELLIGENCERVOL. 18, NO. 3 9 1996Springer-VerlagNew York
Jack Schwartz
I have made repeated use in my own lectures of Jack's strikingly apposite phrases. You may forgive this shameless appropriation upon learning that my students have picked up the very same phrases from me, and so on.
From Princeton to Yale Mathematics in the fifties was a marginal subject, like Latin. The profession of mathematician had not yet been recognized by the public, and it was not infrequent for a mathematics graduate student to be asked whether he was planning to become an actuary. The centers of mathematics were few and far between, and communication among them was infrequent. The only established departments were Princeton and Chicago. Harvard was a distant third, and Yale was in the process of overcoming its overdependence on the College. In New York, Richard Courant was busy setting up his Institute of Mathematical Sciences at 25 Waverly Place, and he had just finished training his first generation of students in America, the generation of Lax and Nirenberg, of Cathleen Morawetz and Harold Grad. It was already clear that the Institute he was putting together was going to be a great center of mathematics. In the spring of 1953, I was a senior at Princeton, and I applied to various universities for admission to gradu-
ate school. It soon became apparent that I only needed to apply to one graduate school. Professor A. W. Tucker was not yet the chairman of the Mathematics Department, but he was already acting as if he were. Solomon Lefschetz, the nominal chairman on the verge of retirement, would make fun of Tucker, by lavishing in public uncomfortably high praise on Tucker's managerial skills. There were few undergraduate majors, maybe a half dozen each year, and A1 Tucker would see to it that they were sent to the "right" graduate schools. He made sure that Jack Milnor stayed in Princeton, and he sent H y m a n Bass, Steve Chase, and Jack Eagon to Chicago, Mike Artin to Harvard. In April 1953, I wrote a letter of acceptance to the University of Chicago, which had offered me a handsome fellowship (in those days, it was extremely easy to be offered a graduate fellowship anywhere). On my way to the mailbox, I met Professor Tucker on the narrow, rickety stairs of the old Fine Hall. He asked me where I had decided to go to graduate school, and, upon hearing of m y decision, he immediately retorted, "You are not going to Chicago, you are going to Yale!" I had no choice but to do what he bid me; I tore up the letter to Chicago and wrote an identical letter of acceptance of a fellowship that I had been offered by Yale. In retrospect, my decision to go to Yale is one of the few right decisions I have made, and I will always thank A1 Tucker's memory for guiding me to it. Don Spencer, another of my undergraduate teachers, was first to mention Jack Schwartz's name to me. He complimented me on m y choice of a graduate school, with the remark: "Oh yes, Yale, that is where Jack Schwartz i s . . . ' . It was an astounding statement, considering that Jack Schwartz was getting his Ph.D. from Yale that very month. Spencer's remark began a process of turning Jack Schwartz into a mythological figure in m y mind, a process that did not stop after I actually met Jack Schwartz a few months later. Actually, I have never been able to stop the process.
Josiah WiUard G i b b s The sciences at Yale have always played second fiddle to the humanities. At faculty meetings it is not unusual to witness a professor of literature point with a wide gesture, like a Roman senator, towards Hillhouse Avenue, where most of the science departments are located, and begin an oratorical sentence with the words "Even in the sciences..." Despite the distrust that Yale College has felt towards science, Yale was once blessed with the presence of one of the foremost scientists of the nineteenth century, namely, Josiah Willard Gibbs. Gibbs served as a professor at Yale without any stipend. Professors did not receive any salary from Yale in those happy days. Teaching young men from the upTHEMATHEMATICALINTELLIGENCERVOL.18,NO.3, 1996 45
per echelons was not a salaried profession, it was a privilege. The administration did of course receive handsome salaries, like all administrations in all times of history. One day, Gibbs received an offer from the recently founded Johns Hopkins University. It was an endowed professorship. We may hazard the guess that it was the position Sylvester had relinquished when he accepted a professorship at Oxford, after the requirement of religious vows for professors was dropped by the two English universities. Thanks to its endowment, the Johns Hopkins professorship carried a stipend of one hundred dollars a year. It is unclear whether Gibbs was delighted with the offer; in any case he felt obliged to get ready to move to Baltimore. One of his colleagues realized that Gibbs was packing, and hastened to contact the Dean of the College. The Dean asked the colleague if he could do something to keep Gibbs at Yale. "Why, just tell him that you'd like him to remain at Yale!" answered the, colleague. The Dean kept his word and did what the colleague had recommended. He summoned Gibbs to his office and generously let him know that'he wanted Gibbs to stay. It was the kind of reassurance Gibbs needed. He declined the Johns Hopkins offer, and remained at Yale for the rest of his career. Some of Gibbs's most original papers in statistical mechanics were published in the Proceedings of the Connecticut Academy of Sciences, a journal which I dare surmise few of us have ever seen. One might wonder how papers which saw the light in such an obscure publication could manage to receive within a short time worldwide publicity and acclaim. After I moved to Yale in the summer of 1953, I accidentally found the answer to this puzzle. There was no mathematics library at Yale in the fifties; a mathematics library was not opened until the early sixties, after several members of the Mathematics Department had threatened to quit. Before that time, mathematics books were relegated to a few shelves in the Sterling Library, randomly classified under that miscarriage of reason that was the Dewey Decimal System. All students had access to the shelves. In August 1953, I used to walk through the mathematics shelves of the Sterling Library and to pull out books at random, as we do when we are young. Next to an array of perused calculus books were hard-bound lecture notes of courses offered at Yale at various times by members of the faculty. Among these were some course notes by Gibbs, written in his own hand. A few additional sheets were glued to one of these volumes. The names of all notable scientists of Gibb's time were listed in these sheets, from Poincar6 and Hilbert and Boltzmann and Mach, all the way to individuals who are now all but forgotten. Altogether, more than two hundred names and ad46
THE MATHEMATICAL INTELLIGENCER VOL. 18, NO. 3, 1996
dresses were alphabetized in a beautiful, fading handwriting. Those sheets were a copy of Gibbs's mailing list. As I leafed through it with amazement, I realized at last how Gibbs had succeeded in getting himself to be known in a short time. I also learned an instant lesson, the importance of keeping a mailing list.
Yale in the Fifties In the early fifties, Yale had not yet lost the charm of a posh out-of-the-way college for the children of the wealthy. Erwin Chargaff, in his autobiography Heraclitean Fire, describes Yale in the following words: Yale University was much more of a college than a graduate school; and the undergraduates were all over town. They were digesting their last goldfish, for the period of whoopee, speakeasies, and raccoon coats was coming to an end, to be replaced by a grimmer America which was never to recover ' the joy of upper-class life. The University proper was much less in evidence. Shallow celebrities, such as William Lyon Phelps, owed their evanescent fame to the skill with which they kept their students in a state of elevated somnolence. The main part of the campus, consisting of nine shining colleges in the middle of New Haven, was of recent vintage. At the lower end of Hillhouse Avenue, the red bricks of Silliman College shone like the plaster of a movie set as one made one's way back to the main campus from the deliberately distant science buildings. Envious Englishmen spread the malicious rumor that the colleges that Mr. Harkness's money had built were Hollywood-style imitations of Oxford colleges. But nowadays the shoe is on the other foot, and it is Oxford that is at the receiving end of other jibes. The graduate school was a genteel (though less and less gentile) appendage added to the University by gracious assent of the Dean of the College. The Dean of the College held the real power, and he could overrule the President. Since the thirties, professors appointed to the few and ill-paid graduate chairs had consistently turned out to be better scholars and scientists than the Administration had foreseen at tenure time. Nonetheless, evil tongues from Northern New England whispered that a certain well-known physics professor would never have made it past assistant professor in Cambridge; but he was one of the last exceptions, soon to fade into best-sellerdom. Hard work, the kind one reads about in the hagiographies of scientists, was regarded by the graduate students with embarrassment. It was not unusual for a graduate student to spend seven postgraduate years as a teaching assistant before being reluctantly awarded a terminal Ph.D. The university cynically encouraged graduate students to defer their degrees: the money saved by hiring low-paid teaching assistants in place of professors could be used to enrich the rare book collection. Writing a doctoral dissertation was an in-house af-
fair, having little to do with publishing or with distasteful professionalism. On learning about the shocking leisure of graduate life at Yale in the fifties, one may seek shelter in one of the current philosophies of education, which promise instant relief from the onslaught of reality. One would thereby be led to the mistaken conclusion that "creativity" (a pompous word currently enjoying a fleeting but insidious vogue) would be stifled in the constricted, provincial, unhurried atmosphere of N e w Haven. The facts tell a different story. The comforts of an easy daily routine in a rigidly circumscribed environment, encouraged by the indulgent scrutiny of benign superiors, foster the life of the mind. Professors were poorly paid but enjoyed unquestioned prestige. In their sumptuous quarters in the colleges they would encourage their students with sherry and conversation. Purposeless delectation in ideas may be as educational as intensive study. At Yale, together with the enjoyment of an absorbing range of campus activities went the lingering belief that nothing much mattered in that little corner of the world. Teachers and students were thereby led to meet the fundamental requirement of a successful educational experience: They were kept from taking themselves too seriously.
It is not clear how functional analysis took over the Mathematics Department. Einar Hille was hired away from Princeton sometime in the mid-thirties, but for several years he was one of two research mathematicians. At the time, several universities would hire one and only one "research mathematician"; Yale could afford as many as two: Einar Hille and Oystein Ore. Nelson Dunford was next to come, as an assistant professor. Soon after his arrival, he received an attractive offer from the University of Wisconsin, and Yale took the unusual step of promoting him two steps up to a full professorship. After the end of World War Two, Kakutani came over from Japan, and Charles Rickart from Michigan. By the early fifties, just about every younger mathematician at Yale was working in functional analysis, and the weekly seminars were attended by well over 50 people.
Mathematics at Yale
The core of graduate education in mathematics was Dunford's course in linear operators. Everyone who was interested in mathematics at Yale eventually went through the experience, even some brilliant undergraduates, such as Andy Gleason, McGeorge Bundy and Murray Gell-Mann. The course was taught in the style of R. L. Moore: mimeographed sheets containing unproved statements were handed out every once in a while, and the students would be asked to produce proofs on request. Occasionally, some student at the blackboard would fall silent. Dunford would make no effort to help, and the silence, sometimes lasting the whole 50 minutes, became unbearable to all. I suspect that Dunford wanted to minimize his teaching load, which in those years ran to 12 hours per week for full professors. Everyone who took Dunford's course was marked by it. George Seligman once remarked to me that Dunford's course in linear operators was the turning point in his graduate career as an algebraist.
The Mathematics Department was the first of the science departments to awaken. It was not until the fifties when the last of a long line of professional teachers of calculus retired from the Mathematics Department: fine, upright gentlemen of the old school, richly endowed with family values, who reaped handsome profits on the royalties of their best-selling textbooks. The mathematicians who replaced them were eager to create a research atmosphere, and at last a few graduate students were slowly beginning to drift over to N e w Haven. From the beginning of the Yale graduate school all the w a y to the twenties, the one notable research mathematician to have taught at Yale was E. H. Moore, and two of the few distinguished mathematicians to come out of Yale until the fifties were Marshall Hall and Irving Segal. In the fifties, a sudden plethora of stars appeared, led by Jack Schwartz.
Dunford had an unusual youth. After being passed over for a graduate fellowship in the middle of the depression in the thirties, he survived in St. Louis on 10 dollars a month, while studying and writing in the public library. Remarkably, the St. Louis library did subscribe to the few mathematics research journals of the time, and while unemployed in St. Louis Dunford managed to finish his first paper, which deals with integration of functions with values in a Banach space. After the paper had been accepted for publication in the Transactions of the A. M. S., Dunford was offered an assistantship at Brown, where he worked under Tamarkin. His doctoral dissertation dealt with the functional calculus that bears his name. He was hired by Yale right after he received his Ph.D., and spent his entire career there. He retired early, ostensibly because he had made lucrative investments in art and in the stock market. But in reality, Dunford's re-
There is a fundamental difference between the quality of life in Northern N e w England and in Southern N e w England. It comes from the shadows. On a Cambridge Sunday, the sharp shadows across the Charles River cut out the outlines of the distant buildings of Boston as if made of stiff cardboard, and deepen the blue of the water. In New Haven, by contrast, the light shadows are softened in a silky white haze, which encloses the colleges in a cozy aura of unreality. Such foresight of Mother Nature bespeaks a parting of destinies.
THE MATHEMATICAL INTELLIGENCER VOL. 18, NO. 3, 1996
47
remains to be estimated. The period that runs roughly from the twenties to the middle seventies was an age of abstraction. It probably reached its peak in the fifties and sixties. The fifties were the heyday of functional analysis, as the sixties were the heyday of algebraic geometry. The two major centers of functional analysis in the fifties were Yale and Chicago. The Mathematics Department at Stanford, which consisted entirely of classical analysts, had trouble finding graduate students. The great classical analysts at Stanford, such names as P61ya, Szeg6, Loewner, Bergman, Schiffer, and the first Spencer, were considered to be hopelessly old-fashioned. At Yale you could find no analysis courses offered other than functional analysis and supporting abstractions. Algebra reached an independent peak of abstraction with Nathan Jacobson and Oystein Ore. There was a standing bet among graduate students at Yale that whenever a doctoral dissertation in analysis was turned in, the writer would be challenged to use its results to give a new proof of the spectral theorem.
Post-retirement portrait of Nelson Dunford and his wife.
tirement could be another episode of The Bridge of San Luis Rey. It coincided with the completion of his life work, which was the three-volume treatise Linear Operators, written in collaboration with his student Jack Schwartz.
Linear Operators started out as a set of solutions to problems handed out in class; it gradually increased in size. Soon after Jack Schwartz enrolled in the course, Dunford asked him to become co-author. The project quickly expanded to include Bill Bade and Bob Bartle, as well as several students, instructors and assistant professors. It was fully supported by the office of Naval Research. There is a persistent rumor, never quite denied, that every nuclear submarine on duty carries a copy of Linear
In those days, no one doubted that the more abstract the mathematics, the better it would be. A distinguished mathematician, who is still alive, pointedly remarked to me in 1955 that any existence theorem for partial differential equations which had been proved without using a topological fixpoint theorem should be dismissed as applied mathematics. Another equally distinguished mathematician once whispered to me in 1956, "Did you know that your algebra teacher Oystein Ore has published papers in graph theory? Don't let this get around!" Sometime in the early eighties the tables were turned, and a stampede away from abstraction started, which is still going on. A couple of years ago I listened to a lecture by a well-known probabilist, which dealt with properties of Markov processes. After the lecture, I remarked to the speaker that his presentation could be considerably shortened if he expressed his results in terms of positive operators rather than in terms of kernels. "I know," he answered, "but if I had lectured on positive operators nobody would have paid any attention!" There are already signs that the tables may be turning again, and we old abstractionists are waiting with mischievous glee for the pendulum to swing back. Just a few months ago, I overheard a conversation between two brilliant assistant professors, purporting to provide an extraordinary simplification to some recently proved theorem; eventually, I realized with pleasant surprise that they were rediscovering the usefulness of taking adjoints of operators.
Operators.
Linear Operators: the Past Abstraction in M a t h e m a t i c s
The pendulum of mathematics swings back and forth towards abstraction and away from it, with a timing that 48
THE MATHEMATICAL INTELLIGENCER VOL. 18, NO. 3, 1996
The three-volume treatise Linear Operators was originally meant by Dunford as a brief introduction to the new functional analysis, and to the spectral theory that
had been initiated by Hilbert and Hellinger, but that had not really taken root until the work of yon Neumann and Stone. Dunford, however, championed spectral theory as a new field. He introduced the term "resolution of the identity," and he developed the program of extending spectral theory to non-self-adjoint operators. The initial core of the book consisted of what are now Chapters 2, 4, and 7, as well as some material on spectral theory now in chapter 11; eventually, this material expanded into two volumes. The idea of volume three was a belated one, coming in the wake of the development of the theory of spectral operators. The writing of Linear Operators took approximately 20 years, starting in the late forties. The third volume was published in 1971. Entire sections and even entire chapters were added to the text at various times, up to the last minute. For example, one of the last bits to be added to the first volume right before it went to press is the last part of section 16 of chapter 4, containing the GaussWiener integral in Hilbert space together with a simple formula relating it to the ordinary Wiener integral. This section was the subject of a lecture that Jack gave at the famous seminar on integration in function space that was held at the Courant Institute in the fall of 1956. The flavor of the first drafts of the book can be gleaned from reading chapter 2, which underwent fewer redrafts than most of the other chapters. Dunford meant the three theorems proved in this chapter, namely, the Hahn-Banach theorem, the uniform boundedness theorem and the closed graph theorem, to be the cornerstones of functional analysis. The exercises for this chapter, which in their first draft were rather dry, were eventually enriched by a set of exercises on summability of series. These problems are continued in chapter 4, and conclude in chapter 11 with the full expanse of Tauberian theorems. The contrast between the uncompromising abstraction of the text and the incredible variety of concrete examples in the exercises is immensely beneficial to the student who learns mathematical analysis from Dunford-Schwartz. The topics dealt with in Dunford-Schwartz can be roughly divided into three kinds. There are topics for which Dunford-Schwartz is still the definitive account. There are, on the other hand, other topics fully dealt with in the text which ought to be well-known, but which have yet to be properly read. Finally, there are topics that are still ahead of the times, and that remain to be fully appreciated. Presumptuous as it is on my part, I will try to give some examples of each kind, Besides the introductory chapter on Banach spaces (chapter 2), the treatment of the Stone-Weierstrass theorem and all that goes with it in chapter 4 still makes nowadays very enjoyable reading; in its time, it was the first thorough account. The short sections on Bohr compactification and almost periodic functions are also still the best reference for a quick summary of Bohr's extensive theory.
Section 12 of chapter 5 is remarkable. It presents a proof of the Brouwer fixpoint theorem. The proof was submitted for publication in a journal in 1954, but it was rejected by an irate referee, a topologist who was miffed by the fact that the proof uses no homology theory whatsoever. It does use instead some determinantal identities, the kind that are n o w again becoming fashionable. Spectral theory proper does not make an appearance until chapters 7 and 8, with the functional calculus and the theory of semigroups. In those days, such terms as "resolvent" and "spectrum" carried an aura of mystery, and the spectral mapping theorem sounded like magic. The meat and potatoes comes in chapters 10, 12, and 13; the proofs are invariably the most instructive, bringing into full play the abstract theory of boundary conditions of Calkin and yon Neumann, as well as the theory of deficiency indices.
Linear Operators: the Present There are topics for which Dunford-Schwartz was the starting point of a long development, and which have grown into autonomous subjects. Thus, for example, the notion of unconditional convergence of series in Banach spaces, which goes back to an old theorem of Steinitz and which is mentioned in chapter 2 almost as a curiosity, has blossomed into a full-fledged discipline. The same can be said of the geometry of Banach spaces initiated in chapter 4, and of the theory of convexity in chapter 5. In the sixties, several mathematicians pronounced the general theory of Banach spaces dead several times over, but this is not what happened. The geometry of Banach spaces not only managed to survive, but it is now widely considered to be the deepest chapter of convex geometry. Grothendieck once told me that his favorite theorem of his analysis period was a convexity theorem that generalizes a result in DunfordSchwartz. Unfortunately, he published it in an obscure Brazilian journal, and he never received any reprints of the papers. The theory of vector-valued measures in chapter 4 has equally blossomed into a chapter of functional analysis of beauty and depth. Strangely, at the time of the book's writing, we all thought that this theory had reached its definitive stage, perhaps because the proofs were so crystal clear. The same can be said of the theory of representation of linear operators in chapter 6; here again whole theories nowadays replace single sections of DunfordSchwartz. Corollary 5 of Section 7, stating that in certain circumstances the product of two weakly compact operators is a compact operator, has always struck me as one of the most elegant results in functional analysis, and undoubtedly sooner or later some extraordinary application of it will be found, as should happen to all beautiful theorems. Thorin's proof of the Riesz convexity theorem had apTHE MATHEMATICAL 1NTELLIGENCER VOL. 18, NO. 3, 1996
49
peared a short time before chapter 6 was written, and it is here given its first billing in a textbook. I take the liberty of calling your attention to problem 15 of section 11. This exercise holds the key to giving one-line proofs of some of the famous inequalities in the classic book by Hardy, Littlewood, and P61ya. Section 12.9 has been scandalously neglected. The classical moment problems are thoroughly dealt with in this section by an application of the spectral theorem for unbounded self-adjoint operators. It is shown in a couple of pages that the various criteria for determinacy of the moment problem can be inferred from a simple computation with deficiency indices. Partial rediscoveries of this fact are still being published every few years by mathematicians who haven't done their reading.
Linear Operators: the Future Finally, there are numerous subjects that were first written up in Dunford-Schwartz, from which the mathematical world has yet to benefit. It is surprising to hear from time to time probabilists or physicists addressing problems for which they would find ready help in Dunford-Schwartz. The functional analytic incompetence of physicists has decreased since the fifties, but one suspects that a lot of research funds might be saved if all physicists were to be required to have some basic functional-analytic background. Once, while I was a graduate student, a physicist working in quantum mechanics, who is n o w one of the leading theoretical physicists of our day, asked me to describe the difference between a symmetric and a self-adjoint operator in Hilbert space, which he ignored; one wonders how much the situation has improved in forty years. Chapter 3 on measure theory is one of the chapters inserted at a fairly late stage. It has not been read much, perhaps because every reader believes he or she is supposed to know measure theory when embarking upon the reading of Dunford-Schwartz. Actually, chapter 3 contains a number of yet-to-be-appreciated jewels. One of them is the comprehensive treatment of theorems of the Vitali-Hahn-Saks type. The proofs are so concocted as to bring out the analogies between the combinatorics of sigma-fields of sets and the algebra of linear spaces. Few analysts make use of this kind of reasoning. In probability, an appeal to the Vitali-Hahn-Saks theorem would bypass technical complications that are instead settled by the Choquet theory, for example, randomization theorems of the De Finetti type. Apparently, the only probabilist to have taken advantage of this opportunity is Alfred R6nyi in an elementary introduction to probability that also has been little read. Similarly, one wonders w h y so little use is still made of theorem III.7.6, which might come in handy in integral geometry. Large portions of spectral theory presented in Dunford-Schwartz remain to be assimilated. Thus, for example, the fine theory of Hilbert-Schmidt operators 50
THE MATHEMATICAL INTELLIGENCER VOL. 18, NO. 3, 1996
and the wholly original theory of subdiagonalization of compact operators in chapter 11 have not been read. The spectral theory of non-self-adjoint operators of chapters 14 through 19 is a gold mine that is still waiting for its day in the sun; only the latter parts of chapter 20, dealing with what the authors have successfully called "Friedrichs's method" and with the wave operator method, have been developed since the treatise was published. It will be a pleasure to watch the rediscovery of these chapters by the younger generations.
Working with Jack Schwartz There are fringe benefits to being a student of Jack's. Occasionally I decline invitations to attend meetings in computer science and even in economics, from organizers who mistakenly assume that I have inherited m y thesis advisor's interests. Two traits of Jack's personality have particularly endeared him to his students. One is his instinctive understanding of another person's state of mind, his tact in dealing with difficult situations. He gives encouragement without exaggerating, and he knows how to steer his friends away from being their own worst enemies. The second is his Leibnizian universality. It spills over onto all of us, it lifts us and points us in the right direction. Whatever topic he deals with at one time he sees as a stepping stone to some wider horizon to be dealt with at some future time. Both of these qualities shine in the pages of Linear Operators, the first by the transparent proofs, the second from the encyclopedic range of the material that is dealt with in 2592 pages. I was hired to work on the Dunford-Schwartz project in the summer of 1955, together with Bob McGarvey. Immediately, Jack took us aside and let us in on the delicate matter of the semicolons. There were to be no semicolons in anything we wrote for the project. Dunford would get red in the face every time he saw a semicolon. For years hence, I was terrified of being caught using a semicolon, and you may verify that in the three printed volumes of Dunford-Schwartz not a single semicolon is to be found. I was asked to check the problems in chapter 3, while Bob was checking problems in chapters 7 and 8. We would all get together every morning in a little office in Leet-Oliver Hall, an office that nowadays would not be considered fit for a teaching assistant. A bulky record player, which we had bought for ten dollars, occupied much of the space; we played over and over the entire sequence of Beethoven quartets and Bach partitas while working on the problems. It took me half the summer to finish checking the problems in chapter 3. There were a few that I had trouble with, and worst of all, I was unable to work out problem 10 of section 9. One evening, Dunford and several other members of the group got together to discuss changes in the exercises. Jack was in N e w York City. It was a warm
summer evening, and we sat on the hard wooden chairs of the comer office of Leet-Oliver Hall. Pleasant sounds of squawking crickets and frogs came through the open window, and mosquitoes were flying in through the open Gothic windows. After I admitted my failure to work out problem 10, Dunford tried one trick after another on the blackboard, in an effort to solve the problem or to find a counterexample. No one remembered where the problem came from, or who had inserted it. After a few hours, we all got up and left, somewhat downcast. The next morning, I met Jack, who patted me on the back and told me, "Don't worry, I could not do it either". I did not hear again about problem 10 of section 9 for another three years. A first-year graduate student took Dunford's course in linear operators. Dunford assigned him the problem, and the student solved it, and developed an elegant theory around it. His name is Robert Langlands. In the second half of the summer of 1955, after checking the problems in chapter 3, I was assigned to check the problems in spectral theory of differential operators in chapter 13. This is the chapter of Dunford-Schwartz that decided m y career in mathematics. Apparently, I had less difficulty with this second round of exercises, but I made a number of careless mistakes, as I always have since. One day, I was unexpectedly called in by Dunford. The details of this meeting have been many times rewritten in my mind. The large office was empty, except for Dunford and Schwartz sitting together at the desk in the shadow, like judges. "We have decided to assign you the problems in sections G and H of chapter 13", they said. A minute of silence followed. I had the feeling that there was something they weren't saying. Eventually, I got it. They were NOT assigning me the problems in section I, which dealt with the use of special functions in eigenfunction expansions. I soon learned, somewhat to my annoyance, that the person in charge of checking the problems in section I was an undergraduate who had just gotten his B.A. two months before. "You will never find a better undergraduate in math coming out of Yale," Jack told me, aware of m y feelings. He was right. The undergraduate checked all the special function problems by the end of the summer, and section I is now spotless. His name is John Thompson, and he went on to win the Fields Medal. I have kept a copy of the mimeographed version of the manuscript of Dunford-Schwartz. On gloomy days, I pull the dusty 15-pound bulk out of the shelf. Reading the now-yellowed pages, with their inky smell, was once a great adventure; rereading them after 40 years is a happy homecoming.
Department of Mathematics Massachusetts Institute of Technology Cambridge, MA 02439, USA
SpringerNewsMathematics Hans Hahn Gesammehe Abhandlungen / Collected Works L. Schmetterer, K. Sigmund (Hrsg./eds.) Mit einem Geleitwort yon / With a Foreword by Karl P o p p e r Like Descartes a n d Pascal, H a n s H a h n (1879-1934) was both a n eminent mathematician and a highly influential philosopher. He founded the Vienna Circle a n d was the teacher of both K u r t G f d e l and Karl Popper. His seminal contributions to functional analysis a n d general topology h a d a huge impact on the development of modern analysis. H a h n ' s passionate interest in the foundations of mathematics, vividly described in Sir Karl Popper's foreword (which became his last essay) h a d a decisive influence upon K u r t G6del. Like F r e u d , Musil or Sch6nberg, H a h n became a pivotal figure in the feverish intellectual climate of Vienna between the two wars.
B d . 1 / V o l . 1: 1995. XII, 511 pages. Cloth DM 198,-, approx.US $140.00.ISBN 3-211-82682-3 The first volume contains H a h n ' s path-breaking contributions to functional analysis, the theory of curves, and ordered groups. These papers are commented by H a r r o Heuser, Hans Sagan, and Laszlo Fuchs. B d . 2 / V o l . 2: 1996. Approx. 560 pages. Cloth DM I98,-, approx.US $140.00.ISBN 3-211-82750-1 The second volume of H a h n ' s Collected Works deals with functional analysis, real analysis and hydrodynamics. The commentaries are written by Wilhelm F r a n k , Davis Preiss, a n d Alfred Kluwick. B d . 3 / V o l . 3: Approx. 480 pages. ISBN 3-211-82781-1. Will be published in Fall 1996. In the third volume, H a h n ' s writings on harmonic analysis, measure a n d integration, complex analysis a n d philosophy are collected a n d commented by Jean-Pierre Kahane, Heinz Bauer, Lutger Kaup, and Wolfgang Thiel. This volume also contains excerpts of letters of H a h n a n d accounts by students and colleagues. Subscription price (only valid when taking all three volumes): 20 % price reduction
SprlngerWien N ew~ ork 9
~
T
T
P.O.Box89, A-1201Wien NewYork,NY10010,175FifthAvenue HeidelbergerPlatz3, D-14197Berlin Tokyo113,3-13,Hongo3-chome,Bunkyoku
THE MATHEMATICAL INTELLIGENCER VOL. 18, NO. 3, 1996
51
Ian Stewart* The catapult that Archimedes built, the gambling-houses that Descartes frequented in his dissolute youth, the field where Galois fought his duel, the bridge where Hamilton carved quaternions-not all of these monuments to mathematical history survive today, but the mathematician on vacation can still find many reminders of our subject's glorious and inglorious past: statues, plaques, graves, the cafd where the famous conjecture was made, the desk where the
famous initials are scratched, birthplaces, houses, memorials. Does your hometown have a mathematical tourist attraction? Have you encountered a mathematical sight on your travels? If so, we invite you to submit to this column a picture, a description of its mathematical significance, and either a map or directions so that others may follow in your tracks. Please send all submissions to the Mathematical Tourist Editor, Ian Stewart.
Sacred Star Polyhedron Istv n Hargittai There is a beautiful star polyhedron at the top of the Sacristy of St. Peter's Basilica in Vatican City (Fig. 1). It was built by the architect Carlo Marchionni in the years
1776-1784. It is a great stellated dodecahedron, called also Kepler's great stellated dodecahedron (Fig. 2 [1]), with 2 of its 20 triangular pyramids left out to accom-
Figure 1. Left: The Sacristy of St. Peter's Basilica in Vatican City; right: the star polyhedron at its top.
*Column Editor's address: MathematicsInstitute, University of Warwick, Coventry, CV4 7AL England. 52
THE MATHEMATICAL INTELLIGENCER VOL. 18, NO. 3 9 1996 Springer-Verlag New York
modate the vertical rod serving as the stand of the cross above the polyhedron. There are many other examples of star polyhedron decorations from even earlier times, such as at the top of the obelisks in St. Peter's Square and in the Rotonda Square in Rome, and on the gate in the Square of September 20 in Bologna (Fig. 3). The star polyhedron often stands on a pile of dome-shaped stones. An octagonal star standing on top of a pile of domeshaped stones was a characteristic motif in the coat of arms of the Chigi family of Pope Alexander VII (1655-1667). This motif is prominently displayed on the colonnades of St. Peter's Square (Fig. 4). Giovanni Lorenzo Bernini (1598--1680) and Francesco Borromini (1599-1667) were leading architects of the Baroque period and their activities overlapped with the reign of Pope Alexander VII. The octagonal star and the coat of arms of the Chigi family are conspicuously present in many of their works. Figure 5 shows Sant Ivo's Church and three of its details by Borromini. Two of them display star polyhedra on piles of dome-shaped stones and octahedral stars. However, the decoration beneath the cross at the top of the tower is not a polyhedron but a sphere. All photographs in this article were taken by the author in 1993. I am grateful to Anna Rita Campanelli and
Figure 2. Great steUated dodecahedron. Photograph courtesy of Magnus J. Wenninger [1].
Aldo Domenicano (Rome), Lodovico Riva di Sanseverino (Bologna), and Magnus J. Wenninger (Collegeville, Minnesota) for assistance and advice.
Figure 3. Left: Top of the obelisk in St. Peter's Square, Vatican City; center: top of the obelisk in Rotonda Square, Rome; right: one of the two side decorations of the gate in the Square of September 20, Bologna. THE MATHEMATICAL INTELLIGENCER VOL. 18, NO. 3 1996
53
Reference 1. M. J. Wenninger, Polyhedron Models, N e w York: C a m b r i d g e University Press (1971).
Budapest Technical University Szt. Gelldrt, tdr 4 I-I-1521 Budapest, Hungary
Figure 4. Decoration from the top of the colonnade in St. Peter's Square, Vatican City.
Figure 5. Sant Ivo's Church (top right) w i t h three details enlarged (above). THEMATHEMATICALINTELLIGENCERVOL.18,NO. 3, 1996 54
S i m o n Stevin's Statue Dirk Huylebrouck
Simon Stevin was b o r n in the Belgian city of Bruges in 1548, but left Belgium in 1582 and became a few years later professor of mathematics at the University of Leyden. A successful engineer, he first published his mathematics in Latin (1583: Problemata Geometrica), but later d e f e n d e d the "use of the m o t h e r t o n g u e to stimulate the progress of science" (1585: Dialectike ofte Bewysconste, The Art of Proving Statements). H e died in 1620. The University of Gent placed his bust in an auditorium, and until 1994 its mathematics review was n a m e d after him (see below). Belgium-in-24-hour tourists always have on their program a visit to the Venice of the North, Bruges. From Brussels, there is only one w a y to drive t h r o u g h this city c r o w d e d with tourists, and one cannot miss the Simon Stevin square in the centre of Bruges. A statue erected in his honor shows a thinking man, holding a pair of compasses in the right hand, and resting the left hand on a book with a d r a w i n g of a parallelogram for adding forces. The inscription S I M O N S T E V I N INAUG. MDCCCXLVI.F. tells us it took the city more than 200 years to h o n o r the mathematician. It was indeed quite a controversial decision. Until the first half of the 19th century, the Catholic Belgian and Protestant Dutch blocs were involved in something like a cold war, and to some Stevin had passed to the other side of the religious curtain. Several politicians and priests did not hesitate to use insults, but the offended party fortunately got the statue anyway. The following plea in his favor by the (Belgian!) physicist A. Quetelet is more polite, although the ordinary peo-
ple he refers to include a m e m b e r of the Brussels A c a d e m y of Science. M a n y of the statements m a y still be valid t o d a y (just replace "princes," "crusade," etc. b y "generals," "war," etc. and use names y o u think appropriate instead of Simon Stevin and Bruges): Simon Stevin, no matter what foreigners have said, was not forgotten by his compatriots. His statue will decorate his native city and will make her proud, a pride he felt himself for her, since it was the only title he used in his works, on the front pages of which one reads the words so remarkable in their simplicity: "By Simon Stevin of Bruges'. But, one may say, does an ordinary scholar, whose name the ordinary people do not know, deserve the honor of a statue? Certainly! an ordinary scholar, who, lost in the mass of people, has grown by himself and the force of his genius up to the highest conceptions: who, by his work and his insight, impregnated the domain of the intelligence; who tore aside with a steady hand the veils covering the great laws of nature; who enriched us with useful discoveries whose fruits we reap peacefully: what, this scholar should not take place next to those great conquerors who distinguished themselves, very often, by the evil they caused to humanity: those princes who impoverished and exterminated their population, and brought ruin and desolation to their neighbors? If you deify those men, then do not deny the honors given to great virtues, to sublime intellects. Those precious qualifies are more obvious signs of Divinity than those you honor by your statues. It is in the obscurity of the forest, in the childhood of society that man, still under the strain of material compulsion, elevated fear and glorified him who inspired it. Today, our honors must see higher; and the nation that knows how to celebrate the great military virtues, who made a statue for the famous head of the first crusade, for the hero praised by Tasso; that nation will not refuse to use the talent of its sculptors to reproduce features of its children who distinguished themselves in other careers as well. If the ordinary people do not know their names, let them learn them; that they know who their benefactors were. Ingratitude is humiliating; it is one of the principal
THE MATHEMATICAL INTELLIGENCER VOL. 18, NO. 3 9 1996 Springer-Verlag New York
55
SpringerNewsMathematics CollegiumLogicum Annals of the Kurt Giidel Society
Volume 2 1996. Approx. 140 pages. ISBN 3-211-82796-X Soft cover DM 64,-, approx. US $ 44.00
Contents: H. de Nivelle: Resolution Games and Non-Liftable Resolution Orderings. - M. Kerber, M. Kohlhase: A Tableau Calculus for Partial Functions. G. Salzer: MUhlog: an Expert System for Multiple-valued Logics. - J. Krajiffek: A Fundamental Problem of Mathematical Logic. - P. Pudl~k: On the Lengths of Proofs of Consistency. - A. Carbone: The Craig Interpolation Theorem for Schematic Systems. - I.A. Stewart: The Role of Monotonicity in Descriptive Complexity Theory. - R. Freund, L. Staiger: Numbers Def'med by Turing Machines.
Volume 1 1995.2 figures. VII, 122 pages. ISBN 3-211-82646-7
factors of dissolution of societies: it breaks the links, fosters political egoism, and dries u p the source of all the civic virtues. Honor to the city of Bruges, which w a n t e d to celebrate the m e m o r y of one of its most famous sons! More than one young talent will be roused before this m o n u m e n t of gratitude, and even a foreigner will not look at it unmoved. Before climbing, two centuries after his death, on the pedestal destined to him, the scholar of Bruges met more than one obstacle. Was not he even accused of bearing arms against his country? A n d on what proof was this accusation based? I do not know, and neither do those w h o m a d e the accusations, because the life of Simon Stevin is clouded by mysteries; and although the scholar held high functions, one only knows him through his works and b y the few things he told us about himself in his works. But the silence of history does not authorize us to become unjust twice towards him. Q u e t e l e t ' s t e x t is q u o t e d in A. V a n h o u t r y v e ' s b o o k The Statues of Bruges, p p . 22-23. T h e p h o t o g r a p h w a s p r o v i d e d b y M r . R. J a c o b u s , p r e s s a t t a c h 6 for t h e c i t y of Bruges.
Aarsthertogstraat 42 8400 Oostende Belgium 56
THE M A T H E M A T I C A L INTELLIGENCER VOL. 18, NO. 3, 1996
Soft cover DM 64,-, approx. US $ 44.00
Contents: P. Vihan: The Last Month of Gerhard Gentzen in Prague. - F.A. Rodriguez-Consuegra: Some Issues on Giidel's Unpublished Philosophical Manuscripts. D.D. Spalt: Vollst/indigkeit als Ziel historischer Explikation. Eine Fallstudie. E. Engeler: Existenz und Negation in Mathematik und Logik. - W.J. Gutjahr: Paradoxien der Proguose und der Evaluation: Eine fixpunkttheoretische Analyse. - R. Hiihnle: Automated Deduction and Integer Programming. - M. Baaz, A. Leitsch: Methods of Functional Extension. -
SpringerWien New~t%rk I~O.Box 89, A-1201Wien New York, NY 10010,175Fifth Avenue Heidelberger Piatz 3. D-14197Berlin Tokyo 113, 3-13, Hongo3-chome, Bunkyo-ku
Quaternionic Determinants Helmer Aslaksen
Introduction The classical matrix groups are of f u n d a m e n t a l importance in m a n y parts of geometry and algebra. Some of them, like Sp(n), are most conceptually defined as groups of quaternionic matrices. But, the quaternions not being commutative, we must reconsider some aspects of linear algebra. In particular, it is not clear h o w to define the d e t e r m i n a n t of a quaternionic matrix. Over the years, m a n y p e o p l e have given different definitions. In this article I will discuss some of these. Let us first briefly recall some basic facts about quaternions. The quaternions were discovered on October 16, 1843 b y Sir William Rowan Hamilton. (For m o r e on the history, I r e c o m m e n d [19], [31], [47], and [48].) They form a n o n c o m m u t a t i v e , associative algebra over R: H={a+ib+jc+kd
n x n matrices over R by GL(n, R). (Some readers might w o r r y about our definition of invertible in M(n, H): Is there a distinction between left and right inverses? We will see later that there is n o such problem. See also [15] and [32].)
Cayley The most simple-minded a p p r o a c h w h e n trying to define the determinant of a quaternionic matrix w o u l d be to use the usual formula. But then the question is: Which
a,b,c,d~R},
where i2 = j2 = k2 = _ 1, jk = i = - k j ,
/j = k = - j i , ki = j = - i k .
We can also express z E H in the form z = x + jy, where x, y E C, but then w e have to r e m e m b e r that yj = ]~ for y E C. Notice that H is not an algebra over C, since the center of H is only ~. Conjugation in H is defined b y a +ib +jc +kd =a- ib-jc-kdandsatisfiesh~= ~. We will call the quaternions of the form ib + jc + kd with b, c, d E R the pure quaternions. For any ring R, w e let R* denote the set of units in R, i.e., the invertible elements of R. If R is a skewfield, then R* = R - {0}. Let M(n, R) be the ring of n x n matrices with entries in R. We will denote the set of invertible THE MATHEMATICAL INTELLIGENCER VOL. 18, NO. 3 9 1996 Springer-Verlag New York
57
usual formula? For a 2 • 2 determinant we could use alia22 - a12a21 (expanding along the first row), or aua22 azla~2 (expanding along the first column), or some other ordering of the factors in the usual formula. To a modem mathematician, this lack of a canonical definition is an indication that this is not the correct approach. But we might still ask ourselves: What exactly would happen if we tried one of these formulas? In 1845, just 2 years after Hamilton's discovery of the quaternions, Arthur Cayley [10, 35] did precisely this. He chose to expand both the original matrix and all the minors along the first column (or vertical row as he called it). If we denote the Cayley determinant by Cdet, we get C d e t ( a~ a2
(
a~ b~ c~) a2
b2
c2
a3
b3
c3
= al(b2c 3 -
b3c2) -
a2(blc 3 -
b3cl) +
a3(blc 2 -
b2Cl).
Is this a good definition? Cayley himself points out that if two rows are the same in a 2 • 2 matrix, then
Cde,(:
T(fv) = c(fv) is in general different from
fT(v) =f(cv), whereas
T(vf) = c(vf) = cvf = T(v)f. We see that we must write the coefficients of a linear transformation on the opposite side of what we use for the vector space structure. I will identify vectors with columns and identify linear transformations with matrices on the left, but consider all vector spaces to be fight vector spaces. Axiom 3 can be expressed in terms of matrix multiplication. Let @ be the matrix with a I in the (i, j) entry, and 0 otherwise. Define
bl) = a~b2-a2b~ b2
and
Cdet
trices. Thus, we need only to define the determinant of invertible matrices. Notice that in Axiom 3 there is a distinction between left and right scalar multiplication. Consider the mapping T(v) = cv. Then, for f E H,
bb) = ab - ab = O'
whereas if two columns are the same in a 2 • 2 matrix, then
which in general is nonzero. For some reason, this didn't seem to bother Cayley much, and he happily proceeded to write a couple more pages about his new function. But it should bother us. Let us try to clarify the situation by first deciding on which properties we want the determinant to satisfy. Based on our experience with complex matrices, we will call d : M(n, H) ~-~ H a determinant if it satisfies the following three axioms. AXIOM 1. d(A) = 0 if and only if A is singular. AXIOM 2. d(AB) = d(A)d(B) for all A, B ~ M(n, H). AXIOM 3. If A' is obtained from A by adding a left-multiple of a row to another row or a right-multiple of a column to another column, then d(A') = d(A). Let me make some comments about these axioms. It can be shown [7] that if d is not constantly equal to 0 or 1, then Axiom 2 implies that d(A) = 0 for all singular ma58 THE MATHEMATICALINTELLIGENCERVOL.18, NO. 3, 1996
Bq(b) = I, + beij for i ~ j. Multiplying a matrix A by Bq(b) on the left adds the jth row multiplied by b on the left to the ith row, whereas multiplying A by Bq(b) on the fight adds the ith column multiplied by b on the fight to the jth column. So Axiom 3 can be restated (using Axiom 2) as saying that d(Bij(b)) = 1. It is easy to see that
Bq(b)- i = Bq(- b), so it follows that products of Bq(b)'s generate a subgroup of GL(n, H), which we will denote by SL(n, H). Notice that when K is a field, we define SL(n, K) to be the set of matrices with determinant equal to 1. But because we don't have a determinant yet, we must define SL(n, H) in some other way, and then hope that once we have our determinant, it will have SL(n, H) as its kernel. That Axiom 3 can be restated as saying that matrices in SL(n, H) have determinant equal to 1 is therefore promising. An obvious question is now whether such determinants exist. Let me first state a simple obstruction. THEOREM 1. Assume that d is a determinant, i.e., d satisfies our three axioms. Then the image d(M(n, H)) is a commutative subset of H. This theorem essentially says that when trying to define a quaternionic determinant, we must keep it complexvalued. This rules out Cayley's definition, since Cdet is onto H. The proof of Theorem I depends on the next two lemmas. We first observe that the definition of Bi)(b) only
involves two indices. We can, therefore, often assume without loss of generality that n = 2. A simple calculaton proves the following lemma.
It is n o w time to ask h o w Cayley's definition fits into this. It clearly cannot satisfy all the three axioms. In fact, it doesn't satisfy any of them! Consider the matrix
L E M M A 2. Let a =~0 and d be a determinant. Then
:,) =
(,)( - a -1
0
1
1
0
a
0 1
1
)(
1
1
0
1
)
and d
((0 0)) a_ 1
It is easy to prove that if
then x = y = 0, so M is invertible. But = 1.
The next lemma is crucial.
LEMMA 3. Every A ~ GL(n, H) can be written in the form A = D(x)B,
where
D(x) =
()
1
" X
and B E SL(n, H). Proof. Because A is invertible, there m u s t be at least one nonzero element in the first row, say alj ~ O. By adding the jth column multiplied by a ~ (1 - a u) on the right to the first column, we get a matrix with a l l = 1. We can then make all the other entries in the first row equal to zero, and proceed by induction. [] The observant reader m a y n o w be w o n d e r i n g about the uniqueness of the A = D(x)B decomposition. But it is more urgent to prove Theorem 1. Proof of Theorem 1. Define f : H --~ H by
f(x) -- d(D(x)). It follows from L e m m a 3 that f(H) = d(M(n, H)). For simplicity of notation we will assume that n = 2. We have
by Axiom 2 and L e m m a 2. But then
f(x)f(Y)=d((o
0 1 1)(0
~))=d(o
Oy)
and we see that f(H) = d(M(n, H)) is commutative.
[]
so M t is singular. But Cdet M --- 0 and Cdet M t = 2k, so we see that Axiom I fails. This also shows that the transpose is not a very useful concept in quaternionic linear algebra. The reason is that it is neither an automorphism nor an antiautomorphi____sm! (But notice that Hermifian involution, M* = M t, is an antiautomorphism, i.e., (MN)* = N'M*.) For similar reasons, the concept of rank is also more complicated. The right column-rank is the same as the left row-rank, but they might be distinct from the left column-rank, which is equal to the right row-rank [12]. Noting that C~((~
i)(k ~ i
Cdet( 1
~)Cdet(k i
~)) = 2 -
2k
whereas
~)=0
we see that Axiom 2 also fails. As for Axiom 3, we have
;)0 but after subtracting the second row multiplied by b on the left from the first row, we get
and Cdet(A') = ab - ba, which in general is nonzero. This clearly shows that Cdet is not the w a y to go. A more promising lead is before us, in Lemma 3. It will be followed up later. Let me finish this section with a remark about Theorem 1. It is inspired by a related theorem proved by the physicist and mathematician Freeman J. Dyson in 1972 [21]. He used a different third axiom: THE MATHEMATICAL INTELLIGENCER VOL. 18, NO. 3, 1996 5 9
A X I O M 3'. Let A = (aij), B = (bq), and C = (cq). If for some
Set
row index r we have aq = bq = cii, then
i ~ r,
and
ari q-
bri =
Cri,
d(A) + d(B) = d(C).
In other words, d should be additive in the rows. He then proved that if d satisfies Axioms 1, 2, and 3', the image of d is commutative. It is easy to see that Axioms 1, 2, and 3' imply Axiom 3. We just have to prove that d(Bij(b)) = 1. Let B' be the matrix obtained by replacing the ith entry along the diagonal in Bij(b) by a 0. Then B' is singular, and it follows from Axiom 3' that d(Bij(b) = 1. So his definition of determinant is more restrictive than ours. But it is, in fact, too restrictive. Determinants satisfying his three axioms simply d o n ' t exist over the quaternions! W h y ? It follows from Axiom 2 that d(I n) = 1. Define
D(x) =
1
"..
1
~b(M(n, C)) = {P E M(2n, ~)[JP =
PI}.
In a similar way, any quaternionic n x n matrix can be expressed uniquely in the form M = A + jB, where A and B are complex n x n matrices. (We write j on the left since we work with right vector spaces.) We can, therefore, define ~b: M(n, H) ~ M(2n, C) by
x/"
Since In + D ( - 1 ) = 2D(0) is singular, it follows from Axioms 1 and 3' that d ( D ( - 1 ) ) = - 1 . Because - 1 = iji-lj -1, we get D ( - 1 ) = D(i)D(j)D(i) -1 D(j) -1, so D(-1) is a commutator in GL(n, H). But Axiom 2 and Theorem I then imply that d ( D ( - 1)) = 1, which is a contradiction.
Study Concerning quaternionic determinants, nothing much happened during the 75 years after Cayley. In the second (posthumous) edition of W. R. Hamilton's book Elements of Quaternions [24] from 1889, the editor added an appendix, which was just a restatement of Cayley's paper. Also, a paper by J. M. Peirce [38] from 1899 is just a laborious elaboration on the Cayley determinant. But in 1920 a very interesting paper by Eduard Study appeared [44]. (For more details, see also [16], [23], and [46].) His idea was to transform a quaternionic matrix into a complex 2n • 2n matrix and then take the determinant. I will start by discussing some important homomorphisms between quaternionic, complex, and real matrices. Recall that a n y complex n x n matrix can be written uniquely as N = C + iD, where C and D are real n • n matrices. We can then define an injective algebra homomorphism q~ : M(n, C) ~ M(2n, ~) by
60
Let R i be right-multiplication by i on C ". The corresponding matrix is iI, and J = ~b(i/) = q~(ai) (I will sometimes identify a linear transformation and its standard matrix). This gives a complex structure on ~2n, and we k n o w that P ~ M(2n, R) corresponds to a complex linear transformation if and only if P commutes with the complex structure. Hence,
THE MATHEMATICAL INTELLIGENCER VOL. 18, NO. 3, 1996
It is straightforward to s h o w that this map is an injective algebra homomorphism. [This implies in particular that there is no distinction between left- and right-inverses in GL(n, H).] Let Rj be right-multiplication by j on H n. Notice that a n y H-linear transformation commutes with Rj, but that Ry is not H-linear. Thus, there is no matrix associated to Rj, and it doesn't make sense to talk about ~(Rj), but we can still consider the corresponding map of C 2n given by Rj(x, y) = (-~, x-). We see that Rj corresponds to first multiplying by J and then conjugating. This gives a quaternionic structure on C 2n, and we k n o w that N E M(2n, C) corresponds to a quaternionic linear transformation if and only N commutes with the quaternionic structure. Since NJv = NJv, we have NJv = JNv if and only if NJ = JN, so ~ M ( n , H)) = {N ~ M(2n, C)IJN = NJ}.
(1)
Notice that this is simply a generalization of the formula jz = ~j for z ~ C. It follows immediately from (1) that detc ~(M) E ~, but we will soon see that, in fact, we have det c 6(M) >-0. (I will sometimes write detR or det c to stress that I'm taking the determinant of a real or complex matrix.) By applying the h o m o m o r p h i s m (J~l : C ~ M(1, C) M(2, R) to each element of M ~ M(n, C), we get a m a p c~: M(n, C) ---~M(2n, ~). [~b(N) consists of four n-blocks, whereas ~b(N) consists of n 2 2-~ocks.] The important thing here is that the 2-blocks in ~b(N) are easier to manage than the n-blocks in ~b(N). Since C is commuative
and (ha is a homomorphism, the 2-blocks in (~(N) commute. This allows us to use the following folklore theorem. [It has been rediscovered numerous times, but to the best of m y knowledge it is originally d u e to M. H. Ingraham [26].] THEOREM 4. If A = (Aij) is a square block matrix, where
M(2n, C), but we need to k n o w that the inverse actually lies in ~ M ( n , H)). By conjugating and inverting the formula J~O(M) = ~M)J, we see that JO(M) -1 = qJ(M)-ll. But then it follows from (1) that ~ M ) -1 lies in
q,(M(n, H) ). To s h o w that Axiom 3 holds, it suffices to prove that Sdet Bij(b) = 1. If b = bI + jb2, then
the Aij are mutually commutative m ;< m matrices, and B is the m x m matrix obtained by taking the determinant of A with the A~j as elements, then det A = det B. For example, if All, A12, and A22 are m u t u a l l y commutative, then det [/ A1, \a21
A121 \ = det (AlaA22 - A,2A2a).
(2)
a22J
In other words, y o u evaluate by "taking the determinant twice." By shuffling some rows and columns, we see that detR r = detR ~b(N), and we can n o w apply Theorem 4 to get [6] detR 4)(N) = det~ (~(N) = detR(~bl(detc N)) = d e t a ( R e det c N \ I m detc N
- I m detc N / = Idetc NI2, Re detc N ]
for N E M(n, C). This discussion leads to the following important theorem.
q,(Bij(b)) = (In + bleij b2eij
-bae-ij I" In + bleq/
But eiflij = 0, so we can apply Theorem 4 to get det(q,(Bij(b)) = det(I,) = 1. Thus, the Study determinant satisfies all our axioms, and it is used frequently in differential geometry and Lie theory [23]. Bear in m i n d that it is a quadratic function of the entries, not multilinear in the rows and the columns like the usual determinant. Let me finish this section with a couple of additional comments. The Study determinant was defined above by identifying H with C a. What would happen if we instead identified H with ~4? After all, the center of H is R, not C, so the quaternions form an Q-algebra. We can write M E M(n, H) uniquely as M = A o + iA1 + jA2 + k a 3 where A0, A1, A2, and A 3 are real n x n matrices, and apply the homomorphism/~ : M(n, H) ---) M(4n, ~) given by
I~(Ao+iAl+jA2+kAa)=
A1 A2
A3
Ao A3
-A2
-A 3 A2 Ao -A1
A1
"
Ao
THEOREM 5. For any complex matrix N, we have Notice that detR 4~(N) = Idetc NI2 ~ 0.
(3)
For any quaternionic matrix M, we have det c qKM) =
X/det~ ~b(q,(M)) -> 0.
c ~ A o + iA1 + jA2 + kA3) = \-A3
(4)
Proof. The first part follows from (2). It follows from (1) that det c ~ M ) E ~, and since det c~(GL(n, H)) is a connected subset of ~, we get that Sdet M >- 0 for quaternionic matrices. We then deduce (4) from (2). []
Ao -A 3
A3 A0
-A1
A2
A1
.
--
tz(Ao + iA1 + jA2 + kA3), but it is easy to see that by shuffling some rows, columns, and signs, we get (see also [4] and [30]) detR /~(M) = deta ~b(~M)) = Sdet(M) 2.
We are finally ready to define the Study determinant Sdet by
I also note that in general
A, jBt (At
Sdet M = detc ~M). The obvious question is n o w which axioms the Study determinant satisfies. The Study determinant satisfies Axiom 2 because ~Ois a homomorphism. Let us show that Axiom 1 holds. (Notice that the proof of this statement is wrong in both editions of the otherwise excellent book by Morton L. Curtis [16].) We k n o w that if S d e t M = det c q,(M)~ 0, then q,(M) is invertible in
Bt
-at
-~
(At B') _-Bt
-at
= ~(f)t'
but
~(M*) = ~(A' + j-~t) = ~(-~, _ ~tj)
( -d' ~') = ~(-~t_jBt)= -B t A t
= ~M)*.
THE MATHEMATICAL INTELLIGENCER VOL. 18, NO. 3, 1996
61
Hence, SdetM* = S d e t M = SdetM; but in general Sdet M t --PSdet M, for, as we saw earlier, M can be invertible while M t is singular. Dieudonn4 Study was not the only one studying quaternionic determinants in his time. In the next 10 years, A. Heyting, E. H. Moore, f~. Ore, and A. R. Richardson all wrote about this topic [25, 34, 36, 42, 43]. The paper by Oystein Ore [36] is important because it introduces the concept of the ring of fractions for a noncommutative ring. But from the point of view of determinants, the most interesting are the papers by A. R. Richardson [42, 43] (this is the Richardson in the Littlewood-Richardson rule, but Littlewood is not the one in Hardy-Littlewood). His main contribution was to make it apparent that commutators play a key role. His papers are filled with formulas involving commutators. Let us go back to studying SL(n, H) and take a closer look at L e m m a 3. It is easy to see that SL(n, H) is a normal subgroup of GL(n, H), and it can be shown [1, 15, 17, 40] that SL(n, H) is the commutator subgroup of
m i n a n t is well defined, so it is an easy consequence of results in [1], [17], and [40], and I refer the reader to those excellent sources for the details. [] It follows that in the decomposition A = D(x)B, neither x nor B is unique, but that the coset x[H*, H*] E H*/[H*,H*] is unique. This is exactly w h a t Jean Dieudonn6 used in his 1943 paper [17]. His goal was to s h o w h o w the determinant could be expressed in terms of group theory. We w o u l d expect det(~
0b)=det(~
0a),
but then we probably need the determinant to take values in a commutative ring, a n d we get that by considering H*/[H*, H*]. His main theorem states that for a n y skew field K, there is an isomorphism
GL(n, K) /IGL(n, K), GL(n, K)] ~ K*/[K*, K*]. For K = H, this is immediate from Lemmas 3 and 7. We therefore define the Dieudonn6 determinant by
GL(n, H). det A = det(D(x)B) = x[H*, H*]. LEMMA 6. SL(n, H) = [GL(n, H), GL(n, H)]. Let me mention in passing that for any field k, the commutator of GL(n, k) is SL(n, k), except w h e n n = 2 and k is 772 or 773 [15]. The main reason w h y Lemma 3 is so crucial is that it shows that we only need to define our determinant on the matrices D(x). But you m a y be impatient for me to get back to the issue of uniqueness. Since SL(n, H) is normal in GL(n, H), the question becomes: For which x E H does D(x) lie in SL(n, H)? The answer is given by the following lemma.
Thanks to Lemma 7, we see that this is well defined and that the kernel is precisely SL(n, H), i.e., our definition of SL(n, H) agrees with the usual one, once we have the determinant. If we n o w extend to M(n, H), we get a determinant that takes values in H*/[H*, H*] U {0}. But w h a t does this set look like? We need the following lemma. L E M M A 8. [H*, H*] is isomorphic to the set of quaternions
of length 1. Proof. It is clear that every commutator has length 1. The
LEMMA 7.
D(x) =
Ill 1
X
is a commutator in GL(n, H) [i.e., it lies in SL(n, H)] if and only if x is a commutator in H*.
pq = -{p,q) + p x q.
Proof. One direction is trivial: (10
aba-lb-O)=(lo
0
1 0
a)(Ob)(O
1
a-0)(10
b0-1)"
The other direction, however, is not so easy. It is essentially equivalent to showing that the Dieudonn6 deter62
set of quaternions of length 1 can be identified with S3, and ~ S 3) = SU(2). But every element of SU(2) is conjugate to a diagonal element, so it follows that every elem e n t in S3 is conjugate to an element of S 1, the unit circle of C C H. (This also follows from the Noether-Skolem Theorem.) So, given z E S3, we can write z = xyx -1 with y E S 1. We can identify the pure quaternions with R3, and for p, q E ~3 we have p-1 = p/Ipl 2 = -p/]pl a and
THE MATHEMATICALINTELLIGENCERVOL. 18, NO. 3, 1996
where {, ) is the usual inner product o n ~ 3 and • is the vector product in ~3. From this, we can easily deduce that every quaternion can be written as the product of two pure quaternions. Since y is complex, we can find w ~ C with y = w 2, a n d it follows from the above that we can write w = pq,
where p, q E ~3. Since Iwl = lyl = 1, w e can also assume that Ipl = Iql = 1, so p-1 = - p and q-1 = -q. But then
plies the determinant by m[H*, H*]. (This last product can be either on the left or on the right, since H*/[H*, H*] is commutative.) On the other hand,
z = xpqpqx -1 = xpq(-p)(-q)x -1 = xpqp-lq-lx -1 = (xpx-1)(xqx-1)(xpx-1) -1 (xqx-1)-k
For other proofs, see [9], [17], and [50]. It follows that H*/[H*, H*] is isomorphic to the positive real numbers. Define co: H*/[H*, H*] ~
det( lb aba)=(ab-ba)[H*'H*]'
[] but
:)0
R + by ~o(x[H*, H'l) = Ixl, and w e see that w e cannot factor out a right multiple of
and define the normalized Dieudonn6 determinant by Ddet(M) = ~o(det(M)). Dieudonn6 s h o w e d [17] that any determinant function satisfying our three axioms will be of the form
a row.
Moreover, it doesn't behave well with respect to addition. Consider the determinant as a function of the first row, keeping the other rows fixed. Denote this function b y m(v). Define addition in H*/[H*, H*] by setting a[H*, H*] + b[H*, H*] = {akl + bk21kl, k2 E [H*, H*]}.
d(M) = Ddet~(M)
(5) It can then be shown [1] that
for some r E ~. In particular, we can easily check the following theorem.
Sdet M = detc(~M)) = Ddet2(M),
(6)
If w e use Ddet instead of det and denote the corresponding function by M(v), we get a sort of triangle inequality:
deta/~(M) = detR ~b(6(M)) = Ddet4(M).
(7)
M(Vl + v2) - M(Vl) + M(v2).
T H E O R E M 9.
Let me also point out that it follows from (6) that the Study determinant corresponds to the reduced norm [15]. Equation (5) has been generalized b y L. E. Zagorin [52]. If v is a h o m o m o r p h i s m of H into M(s, C), and T is the corresponding h o m o m o r p h i s m of M(n, H) into M(ns, C), then det c T,(M) = DdetS(M). In addition to satisfying our three axioms, the Dieudonn6 determinant has several other properties [1, 17, 40]. Interchanging rows i and j corresponds to leftmultiplying by the matrix Pij = Bij(1)Bji(-1)Bij(1). But - 1 E [H*, H*], so det Pq = I[H*, H*]: interchanging two rows doesn't change the determinant. When n = 2, det(;
~ ) d=e t ( 0 a
d-
cba_lblj = (ad - aca-lb)[[~ *, H*]
if a r 0, and det(0c
re(v1 + v 2) C m ( v 1) + rt/(v2).
~)=det(~
d) =cb[H*'H*]=-bc[H*'H*]'b
We can also s h o w that multiplying a r o w on the left by m or multiplying a column on the right by m multi-
Moore We started out by showing what was wrong with the Cayley determinant. But sometimes it does work. Granted that his formula doesn't make sense in general, does it still make sense for certain matrices? The answer is that if w e restrict to Hermitian quaternionic matrices (M* = M), then w e get a useful function b y specifying a certain ordering of the factors in the n! terms in the sum. This was first studied b y Eliakim Hastings Moore (for biographical information about Moore, see [37]), and I will denote it by Mdet. Let Orbe a permutation of n. Write it as a product of disjoint cycles. Permute each cycle cyclically until the smallest number in the cycle is in front. Then sort the cycles in decreasing order according to the first n u m b e r of each cycle. In other words, write Or = (n11...rllll)(n21...r1212)...(rlr1...rlrlr),
where for each i, we have nil < nq for allj > 1, and r/11 > //21 ~ ... ~ nrl. Then w e define Mdet M = Z
]O~mn,1n,2""mn111nllmn21n22""mnrlrnrl.
o ' ~ Sn THE MATHEMATICALINTELLIGENCERVOL. 18, NO. 3, 1996 63
If H is Hermitian, then Mdet H is a real number. I will not go into details, but refer to the w o r k of Moore, Jacobson, Dyson, Mehta, Chen, Van Praag, and Piccinni [5, 11, 12, 20, 21, 27, 28, 32, 33, 34, 39, 49, 50]. But I would again like to m a k e some comments. In general, it is difficult to talk about eigenvalues of a quaternionic matrix [13, 29]. As we work with right vector spaces, we must consider right eigenvalues. If M x = xh,
then for q ~ 0, we get M(xq) = xhq = (xq)q-lhq.
Hence, all the conjugates of ,~ are also eigenvalues. Let us s t u d y the conjugacy classes more closely. For q E H, we define p(q) by p(q)(x) = qxq -1. Since p(q) leaves the real axis invariant and is orthogonal, we can restrict to R 3. It is easy to see [18] that if we write q = q0 + q' with q0 E R and q' ~ R 3, then p(q) represents the rotation of R 3 with the axis q' and angle 2 arctan(Iq']/qo). From this we get that if x is real, then the conjugacy class of x is just {x}, whereas for x E S3 \ {+ 1}, we get a copy of S2 containing x and orthogonal to the real axis. Suppose that h = h 0 + h' with h0 E R and h' E ~ 3 . Then qhq -1 = ,~o + q,Uq -1, and the conjugacy class of h' intersects the i axis at +lh'li. It follows that the conjugacy class of a non-real eigenvalue contains exactly two complex numbers and that they are conjugate. If p is complex and v = u + jw, then M v = vp if and only if i~r t = (uw)tp, and it can be proved by induction [29] that the eigenvalues of ~ M ) occur in conjugate pairs. It follows that the eigenvalues of_~M) ar__e precisely the 2n numbers h ~ , . . . , An and hi . . . . . hn, whereas the eigenvalues of M are the elements of the conjugacy classes of hi . . . . . hn, where we can replace hi by hi. It is n o w easy to show [29] that M is symplecfically similar to a triangular matrix with diagonal elements di, where di equals hi or hi. For more about normal forms of quaternionic matrices, see [27], [29], [41], [45], and [51]. If we restrict to a Hermitian matrix, H, then it turns out that all its eigenvalues are real (and there are, therefore, precisely n of them, since each conjugacy class only contains one element) and that the matrix can be symplectically diagonalized; that is, we can find P E GL(n, H) such that PH-Pt = D,
THEOREM 10. Let H be a Hermitian quaternionic matrix. Then
IMdet H l = Ddet H
and
M d e t H[H*, H*] = det H. (8)
For any quaternionic matrix M , we have
Sdet M = Mdet(M M*).
(9)
Proof. It can be shown that for a Hermitian matrix, the Moore determinant is equal to the product of the eigenvalues, so Mdet H is real-valued. But the normalized Dieudonn6 determinant of a diagonal matrix is the n o r m of the product of the diagonal elements, so (8) follows. To prove (9), we just have to observe that the eigenvalues of AA* are positive, and use the product rule and (6). [Z Finally, if H is Hermitian, then (J~ H ) ) t = - ~(H)tJ = - J ~ H ) t = - J ~ H),
so J ~ H ) is skew-symmetric, and we can take its Pfaffian [14]. But p f ( - J ~ H ) ) 2 = d e t c ( - J ~ H ) ) = Ddet2H = Mdet2H, SO
Mdet(H) = pf(-J6(H)). For other applications of the Pfaffian, see [2] and [3].
SP(n) I w o u l d like to finish with a simple application of these ideas. As mentioned in the introduction, the group SP(n) can be defined as the group preserving the norm on H n. But the usual description of this group is by considering its image under ~b in M(C, 2n). It is easy to see that all such matrices have determinants + 1. There are different ways of proving that in fact the determinant is equal to 1, but this also follows from the results above, since all matrices in ~GL(H, n)) have positive determinants. In conclusion, I would also like to mention the recent work of Gelfand and Retakh [22]. Unfortunately, it is b e y o n d the scope of this article to report on it.
Acknowledgments
m
where pt = p-1 and D is diagonal and real. We can n o w prove the following theorem that relates the Moore determinant to the other determinants. 64
THE MATHEMATICALINTELLIGENCERVOL. 18, NO. 3, 1996
The author would like to thank Jon Berrick, P.M. Cohn, Soo Teck Lee, and the referee for help in improving this article.
References 29.
of a Hermitian matrix, Bull. Amer. Math. Soc. 45 (1939), 745-748. H. C. Lee, Eigenvalues and canonical forms of matrices with quaternionic entries, Proc. Roy. Irish Acad. Sect. A, 52 (1949), 253-260. D. W. Lewis, A determinantal identity for skewfields, Linear Algebra Appl. 7 (1985), 213-217. K. O. May, The impossibility of a division algebra of vectors in three dimensional space, Amer. Math. Monthly 73 (1966), 289-291. Madan Lal Mehta, Determinants of quaternion matrices, J. Math. Phys. Sci. 8 (1974), 559-570. Madan Lal Mehta, Elements of Matrix Theory, Dehli Hindustan Pub. Corp., 1977. E. H. Moore, On the determinant of an hermitian matrix of quaternionic elements, Bull. Amer. Math. Soc. 28 (1922), 161-162. Thomas Muir, The Theory of Determinants, Vol. 2, London: MacMillan, 1911. f~. Ore, Linear equations in non-commutative fields, Ann. Math. 32 (1931), 463-477. K. Hunger Parshall and D. E. Rowe, The Emergence of the
1. E. Artin, Geometric Algebra, New York: Interscience, 1957; reprinted by Wiley, New York, 1988. 30. 2. H. Aslaksen, SO(2) invariants of a set of 2 X 2 matrices. Math. Scand. 65 (1989), 59-66. 31. 3. H. Aslaksen, E.-C. Tan, and C. Zhu, Invariant theory of special orthogonal groups, Pacific J. Math. (in press). 4. A. Bagazgoitia, A determinantal identity for quaternions, 32. in Proceedings of 1983 Conference on Algebra Lineal y Aplicaciones, Vitoria-Gasteiz, Spain, 1984, pp. 127-132. 33. 5. R. W. Barnard and E. Hastings Moore, General analysis. Part 1, Memoirs of the American Philosophical Society, 34. 1935. 6. J. Brenner, Expanded matrices from matrices with complex elements, SIAM Rev. 3 (1961), 165-166. 35. 7. J. Brenner, Applications of the Dieudonn6 determinant, Linear Algebra Appl. 1 (1968), 511-536. 36. 8. J. Brenner, Corrections to "Applications of the Dieudonn6 determinant," Linear Algebra Appl. 13 (1976), 289. 37. 9. J. Brenner and J. De Pillis, Generalized elementary symAmerican Mathematical Research Community, 1876-1900: J. J. metric functions and quaternion matrices, Linear Algebra Sylvester, Felix Klein and E. H. Moore, Providence, RI: Appl. 4 (1971), 55-69. American Mathematical Society, 1994. 10. A. Cayley, On certain results relating to quaternions, 38. J. M. Peirce, Determinants of quaternions, Bull. Amer. Philos. Mag. 26 (1845), 141-145; reprinted in The Collected Math. Soc. 5 (1899), 335-337. Mathematical Papers Vol. 1, Cambridge: Cambridge 39. P. Piccinni, Dieudonn6 determinant and invariant real University Press, 1989, pp. 123-126. polynomials on ~I(n, H), Rend. Mat. (7)2 (1982), 31-45. 11. L. Chen, Definition of determinant and Cramer solution 40. R. S. Pierce, Associative Algebras, New York: Springerover the quaternion field, Acta Math. Sinica (N.S.) 7 (1991), Verlag, 1982. 171-180. 41. J. Radon, Lineare Scharen orthogonaler Matrizen, Abh. 12. L. Chen, Inverse matrix and properties of double deterMath. Sem. Univ. Hamburg 1 (1922), 2-14. minant over quaternion field, Sci. China Ser. A 34 (1991), 42. A. R. Richardson, Hypercomplex determinants, Messenger 528-540. of Math. 55 (1926), 145-152. 13. P. M. Cohn, The similarity reduction of matrices over a 43. A. R. Richardson, Simultaneous linear equations over a diskew field, Math. Z. 132 (1973), 151-163. vision algebra, Proc. London Math. Soc. 28 (1928), 395-420. 14. P.M. Cohn, Algebra, vol. I, 2nd ed., New York: Wiley, 1991. 44. E. Study, Zur Theorie der linearen Gleichungen, Acta 15. P.M. Cohn, Algebra, vol. 3, 2nd ed. New York: Wiley, 1991. Math. 42 (1920), 1-61. 16. M. L. Curtis, Matrix Groups, New York: Springer-Verlag, 45. O. Teichmiiller, Operatoren im Wachsschen Raum, J. Reine 1979; 1984. Angew. Math. 174 (1935), 73-124. 17. J. DieudonnG Les d6terminants sur un corps non-com- 46. C. L. Tong, Symplectic Groups, honours thesis, National mutatif, Bull. Soc. Math. France 71 (1943), 27-45. Univ. of Singapore, 1991. 18. J. Dieudonn6. Special Functions and Linear Representations 47. B. Leednert van der Waerden, Hamilton's discovery of of Lie Groups, CBMS 42, Providence, RI, American Mathequaternions, Math. Mag. 49 (1976), 227-234. matical Society, 1980. 48. B. Leednert van der Waerden, A History of Algebra, New 19. R. Dimitrid and B. Goldsmith, Sir William Rowan York: Springer-Verlag, 1985. Hamilton, Math. Intelligencer 11 (1989), no. 2, 29-30. 49. P. Van Praag, Sur les d4terminants des matrices quaterni20. F.J. Dyson, Correlations between eigenvalues of a random ennes, Helv. Phys. Acta 62 (1989), 42-46. matrix, Commun. Math. Phys. 19 (1970), 235-250. 50. P. Van Praag, Sur la norme r6duite du d6terminant de 21. F. J. Dyson, Quaternion determinants, Helv. Phys. Acta 45 Dieudonn6 des matrices quaterniennes, J. Algebra 136 (1972), 289-302. (1991), 265-274. 22. I. M. Gelfand and V. S. Retakh, Determinants of matrices 51. L. A. Wolf, Similarity of matrices in which the elements over noncommutative rings, Functional Anal. AppI. 25 are real quaternions, Bull. Amer. Math. Soc. 42 (1936), (1991), 91-102. 737-743. 23. F. Reese Harvey, Spinors and Calibrations, New York, 52. L. E. Zagorin, The determinants of matrices over a field Academic Press, 1990. (Russian), Proc. First Republican Conf. Math. Byelorussia, 24. W. R. Hamilton, Elements of Quaternions, 2nd ed., London: Izdat, Minsk: "Vys~aja ~kola", 1965, pp. 151-152. Longman, 1889. 25. A. Heyting, Die Theorie der linearen Gleichungen in einer Zahlenspezies mit nichtkommutativer Multiplikation, Math. Ann. 98 (1927), 465-490. 26. M.H. Ingraham, A note on determinants, Bull. Amer. Math. Department of Mathematics Soc. 43 (1937), 579-580. National University of Singapore 27. N. Jacobson, Normal semi-linear transformations, Amer. J. Singapore 0511 Math. 61 (1939), 45-58. Republic of Singapore 28. N. Jacobson, An application of E. H. Moore's determinant e-maih
[email protected] THE MATHEMATICAL 1NTELLIGENCER VOL. 18, NO. 3, 1996
65
The Solution of the n-body Problem* Florin Diacu
The wind scrambles and thunders over hills with a voice far below what we can hear. Whalesong, birdsongs boom and twitter. Sea, air, everything's a chaos of signals and even those we've named veer and fall in pieces under our neat labels. Waves-how to speak of the structure of waves when all disperses and there's nothing fixed to tell?
--Philip Holmes, Background Noise
Folk-Mathematics A folk-tale is a popular story uttered from one genera-
tion to the next. The main source of culture in times of old, oral tradition plays a marginal role in spreading scientific information today. Still, its significance is by no means negligible, and all domains of human activity are more or less influenced by it. Mathematics is no exception. We all know theorems we have never read in books or papers or learned about at formal presentations. We often don't know a reference, have no idea who proved that result, how, and when. Usually a colleague mentioned it at some conference dinner, during a coffeebreak or in a friendly discussion in our Department. It is striking, it sticks to our mind, and after a while it is part of our mathematical heritage---we just know it. Then we tell it further under similar circumstances, and so the wheel turns on. We will call this component of our knowledge folk-mathematics. Without denying the positive role folk-mathematics plays in spreading information, we must admit that results gathered through it are sometimes misleading or misunderstood. A typical example is the Cantor set. Everybody knows that the middle-third Cantor set has zero Lebesgue measure, and many believe that the middle-fifth analogue has positive measure. Intuitively this sounds plausible: if we remove each time a smaller segment, the remaining quantity should be larger. Unfortunately, the intuition leads us astray this time. For any
k, the middle-kth Cantor set has zero measure. Though a simple computation would show this, few do it, so the mistake propagates from one mathematician to the other. We can indeed obtain a Cantor set of positive measure by assigning a variable removal step. Delete first the middle-third segment, then the middle-ninth, then the middle-twenty-seventh, and so on. This algorithm will lead us to the desired result. The above example is easy to check, but what are we up against when a more complicated folk-mathematical situation appears? Physicists and mathematicians less familiar with celestial mechanics, have asked me at different occasions to provide details about the "impossibility of solving the n-body problem." Some had heard that Poincar6 had proved the result, others recalled only that such a theorem exists somewhere in the literature. After all, this is a natural question. Since Abel and Galois proved the impossibility of solving algebraic equations
*Dedicated to Philip H o l m e s , for his d e e p m a t h e m a t i c s , for his w a r m a n d candid poetry, a n d for the i m m e n s e intellectual joy he h a s instilled in m e d u r i n g the time o u r book took shape.
66 THE MATHEMATICALJNTELLIGENCERVOL. 18, NO. 3 9 1996Springer-VerlagNew York
of degree higher than five through formulae involving only roots, w h y should there not be an impossibility proof for solving the n-body problem? The astonishment comes when we respond that the n-body problem has already been solved. Of course, the answer requires explanation, and since this old question of celestial mechanics continues to raise interesting challenges (as it has for the last three centuries), it is worth telling here the intriguing story and the unexpected consequences the most important attempts to obtain an explicit solution.
King Oscar's Prize Having its origins in Newton's Principia, the n-body problem of celestial mechanics is an initial-value problem for ordinary differential equations: for given initial data qi(0), (~i(O), i = 1 , . . . , n (with qi(0) ~ qj(0) for mutually distinct i and j), find the solution of the secondorder system
miiii =
~
mimj(qi -- qj)
J~'
-~ii C ~l ~
i = 1,
.....
n,
(,)
where ml, m 2 , . . . , mn are constants representing the masses of n point-masses, and ql, q2,. 9 qn are 3-dimensional vector functions of the time variable t, describing the positions of the point-masses. For n = 2 the problem was completely solved by Johann Bernoulli in 1710 (see [B], [W], [DH]), but for more than a century and a half after Bernoulli's success, the case n -> 3 eluded the efforts of everyone. Interest in the problem grew towards the end of the last century, when a special event made the best mathematicians look at celestial mechanics with more concern than ever before. In volume 7, 1885/86, Acta Mathematica announced the establishment of a prize in honour of King Oscar II of Sweden and Norway, to be awarded on the King's 60th birthday: 21 January 1889. The deadline for submission was set for 1 June 1888. Finding a convergent power-series solution of the above initial value problem, was the first--and the most imp o r t a n t - a m o n g the four questions proposed by the three-member jury: G6sta Mittag-Leffier (the editor-inchief of Acta), Charles Hermite, and Karl Weierstrass. The formulation of the first question, due to Weierstrass, who had shown growing interest in the problem himself, appeared in German and French as follows in our translation (a slightly different translation was given by Daniel Goroff in [P]): Given a system of arbitrarily many mass points that attract each other according to Newton's laws, under the assumption that no two points ever collide, try to find a representation of the coordinates of each point as a series in a vari-
able that is some known function of time and for all of whose values the series converges uniformly. This problem, whose solution would considerably extend our understanding of the solar system, seems capable of solution using analytic methods now at our disposal; we can at least suppose as much, since Lejeune Dirichlet communicated shortly before his death to a geometer of his acquaintance [Leopold Kronecker] that he had discovered a method for integrating the differential equations of Mechanics, and that by applying this method, he had succeeded in demonstrating the stability of our planetary system in an absolutely rigorous manner. Unfortunately, we know nothing about this method, except that the theory of small oscillations would appear to have served as his point of departure for this discovery. We can nevertheless suppose, almost with certainty, that this method was based not on long and complicated calculations, but on the development of a fundamental and simple idea that one could reasonably hope to recover through persevering and penetrating research. In the event that this problem remains unsolved at the close of the contest, the prize may also be awarded for a work in which some other problem of Mechanics is treated as indicated and solved completely.
Out of the 12 papers eventually submitted for the competition, 5 treated the n-body problem; none of them, however, obtained the required power-series solution. Under these circumstances the jury decided to award the prize to the 35-year-old Henri Poincar6, for his remarkable contribution to the understanding of the equations of dynamics (called Hamiltonian systems today) and for the many new ideas he brought into mathematics and mechanics. Indeed, Poincar6's memoir, later developed into his monumental 3-volume work Les Mdthodes Nouvelles de la Mdcanique Cdleste, laid the foundations of several branches of mathematics and--most important--opened the way to qualitative methods, as opposed to the quantitative ones that had reigned in analysis since Newton and Leibniz. Published in volume 12, 1890, of Acta Mathematica, Poincar6's memoir offered the first example of chaotic behavior in a deterministic system (it involved homoclinic orbits in a first-return map in the restricted 3-body problem). In fact Poincar6 understood the complicated behavior of those orbits only after the prize was awarded to him. The first version of his paper, the one actually awarded the prize, incorrectly claimed that such orbits were stable, by missing the important fact that the homocIinic intersection might be transversal. Assaulted with questions by Edvard Phragm6n, the assistant editor at Acta in charge of preparing the manuscript for publication, Poincar6 finally discovered and corrected the mistake. Phragm6n had found Poincar6's work very hard to read. The initial version almost doubled in size after Phragm6n's repeated requests for clarification. Writing about the subsequent 1895 paper entitled Analysis Situs, Jean Dieudonn6 [Di] characterized Poincar6's style in the following words: THE MATHEMATICAL 1NTELLIGENCER VOL. 18, NO, 3, 1996
67
As in so many of his papers, he gave free rein to his imaginative powers and his extraordinary intuition, which only very seldom led him astray; in almost every section is an original idea. But we should not look for precise definitions, and it is often necessary to guess what he had in mind by interpreting the context. For many results he simply gave no proof at all, and when he endeavored to write down a proof, hardly a single argument does not raise doubts. The paper is a blueprint for future developments of entirely new ideas, each of which demanded the creation of a new technique to put it in a sound basis.
velocity components) to 6n - 10. Jacobi had shown that using a so-called reduction of nodes (some symmetries), the dimension of the system could be further reduced to 6n - 12, but this was not enough to understand even the 3-body problem--it still left a complicated 6-dimensional first-order system unsolved--not to mention higher values of n. In 1887 the 39-year-old German mathematician Ernst Heinrich Bruns published in Acta Mathematica a surprising result [Bru]: the n-body problem has no integrals-----alge-
Unfortunately Poincar6's correction came only after the memoir had been printed and some of Acta's issues delivered to subscribers. As editor-in-chief of Acta, as a member of the jury, and as a favorite of the King, MittagLeffier was put in a delicate position. To defend the honor of the prize and his own credibility and position, he decided to recall the published issues and print the correct version. Poincar6 agreed to bear the costs of the first printing: 3585 Swedish crowns and 63 6re, more than the 2500 crowns he had received for the prize (to understand the figures, bear in mind that MittagLeffier's annual salary as a professor at the University of Stockholm had been 7000 crowns in 1882) [A],[BG]. I do not go further into the history and the scandal that followed (the interested reader can find the historical and mathematical details in [DH], our forthcoming book about the origins and the development of chaos and stability). What matters now is the negative result proved by Poincar6 in the prize memoir, a result that does show the impossibility of solving the n-body problem, but only by use of a certain method.
braic with respect to the time, the position, and the velocity coordinates--except the 10 known ones. Though some gaps
Is this Problem Unsolvable?
were subsequently discovered in Bruns's proof, Poincar6 had no doubt that the result was true. In his prize paper he proved an even stronger theorem: there are no inte-
grals---algebraic with respect to the time, the position, and the velocities only---other than the 10 known ones. In other words, these negative results showed it is impossible t o solve the equations of motion of the n-body problem by reducing the dimension of the system with the help of first integrals. This does not mean that the n-body problem is unsolvable, just that a certain method fails to solve it. In fact, standard results of differential equations theory show that any initial value problem for the equations (~), with initial data not starting from collisions, leads to the existence of a unique solution defined on a maximal interval, which is the whole real line if singularities do not occur. So the problem posed by King Oscar's prize made sense and could be solved, in principle. Unfortunately, the folk-mathematical tradition retained only one aspect of these results and perpetuated the wrong message that the n-body problem was unsolvable. After a digression into the foundations of mathematics, I will tell how the n-body problem was later solved in the spirit of King Oscar's prize.
First integrals (or simply integrals) for systems of differential equations are functions that remain constant along any given solution of the system, the constant depending on the solution. In other words, integrals provide relations between the variables of the system, so each scalar integral would normally allow the reduction of the system's dimension by one unit. Of course, this reduction can take place only if the integral is an algebraic-not very complicated--function with respect to its variables, such that one variable can be expressed as a function of the others. If the integral is transcendent, any attempt to obtain such an expression is pointless. At the time of Poincar6, the method of solving systems of differential equations by finding first integrals was much in use. It had been known for a long time that the n-body problem had 10 independent algebraic first integrals: 3 for the center of mass, 3 for the linear momentum, 3 for the angular momentum, and one for the energy (see, e.g., [W], [D1], [D2]). This allowed the reduction of the primitive system from 6n variables (each point-mass is represented in space by 3 position and 3 68
THE MATHEMATICAL INTELLIGENCER VOL. 18, NO. 3, 1996
Brouwer's Attack All active mathematicians have opinions about what problems have importance, what branches are difficult, and what directions are promising in their own field. But unlike other sciences, whatever differences of opinion arise, all mathematicians agree that a result proved two millennia, two centuries, or two years ago, remains true forever. The progress of mathematics has little to do with the foundations. In spite of this, some prominent mathematicians have dedicated time and energy towards understanding the roots of their discipline. Sometimes, their efforts have raised polemics and disputes as sharp as those frequently met in other domains of h u m a n activity. In 1913, the 32-year-old Luitzen Brouwer launched an attack against a well established mathematical method of reasoning. As an editor of the prestigious Mathematische Annalen, he rejected all submitted papers that
used reductio ad absurdum as a method of proof. This led in Acta Mathematica a few months before by a Finn of to a scandal. The editorial board held an emergency Swedish origin, Karl Sundman. If he had known and meeting to save the reputation of the journal. The board understood Sundman's work, Brouwer would probably resigned as a whole and reelected itself, except Brouwer. never have developed his intuitionism. Offended by his colleagues' attitude and supported by Sundman's paper [Su3] revisited and republished some of his own results (inspired by a previous work his government, Brouwer immediately established a riof the Italian mathematician Giulio Bisconcini [Bi]) that val journal in Holland [G]. That embarrassing incident marked the beginning of had appeared in 1907 [Sul] and 1909 [Su2] in a Finnish a long fight between intuitionism and formalism, the main journal of lesser fame and circulation. One of Sundman's schools of mathematical-philosophical thought at the achievements was to find, for almost all admissible inibeginning of our century, each claiming to have f o u n d - - tial data, a series solution of the 3-body problem. If he against the other--the only viable w a y of laying the had gotten this result 22 years earlier, he would have foundations of mathematics. The building of founda- probably been awarded King Oscar's prize. Reading Sundman's paper we see that he obtained tions had come to seem urgent due to the antinomies, known already by the Greeks, but which had now a series solution in powers of t 1/3 for the 3-body probstarted to embarrass the recently established set theory. lem, a series convergent for all real t, except for a negThe main objection of Brouwer's intuitionism against ligible set of initial conditions, namely, those for which Hilbert's formalism concerned existence theorems. the angular momentum is zero. Indeed, Sundman proved Brouwer considered that a nonconstructive argument first the convergence of the series as long as no collicannot be accepted as proof of existence, so reductio ad sions take place. (The importance of the method develabsurdum seemed to him a good point to start the oped in that paper, which is based on the theory of funcpolemic. On the other hand Hilbert, who took Brouwer's tions of a complex variable, is analyzed in a nice article action personally, attempted to show that every theo- by Donald Saari [S].) Sundman also surmounted the rem can be deduced by logical steps from the postulates impediment of binary collisions through a process of a given axiomatic system. Unfortunately, in this re- he called regularization, which means to analytically extend the solution beyond the collision singularity, and spect the German mathematician was wrong. In 1931, Hilbert's formalism received a sharp blow which physically corresponds to an elastic bounce. In when the Austrian logician Kurt G6del published his this case, his series still proves convergent for all real incompleteness theorem [G6]. G6del proved that any values of the time variable. Unfortunately he could not sufficiently rich, sound, and recursively axiomatizable theory apply the same method if a triple collision occurs, but is incomplete. A recent paper [CJZ] goes even further by he showed that such a collision can take place only if showing that, in a quite general topological sense, in- the angular momentum cancels, hence for a set of inicompleteness is a common phenomenon: with respect to tial data having measure zero. (Even within this set, the any reasonable topology, the set of true and unprovable state- subset of initial data leading to triple collisions has meaments is dense in the set of all statements. This re- sure zero, as one of Saari's students has shown in his sult has persuaded some mathematicians that the fu- Ph.D. thesis [U].) In 1941, Carl Ludwig Siegel proved ture of mathematics is not with proving theorems but that such a regularization is possible only for a negligiwith trying to estimate the probability that a result is true. ble set of masses, so indeed, the analytic continuation of triple collisions is generically impossible [Si]. On the other hand, Brouwer's intuitionism--though Sundman's method failed to apply to the n-body never fully refuted by any other theory and still the object of some research--fell into oblivion, because it problem for n > 3. It took about 7 decades until the genraised barriers which the mathematical community re- eral case was solved. In 1991, a Chinese student, fused to acknowledge. Mathematics has developed al- Quidong (Don) Wang, published a beautiful paper [Wa], [D1], in which he provided a convergent power most undisturbed by the fight for its foundations. We will further see, however, that the main idea of series solution of the n-body problem. He omitted only intuitionism is off target. In certain cases a constructive the case of solutions leading to singularities--collisions proof of existence brings no more information than a in particular. (To understand the complications raised nonconstructive one. This is surprising, and the exam- by solutions with singularities, see [D2].) Did this mean the end of the n-body problem? Was ple I offer is the n-body problem. this old question--unsuccessfully attacked by the greatest mathematicians of the last 3 centuries--merely solved by a student in a moment of rare inspiration? The Series Solution Though he provided a solution as defined in sophomore textbooks, does this imply that we know everything In 1913, when he launched the attack that would de- about gravitating bodies, about the motion of planets and stars? Paradoxically, we do not; in fact we know prive him of editorial membership at the Mathematische nothing more than before having this solution. Annalen, Brouwer was not aware of a paper published THE MATHEMATICAL 1NTELLIGENCER VOL. 18, NO. 3, 1996
69
The following section deals with this apparent paradox.
The F o u n d a t i o n s of Mathematics What Sundman and Wang did is in accord with the way solutions of initial value problems are defined; everything is apparently all right; but there is a problem, a big one: these series solutions, though convergent on the whole real axis, have very slow convergence. One would have to sum up millions of terms to determine the motion of the particles for insignificantly short intervals of time. The round-off errors make these series unusable in numerical work. From the theoretical point of view, these solutions add nothing to what was previously known about the n-body problem. This unusual situation makes us think once more about the foundations of our discipline. First of all, it illustrates that even a constructive solution can be useless from the practical point of view. Then why stick to it, why give intuitionism any concern? Well, this difficulty would still not keep us from sleeping soundly. How many of us really care about intuitionism when doing mathematics? Unfortunately, doubt is also cast on the definition of a solution for an initial value problem attached to a differential equation. If our definition is meaningful, then shouldn't it exclude totally useless solutions? In certain cases all our efforts toward finding and writing down solutions might be as futile as Sisyphus's work; moreover, we have no way of knowing in advance when this will be the case. What to do then? Eliminate power series solutions from our definition? This would mean to negate two centuries of mathematics and throw many achievements away. Clearly there is no simple answer. The third problem is connected to what "good" mathematics means. Consciously or not, we usually understand by this the mathematics promoted by famous mathematicians. No one would doubt that the mathematics of Weierstrass, for example, was and remains "good." But Weierstrass stated the first problem of King Oscar's prize, a problem tackled by the sharpest minds of the time. It was eventually solved exactly as the German mathematician had wished; still, a hundred years later, its solution presents only historical interest. Fortunately, the genius of Poincar6 steered our discipline in the right direction--at least this is what we believe today. But how will mathematicians think a hundred years from now? The n-body problem--a bulwark against the flow of time, a reliable landmark on the map of mathematics-has posed and continues to pose new challenges. Almost untouched, mysterious as in the beginning, it has survived 300 years of siege. It has kindled and witnessed a few revolutions: the beginnings of calculus, of qualitative methods, of relativity, of chaos; tackled numerically, it has contributed to the launch of satellites and to the first human step on the moon. Now it is disturb70
THE MATHEMATICAL INTELLIGENCER VOL. 18, NO. 3, 1996
ing the fundamentals of differential equations theory, the structure on which a significant part of modern science and technology is based. Do we have an answer to this last challenge?
References [A] [BG]
[B] [Bi] [Br] [CJZ] [D1] [D2] [DH] [Di] [G] [G6]
K.G. Andersson, Poincar6's discovery of homoclinic points, Archive for History of Exact Sciences 48 (1994), 133-147. J. Barrow-Green, Oscar II's prize competition and the error in Poincar6's memoir on the three body problem, Archive for History of Exact Sciences 48 (1994), 107-131. J. Bernoulli, Opera Omnia, vol. I, Georg Olms Verlagsbuchandlung, Hildesheim, 1968. G. Bisconcini, Sur le probl6me des trois corps, Acta Mathematica 30 (1906), 49-92. E.H. Bruns, Uber die Integrale des Vielk6rperProblems, Acta Mathematica 11 (1887), 25-96. C. Calude, H. Jfirgensen and M. Zimand, Is independence an exception? Applied Math. Comput. 66 (1994), 63-76. F.N. Diacu, Singularities of the N-Body Problem, Les Publications CRM, Montr6al, 1992. F.N. Diacu, Painlev6's conjecture, The Mathematical Intelligencer 15 (1993), no. 2, 6-12. F.N. Diacu and P. Holmes, Celestial Encounters--The Origins of Chaos and Stability. Princeton University Press (to appear in August 1996). Dieudonn6, J., A History of Algebraic and Differential Topology 1900--1960, Birkh/iuser, Boston, Basel, 1989. R.L. Goodstein, Essays in the Philosophy of Mathematics, Leicester University Press, 1965. K. G6del, Uber formal unentscheidbare S/itze der Principia Mathematica und verwandter Systeme, Monatshefle fiir Mathematik und Physik 38 (1931),
173-198. [P] IS]
[Si] [Sul]
[Su2] [Su3] [U]
[Wa] [W]
H. Poincar4, New Methods of Celestial Mechanics (with an introduction by D.L. Goroff), American Institute of Physics, 1993. D.G. Saari, A visit to the Newtonian N-body problem via elementary complex variables, The American Mathematical Monthly 97 (1990), 105-119. C.L. Siegel, Der Dreierstot~, Annals of Mathematics 42 (1941), 127-168. K. Sundman, Recherches sur le probl6me des trois corps, Acta Societatis Scientiarum Fennicae 34 (1907), no. 6. K. Sundman, Nouvelles recherches sur le probl6me des trois corps, Acta Societatis Scientiarum Fennicae 35 (1909), no. 9. K. Sundman, M6moire sur le probl6me des trois corps, Acta Mathematica 36 (1912), 105-179. J.B. Urenko, Improbability of collisions in Newtonian gravitational systems of specified angular momentum. SIAM J. Appl. Math. 36 (1979), 123-147. Q. Wang, The global solution of the n-body problem, Celestial Mechanics 50 (1991), 73-88. A. Wintner, The Analytical Foundations of Celestial Mechanics, Princeton University Press, Princeton, NJ, 1941.
Department of Mathematics and Statistics University of Victoria Victoria, British Columbia V8W 3P4 Canada
Jet Wimp*
Probability Theory: An analytic view by Daniel W. Stroock Cambridge, England: Cambridge University Press, 1993. xvi + 512 pp. Hardcover: $52.95, ISBN 0-521-43123-9 Reviewed by Peter Whittle The book has an intriguing title. There is no doubt that many stochastic models can be treated either by probabilistic or by analytic methods, and that a strange complementarity between the two approaches prevents one from judging either superior. The most obvious case in point is the treatment of sums of independent random variables. This can be undertaken without recourse to the characteristic function. However, such a recourse, which amounts to an explicit appeal to Fourier techniques and the great body of associated analytic theory, offers speed and economy. The usefulness of this approach is not confined to the case of independent random variables; it is very often helpful to transform the Kolmogorov equations for a Markov process into operator equations in the characteristic function. The possibility of the two approaches, each with its own advantages, becomes even more evident if one considers the large deviation analysis of such processes when an increase in physical scale causes the model to approach determinism. The probabilistic approach appeals to the tilting of distributions and to martingale structure, the analytic approach to a WKB treatment of the operator equations essentially to approximation of a Fourier transform by a Legendre transform. Quite a different class of ideas was initiated by Kac in his magnificently stimulating paper [1]. Kac considered a random walk (in fact, a Wiener process) in a region with an absorbing boundary. He showed how properties of the process which were "obvious" probabilistically implied classic theorems of Weyl and Carleman on the distribution of eigenvalues of the linear operator constituted by the infinitesimal generator of the process. * C o l u m n Editor's address: Department of Mathematics, Drexel University, Philadelphia, PA 19104 USA.
Either approach, though it must be supplemented by detailed argument if it is to provide a rigorous proof, can supply a key insight into the problem. However, the analytic approach is notable in that formal use of a standard machinery can give one a very quick route to the results which one can expect will hold. I have expressed this view elsewhere in words which cannot command universal assent: "One might assert as a rough truth that the probabilistic course is preferred by the purist and the analytic course by the stylist ... the analytic route has the advantage of homing more directly onto the goal, even if it is the probabilistic route which ultimately provides both rigor and insight." So much for the train of thought initiated by Professor Stroock's title. Does his text follow the tracks suggested? Only slightly. Professor Stroock also pays homage to Kac, but has his own ideas of direction and style. Briefly, the book separates into two parts. In the first, Chapters 1-4, he works simply with the concept of independent random variables and exploits this for all it is worth. The role of analysis in this part seems largely to supply rigour, rather than novelty, to the arguments. Not until the second part, Chapters 5-8, does he introduce the concept of conditioning. This leads quickly to the study of martingales and a demonstration of the relationships--significant, but evident only to a powerful mind--which these enjoy with some of the truly major themes of classical analysis. Chapter 1 (Sums of independent random variables: Independence; the weak law of large numbers; Cram6r's theory of large deviations; the strong law of large numbers; the law of the iterated logarithm). This is necessarily fairly standard material. The concept of probability (as a measure on sets) is adopted immediately, together with that of independence. Professor Stroock produces a rabbit out of a hat with deduction of the Kolmogorov 0-1 law on p. 2. The expectation concept is smuggled in--defined in a two-line footnote on p. 3 and not mentioned in the index---despite the fact that it is put to work immediately and continually. The proofs of the laws of large numbers are standard--the weak law by the Chebychev inequality plus truncation, the strong law by Kolmogorov's inequality. However, interesting applications are given immediately, e.g., the approximation of functions by Bernstein polynomials.
THE MATHEMATICALINTELLIGENCERVOL.18, NO. 3 9 1996 Springer-VerlagNew York 71
The treatment of Cram6r's theorem provides the only mention of large deviations in the book, despite the fact that this would have been a natural theme (for reasons indicated above) and one which the author is impressively qualified to develop. One notices idiosyncratic notation: X / ~ (rather than the accepted i) for the square root of minus one and, later, T (rather than the accepted ~b) for the normal density. As in some botanical gardens, common objects often do not bear their common names, and interpretation is sketchy. Professor Stroock assumes not merely an analytic competence but also an analytic motivation, and so gives some of us an opportunity to learn a lot. Every section has a considerable collection of exercises--an appropriate term with its connotations of muscularity. Professor Stroock has an extensive private collection of pet ideas, techniques, associations and applications, and the exercises give him a fine chance to run through these. Chapter 2 (The central limit theorem). The theorems of Lindeberg and Berry-Esseen are proved by very elegant and ingenious arguments; essentially a weak convergence proof, appealing (for the Berry-Esseen sharpening) to Bolthausen's version of Stein's method. Fourier ideas and the characteristic function are invoked first in the following section ("extensions"), which treats the multivariate case and other characterizations of the normal distribution (e.g., invariance under convolution for finite-variance distributions, the isotropy/independence characterizations of statistical mechanics). The chapter concludes with what the author admits to being a non-probabilistic diversion, but one that he cannot resist: Hermite multipliers. In fact, the reader is less likely to object to the diversion as such than to the fact that its motivation and point remain obscure obstinately analytic. Hermite polynomials make an appearance, and operators are introduced which are plainly the creation and annihilation operators of quantum mechanics, but these names are not used, although the author references Nelson's construction of a two-dimensional quantum field. Enthusiasm is a fine thing, but the section seems to pursue a formalism for its own sake, with no explanation of the context which would make it meaningful. Chapter 3 (Convergence of measures, infinite divisibility, processes with independent increments). The introductory section, developing the weak convergence concept, opens in great generality. However, matters begin to clear quickly, and the relatively limpid Theorem 3.1.7 is especially welcome. This is a variation of the Riesz representation theorem, establishing a sufficient condition that a nonnegative linear functional should be representable as an integral with respect to a measure. Standard concepts and results then begin to click easily into place: tightness, the Kolmogorov extension theorem, the L6vy continuity theorem. The following two sections then develop the notion of infinite divisibility, 72
THE MATHEMATICAL INTELLIGENCER VOL. 18, NO. 3, 1996
first the non-Gaussian case and then the limit (Gaussian) case, so leading to the Wiener process, Donsker's invariance principle, and the full L6vy-Khintchine characterization of infinitely divisible processes. Some hint is given early in the chapter of the form this representation might plausibly take, which is just as well, because 44 pages of dense technical argument are required to get there. Chapter 4 (A celebration of Wiener's measure). This chapter is motivated by Professor Stroock's assertion that "Wiener's measure is possibly the single most important object in all of modern probability theory." This is an assertion which invites discussion, at the very least, but one can see its particular validity for an analyst, with the relation of the Wiener process to the ~7 2 operator, and so to harmonic functions, potentials, and the like. The chapter opens with a compact demonstration of standard properties of the process: scaling-invariance; continuity in a strong sense, unbounded variation. The topic of the second section, "Gaussian aspects," would indeed be straightforward in its finite-dimensional aspects, but Professor Stroock is skirting the idea of white noise: that the exponent in the "probability density" for the Wiener process {q~}would involve the time integral of ~2. He resolves familiar difficulties by an abstract formulation and discussion of the characteristic function of a linear functional of the process. Other deductions follow: e.g., the Cameron-Martin Lemma and the time-reversibility of the pinned process. In the final section, Stroock introduces what he terms: "Markov aspects" and stopping times, and demonstrates the strong Markov nature of the Wiener process (all this without mention, as yet, of the conditioning concept!). From this follows the reflection principle, the first-exit distribution, and the Feynman-Kac formula. Actually, to anyone with a background in dynamic programming, and so an easy if presumptuous familiarity with the Kolmogorov backward equation, the Feynman-Kac formula is immediately evident, and for a general Markov process rather than the Wiener process. However, presumption is not tolerated here, rigour is de rigeur, and rigorous deduction of versions of the formula is a heavy matter. The first part, then, does not display the slickness of the analytic approach, but rather its grinding power in hands as skilled as Professor Stroock's. It is perhaps vain at the present time to resurrect the debate between those with a principal concern for rigour (largely, but not entirely, represented by the mathematicians) and those with a principal concern for insight (largely, but not entirely, represented by the physicists). At the moment fashion sways in favour of the former, although the dynamics of attitude and fashion ensure that this will not last. The cry goes in one direction "You have proved nothing" and in the other "You have proved nothing," and neither group listens. However, one may at least claim a qualitative difference between those results which remain interesting
and significant in their naive (finite-dimensional) version and those which do not. The Kac-Feller assertions relating expected recurrence time and equilibrium occupation probability would be examples of the first type, the Cameron-Martin formula (giving the effect on a Gaussian density of a displacement of the mean) an example of the second (which is not to deny the content of the infinite-dimensional version). To these one must of course add concepts which are intrinsically infinitedimensional in nature, of which the Wiener process is a clear example (being the only homogeneous process of independent increments which has continuous paths). Chapter 5 (Conditioning and martingales). Conditioning is introduced first in the naive characterisation and then by the Kolmogorov characterisation of a conditional expectation. (This latter, I cannot resist saying, illustrates the natural primacy of the expectation concept.) It is introduced for a definite purpose: to set up the concept of a martingale (in fact, a discrete-parameter martingale). That this is now seen as the natural progression of ideas is a tribute to Doob's pioneering insight. Martingale convergence is demonstrated, largely following Doob's treatment, although with a recognition of the advantages of seeing a martingale as a sequence of projections, in the case when second moments exist. After his habitual meaty selection of exercises, Professor Stroock links the martingale concept immediately with some of the material of classical analysis. Explicitly, he discusses the Hardy-Littlewood maximal function Mf(x)--the maximal value of the average of Ifl over cubes centred on x. By appeal to martingale arguments he then deduces a celebrated inequality of these two authors, the Lebesgue differentiation theorem, and the Calder6n-Zygmund decomposition. However, it is in Chapter 6 (Some applications of martingale theory) that Professor Stroock cuts loose and demonstrates the far reach of the martingale concept. First he establishes the individual ergodic theorem. Then he moves on to a topic which one would have imagined completely unrelated to martingales: the study of singular integral operators (of which the Hilbert transform, with kernel proportional to (x - ~)-1, is the prime example). The discussion is again technical and does not lend itself to summary, but confirms the author's contention that the topic exemplifies "the kind of delicate cancellation properties which underlie the most challenging applications of martingale theory." Indeed, in the remarkable following section the author relates this work, Burkholder's inequality, and the general question of the Fourier representation of the action of an operator. Chapter 7 (Continuous martingales and elementary diffusion theory). Completion of the discrete-parameter theory to the continuous-parameter case leads straight into a discussion of the properties of Wiener paths, e.g., their recurrence properties and their ability to mimic
any given continuous path arbitrarily well over any finite time interval. "Perturbations of Wiener paths" refers to what one would loosely speak of as a first-order stochastic differential equation driven by additive white noise--what we otherwise know as a diffusion process. The qualitative conclusions are that the drift (deterministic) term in the equation determines the global properties of the path, but that the local properties are very much those of the Wiener path itself. All this is, of course, expressed and analysed in the most rigorous fashion. A later section deals with the case when the drift term is the gradient of a potential, when it is known that, at least formally, the process has an invariant measure whose density is exponential in this potential. Professor Stroock rigorises the conclusion, and also establishes rate-of-convergence results for passage to this equilibrium. Chapter 8 (A little classical potential theory: The Dirichlet heat kernel: the Dirichlet problem; Poisson's problem and Green's functions; Green's potentials, Riesz decompositions and capacity). This is lovely stuff, which the author plainly handles with the keenest pleasure. It is, of course, again all very much centred on the Wiener process. For example, the Dirichlet problem concerns the solution of Laplace's equation ~72u = 0 in a region ~ given the value of u on the boundary of ~ . The 'probabilistic solution' u(x) is the expectation over boundary values under the first-passage distribution to the boundary of a Wiener path originating at x. Professor Stroock is concerned with rigorous proof that this is indeed the valid and unique solution. One might assert that Wiener character plays a role here only in t h a t V 2 is the infinitesimal generator of the Wiener process and that the path stops on the boundary. Modulo rigour, the corresponding assertion for a Markov process with infinitesimal generator A is that the solution u(x) of Au = 0 in ~ subject to prescription of u outside ~ is just the expectation of terminal u-value under the first-passage distribution of the Markov process to the exterior of ~ , starting from x. However, "modulo rigour" means "nullity" to Professor Stroock, and he asserts nothing that he cannot prove to the hilt. On the other hand, he does invoke Kac's intuitive insights; principally, that the process "does not feel the boundary" for the first few moments of its start from an interior point x. The treatment goes on to consider Poisson's equation (the driven form of Laplace's equation) and to develop a probabilistic view of the concepts of potential and capacity. This is again all very much Wiener-centred, presumably for definiteness, since (as Doob, Hunt, and others have demonstrated) these concepts have versions for much more general processes. The dominating impression conveyed by the text is that a distinguished research worker has written the book that he wanted to write. Such a work cannot be anything other than strong, original, sincere, and thoughtTHE MATHEMATICAL INTELLIGENCER VOL. 18, NO. 3, 1996 7 3
first impact theory in physics. In mathematics, billiards offers a look at a new and complicated geometries. An inexperienced reader might be tempted to exclaim, "Look, the billiard table is merely a rectangle. Although I was not the top student in my school geometry class I can easily solve all problems dealing with rectangles." But the analysis of billiards offers us a chance to witness the ability of the human intellect to question and to extend. Starting from rectangle billiards the inquisitive mind goes on to inquire, "What are triangle billiards like?" and " H o w about polygons or circles and ellipses?" "Can someone play multi-dimensional billiards or billiards where balls move along geodesic lines of some Riemannian metric?" Even in ordinary billiards the geometry involved turns out to be not so simple as it may seem at first. In order to describe the position of a billiard ball, we need Reference to know not only its position on the billiard table (given 1. Mark Kac. On some connections between probability the- by two coordinates x,y) but also the direction of its moory and differential and integral equations. Second Berkeley tion (defined by a unit vector ~). The triplet (x, y, ~) Symposium. University of California Press (1951), 189-215. forms the "real coordinates" of the billiard ball. Thus its "real motion" is displayed in a 3-dimensional space of Statistical Laboratory "coordinates and velocities" called the phase space of University of Cambridge the billiards. The geometry of this space is much more Cambridge CB2 1SB complicated then the geometry of a plane rectangle. The U.K. main problem is to describe different types of billiard trajectories in the phase space. For example, one can ask, "Is there a closed trajectory?" If the answer is positive Billiards, A Genetic Introduction to the we can ask, "How many of them can we count?" and Dynamics of Systems with Impacts then, "Are they stable?" (this means that any trajectory by Valerii V. Kozlov and Dmitrii V. Treshchev starting near this periodic one tends to it with time). The Providence: American Mathematical Society Transla- last question can be crucial for any billiards player: if the periodic trajectory does not hit any billiard pockets tions of Mathematical Monographs No. 89, 1991. and the player launches the ball in a direction close US $157.00, ISBN 0-8218-4550-0 enough to the trajectory, then the ball will not fall into those billiard pockets. Perhaps expert players have to Invariant Manifolds; Entropy and Billiards; develop intuitive methods for finding stable trajectories. Smooth Maps with Singularities One might suspect that a good billiards player knows by Anatole Katok and Jean-Marie Strelcyn all the stable periodic trajectories but keeps them secret N e w York: Springer-Veflag, 1986. from mathematicians! US $42.00, ISBN 0-387-17190-8 The analysis of billiards can be done in the framework of the theory of dynamical systems. The analysis is Reviewed by Ya. B. Pesin based on classical geometry but also uses extensively reGenerations of people have enjoyed billiards. It is an old sults in number theory, topology, ergodic theory, and game, known in India and China long before the birth theoretical mechanics. Many of the methods used in of Christ. For example, in Shakespeare's Anthony and these books, though quite elementary, allow very nonCleopatra, the Egyptian Queen liked to play billiards trivial conclusions. Both of these books are monographs on the theory of with her maid of honour. And the story goes that French King Charles IX was playing billiards when he heard a billiards. However, right away I would like to emphaprearranged signal for the St. Bartholomew's Day mas- size the important difference between them. The first sacre--the ringing of the bells of St. Germain Cathedral. book can be considered an introduction to the theory of Nowadays the game is even more popular. However, billiards. It is intended for an undergraduate knowing I should warn those readers seeking advice on winning calculus and algebra. The second book is for readers fathat the books being reviewed will give you none. They miliar with the basic notions of dynamical systems and are concerned entirely with the mathematics and me- ergodic theory. chanics of billiards and the great influence this game The main idea of the first book is expressed by the has exerted on mathematics and physics. Billiards is the authors as follows: "The authors strive to clarify the genprovoking. According to the author's own acknowledgements it is a "kinder and gentler" text than the version which Professor Diaconis first saw. In that case we owe a debt of gratitude to Professor Diaconis: the text is more remarkable for muscle than for grace, but has perhaps the more character for that. It is a first-class work of reference, for technique as well as for results and definitive theory; even as extended a review as this does not do justice to its detail and density. It will be a source of stimulation to those mature enough to appreciate its range and the richness of connections it establishes. It could certainly make a great graduate text if one regarded it as concentrating nourishment in the same way as does dried beef, from which one carves off a piece to be chewed and digested at leisure.
74
THE MATHEMATICAL INTELLIGENCER VOL. 18, NO. 3, 1996
esis of the basic ideas and concepts of the theory of dynamical systems with impact interactions and also to demonstrate that they are natural and effective." The reader's attention is focused on the mechanics of billiards and the study of its stable trajectories. The authors show how to derive the mathematical laws of billiards from the well-known physical principles of impact theory. They point out that "An impact is a short-time interaction of bodies" such that "the positions of bodies do not change at the moment of impact, while their velocities acquire finite increments. Thus, a central feature of impact theory is finding the dependence between the velocities before and after impact." Classical billiards corresponds to absolutely elastic collisions; the trajectories in it obey the variational principles of Hamilton and Maupertuis. The main law is the law of reflection, which is expressed in the well-known maxim "the angle of incidence equals the angle of reflection." A detailed exposition of this theory can be found in the first chapter of the book. The next chapter is devoted to the problem of the number of geometrically distinct closed trajectories. The first result in this direction is due to Birkhoff: there exist at least two such trajectories if the billiard table is given by a smooth, closed, convex plane curve having nonzero curvature at every point (such billiards are called Birkhoff's billiards). The proof introduces the reader to interesting geometrical ideas and constructions in billiard theory. The authors also formulate some geometrical conditions for stability of Birkhoff closed trajectories. Chapter 5, devoted to integrable billiard systems, plays a special role. Completely integrable systems in mechanics are the simplest ones: they can be completely solved. A natural problem is to find all completely integrable billiards. An example has been known for a long time: elliptical billiards. A reader will find a quite elementary geometrical description of this case as well as some others, and the authors are careful to convey the contemporary status of the problem. It has been conjectured that only billiards on elliptical tables are completely integrable. The authors deal with this problem in the last chapter of the book. The study of nonintegrable billiards is quite different and utilizes ideas and methods from the theory of dynamical systems and ergodic theory. Mathematicians have accomplished a great deal and the end seems to be somewhere only a little beyond the horizon. For the general theory of nonintegrable billiards one must read the second, more advanced book. It develops the ergodic theory of smooth maps with singularities having nonzero characteristic Lyapunov exponents with respect to a smooth invariant measure given by the Riemannian metric of the phase space. This sentence can serve as a test. If you are familiar with every notion in it, then you are ready to start reading the book. You will find 1) smooth nonuniformly-hyperbolic theory in its
modern guise; 2) one of the most general versions of the theory of local invariant manifolds; and 3) the complete description of ergodic properties of the systems specified in the above "test" sentence. The book contains much more: for example, formulas for entropy, and information about a number of periodic trajectories. The authors' methods work like long-range guns: they cover a large area where many classes of billiards are located. The most interesting among them are the dispersed billiards or Sinai billiards, introduced and studied by Ya. Sinai in 1970. Sinai's methods were "more geometrical" while the present authors propose a "more dynamical" approach. Throughout this book the predominance of dynamics over geometry allows greater generality. Both books are written by well-known mathematicians who have contributed a great deal to the field. The exposition in the books is very thorough, and in the second book highly demanding. The reader is rewarded with carefully formulated statements and detailed proofs.
Department of Mathematics Pennsylvania State University University Park, PA 16802 USA e-maih
[email protected]
Genetic Algorithms + Data Structures = Evolution Programs by Zbigniew Michalewicz Second Extended Edition, N e w York: Springer-Verlag, 1994. xvi + 340 pp. US $39.00, ISBN 3-540-58090-5
Reviewed by Stephen J. Hartley Some People Receive Too Many Books H o w I came to review this book for the Mathematical Intelligencer is amusing. I needed a text for an artificial intelligence special topics course in genetic algorithms that I was going to teach in the spring. I called several publishers for examination copies of various books, including Springer-Verlag. In a few weeks, all books arrived except Michalewicz's. After waiting another month, I called Springer-Verlag to see what had happened. They called UPS to verify that the book had arrived at Drexel. I sent electronic mail to all department members to determine whether someone had picked it up accidentally. Jet Wimp, Review Editor for this journal, gets many books from Springer-Verlag and realized he had taken it. When he returned the book to me, he asked me to review it. The focus of this review is to compare Michalewicz's book with the one I chose for the genetic algorithms course, Goldberg's popular text [3]. Will I switch books THE MATHEMATICAL INTELLIGENCER VOL. 18, NO. 3, 1996
75
if I teach the course again? First, I'll give a brief introduction to genetic algorithms.
Genetic Algorithms are Inorganic Genetic algorithms are used for search and optimization, for finding the maximum or minimum of a function. Instead of a deterministic search, as in hill-climbing or gradient methods, genetic algorithms use randomization. The members of the search space--for example, integers or real numbers in some domain--are encoded as bit strings. An initial population of bit strings is generated at random. Each bit string is called a chromosome. Each member of the initial population is evaluated for its fitness in solving the problem (maximizing or minimizing the function). A new population of candidate solutions to the problem is generated using three genetic operators: reproduction, crossover, and mutation. These are modeled on their biological counterparts. With probabilities proportional to their fitness, members of the population are chosen or selected for a new population. Pairs of chromosomes in the new population are chosen at random to exchange genetic material (bits) in a mating operation called crossover, resulting in two offspring. Bits are flipped at random, a procedure called mutation. The new population that is generated with these operators replaces the old population. The algorithm has performed one generation and then repeats for some specified number of additional generations. The population evolves, containing more and more highly fit chromosomes. When the convergence criterion is reached, such as no further increase in the average fitness of the population, the best chromosome is decoded into the solution (maximum or minimum) produced by the genetic algorithm for the problem. Genetic algorithms have been very successful at solving many types of problems, such as maximizing discontinuous, multimodal, multidimensional functions. They have also been used on discrete problems, including such combinatorial optimization tasks as the traveling salesperson, bin-packing, and job-shop scheduling. Two excellent introductory articles are [1,2].
I'd Rather Fight than Switch I chose Goldberg's book as the text because it contains introductory material in addition to advanced topics, has problems and programming assignments at the ends of the chapters, and describes module-by-module a program called SGA implementing a simple genetic algorithm. Michalewicz's book also contains introductory material in Part I. However, the book seems to be written for those who already have some familiarity with genetic algorithms. Advanced operators are mentioned (PMX, OX, CX on page 86) but not defined until much later (page 218). The Banach fixed-point theorem 76
THE MATHEMATICAL INTELLIGENCER VOL. 18, NO. 3, 1996
is used to explain the convergence of genetic algorithms (pages 66-67). Part I could be used as the introductory material for a genetic algorithms class if the instructor provides exercises and problem sets. Part II is on numerical optimization and assumes a sophisticated mathematical background, for example in dynamic control (page 98) and nonlinear optimization (page 158). There are many minor, readily identifiable typographical and editing errors of the kind common to author-supplied camera-ready copy. Michalewicz's book is an excellent resource on genetic algorithms for the specialist. It concentrates on incorporating linear and nonlinear problem constraints into genetic algorithms. However, I would hesitate to use it instead of Goldberg as the text for an introductory class in genetic algorithms because of Michalewicz's mathematical sophistication and lack of exercises.
References [1] David Beasley, David R. Bull, and Ralph R. Martin, An overview of genetic algorithms: part 1, fundamentals, University Computing 15, 2 (1993), 58-69. [2] David Beasley, David R. Bull, and Ralph R. Martin, An overview of genetic algorithms: part 2, research topics, University Computing 15, 4 (1993), 170-181. [3] David E. Goldberg, Genetic Algorithms in Search, Optimization, and Machine Learning, Reading, Mass: Addison-Wesley (1989).
Department of Mathematics and Computer Science Drexel University Philadelphia, PA 19104 USA e-maih
[email protected] website: http://www.mcs.drexel.edu/-shartley
Polynomials and Polynomial Inequalities by Peter Borwein and Tam~s Erd41yi Graduate Texts in Mathematics Vol. 161 N e w York: Springer-Verlag, 1995. x + 480 pp. US $59.00, ISBN 0-387-94509-1
Reviewed by Jet Wimp Several years ago I reviewed John and Peter Borwein's book, Pi and the AGM (Wiley-Interscience, 1987), and m y praise of that book was unstinting. I find this book equally praiseworthy, and to dispel any suspicion that the authors and myself are in league, I will state for the record that I haven't seen the Borwein brothers for at least 10 years, and I've never met Tam~s Erd61yi. Partly, it's a case of confluent sensibilities--the things that interest these authors interest me. Another factor is that all these authors write exceedingly well. Their prose is deft and uncluttered, and they organize their material in a w a y that could serve as a model to up-and-coming mathematical writers. Reading the present book or the "Pi" book, I was constantly energized, divided between
my desire to toss it aside and lavish my own research on the subject at hand and the desire to stay on the speeding train, wondering what on earth was going to appear around the next bend. Polynomials and Polynomial Inequalities is one of the best mathematical books in years. Polynomials are the workhorses of analysis. A question that recurs on the Ph.D. written exams at Drexel is the following: Let ck E L1 [a, b] and
f~t nek(t) dt = 0 ,
n =0,1,2 .....
Show that qh = 0 a.e. If the examinee had me as instructor for the threeterm real variable sequence, he or she knew exactly how to proceed. First, approximate ~bby a continuous function and, then, using the Weierstrass approximation theorem, approximate that function by a polynomial. In my graduate courses I always emphasize the utility of polynomials in analysis. It's not that crucial results in approximation theory or interpolation theory can't be obtained any other way--they usually can. It's just that polynomials are often the slickest way to prove things. The books of Davis, Interpolation and Approximation [dav], and Achieser, Theory of Approximation [ach], contain many examples. This book is about equally divided between the traditional literature on polynomials in a single complex variable and more recent research, due to the authors, on Mfintz generalized polynomials. I found the original material especially enjoyable. Let me introduce some notation. D and D will denote the open unit disk and its closure, respectively. p(z)
= an zn q- a n _ l z n - 1
q- " ' " q- a 0 ,
q(z) = anz ~ + an_lZ ~-1 + "" + ao z ~~ 3o < 81 < "'" < 8n.
(1)
(2)
q is called a Miintz polynomial. Many features of ordinary polynomials have Mfintz polynomial analogs. The supremum norm of a function on a complex set z~ is
IlfG
=
sup
If(z)[.
z~A
It's quite remarkable that, given the work record of polynomials, few general accounts of their properties have appeared in book form. The only contender, E. J. Barbeau's similarly excellent Polynomials (1989), 1 harbors a different set of concerns. It is pitched at the undergraduate level. Although it too has an exhaustive and entertaining set of exercises, there is little overlap with the present book. 1Reviewedin Mathematical Intelligencer, Vol. 16, no. 2, pp. 78-79.
Of the remaining books on polynomials, some treat specialized polynomial sets--Stirling or Eulerian polynomials, for instance and are little more than pamphlets. Some treat orthogonal polynomials, or the role of polynomials in approximation theory. In numerical analysis, estimating the location of the roots of polynomials is imperative for analyzing the growth of error in many algorithms and, in engineering problems, making decisions about the stability of dynamical processes. Yet, the only comprehensive book treatment is Marden's 1949 The Geometry of the Zeros of a Polynomial in a Complex Variable. My copy is so tattered and the gold lettering on the spine so effaced that I sometimes can't find it on my bookshelf. Much material in Marden is present here, including one of my favorites, the extraordinary theorem of Enestr6m-Kakeya:
Let all ak > 0 in (1). Then all the zeros of p lie in the annulus rl : = min ak ~ IZI ~ r2 : = max - -ak . ak+ 1 ak+1 Before I get into details, let me indicate the plan of the book. There are seven chapters. Each chapter starts with a one-paragraph overview, a greatly effective organizing principle. Chapters are divided into sections; each section closes with a subsection on comments, exercises, examples, and historical observations. I appreciated the firm sense of historical grounding the Borweins displayed in "Pi," and the same disposition is at work here. Books that lack contact with this side of their subject always seem to me superficial, evanescent: we need to know where we've been to know where we're going, and we can convey the richness and vitality of mathematics only by attending to its social and historical dimensions. The text is not cluttered with proofs; usually these are left to the exercises (sometimes with appropriate hints), so that occasionally the book assumes the flavor of an encyclopedia, which is not at all a bad thing. It's what makes the book such compulsive reading. The authors organize and present the material in a way that emphasizes its usefulness. It's very much a working mathematician's book. Now about the content of the individual chapters. Some of the results I knew of; some were surprising, others were more than surprising--they were astonishing. As we go along I'll display a few that caught my eye, with almost no commentary and, of course, no proofs. I just want to give an idea of the terrain of the book. We have all found things in mathematics to make us marvel. Often our reaction was one of wonderment: where do people get such ideas? There are many marvels in this book-pretty, or useful, or both. I'll concentrate on the pretty. Chapter 1, the introduction, contains everything we need to know about polynomials to handle later chapters: the fundamental theorem of algebra; explicit solutions for quadratic, cubic, and quartic equations (You all knew these formulas existed. Did any of you know THE MATHEMATICAL INTELLIGENCER VOL. 18, NO. 3, I996 7 7
where to find them?); Newton's identities; norms; partial fractions; theorems on the location of zeros and critical points of polynomials; basic theorems from complex analysis. (I would like to have seen the Schur criterion, a recursive algorithm for deciding whether the zeros lie in the interior of the unit disk. It's a valuable tool for analyzing the stability of numerical integration schemes. 1) The preparation in this chapter for what follows is painstaking. There are no nebulous concepts-we always have what we need to understand what we're reading. One of the first results in this chapter to strike my fancy was the following, a consequence of a general theorem of Szeg6: If Z ak zk has all its zeros in -D, so does ~. akzk k=0
k=0
(~)
Chapter 2, Some special polynomials, begins with a discussion of Chebyshev polynomials Tn(x) and U,(x). They occur in the most unexpected contexts, including the problem of functional iteration: Define pIk] = p(p[k-1]), p[1] = p, see (1), and suppose closure of {z E C]p[kl(z) = 0 for some k = 1, 2, ...} is the interval [ - 1 , 1]. Then p(x) = + Tn(x). In the subsection on transfinite diameter, the important notion of the Fekete polynomial and logarithmic capacity for a complex set appear. Next, the authors discuss orthogonal polynomials on the real line, polynomials orthogonal with respect to an arbitrary measure, the Gram-Schmidt process, and best approximation. The section on Lp spaces is a model of brevity. Section 2.3 is devoted to the classical orthogonal polynomials and their properties. There follows an exercise set on the moment problem. The celebrated theorem of Favard is relegated to an exercise! The chapter closes with a section on polynomials with non-negative coefficients. Chapter 3 explores the vital role polynomials play in approximation theory: Chebyshev and Descartes systems, rational systems, M/,intz polynomials. The authors define a very clever Miintz analog of the Legendre polynomials: Ln(x) = ~
1 f r - r t + X*+ 1 xt I H t ~-~k t - - ;t-----~dt, n = O, 1, 2, ....
where F encloses all the poles of the integrand. Ln is a polynomial of the kind (2), but with the powers unre-
1Call a polynomial Schur if all its roots are in D. Define p(z) = a* + a*-i z + 9 9 9 + a~ z n, the * indicating complex conjugation, and pl(z) = (1/z)[~(O)p(z) - p(z)p(0)l. Then p(z) is Schur if and only if (i) Ip(0)] > [p(0) I and (ii) pl(z) is Schur.
78
THE MATHEMATICAL INTELLIGENCER VOL. 18, NO. 3, 1996
stricted complex numbers, as we find by using residue calculus, Ln(x) =- Z
j=0
A n , j xAJ,
n
An,j = II(;~j + ;~ + 1) k O
n
H(Aj kr
)
(The above formula requires that the Ak be distinct, but the case where they aren't is easily handled.) Quite surprisingly, these polynomials, like the Legendre polynom i a l s - t h e case Aj = j--are an orthogonal set. I found the following result highly dramatic when I encountered it in one of the authors' previous publications. I was happy to see it here: 1 Let {Ak} be a complex sequence, Re (Ak) > -- ~. Then Ln(x)L~(x) dx = 1 + An + A~, It is well known that all the zeros of the Legendre polynomials lie in the open interval (-1, 1). Where are the zeros of the Miintz-Legendre polynomials? I won't spoil the fun for the reader. Read the book and find out. In Chapter 4 we find a discussion of denseness properties of various approximation families, including the powers (essentially, Weierstrass's approximation theorem) and the generalized powers, {x~q (essentially the classical result of Miintz). As I was browsing, m y eye jumped to the following curious item: 1 +
min 1 ai aiEC i=1 z - ~i
=Hlfli[ i=1
-1
"
Bernstein inequalities and the Paley-Wiener theorem finish the chapter. Chapter 5, Basic inequalities. Many polynomial inequalities are scattered throughout the mathematical literature, and the authors have performed a service to the mathematical community by gathering them together: the Remez, Bernstein, Markov, Schur inequalities among them. One of the most inscrutable is the Remez inequality:
I[PllI-l,ll
~ T ~ ( 2 + s~ ~2 - s J
holds for every real polynomial p and s E (0, 2) satisfying ~ff~{x : [p(x)[< 1} -> 2 - s. (Tn is the Chebyshev polynomial and ~fl~is the Lebesgue measure.) It has been said that great mathematics is always surprising. According to this criterion, the Remez inequality, coming out of the blue, surely qualifies for greatness. What could conceivably lead anyone to suspect that this result was true?
Among all this chapter's gorgeous results, I was particularly attracted to Chebyshev's inequality: Jpn(y)[ --- [Tn(y)J'[[p[l[-1, 1],
y ~
[-1, 1],
for p a real polynomial, with equality if and only if pn = cTn. The authors generalize a bit in Chapter 6 by mentioning some inequalities for entire functions of exponential type. The chapter ends with weighted inequalities and inequalities for norms of factors. Many of these findings can be extended to M~intz spaces, where integer powers of the variable are replaced by general powers, and this is done in Chapter 6. Much of this material is due to the authors. In Chapter 7 the authors take on inequalities for rational function spaces. I was struck by an inequality on logarithmic derivatives:
MOVING? We need your new address so that you do not miss any issues of
THE MATHEMATICAL INTELLIGENCER. Please fill out the form b e l o w and send it to: Springer-Verlag N e w York, Inc. Journal F u l f i l l m e n t Services P.O. Box 2485, Secaucus, NJ 07096-2485 Name Old Address (or label)
Address City/State/Zip
Let p be a real polynomial of degree n. Then I P'(K) ~ ~} ~ 2--n-n ~f~ x E ~ : p(x) ol There are five appendices. None of these contains material necessary for topics in the text. Instead, they are mini-essays, each devoted to a subject relevant to polynomials. The Borwein brothers in the book "Pi" expressed a steadfast concern about computability and algorithmic construction. The first appendix in the present book treats computability as the idea relates to polynomials: the fast Fourier transform, fast polynomial algebraic operation on polynomials, methods for localizing zeros. I was disappointed that the authors mention neither the Jenkins-Traub method nor the Lehmer-Schur search algorithm. These are the methods of choice for computing the complex zeros of polynomials. The Lehmer-Schur algorithm is a generalization of the bisection method to disks in the complex plane, and it always gets all the zeros and always converges linearly. The discussions of root finding both here and in Barbeau's book [bar] are unsatisfactory. In particular, Newton's method and its kin, touted for their elegance and their easy extension to operator equations, have no value as global methods for finding the zeros of complex polynomials. The reader who is concerned about effective methods for root finding should consult the recent survey article by McNamee [mcn] and the references given in [pre]. The other appendices treat orthogonality and rationality, interpolation, inequalities for Lp polynomials, and constrained inequalities. There is an 18-page bibliography and a very helpful index of notation. The production values of the book are exceptional, just what we always expect from the Springer-Verlag tradition. The book can be used as a reference work or as a basis for a very imaginative graduate or even advanced undergraduate course.
Name New
Address
Address City/State/Zip Please give us six weeks notice.
This superb book fills a need that was unaddressed for far too long. I predict it will become a classic. References
[ach] N. I. Achieser, Theory of Approximation, Ungar Publishing, New York, 1956. [bar] E. J. Barbeau, Polynomials, Springer-Verlag, New York, 1986. [dav] P. J. Davis, Interpolation and Approximation, Blaisdell Publishing, Waltham, MA, 1963. [mar] M. Marden, The Geometry of the Zeros of a Polynomial in a Complex Variable, American Mathematical Society, Providence, RI, 1949. [mcn] J. M. McNamee, A bibliography on roots of polynomials, Journal of Computational and Applied Mathematics 47, 391-394 + floppy disk (1993). [pre] W. H. Press, W. T. Vettering, S. A. Teukolsky, and B. P. Flannery, Numerical Recipes in C: the Art of Scientific Computing, 2nd ed., Cambridge University Press, Cambridge, England 1992.
Department of Mathematics and Computer Science Drexel University Philadelphia, PA 19104 USA e-maih
[email protected] THE MATHEMATICALINTELLIGENCERVOL. 18, NO. 3, 1996 7 9
Robin Wilson* Irish Mathematics Raymond Flood and Robin Wilson Irish mathematicians have featured on a number of stamps. We illustrate four of these here, covering three centuries. George Berkeley (1685-1753) was a philosopher and clergyman who became Bishop of Cloyne in 1734. A vehement and highly competent critic of many aspects of Newtonian science, he sought to show that Isaac Newton's universe was constructed upon shaky foundations. In his 1734 book The Analyst, or a Discourse addressed to an infidel mathematician (generally thought to refer to Edmond Halley), he unleashed a devastating attack on the calculus of Newton and Leibniz. In particular, he argued cogently that Newton's method of fluxions was logically unsound, referring to derivatives as "ghosts of departed quantities." The Irish stamp below was issued in 1985 to commemorate the 300th anniversary of his birth. Sir William Rowan Hamilton (1805-1865) was a child prodigy who mastered several languages (modern, classical and oriental) by the age of 14. While still a teenager he discovered an error in Laplace's Traitd de Mdcanique Cdleste, and was appointed Astronomer Royal of Ireland while an undergraduate at Trinity College, Dublin. He carried out important theoretical work in geometrical optics and dynamics, and several concepts and results are named after him, such as Hamiltonian function, Hamilton's principle, and the Hamilton-Jacobi equation. He also revolutionized algebra by his investigations into non-commutative systems. The two stamps
Berkeley
below were issued in 1943 and 1983 to commemorate Hamilton's discovery of quaternions in 1843. We recall Dimitric and Goldsmith's "Mathematical Tourist" article, Mathematical Intelligencer vol. 11, no. 2, 29-30. ~amon de Valera (1882-1975) was brought up on a farm in County Limerick. He became a teacher of mathematics in Dublin, where he increasingly became inv o l v e d in republican circles. He was a commandant in the 1916 Easter uprising and narrowly escaped death by firing squad when the uprising was defeated. After independence he was an opposition leader, Prime Minister, and eventually President of Ireland. De Valera's lifelong interest in mathematics, particularly in celestial mechanics and quaternions, is shown in his letters from prison and in his founding one of the leading research institutions of Ireland, the Dublin Institute for Advanced Studies, with its three constituent Schools of Theoretical Physics, Cosmic Physics and Celtic Studies. On establishing the Institute in 1939, he said, "This is the country of Hamilton, a country of great mathematics; ... establishing a School of Theoretical Physics will again enable us to achieve a reputation in that direction comparable to the reputation which Dublin and Ireland had in the middle of the last century."
Raymond Flood Department of Continuing Education Kellogg College Oxford, OX1 2JA UK
Hamilton
Quaternions
de Valera
*Column editor's address: Facultyof Mathematics and Computing, The Open University,Milton Keynes,MK7 6AA, England. 80
THE MATHEMATICAL INTELLIGENCER VOL. 18, NO. 3 9 1996 Springer-Verlag New York