The Mathematical Intelligencer encourages comments about the material in this issue. Letters to the editor should be sent to the editor-in-chief, Chandler Davis.
Too Long for this Margin I'n vol. 12, no. 3 of the Intelligencer (Summer 1990), the Mathematical Entertainments editor cited Fermat's remark ("I have discovered a w o n d e r f u l proof of this fact but this margin is too small to contain it"), and invited the submission of remarks accompanying statements of other famous conjectures or theorems. Now, while I can perhaps not completely satisfy the then editor's requirements, here are t w o remarks, in a vein similar to Fermat's, which m a y interest readers. The first is a remark in de Morgan's A Budget of Paradoxes [Open Court (1915), Chicago and London, vol. I, p. 15], where, on the quadrature of the circle, de Morgan writes, Aristotle, treating of the category of relation, denies that the quadrature has been found, but appears to assume that it can be done. Boethius, in his comment on the passage, says that it has been done since Aristotle, but that the demonstration is too long for him to give. The second remark is nearer in time to Fermat's comment (Boethius's dates are c. 480-524 A.D.). Replying, in N o v e m b e r 1710, to a letter from Jean Bernoulli, Pierre R e m o n d de M o n t m o r t wrote (see his Essay d'analyse sur les jeux de hazard of 1713, p. 403), in a n s w e r to a c o m m e n t on his treatment of the g a m e of Treize (or matches), I would tell you about my method, if I did not fear it would be too long, I flatter myself that it would be to your taste.
A. I. Dale Mathematical Statistics University of Natal Durban, 4001 South Africa
Bourbaki Quoted i n a B o o k o f Mathematics i n 1918 In the mathematical literature, there are m a n y theories about the birth of the n a m e Bourbaki: there exist both a French general [3] and an o m n i v o r o u s h e d g e h o g [2] carrying the notorious name. Andr~ Weil recalls in his memoirs [5] that the very earliest mathematical Bourbaki p u n was a fake lecture given b y a certain Raoul H u s s o n at the Ecole Normale Sup~rieure in 1923. Recently, as I was browsing a m o n g antiquarian books collected b y m y friend Jouni Parkkonen, I h a p p e n e d to o p e n W. Ahrens's Altes und Neues aus der Unterhaltungsmathematik [1] printed in Berlin in 1918. I was startled to see the n a m e Bourbaki q u o t e d on page 103 in a chapter dealing with superstition related to the n u m b e r 13. The passage in question reads as follows: Unter den Schtilem der franz6sischen Kriegsschule SaintCyr gilt die Zahl 13 geradezu als Gl~ickszahl: Wer als dreizehnter bei der Abgangspriifung rangiert, gilt ftir einen gemachten Mann; eine gl/inzende milit/irische Laufbahn ist ihm sicher, und der MarschaUstab liegt in seinem Tornister zur jederzeitigen gef/illigen Bedienung bereit. Mac Mahon, Bourbaki und andere sollen lebendige Beispiele dieses Dogmas gewesen sein. [Among students at the French military academy Saint-Cyr, the number 13 was positively lucky. Whoever came out thirteenth in the ranking at graduation was considered to have it made; a brilliant military career was assured; promotion to marshal's rank would be his for the taking, with continual pleasant duties. Mac Mahon, Bourbaki, and others were given as living examples of this rule.] Of course, this Bourbaki is the general Charles-DenisSauter Bourbaki (1816-1897) w h o has no other relationship with mathematics. Nonetheless, I bet 10,000 Polde-
THE MATHEMATICALINTELLIGENCERVOL. 16, NO. 1 01994 Springer-VerlagNew York 3
vian crowns that this is the earliest existing quotation of the n a m e Bourbaki in any book of mathematics! En passant, some time ago, I also ran into the n a m e Nicolas Bourbaki in the wartime files of the Finnish counter-espionage service. Indeed, Andr6 Weil was interned in Finland from N o v e m b e r 30 t h r o u g h December 12,1939 during the Finnish-Soviet Winter War. Weil was suspected of belonging to the n e t w o r k of a Soviet master spy alias Nicolas Bourbaki. But this is a long story that has been told elsewhere by Weil [5]; I a d d e d some curious features from the fat file on him in the Finnish Central Police Archives [4].
References 1. W. Ahrens, Altes und Neues aus der Unterhaltungsmathematik, Verlag Julius Springer, Berlin, 1918. 2. S. K. Berberian, "Bourbaki, the omnivorous hedgehog: a historical note," The Mathematical Intelligencer 2 (1980), 104105. 3. P.R. Halmos, "Nicolas Bourbaki," Scientific American (May 1957), 88--99. 4. O. Pekonen, "Uaffaire Weil ~ Helsinki en 1939," Gazette des Mathdmaticiens 52 (1992), 13-20. 5. A. Weil, Souvenirs d'apprentissage, Birkh/iuser Verlag, Basel 1991.
Osmo Pekonen Department of Mathematics University of Jyviiskylii P.O. Box 35 SF-40351 Jyviiskylii Finland
9 John de PilIis 4 THEMATHEMATICAL INTELLIGENCERVOL.16,NO.I, 1994
The Opinion column offers mathematicians the opportunity to write about any issue of interest to the international mathematical community. Disagreement and controversy are welcome. An Opinion should be submitted to the editor-in-chief, Chandler Davis.
Copyrights and Wrongs Adrian Riskin
Now, however, it will be open to those who possess the requisite ability to examine these discoveries of mine ... as I judge it well to communicate them to those who are conversant with mathematics, I send them to you with the proofs written out, which it will be open to mathematicians to examine. Farewell. Archimedes [1] With these words, the finest mathematician of antiquity commended to his colleague Dositheus his treatise "On the Sphere and the Cylinder." Friends of Dositheus copied the manuscript, and their friends copied it again, and so on in an unbroken chain through the hundred subsequent generations of mathematicians down to our day. However, if modern conditions had prevailed and "On the Sphere and the Cylinder" had been accepted for publication in a scholarly journal, Archimedes would have been within his rights to send a copy to Dositheus, but anyone who copied that copy would have been in violation of the law. The chain would have been broken and, possibly, the treatise lost to posterity. Modern mathematicians have essentially the same goal as Archimedes did in publishing their work: to "communicate them to those who are conversant with mathematics." And yet, at present, the whole apparatus of mathematical publishing works against this goal. Copyright laws and litigious publishers have put the entire mathematical literature, monographs and journals, effectively out of the purchasing power of individuals and even of many libraries. Large-scale photocopying, the natural solution to this problem, is also proscribed by publishers. If I need a copy of a journal article and my university library doesn't take the journal, I have essentially two options. First, I can write to the author for 6
a reprint, which I may eventually receive if he or she isn't dead, lazy, disorganized, or incommunicado. On the other hand, I can request the article from inteflibrary loan, where it may cost a fortune ($10 for a three-page article is not outside my experience), it may be unavailable as no other libraries in the country have it, or they may simply refuse to process the request due to self-imposed limits designed to protect them from being sued by a publisher. It is tragic that mathematicians should suffer in this parched desert when all around us the springs of information, facilitated by computers and photocopiers, flow freely.
THE MATHEMATICAL INTELLIGENCER VOL. 16, NO. 1 (~)1994 Springer-Verlag New York
The process by which we fell into this trap was, however, a natural one. Scientific journals have their roots in the publications of the great learned societies of the 17th century. The case of The Philosophical Transactions, published by the Royal Society in London, is archetypal. According to M. Ornstein [2], "the Royal Society had early planned a regular publication." Scientists being what they were (and are), nothing was done until the secretary, Henry Oldenburg, "decided, as a private venture, to publish monthly the matters of most importance which the members of the society or foreign scientists communicated to him." This work was undoubtedly a labor of love; he never made more than s a year profit. Because of the difficulty and unprofitability of this t y p e of enterprise, it was unlikely that many journals could have been produced this way. As mathematics 1 became more widely practiced, the need for journals grew, until it became feasible for private businesses to take over the publication of these materials. Privately published journals served the purpose of disseminating information quite well because their publishers had the (profit-driven) inclination and capital that mathematicians lacked. Because lax copyright laws might have interfered with the publishers' ability to make a profit, and because the distribution of the material was dependent on the publishers, it was in the best interest of mathematicians, who are not primarily motivated by the possibility of profit, to sign over their rights to their research. I refer to this setup as the old system. The old system was probably the best possible w a y to facilitate the free flow of information, although it has always had at least one serious-- if previously unavoida b l e - drawback. The market for mathematical journals is small, and that fact tends to drive up the price. This creates a vicious circle that sends journal prices to astronomical levels. Because of this, some of the more obscure English-language mathematical journals are only available at one or two libraries in this country, and some foreign-language ones aren't available at all. The New System
The beginning of the obsolescence of the old system was the advent of high-qualit~ inexpensive, readily available photocopying, which made any price above five cents per page too high. As mathematical journal prices can run to over a dollar per page, photocopying is widely practiced. Publishers retreated in the face of this onslaught, establishing the Copyright Clearance Center (CCC). Most journals have some set fee, usually around three dollars, which one is supposed to remit to the CCC on copying an article. This makes clear the present contradictions of the old system: It was adopted to facilitate 1Note that I am writing about mathematicians specificallybecause of my professional experience, although it is probable that most of my points apply to the other sciences as well.
the free flow of information and has ended up impeding it due to its outmoded internal logic. The old world is dying and the new cannot be born. Nowadays it is a routine matter for mathematicians around the world to transmit their work electronically to one another via TEXand e-mail. The obstacles to free access imposed by the necessity for personal communication between author and reader are gradually being overcome by means of electronic periodicals such as the Ulam Journal, and by electronic archives such as the algebraic geometry repository based at Duke University. Methods like these, which the new system comprises, show the old system up as even more harmful and pointless. For the first time in history, mathematicians can distribute their work to one another freely and effectively without a n y kind of middleman, and yet we remain in the grip of publishers whose goals no longer mesh with our own. Furthermore, we as mathematicians do most of the work for the publishers. Publishers may require authors to submit their work already typeset or in camera-ready form, mathematicians edit journals for free or for a mere pittance, referees review articles for free, and publishers repay the mathematical community by charging $60 a copy for journals and by prohibiting the photocopying of articles. This extensive voluntarism was justifiable in the past because the old system was the only w a y to get the research distributed, but that is no longer the case. We as mathematicians can benefit greatly by redirecting all this effort so that it serves our ends more effectively. The solution is simple: We can rely more heavily on electronics. Electronic journals, perhaps sponsored by universities a n d / o r by the NSF, should become the norm. These would have little overhead because the work of typesetting (i.e., TEXing), editing, and refereeing would be done entirely on a voluntary basis, as it is now. Libraries could print up cheap hard copies of these journals for archival purposes, and anyone on the internet could get copies of articles for free. Standards would not be lowered because electronic journals are refereed in exactly the same fashion as print journals. Monographs could be distributed in the same fashion and would thus never go out of print. I have heard some frustrated mathematicians insist that copyright laws should be changed to allow a freer flow of scientific information, but this is unnecessary. Clearly, electronics will cause the old system to wither away, a process that we can and should hurry along by acting in concert with the inevitable. References
1. Archimedes, On the sphere and the cylinder, in The Works of Archimedes (T. L. Heath, trans.), Cambridge: Cambridge University Press (1987). 2. M. Ornstein, The Role of Scientific Societies in the Seventeenth Century, London: Archon Books (1963), 125.
Department of Mathematics Northern Arizona University Flagstaff, AZ 86011 USA THEMATHEMATICAL INTELLIGENCERVOL.16,NO.1,1994 7
Captain Mangin-Bocquet's Contribution to Mathematics D. Huylebrouck
In Mathematical Intelligencer, vol. 7 (1985), no. 4, Douglas M. Campbell r e n e w e d interest in the life of the French mathematician Andr6 Bloch. Having c o m m i t t e d a triple murder, this first-rank mathematician completed most of his research in a psychiatric hospital. The Mathematical Intelligencer [vol. 10 (1988), no. 1] published a four-page p a p e r b y Henri Cartan and Jacqueline Ferrand on "The Case of Andr~ BlocK" They concluded that "the life of Andr4 Bloch has remained a unique and troubling case, as m u c h in the annals of psychiatry as in the history of mathematics." His biography and m a n y interesting details can be found in the articles quoted above. C o m m e n t i n g on the m u r d e r drama, D. M. Campbell said, " S o m e d a y w e m a y be able to get the French n e w s p a p e r accounts ..."; the second p a p e r stated, "It seems that this painful affair was kept quiet at the time and was not reported in the press." At that time indeed, even an important n e w s p a p e r like Le Figaro was often not more than four pages (one large folded sheet). Yet, it closely followed the event, as did Le Journal des De'bats and Paris-Midi. The column b e l o w is extracted from the Nov. 18, 1917 issue of Le Figaro. It was longer than the note o n President Wilson's congratulations to the N o r t h w e s t States delegates' w a r contribution!
could foretell the drama of which he would be the unconscious actor. How did the tragic event happen? No one knows. What we did learn was that the lieutenant killed his uncle, his aunt, and his cousin with revolver shots in the living room, where they were quietly having dinner. The madman then attacked the bodies with knife-thrusts. The uncle had signs of about fifty wounds. When he had finished the beating, Lieutenant Bloch surrendered himself to the police station of Plaine-Monceau, asked for Mr Raoul Legrand, superintendent of police, and told him:
--Harassed by society with unfair hatred, I just committed justice, I killed them all. Thus he explained his triple murder. Mr. Legrand went to the boulevard de Courcelles and could only verify that the unfortunate had told the truth. Lieutenant Bloch has been imprisoned at Cherche-Midi. (Note from the translator:. Bloch committed his crime at n o o n - - m i d i - - a n d was put in the Cherche-Midi prison, chercher meaning: to look for.)
A TRIPLE MURDER Drama of madness
A terrible drama took place yesterday between noon and 1 o'clock, at I04 boulevard de Courcelles. At that address lived Mr Charles D., forty-six years old, and Mme D., born Linda O., thirty-three years old, manufacturers of watches. The couple had invited their brother, Mr Georges D., lieutenant at the 83rd artillery regiment and member of the Ecole polytechnique, and their cousin Andr6 Bloch, lieutenant at the 117th heavy artillery, staying at 26 avenue de Wagram, for dinner. Lieutenant Bloch had been convalescing for two months following a concussion he received in an attack on the Aisne. He had not yet recovered completely and had complained, every since his injury, of frequent brain troubles, but nothing 8
THE MATHEMATICALINTELLIGENCERVOL.16, NO. 1 (~ 1994 Springer-VerlagNew York
UN TR/PLE ASSA3 AT D r s m e de la relic Un terrible drame s'est d6roul6 hier, entr~ mldl et une hours, 104, boulevard de Couroslles. A cette adreseo habitait M. Charles Didishelm, t1~6 de quarante-six ans, et Mine Didlsheim, n~e Linda Oiivetti,trente-troi8 ans, fabrieanta de montres. Lea ~poux avaient invite /k d6jeuner leue fr~re, M. Geor~ea Didisheim, lieutenant au 83" r~giment d'artillerie, d6tach6 /~ l'Ecolc i~elytechuique, et leur neveu Andr~ Bloch, utenant au it7' d'artilhrie lourde, demeurant 26, avenue de Wa~ram. Le lieutenant Bloeh ~'tait en convaleseeuc~ de deux mois par suite d'une commotioa c~r~brale revue au ~:oum d'une attaque sur l'Ai~ne; I1 h ~tait pas complbtement remis et ai:cusait, . ~ p u i s sa bless~xre, des d&ange. meats c 6 a ~ u x fr~quqnta, mais rien no fan slit p ~ e drams dent il allalt ~tre Fintone,lent 9~.~ r . Comment le traglque 6v6nement s'est-iI d6roul6 ? Ou l'ignore. Ce qua l'on a appris c'est qua le lieutenant a tu6 k e o u ~ de revolver son oncie, sa tante et son cousin clans la sage a manger, o~ ils d~Jetmaient_ tranqufllement. , L e fou a est ensuite acharn6 sur lea cadav ~ s k coups do couteau. L'oncle porte lea tracem d'tme cinquantaine de bleseurea. Quand U fur las de frap~r, le lieutenaac Bloch se readit au comrmssariat de police de la Plain~-Moneeau, fit mander M. Raoul Legra_nd, commi~saire de re}ice, et lui dit : - - Ppursuivt par la so,-i~t~ d'une haino i~Justs, Je viens ite me fairs justice, Je le~ ai tons tufa. Et il expliaua sea triple assaaainat. M. Legrrana so readit boulevard de Coureelle~ et-ne!put que eonstater que le m.,dheureux avail tilt vrai. 1~. liellte~aant Bloch a 6t6 ~ u ~ ~ h~ ~nson du (~hegche-M/dL .... "
La domestique, Mrne Bossart. tenter de d~,s*~'. mor le foreen6 et rut bless6e/t la nmin. Le concierge n'eut que le temps de sc pv~eggi. tea"dans sa loge et do s'y enlermer pour 6ehn,p:pea" h la lu='etw (to 1 assassin, q.o,I, lee mains el 1o visage com~erLs de sang, se pr6cipita s~r le ~u, levard, oa il se luissa arrtter.
T h e Journal des De'bats a n d Paris-Midi p r o v i d e s h o r t e r texts, b u t s o m e a d d i t i o n a l i n f o r m a t i o n c a n be d r a w n f r o m the latter: A. Bloch u s e d two g u n s a n d two knives (this settles a q u e s t i o n raised b y D. M. Campbell!), a n d ... Mme B., the maidservant, tried to disarm the madman and was w o u n d e d in the hand. The housekeeper had only time to reach her room and lock herself up to escape from the fury of the murderer, who, hands and face covered with blood, hurried to the boulevard, where he let himself be arrested .... These a d d i t i o n a l details s e e m to d e e p e n the guilt o f A n d r 6 Bloch. As for the actual accusation, a n o t h e r colu m n in Le Figaro of Nov. 21 a g a i n q u o t e s A n d r 6 Bloch h i m s e l f to relate the facts:
Nouvolies D/voesos Le triple mcurtrc
du boulevard de Courctlles Lc capitaiuc Ma~gin-Bocquet rapportcur pros Io :2" Conceit dc guerre, .~'eot re~,du .L l'h6pitAl mi itairc ,In Val-de.-Gr.~cc, dells 1,= ~er,-ice de [~sychiatrie, u~t !o ticute~tnt Bb..,:]t est soign~. Le meurtrier a expliqu~, pou.rquoi il avait tu6 son fr~ro el, lez ;tpoux Didisheiut, ':"~ onele et tante. -- C'est par humauitd.., dat-il d'abord... J,' ,: voulaispas qu'ils aient des enfants~ui aura~c:,: h&it6 des tares qui font ma misore phy..t...i.~gique. Puis sc repren.',at : - - i l s so } 3 o r t a i e n t t r o p b l c n . . . It y ,x ,hx :it,-" q u e j o VOUlais t u e r le p r e m i e r ~eutt,.. il tJ.~ a q u o q l l e h l u c : i m o i s o.ud JO veU.~:, t u e r dal,~ th.t f.w m i l l e . . . 11 y a t r o i s s e t a . t d u e s , /L a u d,~j a:i ':. c o m m e s a m e d i , j ' a v a i n u a r e v o l v e r , je I'ai ~.,,:.t~ p l u s i ~ u r , : fol,~ p o u r t u e r . . , p u i s j ' a i c u l , t i , 3 . .
Le maiade, qui (:.~t ua math6maticiea ,h:tiugu6, a ,3t,~ couiio aux sOLeS tilt d o c t c t l : Briand, rn.3decin alt6uiste. ~et
MISCELLANEOUS NEWS The triple murder of the boulevard de Courcelles Captain Mangin-Bocquet, reporting at the 2nd Council of War, went to the military hospital of Val-de-Grace, in the psychiatric service, where Lieutenant Bloch is a patient. The murderer explained w h y he had killed his brother and the married couple D., his uncle and aunt. --It is by humanity.., he first said .... I did not want them to have children who would have inherited the infirmities that make my physiological misery today. Then, resuming: --They were having it too well... Ten years ago I wanted to kill the first passer-by.., it is only the last few months that I wanted to kill in my own family ... Three weeks ago, during a dinner, I had a revolver, I drew it several times to kill.., then I had pity on them... The sick man, a distinguished mathematician, was taken care of by Dr. Briand, physician for mental illnesses.
T h e Journal des De'bats o f Nov. 22 c o n f i r m s t h o s e facts, b u t a d d s t h a t Bloch w a s k e p t u n d e r the particular s u p e r vision of C a p t a i n M a n g i n - B o c q u e t to p r e v e n t h i m f r o m c o m m i t t i n g suicide. T h e r e a d e r will notice v a r i o u s inconsistencies in these a c c o u n t s r e g a r d i n g the relation of the p r o t a g o n i s t s to o n e another. These o c c u r in the original n e w s p a p e r a c c o u n t s (along w i t h c o n f u s i o n as to A. Bloch's m i l i t a r y rank: w a s he lieutenant or sous-lieutenant?). In fact, the fellow officer m u r d e r e d w a s A. Bloch's b r o t h e r G e o r g e s , n o t his c o u s i n G e o r g e s D. (Did he e v e n h a v e s u c h a cousin?) University of Burundi BP 2605 Bujumbura Burundi THE MATHEMATICAL INTELLIGENCER VOL. 16, NO. 1, 1994 9
Mysteries of Mathematics and Computation Michael Shub
I was quite surprised last May when Ted Brown, the Chairman of the Queens College Computer Science Department, called and asked me to give a lecture entitled "The Future of the Theory of Computing." I had just returned from two months in California and hadn't opened my mail yet. So I didn't realize that Ted was asking for a very brief and informal talk. He had written to each member of the Industrial Affiliation Board of the Queens College Computer Science Department asking that they make a small presentation at a meeting to be held at the College in June. About two years ago I wrote a book review of David Ruelle's book on dynamical systems and bifurcation theory; this made me reflect on what I thought was important in dynamical systems. I was quite happy with the result [20]. So I hesitantly agreed to give the lecture, to have the opportunity for some reflection on the theory of computing, only partly aware of the work it would entail. Among the issues I wanted to consider was the relation of theory and practice-- past, present, and future; especially as sources of funding are more and more asking the scientific community to assess the effectiveness of scientific research. After the talk was announced at Queens College, I was asked to give it in my own department and at the Courant Institute, and I revised the talk each time. Theory is related to practice in at least four ways. For each I will give examples first from computer science and then from mathematics.
2. Structural: Theory provides a language and structure in which to discuss and analyze an existing practice. As an outstanding example here we have complexity theory, P- and NP-complete problems, or the application of linguistic and logical constructions to the theory of computer languages and compilers. For mathematics, we may take the organizing effect that dynamical systems has had on ordinary differential equations. 3. Anticipatory: For its own internal reasons, theory creates structures which later may be important for practice. Here we might point to the spectacular example of Turing's Universal Machines as precursors of the modern computer and to this day as the main theoretical model of the machine. I will return to this
1. Incremental: Theory and practice incrementally improve together in a well-defined branch of study. Here I have in mind the steady improvement of algorithms for data sorting, compiler design, etc.; for mathematics, the steady progress in the numerical solution of differential equations of interest in physics, engineering, and industrial design. 10
THE MATHEMATICAL INTELLIGENCER VOL. 16, NO. 1 (~)1994 Springer-Verlag New York
later. For mathematics, we have the standard example: the development of Riemannian geometry which was later so useful to Einstein. On a smaller scale, but much more recently and much closer to home, in fact in my own department, the work of Brian Marcus and Roy Adler on the classification problem in symbolic dynamics was later found useful for coding, and was used to improve the storage capacity of our disks. 4. Informal: Theory creates a background which is thought to embody the state of knowledge as a measure of what can be accomplished. Here we have G6del's and Turing's theorems and the conviction that NP-complete problems are hard. For mathematics, we can again cite G6del and Turing, or the unpredictability of chaotic systems. For each category there are lots of examples, and, of course, the structural, anticipatory, and informal all contribute to the incremental. Having tied theory to practice, I was partly off the hook as far as the title of my lecture was concerned. The future of theory will depend to a degree on the future of practice, and we will have to wait and see what that is. Currently, the theory of computing is mostly tied to machine design, data management, and the concomitant combinatorial optimization problems. Looking at the table of contents of the Symposium on the Theory of Computing, 1969, I was struck by how much the first symposium seemed like logic or perhaps the theory of computable functions, with emphasis on the complexity of computable functions. By 1992 complexity theory has matured enormously and some new issues appear in the 24th Annual ACM Symposium on the Theory of Computing, such as parallel computing and fault tolerance. But the 1992 STOC emphasis remains on solving combinatorial problems and problems about the logical structure of the machine such as problems of communication. Over the decades, data processing has occupied most computer time, but the advent of the workstation has put cheap sophisticated computational power on millions of desks. Computational mathematics, even symbolic computational mathematics is exploding. CAD-CAM and emerging computer-video interactive technologies, robotics, and scientific problems such as protein folding will require extensive scientific computation. Parallel and distributed computers will give greatly enhanced computing power. The computer is a tool; numerical algorithms get more sophisticated, partly in response to more sophisticated hardware. We can easily predict that theory will incrementally progress along with practice. The current explosion o'f work on wavelets is an excellent example. Stored-program digital computers were invented to do scientific computations. Here the theory, numerical analysis, and the practice seem almost entirely incre-
Figure 1.
mental. Wilkinson's invention of backward error analysis and condition number are two exceptions. I would call them structural. What are the ingredients of a structural theory of scientific computation? Let me give you an example of a problem with scientific origin and the difficulties we encounter. Computer graphics have proven to be extremely useful tools for the study of low-dimensional dynamical systems. The Lorenz attractor, Julia sets, Mandelbrot set, and Henon attractor are a few of the earliest and most famous examples. If we consider Newton's method for the complex polynomial
f ( z ) = (z 2 - 1)(z 2 +0.16), we have the beautiful picture made by Scott Suthefland [1] (Fig. 1; also see the cover of this issue for a color version). We recall that N y ( z ) = z - f ( z ) / f ' ( z ) is a dynamical system on the Riemann sphere. The fixed points of N f are the roots of f. These are along the imaginary and real axes. The regions in shades of red, green, yellow or blue converge under iteration of N / t o these roots, with the lighter shades converging more quickly. The black region is an open set of points which converge under iteration to two attracting points of period two and, thus, fail to converge to a root. At the boundary between any two hues, including black, is (an approximation of) the Julia set. Its fractal nature is evident in the picture. This computer graphic is an excellent heuristic to understand the dynamics and hence Newton's method for this particular case. But now let us turn this discussion around and ask ourselves to explain the machine which produced this picture and the picture itself. The Turing machine model THE MATHEMATICAL INTELLIGENCER VOL. 16, NO. 1,1994
11
does not seem adequate to the task. For starters, fractional dimension and the boundary between open domains do not make sense for a discrete set of points. To truly understand this picture we must posit a machine which operates on complex numbers, so that we can understand the complex analytic dynamics Nf (z) = z - f ( z ) / f ' ( z ) via the work of Fatou and Julia, Sullivan's theorem on nonwandering domains, the hyperbolicity of the Julia set, approximations to it, etc. Newton's method is but one example. Scientific computation and numerical analysis deal with problems whose natural domains of definition are the real and complex numbers. To be able to discuss computability, efficiency, and complexity in this context, Lenore Blum, Steve Smale, and I [Blum-Shub-Smale [2] (=BSS)] have introduced machines which operate on elements of a ring. The main examples of rings which we have in mind are the integers, Z, the reals, 1R, and the complexes, C. Ring
Functions
Branching
g II~ C
Polynomials Rational functions Rational functions
< 0 or > 0 < 0 or > 0 = 0 or # 0
Our machines, which have come to be known as BSS machines, are essentially flowchart machines, comprising a finite directed graph with five types of nodes: input, output, computation, branch, and a certain fifth node which can access memory. All computations are done on a finite number of variables and there is only one input node. An example of such a machine is a machine to implement Newton's method. input z E C
l :=Z
~
no "J If(z)l < e
l
yes
] output z For appropriately chosen ~, this machine could be used together with a counter as a subroutine to produce the illustration on the cover of this issue. Part of the reason to have a machine defined over various rings is to have the power of mathematical analysis available for problems over I~ or C, while having available simultaneously the well-developed theory of computability and complexity over 77,. One of our first results, motivated by recursive function theory, was on nondecidability. Given a subset of the inputs, S c /, a 1 2 THEMATHEMATICALINTELLIGENCERVOL.16, NO. 1, 1994
machine M is said to decide S, and S is said to be decidable, if on input x E /, M outputs yes if x E S and no if
zCS. THEOREM 1 (BSS). If f ( z ) = ~ j = e 0 ajzJ is a complex polynomial with three or more distinct roots, then no machine can decide for input z E C if Newton's method will converge to a root of f , i.e., if f ( N ~ ( z ) ) --+ Oask --+ oo. Thus, the approximation to the Julia set in Figure 1 can only be an approximation. For problems which are decidable, we have those decidable in polynomial cost in the input size. This is called the class P of decision problems. In order to define P we need two more pieces of data for our three basic examples: Ring
Input Size
Cost
Z
Bit Dimension Dimension
Bit Algebraic Algebraic
I~ C
The class NP is roughly that class of decision problems which can be solved in polynomial cost in the input size. The classes P and NP are defined in analogy with the classes of Cook [3] and agree for Z. The basic problem, does P # NP? now makes sense in all three contexts. The importance of the class NP over Z is largely due to the great number of NP-complete problems of Cook [3] and Karp [4]; see Garey and Johnson [5]. Recall that a problem is NP-complete if any other problem in NP admits a polynomial cost reduction to it. THEOREM 2 (BSS). The following problems are NPcomplete: Over Z: (a) Given f E Z[Xl,..., x,~] and a bound b E Z, is there a point ( x l , . . . , xn) E Z '~ such that f ( x l , . . . , x,~) = 0 and E x 2i < _ 52? (Bounded Hilbert's lOth) (b) Given n x m integer matrix A, integer vectors b, c, and an integer k, is there an integer point x such that A x < b and cx > k? (Integer Linear Programming) Over 1~: (c) Given f: lg n --+ I~ a polynomial of degree 4, is there an x E IRn such that f ( x ) = O? (Four feasibility) Over C: (d) Given a system of polynomials f l , . . . , fa where fj: C n --+ C, is there a common root of the fj, i.e., an x E C '~ such that fj(x) = 0 for j = 1 , . . . , k? (Hilbert's Nullstellensatz ) The problems over Z are well known to be N P complete. Using BSS machines, we give an integrated proof for (a), (c), and (d); (b) is only slightly more difficult. In search problems over ~, we are asked not only to decide if a problem has a solution but to approximate
a solution if one exists. Even to solve x 2 - a = 0, i.e., to compute x/~ up to e, we see that more and more algebraic operations are needed as e --* 0, since v ~ i s not a rational function of a. Thus, the degree of approximation must be part of the input size of the problem. The situation for search problems in the presence of input or computational error is even more problematical. The condition of the problem, i.e., the sensitivity of the solution to perturbation of the data, will have to play a role. Even for two simultaneous linear equations, the solutions are not continuous in the data, so in the presence of error we have infinitely badly conditioned problems. Steve Smale and I have recently done an analysis of homotopy methods for Bezout's theorem [6-8] which measures the n u m b e r of steps in terms of the condition of the homotopy, gives a geometric interpretation to the condition, and gives probability estimates that a problem m a y be ill-conditioned. Here is an example of our results for Bezout's problem: Find the solution lines of n homogeneous equations of degrees dl, .. 9 d,~ in n + 1 complex variables. We let ?'/(d) be the vector space of such systems, (d) = ( d l , . . . , dn). We concentrate on the incidence variety V C P(7-~(d) ) • "P(n), where P(7-l(d)) is the projective space of 7-l(d), P(n) is the projective space of C n+l, and Y = {(f, x)lf(x ) = 0}. The unitary group acts on C n+l with the usual Hermitian structure and induces an action on 7-/(4). We take a unitarily invariant Hermitian structure on 7"/(d) first considered by Kostlan [9]. Thus, we have the Fubini-Study metric on P(7-l(d)) and P(n) and an induced metric on V. We replace N e w t o n ' s method by projective Newton introduced in [10]. The classical condition of (f, x) where f(x) = 0 is determined by the norm of the inverse of the derivative of f at x. We denote the projective version of this number by #(f, x). For a homotopy F -- ft, let #(F) be the sup of #(ft, xt) over all ft(xt) = O. THEOREM (Bez I). The number of steps of projective New-
ton's method sufficient to follow a homotopy is <_ r 2, where c is a constant < 10, D = max d~, and L is the length of the homotopy. There is also a one-root version of this theorem. Thus, if the condition of a problem is taken into account, the number of steps m a y depend modestly on the condition. The unitary group is crucial to our analysis. Next, we give a geometric interpretation to the condition number, generalizing work of Demmel [11, 12]. Let ~' = {(f, x) C Vlrank D f ( x ) < n}. E' is the generalization of the univariate variety of (f, x) such that x is a multiple root of f.,We define a vertical distance p to ~ and prove. THEOREM (Bez I). For (f, x) E V, 1
#(f, x) -- p((f, x), E')
Finally, we estimate the volume of the badly conditioned problems. THEOREM (Bez II). The probability for (f, x) E V that p(f, x) < Po < 1/x/n is < K n N p 2, where N = dim ~'~(d)
and K is a small constant. In order to prove this volume estimate we exploit the two projections
V C P(~(d)) x P(n)
/ P(7-l(d))
\ P(n)
When we apply the technique to real homogeneous systems with the induced Riemannian structure, we have T H E O R E M (Bez II). The average number of real roots of
real homogeneous systems is the square root of the number of complex roots, i.e., :D1/2, where l) = Hd~ is the Bezout number. For all the di equal, this is a result of Kostlan [13]. It stands in distinction to the univariate result of Kac [14] which asymptotically gives (2/lr)In d, where d is the degree. The difference is the role of the unitary group. This is but a contribution to a theory of the complexity of equation-solving. Finding adequate, realistic models and a good body of results for a complexity theory of scientific computation remains a great problem. N o w I would like to consider the theory and practice from a completely different perspective. I have pointed to Turing's Universal Machine as a spectacular example of theory developed for its o w n sake applied to practice. Turing and von N e u m a n n are both frequently credited with the invention of the stored-program computer. But two engineers, J. E Eckert and J. W. Mauchly at the Moore School of the University of Pennsylvania, had built the Eniac and were planning a follow-up machine. Von N e u m a n n consulted on the project and wrote a draft of a report on the Edvac which was distributed under y o n N e u m a n n ' s n a m e alone [14]. Joel Shurkin [15] writes: The paper has become, arguably, the single most important document in the field. Almost every history written about the computer, or about von Neumann, credits him with the idea of a stored-memory computer. Among scientists, computers are to this day known as von Neumann machines. Von Neumann was not the originator of the storedprogram computer. As we have seen, the idea for such a machine was being discussed at the Moore School a year before von Neumann arrived on the scene, and Eckert had written a memo on the subject almost six months before von Neumann had even heard about the Moore School project. Shurkin goes on to quote H. H. Goldstine about the report. This report represents a masterful analysis and synthesis by him of all the thinking that had gone into the EDVAC from THEMATHEMATICAL INTELLIGENCERVOL.16,NO.I, 1994 13
the fall of 1944 through the spring of 1945. Not everything in there is his, but the crucial parts are .... It is obvious that von Neumann, by writing his report, crystallized thinking in the field of computers as no other person ever did. He was, among all members of the group at the Moore School, the indispensable o n e . . , only von Neumann was essential to the entire task. Compare this with S. Frankel quoted b y B. Randell: Many people have acclaimed von Neumann as the 'father of the computer' (in a modem sense of the term) but I am sure that he would never have made that mistake himself. He might well be called the midwife, perhaps, but he firmly emphasized to me, and to others I am sure, that the fundamental conception is owing to Turing--insofar as not anticipated by Babbage, Lovelace and others. In my view von Neumann's essential role was in making the world aware of these fundamental concepts introduced by Turing and of the development work carried out in the Moore School and elsewhere. I can imagine that von N e u m a n n u n d e r s t o o d Eckert's idea better than Eckert because he certainly k n e w Turing's work. This w o u l d be consistent with Goldstine's quote and Frankel's. I have not tried to check the original documents. Two things are clear here, though: the enormous difficulty of tracing the intellectual origin of ideas in practice, and the value of having s o m e o n e w h o really understands t h e m w h e n it comes time to apply them. In these tight economic times, as theoreticians we must not lose o u r confidence in the value of good theoretical work, even w h e n it seems distant from practice in the near term. This is h o w I concluded m y talk at Q u e e n s College in June. The other two talks were not scheduled until October. I put m y material aside, afraid to get too involved in this historical question without the skills or training of a historian, and certainly h a p p y to be concentrating on the mathematical papers I was writing. But I could not put out of m y mind the question of whether von N e u m a n n k n e w Turing's Universal Machine. The assertion was plausible enough. Turing had spent two years at Princeton after solving the Entscheidungsproblem. Von N e u m a n n had even asked him to be his assistant. But there was a very disturbing feature to the assertion. The Edvac report mentioned McCuUoch and Pitts, w h o had i n v e n t e d neural nets in a 1943 paper, but there was no mention of Turing! It is true that McCulloch and Pitts [16] m e n t i o n Turing machines a n d assert that their neural nets give the same computable functions. "This is of interest as affording a psychological justification of the Turing definition of computability and its equivalents, Church's A-definability and Kleene's primitive recursiveness: If any n u m b e r can be c o m p u t e d by an organism, it is computable by these definitions and conversely" (p. 35 of reprint). But no mention is m a d e of Universal Machines and Turing's p a p e r is not referenced. Besides Frankel, Goldstine [17], p. 174, had the following to say about von Neumann: 14 THE MATHEMATICAL INTELLIGENCER VOL. 16, NO. 1, 1994
It was his training in formal logics that made him very much aware of and interested in a result which foreshadowed the modem computer. This was contained in independent papers published by Emil L. Post and Alan M. Turing in 1936. Post taught at City College of New York and Turing was an Englishman studying at Princeton University (1936-1938). Each of them conceived of what is now called an automaton and described it in similar, mechanistic terms. The men worked independently and in ignorance of each other. There is no doubt that von Neumann was thoroughly aware of Turing's work but apparently not of Post's. In 1937, von N e u m a n n wrote a letter of r e c o m m e n d a tion about Turing which is q u o t e d and c o m m e n t e d on in H o d g e ' s biography of Turing [18]. June 1, 1937 Sir, Mr. A. M. Turing has informed me that he is applying for a Proctor [sic] Visiting Fellowship to Princeton University from Cambridge for the academic year 1937-1938. I should like to support his application and to inform you that I know Mr. Turing very well from previous years: during the last term of 1935, when I was a visiting professor in Cambridge, and during 1936-1937, which year Mr. Turing has spent in Princeton. I had opportunity to observe his scientific work. He has done good work in branches of mathematics in which I am interested, namely: theory of almost periodic functions, and theory of continuous groups. I think that he is a most deserving candidate for the Proctor Fellowship, and I should be very glad if you should find it possible to award one to him. I am, Respectfully, John von Neumann Last w e e k I called Goldstine and read him von Neum a n n ' s letter. H e found it incredible. H e was of the opinion that "Johnny w o u l d have mentioned the w o r k if he had k n o w n it." He also said that in the times of the Edvac or the early stages of the IAS project, Turing's w o r k was never mentioned. Only later w h e n von Neum a n n lectured on automata theory did Turing's w o r k come up. Goldstine had learned of Turing from v o n N e u m a n n at some point and had read the work, but he wasn't sure when. Some other evidence that von Neum a n n k n e w Turing's w o r k was recorded in H o d g e ' s 1983 biography: reminiscences of Ulam from 1938, 1939, and of Frankel, w h o recalls that v o n N e u m a n n pointed out Turing's w o r k to him in 1943 or 1944. On the other h a n d H o d g e relates that von N e u m a n n claimed not to have read a n o t h e r p a p e r in logic after G6del's 1931 theorem. So a n o t h e r plausible reconstruction of reality is that y o n N e u m a n n did not k n o w Turing's Universal Machine at the time of the Edvac report, but w h e n he learned of it he graciously gave him credit. As an amateur, I've gone about as far as I can. I w o n d e r if historians can settle the question. H a n k Tropp tells me I'm opening a can of worms. Does this change m y evaluation of the anticipatory value of theory in this example? N o t really. In any case, Turing's w o r k did anticipate the m o d e r n computer and is e n o r m o u s l y important as a model of computation. Von N e u m a n n ' s b a c k g r o u n d in logic was surely an important ingredient to his w o r k on the Edvac. One thing that becomes even clearer is the
incredible difficulty of tracing the intellectual origins of ideas in practice. When I gave this talk at the Courant Institute, Martin Davis gave me a copy of his excellent scholarly article [19]. Martin was m o r e certain that von N e u m a n n k n e w about the Universal Machine. We agreed that proof, as mathematicians like to have it, is hard to find.
10.
11. 12. 13.
References 1. Sutherland, S., Finding roots of complex polynomials with Newton's method, Preprint, Institute for Math. Sciences, SUNY, Stony Brook, 1989. 2. Blum, L., Shub, M. and Smale, S., On a theory of computation and complexity over the real numbers: NPcompleteness, Recursive functions and Universal Machines, Bull. Amer. Math. Soc. (New Series). Also abstracted in the IEEE 1988 FOCS 21 (1989), 1-46. 3. Cook, S. A., The complexity of theorem-proving procedures, Proceedings 3rd ACM STOC, 1989, 151-158. 4. Karp, R. M., Reducibility among combinatorial problems, in Complexity of Computer Computations, (R. E. Miller, and J. W. Thatcher, eds.), Plenum, New York, 1972, 85-104. 5. Garey, M. and Johnson D., Computers and Intractability, Freeman, San Francisco, 1979. 6. Shub, M. and Smale, S., Complexity of Bezout's Theorem I: Geometrical aspects, J. Amer. Math. Soc. 6 (1993), 459-501. 7. Shub, M. and Smale, S., Complexity of Bezout's Theorem II: Volumes and probabilities, Computational Algebraic Geometry (E Eysette & A. Galliger, eds.) Progress in Mathematics, Vol. 109, Birkh~iuser Boston, (1993), 267-285. 8. Shub, M. and Smale, S., Complexity of Bezout's Theorem III: Condition number and packing, J. Complexity 9, (1993), 4-14. 9. Kostlan, E., Random polynomials and the statistical funda-
14. 15. 16.
17. 18. 19.
20.
mental theorem of algebra. Preprint, University of Hawaii, 1987. Shub, M., Some remarks on Bezout's Theorem and complexity theory, in From Topology to Computation, Proceedings of the Smalefest (M. Hirsch, J. Marsden and M. Shub, eds.) Springer, 1993, 443--455. Demmel, J., On condition numbers and the distance to the nearest ill-posed problem, Numer. Math. 51 (1987), 251-289. Demmel, J., The probability that a numerical analysis problem is difficult, Math. Comp. 50 (1988), 449-480. Kostlan, E., On the distribution of the roots of random polynomials, in From Topology to Computation, Proceedings of the Smalefest, (M. Hirsch, J. Marsden and M. Shub, eds.) Springer, 1993, 419-432. Kac, M., On the average number of real roots of a random algebraic equation, Bull. Amer. Math. Soc. 49 (1943), 314320. Shurkin, J., Engines of the Mind, W. W. Norton, New York, 1984. McCulloch, W. S. and Pitts, W., A logical calculus of the ideas imminent in nervous activity, Bull. Math. Biophys. 5 (1943), 115-133. Reprinted in McCulloch, W. S., Embodiments of Mind, MIT Press, 1988. Goldstine, H. H., The Computer from Pascal to von Neumann Princeton University Press, Princeton, NJ, 1972. Hodges, A., Alan Turing: The Enigma, Simon & Schuster, New York, 1983. Davis, M., Logic and the Origin of Modern Computers, in The Universal Turing Machine. A Half Century Survey (Rolf Hecken, eds.), Verlag Kammerer & Unverzagt, HamburgBerlin; Oxford University Press, Oxford, 1988. Shub, M., Book review of Elements ofdifferentiable dynamics and bifurcation theory by David Ruelle, Bull. Amer. Math. Soc. (New Series) 24 (1991), 199-211.
IBM P.O. Box 318 Yorktown Heights, NY 10598 USA
THE MATHEMATICALINTELLIGENCERVOL. 16, NO. I, 1994 1 ~
Sophus Lie and Harmony in Mathematical Physics, on the 150th Anniversary of His Birth Nail H. Ibragimov
"The extraordinary significance of Lie's work for the general development of geometry can not be overstated; I am convinced that in years to come it will grow still greater" - - s o wrote Felix Klein [13] in his nomination of the results of Sophus Lie on the group-theoretic foundations of geometry to receive the N. I. Lobachevskii prize. This prize was established by the Physical-Mathematical Society of the Imperial University of Kazan in 1895 and was to recognize works on geometry, especially nonEuclidean geometry, chosen by leading specialists. The first three prizes awarded were to the following: 1897: S. Lie 1900: W. Killing 1904: D. Hilbert
(Nominator: E Klein) (Nominator: E Engel) (Nominator: H. Poincar6).
There can be no doubt that the work of Lie in differential equations merits equally high evaluation. One of Lie's striking achievements in this domain was the discovery that the majority of the known methods of integration, which until then had seemed artificial and not intrinsically related to one another, could be introduced
all together by means of group theory. Further, Lie gave a classification of ordinary differential equations of arbitrary order in terms of the admitted group, thereby identifying the full set of equations which could be integrated or reduced to lower-order equations by group-theoretic considerations. But these and a rich store of other results of his did not lend themselves to popular expositions and remained for a long time the special preserve of a few. Today we find that this is the case with methods of solution of the problems of mathematical physics: Many of them have a group-theoretic nature yet are taught as though they were the result of a lucky guess. It was my good fortune to get interested in application of groups to differential equations at the very beginning of my university work, and to write my first paper under the direction of Professor L. V. Ovsiannikov, who has done so much to awaken interest in this discipline and establish it as a contemporary scientific field. In my later
Friedrich Engel 20
THE MATHEMATICALINTELLIGENCERVOL.16, NO. 1 01994 Springer-VerlagNew York
work I saw over and over how effective a tool Lie theory is for solving complicated problems. It significantly widens and sharpens the intuitive notion of symmetry, supplies concrete methods to apply it, guides one to the proper formulation of problems, and often discloses possible approaches to solving them. This article presents my view of the role of Lie group theory in mathematical physics, drawing on parts of some of my lectures over the years at Moscow University and Moscow Institute of Physics and Technology.
A decisive move in integrating differential equations is simplifying the skeleton by means of a suitable change of variables. For this purpose, one uses the symmetry group of the differential equation (or its admissiblegroup),
His Life Story
Marius Sophus Lie was born 17 December 1842 in the town of Nordfjordeid, Norway, the sixth and youngest child of the Lutheran pastor Johann Herman Lie. He studied in Christiania (now Oslo) from 1857, first in gymnasium and then (1859-1865) at the University. Among the events of Lie's life which set his creative course, these stand out: his independent study in 1868 of the geometric works of Chasles, Poncelet, and Plficker; his travels in Germany and France in 1869-1870; his contacts there with Felix Klein, Chasles, Jordan, and Darboux; and his close friendship with Klein, leading to a long collaboration. Lie worked at the University of Christiania from 1872 to 1886, then from 1886 to 1898 at Leipzig. He died 18 February 1899 in Christiania. The life and intellectual development and works of the greatest Norwegian mathematician are described in reminiscences of his colleagues and later biographies (see, for example, [7, 22, 27, 29], and references therein). I call special attention to the painstaking introduction of F. Engel to Lie's Collected Works [21]. These give detailed insight into the essence of Lie's ideas and a picture of him as a person. Symmetry
of Differential
1. The skeleton of the Riccati equation yS+ y2- 2/z ~ = 0 is a surface invariant under the group of inh o m o g e n e o u s deformations ~ = ze ~, ~ = ye-~, ~ = y~e-~'. Figure
r
Equations
The notion of differential equations really has two components. For an ordinary first-order differential equation, for example, it is necessary 1. to specify a surface F(x, y, y') = 0 in the space of the three variables x, y, y~; we will call this surface the skeleton of the differential equation; 2. to define the class of solutions; for example, a smooth solution is a continuously differentiable function ~(x) such that the curve y =
y, _ o
L~
(x)
Ox lies on the surface, i.e., F
Oz
] =0
identically in x; going over to discontinuous or generalized solutions (keeping the same skeleton) changes the situation altogether.
Figure 2. The skeleton of the equation u s + u 2 - u - 2 = 0, obtained from the Riccati equation ys + i/2_ 2/x2 = 0 by the change of variables t ~- In x, u = ~ .
THEMATHEMATICAL INTELLIGENCERVOL.16,NO.1,1994 21
Table 1. Lie's group classification of second-order equations.
Group
Basis of the Lie Algebra
Equation
Cl
Xl = o~
yt! = f(y, yt)
x~ = ~ G2
X1 = ~ ,
x,=s163
x~ = ~ )(2 = X
V'=f(V)
~
r
y ~
:
x,=s
x,:xs163
x,:s X6 = y o-~,
X.:xs o --x('VO X7 = x 2 ~7 -t- y ~ ,
x.=ys
y/" ~ 0
X8=xYo27+y2o~
defined as the g r o u p of transformations of the (x, y)plane whose extensions to the derivatives y ' , . . , leave the equation's skeleton invariant. EXAMPLE. The Riccati equation y' + y2 _ 2 i x 2 = 0 admits the group of transformations ~ = xe a, ~ = ye -a, for the equation's skeleton (Fig. 1) is invariant u n d e r the inh o m o g e n e o u s stretching 9 = xe a, ~ = ye -a, ~' = y' e -2a which is obtained b y extending the transformations of the group to the first derivative y'. The substitution t = In x, u = x y leads to the differential equation u' + u 2 - u - 2 = 0. Thus, it straightens out the skeleton of the original Riccati equation, taking it to a parabolic cylinder (Fig. 2); concomitantly, the stretchings are replaced by a g r o u p of translations t = t + a, ~ = u, and E l ~ .UI"
Group
s(r
y" + 2 ( V+cV3/2+V2~ = 0 \ =-y / y" = Cy -3 ytt Ce-y' y" = Cy '(k-2)(k-1), k # 0, 89
x2:xs163
G3
G8
-----
substitutions and complex bases of algebras as needed. For example, the equation y" = C(1 + yt2)3/2eq arctany '
C, q = const,
admits a 3-dimensional Lie algebra with basis 0 X l = Ox'
0 X2 = ~y,
0 0 X3 = ( q x + Y ) ~ x + ( q y - X ) ~ y .
N o real substitution takes this to any of the equations of Table 1. But it is transformed to the equation k-2 y " = C ~ . k'-~- l '
k_q+l q- 1
by the complex substitution 9 = 89(y - ix), y = 89(y + ix). Classification
In a short communication to the Scientific Society of G6ttingen (3 December 1874), I gave, among other things, a listing of all continuous transformation groups in two variables x, y, and specially emphasized that this might be made the basis of a classification and rational integration theory of all differential equations f(x, y, y', . . . . y(m)) = 0 admitting a continuous group of transformations. The great program sketched there I have subsequently carried out in detail. (S. Lie [16], p. 187) This and the next two sections give some of the main results of implementing the program, as it applies to ordinary second-order differential equations. The restriction to second order is motivated not b y anything essential about the m e t h o d b u t by a desire to concentrate on concrete cases and give brief but definitive statements. For second-order equations the g r o u p classification [16] looks especially simple. The second-order classification result is stated briefly and explicitly in [18], w and appears here as Table 1. Remember that Lie carried out his classification in the complex domain, using complex 22 THEMATHEMATICALINTELLIGENCERVOL 16, NO. I, 1994
Algorithm of Integration I noticed that the majority of ordinary differential equations which were integrable by the old methods were left invariant under certain transformations, and that these integration methods consisted in using that property. Once I had thus represented many old integration methods from a common viewpoint, I set myself the natural problem: to develop a general theory of integration for all ordinary differential equations admitting finite or infinitesimal transformations. (S. Lie [17], p. iv) If a second-order equation admits a Lie algebra of dimensionality r _> 2, then it can be integrated by a grouptheoretic quadrature method. This can be done in various ways, one of which is given in Table 3. It is based on the simple fact that in the complex case any Lie algebra of dimensionality r > 2 has a distinguished 2-dimensional subalgebra. But the structure of a 2-dimensional Lie algebra with basis X~=~
~x 0 0, ( ,y)~x+rl~(x,y)-~y
a=l,2,
Table 2. Canonical form of 2-dimensional Lie algebras and invariant second-order equations.
Type
L2 structure
Basis of L2 in Canonical Variables
Equation
I
[Xa,X2]=O,
XaVX2#O
X, = o ,
X2-- o
V't=f(v')
II
[X1, X2] 7---O,
X1 V X 2 = 0
X l 7-- 0"~ '
X2 ----x O_Ou
Y" = f (x )
III
[X1,X2]=X1,
Xa=o-~u,
X2=x~
y,,= 89
X l = ~y,
_
IV
[Xl: X2] = X l ,
X1VX2#O X l V X2 = 0
can be described simply in terms of the commutator [ X l , X2] = X l X 2
-- X 2 X l
and the pseudo-scalar product
0
y-
X2 -- y ~yy
= f(x)y t
Consequently, the algebra L2 belongs to type III of Table 2. Fourth step: finding the integrating change of variables. From the equations xi(t)
= 0,
= 1,
X 1 V X 2 = ~1T]2 -- 771~2.
we find the substitution This description is given in Table 2; for details, see [4, 6, 11, 12, 19, 23, 32]. EXAMPLE. Let us apply the group algorithm to the equation y . _ y' 1 y2 xy "
First step: finding the admissible algebra. This is done by use of the so-called determining equation. It turns out, as a consequence of standard and straightforward computations, that our equation admits the Lie algebra L2 with basis Xl
= X2 0
CO
Or'
X 2 = --X
0 Ox
yO 20y"
From Table 3 we see that we can pass at once to the third step. Third step: finding the type of the algebra L2. We have [Xl, )(2] -- Xl,
Xl V )(2 ---- lx2y r O.
y t = --,
1 u = ----,
X
X
taking the 1-parameter group generated by the operator X1 (group of projective transformations) to the group of translations in u. After this substitution, the basis of Z 2 takes the form
0 X1 = -~U'
tO 0 X2 - 2 0 t + u 0---~
and coincides (up to the inessential coefficient 89in X2) with the canonical basis for type III in Table 2. Here we exclude solutions of the form y = Cz. This substitution has put the equation into the integrable form u" 1 u,--~ + ~ = O. Solving it gives t2 u= -~-+C
and
t 1 u = ~-~-i+ ~212 lnlC, t - l l + C 2 .
Table 3. Algorithm for integrating a second-order equation using a 2-dimensional Lie algebra.
Step
Operation
Result
Compute the admitted Lie algebra Lr. If r = 2, go to the next step; if r > 2, then distinguish any 2-dimensional subalgebra L2 of Dr. (If r = 1, the order of the equation may be lowered; if r = 0, the group method is not applicable.) Determine the type of the algebra L2 obtained by Table 2. For this one computes the commutator [X1, X2] of X1 and X2 and their pseudoscalar product X1 V X2; if [X1, X2] is neither 0 nor X1, then choose a new basis X~, X~, such that [X~, X~] = X~. Bring the basis of L2 into agreement with Table 2 by going over to canonical variables x, V. Rewrite the equation in canonical variables and integrate it. Rewrite the solution in terms of the original variables.
A basis of Lr: X1,..., Xr A basis of L2: Xl, X2
Reduction of structure to a canonical form from Table 2 Finding the integrating change of variables Solution of the equation
THE MATHEMATICALINTELLIGENCERVOL 16, NO. 1, 1994
Fifth step: finding the solution in the given variables. N o w replace t , u in the foregoing formulas by their expressions, and recall the excluded special solutions y = Cx. We obtain the general solution of the given second-order differential equation in the form y = Cx,
y = +V/2x + C x 2,
+C2=0.
Cly + C2x + x ln Cl Y - 1 x
and the compatibility condition zxv = zvx gives F w = 0. Consequently, the equation y" + F(x, y) = 0 having an F(x, y) not already linear in y cannot be linearized. EXAMPLE 3. Let us see w h e n the equation y" = f(y~) in Table 1 can be linearized. According to (ii) of Theorem 1, it is required for linearizability that f(y') be a polynomial of at most third degree, i.e., that the equation be of the form y" + A3y '3 + A2y 12 + Aly' + A0 = 0 (5) with constant coefficients A~. When one writes out the auxiliary system (3) for Eq. (5), one easily sees it is compatible. Consequently, Eq. (5) is linearizable for arbitrary coefficients Ai.
Linearization In the study of ordinary differential equations it is useful to have simple tests for linearizability. Summing up Lie's results on this question, we can state the following theorem ([16], Part III, w see also [11, 12]).
EXAMPLE 4.
N o w consider this equation from Table 1:
v,, = ! y(y,). x
Linearizability requires it to be of the form (2), i.e., T H E O R E M 1. The following are equivalent:
y . + _1 (A3y,3 + A2y,2 + Aly' + Ao) = 0
(i) the second-order ordinary differential equation
v" = f(x, y, v')
(1)
can be linearized by change of variables; (ii) Equation (1) has the form (2):
with constant coefficients Ai. The compatibility conditions of its auxiliary system (3) come out to A2(2 - A1) + 9AoA3 = 0,
y" +F3(x, y)y'3+F2(x, y)y'2+F1 (x, y)y' + F ( x , y) = 0
Oz
-- Z2-
Oz Oy
-
OF
Fw-
Flz +-~y
y,, (3)
- z w - FF3
Ow Oy
_
1 OF1 2 OF1 -- + 30y 30x'
),
Ao=-
b3 )
~aa+2--~a2
[
9
1
ay,3 q_byt2 +
1 + _~a
y, q_ _~a + 2_~a2 " (7 /
x
It is convenient to find the linearizing substitution from assertion (iv). Let us do this for Eq. (7) in the case a -- 1, b = 0:
vl, = 1 (y, + v,3).
OF3 w2 + F2w + F3z + __~z _ F~ Fa;
(8)
x
This equation admits the algebra L2 with basis
(iii) Equation (1) admits an 8-dimensional Lie algebra; (iv) Equation (1) admits a 2-dimensional Lie algebra with basis X1, X2 such that X1 V X2 = 0.
(4)
EXAMPLE 1. The equation y" = e -y' from Table I is not linearizable, for it does not have the form (2) in (ii). EXAMPLE 2. Suppose in Eq. (2) that F1 = F2 = F3 = 0. Then Equations (3) take the form Zx ": Z 2 - F w + F y , Wx = ZW~
24
b
A,=-(l+3~
Hence, Eq. (6) can be linearized if and only if it is of the form
q- F F 2 ,
1 OF1 2 OF1 z w + f F3 - -~ 0---~+ ~ O--y'
Ow Ox
3A3(1 + A1) - A 2 = 0.
Setting A3 = - a , A2 = -b, this gives
with coefficients F3, F2, F1, F satisfying the compatibility conditions of the auxiliary system
OX
(6)
x
Zy = - - Z W Wy = - - W 2,
THE MATHEMATICALINTELLIGENCERVOL.16, NO. 1, 1994
10 Xl - x Ox,
XE-
yO x Ox,
(9)
satisfying condition (4). This algebra L2 belongs to type II of Table 2, and the linearizing substitution is obtained by going over to the canonical variables 9 = y, ~ = x2/2, relative to which Eq. (9) take the form X1 = 0/0-~, X2 = ~(0/0-~). Aside from the particular solutions p = const, this transforms Eq. (8) into ~" + 1 = 0.
Invariant Solutions Special types of exact solutions, n o w widely k n o w n as invariant solutions, have long been used to advantage on
concrete problems. They have grown familiar in mathematics, mechanics, and physics even before there was any group theory, acquiring the status of folklore. Lie [20] elucidated their group-theoretical meaning and studied the possibility of integrating partial differential equations when the group is sufficiently rich (see [20], Chaps. III and IV). Subsequently, group theory made it possible to clarify, sharpen, and extend many intuitive ideas, and incorporate the method of invariant solutions as an essential component of modern group analysis. It was exactly by the notion of invariant solution that group theory was able to transfer its area of application from ordinary differential equations to the problems of mathematical physics, especially thanks to the works [1, 5, 24, 26, 31]. EXAMPLE. Consider the equation y- = y-3 from Table 1, admitting a 3-parameter group. Its solution, y = V/1 + X2, is invariant under a 1-parameter group, whose generator is ," 0 0 X 1 - ~ - X 3 = (1 q - X 2) ~XX + X y ~ y . Subjecting this invariant solution to the transformations of the 3-parameter admissible group gives
When invariance of the boundary conditions is lost (as often happens), the principle stated can be put to use in other ways. This is what happens, for example, in the method of the majorant in the proof of the Cauchy-Kovalevskaya theorem (on the method of the invariant majorant, see [9]). Another example is the Riemann method, which reduces the Cauchy problem with arbitrary (hence not invariant) data to the special Goursat problem, which is invariant and can be solved by the invariance principle. Here, we only indicate briefly the essence of this approach, referring for details to [12].
The Group Approach to Riemann's Method This section is an attempt at synthesis, at combining Riemann's method [30] of integrating linear hyperbolic second-order equations with Lie's group classification [16] of such equations. Also important here is the invariant formulation (in terms of Laplace invariants) of Lie's results, as given by Ovsiannikov [25]. Riemann's method reduces the problem of integrating the equation
L[u]
=
U~y + a(x, y)ux + b(x, y)uy + c(x, y)u f(x,y) (10)
to the construction of an auxiliary function v satisfying the adjoint equation with given conditions on the characteristics: Y
y = [C1 x2 q- 2 ~
- 1 x + C2] 1/2,
L* [v] = 0,
vlx=~ o = exp
a(xo, 71) &l, a Y0
which is the general solution. This means every solution of this equation is invariant under some 1-parameter subgroup of the admissible 3-parameter group (details in [12]).
The Invariance Principle in the Problems of Mathematical Physics When we pass from ordinary to partial differential equations, it becomes impossible (with rare exceptions) and anyway not particularly useful to write out general solutions. But mathematical physics in any case seeks only those solutions that satisfy given side conditions-initial conditions, boundary conditions, etc. In solving many problems of mathematical physics it is advantageous to use the following semiempirical rule, which is rigorously based only in certain cases (see [5, w 25, w 28]).
THE INVARIANCE PRINCIPLE. If a boundary-value problem is invariant under a group, then we should seek a solution among functions invariant under this group. Invariance of a boundary-value problem means invariance of the differential equation, also of the manifold where the data are given, and also of the data themselves.
vl~=~o = exp
F b(G yo) d(.
(11)
J Z0
Once v is found, the solution of the Cauchy problem for Eq. (10) with data on an arbitrary noncharacteristic curve is obtained by the known integral formula. The function v is called the Riemann function, and the boundary-value problem (11) which determines it is called the characteristic Cauchy problem or Goursat problem. The quantities h = a~ + a b - c,
k = by + a b - c
are called the Laplace invariants for Eq. (10). They remain unaltered by any linear transformation of u with variable coefficients, with x, y not being transformed. In contrast, the quantities p = ~ ,k
q = 1 (ln h)xy
(12)
are invariant under the general equivalence transformation ~ = a(x), ~ = fl(y), ~ = )~(x, y)u of the homogeneous equation (10). These invariants are useful for the classification of Eq. (10) according to the dimensionality of the admissible group. Namely, the homogeneous equation (10) ( f = 0) admits a 4-dimensional Lie algebra [more THE MATHEMATICALINTELLIGENCERVOL. 16, NO. 1, 1994
25
precisely, the quotient algebra with respect to the ideal generated by X = r y) with r y) an arbitrary solution of Eq. (10)] if the quantities (12) are constant; whereas if at least one of them fails to be constant, then Eq. (10) can admit at most a 2-dimensional algebra. For proof, see [25], w Using this result, one proves the following theorem (see [11] or [12]):
THEOREM 2. Assume that Eq. (I0) has constant invariants (12). Then the Goursat problem (11) admits a 1-parameter group. Therefore, the invariance principle is applicable, and the Riemann function can be found from a second-order ordinary differential equation. EXAMPLE1. For the telegrapher's equation uxy + u = 0 we have p = 1, q = 0. Hence, Theorem 2 applies. The Goursat problem (11), namely, v]x=~o= 1,
v~y + v = O,
v]y=~0= 1
(13)
must by Theorem 2 admit a 1-parameter group with generator 0
x = ( x - x0)
- (y - y0)
0
Functionally independent invariants of this group are v and z = (x - xo)(y - yo). Therefore, the invariant solution has the form v = V(z), and after substitution in Eqs. (13), we get a form of Bessel's equation: z V " + V' + V = 0 with condition V(0) = 1. Consequently, for the telegrapher's equation, a Riemann function is the Bessel function do. EXAMPLE 2. Riemann ([30], w applied the technique he introduced to the equation f u~u + (x + y)~ u = 0, ~ = const r 0. (14) In the corresponding problem (11), the condition on the characteristics has the form v]x=~0 = 1,
v]y=y0 = 1.
(14')
Riemann reduced the problem (14), (14') to an ordinary differential equation (leading to the special hypergeometric function of Gauss), considering v as a function of the variable z =
(x - xo)(v
+ y0)(
-
(15)
vo)
+ v)"
Here is how this looks from the group point of view. The invariants (12) of Eq. (14) are p = 1, q = 2/L Hence, Theorem 2 applies here. Solving the determining equation, we find the operator x = (x - x0)(
+ y0)
0
- (y - y 0 ) ( y +
0
admissible for the Goursat problem (14), (14r). Invariants for this operator are v and the quantity z given by Eq. (15). Therefore, the invariant solution has the form 26
THE MATHEMATICAL INTELLIGENCER VOL. 16, NO. 1, 1994
v = V(z). This is just the invariant solution found by Riemann!
EXAMPLE3.
Next take the equation
uxy+
x+y
u=0,
g = const ~ 0 ,
an "intermediate case" between the telegrapher's equation (13) and Eq. (14). The invariants (12) are p = 1, q = 1 / f ( x + y). But q not being constant, Theorem 2 is not applicable. A full catalog of equations to which Theorem 2 applies is in [11].
Fundamental Solutions Keeping the same orientation as in the preceding sect i o n - t h e application of the invariance principle to boundary-value problems with arbitrary data by reduction to an invariant problem of a special f o r m - - l e t us see what Lie group theory can offer for the construction of fundamental solutions in the case of the three fundamental equations of mathematical physics. This natural line of development of group analysis, passing to the space of distributions, was sketched in [10], giving heuristic considerations and statement of the problems. Yurii Berest [3], my student, recently got remarkable results applying this method to wave equations in Riemannian manifolds with nontrivial conformal group. Some details of infinitesimal group techniques applicable to distributions may be found in [12]. The Laplace Equation. Let us consider the equation Au = 6(x),
x E ~n,
(16)
for a fundamental solution as a boundary-value problem, where at a fixed point, the origin, a &function singularity is given. This boundary problem is invariant under the group of rotations and dilations, generated by the operators i,j=l,
Xij
=
xj __O _ xi 0 Ox i cgxJ '
Z
=
x i O + (2 - n)u -~u" 0
.. n, " '
A basis of the invariants of this group consists of the single function d = ulxl '~-2. According to the invariance principle, the fundamental solution is to be sought as an invariant solution, determined by the equation d = const. Thus, u = C]x] 2-n. (17) Substituting Eq. (17) into Eq. (16), we find the value of the constant, C = 1/(2 - n) fin, where f~,~ is the measure of the surface of the unit sphere in n-space. Thus, the fundamental solution was determined from the condition of invariance up to a constant multiple, and the differential equation served only for the normalization.
The Heat Equation.
Kepler's Laws
The equation (18)
ut - a u = 5(t, x)
with n-dimensional Laplace operator in the space of variables x ~ is invariant under the group of rotations, Galilei transformations, and dilations, which is generated by Xij
=
xj
0
- xi 0 Ox---]'
Z
=
0 0 2t-~+x~-~xi
O . 0 Yi = 2t -~ - x~u O--u'
-nuff--~.
This group has the invariant
The motion of a material point under a central force with potential V = a / I x I satisfies the obvious law of conservation of angular m o m e n t u m M = m ( x x v), where m is the mass of the particle and x and v are its coordinate and velocity, respectively. This conservation law, k n o w n in celestial mechanics as Kepler's Second Law, is a corollary of the invariance of the Lagrange equations of motion under the group of rotations and follows from Noether's theorem. Write the infinitesimal transformation of the rotation group with vector parameter a = (a 1, a 2, a 3) in the form = x + 6x,
J = utn/2eIxl2/4t.
Therefore, an invariant solution has the form (19)
u = c t - n l 2 e -Ix1214t.
Equation (18) serves as a normalizing condition: Substitution of Eq. (19) into Eq. (18) yields the value of the constant, C = (2v~) -n. The Wave Equation. For the equation (20)
utt - A u = 5(t, x),
the group of symmetries is generated by Xij Z
=
x~ 0
_xJ
=
0 t-~+
xi
0
Ox i ,
y~ = t ~
0 0 --Oxi + ( 1 - n ) U o - u u ,
0
0
+ x ~ -5i'
6x = x x a.
(22)
Then from the group point of view, Kepler's Second Law expresses the invariance of the problem under infinitesimal rotations (22). The Kepler problem is also invariant u n d e r the inhomogeneous dilation generated by the operator 0
0
X = 3t ~ + 2 x i Ox i.
An invariant both of the rotation group and of this dilation is the quantity d = t2 / v 3. The existence of this invariant is called in celestial mechanics Kepler's Third Law. Finally, the Kepler problem has a special group of symmetries, which in the notation of Eqs. (22), can be written = x + 6x,
6x = (x x v) x a + x x (v x a).
(23)
i,j=l,...,n.
The operators Xij and Yi generate the group of rotations and Lorentz transformations and have two invariants: u and T = t 2 -- IXl2. Therefore, an invariant solution is to be sought of the form u = f(7). The condition of invariance under the group of dilations with generator Z takes the form 2 r f ' ( T ) + (n -- 1)f(r) = 0. (21) Let us look only at odd n, for in the case of even n we would have to use the method of balayage of Hadamard. Then, setting n = 2m + 1, m = 0, 1 , . . . , we write Eq. (21) as T f f ( T ) + m f ( T ) : O, which is k n o w n to have general solution f(T) =
via(r) + c2 C16(m_U(T) + C2T_ m
form =0 f o r m r 0.
Substitution of these formulas in Eq. (20) gives C1 = 89 -m, C2 = 0. In this way, the invariance principle yields the fundamental solution
?.t
:
89 -Ix12),
n =
17r(l--n)/2~(n--3)/2(t2 -- IX12),
n _> 3,
1
where 8 is the Heaviside function and (~(n-3)/2 is the derivative of order (n - 3)/2 of the &function.
A. V. Biicklund THE MATHEMATICAL INTELUGENCER VOL. 16, NO. 1,1994
27
This 3-parameter group differs from ordinary Lie groups of point and contact transformations, being more general; it is called the group of Lie-Bficklund transformations [2]. The computation of the symmetry group (23) is carried out in [9], p. 346. From Noether's theorem one gets a vector integral of the motion X
A=vxM+alx
I,
found first by Laplace [14]. Taking the scalar product of Laplace's vector A with the radius vector x, one readily infers that the orbit of Keplerian motion is an ellipse. This is Kepler's First Law. So from the group point of view, Kepler's First Law expresses the invariance of the problem under the 3-parameter Lie-Biicklund group with infinitesimal transformation (23). Thus, all three of Kepler's Laws of celestial mechanics have a group-theoretic nature.
Concluding Remarks I could continue in this spirit, for there are m a n y entertaining applications of Lie theory, and n o w a d a y s newly developed methods of group analysis are awaiting application. But I hope what I have said is enough to convince you that acquaintance with the classical foundations and m o d e r n group-theoretic m e t h o d s has become an important part of the mathematical culture of anyone constructing and investigating mathematical models of natural problems. For this, one can go to the beautiful books of Lie, Bianchi, and so on, and more recent works (see the References). In conclusion, I would like to carry over to Lie theory in mathematical physics Einar Hille's remark [8]: "I hail a [semi-]group w h e n I see one and I seem to see them everywhere! Friends have observed, however, that there are mathematical objects which are not [semi-]groups."
References 1. W. F. Ames, Nonlinear Partial Differential Equations in Engineering, Vols. I and II, New York: Academic Press (1965, 1972). 2. R. L. Anderson and N. H. Ibragimov, Lie-Bficklund Transformations in Applications, Philadelphia: SIAM (1979). 3. Yu. Berest, Construction of fundamental solutions for Huygens equations as invariant solutions, Dokl. Akad. Nauk SSSR, 317(4), 786-789 (1991). 4. L. Bianchi, Lezioni sulla teoria dei gruppi continui finiti di trasformazioni, Pisa: Spoerri (1918). 5. G. Birkhoff, Hydrodynamics, Princeton, NJ: Princeton University Press (1950, 1960). 6. G. W. Bluman and S. Kumei, Symmetries and Differential Equations, New York: Springer-Verlag (1989). 7. T. Hawkins, Jacobi and the birth of Lie's theory of groups, Arch. History Exact Sciences 42(3), 187-278 (1991). 8. E. Hille, Functional Analysis and Semigroups, New York: Amer. Math. Soc. (1948), preface. 28
THE MATHEMATICAL INTELLIGENCER VOL. 16, NO. 1,1994
9. N. H. Ibragimov, Transformation Groups Applied to Mathematical Physics, Dordrecht: D. Reidet (1985). 10. N. H. Ibragimov, Primer on the Group Analysis, Moscow: Znanie (1989). 11. N. H. Ibragimov, Essays in the Group Analysis of Ordinary Differential Equations, Moscow: Znanie (1991). 12. N. H. Ibragimov, Group analysis of ordinary differential equations and new observations in mathematical physics, Uspekhi Mat. Nauk, To appear. 13. E Klein, Theorie der Transformationsgruppen B. III, Pervoe prisuzhdenie premii N. I. Lobachevskogo, 22 okt. 1897 goda, Kazan: Tipo-litografiya Imperatorskago Universiteta (1898), pp. 10-28. 14. E S. Laplace, Mdcanique cdleste, T. I. livre 2, Chap. III (1799). 15. S. Lie, Uber die Integration durch bestimmte Integrale yon einer Klasse linearer partieller Differentialgleichungen, Arch. for Math. VI (1881). 16. S. Lie, Klassifikation und Integration yon gew6hnlichen Differentialgleichungen zwischen x, y, die eine Gruppe yon Transformationen gestatten, Arch. Math. VIII, 187-453 (1883). 17. S. Lie, Theorie der Transformationsgruppen, Bd. 1 (Bearbeitet unter Mitwirkung von F. Engel), Leipzig: B. G. Teubner (1888). 18. S. Lie, Die infinitesimalen Beriihrungstransformationen der Mechanik, Leipz. Ber. (1889). 19. S. Lie, Vorlesungen fiber Differentialgleichungen mit bekannten infinitesimalen Transformationen (Bearbeitet und herausgegeben von Dr. G. Scheffers), Leipzig: B. G. Teubner (1891). 20. S. Lie, Zur allgemeinen Theorie der partiellen Differentialgleichungen beliebiger Ordnung, Leipz. Ber. I, 53-128 (1895). 21. S. Lie, Gesammelte Abhandlungen, Bd. 1-6, Leipzig-Oslo. 22. M. Noether, Sophus Lie, Math. Annalen 53, 1-41 (1900). 23. P.J. Olver, Applications of Lie Groups to Differential Equations, New York: Springer-Verlag (1986). 24. L. V. Ovsiannikov, Group properties of differential equations, Novosibirsk: USSR Academy of Science, Siberian Branch (1962). 25. L. V. Ovsiannikov, Group Analysis of Differential Equations, Boston: Academic Press (1982). 26. A. Z. Petrov, Einstein Spaces, Oxford: Pergamon Press (1969). 27. E. M Polischuk, Sophus Lie, Leningrad: Nauka (1983). 28. V. V. Pukhnachev, Invariant solutions of Navier-Stokes equations describing free-boundary motions, Dokl. Akad. Nauk SSSR 202(2), 302-305 (1972). 29. W. Purkert, Zum Verh/iltnis yon Sophus Lie und Friedrich Engel, Wiss. Zeitschr. Ernst-Moritz-Arndt-Universitiit Greifswald, Math.-Naturwiss. Reihe XXXIII, Heft 1-2, 29-34, (1984). 30. G. E B. Riemann, Ueber die Fortpflanzung ebener Luftwellen von endlicher Schwingungsweite, Abh. K. Ges. Wiss. GOttingen 8 (1860). 31. L.I. Sedov, Similarity and Dimensional Methods in Mechanics, 4th ed., New York: Academic Press (1959). 32. H. Stephani, Differential Equations: Their Solution Using Symmetries, Cambridge: Cambridge University Press (1989).
Institute of Mathematical Modeling Miusskaya Sq. 4 Moscow 125047 Russia
Sophus Lie* Nils A. Baas
In 1992, it is 150 years since the mathematician Sophus Lie was born. It is a great pleasure to see that The Royal Norwegian Academy of Sciences and Letters today on Founder's day honours him by having his portrait on this year's memorial medal. It is a difficult task to give an account of the work of a mathematician in general terms. Permit me to quote from the memorial speech made by L. Sylow on Lie: "It is the misfortune of the mathematician, more than any other scientist, that his work cannot be presented and explained to the well-educated public, not even to a general audience of scientists. One has to be a mathematician in order to sense the special beauty which a mathematical theorem can offer, or to admire the pure lines of the ",finished parts of this science." In spite of this quotation, I will try to give you an impression of the life and work of Sophus Lie. Marius Sophus Lie was born in Nordfjordeid in the western part of Norway on December 17, 1842. His father--Johan Herman L i e - - w a s a minister, and his mother came from a well-known Trondheim family. At an early age, the family moved to Moss (near Oslo), where he grew up. Later on, he was sent to Nissens Latinog Realskole in Christiania (Oslo) where he graduated in 1859. At school, Lie did well in all subjects. In mathematics, he had Ludvig S y l o w u l a t e r on a well-known mathemat i c i a n - a s a teacher. Sylow remarked later that even if Lie was ahead of his class in mathematics, he did not see a great mathematician in him. Lie himself said that the road to mathematics for him was long and difficult. Sophus Lie was extremely well endowed, not only intellectually; but also physicall3a He was tall and strong and did very well in sports. It was, therefore, natural
for him to think of a military career, but his astigmatism stopped him. He considered studying both languages and science, and his friends thought that a bad mark in Greek made Lie switch to science. During his studies, Lie attended lectures by Sylow on group theory--a subject not being taught many places at the time and which turned out to have fundamental importance for his later research. He graduated from the University at Christiania in 1865. He did not graduate with distinction as he had hoped. The years just after graduation were very difficult. His talents were man~ and he could not make up his mind what to do. He was teaching mathematics, working as an assistant in astronom~ but without real satisfaction. The turning point in Lie's life came in 1868. He came across works by J.V. Poncelet and J. Plficker, introducing
*This paper is a translation by the author of the biography presented in Norwegian at The Royal Norwegian Academy of Sciences and Letters in Trondheim, Norway, on 26 February 1992.
16 HE MATHEMATICALINTELLIGENCERVOL.16,NO. 1 ~1994 SpringeroVerlagNew York
him to m o d e m geometry. Suddenly he felt that he had found a field where he could unfold his talent. Lie was excited by P1/icker's idea of replacing the point as a space element by more complicated objects like lines, curves, and surfaces. From now on, he was fully devoted to mathematics, and he seemed happy with his choice. He soon felt the need to communicate his results, but formal publications were too tedious for him. Therefore, in this early period he made use of small pamphlets containing isolated theorems without proofs and in unedited form. To Lie, it seemed important to secure as soon as possible the priority of the scientific discoveries he made. His first published paper appeared in 1869. It gives a new representation of the complex plane and uses ideas of P1/icker. But Lie had difficulties in getting these ideas published by the Academy in Christiania. He was impatient. Professor Bjerknes asked for more time to look at the paper, but Professor Broch returned it after two days--saying he had understood nothing! However, three other professors--who probably understood the material even less--supported publication. This happened as a result of influence by friends of Lie. The important consequence of this was that Lie now received a / travel grant which made it possible for him to go abroad. Elling Holst writes in a paper in 1912 that many stories were told in Christiania at the time about Lie and his physical strength and good condition. One weekend, he was going to visit his parents in Moss. He walked the 60 km. When he did not find his parents at home, he immediately turned around and walked the same way back! He was also known as an excellent gymnast. Lie spent the winter of 1869-70 in Berlin. Here he met another gifted young geometer, Felix Klein. They became friends and were to influence each other strongly in the following years. He also received positive attention among the leading mathematicians in Berlin. From Berlin, he went to Paris via G6ttingen; later on, Klein also came to Paris. Here Lie started to work on transformation groups, which was to become one of his major fields of research. Because of the French-German war in 1870-71, Klein had to return to Germany. Lie decided to walk to Italy. However, he was soon arrested--accused of being a German spy. His mathematical notes were assumed to be military secrets in codc cspecially a letter from Klein seemed suspicious! After intervention by the French mathematician G. Darboux, Lie was released. Thereafter, he travelled back to Christiania through Switzerland and Germany without further complications. At home, he now received a fellowship from the University, and in 1871 he received his doctor's degree, defending a dissertation entitled, "Over en Classe geometriske Transformationer." Lie now applied for a professorship in Lund (Sweden), but many of his influential friends appealed to the Parliament (Stortinget), and consequently he was appointed to an extraordinary profes-
Sophus Lie sorship in Christiania in 1872. In 1874 he married Anna Birch from Tvedestrand, to w h o m he had been engaged since 1872. They had three children--two girls and one boy. The marriage was a happy one, and Lie was very fond of his family. Lie entered a very rich and productive scientific period. His starting point had been geometry, but now he turned to the theory of differential equations with all his force. He began to see the connections between his transformation groups and general symmetries. Lie was still in contact with Klein, who published his Erlangen program in 1872. Here Klein defines geometry as the study of those properties of a space which are left invariant under a group of transformations. In Norwa~ it is well-known that N.H. Abel showed the impossibility of solving the general quintic equation or higher-degree equation by radical expressions. However, the full understanding of these questions first came through the work of the French mathematician E. Galois w h o studied the symmetries of these equations. Lie discovered that his theory of transformation groups could be used to give a corresponding theory of differential equations. Working out the underlying group theory took a long time, and it was not until 1893 that "Theorie der Transformationsgruppen" was finished in three volumes. In Christiania, Lie was mathematically isolated, and he complained to Klein. In THE MATHEMATICAL INTELLIGENCER VOL. 16, NO. 1, 1994
17
1884, Klein sent one of his students--Friedrich Engel-to Christiania to study with Lie. They developed a warm and lifelong friendship. Engel helped Lie by giving his often intuitive geometrical ideas a more precise mathematical form. Lie often felt it a burden to prepare his ideas for publication. Despite his isolation, the 1870s were the most intense and creative years of his life. At the same time, he was disappointed that his works did not receive more attention abroad. In a letter to the German mathematician A. Mayer he writes, "If I only knew how to get the mathematicians interested in transformation groups and their applications to differential equations. I am certain, absolutely certain in my case, that these theories in the future will be recognized as fundamental. I want to form such an impression now, since for one thing, I could then achieve ten times as much." This disappointment led him in 1876 to start working on differential geometric subjects again. In fall 1882 Lie went to Paris, and he seemed to have impressed the leading French mathematicians. From then on, they followed his work with great interest.
"If I only knew how to get the mathematicians interested in transformation groups and their applications to differential equations . . . . " In 1886, Lie was offered a professorship in Leipzig to succeed Klein, who had moved to G6ttingen. The offer was very tempting for Lie, and he accepted. Here he could come to a great mathematical center where he could lecture on his own research. But he did not want to stay forever--he had in mind 6-8 years abroad. So he did not resign from his professorship in Christiania, but he was granted an extraordinary leave of absence. For Lie, it was now vital to get his ideas published. Therefore, it was also important for him that in Leipzig he was close to the printers of the extensive production he was planning. His great work on transformation groups with Engel was published in the period 1888-93. At this time it was quite unusual for young French mathematicians to go to Germany to study. But the I~cole Normale Superieure in Paris sent some of its best students to Lie, and he was very proud of this. Life in Leipzig was not that easy for Lie. His teaching duties were much heavier than at home, the language caused him some problems, and he tired of supervising weak and dependent graduate students. As time passed, he also ran into trouble with some of his colleagues. In the fall of 1889, he suffered severe neurasthenia and insomnia. He entered a psychiatric clinic near Hannover and described himself as an impossible patient, resisting the opium cure to which he was subjected. He got over this crisis and started to teach again in the fall of 1890. But 18 ThEMATHEMATICAL INTELLIGENCERVOL.16,NO.1,1994
from his letters it seems he did not get well until 1892. Then he writes that he has regained his old health and "with sleep, the pleasure of life and work has returned." Yet, he wrote: "I cannot find words to express how much I am longing to be back in Norway. My nervous system has suffered a lot here in Leipzig, where I have missed the opportunity for exercise and the spiritual influence of nature." After the breakdown, it seems he regained his scientific power, but he was more depressed and pessimistic. At the same time, he also entered into sharp polemic attacks against colleagues. In 1892, there was a conflict between Klein and Lie. The Erlangen program of Klein from 1872 had not been as influential as the work of Lie. Klein wanted to republish it with an addition involving the ideas of Lie, but they had very different points of view on how things had developed. In addition, Klein had burned all the letters from Lie up to 1877---contrary to an agreement they had. The old friends had fallen out. In 1893, Lie shocked the German mathematicians by publicly attacking Klein who now had a leading position in German mathematics. He wrote: "I am no pupil of Klein, nor is the opposite the case, although this might be closer to the truth". The famous Norwegian author Bjornstjerne Bjornson was told by Fridtjof Nansen that Lie was longing to go back to Norway, and he took the initiative to get him home. In 1894, the Parliament established an honorary chair in the theory of transformation groups and with a special salary--so that he could return without any financial loss. But it was not until 1898 that he left Leipzig and moved back to Christiania. Lie was a sick man when he came home. He started lecturing in the fall of 1898, but he had to stop after a few months. He suffered from pernicious anemia, and the illness progressed quickly. Sophus Lie died on February 18, 1899. The contributions to science by Sophus Lie are very impressive. He created fundamentally new theories and fields of research which have greatly influenced the mathematical development in our century. Lie groups and Lie algebras are today classical notions pervading all of mathematics. But also in other sciences where fundamental symmetries play a role, like physics and mechanics, the work of Lie has had an enormous influence. Especially in elementary particle physics, Lie groups have been essential. The most modern theories of elementary particles--gauge theories and string theories---could not even have been formulated without the work of Lie. Lie himself, in a letter to his friend E. Motzfeldt in 1890, wrote, "... m y life's work will stand through all times, and in the years to come, be more and more appreciated--no doubt about it." He seems to have been absolutely right! Finall}~ I regret that a proper biography on Sophus Lie has not yet been written. Many myths have been created about h i m - - m a n y because of his illness. Many
of those he was in conflict with survived him, and they may have been too influential in forming our impression of Lie. It is, therefore, important to study carefully his correspondence and other available material in order to get a better understanding of his personality. Let us hope that in connection with the 150th anniversary of his birth, an initiative will be taken to give him the biography he truly deserves.
Acknowledgments I would like to express my gratitude to Eldar Straume, Inst. for Mat. Realfag, Univ. of Troms6 and Elin Strom, Historisk Institutt, Univ. of Oslo for their kind help in finding relevant sources and for interesting and informative conversations about Sophus Lie. Note added to manuscript. At the University of Oslo "The Sophus Lie Memorial Week" took place August 17-21,1992, including an exhibition of his life and work. A bust of Sophus Lie was unveilled at his birthplace Nordfjordeid in Norway on August 22, 1992.
3. E. Holst: "Tr~ek of Sophus Lies ungdomsliv," Ringeren 2 (1899), 98-102. 4. E. Hoist: "Sophus Lie, in Nordm~end i det nittende Aarhundre," 2 (1912), 99-143. 5. I. Johansson: "Sophus Lie--betraktninger i anledning av hundre~irsdagen for hans fodsel," Norsk Mat. Tidsskrift 24 (1942), 97-106. 6. I. Johansson: "Sophus Lie ct femtblrsminne," article in Aftenposten, Oslo, February 18, 1949. 7. L. Sylow: "Sophus Lie cn Mindetale," Kristiania Videnskabsselskab, February 24, 1899, published 1900. 8. E. Straume: "Sophus Lie i historisk perspektiv," Inst. Mat. Realfag, Univ. of Tromso, 1983. Department of Mathematical Sciences,NTH University of Trondheim N-7034 Trondheim,Norway
Bibliography 1. E Engel: "Sophus Lie," NorskMat. Tidsskrifl4 (1922), 97-114. 2. E. Hoist: Speech at "Universitetes Mindefest over Sophus Lie," April 20, 1899, published by Aftenposten, Kristiania.
T.J. Stieltjes
uvres Completes / Collected Papers Volumes I and 2 Edited by G. van Dijk 1993. XV, 1316 pp. 1 fig. (In 2 volumes, not available seperately. Vol. 1: IX, 566 pp.; Vol. 2: VI, 750 pp.) Hardcover $ 240.00 ISBN0-387-55560-9 This is a new, annotated edition of Thomas J. Stieltjes'Collected Papers, published earlier in 1914 (Vol. I) and 1918 (Vol. II) by Noordhoff, Groningen, in French. The present edition is published in two volumes on the occasion of the lOOthanniversary of Stieltjes' death in 1894. Besides the reproduction of Stieltjes' papers, the new edition contains a short biography and about 75 pages of commentaries by important mathematicians who discuss the impact of Stieltjes'work on the development of the theory of orthogonal polynomials and continued fractions. In particular, Stieltjes'claim to have proven the Riemann hypothesis is discussed. The work is especiallyof interest for mathematicians,but also for others who are anxious to know about the impact of Stieltjes on modem mathematics.
Springer Pri~ are subjectto change without notice. All prices for books and journals include 7% VAT.In EC countries the local VATis effective,
rb.1025/MNT~/2q
Spriager-~'iag D HeidelbeigerPlatz 3, 0-14197Berlin, F. R.GermanyD 175 Fifth Ave., New York,NY10010,USA [] 8 Mexandeagd., London SW197Jg, England[] 26, rue des Cumes, F-75005 Paris, France[~ 37-3, Hongo ychome, Bunkyo-ku,Tokyo 113,Jalm [] Room 701, MirrorTower, 61 mody Road,Tsimshatsui,Kowloon,Hoog Koog[] AvingudaDiagonal,468-4~ E-08006 Barcelona,Spain[] We~sel~yi a. 28,1t-1075 Budapest,lltmgary THE MATHEMATICAL INTELLIGENCER VOL. 16, NO. 1, 1994
19
Self-Avoiding Walks Gordon Slade
What Are They?
Self-avoiding walks are discrete paths without selfintersections. For our purposes they live in the ddimensional integer lattice Z a, which consists of the points in R a whose components are allintegers. Elements of Z e will be referred to as sites. Two sites are called nearest neighbours if they are separated by unit Euclidean distance. An n-step self-avoiding walk is then defined to Jbe an ordered set of n + 1 sites in Z a for which each consecutive pair of sites consists of nearest neighbours and in which no site occurs more than once (Fig. 1). It is this last feature that gives the walk its self-avoiding character and turns out to make life difficult. For despite its simple definition, the self-avoiding walk leads to mathematical problems which are simple enough to state to the mathematically uninitiated, but which are very hard to solve and mainly still open. This article gives an introduction to some of these mathematical problems; a detailed account can be found in [23]. The self-avoiding walk has been used for some time as a model of linear polymers. Linear polymers are molecules which form in long chains of basic units called monomers. The chains can get pretty long: some consist of about 105 monomers. Polymer scientists want to know how many different configurations an n-monomer chain can adopt, and also how far apart the endpoints of the molecule typically are, assuming each configuration is equally likely. These questions translate into questions about self-avoiding walks, where the selfavoidance constraint models the excluded volume effect: No two monomers can occupy the same region in space. For polymers, n is very large, so it is natural to ask about the asymptotic behaviour of properties of the set of nstep self-avoiding walks, in the limit as n goes to infinity. Real polymers live in the continuum rather than on a lattice, but this local issue turns out to be largely irrelevant for long-range questions, and the lattice approximation is a good one for large n. For polymers, the most relevant dimension is d = 3, but 2-dimensional self-avoiding walks are also important as a model of polymers constrained between two narrowly separated
f
1 1
im
f rr Figure 1. A 2-dimensional self-avoiding walk with 105 steps, beginning at 0. A subsequent step to the left would eventually lead to a trap.
THE MATHEMATICAL INTELLIGENCER VOL. 16, NO. 1 (~1994 Springer-Verlag N e w York
29
parallel planes. It is perhaps less clear w h a t dimensions 4 and higher have to do with polymers, but it is instructive nevertheless to s t u d y how the behaviour of the model changes as a function of the spatial dimension d. Visualizing a walk as a path in the lattice, the selfavoidance constraint says that the p a t h has no loops. Although a walk is defined to be a sequence of sites, it is useful to think of it as the continuous path formed by joining each pair of consecutive sites by the intervening unit line segment; see Figure 1. Actually, the term "walk" can lead to some confusion, and perhaps "chain" would be better. The self-avoiding walk is not a model of polymers which are growing in time: n is static. It is tempting to try to think of the self-avoiding walk as a kind of nonMarkovian stochastic process (a Markov process forgets its past, whereas the self-avoiding walk has to remember its entire history), but the self-avoiding walk is not only n o n - M a r k o v i a n - - it is also not a process. A walk m a y be trapped and impossible to extend by another step. The Connective Constant
Let cn denote the n u m b e r of n-step self-avoiding walks which begin at the origin. This measures the number of possible configurations of a polymer of n + 1 monomers. For simple r a n d o m walks, which have no self-avoidance constraint, the analogue of cn is just (2d) n as there are 2d options for the walk at each step. The situation is not as easy for self-avoiding walks, but we can at least say the following. All possible (n + m)-step self-avoiding walks can be formed by concatenating n-step self-avoiding walks to m-step walks, but not all such concatenations will be self-avoiding. This means that c,~+m <_ c,~cm, or in other words the sequence {cn } is submultiplicative.
Figure 2. A piece of the honeycomb lattice, the only lattice for which the connective constant is believed to be precisely known.
30
THE MATHEMATICAL INTELLIGENCER VOL. 16, NO. 1, 1994
Table 1. Rigorous lower and upper bounds on the connective constant #, together with estimates of actual values, for dimensions 2, 3, 4, 5, 6. Estimated errors in the last digit(s) are shown in parentheses.
d
Lower Bound
Estimate
Upper Bound
2 3 4 5 6
2.62002 a 4.572140 c 6.742945c 8.828529c 10.874038c
2.638 1585 (10)d 4.683 907 (22)e 6.772 0 (5)f 8.838 6 (8)g 10.8788 (9)g
2.695 76b 4.756b 6.832b 8.881 b
10.903b
References: (a) [6], (b) [1], (c) [17], (d) [5, 11], (e) [8], (f) [9], (g) [101.
It is an immediate consequence of submultiplicativity that the sequence {(c2k) 1/2k } is nonincreasing. With a bit more w o r k it can be s h o w n that submultiplicativity ensures the existence of the limit # = lim,~_,oo c~/n and that cn ~ #n for all n. This limit # is k n o w n as the connective constant and was first s h o w n to exist by H a m m e r s l e y and Morton [13]. Roughly speaking, cn is of order #n for large n, so that # measures the average number of possible next steps available for a long self-avoiding walk. Of course, any particular walk can have anywhere from zero to 2d - 1 possible next steps. It is not hard to see that d n ~ cn ~_ (2d)(2d - 1) n - l , and hence d _< # _< 2d - 1. The lower b o u n d is due to the fact that walks which only take steps in positive coordinate directions are surely self-avoiding. The upper b o u n d follows from the observation that the set of all walks having no immediate reversals (steps which reverse their predecessor) includes all self-avoiding walks and can be counted by noting that such a walk has 2d choices for its first step and 2d - 1 for each subsequent step. Improving these estimates on # is an activity that some people enjoy, and Table I gives the current status for dimensions 2 through 6. The estimates of the precise value of # are obtained from series extrapolation methods: Values of c,~ are enumerated on a computer (a nontrivial job) for n as large as can be managed, and then # is estimated from these data. For d = 2 the current world record is n = 39, but this is changing rapidly. Table 2 gives the k n o w n values in two dimensions. The precise value of # is not k n o w n in any dimension. Early guesses that in two dimensions # equals 1 + v ~ = 2.4142... or e = 2.7182... have been ruled out by the rigorous bounds. There is one interesting exception where # is believed to be k n o w n exactly: For the 2dimensional honeycomb lattice (Fig. 2) there is strong evidence [26] from physical arguments that # = 2 v / ~ v~. This intriguing value has been confirmed numerically, but not yet by a rigorous proof.
Table2. Values of c,~ on the 2-dimensional square lattice. The most recent additions to this table are from [5]. n
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Cn
4 12 36 100 284 780 2 172 5 916 16 268 44 100 120 292 324 932 881 500 2 374 444 6 416 596 17 245 332 46 466 676 124 658 732 335 116 620 897697 164
rl
21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39
Cn
2 408 806 028 6 444 560 484 17 266 613 812 46 146 397 316 123 481 354 908 329 712 786 220 881 317 491 628 2 351 378 582 244 6 279 396 229 332 16 741 957 935 348 44 673 816 630 956 119 034 997 913 020 317 406 598 267 076 845 279 074 648 708 2 252 534 077 759 844 5 995 740 499 124 412 15 968 852 281 708 724 42 486 750 758 210 044 113 101 676 587 853 932
Because of trapping, it is not immediately apparent even that cn < cn+l for all n, although this strict monotonicity has been p r o v e d recently [27]. In fact, one w o u l d expect from the definition of the connective constant that cn+l/Cn approaches the limiting value of p as n --* co, but this has only been proved for d > 5 and remains open for d = 2, 3, 4. (The case of d = 1 is trivial for selfavoiding walks.) Thirty years ago Kesten [19] s h o w e d that limn-,oo Cn+2/Cn = #2 in all dimensions, using a sophisticated a r g u m e n t based on a "pattern theorem," but his proof doesn't w o r k if the subscript n + 2 is replaced byn+l.
The Mean-Square Displacement The standard measure of the average end-to-end distance of an n-step self-avoiding walk or p o l y m e r is the mean-square displacement, which is d e n o t e d b y (R 2) and is defined as the average of the squared Euclidean distance b e t w e e n the endpoints of a walk. The average is taken over all possible n-ste p self-avoiding walks, with each walk equally weighted. For the simple r a n d o m walk, which is a M a r k o v process, an elementary probability argument 1 shows that the analogue of the mean-
1 Let Xi (i = 1,2, 3,...) be independent and identically distributed random variables such that X~ takes values equal to the 2d positive and negative unit vectors in Z d with equal probabilities 1/2d. The analogue of (R 2) for the simple random walk is the variance of the random variable X1 + ' " + Xn. Because the variance of a sum of independent random variables is the sum of the variances and because the variance of each X~ is unity, the analogue of (Ran) is n.
square displacement is exactly equal to n. In contrast, for the self-avoiding walk the most "obvious" b o u n d s on the mean-square displacement remain u n p r o v e n in low dimensions. For a lower bound, it seems clear that the self-avoidance constraint should force the self-avoiding walk to m o v e a w a y from its starting point at least as fast as the simple r a n d o m walk, and hence that (R 2) > O(n). But it remains an open problem to prove this in dimensions 2, 3, and 4. Nor has an u p p e r b o u n d of the form (R~) _< O(n 2-') been p r o v e d in these dimensions for positive e, even t h o u g h it seems obvious that the ballistic n 2 behaviour cannot be typical above one dimension.
Critical Exponents Chemists and physicists can't always afford to wait for rigorous proofs w h e n they need results, and this is the case for the self-avoiding walk. They n o w have precise conjectures about the b e h a v i o u r of the n u m b e r of n-step walks and of the mean-square displacement, and more. Actually some physicists and chemists m a y be surprised to see their results reported as conjectures, as they regard them rather as facts. W h a t e v e r w e call them, they are almost certainly correct. The consensus is that there are critical exponents "~and v, and amplitudes A and D, such that c,~ .., A#nn "-1, (1)
(R2n) ,'~ Dn 2v.
(2)
Here the symbol ,,~ means that the left side is asymptotic to the right side as n --* 0% in the sense that their ratio has limiting value of unity. These critical exponents are the subject of a considerable b o d y of research, not least because they are believed to be universal. Universality means that they should dep e n d on the spatial dimension and on almost nothing else. For example, the exponents are believed to have the same value on all 2-dimensional lattices such as the square or honeycomb. This will certainly not be the case for the connective constant, which describes the average n u m b e r of next steps available for a long self-avoiding walk and, hence, depends greatly on the lattice. Because they are believed to be universal, the critical exponents are physically more meaningful than the connective constant. Their d e p e n d e n c e on dimension makes it natural to s t u d y the self-avoiding walk in general dimensions, not just in dimensions 2 and 3 where the application to p o l y m e r s is most clear. N u m e r o u s tools have been used to arrive at relations (1) and (2). Important confirmation has been p r o v i d e d b y numerical studies, including extrapolation of exact e n u m e r a t i o n data and M o n t e Carlo simulation. The renormalization group m e t h o d has provided a powerful formal tool in the u n d e r s t a n d i n g of these relations. Application of the renormalization group has been facilitated by a remarkable connection between the selfTHE MATHEMATICAL
INTELLIGENCER VOL. 16, NO. I, 1994
31
avoiding walk and the theory of ferromagnets, which provides a link with the general theory of critical phenomena and phase transitions. In this approach it has proved useful to employ a representation in which the spatial dimension d can be replaced by a complex variable and to study the behaviour of the model as a function of this upgraded dimension. Conformal field theory (the physicist's, not the algebraist's, field theory) has played an important role in two dimensions. Relations (1) and (2) were recently proved for dimensions 5 and higher by Hara and Slade [15, 16], with the values ~ = 1 and L, = 1/2. These are the same values as for the simple random walk, which indicates that the selfavoidance constraint is not playing a terribly dramatic role in high dimensions. It is, however, known that D is strictly greater than 1 for d _> 5: The self-avoidance constraint does push the walk away from the origin faster than simple random walk, but only at the level of the amplitude and not at the level of the exponent. For d = 5 the bounds 1.098 _< D < 1.803 give limits on this effect; as d --* c~ it is known that D approaches the corresponding simple random walk value of 1. The proof of these results relies on an expansion known as the lace expansion, first introduced by Brydges and Spencer [4]. This expansion has its roots in the cluster expansions of statistical mechanics and constructive quantum field theory. Brydges and Spencer used the lace expansion to study the weakly self-avoiding walk in more than four dimensions. The weakly self-avoiding walk involves a measure on simple random walks in which all self-avoiding walks receive the same weight, whereas walks which intersect themselves receive a slightly smaller weight--there is a small penalty for each intersection. The weakly self-avoiding walk is believed on the basis of the renormalization group to have the same critical exponents as the usual self-avoiding walk whenever the penalty for intersections is strictly positive (an example of universality), and the small parameter helps a lot with convergence issues. The HaraSlade proof involves instead using the inverse dimension as a small parameter. More precisely, the small parameter is proportional to (d - 4) -1, which is a serious but surmountable hindrance when d = 5. It is perhaps not surprising that high dimensions are easier to treat because in high dimensions it is more difficult for a walk to run into itself, and, therefore, the self-avoidance constraint is weaker. Four dimensions is the borderline case here, as is well known from the theory of simple random walks [21]. For example, the probability that two independent simple random walks of length n do not intersect remains bounded away from zero as n ~ oc for dimensions d > 4, but not for d _< 4. Nevertheless, it is still a nontrivial problem to treat the high-dimensional case, and the proof given in [15, 16] is long, technical, and computer-assisted. For d = 4 it is believed on the basis of physical and numerical arguments that the critical exponents are the 32 mE MATHEMATICAL INTELLIGENCERVOL.16,NO.1,1994
same as for d _> 5, but that the right sides in relations (1) and (2) should be modified by the insertion of factors equal to the fourth root of the logarithm of n. This is a subtle deviation from simple random walk behaviour, and four dimensions is particularly amenable to renormalization group analysis. Indeed, in the physics literature [22] one finds that three dimensions can be treated by expanding about four dimensions in a complex parameter e = 4 - d and then setting e = 1. This e-expansion, introduced 20 years ago by Wilson and Fisher, has become a standard tool in the renormalization group. Although it remains unclear how to apply such arguments to study d = 3 in a mathematically rigorous way, important steps have recently been taken in making renormalization group arguments rigorous for the weakly selfavoiding walk in four dimensions [2, 3, 18]. For d = 3 it is believed that ~/is about 1.162 and v is about 0.588, whereas for d = 2 it is believed that "y is exactly equal to the astonishing value 43/32 and v is exactly equal to 3/4. These 2-dimensional values were first predicted by Nienhuis [26] on the basis of connections with the theory of ferromagnetism. Numerical work leaves little room for doubt that these values are correct, but a proof is lacking. It will likely be some time before things are completely settled in a mathematically rigorous fashion in dimensions 2, 3, and 4. The 30-year-old bounds
#n exp[Cnl/2], cn <_ #n exp[Cn2/5 log n], #n exp[Cnl/3 log n],
d=2 d= 3 d = 4,
(3)
which are due to Hammersley and Welsh [14] and Kesten [20], have not been improved, nor has the submultiplicativity bound cn _> #n been improved in dimensions 2, 3, or 4. The bounds (3), whose proofs have at their heart an elegant argument relying on submultiplicativity, are still a long w a y from (1). And as has already been mentioned, the situation is even more embarrassing for the mean-square displacement.
Monte
Carlo Methods
Suppose you want to measure the mean-square displacement on a computer, to check the asymptotic relation (2). One w a y is to compute (R 2) exactly for as many small values of n as you can and then extrapolate, and a lot of work has been done in that direction [12]. Another w a y is to do a Monte Carlo experiment. In a Monte Carlo experiment, you first fix a value of n and generate a lot of (hopefully) independent examples of n-step self-avoiding walks, and then take the average of the square displacements of this sample. If your sample is "typical," then the measured average should be a good approximation to the true mean-square displacement (in which the average is taken over all n-step self-avoiding
walks and not just your particular sample). The process can be repeated for several different values of n, and the results fitted to (2). Interesting algorithmic issues arise. H o w do you generate a random sample of 100-step self-avoiding walks? One way that jumps to mind is to do the following. Start by constructing the first step, by picking a neighbour of the origin at random. Then proceed inductively by choosing your next site at random from those neighbours of your current site which have not previously been visited, until you have a 100-site walk. If in this process you become trapped, so that any next step would force an intersection, then discard the walk and start over. This is certainly a w a y of generating 100-step walks, but unfortunately they will have the wrong probability distribution. This can be easily seen by looking at a specific example: Consider 4-step walks in two dimensions. The walk NEEE, where N denotes a step to the north (upwards) and E denotes a step to the east (to the right) has probability 1 x ~1 x ~1 x ~1 = i-~, 1 whereas the walk NESS has probability 88x 89x 89x 1 = 1 . Because these two walks have different probabilities of being generated by the algorithm, taking averages over walks produced by this algorithm will not be the same as taking averages 9"with respect to walks with the desired uniform distribution, in which all walks of the same length are equally likely. Hence, measured quantities like the mean-square displacement cannot be trusted if the walks are generated by this algorithm. Indeed, there are good reasons to believe that the distribution of walks generated by this algorithm (sometimes called "true" or "myopic" selfavoiding walks) has very different properties from the uniform distribution of the self-avoiding walk. A second natural method is to begin with a particular 100-step walk, such as a straight line, and then to perform a sequence of random modifications of the walk to generate new walks. Perhaps the simplest such modification would be to perform local moves, which just change a few (say up to k) contiguous sites of the walk. For example, a small subwalk could be chosen randomly from within a long walk and then replaced by a different self-avoiding subwalk having the same length and endpoints (unless the original subwalk occurs at the beginning or end of the original walk, in which case the unattached endpoint can move). If a self-avoiding walk is produced, then the new walk is kept; otherwise the new walk is rejected and we try again. The set of rules which are used to say how replacements are made defines a specific algorithm. This is usually arranged in such a w a y that the procedure is reversible, which means that the probability of a particular transition being made is equal to the probability of the reverse transition., Unfortunately, all such algorithms suffer from the following fundamental drawback. A theorem due to Madras and Sokal [24] states that for any fixed k, any such algorithm can explore only an exponentially small subset of all n-step self-avoiding walks. More precisely,
given a reversible local algorithm in which each move changes up to k contiguous sites, define an (k) to be the maximum, over all possible n-step initial walks, of the number of walks which can be reached from the initial walk by applying any number of allowed moves to the initial walk. The Madras-Sokal theorem states that for any fixed k, limsup an(k) 1/n < #. n ----~ o o
Because the number of walks is at least # n this means that the algorithm explores only an exponentially small subset of all walks. The proof of this result relies on Kesten's pattern theorem. A pattern is defined as a self-avoiding walk of some fixed length. A proper pattern is a pattern that can occur arbitrarily often in some self-avoiding walk, or, in other words, one which does not inherently entail traps. The pattern theorem says essentially that any given proper
AI andar se hace camino, y al volver la vista atr~s se ve la senda que nunca se ha de volver a pisar. Antonio Machado (As you go you make the way, and looking back, you see the path your feet will never tread again.)
pattern must appear often on all but an exponentially small subset of self-avoiding walks. The theorem about the local algorithms then follows by exhibiting a specific pattern which is frozen, or, in other words, cannot be changed by a local move. The number of occurrences of the frozen pattern remains unchanged under the algorithm. If there are few occurrences of the pattern, then only a small subset of all self-avoiding walks can be explored, by the pattern theorem. On the other hand, if there is a large number of occurrences of the pattern, then again only a small subset of all self-avoiding walks can be explored as only the subwalks which are not part of the occurrences of the frozen pattern are able to change. An example in two dimensions of a proper pattern which is frozen under local moves changing up to k contiguous sites is shown in Figure 3. A similar pattern has been written down in three dimensions. This has not been done in higher dimensions, and, hence, the theorem has not been completely proved for d > 4, but this could THEMATHEMATICAL INTELLIGENCERVOL.16,NO.1,1994 33
.......
i
.......
l
Figure 3. A frozen pattern for local moves.
Figure 4. A tree.
likely be done with some effort. The pattern in Figure 3 is the (10k + 39)-step walk Nk+2W3Sk+ 1ENkESk +I W3Nk +3E9Sk +3W3N k+l 9ESkENk+IW3S k+2. An algorithm which does explore the entire sample space of all n-step self-avoiding walks is the pivot algorithm. In this algorithm, the first step is to choose a site at random on a self-avoiding w a l k thereby dividing the walk into two pieces. Treating this site as the origin of the lattice, one of the pieces is then acted upon by a random lattice symmetry, namely, reflection or rotation. This has the drawback that very often the resulting walk will not be self-avoiding and the trial will therefore be rejected, but this is compensated by the fact that the resulting walk will typically be quite different from the original walk, facilitating a rapid exploration of all corners of the sample space. The pivot algorithm has been studied by many people and used with great success; the most in-depth analysis is in [25].
Related Problems
Not all polymers are linear. Some like to form branches but still remain self-avoiding, and these branched polymers are modelled by trees in the lattice (Figure 4). A tree in the lattice is defined to be a finite connected set of nearest-neighbour bonds having no closed loops, where a nearest-neighbour bond is a unit line segment joining two neighbouring sites. One can ask similar questions about these, such as how many n-bond trees are there in the lattice, and what is the average radius of gyration of these trees (assuming they are equally likely)? These questions are harder and less studied than the corresponding selfavoiding walk questions 9 Another related problem concerns lattice animals. (See Figure 5.) A lattice animal is a finite connected cluster of 34 THEMATHEMATICAL INTELLIGENCERVOL.16,NO.1,1994
Figure 5. A lattice animal.
bonds in the lattice, which, unlike a tree, is permitted to form closed loops. If in two dimensions you replace each bond by the square having the bond as diagonal, then a lattice animal models a connected planar beast composed of unit cells. H o w many are there, and how big are they typically? It is believed that there are critical exponents for trees and animals analogous to those in relations (1) and (2). The most detailed theorems for trees and animals are also for high dimensions, this time (strictly) above eight dimensions, where things appear to be simpler. Very roughly speaking, trees and animals like to be &dimensional objects, and life is therefore easier in more than eight dimensions because two &dimensional objects generically do not intersect in more than eight dimensions. Lattice animals are basic in the theory of percolation. In percolation each nearest-neighbour bond in the infinite lattice is assumed to be "occupied" with probability p and "vacant" with probability I - 19.Here p is fixed for all bonds and the occupation status of different bonds is independent 9 The set of occupied bonds decomposes in a natural w a y into connected clusters, and each cluster will be a lattice animal or its infinite generalization. Percolation provides a model of a porous medium: The bonds which are occupied correspond to pores which admit the
flow of fluid. These pores are to be t h o u g h t of as microscopic in size, so that fluid flow on a macroscopic scale requires the existence of an infinite connected cluster of occupied bonds. It is k n o w n that for all dimensions greater than or equal to 2 there is a phase transition in this model: There is a critical value pc = pc (d) lying strictly b e t w e e n 0 and 1, such that for p < Pc the probability is 0 that there exists an infinite connected cluster of occupied bonds, whereas for p > pc this probability is 1. In other words, if the bond density p is less than Pc, then there is certainly no fluid flow on the macroscopic scale, but as soon as p is above Pc there certainly is such flow. For any p for which there is an infinite occupied cluster, it is k n o w n that with probability I the infinite cluster is unique: There cannot be more than one. Simulations and physical arguments suggest very strongly that at pc itself there can be no infinite cluster of occupied bonds, but this has been p r o v e d rigorously only for d = 2 and for very high d. There are several critical exponents that can be defined for percolation; for example, one exponent is defined in terms of the rate of divergence of the expected size of the connected cluster containing the origin of Z a, as p 7 Pc. This time it is above six dimensions that existence of critical exponents is best u n d e r s t o o d - - t h e r e is still no rigorous proof of their existence otherwise, although the numerical and other evidence leaves little room for d o u b t that they are out there waiting. Critical exponents and the nature of percolation at and near pc is a subject which is attracting great attention in the probability c o m m u n i t y these days; a good introduction is [7].
Acknowledgments This article was written during a visit to the Department of Mathematics of the University of Virginia, and I am grateful to David Brydges and the D e p a r t m e n t for their hospitality. I thank David Brydges, Takashi Hara, Neal Madras, and Alan Sokal for teaching m e about selfavoiding walks and other topics. I also thank the Natural Sciences and Engineering Research Council of Canada for its s u p p o r t u n d e r grant A9351.
References 1. S. E. Alm, Upper bounds for the connective constant of self-avoiding walks, to appear in Combinatorics, Probability and Computing 2 (1993). 2. D. Arnaudon, D. Iagolnitzer, and J. Magnen, Weakly selfavoiding polymers in four dimensions. Rigorous results, Phys. Lett. B 273 (1991), 268-272. 3. D. Brydges, S. N. Evans, and J. Z. Imbrie, Self-avoiding walk on a hierarchical lattice in four dimensions, Ann. Probab. 20 (1992), 82-124. 4. D. C. Brydges and T. Spencer, Self-avoiding walk in 5 or more dimensions, Commun. Math. Phys. 97 (1985), 125-148.
5. A. R. Conway, I. G. Enting, and A. J. Guttmann, Algebraic techniques for enumerating self-avoiding walks on the square lattice, J. Phys. A: Math. Gen. 26 (1993), 1519-1534. 6. A. R. Conway and A. J. Guttmann, Lower bound on the connective constant for square lattice self-avoiding walks, J. Phys. A: Math. Gen. 26 (1993), 3719-3724. 7. G. Grimmett, Percolation, Springer-Verlag, Berlin, 1989. 8. A.J. Guttmann, private communication. 9. A. J. Guttmann, On the zero-field susceptibility in the d = 4, n = 0 limit: analysing for confluent logarithmic singularities, J. Phys. A: Math. Gen. 11 (1978), L103--L106. 10. A.J. Guttmann, Correction to scaling exponents and critical properties of the n-vector model with dimensionality > 4, J. Phys. A: Math. Gen. 14 (1981), 233-239. 1I. A.J. Guttmann and I. G. Enting, The size and number of rings on the square lattice, J. Phys. A: Math. Gen. 21 (1988), L165-L172. 12. A.J. Guttmann and J. Wang, The extension of self-avoiding random walk series in two dimensions, J. Phys. A: Math. Gen. 24 (1991), 3107-3109. 13. J. M. Hammersley and K. W. Morton, Poor man's Monte Carlo, J. Roy. Stat. Soc. B 16 (1954), 23-38. 14. J. M. Hammersley and D. J. A. Welsh, Further results on the rate of convergence to the connective constant of the hypercubical lattice, Quart. J. Math. Oxford (2) 13 (1962), 108-110. 15. T. Hara and G. Slade, The lace expansion for self-avoiding walk in five or more dimensions, Rev. Math. Phys. 4 (1992), 235-327. 16. T. Hara and G. Slade, Self-avoiding walk in five or more dimensions. I. The critical behaviour, Commun. Math. Phys. 147 (1992), 101-136. 17. T. Hara, G. Slade, and A. D. Sokal, New lower bounds on the self-avoiding-walk connective constant, J. Stat. Phys. 1993, 479-517. 18. D. Iagolnitzer and J. Magnen, Polymers in a weak random potential in dimension four: rigorous renormalization group analysis, to appear in Comm. Math. Phys. 19. H. Kesten, On the number of self-avoiding walks, J. Math. Phys. 4 (1963), 960-969. 20. H. Kesten, On the number of self-avoiding walks. II. J. Math. Phys. 5 (1964), 1128-1137. 21. G. F. Lawler, Intersections of Random Walks, Birkh/iuser, Boston, 1991. 22. J.C. Le Guillou and J. Zinn-Justin, Accurate critical exponents from field theory, J. Phys. France 50 (1989), 1365-1370. 23. N. Madras and G. Slade, The Self-Avoiding Walk, Birkh/iuser, Boston, 1993. 24. N. Madras and A. D. Sokal, Nonergodicity of local, lengthconserving Monte Carlo algorithms for the self-avoiding walk, J. Stat. Phys. 47 (1987), 573-595. 25. N. Madras and A. D. Sokal, The pivot algorithm: A highly efficient Monte Carlo method for the self-avoiding walk, J. Stat. Phys. 50 (1988), 109-186. 26. B. Nienhuis, Critical behavior of two-dimensional spin models and charge asymmetry in the Coulomb gas, J. Stat. Phys. 34 (1984), 731-761. 27. G.L.O'Brien, Monotonicity of the number of self-avoiding walks, J. Stat. Phys. 59 (1990), 969-979.
Department of Mathematics and Statistics McMaster University Hamilton, Ontario, Canada L8S 4K1 e-mail:
[email protected] THE MATHEMATICAL INTELLIGENCER VOL. 16, NO, 1,1994
35
36
THE MATHEMATICAL INTELLIGENCER VOL. 16, NO. 1 (~1994 Springer-Verlag New York
David Gale* For the general philosophy of this section see Vol. 13, No. 1 (1991). Contributors to this column who wish an acknowledgment of their contributions should enclosea self-addressedpostcard.
Our lead item for this issue was contributed by Jim Propp.
Further Ant-ics
Jim Propp The Industrious Ant (see this column, Spring 1993) has some sisters and cousins that are even more interesting. These generalized ants m o v e from cell to cell in an infinite square grid. At any given m o m e n t , each cell in the grid is in s o m e particular state; these states will be n u m b e r e d 0 t h r o u g h n - 1, where n is the n u m b e r of allowed states. W h e n an ant passes t h r o u g h a cell that is in state k, the state of the cell changes from k to k + 1 m o d u l o n and the ant then leaves the cell, m a k i n g either a left or a right turn relative to the direction it traversed in arriving at the cell. The ant is not free to choose which w a y it will go (right versus left); it must proceed in accordance with a rule-string of length n that is fixed for all time. The rule-string consists of n bits, n u m b e r e d 0 through n - 1. W h e n the ant is leaving a cell w h o s e state used to be k (and is n o w k + 1 m o d u l o n), it turns right if ra is I and left if rk is 0, where rk is the kth bit of the rulestring. This generalization of the ant seems to have been first considered b y Greg Turk [1] and, independently,
*Column editor's address: Department of Mathematics, Universityof California, Berkele~ CA 94720, USA.
by Bunimovich and Troubetzkoy [2] w h o were building u p o n earlier work of E. G. D. Cohen [3]. We m a y as well focus o u r attention on rule-strings that begin with a 1, as c o m p l e m e n t i n g all the bits in a rulestring simply interchanges "left" and "right" and thus gives a mirror-image universe that is not essentially different from its twin. We will interpret a rule-string that starts with a 1 as the base-2 representation of a natural number. For instance, Langton's original rule will be called rule 1 0 or rule 2. It is easy to see that rule 1 is trivial and causes an ant to travel endlessly a r o u n d a 2 x 2 square. For the same reason, rules 1 1, 1 1 1, 1 1 1 1, and so forth are also trivial. These generalized ants constitute a special case of the "tur-mites" studied b y Greg Turk and others, which were described in A. K. D e w d n e y ' s article "Two-dimensional Turing machines and tur-mites m a k e tracks on a plane" (see [1]). Tur-mites are so general that they include Turing machines as a special case; consequently, it is nearly impossible to prove any general theorems about tur-mite behavior. For the ants, however, one can prove at least one result, namely: T H E O R E M . An ant's trajectory is always unbounded, provided that the rule-string contains at least one 0 and at least one 1. The proof is the same as the one given in the earlier Mathematical Entertainments column. There are m a n y questions w e can ask about generalized ants. The one I'll discuss here is, "What h a p p e n s w h e n y o u start an ant in a universe in which all cells are originally in state 0?" We'll a s s u m e that the ant's initial heading is southward.
THE MATHEMATICALINTELLIGENCER VOL. 16, NO. 1 (~)1994Springer-Verlag New York 3 7
Figure2.
Figure1. Figure 1 shows the state of the universe after an ant with rule-string 1 0 has been wandering around for 11,000 steps in a universe in which all of its cells were initially in state 0. State 0 is drawn in white; state 1 in black. We can see a highway forming in a northwesterly direction. (This highway has the same structure as the one shown in Fig. 5 of David Gale's column in the Spring 1993 issue; it is merely rotated by 90 ~. The same picture appears in Dewdney's article.) To handle ants with rule-strings of length n > 2, I'll let white represent state 0, black represent state n - 1, and intervening shades of gray represent the intermediate states. Ant 4, like ant 2, starts out by creating various symmetrical patterns (such as the one shown in Fig. 2, which comes into existence at the 236th step); these patterns tend to possess bilateral symmetry, unlike the patterns 38 THEMATHEMATICAL INTELLIGENCER VOL. 16, NO. I, 1994
created by ant 2, which have 180 ~ rotational symmetry. The ant then stops behaving symmetrically and creates a chaotic jumble, as shown in Figure 3 (step 100,000). Is some sort of highway eventually formed? I don't know. I've tracked it for over 150,000,000 steps without seeing signs of any sort of clear pattern. Ant 5 is even more like ant 2, in that it favors twofold rotational symmetry; its crowning accomplishment is the pattern shown in Figure 4, created after 616 steps. However, after that, the pattern breaks, and after 150,000,000 steps one sees a configuration with no signs of any reemerging structure. Nevertheless, some surprising statistical patterns appear. For instance, I looked at a central 21 x 21 square in the middle of the configuration and found only 79 cells in state 0, as compared with 190 l's and 172 2's. What is the cause of this fluctuation? Ant 6, in contrast to ants 4 and 5, is very t a m e - - even
F~ure3. tamer than ant 2. After just 150 steps, one can see a sort of highway forming to the southwest (Fig. 5). Unlike the highway formed by ant 2, which has a "period" of 104 (that is, it takes 104 time-steps for the ant to build each successive piece of highway), the highway formed by ant 6 has a period of only 18. Moreover, experiments show that even if one modifies the initial state of the universe by sprinkling a few l"s and 2's among the O's, highways tend to form extremely quickly. Ants 8 through 14 reveal new phenomena. Ants 8 and 14 are the fully "chaotic" ones; the patterns they build, like those built by ants,4 and 5, show no signs of global structure, though each one has distinctive local motifs, especially along the boundary of the growing cloud of chaos. Ant 10 (with rule-string 1 0 1 0) is just an elaborated version of ant 2 (with rule-string I 0); more generally, a rule-string that consists of two or more repetitions
F~meS.
F~e4.
THE MATHEMATICAL INTELLIGENCER VOL. 16, NO. 1, 1994
39
of a shorter rule-string will lead to the same behavior as the shorter string. Ant 13 starts out chaotic, but after roughly 250,000 steps it starts building a highway of period 388. Ant 14 is a curious hybrid of ants 2 and 6; like ant 2, it builds a highway of period 52, but the highway looks very much like the one built by ant 6. Are these two similarities (one numerical and one pictorial) merely coincidental?
Figure 6.
Figure 7.
INTELUGENCERVOL.16,NO.I, 1994 40 THEMATHEMATICAL
Ants 9 and 12 are the truly surprising ones. In each case, the patterns gets ever larger, but without ever getting too far away from bilateral symmetry! More specifically, one finds that the ant makes frequent visits to the cell it started from, and when it does, the total configuration quite often has bilateral symmetry at the instant that the ant arrives at the starting cell. This phenomenon was first noted by Greg Turk. Figure 6 shows ant 12 after 16,464 steps. (This portrayal of M.I.T.'s mascot, the beaver, is offered in appreciation for the use of M.I.T.'s facilities during the preparation of this article.) Figure 7 shows the same ant after 186,848 steps, and Figure 8 shows ant 9 after 38,836 steps. (For a picture of this configuration at a later stage, see page 182 in [1].) As this article is going to press, Greg Turk informs me that Bernd R6mmler has proved that ant 12 builds ever-larger bilaterally symmetric patterns for all time. If we proceed to look at five-bit rule-strings (corresponding to universes with five states), the major new behavior we find is exemplified by ant 27, which builds an ever-increasing spiral, as shown in Figure 9. We do not, however, find any ants that build ever-larger bilaterally symmetrical patterns, like ants 9 and 12. To find more
Figure 8.
Figure 9. THE MATHEMATICAL INTELLIGENCER VOL. 16, NO. 1,1994
41
ants of this sort, we have to move on to rule-strings of length 6. Here we encounter another mystery: the rulestrings of length 6 that lead to bilaterally symmetrical patterns are 33, 39, 48, 51, 57, and 60. Note that all these numbers are divisible by three! Surely this cannot be an accident. The study of generalized ants could easily become a game between two teams. On the one team are the "theorists," who will try to develop general rules describing what long-term behaviors are possible and which ones will occur given a particular ant rule and a particular initial state of the universe. On the other team are the "engineers," who will try to devise patterns in the ant universe that can be used as building blocks for a general-purpose computer. If the engineers succeed in this, then it will follow that the theorists' goal is in some sense unreachable, just as in the case of Conway's game of Life (see Chapter 25 of [4]). It might even be the case that a simple question like "Does ant 4 ever build a highway?" is unprovable in your favorite axiomatic basis for mathematics (ZFC or whatever). Readers are encouraged to play around with ants on their own and draw their own conclusions. I've written a C program which I'll be glad to send to anyone interested. Thanks to L. A. Bunimovich, E. G. D. Cohen, X. E Kong, Chris Langton, Bruce Smith, S. E. Troubetzkoy, Greg Turk and Fei Wang, on whose work this article draws.
go part of the way. The solution is to allow the jeep to make preliminary forays part-way into the desert for the purpose of depositing various amounts of fuel at "depots" along the wa~ thus allowing it to refuel as needed in the final trip. Can the jeep cross an arbitrarily long desert in this way, and, if so, how can this be done so as to consume the minimum amount of fuel? A complete solution to this problem was given by N. J. Fine in 1947. Since then, numerous variants of the problem have been treated. In particular, your obedient servant noted in 1970 that if the problem was to send not one but n jeeps across, then the minimum cost was strictly less than n times the cost for a single jeep (American Math. Monthly agreed to publish the result, and even allowed the article to be subtitled "Jeeper by the Dozen"). In that article, I listed as unsolved the seemingly natural problem of the most economical way for a jeep to make a trip across the desert and b a c k assuming that fuel was available at both ends. I'm happy to announce that now, 23 years later, this problem has been solved by Alan Hausrath of Boise State University, and Bradley Jackson, John Mitchem, and Edward Schmeichel of San Jose State. Their solution is quite elegant, and I will attempt here to describe it qualitatively. Before doing so, however, I will look one more time at the original jeep problem, whose solution, as others have noted, can be made quite plausible by some common sense arguments. First, it turns out to be more convenient to consider the clearly equivalent problem of calculating the maximum distance a jeep can go on x tankloads of fuel, and we shall adhere to this formulation from now on. In thinking about the problem, I suspect most people References would picture the jeep scurrying back and forth between 1. A.K. Dewdney, Computer recreations, Scientific American the starting point and the various depots, depositing or picking up fuel as it goes. Suppose, however, that each (September 1989), 180-183; follow-up (March 1990), 121. 2. L.A. Bunimovich and S. E. Troubetzkoy,"Rotators, period- time the jeep returned to the home base it was replaced icity,and absence of diffusion in cycliccellular automata"; by a new jeep for the next outward foray. This surely Journal of Statistical Physics, 74 (January 1994). would not affect the problem. Further, the new jeep could 3. E. G. D. Cohen, New types of diffusion in lattice gas celbe driven by a new chauffeur. But if that were the case, lular automata, in "Microscopic Simulations of Complex HydrodynamicPhenomena," M. Mareschal and B. L. Ho- one could save a lot of time, because there would be no lian ed., Plenum Press, 1992. reason for each jeep to wait for the return of the preceding 4. E. R. Berlekamp, J. H. Conway and R. K. Guy, "Winning one before setting out. They could all leave the starting Ways," Academic Press, 1982. line and travel together as a convoy, the purpose of all but one of the jeeps being to refuel the others. Figure I is a schematic bird's-eye view of a four-jeep convoy which Department of Mathematics has set out from S in the direction of E Massachusetts Institute of Technology We choose the unit of distance to be the distance a Cambridge, MA 02139-4307, USA jeep can travel on a tankload of fuel. In the figure, the plaid or superjeep, d*, is supposed to make the trip. The others are the refuelers. The refuelers are required to go back to the starting line which is equivalent to assuming that they consume twice as much fuel per unit distance as The Return of the Jeep the superjeep. The shaded portion of the jeeps represents No doubt many readers know about the so-called jeep the fuel remaining in their tanks after they have gone a problem, but in case some of you may have forgotten-- distance x, so the unshaded portion represents x units of the problem is to get a jeep across a desert, the difficulty fuel for the superjeep and 2x for the others. At the time being that the jeep is only able to carry enough fuel to of Figure 1, Jeep #1 is about to give up all of its fuel,
42 THEMATHEMATICALINTELLIGENCERVOL.16,NO.I, 1994
1 - 2x units, to exactly fill up the other three, so we have 1 - 2x = 5x or x = 1/7. The four-jeep problem has n o w become a three-jeep problem. It is n o w easy to see that using this method, n jeeps can travel a distance 1
1
1 + g + g + . 99+
1
2n----Z~.
In case the given a m o u n t of fuel is not an integer, let f be the fractional part and add an extra refueling jeep to the convoy to carry the extra f units. A calculation like the one above shows that one gains an additional f/(2n + 1) units of distance. For future use, define D(x) to be the m a x i m u m distance a jeep can go on x units of fuel. Then 1
1
f
D(x) = 1 + -~ + . . . + 2n~-1 + 2n +--~'
(*)
where f is the fractional part of x and n = x - f. The problem in which the jeep is required to cross the desert and return is called the round-trip problem, and here the formula is even simpler, being 1/2 + 1/4 + . . . + 1/2n, as follows easily from the convoy formulation. Likewise, for the k- (or dozen) jeep problem the same model shows t h a t on n tankloads of fuel, n > k, the k jeeps can go distance 1 + 1/(k + 2) + - . . + 1/(2n - k). Of course, in all these cases some further argument is needed to show
that these formulas actually give the optimal distance. Before going on to describe the solution of w h a t I will call the two-way jeep problem (round-trip with fuel available at both ends), let me digress for a m o m e n t to consider a more general convoy problem. Instead of assuming all jeeps are alike, we consider different kinds of jeeps. A jeep is characterized by two numbers, its capacity, C = the number of liters it can carry, and its fuel efficiency, E = the number of kilometers it can travel on a liter of fuel. Problem: Given n jeeps, each with its o w n C and E, w h a t is the longest desert that can be crossed? I believe there is a literature on this problem (the context being usually rockets and interstellar space rather than jeeps and deserts), but as far as I know the general problem remains unsolved, meaning there is no k n o w n "good algorithm" for finding the optimum. Presumably there is some refueling sequence which achieves the optimum, but what is it? One w a y to visualize the situation is again given by Figure 1; but this time, imagine the jeeps are rockets and the figure is in a vertical rather than horizontal plane. Then the refueling can be thought to take place continuously. Gravity causes the fuel to run out of the top rocket, keeping the lower ones full at all times. When the top tank is empty, the top rocket is abandoned and the others continue on, and the process repeats. It is conceivable that this problem is NP-hard. As with m a n y optimization problems, there is the difficulty that
Figure 1. THE MATHEMATICAL INTELLIGENCER VOL. 16, NO. 1, 1994
43
S-fuel
'~I
F-fuel
T
I
I
~w S-fuel
S
F
Figure 2.
there seems to be no easy way to recognize when a given solution is optimal. How does one deal with this type of situation? There is one general method that sometimes works. One shows that any optimum solution must satisfy certain conditions (e.g., at an interior maximum of a differentiable function, the derivative must vanish). If one is lucky, one finds enough such conditions so that there is a unique object satisfying them, which must, therefore, be the optimum. This, in fact, is what the authors do for the case of the two-way jeep problem, which I will now attempt to describe. We first consider the problem of the longest desert which can be crossed if there are f units of fuel at S and g units at F. Some more common sense: If g ~ f, then clearly the best one can do is D(f) [given by (,)], and one uses S-fuel for the outward trip and F-fuel for the return. On the other hand, this procedure is clearly nonoptimal if g < f, for then one could do no better than D(9) and some of the S-fuel would remain unused. There must, therefore, be some way of setting up depots of S-fuel on the outward trip which can be used on the return; so imagine such depots have been set up, and suppose the one farthest from S is located at a point T as shown in Figure 2. The authors now prove that there are optimal solutions with the following highly plausible properties. Obviously on the outward trip only S-fuel can be used, but it is shown that there is an optimal solution such that (i) on the return trip only F-fuel is used in getting from F back to T, and (ii) only S fuel is used for getting from T back to S (as indicated in Fig. 2). Intuitively, it would surely seem wasteful to bring S-fuel to F (coals to Newcastle) or to bring F-fuel further back than T (if you needed the 'rT-TI~]~AATT-KI~T~AA'I"Tt'~AT
T~T~TTf~Tt"~D
'~lt"~Z
1E. I~,TL~ 1
1K3tt3L4
extra fuel, you should have left it there on the outward trip). But it turns out that these conditions actually determine the solution. Namel~ the distance dl from T to F must be D(g), because, from (ii), all of g and nothing else must be used up in going from F to T. Therefore, on the outward trip the jeep must transport g units of fuel to T. The distance d2 from S to T must be the optimal distance a round-trip jeep can go on f units of fuel if it is required to deliver g units at its destination; this is just a mild modification of the original round-trip jeep problem and is easily solved by the convoy method. Then d = dl + d2 is the distance we are looking for. Thus, the solution for the two-way problem is essentially reduced to patching together the solutions of the two original jeep problems. The authors now treat another two-way problem: given x units of fuel, how should one choose f and g optimally, where f + g = x? For 2 > x, the best one can do is put half the fuel at each end. For x > 2, the solution is, in general, not unique, but there is always a solution in which g is an integer given by g = L((x+ 1)/2)I/2J, much less than half of the fuel for large values of x, although the optimal distance traveled differs from the distance when half the fuel is at each end by no more than 1 + In 2 (independent of x). Further, the number of intermediate depots needed for the return trip turns out to be Ix1 - 2. Thus, Figure 2 represents the solution where there are between four and five tankloads of fuel available. Of course, the arguments proving the optimality of the authors' algorithm are the main content of their article. All we have done here is to describe what that algorithm is. Practical applications? Well, it provides nice material for an Entertainments column.
Jeremy Gray*
Green and Green's Functions The English mathematician George Green was born 200 years ago, on 14 July 1793, in the town of Nottingham. He led a prosperous life, inheriting a successful milling and bakery business from his father. On the strength of a mathematical essay which few, if any, of its original readers understood and the contacts he made in the town, he found his w a y to the University of Cambridge, but finding little to stimulate him there he returned to Nottingham, where he died a few weeks short of his 48th birthday, not even, it would seem, a local celebrity. Toda)~ we still know tantalisingly little of Green, the man, although this anniversary has seen the publication of as well-researched and as thorough a book on him as we are likely to get.** But his theorem and the type of functions named after him are probably used in every country, if not every institution, where mathematics is practised, from elementary particle physics to soil science. What are the ingredients of this success, and how was it ever brought about? George Green could only have learned little from the schooling he received. He would have done little better if his background had taken him at the age of 20 to Cambridge, then famously, and complacently, at a low ebb. But being born into trade, he led a different life. It seems that he fathered no fewer than seven illegitimate children, all by the faithful Jane Smith; we do not know w h y he never married. Jane took his name when he died and seems not to have suffered much disapproval from society, but some of the children felt their illegitimacy keenly. Nor do we know what captured Green for mathematics. Even local influences are hard to document, unless one agrees that John Toplis might have helped. Toplis was a Cambridge graduate who went to Nottingham in 1806 and published a translation of the
* Column Editor's address: Faculty of Mathematics, The Open University, Milton Keynes, MK7 6AA, England. **D. M. Cannel, George Green, Mathematician and Physicist, 1793-1841. ~Le background to his Life and Work, London and Atlantic Highlands, NJ: The Athlone Press, 1993.265 pp.
first two books of Laplace's Mdcanique cdleste in 1814, before returning to Cambridge in 1819 to become Dean of Queen's College. He was a well-read advocate of the methods of Lagrange, Legendre, and Lacroix, and may well have tutored Green, but there is no hard evidence. The historian can, however, see two more nebulous factors working in Green's favour. By the 1820s it was a commonplace that British science had fallen far since the days of Newton. Oxford and Cambridge, which were the only English Universities, seemed incapable of reform, although, in fact, reform at Cambridge was by then slowly underway. Those who did not belong to the Church of England, including Jews, Roman Catholics, and nonconformists, were still excluded by statute from taking a degree. Such men were often active in the booming new industries that were forging the Industrial Revolution and simultaneously turning towns like Nottingham into overcrowded slums. The prosperous among them, conscious of their own achievements and wanting to learn more, often formed local learned societies. One of these was Bromley House, later the Nottingham Subscription Library, which was founded in 1816. Even here, Green continues to elude us. He joined the Library when he was 30 and attended its scientific discussions. But recent research has shown that the Library did not contain many of the few sources Green was to cite, although we may suppose that it acted as a stimulus to his researches. The second factor that made Green's time a propitious one was the emergence of the new physics of electricity and magnetism. Experiments, often lending themselves to public displa~ led to new phenomena, and there was no satisfactory theory to explain them. Leaving experimental work to the likes of his contemporaries Ampere and Faraday, Green set himself the task of facilitating "the application of analysis to one of the most interesting of the physical sciences." The new physics called for new mathematics, and was based on intuitions of a non-Newtonian kind. Green read what he could of the relevant literature, which was not much: Laplace's Mdcanique celeste, some
THE MATHEMATICALINTELLIGENCERVOL. 16, NO. 1 9 1994Springer-Ver|agNew York 45
papers by Poisson, Biot, and Coulomb (more, in any case, than the Cambridge syllabus could have alerted him to). By 1828 he had written his first and single most important paper, entitled "An Essay on the Application of Mathematical Analysis to the Theories of Electricity and Magnetism." Even the word "analysis" in the title alerts us to the fact that Green was self-taught, for it refers to the calculus as it had become in France, not to the sterile exercises in Newtonian methods preferred in England. In the Essay, which ran to nearly 80 pages, Green had the happy idea of coining the term "potential function" to describe the effect upon a point of forces coming from a system of bodies. He considered forces arising from a distribution of something of variable density throughout a body and wrote down as an integral the total effect this had upon the point. This effect depends also on the distance of the point from the body. As Green had learned from Laplace, the potential function satisfies a partial differential equation, or, rather it satisfies two: one for points outside the body, and another for points inside. The move away from force as the central concept, as it was in Newtonian physics, and to the concept of potential was not original with Green, but the introduction of the convenient name helped the process along. He then formulated what today is called Green's theorem. This relates an integral of one function taken over a volume to another taken over the surface enclosing that volume. A mathematician sees this theorem as generalising the fundamental theorem of the calculus to several variables. A physicist sees Green's theorem as relating a flux across a surface to the quantity of material inside it. Green gave an ingenious account of the consequences of this theorem, including a reciprocity theorem that is today expressed in terms of the symmetry of a Green's function when treated as a kernel.
Green's Functions and Distributions Green's theorem is a piece of mathematics, whether pure or applied. His third and deepest insight, the introduction of what today are called Green's functions, derived from his interest in physics, because it is only there that it makes intuitive sense. He introduced the functions to solve the differential equation satisfied by a potential function. His ideas was to solve the equations in extreme cases, where the solution is easy to discover. In particular, Green considered the case where the potential was caused by a single charge at an isolated point. So the potential function satisfied Laplace's equation away from the point. Green assumed that the potential vanished on the boundary of the body, and that the potential increased as 1 / r as one neared the point charge. He then showed how to solve the equation for such a curious function by an attractive use of the consequences of Green's theorem which he had developed in the earlier part of his Essay. He was aware that his mathematics 46 THEMATHEMATICALINTELLIGENCERVOL.16,NO.1,1994
was not rigorous but gave a plausible limiting argument to show how such potential functions can be defined. Amusingl}~ such examples have often struck mathematicians as likely to make sense on physical grounds, and physicists (such as Maxwell) as likely to be amenable to rigorous mathematics. There are many other situations where a point source of influence is a natural object to study. To mathematicians, among Green's eventual readers, the example of a complex function was irresistible. Cauchy showed during the 1820s that much of their theory follows from the distribution of their poles, and later writers, such as Riemann, saw Green's theorem as the natural w a y to the Cauchy Integral theorem. To a physicist, a single force concentrated at a point might represent an impulse. A load on a beam might be concentrated at a point. These extreme situations have the advantage over more general ones of simplifying the attendant mathematics, which is w h y Green's functions have become such a powerful idea. It remains to show that there is a way of going from the solution to a problem with an extreme set of boundary conditions (say, an infinite amount of electricity concentrated at a single point) to more plausible ones (where the electricity is distributed through an entire body). Two steps suffice. First, one can solve a problem with finitely many point charges simply by adding the solutions. Second, one can pass to infinitely many point charges distributed in some w a y on replacing addition with integration. Green's Essay was not appreciated in his lifetime, although his later papers earned him a modest reputation. That the Essay became known at all was the work of William Thomson, later Lord Kelvin, who had picked up a stray reference to it and finally tracked the Essay down the day before taking his degree and leaving for Paris in January 1845. Thomson was entranced. He took the Essay to Paris and showed it to Joseph Liouville and his friend Charles Sturm. They too were excited by it, as was the German Leopold Crelle, who immediately accepted it for publication in his journal, the leading mathematical journal of the day. It was published in three instalments between 1850 and 1854. The combination of ideas, their elegance, and their clear presentation--even when rigour lay out of reach --impressed mathematicians and rapidly ensured that Green's name was henceforth securely attached to his discoveries, even though some had by then been discovered by others. From Crelle, news of the essay passed to Dirichlet, the leading mathematical analyst of his da~ and from him to Bernhard Riemann. Green's functions have played a prominent role in mathematical physics ever since. Likewise, among the physicists, William Thomson remained a staunch advocate of them, as was Maxwell. The standard equations of mathematical physics can always be treated using Green's functions. The reason is
A Green's Function for a Neumann problem (Mathematica graphics by Claude Beauchamp). that the equations are linear, and so may be said to have a family of solutions, every one of which is a sum (in a suitable sense) of what may be called the basic solutions. If a basis of solutions can be found by solving extreme cases (such as concentrating the electricity at a point), then the method of Green's functions will work, and this is usually the case. If the equations are not linear, their solutions may still fill out a cone, in which case the ones arising from the extremal problems will lie on the boundary of the cone. There is a subtle connection between Green's functions and Dirac's delta function. As Dirac clearly saw, the delta function is not a function in the mathematical sense of the term. There is no function 6t which is zero everywhere except at the point t, where it is infinite. The delta "function" is something that you use with an integral. The result is to pick out the value of a function being integrated at some point:
~
b f(x)6t dx = f(t).
The same is true of Green's functions. Whatever the physical intuition that may lead to them, they, too, are used only in integrals,What tends to happen is well illustrated by the case of Dirichlet's problem. One is to find a function that satisfies Laplace's equation on the interior of a domain and takes specified values on the boundary. The domain and boundary have to be reasonably well behaved; suppose that they are a disc and a circle.
First, a solution is found in an extreme case: The function is to be zero everywhere on the boundary except at one point, where it is infinite (making it a sort of delta function). The solution can be written down by Poisson's formula. Then an integration is used to smear out the solutions until they take the required value on the circle. The object one is integrating is the normal derivative of a Green's function for the given Dirichlet's problem. In passing in the reverse direction, from the smeared out "function" concentrated at a point, one sees the role of the delta function. It would doubtlessly have surprised Green, who was engagingly modest about his accomplishments, to see how they have been used in branches of physics so different from the new physics of his day. It is by now highly unlikely that the full range of their application will ever be known. A new lease on life came with their use in 1947 by Schwinger, Feynman, Tomonaga, Freeman Dyson, and others to solve problems in the emerging theory of quantum electrodynamics. Moreover, because Green's functions and delta functions belong mathematically not with functions but in the theory of distributions, it has become hard if not impossible to sort out Green's functions from other techniques to which they have contributed. They are flourishing in mathematics too. One basic question, recently resolved by Douady and Hubbard, is to prove that the Mandelbrot set is connected. It is, and the crucial step in the proof is the study of a family of functions defined by the Mandelbrot set. Green's functions, of course. THE MATHEMATICAL INTELLIGENCER VOL. 16, NO. 1,1994
47
Ian Stewart* The catapult that Archimedes built, the gambling-houses that Descartes frequented in his dissolute youth, the field where Galois fought his duel, the bridge where Hamilton carved quaternions--not all of these monuments to mathematical history survive today, but the mathematician on vacation can still find many reminders of our subject's glorious and inglorious past: statues, plaques, graves, the cafd where the famous conjecture was made, the desk where the famous ini-
tials are scratched, birthplaces, houses, memorials. Does your hometown have a mathematical tourist attraction ? Have you encountered a mathematical sight on your travels? If so, we invite you to submit to this column a picture, a description of its mathematical significance, and either a map or directions so that others may follow in your tracks. Please send all submissions to the Mathematical Tourist Editor, Ian Stewart.
Colin Maclaurin and Glendaruel
John Mooney The west of Scotland contains four universities-Glasgow, Strathclyde, Glasgow Caledonian, and Paisl e y - a l l lying within 7 miles of the centre of the city of Glasgow. If visiting one of these universities, or on holiday in the area, the opportunity to visit the quiet neighbouring peninsula of Cowal should not be missed. This peninsula lies to the north of the isle of Bute, to which there is access via a 5-minute car ferry crossing at scenic Rhubodach-Colintraive on the Kyles of Bute. The principal towns of Rothesay in Bute and Dunoon in Cowal are both ports of call for the world's last seagoing paddle steamer, the Waverley, which harbours at the Broomielaw in central Glasgow. The Cowal peninsula can also be accessed in 20 minutes by car ferry from Greenock/Gourock to Dunoon and by road via Loch Lomond, Loch Long, and the east shore of Loch Fyne to Strachur. About 15 miles south of Strachur is the quiet, almost hidden clachan (village) of Glendaruel, bypassed by the Strachur-Colintraive road on the east. The church at Glendaruel, dedicated to St. Modan, and the third to occupy the site, was built in 1610 and modernised in 1783. The church is the focal point of the parish of Kilmodan in the presbytery of Dunoon. The current m i n i s t e r is the Rev. David Cumming who occupies the house there. The church, which is open throughout the year, contains a monument mural dedicated to the Scottish mathematician Colin Maclaurin. The minister at the end of the 17th century was John Maclaurin, father of Colin. Both Colin and elder brother John were born and baptised at Glendaruel in 1693 and 1698, respectively.
* Column Editor's address: MathematicsInstitute, Universityof Warwick, Coventry,CV4 7AL,England. 48
THE MATHEMATICAL INTELLIGENCER VOL. 16, NO. 1 9 1994 Springer-Verlag New York
Their father died 6 weeks after Colin's birth and their - mother in 1707, by which time they were in the care of an uncle who was a minister in Kilfinan, Argyll. There is a plaque above a picture of Colin Maclaurin on the south wall of the church, bearing the inscription: In memory of John Maclaurin, minister of Kilmodan, who died in 1698, a faithful pastor of great public spirit, literary ability and business talent. His eldest son John Maclaurin, minister of North West Parish, Glasgow, born at Kilmodan, October 1693, died September 1754. A man of extraordinary zeal and piety, the most profound and eloquent Scottish theologian of the 18th century. His youngest son, the celebrated Colin Maclaurin, M.A. ER.S., Professor of Mathematics, University of Edinburgh, born at Kilmodan, February 1698, died 14th June 1746. One of the most eminent among the mathematicians and philosophers that Great Britain has produced, he was noted for benevolence and unaffected piety and rendered distinguished service to his country and to the cause of religion. Erected by Sir James Russell, LL.D., and Lt.-Colonel A.T. Russell C.M.G., natives of Kilmodan. The picture is described: Colin Maclaurin M.A. ER.S. 1698-1746 Buried in Greyfriars Churchyard, Edinburgh In 1745, Colin Maclaurin, while professor of mathematics at the University of Edinburgh, organised the unsuccessful defense of the city against an army of the pretender, Bonnie Prince Charlie. His efforts, together with his arduous return from York in 1746, resulted in his death at the age of 48.
His precociousness was reflected in his enrolment in the University of Glasgow at the age of 11, and his graduation there at 15, for which he wrote a treatise on the power of gravity. His subsequent career was closely connected with that of Isaac Newton when, in September 1717, at the age of 19, he was elected to the chair of mathematics at the University of Aberdeen. According to the Guiness Book of Records, his position is the youngest professorial chair on record at any university. His eldest son John, born 1734, was Lord Dreghorn, a well-respected Scottish judge. Of passing interest are the preserved carved gravestones of the 15th and 16th centuries in the Lapidarium in the southwest corner of the kirkyard. The hill rising steeply above the church on the west is crossed by the single track road to Otter Ferry and it is home to several species of birds of prey. Also, the traveller with a wider scientific appreciation will find much of interest, in particular at the estuary of Loch Riddon, 2 miles south of Glendaruel, which is a designated S.S.S.I. (a site of special scientific interest) with old native broad-leaf forests, marsh plants and mosses in abundance, and a variety of seabirds and marine life inhabiting its sand fiats and shallow estuarine waters. Even from a map the surrounding topography appears most diverse and inviting.
Mathematics Department Glasgow Caledonian University Cowcaddens Road Glasgow G4OBA Scotland, U.K. TIq~' MI'ATIq~K.,[ATIC'AT TNTTI~I].I~I~.NTC'ERVOL. 16. N O . 1.1994
4~
Oswald Veblen A. F. M o n n a
Some remarks on O. Veblen in this magazine (vol. 13, no. with the name Veflen. It appeared that the name Veflen 3, summer 1991, p. 7) reminded me of interesting math- was still a family name on several farms (=seter): it is a ematical incidents during one of our holidays passed in common name in the region. In the Folkemuseum at Fagernes they then had a talk Norway many years ago; it must have been in the 1960s. Coming from the Sogne fjord we stayed some days in with one of the managers who kindly gave the decisive the old farm Elveseter in the mountains of Jotunheimen. information. In a publication of the Museum, "Valdres We shared the table with an American engineer, and to- 900--Arsskrift 1923," both A. A. Veblen and O. Veblen gether we climbed Galdhopiggen, the highest mountain are mentioned, and details about their lives are given. of Norway. Surprisingly he appeared to be a neighbor The familj6 then living in Hurum, Norway, settled in of the Dutch historian-mathematician Dirk J. Struik (see the U.S.A. in the middle of the 19th century. Andrew Math. In telligencer vol. 11, no. 1,1989). The world is small! Andersen Veblen (Veflen) was born in 1843 in the U.S.A. We continued to Fagernes in the district called Val- (Ozankee County, Wisconsin). He became a professor of dres, where we visited the Valdres Folkemuseum. On mathematics. It may be that the name Andrew is related the wall of one of the rooms hung the painted picture to the Norwegian name Andris. of a man whose name was indicated as being A. A. Oswald Veblen is the son of Andrew Andersen Veblen. Quite naturally I associated this person with the Veblen; he was born in 1880 in Decorah, Iowa. He also mathematician Veblen whose name I already knew from studied mathematics and also was a professor: in the his book "Analysis Situs" with examples and theorems on homotopy and homology groups (New York, A.M.S., 1923; 1932). I wondered whether the man in the picture was really the mathematician and in what way he was related to Fagernes. I went out for information to the tourist office. At first they could not give me the desired information, but the next day they called and asked me to come again. The family Veblen came from Grindaheim or surroundings, not far from Fagernes. The name Veblen was still known there but was written as Veflen, the V pronounced as W. Grindaheim is a village near Jotunheimen. So we went by bus to Grindaheim, hoping to get more information there, perhaps in the churchyard. But this was in vain. Home again, I saw in the library that the first name of the mathematician was Oswald. But who was A. A. Veblen? I asked my daughter and her Danish husband, who live in Denmark and often pass holidays in Norway, to look into it when they were in Valdres. More or less by chance they came to visit Hore, a hamlet not very far from Grindaheim, to see the old Norwegian stavkirke. In the churchyard they discovered several tombstones A n d r e w Andersen Veblen 50
THE MATHEMATICAL INTELLIGENCER VOL. 16, NO. 1 9 1994 Springer-Verlag New York
mathematical world he is famous for his work in geometry and topology. His name is connected with a prize: the Oswald Veblen Prize in Geometry (see Notices of the American Mathematical Soc., vol. 38, No. 3, p. 181, 1991). There is some confusion about the name: one finds Veblen, Veflen and even Vovle (see the •rsskrift mentioned before). The village Hurum is identical with H~re. This difference in notation may be explained from the old political connections between Denmark and Norway and their respective languages. Acknowledgments: I thank the managers of the Folkemuseum at Fagernes and the Tourist office for their kind help in getting information on the Veblen family and my Danish family for their assistance.
Kerkdwarslaan 3731 E.M. De Bilt The Netherlands
Oswald Veblen
THEMATHEMATICAL INTELLIGENCERVOL.16,NO.I, 1994 51
Thomas Joannes Stieltjes: Honorary Doctor of Leiden University Gerrit van Dijk A few years ago, Leiden University conferred an honorary doctorate on E. L. Lehmann, professor of statistics at the University of California at Berkeley. A little research showed that the last laureate had been T. J. Stieltjes, precisely one century ago. Thomas Joannes Stieltjes was born on December 29, 1856 in Zwolle, The Netherlands. He carried the same first names as his father, a civil engineer and member of parliament, well known for his work on the Rotterdam harbours. Thomas Jr. started his studies in 1873 at the Polytechnical School in Delft (now the Technical University). Instead of going to lectures, he spent most of his time in the library studying the works of Gauss and Jacobi. As a consequence, he failed the propedeutical examination, and failed again in 1875 and 1876. His father contacted his friend Prof. H. G. van de Sande-Bakhuyzen, director of Leiden Observatory, and in April 1877 Thomas was appointed "assistant for astronomical calculations" at the Observator3a Thomas devoted almost all of his free time to mathematics. Through his work on celestial mechanics he made contact with Charles Hermite in Paris, and his first letter to Hermite is dated November 8, 1882.
It was followed by a long correspondence consisting of 432 letters, the last being written on December 15, 1894, 14 days before Stieltjes's death. In May 1883, Thomas married Elizabeth (Lilly) Intveld, who was a tremendous stimulus for his mathematical work. On December 1, 1883, Thomas resigned from the Observatory to devote himself completely to mathematics. On January 15, 1884, Stieltjes wrote to Hermite: I have been offered, some days ago, a professorship in analysis (differential and integral calculus) at the University of Groningen. I have accepted this offer and I believe that this position will permit me to be more useful. I owe much, in this situation, to the extreme kindness of my old chief Mr. Bakhuyzen, the director of the observatory. One of these days, my nomination will become definitive. However, there were problems. On March 13, 1884, he wrote to Hermite: The Groningen Faculty has indeed put me in first place for the vacancy, but the Minister has named one of the others. Probably the reason is that I, not having followed the ordinary route, have not obtained any degree at the University. From archives of the University of Groningen it appears that the following nomination was made in 1883: (1) Prof. dr. D. J. Korteweg, (2) Mr. T. J. Stieltjes. When asked, Prof. Korteweg said he would not consider moving to Groningen. Stieltjes, however, declared that he would accept an appointment. A new nomination was then made: (1) T. J. Stieltjes, (2) E (Floris) de Boer. De Boer was appointed by Royal Decree of March 12, !884. In May 1884, Hermite met the Dutch mathematician Bierens de Haan at the fifth centenary celebrations of the University of Edinburgh. They discussed Stieltjes's poor ~ m s t a n c e s . It is very likely that the idea of conferring an honorary doctorate upon Stieltjes originated during this discussion. At any rate, June, 1884 an honorary doctorate in mathematics and astronomy was conferred upon Stieltjes by Leiden University, following the nomination by D. Bierens de Haan and H. G. van de Sande-Bakhuyzen. Even then things were less than smooth, as Stieltjes's reply to the official letter from the Senate of Leiden University shows:
Letter of thanks to the Senate. 52
THE MATHEMATICALINTELLIGENCER VOL, 16, NO. 1 (~)1994 Springer-Verlag New York
The undersigned wishes to thank you for the honourable distinction, conferred upon him by Your College, and to assure you that the distinction is highly appreciated. Due to a regrettable misunderstanding he was not aware of the intention of a public ceremony on last Tuesday June 17 at 3 o'clock. Leiden, June 19,1884 T. J. Stieltjes
In April 1885, Stieltjes's family settled in Paris; and in 1889 he was appointed professor of differential and integral calculus at Toulouse. On June 18,1894, summary of his most important paper "Recherches sur les fractions continues" was published in the Comptes Rendus. An extended version was published in the Annales de la Facultd des Sciences de Toulouse in 1894/95. This article, in which he introduced the Stieltjes integral was awarded a prize by the Acad6mie des Sciences. On December 31, 1894, Stieltjes passed away in Toulouse, at the age of 38. The burial actually took place on January 2, 1895. His tomb is in the cemetery of Terre Cabade in Toulouse (no. 828, section II, division 4) and can be visited. In 1990, Stieltjes's family restored the tomb. His name is still alive in Leiden and Toulouse. In both cities, there is a street named after him, and one of the two mathematics amphitheatres at the University Paul Sabatier bears his name (the other bears the name of Fermat). Furthermore, Leiden University has recently taken the initiative of establishing a graduate school in mathematics, called the Stieltjes Institute for Mathematics. Participating institutions are the University of Amsterdam, the Free University of Amsterdam, the Center 9,for Mathematics and Computer Science at Amsterdam, Delft Technical University, and the Erasmus University at Rotterdam. The Institute plans to begin its activities in 1993. Both in The Netherlands and France events are in preparation for the commemoration of the 100th anniversary of Stieltjes's death. Symposia and a "Stieltjes year" (in Toulouse) are just two of the activities on the
Stieltjes's tomb in Toulouse.
Stieltjes's s~eet in Leiden.
programme. A Stieltjes Lecture will be delivered at the Congress of the Dutch Mathematical Society, to be held in Leiden. A new annotated edition of Stieltjes's Collected Papers will be published by Springer-Verlag, edited by the author of the present article. It will appear in 1993. Acknowledgments
The author would like to thank Prof. J.-B. Hiriart-Urruty of the University Paul Sabatier for his assistance in the preparation of this article. He would also like to thank Springer-Verlag for the kind permission of publication of the above material, which will partly be included in the new edition of Stieltjes's Collected Papers. My colleague Bert Peletier assisted with the historical research.
T. J. Stieltjes.
Leiden University Department of Mathematics and Computer Science Niels Bohrweg1 2333 CA Leiden, The Netherlands THE MATHEMATICAL INTELLIGENCER VOL 16, NO. 1, 1994
53
Max Dehn and Black Mountain College R. B. Sher In the Blue Ridge Mountains of western North Carolina is the site of a singular experiment in higher education: Black Mountain College. And there, in a large grove of rhododendrons, can be found the grave of Max Dehn. Black Mountain College was founded in 1933, the brainchild of a small group of faculty members at Rollins College in Florida who had been dismissed or had resigned over a dispute involving academic freedom. Opening its doors in the fall of that first year with 13 faculty and 22 students, the college was intended to be a community of learning in which decisions would be made in a democratic fashion through consensus reached in open meetings involving faculty and students. The college was owned as a corporation by the faculty as a whole, and there was no outside board to exercise control. There were no required courses, although the student was expected to develop an overall plan of study in consultation with an advisor. Classes were to consist of some combination of recitation, lectures, tutorials, and seminars scheduled at the discretion of the instructor. Indeed, it was intended that learning be a continuing process that spread beyond the usual confines of the classroom. Faculty and students alike lived on the campus of the college, ate together in a common dining hall, and participated in all aspects of community life including the work program. I Education at Black Mountain was to mean not just the understandings reached through the usual classroom experience, but also growth of personal responsibility and the ability to deal intelligently and rationally with whatever life presented. Black Mountain College did not seek, nor was it granted, accreditation from any overseeing body. At first, 1 Except for those tasks that required full-time employees, such things a s general maintenance, some construction projects, work on the col-
lege farm, etc., were to be the responsibility of all members of the community. In a July 14, 1946 letter to Mary Gregor}~ Secretary of the Corporation, Dehn wrote, "In little more than two weeks I shall be b a c k with you. I hope you will arrange some nice work to do for me, for instance Geometry for Artists or hoeing potatoes."
Lake Eden Campus of Black Mountain College at the time of Max Dehn. Courtesy of North Carolina State Archives.
grades were not even given, although this later changed to allow students to transfer credit more easily to other institutions. 2 Graduation was attained by comprehensive examination, oral and written, given by experts selected from outside the college. Despite the lack of an accredited degree, many of the college's students were able to compete successfully at major graduate schools after leaving Black Mountain. Although the usual ingredients of a college curriculum were present at Black Mountain, what made the program of study so distinctive was the central role of the creative arts. Indeed, much of the college's fame comes from its association with such faculty and students as Joseph and Anni Albers, John Cage, Robert Creeley, Merce Cunningham, Fielding Dawson, Robert Duncan, Buckminster Fuller, Charles Olson, Joel Oppenheimer, Arthur Penn, Robert Rauschenberg, Xanti Schawinsky, and Jonathan Williams. 3 As it evolved, the arts became not just the subject of a few electives here and there but instead the core of the curriculum. Mathematics was frequently taught by instructors whose academic training was in the sciences or engineering. In fact, during the 23 years of the college's existence only one full-time member of the faculty had been trained as a mathematician--Max Dehn. After leaving Germany in 1938 as a result of Nazi oppression, Dehn had made his w a y to the United States by way of Scandinavia, Russia, and Japan. He had taught at the University of Idaho, at the Illinois Institute of Technology, and at Saint John's College in Annapolis, Maryland, but had not found a suitable full-time position. Dehn first visited Black Mountain in March of 1944 to give two lectures. In a letter written that February, he noted that it did not seem appropriate, given the constitution of the faculty and student body, to give a talk on an advanced mathematical topic. He offered instead the titles "The psychology of mathematical activities," "Common roots of mathematics and ornamentics," and "Some moments in the development of mathematical ideas." The latter two were chosen by the faculty. Later, in the fall of 1944, representatives of the faculty corresponded with Dehn about the possibility of a full-time appointment for the winter and spring quarters of 1944/45. After an exchange of letters and telegrams in which the initial offer of $25 a month was increased to $40 a month, the offer was accepted and Dehn joined the faculty in January of 1945. 4 Apparently, Dehn's years at Black Mountain were happy ones. An amateur naturalist and an enthusias2 Grades were not, however, made available to the students. 3 Some of these were, however, not associated with the college as fulltime faculty, but in the summer institutes that provided an important share of the intellectual life of the communi~.
54
THEMATHEMATICALINTELLIGENCERVOL.16,NO. 1 (~)1994Springer-VerlagNewYork
tic hiker, he enjoyed the forested mountains in which the college was located. He also enjoyed the enthusiasm of the students and the intellectual contact with the faculty. While on leave in 1949 during one of the college's frequent crises, he wrote to Albers, "I am always glad to do something for BMC which I consider--in spite of occasional trouble--a wonderful place where I can be together with young people without any institutional impediments. There, I can use what little abilities I have to transmit to them what I think is leading most surely towards a happy life. Not to forget the beauty of the surrounding nature which, I think, is of the greatest value to transform young and old people who live in it." While at Black Mountain, Dehn taught not only mathematics but also philosophy, Latin, and Greek, including the reading of the classics. Perhaps the flavor of his ideas on teaching can be gotten from a remark to his colleague the physicist Natasha Goldowski, who had suggested in a paper that as students were generally not well prepared in mathematics, it was best to simply state results without indicating in any way how they were derived. Dehn's response was, "Only in one point I am, perhaps, not quite of your opinion. All instruction, including mathematics, should be an end in itself, not, in the first place, means. Therefore in mathematics one should be so simple and elementary as possible so that the structure of mathematical thinking becomes clear. If this is achieved then the student will have less difficulty to see mathematical structures in physical phenomena or even in phenomena of biology and social life. Do you agree?" In the summer of 1952, Dehn was made Professor Emeritus. He was to serve as an advisor and continue to live on the campus. That same summer, while supervising the removal of some of his beloved dogwood trees that the college had agreed to sell because of financial difficulties, he suffered an embolism and died at the age of 73. He was buried on the side of a mountain beneath a large growth of rhododendrons that still thrives today. Black Mountain College closed in 1956. Its campus, located about 20 miles east of Asheville and just north of the community of Black Mountain, is now the site of a summer camp. The visitor should exit Interstate Highway 40 to the north at the Swannanoa exit. Turn right onto Highway 70 then, after 0.4 miles, left across the short bridge at Whitson Road to "old Highway 70." Turn right and follow the signs to Camp Rockmont, which is on Lake Eden Road. (Turn left on Lake Eden Road after following old 70 for 1.9 miles.) On entering the campgrounds, follow the road to the Administration Building. Let someone there know that you would like to visit the grave site. Continue by foot on the road up the hill from the Administration Building, taking the right side when the road forks. Shortly you will see a house on the left with a tin roof. Walk to the right side of the house.
Left to right: Unknown, Johanna Jalowetz, Joseph Albers, Max Dehn. Courtesy of North Carolina State Archives.
About 60 feet past the house you will enter a grove of rhododendrons and see, on your left, the grave of the Viennese conductor Heinrich Jalowicz who preceded Dehn in death in 1946. About 25 feet to your left you will find a small ceramic marker that indicates the final resting place of Max Dehn. For more information on Black Mountain College one should refer to [1] or [2]. Information on Dehn's life can be found in [3], and his mathematical works are treated in [4] and [5]. References
1. Martin Duberman, Black Mountain College:An Experiment in Community, New York: E. R Dutton (1972). 2. Mary Emma Harris, The Arts at BlackMountain College,Cambridge, MA: MIT Press (1987). 3. C. L. Siegel, Zur Geschichte des Frankfurter Mathematischen Seminars, in Gesammelte Abhundlungen, Vol. 3, Heidelberg: Springer-Verlag (1966), Vol. 3, 462-474. 4. Wilhelm Magnus and Ruth Moufang, Max Dehn zum Ged/ichtnis, Math. Annalen 127 (1954), 215-227. 5. Wilhelm Magnus, Max Dehn, Math. Intelligencer 1 (1978/79), 132-143.
The University of North Carolina at Greensboro Greensboro, NC 27412 USA
The ceramic marker of Dehn's grave.
4It should be noted that the collegealso provided, in additionto the salar~ livingquarters, meals,and laundryservice. THE MATHEMATICAL INTELLIGENCER VOL. 16, NO. 1, 1994
55
Are the Traditional Philosophies of Mathematics Really Incompatible? J. Lambek
The Four Schools Having been under the impression that most mathematicians do not care about the foundations of their subject, I was amazed by the heat generated by this topic in recent issues of the Mathematical Intelligencer, particularly in the letters to the editor (see, e.g., Paris [25]). The purpose of this article is to marshal a number of facts that support a certain philosophical thesis, which I hope to persuade at least some readers to share. I would like to argue that, contrary to widely held opinion, the traditional philosophies, logicism, formalism, Platonism, and intuitionism, if stated with sufficient moderation, do not really contradict each other, although I still have some reservations about logicism. This idea was first proposed in our book [17] by Phil Scott and me and elaborated for a philosophical audience in collaboration with Jocelyne Couture [5]. The present discussion owes a considerable debt to both co-authors. For background material on the traditional mathematical philosophies, the reader is referred to the standard references Benacerraf and Putnam [1], Hintikka [10], and van Heijenoort [30]. There are a number of problems a philosophy of mathematics should address. Perhaps the most important of these are: H o w is mathematical knowledge obtained (epistemology), and w h y can it be applied to nature? However, we shall here confine attention to another problem: What is the nature of mathematical entities (ontology) and of mathematical truth? The best-known mathematical philosophies have given different answers to this ontological question (see [5]), which we shall summarize here in rather abbreviated form. 56
Logicists claim that mathematical entities can be defined in the language of symbolic logic. Formalists claim that mathematical entities, if they exist at all, are nothing but terms of a formal language (of course, modulo the equivalence relation of provable equality, two terms being equivalent, or denoting the same entity, if the formula obtained by putting an equality sign between them is provable in the language). Platonists claim that mathematical entities exist independently of our w a y of viewing them. Intuitionists claim that mathematical entities are mental constructs.
THE MATHEMATICAL INTELLIGENCER VOL. 16, NO. 1 (~)1994 Springer-Verlag N e w York
The different schools also have varied conceptions of what is mathematical truth. Logicists and formalists would claim that mathematical statements are true only when they are provable. Platonists claim that mathematical truths are there to be discovered and intuitionists claim that mathematical statements are true only if they can be known to be true. These are the more moderate views expressed by these schools. Some adherents of these schools may have more extreme opinions, which we shall mention only briefl3a Thus, an extreme logicist might claim that set theory is not part of logic. An extreme formalist might claim that mathematics is a meaningless game and that there is no such thing as a number, that only numerals exist. An extreme Platonist might believe that mathematical entities are ideas in the mind of a supernatural being; I am told that this view was proposed by Nichomachus of Gerasa about 100 A.D. and entertained by many religious thinkers since. Finally, an extreme intuitionist might believe that only those statements are true which are known to be true today; this was, in fact, occasionally asserted by the founder of the school. It goes without saying that these extremist views cannot be reconciled with one another. Alas, it has become cleat from various comments I have received that the more moderate reformulations, proposed in the interest of eclectic conciliation, are rejected by some adherents as well as by some opponents of these positions. However, I believe that, if hard pressed, I could find adherents who will accept the moderate presentations advocated here.
G 6 d e l ' s Impact There seems to be a general consensus among logicians that the rather vague concept of "truth" should be replaced by the more precise notion of "truth in a model," and we shall adopt this point of view here. Hilbert's formalist program implicitly contains the proposal that the semantical notion of truth can be captured by the syntactic notion of provability. In a sense, this proposal was carried out by GSdel [1930] in his completeness theorem:
a statement in a formal language is provable if and only if it is true in every model of that language. This result holds not only for first-order logic, but also for higher-order logic, that is, type theory, as was shown by Henkin [8], and even for intuitionistic type theory [17]. Type theory for us is the language of mathematics. Presumably, this solution is not acceptable to a Platonist, who feels uneasy with the plurality of models and wishes to single out a distinguished model, let us call it the real world of mathematics, in the hope that truth in this model alone should suffice.
G6del was a Platonist and believed in a real world of mathematics. In the semantic version of his famous incompleteness theorem [7], he apparently showed that the Platonists' hope is incompatible with Hilbert's proposal: There are mathematical statements true in what he thought was the real world, yet not provable in a language adequate for arithmetic, it being assumed that the set of proofs in that language is recursive. But wait a minute, let us look at GSdel's argument more closely. (See also [4].) G6del constructed a formula G of the form Vy6NgP(y) such that G is not provable, yet true in every w-complete model. By this, we shall mean that Vu~Nqo(y ) is true in the model whenever qa(0), qa(S0), ~(S(S0)), etc. are all true. (For some details in this argument, the reader is invited to consult Appendix II.) It follows that the following two statements are incompatible: (a) the real world of mathematics is w-complete; (b) truth in the real world implies provability. Classical Platonists may assert (a), whereas Hilbert presumably hoped for (b). So who is right?
Brouwer to the Rescue Curiously, it would seem in retrospect that the intuitionist Brouwer comes to Hilbert's rescue here, even though both Hilbert and Brouwer had perceived a conflict between their respective positions, formalism and intuitionism, prior to the publication of G6del's epochmaking paper (see [29]). Moreover, to allow himself to be rescued, Hilbert would have to sacrifice the principle of the excluded third, which is not essential to a formalist position. Brouwer would certainly deny (a) and, although he cannot be accused of favouring Platonism, I shall argue that his position can be interpreted as defending (b), thus removing the apparent contradiction between formalism and Platonism. On the one hand, he would allow us to assert the truth of~/ueNqa(y ) only if the truths of ~(0), ~(SO), ~(S(SO)), etc., can be established in a uniform way. This would fail to be the case, for example, if the lengths of the proofs of ~(sno) were unbounded as n varies over the natural numbers. On the other hand, he would insist that a mathematical statement is true only when it can be known, which we will take the liberty of interpreting to mean that it can be proved. It would appear that Brouwer himself later softened his stand against formalism and that his present-day followers, on the whole, have adopted formal proof theory as a tool to investigate his principles. On the other hand, intuitionism has been accepted into the mathematical mainstream, even if not always as an exclusive position, by constructivists, logicians, categorists, and, for some purposes, by computer scientists. (See, e.g., Troelstra and van Dalen [28].) THE MATHEMATICAL INTELLIGENCER VOL. 16, NO. I, 1994
57
What Distinguished Model?
Constructive Formalism
There remains the question whether there is a dis- What we are suggesting here is that the free topos is tinguished model which suffices for the discussion of a suitable candidate for the real (meaning ideal) world mathematical truth. We shall not follow Tait [27], who of mathematics. It should satisfy a moderate formalist believes that Platonism can survive even without this because it has been constructed from terms of a lan"Model-in-the-sky," as he calls it. Having accepted the guage and because it exhibits the correspondence beintuitionistic viewpoint, we must insist that this model tween truth and provability. It should satisfy a modershould not only exist in the classical sense, but that it ate Platonist because it is distinguished by being initial among all models and because truth in this model sufshould actually be constructible. Different answers have been given to this question. fices to ensure provability. It should satisfy a moderate Plato himself would have said that the real world is in- intuitionist, who insists that "true" means "knowable," habited by ideal objects, of which we can only observe the not only because it has been constructed from pure inshadows. Leibniz would have said that the real world, tuitionistic type theor~ but also because it illustrates namely, this world, is the best of all possible worlds. A mod- all kinds of intuitionistic principles [17]. The free topos e m logician would be tempted to construct the distin- would also satisfy a logicist w h o accepts pure intuitionguished model as the term model of pure intuitionistic istic type theory as an updated version of symbolic logic type theory: The entities in it are closed terms modulo and is willing to overlook the objection that the natural provable equality. In particular, a statement is true if and numbers have been postulated rather than defined. It is by no means a trivial matter to show that the only if it is provably equal to T, that is, if and only if it is provable. A categorist might attempt to bring Leibniz up Lindenbaum-Tarski category is a model in our sense. to date, albeit in a watered-down version, but one that is Some fancy metamathematics or category theory has to immune against Voltaire's criticism, by suggesting that be used to prove this (see, e.g., [17] ). The three properties the distinguished model is an initial object in the category of truth in a model are certainly principles that Brouwer would have insisted on. [The arrow a : 1 --* A of propof all models. It turns out that the term model when suitably pre- erty (3) is just a term in the internal language of T; see sented as a category, which might reasonably be called Appendix II.] He might also be happy that truth in the the Lindenbaum-Tarski category, is an initial object in the free topos coincides with provability, even if the latter is category of all models, even in the category of all toposes only a formalist's attempt to interpret "knowability." It also turns out that in the free topos every natural (with logical morphisms) and has been called the free topos. (See Appendix III for a discussion of toposes.) It so number is standard, namely, equal to one of O, SO, S(SO), happens that the intuitionistic version of model, gener- etc. In view of property (3) in the definition of "model," alizing Henkin's classical nonstandard model, is a special it then follows that the free topos is w*-complete in the is true in the topos, then kind of topos. It is true, though not at all obvious, that following sense: if 3u~Nr one of r r r etc., is true. This property the term model is a model in this sense. A topos T is called a model if it shares the following is equivalent to w-completeness classically, but not intuitionistically. We may, therefore, subscribe to the followproperties with the usual category of sets: ing revised form of (a): 1. _1_is not true in T; 2. if p V q is true in T, then p is true or q is true; (a*) the real world of mathematics is w*-complete. 3. if 3~eA~(X) is true in T, then ~a(a) is true for some arrow a : 1 --* A i n T . In this connection it should be pointed out that the free Boolean topos, namely, the Lindenbaum-Tarski category These properties have an algebraic translation, first for pure classical type theory, is not a model because for pointed out by Peter Freyd, concerning the terminal ob- G6del's sentence G, G v --G is true, but neither G nor ject I of T: -~G is, thus violating property (2). One may, of course, obtain a model of pure classical type theory from the 1. I is not an initial object; free Boolean topos, as a first step with the help of an 2. I is indecomposable; ultrafilter of arrows 1 --* f~, by declaring all these to be 3. I is projective. equal to T; but I doubt whether any such ultrafilter can be described constructivel~ Here "projective" has exactly the same meaning as in If we look at G6del's incompleteness theorem for pure module theory. Moreover, model toposes are the ana- classical type theory, we thus obtain a classical (Boolean) logues of local rings, and the completeness theorem can model which is not w-complete. It follows from G6del's be sharpened to yield an analogue of the representation argument (see Appendix II) that there is a formula ~(y), of commutative rings by continuous sections of sheaves y a variable of type N, such that ~a(0), ~a(S0), ~a(S(S0)), of local rings [13]. etc., are all true, but also 3y~g-,~(y) is true. This al~8
THE MATHEMATICAL INTELLIGENCER VOL. 16, NO. 1, 1994
lows us to construct a nonstandard natural number n = #y~N-~(y) in this classical m o d e l which is afortiori a model of pure intuitionistic type theory. The existence of nonstandard natural numbers in a model is what bothered Nyikos [24]. Would he be satisfied with the distinguished model discussed above in which all natural numbers are standard? To sum up, we suggest that to a moderate intuitionist there should be no contradiction between formalism and Platonism. Moreover, he ought to be willing to accept the free topos as one candidate for the real world of mathematics, at least of elementary mathematics. It is not necessary for our argument that the free topos is all there is, only that its existence shows the compatibility of apparently conflicting views. There are competing candidates for the "real world of mathematics," for instance, G6del's universe of constructible sets and Martin Hyland's realizability topos; but I have not investigated to what extent either of these notions would support the attempt of eclectic conciliation. The proposal to accept as the real world of mathematics the term model of pure intuitionistic type theory, or perhaps of some more powerful language, has been called constructive nominalism in [5]. It is m y belief that this Jproposal can be extended to natural languages to construct the everyday world of "shoes and ships and sealing wax, of cabbages and kings" [2]. There may even be different such worlds for different linguistic cultures. I suspect that similar views are held by a number of philosophers, linguists, and anthropologists. The real world of mathematics should not be confused with the real world of physics. Not being ultrafinitists, who believe that numbers greater than 10l~176(say) do not exist, we take the world of mathematics to be infinite. According to the present state of physics, there is no conclusive evidence that the material universe is infinite in the large, although ever since Zeno, it is generally believed that every interval contains infinitely many points, but even this has been doubted, for example, by Coish [1959].
What About Logicism? The problem with logicism is not its compatibility with the other positions, but whether it is defensible in the first place. The usual mathematical entities are natural numbers, pairs of such, sets of such, sets of sets of such, and so on. If we want to reduce mathematics to logic, as chemistry has been reduced to physics, we must surely include the machinery of set theory into what we call logic, thus allowing for the notions of equality and membership and some form of the comprehension scheme. This much seems to be taken for granted by all logicists. The difficulty arises when we want to construct the natural numbers as sets. For this, we need an axiom of infinity, which asserts the existence of an infinite set. But
then we may as well adopt Peano's axioms in the first place. This entails, in particular, that we include symbols for zero and successor in our language. There seems to be a general feeling that this is contrary to the logicist program, hence that logicism has failed. There is, however, a glimmer of hope that logicism may be resurrected, in view of recent developments in categorical computer science (e.g., [15], [16]). Following the lead of Church, one can construct the natural numbers object in a category as the retract of the formal product IIx ( x X ) (xx) or H x X (xx+'), where X is an indeterminate object and where the retract is constructed with the help of equalizers. Unfortunately, such formal products exist neither in the usual category of sets nor in the free topos, so some difficulties still have to be ironed out. Anyway, if logicism is to be salvaged, this may have to be with the help of categorical logic. See the discussion below.
Some Objections It is not likely that the proposed compromise among three, or perhaps four, major philosophical schools will put the controversy about the foundations of mathematics to rest. One reason for this is that some people's favourite positions have been ignored in this discussion, for example, predicative mathematics, ultrafinitism, and quasiempiricism. Others believe that formalism and Platonism are both wrong. This is the opinion of Saunders Mac Lane (expressed privately, but see also his lecture [20] and his book [21, Chapter XII]). Finall~ I may as well admit that, in presenting the traditional philosophies in moderate form, I have distorted each of them a little. It is debatable whether Plato, Frege, Hilbert, and Brouwer would acknowledge my version of Platonism, logicism, formalism, or intuitionism, respectively. Mac Lane has also criticized the prominence given to the free topos, or, what amounts to the same thing, to pure intuitionist type theory. Of course, other models (equivalently, applied type theories) should be studied too, and one may even look at them simultaneously, for instance bundled up in a sheaf [13]. The question then arises: Where do these models live? Well, in the category of sets, of course. But what is the category of sets? According to a constructive nominalist, it is the free topos. Yet there are other candidates for the category of sets and these are models of pure intuitionistic type theory, so we are back where we started. Let me anticipate another criticism, which shows that constructive nominalism, the position defended here, is guilty of the same circularity. In constructing the free topos linguistically, we have taken the number 2 to be the class of all closed terms of our formal language which are provably equal to S(SO). N o w the terms of this language are elements of the free monoid generated by a finite set of symbols. However, the exact nature of these symbols is of no importance; it does not matter whether they conTHE MATHEMATICAL INTELLIGENCER VOL. 16, NO. 1, 1994
59
sist of chalk marks or of sound waves, or whether they are written in blue or red ink. What matters is that the elements of the free monoid can be put in one-to-one correspondence with the natural numbers, as in G6del's well-known arithmetization. If we pick one such coding of terms by numbers, we end up with the conclusion that 2 is a set of natural numbers! This is hardly an illuminating conclusion. Pursuing this line of reasoning even further, we find that the free topos is an object in the category of sets, for that matter, in any model of our language, even in the free topos. Evidentl}; we have again gone in a circle. It is like lifting oneself up by one's shoelaces. However, I believe that this kind of circularity is inherent in any attempt to come to grips with basic ontological questions. Many people share G6del's belief (a) that the real world of mathematics is w-complete and that, therefore, his statement G is true but not provable. Because apparently we can see that G is true, Penrose [26], following Lucas [19], draws the further conclusion "that our consciousness is a crucial ingredient in our comprehension of mathematical truth" and that it is "not something that we can ascertain merely by use of an algorithm." For all I know, this conclusion, asserting the superiority of the human mind over the computer, is correct, but I must reject the argument, as I do not believe (a). G6del himself drew an important corollary from his incompleteness theorem: To prove the consistency of any language adequate for arithmetic one has to go outside that language. This shows that Hilbert's proposal to restrict metamathematics to finitary methods cannot succeed. Indeed, metamathematicians no longer feel bound by Hilbert's proposed restriction. For example, the simplest proof of the consistency of pure intuitionistic type theory consists of pointing to property (1) of some model, say the free topos. Whereas the free topos can be constructed in pure intuitionistic type theo~, the proof that it is a model requires more powerful methods. Our version of type theory, sometimes called the theory of finite types, is adequate for elementary mathematics, namely, arithmetic and analysis. Even if we want to treat these subjects classically, we can do so within intuitionistic type theory by looking at statements of the form V x ~ ( x V ~x) =~ p. Metamathematics and category theory require more than the theory of finite types. One may have to admit not only the axiom of choice, but also much higher types, corresponding to Grothendieck universes in G6del-Bernays set theory or to inaccessible cardinals in Zermelo-Fraenkel set theory. For some purposes, even quantification over types may be required.
Categorical Logicism It has been argued by Henle [9] and Marquis [22] that logicism should be revived as categorism or categorical Iogicism. Without necessarily following these two authors, I see categorical logicism as abandoning the at60
THE MATHEMATICAL INTELLIGENCER VOL. 16, NO. 1,1994
tempt to reduce arithmetic to logic, but realizing instead that, at a very basic level, arithmetic and logic are the same. To say this, however, one has to enlarge one's conception of logic to incorporate proofs in place of mere provability. Consider, for example, the arithmetical identities ( a b ) c = a (c•
a b+c = a b x a ~,
and compare them with the intuitionistically valid logical equivalences
c=v (b=~ a) ~-* (cAb) =~ a, (b V c) ==>a ~ (b =:~ a) A (c :=~ a). The obvious analogy extends to the usual laws of arithmetic: commutative, associative, and distributive laws, and the laws of indices. What lies behind this analogy is Lawvere's [18] concept of cartesian closed category. However, one can also understand the analogy by looking at sets: Replace the natural number a by a typical set of a elements and replace the proposition a by the set of all reasons for a. Here "reason" cannot be taken to mean proof, else all unprovable propositions would be replaced by the empty set; but it may be taken to be any deduction c -* a, where c is any proposition whatsoever (see [15]). In this analogy, we have compared arithmetical operations with logical connectives. One can also compare arithmetical operations with logical deductions in the form of Gentzen sequents, because both are special cases of algebraic operations (see [14]). A primitive recursive function N n ~ N may be viewed as realizing an operation N n --+ N in a certain algebraic theory, and a Gentzen style deduction A1 ... An --* B may be viewed as an operation in a multisorted algebraic theory. The categorical viewpoint allows us to go beyond mere ontology and ask: Which mathematical entities are interesting, relevant, or important? With Bill Lawvere and other categorists, I share the view that interesting mathematical entities tend to be categories or functors and that the growth of mathematics is often guided by looking for functors adjoint to previously known functors. However, I admit that it may be difficult to convince a number-theorist of this.
Appendix I. A Modem Version o f T y p e T h e o r y G6del's incompleteness theorem applies to any formal system, classical or intuitionistic, as long as it is adequate for arithmetic and as long as the set of all proofs is recursive. In fact, the title of his paper [7] referred to classical type theory as formulated by Russell and Whitehead in their Principia Mathematica. Personally, I prefer a more modern version of type theory as presented in [17]. We admit the following types and terms, the latter written under their respective types:
1 f~
N AxB
PA
, a=a'ae,, snO (a,b) { x 9 where it is assumed that A and B are previously given types, that a and a' are terms of type A, x is a variable of type A, ~ a term of type PA, n a term of type N, b a term of type B, and ~(x) a term of type fL We also presuppose a supply of countably many variables of each type. The usual logical connectives may be defined by writing T p Aq p :=~ q VxeA~(X)
for * = *; for (p, q) = (T, T), p and q being of type f~; for p A q = p; for {x 9 A}9~(x)} = {x 9 A]T}.
Other connectives 3-, -~, and v and the quantifier 3 are defined in familiar fashion, for example, by writing pVq
for Vxen(((p ~ x) A (q ~ z)) ~ x).
For a complete list of axioms and rules of inference, the reader is referred to [17]; there are no surprises. Notably absent is the axiom V~ea(x V ~x) or, equivalently, V~en ( ~ x =} x); if it is added, one obtains classical type theory. We speak of pure type theory, intuitionistic or classical, if there are no types, terms, axioms, or rules other than those that have to be there; in applied type theory, there may be others. It is often useful to incorporate into the language of type theory a Russellian description operator. It so happens that this is not needed in pure intuitionistic type theory as formulated here, nor is it needed in the internal logic of a topos to be discussed in Appendix III (see [17]).
Appendix II. GSdel's Argument G6del's basic argument may be presented as follows. Let c~0,~1, a2,.., be a given effective enumeration of all closed terms of type P N , and let P0, P1, P2, .- 9be an effective enumeration of all proofs, regarded as strings of terms of type fL Consider the metamathematical statement R(m, n):
Pn is a proof of SmO 9 C~m. G6del realized that this is a recursive binary predicate, having practically invented the theory of recursive functions to do so. He succeeded in proving that there is a formula (term of type fD ~(x, y) with free variables of type N such that (i) if R(m, n), then r S'~0) is provable; (ii) if not R(m, n), then ~@(sm0, Sn0) is provable. N o w consider the closed term % - {x 9 NIV~cN-~P(z, y)}.
If we assume that G - SgO 9 ag is provable, say with proof Pn, then we can prove k~(Sg0, Sn0); hence 3yeNk~(SgO, y), and therefore -~G. If we assume that our
formal language is consistent, we may infer that G is not provable and so not R(g, n) for any n; thus -~r SnO) is provable for all n. Taking ~a(y) = -~r y) in the definition of w-completeness, we infer that VucN-~d(Sa0, y) is true in any w-complete model, which implies that G ~ Sg0 9 ag is true in such a model. In the syntactic version of his incompleteness theorem, GSdel assumed that the language is w-consistent and deduced that -~G is not provable either. Rosser later showed that w-consistency here can be replaced by consistency (see [12] or [11]).
Appendix III. On the Notion of Topos The notion of a topos, actually of an elementary topos, was introduced by Lawvere in collaboration with Tierney, following a lead by Grothendieck. It is a category which resembles the familiar category of sets in having finite products, exponentiation, like the object B A of all functions from A to B, and a subject classifier f~, resembling the set {T, 3_} in classical set theory, inasmuch as it allows one to characterize subsets of A by their characteristic functions from A to f~. For our purposes, to the regret of all logicists, we must also stipulate a natural numbers object N, resembling the usual set of natural numbers. This is not the place to describe in detail the internal language of a topos T. Suffice it to say that its closed terms of type A are arrows a : 1 --* A in 7", where 1 is the terminal object (empty product) and A is any object. In particular, closed formulas are arrows p : 1 --* fL In general, the internal language of a topos is intuitionistic and there may be more than the two arrows T, 3_ : 1 ~ fL To say that p is true in T means that the arrows p and T from I to f~ coincide. Conversely, from every language, that is, type theor~ one can form the topos generated by it, alias its Lindenbaum-Tarski category. Its objects are closed terms a of type PA, A being any type, and its morphisms c~ --* fl, fl of type PB, are closed terms of type P ( B x A) about which it can be proved in the language that they denote functions from the set denoted by c~ to the set denoted byfl. When we say that a model topos T is a model of a language s we are implicitly referring to an interpretation of s in T. (It so happens that, when s is pure intuitionistic type theor~ there is exactly one such interpretation.) An interpretation of s in T may be viewed either as a morphism (translation) in the category of type theories from s to the internal language of T or, equivalently, as a morphism (logical functor) in the category of toposes to T from the topos generated by s (The equivalence follows from the fact that the processes "topos generated" and "internal language" are adjoint functors, see [17].) A d o s e d formula p of s is true in the topos 7", under the given interpretation, if the translation sends it onto the arrow 3_: 1 ~ f~ in T. Before G6del proved the incompleteness theorem, people had need of a special term for THE MATHEMATICAL INTELLIGENCER VOL. 16, NO. I, 1994
61
"true in every model"; for example, "semantically true," and I seem to recall that Carnap used "analytic." As we already hinted in the section entitled "What Distinguished Model?", a model of pure classical type theory is precisely a nonstandard model in the sense of Henkin, showing that this concept is not at all contrived, as some people seem to think. It m a y be worth pointing out in support of nominalism that every topos is equivalent to the topos generated by its internal language. On the other hand, the internal language of the topos generated by a language is merely a conservative extension of the latter. Although it must be conceded to extreme formalists that, at first sight, pure intuitionistic type theory is not about anything, it has a conservative extension, the internal language of its free topos, which is about the free topos, the proposed candidate for the real world of mathematics.
5. J. Couture and J. Lambek, Philosophical reflections on the foundations of mathematics, Erkenntnis 34 (1991), 187-209. 6. K. G6del, Die Vollst/indigkeit der Axiome des logischen Funktionenkalkiils, Monatshefte Math. Physik 37 (1930), 349-360. 7. K. G6del, Ober formal unentscheidbare S/itze der Principia Mathematica und verwandter Systeme I, Monatshefle Math. Physik 38 (1931), 173-198. 8. L.A. Henkin, Completeness in the theory of types, J. Symbolic Logic 15 (1950), 81-91. 9. J.M. Henle, The happy formalist, Mathematical Intelligencer 13 (1991), no. 1, 12-18. I0. J. Hintikka, The Philosophy of Mathematics, Oxford University Press, Oxford, 1969. 11. V. Huber-Dyson, GfJdel's theorems: a workbook on formalization, Teubner, Stuttgart and Leipzig, 1991. 12. S.C. Kleene, Introduction to Metamathematics, Van Nostrand, New York, 1952. 13. J. Lambek, On the sheaf of possible worlds, in J. Ad~mek and S. Mac Lane (eds.), Categorical Topology, World Scientific, Singapore, 1989, pp. 36-53. 14. J. Lambek, On the unity of algebra and logic, in F. Borceux Appendix IV. Some Recollections of Brouwer (ed.), Categorical Algebra and its Applications, SpringerVerlag, New York, 1989, pp. 221-229. I wish to take this opportunity to share some personal 15. J. Lambek, Some aspects of categorical logic, in D. Prawitz recollections of L.E.J. Brouwer. When he visited Canada, et al. (eds.), Proc. 9th International Congress of Logic, quite a few years ago, to address the Canadian MathMethodology and Philosophy of Science, Uppsala 1991, NorthHolland, Amsterdam, 1992. ematical Congress (now called "Society") on his ideas, he defended his notion of "twoity" against H.S.M. Cox- 16. J. Lambek, Least fixpoints of endofunctors of cartesian closed categories, Report from the Dept. of Math., 91eter's criticism that it should be called either "twoness" 11, McGill U., Montreal 1991; in Mathematical Structures or "binity." He also came to m y house and became quite in Computer Science, to appear. interested w h e n I told him that he had influenced two 17. J. Lambek and P.J. Scott, Introduction to Higher Order Categorical Logic, Cambridge University Press, Cambridge, people in rejecting Aristotelian logic, the founder of Gen1986. eral Semantics, Korzybski, and the science-fiction writer 18. F.W. Lawvere, Adjointness in foundations, Dialectica 23 Van Vogt. S o m e h o w the conversation turned to Wittgen(1969), 281-296. stein, and Brouwer doubted whether the latter had made 19. J.R. Lucas, Minds, machines and G6del, Philosophy 36 any contributions to logic. I mentioned that Wittgenstein (1961), 120-124. had invented truth tables, although I n o w k n o w that they 20. S. Mac Lane, The 1982 Ryerson Lecture, University of Chicago Press, Chicago, 1982. go back to Philo of Megara, about 300 B.C. Brouwer then 21. S. Mac Lane, Mathematics, Form and Function, Springerasked: "What are truth tables?" I was naive enough to Verlag, New York, 1986. attempt to explain them to him. 22. J.-P. Marquis, Russell's logicism and categorical logicism, preprint 1991; in: A. Irvine et al. (eds.), Proceedings of the Acknowledgments Conference on Russell, University of Toronto Press, Toronto, to appear. The author acknowledges support from the Social Sci- 23. G.H. Moore, Sixty years after G6del, Mathematical Intelligencer 13 (1991), no. 3, 6-11. ences and Humanities Research Council of Canada. He is indebted to Bill Anglin, Saunders Mac Lane, Jean-Pierre 24. P.J. Nyikos, Formal systems, Mathematical Intelligencer 13 (1991), no. 3, 4-5. Marquis, and Du~ko Pavlovi4 for helpful comments on 25. A. Paris, A letter to J.M. Henle, the "happy formalist," a preliminary version of this article and wishes to thank Mathematical InteUigencer 14 (1992), no. I, 6-8. the referee and the editor for their constructive criticism. 26. R. Penrose, The Emperor's New Mind, Oxford University Press, New York, 1989. 27. W.W.Tait, Truth and proof: the Platonism of mathematics, References Synthese 69 (1986), 341-370. 28. A.S. Troelstra and D. van Dalen (eds.), The L.E.J. Brouwer 1. P. Benacerraf and H. Putnam, Philosophy of Mathematics: seCentenary Symposium, North-Holland, Amsterdam, 1982. lected readings, 2nd ed., Cambridge University Press, Cam- 29. D. van Dalen, The war of the frogs and the mice, or the cribridge, 1984. sis of the Mathematische Annalen, Mathematical Intelligencer 2. L. Carroll, The Complete Works of Lewis Carroll, The Modern 12 (1991), no. 4, 17-31. Library, New York. 30. J. van Heijenoort (ed.), From Frege to Gfidel, Harvard Uni3. H.R. Coish, Elementary particl6s in a finite world geomeversity Press, Cambridge, MA, 1967. try, Physical Review 114 (1959), 383-388. 4. J. Couture, Analyticit6 et compl6tude, Cahiers d'dpistdmologie, D4pt. de philosophie de UQAM 8510 (1985); also in Department of Mathematics and Statistics R. Nadeau (ed.), Contingence et Raison, Vrin-Bellarnin, to McGill University appear. Montreal, Quebec, H3A 2K6, Canada 62 THEMATHEMAnCALINTELLIGENCERVOL.16,NO.I, 1994
Jet Wimp*
A Course on Integral Equations by Allen C. Pipkin Springer Texts in Applied Mathematics, Vol. 9 N e w York: Springer-Verlag, 1991. xiii + 268 pp. US $39.00. ISBN 0-387-97557-8.
Integral Equations: A Practical Treatment from Spectral Theory to Applications by David Porter and David G. Stifling N e w York: Cambridge University Press, 1990, xi + 372 pp. US $29.95. (paper) ISBN 0-521-33742-9.
Linear Integral Equations by Rainer Kress Springer Texts in Applied Mathematics, Vol. 82 N e w York: Springer-Verlag, 1989. xi + 299 pp. US $49.00. ISBN 0-387-50616-0.
Integral Equations and Applications by Constantin Corduneanu N e w York: Cambridge University Press, 1991. ix + 366 pp. US $94.95. ISBN 0-521-34050-0.
Reviewed by Thomas S. Angell In the May 1697 issue o f the Acta Eruditorum appear articles by Johannes and Jakob Bernoulli and also a note by Leibnitz, all discussing the solution of the famous * Column Editor's address: Department of Mathematics, Drexel Uni-
versity, Philadelphia, PA 19104 USA.
problem of the brachistochrone: What is the plane curve having the property that the time necessary for a particle to slide d o w n a path to the lowest point on the curve is a minimum? Many of us are familiar with the history of this "challenge problem" posed to the mathematicians of Europe by Johannes Bernoulli in June of 1696. The note of Leibnitz appears there simply because, as is perhaps not so well known, he had already solved the problem. In response to a private letter sent to him on June 9th, Leibnitz had communicated his solution to Bernoulli on June 16, 1696. The note of Leibnitz is remarkable not only for its reticence but also for Leibnitz's assertion that Huygens too, had he been alive, could have solved the problem. Huygens, who died in 1695, had not seen Bernoulli's challenge. Not that the problem was new: Galilei in 1638 had posed it and suggested, incorrectly, that the curve of steepest descent was an arc of a circle. The Bernoullis and Leibnitz showed 49 years later that the brachistochrone was instead a cycloidal arc. Huygens was probably ignorant of Galilei's problem. On the other hand, both Leibnitz and the Bernoullis were well aware of the investigations of Huygens into the properties of the cycloid and, in particular, its isochronous property, which Huygens had described in 1673. Indeed, the Bernoullis expressed amazement that the solution of the problem of the curve of quickest descent was the same as Huygens's tautochrone, the plane curve having the property that the time necessary for a particle to slide down a path to the lowest point on the curve is independent of the point of origin. I enjoy talking to the students of elementary calculus about Huygens's study of the pendulum clock and the tautochrone's importance to navigation, geography, astronomy, and the development of a clock with a period
THE MATHEMATICALINTELLIGENCER VOL. 16, NO. 1 (~)1994 Springer-Verlag New York
63
independent of the length of the pendulum. Of course, sometimes a curious student catches me up by asking how Huygens came to try the cycloid in his search for an isochronous curve. My answer is that good intuition always characterizes the work of a master, an assertion that hardly sits well with students. I wish that I could tell them about the work of Abel in 1823. Abel developed a method, starting from first principles, for finding the tautochrone simply from its defining property. Unfortunately, Abel's work is a bit distant from the subject matter of the course, and my students are not sophisticated in matters of mechanics. The problem, as stated by Abel, was a generalization of the problem of Huygens in that he asked for the path along which a particle, confined to a vertical plane and subject only to the force of gravity, should fall so that its time of transit would be equal to a given function of the vertical distance fallen. Abel proceeded from the relationship between potential and kinetic energy and, integrating with respect to arc length, derived the relation y=O ds _ T(h), =h v/2g(h - y)
~
where T is the given function of transit time depending on the vertical distance h. If one sets ds/dy = -u(y), his equation becomes
1 h v ~ ~o ~u(Y) d y = T(h). We obtain Huygens's problem by setting T(0) = 0 and T(h) - c f o r h > 0. In modern terminology, this last equation, Abel's equation, is a Volterra integral equation of the first kind for the unknown function u. What is particularly fascinating about this equation, apart from its mathematical character, is that the physical problem leads ab initio to an integral equation rather than to a differential equation that can be rewritten in integral form. Also, the equation seems to be ubiquitous, occurring, for example, in mechanics, geophysics (in the description of earthquake shocks), and tomography. The equation is still an object of lively investigation. Today we write Abel's equation in symbolic form as )~Au = T, the symbol A denoting the map
u ---* f0 h ~ u(y) dy, with A = 1/v/-~. The function A(h, y) = (h - y)1/2 is the kernel of the integral equation. This equation and Volterra equations of the second kind were both first studied systematically by Vito Volterra in 1896. He summarized his work in his book of 1913 [19]. Equations of the second kind have the form
(I + )~K)f = g, 64 THEMATHEMATICALINTELLIGENCERVOL.16,NO. 1,1994
the operator K being generated by ~a t
f --*
K(x, y)f(y) dy,
with variable upper limit of integration. Although Volterra used the theory of algebraic equations primarily as a guide, he did advocate the use of infinite determinants. Other eminent mathematicians soon became interested in integral equations. Du Bois Reymond in 1888 was the first to suggest the name "integral equations," whereas Poincar6, in 1894, used integral equations in his study of a three-dimensional version of Liouville's problem concerning the cooling of a bar. Unlike Abel's equation, the equations considered by Poincar6 arose from differential equations, in this case the heat equation. In Poincar6's time, derivations of integral equations from differential equations were common. George Green had proposed them in his Essay of 1828. Green had studied Laplace's equation, was the first to pose what we now call the Dirichlet problem, and had derived what we now call Fredholm integral equations of the first kind. It remained for Beer [2] in 1865 to derive an integral equation of the second kind for the Dirichlet problem. Integral equations and partial differential equations have been intimately linked ever since. This was the heritage that Fredholm enjoyed at the turn of the century when he developed his theory of integral equations. Fredholm exploited the analogy with linear algebraic equations in a w a y that Volterra did not, and he laid the foundation for much of what was to come, particularly the work of Hilbert and his school. The theory flourished in the first half of our century with significant contributions made by Gevrey, Tamarkin, Tonelli, Carleman, Weyl, E Riesz, Wiener, and Hopf. For those who enjoy history, I recommend the introduction to Corduneanu's book under review.
Is There a Field of Integral Equations? The subject should not be relegated to mere history; it is quick, not dead, as a glance at Mathematical Reviews or Zentralblatt will confirm. Even its oldest manifestations retain vitality, as the current interest in Abel's equation shows [5]. The application to population dynamics, envisioned by Volterra and d'Ancona in the late 1920s, is another example. The books reviewed here are full of current and classical applications: thermostatic regulators, acoustic scattering, composite materials, population genetics, automatic control input-output systems. Yet it is a rare undergraduate who has ever encountered the subject. I maintain that most graduate students have only a passing acquaintance with integral equations, one gleaned either from examples of compact operators in a course in functional analysis or from applications, numerical or otherwise, in a course in partial differential equations or in mathematical physics. A re-
cent article in the SIAM News [11], which discusses the training of the next generation of applied mathematicians, fails to suggest integral equations as a study topic. As the classical tools of applied mathematics the author lists complex variables, linear algebra, advanced calculus, ordinary and partial differential equations, perturbation, and asymptotic methods. He mentions modern tools as well, but absent from either list is integral equations. This omission, this gap in the knowledge of our students, is simultaneously understandable and lamentable. As a result of the basic simplicity of textbook models and the weak background of students in physical applications, most lecturers present their applications in the language of differential equations. There may even be a psychological barrier. I know colleagues who in the 1950s were exposed to integral equations, but whose recollec-
" . . . thermostatic regulators, acoustic scattering, composite materials, population genetics, automatic control, input-output s y s t e m s . . . " tions are of massive systems of linear algebraic equations and large, complicated Fredholm determinants. Their recollections are not happy ones. There is also the understandable view that modern functional analysis is so rich, and the compact operators form such a small chapter, that one can afford to mention concrete realizations of the operators only in passing. Nonlinear matters, e.g., monotone operators and Hammerstein's equation, are best left to advanced seminars. Perhaps the appearance of these books, all "based on lectures," heralds a change from curricular neglect. The curricular neglect is not for the want of suitable texts. When I began to teach the subject, the four books of which I had some knowledge were Tricomi [18], Hochstadt [7], Muskhelishvili [13], and Krasnosel'skii [9]. Add to that list the book of Smithies [16]. These were the books called to m y attention as being "top notch." Those of Muskhelishvili and Krasnosel'skii were monographs suitable for research students, the others were texts. None of these books has lost its luster. The books reviewed here are all described by their authors as textbooks. As one might expect, there is considerable overlap, but the books differ in style, approach, audience of choice, and emphasis on topics. They fit at different levels and even into different branches of the mathematics curriculum. Of the four, the text of Pipkin is the most elementary. The author intends the book as an advanced undergraduate or perhaps a beginning graduate text. Its greatest strength--as is appropriate for such an audience--is the breadth of examples and applications. Pipkin emphasizes analogies with linear algebra, and in the latter half of the book he uses complex variables. For example,
to work through Chapter 9 on principal value integrals, one must be proficient at contour integration. Pipkin's discussion is somewhat informal, but I think appropriately so. He offers us a "techniques" book, which allows the great breadth of topics, though it has the disadvantage that, without an experienced instructor, students will find this book difficult. The first five chapters present the basic Fredholm theory, Hilbert-Schmidt equations, and Volterra equations. In particular, the fourth chapter contains a brief account of the Laplace transform. After a chapter entitled Reciprocal Kernels that deals with integral representations of solution of convolution equations of the first kind, Pipkin introduces Wiener-Hopf equations (suitable for problems defined only on a half-space) and principal value integrals. He devotes the final three chapters to integral equations whose kernels are of the form K(x, y) = (x - y)-l. Woven into the development throughout are applications involving fiber-reinforced materials, input-output systems, lateral vision, airfoils, and viscoelastic materials. One omission, curious in a m o d e m text, is a discussion of numerical methods. However, one cannot cover everything. There are many good exercises emphasizing computation, with some answers provided in an appendix. The author provides precise statements of theorems but few proofs. I recommend the book to experienced instructors with a good knowledge of integral equations who, along with their students, are interested in applications (with a seasoning of engineers). The book of Porter and Stirling also will require an experienced instructor, but for a different reason. The authors direct the book toward final-year Honors Mathematics students or M.Sc. students studying in the British educational system. Unfortunately, the book will not fit into a niche in the standard curricula of U.S. graduate schools. Though the benefits might be considerable, an instructor in the United States should exercise great care in using the book. What are the problems? Problems of necessary background, mostly. Few beginning graduate students in the United States have sufficient acquaintance with complex variables and linear algebra, few have the background sufficient for this book, though in all other respects the book is really introductory. The authors thoughtfully supply two appendices, one on the facts of Lebesgue integration through Fubini's theorem, and one on the facts about operators on Hilbert space, which, with certain sections of the text, provide a whirlwind tour of the spectral theorem for compact operators. The authors admit from the start that they are addressing students with a first-level graduate course in functional analysis. Those with such exposure will find the text an excellent w a y to gain acquaintance with integral equations and will enjoy a review of the functional analysis recently learned. At my suggestion, a colleague used this book in just this w a y with great success, no doubt partly due to the large number of exercises. THE MATHEMATICAL INTELLIGENCER VOL. 16, NO. I, 1994
65
The book begins with two chapters of motivation presenting problems in ordinary and partial differential equations and showing how integral equations arise in each. Here we meet, for example, Tricomi's equation, Laplace's equation (as a special case of the former), and Sturm-Liouville problems. There follow chapters on Fredholm theory, compact operators, the spectral theorem for these operators in the self-adjoint case, and one on positive operators. The emphasis is on eigenfunction expansions, and this naturally leads to a discussion of the topic of approximation methods for eigenvalues and eigenfunctions. The authors close with chapters on variational techniques and Galerkin's method, and on singular equations and the Hilbert transform. This last chapter is particularly well done. There are nice things scattered throughout the text. For example, Porter and Stirling present a variant of Galerkin's method, called the Iterated Galerkin Method, which apparently goes back to a 1975 article of Sloan, Burn, and Datyner [15]. This variant yields better approximations to the solution of second-kind equations. The method can be explained simply as follows: One replaces the equation ( ! + ,~K)x = f with a fixed point equation x = T x with the affine map x ~ )~Kx + f. A solution will then lie in the range of T. Although the usual Galerkin procedure computes an approximate solution xn of the fixed point equation, the iterated Galerkin procedure replaces this approximate solution xn with ~,~ = Txn, which is both an element of R ( T ) and of the fiber over x,~. Examples show the increased accuracy obtained. The table on page 282 compares the exact solution, the Galerkin approximate, and the iterated Galerkin approximate for a specific example. The cautious reader will note that the last two rows of the table on page 282 are interchanged. I really don't want to argue matters of taste, but I find it curious that, even though the authors emphasize the subject's origins in differential equations, they do not include an account of potential theory. One does not find single- or double-layer potentials or any systematic account of Laplace's equation. Considering the role played by potential theory in the history of the subject, the authors have missed a great opportunity.
Applying Functional Analysis Kress doesn't miss the opportunity. He devotes a chapter to the subject and another to potential-theoretic methods for the heat equation. He uses similar methods in the final chapter of his book on inverse acoustic scattering. As the author remarks in the preface, his book is based on lectures given in G6ttingen in which he presents not only theory, but also applications and numerical methods. There is no question here about the need of a firm foundation in functional analysis. Kress employs it heavily throughout, including in the error and convergence analysis of numerical methods. Most American students 66
THE MATHEMATICAL INTELLIGENCER VOL. 16, NO. 1, 1994
will need a course in functional analysis before they achieve any facility with the ideas of compact operators, dual systems in Banach spaces, and Sobolev spaces. On the other hand, if you wish to teach a course that combines functional analysis and integral equations (not a bad combination), then this book would be an excellent choice. For someone with a grounding in functional analysis, this book provides a well-constructed self-contained presentation. It is clear, appropriate details are filled in, and a connected thread runs throughout. This thread is, of course, the story of how linear boundary value problems in partial differential equations can be approached using integral equations. Fortunately, the author does not allow the theme to overwhelm the development: The subject remains linear integral equations. Kress's approach to the Fredholm alternative and the Riesz-Schauder theory, using general dual pairings of Banach spaces, deserves mention, as does his treatment, in the same setting, of normal solvability. To my knowledge, this approach has appeared only in the books of J6rgens [8] and Heuser [6]. Here, the use of more general pairings--rather than that of a Banach space with its d u a l - - allows a proof of the Fredholm theorems which avoids the use of the Hahn-Banach Theorem, therefore of Zorn's Lemma. The tactic of circumventing the axiom of choice is familiar in analysis, although perhaps not so much pursued in the recent past. My opinion, which I pass on to my students, is that such niceties, if the only such rationale, to an applied mathematician hardly justify the effort. In Kress's b o o k however, it is not just a nicety; the rationale is much more practical. If one studies integral equations with square-integrable kernels, say in the space L2[0, 1], then the adjoint equation is again an integral equation in L2[0, 1]. However, in case the function space is C[0, 1], the adjoint equation is in an inherently different dual space. The use of dual systems avoids the even more difficult question of how to interpret the adjoint equation if we have no concrete knowledge of the dual space. Kress's treatment of Sobolev spaces is straightforward and tailored to the numerical methods presented later in the text. In the final chapters, Kress carefully works out the material on ill-posed problems and their regularization; in the final chapter, on inverse scattering theor~ we can see all the theory in action. What really sets this book apart, and makes it so valuable in the present climate, is its emphasis on numerical methods. These are worked out in detail in five chapters. Although there are books available (e.g., [4]) treating such methods, this is the only general text that I know that contains this material. I can well imagine the book becoming a standard text for second-year graduate students. My suggestion to Springer-Verlag is that it persuade the author to include a diskette containing basic codes. They have done that with other texts. It would make a stunning addition to Kress's book.
Altogether different from the previous books is the one by Corduneanu. After the historical introduction mentioned above, written with G. Bantam, it discusses briefly some applications in which integral equations arise and summarizes the basic theory of linear Volterra and Fredholm equations. The author reveals the true nature of the book in the last section of the first chapter, devoted to nonlinear equations of Hammerstein type. He articulates the hope that the "topics featured in the book will convince the reader that integral equations are a very useful and successful tool in contemporary research .... " I think he has succeeded admirably with his emphasis on nonlinear problems and equations of Volterra type, an invaluable bibliography, and sets of bibliographical notes. Apparently only the few of us working in areas like control theory or dynamical systems know that Tonelli in 1928 [17] introduced an extensive generalization of Volterra equations, what are now called abstract Volterra operators. In the engineering control literature, the term used is "causal operator." Such an operator V defined on some function space E([t0, T], R) is characterized by the property that if x(s) = y(s), to <_ s <_ t, t <_ T, then (Vx)(t) = (Vy)(t). I first learned of such operators from a d o s e friend and fellow graduate student, an electrical engineer, who wrote his Ph.D. thesis on nonlinear feedback systems with causal operators. Working on my dissertation, I was naturally excited to find the same idea, and the reference to Tonelli, in the book of O~uzt6reli [14]. The idea shows up later in Warga's book [20] under the name p-hereditary operator, and most recently in Krasnosel'skii and Pokrovskii [10] as a physically realizable transducer. As Corduneanu so well documents, the use of Volterra operators can be a unifying tool for a range of equations including integrodifferential equations of Volterra type, functional differential equations, and linear and nonlinear Volterra integral equations in one or several variables. As the last three references show, they are important for the description and analysis of control systems that involve hereditary dependence. It is fortunate that Corduneanu treats these ideas and amply illustrates their usefulness. I contend that much could be done to develop a systematic approach to hereditary systems by using Volterra operators. To my knowledge, such a synthesis has yet to be accomplished. Also of growing applied interest are equations with multivalued right-hand sides, sometimes called contingent equations. Differential inclusions, i.e., relations of the form :b(t) E Q(x, t), were known in the 1930s. They appeared explicitly in the 1937 paper of McShane [12], which anticipated many of Filippov's results. A systematic development of the subject, driven largely by the demands of control theor~ occurred only in the 1960s. Two recent references are [1, 3]. Of still more recent vintage are integral inclusions. Corduneanu gives us a brief introduction to the subject with a treatment of Volterra-
Hammerstein integral inclusions. His discussion, and the material in the additional references, affords an entry into what should become a fertile field of research. Considering that much of the motivation for the development of the theory of differential inclusions was their occurrence in control theory, it is a pity that Corduneanu does not illustrate their use in his discussion of control problems monitored by Volterra equations. Much is to be discovered in this field of applications, particularly in the investigation of multidimensional problems, and those who read this book will find it easy to identify promising research problems.
References 1. J. P. Aubin and A. Cellina, Differential Inclusions, SpringerVerlag, Berlin, Heidelberg, New York, 1984. 2. A. Beer, Einleitung in die Electrostatik, die Lehre vom Magnetismus und die Elektrodynamik, Vieweg, Braunschweig, 1865. 3. K. Deimling, Multivalued Differential Equations, de Gruyter, Berlin, New York, 1992. 4. L. M. Delves and J. L. Mohamed, Computational Methods for Integral Equations, Cambridge University Press, Cambridge, New York, 1985. 5. R. Gorenflo, Abel Integral Equations: Analysis and Applications, Springer-Verlag, Berlin, New York, 1991. 6. H. Heuser, Funktionalanalysis, Teubner, Stuttgart, 1975. 7. H. Hochstadt, Integral Equations, Wiley, New York, 1973. 8. K. J6rgens, Linear Integral Operators (G. Roach, trans.), Pitman, London, 1982. 9. M. A. Krasnosel'skiL Topological Methods in the Theory of Nonlinear Integral Equations, Pergamon Press, Oxford, New York, 1964. 10. M. A. Krasnosel'skii and A. V. PokrovskiL Systems with Hysteresis, Springer-Verlag, Berlin, Heidelberg, New York, 1989. 11. C. D. Levermore, Training a new generation of applied math faculty, SIAM News 25(6) (1992), 17. 12. E. J. McShane, A navigation problem in the calculus of variations, Amer. J. Math. 59 (1937), 327-334. 13. N. I. Muskhelishvili, Singular Integral Equations; Boundary Problems of Function Theory and Their Application to Mathematical Physics, P. Noordhoff, Gronigen, The Netherlands, 1953. 14. M. N. O~,uzt6reli, Time-Lag Control Systems, Academic Press, New York, London, 1966. 15. I. Sloan, B. Burn, and N. Datyner, A new approach to the numerical solution of integral equations, J. Comp. Physics 18 (1975), 92-103. 16. F. Smithies, Integral Equations, Cambridge University Press, Cambridge, 1958. 17. L. Tonelli, Sulle equazioni funzionali di Volterra, Bull. Calcuta Math. Soc. 20 (1929), 31-48; Opere Scelti 4, 198-212, Edizioni Cremonese, Rome, 1960. 18. E G. Tricomi, Integral Equations, Interscience Publishers, New York, 1957. 19. V. Volterra, Theory of Functionals and of Integral and IntegroDifferential Equations, Dover, New York, 1959. 20. J. Warga, Optimal Control of Differential and Functional Equations, Academic Press, New York, London, 1972.
Department of Mathematical Sciences University of Delaware Newark, DE 19716 USA THE MATHEMATICAL INTELLIGENCER VOL. 16, NO. I, 1994
67
Exact Constants in Approximation Theory by N. Korneichuk (translated by K. Ivanov) Cambridge: Cambridge University Press, 1991, xii + 452 pp. US$89.50. Reviewed by I". M. Mills In teaching undergraduate calculus, we try to ensure that our students learn the qualitative fact that oK)
S := E n-2 < o0. It is easy to prove this by comparing partial sums with integrals. If we note that, for n > 1,
1 1 ---~ n < n(n - 1--~)
1 n - 1
1 n'
then, by estimating the partial sums again, we can prove the quantitative fact that S<2. However, I never cease to enjoy telling my students that S = 7r2/6. We have found an "exact constant." If you agree with me that there is something final and something beautiful about the statement that S = ~r2/6, then you will appreciate the subject matter of this new book by Korneichuk. It deals with best-constant problems in the context of approximation theor~ Specifically, approximation theory is the study of methods for approximating functions by using simple
functions such as polynomials, or splines, or rational functions. In this study, one is concerned with estimating errors that are committed by approximation methods. Estimating these errors as precisely as possible often comes down to finding a best constant in some inequality. This usually gives us the last word in estimating an error. Exact Constants in Approximation Theory deals with finding these constants across a broad range of problems in approximation theory. To give readers some flavour of both the subject and the book, I have described below a typical problem. Many mathematicians think that approximation theory began and ended with Weierstrass's theorem; perhaps the description below will help to dispel this myth.
Typical Problem Let X
=
C2~ = {f : I~ ~ I~[ fcontinuous everywhere, 27r-periodic}
and V,~ be the set of trigonometric polynomials of order n - 1 with real coefficients. When endowed with the uniform norm [[fllx = [[f[[~ = sup{lf(x)[ : x E ~}, X becomes a Banach space and V,~ is a finitedimensional subspace of X. Obviously, V1 c V2 c V3 c ... and, by Weierstrass's approximation theorem, Un~176 1 Wn is dense in X. One of the principal developments in approximation theory is the study of best approximation, which is an attempt to quantify Weierstrass's theorem. Dunham Jackson's doctoral thesis [2] is the starting point of this development. Let f E X. Then, the best approximation to f by elements of Vn is defined to be E n ( f ) = inf{llf - zllx : T E Vn}. Jackson estimated E n ( f ) in terms of w(f; .), the modulus of continuity of f. This is defined by w(f;6) =
sup{If(x)- f(y)[
: Ix - y l - < 6;x,y E I~}.
If f : ~ --* I~ is a 27r-periodic function, then the following statements are equivalent: o f E C27r 9 f is uniformly continuous on 9 lim~__.0+w(f; 6) = 0. Jackson proved the following result.
THEOREM 1 (D. Jackson). There is a finite constant c such that Figure 1. En(f) is the best approximation to f by elements of Vn. 68
~ E MATHEMATICAL INTELLIGENCER VOL. 16, NO. 1,1994
E n ( f ) ~ cw(f; 7r/n)for all f E X, n E 1~.
(1)
So as n --* c~ we have 1/n --* O, w(f;rc/n) ~ O, and, hence, E,~(f) --, O. Thus, Jackson's theorem is a quantitative version of Weierstrass's theorem. For example, suppose that f is not only continuous but has a bounded derivative. Then, for some M > 0, If'(x)I < M for all x E ~. Hence, by the mean value theorem,
w(f;5)
= <
sup(If(x) - f(Y)l : l x - Yl <- 6;x,y ~ ]~} MS,
and so Jackson's theorem implies that, for n = 1,2, ,...
E n ( f ) < cM/n. We can now ask whether inequality (1) can be improved. Jackson showed that there is a function f0 E X and a positive constant co such that
E,~(fo) > cow(fo, Tr/n)
(n = 1,2,3,...).
Hence, we cannot replace 1/n in Equation (1) by a smaller function of n, such as 1/n 2 for example. It remains to find the smallest c that can be used in inequality (1). This is a typical problem in the book under review. The solution to the problem is given in the following result (p. 270, Theorem 6.2.2.): THEOREM 2. The assertion
En(f) <_ cw(f;Tr/n) for all f E X , n E I~ is valid for c = 1 but not for any c < 1. Thus, we have solved a best-constant problem for Jackson's theorem. Jackson's theorem has been a rich source of ideas for generalisation. One of my favourite pieces of reading is the dissertation by James Case [1] which is an exposition of the generalisations of Jackson's theorem. Some entrepreneurial mathematician should organise a conference on Jackson's Theorem: The first hundred years to be held in 2011 in some exotic location. Another worthwhile project would be to publish the collected works of Dunham Jackson. His papers (listed in [3]) are beautiful to read, and they provide fine examples of style and clarity for graduate students to emulate. Readers of the Mathematical Intelligencer enjoy photos: Does any one out there have a photo of Dunham Jackson? Other P r o b l e m s The problem above has certain key ingredients: the space X, the norm (l" IIx, and the increasing sequence of finite-dimensional subspaces ~c 89189
We can create new settings for our best-constant problem by varying these ingredients. For example, we could let X be 1. Lp, the space of 27r-periodic functions whose pth power is Lebesgue integrable over [0, 2r); 2. "-'2~m(m) = { f E V21rI f (m) E U2~}, the space of 27rperiodic functions which are m times continuously differentiable; 3. C([a, b]), the space of continuous real-valued functions defined on [a, b] (with no restrictions about periodicity). Also we could let V,~ be 1. a finite-dimensional space of splines, 2. [in the case of X = U([a, b]) mentioned above] the space of ordinary polynomials of degree at most n. As you vary the scenario of the problem, you generate new best-constant problems to be solved. The author does this throughout the book.
General R e m a r k s 1. Title: I do not like the title, because all constants are exact. "Best Constants in Approximation Theory" would be a better title, as most English-speaking mathematicians would prefer the phrase "best constant" to "exact constant." 2. Introduction: The author does not give an introduction to the subject matter which would give a nonspecialist reader some idea of what the book is about. In fact, he does not define the topic of the book at all, unless you accept his claim that the title is selfexplanatory. The Preface suggests that the material may be useful to "the applied scientist who is looking for mathematical approaches to the solution of practical problems .... " As a mathematician who works sideby-side with applied scientists every day, I reckon that most "applied scientists" could not read this book. The book will be useful to mathematicians who work in approximation theory and are interested in best-constant questions. Perhaps some people who work in numerical analysis or computational mathematics also may be interested in it. 3. Price: Like all books in this series, this book is expensive. Graduate students could not afford it; academic mathematicians who could afford it may think twice about buying it; many libraries will baulk at spending this amount of money without strong justification from faculty. 4. Printing: The book is up to the usual high standard set by Cambridge University Press for the series. THE MATHEMATICAL INTELLIGENCER VOL. 16, NO. 1,1994
69
5. Exercises: There are 90 exercises at the ends of chapters. They appear to be fairly difficult (at least to this reviewer!) as many of them are based on results in research papers. This is common in monographs from the (former) USSR. 6. Translation: The quality of translation is satisfactory, particularly considering that the translation was done by Professor K. Ivanov (Bulgarian Academy of Sciences) whose native language is not English. One sentence, though, tickled my fancy: "This fact is far from trivial unlike the other results in this section" (p. 285). However, it is irritating to see that certain book references have not been translated. For example, it would be more helpful to refer to an English version of a book where it exists rather than the Russian version. Of the 45 Russian books given in the bibliography, 19 have been published in English. Many were originally published in English: some were even published by Cambridge University Press! 7. Bibliography: The bibliography is extensive and heavily biased towards Soviet work. This may not be a very unfair bias as Soviet mathematicians have been the most active in the field of best constants for a long time: but the bias is a little unfair. Still an extensive guide to the Soviet literature is useful to non-Soviet best constant hunters. 8. Index: Brief. 9. Notation: I often find the notation in this area of mathematics difficult to remember, especially in the description of function classes. For example, we are told on p. 212 that it is clear that the class W m K H 1[a, b], which, incidentally, is the same as the class K W ~ +1 [a, b], is contained in W TM K ~ [a, hi. This problem is aggravated by the fact that mathematicians do not read mathematics books from cover to cover. We tend to dive in at the middle.
References 1. James Robert Case, Extensions and Generalizations of Jackson's Theorem, Ph.D. Dissertation, Syracuse University, 1970. 2. Dunham Jackson, Uber die Genauigkeit der Ann/iherung stetiger Funktionen durch ganze rationale Funktionen gegbenen Grades und trigonometrische Summen gegebener Ordnung, Gekr6nte Preisschrift und Inaugural-Dissertation, G6ttingen University, 1911. 3. William L. Hart, Dunham Jackson 1888-1946, Bull. Amer. Math. Soc. 54 (1948), 847-860.
Department of Mathematics La Trobe University College of Northern Victoria PO Box 199, Bendigo, Victoria 3550 Australia 70
THE MATHEMATICALINTELLIGENCER VOL. 16, NO. 1, 1994
Leonardo, Special Issue, Visual Mathematics Guest Editor:. Michele Emmer Journal of the International Society for the Arts, Sciences, and Technology Volume 25, Numbers 3 and 4 N e w York: Pergamon Press (1992) Reviewed by Harold L. D o r w a r t This will be an unconventional review of an unconventional but beautifully produced publication that is worthy of high praise. First some facts, by w a y of stage setting. The following one-sentence biography appears in Webster's Dictionary: "Vinci, Leonardo da, 1452-1519; Italian painter, sculptor, architect, scientist, musician, and natural philosopher." Therefore it is easy to see w h y Frank J. Malina (1912-1981), who was an "aeronautical engineer, pioneer in rocketry, research administrator, promoter of international cooperation, artist, and editor," chose the name Leonardo when, in 1967, he founded a "professional journal for working artists to write about their own work." Concerning the founding publisher, I. R. Maxwell (1923-1991), who was then chair of Pergamon Press, it has been stated that "his vision of the future of publishing was instrumental to the establishment of contemporary scientific and scholarly publications and resulted in a major contribution to the development of modern science." The International Society for the Arts, Sciences, and Technology was founded in 1981 as a "non-profit organization that seeks to encourage the interaction of art, science, and technology." Thus it is not surprising that the bimonthly publication Leonardo was chosen as the official journal of the organization, which would then become known as Leonardo/ISAST. Headquarters were in Europe until recently, when the organization moved to San Francisco. 1 N o w that the stage is set, it is time for some general comments concerning the publication to be reviewed. Visual Mathematics is apparently the most elaborate of a series of special issues, and of theme packs "of frequently requested articles that have appeared in the Journal over the years." Its face size is the same as that of The Intelligencer, but its thickness is that of a book. It weighs one pound, six ounces. It was printed in Great Britain on very good, heavy paper, the binding is excellent, and the cover is strong enough to hold up under considerable use. The price to a non-member of the Society or to an Institution/Library--not mentioned on the cover but discovered in the classified advertisements section - is $45. The Society "gratefully acknowledges donations from Research on Demand, the Malina Trust and CRSS Architects, Inc.," for "partial support of this volume." 1Beginning in 1993, Leonardo was published by the MIT Press-Editor's Note.
The editors have organized the contributions under 7 headings: Geometr~ Computer Graphics and Geometr~ Computer Graphics and Art, Topology, Tessellations, Perspective, Art and Mathematics. The difficult part of the review now has to be faced! H o w do you do justice to a guest editor and to 29 individuals who have written an Introduction and 22 articles, when your feelings--after thumbing through the book a few t i m e s - - are those of a small child of long ago who was given some money and told to spend it at a candy store? After gazing for some minutes at the plates of delectables, the child was completely bewildered. For the first few days after receiving the publication, this reviewer merely looked (and looked) at the 179 black-and-white and 28 beautifully colored figures, all with carefully written captions, and then read the helpful abstracts that preceded each article. Approximately half the authors describe themselves as mathematicians or as working in closely related fields. The other half are sculptors, composers, writers, etc. Of the 30 article writers, exactly half give the USA as their home base, and the others are from Italy, United Kingdom, German)~ Switzerland, Canada, and Chile. The articles are heavily referenced; it is interesting to lo0k at some of the bibliographic entries m in particular, at the books that the present authors have considered important in helping their ideas come to fruition. Among the older reference books, one finds D'Arcy Thompson's On Growth and Form, J. Hambidge's The Elements of Dynamic Symmetry, and H. M. Cundy and A. P. Rollet's Mathematical Models. Hilbert and Cohn-Vossen's Geometry and the Imagination comes up frequentl)~ as do the early and later works of Coxeter, including Escher and the Visual Representation of Mathematical Ideas, of which Coxeter, M. Emmer, R. Penrose, and M. L. Teubner are editors. Also, there are references to modern books on fashionable subjects, such as Mandelbrot's The Fractal Geometry of Nature, Peitgen and Richter's The Beauty of Fractals, Peitgen and Saupe's The Science of Fractals, and Gleick's Chaos, Making a New Science. In his Introduction, entitled "Visual Mathematics: a N e w Era?", guest editor Michele Emmer mentions one of his favorite paintings, "The Flagellation," by the Italian Renaissance artist Piero della Francesca, and then quotes Morris Kline's description of this artist as a mathematician (from KIine's Mathematics in Western Culture): The artist who perfected the science of perspective was Piero della Francesca. This highly intellectual painter had a passion for geometry and planned all his work mathematically to the last detail. The placement of each figure was calculated so as to be correct in relation to other figures and to the organization of the painting as a whole. He even used geometric forms for parts of the body and objects of dress and he loved curved surfaces and solidity. He later quotes from The Mathematical Experience, by P. J. Davis and R. Hersh, concerning the use of highspeed computers in mathematics, and briefly describes
Figure 1. An impossible figure, the tribar, drawn in perspective.
the seminal work at Brown University (and elsewhere) in computer graphics, concluding, "it is time to propose a comparison between the research of mathematicians and the works of artists, in order to gain an understanding of what interesting results can be expected from the field of visual mathematics." I will select just a few of the articles and describe them. In "Portraits of a Family of Complex Polytopes," H. S. M. Coxeter and G. C. Shephard exploit recent developments in computer graphics to represent regular complex polytopes on the real plane. The intricate and very beautiful figures include, for example, four projections of the M6bius-Kantor polygon 3{3}3 showing how its appearance gradually changes as it is rotated. Roger Penrose, in "On the Cohomology of Impossible Figures," explores the close relationship between certain types of impossible figures, including the tribar (see Fig. 1), and the mathematical idea of cohomology. He first gives a fascinating definition of an impossible figure: It is a two-dimensional representation meant to be reconstructed by the observer as a three-dimensional figure but in which it is impossible for the viewer to decide the distance from his mind's eye to any given point in the reconstruction. In the abstract, such figures are said to possess an ambiguity group. Weirdl)~ an impossible figure may be decomposed into possible figures, as is the case with the tribar, which shows that the lack of ambiguity can be a local phenomenon (see Fig. 2). In "On the Edge of Science: The Role of the Artist's Intuition in Science," the remarkable sculptor Charles O. Perry describes his highly geometric sculptures, and the intuitive processes that led to their creation. He credits D'Arcy W. Thompson's On Growth and Form as being his bridge from real life to art. THE MATHEMATICAL INTELLIGENCER VOL. 16, NO. 1, 1994
71
T h e Physical Basis of the Direction of Time by H. D. Zeh N e w York: Springer-Verlag, 1992. x + 188 pp. US $39.95 paper. ISBN 0-387-54884-X
Reviewed by John C. Baez
Figure 2. The tribar shown pieced together out of overlapping smaller drawings, each of which depicts a possible structure. In "Visualization in Art and Science," the educator Harriet E. Brisson addresses the puzzle of how humans obtain a cognitive grasp of what in reality are not visualizable structures, such as four-dimensional objects or space-filling curves. Obviously to some extent we can learn to do this, because we can utilize in mathematical proofs the salient properties of such objects. Recent advances in computer graphics and sculptural forms have shed some light on the process by means of which we create internal models for such objects. In "Visualization of Soap Bubble Geometries" Fred Almgren and John Sullivan discuss startling new techniques for generating computer graphics of bubble clusters. The software, based on Fresnel's equations, produces both the colored interference pattern and the Fresnel effect of varying transparency. Several articles deal with patterns and tilings. Branko Griinbaum and G. C. Shephard, in "Interlace Patterns in Islamic and Moorish Art," explore the rather deep mathematical structures inherent in these very old Moorish decorative patterns. Instead of discussing all the many fascinating articles in detail, which is what the writer would like to do, he will tell you how the child solved the candy selection problem. After counting pennies and seeing that there was a one-to-one matching with candy containers, the child announced, "I'll take a penny's worth of each kind." I hope the reader does the same, and purchases this wonderful journal. H a p p y reading! 17 Cobble Road Salisbury, CT 06068 USA 72
THE MATHEMATICAL INTELLIGENCER VOL. 16, NO. 1,1994
In this book Zeh brings some clarity to a very murky problem: Why are the future and the past so different? One need only read the physics journals to see that this is a multifaceted and very real issue that still vexes the experts. To understand its seriousness, it is first necessary to see how similar the future and the past are. They don't seem so in everyday life: We remember the past but not the future, our actions affect the future but not the past, and so on. From this standpoint it is really surprising that the dynamical laws of p h y s i c s - - with one small exception--seem symmetrical under time reversal. Before we go further, it's important to get a clear idea of what time-reversal symmetry really means. At the simplest level we may think of the laws of physics as equations involving a time variable t, and say that they are symmetric under time reversal if given any solution, and making the substitution t ---, - t , we obtain another solution. To take an easy example, consider a point particle of mass m in three-dimensional space with no forces acting on it. If we write its position as a function of time, r(t), Newton's second law says that
d2r --
dt 2
~0.
If r(t) satisfies this equation, then so does r(-t). Typically, the laws are more complicated, and one may have to be more careful in defining time-reversal symmetry. Different laws of physics involve very different mathematical structures, but they are usually separated into two components, the "kinematics" and the "dynamics." The kinematics consists of the description of the set S of states that the system can be in at any given time. For example, in classical mechanics, we can specify the state of a point particle in three-dimensional space by giving its position r and its velocity v, so S = ~3 x ]1~3. The dynamics tells how states change with time. In a theory where we can predict both the future and the past from the present, and where there are no time-dependent external influences, we usually describe dynamics with a family of maps U(t): S ~ S, where t E I~. If the state of the system is ~b at some time to, then the state is U(t)~b at time to + t. The maps {U(t)} should form a "oneparameter group," that is, u ( t ) U(s) = u ( t + s),
v t, s, ~ ~.
We say that the physical system given by (S, U) has "time-reversal symmetry" if there is a map T: S ---, S, called time reversal, such that u(-t)
= T -~ U(t) T.
For example, our point particle with no forces on it moves with constant velocity, so U(t)(r,v) = ( r + tv, v). It's easy to check that U is a one-parameter group and that the system has time-reversal symmetry, where T(r, v) = (r, - v ) . Note, by the way, that time-reversal symmetry in the sense described is different from requiring that a given state r be invariant under time reversal: Tr162 Our world is evidently in a state that is not even approximately invariant under time reversal; there are many processes going on whose time-reversed versions never seem to happen. This is logically independent of the question of whether the dynamical laws of physics admit time-reversal symmetry. Keeping this distinction straight is crucial for thinking clearly about the direction of time. Even people who claim to understand the distinction often slip. When reading about time-reversal symmetry, I become infuriated when authors confuse symmetry of the laws with symmetry of the state, and I am happy to report that not once did I hurl Zeh's book to the floor in anger.
... Violation of T symmetry has so far been seen only in the decays of a single particle . . . . At this point, we could go through all theories of physics and check to see whether they have time-reversal symmetry. However, let us simply turn to the most upto-date and complete laws of physics we know: the standard model and general relativity. The "standard model" is a complicated theory of quantum fields that describes the most fundamental particles we know (mainly leptons and quarks) and the forces (electromagnetism and the weak and strong nuclear forces) by means of which they interact. In other words, it treats everything except gravity. The standard model has time-reversal symmetry except for effects involving the weak force. This is the force that permits a proton and electron to turn into a neutron and a neutrino, as happens in some radioactive atoms, or vice versa, as in some others. In quantum field theory, time reversal or T, is one of a trio of possible symmetries, the others being charge conjugation or U, which amounts to interchanging particles with their antiparticles, and parity, or P, which is related to spatial inversion
(t, x, y, z)
(t, - x , -y, - z )
in somewhat the same way that time reversal relates to the map --.
( - t , x , y , z).
In a quantum field theory, states are given by unit vectors in a Hilbert space H. The symmetries C and P are given by unitary operators on H - - i f the theory in question admits these symmetries--whereas T is given by an antiunitary operator, that is, a conjugate-linear oneto-one and onto norm-preserving map from H to itself. I will restrain myself from explaining why T must be antiunitary rather than unitary, fascinating though this is. The key point here is that in the standard m o d e l the weak force violates C, P, and T symmetry, whereas electromagnetism and the strong force admit all these symmetries. Moreover, although violation of P symmetry is common and blatant-- neutrinos, for example, which only interact by the weak force, come in a left-handed form but not a right-handed one--violation of T symmetry has so far been seen only in the decays of a single particle, the neutral kaon, and the amount of violation is minute. Most physicists believe that this small T symmetry violation is not particularly related to the gross time asymmetry of the state of the universe. However, there is something very curious about this, as in the elaborate Islamic designs that are perfectly symmetrical except for one tiny flaw put in to avoid the wrath of Allah. Zeh's book does not treat the T asymmetry of the weak interaction very thoroughly, but luckily there is already a good book that does just this [1]. On the other hand, general relativity treats gravity, which is a great puzzle in its own right. It seems very difficult to unify gravity with the rest of the forces. Unlike all the other forces, it is not at all natural to formulate its dynamics in terms of a one-parameter time evolution group. Essentially, this is because it treats the geometry of space-time itself and describes how it wiggles around. Although the dynamics of general relativity is by now moderately well understood, the modifications required for a quantum theory of gravity are still very poorly understood and seem to require a radical rethinking of the very notion of time. In his last chapter, "The Quantization of Time," Zeh tours this fascinating subject. Although a quantum theory of gravity would be likely to have profound implications for the study of time reversal one can fairly say that, so far, the dynamics of gravity seems to admit time-reversal symmetry. It's worth noting that there are some cases where at first glance it looks as if the laws of physics are asymmetric under time reversal, but on closer inspection it turns out to be the fault of the particular state of the universe in which we are. The two most famous examples are the "time arrow of radiation" and the "time arrow of thermodynamics." Here an "arrow of time" is used loosely to denote something that is not symmetric under time reversal. The time arrow of radiation refers simply to the fact that when we shake an electrically charged object, it emits waves of radiation that ripple outward as time progresses into the future, rather than the past. We express this concept mathematically in terms of what are THE MATHEMATICAL INTELLIGENCER VOL. 16, NO. 1, 1994
73
called Green's functions. To understand these, it's easier to consider the scalar wave equation rather than Maxwell's equations of electromagnetism in their full glory. We have a "field" r ]~4 ----+ ~ being produced by a "source" f: I~4 ~ ~; we assume that both are smooth functions and that
De = where O=
f,
02
02
02
02
Ot2
COX2
Oy2
OZ2"
The source does not uniquely determine the field, but it is possible to write down formulas that give us for any source f a field r with EJr = f. In particular, we say that G(t, x, y, z) is a Green's function (actually a distribution) for the scalar wave equation if r
= j ( a ( t - t',r - r')f(t',r') dt' dr'
(where r is short for (x, y, z)) implies that o r = f. Two Green's functions are the "advanced" one, G.dv(t,r) -- 5(t + Irl) [rl
'
and the "retarded" one, G~t(t,r)
- 5(t-
Irl) Irl
'
where 6 is the Dirac delta distribution. In electromagnetism, one typically uses the retarded Green's function, so that if f(t, r) is nonzero only for times t E [to, tl], then r is typically nonzero after tl, but is zero before to. It may seem odd that although the equation []r = f is preserved by the transformation ( t , x , y , z ) -* ( - t , x, y, z), we are solving it in a way that doesn't respect this symmetry. There are, however, two things that help resolve this puzzle. First, working with the retarded rather than the advanced Green's function is, at least for f vanishing outside a bounded set, equivalent to an assumption about the nature of the field r namely, that it vanish as t -* -oc. In short, we are making a timeasymmetric assumption about the state of the system in choosing the retarded Green's function. Why do we make this assumption? For a quite interesting reason: because it's dark at night. Light radiates out from the sun and from our flashlights, rather than coming into them from the distance, because the universe is a dark and cold place. The very fact that space is mostly dark and empty, with a speckling of hot bright stars that radiate outward, is blatantly time asymmetric, so the time arrow of radiation appears to be cosmological in origin. This fact about the universe is crucial to life as we know it, as all life on earth is powered by the outgoing radiation of the sun, and the earth, in turn, dumps its waste heat into the blackness of space. A second, subtler point is that the equation De = f 74 T.EMATHEMATICAL INTELLIGENCER VOL 16, NO. I, 1994
does not fit into the general framework of one-parameter groups because the field r is subject to an arbitrary timedependent external influence, the source f. Here one wants to imagine oneself, the experimenter, as being able to do whatever one wants with the source f, and to see what it does to the field r This is related to the notion of free will: We like to think that the laws of physics govern the behavior of everything else, but that we are free to do whatever we want. In the most fundamental laws of physics we k n o w - - the standard model and general rela t i v i t y - n o "arbitrary external influences" appear. In these laws, there is no need to choose between a retarded and advanced Green's function (or some other Green's function, for that matter). There is only the need to choose the state that best matches what we observe. The time arrow of thermodynamics is perhaps the most famous aspect of time-reversal s y m m e t r y - - s o I will treat it very briefly here. Why is it so much more likely that a porcelain cup will fall to the floor and smash to smithereens, than it is for a pile of porcelain smithereens to form into a cup and jump into one's hand? Disorder seems always on the increase. In fact, in thermodynamics there is a quantity called entropy, S, which is a measure of disorder--although one must be very careful not to fall for the negative connotations of "disorder," which here has a very precise and sometimes counterintuitive sense. The second law of thermodynamics states that dS -->0. dtThis law appears utterly time-asymmetric, and reconciling it with the (almost) time-symmetric fundamental laws of physics has exercised the minds of many physicists for many years. The final resolution seems a simple one: This law is not true except for certain states. That is, it expresses a time asymmetry of the state of a system, rather than an asymmetry in the dynamical laws. As long as the dynamical laws admit symmetry under time reversal, for every state with dS/dt > 0 there is a time-reversed state with dS/dt < 0. It is also worth noting that the vast majority of states typically have S quite large and dS/dt -~ O. As with the time arrow of radiation, in the last analysis it appears to be nothing but a raw experimental fact that the entropy of our universe is increasing. In a sense this is not surprising because chemistry and biology convince us that life as we know it requires entropy to be changing monotonically, rather than staying about the same. One might ask why dS/dt > 0 rather than dS/dt < O, but this is essentially a matter of convention. Processes like remembering and planning, which define the psychological notions of past and future, can occur only in the direction of increasing entropy. That is, a memory at time t can only be of an event at time t' for which S(t') < S(t), whereas a plan at time t can only be for an action at time t' for which S(t') > S(t). Because we have settled on using calendars for which the number of
the years increase in the direction of plans rather than memories, we have chosen a time coordinate t for which t < t' implies S(t) < S(t'). The main remaining mystery, then, is w h y the state of the universe is grossly asymmetric u n d e r time reversal, though the dynamical laws of physics are a l m o s t - - b u t not q u i t e ! - - s y m m e t r i c . If readers wish to puzzle over this some more, or want supporting evidence for some of the (perhaps upsetting) claims I've m a d e above, they could not do better than to read Zeh's book.
References 1. Robert G. Sachs, The Physics of Time Reversal, Chicago, University of Chicago Press (1987).
THE SELFAVO IDING WALK by N.N. M a d r a s , York University, Ontario & G.D. S l a d e , M c M a s t e r University, Ontario
"This is the first book written on the self-avoiding random walk that has been written for mathematicians rather than for chemists. The authors' achievement is impressive. A particularly awesome treatment is given of the lace expansion, a technique of which professional combinatorialists are not yet aware. Required reading for whoever takes applied discrete mathematics to heart." - - G i a n - C a r l o Rota, The Bulletin of Mathematics Books and Computer Software
Department of Mathematics University of California Riverside, CA 92521 USA
CONTENTS: Introduction 9 Scaling, Polymers and Spins 9 Some Combinatorial Bounds 9 Decay of the Two-Point Function 9 The Lace Expansion 9 Above Four Dimensions 9 Pattern Theorems 9 Polygons, Slabs, Bridges, and Knots 9 Analysis of Monte Carlo Methods ~ Related Topics 9 Index
Corrections
1992 425pp. Hardcover $64.50 ISBN0-8176-3589-0 Probability and Its Applications
The authors of the first book reviewed in vol. 15, no. 3, pp. 69-71 are Yu. V. Egorov and M. A. Shubin. We regret having misspelled one of these names. In the review by David M. Bressoud in vol. 15, no. 4, 71-73, both formulas on p. 73 are wrong due to an editing error. In each case, the summation should be a product.
MOVING? We n e e d y o u r n e w address so that y o u do not miss a n y issues of
THE MATHEMATICAL INTELLIGENCER. Please fill out the form below and send it to:
Springer-Verlag New York, Inc. Journal Fulfillment Services 44 Hartz Way, Secaucus, NJ 07096-2491 Old Address (or label)
"This well organized and clearly written advanced textbook introduces students to analytic functions of one or more real variables. Many historical remarks, examples and references to the literature encourage the beginner to study further this ample, valuable and exciting theory." - - Mathematical Reviews This book treats the subject of analytic functions of one or more real variables using, almost solely, the techniques of real analysis. This point of view dramatically alters the natural progression of ideas and brings previously neglected arguments to the fore. 1992 200pp. Hardcover $49.50 ISBN0-8176-2768-5 Birkhiiuser Advanced Texts: Basler Lehrbiicher, Volume 4
Name. Address. City/State/Zip
New Address
by S.G. K r a n t z , W a s h i n g t o n University, St. Louis & H.R. P a r k s , Oregon State University
Name Address. City/State/Zip.
To Order! Call ToU-Free 1-800-777-4643. In NJ call (201) 348-4033. Or Write to BirkhA'useI,Dept. Y751, 44 Hartz Way, Secaucus, NJ 07096-2491. Prices valid in North America only. For information outside North America, please contact:
B kh u,er Verlag AG, aost berg 23, P.O. Box 133, CH-4010 Basel, Switzerland. FAX 061 271 7666.
Birkhduser B o s t o n Basel Berlin
Please give us six weeks notice. THEMATHEMATICALINTELLIGENCERVOL,16,NO. 1,1994 75
Robin Wilson* Islamic Mathematics and Astronomy II By the year 1000, the Islamic world extended along the northern coast of Africa into southern Spain, and the next three centuries were to be fertile years for the development of mathematics and astronomy. Outstanding among the scholars of the time was Omar Khayyam(1048?-1122), a mathematician, physician, and astronomer who wrote an important treatise on algebra containing a geometrical approach to the solution of cubic equations; he is known in the West mainly for his poem Rubaiyat. Later on the scene was the Persian mathematician and philosopher Nasir Eddin al-Tusi (12011274), who improved on earlier Arabic translations of Euclid, ApoUonius, and Ptolemy and who made original contributions to both mathematics and astronomy. The Islamic scholars were familiar with all types of astronomical and navigational instruments. These included the planisphere, used for observing the positions and altitudes of celestial bodies and telling the time of day, and its cousin, the astrolabe, shown here with the Spanish geometer, astronomer, and instrumentmaker al-Zarqali (d. 1100), who lived in Toledo and Cordoba and wrote a number of important works. Also constructed during this period was an Arabian celestial globe, featuring the stars and planets.
Grateful though we are for the wealth of philatelic marvels Robin Wilson has provided over the years, we need not condemn to silence all the other readers who have prizes of their own to share. Feel free to submit guest columns! They may present stamps, or coins, or for that matter bills, having some association with mathematics. Submissions may be sent to the Column Editor or to the Editor-in-Chief. Chandler Davis *Column editor's address: Facultyof Mathematics,The Open University,MiltonKeynes,MK76AA,England. 76
THE MATHEMATICAL INTELLIGENCER VOL. 16, NO. 1 (~)1994 Springer-Verlag New York