The Mathematical Intelligencer encourages comments about the material in this issue. Letters to the editor should be sent to the editor-in-chief, Chandler Davis.
Prediction Paradox
Ethan Akin Responds
Lwas surprised that so many readers who wrote about the two-computer prediction paradox [Mathematical Intelligencer, vol. 15, no. 2, 1993, 3-5)] seemed unaware of how often this has been analyzed by logicians and philosophers of science. John Kemeny, in A Philosopher Looks at Science (1959), was one of the first to discuss it. Since then many variations have been proposed. Prediction paradoxes can arise whenever a prediction is part of the event being predicted. A variation I myself invented is in Chapter 11 of my New Mathematical Diversions from Scientific American. You write on a sheet of paper, turning it face down so no one can read it, a description of an event that is certain to occur within an hour. You then bet a million dollars to a dime that a person cannot accurately predict whether the event will or will not occur by writing "yes" if he thinks it will, "no" if he thinks it won't. You are sure to win because the event you described is that the person will write "no." As I pointed out, the simplest variant of this paradox is to say to someone, "Will the next word you speak be 'no'? Please answer yes or no." Such paradoxes are cleverly disguised forms of the old liar paradox which arises whenever a language is allowed to talk about the truth or falsity of its own statements. Two notorious prediction paradoxes are much harder to resolve. One is the problem of the unexpected hanging, discussed in the first chapter of my Unexpected Hanging. The other is Newcomb's paradox, the topic of two chapters in my Knotted Doughnuts.
Gestalt psychologists have produced a number of drawings which, like the duck-rabbit, yield conflicting interpretations depending upon how they are viewed. My intent with "The Spiteful Computer" was to produce a conceptual analogue of these ambiguous figures. As a matter of logic, we interpret the setup using words like "prediction" and discover an internal contradiction. As Martin Gardner points out, it is just a version of the Liar Paradox and, as he is too polite to say, it is a rather ponderous construction compared with the examples he delight fully summarizes. N o w forget all that. Look instead at the universe of moving particles which motivated Laplace's original boast. The initial state determines all future states, and Laplace, human or machine, is equipped to compute them. The question is, "Where does the logic of the first interpretation apply its bite to prevent the computation, routine on the second interpretation, from succeeding?" My current answer is that the difficulty lies in describing the initial conditions, in giving to Laplace the description of the world in which he (or it) has been asked the question and been given the initial conditions.
Martin Gardner 110 Glenbrook Drive Hendersonville, NC 28739
Department of Mathematics City College (CUNY) New York, NY 10031-9100 USA
Mathematics and History The recent essay "Mathematics and History," by W. S. Anglin [1], was no doubt "provocative" by intent. But there aie at least two places where Anglin is so seriously unjust to earlier writers that he should not be allowed to go uncorrected.
THE MATHEMATICAL1NTELLIGENCERVOL. 16, NO. 2 (~)1994Springer-Verlag New York 3
Here is one [1], p. 11, from a section on treating mathematical progress as a joke: Euclid intended to give a rigorous treatment of geometry but he goofed in his very first demonstration. He forgot to add an axiom to ensure that the circumferences of two overlapping circles do not merely pass through each other without touching, but actually meet at a point. In 1889, Hilbert presented a rigorous reworking of Euclid, but he made the very same mistake. It was only in the French translation of Hilbert's book that he belatedly added the necessary axiom. (p. 25) This is true of Euclid, of course, but it is an outright falsehood about Hilbert. The well-known b o o k b y Hilbert [3] takes axioms o n incidence, betweenness, congruence, and parallels and proves that they yield planes (with the usual distance formula) having coordinates in suitable ordered fields. Hilbert deliberately a v o i d e d all "continuity" axioms, and his theorems are valid even in geometries where overlapping circles m a y not intersect. In m o d e r n terms, his fields are Pythagorean but not necessarily Euclidean (and he pioneered the s t u d y of this distinction, which has become important in the m o d e r n theory of quadratic forms). Indeed, Hilbert in the same book went on to analyze the constructions n e e d e d for his axioms; he s h o w e d that they require only the use of a straightedge and a tool for marking off congruent segments on different lines, rather than the full Euclidean use of straightedge and compass. To k n o w this obviously requires some k n o w l e d g e of the subject. But no special knowledge is n e e d e d to see what [3] marks as a d d e d in the French translation: a "Remark" in which Hilbert formulates a "completeness" axiom and explains that adopting it w o u l d force the coordinate field to be the real numbers. This axiom is in no w a y necessary for the rest of the book. N o r do y o u have to read the whole b o o k to be sure of this; a n y o n e w h o turns the page to finish the Remark (p. 26) will find that it ends with the w o r d s "in what is to follow, no use will be m a d e of the axiom of completeness." A second falsification occurs in Anglin's discussion of the historian D. E. Smith [1], p. 8: Smith's nationalistic analysis leads him to remark that the work of Fibonacci (an Italian) was beyond the competence of any professor in Paris. What needs to be told is quite different, however. The scholars of Medieval Europe were not tied to their particular countries. They spoke a common language (Latin), and they travelled all over the continent. At that time, the sharp social boundaries were not national or racial, but economic or religious. The point which Smith's nationalism obscures is that Fibonacci was the best mathematician, not just in Italy or France, but in the whole of the European scholarly community. Smith's w o r k is readily available in reprint [4], and readers can check the following facts for themselves. First, Smith's discussion of Fibonacci (pp. 214-218) occurs not in a section on a n y particular nation but u n d e r the general heading "Christian Europe from 1200 to 1300." Sec4
THE MATHEMATICAL INTELLIGENCER VOL. 16, NO. 2, 1994
ond, Smith says there that Fibonacci's best w o r k w o u l d not have been comprehensible to professors "in the University of Paris, to select w h a t was soon to become the greatest intellectual center," a d d i n g that the subject had no standing "there or in the schools of Italy." And finally, Smith's very first sentence about Fibonacci calls him "the greatest and most productive mathematician of all the Middle Ages." If this amounts to "obscuring" the point,
vive l'obscuritd! What is more, even Smith's brief account makes it clear that the "European scholarly c o m m u n i t y " has only a tenuous claim to Fibonacci. We k n o w from Fibonacci's o w n statements that he learned mathematics from Arabic teachers in North Africa, and that he later discussed the subject with scholars in "Egypt, Syria, Greece, Sicily, and Provence." He explicitly describes his simplest work, the famous Liber Abaci, as an attempt to familiarize Western Europeans with material from Indian, Greek, and Arabic mathematics. I w o u l d add that the dedications of Fibonacci's works give us a good idea where he f o u n d intellectual stimulation later in his life: the court of Frederick II in Sicily, one of the places (like Muslim Spain) where the "sharp religious boundaries" were broken down. Frederick himself was famous (or infamous, as the papal writers said) for listening to Greeks, Jews, and Arabs as m u c h as to Catholics; Fibonacci's Book of Squares is dedicated to him and arose from a problem posed by a m e m b e r of the court (probably from Arabic sources) [2]. Michael Scot, to w h o m the second edition of the Liber Abaci was dedicated, was a m e m b e r of the court famous for translations from Greek and Arabic (Aristotle and Avicenna, no less!); the general European view was that he was a sorcerer, and Dante a century later reported seeing him in Hell (Inferno XX). Another of Fibonacci's works is in the form of a Letter to Master Theodore; Theodore, a p r o m i n e n t philosopher at the court, was a Greek born in Egypt. In short, w h a t m a d e Fibonacci stand out so much a m o n g his Latin-speaking contemporaries was precisely that he was not one of the scholastics of Medieval Europe.
References 1. W. S. Anglin, Mathematics and history, Math. Intelligencer14 (1992), no. 4, 6-12. 2. Leonardo Pisano (Fibonacci), The Book of Squares (trans. L. E. Sigler), Academic Press, Orlando, FL, 1987. 3. D. Hilbert, Foundations of Geometry, Open Court, Chicago, 1902. 4. D.E. Smith, History of Mathematics, Vol. I, Dover, New York, 1958.
William C. Waterhouse Mathematics Department The Pennsylvania State University University Park, PA 16802 USA
On the Kepler Conjecture Professor Wu-Yi Hsiang announced in 1990 that he had solved Kepler's problem of establishing the densest packing of spheres in three dimensions. The signatories of this letter, who have all worked on sphere packing problems, do not consider that Hsiang's work constitutes a proof of Kepler's conjecture, or can be completed to one in a reasonable time. We understand that, despite the many objections that have been made to it over the past few years, Hsiang's paper will soon be published in the International Journal of Mathematics. Perhaps this is fortunate, since interested mathematicians will now be able to judge its quality for themselves. An article by Hales to be published in the next issue of the Intelligencer describes our own objections in more detail. J. H. Conway Department of Mathematics Princeton University Princeton, N] 08544 USA emaih
[email protected] T. C. Hales Department of Mathematics University of Michigan Ann Arbor, M148109 USA emaih
[email protected] D. J. Muder Mitre Corporation Burlington Road Bedford, MA 01730 USA emaih
[email protected]
Dear Wet Hen, Unrivaled explained that the item you ordered was temporarily out of stock and assured Quick-Fix that it has now been sent to you. Please let Quick-Fix know if you have not received this item, for in that case we will poor-mouth the Unrivaled Can-Opener Company unmercifully.
N o w it occurs to me that the Mathematical Intelligencer could use a Quick-Fix column of its own. A typical M.I. Quick-Fix might read as follows: Dear M.I. Quick-Fix, In 1989 1 wrote to Professor X, Dear Honorable and Honored Professor Dr. X, On behalf of my coauthor and me, I am submitting our paper, "A note on totally analytic and algebraically topological infinitely differentiable Riemannian manifolds," for publication. Several weeks later I received a reply, which indicated that our paper had been assigned number 0000-0 and had been sent to a referee. A year later I inquired about the paper and received the following reply: "We have received your inquiry concerning your paper 0000-0, 'A note on totally....' I have reminded our referee. As soon as I receive his report I will inform you." All subsequent inquiries about our paper have gone unanswered, even though I have taken to sending them by registered mail. Can M.I. Quick-Fix find out what has happened to 0000-0? Sincerely Yours, Hopping Mad
N. ]. A. Sloane Department of Mathematics AT&T Bell Labs Murray Hill, NJ 07974 USA emaih
[email protected]
Proposed Feature for Intelligencer My local newspaper has a feature called Quick-Fix. A typical Quick-Fix column goes something like this: Dear Quick-Fix, Four months ago I ordered a can opener from the Unrivaled Can-Opener Company and enclosed a check for $19.95. Although the company cashed my check in June, I still have not received my can opener!! Sincerely Yours, Madder than a Wet Hen
Dear Professor Hopping, M.I. Quick-Fix has learned that your paper was viewed more favorably by the journal's referee than by its editors. Therefore, the editors have been waiting for someone else to discover and publish your results so that you can be sent a letter of rejection that will seem reasonable both to you and to the referee who recommended your paper for publication. The wait has been longer than the editors had anticipated, and they apologize for the delay; but at last your principal result has been published by another author in another journal, and the editors have assured M.I. Quick-Fix that they will reject 0000-0 forthwith. Please let us know if M.I. Quick-Fix can be of any further assistance. Perhaps there are other readers of the Mathematical Intelligencer who have need for a Quick-Fix feature. Peter Fletcher 2466 Meadow CreekRoad Christiansburg, VA 24073 USA THE MATHEMATICALIN'I~LLIGENCERVOL. 16, NO. 2,1994 5
The Birth of Lie's Theory of Groups Thomas Hawkins
In 1865 when Sophus Lie (1842-1899) completed his studies at the University of Christiania (now Oslo), Norway, he had no idea he was destined to become a mathematician. He had done well, but not brilliantly, in all subjects and was toying with the idea of becoming an observational astronomer. He even gave lectures on the subject in the student union. He had a real talent for explaining the geometry of the heavens. To support himself financially while in this state of career indecision he gave private instruction in mathematics. In this connection, Lie began to read the geometrical works by Poncelet, Chasles, and above all Pl~icker. Inspired by his reading, he did some original mathematical research on the real representation of imaginary quantities in projective geometry, a portion of which was accepted for publication by one of the leading mathematics journals of the t i m e - Crelle's journal in Berlin. On the basis of this experience, he decided to devote himself to geometrical research, to become a mathematician. He was 26 years old. Five years later, during the fall of 1873, Lie made a second fateful decision: to devote himself to the enormous task of creating a theory of continuous transformation g r o u p s - - a task that meant doing mathematics of a quite different sort from the geometrical work that had occupied him in his first years of mathematical research, 1 8 6 9 - 1 8 7 1 - a task that ended by occupying most of his creative mathematical energies for the remainder of his career. The purpose of this article is to explain how Lie was led from the one decision to the other. To accomplish this, I have to immerse you in the mathematical world inhabited by Lie, a world that is quite different from the one to which you are accustomed. The first two sections of this article concern the early geometrical work of Lie (18691871), done in close contact with Felix Klein. It was from this work that the ideas emerged which served to redirect Lie's researches. The third section briefly discusses the years 1872-1873, when the theory of first-order PDEs, 6
particularly in the form given to it by the work of Jacobi, provided what turned out to be a fertile context for the development of the group-related ideas that had emerged during Lie's geometrical investigations. One of the reasons the history of mathematics is fascinating is that it provides insight into the dynamics by means of which ideas and concepts from diverse mathematical theories combine, in remarkable, often unexpected ways, to give birth to entirely new mathematical theories. In the course of showing you how Lie was led from his deci-
THE MATHEMATICAL INTELLIGENCER VOL. 16, NO. 2 (~ 1994 Springer-Verlag New York
sion to become a mathematician to his decision to create a new and far-reaching mathematical theory, I also hope to give you some feeling for the manner in which concepts, theorems, ideas, and viewpoints from 19th-century algebra, geometry, and analysis played roles in this creative process.
Tetrahedral Line Complexes On the basis of the mathematical creativity exhibited by Lie in his essay on the real representation of imaginaries, he was granted a stipend to leave Norway and to travel to various centers. His first stop was Berlin, where the mathematical scene was dominated by Kummer and Kronecker and, above all, Weierstrass. Berlin was certainly one of the world's foremost centers of mathematics in 1869. However, the mathematics and the spirit in which it was done did not appeal to Lie, who found it too analytical. In Berlin he met another visitor who felt as he did. His name was Felix Klein (1849-1925). He was just 20 years o l d - - 7 years younger than L i e - - a l t h o u g h he already had his doctorate. Klein had been Pliicker's student, and after his mentor's untimely death in 1868, he had edited Plficker's lectures on line geometry for publication. Lie had, in fact, studied these lectures, and when they met in Berlin, they were both actively engaged in line-geometric research. So Klein and Lie enjoyed each other's mathematical company. They were self-styled "Synthesists" in the midst of analysts and arithmeticians. Although Lie and Klein had interests in common, their backgrounds and personalities were quite different. Klein had a relatively solid mathematical educat i o n - first at Bonn under Plficker and then in G6ttingen, where Alfred Clebsch (1833-1872) had attracted a circle of students, including, for example, Max Noether. Clebsch had been trained in Jacobian analysis and mathematical physics at the University of K6nigsberg. As we shall see in the third section, his contributions to Jacobi's theory of PDEs in the 1860s were to influence Lie. By the time Klein met Clebsch, however, he was working on algebraic geometry and the theory of invariants. His unexpected death from diphtheria at age 39 was a blow to Klein, who had great admiration for him. Klein enjoyed learning about the work of others because he wanted to "understand" it from his own point of view and to place it within a more encompassing, conceptually unified picture. By contrast, Lie had a limited mathematical background (and no doctorate yet). He tended to focus rather exclusively on developing his own (highly original) ideas and became interested in the work of others only when it was clear it was relevant to his own interests. This difference in personalities dictated the way they related to each other: Lie developed his ideas, explained them to Klein, Klein reacted, Lie (sometimes) responded to Klein's reactions, and so on. When Klein and Lie met in Berlin, Lie was studying what were known as tetrahedral line complexes.
Line complexes were a basic object of study in line geometry, a geometry in which lines rather than points are taken as the basic objects. Thus, lines rather than points are coordinatized. Given a line f c p3(C), if ( X l , . . . , X4) and (yl,--., y4) are the homogeneous coordinates of two points on ~, set pij = x i y j - x j y ~ . Then P12, P13, P14, P23, /942, and P34 are the six homogeneous P1/icker coordinates of L They satisfy the relation f2 = P12P34 + P13P42 + PI4P23 -~ 0.
In terms of line coordinates, a line complex of degree d is a set of lines f = ( P 1 2 ~ 9 9 ~ P 4 2 ) whose line coordinates Pi; satisfy a homogeneous polynomial equation of degree d (in addition to f~ = 0). From an abstract viewpoint, line complexes are simply projective varieties in p5(C), but in the mid-19th century the fact that the homogeneous coordinates corresponded to lines was always in view, and attention was focused on particular types of line complexes and their special geometrical properties. The tetrahedral line complex studied by Lie was a special second-degree complex, which may be defined geometrically as follows. Let A denote a tetrahedron with faces determined by planes 7 h , . . . , 7r4,and let k be a fixed constant. The totality T of all lines f c CP 3 such that the cross ratio of p~ = f n 7ri, i = 1 , . . . , 4, is k is called a tetrahedral complex. Other mathematicians had considered the geometry associated to a tetrahedral complex before Lie, but Lie's approach was totally original. Consider the set G of all projective transformations of space which fix the vertices of A. (Lie himself did not introduce notation such as G; he simply spoke of the transformations in G as the transformations of the tetrahedron.) Then for any fixed line t0, the set T of all lines T[Q] such that T c G is a tetrahedral complex. In effect, a tetrahedral complex for Lie was the orbit of some fixed line under the transformation of G. An idea of what the transformations comprising G are like can be obtained by choosing for A the tetrahedron with vertices at the origin and the points at infinity on the coordinate axes. Then the T E G are given in Cartesian coordinates by x' = Ax,
y' = #y,
z' = uz.
(1)
In his study of the geometry of tetrahedral complexes, Lie took advantage of the following properties of the transformations of G: If T1, T2 E G: (a) T1 o T2 E G; (b) T1 o T2 = T2 o T1; (c) for "general" p, q 9 p3 (C), a unique T 9 G exists such that T ( p ) = q; (d) there is a threefold infinity (oc3) of transformations in G. For us today, the fact that G is a group stands out. We see Lie using the commutativity and simple transitivity of G. But in 1869, the theory of groups was not a part of the basic mathematics known to all active mathematicians. In 1869, "group theory" meant the theory of finite permutation groups, which was known to have a nice application to algebraic equations as shown by Galois. However, relatively few mathematicians were actively interested in this subject. THE MATHEMATICALINTELLIGENCERVOL.16, NO. 2,1994 7
One place where permutation groups and Galois theory were the subject of lectures was the University in Christiania, where Sylow presented such lectures in 1862. In fact, Lie had actually attended these lectures! If history were more "rational," this would have been a turning point in Lie's life. But, apparently, history is not so ration a l - and, as a result, is more fascinating. According to Lie, he understood hardly anything that Sylow said and the lectures made no impression on him whatsoever. This is not surprising, since throughout his life Lie never displayed an interest in algebra or the theory of numbers; he was at heart a geometer. In 1862, he was still more than 5 years away from his decision to become a mathematician. Undoubtedly, Lie came away from Sylow's lectures with a vague realization that the theory of permutation groups had an important application to the resolution of algebraic equations, but there is no evidence to suggest that, before he met Klein, Lie regarded the totality of transformations G of the tetrahedral complex as the analogue of a group of permutations or, more importantly, that he thought this analogy was somehow significant. All the evidence suggests that it was Klein who perceived in Lie's work a continuous analogue of the theory of permutation groups, with analogous applications. Before coming to Berlin, Klein had spent several months in G6ttingen with Clebsch and his students. Shortly before Klein arrived, Clebsch had been in contact with a young French mathematician named Camille Jordan (1838-1922). Jordan was in the finishing stages of composing a systematic treatise on the theory and application of permutation groups. Jordan's book represented the first attempt to systematize Galois's cryptic ideas on polynomial equations and, in particular, to emphasize and develop their group-theoretic basis. Jordan was also interested in applying the theory of permutation groups to geometry, and this is w h y he made contact
Figure 1. Kummer surface with 16 real nodal points. (Reproduced from Kummer's Quartic Surface by R.W.H.T. Hudson, Cambridge University Press.) 8
THE MATHEMATICALINTELLIGENCERVOL. 16, NO. 2,1994
with Clebsch. Consider, for example, the fourth degree Kummer surface with 16 nodal points (Fig. 1). Jordan was interested in the polynomial of degree 16 whose roots yield the nodal points. From the geometrical properties of these points, he was able to determine the Galois group of the equation, and then to use its structure-- the composition series--to conclude that the resolution of the equation reduced to resolving the general equation of degree 6 and some quadratic equations. Clebsch's work in enumerative algebraic geometry supplied Jordan with further geometrically relevant polynomial equations to analyze in this fashion. Clebsch admittedly did not fully understand the group theory Jordan was applying, but he encouraged Jordan in his efforts, apparently sensing their importance. In fact, the inaugural 1869 volume of Clebsch's new journal, Mathematische Annalen, contained Jordan's expository "commentary" on Galois's most famous memoir. H o w much group and Galois theory was discussed among Clebsch and his students, including Klein, in G6ttingen is unknown, but it seems likely that there was some general talk about permutation groups, Galois's theory and Jordan's geometrical applications. In any case, Klein knew about Jordan's work on Kummer surfaces. Klein was an expert on these surfaces, which he showed could be studied completely in terms of a second-degree line complex. He cited Jordan's results about the nodal points in one of his papers, written while at G6ttingen in 1869, just before he came to Berlin. At Berlin, as Klein became acquainted with Lie and with his investigations on tetrahedral complexes, he perceived significant analogies between what Lie was doing and Galois's theory. Of primary importance in this connection was Lie's work on problems of a type considered earlier by Pliicker. Plficker associated to a quadratic line complex, such as T, a family of "cones." The cone at a point p consists of all lines f 6 T which pass through p (so p is the "vertex" of the cone). Plficker had used the cones to define geometrically interesting surfaces associated to quadratic line complexes. It was in the spirit of this work that Lie posed problems such as: Determine surfaces S with the property that at each point 196 S, the complex cone at p "touches" S at p. Here the cone at p is said to touch the surface S at p if it meets the tangent plane to S at p in one of the lines of the cone as pictured in Figure 2. Lie observed that a surface S defined by z = ~(z, y) has this property if and only if ~ is a solution to a certain first-order PDE, that is, an equation of the general form f(x,y,z,p,q) = O, where p = Oz/Ox and q = Oz/Oy. From the geometry of the situation, Lie could see readily that if a surface S has the desired property, then so does any surface obtained from S by any transformation of G. In terms of the PDE, this says that the transformations of G take solutions into solutions. For future reference I shall employ the terminology later introduced by Lie to describe this property.
By 1871 Lie had pushed this idea to the following extent: Suppose a first-order PDE f(x, y, z, p, q) -- 0 admits k _< 3 known independent, commuting infinitesimal transformations. Then new variables X, Y, Z may be chosen so that the equation becomes (1) F(P, Q) = 0 if k = 3, (2) F(Z,P,Q) = 0 i f k = 2, (3) F ( X , Y , P , Q ) = 0 i f k = 1. In each case, the form of the transformed equation simplifies its integration. For example, regarding case (2), Lie observed that it followed from known results that the solution to such a PDE reduces to a quadrature. Lie's proof of this proposition was very vague and intuitive.
The Line-to-Sphere Mapping Figure 2. A surface touched by complex cones. (Reproduced with the permission of Chelsea Publishing Company.) A PDE f(x, y, z, p, q) = 0 is said to admit the transformations T E G if each T E G takes solutions into solutions. Using the fact that the PDE associated to his problem admits the transformations G of the tetrahedron, Lie was able to show how to solve it. To this end he introduced the logarithmic mapping X = log x, Y = log y, Z = log z, which sends the T E G, expressed in the form (1), to the T* E G*, where T* is defined by X' = X + a, Y' = Y + f l , Z' = Z + % w i t h a = logA, fl = log#,-y = log v. Thus, the PDE f(x, y, z, p, q) -- 0 transforms into a PDE, F(X, Y, Z, P, Q) = O, which admits the T* E G*. Because the PDE F ( X , Y, Z, P, Q) = 0 thereby admits all translations, Lie concluded that the function F cannot actually vary with X, Y, Z, and, hence, must be of the form F(P, Q) -- 0, a type of PDE that had been resolved by Euler. When Klein learned about Lie's method of solving the problem, he was struck by a consequential analogy with a result due to Abel. In a famous memoir of 1829, Abel had considered a class of polynomial equations whose roots had a commutativity property that generalized the property of the roots of cyclotomic polynomials exploited by Gauss in Disquisitiones arithmeticae. Abel showed that this class of polynomials could also be solved by radicals. Abel's commutativity property of the roots is equivalent to the Galois group of the polynomial being commutative. Klein's observation of this analogy made a great and lasting impression on Lie. From that moment on, Lie always had in the back of his mind the search for a systematic mathematical theory of differential equations that would resemble somewhat Galois's theory of algebraic equations. I will call it LIE'S ID~IE FIXE. The fact that a differential equation, or a system of such equations, admits known (possibly infinitesimal) transformations, which commute or which, more generally, form a group, should translate into information about its integration. Establish theorems showing this is the case.
Before proceeding further, an update on the travels of Lie and Klein is in order. After one semester at Berlin, they had moved to the next obvious center of mathemati c s - Paris, where they spent the spring and early summer of 1870. Towards the end of their stay in Paris, Lie discovered a remarkable line-to-sphere mapping, which switched the focus of his research from tetrahedral complexes to the properties and implications of his new mapping. I will have more to say about this historically important work in a moment. The outbreak of the Franco-Prussian war caused Klein to leave Paris and return to Germany. Lie, being Norwegian, opted to hike back to Norway via Italy. He was an avid hiker with a reputation for great endurance. Lie also had some interesting hiking techniques. For example, according to Klein, in order to keep his clothing dry while hiking in the rain, he would take it off and put it in his backpack! He must have made an interesting sight. Perhaps this was one reason why he was picked up by the French authorities. He was suspected of being a German spy and thrown into prison. The letters he had written encouraged the authorities in their suspicions, for when Lie wrote (in German) about "lines" and "spheres" they thought he was writing about "infantry" and "artillery." When Lie said it was mathematics and began to explain, "Let x, y and z be rectangular coordinates . . . . " they decided he was insane! Eventually, he was released (through the intervention of Darboux), and he returned to Norway to write up the new mathematical discoveries he had made in P a r i s - - a n d in p r i s o n - - a s his doctoral dissertation. Lie and Klein's stay in Paris proved to be a far more congenial mathematical experience than their stay in Berlin. Camille Jordan was there and they met him. In fact, Jordan's great treatise on permutation groups and Galois's theory rolled off the press while they were in Paris. Klein at least appears to have taken a look at it and was impressed-- impressed both by the monumental nature of the theory expounded by Jordan and by how little he understood of it! Klein and Lie did not spend much time talking to Jordan, and they spent no time trying to decipher his treatise. They were impressed THE MATHEMATICAL INTELLIGENCER VOL. 16, NO. 2,1994
9
by the existence of the theory, but they did not envision its technical details as the basis for a theory of continuous groups. By this time, the concept of an infinitesimal transformation was basic to their group-related studies connected with tetrahedral complexes and Lie's idde fixe, and they believed infinitesimal transformations would be fundamental to the development of a theory of continuous groups as well. Their primary research interests remained geometrical, and the person they spent most of their time talking to in Paris was a geometer, Gaston Darboux (1842-1917). Darboux, who was Lie's age, was part Of a group of French mathematicians who sought to develop an approach to geometry that combined concepts from differential g e o m e t r y - - f o r which there was a great tradition in France going back to M o n g e - - and concepts from projective geometry in the tradition of Poncelet. Lie and Klein tended to refer to this methodological amalgam as French "metrical geometry." French metrical geometry was inspired in part by a theorem of Liouville (1846) which showed that, by contrast to the case of the plane, conformal transformations of space are less abundant: They can all be generated by composition of certain projective transformations (similarity transformations) and transformations closely related to projective transformations: what were called transformations by reciprocal radii, that is, inversions in spheres. One of the tools used by Darboux and his colleagues was a "pentaspherical coordinate" system. Darboux pointed out to them that there was a remarkable formal similarity between some of Klein's line-geometrical results (expressed in six homogeneous line coordinates) and Darboux's results (expressed in five pentaspherical coordinates). In July 1870, shortly before his departure from Paris, Lie discovered a basis for such analogies in what he called his sphere mapping. Lie's sphere mapping was a by-product of his generalization of a basic notion in projective geometry: reciprocity, a notion that Pliicker in particular had emphasized and himself generalized. One of the simplest examples is the duality between points and planes in space. It is given by xX +yY + zZ + l =0.
(2)
Thus, corresponding to a fixed point whose coordinates are given by the lower case x, y, z, is the plane of points capital X, 1I, Z satisfying (2). Lie's sphere mapping is based on the reciprocity determined by the two linear equations (X+iY)-zZ-x=O,
z(X-iY)+Z-y=O.
(3)
By virtue of these equations it is possible to associate various geometrical objects in complex projective space r, with Cartesian coordinates (x, y, z), to objects in space R, with Cartesian coordinates (X, Y, Z). For example, for 10
~
MATHEMATICAL INTELLIGENCER VOL. 16, NO. 2,1994
each fixed point P = (X, Y, Z) E R, (3) represents a system of two linear equations in three unknowns x, B, z, and the solutions/9 = (x, p, z) form a line f C r. Thus, (3) establishes a correspondence 9 : P - , f between points of R and certain lines of r. The lines f = ~ ( P ) obtained by varying P E R form a linear (=degree one) line complex s = {~ : ~ = ~(P), P E R}, and the mapping : R ---* s is one-to-one. Given any line ~ c r, let C(f) denote the set of lines ~* from the complex s which intersect the given ~. Then the corresponding set of points in R, S(f) = r [C(f)], is a sphere, that is, a set of points in R satisfying the usual Cartesian equation of a sphere, although the center and the radius may be complex numbers. In this manner, Lie set up a correspondence between all lines in r and "spheres" in R - - a n d , thus, between the line geometry relative to a linear line complex s in r and a "sphere geometry" in R. For example, lines which intersect in r correspond to spheres which touch in R. Incidentally, Lie sphere geometry (in any dimension) has recently become of interest to a number of mathematicians. These properties of the line-to-sphere correspondence bring out another aspect of reciprocities in general to which Lie first called attention. Each of them defines what he called a contact transformation. Contact transformations of 3-space do not transform points, they transform what Lie called "surface elements." A surface element ds consists of a point a = (x, y, z) and an infinitesimal surface through that point which may be identified with the tangent plane, (z* - z) = p(x* - x) + q(x* - x), to the surface at point a = (x, y, z). In the spirit of Pliicker, Lie thought of ds as coordinatized by the (x, y, z, p, q). Thus, a contact transformation is a transformation T : (x, y, z,p, q) ---* (X, Y, Z, P, Q) of surface elements. Lie called these transformations "contact transformations" because they preserve the contact between surfaces; that is, if two surfaces touch at a point, they are transformed by a contact transformation into surfaces which touch at the image point. Any "point transformation" - - that is, any transformation in which X, II, Z are functions of (x, y, z) (and not p and q as well)--defines (by prolongation) a contact transformation, for point transformations evidently preserve the contact between surfaces. But many contact transformations are not generated by point transformations. An example is provided by the contact transformation E : (x, y, z,p, q) ---* (X, II, Z, P, Q) associated to the reciprocity (3). E may be defined geometrically as follows: Given a surface element in r as represented by a point and a plane through it, consider all lines through that point and lying in that plane. These lines correspond in R to a family of spheres which touch in a common point and, hence, share a common tangent plane. This point and plane define the corresponding surface element in R under the contact transformation E. Here the point (X, Y, Z) depends not only on the point (x, y, z)
but also on the chosen plane through it and, hence, on p and q. Lie pursued his investigations related to the sphere mapping in two directions, both of which are historically important. The first direction was encouraged by the discovery in July 1870 that, by virtue of the sphere mapping reciprocity r ~ R, the asymptotic curves on a surface in r correspond to the lines of curvature on the associated surface in R. Asymptotic curves and lines of curvature were two types of curve lying on a surface of interest to differential geometers of the time. Lie and Klein had learned in Paris that Darboux and his colleagues had recently determined the lines of curvature of a certain class of fourth-degree surfaces which arise naturally from pentaspherical coordinates and were called generalized "cyclides." Lie observed that if one thought of these cyclides as "living" in the space R, then the corresponding surfaces in r were the Kummer surfaces that Klein had studied. In this way, Lie was able to transfer the results of the French on lines of curvature of generalized cyclides to a description of the asymptotic curves on Kummer surfaces - - a new result at that time and one that greatly impressed Klein. With the encouragement of Klein, Lie set out to explore the relations, established by his sphere mapping, between line geometry in the space r and the metrical geometry of the Darboux school in the space R. The contact transformation G sets up a correspondence between transformations of r and transformations of R: t : r ~ r corresponds to T : R ~ R, where T = Y]t[~] -1. Lie observed that various types of familiar geometrical transformations of r correspond, in this manner, to familiar types of geometrical transformations of R. All of these types corresponded to groups, although Lie did not mention this fact explicitly. For example, the totality L10 of 0o1~projective transformations of r which (as line transformations) take s into itself corresponds to the totality G10 of 0010 conformal transformations T of S. The totality M7 of OO7 t E L10 which fix ~ (the line at ~ in the xy plane) corresponds to the totality H7 of all similarity transformations T (generated by translations, rotations, and scaling maps). Klein found Lie's results very stimulating. The metrical geometry of the French involved properties invariant under the group Glo of conformal transformations or, more generally, under the group H7 of similarity transformations. For Klein, the key to Lie's success was the mapping 9. He saw Lie's correlation of projective line geometry in r with metrical geometry in R as based on the existence of the map r : s --, R, which brings with it Llo --~ Glo and M7 --* H7. Incidentally, the group Llo, the projective group of a linear line complex as Lie later called it, is the projective symplectic group with structure type C2. Symplectic groups first arose in Lie's work in this manner. As the group Glo of conformal transformations has structure type B2, one could say (anachronistically) that Lie's correspondence between the projective line ge-
ometry of a linear complex in r and the metrical geometry of R was a reflection of the "accidental isomorphism" of C2 and 82. Lie's work convinced Klein that he had become too dogmatic in his assumption, encouraged by his association with Clebsch, that projective geometry was more fundamental than other types of geometry. The equivalence established by the map 9 put three-dimensional metrical geometry on an equal footing with the line geometry of the linear complex s Now the totality of all lines in 3-space coordinatized by Plficker coordinates is four-dimensional, s being of course only threedimensional. This fact led Klein to ask whether an analogous correspondence exists between all of line geometry (which is four-dimensional) and four-dimensional metrical geometry. Klein showed the answer is yes. Klein's answer, published in 1871, is itself less important historically than the new conception of the essence of geometry implicit in it, for Klein took the radical view that a geometry can be described by specifying (i) a manifold A4 of "elements" and (ii) a group G of transformations T : A4 ~ At, which defines the invariant relations of the geometry. He regarded two geometries (A41, G1) and (.M2, G2) as equivalent if there exists a one-to-one map 9 from A41 onto J~2 such that G1 -- k~-lG2 k~. Klein regarded line geometry as determined by the pair A41,G1, where .A41 consists of all ( x l , . . . ,x6) E PS(C) which satisfy f~ = x 2 + ... + x 2 = 0. Here, following Klein, Plficker coordinates pq have been replaced by those coordinates xi in which the quadratic form f~ is expressible as x 2 + ... + x 2. Accordingly, G1, the group of all projective line transformations, consists of all T C PGL(6, C) such that T takes .M1 into itself. Fourdimensional metrical geometry was defined by analogy with the three-dimensional case as the pair .M2, G2, where .M2 = p4(C), and G2 is the conformal group of 4-space, the group generated by four-dimensional similarity transformations and transformations by reciprocal radii. Having characterized line geometry and fourdimensional metrical geometry in this radically new way, Klein then showed that these two geometries are equivalent in the sense that a mapping @ exists. The conception of geometry implicit in the abovedescribed work was made explicit by Klein 2 months later in an essay "On Geometrical Methods." In it he took the view that every geometrical method is determined by specifying a manifold A4 and a group G of transformations taking it into itself. Although Klein never published this essay, the ideas were finally published in his famous Erlangen Program of 1872. As you can see, Klein owed much in the way of inspiration for these ideas to Lie's work on his sphere mapping. Lie himself was led in another direction by the sphere mapping research. Lie's reciprocity defined by (3) was, he realized, but one example of an infinite family of reciprocities. Just as (3) defined the contact transformation ~, so too, any type of reciprocity THE MATHEMATICAL INTELLIGENCER VOL. 16, NO. 2,1994
11
F(x,y,z,X,Y,Z)
(A) F ( x , y, z, X , Y, Z) = O,
(B) G(x, y, z, X , Y, Z)
=0 0
F(x,y,z,X,Y,Z) =0 (C) G ( x , y , z , X , Y , Z ) = O H(x,y,z,X,Y,Z)
= 0
defines an associated contact transformation T: (x, y, z, p, q) --* (X, Y, Z, P, Q). For example, take type (A). Fix X, Y, Z and regard F = 0 as defining z as a function of x and y, to obtain, by differentiation of F = 0, the equations Fx + Fzp = 0 and F v + Fzq = 0, where F~ = a F / O x , p = Oz/Ox, etc. Similarly, with x, y, z held fixed, differentiation of F = 0 yields F x + F z P = 0 and F y + F z Q = 0. The 5 equations F = 0 , . . . , F y + F z Q = 0 in the 10 unknowns x , y , z , p , q , X , Y , Z , P , Q can be solved for X, Y, Z, P, Q (assuming the requisite implicit function theorem applies) to obtain 5 equations expressing each of X, Y, Z, P, Q as a function of x, y, z, p, q. These equations define the contact transformation T associated to the reciprocity F ( x, y, z, X , Y, Z) = O. Pliicker had called attention to type ( A ) - - b u t in two dimensions and with F restricted to being a polynomial. Type (C) defines ordinary "point transformations"; for example, for a fixed point x, y, z, system (C) defines (generally) a p o i n t - - the intersection of three surfaces. But it was Lie who saw all of these types as contact transformations. His sphere mapping represents a type (B) contact transformation--apparently the first of type (B) to be considered by any geometer. By 1872 Lie had obtained a characterization of contact transformations that proved more useful in analytical reasoning and which may make the concept seem more familiar today. Contact transformations of threedimensional space turn out to be those transformations of five variables, T : ( x , y , z , p , q ) ~ ( X , Y , Z , P , Q ) , which leave the Pfaffian equation d Z - ( P d X + Q d Y ) = 0 invariant, meaning that dZ - ( P d X + Q d Y ) = p[dz - (pdx + qdy)], p = p(x, y, z, p, q) # O.
(4)
Although geometry had been the inspiration for the general idea of a contact transformation, Lie's interest in developing it was due mainly to its potential importance in the study of PDEs. In the mid-1860s the German mathematician Paul Du Bois-Reymond had focused attention on a very special type of contact transformation, a type which had been used occasionally in the 18th century by Euler, D'Alembert, and Legendre to transform PDEs. Typical is the transformation X = -q,
Y = p,
P = y, This
transformation
Z = z-px-
qy,
Q = -x. takes
a
(5) first-order
PDE
F ( X , Y, Z, P, Q) = 0 into another such PDE, f ( x , y, z,p, q) = 0;because Z = z - p x - q y = z - Y x + X y , so P = cgZ/OX = y and Q = O Z / O Y = - x . In other
words, (5) actually transforms a first-order PDE into an12 THEMATHEMATICAL INTELUGENCER VOL. 16, NO. 2, 1994
other such equation because the first three equations (for X, Y, Z) imply the remaining two, so that P is actually coZ/OX and Q is actually OZ/OY. Du Bois-Reymond tried to get a more general class of transformations by which a first-order PDE might be transformed into another, but he did not get very far. Certainly he had no general concept of a contact transformation. Lie realized that any contact transformation may be used to transform a first-order PDE. As Lie had learned while in Paris, Monge had showed how to develop the 18th-century theory of PDEs in terms of geometrical constructs. Lie dreamed of doing something analogous by means of the geometrical notions associated with contact transformations. One aspect of that dream was a prospective "invariant theory" of contact transformations. A general goal was to determine when two given systems of PDEs can be transformed into one another by a contact transformation. Such a transformation would take the solutions of the one system into those of the other, so that solving the one system would be tantamount to solving the other. Now, if one system has a property (P) that is preserved under all contact transformations, then it can only be transformed into another system having (P). Therefore, the problem arises to determine invariant properties of systems of PDEs in the hopes of determining necessary and sufficient conditions that one system be transformable into another by a contact transformation. The idea of this prospective invariant theory was partly inspired by the results Lie had obtained transforming various types of PDEs using the contact transformation E associated to the sphere mapping. Lie was thinking about this in the closing months of 1871 when he received a copy of Klein's essay on geometrical methods. At first, Lie failed to see what Klein was proposing. But once he understood, he became very enthusiastic, because his project on invariant properties of PDEs exemplified what Klein was saying in his essay: Lie's prospective theory could be thought of as the study of the Kleinian geometrical method corresponding to the five-dimensional manifold of surface elements x, y, z, p, q and the group of all contact transformations. (Strictly speaking, these contact transformations form a Lie pseudogroup - - but all group-related concepts were only vaguely articulated at this time.) Klein, in turn, was excited by Lie's observations, for he had not had anything like Lie's example in mind! When Lie and Klein were finally together again in the fall of 1872, they made their new ideas public in the following ways. Lie, with Klein's editorial assistance, announced some of his ideas on PDEs and contact transformations. Klein, with Lie's assistance, composed his Erlangen Program on geometrical methods. In the Program, Klein wrote, "Given a manifold and a group of transformations of it, develop the invariant theory related to the g r o u p " - - a n d that is precisely what Lie intended to do for the group of contact transformations.
In the Erlangen Program, Klein called for the development of an autonomous theory of continuous transformation groups. He pointed out that in Jordan's 1870 treatise, the theory of finite groups is first developed independently, then its applications are developed. Klein and Lie were suggesting some sort of development of the theory of continuous groups, with the applications to follow. Klein probably had geometrical applications in mind, but Lie still had his iddefixe-- an application of group theory to differential equations somewhat analogous to Galois's theory and perhaps now also involving the invariant theory of contact transformations. But who would develop the group theory? Did they actually envision carrying out such a development themselves? Certainly not in 1872. Neither of them felt prepared to do it, even though they claimed the continuous theory would be easier to develop than the "discrete" theory. After writing the Erlangen Program, Klein's interests turned to other matters, which did not involve continuous groups at all. As for Lie, he was neither prepared nor inclined to grapple with the theory of continuous groups. For example, according to his recollections, he realized at the time the Erlangen Program was written that implicit in it was the problem of classifying continuous groups. Lie, however, felt that such a classification was "absurd or impossible" - - that is, so far beyond what he would be capable of resolving as to appear ridiculous. This is the same negative attitude that both Lie and Klein had shared earlier, in 1871, when a spinoff of the work on tetrahedral comp l e x e s - their theory of W-configurations--had led to a problem of classifying certain commutative subgroups of the general projective group. This classification problem they also had dismissed as too difficult to consider after attempting to resolve it in special cases. Slightly over I year later, however, Lie had completely changed his mind! By the end of 1873 he had decided that he could develop a theory of continuous groups, and that the effort to do so was warranted. What caused him to change his mind? In a word, the answer is: Jacobi.
Jacobi and First-Order PDEs By the time Lie and Klein got together to write the Erlangen Program, Lie was trying to work out his geometrical theory of PDEs, based on his "invariant theory" of contact transformations. At that time, the theory of first-order PDEs was dominated by the results of Jacobi that had been inspired by his study of Hamilton's papers on dynamics. Jacobi had actually obtained these results in the late 1830s--he died in 1 8 5 1 - but many of them were first published by Clebsch in the period 1862-1866. Many mathematicians.were interested in developing Jacobi's theory, but Lie was unique in that he approached it with a geometrical background and with "groups" and his iddefixe in the back of his mind. To conclude this article, I will briefly indicate how Lie's approach was linked by Lie to Jacobi's theory, and why the link encouraged
him to take on the challenge in the Erlangen Program. Let FI(x,p) = Fl(Xl,...,Xn,Pl,...,pn) = O, Oz (6)
Pi - Oxi' denote a first-order PDE which does not explicitly involve the dependent variable z = ~ ( x l , . . . , x,~). It was known that the solution of any first-order PDE could be reduced to this special type. Jacobi's method of solving (6) made fundamental use of the brackets that had been introduced by Poisson in 1809 in connection with Lagrangian mechanics. If G and H are any functions of 2n variables (x,p) =
(Xl,...
,
xn,Pl,...
,Pn),
then
(G,H) = ~--~ ( OG OH i=10piOxi
OG OH~ ~ ~ ] .
(7)
Jacobi observed that the PDE (6) could be completely solved if functions F2, 999 Fn of the variables (x, p) could be determined such that F1,..., Fn are functionally independent and satisfy the relations (Fi, Fj) = 0 for all i and j. He then proceeded to devise a method for producing the functions Fi. [In other words, Jacobi's method made the Hamiltonian system defined by H = F1 (x, p) completely integrable.] Jacobi also discovered a property of the Poisson bracket (7) that Poisson and Lagrange had apparently missed: JACOBI'S IDENTITY. If F, G, and H are any three func-
tions of the variables (x, p), then ((F, G), H) + ((G, H), F) + ((H, F), G) = 0.
(8)
The Jacobi Identity played a fundamental role in Jacobi's method for determining the functions F2, 999 Fn. He also saw the Identity as having important implications for an earlier method he had devised for solving a firstorder PDE. The earlier method solved F1 (x, p) = 0 by determining 2n - 1 functionally independent solutions 9 = F1,..., F2n-1 to (F1, (I)) = 0, one of which is automatically (I) = F1. Jacobi believed that if he knew just two solutions 0 = F2 and (I) = F3 to (F1, O) = 0 such that F1, F2, F3 are functionally independent, then "in general" the remaining 2n - 1 could be determined by (8). Indeed, given the solutions F2 and F3, it follows readily from (8) that F4 = (F2, F3) is another solution. The same reasoning then implies that F5 = (F3, F4) is another solution, and so on. Jacobi was convinced that "in general" it would be possible to produce in this manner the requisite 2n - 1 independent functions. He realized, on the other hand, that in "particular cases," such as when a certain number of solutions to (F1, O) = 0 are known from general mechanical principles, the above bracketing process falls short of the goal of generating 2n - 1 independent solutions. For example, it could happen that THE MATHEMATICAL INTELLIGENCER VOL. 16, NO. 2,1994
13
F4 = (F2, F3) vanishes or is some function of F1, F2, and F3. He claimed that in such cases, knowledge of these solutions could be used to simplify the problem of solving the PDE F1 (x, p) = 0. Perhaps due to the poor health that plagued him after 1843 as a consequence of diabetes, Jacobi never published any detailed justification of his claims before his death in 1851. Most mathematicians doubted them. Still left open
Mathematicians writing about (11) would switch to (10) without a word and vice versa: To them, (10) and (11) were two sides of the same coin. Lie also fell into this habit of identification. Thus, d T could be identified with the system (10) or with (11). Jacobi's work added another twist to this identification process. In developing his method of solving first-order PDEs, Jacobi had introduced differential-operator notation such as
was
JACOBI'S PROBLEM. Suppose 9 = F1, . . . , Fr are r func_ tionally independent solutions to ( F1, cb) = 0 with the property that bracketing produces no more solutions, that is, for all i, j ( Fi, Fj ) is functionally dependent on F1, . . . , Fr: (Fi,Fj) = f2ij(F1,...,F~),
i,j = 1,...,r.
(9)
How does knowledge of F a , . . . , G simplify the problem of solving F1 (x, p) = 0? Although several mathematicians had dealt with this problem in the case that the equations in (9) have the special form (Fi, Fj) = 0, +1, Lie was the first to tackle the general problem. For him, it was an expression of his idde fixe. This understanding of the problem was based on discoveries he had made in seeking to develop his invariant theory of contact transformations. Before discussing these discoveries, some preparatory remarks about Lie's use of infinitesimal transformations are necessary. Infinitesimal transformations were fundamental to the conception of a continuous group that Lie had developed as an outgrowth of his work on tetrahedral complexes in 1870. For him, a group of transformations was continuous because its transformations could be generated by infinitesimal ones. In present-day formulations of Lie's theory, the infinitesimal transformations of a group are the elements of its Lie algebra. Lie himself tended to think of the infinitesimal transformations as themselves belonging to the group. He identified such a transformation d T : y = ( Y l , . . . , Y m ) ~ (Yl + d y l , . . . , y m + dym), dyi = 71i(y)dt with the system of ordinary differential equations dyi = ~i(y), dt
i = 1,..,m.
(10)
If yi(t) = fi(t,y(~ i = 1 , . . . , m , is the solution to this system satisfying the initial condition yi(to) = y~O), then these equations defined the one-parameter family of transformations Tt : y(0) __, y(t) generated by dT. As Lie learned about the theory of first-order PDEs as it stood circa 1870, he found that solving the system (10) was equivalent to solving the linear homogeneous PDE m
Of
Z r/dY) ~
-- O.
i=1
14
THE MATHEMATICAL INTELLIGENCER VOL. 16, NO. 2, 1994
(11)
m
of
Y ( f ) = Z ~li(Y) Oyi i=l so that (11) can be expressed succinctly as Y ( f ) = O. He also utilized the fact that if X ( f ) is another such operator, then so is X ( Y ( f ) ) - Y ( X ( f ) ) . Eventually, Lie identified the infinitesimal transformation dT with Y ( f ) (which nowadays would, in turn, be identified with a vector field). If this identification is expressed by the notation dU ~ X ( f ) , d T ~ Y ( f ) , then Jacobi's calculus of differential operators led Lie to the further identifications: d U o dT ~ X ( f ) + Y(f), d U o d T o dU -1 o dT -1 ~ X ( Y ( f ) ) - Y ( X ( f ) ) .
(12)
These identifications proved inspirational to him as he developed his invariant theory of contact transformations. In the (n + 1)-dimensional space of points (x, z) = ( x l , . . . , x,~, z), contact transformations are transformations T : (x, z, p) ~ (X, Z, P) of the 2n + 1 variables (x, z,p) = ( x l , . . . , X n , Z , p l , . . . , p n ) which satisfy the higher-dimensional analog of (4). Lie discovered that in developing his theory for PDEs of the form F ( x , p) = O, he could restrict his attention to those with equations of the form Xi = ~i(x,p), Z = z + ~(x,p), P~ = 7ri(x,p). Following Lie, these contact transformations will be referred to as transformations of (x, p). He realized that, as transformations (x, p) ---* (X, P), they define canonical transformations in the sense of Hamilton-Jacobi dynamics. A link between the theory of PDEs and his idde fixe was provided by his discovery that d T ~ X ( f ) is an infinitesimal transformation of (x, p) if and only if there is a function W ( x , p) such that X ( f ) : [W,f]
forallf = f(x,z,p),
(13)
where the generalized Poisson brackets are defined for functions G, H of (x, z,p) by O__~GD H [G, H l = ~ i=l Opi Oxi
D___GGOIt -
OXi
D 0 0 Dxi - Ox--i + Pi -~z"
0
Pi
'
Generalized Poisson brackets had been introduced into Jacobi's theory by his successors. They have the same basic properties as the Poisson brackets (7). In particular, the analogous Jacobi Identity holds. Lie discovered the connection between infinitesimal transformations of (x, p) and functions W(x, p) by a line of reasoning typical in projective geometry: Take a simple fact (e.g., about a circle) and apply projective transformations to turn it into a general theorem (e.g., about conics). Lie's simple fact was from the theory of first-order PDEs, and contact, rather than projective, transformations were applied to obtain the general result. Consider Jacobi's Problem. Giventhe solutions ep = Fi to (F1, ~) = 0, it follows from the chain rule that any function of F1,..., Fr, F = O(F1, 999 F~) for some function O of r variables, is also a solution to (F1, ~) = 0. Thus, in Jacobi's Problem all solutions of the form F = O ( F 1 , . . . , F~) are known, and the problem is to use this information to simplify the integration of F1 (x, p) = 0. N o w by virtue of (13), in Lie's mind F = O ( F 1 , . . . , F~) corresponded to an infinitesimal transformation X (f) = [F, f]. He realized that an infinitesimal transformation X ( f ) is admitted by a PDE Fl(x,p) = 0 if and only if X(F1) = 0 for all (x,p) satisfying Fl(x,p) = 0. By virtue o f (13), X(F1) = IF, F1] = (F,F~) = 0, so that F~ = 0 admits the infinitesimal transformation X ( f ) = IF, f] corresponding to F = O ( F 1 , . . . , F,). Lie also realized that these X ( f ) form a Lie algebra. This property of the X ( f ) is a consequence of the Jacobi Identity for generalized Poisson brackets. At this time, Lie did not concern himself with the question as to whether a group corresponded to this Lie algebra. He seems to have assumed it did, but, in any case, he focused his attention on the "group" of infinitesimal transformations X(f), which will be denoted by G(F1,..., G ) . Jacobi's Problem could then be interpreted as an expression of his idde
fixe: Given that the PDE FI(x,p) = 0 admits the group 6(F1, 999 F~) of infinitesimal transformations, how does
this knowledge simplify the problem of solving it? Lie provided an answer to this question and called the attendant theory his theory of function groups. It became the core of his invariant theory of contact transformations. N o w it can be seen as a precursor of that part of symplectic geometry involving Poisson structures. Although the concept of a function group was motivated by Jacobi's Problem, Lie's actual definition was independent of the Problem and runs as follows. Given functionally independent functions of (x, p), F 1 , . . . , F~, which satisfy relations, of the form (9) - - b u t which need not be solutions to (F1, O) = 0 - - let $'(F1,.. 9 F~) consist of all functions F which are functionally dependent on F 1 , . . . , Fr, so that F = O ( F 1 , . . . , F~) for some O. Then G( F 1 , . . . , Fr) consists of all infinitesimal transformations of the form X ( f ) = [F, f] for some F E 9V(F1,..., F~). In
his publications, written for 19th-century analysts with no understanding of group-related concepts, Lie officially designated 5V(F1,..., Fr) as an "r-term function group" and did not mention G(F1,..., Fr), even though he himself thought in terms of G. Equipped with the Poisson bracket, ,~(F1,..., Fr) is a Lie algebra in its own right, and the mapping F --* X (f) = [F, f] is a h o m o m o r p h i s m with kernel consisting of the constant functions, so that the distinction between Y and G is more significant conceptually than mathematically. Using his theory of function groups, Lie was able to resolve the problem of w h e n two systems of PDEs, Fi(x,p) = 0 and Gi(x,p) = 0, i = 1 , . . . , r , are transformable into each other by a contact transformation in the sense that each Fi gets transformed into the corresponding Gi. Recall from the second section that this was a major goal of his prospective invariant theory in 1871. Lie first solved the problem w h e n the Fi and Gi form r-term function groups. In this case, the necessary and sufficient condition is that for all i and j, (Fi, Fj) is the same function of F1,. 99 Fr as (Gi, Gj) is of G 1 , . . . , G~:
(Fi, Fj) = 12ij(F1,..., Fr) iff (ai, aj) = f~ij(G1, . . . , at).
(14)
As we shall see, Lie discovered this theorem had an inspiring application to the group classification problem. As a Lie algebra, G(F1, 999 F~) is infinite-dimensional. Jacobi's method of solving first-order PDEs was also the source of Lie's discovery that finite-dimensional groups are likewise important to the theory of first-order PDEs. In 1866 Clebsch had developed his theory of complete systems of linear homogeneous PDEs, motivated by the fact that both Jacobi's method and his own extension of it to Pfaffian equations reduced the problem to solving complete systems. (Clebsch's theory is now subsumed in the theory of completely integrable vector fields.) Lie sought to apply his idde fixe to complete systems just as he had to Jacobi's Problem. In the case of Jacobi's Problem he had seen that if F1 (x, p) = 0 admits a finite n u m b e r of infinitesimal transformations, then, by applying the bracketing process, it can be assumed without loss of generality that F1 = 0 admits a function group ~ ( F 1 , . . . , F~). Within the context of complete systems, somewhat analogous reasoning led him to the discovery that if a complete system admits a finite number of infinitesimal transformations, then without loss of generality it can be assumed that the system admits infinitesimal transformations X1 ( f ) , . . . , X~ (f) which satisfy
Xi(Xj(f)) - Xj(Xi(f)) = ~
cijkXk(f).
(15)
i=1
In m o d e r n terms this means that the Xi(f) span a finitedimensional Lie algebra. To Lie, it meant that the complete system admits a group which depends continuTHEMATHEMATICAL INTELLIGENCERVOL.16,NO.2,1994 15
ously on a finite number of parameters, the type of group that had arisen in his early work related to tetrahedral complexes. These groups could now be seen to be important to the general theory of PDEs. Moreover, Lie discovered a link between these groups and his theory of function groups, a link which encouraged him to believe that he was now in a position to solve, for these groups, the sort of classification problem that he had earlier dismissed as impossible. For Lie, two transformation groups were essentially the same--"similar" was his t e r m - - i f the transformations of the one could be transformed into those of the other by means of a variable change. On the infinitesimal level on which he operated, the problem of classifying all finite-dimensional groups of transformations was to determine up to similarity all finite-dimensional groups (Lie algebras) of infinitesimal transformations in m variables, X ( f ) = ~_,im=lrli(y)Of/cgyi, y = (Yl,... ,Ym). Today, Lie's problem would be formulated as that of determining all Lie algebras of vector fields up to diffeomorphisms. In the case m = 1, Lie was able to use (15) to conclude by elementary considerations that there are only three distinct possibilities (the projective group of the line and its subgroups). This success undoubtedly helped him appreciate the value of (15), but his confidence that he could achieve comparable success in the case of any number of variables came from another source. As soon as the number of variables exceeds one, there are contact transformations as well as point transformations to consider. Point transformations being (as noted in the second section) special types of contact transformations, Lie's problem was to classify all Lie algebras of infinitesimal contact transformations. In this connection he had discovered that just as the study of projective transformations can be reduced to the study of linear transformations, so too the study of contact transformations can be reduced to the study of what he called "homogeneous contact transformations." On the, infinitesimal level, these are transformations of (x, p) which leave z fixed. Consequently, by (13), an infinitesimal homogeneous transformation has the form X (f) = [W, f] = (W, f). Lie's problem reduced to classifying all Lie algebras of infinitesimal homogeneous contact transformations Xi(f) = (Fi, f). Since in this case X ( f ) is given by a Poisson bracket, it follows from Jacobi's Identity that Xi(Xj(f)) - Xj(Xi(f)) = ((Fi, Fj), f), and so the Lie algebra property (15) implies
(Fi, Fj) = ~
CijkFk.
(16)
k=l
Equation (16) was the primary reason w h y Lie changed his mind about the impossibility of group classification problems. It showed him that a finite-dimensional group has associated to it a function group. Suppose that p _< r 16
THE MATHEMATICAL INTELLIGENCER VOL. 16, NO. 2,1994
of the functions Fi are functionally independent and that the notation is such that F1,..., Fp are independent. Then for k > p, Fk is some function of F1,..., Fp, and (16) implies that (Fi, Fj ) is expressible as a function of F1,..., Fp, which is precisely the condition (9) that F1,..., Fp determine a p-term function group G(F1,..., Fp). By means of this connection between the rdimensional group and the p-term function group, Lie believed he could classify the r-dimensional groups into general similarity types by applying the results of the theory of function groups. To give some idea of how this w o r k e d - - a n d some of the critical points Lie overlooked--it suffices to consider the case in which F1,..., Fr are functionally independent, so p = r. Suppose Xi(f) ---- (Fi, f) and Yi(f) -- (Hi, f) define two groups of this type, and that in addition [possibly by taking suitable linear combinations of the Yi(f)] the constants cijk are the same for the Yi(f). In other words, the assumption is that the two Lie algebras are isomorphic. In view of (16) this means, of course, that (Fi, Fj) is the same function of F1,..., Fr as (Hi, Hi) is of H1,..., Hr. Thus, (14) is satisfied, so that by the related theorem Lie could infer the existence of a homogeneous contact transformation which takes each function Hi into the corresponding Fi. This meant that the two r-dimensional groups Xi(f) and Y/(f) are similar. In other words, Lie concluded that groups of this type are similar if and only if their Lie algebras are isomorphic, so that the classification problem reduces in this case to that of classifying Lie algebras up to isomorphism. Lie envisioned a limited number of nonisomorphic possibilities and evidently felt it would not be difficult to determine them. In the fall of 1873 he,
... the task of creating a theory of continuous transformation groups, the task that became his life work. thus, glossed over an algebraic problem which is still not completely resolved today! The problem was first tackled in 1888 by Killing, who resolved it in the semisimple case. Since Lie's classification of groups with p < r also depended on the classification of Lie algebras, his overall approach was not as effective as he believed. At the time, however, it seemed that his theory of function groups, and more generally his invariant theory of contact transformations, had provided him with precisely the tools he needed to tackle the group classification problem. Not only was he now convinced of the importance of continuous groups to the general theory of PDEs but he also saw the theory of these groups as drawing on the same mathematical tools as the general theory of PDEs. These considerations gave him the courage to make the decision to commit himself completely to the task of creating a theory of continuous transformation groups, the task that became his life work.
Concluding Remarks Lie's encounter with the work of Jacobi was evidently a decisive factor in his decision. But, as the presentation in the first two sections indicates, it was only because of his experiences during 1869-1871, when geometry was the focus of his interests, that Lie was in a position to see in Jacobi's theory something no one else saw. During 18691871 his work had been dominated by two successive research projects: the study of the geometry of tetrahedral complexes and the study of the sphere mapping. From the first came a fundamental concept and a fundamental idea: the concept of a continuous group of transformations and the idea of a continuous analog of Galois's theory of algebraic equations-- his idde fixe. Likewise, a fundamental concept and a related idea originated in the sphere mapping work: the concept of a contact transformation and the idea of an invariant theory of contact transformations. It was in terms of these ideas and concepts that Lie assimilated the theory of first-order PDEs of Jacobi and his successors, and turned it into something quite different. . Further Reading For further information about Lie's early geometrical work, see the articles by T. Hawkins ("Line Geometry, Differential Equations and the Birth of Lie's Theory of Groups") and by D. Rowe ("The Early Geometrical Works of Sophus Lie and Felix Klein") in Volume 1 of The History of Modern Mathematics (D. Rowe and J. McCleary, eds.), London: Academic Press (1989). For more information about Jacobi's influence on Lie, see T. Hawkins, Jacobi and the birth of Lie's theory of groups, Arch. History Exact Sci. (to appear). A m o d e m presentation of Lie's group-theoretic approach to differential equations can be found in P. Olver, Applications of Lie Groups to Differential Equations, N e w York: Springer-Verlag (1986). Olver's book also contains informative historical notes, such as one on Lie's function groups and symplectic geometry (pp. 417418). Killing's work is discussed in T. Hawkins, Wilhelm Killing and the structure of Lie algebras, Arch. History Exact Sci. 26 (1982), 127-192. See also: T. Hawkins, Non-Euclidean geometry and Weierstrassian mathematics: The background to Killing's work on Lie algebras, Studies in the History of Mathematics (E.R. Phillips, ed.), Washington DC: Mathematical Association of America (1987), pp. 21-36; A.J. Coleman, The greatest mathematical paper of all time, Mathematical Intelligencer 11 (3) (1989), 29-38; N.H. Ibragimov, Sophus Lie and harmony in mathematical physics, The Mathematical Intelligencer 16 (1) (1993), 20-28.
Department of Mathematics Boston University Boston, MA 02215 USA THE MATHEMATICAL INTELLIGENCER VOL, 16, NO. 2,1994
17
The Sacred Cut Revisited: The Pavement of the Baptistery of San Giovanni, Florence Kim Williams
The pavement of the Baptistery of San Giovanni in Florence has been metaphorically compared to a carpet that has been unrolled in a spaceJ It is true in the case of the Baptistery that the motifs in the floor were inspired by the designs in Arabic carpets. But a decorated pavement can do more than merely embellish a monument. The design of a pavement may reveal an underlying ordering principle or philosophy in the architecture. The Baptistery of San Giovanni is the oldest church in Florence. The walls of the octagonal building date from about the seventh century after Christ. Its foundations are even earlier. During the 1912-1915 renovation, which returned the Baptistery to its "original" state, an excavation was undertaken inside the Baptistery which revealed the older Roman substructure under the apse. Fragments of the substructure and its mosaic decoration may still be seen through grills in the floor. By the time she became the queen of the Renaissance, almost all of Florence's Roman and Paleo-Christian structures had been destroyed. The Baptistery serves as a reminder of Florence's ancient past. Was the Baptistery intended from its conception to be a baptistery? Probably not. Despite its central plan which is characteristic of baptisteries, it probably served as a chapel. San Giovanni became a baptistery officially, however, when the font was transferred from the church of Santa Reparata in 1128. The pavement is conventionally dated 1207. It is likely that the d e c i s i o n to pave San Giovanni in the late twelfth or early thirteenth century coincided with a desire to heighten the symbolic character of the building as a backdrop to the rite of baptism. The geometry which I h a v e f o u n d to be key in the layout of the Baptistery floor provided a particularly rich and apt symbolic base for the program of architectural ornamentation. The pavement is a patchwork quilt of patterns. Three strips of pavement define the north-south and east axes.
The pavement of the west axis was destroyed in 1576 when Grand Duke Francesco I of Tuscany refurbished the Baptistery. A center medallion with a centripetal motif was laid in 1752, but was removed during the 1912 restoration. Today, the west axis and central octagon are paved in simple brown terrazzo. The north-south and east axes radiate out from the center octagon to the three doors. The strips of pavement which define the axes separate the four quadrants of pavement. Each quadrant in its turn is partitioned by decorated borders into compartments, each filled with a different motif. Of the four quadrants, the southeast quadrant is by far the richest, presenting more complex geometrical patterns as well as figural patterns. 2 Of the axes, the east axis is the most elaborate, the east portal being on axis with the Duomo and so being the ceremonial entrance. (See Figure 1.) 2EnricoMadoni has drawn each motifin the Baptisterypavement in his monograph,I1Pavimento del Battistero di Firenze, 1913.
1My drawings and research on the pavement of the Baptistery of San Giovanni are supported by grants from the Anchorage Foundationof Texas and the Graham Foundation for Advanced Studies in the Fine Arts. 18
THE MATHEMATICALINTELLIGENCERVOL. 16, NO. 2 (~ 1994 Springer-VerlagNew York
Figure 1. The pavement of the Baptistery of San Giovanni in Florence, showing the central octagonal medallion and the altar on the west side. The southeast quadrant is in the lower left-hand corner. Photograph taken before the 1912 restoration of the Baptistery. (Photograph by Fratelli Alinari.) THE MATHEMATICALINTELLIGENCERVOL. 16, NO. 2,1994 1 9
.213
~
.707
f
~_
1.0
Figure 2. The construction and proportions of the Sacred Cut geometry (Figures 2-10 were drawn by Kim Williams).
Initially drawn to study the wealth of geometric patterns in the Baptistery pavement (some 60 of them), once I began work I became more and more intrigued by the layout of the pavement as a whole. An Italian scholar named Carocci wrote in 1897, "The pavement, as far as I am concerned, wasn't conceived as a systematically arranged whole because it is so irregular in its partitioning and doesn't present us with a symmetric whole as does, for example, the cathedral of Siena .... -3 1 found the partitioning of the Baptistery floor to be very regular, especially when each individual quadrant is compared to the others. The existence of an underlying order was pointed to by the pavement having been so regularly laid out in each of the quadrants, by the layout having been so deliberately emphasized by decorated borders which separated the segments from each other, and by each segment having been filled with a different geometric motif. With the help of a geometrical construction called the "Sacred Cut," I found the ordering mechanism for which I was looking. The Sacred Cut is the fundamental construction which forms the basis for Danish engineer Tons Brun4s' system of geometry, which he claims has governed the construction of monuments in every period from the Egyptian to the Medieval [1]. Let us examine the construction. To begin, one takes a reference square of a given dimension. The reference square is divided by placing the compass at a corner of the square and striking an arc which intersects the two adjacent sides and passes through the center of the square. Thus, each of 3 G. Carocci, I1 P a v i m e n t o nel Battistero Fiorentino, in Arte Italiana Decorativa e Industriale, 1897, p. 34, " . . . il p a v i m e n t o , s e c o n d o me, n o n stato ideato c o m e cosa organica, perch4 ~ assai irregolare neUa s u a divisione e n o n presenta u n a s s i e m e simmetrico c o m e per e s e m p i o quello del D u o m o di Siena . . . " 20
THE MATHEMATICALINTELLIGENCERVOL.16, NO. 2,1994
the two sides is divided into segments of proportions v ~ / 2 ~ .707 and 1 - (v~/2) ~ .293 of the length of the side. When similar arcs are struck from the other three corners, the operation is complete, and the reference square is subdivided into a nine-module grid with modules of different proportions. The four corner modules are smallest, .293 square; the middle modules are .293 x .414 rectangles; the inner module is the largest, v ~ - 1 ~ .414 square, and Brun6s calls it the "sacred square." It may be seen that the diagonal of each corner square is also equal to v~ - 1 ~ .414, thus completing a regular octagon. The Sacred Cut also provides a close approximation for squaring the circle: The length of the arc used to construct the Sacred Cut is equal to within 0.6% to the length of the diagonal of one-half the reference square (Figure 2). There is no direct evidence that the Sacred Cut was actually used in antiquity. For example, the Sacred Cut is not mentioned specifically by Vitruvius, who wrote the only surviving document on architectural practice which has come down to us. However, Vitruvius does recommend the use of geometric constructions as devices to ensure proportional harmony in an architecture. Modern-day architectural historians have regarded it as a clue to how ancient monuments were constructed. Tons Brun4s found the Sacred Cut to have played an important part in the proportioning of the Pantheon [2]. Professor Brun6s placed a circle within the section of the Pantheon so that it followed the curve of the dome, then circumscribed a square about the circle, which he used as the first reference square. Sacred Cuts performed on this square resulted in horizontals which coincided with the springing line for the dome and the top of the niches between the columns which encircle the main hall. Verticals produced by the Cuts coincided with the upper limit of the coffers in the dome. The sacred square of the first reference square then became the second reference square, and its Sacred Cuts provided verticals which coincided with the diameter of the oculus in the dome and horizontals which coincided with the top of the upper row of windows. The Sacred Cut was useful in analyzing the ground plan of the building as well. It again provided horizontal and vertical lines which coincided with major elements of the building, most notably the width of the porch. More recently, the Sacred Cut has been found by Donald and Carol Watts to have governed the layout of a second-century A.D. housing complex called the "Garden Houses" in ancient Ostia [3]. The Wattses found that three successive Sacred Cuts determined the geometrical order of the Garden House complex. They placed a circle which touched the corner of the courtyard within the complex, and constructed a square which circumscribed that circle. Sacred Cuts performed on the east and west sides of this reference square coincided with the position of the outer walls of the courtyard buildings. Similar Sacred Cuts on a second reference square
/ / / \ \ Figure 3. Using the Sacred Cut to form the outer octagon.
determined by the width of the courtyard and the position of the fountains coincide with the party walls along spines of the courtyard buildings. The sacred square of the second reference square becomes the third reference square, and its Cuts coincide with innermost walls of the 9courtyard building. The unfolding of all the Sacred Cuts from a common center reveals an emphasis on the major east-west axis of the complex. The success with which the Sacred Cut has been used in studying Roman architecture recommends its use in analyzing medieval architecture. The step from a Roman housing project to a medieval Tuscan sacred structure is not the large leap it may appear to be. Italian contemporaries of Romanesque architecture believed themselves to be carrying forward Roman building techniques, and hence Roman greatness. I was drawn to study the Sacred Cut geometry in relation to the Baptistery not only because of the link between Roman and Romanesque architectures, but particularly because I recognized the kinship between Sacred Cut geometry and the Baptistery's octagonal form. Although the octagon played no part in the analyses of either the Pantheon or the Garden Houses, the Sacred Cut is the method by which one constructs an octagon with equilateral sides from a given square. Here is how it is applied to the Baptistery floor: We construct a first square. Using the Sacred Cut procedure to construct an octagon, as discussed, we obtain the perimeter walls of the space, referred to as the "outer octagon." Next, by joining alternate midpoints we construct within the outer octagon two superimposed squares, one turned 45 ~ with respect to the other. This forms an eight-point star. Sacred Cuts are next performed on each of the two superimposed squares, and the resulting smaller eight-point star which is formed determines the octagon in the center of the pavement, referred to as the "inner octagon." (See Figures 3-6.) Originally, the baptismal font occupied the entire inner octagon. After the demolition of the font, a strongly centripetal design in black, white, and red marble took its place (see cover).
\
/
/
Figure 4. The eight-point star found within the outer octagon.
Figure 5. Generating the inner octagon using the Sacred Cut.
Figure 6. Two superimposed Sacred Cuts form the inner octagon.
THEMATHEMATICAL INTELLIGENCERVOL.16,NO.2,1994 21
We have obtained the sequence as two sequences interleaved. The first,
The Wattses found in the Garden Houses a series of integer dimensions which occurred as ratios of segments constructed by Sacred Cuts. Beginning with a reference square with a side of 41 and performing successive Sacred Cuts, using each resulting sacred square as the next reference square, one obtains approximately the following integer series: 41
29
17
12
7 5
3
2
(an):
1
2
3
5
7
(bn):
17 29
41
29
1 3
7
17
41
70...
Figure 7. The Pythagorean construction for approximating the value of v~.
22
2'3 5
THE MATHEMATICAL INTELLIGENCER VOL. 16, NO. 2,1994
99...,
4The 0 value was given its name and extensivelyanalyzed by P. H. Scholfieldin The Theory of Proportion in Architecture, Cambridge:Cambridge University Press (1958), 138-144. 5This particular division of the square is illustrated in Scholfield,The Theory of Proportion, p. 133,Fig. 3f. 6For more on the "principle of repetition of ratios," see Jay Kappraff, Connections: The Geometric Bridge between Art and Science, New York: McGraw-Hill(1991), 19.
as mentioned above. It is easy to prove that the ratios bn ~an have the desired property of approaching v~.
12
70...,
represents those assigned to the diagonals. Each of these is a Pell's series, with the same recursion Cn+l = 2cn + Ca--1. The ratios cn+l/Cn approach 0 = 1 + v~. 4 This ratio comes naturally into the Sacred Cut construction: It is the exact ratio of the side of the reference square to that of the inner module or "sacred square." If the side of the reference square is taken as 0v~, the modules are simply expressed (Figure 8). 5 P. H. Scholfield points out another valuable quality of this division to the architect. A rectangle divided by 2 horizontal and 2 vertical lines generically gives 36 different shapes of subrectangle. The Sacred Cut construction limits the number of shapes to 5 [4]. The Baptistery floor contains m a n y rectangles conforming to the shapes occurring in Figure 8. (See Figure 9.) Such repeated proportionality has been studied u n d e r the name of the "principle of repetition of ratios. "6
bn+l = 2an + bn.
12
12
1.
(See Figure 7.) The sequence so produced is 1
5
represents the successive values assigned to the sides of the squares; the second,
These integers and their multiples were found to figure significantly in the apartments of the Garden Houses complex. The w i n d o w s in the largest rooms are based on a 7-foot module, the w i n d o w itself measuring 5 Roman feet and the interval between w i n d o w s measuring 2 Roman feet; the width of the public corridor and stairw a y is 17 Roman feet; the inside width of the courtyard buildings is 58 Roman feet. The Wattses observed that these integers can be obtained from a familiar Pythagorean construction. The construction begins with a square with side 1, whose diagonal, of exact length v ~ according to the Pythagorean theorem, is approximated by integer 1. Subsequent approximations are obtained by recursion: if an and bn are the integers used at the nth stage, then the next stage uses an+l = an + bn,
1 2
5
12
7
S 12
I 29
17
I
I
t
d'2
1
1
I
~e
1:1
(a)
(b)
Figure 8. (a) The Sacred Cut division of the 0x/2 square. (b) The four rectangles generated by the Sacred Cut division of the 0v~ square.
Unlike the Wattses, I was unable to find within the Baptistery a n y linear dimensions which c o n f o r m e d exactly to the P y t h a g o r e a n series or the PelFs series. 7 Rather, m y h a v i n g f o u n d the p r o p o r t i o n s of the pavem e n t c o m p a r t m e n t s to be related to 0 a n d v ~ points to the use of a geometric system rather than the use of a system of integer ratios to lay out the p a v e m e n t . The Sacred Cut is an ideal e x a m p l e of such a system. W h a t remains enigmatic in the Baptistery p a v e m e n t is an a p p a r e n t irregularity in the p e n t a g o n a l pieces along the diagonal axes of the inner octagon. To maintain radial symmetry, they should be circumscribable b y a square. Instead, they are circumscribed b y a rectangle. Similar irregularities in a n o t h e r p a v e m e n t h a v e b e e n ascribed to craftsmen w h o , in articulating a s y s t e m w h o s e ordering principles w e r e u n k n o w n or lost, m a d e alterations to the pattern as they w o r k e d it out and unintentionally interrupted the system. K n o w i n g that trade secrets in the guilds of the m i d d l e ages were jealously g u a r d e d , exposing t h e m being p u n i s h a b l e by death, this is a plausible explanation. (See Figure 10.)
1:0~"2
/ \ 1: e,'~
7 After having taken all the measurements and produced a measured drawing of the pavement, I began to study the measure of the building. The Roman foot was most probably the standard unit of measure used in the Baptistery. Though it has been found to vary, the Roman foot is conventionally considered to be equal to 29.5 cm or 11.625in. The value I have used to obtain the integers used in my analysis is 28.49 cm. This results in dimensions which correspond roughly to the values in the Pythagorean construction discussed earlier. It must be noted that the Baptistery is not a perfect octagon, being some 41 cm shorter on the east-west axis than on the north-south axis. There is accordingly some disagreement between dimensions in the various quadrants.
Figure 9. The proportions of rectangles in the Baptistery pavement. THE MATHEMATICALINTELLIGENCERVOL. 16, NO. 2,1994 2 3
Figure 10. (a) The existing irregular trapezoids (shaded) which surround the octagonal center of the Baptistery pavement. (b) Ideal arrangement of irregular pentagons surrounding the inner octagon.
To seek the symbolism in the pavement of San Giovanni, we ask the significance of the geometric figure produced by the Sacred Cut geometry. The superimposed reference squares in the Baptistery create an eight-point star. The eight-point star as an ecclesiastical emblem signifies resurrection. In medieval number symbolism, eight signified cosmic equilibrium and immortality. It was considered to be particularly closely related to resurrection: Christ was resurrected eight days after he triumphantly entered Jerusalem on Palm Sunday, for example. As a symbol of baptism, the number eight, and by extension the eight-point star and the octagon, represent rebirth or resurrection of the catechumen in Christ. Further, the octagon mediates between a square, which represents things terrestrial, and a circle, which represents things celestial [5]. In fact, in wall intarsia in San Giovanni, the two eight-point stars on the ends have an octagonal center, whereas the two inside stars each have a circular center. I believe the unknown designer of the Baptistery pavement was making a similar statement when he based the layout of the floor on the same geometric figure. The use of irrational values, or incommensurables, is linked philosophically to the symbolism of the circle and the square. A circle was indefinite, its circumference and area based on the irrational ~r, whereas the circumference and area of a square were rational values. Philosophically the use of irrational numbers such as ~ shows an attempt to rationalize that which is irrational, or in other words, to make sensible that which is divine or only achievable through the intellect. In this context, let's turn 24 THEMATHEMATICAL INTELLIGENCERVOL.16,NO.2,1994
again to the Sacred Cut. According to Professor Brun6s, its special quality, its "sacredness," lies in its very nearly solving the riddle of how to square the circle. The Wattses believe that the Roman designer of the Garden Houses used the Sacred Cut to represent the squaring of the circle in order to integrate sacred and profane, earthly and celestial. I believe likewise that the designer of the pavement in San Giovanni intended to imbue his pavement with symbolism to represent this integration. The octagon mediates between square and circle in the Baptistery, thus expressing the purpose of the baptismal rite: The catechumen undergoes death to sin and rebirth in Christ, coming in this way as near to the divine as a human can.
References 1. Tons Brun6s, The Secrets of Ancient Geometry and Its Use, Copenhagen: Rhodos (1967), Vol. 1, 72-80. 2. Tons Brun6s, The Secrets of Ancient Geometry and Its Use, Copenhagen: Rhodos (1967), Vol. 2, 38-56. 3. Donald J. and Carol Martin Watts, A Roman apartment complex, Scientific American, 255(6) (1986), 132-140. 4. P. H. Scholfield, The Theory of Proportion, Cambridge: Cambridge University Press, p. 132. 5. Jean Chevalier and Alain Gheerbrant, Dizionario dei Simboli, Milano: BibliotecaUniversale Rizzoli (1986), Vol. 2, 175. Via Mazzini 7 50054 Fucecchio Florence, Italy
David Gale*
For the general philosophy of this section see Vol. 13, no. 1 (1991). Contributors to this column who wish an acknowledgment of their contributions should enclose a self-addressed postcard.
Go The Problems
The July issue of this column was devoted to certain games, and we remarked that although game theory has become an extensive branch of both pure and applied mathematics, the set of games studied in the mathematical literature is with very few exceptions disjoint from the set of games people actually play. A notable exception was the analysis by Paterson and Zwick of the children's game of Concentration, where the authors actually found the rules for optimal play, that is, they solved the game. * Column editor's address: Departmentof Mathematics,Universityof California,Berkeley,CA 94720USA.
Figure 1. White to play and win.
This issue will be devoted again to a game people play, in fact, one of the most played and important games in human history, the game of Go; we will report on some recent and continuing work of Elwyn Berlekamp and his associates, David Wolf, David Moews, Yonghoan Kim, and Raymond Chen. We hasten to remark, however, that, unlike Paterson and Zwick, Berlekamp has not solved Go. In fact, it is not clear how much help, if any, his work offers to people or even computers in trying to improve their game. What he has done is to devise a set of Go problems which he and his computer program are able to solve but which no professional player has succeeded in solving. The board set up in Figure I is an example. Taken by itself this may not be so impressive. Indeed, there are instances of chess problems devised with the aid of computers which one would not expect even Garry Kasparov to be able to solve. An example is given in Figure 2.
Figure 2. Black to move, white to w i n (promote, capture, or mate) in 109 moves.
THEMATHEMATICAL INTELLIGENCERVOL.16,NO.2(~)1994Springer-VerlagNewYork 25
It seems doubtful that any human player would recognize this position as a win for white, since black can prolong the agony for so many moves. However, aside from being a curiosity, this problem is of little interest. In contrast, the thing that makes Berlekamp's work notable is that in order to solve his problems one must make use of a vast, profound, and beautiful mathematical theory called "combinatorial game theory" by its inventors, Berlekamp, Conway, and Guy. There are only a handful of mathematicians in the world who know this theory, which is expounded in the authors' 850-page, twovolume book Winning Ways.Before describing the applications to Go, I will attempt to give a thumbnail sketch of what this theory is about, that is, to summarize 850 pages in a few paragraphs--which obviously can't be
I I ~ I I I I I I I I I I I I I I I I I I I I I ~ I I
Figure 4.
a
Combinatorial Game Theory
The theory treats only win-lose games, between Ms. Left and Mr. Right, in which the person who moves last is the winner. Although this seems rather special, many games can be put in this form by making simple changes in the rules. Such games are conveniently represented as trees like the one in Figure 3. The "button" is originally at the root, as shown. Depending on whose turn it is, the button is pushed along some edge either left or right, and the winner is the one who pushes it onto a terminal vertex. A simple but highly nontrivial example of such a game is Domineering,illustrated in Figure 4. The rules are simple. Left has the vertical dominoes and right the horizontal; they take turns placing them on the g r i d - - a n d that's it. I chose 4 x 6 Domineering as an example because at the time of this writing this game is still unsolved. Returning to the general theory, all games belong to one of four types: (1) (2) (3) (4)
Figure 3.
B
done, but anyway, here goes. (In fairness it should be said that only about 150 pages are devoted to the theory; the bulk of the book is concerned with applications to specific games.)
b
c
Figure 5.
26 T~EMATHEMATICAL INTELLIGENCER VOL 16, NO. 2,1994
d
a win a win a win a win
for for for for
Left (regardless of who moves first), Right (regardless of who moves first), the player who moves first, the player who moves second.
Each of the four possibilities is illustrated by the Domineering games in Figure 5. We call Figure 5d a second-player win because the first player is unable to play on his first move. A less trivial example of a second-player win is the board presented in Figure 6, shown with its tree. The reader will easily verify that this is a second-player win. The first observation is that there is a natural notion of addition on the set of all games; namely, the sum of two games G and H is defined to be the two games played simultaneously. Thus, take (Checkers) + (Tic-Tac-Toe) (suitably modified to make them win-lose). Left, say, plays the black pieces and makes X's, while Right plays red and makes O's. A player on his turn may play in either game.
m[ III a
Figure 6.
b
II II a
b
Figure 7.
Next, the negative, - G , of a game G is defined to be G with the roles of the players interchanged (reflect the tree about a vertical axis). N o w comes the key observation: We define an equivalence relation ,-, on G by writing G ~ H if G - H is a second-player win. Observe that ~ is reflexive: The second player can win G - G by playing " c o p y cat" whenever the first player plays in G ( - G ) , she plays the same m o v e in - G (G). From n o w on, instead of saying G is a second-player win, we will write G ~ 0. One must now show the following: (1) If G ~-, 0 and H ,~ 0, then G + H ,-~ 0 (Do y o u see it?). (2) If G ,,~ 0, then G + H ~ H for all H, because (G + H) - H -- G + (H - H) ,,~ 0, b y reflexivity and (1). (3) Symmetry: G ~ H if and only if H ~ G [because - (G - H) ~ H - G, etc.]. (4) Transitivity: If G - H ,,~ 0 and H - K ,-~ 0, then G H + H - K,-~ 0, so G ~-, K. So (you saw it coming) the set of equivalence classes forms an Abelian g r o u p whose 0 element is the set of all second-player wins. We will call this g r o u p F, and it is the object of study in combinatorial game theory. From n o w on, G denotes an equivalence class, so ~-, can be replaced b y =. To get a feeling for F, consider the Domineering games in Figure 7. Both games are obviously wins for Left; but Figure 7a is the s a m e game as Figure 5a (prove it), whereas Figure 7b is not: Figure 5a-Figure 7b is a win for Left rather than the second player. (If you don't go through these verifications, y o u ' r e missing half the fun.) Back to the theory. There is a natural partial order on F. Call G positive, G > 0, if G is a win for Left, and negative if G is a win for Right. Note that > is well defined, i.e., games equivalent to a positive game are positive (why?). Obviously the s u m of positive games is positive (Left should always respond in whichever game Right plays), so defining G > H if G - H > 0 gives a partial ordering of F. Every game is, thus, either positive (Fig. 5a), negative (Fig. 5b), zero (Fig. 5d), or none of the above, meaning a first-player win (Fig. 5c). So n o w that we have.a partially ordered Abelian group, what do we intend to do with it? At this point the analysis takes a surprising turn. In the usual algebraic approach to group theory one proceeds to s t u d y the structure of the group, its subgroups, representations, and so on, but here the interest is not in structure but in the individual
9
%2
Figure 8.
Figure 9.
g r o u p elements. Indeed, these enter the picture somewhat like characters in a play or novel. They even have names, some of which are numbers, others are like Star, Up, Down, Tiny, and Miny (with the symbols ,, T, ~, +on, --on), and dozens more. Further, each of these elements has not only a name and a symbol but also a "picture," namely, its tree. Of course, m a n y different trees correspond to the same game. For example, the tree of Figure 6b represents the game 0 w h o s e tree is simply a single vertex. One of the important theorems of the theory states that every game has a unique "canonical form," meaning a unique tree with the fewest possible branches. For example, the canonical form for Figure 7b is given by Figure 8. F has m a n y interesting subgroups. Perhaps the simplest is {0, ,} where 9 is the game of Figure 5c whose tree is given in Figure 9. You should verify that 9 + 9 = 0. In fact, any game w h o s e tree is right-left symmetric must clearly have order 2. It turns out that the canonical form of the tree of Figure 3, which I constructed just b y playing around, is the tree of 9 (Fig. 9), as Berlekamp pointed out to me. Obvious question: Given a game, h o w does one find its canonical form? Answer: The general problem is NPhard. The most important subgroup of F is a group called Numbers, and it is isomorphic to the group of dyadic rationals. The tree of n is simply
\
"x and it is a triviality to show that m + n = (m + n). More interesting is 1 / 2 n, whose tree is
,)
~
f f f
THE MATHEMATICAL INTELLIGENCER VOL. 16, NO. 2, 1994
27
Thus, Figures 8 and 7b are 1/2. A nice exercise is to show that 1/2 n + 1/2 n = 1/2 (n-l) . Thus, Numbers add just like numbers. The tree of 3/4 is
'b Recall that F is partially ordered; so one may ask, for example, where does * fit into the ordering. It is an easy exercise to show that 9 is an "infinitesimal," meaning that it is smaller than any positive Number and greater than any negative Number, but not comparable with 0. My problem now is deciding when to stop, because there is so much more that could be said. For example, games have properties called temperature (every game is either hot, cold, or tepid) and atomic weight. There are important homomorphisms from F to Numbers called cooling and chilling; in some cases, chilling has an inverse called warming; and on and on. It's time, however, to move on toward explaining the connection with Go.
Solutions and Decomposition What does it mean to solve a game? A person not initiated into combinatorial game theory (henceforth to be abbreviated CGT) might suppose it means knowing who the winner is and how to go about winning, that is, to describe the winning strategy. The CGT notion of solution is rather different and is at once more and less than the naive notion. A game is said to be solved if one knows exactly which element of F it is, or, more precisely, if one knows its canonical form. This is called the value of the game. (In practice, one is happy if one can express a given game as a sum of games whose values are known. We will take this up in a moment.) Some examples: The value of 2 x 6 Domineering is 1 - [2 - 1 (don't worry if you don't know what this notation means). It is known that 5 x 5 Domineering has value 0, so it is a second-player win, though knowing this fact does not mean that one necessarily can obtain the second player's winning strategy without performing a major amount of computation. Indeed, the Domineering story is quite intriguing. Even games as apparently simple as 2 x n Domineering are far from trivial. Berlekamp has a "formula" for the value when n is odd. It turns out that 2 x n is a second-player win only for n = 13. At the time of the writing of Winning Ways, 4 x 4 was unsolved, and a year ago Berlekamp assigned it to a couple of MIT undergrads as a term project. They came up with different answers, neither of which was correct. I asked Elwyn for the correct solution, which he supplied, but this too turned out to have a bug. Finally, David Wolf, making use of a computer program, came up with the correct (we hope) canonical form. The tree turned out to have 52 branches. 28
THE MATHEMATICALINTELLIGENCERVOL. 16, NO. 2,1994
But if CGT doesn't tell one how to play games, what good is it? That's a fair question, and the answer is that the theory is extremely useful for a very restricted class of games; namely, those games in F which can be expressed as a sum of simple games whose values are known. Such games will be said to decompose. The prototype of such games is the game of Nim, where one is clearly playing a sum of games, each of which is completely trivial. (The element of F corresponding to an n-element Nim pile is called a Nimber and is denoted by *n. Exercise: Draw the tree for a three-element Nim pile.) Indeed, one way of looking at CGT is as a vast generalization of the theory of Nim. Now, most games one is likely to run into are not decomposable. For example, the game of Chomp described in our April column essentially never decomposes. However, anyone who has played the well-known game of Dots and Boxes has encountered an example, par excellence, of decomposition. Recall that in the "endgame" the board breaks up into disjoint regions with the property that drawing a line between dots in a region gives the opponent all the boxes in that region. (A whole 43page chapter of Winning Ways is devoted to Dots and Boxes, a Berlekamp speciality, though a complete theory has not yet been found.) So (finally!) what about Go? The positions which occur in the early stages of a game of Go surely do not decompose. However, positions which tend to occur in the later stages of the endgame do decompose-- both in the technical sense of CGT and in the geographical sense that the board is divided up in separate independent regions-and it then becomes fruitful to apply CGT. Professional Go players are very skilled at handling positions involving only a few terms (regions) even when some of these terms are still beyond the reach of CGT analysis. What Berlekamp and Co. have done is to construct certain Go positions which do, in fact, decompose and where the values of the games associated with the different regions can be calculated. As examples, all of the Go positions in Figures 10 and 11 below have as values our old friends, 1/2 and ,, respectively.
Figure 10.
Figure 12 shows the decomposition for the problem of Figure 1. Gaze and admire! In this example, white moves and wins by one point, but only if he does everything correctly. There is a software package which plays optimally in this game and others in the collected problem set. The software does not look ahead; its computations consist only of rudimentary manipulations of the values, but it is able to play perfectly because it maintains an accurate table of the local values of all regions on the board.
Looking Ahead The natural next question is, Where do we go from here? It's clearly fun to stump the experts, but now that one knows it can be done, it would seem the point has been made. But there are other reasons to pursue the study. Berlekamp notes that professional Go players have formulated various quite sophisticated but rather imprecise principles which they use in playing endgames. He believes that, using CGT, some of these "rules of thumb" may be given a precise mathematical formulation. One might think of this enterprise as a very sophisticated analogue to the earliest work in modern game theory, namely, the results of Borel and von Neumann in giving the complete mathematical theory of bluffing in twoperson poker.
Some Afterthoughts I should perhaps apologize to the readers of this column for conducting my education in public in the preceding paragraphs. At the start I knew nothing about combinatorial game theory, and, as should be evident, I have by now barely scratched its surface. Nevertheless, the experience has left me with an overwhelming sense of awe at the unfathomable diversity of mathematics itself. The authors of Winning Ways have created a mathematical fairyland. To my knowledge, there is nothing remotely like it anywhere else in the discipline. With this in mind, I can't resist concluding with a bit of speculation.
Figure 11.
Figure 12. One often hears people say that all of mathematics is really one, and as the subject develops we will ultimately see clearly how everything fits in with everything else. The whole Bourbaki enterprise, it seems to me, was predicted on some such unspoken assumption. Those authors seem to have been trying to help us to find the "right way of looking at things." But I wonder if this belief in ultimate unity may not be just wishful thinking. Certainly, it would be nice if, from a few guiding principles, we could gain an understanding of everything. Unfortunately, there is nothing in the evidence on hand to suggest that this will ever be the case. H o w much unity will there ever be between, say, the theory of transfinite cardinals on the one hand and numerical solutions of partial differential equations on the other? One hears about unity versus diversity in sciences other than mathematics. The physicists talk about a Grand Unified Theory for which they are searching. Perhaps the projected Super Collider will provide some clues on this, but if experience is any guide, this great piece of hardware may end up creating more puzzles than it solves, in which case I suppose we go on and build the Super-Duper Collider. My own hunch is that mathematics (perhaps physics too) is not going to unify. It's just not in the nature of the beast. It may even be that Bourbaki-like attempts in the end do more harm than good by forcing people's thinking into prescribed patterns. Combinatorial game theory is, I expect, just one example of what unfettered mathematical imagination is capable of creating. There will surely be many others, and I suspect that mathematics, far from becoming unified, will continue to diversify in totally unpredictable ways. This may seem a somewhat terrifying thought, but if, as I believe, this is the kind of universe we live in, then we might as well face up to it. THE MATHEMATICAL INTELLIGENCER VOL. 16, NO. 2, 1994
29
30
WIlE MATHEMATICAL INTELLIGENCER VOL. 16, NO. 2, 1994
MATH INTO TEX A SIMPLE INTRODUCTION TO AMS-LATE x
References
G. Gr/itzer, University of Manitoba 1. J. Mycielski, Representations of infinite p e r m u t a t i o n s by words, Proc. Amer. Math. Sac. 100 (1987), 237-241. 2. R. Dougherty a n d J. Mycielski, Representations of infinite permutations b y w o r d s (II), Proc. Amer. Math. Soc. (to appear).
[Editor's Note: Raphael Robinson has observed that the above solution shows that every permutation can be expressed as the . product of two cycles in uncountably many different ways.]
ATHEMATICAL CO illing Gets the Last Point Elwyn Berlekamp & David Wolfe January 1994 Hardcover, 256 pp. ISBN 1-56881-032-6, $29.95
George Gr~itzer's book provides the beginner with a simple and direct approach to typesetting mathematics with AMS-DTEX. Using many exampies, a formula gallery, sample files, and templates, Part I guides the reader through setting up the system, typing simple text and math formulas, and creating an article template. Part II is a systematic discussion of all aspects of AMS-LATEXand contains both examples and detailed rules. There are dozens of tips on how to interpret obscure "error messages7 and how to find and correct errors. Part III and the Appendices take up more specialized topics, from customizingAMSLATEXto the use of PostScript fonts. Even with no prior experience using any form of TEX, the mathematician, scientist, engineer, or technical typist, can begin preparing articles in a day or two using AM&LATEX. The experienced TEXer will find a wealth of information on macros, complicated tables, postscript fonts, and other details that permit customizing the LATEX program. This book is truly unique in its focus on getting started fast, keeping it simple, and utilizing fully the power of the program.
CONTENTS: Introduction 9 Part I: A Short Course 9The structure of AMS-L^TEX9Typing your first artide ~ Part II: A Leisurely Course * Typing text ~ Typing math ~ The Preamble and the Topmatter 9 The Body of the article 9 The Bibliography ~ Multiline math displays 9Displayed text 9 Part III: Customizing 9 Customizing AM&LATEX 9 TEX macros ~ Appendices (A-G) 9Bibliography 9 Index 1993 294 PP., PLUS DISKETTE SOFTCOVER $42.50 ISBN0-8176-3637-4
Berlekampand Wolfe, only amateur Go players themselves, have developed the mathematical techniques for solving latestage endgame problems that can stump top-ranking professionals. Any Go player with an interest in mathematics or a mathematician interested in Go will not want to miss this book.
INCLUDES READY TO USE TEMPLATES!
'The authors open up a whole world of mathematical techniques and concepts for analyzing games. Applied either in a rigorous fashion or more intuitively as a help in evaluating the possibilities, these techniques and concepts help sharpen one's problem solving abilities." Richard Nowakowski, Canadiaa InternationalMath Olympiad Team Leader and Coach 1988-2991, 1994-1996.
"It is rare that substantial mathematics is relevant to any aspect of a popular game. Berlekamp and Wolfe are to be congratulated on a remarkable achievement." John McCarthy, Prof. of Computer Science, Stanford University
A K Peters, Ltd., 289 Linden St., Wellesley, M A 02181 (617) 235-2210 Fax (617)235-2404 emnil
[email protected] THE MATHEMATICAL INTELLIGENCER VOL. 16, NO. 2, 1994
31
G.H. H a r d y as an Editor John Todd
In the mid-forties, a distinguished mathematician whom we shall call X and whose mother tongue was not English, decided he would like to have a manuscript published in English as a Cambridge Tract. Naturally he sent it to Hardy, who, though ailing since 1939, studied it and found that it needed editorial attention. I got involved in this and, in a series of letters, Hardy told me what was required. These letters are presented here, very slightly edited. I first give a brief history of the famous series, almost all paperbacks, as well as s o m e family reminiscences of Hardy, for many of which I am evidently indebted to Olga Taussky. After the letters, I give a commentary which provides some continuity. T h e Tracts
The series "Cambridge Tracts in Mathematics and Mathematical Physics" began in 1905 under the general editors J.G. Leathern and E.T. Whittaker. As might be expected, they wrote early volumes, to set the style, Nos. 1 and 7, respectively. Hardy wrote No. 2, probably based on his undergraduate lectures in Cambridge in the early years of this century. These paperbacks, for economic reasons, were to be of a limited size, in fact, about 100 pages. Some, No. 3 by Bromwich and No. 13 by Henderson, were exactly 100 pages, whereas No. 38 by Hardy and Rogosinski had exactly 100 pages and exactly 100 theorems. There has been, in the course of time, occasionally a considerable inflation in size and in price--originally the Hardy volume, No. 2, cost half a crown. The systematization of integration in No. 2 has again become of interest in view of the use of computers to integrate symbolically. We note that James Harold Davenport, the son of Harold Davenport, a colleague of Hardy, has written a tract [6] in a different series "On the integration of algebraic functions." There were at least two other Cambridge attempts to deal with this problem, less ambitious than Hardy's [4, 11]. Continuing to examine the "Tracts" as a whole, we note that the words "and Mathematical Physics" were removed from the title about 1973, and that they attained their century in 1991 and are now 104 not out. About half a dozen were written by women. Many have appeared in revised editions; some have been translated, for example, No. 7. Hardy was an Editor from 1914 to 1946. Apart 32
from Nos. 2 and 38 already mentioned, he was involved with Nos. 12 and 18. Among the other editors were Hyman Bass, B. Bollob~is, E. Cunningham, H. Halberstam, P. Hall, W.V.D. Hodge, J.EC. Kingman, J.E. Roseblade, P. Sarnak, E Smithies, J.A. Todd, and C.T.C. Wall. It is interesting to note the advertisements on the back cover for volumes "in preparation," particularly those which never were completed. Among those announced elsewhere [14] was one on the Riemann-Zeta function by H. Bohr and J.E. Littlewood, which was replaced by No. 26 by E.C. Titchmarsh and by No. 30 by A.E. Ingham, for reasons explained in the prefaces to these volumes. Note that "Inequalities" was originally envisaged as a Tract. British (and other) mathematicians learned a lot, directly or indirectly, from the Cambridge Tracts. For example, the problems on contour integration from G.N. Watson's No. 15 (1914) were an essential part of complex variable courses, and his (final) erroneous sentence "Further, Cauchy's theorem cannot be employed to evaluate
THE MATHEMATICAL INTELLIGENCER VOL. 16, NO. 2 (~)1994 Springer-Verlag New York
all definite integrals; thus f o e-z2 dx has not been evaluated except b y other methods." w a s repeated in dozens of textbooks. A.E. I n g h a m pointed out to m e that the trick is to integrate e i'z2 cosec 7rz r o u n d a long parallelogram y = m(x=l=89 = + M . Let M --* oc a n d t h e n t a k e m = 1. I do not k n o w w h o first detected the error. 1 I w a s indirectly able to r e p a y in p a r t w h a t I learned f r o m the Tracts. I realized in the early forties that a sequel to Watson's Tract dealing with steepest descent methods w a s desirable, a n d began collecting material. Other c o m m i t m e n t s p r e v e n t e d me f r o m c o m p l e t i n g this but allowed m e to e n c o u r a g e E.T. C o p s o n to take over the project. Ultimately, his Tract No. 55 a p p e a r e d in 1965. Scanning the v o l u m e s in the series, I note that only one w a s not in English, namely, No. 35 b y E. Landau. Carath6odory's No. 28 w a s translated f r o m G e r m a n . There w a s a p p a r e n t l y s o m e discussion (cf. letter of 17 N o v e m ber) a b o u t a v o l u m e in French b y Fr6chet, w h o w a n t e d it in English revised b y a Chinese. H a r d y wrote: "Picture the horror of it!" H e need not h a v e b e e n horrified, for in the mixing of mathematicians w h i c h w e n o w experience, I had m u c h contact with the p r o p o s e d reviser, and his written English w a s well-nigh irreproachable.
Figure 2: G.H. Hardy on an excursion on the lake. ICM, Ziirich, 1932.*
1Editor's note. This result is often attributed to K. N. Srinivasa Rao, ElementeMath. 27 (1972), 88-90--plainly, a rediscovery.
G.H. Hardy
Figure 1: R. Courant, G.H. Hardy, and O. Veblen at ICM, Ziirich, 1932.
There is an excellent obituary of H a r d y b y E.C. Titchm a r s h [21] (see also S n o w [17, 18]). A p a r t from technical material, there are lists of his 12 e n t h u s i a s m s and 12 hates. For a list of his four axioms for collaboration, w e refer to H a r a l d Bohr [3]. In this section, w e include s o m e personal reminiscences and p h o t o g r a p h s of Hardy, Courant, and Veblen at the ICM, Ziirich, 1932 (Fig. 1), w h e n they were the reigning e m p e r o r s of m a t h e m a t i c s in their countries, and of H a r d y alone (Fig. 2). Titchmarsh writes [21, p. 5]: " H e w a s an extremely kind-hearted man, w h o could not bear a n y of his pupils to fail in their research." A colleague assured m e that in each of five papers (probably the entire output) of one of his students there are errors: Referees should not believe that talent is inherited f r o m an academic parent. *Professor Hubert Kalf (Miinchen) has pointed out that this photograph has also been reproduced on p. 89 of the P6lya Picture Album (Birkh/iuser, 1987) to which we shall refer as PPA. Olga Taussky has had the original 1~3/t x 2~1 tt copies of both figures reproduced here since 1932. It is possible that they were taken by Dorothy M. Wrinch (whose photograph also appears on p. 85 of PPA). C.P. Snow [17, p. 15] writes in 1967: "So far as I know, there are only five snapshots (of Hardy) in existence." V. Drobot (PPA, p. 121) remarked that P61ya had more pictures of Hardy than exist. Apart from the six photographs in [12] and the two reproduced here, there is one in D.C. Russell's obituary of L.S. Bosanquet [Bull. London Math. Soc. 18 (1986), 403-420] which also appeared in C.R. Fletcher, "G.H. Hardy--Applied Mathematician." [Bull. Inst. Math. Appl. 16 (1980), 61-67, 264]. As Professor Kalf (who has a photographic memory) also pointed out, there is one in the article by R.P. Boas in "More mathematical people" [ed. D.J. Albers, Gerald L. Alexanderson, Constance Reid, Harcourt, Brace,Jovanovich 1990]. THE MATHEMATICAL INTELLIGENCER VOL. 16, NO. 2,1994
33
{~
...~, .r
'~k,k,~
~
~
~ L,
I
Figure 3: Portion of a letter dated 6 June 1946 to John Todd.
At one time, Olga Taussky, my (future) wife, was invited to talk at the famous Conversation Class. On the advice of Hans Heilbronn, then also at Trinity College, she chose to speak on a topic in one of her minor subjects (topological algebra). This apparently went down well, for Hardy, after her talk, told her that she could use his name as a reference when applying for a position. (Times were also bad in the late thirties.) After she had left Cambridge, Heilbronn told her that another mathematician spoke in the Conversation Class on her major subject (algebraic number theory, especially class field theory). Later, at dinner, Hardy said: "I wonder how Miss Taussky would have dealt with the topic." Once when Hardy's sister was visiting Cambridge, Miss Taussky asked them both to tea at her college (Girton). Because Hardy had said to her on a previous occasion, "No foreigner can make proper tea," she took advice on this problem. Miss Hardy arrived on time and they talked until Hardy arrived; he said he lost his way (in Cambridge!). Miss Taussky suspected he was frightened of having to drink her tea. Anyway, Hardy got out of his sweaters and then tea was ready. Hardy said" "I call that a good cup of tea." Her formula is secret. In due course, Miss Taussky was short-listed for a Chair, and at the interview, Hardy was one of the Selection Committee. Another member said to her: "I see you have written several joint papers. Were you the senior or junior author?" Hardy, who was huddled in his muffler, woke up to say: "That is a most improper question. Do not answer it." On another such occasion she was asked: "I see you have collaborated with some men, but with no women. Why?" She replied: "That is why I applied for this position." (It was at a women's college.) To indicate Hardy's punctiliousness, I quote a fragment from another batch of correspondence I had with him. The manuscript is reproduced in Figure 3. "Can you tell me who the signatory of this letter i s is it--'Mark'? and what is his rank or title? He may be anything from a Field Marshal to a home office clerk, for all I know. And as his tone is courteous and he writes 'Dear Prof. Hardy', I don't like the idea of being forced back on the bleak 'Dear Sir'." On any return to London, two things recall Hardy to me. Naturally Lord's (or even St. John's Wood) is the first. 34 THEMATHEMATICAL INTELLIGENCER VOL. 16, NO. 2, 1994
Figure 4: A sentence from a letter dated 14 N o v e m b e r 1946 to
Olga Todd. The second is Pimlico which he preferred to Chelsea as the name of the district where he lived. It is interesting to see how fierce a normally mildmannered man like Hardy can get when upset on a matter of which he feels strongly. Witness the last sentence of the 14 November letter, the manuscript of which is also reproduced in Figure 4.
THE MATHEMATICAL INTELLIGENCER VOL. 16, NO. 2,1994
35
36
THE MATHEMATICAL INTELLIGENCER VOL. 16, NO. 2,1994
H a r d y was satisfied with the mathematics, but mainly concerned with the architecture of the formulas. This remains a problem for all of us w h o use w o r d processors, not only for the technical typists and compositors. He was worried about clarity and e c o n o m y (in space), and he wanted a good looking book. H a r d y was not concerned with t y p o g r a p h y (as was Knuth [15] later). I am no longer clear about the mechanics of the process; for example, w h e t h e r the manuscript was actually a manuscript, or a typescript with formulas entered b y hand. I have even forgotten the title of the "Tract." In any case, in the s u m m e r of 1946, I dealt with it according to H a r d y ' s Rules and, for m y o w n part, queried parts which were not immediately clear to me, and finally sent the manuscript back to X. During this time I received various a m e n d m e n t s from X; in the comparative tranquility of postwar Europe, he was going over his work. When I received the Nov. 17 letter I was, naturally, annoyed. But I soon got over the annoyance and sent all the material back to X, keeping only H a r d y ' s letters. After all, I had had a personal correspondence course from H a r d y on the presentation of mathematics, which served me well in m y later teaching, writing, and editorial duties. For instance, at Caltech for some time there was an "Elementary Seminar" for first-year graduate students, directed by two faculty of differing interests, on a topic for example, Convexity [22, No. 47] - - w i t h m a n y facets. The object was to expose the students to a range of mathematics and to encourage them to use libraries and to learn h o w to report on a topic. To this I a d d e d the following: After an oral report was given, another student was required to prepare a written report, which was reviewed b y the teacher and then typed by the departmental secretariat, photocopied, and distributed to the rest of the class. This arrangement was very beneficial but was later a b a n d o n e d as involving too much manpower. I note that D. E. Knuth [16] ran a writing course. An editor's "lot is not a h a p p y one," particularly w h e n "trying to make the p u n i s h m e n t fit the crime." We mathematicians should be grateful to good editors, for, on the one hand, they prevent us from publishing material which later might embarrass us, and, on the other, they often help us to "polish up" our contributions. Incompetent editors are an abomination. H a r d y was an editor in what he w o u l d have called the Hobbs or Bradman class ([21, p. 450; 18]) and was "a slave of d u t y " to the Tracts.
References 1. Notes on the preparation of mathematical papers, London: London Math. Soc. (1932). 2. A Manual for Authors of Mathematical Papers, Providence, RI: American Mathematical Society (1962). 3. Harald Bohr, Collected Mathematical Works, 3 Vols. Kobenhavn: Dansk Mathematisk Forenung (1952), Vol. 1, pp. xiii-xxxiv, esp. p. xxviii. 4. T.J. I' A. Bromwich, Elementary integrals, Cambridge: Bowes and Bowes (1911).
5. T.W. Chaundy, P.R. Barrett, and C. Batey, The printing of mathematics, Oxford: University Press (1954). 6. J.H. Davenport, "On the integration of algebraic functions," Lecture Notes in Computer Science, No. 102, Berlin: Springer-Verlag (1981). 7. H.W. Fowler, A dictionary of modern English usage, 2nd ed., revised by Ernest Gowers, Oxford: Clarendon Press (1965). 8. H.W. Fowler and F.G. Fowler, The King's English, 3rd ed., Oxford: Clarendon Press (1931). 9. L.E. Gillman, Writing Mathematics Well, Washington, DC: Mathematical Association of America (1987). 10. Ernest Gowers, The Complete Plain Words, London: Her Majesty's Stationery Office (1954). 11. A.G. Greenhill, A chapter in the integral calculus, London: Francis Hodgson (1888). 12. G.H. Hardy, Collected Mathematical Papers, Oxford: Clarendon Press (1966-1979); 7 Vols. 13. G.H. Hardy, A mathematician's apology, Cambridge: Cambridge Univ. Press (1940, 1967, 1992). 14. G.H. Hardy and J.E. Littlewood, "Some problems of Diophantine approximation, I," Acta Math. 37 (1914), 155-191, fn. p. I64. 15. D.E. Knuth, "Mathematical typography" (Gibbs Lecture 1978), Bull. Amer. Math. Soc., N.S. 1 (1979), 337-372. 16. D.E. Knuth, T. Larrabee, and P.M. Roberts, Mathematical Writing, Report No. STAN-CS-88-1193, Stanford University, January 1988, Revised version, Washington, DC: Mathematical Association of America (1990). 17. C.P. Snow, Foreword in 1967 and later editions of [13]. 18. C.P. Snow, "The mathematician on cricket," The Saturday Book, 8th Year, 1968. London: Hutchinson (1968), pp. 65-73. 19. N.E. Steenrod, P.R. Halmos, M.M. Schiffer, and J.A. Dieudonn6, How to Write Mathematics, Providence, Rh American Mathematical Society (1971). 20. Ellen E. Swanson, Mathematics into Type, Providence, RI: American Mathematical Society (1971). 21. E.C. Titchmarsh, Godfrey Harold Hardy, Vol. 1, p. 1-12 of [12]. [Original in Obituary Notices of Fellows of the Royal Society 6 (1949), 447-458.] 22. Tracts Mentioned: (1) J.G. Leathern, Volume and surface integrals used in physics, 1905. (2) G.H. Hardy, Integration of functions of a single variable, 1905, 1916. (3) T.J.I'A. Bromwich, Quadratic forms and their classification by means of invariant factors, 1906. (7) E.T. Whittaker, The theory of optical instruments,1907,1915. (12) G.H. Hardy, Orders of infinity, 1910, 1924. (13) A. Henderson, Twenty-seven lines upon the cubic surface, I910. (15) G.N. Watson, Complex integration and Cauchy's theory, 1914. (18) G.H. Hardy and M. Riesz, The general theory of Dirichlet's series, 1915. (26) E.C. Titchmarsh, The zeta-function of Riemann, 1930. (28) C. Carath6odory, Conformal representation, 1932. (30) A.E. Ingham, The distribution of prime numbers, 1932. (35) E. Landau, Uber einige neuere Fortschritte der additiven Zahlentheorie, 1937. (38) G.H. Hardy and W. Rogosinski, Fourier Series, 1944,1950, 1956. (47) H.G. Eggleston, Convexity, 1966. (55) E.T. Copson, Asymptotic expansions, 1965. (104) Michael Aschbacher, Sporadic groups, 1993.
Department of Mathematics 253-37 Caltech Pasadena, CA 91125 USA THE MATHEMATICAL INTELLIGENCER VOL. 16, NO. 2, 1994
37
Jeremy J. Gray* International Congresses of Mathematicians from Zurich 1897 to Cambridge 1912 June Barrow-Green This year, the International Congress of Mathematicians (ICM) returns to Zurich for an unprecedented third time in order to celebrate (as near as possible) the centenary of the ICM, which was held in Zurich in 1897. That first ICM has its origins in the World Congress of Mathematicians in Chicago in 18931, when Felix Klein and other German mathematicians began to think in international terms. In 1895 Georg Cantor began to canvass support for the idea from French mathematicians, notably Hermite, Poincar6, and Laisant and Lemoine (who were independently pursuing the same idea). He worked assiduously at promoting the idea and became "the self-appointed director, working behind the scenes to provide continuous inspiration, maintaining support, even drafting circulars and statutes"2; even claiming for himself the idea of holding the congress. By 1896 several national mathematical societies had voted in favour of the idea. In July Carl Geiser, a professor at the Zurich Polytechnikum, headed a committee to arrange a congress at the Polytechnikum the following year. Z u r i c h , 9 - 1 1 A u g u s t 1897
In January 1897 an invitation was sent to some 2,000 mathematicians inviting them to participate in an international congress. For international credibility the invitation was endorsed by such eminent figures as Klein, Poincar6, Cremona, Mittag-Leffier, Hill, Greenhill, and Markoff. Zurich was promoted for its central geographic location and Switzerland's tradition of promoting international interests; for fostering a spirit of international cooperation rather than international rivalry, there was an obvious advantage in holding the congress in Zurich rather than a location in France or Germany. In general the invitation met with a positive response, although Berlin University did provide one notable ex-
ception by objecting, as Minkowski had predicted, to the choice of Klein as head of the German delegation. Altogether 208 people came, from 17 different countries. There were 60 from Switzerland, 42 from Germany, 23 from France, and 21 from Italy, but the delegation of 3 from Britain was woefully small, one less even than the number of women. The aims of the meeting were: to promote good scientific relationships between mathematicians of different countries; to give in reports or lectures an insight into the current state a n d / o r historical development of different branches of mathematics, and provide the opportunity to address particular questions of recognised importance; to consider future congresses; and to treat questions of bibliography, terminology, etc., on which international cooperation is essential. These points were all taken up in the address by Ferdinand Rudio, the General Secretary of the congress, on behalf of the organising committee. He stressed the necessity of international cooperation for certain tasks, citing the publication of Euler's oeuvre, the standardisation of terminology and units, a n d - - perhaps most essential of a l l - - t h e publication of a mathematical bibliographical journal. Fortschritte had been active for 30 years, but it was only published annually, and even then it came
*Column editor's address: Facultyof Mathematics,The Open University,MiltonKeynes,MK76AA England. 1See KarenV. H. Parshall and David E. Rowe,Embeddedin the Culture: Mathematicsat the World's ColumbianExhibitionof 1893, The Mathematical Intelligencer 15 (2) 1993,40-45. 2j. W. Dauben, Georg Cantor. His mathematics and philosophy of the infinite, Harvard University Press, Cambridge, Massachusetts,1979, p. 164,p. 339,n. 61. 38
THE MATHEMATICAL INTELLIGENCER VOL. 16, NO. 2 (~)1994 Springer-Verlag New York
out considerably later than the last publication it listed. Rudio also announced that it was intended to hold future congresses at intervals of three to five years, and that to celebrate the end of the 19th century it was planned to hold the next congress in 1900 in Paris, to coincide with the great Exhibition there. Plenary lectures were intended to give broad views of particular branches of mathematics and to help specialists to recognise the connections between the different subjects. Sectional meetings with shorter papers were devoted to various specialties. The near absence of applied mathematics papers is notable (only two out of 24 papers given), and there were no papers from either Britain or the United States. After lengthy discussion, permanent commissions were formed to prepare for the next congress; to report on the progress of the different branches of mathematics in different countries; to work on mathematical bibliography, including ideas for standardising and simplifying terminology; and to consider how best to give the institution of international congresses a permanent character. Outside the mathematical activities there was one other important feature of the congress: the social programme. The organisers, having recognised the value of this side of the congress in providing opportunities for informal scientific discussion, had expended considerable effort on its organisation. The programme, which was enthusiastically appreciated by the members, included a steamer excursion on Lake Zurich and a closing banquet on the summit of the Uetli.
Paris, 6-12 August 1900 The second congress, which was organised by the Soci6t6 Math6matique de France (SMF) with Poincar6 as President, is arguably the most famous of all mathematical congresses, for it was there that David Hilbert gave his celebrated speech on mathematical problems. Although about 1,000 mathematicians had provisionally accepted the invitation to attend, the number that actually went to the congress was only 249, just 41 more than had been at Zurich. It had been expected that the Exhibition would draw people to the congress; but, as reported in Nature (August 30, 1900, 418), the reduction in numbers "was doubtless due partly to the great heat of the preceding month, but probably in greater measure to the fear of exhibition crowds and exhibition extortions." Not surprisingly, the dominant nationality was French, but although the Germans again made up the second largest group, there were only 26 of them, Berlin University being again conspicuous by its absence. The Italian contingent, 23, ,was almost as strong as the German. Twelve made the journey across the English Channel, and 19 came from the United States. The number of countries represented rose from 17 to 29, and this included seven countries outside Europe in addition to the United States.
The pattern of the congress was very similar to Zurich, with the addition of a sixth section on teaching and methods. The number of papers presented went up from 24 to 38, although the applied mathematics section was again poorly supported. The French gave 14 out of the 38 papers, with only two speakers from Germany, five from the United States, and no British contributions. Without doubt the most significant of the sectional meetings was the first joint meeting of sections V and VI, where David Hilbert gave his address on the future problems of mathematics. His original list of 23 problems was reduced, on the advice of Hurwitz and Minkowski, to 10 for his talk. According to the report in the Bulletin Amer. Math. Soc. (7 (1901), 68) by Charlotte Scott (one of five women members of the congress), the immediate response was only "a rather desultory discussion." Nevertheless, it was not long before the significance of Hilbert's words was recognised; two years later the Bulletin Amer. Math. Soc. (8 (1902), 437-479) published Hilbert's complete address plus all 23 problems in English translation. Other speakers may not have deserved even a delayed impact, for, as Charlotte Scott remarked (Bulletin Amer. Math. Soc. 7 (1901), 77), One thing very forcibly impressed upon the listener is that the presentation of papers is generally shockingly bad. Presumably the reader desires to be heard and understood; to compass these ends, instead of speaking to his audience, he reads his paper to himself in a monotone that is sometimes hurried, sometimes hesitating, and frequently bored. He does not even take pains to pronounce his own language clearly, but slurs or exaggerates its characteristics so that he is often both tedious and incomprehensible .... It would be invidious to mention names; the special sinners sit in both high and low places. . . . It is perhaps pardonable to refer to Mittag-Leffier's presentation of his paper in Section II as showing how admirable and engaging a style the thing can be done, although it is not given to everyone to do it with this charm. There were other causes for complaint. There was a general dearth of information about arrangements and no common assembly room where the members could meet informally with one another. There were very few excursions arranged by the organisers, which reduced the opportunity for members to meet informally. More seriously, the SMF had done almost nothing about implementing the resolutions passed in Zurich concerning commissions on bibliography, etc. When formally questioned about this, their explanation was that they had been so taken up with making the arrangements for the actual congress they had not had enough time.
Heidelberg, 8-13 August 1904 The third ICM by all accounts was a model of efficiency. The organising committee included Hilbert and Klein and was headed by Heinrich Weber, who was later elected President of the congress. It had made a great effort to increase attendance, and the number of members THE MATHEMATICAL INTELLIGENCER VOL. 16, NO. 2,1994
39
rose to 336, although the number of nationalities fell from 29 to 21. In Heidelberg the Germans accounted for well over half with, for the first time, representatives from the University in Berlin. There was a marked increase in members from Russia, Austria, Hungary, and Scandinavia, a fair reflection of the number of people who had studied in Germany and welcomed the opportunity to return. The congress followed a similar format to that in Paris and again took place over six days. Apart from the four plenary lectures, there were also several presentations, notably Klein's presentation of the first volume of the Encyklopffdie der mathematischen Wissenschaften which was immediately followed by Julius Molk's presentation of the French version. This was a striking example of collaboration between the two premier mathematical nations, for as Molk explained, although the French encyclopaedia was based on the German version, it was not simply a translation but rather it had resulted from editing by French mathematicians and engineers who had been exchanging ideas with their German counterparts. Another feature of the congress which provoked considerable interest was an extensive exhibition of mathematical literature, apparatus, and models. About 300 models were exhibited, including not only the usual plaster, paper, and thread types but also integrators, harmonic analysers, calculating machines, instruments for drawing curves and solving equations, and kinematical models. Leibniz's calculating machine was shown and explained, and there were demonstrations of experiments on fluid motion.
The sections were arranged in the same way as in Paris but with almost double the number of papers presented, with the German contribution accounting for almost half, and for the first time papers by British mathematicians. There was a significant increase in applied mathematics papers. For the first time there were several papers on both history of mathematics and pedagogy, as a result of which the congress was prompted to adopt a resolution in favour of assigning history of mathematics "a fitting place in public education" at both university and secondary school level. Although the DMV appears to have followed the SMF in ignoring the propositions concerning bibliography and terminology, their social programme was beyond reproach. On one evening the members sailed down the Neckar in illuminated barges and on reaching Heidelberg found the castle illuminated by fire with the whole proceedings ending with fireworks, "... in the midst of which the pythagorean diagram stood out brilliantly against the s k y - - an appropriate symbol of the nature of the meeting." (Bulletin Amer. Math. Soc. 11 (1905), 200).
R o m e , 6-11 April 1908
In Rome there was a further increase in the number of members to a total of 535, some 200 more than in Heidelberg. The German presence was extremely strong (120), although both Klein and Hilbert were unable to attend. It was also the first congress at which the number of women members ran into double figures, of which eight
Heidelberg at the turn of the century. (Used by permission of Springer-Verlag New York,Inc.; from InternationaIMathematical Congresses. An Illustrated History: 1893-1986, by D. J. Albers, G. L. Alexanderson, and C. Reid; (~ 1987.) 40
THE MATHEMATICAL INTELLIGENCER VOL. 16, NO. 2,1994
came from Italy. The Italians also doubled the number of plenary lectures, and for the first time the section on arithmetic and algebra was joined with the section on analysis. The number of papers in the sectional meetings (129) was almost double that of the previous congress, with almost as many given in French as in Italian. Although the Italians dominated the applied mathematics section, the British also made a significant contribution. A new section was introduced so that for the first time papers on statistics and practical applications such as engineering were included. Also for the first time there were several papers on the philosophy of mathematics. Unfortunately, the first woman due to give a paper at an international congress, Laura Pisati, had died the previous month; her work on complex function theory was briefly described by Marcolongo. There was a large number of papers on teaching, many of which were in response to requests for reports on the position of mathematical teaching in schools, and these came from a broad spectrum of nationalities. This led to the proposition that the congress should recognise the importance of the teaching of mathematics in secondary schools. Klein, Greenhill, and Fehr were asked to set up an International Commission to study the question and report to the next congress. Hadamard, on behalf of the applied mathematics section, and as a result of debate arising from a paper by Marcolongo, proposed an international commission on the unification of vectorial notation. The social arrangements were again a success: there were receptions, a concert, a visit to Hadrian's Villa, with the result that "Everyone went away with the feeling that these non-sessions were among the most delightful parts of the congress."
Cambridge, 22-28 August 1912 The sentiments expressed by Sir George Darwin, the President of the Cambridge congress, in his opening address on the differences between pure and applied mathematics, were doubtless shared by many. He confessed (as an applied mathematician), "I must tell you frankly that when I gaze on some of the [pure mathematics] papers written by men in this room I feel myself much in the same position as if they were written in Sanskrit." The number at the congress was marginally up from Rome with 574 members attending, including 39 women, and, not surprisingly, there was a large increase in the delegation from North America. The format of the congress followed the now established pattern of sectional meetings in the mornings and lectures in the afternoons. In addition an exhibition had been mounted which included English and foreign textbooks, models, and a collection of calculating machines. Altogether 128 papers were presented, almost exactly the same number as in Rome, and they i n c l u d e d - - in the
geometry section--the first paper given by a woman, Hilda Hudson. For the first time the papers given by mathematicians from North America formed the largest group after those of the host nation. Three out of the five sessions of the teaching section were devoted to the reports presented to the International Commission on Mathematical Teaching, which had been formed as a result of the last congress. These reports, which came from 21 countries, included 150 already published with 50 more still to appear. There was still no consensus with regard to vectorial notation and the topic was further postponed to the next congress. Again there was an elaborate programme of social events, including four receptions, one of which was hosted by Lord Rayleigh (as Chancellor of the university), an organ recital in King's College Chapel, and an expedition to lay a wreath on Cayley's tomb. Unfortunately, as Darwin in his closing speech wryly observed, The weather has been such that in a more superstitious age we should surely have concluded that heaven did not approve of our efforts;but fortunatelytoday we regard it rather as a matter for the consideration of Section IIIa to decide why it is that solar radiation acting on a layer of incompressible fluid on the planet should have selected England as the seat of its most unkindly efforts in the way of precipitation.
Conclusions The continuous increase in the number of members attending the congresses, as well as the consistently high number of different nationalities represented, meant that the congresses did, as the original organisers had hoped, provide opportunities for encouraging cooperation. The establishment of various international commissions shows that such cooperation did indeed take place. It was as a result of action initiated at the congresses that the publication of Euler's works finally began; the publication of the first five volumes had been completed by the time of the Cambridge congress. The congresses were not so successful in their treatment of the questions of bibliography and terminology. No real progress was made, and no up-to-date international bibliographical journal appeared until Zentralblatt in 1931 and Mathematical Reviews in 1940. But the congresses were conspicuously successful in giving a view both of the current state and historical development of mathematics, Hilbert's paper in Paris being a prime example. The plenary lectures were especially valuable, even though it took until Rome to establish a balanced set of sectional meetings. In only 15 years, mathematicians did well at promoting mathematics as an international activity.
Faculty of Mathematics and Computing The Open University Milton Keynes,MK7 6AA England THE MATHEMATICAL INTELLIGENCER VOL. 16, NO. 2,1994
41
Linear Ill-Posed Problems Are Solvable on the Average for All Gaussian Measures J. F. Traub and A. G. Werschulz
1.
Introduction
The purpose of this article is to bring a series of rather surprising results which have appeared in the technical literature to the attention of a wider audience. We describe the results informally here; precise definitions and results will be given later. Our subject is the solution of an ill-posed problem on a digital computer. Such a problem can at best only be solved approximately. We say a problem is solvable if we can compute an Eapproximation for any positive e, and is unsolvable if we cannot compute an e-approximation, even for arbitrarily large e. It was shown in [17] that a linear problem is unsolvable iff it is ill-posed. Are there circumstances under which one can avoid this negative conclusion? It is common to associate an operator S with an illposed problem. If S is linear, then a problem is ill-posed iff S is unbounded. Note that unboundedness is a worstcase concept; the s u p r e m u m of S operating on elements in the unit ball of its domain is infinite. This suggests introducing the concept of a problem's being ill-posed on the average if the expectation of S with respect to a measure is infinite and being well-posed on the average if this expectation is finite. These concepts were introduced in [17], where it was shown that if the measure is Gaussian and the linear operator S is measurable, then a linear problem is unsolvable on the average iff it is ill-posed on the average. A natural next step is to find a linear ill-posed problem which is also ill-posed on the average. Several attempts to do this for Gaussian measures were unsuccessful. This suggested the question: Is every linear problem wellposed on the average for any Gaussian measure? This question was answered in the affirmative in [5, 15]. The conclusion in the title follows immediately from this and the preceding result. 42
What if the measure is non-Gaussian? N o w the situation is more complicated. In particular, there are problems that are even ill-posed on the average. In this article, we round out the picture by presenting examples of such problems for non-Gaussian measures. The reader m a y note a similarity between the result that a linear ill-posed problem is unsolvable and a result
THE MATHEMATICAL INTELLIGENCER VOL. 16, NO. 2 (~)1994 Springer-Verlag New York
from computability theory. In [11], it is s h o w n that a linear operator preserves computability iff it is bounded. There has been considerable interest in this result; see, for example, [1] and [10, pp. 185, 215]. The results in [11, 17] indicate a reassuring robustness. In the context of computability theory, Pour-E1 and Richards show that a linear u n b o u n d e d transformation causes noncomputability. In the context of informationbased complexity, Werschulz shows that a linear unb o u n d e d transformation causes unsolvability. (See [14] or [19] for the formulation of information-based complexity.) Note, however, that there is no counterpart in the w o r k of Pour-E1 and Richards to the average-case setting, which is the p r i m a r y focus of this article9 It might be of interest to s t u d y computability on the average. We emphasize that this article deals with solvability, not computational complexity. C o m p u t i n g an eapproximation for a solvable problem m a y still require arbitrarily large c o m p u t i n g resources 9 Indeed, there is no model of computation in this article. We only require that certain functionals can be computed. (This is sometimes called an oracle or black-box model.) We summarize the contents of this article. After introducing the basic concepts in Section 2, worst-case results are reported in Section 3. The average case is discussed in Section 4. We report three new results for non-Gaussian measures. In the concluding section, m u c h stronger conclusions are given for Gaussian measures. The backward heat equation is used as a running illustrative example. Proofs of the n e w results are sketched in the Appendix.
2.
Basic Concepts
M a n y problems that arise in science and engineering are inverse problems; that is, we are given a subset D of a n o r m e d linear space F1, a n o r m e d linear space G, and a m a p p i n g A : G --+ Ft. For an element f E D, our goal is to compute u E G such that A u = f . Recall that H a d a m a r d [2, 3] said that such a problem is well-posed ~ if the following three conditions are satisfied: (1) Existence. For every f E D, there exists an element u E G such that A u = f . (2) Uniqueness. There is at most one element u such that Au = f.
(3) Continuity. The m a p p i n g f H u is continuous. A problem that is not well-posed is said to be ill-posed. Suppose without essential loss of generality that A is injective, and let D be the range of A. Define the operator S : D (C_ F1) --* G b y 'S = A -1. Then the solution u of the original problem A u = f is given b y u = S f . In this article, we shall consider this latter form of the problem. 1Hadamard's original term was "correctement pos6"; his translator used "correctlyset."
Note that m a n y other mathematical problems (besides inverse problems) can be stated as obtaining S d for some f E D, where S is a linear or nonlinear operator, see [13], [14], or [19]9 We say that S is the solution operator for o u r problem. In this article we further confine ourselves to linear problems; that is, we assume that the operator S is linear, in which case D m a y be taken as a linear subspace of F1 without loss of generality. Recall that a linear operator S on a n o r m e d linear space is continuous iff it is bounded, i.e., iff there exists a finite constant M such that sup IISflI _< M. .fED
llfll_<1
If S is not bounded, it is said to be unbounded. Hence, for linear S, a problem is well-posed if S is bounded, and is ill-posed otherwise. T h r o u g h o u t this article w e will illustrate our results with the following example, which is a standard instance of an ill-posed problem: MAIN EXAMPLE. The backward heat equation. Let I = (0, 1). Consider the model one-dimensional heat equation ut=uxx for(x,t) EIxL~, with zero b o u n d a r y conditions and initial condition u(., O) = f . Let F1 = G = L2(I). Write (., .) for the usual L2(I) inner product. For each positive integer j, we let u j ( x ) = v ~ sin(jTrx)
(x E I);
this is the j t h eigenfunction of the heat equation. For t E k, we let D = Dt =
f E FI :
]
.
j=l Then for f E D, oo
( f , uj )e-J2~2tuj
Sf = Stf = j=l
is the solution to this problem at time t. Suppose that t < 0, i.e., we are trying to solve this problem backward 9 . '2~2 t 9 . , m time. Because St u j = e -3 uj for all m&ces 3, we see that the operator St is u n b o u n d e d for any t < 0. [] Assume n o w that we want to approximate S f b y using a digital computer. In applications, f is typically a multivariate function defined on an uncountable set of points, whereas a digital c o m p u t e r can only store a finite set of numbers. Hence, the mathematical input f m u s t be replaced by a computer input; that is, the c o m p u t e r input N f has the form N f = [Al(f),... ,/~n(f)],
Vf E D,
where the )~ are functionals. Here, n = n ( f ) is the n u m ber of functionals. We assume that ,~1, 999 ,~n E A, with T H E M A T H E M A T I C A L I N T E L L I G E N C E R V O L . 16, N O . 2, 1994
43
Mathematical input
Information operator N
IP
f
Computer input
sf
Figure 1.
(3) The problem is unsolvable if we cannot compute an e-approximation, no matter h o w large e is chosen to be. Note that the extent to which a problem is solvable depends, in particular, on the setting [i.e., on how e(r N) is defined[ and the class A of permissible information functionals.
3. A a given set of permissible information functionals on F1. Note that the n u m b e r n(f) of functionals, as well as the choice of which functionals Ai to evaluate, m a y be determined adaptively; that is, they m a y d e p e n d on the previously computed information; see [14] for further details. The many-to-one operator N is called an information operator. We shall refer to the computer input N f as the information about f. See Figure 1. What are typical examples of A? (1) Let F1 be a function space. For some applications, only function evaluations (standard information) can be computed. The class A std consists of function values, and N f = [ f ( x l ) , . . . , f(Xn)]. (2) Let/71 be a normed linear space. The class A* consists of all continuous linear functionals on F1. In particular, if F1 is a Hilbert space, then the continuous linear functionals are inner products of f with elements of F1. Because N is many-to-one, we cannot compute S f exactly. We have to settle for an approximation. We use the information N f to obtain an approximation r to Sf. We say that r is an algorithm using the information N. This formulation is used in information-based complexity; see [13], [14], or [19]. The same formulation is used in optimal recovery; see [7] or [8]. The quality of an algorithm r using information N is given by its error e(r N). H o w this error is defined depends on which setting is used. We consider the worstcase setting in Section 3, and the average-case setting in Sections 4 and 5. Finally, we define the degree of solvability of a problem. Let us say that we can compute an e-approximation if there exist information N and an algorithm r using N such that e(r N) _< e. There are then (at most) three mutually exclusive possibilities: ( D T h e problem.is solvable if we can compute an eapproximation for any e > 0, no matter h o w small. (2) The problem is weakly solvable2 if there exists an e0 > 0 such that we can compute an e-approximation iff C _~CO.
2 This is not to be c o n f u s e d with the w e a k solution of a n operator equation. 44
THE MATHEMATICALINTELLIGENCERVOL. 16,NO. 2,1994
Worst-Case R e s u l t s
We first consider the worst-case setting, in which the error of an algorithm r using information N is its largest value over all possible d, i.e., ew~
=
sup
IISf- r
fEDnF
Here F = {f E F1 : IIf]l -< 1} is the class of problem elements for which we seek to find approximate solutions, our restriction to the unit ball being merely a normalization. We assume that supfEDnF n(f) < oo, so that we cannot use an arbitrarily large number of information evaluations at any problem element f. Moreover, we assume that A = A*, i.e., the permissible information functionals are precisely the continuous linear functionals. Finally, in this section, the terms "solvable," "weakly solvable," and "unsolvable" will always mean "in the worst-case setting." The following theorem for the worst-case setting was proved in [17]: T H E O R E M 1. A linear problem is unsolvable iff it is ill-
posed. MAIN EXAMPLE(continued). Suppose we wish to compute the numerical solution of the backward heat equation. Let t < 0. Recall that our solution operator S is St : Dt (C_ F1) -* G, where F1 = G = L2(I) and Dt = S_t(L2(I)). As St is u n b o u n d e d , the problem is ill-posed. Thus, this backward heat equation problem is unsolvable in the worst-case setting. [] Because we are using a worst-case setting, Theorem 1 means that for any information N and any algorithm r using N, there exist bad problem elements f E D N F for which IISf - r is arbitrarily large. We might hope that there are only a "few" such bad elements. This hope is demolished by the next theorem, which says that no algorithm, no matter h o w sophisticated, can do substantially better than the zero algorithm r which is defined by r VfcD. (The zero algorithm is arguably the most naive algorithm imaginable.) THEOREM 2 [17]. Let q E [0, 1). For any information N and for any algorithm r using N, let
{
"Sf-C(Nf)]l
Aq= f ~ O n F : IlSf -Co(gf)ll
<-q}.
If the problem is ill-posed, then the relative interior of Aq in the unit ball of F1 is empty. Note that q can be chosen arbitrarily close to 1. The set
Aq consists of all problem elements at which our clever algorithm r beats the naive zero algorithm by a factor of q. Theorem 2 tells us that Aq can contain no ball, no matter how small its radius. In short, there are many bad elements for any algorithm that attempts to solve an ill-posed linear problem in the worst-case setting. We consider the consequences of Theorem 1. If the solution operator is bounded, then the problem is either weakly solvable or solvable. As weak solvability will generally not be enough for our needs (we might not be able to find an algorithm whose error is small enough), we want to distinguish between these classes of problems. Using Theorem 1, along with results in Chapter 2 of [13], we have the following "trichotomy theorem" characterizing the solvability of linear problems in a Hilbert setting: THEOREM 3. Suppose that the domain and codomain of the linear solution operator S are Hilbert spaces and that A = A*. (1)" If S is unbounded, then the problem is unsolvable. (2) If S is bounded but noncompact, then the problem is only
weakly solvable. (3) If S is compact, then the problem is solvable. Note that bounded noncompact solution operators often appear in practice. One example is the identity operator on any infinite-dimensional space. Another is the solution of the Poisson equation - A u = f under zero Dirichlet boundary conditions, with error being measured in the H~-norm (H01 is the space of functions having one L2 derivative and zero boundary conditions), with f belonging to the dual space of H01. These two problems illustrate the general principle that a solution operator will be bounded but noncompact when there is just enough smoothness in the elements of F to guarantee existence and uniqueness of solutions. We close this section by mentioning that the unsolvability of ill-posed problems in the worst-case setting was proved under the assumption that the absolute error criterion ]]Sf- r was used to measure the error at a particular f. This negative result may not hold if we use a different error criterion. In particular, suppose that our original problem was a linear inverse problem Au = f, and that we use the residual error criterion IIA(r - f ll. Then the problem is either weakly solvable or solvable. For further details, see [18]. We also note that regularization methods (see, e.g., [9]), which are often used to solve ill-posed problems, do not claim to solve these problems in the worst-case setting. Instead, they provide a solution in an asymptotic setting, described in Chapter 10 of [14].
4. Average-Case Results We have seen that ill-posed problems are unsolvable. Because ill-posedness involves a supremum, it is a worstcase concept. This suggests that a positive result might be achieved if one introduced the concept of being wellposed on the average. One might be able to show that if an ill-posed problem is well-posed on the average, then it is "solvable on the average." This motivated Werschulz to initiate the average-case study of ill-posed problems; see [17]. In the rest of this article, we consider an average-case setting, where we replace the requirement that the error be always at most e by the weaker requirement that the expected error be at most e. The terms "solvable," "weakly solvable," and "unsolvable" will always mean "in the average-case setting" in what follows. We make the following assumptions: (1) D is a linear subspace of a Polish space 3 F1, the space F1 being equipped with a Borel probability measure # and the subspace D being of full measure. (See [4, 6, 12, 14, 16] for background material about probability measures on infinite-dimensional spaces.) (2) G is a separable Hilbert space. (3) S : D --* G is a measurable linear operator. (4) The class A is a subset of the class A~ of measurable linearfunctionals on D. In the average-case setting, the error of an algorithm r using information N is its expected L 2 error, i.e., eavg(r
= (/D HSf --r
f)]]2#(df)) 1/2
Remark. Note that there are several changes when we replace the worst-case setting by the average-case setting. Of course, the error is given by an average, rather than a supremum. Moreover, we take our average over all elements in D, and not merely in the unit ball; often, this does not affect the results (see, e.g., [14], pp. 257-268). There is one other important difference between the two settings. In the worst-case setting, we assumed that for any information N, the number n(f) of linear functionals in N f was uniformly bounded over all problem elements f. In the average-case setting, we can relax this requirement; we only require that the integral fD n(f)#(df) be finite. See [14] for further discussion. [] Recall that in the worst-case setting, a problem is unsolvable iff the solution operator is unbounded, i.e., iff the problem is ill-posed. This suggests that solvability in the average-case setting may depend on whether the problem is "well-posed on the average." We say that S
3 A Polish space is a c o m p l e t e s e p a r a b l e m e t r i c space. THE MATHEMATICALINTELLIGENCERVOL. 16, NO. 2, 1994 4 5
is bounded on the average if fD IlS fN2#(df) < ~ " We say a problem is well-posed on the average if S is b o u n d e d on the average, and is ill-posed on the average otherwise. We first consider problems that are well-posed on the average. Clearly, any such problem is at least weakly solvable because eaVg(Oo, N) =
(fD IlSfll 2#(df))1/2
i.e., the average-case error of the zero algorithm ~0 is finite. Furthermore, for some restricted classes A, the problem is only weakly solvable. One example of a problem that is only weakly solvable is the inversion of a compact injection on a Hilbert space, the permissible information consisting of inner products of f with elements that are orthogonal to a nontriviaI eigenspace. On the other hand, we have the following positive resuit: T H E O R E M 4. If a linear problem is well-posed on the average, S is closed, and A* c_ A c A~, then the problem is solvable. (The proofs of the theorems in this section are treated in the Appendix.) Hence, a problem that is well-posed on the average is solvable on the average, assuming that any continuous linear functional is permissible. If the problem is not well-posed on the average, it m a y or m a y not be solvable, d e p e n d i n g on the measure # and the class A of permissible information functionals, as seen in the following theorems. O u r first result states that there exist problems that are ill-posed on the average but are solvable on the average, assuming that only continuous linear functionaIs are permissible. T H E O R E M 5. Let S be a diagonal operator on a separable Hilbert space G, let # be an atomic measure on the eigenvectors of S, and suppose that A = A*. Then the problem is solvable, regardless of whether the problem is well-posed on the average. Our second result is that one can find problems that are ill-posed on the average, but for which the average-case solvability d e p e n d s strongly on the class of permissible information functionals: T H E O R E M 6. There exists a problem that is ill-posed on the average, such that the following holds: (1) If A = A*, then the problem is unsolvable. (2) If A = A~,, then the problem is solvable. It is natural to ask whether there exists a problem that is so badly ill-posed on the average that it is unsolvable for A = Az. We have not yet succeeded in finding such a problem. It is tempting to conjecture that such a problem exists. 46
THE MATHEMATICAL INTELLIGENCER VOL. 16, NO. 2,1994
5.
Average-Case Results for Gaussian Measures
We have seen that problems in the average-case setting can be either well-posed or ill-posed on the average. If a problem is well-posed on the average, it is solvable (assuming that any continuous linear functional is permissible). However, if the problem is ill-posed on the average, the problem can be either solvable, weakly solvable, or unsolvable, depending on the measure # and on the class A of permissible information. In short, there is no simple criterion for deciding solvability in the average-case setting. We will n o w discuss a recent result that settles the question of average-case solvability for any Gaussian measure #. We assume that the measure Iz is a Gaussian measure on a separable Banach space F1. This means that there exists a linear correlation operator C~ : F~ ~ F1 and a mean element m~ E F1 such that
IF1 exp{iJ~(f) }#(df) = exp { i)~(m~) - ~ A(C~A) } , for all ,~ ~ F{, where i = v'-2X. (Equivalently, a measure on F1 is Gaussian iff all continuous linear functionals on F1 are normally distributed.) Assume that the solution operator S is a measurable linear operator defined on a subspace D of F1 of full measure. Finally, we assume that the class A of permissible information functionals satisfies A* C_ A C_ A~. Consider what happens w h e n the measure for our backward heat equation problem is Gaussian: MAIN EXAMPLE (continued). Recall that we are trying to c o m p u t e approximations to the problem whose solution operator is St : Dt (c_ F1) ~ G, where stY is the solution at time t to the heat equation with initial condition d, with F1 = G = L2(I) and Dt = S-t(F1). We suppose that # is a Gaussian measure on F1; there is no essential loss of generality in assuming that the m e a n element of # is zero. To simplify the analysis, we assume that the eigenvectors of C~ are precisely the eigenvectors ul, u 2 , . . , of the heat equation and that the corresponding eigenvalues are al _> O~2 ~ ' ' ' > 0. We will also require that & be measurable with respect to #. As # is Gaussian, the Kolmogorov z e r o - o n e principle holds. This means that St is measurable iff the series ~-~j=l OgJe-2j27r2t converges. Moreover, an easy calculation establishes that oo
o ll&flla#(df) = ~ - ' a j e - 2 ; 2~2t. j=l
Thus, the measurability of St implies that St is b o u n d e d on the average, and so the problem is well-posed on the average. As we will discuss below, this is an instance of a general result. For this example, we can d r a w some additional conclusions. We n o w apply the techniques in [19, Chapter 71 to find optimal information Nn using at most n functionals and
an optimal error algorithm using Nn; that is, e avg(r Nn) is the infimum of eavg(r N) over all algorithms r using information N consisting of at most n functionals. We find that
Nnf = [(f, ul),...,(f, Un)], Vf E D,
Using the Lebesgue m o n o t o n e convergence theorem, w e find that (x)
eavg(~gn'Xn)2 = E where
is nth optimal information and that the algorithm
r
=~
e-j2r2t(f, uj)uj,
Vf E D,
~J = / J I(Sf'gJ)12#(df)" S being b o u n d e d on the average, w e have
j=l
oo
is an optimal error algorithm using Nn. It is straightforward to check that 1/2 e
avg(r
~J~
j=n+l
=
[
se-2522t
Y~ "~5 = fD IISflla#(df) < co. j=l
But the remainder in approximating a convergent series by its partial sums converges to zero, so
j=n+1
lim
n ----+oo
eavg(On,
= O,
lim eavg(r
n---~ oo
Because St is b o u n d e d on the average, we see that as required.
Nn) = O.
[]
Proof of Theorem 5. Let F1 = G be a Hilbert space with To be even more specific, suppose that aj = e -2527r2t~ orthonormal basis {ej }5~=1, and let {aj }~-1 be a nondefor some to > 0, i.e., that c~1/2 ,_,, = Sto. We find that if creasing sequence of positive numbers. Define t > - t o , then St is b o u n d e d on the average, and St ,is measurable with d o m a i n of full measure. Hence, the D= f ~ G : rr (f, ej)12 < c~ , problem is solvable for any t > -to. On the other hand, j=l if t < - t o , then the d o m a i n of St has measure zero. []
{ z
Hence, we have an example for which measurability with respect to a Gaussian measure implies that the problem is well-posed on the average, and hence solvable on the average. This is an instance of a general property, as stated in the following theorem, which was proven independently in [5, 15]:
THEOREM 7. Let S : D ( c_ F1) ~ G be a measurable linear transformation of separable Banach spaces. If F1 is equipped with a Gaussian measure #, then the problem is well-posed on the average. From Theorems 4 and 7, we immediately obtain the result mentioned in the title of this article.
}
and define our solution operator S : D --* G b y oo
Sf = E
as(f'es)es'
Vf E D.
j=l
Let # be an atomic measure on {ej }~=1" Note that
/D rlSfll2#(df) = ~
a2#(ej),
j=l
and this m a y or m a y not be finite depending h o w we choose a 5 and #(ej). Hence, S m a y or m a y not be wellposed on the average. To show that this problem is solvable, let
Appendix A(f) = ~
We sketch the proofs of Theorems 4, 5, and 6 in this Appendix.
Proof of Theorem 4. It suffices to prove the result for A = A*. Because S is closed, the domain D(S*) of S* is dense in G. Let gl, g2, 999 E D (S*) be an orthonormal basis for G, so that (S., gj} = (., S*gs) is a b o u n d e d linear functional on D. Define continuous linear information Nn as
Nnf = [(Sf,91,),..., (Sf, gn)], and the algorithm
(~n
Vf E D,
Vf E D,
noting that A E A*. Define information N by N f [,k(f)], and define an algorithm r using N b y r
=
ojej 0
=
if )~(f) = j - 1 otherwise.
Then eaVg(r N ) ---- 0, showing that the problem is solvable. D
Proof of Theorem 6. Let F1 = Eoo, u n d e r the metric
using Nn as
C n ( N , J ) = ~ (Sf, gj}g5, j=l
j - l ( f , e5) ,
j=l
V f E D.
d(x, y) = ~ j=l
2 -j
Ixj
Yjl
1 + Ix 5 - y j l '
THE MATHEMATICAL INTELLIGENCER VOL. 16, NO. 2, 1994
47
where xj denotes the j t h c o m p o n e n t of x E F1. We let uj denote the standard j t h basis vector for Ft. Next, we choose a nondecreasing sequence {a 5 }j~=l of positive numbers, and let
D=
2 2 ajf~ < oo
f E FI :
a Sherman Fairchild Distinguished Scholar at the California Institute of Technology and a visiting researcher at the Santa Fe Institute. The work of the second a u t h o r was s u p p o r t e d in part by NSF Grant CCR-91-01149.
References
.
j=l
Letting G be a Hilbert space with orthonormal basis {e5}5=1, we define our solution operator S : D --+ G to be oo
Sf = E
asfsej,
Vf e D .
j=l
(That is, Suj = aSe 5 for all indices j.) Finally, w e let # be oo an atomic measure on {U 5}5=1, chosen so that oo
j=l
Then S is not b o u n d e d on the average. Suppose first that A = A*. It is easy to see that any continuous linear functional k on D has the form
j=l
j=l
where only finitely m a n y rh, r/2,.., are nonzero. From this, it then follows that for any information N, there is an index k0 such that N(Uk) = 0 for k > k0. Thus, if r is any algorithm using N, we h a v e oo
eaVg(r
->
E
Ilokek
-
=
k=ko
i.e., the problem is unsolvable if A = A*. Now, suppose that A = At,. Choosing distinct positive numbers rh, rt2,..., we define
j=l
j=l
By the Lebesgue d o m i n a t e d convergence theorem, we see that ~ is defined almost everywhere and that ~ E A~. Define information N b y N f = [~(f)], and define an algorithm r using N by r
=
ajej 0
if)~(f)=~j otherwise.
Then eavg(r N) = 0, and so the problem is solvable for A = A~,. []
Acknowledgments We would like to thank K. Ritter and H. Wo~niakowski for their insights, comments, and suggestions. The w o r k of the first author was s u p p o r t e d in part by NSF Grant CCR-91-14042 and AFOSR Grant 91-0347. Part of this w o r k was done while the first author was 48
~
MATHEMATICAL INTELLIGENCER VOL. 16, NO. 2,1994
1. D. S. Bridges, Review of Computability in Analysis and Physics (by M. B. Pour-El and J. I. Richards), Bull. Amer. Math. Soc. (New Series) 24, 216-228 (1991). 2. J. Hadamard, Sur les probl6mes aux d6riv6es partielles et leur signification physique, Princeton Univ. Bull. 13, 49-52 (1902). 3. J. Hadamard, Lectureson the Cauchy Problem in Linear Partial Differential Equations, Dover, New York, 1952. 4. J. L. Kelley and T. P. Srinivasan, Measure and Integral, Volume I, Graduate Texts in Mathematics 116,Springer-Verlag, New York, 1988. 5. M.A. Kon, K. Ritter, and A. G. Werschulz, On the average case solvability of ill-posed problems, J. Complexity 7, 220224 (1991). 6. H.-H. Kuo, Gaussian Measures in Banach Spaces, Lecture Notes in Mathematics 463, Springer-Verlag, Berlin, 1975. 7. C. A. Micchelli and T. J. Rivlin, A survey of optimal recovery, Optimal Estimation in Approximation Theory (C. A. Micchelli and T. J. Rivlin, eds.), Plenum, New York, 1977, pp. 1-54. 8. C. A. Micchelli and T. J. Rivlin, Lectures on optimal recovery, Numerical Analysis Lancaster 1984, Lecture Notes in Mathematics 1129, (P. R. Turner, ed.), Springer-Verlag, Berlin, 1985, pp. 21-93. 9. V. A. Morozov, Methods for Solving Incorrectly Posed Problems, Springer-Verlag, New York, 1984. 10. R. Penrose, The Emperor's New Mind, Oxford University Press, Oxford, 1989. 11. M. B. Pour-E1 and J. I. Richards, Computability in Analysis and Physics, Springer-Verlag, Berlin, 1988. 12. A. V. Skorohod, Integration in Hilbert Space, SpringerVerlag, Berlin, 1974. 13. J. E Traub and H. Wo~niakowski, A General Theory of Optimal Algorithms, Academic Press, New York, 1980. 14. J. E Traub, G. W. Wasilkowski, and H. Wokniakowski, Information-Based Complexity, Academic Press, New York, 1988. 15. N. N. Vakhania, Gaussian mean boundedness of densely defined linear operators, J. Complexity 7, 225-231 (1991). 16. N. N. Vakhania, V. I. Tarieladze, and S. A. Chobanyan, Probability Distributions on Banach Spaces,Reidel Publishing Company, Dordrecht, 1987. 17. A.G. Werschulz, What is the complexity of ill-posed problems? Numer. Func. Anal. Optim. 9, 945-967 (1987). 18. A. G. Werschulz, Optimal residual algorithms for linear operator equations, J. Complexity 7, 150-173 (1991). 19. A.G. Werschulz, The Computational Complexity of Differential and Integral Equations: An Information-Based Approach, Oxford University Press, Oxford, 1991.
Department of Computer Science Columbia University New York, NY 10027 USA Division of Science and Mathematics Fordham University, Collegeat Lincoln Center New York, NY 10023 USA and Department of Computer Science Columbia University New York, NY 10027 USA
H o w to Tangle with a Nested Radical Susan Landau*
Introduction Like many an intriguing question in algebraic manipulation, the problem of denesting nested radicals had its origins with Ramanujan. That is not to say that no one had ever considered the problem of denesting radicals before he did. Certainly, the fact that ~/5 + 2v~ = v ~ + v~ is simple enough that it must have been known several centuries ago. Ramanujan [11] upped the ante. For each of the formulas below, he took the doubly nested radical on the left and simplified it to a combination of singly nested radicals on the right: ~/T~-I
= ~ -
Xy2-~+ ~/'4-/9,
v/xY-5- if4 = (1/3)(xY2+ ~ - 0 - xY~), ~ / / 7 f f ~ - 19 = ~ ' ~ ~/77-~ V
~/~/~- ~ ~/~-~/2-~--5
~27-3,
lem remains open, there are now solutions to a number of subproblems: for denesting real nested square roots [3], for denesting real radicals of depth 2 [5], for radicals of a special form [9, 13], and for radicals using roots of unity [7, 8]. I am interested in three questions: When does a simplification exist? Is there a technique for finding it? How long does it take? In this article, I will briefly present some recent theorems for radical simplification, and the algorithms they lead to. For proofs, and complete presentations, the reader is urged to read the original papers. What D o e s It Mean to D e n e s t a Radical? 1 I begin by making precise what is meant by simplifying a nested radical. Assume all fields are characteristic 0. In defining the depth of nesting of a formula, I will view the formula as a sequence of formal symbols. Following [3], a formula over a field k and its depth of nesting are defined recursively: 9 An element in the field k is a formula of depth 0 over k. Thus, 17 is of depth 0 over Q, while I + v~ is of depth 0 over Q(v~).
4/g+ 1 -
= (1/3)(~9-8- xY~ - 1), = ~ +
~ -
1 This brief overview is taken from [8]. A m o r e detailed d i s c u s s i o n of these issues can be f o u n d in [7].
~9-/25.
What Ramanujan neglected to do was provide a theory for simplifying nested radicals. When computers came along, symbolic computation became important. There was a practical reason to find an algorithm for denesting nested radicals. A machine has no problem with 1, V/5 + 2v~, 5 + 2v~, (5 + 2x/6) 3/2 as a basis for Q(v/5 + 2v~) over Q. Most human beings seem to prefer the basis 1, x/2, v~, v/6. The difficulty is that there was no general method to go from the complex form of a nested radical to a simplified version. If Ramanujan had one, he never wrote it down. Necessity has often been the mother of invention, and so it proved to be in this case. Although the general prob-
* S u p p o r t e d by NSF g r a n t DMS-8807202. THE MATHEMATICALINTELLIGENCERVOL. 16, NO. 2 (~ 1994 Springer-Verlag New York 4 9
9 An arithmetic combination (A -4- B, A x B, A/B) of formulas A and B is a formula w h o s e depth over k is max(depth(A), depth(B)). Because I view V/5 + 2 v ~ - v'2 - x/3 as a sequence of formal symbols, it has nesting depth 2 over Q. 9 A root ~/A of a formula A is a formula w h o s e depth over k is 1+ depth(A). 9 Finally, V / V ~ +
2v/-6 has depth 3 over Q.
Such a formula is called a nested radical. A nesting of a means any formula A that can take c~as a value. But there are difficulties involved as an nth root is a multivalued function. When I write the equation V/5 + 2 v ~ = v ~ + v~, it is unclear which v ~ I mean and which V~. The usual interpretation is the positive real roots for all four choices in the equation above. Under those choices, the equation is correct; u n d e r others, it m a y not be. I start with the input as a sequence of expressions of the form: al = n,y-q, q E k, ft 2
O~2 = Ol3 =
Ctrn =
~
~ ) ,
P2 E ]~[Xl],
nv/p3(oL1, O~2), P3 e k [ X l , x21,
q(oo,...,O~rn-1)-t-
q and 16r~ E k [ x l , . . . ,
Xm-1]
nmv/Pm((~l,...,O~m-1),
and a = am.
It is not hard to go from this complicated sequence to the minimal polynomial for a over k. One can do it by first determining a minimal polynomial for a~ over k, then using that to determine a minimal polynomial for a2 over k, and so on (see [81 for details). One must take careful note of the choices of roots as they are made. Once one chooses a particular nth root for ~/-~, the same value must be assigned to it each time it appears. If the roots are specified at the time a nested radical is given, choose those roots. W h e n e v e r roots a p p e a r which have not been previously specified; one is free to pick a value arbitrarily for them, so long as after that one consistently chooses the same value to represent the root. Suppose one is interested in denesting the expression ~ - 1 -
~/1--/9.
terms of degree) field extension possible. In the above example, I would choose the ~ that is already in the field. I will say the formula A can be denested over the field k if there is a formula B of lower d e p t h than A such that A and B have the same (real or complex) value. I will say that A can be denested in the field L if there is a formula B of lower nesting depth than A with all of the terms (subexpressions) of B lying in L, again with A and B having the same value. For any c~, I define the d e p t h of c~ over k to be the d e p t h of the m i n i m u m - d e p t h expression for c~. W h e n I am given a formula A for c~such that A can be denested, I will sometimes say that c~can be denested. For the remainder of this article I will assume that c~ has been given by its minimal polynomial over k, and the choice of roots in any ambiguous situation has been spelled out.
Denesting Real Nested Radicals Real nested radicals were Ramanujan's examples and form a clear starting point for the problem. Nested square roots form the simplest example of nested radicals. Since nested real square roots describe the Euclidean distance from one vertex on a p o l y h e d r o n to another, an algorithm for their simplification is potentially of practical value. With such concerns in mind, Borodin, Fagin, Hopcroft, and Tompa [3] studied simplifying nested square roots. Their first result demonstrates w h e n square roots suffice for denesting: T H E O R E M 1 [3]. Let Q c_ k, and let a, b, r be elements of k, with v/r not in k. The the following are equivalent: 1. x / a + bv/F is in k(v/-r, v/-6T,..., v ~ ) for al, 9 9 9 al in k. 2. x / a + bv/-F is in k(v G, v~) for some s # 0 in k. 3. x/a 2 - b2r is in k.
some
Next they gave the conditions u n d e r which fourth roots m a y help: T H E O R E M 2 [3]. Let Q c_ k, and let a, b, r be in k, with v'7 not in k. Then the following are equivalent: 1. x/'a + bx/-r is in k(,~/r, v r ~ , . . . , v ~ ) for some a l ~ 9 9 9 , a l in k. 2. Either 4/-rv~x/a + bv'-r or v ~ v / a + b v ~ is in k ( v ~ ) , for some s # 0 in k 3. Either V/r(bar - a 2) or v / ~ - b2v is in k.
1 - ~ / ~ , I need to k n o w to which root
Finally, they showed that, for denesting expressions containing only real square roots, only square roots or fourth roots play a role:
of x 3 - 9 1 am referring in Q(~/,Y2 - 1): the one which satisfies x - c~8 - 4c~s - 4oL2, or one of the two satisfying X 2 q- (O~8 -4- 4c~s + 4oL2)x q- (3o~4 -4-6a), w h e r e c~ = ~ - 1. For the purposes of this paper, I have chosen that w h e n I adjoin ~/~, I do so in a way that makes the smallest (in
T H E O R E M 3 [3]. Let k be a real extension of Q, and let a, b, v be in k with v/r not in k. Let nl , . . . , nl >_ 1,and let al , . . . , az in k be positive. If ~ b v ~ is in k(~x/h-~,..., =~/-~), then x/-d-~ b v ~ is in k( 4/F, v/-~, . . . , v ~ ) for some pl, . . . ,PL in k.
The polynomial x 3 -- 9 factors over the field Q ( ~ / q ~ - 1). To denest ~
50 THEMATHEMATICALINTELLIGENCERVOL.16, NO. 2,1994
In combination, these three theorems provide the backbone for an algorithm for denesting real nested square roots. Let r0 be in Q, and inductively let r~ = ai + b~rv/-FTL-(~_l, with ai and bi in Q(ri-1) for i = 1 , . . . ,n. Consider the following tower of fields: ko = Q, and for all i _> 0, ki+l = k i ( v ~ ) , w h e r e ri C ki. There are two ways v ~ can denest using only square roots. One is that ~ m a y denest. The other is that there is a field K satisfying the following (i) it contains a, b, rn-1, (ii) it contains only elements of d e p t h n - 1, and (iii) a 2 - b 2 r n _ l is a square. To find the field K in which these conditions are satisfied, I attempt to denest via 2 - b2rn_l. If this is accomplished, I will have f o u n d a field k in which all elements have depth at most n - 2. I will also have f o u n d an element s such that via 2 - b2rn_l is in ]~(v~, r~/GC~_2).Then K = ]r ~, rv/-G-~_2). This idea leads to a recursive algorithm. If one is careful with the input and output, the running time of this algorithm is polynomial. But this tells us only h o w to handle a nested square root which consists of roots recursively f o r m e d by ri = ai + bi rv/-ff(~,-1,with ai and bi in Q(ri-1) for i = 1 , . . . ,n, and ro in Q. If we w a n t to denest all real nested square roots, we have to be able to handle linear combinations of nested square roots. I look first at the simpler case, in which none of the radicals is nested. T H E O R E M 4 (Besicovitch [2]). Let {ei} denote the set of
n I radicals,
~/ppipT2
m, ,
""Pz
O<miKn,
1
where p l , . . . ,pz are the first l primes. Then the set {ei} is linearly independent over Q. To check if a linear combination of square roots is equal to zero, one needs only to check if it is trivially equal to zero. There is the obvious solution of factoring all the integers u n d e r the square root sign in order to check if pieces cancel, but this is far from optimal. A m u c h faster w a y to do the problem is to compute gcd's of the integers u n d e r the square root signs. Then where appropriate, pull out squares from u n d e r the radical signs. Combine like terms. Repeat until no further simplification is possible. If there is a n y t h i n g left, then the combination of square roots is different from 0. One can i m p l e m e n t a more complex version of this idea to simplify linear combinations of nested square roots. Borodin, et al. [3] consider the case w h e r e k is a real extension of Q, and where li, ai, bi, and v ~ are in k for i = 1 , . . . , h. Suppose E i = h I liv/ai + b i v ~ denests in k. They find the denesting as follows. First denest any single radical that can be denested using the criteria of Theorem 3. Next consider each pair of nested radicals, v/ai + biv'~, v/aj-4-bjv ~ , and see if the product denests. Suppose it is equal to m in k. Replace
liv/ai + biv"~ +ljv/aj + b j v ' ~ b y [li+mlj/(ai+bix/~)]. v / a i + biv/~ in k(v/ai + bix/7~). Iterate the process of looking for a pair of radicals that denests. If at any point the product of any pair of radicals cannot be further denested, then the combination of nested radicals cannot be further denested. Earlier Siegel had studied a more general situation than square roots. Assume that F is an arbitrary real field, and let r l , . 9 . , Vk be natural numbers. Let ql,. 99 qk be elements of F, with ~ - ~ , . . . , T~/~ real. Siegel introduced the multiplicative groups generated by the nonzero elements of F and the first i radicals {~x/~, 999 ~x/~}, which he denoted by F (i). Let r~ be the group index of F (i 1) in r (i). Then Siegel showed: T H E O R E M 5 [12]. With notation as above, the degree of the field extension F( ~x/-~, " " , "~/~) equals I-Ik=l r~. Thus, the basis of the extension is given by k
-I ~
r
O<ei_<
'-1,
i=1
k.
i=1
In seeking an algorithm for simplifying sums of radicals. B16mer recast this as COROLLARY 6 [4]. Let S = y~ik=l 3"i "~x/~i,where the r~ are
natural numbers, and 3"i, fli are in F with ' ~ real. Assume that there is no pair of indices (i, j ) , i r d, with ~ / ~ / ";v/~ c F. Then S is in F if and only if there exists at most one "yi r O. In this case, ~x/~i is in F. That is, arbitrary real radicals ~x/~ are linearly dependent over F if and only if there are already two radicals ~r ~ ' - ~ that are linearly dependent over F. This gives an easy algorithm to check if a sum of radicals over a field Q(a) is in Q(a). Let F = Q(c 0 be an k algebraic n u m b e r field, let S = Y~i=l 3'i ' ~ , and let R = { "~v~, 99-, T~/~-}, where the fli are in F. To show that "~x/~/~Ffi~ is in F, it is sufficient to show that 1. T~v/-fil= fl~ E F,
2. qv/-fi2= fl~ E F, 3. r l I l r l ! F, where r = gcd(rl, r2) and ri = rllr, r 2' = r 2 / r . If F = Q and q = zl/z2, with gcd(zl, z2) = 1, clearly ~ is in Q if and only if ~ - and ~ / ~ are in Z. B16mer uses logarithms to c o m p u t e an approximation to 4/-~,which is then checked to see if it is an rth root of z. If F ~ Q, B16mer [4] gives a Monte Carlo 2 algorithm with error probability 2 - t which, given a, fl, and 3', decides in time polynomial in log r w h e t h e r there is a 3' in Q(a) that satisfies 3'~' = ft. He does this by checking whether 3` is an rth root of fl m o d u l o T distinct primes which are randomly chosen 2 Monte Carlo and Las Vegas methods are the most c o m m o n types of probabilistic algorithms used in c o m p u t e r science. The running time is polynomial in Monte Carlo algorithms, with an exponentially small chance of failure. Las Vegas algorithms always produce the correct a n s w e r with a polynomial expected r u n n i n g time. THE MATHEMATICALINTELLIGENCERVOL.16, NO. 2, 1994 51
from the interval [0, 2T], where T is a polynomial in log r, the input size of fl, and t. B16mer's technique also explains the denestings of the radicals at the beginning of this article. Call depth 2 nested real radicals "Ramanujan Radicals." Building on Theorem 5, B16mer proved T H E O R E M 7 [5]. Let F be a real field, and F' = F ( r~v/~,. . . , r~/-~) be a real radical extension of F of degree r. Let 3" be in F', and let s be a natural number. Assume that ~/~ is real and denests over F. Then there is a nonzero ~1in F such that ~'-q ~/~ is in F'. Consider the field embeddings aj of F( r~v/-fi~,..., r~rfi-~) in its splitting field over F. Let (~ be a primitive r~th root of unity, where r~ is the group index of F (i-1) in F (~). Let 0 < fi _< r~.' The field embeddings aj are given by aj : F( ~V'~I,... , r ~ k ) Then
--* F rrf' ,,r;
, r Ik % G )
T H E O R E M 8. Assume ~/~ denests as above. Let {fli} be a basis of F' over F. Then for some basis element r fli, Y ~ j = l c r j ( f l i ) ~ ( 3 " ) is a nonzero element of F and [ ~ = 1 aj (fli) ~ j (3')]-~ denests ~ . If we l e t , = [~d=l aj (fli)( ~ / ~ ('7))]-~, then ~/denests ~,'-~, that is, ~xY'-~xY-~= 3" E F'. The value of this theorem is that in bounding ~1, it tells us where to search for a denesting. Let us consider the simplest possible case, a Ramanujan Radical. The base field F = Q, and F' = Q( ~'x/~,'", ~ k - ) with ri ~ N and qi ~ Q. Interpret ~ to be the real rith root. I wish to denest 9r~, where 3" = Y~i~l difli, and the {fli} form the basis of F' over F, and the di are in Q. Without loss of generality, one can assume that the di and qj are all integers, that is, that 3' is an algebraic integer. If 3' is an algebraic integer, so is a3(3"~) for each j. Furthermore, (a3(3'~)) - ~ is one of the rs roots of cr3(3'r). Thus, (aj(3'r))-~ is an algebraic integer, and so is [Y~=I aj(fli)~,/-~ (3')]-~, and fli is an element of any basis for F' over F for which fl~3'~ has nonzero trace. As before, B16mer employs the idea of computing logs and lattice reduction to compute ~. If the sum denests, Theorem 8 tells us how. This denesting will have depth 1, that is, it will be a sum of radicals. We can easily check whether this sum is actually in Q or not. Thus, Theorem 8 leads to an algorithm for completely denesting Ramanujan Radicals. Note that Bl6mer's technique guarantees a minimal-depth solution only for the depth 2 case, Ramanujan Radicals. Bl6mer also has a nice solution to the question of sums of nested radicals: k T H E O R E M 9 [5]. Suppose S = ~i=~ fl~ ~x/~ is a sum of real nested radicals such that 3"~is in F~, fli ~ Li, i = 1 , . . . , k, where each F~, Li is a real radical extension of F , a subfield of 52
THE MATHEMATICAL INTELLIGENCER VOL. 16, NO. 2,1994
the reals of degree ri, ti, respectively. Assume that no nested radical "~x/-~denests using real radicals. Then 1. S can denest only to 0; 2. if no quotient ~v~*/ r~,~ denests, then S = 0 if and only if fli = O for all i. This gives a polynomial-time algorithm to determine whether the sum of nested real radicals is equal to zero. A Particularly Simple Form of Denesting
Zippel studied the equation
and found the beginnings of a pattern. Whereas ~ - 1 is not a cube in Q(xY2), 9(x~ - 1) is. I can find a 3' which satisfies 3'3 = 9 ( ~ r 2 __ 1 ) in Q(~Y2) by factoring X 3 - - 9 ( ~ / 2 - 1) over Q(xY2). Namely, X 3 -- 9(X~
-- 1)
= (x-(1-~/2+~/4))(x2+(1-~/2+~/-4)x+3~'4-3).
The first factor gives Ramanujan's denesting. Zippel noticed that a similar situation arises with 5 + 2v~. Again, this is not a square in Q (x/-6), but a multiple of it, 2(5 + 2v~), is. Our task is to find 3', where 3'),2 __ 2(5 + 2v~) in Q(v~). We discover X2
-
-
2(5 + v~) = (x - (2 + V'-6))(x - (-2 - v~)).
In both cases, w e have found an element fl in Q such
that, although r is not an element of Q(O), ~ have the following picture: L = K(r
is. We
= KF
k=KNF
In each case, expressions of nesting depth n in the field L have been dropped to expressions of nesting depth n - , in the subfield L. This idea generalizes to a theorem: T H E O R E M 10 (Zippel [,3]3). Assume K is an extension of k, a field containing a primitive dth root of unity. Let L = K(~'-d) be an extension o[ degree d, where a is in K . If there is a field F which is a Galois extension of k = K N F, and L = K F , then there is a fl in k such that aft is a dth power of an element of k. Furthermore, F = k(r Zippel exploited some lucky guesses. If I want an algorithm, I will need something somewhat more deterministic than that. I will need an algorithm to determine whether such a fl exists, and if so, h o w to find it. More precisely, given ~ in L, when is there a solution to aft = 3"d 3 Zippel's original statement omitted, but implicitly assumed, the hypothesis that F is a Galois extension of k. A corrected version appears in [9].
with fl in k, a proper subfield of k(~r~), and ~/in K, a proper subfield of L? It is not hard to s h o w that the following converse of Zippel's theorem holds: T H E O R E M 11 (Landau [9]). Let a be an element of a field K. Suppose that ~ is of degree d over K, and that ~ = ,k/ (r with ,k in K and fl in k C K. Assume that the dth roots of unity lie in k. Then the field F = k(~fi) satisfies: (i) F over k is Galois and the Galois group of F over F n K is isomorphic to the group of F K over K, (ii) F K = K(~/-a), and (iii) k = F N K. Thus, if I seek a "Zippel denesting," I am asking to find subfields of K(~/-a) satisfying Theorem 11. In [10], there is a polynomial-time algorithm to find maximal subfields of a field. One can use this for an algorithm for determining whether a Zippel denesting exists. The first step is to find all maximal subfields of L. It is a simple matter to check if a field extension is Galois. One finds a primitive element for the larger field, possibly by resorting to the well-known construction that k(% p) = k(7+cp) for s o m e c ~ (degk(7) degk(p)) 2. With a primitive element, say a, which has minimal polynomialp(x), one can compute the action of the Galois group by observing that the factorization
perfect square. That computation eventually leads to
-- 2(bq1+ d) (~/4 q(bq -4- d) 2 q- ~/(4q(bq + d)2) 3)
These two denesting formulas are the two s h o w n in [3] to be the only ways in which expressions involving square roots can be denested.
What About the General Case? Zippel's criteria are simple and elegant, but the conditions in Theorems 10 and 11 are sufficiently restrictive that they will not handle all cases. Trying to understand where the 9 came from in ~/~'2 - 1 = ~y179 - ~/27-9 + ~ 7 - 9 , I discovered the following subfields 4 of Q( ~x/-~-73 1):
0 ( 3 3 ~ - 1 ) = 0 (~(/2, ~'-9)
p(x) = (x - pl (a))(x - p2(a))... (x - pro(a)) gives the group table, since rYi(cej) = (pi(Pj(c~)) (mod
p(x)).
The denesting
Computing whether a candidate subfield satisfies parts (ii) and (iii) is even easier. It is just a matter of checking whether two fields are equal. This can be done by seeing whether the basis of one is contained in the other, and vice versa. Iterating the procedure in [10] will give an algorithm to find all subfields. Thus, there is an algorithm to determine if a "Zippel denesting" exists. It is not fast. There may be 2 4 fields between k and L, and in a worst case I would have to check each one of them. But this exponential-time algorithm can still be quite reasonable for small (< 10) values of d. Zippel used his theorem to shed some light on the calculations and theorems of Borodin, et al. [3]. Let a, b, and q be elements of a field k, and suppose I am hoping to denest v/-a + by@ Assume there is a fl in k such that
fl(a + bv/q ) = (ao + v'-q)2. N o w a 2 - qb2, the norm of a + by@ is a square, d 2. This leads to =
(a + d),
a0 =
4- d).
Choosing the positive sign, 4-
--5--'
where the sign depends on the sign of b. If a 2 - qb2 is not a square, then that means that x/-gT-G-v~ does not denest in a quadratic extension K = k(v,~ ), and I must look for a quartic extension in which it denests. I try looking for a fl such that fl(av,~ + bq) is a
@
~/7~-0- 19 = ~ / ~ led to a similar tower of fields: Q (V/7~/2-0 - 19) i Q (~/2-0, ~ - 8 )
In fact, such pictures arose for all the simplifications I had occasion to try. This was too beautiful to happen by accident. A natural place to search for a denesting is the smallest closed field in which the minimal polynomial of a factors c o m p l e t e l y - - the splitting field. The answer almost turned out to be that surprisingly simple and elegant. A minimal-depth expression for a nested radical can always be found in the splitting field, provided all roots of unity lie in the base field. More precisely: T H E O R E M 12 (Landau [81). Suppose a is a nested radical over k, where k is a field of characteristic 0 containing all roots of unity. Then there is a minimal-depth nesting of a with each of its terms lying in the splitting field of the minimal polynomial of a over k. 4 This d i a g r a m gives a partial e x p l a n a t i o n of w h e r e the 9 c o m e s from. A m o r e c o m p l e t e a n s w e r is that 9 d i v i d e s the d i s c r i m i n a n t of x 9 + x 6 -~ x 3 - 1, the m i n i m a l p o l y n o m i a l of ~ / ~ ' ~ - 1 o v e r Q. THE MATHEMATICALINTELLIGENCERVOL. 16, NO. 2, 1994 ~ 3
Proof. Consider the following diagram: normal closure of K / k = L
~
/
'
J
~
~
LH:k(a)~!G:G/N
/ K
G = Galois group of L / k /
Let a be a nested radical over k, and suppose that L is the splitting field of k(a) over k. Let L be the normal closure of K over k, where K is a field which contains a minimal-depth nested expression for a. If I let G be the Galois group of L over k, and G be the group of L over k, then I note that G = G//V, where 19 is the group of L over L. (Note t h a t / 9 is normal because L is normal over L.) As c~ can be denested over L, there is a sequence of subgroups/~1,...,/tl of G with G = H0 r>/~1 D ' ' " D /~rl, with H~//ti+l abelian for i = 0 , . . . , I - 1, and H D Hz. This sequence can be pulled d o w n to a sequence in G. If all roots of unity are in k, the tower defined by the groups can be m a d e into a tower of radical extensions, thus showing there is a minimal-depth denesting in L. 9 Of course, this does not solve the original problem, which was to denest over an arbitrary field. Can one a priori add certain roots of unity to the base field so that a minimal-depth nesting can be achieved? The answer is yes, so long as one is careful in handling roots of unity. All roots of unity can be expressed in terms of radicals. The problem is that the depth of nesting of a root of unity can be very deep indeed. In general, a pth root of unity has nesting depth one more than the m a x i m u m of the nesting depths of the prime factors of p - 1. Thus, if there are arbitrarily long sequences of primes p, 2p + 1,2(2p+ 1) + 1 , . . . - - a plausible, but unproved conjecture in number t h e o r y - - t h e n an nth root of unity can have nesting depth log n. For me the motivation for studying the denesting of radicals was to develop an algorithm for radical simplifications. In m a n y applications, writing a root of unity as ~,~ instead of the nested radical is a perfectly reasonable solution. This was the route taken here. But adding roots of unity to k does change the field in unexpected ways. By the Kronecker-Weber Theorem, every abelian extension over Q can be embedded in a cyclotomic extension. When I attempt to write ~/a in Q(~z) I m a y find that is an irrational number which is already in Q(~z). Such is the case for v ~ in the field Q(~5). Thus, v ~ will be represented as a polynomial in ~s, rather than the more usual expression v~. This type of simplification m a y drop us a single level of nesting. A more serious problem is that writing a root of unity as ~z in some sense masks it. There are subtle ways in which I pay for that. For example, ~ / v ~ - 5/2 = ~5 - 1/~5. Which symbol is easier to un54
THE MATHEMATICAL INTELLIGENCER VOL. 16, NO. 2, 1994
derstand: ~ / v ~ - 5/2 or ~5 - 1/~5? That depends on the a p p l i c a t i o n - - or the mathematician. Taking these concerns into consideration, I find: THEOREM 13 (Landau [8]). Suppose a is a nested radical over k, where k is afield of characteristic O. Let L be the splitting field of k(a) over k, with Galois group G. Let l be the lcm of the exponents of the derived series of G, and let us write a primitive lth root of unity as ~l, and not simply as a nested radical. If there is a denesting of c~ such that each of the terms has depth no more than t, then there is a denesting of a over k(~l) with each of the terms having depth no more than t + 1 and lying in L(~l). I also have an alternative version of this result in which I achieve minimal depth at the expense of adjoining a primitive rth root of unity, where r is dependent on the presentation of the input. COROLLARY 14 [8]. Let k, c~, L, G, I, and t be as in Theorem 13. Let m be the lcm of the (mij ), where mij runs over all the roots appearing in the given nested expression for c~. Let r be the lcm of (m, l). Then there is a minimal-depth nesting of c~over k( ~r ) with each of its terms lying in L( ~ ). These theorems tell us that the splitting field is the right place to look. They also lead naturally to a n algorithm. If I wish to denest the nested radical a, I begin by computing the minimal polynomial of c~ over k. From that I construct the splitting field L of the minimal polynomial of a over k. Next I compute G = Gal(L/k). We have already seen h o w to do these computations. What is the shortest sequence of nested radicals that will give c~? It will come from the shortest sequence of groups in the Galois group, the series of commutator subgroups DiG, i = 1 , . . . , s, where DSG = {e}. Good algorithms for group computations have existed for over 15 years. Having a group table, or an equivalent, for G, it is not hard to compute the commutator series of the group. Next I also compute 1, the lcm of the exponents of the derived series of G. For each i,i = 1 , . . . , s, I compute D i - I G / D i G =dil x . . . x ditl as a direct product of cyclic groups. Let,]ij = {e} x . . . x {e} x dij x {e} x . . . x {e}, ~ and let Li = L D*c. Thus, for each i, Li = Liyi' ' ' ' Li&t~ is a composite of cyclic extensions of L~-I. For each i and d, I compute flij such that L~*~ = Li-l(flij). Thus, Li = L i - , (fli,,. 99 3~t~).
I write K0 = k(~), where r is a primitive lth root of unity. Then Ki; = Ki-l(flij) can be written as a radical extension of Ki-1, and each Ki = K i l . . . Ku, is a composite of radical extensions of Ki-1. The crux of the matter is how to write these extensions as radical ones, that is, Ki = Ki-1 ( d ~ ) . This is achieved as follows. Following Artin, I construct a polynomial s~; (x) whose roots O~jl,..., Oij~j form a "normal" basis for Kij over Ki-1. The degree of 8ij(X) is rij = [Kij : Ki-l], and its roots are linearly independent over Ki-1. Then I use Lagrange resolvents to find a flij in Kij such that K i j = K i - l ( f l i j ) , where fiij satisfies an irreducible polynomial of the form xn*J - bij over Ki_~. That each of the extensions can be obtained as radical extensions stems from the fact that the appropriate roots of unity lie in the base field. This is just a brief sketch of the algorithm, details of which can be found in [8]. But the point should be clear: There is an algorithm for simplifying nested radicals, assuming one allows roots of unity to creep into the expression. H o w long does this take? Too long! If a is of degree n over k, its Galois group m a y be of size n! Even groups which are exponentially large (S,~, An, etc.) have a small set of generators, but from a computational standpoint, that does not seem to help. No one knows h o w to determine the generators of a Galois group of a general polynomial without first determining its splitting field. In general, computing the splitting field (that is, finding a minimal polynomial for a generator over the base field) is an exponential-time computation. Allowing roots of unity written in shorthand, Theorem 12 limits where I have to search if I am seeking a denesting. Nonetheless, except for radicals with small degrees, the computation is presently infeasible. Until there are improvements in algorithms for splitting field and Galois group computations, the algorithms based on Theorems 12, 13, and Corollary 14 are useful only for nested radicals of small degree.
Problem 1: Find a polynomial-time algorithm to compute the Galois group of an irreducible polynomial over Q. There is an improvement one can make to Corollary 14. Horng and H u a n g [7] have shown:
THEOREM 15 [7]. Let k, a, L, G, 1, and t be as in Theorem 13. Let n be a natural number which is divisible by [L : k] and the discriminant of L over Q. Then there is a minimal-depth nesting of a over k( ~n ) with each of its terms lying in L( ~n ). In finding a root of unity to achieve minimal-depth nesting for a, they eliminate the need for including anything that relies on the presentation of a, as in Corollary 14. However, they do so at the expense of introducing a dth root of unity, where the discriminant of L over Q divides d. This discriminant is of exponential size in a.
Their algorithm for denesting follows the algorithm described earlier. The problem of simplification of nested radicals is far from completely solved. Because of B16mer we now can efficiently denest depth-2 real radicals. We do not have efficient algorithms for depth 3 or greater real nested radicals which achieve minimal-depth nestings. Thus, the following questions remain:
Problem 2: Without a special encoding for roots of unity, given a nested radical, determine whether there is another nested radical of the same value, with lower nesting depth. Problem 3: Find such a lower nesting-depth radical. Problem 4: Given a real nested radical, determine whether there is another nested radical of the same value, of lower nesting depth. Acknowledgments I would like to thank Neil Immerman, Hendrik Lenstra, Carl Pomerance, Martin Tompa, and Ann Trenk for their help.
References 1. E. Artin, Galois Theory, University of Notre Dame Press, Notre Dame, IN, 1942. 2. A. Besicovitch, On the linear independence of fractional powers of integers, J. London Math. Soc. 15 (1940), 3-6. 3. A. Borodin, R. Fagin, J. Hopcroft, and M. Tompa, Decreasing the nesting depth of expressions involving square roots, J. Symb. Comput. 1 (1985), 169-188. 4. J. B16mer, Computing sums of radicals in polynomial time, Proceedings of the 32nd Annual IEEE Symposium on Foundations of Computer Science, 1991, pp. 670--677. 5. J. B16mer,How to denest Ramanujan's nested radicals, Proceedings of the 33rd Annual IEEE Symposium on Foundations of Computer Science, 1992, pp. 447-456. 6. B. Caviness and R. Fateman, Simplification of radical expressions, Proc. SYMSAC 77, pp. 329-338. 7. G. Horng and M. Huang, On simplifying nested radicals and solving polynomials by pure nested radicals of minimum depth, Proceedings of the 31st Annual IEEE Symposium on Foundations of Computer Science, 1990, pp. 847-854. 8. S. Landau, Simplification of nested radicals, SIAM J. Cornput. 21 (1992), 85-109. 9. S. Landau, A Note on Zippel denesting, J. Symb. Comput. 13 (1992), 41-45. 10. S. Landau and G. Miller, Solvability by radicals is in polynomial time, J. Comput. Syst. Sci. 30 (1985), 179-208. 11. S. Ramanujan, Problems and Solutions, Collected Works of S. Ramanujan, Cambridge University Press, Cambridge, 1927. 12. C. Siegel, Algebraische Abh/ingigkeit von"Wurzeln, Acta Arithmetica 21 (1971), 59--64. 13. R. Zippel, Simplification of expressions involving radicals, J. Syrnb. Comput. 1 (1985), 189-210.
Computer Science Department University of Massachusetts Amherst, MA 01003 USA
[email protected] THEMATHEMATICAL INTELLIGENCER VOL.16,NO.2,1994 55
Ian Stewart* The catapult that Archimedes built, the gambling-houses that Descartes frequented in his dissolute youth, the field where Galois fought his duel, the bridge where Hamilton carved quaternions--not all of these monuments to mathematical history survive today, but the mathematician on vacation can still find many reminders of our subject's glorious and inglorious past: statues, plaques, graves, the cafd where the famous conjecture was made, the desk where the famous ini-
tials are scratched, birthplaces, houses, memorials. Does your hometown have a mathematical tourist attraction? Have you encountered a mathematical sight on your travels? If so, we invite you to submit to this column a picture, a description of its mathematical significance, and either a map or directions so that others may follow in your tracks. Please send all submissions to the Mathematical Tourist Editor, Ian Stewart.
Laplace in Calvados Andreas M. Hinz
Leaving the prestigeous C6te Fleuri (Flowery Coast) of Normandy, France, halfway between Cabourg and Deauville, at Villers-sur-Mer, the road Dl18 enters the interior of a region called Auge in the heart of the department of Calvados. This is the origin of three of the most famous French cheeses: Camembert, Livarot, and Pont-l'Ev~que-- not to forget the spirit of Calvados!
Two-thirds of the way to Pont-l'I~v~que, the mathematical tourist reaches the charming village of Beaumonten-Auge. Here Pierre-Simon Laplace passed his first 16 years, before leaving for Caen, the capital of Calvados, and finally Paris. An impressive monument on the Place de Verdun (Fig. 1) shows the father of celestial mechanics leaning on a celestial globe, supported by Atlas, measuring the stars with a compass in his hand. The books at his feet are those of the grandfathers Galilei and Newton, who are represented in medallions on both sides of the base, the center of which has the golden inscription LAPLACE 1749-1827.
Figure 1. The Laplace m o n u m e n t in Beaumont-en-Auge. *Column Editor's address: Mathematics Institute, University of Warwick,Coventry,CV4 7AL England. 56
Figure 2. Inscription on the m o n u m e n t .
THE MATHEMATICALINTELLIGENCER VOL. 16, NO. 2 (~)1994 Springer-Verlag New York
The monument was erected by international subscription with the suFport of Carnegie Foundations for Peace and the Development of Sciences and carries a dedication by John Hemming Fry (Fig. 2) which translates as: "We must erect monuments for astronomers, for artists, for poets who, detached from the useful and the immediate, reveal the ideal and the charm of unknown things." The bronze statue was created by Robert Delandre and inaugurated July 3,1932, by the French marshal Franchet d'Esperey. Behind the monument, the Place de Verdun offers a splendid view over the country, and it is said that the young Pierre-Simon, who was a pupil of the coll~ge (high school) run by Benedictines, observed the movements of the heavenly bodies from this point. At the southwest corner of the same place, a building (Fig. 3) carries an ornament created by Follebarbe and put in place on October 18, 1835. The marble trophy shows attributes of astronomy as well as two inscriptions. The first one is a poem of Charles Lioult de Saint-Martindon, called Ch~nedoll6: SOUS UN MODESTE TOIT, ICI NAQUIT LAPLACE, LUI QUI SUT DE NEWTON AGRANDIR LE COMPAS, ET S'OUVRANT UN SILLON DANS LES CHAMPS DE UESPACE, Y FIT ENCORE UN NOUVEAU PAS. (Under a modest roof, here was born Laplace, He who was able to widen Newton's compass, And in opening himself a furrow in the fields of space, Advanced there another pace.) The second table reads The commune of Beaumont and the department of Calvados, to the memory of Laplace, born in Beaumont the 22nd [sic] of March 1749, died in Paris the 5th of March 1827.
Figure 3. The site of Laplace's birth.
The building is supposed to stand on the site of his grandmother's house, where Laplace was born. However, local tradition has it that his birth took place outside Beaumont on the family's farm (Fig. 4), which may have served as a relay station at the time. It is called the Ferme du Merisier (Farm of the Wild Cherry Tree) and can be reached by leaving Beaumont to the south on road D58, turning westward on N175, where it lies on the right side after about I km. (I must use the metric system throughout as Laplace was a member of the
Figure 4. The farm of the Laplace family. THE MATHEMATICAL INTELLIGENCER VOL. 16, NO. 2,1994
57
presides majestically over the rest. To reinforce this aristocratic impression, the gravestone presents only the coat of arms. A truncated obelisk, topped with an urn bearing the name LAPLACE, shows inscriptions on all sides. The front lists all the honorary titles of Pierre-Simon Marquis de Laplace, Pair de France (Peer of France): Grand Croix de l'Ordre Royal de la L6gion d'Honneur (Grand Cross of the Royal Legion of Honour) and memberships of the Acad6mie Fran~aise, the French Academy of Sciences, the Bureau des Longitudes, the Royal Societies of London and G6ttingen, the Academies of Sciences of Russia, Denmark, Sweden, Prussia, The Netherlands, Italy, Boston, etc. As the date of birth the inscription gives the 29th of March 1749. The side panels indicate the main scientific works of Laplace, namely, MECANIQUE C]~LESTE(celestial mechanics) SYSTI~MEDU MONDE (world system) PROBABILITES (probability)
Figure 5. Portrait of Laplace by Krug.
Committee on Weights and Measures which introduced the m e t e r - - y e s , it is that long ago!) Laplace's famous last words Ce que nous connaissons est peu de chose; ce que nous ignorons est immense. (What we do know is little; what we do not know is immense.) apply in particular to his family background. Biographers even say that, ashamed for its modesty, Laplace tried to conceal the truth, for instance his birth in a farmhouse. However, the custom of the time of giving birth in the grandmother's home and the cold season point to the house in Beaumont. There in the townhall (Mairie with the inscription 1880 LEGS LAPLACE) a room of the museum (open in the afternoon from W e d n e s d a y t o Sunday) contains a bust of Laplace, a portrait by Edouard Krug (Fig. 5), some volumes of his scientific work, and a few copies of manuscripts. As there are two birthplaces, there must be two graves (maybe there were two Laplaces?). One of these is in the cemetery of Beaumont, which lies at the end of a small road running southward from the townhall. On top, at the western end of the main axis, a white tomb (Fig. 6) 58
THE MATHEMATICALINTELLIGENCERVOL. 16, NO. 2,1994
Figure 6. Tomb in Beaumont.
The verso of the stone (Fig. 7) shows some heavenly bodies. The impressive list of honors reflects two sides of Laplace's personality: His political character was not as firm as his scientific achievements--to say the least. (In today's Germany we would call him a Wendehals.) His life and work are summarized in volume XV (supplement I) of the Dictionary of Scientific Biography (C. C. Gillispie, Editor-in-chief; Scribner's Sons, New York 1978, pp. 273403). According to this account, the grave just described is empty (or does at least not contain the remains of Laplace). It was moved to Beaumont in 1878 from the famous Paris cemetery of P6re Lachaise, where Laplace was originally buried, when the family, now the counts of Colbert-Laplace, wanted to have their illustrious ancestor within their estate near St. Julien-de-Mailloc. To find Laplace's very last resting place, take the road D519 from Lisieux, the capital of Auge and about 20 km southeast of Beaumont, to Orbec, again to the southeast.
Figure 8. Last resting place.
Halfway between the two towns, just after passing St. Denis-de-Mailloc and to the left of the road, there is a chapel, the Chapelle de Mailloc, with a bus stop. Leave the car (or the bus or bicycle) there, climb over the rusty gate to the left of the chapel, and follow the pathway uphill along a corn field for about 100 m. You will find to your right a group of trees hiding a little temple (Fig. 8). Here is where Laplace rests. The roof covers a bronze bust of the genius, and the inscription is sympathetically simple: PIERRE SIMON LAPLACE MECANIQUE CELESTE SYSTEME DU MONDE THEORIE ANALYTIQUEDES PROBABILITES
Acknowledgments I want to thank M meMagdeleine Duprez (Deauville) and M me Dominique Sebban (Beaumont-en-Auge) for their valuable suggestions. Thanks also to the keeper of the restaurant La Forge in St. Denis-de-Mailloc for pointing out the precise location of Laplace's grave.
Figure 7. Detail of the obelisk.
MathematischesInstitut Universitdt Mfinchen D-80333 Mfinchen, Germany e-mail
[email protected] THE MATHEMATICAL INTELLIGENCER VOL. 16, NO. 2, 1994 5 9
Symmetry of Opposites: Antisymmetry Istv n Hargittai and Magdolna Hargittai
Symmetry has long been identified with the properties of geometrical figures. The Russian crystallographer E. S. Fedorov, for example, gave it the following definition (quoted in [1]): "Symmetry is the property of geometrical figures of repeating their parts, or more precisely, their property of coinciding with their original position when in different positions." According to the Canadian geometer H. S. M. Coxeter [2], "when we say that a figure is 'symmetrical' we mean that there is a congruent transformation which leaves it unchanged as a whole, merely permuting its component elements." However, it has also been recognized for a long time that symmetry, as observed in real nature, cannot be reduced entirely to this geometrical symmetry. K. Mislow and P. Bickart [3] observed in their epistemological note on chirality that "when one deals with natural phenomena, one enters 'a stage in logic in which we recognize the utility of imprecision.'" Material symmetry, devoid of the rigor of geometrical symmetry, has been viewed as applicable to material objects as well as abstractions with limitless implications [4]. Symmetry also connotes harmony of proportions, a rather vague notion according to Weyl [5]. Human ability to geometrize nongeometrical phenomena helps us see symmetry even in its "vague" and "fuzzy" variations [6, 7]. Thus Weyl [5] said Diirer "considered his canon of the human figure more as a standard from which to deviate than as a standard toward which to strive." The vagueness and fuzziness of the broader interpretation of symmetry allow us to talk about degrees of symmetry. There must be a range of criteria, which may change from problem to problem, and may very well change in time as well. Today, Science is turning to the examination of the less orderly systems, yet symmetry considerations are not losing importance. On the contrary, their applications are gaining depth as well as breadth. 60
THE MATHEMATICAL INTELLIGENCER VOL. 16, NO. 2 ~)1994 Springer-Verlag New York
@
(a)
ii
n
i
II
i
Ii
Ii
ill
a
in
n
|
i
m
i
n
n
i
(a)
(b)
Figure 3. Antireflections: (a) Hungarian batik design; (b) logo of Tungsram Works.
(b)
Figure 1. (a) Identity operation and (b) antiidentity operation [8].
2
~
6
z
6
4m
6,m
L 1
2
3
/,,
Figure 2. Reflections (1/2 and 3/4) and antireflections (1/4 and 2/3) [8).
Chemistry [8], for example, is a science where the symmetry concept has played an increasing role, and not only in such areas as spectroscopy and crystallography but more recently even in such a seemingly nonexact field as organic synthesis. The so-called antisymmetry has become a seminal consideration in m o d e r n chemistry, for example, in the description of atomic and molecular orbitals of electronic structure and its changes and interactions during chemical reactions, and in the description of molecular vibrations. "Operations of antisymmetry transform objects possessing two possible values of a given property from one value to the other" [9]. The simplest antisymmetry operation is color change. Let us first consider an identity operation and an antiidentity operation in Figure 1. Move on then to antireflection in Figure 2, and a few further examples of this in Figure 3. Of course, geometrical symmetries are not restricted to reflection, and Figure 4 presents examples combining color change with both reflection and rotation, after Shubnikov [1].
2m
l,m
3.m
Figure 4. Antisymmetry operations after Shubnikov [1]. 2_,4_, 6_,antirotation axes; 2_,4_,6_,antireflection-rotation axes; _2. m, _4. m, 6. m, antirotation axes combined with ordinary reflection planes; 1. ~ 3. m__,ordinary rotation axes combined with antireflection planes. THE MATHEMATICAL INTELLIGENCER VOL. 16, NO. 2, 1994
61
Figure 5. (left) Picture by Victor Vasarely (courtesy of the artist). (above) Picture by Victor Vasarely (courtesy of the artist). (left) Vasarely-like car decoration (photograph by the authors).
Figure 6. Logo of sporting goods store in Boston, Massachusetts (photograph by the authors). 62
THE MATHEMATICAL INTELLIGENCER VOL. 16, NO. 2, 1994
Figure 5 presents some op-art patterns. The first Vasarely picture, at least in this black-and-white version, illustrates simple color change between the upper and lower parts of the figure. The second Vasarely picture and the decorated car involve a change in the shape of the motifs in addition to the color change. Here we have alternative properties, color and shape, either of which can be changed into its opposite. We are also moving away from rigorous geometrical symmetry, and moving toward a wider application of the antisymmetry concept. The presence of a property turning into its opposite becomes the dominating effect; symmetry elements, such as reflection or rotation, may or may not accompany it. Figure 6 shows the logo of a sporting goods store in Boston, Massachusetts. The antireflection plane relates winter and summer. Obviously, this store caters to both winter sports and summer sports fans. The color change in the self-serve/full-serve sign attracts attention in Figure 7, but the concepts may also be considered to have an antisymmetrical relationship. The Perestroika poster of Figure 8 displays color change only, and the implication is ironic: Forces against reform would like to reduce the significance of Perestroika to mere color changes.
Figure 7. Self-serve versus full-serve gas station in Oahu, Hawaii (photograph by the authors).
Let us interrupt our visual examples for two literary examples. The first refers to some antisymmetrical geographical relationships between, say, Western Europe and New Zealand. These locations can be connected by a straight diameter of the Earth going through its center. The noted American journalist James Reston [10] writes in his "Letter from Wellington. Search for the End of the Rainbow": "... Nothing is quite the same here. Summer is from December to March. It is warmer in the North Island and colder in the South Island. The people drive on the left rather than on the right. Even the sky is different-clark blue velvet with stars of the Southern C r o s s - - and the fish love hooks." (He might have added, cyclones go clockwise there, as does water draining from a sink.) The other example is taken from the Hungarian writer of the 1930s, Frigyes Karinthy, from a short story "Two diagnoses" [11]. The same person, Dr. Same, goes to see a physician at two different places on two different occasions. At the recruiting station he would obviously like to avoid getting drafted, whereas at the insurance company he would like to acquire the best possible terms for his policy. His answers to the identical questions of the physicians are related by antisymmetry. DR. SAME
Figure 8. "This is perestroika to some." An award-winning Soviet poster from 1987.
Figure 9. Belgian holiday ad in Flemish and French from 1983.
Returning now to visual examples, Figure 9 shows a Belgian travel ad, and the changing property is the language, Flemish/French. The horizontal antireflection is
PHYSICIAN
DR. SAME
At the recruiting station
At the insurance company
Broken-looking, sad, ruined human wreckage,feeble masculinity, haggard eyes, unsteady movement.
Young athlete with straightened back,flashing eyes.
H o w old are you?
Coyly, Oh, my gosh, I'm almost ashamed of it... I'm so silly...
Old... very old, indeed.
Your I.D. says you're thirty two.
With pain. To be old is not to be far from the cradle-but near the coffin.
To be yofing is not to be near the cradle, but far from the coffin.
Are you ever dizzy? Don't mention dizziness, please, Doctor, or else I'll collapse at once. I always have to walk in the middle of the street, because if I look down from the curb, I become dizzy at once.
Quite often, sorry to say. Every time I'm aboard an airplane and it's upside-down, and breaking to pieces. Otherwise, not...
THE MATHEMATICAL INTELLIGENCER VOL. 16, NO. 2, 1994
63
TESSi:KVALASZTANI
FIATAL DEMOKRAT,~K SZOVETSEGE Figure 10. Election poster by the (Hungarian) Alliance of Young Democrats (FIDESz), 1990. Upper half: Brezhnev and Honecker. Text in the middle: "Please, make your choice."
Figure 11. Viennese dancing school ad (photograph by the authors). 64
THE MATHEMATICAL INTELLIGENCER VOL. 16, NO. 2, 1994
very approximate in Figure 10 on the election poster by the Alliance of Young Democrats at the time of the 1990 Hungarian elections. The Viennese dancing school ad (Figure 11) relates an elephant's legs and a girl's by antisymmetry, obviously for the ability to dance. Figure 12 shows two military jets and a sea gull off Bodo, Norway, a military base, and they may imply the polarity of war and peace. A few examples of translational antisymmetry are shown next. Apparently, the first systematic discussion of the 46 two-color two-dimensional patterns was communicated by H. J. Woods in 1936, in a work recently saved from oblivion by D. W. Crowe [12]. Woods pointedly called these two-color patterns '~ounterchange" patterns. The first two of our illustrations (Fig. 13) are by a Canadian crystallographer, F. Brisse [13]. In one, the polar bear is subjected to a twofold rotational antisymmetry and then translation in two directions. In the other, the two-dimensional space group of the pattern, disregarding color change, would be p4gm. This pattern has already been used by G. P61ya [14] among his representations of the 17 two-dimensional space groups. It may also be found as a typical decoration in Islamic geometrical patterns [15]. However, in Brisse's pattern there is a two-color change during a complete revolution. There is then translation in two directions. Further simple color changes are involved in the next two figures. M. C. Escher's famous "Dogs" [16] is an excellent illustration of closest packing (Fig. 14). The color change is combined with glide lines. Reflection is also involved in generating Kh. Mamedov's 'Girls" in Figure 15 [17]. The Azerbaidzhani crystallographer's other drawing "Unity" (Fig. 16) once again combines geometrical symmetries with a conceptual opposition: young versus old.
Figure 12. Military jets and a sea gull, off Bodo, Norway, 1981 (photograph by the authors).
Figure 15. Kh. Mamedov: "Girls" (From Ref. 17, courtesy of Professor Mamedov). Figure 13. E Brisse: (a) "Northwest Territories"; (b) "Canada" (From Ref. 13, reproduced by permission).
Figure 14. M. C. Escher: "Dogs" (From Ref. 16, reproduced by courtesy of the International Union of Crystallography).
Figure 16. Kh. Mamedov: "Unity" (From Ref. 17, courtesy of Professor Mamedov). THE MATHEMATICALINTELLIGENCERVOL. 16, NO. 2,1994 6 5
Figure 17. Symmetric (a) and antisymmetric (b) consequences of reflection (drawing courtesy of architect G. Doczi [8]). The symmetric and antisymmetric consequences of reflection for two m o v e m e n t s are illustrated in Figure 17. Suppose we walk alongside a long wall of mirror (Fig. 17a). Our mirror image will be walking with us; the two velocities will be the same. N o w walk from a distance toward the mirror, perpendicular to it (Fig. 17b). In this case, our mirror image will have a different velocity from ours. The speed will be the same again, but the direction will be the opposite. If we don't stop in time, we shall collide. We conclude our discussion b y mentioning A. Koestler's concept of bisociation. According to Koestler [18], the connection in thought association is m a d e between thoughts on the same plane, whereas bisociation refers to connection of thoughts from different planes. Thus, bisociation m a y be considered to be the antisymmetric partner of thought association. Let us just quote one example from Koestler: "The Prince, travelling through his domain, noticed a man in the cheering crowd who bore a striking resemblance to himself. He beckoned him over and asked: 'Was your mother ever employed in my palace?' 'No Sire,' - - t h e man r e p l i e d . - 'But my father was.'"
References 1. A. V. Shubnikov, Simmetriya i antisimmetriya konechnikh figur, Izv. Akad. Nauk SSSR, Moscow, 1951. English translation: A. V. Shubnikov, N. V. Belov et al., Colored Symme-
try: A Series of Publications from the Institute of Crystallography, Academy of Sciences of the U.S.S.R., Moscow, 1951-1958, W. T. Holster, ed., Pergamon Press, New York, 1964. 2. H. S. M. Coxeter, Regular Polytopes, 3rd ed., Dover, New York, 1973. 3. K. Mislow and P. Bickart, Israel ]. Chem. 15 (1976/77), 1. 66
v~E MATHEMATICAL INTELLIGENCER VOL. 16, NO. 2,1994
4. I. Hargittai, Limits of perfection, Computers Math. Appl. 12B (1986), 1-17; also in Symmetry: Unifying Human Understanding, I. Hargittai, ed., Pergamon Press, New York, 1986, pp. 1-17. 5. H. Weyl, Symmetry, Princeton University Press, Princeton, NJ, 1952. 6. I. Hargittai, The joy of symmetry, Computers Math. Appl. 17 (1989), 1067-1072; also in Symmetry 2: Unifying Human Understanding, I. Hargittai, ed., Pergamon Press, Oxford, 1989, pp. 1067-72. 7. I. Hargittai, Real turned ideal through symmetry, Symmetrie in Geistes- und Naturwissenschafl, R. Wille, ed., Springer, Berlin, 1988, pp. 131-161. 8. I. Hargittai and M. Hargittai, Symmetry through the Eyes of a Chemist, VCH, Weinheim and New York, 1986. 9. A. L. Mackay, Acta Crystallogr. 10 (1957), 543. 10. J. Reston, Letter from Wellington. Search for the end of the rainbow, International Herald Tribune, 7 May, 1981, p. 4. 11. E Karinthy, Two diagnoses, Selected Works, Szepirodalmi, Budapest, 1962 (in Hungarian). 12. D. W. Crowe, The mosaic patterns of H. J. Woods, Computers Math. Appl. 12B (1986), 407-411; also in Symmetry: Unifying Human Understanding, I. Hargittai, ed., Pergamon Press, New York, 1986, pp. 407-411. 13. E Brisse, La sym6trie bidimensionnelle et le Canada, Can. Mineral. 19 (1981), 217-224. 14. G. P61ya, Uber die Analogie der Kristallsymmetrie in der Ebene. Z. Krist. 60 (1924), 278-282. 15. I. El-Said and A. Parman, Geometric Concepts in Islamic Art, World of Islam Festival Publ. Co., London, 1976. 16. C. H. Macgillavry, Symmetry Aspects of M. C. Escher's Periodic Drawings, Bohn, Scheltema & Holkema, Utrecht, 1976. 17. Kh. Mamedov, I. R. Amiraslanov, G. N. Nadzhafov, and A. A. Muzhaliev, Decorations Remember, Azerneshr, Baku, 1981 (in Azerbaidzhani). 18. A. Koestler, The Act of Creation, Macmillan, New York, 1964. Budapest Technical University Szt. Gelldrt ter 4 1-1-1521 Budapest, Hungary Hungarian Academy of Sciences H-1431 Budapest, Pf. 117 Hungary
Gauss, Eisenstein, and the "Third" Proof of the Quadratic Reciprocity Theorem: Ein kleines Schauspiel Reinhard C. Laubenbacher and David J. Pengelley
The year: 1844 The characters: Carl Friedrich Gauss, Director of the G6ttingen Observatory, and the m o s t p r o m i n e n t m a t h e m a t i c i a n of the 19th century (possibly of all time). Gotthold Eisenstein, 21-year-old m a t h e m a t i c s student at the Friedrich-Wilhelms University in Berlin.
Eisenstein: "Already early in my youth I was attracted by the beauty of a subject which differs from other subjects not only in its content but, most importantly, in the nature and variety of its methods. In it, it is not enough just to lay out the consequences of a single idea in a long sequence of deductions; almost each step requires one to conquer new difficulties and apply new principles. A little over fifty years ago, number theory consisted only of a collection of isolated facts, unknown to most mathematicians, and practiced only occasionally by a few, even though Euler already found in it leisure from his other activities. It was through Gauss and some of his successors that number theory has reached such heights that now it is not inferior to any other mathematical discipline in depth and breadth, and has had a fruitful influence on many of them. A school has arisen which counts the most eminent mathematical talents among its disciples, and which I too proudly am a part of, if only one of its lowliest." [5]
Eisenstein (continuing): You, H e r r Direktor, h a v e described so eloquently the strange attractions of this science, a n d h a v e given several proofs of its F u n d a m e n t a l Theorem. Gauss: "The questions of higher arithmetic often present a remarkable characteristic which seldom appears in more general analysis, and increases the beauty of the former subject. THE MATHEMATICAL INTELLIGENCER VOL. 16, NO. 2 (D1994 Sprlnger-Verlag New York
67
to confirm the assertions of the preceding paragraph. I discovered this theorem independently in 1795 at a time when I was totally ignorant of what had been achieved in higher arithmetic, and consequently had not the slightest aid from the literature on the subject. For a whole year this theorem tormented me and absorbed my greatest efforts until at last I obtained a proof given in the fourth section of the abovementioned work. Later I ran across three other proofs which were built on entirely different principles. One of these I have already given in the fifth section, the others, which do not compare with it in elegance, I have reserved for future publication. Although these proofs leave nothing to be desired as regards rigor, they are derived from sources much too remote, except perhaps the first, which however proceeds with laborious arguments and is overloaded with extended operations. I do not hesitate to say that until now a natural proof has not been produced. I leave it to the authorities to judge whether the following proof which I have recently been fortunate enough to discover deserves this description." [8, 11]
Gauss (continuing): The Q u a d r a t i c Reciprocity Theor e m c o m p a r e s the quadratic character of two p r i m e s w i t h respect to each other. The quadratic character of q with respect to p is expressed b y the Legendre s y m b o l (~), defined to be 1 if q is a quadratic residue (i.e., a square) m o d u l o p, and - 1 if not. Q U A D R A T I C R E C I P R O C I T Y T H E O R E M . If I9 and q
are distinct odd primes, then =
[-1)'~--"
2
.
Gotthold Eisenstein
We also have the Erga'nzungssatz While analytic investigations lead to the discovery of new truths only after the fundamental principles of the subject (which to a certain degree open the way to these truths) have been completely mastered; on the contrary in arithmetic the most elegant theorems frequently arise experimentally as the result of a more or less unexpected stroke of good fortune, while their proofs lie so deeply embedded in the darkness that they elude all attempts and defeat the sharpest inquiries. Further, the connection between arithmetical truths which at first glance seem of widely different nature, is so close that one not infrequently has the good fortune to find a proof (in an entirely unexpected way and by means of quite another inquiry) of a truth which one greatly desired and sought in vain in spite of much effort. These truths are frequently of such a nature that they may be arrived at by many distinct paths and that the first paths to be discovered are not always the shortest. It is therefore a great pleasure after one has fruitlessly pondered over a truth and has later been able to prove it in a round-about way to find at last the simplest and most natural way to its proof. The theorem which we have called in sec. 4 of the Disquisitiones Arithmeticae the Fundamental Theorem, because it contains in itself all the theory of quadratic residues, holds a prominent position among the questions of which we have spoken in the preceding paragraph. We must consider Legendre as the discoverer of this very elegant theorem, although special cases of it had previously been discovered by the celebrated geometers Euler and Lagrange. I will not pause here to enumerate the attempts of these men to furnish a proof; those who are interested may read the above mentioned work. An account of my own trials will suffice 68
THE MATHEMATICAL INTELLIGENCER VOL. 16, NO. 2,1994
(~)
= ( - 1 ) ~2;1
for the prime 2. N o t e that the theorem says that p and q h a v e the s a m e quadratic character with respect to each other if either one of t h e m is = 1 (rood 4), but not if b o t h are =_ 3 ( m o d 4). W h y is this theorem so i m p o r t a n t ? First, it can be u s e d to solve the general p r o b l e m of w h e n a quadratic congruence has a solution, because the F u n d a m e n t a l Theorem, along w i t h the multiplicativity of the Legendre s y m b o l ( ~ ) = (~)(F), a b enables one easily to c o m p u t e its values, a n d thus to k n o w exactly w h e n square roots m a y be f o u n d m o d u l o p. But it also represents an a m a z i n g a n d u n e x p e c t e d relationship b e t w e e n pairs of primes, a d e e p l a w g o v e r n i n g p r i m e n u m b e r s . Later in the century, B a u m g a r t [2] will describe the role of the F u n d a m e n t a l T h e o r e m within higher arithmetic: "The higher arithmetic in essence divides into two main parts, the theory of congruences and the theory of homogeneous forms. The theory of binomial congruences forms an integrating part of the general theory of congruences. 'The reciprocity laws are the cornerstone of the latter theory' [Kummer, Berliner Abhandlungen, 1859, S. 19]."
In m y lifetime I will, in fact, present eight different proofs of the Fundamental Theorem, and other mathematicians will find m a n y more. But m y third published proof, which I n o w present, is perhaps the most natural one I know. M y proof begins with a theorem which in the future might be called
The right-hand product evidently becomes, modulo p, = (-1)'~aa'a ' ' . . . bb'b"... = (-1)~q - 2q. 3q..- ~ q
(-1)~q ~7! 91 . 2 . 3
1
P 2
Hence GAUSS'S LEMMA. divisible by p. Then
(q)
that is, q 2 ~ +1 according as c~is even or odd. Hence our theorem follows at once [from Euler's Criterion that (~) qr@].,
= ( - 1 ) '~,
with c~ obtained as follows: Let
"" T
B
=
{p+l 2
'
p+3 2 ''"'P-
and
1} '
Then c~ is defined to be the number of least positive residues modulo p of the set q.A which lie in B.
"Proof: Let a, a', a " , . . , be the residues belonging to the class ~A and b, b', b ' , . . , be those belonging to B. Then it is clear that the complements of the latter, p - b, p - b', p - b ' , . . . are not equal to any of the numbers a, a', a " , . . . [for if, say, qx = a = p - b = p - qy where x, ~/come from .A, then q(x + y) would be divisible by p, which cannot occur since both x and y lie strictly between 0 and p/2], and together with them make up the class .A. Consequently, we have 1 . 2 . 3 . - " p -21
1 = ( - 1 ) ~ q e~A,
Let p be prime, and q any number not
=a.a'
-a"- . . ( p - b ) ( p - b ' ) ( p - b " ) . . . .
Eisenstein: Hochgeehrtester H e r r Hofrath, I too have devoted m u c h effort to the s t u d y of the quadratic reciprocity law and have given four different proofs of it. My recently published geometric proof [71 is related to y o u r third proof and, with all due modesty, represents something of an i m p r o v e m e n t in the exposition of the main ideas underlying it. As I wrote to m y friend Moritz Stern in G6ttingen, "I did not rest until I freed my geometric proof, which delighted you so much, and which also, incidentally, particularly pleased Jacobi, from the Lemma [of Gaussl on which it still depended, and it is now so simple that it can be communicated in a couple of lines. The main difference between my argument and that of Gauss is that I do not divide the numbers less than p into those less than p/2 and those greater than p/2, but rather into even and odd ones." [9] Instead, I begin m y proof as follows, with what m a y some day be called Eisenstein's Lemma. Consider the set a = 2, 4, 6 , . . . ,p - 1. Let r denote the remainder (mod p) of an arbitrary multiple qa. Then it is apparent that the list of n u m b e r s ( - 1 ) r r agrees with the list of numbers a, u p to multiples of p. (For, clearly, each of the numbers ( - 1 ) % has even least positive residue, and if there were duplication a m o n g these residues, e.g., !
!
( - 1 ) q a q a =_ ( - 1 ) qa q a ,
then a = +a'. As the a's are distinct, it follows that a + a ~ - 0, which cannot occur because 0 < a + a ~ < 2p and a + a t is even.) Thus, qa~Ha II a
-
Hr
(modp)
_= ( - 1 ) zr H r
and (modp),
from which it follows that qaq_~ _ ( _ l ) Z r (mod p). Recalling Euler's Criterion that (~) = qG@ (mod p), this produces (q)
Carl Friedrich Gauss
= ( - 1 ) zr,
(1)
so one m a y focus solely on the parity of the exponent. This formula is m y replacement for y o u r Lemma. Because the exponent is algebraic in nature, m y proof will n o w proceed more easily than yours. Incidentally, the THE MATHEMATICAL INTELLIGENCER VOL. 16, NO. 2, 1994
69
odd terms actually contributing to the parity of this exponent correspond exactly to the ones counted in your Lemma.
F
Gauss: Well, we shall see about that, Herr Student. I will
begin the main part of my proof by deriving a few technical facts about the greatest integer function [ ] which are needed subsequently. Let x be a nonintegral quantity. 1. [x] + [b - x] = b - 1, whenever b is an integer. 2. If x - [x] is a fraction less than 1, then [2x] - 2[x] = 0. If, on the other hand, x - Ix] is greater than 89 then [2x] - 2[x] -- 1. Thus: 3. If the smallest positive residue of b (mod p) is less than p/2, then [2b/p] - 2[b/p] = 0. If, however, it is larger than p/2, then [2b/p] - 2[b/p] = 1. 4. From this it follows that a = I ~ ] + E-~I + [ ~ 1 + "'" + [(P pl)~q ]
- 2 Eq] - 2 [ ~ 1 - 2
E~] . . . . .
2
[[(p-~)/2]q].
q/2 L
(2)
mula (1) in my Lemma already produces the essence of this more quickly and transparently. Clearly,
A
r"
(3)
first, when p is of the form 4n + ~--
Gauss: Well, Herr Eisenstein, that was indeed a clever
shortcut to formula (3), leading now to the following necessary calculations. I transform formula (2) as follows: From property 1, we have
= :
70
THE MATHEMATICAL INTELLIGENCER VOL. 16, NO. 2,1994
1,
(q - 1)(p - 1) 4 -2{[ql+
[~1+
[~1+.-.+
[-~]+ [~1 +'"+
[[(p-3)/2]q]} E[(P-~)/2]q]} ;
(q - 1)(p + 1) 4 -2{[q]+ -{[ql+
This is equivalent to what you have learned about a in Eq. (2), as only the parity is relevant.
[(pp3)q]
x
second, when p is of the form 4n + 3,
Because the elements a are all even and p is odd, it follows that y~ r = y~ [p~] (mod 2), and thus from Eq. (1) that
1
p
p/2
Figure 1.
-{[q]+
1 =
H
When we apply these substitutions to the last (p :F 1)/4 terms of the top series in Eq. (2) we have:
Eisenstein: But with all due respect, Herr Hofrath, for-
( q ) = ( - 1 ) ~ [ ~ ].
B
D
Oz
Eqa=pE[~I+E
J
[~]+
[~1+...+
[~1 + [~] +"'+
[[(p-pl)/a]q]~jj
[[(P-1)/2]ql}.
From these the Erg~inzungssatz follows for q = 2, and we assume henceforth that q is an odd prime. Eisenstein: Firstly, Herr Direktor, the Erg~inzungssatz
also follows easily from my Eq. (3). And, secondly, the results of your substitutions can all be interpreted geometrically for q odd. Let us use a simple geometric representation of the exponent ~ [p~] in Eq. (3) to transform it while retaining its parity: This exponent is precisely the number of integer lattice points with even abscissas lying in the interior of triangle ABD in Figure I (note that no lattice points lie on the line AB). Consider an even abscissa a > p/2. Because the number of lattice points on each abscissa in the interior of rectangle ADBF is q - 1, which is even, the number [p~] of lattice points on the abscissa below AB has the same parity as the number of
lattice points above AB. This, in turn, is the same as the number of points lying below AB on the odd abscissa p - a. The one-to-one correspondence between even abscissas in triangle BHJ and odd abscissas in AHK now implies that ~ [~] _= # (mod 2), where # is the number of points inside triangle AHK, and thus
above for c~we see that L, and likewise M, are even, and, moreover,
L+M=a+fl+
( p - 1 ) ( q - 1) 4
Then
(P) (q) = (-1)a(-1)fl = (-1)a+fl =(-1) (p-1)(q-1)/4 and the proof is complete. In each of your expressions above for a, the first line is even, and the second line counts precisely the lattice points in triangle AHK. Your substitutions are realized simply by these geometric transformations when we focus specifically on the parity. Gauss (aside): Darn, my calculations really do seem clumsy compared with this youngster's geometric methods. Gauss (to Eisenstein again): Very well, Herr Eisenstein, there does seem to be some merit to your approach. But now your geometry has reached its limit, and we must compare the reciprocal roles of p and q as follows. Considering the third line in each of the expressions above for c~, and comparing it with the same expression when the roles of p and q are interchanged, I will prove that [ q ] + [~_q] + [_~] + " ' + + [~]+
[_~] + - - - +
[[(p-1)/2]q]+
Eisenstein (agitated): But Herr Hofrath, please, this is immediately obvious from the geometric representation, for it is merely the sum of the number of lattice points # in triangle AHK with the number v in triangle AHL, which clearly equals the total number (/9-1)/2. (q - 1)/2 inside the rectangle. Gauss (after some hesitation): Indeed, verehrter Herr Eisenstein, this is impressive! For the sake of completeness, though, I will now finish my argument by letting
[ P ] + [ - ~ 1 + [-~] + " ' +
"How lucky good Euler would have considered himself, had he possessed these lines about seventy years ago." [9] Eisenstein (aside): I wonder if Gauss already had m y geometric view of his transformations when he published his third proof in 1808, and if he has merely been humoring me all along in our conversation.
Epilogue
This is somewhat technical and lengthy, but we begin as follows: ...
M=fl+
As I wrote to my friend Stern,
[p]
[[(q-1)/2]p](p-1)(q-1)4
L = a + [ q ] + [ - ~ ] + [-~-~] + ' " +
Eisenstein: Herr Hofrath, thank you for your kind words. Of course, my proof was already finished before, as I have
[[(p-~)/2]q], [[(q-1)/2]p],
where fl has the same definition as a does but with the roles of p and q interchanged. Then from the expressions
Gauss: I have just today, July 14, 1844, written a letter to C. Gerling, saying, "I have recently made the acquaintance of a young mathematician, Eisenstein from Berlin, who came here with a letter of recommendation from Humboldt. This man, who is still very young, exhibits very excellent talent, and will certainly achieve great things" [4]. By the end of this year he will have contributed no less than 16 of the 27 mathematical articles in volume 27 of Crelle's Journal, including his geometric proof of the Fundamental Theorem. Next year, as a thirdsemester student, he will receive an honorary doctorate from the University of Breslau [10]. I will write to the great scientist and explorer Alexander von Humboldt that Eisenstein's talent is "that which nature bestows upon only a few in each century." Both yon Humboldt and I will make great efforts, for the most part in vain, to obtain recognition and financial security for the impoverished Eisenstein. He will obtain a position as a Privatdozent (unsalaried lecturer) at the University in Berlin, and will eventually be admitted to the Berlin Academy of Sciences in early 1852. But by then his lifelong poor health will have seriously deteriorated, and he will die later that same year, at the age of 29, of tuberculosis [3, 10]. How tragic that, like Abel and Galois in the same period, the genius of Gotthold Eisenstein will be lost to so early a death. THE MATHEMATICAL INTELLIGENCER VOL. 16, NO. 2,1994
71
The brilliance and p o w e r of Eisenstein's w o r k will remain unappreciated for a long time, in part because it will be almost 125 years before the publication of his Collected Works. In the foreword to their second edition [6], the eminent 20th-century mathematician Andr6 Weil will review Eisenstein's mathematical contributions, in particular "the impressive series of papers on elliptic functions and their application to the higher reciprocity laws .... [The] series ends up with a great paper, the Genaue Untersuchung of 1847, which excited Kronecker's enthusiasm when he discovered it late in life, and which still deserves ours; it is nothing less than the sketch of a complete theory of elliptic and modular functions, based on principles essentially distinct from those of Jacobi and from those of Weierstrass (while anticipating him by nearly fifteen years), but, as I have more amply demonstrated elsewhere, its principles can be profitably applied to important current problems." Late in the 19th century, Baumgart [2] will write a survey of the m a n y different proofs of the Fundamental Theorem given b y then. Unfortunately, he will misunderstand and overlook most of the beautiful features of Eisenstein's geometric proof, mentioning only how he counts the points in a rectangle to avoid m y technical argument above for adding the two series with interchanged roles for p and q. Sadly, he will miss Eisenstein's algebraic form of m y Lemma, as well as his geometric way of representing m y transformations. Subsequent mathematicians, probably relying on Baumgart's survey rather than reading Eisenstein's original paper, will perpetuate this oversight. Let me just mention Bachmann's early 20th-century book on n u m b e r theory [1] as an example. Only shortly before the d a w n of the 21.st century will this injustice be rectified, w h e n mathematicians of the distant future rediscover and fully appreciate the neglected and spectacular parts of Eisenstein's geometric proof of the Fundamental Theorem of higher arithmetic.
Acknowledgments The authors w o u l d like to thank Keith Dennis for his assistance in locating sources, Tom H o e k s e m a and Carol Walker for their enthusiastic support of the Honors courses which led to this work, and Joe Z u n d for telling us where to look.
References 1. P. G. Bachmann, Niedere Zahlentheorie, Teubner, Leipzig, 1902-1910; republished by Chelsea, New York, 1968. 2. O. Baumgart, "Uber das Quadratische Reciprozit/itsgesetz," Z. Math. Phys. 30 (1885), 169-277. 72
THE MATHEMATICAL INTELLIGENCER VOL. 16, NO. 2, 1994
3. K.-R. Biermann, Gotthold Eisenstein: "Die wichtigsten Daten seines Lebens und Wirkens," in Mathematische Werke; Gotthold Eisenstein, Chelsea Publ. Co., New York, 1989, pp. 919-929. 4. K.-R. Biermann, Carl Friedrich Gauss, Verlag C. H. Beck, Mfinchen, 1990, p. 177. 5. G. Eisenstein, Berlin Academy of Sciences membership acceptance speech (1852), in Mathematische Werke, 2nd ed., Chelsea Publ. Co., New York, 1989, pp. 762-763. 6. G. Eisenstein, Mathematische Werke, 2nd ed., Chelsea Publ. Co., New York, 1989, p. ix. 7. G. Eisenstein, "Geometrischer Beweis des Fundamentaltheorems fiir die quadratischen Reste," Journal ~ r reine u. angew. Math. 28 (1844), 246-249; also in Mathematische Werke, 2nd ed., Chelsea Publ. Co., New York 1989, pp. 164-166. 8. C. E Gauss, Commentationes Societatis Regiae Scientiarum Gottingensis 16 (1808), G6ttingen; also Werke, G6ttingen, 1876, Band 2, pp. 1-8. 9. A. Hurwitz and E Rudio (eds.), "Briefe von G. Eisenstein an M. Stern," supplement to Z. Math. Phys. 40 (1895), 169203, pp. 173-4; also G. Eisenstein, Mathematische Werke, 2nd ed. Chelsea Publ. Co., New York, 1989, pp. 789-823. 10. E Rudio (ed.), "Eine Autobiographie von Gotthold Eisenstein. Mit Erg/inzenden Biographischen Notizen," Z. Math. Phys. 40 (1895), 143-168; also G. Eisenstein, Mathematische Werke, 2nd ed. Chelsea Publ. Co., New York, 1989, pp. 879904. 11. D. E. Smith, A Source Book in Mathematics, Dover, New York, 1959, pp. 112-118.
Department of Mathematical Sciences New Mexico State University Las Cruces, NM 88003 USA
Jet Wimp*
Inverse Acoustic and Electromagnetic Scattering Theory by D. Colton and R. Kress N e w York: Springer-Verlag, 1992. x + 305 pp. US $59.00, ISBN 0-387-55518-8
Reviewed by Pierre Sabatier We mathematicians are often concerned with respectabili~: We tend to believe there are respectable disciplines, and within these disciplines there are respectable problems. Those working in other sciences are perhaps less value-conscious; to them there are simply problems. The progress of inverse scattering theory from a literal scattering of isolated results of questionable mathematical value to the central (and respectable) position it occupies in present-day mathematical physics is illustrative. Rutherford's discovery of the atomic nucleus, and the development of sonar and radar were essentially isolated discoveries. Now, because of its applications to such cutting-edge technological advances as geophysical exploration, medical imaging (for instance, computerized tomography), and nondestructive testing, inverse scattering theory attracts and challenges engineers and mathematicians, both pure and applied. Scattering theory deals with the effect an inhomogeneous m e d i u m - - the scatterer-- has on an incident particle or wave. The scatterer may either be an object or a medium. Suppose we view the total field as the sum of an incident field u i and a scattered field u s. Workers have found it useful to distinguish two distinct problems. The direct scattering problem is to determine u s from a knowledge of u i and the differential equation governing the wave motion. The more difficult and possibly more interesting inverse scattering problem is to determine the nature of the scattering object from a knowledge of the asymptotic behavior of u s, that is, to reconstruct the differential equation from the behavior of (many of) its solutions. The above much simplified description subsumes a huge * Column Editor's address: Department of Mathematics, Drexel University, P h i l a d e l p h i a , PA 19104 USA.
range of physical concepts and mathematical ideas, and the approaches and the mathematical tools used in scattering theory reflect this range. The methods we use to investigate direct scattering problems depend very much on the frequency of the wave motion, whether the wave is acoustic or electromagnetic. If the wavelength is very small compared with the smallest experimentally observable distance, the scatterer will produce a shadow with a sharp edge. However, if one examines the edge more closely, one finds that it is not sharply defined but consists of fringes, a phenomenon known as diffraction. On the other hand, obstacles which are small compared with the wavelength will not produce identifiable shadows. Hence, there are two distinct crucial frequency regions corresponding to the magnitude of ka, a being the dimension of the scatterer and k the wave number. The high-frequency region is the set of k such that ka >> 1. The resonance region is the set of values such that ka is less than or comparable to unity. The mathematical methods that are necessary to analyze scattering in the resonance region differ from those used in the high-frequency region. Although the book under review gives exact methods which apply to data in any range, these methods are useful almost exclusively in problems in the resonance region. Many investigators have treated the direct scattering problem, starting with Sommerfeld's uniqueness results in 1912, and by now a considerable body of information is available concerning its solution. Only recently has the inverse scattering problem progressed from a smattering of ad hoc techniques with little rigorous mathematical basis to respectability. The reason for the comparative intransigence of the inverse problem, is that it is inherently nonlinear and, more seriously from the point of view of numerical computations, it is improperly posed. A mathematical problem is called improperly posed, or ill-posed, if the informational content of the data is not adequate to allow the reconstruction of the parameters of the problem. Even though the solution of the problem may exist and be unique, the problem is still called ill-posed if small perturbations of the data produce large effects in the reconstructed parameters. In scattering theory, if the scatterer has finite
THE MATHEMATICALINTELLIGENCERVOL. 16, NO. 2 (~)1994Springer-VerlagNew York 7 3
support, small perturbations of the far-field pattern give rise to a function which lies outside the class of far-field patterns, and this can lead to large errors in attempts to reconstruct the scatterer. Scattering theory allows several kinds of inverse problems. Two clear-cut cases are (1) determining the shape of an obstacle and (2) determining the constitutive parameters of a medium by means of scattering experiments. In both problems, measurements are made on the far-field pattern of the waves. Every day on an entirely personal basis we manage to solve a plethora of inverse scattering problems. A large part of our sensory information on the world around us consists of intuitive solutions of such inverse problems: "We infer the shape, size and surface texture of external objects from their scattering and absorption of light as detected by our eyes" [1]. However, inverse problems can also be idealized as purely mathematical problems, such as determining an operator from its spectrum [2]. It is quite remarkable that inverse problems of imaging in the physical and geophysical sciences, spectral inverse problems in pure and applied mathematics, inverse problems of geological prospecting, and many others coming from even more remote disciplines have so many points in common. One can speak of a unified field of Inverse Problems. In this field, the results of David Colton and Rainer Kress a r e - to use a metaphor from the world of car racing-- in "pole position" among the exact studies of the determination of the shape or constitutive parameters from electromagnetic or acoustic soundings. The Colton and Kress book on the direct problem, Integral Equation Methods in Scattering Theory, published by John Wiley in 1983, has a well-deserved reputation among research workers. For almost the first time, surface integral methods were demonstrated in a comprehensive and efficient way. Kress's book, Linear Integral Equations, published by Springer-Verlag in 1989, had the same qualities. Hence, people in the field looked forward to the present book with great anticipation. We are not disappointed. This book treats four kinds of inverse scattering problem: obstacle and medium scattering problems for scalar waves, namely, acoustic waves; and obstacle and medium scattering problems for vectorial waves, namely, electromagnetic waves. An obstacle D is characterized by its surface OD and the boundary conditions imposed on the wave function on that surface. In acoustical scattering, the wave function u vanishes on OD in the case of a "soft" obstacle, whereas if v is the unit outward normal to OD,
Ou Ov
--~0
for a "hard" obstacle. If the surface impedance A is finite and nonvanishing,
Ou 0---~+ Au = 0 on OD. 74
THE MATHEMATICALINTELLIGENCERVOL. 16, NO. 2,1994
A medium is characterized by a set of "constitutive parameters" such as density, conductivity, and so on, instead of only a boundary and boundary conditions. Colton and Kress begin the presentation with a historical-thematical chapter. The treatment of acoustic direct problems comes next. The treatment is more compact but not less clear and efficient than their 1983 treatment. A tutorial chapter on ill-posed problems follows. This chapter is a simple and self-contained account of all the essential topics: definitions, singular-value decomposition, standard regularization methods, and so on. Next, the authors begin their discussion of one of the main subjects of the book: the inverse acoustic obstacle problem. This was the object of well-known studies by the authors throughout the eighties, and the results of those studies appear as a sort of bouquet at the end of this chapter. Uniqueness theorems open the discussion. A wellknown one is that if the far-field patterns of two "soft" scatterers coincide for an infinite number of incident plane waves with distinct directions and one fixed wave number, the scatterers coincide. This problem can be related to the Dirichlet problem, and by using results on the Dirichlet eigenvalue problem for the Laplacian, the authors are able to prove uniqueness theorems. The authors easily extend these results to the case of a finite number of incident plane waves by obtaining an a priori bound on the number of possible Dirichlet eigenvalues for scatterers contained in a ball of fixed radius. They establish similar theorems for hard scatterers. They then demonstrate the construction of the obstacle shape OD from the scattered field by a direct a p p r o a c h - - a n iterative one in most cases and with convergence proofs given in the case of starlike C 2 surfaces--and by a twostep approach. In the latter, they first derive the scattered wave from the far field (this step taking care of the illposedness of the problem), and then they provide the construction of OD in terms of the location of the zeros of the total field. The authors give numerical results to illustrate a typical reconstruction. Finally, they explain the Colton-Monk approximation method, which relies on the properties of Herglotz wave functions. [A Herglotz wave function is a function of the form
v(x) = / ~ eik~'dg(d)ds(d), where x E ]~3, ~ is the unit sphere, and g E L2(~).] The second part of the book deals with electromagnetic scattering problems for obstacles, mainly perfect conductors. For three-dimensional problems, we must use Maxwell's equations. (Two-dimensional problems can be reduced to the problems of the two-dimensional Helmholtz equation.) After a discussion of the physical background, the authors derive the representation theorems for solutions to the Maxwell equations in a homogeneous medium utilizing the Silver-Muller radiation
condition, which replaces the Sommerfeld condition for acoustic waves. They extend the notion of Herglotz wave functions for acoustic waves to the case of electromagnetic waves. This enables them to present completeness results for the far-field patterns which correspond to the scattering of electromagnetic plane waves with different incident directions and polarizations. The chapter on inverse electromagnetic obstacle scattering extends the results on inverse acoustic scattering. Again, the authors consider only one of the many possible inverse obstacle problems: Given the electric far-field pattern for one or several incident plane electromagnetic waves and a perfectly conducting scattering obstacle, find the shape of the scatterer. The results generalize those of acoustics, but the proofs are more complicated. In the last chapters of the book, the authors deal with scattering by an inhomogeneous medium. They reduce the direct problem to the Lippmann-Schwinger equation, and from this point on, scattering theory could be developed in a manner parallel to that of quantum mechanics [2]. The uniqueness theorems, which were related to Dirichlet eigenvalues in the case of scattering by soft obstacles, are related now to the so-called transmission eigenvalues, which the authors discuss thorOughly. They then extend the analysis to electromagnetic waves. The last chapter is an easy walk through the techniques of the inverse-medium p r o b l e m - - t h e most obvious one being the approximate solution of the Lippmann-Schwinger equation by an optimization process. (An easily proved and useful uniqueness theorem states that the refractive index n is uniquely determined by a knowledge of the far-field pattern for an interval of values of k. A more tedious and lengthy argument shows that a single value of k suffices.) The authors produce a generalization of their results to the electromagnetic case where weak solutions exist.
Applications of the ideas mentioned in this review occur in numerous areas in science and technology. For instance, Colton and Monk [3] have recently shown how their techniques can be used for the detection of leukemia. The idea is that cancerous cells in bone marrow cause a change in the dielectric constant and conductivity of the marrow, and these changes can be detected by solving the inverse medium problem. The book is a striking example of a treatment of a highly technical subject which is well written, yet never sacrifices rigor. Sticking to typical problems, the authors succeed in maintaining clarity and interest. The reviewer would have appreciated more physics. However, he must admit that many physicists are unable to read current mathematical discussions of scattering theory, and he is not sure that many applied mathematicians would have been interested in more physics. The reviewer is sure of one point: This book should be on the desk of any researcher, any student, any teacher interested in scattering theory.
References 1. R.G. Newton, Introduction to Ref. [2]. 2. K. Chadan and P. C. Sabatier, Inverse Problems of Quantum Scattering Theory, 2nd ed., Springer-Verlag, New York, 1989. 3. D. Colton and P. Monk, The monitoring of human or animal bone marrow by electromagnetic waves, in Inverse Problems: Principlesand Applications in Geophysics, Technology,and Medicine (G. Anger, et al., eds.), Academie-Verlag, Berlin, 1993, pp. 100-110. Physique-Mathdmatique Universitd des Sciences et Techniques du Languedoc Place Eugene Bataillon 34095 Montpellier Cedex 05, France
STATEMENT OF OWNERSHIP, MANAGEMENT, AND CIRCULATION (Required by 39 U.S.C. 3685). (1) Title of publication: The Mathematical Intelligencer. A. Publication No.: 001-656. (2) Date of filing: 10/1/93. (3) Frequency of issue: Quarterly. A. No. of issues published annually, 4. B. Annual subscription price, $37.00. (4) Location of known office of publication: 175 Fifth Avenue, New York, NY 10010. (5) Location of the headquarters of general business offices of the publishers: 175 Fifth Avenue, New York, NY 10010. (6) Names and addresses of publisher, editor, and managing editor: Publisher: Springer-Verlag New York Inc., 175 Fifth Avenue, New York, NY 10010. Editor: Chandler Davis, Department of Mathematics, University of Toronto, Toronto, Ontario, M5S 1A1 Canada. Managing Editor: SpringerVerlag New York Inc., 175 Fifth Avenue, New York, NY 10010. (7) Owner: Springer-Verlag Export GmbH, ~ergartenstrasse 17, D-69121 Heidelberg, Germany; and Springer-Verlag GmbH & CO.,KG, Heidelberger Platz 3, Berlin, Germany. (8) Known bondholders, mortgagees, and other security holders owning or holding 1 percent or more of total of bonds, mortgages or other securities: Dr. Konrad Springer, Heidelberger Platz 3, D-14197 Berlin, Germany. (9) The purpose, function, and nonprofit status of this organization and the exempt status for Federal income tax purposes: has not changed during preceding 12 months. (10) Extent and nature of circulation. A. Total no. copies printed (net press run): Average no. copies each issue during the preceding 12 months, 8150; no. copies single issue nearest filing date, 9400. B. Paid circulation: 1. Sales through dealers and carriers, street vendors, and counter sales: Average no. copies each issue during preceding 12 months, 1028; no. copies single issue nearest to filing date, 1228. 2. Mail subscriptions: average no. copies each issue during preceding 12 months, 4570; no. copies single issue nearest to filing date, 4393. C. Total paid circulation: average no. copies each issue during preceding 12 months, 5598; no. copies single issue nearest to filing date, 5621. D. Free distribution by mail carrier, or other means. Samples, complimentary, and other free copies: average no. copies each issue during preceding 12 months, 359; no. copies single issue nearest to filing date, 369. E. Total distribution: average no. copies each issue during the preceding 12 months, 5957; no. copies single issue nearest to filing date, 5990. E Copies not distributed: 1. Office use, left-over, unaccounted, spoiled after printing: average no. copies each issue during the preceding 12 months, 1782; no. copies single !ssue nearest to filing date, 3410.2. Return from news agents: average number of copies single issue during preceding 12 months, 41I; no. of copies single issue nearest to filing date, 0. G. Total: average no. copies each issue during preceding 12 months, 8150; no. copies single issue nearest to filing date, 9400. I certify that the statements made by me above are correct and complete.
Craig Van Dyck Vice-President, Production THE MATHEMATICALINTELLIGENCERVOL.16, NO. 2, 1994 75
Robin Wilson*
Russian Mathematics I Numerous distinguished Russian mathematicians have been featured on stamps. In this column we present three of these from the 19th century. Nikolai Ivanovich Lobachevskii (1792-1856) was appointed Dean of Mathematics at the University of Kazan in 1820, and Rector of the University in 1827. He discovered his celebrated non-Euclidean geometry in the 1820s, announcing it in 1826 and publishing it in 1829, but like Bolyai's geometry around the same time, it went unrecognized for many years. This stamp, and that of Sonya Kovalevskaya appeared in 1951 in a set featuring distinguished Russian scientists. PafnutiT Lvovich Chebyshev (1821-1894) is remembered primarily for his work on orthogonal functions, prime numbers, and probability theory. In 1847 he became Assistant Professor of Mathematics at the University of St. Petersburg, and founded the St. Petersburg School of Mathematics. He has given his name to Chebyshev's inequality in probability theory and to the Chebyshev polynomials. He made an important contribution to the proof of the prime number theorem, worked on quadratic forms and integrals, and studied theoretical mechanics and linkages. This stamp was issued in 1946 to commemorate the 125th anniversary of his birth.
Sonya Kovalevskaya (1850-1891) was a mathematician and novelist who made valuable contributions to partial differential equations, Abelian integrals, and astronomy. Barred, because of her sex, from studying in a Russian university, she went to Heidelberg and attended lectures of du Bois-Reymond, Kirchhoff, and Helmholtz. Later she worked closely with Weierstrass in Berlin. Her greatest success was in winning the French Academy's coveted Prix Bordin for a memoir on the rotation of bodies.
S. Kovalevskaya
Grateful though we are for the wealth of philatelic marvels Robin Wilson has provided over the years, we need not condemn to silence all the other readers who have prizes of their own to share. Feel free to submit guest columns! They may present stamps, or coins, or for that matter bills, having some association with mathematics. Submissions may be sent to the Column Editor or to the Editorin-Chief. Chandler Davis
N.I. Lobachevskii
R L. Chebyshev
*Column editor'saddress:Facultyof Mathematics, The Open University,Milton KeynesMK76AA England 76 THEMATHEMATICAL INTELLIGENCER VOL. 16, NO. 2 (~)1994 Springer-Verlag New York