Mathematics and the 21st Century: Proceedings of the International Conference, Cairo, Egypt, 15-20 January 2000

[js i/jismaitonaJ Confarsnc;: Mathematics and the 21st Century Editors A. A. Ashour & A.-S. F. Obada World Scientifi...

Author: A. A. Ashour | A. S. F. Obada

18 downloads 897 Views 15MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

[js i/jismaitonaJ Confarsnc;:

Mathematics and the 21st Century

Editors

A. A. Ashour & A.-S. F. Obada

World Scientific

Mathematics and the 21st Century

Proceedings of the International Conference

Mathematics and the 21st Century Cairo, Egypt

15-20 January 2000

Editors

A. A. Ashour Department of Ivlathematics, Cairo University, Egypt

A.-S. F. Obada Department of Mathematics, Al-Azhar University, Egypt

V f e World Scientific « •

Singapore *New Jersey London* Hong Kong

Published by World Scientific Publishing Co. Pte. Ltd. P O Box 128, Fairer Road, Singapore 912805 USA office: Suite IB, 1060 Main Street, River Edge, NJ 07661 UK office: 57 Shelton Street, Covent Garden, London WC2H 9HE

Library of Congress Cataloging-in-Publication Data Mathematics and the 21st century : proceedings of the international conference, Cairo, Egypt, 15-20 January 2000 / edited by A.A. Ashour, A.-S.F. Obada. p. cm. ISBN 9810245483 (alk. paper) 1. Mathematics--Congresses. I. Ashour, A.A. II.Obada, A.-S.F. III. International Conference on Mathematics and the 21st century (2000 : Cairo, Egypt) QA1 .M83245 2001 00-068520 510-dc21

British Library Cataloguing-in-Publication Data A catalogue record for this book is available from the British Library.

Copyright © 2001 by World Scientific Publishing Co. Pte. Ltd. All rights reserved. This book, or parts thereof, may not be reproduced in any form or by any means, electronic or mechanical, including photocopying, recording or any information storage and retrieval system now known or to be invented, without written permission from the Publisher.

For photocopying of material in this volume, please pay a copying fee through the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, USA. In this case permission to photocopy is not required from the publisher.

Printed in Singapore by Uto-Print

The International Organizing Committee A. A. Ashour (Egypt) (Chairman) F. Abi-Khuzam (Lebanon) N. Balahrishnan (Canada) P. Griffiths (USA) (General Secretary of IMU) A. Hamoui (Kuwait) M. Ismail (USA) I. Khalil (Morocco) J. Palis (Brazil) (President of IMU) P. Sarnak (USA)

Local Organizing Committee G. M. Abd Al-Kader (Al-Azhar University) E. K. Al-Hussaini (Assiut University) A. H. Azzam (Suez Canal University) S. El-Gindy (Assiut University) H. El-Hosseiny (Cairo University) H. M. El-Oweidy (Al-Azhar University) H. N. Ismail (Benha High Institute) I. F. Mikhail (Ain Shams University) A.-S. F. Obada (Al-Azhar University)

vii

Preface

The Conference on "Mathematics and the 21st Century" was held in Cairo, Egypt during the period 15-20 January 2000. The Conference was an event of the WMY2000 initiative launched by the International Mathematical Union. The Conference was hosted in the Tiba Rose Hotel in Cairo with its good facilities, where most of the foreign participants were lodged. The following organizations are acknowledged for their financial support: The International Mathematical Union (IMU). The Abdus Salam International Center for Theoretical Physics (ICTP), Trieste, Italy. The Third World Academy of Science (TWAS). UNESCO's Regional Office for Science and Technology for the Arab States (ROSTAS). Ministry of International Cooperation, Egypt. The Egyptian Academy for Scientific Research Technology (ASRT). The International Committee for WMY2000 is acknowledged for making their web site available to distribute information about the conference. The Egyptian Mathematical Society helped in the organization of the conference. The Conference was opened by an address from H. E. Prof. Dr. Mofeed Shehab, Egyptian Minister of Scientific Research. In addition, Prof. M. Yosry President of the ASRT, Prof. Jacob Palis, President of IMU, and also representing TWAS, Prof. M. El-Deek, Head of ROSTAS addressed the conference. A message of good wishes from Prof. M. Verasoro Director of ICTP was delivered by Prof. A. Ashour chairman of the conference who also gave a welcoming speach. The opening session was followed by the Millennium lecture, "A Thousand Years of Mathematics" delivered by Sir Michael Atiyah. The conference's sessions consisted of plenary lectures and topical sessions. Some of the plenary lectures covered general fields such as: Rewriting the history of mathematics (Rashed); Education of mathematics (Ebeid); Relation between mathematics and sciences (Griffiths); Mathematical aspects of transportation (Groetschel). General reviews of the recent research and standing problems were also delivered. In this aspect the following lectures are relevant: A global view of dynamical systems (Palis); Einstein's theory of space-time and gravitation (Ehlers); A geometrical theory for the unification of all fundamental forces (El-Naschie); Transfer of energy from low frequency to high frequency modes (Nayfeh); Stratification of algebras (Dlab); Multivariate statistical distributions (Balakrishnan). The invited topical lectures covered: Finite groups (Bolinches & Assad); Radical theory (Wiegandt); Enumerative geometry (Procesi); Moduli problems

viii in geometry (Narasimhan); Asymptotic behaviour of solutions of evolution equations (Basit); Instability of nonlinear evolution equations (Debnath); Nonsmooth dynamical systems (Kuepper); On approximations of functions (H. Ismail); On a semi-analytic method for the solution of some elliptic B.V.P (Ghaleb); On robust layer resolving methods for computing numerical approximations (Miller); Eigen values for fractal drums (Fleckinger); Electrostatic models and orthogonal polynomials (M. Ismail); On the developments on theory of functions of several complex variables (Fadlalla); Invertibility preserving linear maps (Sourour); Entire functions sections (Abi Khuzan); Optical solitons (Bullough); Non-classical properties of intermediate states (Obada); On the relativistic 2-body problem (Komy); Singularities in general relativity (Buchner); Linear geometry of light cone (Abdel-Megied); Advances and new results in prediction (Al-Hassaini); and Theory of accelerated experiments (Nikouline). The plenary and invited topical lectures covered a broad spectrum of research in mathematics and its applications together with the history of mathematics and mathematics education. More than 90 research papers related to different fields of mathematics were delivered in the topical sessions. The Conference was attended by 132 participants, coming from 19 countries, including 11 invited plenary speakers and 24 invited topical speakers. In the closing ceremony, Prof. Narasimhan, the Director of Mathematics in ICTP gave a short speech and Prof. Ali H. Nayfeh of Virginia Polytech., USA spoke for the invited speakers and Prof. Ashour chairman of the conference gave some concluding remarks. The Conference offered a unique opportunity for mathematicians from Egypt, and nearby countries to have an overview of the actual status of research in many areas of the mathematical sciences and to tighten connections with their colleagues in other countries. The organizing committee requested all the lecturers who delivered the plenary and the invited lectures to present the texts of their contributions. The following volume comprises the texts of the plenary and invited topical lectures that have been provided by the contributors. A. A. Ashour A.-S. F. Obada

Cairo October 2000

IX

Contents

Organizing Committee Preface

v vii

Millennium Lecture - Cairo, 15 January 2000 Sir Michael Atiyah

1

Trends for Science and Mathematics in the 21st Century Phillip A. Griffiths

3

Arabic Mathematics and Rewriting the History of Mathematics Roshdi Rashed

13

The Paradigm Shift in Mathematics Education: A Scenario for Change William Ebeid

27

Einstein's Theory of Spacetime and Gravity Jurgen Ehlers

41

Moduli Problems in Geometry M. S. Narasimhan

53

Enumerative Geometry from the Greeks to Strings C. Procesi

59

Optical Solitons: Twenty-Seven Years of the Last Millennium and Three More Years of the New? R. K. Bullough

69

Concepts for Non-smooth Dynamical Systems Tassilo Kiipper

123

Radical Theory: Developments and Trends Richard Wiegandt

141

On Minimal Subgroups of Finite Groups M. Asaad

153

Totally and Mutually Permutable Products of Finite Groups A. Ballester-Bolinches

159

Asymptotic Behaviour of Solutions of Evolution Equations Bolis Basit

169

X

On Nonlinear Evolution Equations with Applications Lokenath Debnath

173

A Robust Layer-Resolving Numerical Method for a Free Convection Problem Jocelyn Etienne, John J. H. Miller and Grigorii I. Shishkin

189

Growth Value-Distribution and Zero-Free Regions of Entire Functions and Sections Faruk F. Abi-Khuzam

199

Three Linear Preserver Problems Ahmed Ramzi Sourour

211

Prediction: Advances and New Research Essam K. Al-Hussaini

223

Inference on Parameters of the Laplace Distribution Based on Type-II Censored Samples Using Edgeworth Approximation N. Balakrishnan, A. Childs, Z. Govindarajulu and M. P. Chandramouleeswaran

247

Mathematical Models in the Theory of Accelerated Experiments V. Bagdonavicius and M. Nikulin

271

The Vibrations of a Drum with Fractal Boundary Jacqueline Fleckinger-Pelle

305

Intermediate States: Some Nonclassical Properties M. Sebawe Abdalla and A.-S. F. Obada

323

On the Relativistic Two-Body Equation S. R. Komy

357

Singularities in General Relativity and the Origin of Charge K. Buchner

373

The Inner Geometry of Light Cone in Godel Universe M. Abdel-Megied

387

List of Participants

395

Mathematics and the 21st Century

Mathematics and the 21st Century Eds. A. A. Ashour and A.-S. F. Obada © 2001 World Scientific Publishing Co. (pp. 1-2) MILLENNIUM LECTURE - CAIRO, 15 JANUARY 2000 Michael Ativan

1000 YEARS OF MATHEMATICS Over 2000 years ago the Greeks developed the formal study of Geometry and around 1000 years later the Arabs, building on the work of their predecessors in bom Greece and India, established Algebra. These are two pillars of mathematics and provide the framework in which the Calculus was formulated in the 171* Century, beginning the modern era in Mathematics and physical science. There are many dichotomies in Mathematics representing different viewpoints, and I give below one such list. On the left side are those aspects most closely associated with the geometric way of thinking, while on the right are the more formal aspects represented by algebra. The dichotomies are of course not clear cut, they represent extreme points of a continuous spectrum, and each deserves a study on its own. Applied Concrete Geometry Space Understanding Implicit Infinite

Pure Abstract Algebra Time Proof Explicit Finite

The two aspects of mathematics, exemplified by Geometry and Algebra, represent different modes of thought which undoubtedly have a physiological basis, perhaps related to the two hemispheres of the brain. Geometry is concerned with vision and static phenomena in space (since the speed of light is almost infinite), while Algebra is concerned with information which is sequential in time and analogous to hearing: algebraic operations are always thought of as being performed one after the other. The Calculus, which describes dynamics, arises from the fusion of these two ways of thinking. The Newtonian approach to Calculus emphasised the geometrical side, exemplified by the use of space-time graphs and tangents to curves. This saw its ultimate realisation in the Minkowski picture of a 4-dimensional space-time. By contrast Leibniz preferred to emphasise algebra with the derivative being treated as an algebraic process and part of a symbolic notation. This formal approach became standard and the whole Leibniz philosophy of treating mathematics symbolically prepared the way for

These two sides of Mathematics, following in the footsteps of Newton on the one hand or Leibniz on the other, still have their protagonists in the 20lh century. In 1900 and Hilbert respectively, while in 2000 they have Arnold and Bourbaki as their champions. The history of mathematics over the past four centuries can be viewed as the progressive understanding and enlargement of the notions of Geometry and Algebra, together with their fusion in the Calculus. The history of Geometry is the development of the notion of space, starting with Euclid and progressing through Gauss, Riemann, Klein and Lie to the modern times. At present the global or interconnection with physics through Einstein's theory of General Relativity has had a major impact. Algebra has also had a fascinating development with the discovery of complex numbers (and their geometric representation) being the first high-point. In the 19th century non-commutative multiplication in various guises (quaternions, groups, matrices) opened new doors, which also found their physical applications later in quantum mechanics. Another side of algebra encompasses combinatorial methods from triangulations of spaces to homological algebra. Here geometry and algebra come together in novel ways. Finally there was,

2 following Leibniz, the development of Boolean Algebra leading to symbolic logic, the Hilbert programme of axiomisation and Godel's theorem. As we enter the 21 st century both Geometry and Algebra are in full vigour and both have intimate links with the latest ideas in physics, each represents a vital part of mathematics and both are essential: the legacy of both Greeks and Arabs lives on. To the purist who wants to take sides one should ask the rhetorical question "would you rather be blind or deaf?".

Mathematics and the 21st Century Eds. A. A. Ashour and A.-S. F. Obada © 2 0 0 1 World Scientific Publishing Co. (pp. 3-11)

3

Trends for Science and Mathematics in the 21 st Century1 Phillip A. Griffiths Director Institute for Advanced Study Princeton, New Jersey Good morning. I'm very glad to be with you today as we begin a new millennium together. I can't think of a more appropriate topic than trends in science and mathematics, because it seems very likely that science and technology will be even more important in the next millennium than they are right now. I am certainly no expert on trends, and I confess that I become very nervous when talking about the future. But I have recently served on a science policy committee in Washington, and one of the requirements of serving in Washington is that you pontificate on very large issues you don't know much about. So perhaps you'll excuse me for making some guesses today about some of the issues we've been discussing. Perhaps if we can agree on the shape of a few large trends as we see them today, we can also agree that their momentum will move them some distance into at the least the near future. The primary theme I want to talk about is how interconnected mathematics and the sciences are becoming. We are learning that all scientific and mathematical knowledge is interrelated and interdependent. And we have begun to see this knowledge as a set of principles and relationships that extends from invisible atomic particles to the vast biological and social systems of the earth. As a consequence, we can see more clearly the need to practice the theoretical and applied aspects of research in close proximity, and the need for collaboration among workers in many disciplines. I am a mathematician, so I'll speak primarily from the viewpoint of mathematics, and from here the present era certainly looks like a golden age. One of the reasons is that mathematics is starting to become very interactive with the sciences and engineering. These interactions are leading both to great insights in the sciences, and to fundamental advances in mathematics. I want today to describe five major trends in science and mathematics, along with some of the challenges that await us in the 21 st century.

Trend 1: From the Linear Model to the Dynamic Model of Research The first major trend has to do with the way we describe research. Many people who discuss science policy have assumed that basic research is different from applied research. They might say that basic research is the pursuit of knowledge for its own sake, without much thought about how it will be used. And they might say that applied research is different because it is done with more specific goals in mind. Many people talk about a "linear model" of research - they say that knowledge moves in one direction from basic research to applied research to development and finally to application. But this model does not match very well with the real world. Even the simplest research project involves a dynamic flow of ideas and information in many directions. This is not surprising to researchers, because they know how they work. But it can be surprising to the agencies that provide the money for research. If agencies understand this dynamic process of research, they can do a better job of funding research more effectively. For 1 Presented at the International Conference on Mathematics and the 21 st Century, Cairo, Egypt, January 16, 2000.

4 example, an agency would be wise to fund both basic and applied research - not just one kind. It they decide to fund only applied research because they want to move directly to practical applications, they could distort the scientific process very seriously. We can think of many examples to show how the most creative research depends on both basic and applied thinking. Louis Pasteur, the great French biologist, was often motivated by practical questions from medicine, beer brewing, wine making, and agriculture, and these questions led him to fundamental discoveries about basic biology and disease. Gregor Mendel, the father of modern genetics, was asking very practical questions about how to improve agricultural crops when he discovered the basic laws of genetics. More recently, the study of basic optics in physics, which has the traditional goal of producing better lenses for cameras and telescopes, has now brought us fiber optics, which is one of the most important foundations of modern telecommunications. We need to maintain a balanced and diverse research portfolio, with many kinds of researchers and many linkages between them.

Trend 2: From Theory + Experiment, to Theory + Experiment + Computation A second major trend in research has been the expansion of the scientific process itself. Until recently, we defined the scientific method as two steps - theory and experiment. Now, with the explosion of computer capacity, we have added the third essential step of computation. This third step allows us to design mathematical models of systems that are too complex to measure or quantify directly, and to answer questions that were beyond understanding only a few decades ago. The ozone hole A familiar example that requires extensive computation is the mixing of oceans and atmospheres. We try to understand this mixing through a combination of fluid mechanics and nonlinear dynamics, modeling the underlying physical and chemical processes. It is far more complex than a fast diffusion process, such as the spreading of ink through water. For example, a careful look at either environment reveals "islands" of unmixed fluid that are not penetrated from the outside. In the oceans, this phenomenon can be a matter of life or death for fish, which depend on the mixing of nutrients, chemicals, plankton, and other fish. In the atmosphere, these islands can determine the spread of pollution and greenhouse gases. For example, the ozone hole that forms every winter over Antarctica is one of these islands. In the hole, ozone is almost completely destroyed by chemical reaction in the upper atmosphere's clouds. The hole is surrounded by ozone, and the atmosphere is stirred by turbulence, but the surrounding ozone doesn't enter the hole. This is because it is at the center of a large vortex, and mathematical models correctly predict that the outer edge of the vortex acts as a barrier to mixing. When warming breaks up the vortex each spring, the barrier disappears and new ozone returns to the hole. Understanding this question requires all three steps of the scientific process - the theory of fluid mechanics, experiments with atmospheric conditions, and finally computation, which is then checked against the original observations. This understanding was previously impossible because we didn't have the computing power.

5 Kepler's sphere packing conjecture Computing power has also allowed the solution of a major problem in mathematics, Kepler's Sphere Packing Conjecture, which had eluded mathematicians for nearly four centuries. Work on this problem began in the latter half of the 16th century, when Sir Walter Raleigh wrote to the English mathematician Thomas Harriot, asking him to find a quick way to estimate the number of cannonballs piled on the deck of a ship. In turn, Harriot wrote to Johannes Kepler, the German astronomer, who was already interested in stacking: how could spheres be arranged to minimize the gaps among them? Kepler could find no system more efficient than the way sailors naturally stack cannonballs, or grocers stack oranges, known as face-centered cubic packing. This assertion became known as the Kepler conjecture. The problem is difficult because of the immense number of possibilities that must be eliminated. By the mid^O* century, mathematicians knew in principle how to reduce it to a finite problem, but even that problem was too large for existing computing. A major advance came in 1953 when the Hungarian mathematician Laszlo Fejes-Toth reduced the problem to a huge calculation involving many specific cases and also suggested a new way of solving the problem by computer. Hales' own proof involves enormous complexity. His equation has 150 variables, each of which must be changed to describe every conceivable stacking arrangement. The proof relies extensively on methods from the theory of global optimization, linear programming, and interval arithmetic; it fills 250 pages of text and about 3 gigabytes of computer programs and data. Only at the end of the proof does one know for sure that Hales' reduction to a finite problem was legitimate. He acknowledges that for a proof this long and complex, it will be some time before anyone can confirm all its details. It's worth noting that this exercise sheds light on related subfields. The topic of sphere packing belongs to a crucial part of the mathematics that lies behind the error-detecting and errorcorrecting codes that are widely used to store information on compact disks and to compress information for transmission around the world. In today's information society, it is difficult to find a more significant application than that. Theoretical computer science I'd like to emphasize that computation belongs to the larger field of computer science, the theoretical aspect of which has become one of the most important and active areas of scientific study today. It really started half a century ago, before modern computers existed, when Alan Turing and his contemporaries set out to mathematically define the concept of computation, and to study its power and limits. These questions led to the construction by von Neumann of the first electronic computer, followed by the computer revolution we are witnessing today. The practical use of computers, and the unexpected depth of the concept of "computation," has significantly expanded theoretical computer science, or TCS. In the last quarter century TCS has grown into a rich and beautiful field, making connections to other sciences and attractingfirst-rateyoung scientists. A very important development is the shift in focus from "computation" to the much more elusive notion of "efficient computation." Other important aspects are the fundamental notion of NP-completeness, the use of randomness to revolutionize the theory of algorithms, and the development of modern cryptography and complexity theory.

6 Beyond these activities that are internal to TCS is important cross-fertilization between TCS and mathematics, such as combinatorics, algebra, topology, and analysis. Moreover, the fundamental problems of TCS have gained prominence as central problems of mathematics in general. More and more mathematicians are considering the "computational" aspects of their areas. In other words, they start with the theoretical conclusion that, "An equation can be solved" - then they follow it with the problem, "How fast, and to what degree of approximation, can the solution be found?" A final aspect of TCS, which is to some people the most interesting, is that the field now overlaps with a whole new set of algorithmic problems from the other sciences. In these problems the required output is not well defined in advance, and it may begin with almost any kind of data: a picture, a sonogram, readings from the Hubbell Space Telescope, stock-market share values, DNA sequences, neuron recordings of animals reacting to stimuli. Mathematical models are used to try to make sense of the data or predict their future values. In general, the very notion of "computation," and the major problems surrounding it, have taken on deep philosophical, as well as practical, meaning and consequences. The field is focused on a few clear and deep questions: For example, Does randomization help computation? What constitutes a difficult theorem to prove? and, Can a quantum computer - or an optical one be built? The time is ripe for exciting growth and fundamental new understanding throughout this new field. Trend 3: From Disciplinary to Interdisciplinary Research A third broad trend today is the shift from disciplinary to interdisciplinary research. Traditionally, academic research institutions are organized by disciplines, and research programs and results are reviewed by peers from the same discipline as the researcher. A successful academic career is still primarily dependent on success in disciplinary research, which in turn is measured by publications, election to academies (whose sections are disciplinary), and the ability to obtain research grants. By and large, disciplinary science has been spectacularly successful in its depth and focus: Physicists have explored the building blocks of matter, chemists have learned to create new compounds with specified qualities, biologists have identified many of the genes and proteins that regulate life. At the same time, modern problems are inviting approaches that require a new degree of breadth. New kinds of interdisciplinary teams are learning to examine problems whose complexity is greater than any single discipline. The life sciences This trend is especially evident in the life sciences, where new technologies and new knowledge have revolutionized our abilities to understand normal biological functions and disease. A broad array of scientific disciplines are beginning to overlap - a new consortium of biology, chemistry, physics, and mathematics. Physics, for example, has supplied the ingredients fundamental to many common clinical practices - X rays, CAT scans,fiberoptic viewing, laser surgery, ECHO cardiography and fetal sonograms. Materials science is helping with new joints, heart valves, and other artificial tissues. Likewise, an understanding of nuclear magnetic resonance and positron emissions was required for the imaging experiments that allow us to follow the location and timing of brain activities that

7 accompany thought, motion, sensation, speech, or drug use. And X-ray crystallography, chemistry, and computer modeling are now being used to improve the design of drugs, based on three-dimensional protein structures. The Human Genome Project, which is now creating the maps and nucleotide sequences of the chromosomes of many organisms, from microbes to man, would not exist without recombinant DNA methods. Molecular cloning, in turn, would not exist without earlier studies of enzymes for synthesizing, cutting, and rejoining DNA. Moreover, today's effort to complete the 3-billion-base sequence of human DNA by 2005 depends on robots for processing samples and computers to store, compare, and retrieve the data. Other, more specialized subfields have become indispensable. Recent efforts to sequence DNA on a commercial scale - for example, to screen many individuals for mutations that predispose to certain cancers - use nanotechnology and photochemistry to synthesize arrays of nearly 100,000 different short pieces of DNA on a small chip. Infectious diseases One of the fastest-growing new partnerships is the collaboration between mathematics and biology in the study of human infections. The foundations of this work were laid in the 1920s, when the Italian mathematician Vito Volterra developed the first models of predator-prey relationships. He found that the rise and fall of predator and prey populations of fish could best be described mathematically. After World War II, the modeling methods developed for populations were extended to epidemiology, which resembles population biology in being the study of diseases in large populations of people. Most recently, the insights of molecular genetics have inspired scientists to adapt these same methods to infectious diseases, where the objects of study are not populations of organisms or people, but populations of cells. In a cellular system, the predator is a population of viruses, for example, and the prey is a population of human cells. These two populations rise and fall in a complex Darwinian struggle that lends itself to mathematical description. Mathematical biologists have been able to make quantitative predictions about the life expectancy of cells once they are infected by virus. Some of the most surprising results have emerged in the study of the AIDS epidemic, reversing our understanding of HTV viruses in infected patients. The prevailing view had been that HTV viruses lie dormant for a period of 10 or so years before beginning to infect host cells and cause disease. Mathematical modeling has shown that the HTV viruses that cause the most disease are not dormant; they grow steadily and rapidly, with a half-life of only about 2 days. Why, then does it take an average of 10 years for infection to begin? Again, mathematical modeling has shown that disease progression may be caused by viral evolution. The immune system is capable of suppressing the virus for a long time, but eventually new forms of viruses mutate and become abundant and overwhelm the immune defense. These same mathematical models have brought an understanding of why anti-HTV drugs should be given in combination, and given as early as possible during infection. They are most effective in combination because viruses seldom produce multiple mutations at once. And they should be given early before viral evolution can progress very far.

8 Trend 4: Complementing Reductionism with the Study of Complex Systems A fourth major trend is a shift away from the traditional focus on reductionism toward more study of complex systems. Reductionism, or reducing a system to its smallest parts, has been dominant until recently, and many people have regarded physics, which studies the smallest of all particles, as the truest of the sciences. There's a famous statement attributed to Lord Rutherford, which is, "All science is either physics or stamp collecting." Obviously Lord Rutherford was an enthusiastic subscriber to the credo of reductionism and the simplicity of early physical laws. But while the laws of the world are neat and orderly, the world itself is not. Everywhere we look - outside the classroom, that is - we see evidence of complexity: jagged mountain ranges, intricate patterns on the surface of sand dunes, the interdependencies of financial markets, the fluctuation of populations in biology. Because the world is complex, there is a demand for more complex models. However, complex models lead eventually to problems that are not just larger and more complicated, but fundamentally different. It's not possible to characterize complex systems with the tools that work for well-behaved systems. The study of complex systems is much more subtle than just extrapolating from the fundamental laws by using a huge set of equations. The study of climate is a good example. The basic equations used to define atmospheric processes, the Navier-Stokes equations, are nonlinear. This means that a predicted variable, such as wind speed and direction, or wind velocity, appears in the equations raised to a power. This exponential quality means that the system is highly sensitive to small differences in the initial state, as well as to measurement errors: Change something slightly and you may get a very different outcome. This is one reason that weather predictions are reasonably good for 3-5 days, but not very accurate after that. Complexity is also well known to engineers. A Pentium chip, for example, contains millions of individual elements: transistors, connecting wires, gate arrays. The fundamental principles are known for each component, but in the aggregate, these components interact in ways that are not straightforward. Designers have to use sophisticated modeling programs to predict these interactions and work out sensitivity-induced errors, or bugs. Studies of complexity are fruitful in the life sciences. After decades of successfully reducing fundamental questions about life to individual genes and proteins, biologists are now interested in looking at components in a more systemic way. Gene sequencing and other techniques will soon have isolated all the cell's individual parts and spelled out their individual functions; now investigators want to know how they function as a system. A central challenge is to understand the chemical networks that govern cell function, which is highly complex. For example, the expression of individual genes is regulated not by one, two, or five proteins, but by dozens. Some of them bind to DNA all the time, while others bind temporarily. Interactions between cell molecules have feedback effects that increase or decrease the expression of other molecules. We are now seeing early attempts to model cell systems by computer, which might be called the third phase of physiological study. First we had "in vivo," then we had "in vitro," and now we have "in silico." Primitive simulations can already show us how cells respond to simple changes in nutrients or environment. Other interdisciplinary projects are learning how viruses

9 "decide" whether to replicate inside a host or to lie dormant, waiting for a better opportunity. It appears that viruses have a feedback control mechanism that is inherently "noisy," so that not all of them make the same decision even under identical conditions. This clever adaptation ensures that some will survive should the other path prove fatal. Trend 5: Globalization and the Diffusion of Knowledge A fifth trend that affects research is the globalization of science. I said earlier that we need all kinds of research, both basic and applied. To continue this thought, every nation needs to do all kinds of research in order to compete internationally. At one time, during the 1970s and 1980s, it was believed that nations could use the research done in other countries and convert it into profits by good manufacturing and marketing techniques. But it now appears that this strategy of "technology first" is not as effective as we once thought. Recently, Japan, Korea, and other countries that used this strategy have now moved to build up their own basic research capability. They have seen that they need their own advanced capabilities in order to understand and extend the discoveries that are made elsewhere. The second part of this trend is a global exchange of knowledge among both developed and developing countries. This trend is especially important to the developing nations, which are eager to raise their capabilities in science and technology. A generation ago, scientists from these countries usually had to relocate to find the best research opportunities and equipment. That's beginning to change, and increasingly the best scientists in every country are interested in staying at home to contribute to their own country's scientific expertise. Recently the World Bank has launched an initiative to establish small, exemplary research institutes in countries around the world. The Millennium Science Initiative, as it's called, has received seed money from the Packard Foundation and loan money from the World Bank to begin operation. The first Millennium Science Institutes have now been established in Chile, and subsequent Institutes will be established in other countries in Latin America and elsewhere around the world. The objective of these Millennium Science Institutes is to allow scientists to work in their home countries, where they perform research and train the next generation of scientists through graduate and post-doctoral programs. They will establish linkages with existing research communities and help stimulate economic development. The Institutes themselves will form a global network, connected electronically and bound by common purposes. I anticipate that you'll be hearing more about these Millennium Institutes in the future. Some challenges Finally, I want to say that there are difficult challenges that await us in the new millennium - challenges that will resist the trend toward interdisciplinary and collaborative research. I have said that we need a higher level of interaction between disciplines, but there are significant barriers to overcome. I'll use the example of mathematics, which has similarities with other disciplines. One barrier to interaction is our own tradition of isolation. We mathematicians have been isolated from other subfields of mathematics, from otherfieldsof science, and certainly from nonacademic areas, especially the private sector. It's important to build more bridges within

10 institutions and between institutions. For example, the cultures of universities and private industries are very different, and few mathematics students have enough knowledge about industry to know that they might have satisfying careers there. In the United States, some 80% of new doctoral mathematicians consider only academic careers. And yet we have already talked about how many promising opportunities are found in fields where industry is very active, such as bioinformatics and communications technology. The culture of "pure" mathematics Perhaps a more fundamental reason for our discomfort is that in the 20th century we were taught to place the highest importance on mathematical problems of great depth. Our culture has taught us to value the intellectual excitement of mathematics, the elegance and ultimate simplicity of its structures, and the freedom to follow interesting problems wherever they may lead. The tradition of pursuing mathematics for its own sake was firmly in place by the time I was a student. For example, I was strongly influenced by the book of G. H. Hardy's book, titled A Mathematician's Apology. Hardy wrote about the intrinsic beauty of mathematics, and he suggested that our support of mathematics was justified by its importance as an esthetic and intellectual activity. Any relevance to practical uses or the physical world was irrelevant or even undesirable. We were not taught to relate to problems that seem "messy" or insoluble in a precise sense - such as those of engineering, biology, chemistry, and meteorology. We preferred "pure" problems, and the word "pure" gives an accurate picture of our attitude, suggesting that other kinds of activity are less than pure. However, it is helpful to look farther back at the extremely long history of mathematics. As in the case of Pasteur, and Mendel, we see that fundamental mathematical discoveries were motivated by practical questions. We think of Newton, Euler, Gauss, Riemann, Poincare, and others whose mathematics were integral to studies of the physical world. For most of our history we have participated in the mathematical aspects of physics and found them intrinsically interesting. But in the twentieth century there developed a tradition of doing mathematics for its own sake, and we designed our universities in a way that does not encourage collaboration across disciplinary boundaries. We physically separated the department of "applied mathematics" from the department of "pure mathematics," which reflected a narrow view about mathematical thought. I recall in the late 1970s and early 1980s, for example, at the university where I was teaching mathematics, the mathematics faculty focused exclusively on pure research, which they did extremely well. But we were physically separate from applied mathematicians, who were part of the department of applied science, along with computer science, control theory, and some engineering. Once we tried to hire an excellent mathematician to a joint appointment in both departments. He studied fluid mechanics both from the ["applied"?] viewpoint of partial differential equations and from the ["pure"?] viewpoint of numerical analysis. Unfortunately, other people in the department thought that this work was not "pure" enough for us and they declined what I thought was a very exciting opportunity to reach across disciplines. Today, this would be less likely to happen. Mathematics has become more interactive with the sciences and engineering. These interactions have led both to great insights in the sciences, and to fundamental advances in mathematics. So we are being invited to look more closely at subfields other than our own, and even at disciplines outside mathematics.

11 I think that the universities can learn a great deal from the private sector about effective organization of research. For example, one of the greatest research institutions in the U.S. was the old Bell Laboratories, in New Jersey, where researchers were organized by multidisciplinary teams rather than disciplines. At Bell Labs, the organizational structure did not determine the science; the science determined the organizational structure. There was far more freedom and flexibility to pursue problems, and great success in producing excellent science. Fortunately, it appears that some change is in the wind. For example, last year, our National Institutes of Health announced a new bioengineering initiative to fund multidisciplinary research, and interdisciplinary review panels are likely to follow. And new interdisciplinary centers are planned. One has been proposed at Stanford University to focus on biophysics; another at Princeton would focus on the working of genes and proteins. The Packard Foundation in the U.S. has recently put in place a major initiative to support interdisciplinary projects, exactly the type of projects that are very difficult to fund within the existing federal agency structure. Conclusion In conclusion, I would emphasize that we are witnessing a large global trend toward interactivity and collaboration, both in the way we look at our research activities and in the way we work with each other. The work of research is becoming more complex because we are doing much of it by computation. It is becoming more interdisciplinary, because that is the best way to understand complex systems. Nations all over the world are beginning to understand that they need their own research capability if they are going to compete intellectually and economically in the 21st century. I have discussed a very exciting possibility with leading scientists here in the Middle East, and that is to establish a small interdisciplinary and international research center in Beirut. This center would be part of the Millennium Science Initiative, and one of its goals would be to promote collaborative research and education both among different disciplines and among leading Arab and Israeli scientists. I am confident that scientific research is an excellent forum to advance not only scientific and technological knowledge, but also the process of learning to work together across national borders. I do believe that the best way to pursue the technological challenges of the 21st century will be to recognize and adapt to this powerful trend, and to learn from organizations like the old Bell Labs, which identified the value of teamwork and interdisciplinary approaches many years ago. The challenge for us today is to improve on these older models, and to extend them from industry into academic research and teaching, where the scientists and engineers of tomorrow are being trained. Thank you very much.

Mathematics and the 21st Century Eds. A. A. Ashour and A.-S. F. Obada © 2001 World Scientific Publishing Co. (pp. 13-26)

ARABIC MATHEMATICS AND REWRITING THE HISTORY OF MATHEMATICS Roshdi RASHED When one is speaking about mathematics in Arabic from 9th century on, it is difficult to avoid understatement; it is difficult to do full justice to one's subject. Even an uncommmonly armed and industrious mind could scarcely produce the content of a large number of lost works, and a huge number of mathematical arabic manuscripts, neither yet edited nor studied. In spite of this situation, the results obtained in the last decades show easily that, if we take away the contribution of arab mathematicians, we will be unable to understand classical mathematics, I mean mathematics until the end of the 17th century. The old story about mathematics beginning in Greece and renewed in Renaissance Europe, with arab mathematicians as agents of transmission of greek legacy, does not fit if we are aware to get the facts right. In this talk, after a general view about mathematics in the 9th century, I will come back to the rewriting the history of two chapters as examples. The early 9th century appears with mathematicians like Ban0 M°s8e, Thaebit ibn Qurrra, al-Mashaeni, among many others, as a great moment of expansion of hellenistic mathematics in arabic. And it is at precisely that time — the beginning of 9th century — that Mulammad ibn M°sae alKhwaerizmi writes a small book on a subject and in a style which are both new. It is in these pages that algebra features for the first time as a distinct and independant mathematical discipline. The event was crucial and perceived as such by al-Khwaerizmi's contemporaries, as much for the style of mathematics as for the ontology of the subject, and, even more, for the wealth of possibilities that it offered from then on. The style is both algorithmic and demonstrative, and already we have here, with this algebra, an indication of the immense potential which will pervade mathematics from the 9th century onwards: the application of mathematical disciplines one to another. In other words, algebra, because of its style and generality of purpose, made these inter-disciplinary applications possible, and they in turn, by virtue of their number and diversity, will, after the 9th century, constantly modify the structure of mathematics. A new mathematical rationality is born, one that we think will come to characterise classical mathematics, and more generally, classical science. Al-Khwaerizmi's successors began - bit by bit - to apply arithmetic to algebra, and algebra to arithmetic, and both to trigonometry; algebra to

13

14 Euclid's theory of numbers, algebra to geometry and geometry to algebra. These applications were the founding acts of new disciplines, or at least of new chapters. This is how polynomial algebra came to be; as well as combinatorial analysis, numerical analysis, the numerical resolution of equations, the new theory of numbers and the geometric construction of equations. There were other effects as a result of these multiple applications - such as the separation of integer Diophantine analysis from rational Diophantine analysis, which would eventually have a chapter of its own within algebra under the title of 'indeterminate analysis'. From the 9th century onwards, therefore, the mathematical landscape is never quite the same: it is transformed, its horizons widen. One first sees the extension of Hellenistic arithmetic and geometry: the theory of conies, the theory of parallels, projective studies, Archimedean methods of measuring surfaces and curved volumes, isoperimetrical problems, geometrical transformations; all these areas become subjects of study for the most prestigious of mathematicians (Thaebit ibn Qurra, Ibn Sahl, Ibn al-Haytham, to name but a few) who manage, after in-depth research, to develop them in the same fashion as their predecessors, or by modifying them whenever necessary. At the same time, within the tradition of Hellenistic mathematics, there is seen to be an exploration of nonHellenistic mathematical areas. It is this new landscape, with its language, its technique and its norms, which will gradually become the landscape of classical mathematics. To show that, let me take two examples: diophantine analysis and numerical analysis. Rational Diophantine analysis The emergence of indeterminate analysis - or, as it is called today, Diophantine analysis - as a distinct chapter in the history of algebra, goes back to the successors of al-Khwaerizmi, and namely to Ab° Kaemil. His book, written around 880, was translated into Latin in the 12th century and into Hebrew in the 15th century in Italy. Ab° KEemil's purpose in his Algebra is to improve upon previous uncoordinated works, and to give a more systematic account; including not only problems and their algorithm solutions, but methods as well. Indeed, Ab° Kaemil, towards the end of his Algebra, deals with 38 Diophantine problems of the second degree and the systems of these equations, 4 systems of indeterminate linear equations, other systems of determinate linear equations, a group of problems centred around arithmetical

15

progression, and a further study of this last group1. This collection satisfies the double goal set by Ab° Kasmil: to solve indeterminate problems and at the same time to use algebra to solve problems that arithmeticians usually dealt with. In Ab° Kasmil's Algebra, for the first time in history as far as I know, there is an explicit distinction drawn between determinate and indeterminate problems. A study of his 38 Diophantine problems not only reflects this distinction, it also shows that the problems do not succeed each other randomly, but according to an order implicitly indicated by Ab° Kasmil. He puts the first 25 all into the same group, and gives a necessary and sufficient condition to determine rational positive solutions. Thus for instance x2 + 5 - y2.

Ab° Kasmil reduces the problem to that of dividing a number the sum of two squares into two other squares and solves it. Ab° Kaemil's techniques of resolution show that he knows that if one of the variables can be expressed as a rational function of the other, or, more generally, if a rational parameterage is possible, then all solutions are possible. Whereas if, on the other hand, the sum has led to an expression with an unresolvable radical, then there is absolutely no solution. In other words, unknown to Ab° Kasmil, a second degree curve does not possess a rational point, nor is it bi-rationally equivalent to a straight line. The second group is made up of 13 problems that are impossible to parameterise rationally. Once more, in a language unknown to Ab° Kasmil, they all define curves of genus 1, as for instance the problem x2 + x = y2, x2+l

=z2,

which defines a "skew quartic" curve of A3 of genus 1. Half a century later, al-Karaji, another algebraist, extends rational Diophantine analysis further than ever before. He marks an important point in the history of algebra by formulating the concept of polynome and algebraic calculus of polynomials. In rational Diophantine analysis, alKaraji differs from his predecessors - from Diophantes to Ab° Kasmil - in that he does not give well-ordered lists of problems and their solutions, but instead structures his account on the basis of the number of terms in the

1

Istanbul, MS Kara Mustafa Pasa n 379, fol. 79r-l lOv.

16

algebraic expression, and on the difference between their powers. AlKaraji considers, for example, successively ax2n ± bx2"1 - y2, ax2" + bx2-2 = y2, ax2 + bx + c = y2.

This is a principle of organisation which would be borrowed by his successors. Al-Karaji further advances the task initially undertaken by Ab° Kasmil, highlighting - as far as is possible - the methods for each class of problems. We can show the problem which defines a curve of genus 1 in A3 simply as: x2 + a = y2 x2-b

= z2,

Al-Karaji's successors have attempted to follow the path that he laid out. I shall not elaborate further on the matter of rational Diophantine analysis in Arabic, and will return to the development of integer Diophantine analysis. Integer Diophantine Analysis The 10th century sees for the first time the constitution of integer Diophantine analysis, or new Diophantine analysis, doubtless thanks to algebra, but also, in some ways, despite it. The study of Diophantine problems had been approached on the one hand by demanding integer solutions, and on the other by proceeding according to demonstrations of the type found in Euclid's arithmetical books of Elements. It is the specific combination - for the first time in history - of the realm of positive integers (understood as line segments), algebraic techniques and pure Euclidean-style demonstration that permitted the birth of the new Diophantine analysis. The translation of Diophantus's Arithmetica, as we know, provided these mathematicians not so much with methods as with problems in the theory of numbers which they found formulated therein. Unlike their Alexandrine predecessor, they wasted no time in systematising and examining these problems: the representation of a number which is the sum of squares, congruent numbers, etc.

17

This is how 10th century mathematicians such as al-Khaezin studied numerical rectangular triangles and problems of congruent numbers. AlKhaezin gives the theorem of congruent numbers as follows2: Given a natural integer a the following conditions are equivalent: 1°

the system

x2 + a = y2 x2-a

= z2

admits a solution; 2°

there exist a couple of integers (m, ri) such as m2 + n2 = x2, 2 mn — a;

in these conditions, a is in the form 4 uv(u2 - v2). It was also in this tradition that the study of the representation of an integer as the sum of squares started: in fact, al-Khaezin devotes several propositions in his dissertation to this study. These 10th century mathematicians were the first to address the question of impossible problems, such as the first case of Fermat's theorem. But in spite of all their efforts, this problem continued to occupy mathematicians, who later stated the impossibility of the second case, x4 + y*= z4. Research into integer Diophantine analysis did not die with its initiators after the first half of the 10th century: quite the contrary, their successors carried on, at first in the same spirit. But, towards the end of its evolution, there was a noticeable increase in the use of purely arithmetical means in the study of Diophantine equations3.

The Tradition continues

2

R. Rashed, Entre arithmetique et algebre: Recherches sur I'histoire des mathematiques arabes, Paris, 1984, p. 212; English transl. The Development of Arabic Mathematics: Between Arithmetic and Algebra, Kluwer, Boston Studies in Philosophy of Science, 1994. n

3

See R. Rashed, 'Al-Yaezdiet l'equation ^ x ] — X'< Historia Scientiarum, vol. 4-2

(1994), p. 79-101.

18

With this example of Diophantine analysis, I wished to illustrate how algebra conceived at the time of al-Khwaerizmi was central to the foundation and transformation of this new discipline. As we have seen, the dialectic between algebra and arithmetic has meant that rational Diophantine analysis was considered as part of algebra. And from then on, from al-Karaji to Euler, an important treatise of algebra would always include a chapter on rational Diophantine analysis. This stage marks the birth of integer Diophantine analysis, which would be bound to comply with the exigencies of demonstration. With these disciplines, we have finally seen the rise of elements of a new mathematical rationality which admits the infinity of solutions as a genuine solution. This allows us to differentiate between several types of infinity of solutions - such as the identities and infinitely great numbers - and to positively consider impossibility, or impossible solutions as a subject for construction and demonstration4. However, all these features are precisely those of classical Diophantine analysis as it was conceived and practised in the 17th century by Bachet de Meziriac and Fermat. Around 1640, Fermat invents the method of infinite descent5 which itself would breathe new life into the discipline, but that is another story. One might ask whether this so-called epistemological continuity corresponds to a particular historical continuity and, if so, to which? To put it more bluntly, was Bachet de Meziriac, at the beginning of the 17th century, created out of nothing? Let us ponder this question for a while, as it affects our subject. My answer would be to simply recall the figure of one of the most prominent Latin mathematicians of the Middle-Ages and the source of many Renaissance writings: Fibonacci, alias Leonardo Pisano. Fibonacci (from 1170 until after 1240) who lived in Bougie and who travelled in Syria, Egypt and Sicily, was in touch with Emperor Frederic II and his court. This court included Arabists dealing with Arabic mathematics, like John of Palermo, and Arabic speakers knowledgeable in mathematics, like Theodore of Antioch. Fibonacci wrote a Diophantine analysis, the Liber Quadratorum, that historians of mathematics rightly hold to be the most important contribution to Latin Middle-Ages theory of numbers, before Bachet de Meziriac and Fermat's contributions. The purpose of this book, as stated by Fibonacci himself, is to solve this system

4

R. Rashed, Entre arithmetique et algebre, p. 195 sqq. J. Itard, Essais d'histoire des mathematiques (collected and introduced Rashed), Paris, 1984, pp. 229 - 234. 5

by R.

19 x2 + 5 = y2 x2-5

= z2

proposed by John of Palermo. This is not just any question of Diophantine analysis, but a problem that crops up as a problem in its own right in the works of al-Karaji and many others. More generally, the main results revealed in the Liber Quadratorum are either those obtained by Arabic mathematicians in the 10th and 11th centuries, or are very close to those. Furthermore, the results are placed in an identical mathematical context, namely the theory of Pythagorean triplets, so the conclusion is really nothing new; a prominent historian whose admiration for Fibonacci cannot be doubted had already put it forward. I am referring to Gino Loria who wrote: It seems difficult to deny that Leonardo of Pisa (Pisano) has been led to research that had already been summarised by Mulammad ibn Iosein (read al-Khaezin), and his dependence on him is even more in evidence in the following section of the Liber Quadratorum which deals with 'congruent numbers'. We can see therefore that the Liber Quadratorum truly belongs to the tradition of 10th century mathematicians, who created integer Diophantine analysis. Although the case of Fibonacci and Diophantine analysis is not unique, it is exemplary, considering the level it reached. This mathematician, looked at from one direction can be seen as one of the great figures in Arabic mathematics of the 9th to 11th centuries, but, looked at from another direction, can be seen as a scholar of 15th to 17th century Latin mathematics. We have seen in this example how classical scientific modernity had its roots in the 9th century, and that it continued to develop until the late 17th century. In this way, rational Diophantine analysis lives on into the 18th century, whereas integer Diophantine analysis undergoes a new revolution in the mid-17th century. We also see that this modernity is written about in Arabic in the early stages, that it was then transmitted through Latin, Hebrew and Italian, before going on to become part of significant new research. And finally, we see that the rational core of this modernity was algebra, and that the conditions which allowed it to exist are inherent in the new ontology contained within its discipline.

Numerical analysis and interpolation methods

20

The second example is devoted to a chapter of mathematics almost unknown in the hellenistic period, i. e. numerical analysis. Undoubtedly Greek mathematicians as well as Babylonian and Egyptian knew a lot about numerical procedures, particularly when they dealt with astronomical data calculation of calendar, etc. But they definitely not conceived a chapter of mathematics to study numerical procedures. We have to wait until the 10th century when mathematicians became highly interested in the subject. The origin of this interest is to be found in other topics: algebra and applied mathematics at that time, i. e. observational astronomy, and later on optics. Many of algebraic activities, at that time, were about the new algorisms required for the study of polynomial expressions and the numerical resolution of algebraic equations. The invention of combinatorial analysis by the 10th century mathematicians became a powerful instrument for this algorithmic research. One of the mathematicians who had contributed the most to this research was alKaraji, whom I mentioned before. A second figure coming from different horizon was Ab° al-Raylaen al-Bir°ni. He wrote a one hundred folios length book — now lost — on the extraction of the n th root of an integer. During the same period, astronomers were studying interpolation methods required to undertake the composition of the trigonometrical tables and their astronomical ztjs. From the various contributions to numerical analysis, I shall be concerned here only with methods of interpolation. In this field, the main mathematical idea conceived by mathematicians of the 10th century is to approximate a function by polynomials. A good illustration of this activity at that time is the work of Ibn Y°nus [950 - 1009] in Cairo, and al-Khaszin — mid 10th century — in Western Iran. Both of them inventend and applied a quadratic interpolation. This interpolation, given by Ibn Y°nus in his ztj for instance, may be written in modern notations as follows: y = y.i +

h^(*,.*.)4

*y-,

with d = x,i - XM (i = 0, 1,2, ...) A first order difference, A2 second order difference. Few decades later with his work on mathematical astronomy, Al-Qcen°n al-Mas'"di, al-Bir°ni contributed to the advancement of research on interpolation methods. We should notice before going further that al-Bir°ni had contributed to two main topics of numerical analysis at that time. In this respect, he was the founder of a tradition in Arabic mathematics; to this tradition belonged later not only astronomers, but also algebraists, like al-Samaw'al [d. 1174] and al-Kseshi [d. 1429]. The combination of both

21

activities should be kept in mind because it influenced how to conceive mathematical research on algorisms. In this tradition, and that is a crucial point, al-Bir°ni and his successors did not apply interpolation methods indiscriminantly. They compared them with one another to choose the best for the given function. In order to realize these comparisons, they combined theoretical and experimental considerations. Moreover, al-Bir°ni tried to give geometrical justification for some interpolation methods, as well as the inequality of differences. Let us take now the interpolation methods explained in al-Bir°ni's fragment - the linear interpolation designated by the Astronomers' method - to which we add al-Bir°ni's method presented in his al-Qcen"n alMas'°di. These methods could be written in another notation for X-i < x < XQ, CL — XQ — X.\ = Xj — Xi-i {i = - 2 , - 1 , . . . , n).

(a)

4y-r

y = y~i +

the Astronomers' method;

(P)

^+[£-7=lK^

y=y-i +

al-Bir°ni's formula; its application requires the calculus of Ay.2, A2v.2 and x-\ > d. (Y)

4y_,+4y„ . i f * - s „ | ^

y=y0 +

According to al-B!r°ni, this method supposes x < x0 and would be written A

y = y0 +

y-i+Ayo 2

i f*o-*"| A2 21 d )

with (JC0 - x) > 0, Ay.j < 0, Ay0 < 0 and A2yA > 0. In other terms, according to al-Bir°ni, Brahmagupta's method (y) supposes x < xo, and the correction is additive. In a sense, there is a slight difference with Brahmagupta's expression, at least in Sengupta's translation:

22 Multiply the residual arc left after division by 900' [d] (i.e. by 15°), by half the difference of the tabular difference passed over and that to be passed over and divide by 900' (i.e. by 15°); by the result increase or decrease, as the case may be, half the sum of the same two tabular differences; the result which is whether less or greater than the tabular difference to be passed, is the true tabular difference to be passed over [The Khan' akhcedayka, p. 141].

Finally, the monomial method: (8)

y= y

°~

d{d+l)

Ay

~"

which proceeds by calculating the increments from x; to xi-\. It is of some interest for the history of numerical analysis to answer the question about the origin of these methods. There is a partial answer by alBir°ni himself, indicated but not justified. For him quadratic methods, and particularly Brahmagupta's one (y), were invented to improve linear interpolation. Now, if we go back to the cotangent tables considered by alBir°ni - without diminishing whatsoever the generality of the discussion we see that the values obtained by the linear interpolation are by excess, while the variation of the first order differences is not uniform. Al-Bir°ni's interpretation is that Brahmagupta, conscious of this fact, had searched the second order interpolation. Let us take anew the conjecture of al-Bir°ni for all three quadratic interpolations, and replace in (a) that means in the linear interpolation expression Ayo by a A dependent on x. Start from Ay.i by linear interpolation for x = x0, A = Ay-i and for x = X\, A = Ay0. On [XQ, x\] this interpolation gives A=Ay_l+{^)j{Ay!>-Ay_i)

hence ( X

y=

-X,

2 yo+[-j*\A=yo+]L-^ 4 y _ 1 + | ^ U y H

If we take again this interpolation on [x-\, x0], we obtain the (P) formula. Let us now consider the linear interpolation on [x.\, x{\, we obtain

23

4 2 y- 1 +-(4y 0 -4y_ 1 ),

A=Ay_l+hence A=

1 x *n z Ay-, +Aya] + — 4 y.,; 2 J 2L d J

hence Ay_,+Ayr>^

y=yo+l^y-_yo+[^

l(x-xn^ 21
which is Brahmagupta's formula. Next we come to the monomial method. Divide the interval [x0, X\\ into d equal parts. It may be evident from the cotangent table that the function decreases more rapidly near x0 than near x\. Let us consider cumulative increments from x\ to x0 d(d+l) e + 2e+...+d£ =" N " ' ' e = |Ay0| 2 hence

AAya\ ' d{d+\)' The correction to be made on y\ is additive and corresponds to a cumulative increment on (x\ - x), where {x\ - x) is an integer; then y = yi + c where {xx-x\xl-x 2

+ l) £

fa-xXxj-x + l), " d(d + l) '

and

(x. - xix, - x+i) But for x\ — x = X\ — XQ + XQ - x = d + XQ - x, then

'-H^-W-TS hence

Ay0

| %l

24

x-x\2d

+\ ^

(x-xA

d

relation of the form a

where A is first degree relatively to x~xfi . d

Finally, we obtain the monomial formula if we consider the interpolation on \x.\, x0]. Al-Bir°ni was definitely right when he affirmed that the three quadratic methods are three different procedures to improve linear interpolation. The second topic which is interesting for the history of these methods is why al-Bir°ni did not apply his own method in his tract, and why he preferred the monomial formula to that of Brahmagupta. As al-Bir°m did not give any indication, whatsoever, these historical questions may only have a mathematical answer. The answer, in this case, is quite probable, but not completely certain. We compared the different interpolations given by each of the three formulas, and the cotangent curve on the different intervals considered by al-Bir°ni. In this way, we can prove that for the interval [2°, 5°] he considered, and for other intervals of 3 degrees amplitude, [4°, 7°], [5°, 8°], [6°, 9°], the monomial method gives a better interpolation than Brahmagupta's one. For all the other intervals of 3 degrees amplitude, and the intervals of smaller amplitude, however, the last one gives a better interpolation. Moreover the Brahmagupta method in general has greater possibilities of approaching better the cotangent function. Al-Bir°ni's method [|3] is less suitable than the two others for this function. In his commentary on al-Bir°ni's text, al-Samaw'al not only criticized his preference for the monomial method, but also attempted to explain it. In his opinion, al-Bir°ni might have generalized the result obtained for particular interval [2°, 5°] to all intervals. This criticism could be justified if al-Bir°ni had generally preferred the monomial formula to Brahamagupta's formula, and not only for certain intervals where the interpolation of the cotangent function is not easy, because we are in the neighbourhood of the pole. In any case, it is difficult to imagine the true intention of al-Bir°ni from this fragment alone. More important than the defense of Brahmagupta's formula by alSamaw'al against the monomial formula, is his own contribution to

25

improve all the methods proposed. His main idea was to find convenient ponderations. This idea is indeed interesting when first order differences are important. This is quite true for the intervals [2,3], [3,4], or [2,5]. But it is less interesting when the cotangent decreases less rapidly. In this case, the ponderated mean is nearer and nearer the arithmetical mean.

Without going into greater details, let me summarize the main points. l°Ibn Y°nus, al-Khaezin, al-B!r°ni, al-Faerisi, al-Kaeshi, among many others, contributed to this chapter on numerical analysis, and particularly by their research on quadratic interpolation methods (and sometimes cubic interpolation) to the coming calculus of finite differences. The number of mathematicians interested in this field and their status in the mathematical community reflect in a way the importance of this chapter. 2° All of them were obviously trying to improve linear interpolation. In this respect, they followed two directions: substitute parabolic interpolation for a linear one, or find convenient ponderations. 3° The increasing number of these methods led mathematicians to ask new questions which sometimes they had not yet the conceptual means to answer. Al-Samaw'al, for instance, was asking about the speed of the differences. Nevertheless, to test the performance of each method, they determined the approximation magnitude of the error of different intervals. This is one of the first contributions, up to my knowledge, to evaluate the error of approximation. 4° A simple examination of the experimental determinations confirms the fact that the mathematicians were guided by a certain knowledge of the studied function; the cotangent by example. This knowledge, however, is only presented in the text in numerical form. Nevertheless this knowledge determined the choice of this method, rather than another, for a certain interval. This is, I believe, al-Bir°ni's motive. Let me now say a few words to conclude. This example of numerical analysis has no room in the traditional writing of the history of classical mathematics. It is by no means in the continuity of Greek mathematics. The case of Diophantine analysis advocates in another way for the rewriting of the history of classical mathematics. Unless we include the works of arabic mathematicians, we cannot even evaluate the Liber Quadratorum of Fibonacci. Moreover, we can neither understand the true reasons of the conception of integer diophantine analysis, nor what is authentically new with Fermat.

26

In all that, it is not a mere question of new results that we should take into account, but the emergence of new mathematical rationalities, thanks mainly to the new possibilities offered by algebra, and which shaped a new relationship between various chapters of mathematics.

Mathematics and the 21st Century Eds. A. A. Ashour and A.-S. F. Obada © 2001 World Scientific Publishing Co. (pp. 27-40)

27

T H E PARADIGM SHIFT IN MATHEMATICS EDUCATION:

A SCENARIO FOR CHANGE

William Ebeid Prof. Mathematics Education Faculty of Education, Ain Shams University. With the advent of the third millennium, the mathematics education institutions seem to be in a "tempestuous" zone. While there is a great progress in mathematics as a discipline and as a recognized effective tool for the advancement in science and technology to the extent that high technology is considered as a mathematical technology (David, 1984), there is a distress and dissatisfaction with mathematics education in terms of its content, pedagogies and delivery system at all levels. In general there are poor outcomes in spite of the rich mandated objectives. Symptoms of Dissatisfaction: 1. International and local levels of students attainment indicate poor results in the mathematics examination papers. Perceived problems are: serious lack of essential technical facility such as the ability to undertake numerical and algebraic calculation with fluency and accuracy, deficiency in spatial abilities and visual thinking, decline in analytical powers when faced with problem-solving situations. International olympiads and competitions such as the third international mathematics study (TIMSS) confirm many of these perceptions. The serious problem, as reported by the London Mathematical Society (1995), is not just that some students are less well prepared, but that many high attaining students are lacking in fundamental notions of the subject. 2. The beliefs about mathematics tend to perceive it as a tough subject to learn. Skemp (1971) mentined that mathematics is a subject to be endured, not enjoyable and to be dropped. Cockroft (1982) reported that mathematics is known as a difficult subject both to teach and to learn. Jensen and others (1989) indicated that because of their intrinsic abstractness and generality of their issuse, concepts and methods both mathematics and physics are hard subjects to study. There are no roads to their acquisition that do not involve hurdles to overcome and hardship to be endured. Frudental (in Ebeid, 1995) alluded to two devils menacing geometry: its absorbtion in a system of mathematics or strangulating it by rigid axioms. 3. Negative attitudes towards mathematics are reflected in many attitudinal studies. Mathematics is a generally disliked subject (Ernest, 1991). Mathematics leads away from the things of life and estranges men fro the perception of what conduces to the common

28 "weal" (Howson, 1982). Mathematics dries out the heart (Winslow, 1998). Mathematics is a black forest of symbols, it requires to prove the obvious, its professors look arogant (Ebeid, 1999). Mathphobic social environment has its impact on some students to have mathematics anexiety which causes aversion from learning mathematics or turn them to poor achievers. 4. There is an evidence of a general decline in enrolments to tertiary education in mathematics during the last decade (Jorgensen, 1998). This is also notedon upper grades of the secondary stage where courses follow the elective system or profilization into different streams of study. A Half Century of Swirl Progress: Prior to the panic reaction to the Sputnik incident in the midfifties of the twentieth century, mathematics education enjoyed a reasonable state of stability and "linear" ammendments in its content. The dominating content was the canonical syllabus as manifested in the arithmetic of numbers, Al-Khouarizmy-type algebra concentrating around solving equations and-in some places-manipulation of determinants, Euclidean geometry moving from practical constructions to theoretical proof. Later more "new" topics or branches were introduced here and there. In Egypt for example: trigonometry and solid geometry were introduced as early as 1874, coordinate geometry in 1908 as part of algebra then as a speparate branch in 1953, history of mathematics in 1953 but was dropped in 1961, statistics in 1957, descriptive geometry in 1961 but was dropped two years later, differentiation and integration (as more related to algebraic functions) in 1961 (Ebeid, 1992). In UK and many other countries mathematics was seen as a training for discipline of thought and for logical reasoning (Dainton, 1968). The most profound change in mathematics curricula in the twentieth century is synonymous with the introduction of modern mathematics in the 1960s. The changes envisioned in that era were intended to bring mathematics thought in schools into line with that of university mathematics including changes in language, symbolism, treatment and topics so as to give pre-university students a sense of what was preached as honest mathematics emphasizing the "logic-axtiomatic" approach to unifying context-free mathematical systems. The enthusiasm for modern mathematics infused most of the countries even those which lacked enough resources, repetoire of experienced teachers, and cognitive readiness to early abstractness on the part of the learners. However, the enthusiasm for modern mathematics had faltered by the seventies (inspite of the fact that some third world countries were just being ignited by the movement, encouraged by some international and regional organizations and some commerical agents). The reforms enshrined by the modern mathematics had refutable outcomes. A world-wide concern about the inadequacies of modern mathematics was expressed in the "Back to Basics" wave of new changes. The lack of recognizing what is basic caused swirl changes in different places. Some sought a mixture of traditional and modern topics and approaches, others restricted the content to traditional computations and manipulations. A reconcila-

29 tion agenda for change was proposed by the National Council of Teachers of Mathematics in U.S.A. (NCTM, 1980) recommended eight priorities : (1) Problem solving be the focus of school mathematics, (2) Basic skills must encompass more than computational facility, (3) Mathematics programs must take full advantage of the power of calculators and computers at all grade levels, (4) Stringent standards of both effectiveness and efficiency must be applied to the teaching of mathematics, (5) The success of mathematics programs and student learning must be evaluated by a wider range of measures than conventional testing, (6) More mathematics for all and greater range of options, (7) A high level of professionalism for teachers, (8) Public support must be raised to commensurate with the importance of mathematics to individuals and society. Thus we find a shift in change to encompass multidimensional aspects of improvement and involve all stake-holders. In different projects the pendulum has swung oscilating between emphasizing mathematical skills and between trials for the infusion of thinking abilities while teaching mathematical topics. Paul Ernest (1991) distinguished five interest groups in Britain showing that each has had different aims and views about mathematics education as shown: (1) Radical Conservatives and Bourgeois: Back to basics numeracy, social training in obedience. (2) Meritocratic industry-centered Industrialists and Managers: Useful mathematics to appropriate level and certification. (3) Conservative Mathematicians: Preserve rigour of proof and purity of mathematics. Transmit body of pure mathematical knowledge. (4) Professionals, Liberal educators, Welfare state supporters: Creativety, self-realization through mathematics. (5) Democratic Socialists and Radical Reformers concerned with justice and inequality: Critical awareness and democratic citizenship via mathematics. Ernest (1998) reports that aims (1) and (3) are conservative, with the lower elements of knowledge and skills together with external testing achieving in aim (1), and the higher elements of knowledge and skill directed for the few elite in aim (3). The two aims are directed at "good" external to the students. They embody views of knowledge and skills as decontextualized. Aims (2) and (4) support the inclusion of a progressive-knowledgeapplication dimension. The two aims support the using and application relevant to the learner for using knowledge productively. Aim (5) is concerned with the development of critical citizenship and empowerment for social change and equality through mathematics. Ernest considered that making mathematics relevant to critical citizenship is neglected in most of the countries. With the increasing availability and access to calculators and computers, there have been demands to benefit from this technology in mathematics education leading to elimenate some traditional skills and inject new concepts and topics which are relevant to the need to live with complexity. Thus, mathematics educators are more and more riding the wave of

30 interest to create new and innovative approaches that capitalize on using technology. Some, for example, are calling to approach mathematics as an exprimental science, within visual thinking, but not as language or as liturgy (Davis et al., 1994). However, Ernest (1998) reports that in technology education, curriculum theorists distinguish between developing technological capability and appreciation and awareness (Jeffery, 1988). Capability consists of the knowledge and skills in planning and making artifacts and systems. Appreciation and awareness comprise of the higher level skills, knowledge and judement necessary to evaluate the significance, import and value of technological artifacts and systems within the social, environmental, ecological and moral education. Kahan (1998) asserts that the educational project of our time cannot be the Bourbaki type. Rather it should be inspired by the Web System. Webbing mathematical knowledge would be to allow everyone, starting from his own culture and interest to find a short track in the mathematics forest. Examples of Paradigmatic Shifts. The above mentioned trials and suggestion reflect the fact that "modern" societies-as they are contending to socio-economic prosperity and advancement-need numerate citizens, top mathematicians, authentic scientists and creative engineers and technologists. This implies compelling and imperative necessity to make paradigmatic shift in the course of mathematics education so as to tune it to the appropriate content, delivery systems and learning theories. In this context the following projects give examples of indigenous shifts, not just changes through addition and deletions. 1. A Chinese Perspective (Er-Sheng, 1998). The prespective of mathematics education (PME) in China in the 21 st century calls for a shift based on changes ins: the social needs for mathematics, the nature of mathematics and its applications and the understanding of how students learn mathematics. These changes imply the following: (a) Adptation to the needs of the economy of the information age and the market economy. This requires useful mathematics to be learnt at the mastery level so as to: interpret computer-controlled processes, acquire analytical rather than merely mathematical skills, deal with daily activities such as cost, profit, tock, forecast, risk evaluation... which in turn needs the study of ratio and proportion, operational research and optimization, systematic analysis and decision theory (and complexity and cahos). (b) Inclusion of applications from the real world, in such areas like environmental and ecological sciences, social sciences, art, music, (in addition to biology and other bio-sciences). This requies more of statistics and probability, dynamic systems, mathematization, modeling patterns as mainfested in number, data shape, arrangements... this also need to use of appropriate packages of software to facilitate and empower students work. (c) Approach learning mathematics through constructitivism, where the students approach each new task with some prior knowledge, assimilate the new information and construct

31 their own meanings to the extent that the new knowledge be integrated to their own cognitive structure via creative activities... instead of the learning (if any) through passive absorbtion of information and storing it in easily retrievable fragments as a result of repeated practices. II. A view From South Africa: Out-Comes Based Education (OBE) (Volmik, 1998). South Africa has adopted a National Qualifications Framework and Curriculum 2005 as the focus for systematic transformation of the education and training system. Further, an outcomes based education approach was chosen as the vehicle to implement the objectives of the NQF. Eight generic outcomes have been chosen to ensure that learners would be prepared for life in a global society. The eight cross-curriculum outcomes are: 1. Identifying and solving problems in which responses display that responsible decisions, using critical and creative thinking, have been made. 2. Working effectively with others. 3. Organising and managing onself and ones activities responsibly and effectively. 4. Collecting, analysing, organising and critically evaluating information. 5. Communicating effectively, using visual and/or language skills in the modes of oral and/or written persuation. 6. Using science and technology effectively and/or critically, showing responsibility towards the environment and health of others. 7. Demonstrating an understanding of the world as a set of related systems by recognizing that problem solving contexts do not exist in isolation. 8. Contributing to the full personal development of each learner and the social and economic development of the society at large. The specific outcomes for learning mathematics are stated as follows: (1) Demonstrate underatanding about ways of working with numbers. This outcome is intended to develop an intuitive understanding of the number concept and to extend that understanding to include the tools needed to solve problems and handle information. (2) Manipulate number patterns in different ways. This involves observing, representing and investigation patterns in social and physical phenomena. (3) Demonstrate understanding of the historical development of mathematics in various social and cultural contexts. Mathematics must be seen, not as a European product, but as a human activity to which all people of the world have contributed in significant ways. (4) Critically analyze how mathematical relationships are used in social, political and economic relation. This outcome is intended to allow learners to develop the critical capacity to participate in the descisions that effect their lives and to be aware of how issues such as race, gender and class playout in their lives and their communities. (5) Measure with competence and confidence in a variety of contexts. This outcome is intended to develop the skills of measurement with due regard to accuracy and relevant

32 units. (6) Use data from various contexts to make informed judgements. In order to have the skills to make informed decisions within the context of a technologically advanced global system, learners must understand how information is processed. (7) Describe and represent experience with shape, space, time and motion, using all available senses. This outcome is intended to help learners to visualize and represent phenomena within the context of pace and time more effectively. (8) Analyze natural forms, cultural products and processes as representations of shape, space and time. This will allow learners to make sense of the aesthetic forms, relationships and processes in their communities and beyond. (9) Use mathematical language to communicate mathematical ideas, concepts, generalizations and thought processes. Learners will acquire the algebraic skills to process and communicate the ideas. (10) Use various logical processes to formulate, test and justify congectures. This outcome is intended to encourage learners to question, conjecture and experiment, and to develop their reasoning skills to construct and evaluate arguments. Volmink (1998) comments that the curriculum of the past had been content-driven and extremely sterile. The new specific outcomes encourage educators and leareners to focus on outcomes aiming at helping people to understand and act on the world they live in. III. U.S.A. Standards 2000 (NCTM, 1998). A draft document has been issued by the American National Council of Teachers of Mathematics (NCTM). It is concerned with principles and standards for mathematics classrooms which are viewed as places where thinking about and doing mathematics is the central focus for the 21 st Century. Guiding Principles: Mathematics instructional programs should: (1) promote the learning of mathematics by all students. (2) emphasize important and meaningful mathematics through curricula that are coherent and comprehensive. (3) depend on competent and caring teachers who teach all students to understand and use mathematics. (4) enable all students to understand and use mathematics. (5) include assessment to monitor, enhance and evaluate the mathematics learning of all students and to inform teaching. (6) use technology to help all students understand mathematics and prepare them to use mathematics in an increasingly technological world.

33 Content and Processes. Ten standard followed the guiding principles which describe the knowledge base through a connected body of mathematics understanding and competencies. The first five standard address the content which represent what students should know, the last five address the processes which represent ways of acquiring and using that knowledge. All the ten standards are to be developed spirally through pre-K-12 grades: (St.l) Number and operation: Mathematics programs should foster on the development of number and operation sense so that all students: (a) understand numbers, ways of representing numbers, relationships among numbers and number systems. (b) understand the meaning of operations and how they relate to each other. (c) use computational tools and strategies fluently and estimate appropriately. (St.2) Patterns, Functions and Algebra: Mathematics programs should include attention to patterns, functions, symbols and models so that all students: (a) understand all various types of patterns and functional relationships. (b) use symbolic forms to represent and analyze mathematical situations and structures. (c) use mathematical models and analyze change in both real and abstract contexts. (St.3) Geometry and Spatial Sense: Mathematics programs should include attention to geometry and space sense so that all students: (a) analyze characteristics and properties of two and three dimensional geometric objects. (b) select and use different representational systems, including coordinate geometry and graph theory. (c) recognize the usefulness of transformations and symmetry in analyzing mathematical situations. (d) use visualization and spatial reasoning to solve problems both within and outside mathematics. (St.4) Measurement: Mathematics programs should include attention to measurement so that all students: (a) understand attributes, units and systems of measurements. (b) apply a variety of techniques, tools and formulas for determining measurements.

34 (St.5) Data Analysis, Statistics and Probability: Mathematics programs should include attention to data analysis, statistics and probability so that all students: (a) pose questions and collect, organize and represent data to answer those questions, (b) interpret data using methods of exploratory data analysis. (c) develop and evaluate inferences, predictions and arguments that are based on data. (d) understand and apply basic notions of chance and probability. (St.6) Problem solving: Mathematics programs should focus on solving problem as part of understanding mathematics so that all students: (a) build new mathematical knowledge through their work with problems. (b) develop a disposition to formulate, represent, abstract and generalize in situations within and outside mathematics. (c) apply a wide variety of strategies to solve problems and adapt the strategies to new situations. (d) monitor and reflect on their mathematical thinking in solving problems. (St.7) Reasoning and Proof: of (a) (b) (c) (d)

Mathematics programs should focus on learning to reason and construct proofs as part understanding mathematics so that all students: recognize reasoning and proof as essential and powerful parts of Mathematics. make and investigate mathematical conjectures. develop and evaluate mathematical arguments and proofs. select and use various types of reasoning and methods of proof as appropriate.

(St.8) Communication: Mathematics programs should use communication to foster understanding of mathematics so that all students: (a) organize and consolidate their mathematical thinking to communicate with others. (b) express mathematical ideas coherently and clearly to peers, teachers and others. (c) extend their mathematical knowledge by considering the thinking and strategies of others. (d) use the language of mathematics as a precise means of mathematical expression. (St.9) Connections: Mathematics programs should emphasize connections to foster understanding mathematics so that all students:

35 (a) recognize and use connections among different mathematical ideas. (b) understand how mathematical ideas build on one another to produce a coherent whole. (c) recognize, use and learn about mathematics in contexts putside mathematics. (St.10) Representation: Mathematics programs should emphasize mathematical representations to foster understanding of mathematics so that all students: (a) create and use representations to organize, record and communicate mathematical ideas. (b) develop a repetoire of mathematical representations that can be used purposefully, flexibly and appropriately. (c) use representations to model and interpret physical, social and mathematical phenomena. IV. The Swedish "ADM"-Project (Bjork and Brolin, 1998). The ADM-project is a research and development project for the analysis of the consequences of the computer for mathematics education which has been initiated at the department of teacher training at the university of Uppsala in Sweden. In these experimental materials, for secondary school calculus, the amount of time for skill development and procedural knowlege was reduced in favor of conceptual knowledge and enhancing a problem solving learning envitonment. Computers and later on graphing calculators were used to perform all routine operations in the analysis of graphs and functions. In a longitudinal study (1987-92), the results indicated that the use of computing and graphing technology in calculus courses can have many positive effects when compared to traditional paper and pencil methods. In particular, students will be better problem solvers, have a deeper and richer understanding of fundamental concepts, be better able to model word problems with functions, to interpret given functions and equations and to change between different representations, more often use their own methods for solving problems. In 1996/97 the ADM project launched a TEMA (Technology in Mathematics) study. Secondary school teachers assessed the changes in the new courses and called for: (a) less emphasis on exact integration and curve construction using derivatives. (b) greater emphasis on problem solving, discussion, reporting solutions, lines of thought, understanding concepts, using and interpreting derivatives, setting up and interpreting integrals, properties of families of functions... V. An Australian Curriculum and Standards Framework (CSF). (Board of studies, 1995). This framework is a policy about mathematics education for the eleven years of schooling in the State of Victoria, Australia. Its content is adopted from the Australian wide national profiles, CSF provides an outline of the mathematics curriculum. It leaves to the schools to be responsible for detailed development and delivery. It encopasses: goals, activities, cotnent as structured into strands and substrands, laming outcomes expected at each level,

36 and guidelines to approaches to teaching and learning in addition to time allocation for the strands at different levels. Content: The content is structured in the following strands and substrands: (a) Space: interpreting, drawing and making, location, shapes, transformation. (b) Number: number, counting and numeration, mental computation and estimation, written computation, applying numbers, number patterns and relationships. (c) Measurement: Choosing units, measuring, estimating, time, using relationships. (d) Chance and Data: chance, posing questions and collecting data, summerizing and presenting data, interpreting data. (e) Algebra: expressing generality, equations and inequalities, function. (f) Mathematical Tools and Procedures: mathematical tools, communicating mathematics, strategies for mathematical investigation, contexts of mathematics. Access to Technology: CSF places clear emphasis upon sensible use of technology in: concept development, problem solving, modelling and investigative activities. It encourages schools to ensure that calculators and computers are available for mathematics lessons. Four functions or scientific calculators are recommended to all students. Schools are to avail graphing calculators at levels 6 and 7. Improved access to computer resources is necessary: free stand computer with an overhead projector in each class, computer labs and a range of appropriate software. Competencies and Learning Outcomes: The following is a summarized example of the learning outcomes expected by the end of the first level from each of the five strands, such that children can: 1. (Space): Draw, build and describe shapes and objects that they see and handle, note simple similarities and inferences, match congruent shapes, recognize symmetry in pictures, follow and give directions of position and movement. 2. (Number): Make, count, record and estimate small collections of objects and order and compare them, relate numbers using part-whole imagery, deal with numbers, copy, continue and devise repeating and counting patterns, recall simple facts, count forwards and backwards to make simple mental calculations, represent number stories using materials and drawings, exchange money for goods in play situations. 3. (Measurement): Use everyday language to describe, order and compare length, mass and capacity for familiar objects, compare length and capacity by repeated use of informal units, understand the purpose of clocks and relate time to familiar recurring events, link the days of the week and months of the year with events, their lives.

37 4. (Chance and Data); Recognize elements of chance in familar situations, collect and classify objects, pose questions and represent information to make comparisons. 5. (Mathematical Tools): Recognize ways in which mathematics is part of their family's everyday life, communicate and discuss mathematical ideas in natural languge, explore and test conjectures about problems that arise in their everyday experience, detect and correct inconsistencies in simple patterns, reassess non-numerical estimates of size, use calculators to represent numbers and explore counting. A Scenario For Change In Mathematics Education (Case study: Egypt). Guiding and Controlling Rules: 1. Follow a holistic perspective away from fragmentation and piece-meal changes. 2. Consider the complexities regarding school buildings, classroom densities, teachers reactions and competencies, centralized curriculum development, line authority, physical facilities and the flow of increasing students enrolment in all stages .. etc. 3. Learn from past experiences whether failed or succeeded. 4. Benefit from others experiences and innovative projects and the patheays to smooth implementation. 5. Simulate the realities using systems analysis. 6. Share and interact in dialectical dialogues with mathematicians, mathematics educators, teachers, students, parents, consumers and users of mathematics. 7. Look for policies, rather than politics, in the process of change so as to serve the society's current and future real challenges and needs. 8. Avoid generalization before scientific experimentation and formative evalution. 9. Consider the cost and benefit expectations in the light of the hard equation of financing and the obligation for free education. 10. Be aware of consistency among different levels of decision mahing. Avoid passive or anti-reform executive through convincing dialogues. Guiding Features For Change: 1. Mathematics instruction should free itself from the classic taxonomy of Bloom and shift to standards and outcomes-based philosophy. 2. Levels of achievement ought to be raised to the international benchmarks. 3. Sofren centralized curriculum development by adopting a "core" mathematics program which covers 60-80% of the allocated time to be mandated all over the country. And leave the rest to be differentiated by the educational zones so as to contextualize and societalize it to local situations. 4. Delete routine skills and operations along with increasing sensible use of technologies. 5. Incorporate new mathematics concepts at relevant levels. Examples come from data analysis, sampling techniques, probability concepts and new applications, linear programing, game theory, graph theory, topological maps, operational research, patterns,

38

6.

7.

8.

9. 10.

recursions and fractal geometry which reflect the aesthetics of mathematics, history of mathematical discoveries and roles of mathematicians including Egyptians and Arabs, discrete mathematics wherever, a quantity is to be counted. Add integrated societal application modules by the end of each grade to address the mathematics of: the farm, the building, the factories, the family budgeting...etc as rlated to local communities, environments and ecological situations. The students use of technology are to be beyond just "Pick and Click". It must be interactive aiming at development of concepts, discovering relations and verifying generalizations. Create a new culture in the teaching-learning classroom environment, so as to avoid the culture of "talk and chalk" by the solo "sage on the stage". Staff development is an urgent and pre-requisite necessity. Ceate a center for producing relevant mathematics software in the Arabic language. Mathematical knowledge is not a sort of sports to be watched or imitated nor they are mere scripts to be transmitted, but it is more mental processes, human mind activities and adventures and psychometric skills which need to be constructed through productive actions. Its main media for development ought to be very close to real life situations and problems, targeted to career preparation and self realization. In general, change out not to be merely content-driven.

Guidelines For Pathways to Change: 1. Sustainable change for reform must be institutionalized, not depending on mere a top authority initiative. It must be directed to foster on classroom work. 2. Change has to pass through main four phases: initiation, pilot experimentation and evaluation, implemntation and follow-up, and dissmenation with flexible continuity. 3. There must be a programmed time schedule until changes reach classrooms. 4. A strategy has to be chosen from alternatives in the light of governing rules and possible intervening factors. Literature in this context projects two influential variables (1) the degree and intensity of aimed change, (2) the extent to which the educational community is ready to accept the intended change. This of course goes along with the availability of relevant facilities. Within these variables, one of the following four pathways can be taken: (a) Successeve Development : that is to implement certain parts of the change, one after the other, all over the country. Be time, complete change will be implemented at large. (b) Increasing Expansion: that is to implement the complete change in a limited number of schools which can be gradually expanded. (c) Cautious Change: this is to do partial reforms, or implement some innovative projects in a limited number of schools without specific plants for dissemention or generalization on a large scale.

39 (d) Pilot Experimentation: that is to experiement with some facets of reform in some schools, without having enough faculities nor sufficient support. Thus it will depend on conventient circumstances to be done now and then, here or there. In general, it may not be easy to choose the appropriate choice without being engaged openly and creatively with the context of the total view of the intended paradigm shift in mathematics education so that the core issues can be identified. This needs democratic sharing, professional preparation, human and material resources, and courage perseverance and faith on the part of the leadership. REFERENCES (1) Birk, Lars-Eric and Brolin (1998): "Which Traditional Algebra and Calculus Skills are Still Important"? A paper presented in the Fourth UCSMP Intl Conference on Mathematics Education, University of Chicago, August 1998, U.S.A. (2) Board of Studies (1995, 96) "Curriculum and Standards Framework, Mathematics"; Mathematics Study Design", Victoria, Australia. (3) Cockroft, W.H. (1982): "Mathematics Counts", Report of Inquiry Committee in the Teaching of Mathematics, HMSO, London, U.K. (4) Daintor, F.S. (1968): "Enquiry into the Flow of Candidates in Science and Technology into Higher Education", HMSO, London. (5) David, Jr. E. (1984): "Renewing U.S. Mathematics: Critical Resource for the Future", National Academy, Washington, D.C., U.S.A. (6) Davis, B., Porta and Uhl, J. (1998): "Is the Mathematics we teach the Same as the Mathematics we Do?" A paper distributed at the Rosilde University Conference in 1997, Demark. (7) Ebeid, William (1999): "Current and Future Trends in Learning and Teaching Mathematics" ; MOE, Dubai, United Arab Emirates. (8) Ebeid, William (1999): "Socieetal Mathematics as a Futuristic Trend", International Conference on Mathematics Education into the 21 st Century". (Rogerson ed.) Cairo, Egypt, Nov. 1999. (9) Ebeid, William (1998): "Enrolment in Mathematics, Problems and Aspirations in Kuwait University", in Proceedings of Conference on Justification and Enrolment Problems Involving Mathematics or Physics", Roskilde University, Denmark. (10) Ebeid, William (1999): "Mathematics for All in Egypt: Adoption and Adaptation", in Proceedings of the Fourth UCSMP Conference (1998), Chicago, U.S.A. (11) Ebeid, William (1996): "Deficulties in Learning Geometry", 8th ICMI, Seville, Spain. (12) Elaisawy and others (1999): "The Theoretical and Methodological Bases for "Egypt 2020" Scenarios", (in Arabic), Third International Forum, Cairo, Egypt. (13) Er-Shing, Ding (1998): "Mathematics Reform Facing the New Century in Chaina", in Proceedings of UCSMP Conference, op. cit. (14) Ernest, Paul (1998): "Why Teach Mathematics?" in Justification Conference, op. cit.

40 (15) Prudental, Hans (1971): "Geometry Between Devil and the Deep Blue Sea", in Educational Studies in Mathematics Vol.3 No.3-4, Boston, Mass. U.S.A. (16) Horwon, G. (1991): "National Curricula in Mathematics", Lecister, The Mathematics Association, U.K. (17) Jeffery, J. (1988): "Technology Across the Curriculum", Exter School of Education, Exter, U.K. (18) Jenssen, J., Niss and Wedge, T. (1998): "Introduction" in the Justification Conference, op. cit. (19) Jorgensen, Bent (1998): "Mathematics and Physics Education in Society", in the Justification Conference, op. cit. (20) Jorgensen, Bent (1998): "Mathematics and Physics Education in Society", in the Justification Conference, op. cit. (21) Kahan, J.P. (1998): "Mathematics and Higher Education Between Utopia and Realism", in the Justification Conference, op. cit. (22) London Mathematical Society (1995): "Tackling the Mathematical Problem", LMS,IM and RSS, Burlington House, Pecadilly, London. (23) Moris and Arora (eds.)(1992): "Moving into the Twenty First Century", Studies in Mathematics Education, Vol.8, Unseco, Paris. (24) NCTM (1980):"An Agenda for Action: Recommendations for school Mathematics of the Eighties", NCTM, Virginia, U.S.A. (25) Skemp, R. (1971): "The Psychology of Learning Mathematics", Penguin, Harmondswarth,, U.K. (26) Standard s2000 Group (1998): "Principles and Standards for School Mathematics Discussion Draft", NCTM, Virginia, U.S.A. (27) Usiskin, Zalman (1999): "Is there a worldwide Mathematics Curriculum?", in the UCSMP Conference (August 1998), op. cit. (28) Volmink, John (1999): "School Mathematics and Outcomes-Based Education - A view From South Africa", in the UCSMP Conference, op. cit. (29) Winslow, Carl (1998): "Justifying Mathematics as a Way to Communicate", in the Justification Conference, op. cit.

Mathematics and the 21st Century Eds. A. A. Ashour and A.-S. F. Obada © 2 0 0 1 World Scientific Publishing Co. (pp. 41-52)

EINSTEIN'S THEORY OF SPACETIME AND GRAVITY J U R G E N EHLERS Max-Planck-Institut fiir Gravitationsphysik, Albert-Einstein-Institut, Am Muhlenberg 1, 1^16 Golm, Germany

INTRODUCTION

T h e general theory of relativity (GR), created almost single-handedly by Albert Einstein between 1907 and 1915, deals with both gravitation and spacetime structure. While in Newtonian physics and in special relativity theory the geometry of spacetime - its topological, affine and metric structure - is assumed to be independent of whatever physical processes are taking place, according to G R spacetime carries a dynamical metric which interacts with matter, and this leads to the phenomena of gravitation. T h e name geometrodynamics, coined by John Archibald Wheeler, indicates more adequately the contents of Einstein's theory than the dry, traditional term "general relativity theory". Since some spacetime structure so far enters all descriptions of matter, whether classical or quantum, G R is considered, besides q u a n t u m theory, as one of the foundations on which our understanding of the physical world rests. In the first part of the following survey I have tried to describe the mathematical and physical concepts and assumptions on which G R is based. In particular, I have tried to state the two ideas which guided Einstein on his way towards GR, the principle of general invariance and the principle of equivalence, in a modern, mathematically clean way without sacrificing their intuitive meanings, and I have recalled reasons why Einstein's field equation cannot easily be modified. T h e second part is devoted to a qualitative account of some major developments of the theory which have led to several definitive insights and whose unsolved problems are related to current research. The literature about G R is vast. The bibliography given below only refers to some of the articles which I consider as sign-posts along the road G R has taken during its eighty-five-year journey. More detailed accounts of these and other developments with extensive references can be found in [36]. I.

FOUNDATION

1. As in all successful physical theories so far, spacetime, the set of events "herenow", is represented mathematically as a four dimensional, smooth, differentiable manifold M, and physical objects and processes are described in terms of fields on M. In general relativity, in contrast to Newtonian physics or special relativity, all fields - even those specifying the geometric and kinematic structure of spacetime - are assumed to be dynamical, i.e., influencing and influenced by other fields. Therefore, there do not exist preferred coordinates on M which

42 assign universal values to all components of some ("absolute" or "passive" )fields in all charts of an atlas. (In Newtonian physics and special relativity the metrics assigned to spacetime are described by absolute fields) . Thus, coordinates generally only serve to "name" points and to express the differential topology of M, and all field laws must be expressed in generally covariant form. In physics, theories of this kind, i.e., containing no absolute fields, are said to obey the principle of general invariance [1], [2]. 2. One field, assumed to be present always, is a Lorentzian metric g, a nondegenerate, 2-covariant, symmetric tensor with signature (-, + , + , + ) . It defines on each tangent space to M an inner product X-Y

= gapXaYe.

(1)

A vector X is called timelike if X2 < 0, lightlike or null if X ^ 0 and X2 — 0, and spacelike otherwise, a similar terminology is applied to curves in M. The light-like vectors at an event of M form the null cone. The motion of a point particle is represented by a timelike curve, and the arc length j\-gal3dxadxl3)i

(2)

along such a curve is called its proper time. Under appropriate conditions an undisturbed, good clock - e.g., an atomic clock -measures proper time, irrespective of its motion and its environment. This experimentally well-supported, basic assumption is the most direct way to assign a physical meaning to the metric. It replaces Newton's apparently evident axiom concerning the existence of an absolute, universal time. "If one drops the connection of ds to the measurement of ...time, then relativity loses all its empirical basis" (A. Einstein in a letter to H. Weyl, April 4, 1918 [3]). The metric defines a canonical isometry between tangent and cotangent spaces. Through this isometry and in terms of its connection the metric enters virtually all descriptions of matter, as will be seen in equations ( 8 ) - ( l l ) below and also in the standard model alluded to in connection with equation (13). This role of the metric is structurally more fundamental then its phenomenological role in defining proper time. In fact, the latter role follows as an approximation from the former one, given a dynamical, classical or q u a n t u m mechanical model of an object which can serve as a standard clock. 3. T h e metric uniquely determines a symmetric, linear connection on M which relates different tangent spaces to each other by an inner-product preserving parallel transport along curves, expressed by HVa -— + T%V^

2 r ^ 7 = ga\gspn

= 0,

(3)

+ gs^,p - gp-y,s)-

(4)

43 Here, xa(u) indicates the curve, x1 its tangent, Va(u) the transported vector. (A comma denotes partial differentiation, g_1 = {gaf3) denotes the inverse of g — (gap)). T h e functions T% are called the components of the connection. T h e transport (3) in general depends on the connecting curve. T h e connection defines (and is equivalent to) a covariant derivative operator V acting on vector fields and, more generally, on tensor fields and serves to formulate covariant differential equations. Such equations therefore depend, via the connection, on the metric and its first derivative. This fact expresses in G R the universal influence of gravity on fields and particles. A curve whose tangent vector is parallel along the curve is called a geodesic. A timelike geodesic represents a. freely falling test particle, a lightlike one represents a freely propagating light ray. T h e union of all light rays through an event E is called the light cone Ce of E. 4. Let xa(u, s) be a one-parameter family of geodesies so t h a t , for small e, xa(u, e) is close to the "central" geodesic x"(u,o) = xa(u). Then, eduxa(u,o) = sra may be thought of as connecting the central geodesic to its neighbours. ra is called a Jacobi vector; it obeys the equation of geodesic deviation [4],

gr^V^V-

(5)

(Conversely, given a solution ra to (5) on a geodesic, there exists a family of nearby geodesies as above.) T h e curvature tensor Rag~j on the r.h.s. of (5) is characterized by t h a t equation. It is determined by the metric and its first and second derivatives. If the geodesies involved in obtaining eq. (5) represent freely falling test particles (or light rays), t h a t equation shows t h a t these particles are relatively accelerated (these light rays are bent relative to each other). If the curvature tensor were zero, eq. (5) would contain an invariant, differential version of the law of inertia, assumed to hold in special relativity. In GR, R^s is seen to be a pointwise measure of the direction dependent strength of the gravitational field. The tensor Gap —R1^/}--gapR1 is called the Einstein

lS

(6)

tensor. It satisfies the contracted Bianchi identity VpGaP

= 0

(7)

which originates in the coordinate-independence of the function gap H-» Gap. 5. In GR, the laws of (classical) physics are expressed as covariant partial differential equations imposed on the metric and on matter variables. One distinguishes matter laws, also called non-gravitational laws, from the gravitational field equation. T h e matter laws contain, besides m a t t e r variables, the metric and, via covariant derivatives, the connection, but not the curvature, and therefore they have the same tensorial (or spinorial) form as in special relativity. This

44 statement may be considered as an exact reformulation of Einstein's principle of equivalence [5]. It permits to generalize matter laws from special to general relativity and ensures that the special theory holds approximately in regions which are small compared to the length and time scales set by the curvature field. Examples of matter laws are Maxwell's (generalized)equations for electromagnetic fields in vacuum, V[«.*foi] = 0,

\7PF"P

= 0;

(8)

the Maxwell-Boltzmann-Vlasov equation for a gas of freely falling particles,

P * ^ /

= 0,

(9)

a

in which f(x ,pa) > 0 denotes the phase space density of the particle trajectories; and the GR-version of equations of motion of an Eulerian ideal fluid,

VeT<*e = vp{(p + p)}uaue + Pgae = o,

(10)

containing the energy density p, the pressure p and the 4-velocity Ua of the fluid. As in the last example (10), so also in the other cases (8) and (9) one can form a (symmetric) energy tensor T"13 from the basic matter variables which obeys, as a consequence of the matter laws, the energy momentum conservation law afl

VfjT

= 0.

(11)

(The occurence of the covariant derivative in (11) prevents one from passing from (11) to an integral version of a local conservation law. Nevertheless there is an exact sense in which (11) expresses t h a t material energy-momentum is preserved in the "infinitesimal" vicinity of an event ([4], ch. IV). 6. The gravitational field equation, which Einstein communicated to the Prussian Academy of Sciences in Berlin on November 25, 1915 [6], states t h a t the curvature of spacetime is related to matter by G0"3 = 8nT"p

(12)

This equation is the hard core of GR; its discovery marks one of the oustanding achievements of physics in the 20 t h century. Because of the identity (7), the total energy tensor appearing on the r.h.s. of (12) has to satisfy eq. (11), in accordance with the equivalence principle. If interactions between different kinds of m a t t e r are taken into account, the total energy tensor is not necessarily a sum of contributions belonging to the constituents, it may contain coupling terms. But in any case the total energy tensor has to obey eq. (11).

45 The only tensor valued function Val3(g,dg,d2g) which satisfies "\JpVaP — 0 identically (in four dimensions) is a linear combination of Gal3 and gaP [7]. Thus, apart from a cosmological term +Agal3 which may be transferred to the r.h.s. and considered as a (vacuum) contribution to Tal3, there is no choice but to accept eq. (12), provided the field eq. is required to contain the metric as the only geometrical, respectively gravitational field, and t h a t at most in second differential order. (Linearity in the second derivatives, quadratic dependence on
AD\S,

• • •] = T ^ - /

G[g]dV + [ L[g, • • -]dV

[8], [9], [10]

(13)

lOTT JD JD in which G denotes the trace of the Einstein tensor. T h e action density L of matter depends on g and dg, but not on higher derivatives of g. Dots indicate matter variables, dV denotes the volume element associated with the metric, and D is a compact domain of M. The energy tensor can be obtained from the action density as a variational derivative, 2

yf^g

5gap

T h e action functional (13) connects G R with the classical, geometrical aspect of the standard model of particle physics [11]. The matter fields of t h a t model are defined not on M itself, but on a principal fibre bundle over M with structure group U(l) x SU(2) x 5f/(3); the form of the action expresses internal gauge symmetries of the fields. T h e action is also the basis of several Hamiltonian reformulations of the field equations of G R in terms of the metric or other, new variables, which in turn form the starting points for a t t e m p t s to quantize gravity [12], [13]. 7. To draw any conclusion from the field equation (12), something has to be assumed about the energy tensor. T h e weakest requirements are energy positivity conditions. The weak energy condition e.g., says t h a t the energy TapUaUP a is non-negative for all timelike vectors ("observers") U . It ensures that under quasistationary conditions the field equation mimics attractive forces. T h e strongest (local) condition is perhaps that T"' 3 should be constructed from matter variables and gap in such a way t h a t the differential system formed by the m a t t e r equations and the gravitational equation admits a well-posed Cauchy initial value problem, with evolution equations whose characteristic rays are contained in the (closed) null cone of the metric [14]. This requirement expresses, at the classical level, Einstein causality. This condition is satisfied in the three

46 examples mentioned in section 5, provided that in the fluid case there is an equation of state such that the adiabatic sound speed does not exceed the speed of light. The "Cauchy requirement" imposes a non-trivial restriction on GR matter laws obtained via the equivalence principle from the special theory. For example, the standard SR equations for free fields with spin s > 1 do not satisfy it. 8. According to the preceeding summary of GR, a model of some physical process is represented by a diffeomorphism-equivalence class of structures {M, g, matter variables} obeying eq. (12) and matter laws. This assertion includes that observable relations and measurable numbers are invariant statements, not dependent on the choice of coordinates. 9. In the transition from SR to GR, Lorentz (SO(l,3)) invariance, referring to tangent spaces, is preserved; hence the spin of fields remains well-defined, and eq. (12) implies that the gravitational field has spin 2. On the other hand, translation invariance is lost and generalized to diffeomorphism invariance. This is related to the loss, or rather weakening, of the law of local energy-momentum conservation mentioned in section 5. It appears that the Poincare group has a well-defined role in GR as a global asymptotic symmetry group for a certain class of asymptotically fiat spacetimes. (H. Friedrich, private communication). Whether under some conditions an action of this group on spacetime fields exists, is an open question. 10. So far, essentially local properties have been considered here. Global properties sometimes imposed include time and/or space orientability, existence of a spinor structure, existence of a foliation of spacetime by spacelike, possibly compact slices (in cosmology particularly), global hyperbolicity, asymptotic Minkowskian or de Sitter or anti-de Sitter behaviour at some kind of infinity attached to M, non-extendability [15]. It is an open, and in part philosophical, question, whether some of these properties should be taken as axioms of GR. 11. To end this summary concerning the foundations of GR, I mention that it is possible to reformulate the laws of GR in such a manner that the laws of Newtonian theory, reformulated in spacetime language, arise in a well-defined sense as degenerate limits of those of GR [16].

II. SOME DEVELOPMENTS, RESULTS AND PROBLEMS

12. One way to gain insight into the properties of solutions of the field equation (12) is to study the initial value problem for those equations. This problem was raised by David Hilbert [17]. Steps to solve it were taken by George Darmois (1923, 1927), and thanks to the researches of Andre Lichnerowicz (1940...) and later authors the local problem, with or without sources, is now well understood, and some global results have also been obtained. (For a recent survey see [14].) The initial value problem in GR differs in three essential respects from the corresponding ones in the standard theory of partial differential equations. The

47 first novelty is that not only the fields have to be determined from initial data, but also the domain of definition of these fields, the spacetime manifold M, has fo be found. Secondly, uniqueness of tensorial equations without a given background metric has to be understood as "uniqueness up to diffeomorphisms" [17]. (Failing t o recognize this was a major reason why Einstein, on his arduous way towards GR, temporarily gave up general invariance "with a heavy heart".) Thirdly, the Einstein equations - coupled to m a t t e r equations or not - consist of constraint equations and evolution equations. The former impose conditions on the initial data, the state of the field at one instant of time, while the latter determine the fields later (and earlier). In the case of the vacuum field equation, Gap = 0, one therefore proceeds as follows. One first solves, on a 3-manifold S, the (non-linear, elliptic) constraint equations. Thus one obtains a set (5, habkab) where hab is a Riemannian metric and kab is a symmetric tensor field on S. Such an initial data set is determined by four free functions on S, which in the terminology of physics correspond to 2 degrees of freedom of the gravitational field, as expected for a massless, helicity 2 field. Next, one chooses gauge conditions which provide, combined with those components of Gap = 0 which are not constraints, a hyperbolic system of evolution equations. The previously determined d a t a are then evolved according to those equations, providing a metric gap on some 4-manifold M containg 5 as a spacelike submanifold on which hat and kai, turn out to be the intrinsic metric and external curvature of S in (M, g). Such a solution is called a Cauchy development of (5, h, k). On M, any non-spacelike curve intersects S exactly once; S is then called a Cauchy hypersurface. The procedure just sketched has been carried out in several variants, and the following theorem has been established [18]: An initial d a t a set (S,h,k) determines a geometrically unique, maximal Cauchy development (M,g) i.e., one which is not contained in another such development. An analogous statement presumably holds for the general case when eq. (12) is coupled to m a t t e r laws, provided the local Cauchy problem is well posed. (For technical remarks, see [14].) T h e intrinsic characteristics of a solution (M, g) to the vacuum field equation are its lightlike hypersurfaces. Hence, gravitational waves propagate with the fundamental speed c = 1, just like electromagnetic waves. Since the characteristics determine the domains of dependence ("Einstein causality"), it follows that according to GR, it is impossible to predict future events on the basis of observations. In contrast to Laplace's demon, however, an observer, having determined the state in a finite spatial region, can in principle test whether the laws of G R hold for the fields in the domain of dependence of t h a t region, while according to Laplace the pitiable demon would have to know the instantaneous state on the whole infinite space before being able to predict or retrodict anything. It is this change of causal dependences to local ones which distinguishes field physics from Newtonian physics, based on instantaneous action at a distance. A spacetime which admits a Cauchy hypersurface, i.e., which can be determined from d a t a according to the theorem above, is said to be globally hyperbolic; its manifold is a product M = K x § of a (topological) "time" and a 3-manifold

48 S, "space". To be acceptable as a model of the universe, a solution to eq. (12) should be inextendable, since otherwise it would be just a part of a larger universe, a contradictio in adiecto. But then it might not be globally hyperbolic, hence not determined by any instantaneous state (initial d a t a set). This type of "non-local indeterminism" does not seem to have been envisioned before the analysis of the Cauchy problem in G R . It is totally different from q u a n t u m indeterminism. The theorem quoted above does not say anything about the "size" of the respective spacetime. One would like to know whether t h a t size is limited by the occurrence of singularities (and which ones) or Cauchy horizons. In order for such questions to be significant it appears that the initial d a t a sets should in some sense be complete, global objects, not parts of larger such objects, and free of singularities. Attempts to answer such questions form a rather recent, active field of investigation. (See, e.g., [19].) One important global result on vacuum spacetimes is the following [20]: There exist "small, strongly asymptotically flat" initial d a t a sets whose maximal Cauchy developments are geodesically complete spacetimes whose curvature tensors approach zero on any geodesic as the corresponding affine parameter tends to infinity. - These singularity-free, inextendable solutions may be viewed as forming a neighbourhood of Minkowski spacetime in the space of vacuum solutions; in other words, sufficiently small, finite perturbations of Minkowski spacetime do not have singularities. (Geodesic completeness means that all geodesies can be extended to arbitrary values of their affine parameters.) Semi-global results of a similar kind have been established by H. Friedrich [21]. 13. An important class of problems for a theory of gravitation is the modelling of isolated systems such as an oscillating star or a system of n bodies separated by (nearly) empty space, treated as separated from the rest of the world [22]. In fact, all quantitative tests of the field equation (12) are based on models of isolated systems. The spacetime of an isolated system should resemble, far from the matter sources, t h a t of Minkowski spacetime whose curvature vanishes. T h e curvature tensor of a general spacetime can be decomposed into the Einstein tensor and a tensor which depends not on the metric g itself, but only on the field of null cones determined by g. The first part is pointwise related to the matter by the field equation (12) while the second part, the conformal curvature, is only partly and non-locally determined by the material sources. Guided by these facts and by examples of spacetimes which clearly do represent an isolated body or a black hole, like the Schwarzschild and Kerr spacetimes [22], Roger Penrose had the happy idea to construct conformal extensions of the Minkowski and Schwarzschild spacetimes. On this basis he proposed an elegant definition of asymptocically fiat spacetimes [23], [24] which was later shown [21] to be compatible with radiative solutions to the field equation in a finite neighbourhood of (lightlike) infinity, that region into which gravitational waves emitted by m a t t e r travel. An outstanding, open question is whether there exist solutions to the field equation which are asymptotically flat (in Penrose's or a similar sense) and contain

49

physically reasonable sources like those mentioned in the first sentence of this section. At present, constructing such solutions from initial d a t a is not feasible for at least two reasons, i) T h e asymptotic behaviour of the metric at spacelike infinity is not yet sufficiently understood in order to decide what are appropriate Cauchy d a t a for such systems, ii) In GR, bodies can neither be idealized as massive points nor as rigid bodies, they have to be modelled as extended , deformable bodies. This means that the evolution of the surfaces of the bodies has to be controlled, which requires to treat initial-boundary value problems for interior and exterior solutions and their matching. Work on both of these problems is progressing (see the report [14] and the references there). A third problem is t h a t one would like to incorporate into the d a t a t h a t there is no (or not much) incoming radiation, and it is not known how t h a t can be done. T h e solution to these problems may require the combination of analytic and numerical work, but so far the necessary interaction between these communities appears to be rather weak. For the reasons indicated physicists have resorted t o approximation methods based on plausible assumptions and (at least formally) consistent iteration or expansion methods. By evaluating the conservation law (11), using an approximate metric (and its connection) related to matter by eq. (12), Thibault Damour and his collaborators have been able to derive equations of motion for the centres of nearly spherical bodies with respect to a flat "background" metric which is expected to asymptotically approximate the "physical" metric, albeit in a way which is not well understood. (See, e.g., [26] and [27]). These problems are becoming particularly relevant now because of the opportunity to observe and measure within the next decade gravitational radiation produced by nearly isolated systems. In this context it is significant that exact statements about the total energy-momentum of an isolated system and about the amount of energy-momentum radiated to infinity have been obtained between 1962 and 1982 [28]- [32]. In particular, the total energy of any non-flat isolated system has been shown to be strictly positive. 14. As mentioned already in several sections, light cones play a prominent role in GR. In curved spacetimes, light "cones" in general are not (except at the vertex) smooth, cone-like hypersurfaces as in flat spacetime. Rather, they have singularities - self intersections and caustics. These singularities can be studied by means of the theory of singularities of maps as developed by Vladimir J. Arnold and collaborators. (For an introduction aimed at physicists, see, e.g., [33].) Observationally, these facts lead to the phenomena of gravitational tensing: Several, more or less distorted, flux-magnified or diminished images of a light source - a star, a galaxy or a quasar, can be produced by a deflecting m a t ter distribution along or near to the line of sight between "us" and the source. T h e theoretical possibility of such phenomena had been realized by Einstein in 1912 already and was rediscovered by others much later. Einstein had not published his results since he considered them as too improbable ever to be observed. Only sixteen years after this discovery of very distant, bright sources of light, the quasars, the first observation of a double-imaged quasar occurred. Since then, studying such phenomena and using them to determine masses and

50 mass distributions of stars, galaxies, clusters of galaxies and recently even dark m a t t e r concentrations [34] has become an active field of "applied G R " in astrophysics. (For an introduction and survey, see [35].) T h e research field of gravitational lensing is particularly attractive because it combines sometimes simple, sometimes sophisticated geometry with real observations which teach us something about the world "out there". 15. I should like to end this survey with a few remarks about the foreseeable future of research on gravitational physics. A new and more intense interaction between observers and theoreticians can be expected in the area of gravitational wave research and the related field of compact objects including black holes. The most fundamental theoretical task is and remains to find a theory which combines the successful concepts and laws of G R about gravity and spacetime structure with those of q u a n t u m theory, which has uncovered the strange microworld of particles and their interactions, and thereby to remove the "most glaring incompatibility of concepts" (Freeman Dyson) between present physical theories.

REFERENCES

[1] Anderson, J.L., Principles (1967)

of Relativity

Physics,

Academic Press, New York,

[2] T r a u t m a n , A., "The General Theory of Relativity", Uspekhi Fiz. Nauk 89, 3 (1966) [3] Straumann, N., "Gauge Theory: "Historical Origins of Some Modern Developments", Rev. Mod. Phys. 72, 1 (2000) [4] Synge, J.L., General Relativity,

North-Holland, Amsterdam (1960)

[5] Norton, J., "What Was Einstein's Principle of Equivalence?", p.5 in Einstein and the History of General Relativity , D. Howard and J. Stachel (eds.), Birkhauser, Basel (1989 [6] Einstein, A., "Die Feldgleichungen der Gravitation", Sitzungsber. Preufi. Akad. Wissensch., Math.- Naturw. Kl. II, 844 (1915) [7] Lovelock, D., "The Four-Dimensionality of Space and the Einstein Tensor", J. Math. Phys. 13, 874 (1972) [8] Hilbert, D., "Die Grundlagen der Physik", Nachr. Ges. Wiss. Gottingen, 395 (1916) [9] Lorentz, H.A., "On Einstein's Theory of Gravitation", Versl. Akad. Amsterdam 24, 1389 (1916); Proc. Acad. Amsterdam 19, 1341 (1916) [10] Einstein, A., "Hamiltonsches Prinzip und allgemeine Relativitatstheorie", Sitzungsber. Preufi. Akad. Wiss., Math.-Naturw. KL, 1111 (1916)

51 [11] Ticciati, R., Quantum Field Theory for Mathematicians , Encyclopedia of Mathematics and its Applications, vol. 72, ch. 15, Cambridge University Press, (1999) [12] Arnowitt, R., Deser, S. and Misner, C.W., "The Dynamics of General Relativity", p. 227 in Gravitation: An Introduction to Current Research , L. Witten (ed.), Wiley (1962) [13] Ashtekar, A., Non-Perturbative Canonical Gravity , World Scientific, Singapore (1991) [14] Friedrich, H. and Rendall, A., "The Cauchy Problem for the Einstein Equations", p. 127 in Einstein's Field Equations and Their Physical Applications, B.G. Schmidt (ed.), Lecture Notes in Physics 540, Springer, Berlin (2000) [15] Geroch, R. and Horowitz, G.T., "Global Structure of Spacetimes", p. 212 in General Relativity, An Einstein Centenary Survey, S.W. Hawking and W. Israel (eds.), Cambridge University Press (1979) [16] Ehlers, J., "Examples of Newtonian Limits of Relativistic Spacetimes", Class.Quantum Grav. 14, A 119 (1997) [17] Hilbert, D., "Die Grundlagen der Physik II", Nachr. Ges. Wiss. Gottingen 53, 61 (1917) [18] Choquet-Bruhat, Y. and Geroch, R., "Global Aspects of the Cauchy Problem in General Relativity", Commun. Math. Phys. 14, 329 (1969) [19] Rendall, A., "Local and Global Existence Theorems for the Einstein Equations", Living Reviews, article 1998-4. http://www.livingreviews.org [20] Christodoulou, D. and Klainermann, S., The Global, nonlinear Stability of the Minkowski Space, Princeton University Press, Princeton (1993) [21] Friedrich, H., "Einstein's Equation and Geometric Asymptotics", p. 153 in Gravitation and Relativity: At the Turn of the Millenium, N. Dadhich and I. Narlikar (eds.), Inter-University Centre for Astronomy and Astrophysics, Pune, India (1998) [22] Ehlers, J. (ed.), Isolated Gravitating Systems in General Relativity, NorthHolland Publ. Comp., Amsterdan (1979) [23] O'Neill, B., Geometry of Kerr Black Holes, Peters, Wellesley Mass. (1995) [24] Penrose, R., "Zero Rest-Mass Fields Including Gravitation: Asymptotic Behaviour", Proc. Roy. Soc. Lond. A284, 159 (1965) [25] Esposito, F.P. and Witten, L. (eds.), Asymptotic Structure of Spacetime, Plenum Press, New York (1977) [26] Damour, T., "The Problem of Motion in Newtonian and Einsteinian Gravity", p. 128 in Three Hundred Years of gravitation, S.W. Hawking and W. Israel (eds.), Cambridge University Press, Cambridge (1987)

52 [27] Blanchet, L., "Post-Newtonian Gravitational Radiation", in "Einstein's Field Equations and Their Physical Applications, B.G. Schmidt (ed.), Lecture Notes in Physics 540, Springer, Berlin (2000) [28] Bondi, H., van der Burg, M.G.I., and Metzner, A.W.K., "Gravitational Waves in General Relativity. VII. Waves from Axisymmetric Isolated Systems", Proc. Roy. Soc. A269, 21 (1962) [29] Sachs, R.K., "Gravitational Waves in General Relativity. VIII. Waves in Asymptotically Flat Spacetime", Proc. Roy. Soc. A270, 103 (1962) [30] Schoen, R. and Yau, S.T., "On The Proof of the Positive Mass Conjecture in General Relativity", Commun. Math. Phys. 65, 45 (1979); "Proof that the Bondi Mass is Positive", Phys. Rev. Lett. 48, 371 (1982) [31] Witten, E., "A new Proof of the Positive Energy Theorem", Commun. Math. Phys. 80, 381 (1981) [32] Horowitz, G.T. and Perry, M.J., "Gravitational Energy Cannot Become Negative", Phys. Rev. Lett. 48, 371 (1982) [33] Ehlers, J. and Newman, E.T., "The Theory of Caustics and Wavefront Singularities with Physical Application", to appear in J. Math. Phys., June 2000 [34] Erben, T., van Waerbecke, L., Mellier, Y., Schneider, P., Cuillander, J.-C, Castander, F.J., Dantel-Fort, M., "Mass Detection of a Matter Concentration Projected near the Cluster Abell 1942: Dark Clump or high Redshift Cluster?", Astron. Astrophys. 355, 23 (2000) [35] Schneider, P., Ehlers, J. and Falco, E.E., Gravitational Lenses, Springer-Verl., Berlin (1992) [36] Classical and Quantum Gravity 16, Dec. 1999, Millenium Issue, G.W. Gibbons and N.M.J. Woodhouse (eds.)

Mathematics and the 21st Century Eds. A. A. Ashour and A.-S. F. Obada © 2001 World Scientific Publishing Co. (pp. 53-58)

53

Moduli Problems in Geometry M. S. Narasimhan The aim of this expository talk is to explain some aspects of moduli problems both classical and modern. Moduli problems occur naturally in several areas of Mathematics, like Algebraic Geometry, Differential Geometry, Number theory, Partial Differential Equations and in Theoretical Physics. Indeed techniques from all these areas are used to study moduli problems. Moduli spaces are essentially parameter spaces for the classification of structures of a specified type, like complex (holomorphic) structures on a differentiable manifold. Some well-known examples are: 1. Moduli spaces of elliptic curves more generally of compact Riemann surfaces. 2. Jacobians of curves and their non-abelian generalisations. 3. The space of solutions of certain (non-linear) Partial Differential equations. I. Moduli of compact Riemann Surfaces The term moduli was perhaps introduced for the first time in this context by Riemann. He calculated heuristically that compact Riemann Surfaces of genus g(> 2) depend on (3g—3) complex parameters and called (3g — 3) the number of moduli. In fact it turns out the set of isomorphism classes of complex structures on compact orientable differential surface X0, whose first Betti number is 1g, has a natural structure of a connected complex space of complex dimension 1 if g = 1 and (3g — 3) if g > 2. This space is called the moduli space of compact Riemann surfaces of genus g, The dimension of the moduli space is the number of moduli. Note that, since any two compact Riemann surfaces of genus 0 are isomorphic, in this case the number of moduli is 0. In the case of genus 1 (elliptic curves), the moduli space is well-known and can be constructed explicity. In this case we may assume that the Riemann surface is a 1-dimensional

54 complex torus XT := Cj {1, r } , where {1, r} is the lattice spanned by 1 and r, r being a complex number with Im r > 0. We represent XT by the point T in the upper half plane. We see easily that XT and XTi are isomorphic if and only if r' = ar + b/cr + d, where a, b,c,d 6 Z, with and ad — 6c = 1. Thus the moduli space of elliptic curves is the quotient of the upper half plane by the action of SL(2, Z5) gives by r >-> ar + b/cr + d where I , 6 SL(2, Z). In fact this quotient is isomorphic to the complex plane C. Thus the moduli space of elliptic curves is just the complex plane. There is a deep generalisation of this situation to complex torii of higher dimensions. While it is not true that the set of isomorphism classes of complex torii of complex dimension g > 2 form a good moduli space, certain complex torii, the so called principally polarised abelian varities of dimension g, have good moduli space. We consider the Siegel half space (Hg) consisting of complex gxg symmetric matrices Z = X+iY, X, Y real with Y positive definite. The Siegel modular group Sp(2g, Z) consisting of 2g x 2g matrices, with integral entries, belonging to the symplectic group Sp(2g, R) acts on Hg by :if g = I _ _ I e Sp(2g, Z) then gZ = (AZ + B)(CZ + Z?) _1 . Then the moduli space of principally polarized abelian varieties is the quotient of Hg by Sp(2g, 2Z). The number of moduli is g'g^~ ' . II. Moduli of vector bundles on a compact Riemann surfaces Let X be a compact Riemann surface of genus g. By means of the theorems of Abel and Jacobi, classically one associated to X a complex torus (in fact a principally polarised abelian variety) of dimension g, called the Jacobian of X. Let D0 be the group fo divisor classes of degree 0 on X: an element of DQ is a finite formal linear combination SOJPJ, Oj € Z, Pi £ X and Ea( = 0. Let Di be the subgroup of divisors defined by the zeros and poles of a meromorphic function ^ 0. Then the quotient group Do/A has a natural structure of a complex torus. Observing that a divisor of degree 0 gives rise to a holomorphic line bundles of degree 0 (i.e. chern class 0) and the divisor of a meromorphic function defines the trivial holomorphic line bundle, we may say that the moduli space of holomorphic line bundles on X which are differentiably trivial is the Jacobian. The number of moduli is g. In 1938, A. Weil envisaged a generalisation, by looking at r x r " matrix divisors", the case r = 1 would correspond to the Jacobian. In modern terminology, one can say that Weil initiated the study of holomorphic vector bundles on compact Riemann surfaces. Roughly the question was whether there is a natural structure of an algebraic variety on the set of

55 isomorphism classes of holomorphic vector bundles of rank r, with fixed topological type. Weil calculated heuristically the number of moduli and found it to be r2(g — 1) + 1, g being the genus of the surface. He studied holomorphic vector bundles which arise from r dimensional complex representations of the fundamental group of the surface and expected that the bundles which arise from unitary representations would play a special role. It turns out that one can not expect to have a good moduli space for all holomorphic vector bundles and one has to restrict the class of holomorphic bundles. In 60's, D. Mumford constructed the moduli space for stable holomorphic bundles of a degree. A holomorphic vector bundle E of degree 0 is said to be stable if proper holomorphic subbundles of E have strictly negative degree. C. S. Seshadri and I proved that stable holomorphic vector bundles of degree 0 are precisely those which arise from inducible unitary representations of the fundamental group. The equivalence classes of unitary representation of the fundamental group give a compactified moduli space. III. Gauge Theory With the advent of gauge theory (Yang-Mills theory) in Physics, new moduli spaces, which parametrize solutions of certain non-linear partial differential equations, were studied. The Yang-Mills equation is a non-linear euclidean version of Maxwell's equation. These are equations satisfied by connections on a hermitian vector bundle on a compact oneifed Riemannian manifold. If w is a connection and Q. = dw + [w,w] its curvature form the Yang-Mills equation is: dw * Q = 0, where dw is the covariant differentiation with respect to w and the star operator * is given by the metrices. (In general the solutions form on infinite dimensional space, but modulo 'gauge equivalence, they form a finite dimensional space. The non-linearity comes from the term [a, a], which involves non-commutative matrix multiplication]. A special case of importance is anti self-dual Yang-Mills equation in the case of 4 (real) dimensional manifolds: *Q = - Q .

Atiyah and Bott studied moduli spaces of vector bundles on compact Riemann surfaces from the point of view Yang-Mills theory on Riemann surfaces.

56 IV. Construction of moduli spaces We may say that the problem of classification of structures is a moduli problem. We expect that the structures form a variety (or more generally a scheme). The variety, which is the parametric space for the structures, is the moduli space and its dimension is the number of of moduli. Given a fixed structure are may say that the moduli space parametrises the deformations of the given structure. The moduli problem can be divided into local and global problems. IV (a). Local Problem In the local problem we start with a fixed structure e.g. a fixed complex structure on a compact differentiable manifold and study nearby structures. This theory was initiated in a famous series of papers by Kdoia and Spencer. They related this question to certain cohomology groups related to the structure, which describe the (first order) innnitesenial deformations of the structure. For example, if X is a compact complex manifold, the relevant cohomology group is H1(X, T), where T is the tangent sheaf of X. Morally this cohomology space is supposed to be the tangent space to the moduli space at the point representing the complex manifold X, and the number of moduli may be expected to be the dimension of this vector space. They also considered the problem whether there exist effectively dimH1(X, T) parameter family of deformations. They showed that it is indeed the case if H2(X,T) = 0. This result was generalised lated by Karavishi. Grothendick made contributions to these questions by using systematically schemes with nilpotent elements. IV b. Global Problem Construction of global moduli spaces as algebraic varieties or schemes require deeper techniques. There are two questions: a) what structures to retain (like stable vector bundles' so that there is a moduli space b) In general the moduli space may not be compact (e.g. moduli of elliptic curves). What is 'modular interpretation of the new objects in the 'boundary' ? The most effective method known so far to deal with these problems is Mumford's Geometric Invariant Theory (G. I. T.). In this theory moduli spaces are constructional as

57 the quotients of algebraic varieties by the action of a group like GL (n). The construction of quotients in algebraic geometry is a subtle problem. Suppose that GL(n) acts on a projective variety V, the action lifting to ample line bundles. It turns out there is a natural open set Vss (the set of semi-stable points for the action) and there is a projective (compact) quotient of VBS by aL(n). The difficult part of the construction of moduli spaces, is to reduce to the problem to the construction of a G. I. T quotient (which would be the required moduli spaces) and to determine the set of semi-stable points and to intepret the notion of semi-stability geometrically in terms of the structures under consideration. Using G.I.T. one can construct the moduli space of compact Riemann surfaces of genus g as a quasi-projective variety and also compactify this space into a projective variety by adding some singular curves- so called stable curves. V. Study of moduli spaces Moduli problems lead to the construction of new algebraic varieties, starting from a given one. The problem then naturally arises to study these new varieties in depth. On the one hand, some varities which were classically studied have modular interpretation i.e they occur as solution to some moduli problem. This point of view enables one to solve some classical problems. On the other hand, suprisingly, the study of the moduli spaces reveal some hidden properties of the original varieties. For instance the study of the conomology ring of the moduli spaces (suitably compactified if necessary), which amounts to studying intersection numbers numbers of conomology classes of these spaces, give interesting numerical invariants of the original varities. One famous example is the Donaldson polynomial, which is obtained by studying intersection numbers of moduli space of stable vector bundles on an algebraic surface. An other example which knows light on the original manifold, is the moduli space of J-holomorphic curves on a symplectic manifold. The moduli spaces themselves are studied by several methods: purely topological, number theroetic (e.g. Weil Conjectiones) gauge theoretic, algebra-geometric, differential geometric. One popular, if heuristic method is to use techniques from Physics, like Feynman path inteprals (observables are cohomology classes on the moduli space and intersection numbers are expectation values). IV. Difference Geometric Interpretation of Stability

58 One of the deep and fascinating aspect of moduli problems is that the notion of stability (which of a purely algebraic nature) which is needed to construct good moduli spaces has usually a transcental meaning, amounting to the existence of solutions of a non-linear partial differential equation. For instance, on a kahler manifold the stability of a holomorphic vector bundle is (essentially) equivalence to the existence of a hermitian metric on the bundle whose curvature satisfies the so-called Hermitian-Einstein condtion (This theorem, which generalises the result on the equivalence of stable and unitary vector bundles on a compact Riemann surface, was conjectured by Hitchin and conjected by Hitchin and Kosayashi and proved by Donaldson and Uhlenbeck-Yan). This deep relationship between algebra-geometric and differential geometric structures was exploited by Donaldson in his celebrated work. In this case, the existence of a solution of a P.D.E is assured by the purely algebraic condition of stability. It would be interesting to discover such purely algebraic condition for the existence of a solution in the general theory of non-linear P.D.E. Here the analogy with the Hilbert-Mumford criterion for stability in G.I.T. could be relevant.

Mathematics and the 21st Century Eds. A. A. Ashour and A.-S. F. Obada © 2001 World Scientific Publishing Co. (pp. 59-67)

59

E n u m e r a t i v e G e o m e t r y from t h e Greeks t o Strings C. Procesi Roma, January 2000. From the transparencies of a lecture given in Cairo. 1. The circumferences of Apollonius 2. The 3264 conies of Chasles 3. Schubert calculus 4. Hilbert's 15 th problem 5. Cohomology 6. Homogeneous spaces 7. Spherical varieties 8. Strings and moduli spaces 9. Quantum cohomolgy 10. Equivariant Theories

1. The beginning Enumerative geometry starts with the beginning of Geometry, even the axioms on points and lines are of this nature. Start with the example, the circumferences of Apollonius. Take 3 disjoint circumferences then: There are exactly 8 circumferences tangent to all the 3 given circumferences. Enumerative Geometry is counting number of solutions to a geometric problem.

2. The 3264 conies of Chasles Jumping about 2000 years after passing from syntetic geometry to analytic, we find ourselvees with: projective plane (or space etc.), and curves, surfaces, varieties given by equations, in particular algebraic equations and algebraic varieties. Degree The first enumerative invariant is the degree of the equation, it counts the number of intersections of the curve (or hypersurface) with a generic line. The first Theorem of modern Enumerative Geometry is: Bezout's Theorem Two algebric curves in the plane of degrees m, n intersect in exactly mn points (with some provisos). Similarly 3 surfaces in space of degrees m,n,p intersect in mnp points etc.. An idea from projective geometry is that curves, surfaces depend on parameters and thus can be treated as points in new projective spaces (or varieties). For instance duality: The lines in projective plane form the dual plane. One can interepret the Theorem of Apollonius as a special case of Bezout's Theorem on the 3-dimensional space of circles.

60 Circles are just special kinds of conies, conies depend on 5 parameters and we have: Through 5 points (generic) passes a unique conic, to 5 lines is tangent a unique conic. Theorem Chasles: There are exactly 3264 conies tangent to 5 (generic) conies. This is a difficult point! clarified only much later.

3. Schubert Calculus The main fact is that the previous numbers can be obtained by an algebraic calculus. Let ii, v be, respectively, the conditions for a conic to pass through a point or be tangent to a line, then the condition to be tangent to a conic is 2fi + 2v and the one to be tangent to 5 conies is: (2/i + 2vf = 32(/j,5 + 5//u + 1 0 / J V + 1 0 i / V + 5^V + ^ 5 ) 5 M

= „ s = 1, n*v = uAn = 2, //u2 = i / V = 4, (2/i + 2vf = 64(1 + 5 x 2 + 1 0 x 4) = 3264

The formulas: M5 = v5 = 1, if" = v^ix = 2, //v2 = i / V = 4 represent all special numbers, called: characteristic numbers. fj,3i/2 = 4 means, there are 4 conies passing through 3 points and tangent to 2 lines. One can define characteristic numbers in various situations, for quadrics, projectivities and many other interesting examples, they are predecessors of: characteristic classes.

4. Hilbert's 15th Problem In the second half of the 19th—century various results of this type from: Chasles, Schubert, Zeuthen, Halphen etc. This is part of the beginning of Algebraic geometry with foundational problems. In the International congress in Paris in 1900 Hilbert presented 23 problems, a possible guideline for the just passed century! The 15 th Problem requires to justify these computations: i.e. Schubert's calculus. The modern approach to these questions goes under the name of Intersection Theory.

5. Cohomology The main idea in the foundations of intersection Theory is cohomology, in its various incarnations.

61 Singular, simplical, (Lefschetz), through differential forms (De R h a m ) , Geometric cycles and Chow groups, Etale cohomology (Grothendieck). In all these cases one arrives at an algebra of cohomology H*(X) associated to a space or a variety X, the computations of Chaises are in such an algebra. Let X (smooth compact) have complex dimension n: H*(X,Z) is a graded algebra. T h e top degree cohomology is H2n(X) := Z[P] generated by a special class [P] (class dual t o a point). A condition defining a subvariety of codimension j corresponds to an element: a G H2i(X), equivalent conditions give rise to the same cohomology class. T h e class of t h e intersection of two varieties (or rather two conditions) is the product (in the cohomology algebra), usually called U product of the corresponding classes. Imposing enough conditions to expect finite solutions means to compute a product 01 U a 2 U • • • U a; e H2n(X) so t h a t : ai U 02 U • • • U a; = m[P] is an integral multiple of the fundamental class [P] and: m represents the requested intersection number.

6. Homogeneous spaces Many classical examples fall in the following class. We take a variety X on which a symmetry group acts transitively (a homogeneous space). For instance projective space or non degenerate conies, quadrics etc.. In this case t h e problem of intersection theory can be formulated starting with: Kleiman's Transversality Theorem. Given an algebraic homogeneous space X over an algebraic group G, irreducible subvarieties \\,..., Vk of codimensions dt in X, for generic elements gi e G the intersection, giV! n • • • n gkvk is proper and generically transversal, in particular of codimension di + d 2 H If n = d\ + d.2 + • • • + dk then

+ dk.

giVir\---ngkVk consists of a finite number of points, which for generic g^s is independent of the g's and thus is t h e intersection number of the given varieties by definition. Here the interpretation through cohomology is not immediate, for instance for the case of conies one has such an interpretation only after compactifying in a suitable way to the variety of complete conies.

62 In general there is no complete theory. Compact homogeneous spaces are classified and their cohomology well understood, the prototype is complex projective space P™(C)) with cohomology algebra Z[x]/(xn+1) or the flag variety with cohomology algebra Z[xi,...,xn+i]/(ei,e2,...en+i), n+l

J ] (t + Xi) = tn+1 + eitn + eat"" 1 + • • • + en+1 i=l

spherical varieties are the next class which has been studied extensively. 7. Spherical varieties The technical definition of spherical variety is an irreducible algebraic variety X with an action of a reductive group G which has the property that: a Borel subgroup has an open orbit on X. The classical examples are conies, quadrics, projectivities, null correlations. These are special cases of reductive groups G as G x G space, algebraic symmetric spaces G/H,

H := Ge, 92 = 1

For these spherical varieties X one can define an intersection ring (Halphen's ring) based on Kleiman's Theorem. If A, B have complementary dimension set

(A,B)~#\AngB\ the number of points of intersection of generic translates. Set two irreducible subvarieties of codimension k, Yi, Y2 equivalent if (YUZ) = (Y2,Z) for every Z of codimension n — k. One can define an intersection product [A]n[B]~[AngB] on equivalence classes of cycles which defines an associative commutative graded algebra H*(X,Z). This is computed since it is isomorphic to the direct limit H*(X,Z) =*

\imH*(X,Z),

X runs over the smooth equivariant compactifications. All these terms can be combinatorially described, e.g. for symmetric spaces the relevant compactifications are Wonderful and indexed by simplicial rational decompositions of the fundamental Weyl chamber of the restricted root system.

63 The cohomology of these compactifications can be described (through equivariant cohomology). An algorithm can be used to compute all characteristic numbers. e.g. 666,841,088 quadrics tangent to 9 quadrics in P 3 .

8. Strings and Moduli spaces String theory aims to develop a quantum theory based on viewing particles as vibrating strings. Prom a classical point of view a moving string sweeps in the ambient space some surface so it should be described by a function F:Z->X

where E is a surface, the world sheet and X is the ambient space. Already in the geometry appears the topological form of E (the genus for compact orientable surfaces). According to Feynmann's approach to quantization one can procede by evaluating some functional integral. One will try to evaluate this integral as a perturbative series on all topological forms and also around classical solutions of the embedding F : S —> X. This needs a suitable action S. For instance, a functional depending on F an internal metric ha^ and the ambient metric E. A variation of Nambu's action, in coordinates a, r on E d2<jVhha'fin^vdaXi1daX''

S := - ^ /

There is an infinite dimensional group of symmetries for the action: which has to be eliminated in the functional integral /

the gauge group,

eis6XSh

In this way appear moduli of Riemann surfaces or algebraic curves. Fix an oriented surface E as pure topological or differentiable data. If M is the set of all metrics on E, on M. act two groups. The group of diffeomorphisms and the group of rescaling the metric e?, / : E —» K. What remains invariant is the angle or conformal structure. The set of moduli so resulting is a remarkable algebraic variety Aig of dimension 3g — 3 for E compact of genus g > 1. So quantum numbers start to appear as enumerative invariants of algebraic curves. A complete description of the intersection theory of Riemann surfaces is yet unknown. The very remarkable Theory of Witten and Kontsevich gives a powerful identity of enumerative meaning on: M9]V. Deligne-Mumford compactification of stable genus g curves with n marked points.

64 To each variable point Pi is associated a line bundle (the tangent line at Pi) and a cohomology class did)

e

H2j4g,n

the Chern class of the line bundle. Define characteristic numbers:

Ylctid] di

(Td1,---,Tdn) := _

and the generating series: oo

F{t0,tu-

oo

..):= {exp{Y,tiTdi))

hi

= ^ ( r d l > . . .,Tdn) H -iy.

i=0

(k)

i=0

z

'

WITTEN CONJECTURE PROVED BY KONTSEVICH F(to,ti,...) : coincides with the partition function of the standard matrix model and obeys the Korteweg-de Vries hierarchy. A positive N x N hermitian matrix, d^{X) matrices: cAexp(

.

the probability measure on hermitian

trX2A

)dX

setting: ti(A):=-(2i-l)\\trA-(-2i-1), the formal series F(to(A),ti(A),...)

log(J

is an asymptotic expansion when A - 1 —» 0 of

exp(^trX3)d^A(X))

The proof is by a combination of Feynman diagram techniques and the stratification of moduli space associated to Strobel differentials.

65

9. Quantum cohomology Enumerative geometry of algebraic curves in a given variety X is still in a very conjectural form. Some major progress has been done for rational curves, mainly because, for some special varieties like homogeneous spaces, a good theory of moduli is available. The most remarkable result is probably the following: Let X be a generic quintic threefold in P 4 . In 1991 Candelas, de la Ossa, Green and Parkes [COGP] "predicted" the numbers rid of degree d rational curves in X conjecturing that the generating function 00

d

K(q) = 5 + J2ndd3-3—d H

d=l

could be recovered via elementary transformations from the hypergeometric series

4

771 = 1

^

'

Indeed, pick a basis {fi(t)}i=0 3 of the space of solutions of D(f) = 0, introduce the new variable T{t) = j¥x, and consider functions
[dtj

K(expt)

This conjecture was motivated by a fascinating phenomenon which is known by physicists as mirror symmetry . A relation between enumerative geometry in X and solutions of differential equations for periods of some mirror family {Yt}. The prediction of the Phisicists has been rigorously checked by Givental (based also on the work of several people). Here we are dealing with enumerative geometry on Mo,n{X, f3) the space of stable maps of curves of genus g to X with fundamental class mapping to a chosen homology class /3eH2{X,Z). The numbers counting rational curves on X (i.e. the Gromov Witten invariants of X) are defined as follows: Let 71,..., 7„ be classes in H*(X), and /3 G H2(X, Z) : J>(7i,->7n):= / JMO,r,(X,0)

pJ(7i)U...Up;( 7 n),

66 then these numbers are used to form a function called potential, of the following form:

J2 no + ...+n m >3

—,-—, "

£

Ip(n0,...,nm).

peH2(X,Z)

Set ijk(to, • »,tm) : = dtidtjdtk

If we denote by gl* the inverse of the intersection matrix gtj = fxTi the: quantum product (commutative with unit):

U Tj, and define

Ti*Tj-.= Y, m9klTi. k,l=0

The extremely remarkable fact is that Theorem. The quantum product is associative. (Quantum cohomology (QH*(X), *)). This has for instance as a consequence the formula of Kontsevich: If Nd denotes the number of degree d rational curves in P 2 passing through 3d — 1 general points then Nd is computed recursively as:

E di+d3=d, di,dj>0

»**(«(£-»)-*-(£~-4. N

\

1

/

\

10. Equivariant Theories A very useful way to make computations, which is essential in the previous cases, is to pass through equivariant cohomology and equivariant quantum cohomology. This is the case whenever there is a group of symmetries. The reason is that, computing in equivariant theory one has the localization principle which allows powerful inductions. This is the basic tool of Givental.

67 Bibliography [AB] M. Atiyah and R. Bott, The moment map and equivariant cohomology, Topology 23 (1984), 1-28. [Al] P. Alum, Quantum cohomology at the Mittag-Leffler institute, 1996-1997, Appunti della Scuola Normale Superiore, (1998). [BDP] E. Bifet, C. De Concini, C./ Procesi, Cohomology of regular embeddings. Advances in Mathematics, v. 82, n. l,pp.l-34 (1990). [COGP] P. Candelas, X. C. de la Ossa, P. S. Green, and L. Parkes, A pair of Calabi Yau manifolds as an exact soluble superconformal theory, Nuc. Phys. B 359 (1991), 21-74. [DPI] C. De Concini-C.Procesi, Complete Symmetric Varieties. C.I.M.E. 1982, Springer Lecture Notes 997, pp. 1-44 (1983). [DP2] C. De Concini-C.Procesi, Complete Symmetric Varieties II (Intersection Theory). Advanced Studies in Pure Mathematics 6, Algebraic groups and related topics pp. 481-513 (1985). [DP3] C. De Concini-C.Procesi, Cohomology of compactifications of algebraic groups. Duke Mathematical Journal 58, pp. 585-594, (1986). [DGMP] C. De Concini, M. Goresky, C.Procesi, R. Mac Pherson, On the geometry of quadrics and their degenerations. Comm. Mathematics Helvetici 63, pp. 337-413 (1988). [Du] B. Dubrovin, Geometry of 2D topological field theories, in Integrable systems and quantum groups, Montecatini Terme, 1993, Lecture Notes in Mathematics, 1620, SpringerBerlin, 1996, 120-348. [FP] W. Fulton, R. Pandharipande, Notes on the stable maps and Quantum Cohomology, alg-geom 9608011 (1996). [Gl] A. B. Givental, Equivariant Gromov-Witten Invariants, IMRN No.13 (1996), 613663. [G2] A. B. Givental, Homological geometry I: Projective hypersurf aces, Selecta Math I (1995), 325-345. [G3] A. B. Givental, Homological geometry and mirror symmetry, in Proceedings of the International Congress of Mathematicians, 1994, Zurich, Birkhauser, Basel, (1995), 472-480. [K] M. Kontsevich, Enumeration of rational curves via torus actions, in The moduli space of Curves, ed. by R. Dijkgraaf, C. Faber, and G. van der Geer, Progr Math. 129, Birkhauser (1995), 335-368. [LLY] B. H. Lian, K. Liu, S-H Yau, Mirror Principle I, alg-geom 9712011 (1997). [Ma] Y. Manin, Generating functions in algebraic geometry and summation over trees, in The moduli space of Curves, ed. by R. Dijkgraaf, C. Faber, and G. van der Geer, Progr Math. 129, Birkhauser (1995), 401-417. [PX] C. Procesi, Xambo, On Halphen's first formula (Zeuthen Colloquium edited by S.Kleiman), Contemporary Mathematics, v.123, pp. 199-211 1991 [Vo] C. Voisin, Variations of Hodge structure of Calabi-Yau threefolds, Quaderni della Scuola Normale Superiore (1998). [W] E. Witten, Two dimensional gravity and intersection theory on moduli space, Surveys in Diff. Geom. I, (1991), 243-310.

Mathematics and the 21st Century Eds. A. A. Ashour and A.-S. F. Obada ©2001 World Scientific Publishing Co. (pp. 69-121)

"9

OPTICAL SOLITONS: TWENTY-SEVEN YEARS OF THE LAST MILLENNIUM AND THREE MORE YEARS OF THE NEW? R. K. BULLOUGH

ABSTRACT. I give a short survey of soliton theory since the time of its creation in the late sixties and early seventies of the last century: mainly I here focus on the discovery of solitons in the dynamics of nonlinear optics - particularly reference optical solitons of the so-called Maxwell-Bloch equations in one time and one space (1+1) dimensions. These M-B equations include the sine-Gordon (s-G) and nonlinear Schrodinger (NLS) equations as final products under a succession of slowly varying envelope and phase approximations. These SVEPA's preserve the exact integrability of each family of nonlinear partial differential equations in 1+1 dimensions, and each family can be quantised and exactly solved as completely integrable quantum systems. Both the quantum s-G and the quantum NLS in 1+1 dimensions could prove to be natural candidates for applications in the newly developing science and technology of 'quantum information' including 'quantum computing': the attractive case of the quantum NLS is already a centre of interest in optical fibres. Neither the quantum nor the classical NLS equations are integrable in two and three space dimensions (i.e. in 2+1 and 3+1 dimensions). But, as a part of the theme of the paper illustrating the interplay of abstract mathematics, mathematical physics, and experimental physics and technologies, I show that the weakly coupled quantum NLS equations in 3+1 dimensions in particular find direct application to the theory of Bose condensates observed in 1995 in metal vapours held in magnetic traps at temperatures of nano-degrees Kelvin. Here approximated functional integral methods give a good description of these Bose condensates, but so far only for the repulsive case: predictions for the correlation functions in this repulsive case are in good agreement with recent experiments on magnetically trapped 8 ' R b vapour held in very elongated traps in the temperature range 250-450 nano-Kelvin. Although these experiments rather confirm that these quantum condensates in 3+1 dimensions are in exact quantum coherent states asymptotically, nevertheless they must also confirm that these states are states of 'quantum chaos'. This is because the semiclassical equations in 3+1

70 dimensions are not integrable and so represent very large dimensional Hamiltonian chaos. Even so the quantum coherent states in 3 + 1 dimensions may herald still another new technology based on an atom, rather than a photon, laser.

1. INTRODUCTION: A NEW TECHNOLOGY AND A SHORT SURVEY O F SOLITON THEORY AND ITS MATHEMATICS

The material of this paper superbly illustrates an affirmative answer to the question posed by Professor Griffiths [1] "Is interdisciplinary research possible?" It is possible but perhaps at some cost to the individuals concerned [2]. The paper briefly summarizes the emergence of a new technology deriving from the "optical soliton": discovered in a mathematical investigation of ultra-short resonant optical pulse propagation (nano sec pulses) in a nonlinear dielectric medium in the period up to 1973 [e.g. 3-7], perhaps first labelled as such at the First National Quantum Electronics Conference held in Manchester in September 1973 [7], and certainly so described at the Rochester Conference in 1977 [7,8], this "optical soliton" is currently a natural candidate for becoming the "one-bit" for information transfer in trans-oceanic optical fibre communication, and as such will carry some US dollars 40 billions of investment! Moreover quantum multisoliton excitations might become natural "qubits" for quantum information manipulation and transfer [9]. However to date a supposed poor "bit-rate" (number of pulses per second) in the fibres as compared with current linear methods of communication, allied with conservative reaction in the telecommunications industry, has put this fibre investment programme "onhold" until at least 2003 - the reason for the title of this Lecture 1 - while as "qubits" quantum multisoliton solutions for nonlinear dielectrics must at present only be a gleam in somebody's eye (see the §5 following in this paper a n d t h e r e f . [9]). For mathematics, and for the physics of these technologies, an essential point is that these optical solitons are intrinsically nonlinear phenomena - based as we shall see on certain systems of nonlinear partial differential equations. However, more striking to the mathematician, a remarkable mathematics underlies these optical solitons. They are true solitons in the sense described in [10] and though they have natural algebras - see for example the range of mathematics sketched (but unfortunately scarcely developed in this paper) contained in the 'map' "Solitons" introduced as the Figure 8 in the §3 below - they also have natural geometries - see Chapter 1 in [10] and the very recent [11-13]. The p.d.e.s in question derive through a hierarchy of so-called Maxwell-Bloch (MB) equations [7]. This hierarchy includes the famous sine-Gordon (s-G) and nonlinear Schrodinger (NLS) equations, and both s-G and NLS relate to the 1 The title counts 27 years from September 1973 to September 2000 and three years from September 2000 to (about) September 2003. The 'last millennium' therefore includes even September 2000!

71 now equally famous Korteweg-de-Vries (KdV) equation. In one space dimension (x) and one time (t) (1+1 dimensions) KdV has a countably infinite set of polynomial conserved densities pn(x,t) such that J(pn(x,t)dx are constants of the motion [14], and these p.c.d.s are shared with s-G taken in light cone coordinates. Indeed the s-G, NLS and KdV are each "completely integrable Hamiltonian systems" which have the property of 'integrability' and consequently exact analytical solvability by (say) the inverse scattering method [10] and each carries a very large number of conserved quantities. Historically complete integrability derives from Liouville [15] of 1855: for a Hamiltonian system with a finite number N of degrees of freedom the system is completely integrable if N independent constants, each capable of independent variation, can be found mutually commuting under the Poisson bracket: then its solution is reducible to 'quadratures' (a succession of explicit integrations). Note how N constants are sufficient to solve a problem with 2N constants of integration. More recently [16] Arnold shows how when the manifold of level fines (these constants) is compact and connected this allows the construction of N pairs of action-angle variables for these systems, while the motion is confined to an N torus in the phase space. The new feature for integrable nonlinear p.d.e.s in 1+1 dimensions is that the degrees of freedom are labelled by x 6 R (the real line): thus there is a very large (uncountable) infinity of degrees of freedom and a very large infinity of pairs of action-angle variables together with a corresponding very large dimensional torus in the phase space. As far as I know there is still no explicit proof that the Liouville-Arnold theory can be extended to Cantor's 2Xo number of degrees of freedom, the problem being the exact count of what is a sufficient number of independent mutually commuting constants. Nevertheless s-G, NLS, KdV and very many more systems in 1+1 dimensions (see e.g. [10] and refs., [17-19] and refs.) are manifestly completely integrable in that they can both be integrated and shown to have action-angle variables which may be used to facilitate that integration. Moreover, some systems, like the Kadomtsev-Petviashvili (KP) equations, for example, are completely integrable in 2+1 dimensions [20]. On the other hand KP in 2+1 is the natural extension of KdV in 1 + 1 : the corresponding natural extension of s-G to 2+1 is not covariant and does not extend the physics [21]; the natural integrable extension of the integrable NLS in 1+1 is apparently the Davey- Stewartson system in 2+1 dimensions [22] while under the obvious extension that d2/dx2 —• v 2 the NLS becomes explicitly non-integrable in 2+1 and 3 + 1 dimensions and the number of commuting constants is simply insufficient in these cases. Note that each of these completely integrable nonlinear p.d.e.s already forms a natural hierarchy of its own; for each of the independent mutually commuting constants can be taken as a Hamiltonian for an integrable system and there is a large infinity of these. Such systems, which are rare in some functional sense in the space of functions can thus, nevertheless, be of very large measure in some other space! Moreover this situation extends to corresponding quantum field theories (so far only in 1+1 dimensions): these include the quantum NLS,

72 the quantum s-G (and indeed the quantum MB systems [23]). These quantum Hamiltonian systems have a large number of mutually commuting independent constant operators commuting under a Lie bracket. Moreover, these hierarchies do not end the matter: solutions of the Yang-Baxter equations at either classical or quantum level [24] lead in each case (as far as I know) to hierarchical families of integrable systems - though again for the integrable quantum field theories no "quantum Liouville-Arnold theorem" is explicitly proved to my knowledge (The "quantum inverse method" or "algebraic Bethe ansatz" emanating from Leningrad - St. Petersburg [24] can however be said to be illustrating exactly this. I refer the reader to [24] and the third column from the right in the Fig. 8 where the symbolism RT ®T = T ® TR leads through its matrix trace to [AT(A), AT(/J,)] = 0, AT(A) = TrT(X), a large number of commuting constants labelled by A, /J, € C which however are not specifically counted: A, \i are spectral parameters, and all of this is elaborated below at eqn. (27) in §3). At c-number (i.e. at classical) level many of the completely integrable p.d.e.s in 1+1 dimensions can be derived from the self-dual Yang-Mills equations [1113,25] in 4 + 0 dimensions or from the KP equations [26] in 2+1 dimensions. However, as far as I know, quantum integrable systems are still confined to 1+1 dimensions (an R matrix, see [17], has been found for KP in 2+1). In this paper I concern myself with quantum and classical integrability in 1+1 dimensions. It can already be seen that these integrable systems carry much mathematical structure, in the classical cases both algebraic and geometrical in the senses described by Sir Michael Atiyah in his Millennium Lecture. Thus [10] all of the classical so-called AKNS systems have representations as surfaces of constant negative Gaussian curvature with respect to a particular metric while [11] exhibits examples of these surfaces for the classical s-G system. Moreover under periodic boundary conditions elliptic curves seen as punctured Riemann surfaces in the sense of 'algebraic geometry' become natural geometrical objects of the theory of integrable systems [12]. For field theories these Riemann surfaces are of infinite genus (see 'map' Fig. 8, top right, the [18], and the thesis [27]). The algebras extend to the quantum integrable systems, but as yet I cannot say anything about the geometry of these quantum systems: since its inception quantum mechanics has had wholly natural algebras as its basis, e.g. su(2) for angular momentum and spin [28]: on the other hand much quantum geometry for the quantum integrable systems may already be evident within the so-called twistor programme [13]. As Atiyah also mentioned the 'quantum groups' are recent discoveries made very much in the latter part of the last century - although 'q-deformation' was investigated in the 1970's [29]. In the context of integrable systems the quantum groups are generalised Hopf algebras (generalised by inclusion of a spectral parameter [30]) and are intimately related to both classical and quantum integrable systems [29-32]. In this article I can do little more than display the 'map', the Fig. 8 below already mentioned, which for example tracks a route for the quantum groups from the loop algebras well on the left to the quantum inverse method in the third column from the

73 right through the Hopf algebras (the notation A T in A T = T®T appearing in the second column from the right in the Fig. 8 refers t o the co-multiplication structure underlying the Hopf algebras [18,29-32]). It may be surprising, or it may not [33], that this treasure trove of mathematics sketched in Fig. 8 is also a treasure trove of physics: t h e soliton solutions of integrable p.d.e.s in 1+1 dimensions may have the applications in optics in [7]; but they are also applicable t o the much wider range of physics which was already sketched, for example, in [34]. Moreover as mentioned that physics begins to have the applications in the new and important technologies of information transfer and communication referred t o while at quantum level quantum solitons may have the applications to 'quantum information', quantum computing, logic gates, etc [9,35] likewise mentioned. After this introductory 'essay' on t h e potential of solitons for the technologies of the new Millennium the necessarily too short a paper which now follows touches on aspects of all of this material. We begin with the optical solitons and the solitons of the Maxwell-Bloch system in the next section, §2; in §3 we explore the complete Hamiltonian integrability of the soliton systems; in §4 we present recent results in the theory of Bose-Einstein condensates in magnetic traps as an illustration of a non-integrable quantum NLS system in d = 3 dimensions capable in principle of creating a new technology - that of the atom-laser; and in §5 we investigate the quantum soliton in both of the roles of 'quantum communication' and of 'quantum information'. 2. M A X W E L L - B L O C H HIERARCHY O F EQUATIONS AND T H E SOLITONS O F SELFI N D U C E D TRANSPARENCY ( S I T )

The sine-Gordon equation in 1+1 dimensions (Eq. (1) below) together with the self-induced transparency (SIT) equations in 1+1 were solved in Manchester for their multisoliton solutions early in 1973 [3]: later in 1973 the sine-Gordon (sG) was solved more generally (for the general initial value problem) by inverse scattering methods as generalised by AKNS [36]. Independently both SIT and s-G were solved by George Lamb [37]. Then in later 1973 we introduced, and by an inverse scattering method solved, the reduced Maxwell-Bloch equations [4,5] while in [6] we gave a general theory of SIT. The reduced Maxwell-Bloch (RMB) equations of 1973 contain in effect all of the s-G, the nonlinear Schrodinger (NLS) and the SIT equations [7]. This author first came across s-G when studying ultra-short resonant optical pulse propagation (10~ 9 sec pulses) in the context of quantum optics [7]. This s-G, in the notation x = dxx - tt = ™2 sin <j>.

(1)

The parameter m > 0 is a "mass". Indeed the linear Klein-Gordon equation entered particle physics in 1925-26 [7,38]: from the relativistic relation E2 — c2p2 +m2dl with Schrodinger's correspondences E = —ihd/dt, p = —iTv\/ one readily finds t h e K-G equation U24> = (mc/h)24>,

(2)

74

Q

£

spin up

I I V

S pj n

down

Fig. 1. The 2-level atom here displayed as a spin-^ system: spin-up is |e) in the text spin-down is \g) in the text. The energy spacing between the 2-levels is Huo and p is the dipole matrix element between them. Physically the 2-level atom is a good approximation when the dynamics of interest is close to resonance with the atomic frequency LJQ and |e) and \g) are both nondegenerate.

75 with the d'Alembertian D 2 , and for h = c = 1 and 1+1 dimensions eqn. (2) is the linearisation of the s-G eqn. (1). As suggested and see [7] the s-G lies in a hierarchy of integrable Bloch-Maxwell (= Maxwell-Bloch (MB)) equations in which integrability is 'handed down' - largely in the fashion now thoroughly investigated by Calogero [39]. Conveniently, from the physics, one starts with the two-level atom, Fig. 1. This model atom has two non-degenerate eigenenergies separated by an energy huo (say): the upper energy has corresponding eigenstate |e), the lower \g); there is a dipole matrix element p between these two states \g), |e). There is an exact representation of this 2-state quantum system by a spin-j system: spin-up is |e), spin down is \g). Indeed in terms of Dirac's outer products S+ = \e)(g\,S~ = \g){e\andSz = %(\e)(e\—\g)(g\) one has the two dimensional representation of the su(2) Lie algebra [ S ± , 5 2 ] = =FS' ± ,

[ 5 + , 5 - ] = 2S2.

(3)

On the other hand the 2-level atom also has representations as two or one fermion; or indeed as two or one boson [40], For our purposes the SU(2) group is a double cover of the 5 0 ( 3 ) group: the su(2) Lie algebra maps to the so(3) algebra via the so-called Bloch vector r = {ri,T2,rz), with |r| = 1: spin up is now (0,0,1), spin-down is ( 0 , 0 , - 1 ) . The equation of motion for a spin-^ "magnet" r is the so-called Bloch equation r = dr/dt = w x r .

(4)

However, the atom enters the physics via the choice w = (2pft _1 .E(i),0,Wo) in which E(t) is the electric field vector falling on the atom: p is the vector dipole matrix element. The physics of the step su(2) —> so(3) is:- For the arbitrary state \ip(t)) of the 2-level atom \ip(t)) — C!(t)|e) + C2(t)\g) (ci,C2 € C), the density matrix p(t) is a 2 x 2 matrix with diagonal elements jc x | 2 , \CQ)^ and off-diagonal ones Pi.2 — cjc2, P21 = CiCj (p is Hermitean). The correspondence is then ri

=

(cic£+c£ci)

r-2

=

-i(c x C2 - c{c2)

r3 = ( N 2 - | c 2 | 2 ) ;

(5)

r\ provides the atomic dipole pri, r$ is evidently the atomic inversion, and r 2 carries the "phase". Notice that this representation does not distinguish \ip(t)) and et7r|«/)(t)) - a "Berry" or better a 'geometric' phase resulting from the representations of su(2) and so(3). Schrodinger's equation for the atom in the field E(t) now means exactly the Bloch equation eqn. (4) with u> as quoted above. For the Bloch-Maxwell (BM) equations one now assumes a smooth distribution of the atoms at all points x £ Rz or (see below) x £ V G R3: physically we are dealing with a mean-field theory which ignores difficult "many-body theory" (see for example [41-43] for the case of the linearised dielectric). For

76 this smooth distribution r(t) —+ r ( x , t) simply, a good approximation for small enough number densities n of atoms. For dielectrics of interest n is locally independent of x, although for a parallel sided slab V of dielectric, a physically realistic case, n(x) = n,x£V, n(x) = 0, x ^ V and this choice has physically natural but important consequences at the boundaries of V (see [41]). This is because with P(x,£) = p r i ( x , i ) and E(£) —> E(x,£) (6) for all x € V. The Bloch equations are now r(x,i)=w(x,i)xr(x,t) _1

(7)

in which u>(x,t) = (2ft p-E(x,t),0,Wo)- These are coupled to E ( x , t ) which is coupled to rx(x,t) through eqn. (6), and eqns. (6) and (7) together constitute the nonlinear Bloch-Maxwell system of p.d.e.s in r ( x , t) and E(x, t) - nonlinear exactly because w(x,4) depends on E ( x , i ) . Note that the Bloch equations are non-linear but the Maxwell equation, eqn. (6), is still linear, and this latter allows the so-called "optical extinction theorem" [41] to apply to this nonlinear dielectric [44,45]. In this way we can investigate the action of a nonlinear FabryPerot interferometer and its optical bistability [7,44,45]: OB was supposed to herald an optical computer, but "cross-talk" between pixels (the supposedly separate and independent units of OB) has so far crushed that technology. I mention OB again reference the Figs. 9 and 10 below. Conveniently the BM system eqns. (6) and (7) will now be referred to as the Maxwell-Bloch (MB) equations: the boundary conditions are initially outgoing at infinity, but the 'extinction theorem' mentioned can look after these. As we had supposed by 1973 we showed in Manchester in 1977 [46] that the MB system eqns. (6) with (7) is not integrable in the Liouville-Arnold sense (refer §1) by showing that its supposed "soliton" solutions interacted with each other: in 1+1 dimensions solitons interact, despite the nonlinearity of the system, only with a simple phase shift [10]. However, for low densities n of atoms (as in the experiments on 8 7 Rb vapour [47]) we showed that in 1+1 dimensions the Maxwell equation eqn. (6) can be replaced [4] by the 1+1 dimensional forward going system dE{x,t) ldE(x,t) _ 2-rrndP(x,t) + dx c dt ~~ c dt ^ ' now for scalar E, P. The Bloch equation eqn. (7) in 1+1 dimensions has x , t —* x,t and with eqn. (8) this is the reduced MB (RMB) system in 1+1 dimensions [4,5]. The RMB system in 1+1 is completely integrable and was integrated by a 2 x 2 matrix inverse scattering method in [5]. Note that this theory is "semiclassical": the quantum theory governs the Bloch equations, but the Maxwell equation is classical (a c-number theory in the sense of Dirac). Under a sequence of slowly varying envelope and phase approximations (SVEPAs) as shown in Fig. 2 this integrability is handed down [3-7,39]. Thus

77

THE

(

ftieMftCNV

Of * * * fWSffi

VAPyf* EMV&ofe fiV* JHMB (SViStA) J

SLOWLY

lAtfuox c^n+* *in>

'

MWMULtTy ^

(SOLITt*) ^fJSfroCteAt folitons'

(Tt> Lee,

im)

FMST *OfT(CAL SOnTQK/S* \kU - tic CeJJ

ftrtt "Ofhu.1 Sofi"'«v»

in

t*** CQOk

* Av^

owe

HvdctiA

_

J |

Fig. 2. The integrable hierarchy of MB equations taken from Fig. 6 in [7]: the MB system is not integrable (see Ref. [46] below) but each of RMB (reduced MB), SIT, sine-Gordon (s-G) and NLS is integrable in 1+1 dimensions. Reference to Mark Saffman et al. (and our Chairman eg. !) and to 'Spatial "optical solitons"' is explained in [7]. Likewise reference to the "Aristocratic solitons" of T.D. Lee and the first "Optical Solitons" of Hahn and McCall (1967-69) [10]. The SVEPA is illustrated in the following Fig. 3.

78 the self-induced transparency (SIT) equations derived from the RMB equations this way are also exactly integrable. At exact resonance the so-called sharp-line SIT equations reduce to the sine-Gordon system which is integrable. And under a further SVEPA (or in non-relativistic limit) the s-G becomes the NLS which is also integrable [48]. The Fig. 2 taken from [7] illustrates this sequential integrability and the beginnings of a physics beyond the merely integrable systems and without their mathematical structure [7]: in Fig. 2 this new physics starts at the 'spatial optical solitons' which are actually integrable for 1+1 dimensions co-ordinatised as z (for t) and x (see below). The SVEPA for Fig. 2 is illustrated in the Fig. 3. For more actual detail the SVEPA approximates the solutions in the form that, for example, E(x,t) = £(x,t)cos(kx — uit + (f>(x,t)), and the envelope function £(x,t) and the phase function 4>(x,t) are slowly varying in the sense that their x and t derivatives are each slowly varying against the wave-length A;-1 and inverse frequency w _ 1 respectively. Terms of order more than one in this ratio are then simply dropped. For nano- second optical pulses this ratio is about 1 0 - 6 but for femto-second optical pulses, since w/2?r w 10 15 Hertz at optical frequencies, the full RMB equations must be used and no SVEPA is possible. However, the RMB equations have "breather" solutions (see below) and rather than use the SVEPA we can use these to get exact solutions for the SIT equations [4]. We now derive the SIT equations from the RMB under the SVEPA. Note again first that for the one-way going Maxwell equation, eqn. (8), it is back scattering which is neglected and this is possible for small enough n: thus n ~ 10 13 atoms c.c. - 1 in the vapours of 8 7 R b [47] and back scattering is properly neglected. Then for the SIT equations derived from the RMB equations one assumes [6] E — hp~1£(x,t) c o s $ , r i = P(x,t; OJO) s'm&+Q(x,t; Wo)cos$, r 3 = N(x,t;wo) for which <E> = kx — uit + 4>(x,t): the additional complication, which does not destroy the integrability, is that the atomic resonance frequency Wo is also assumed to vary according to some distribution g(Aui) (say) where Aw = wo — <*>: this is to meet, for example, the physics of Doppler shifted moving atoms and is usually called 'inhomogeneous broadening'; for no broadening g(Au)) — §(Aw), the 5-function, called 'sharp line'. Within the SVEPA one gets from the RMB equations the five SIT equations [6] dQ/dt

=

- ( A w + d<j>/dt)P

dP/dt

=

+(Aoj + d4>/dt)Q +

dN/dt

=

-£P

=

a(P(x,t;

=

a{Q(x,t;ui0))

d£/dt + c'1d£/dt l

£{x,i){d4> /dx

1

+ c- d4>'/dt)

£(x,t)N(x,t;oj0)

w0)) (9)

in which a = 27rnp2u>h~1c~1, (...) = Jg(Aw)(...)dAw, '(x,t) is a new slowly varying phase such that c~1d'/dt+d' jdx = c~1Aui—Ak+c~1d<j>/&t+

79

svetA fe.som.Ht optic* I

$V£?A k\me

ft*rf

CM*

/e»iU*i€c

SCfJef,

id"%€c

icr^ sec

:

SIT

Fig. 3. Illustrating the slowly varying envelope and phase approximation (SVEPA).

80 d<j>/dx, and Aft = ko — k, ko = e -1 wo- Ref. [6] gives all of the details. It is remarkable that this SIT system is exactly integrable, but this as we have explained is because the simpler RMB system, with inhomogeneous broadening, is integrable. In the 'sharp line' case at exact resonance the five SIT equations reduce t o dS/dx + c-1d£/dt

=

aP

dP/dt = £N, dN/dt

=

-EP

(10)

and Q = 0, (/> = 0 thoughout the motion, this requiring only that Q = 0, = 0 initially [4]. Then this system is solved by setting £(x,t) = d/dt in terms of a new function 4> = J"_ £(x,t')dt', together with P = sin0, N = cos0: note that Q2 + P2_+N2 = P2+N2 = 1, since Q = 0. Conveniently we can recall the notation d2
xt, etc. introduced near eqn. (1). Then from eqns. (10) 4>xt +c~14>tt = & sin 4>. By a change of independent variables and choice c = 1 this last equation is exactly the sine-Gordon equation eqn. (1) with a = m2: (x,t) is replaced by {x,t) = 4 tan Lexp , _ • (11) and the + is a 27r-kink which takes 4>{x,t) = 0 for x —> —oo to 4>(x,t) = 27r for x —> +oo; the — is the 2TT antikink which takes 4>{x,t) = 0 at x = —oo to —27r at x —* oo. The s-G is both a Hamiltonian system and Lorentz co-variant and the stationary kinks found for V = 0 both have the rest energy SrwyQ where 7o > 0 is a real valued coupling constant (see below). However, there is also the OTT bound kink-antikink pair, the so-called breather solution 4>{x,t)

— 4 t a n - 1 [tan^tsin 0 / sech ©#]

81

Fig. 4. The 'phase plane' (phase space) for the nonlinear pendulum. The plots are 0 against 0 and the seperatrices are the trajectories 0 = —7r goes to 0 = -K and +7r goes to —7r.

82 @R = 07

=

(msmfi)(x-Vt)(l-V2)-i (mcosn)(t-Vx)(l-V2)-i

(12)

for [i € R, 0 < // < |7r; the rest energy of this breather for V — 0 is \$mrfQ sin^i, less than the kink plus antikink rest energies (it is a bound pair). There is also the Air kink solution which is a sum of two 27r-kinks: for any t the boundary conditions are 4>{x, t) — 0 for x —» - c o becoming (x, t) = 4TT for x —> +oo and there are indeed 2n7r kinks n = 1,2... with corresponding b.c.s. The Fig. 5 both gives the general 4w kink solution and its time asymptotics for t —+ +00. This Fig. 5 is an attempt to draw the coupled rigid pendulums making up the sine-Gordon model, a t rest and excited into the 4n kink solution. For the latter (f> turns through 27T twice (whereas for the breather swings through less than 27T and then swings back). Note from the time asymptotics how two separated kinks a t t —• —oo become again two separated kinks as t —> +co. But in between there is a complicated sequence of interactions in the neighbourhood of t — 0. Note too that each 27r-kink has the phase shift A = 2lnai2, a\2 determined by the two kink velocities Vi ^V^. The kink 'one' gets the total phase shift A i = A from t = - c o to t = +00; the kink 'two' gets A2 = - A ; and the total phase shift all together is Ai + A 2 = A + ( - A ) = 0. For more than two kinks (the "2n7r-kink, n > 2") interactions are pairwise for each possible pair (three distinct pairs for the 6ir- kink, six for the 87T-kink, etc.) and the total phase shift is conserved. The pair-wise shifts are a measure of the 'solvable S-matrbc' available for quantum s-G and the conservation of total phase shift is a measure of the 'complete integrability' of the sine-Gordon equation. It is instructive to note that for the 27T-kink

The sech function, shown in Fig. 6, is the generic 'soliton' (the single soliton solution for all AKNS systems [10,36]). Evidently the amplitude increases as V increases in 0 < V < 1 (V <-> Vc-1 with c = 1): accordingly 'bigger pulses travel faster' [50] and this 'explains' the collision property of two 27r-kinks travelling in the same direction x: the bigger one overtakes the smaller one and re-emerges asymptotically at large enough x, exactly as though the bigger one has passed through the smaller despite the nonlinearity! The formula for the 4-Tr-kink given in Fig. 5 has exactly this property with however the phase shifts as already explained: phase shifts are generic in 1+1 as already mentioned. The Fig. 7 shows how the bigger 27r-kink taken in generic sech form 'passes through' the smaller one: careful scrutiny also shows up the phase shifts after the collision. Notice that the 'area' of the pulse J_ (d(p/dx)dx — 2TT exactly. The kink is the 27r-kink for this reason. Similarly the 47r-kink is (asymptotically!) the simple sum of two 27r-kinks and the 2n7r-pulse is the sum of n 27T-kinks. In this language the 'breather' is the 07T pulse and does not change any total area. Notice that in terms of 2-level atoms the 27r-pulse rotates each atom from

83

6 b 0 d 0 0 0 O O O O b 0 6 C > O 6 6 0 6 < t

f

1 o <>

>

<>

r

?

1

1 6

6 c!>

T

1 i

.

<

Fig. 5. Model of coupled nonlinear pendulums in static equilibrium under gravity (above); and excited by a 47r-pulse (below). The analytical expression is that for the 47r-pulse of the s-G equation. Note the phase shifts A i = 2£nai2, and A2 = — A i : 012 is determined by Vi and V2 the speeds of the two 27r-kinks involved.

84

Fig. 6. The generic hyperbolic secant pulse of the AKNS systems illustrated by eqn. (13).

Fig. 7. The 'bigger' 27r-kink taken in generic sech form eqn. (13) 'overtake the 'smaller' 27r-kink taken in that generic form.

86 ground state t o ground state: no energy is lost from the pulse t o the atoms this way and the 'attenuator' which is a dielectric made up from a smooth distribution n (say) of 2-level atoms each in their ground states initially is actually transparent to such a pulse; indeed it is transparent to all 2n7r-pulses including n = 0, the breather. If the initial area of the pulse is not 2n7r, n — integer positive or negative or zero, then the pulse re-shapes to an appropriate value of n! This then is the mathematics of the physical observations of 'selfinduced transparency' or SIT ([6,7,34,47,48] and refs). Notice how we can move from the s-G to the SIT equations eqns. (9) themselves in this way. Notice too that this ability to 'rotate' a 2-level atom is in principle very general: a pulse of area 6 (a '6 pulse') can rotate the Bloch vector of the 2-level atom by 6. This ability t o change the quantum state of a 2-level atom is a t the heart of recent ideas t o encode quantum information (see §5) on systems of 2-level atoms in cavities - an aspect of 'cavity q.e.d.' (see Figs. 9 and 10 below). The sech pulse eqn. (13) is roughly speaking an electric field pulse (because of the change of independent variables one needs both of d/dt for s-G). The same kind of electric field pulse arises in the propagation of electrical signals in an optical fibre. In this case, in the absence of any resonance frequency in the fibre, the sine non-linearity in the s-G can be replaced by a 4>3 nonlinearity: more precisely this can be done by a SVEPA [48] so that the resultant field is a complex field 4> £ C. The result is the NLS equation which suitably scaled is

-it = 4>xx - 2c\4>\24> , 0 e C.

(14)

and the real valued parameter c has c < 0. This system with c < 0 has a 2-parameter sech solution satisfying b.c.s. 4> —* 0, = |cr^sech [q{x-Vt)]expi[{q2-^V2)t+^Vx]

(15)

with q, V € R. All of these solutions are of "breather" type (a sech envelope modulates a (complex) oscillatory term): like the s-G breathers these NLS solitons have velocity V and amplitude q (the two parameters) independent of each other (now 'bigger pulses do not travel faster!'): the breather of the NLS equation, eqn. (15), derives directly from the breather solution eqn. (12) of the s-G under a suitable SVEPA. By replacing t by a spatial co-ordinate z the NLS equation eqn. (14) describes stationary solutions for a nonlinear dielectric with third order non-linearity in the space (x, z) £ R~: these are the 'spatial optical solitons' appearing in the Fig. 2 already mentioned, and were first observed by C.H. Townes et al. in 1964 [51] and by P.L. Kelley in 1965 [52]. The first soliton t o be observed was in the form of a 'bump of water' on the surface of a canal in 1834 [10,53]. The motion of this bump which is a sech 2 bump not a sech like eqn. (12) is governed by the famous Korteweg-de Vries equation [10,53-55]: the KdV equation is (in a frame moving a t the sound speed and suitably scaled) Ut + 6uux + uxxx = 0. (16)

87 The field u(x,t), which arises as a velocity, replaces previous fields 4>{x,t) and eqn. (16) is Galilean invariant if Ut —* ut + ux and the number 6 is scaled away - as one checks. The soliton solution is u = 2£ 2 sech 2 £(z — 4£2£)] with £ £ R and there are no breathers: bigger pulses travel faster for KdV - see eg. [10,17,53], The Ref [17] illustrates more than nine examples of soliton systems. 3. C O M P L E T E HAMILTONIAN INTEGRABILITY O F T H E SOLITON SYSTEMS

All of the nine or so examples discussed in [17], and see eg. [18,55,56] for still more, are completely integrable Hamiltonian systems and typically have actual soliton solutions. One completely integrable Hamiltonian system of the same type, a kind of repulsive s-G, is the sinh-Gordon (sinh- G) system which does not have soliton solutions. It takes the form 4>xx - 4>tt =

m2

smh

4>

(17)

and this sinh-G derives from the s-G by analytical continuation in the coupling constant 70 - see below and [18]. All of these systems mentioned are solvable by the inverse scattering method under b.c.s. vanishing sufficiently fast at x —> ±00: particularly the hierarchical sequence RMB, SIT, s-G and NLS of the Fig. 2 are all completely integrable Hamiltonian systems actually solvable via the 2 x 2 matrix inverse scattering method [10,56-58]. Here we consider complete integrability of the s-G equation which in covariant form is eqn. (1), with the b.c.s. <j> —> 0 (mod 2ir), cj>x —+ 0 'fast enough' (eg <j> ~ 2WK ± e - * ' 1 ' , x —* ±00). There is also the nonlinear evolution equation (NEE) form 4>t(x,t) = m2sin[

4>(x';t)dx'}

under comparable b.c.s. The NEE eqn. (18) is for 4>(x,t) = 2

uxt = m sin u

(18) ux(x,t) (19)

which is the s-G in iight-cone' coordinates. Takhtadjan and Faddeev used inverse scattering methods to demonstrate the complete integrability of the s-G equation eqn. (1) in a preprint of 1974 [57]. R.K. Dodd and myself used these methods to demonstrate the complete integrability of the s-G in lightcone coordinates also in 1974 (in an oral presentation at the British Theoretical Mechanics Colloquium in Manchester). However in print we first of all reported both versions together in [59] which appeared in 1977: a more complete analysis was given in [60]. As noted in §1 Liouville's theorem [15] says that for N < 00 degrees of freedom, given N independent constants Ik commuting under the Poisson bracket {.,.} i.e. {Ik, h} = 0, k, £ = 1,..., N; the system is completely integrable and can be integrated as a sequence of integrals. In [16] of 1974 Arnold shows that if the manifold of level lines, the set of Ik = constant, is compact and connected the motions are diffeomorphic to a torus (an N torus TN) and this means actionangle variables can be found. The N degrees of freedom define a symplectic manifold M2N, smooth and differentiable, which carries the differential 2-form

88 LJ = 5 2 i = 1 dpi A dqi in terms of local canonical coordinates Pi,qi, i = l,...,N: the 2-form u is the 2-form u/ 2 ) and M2N carries the forms a / 2 \ a/ 4 ), ...,u/ 2 i v ) in which the last is the phase volume. Each of these a/ 2 ') is invariant under canonical transforms [16] and invariance of the phase volume u/ 2JV ) means the Jacobian of the canonical transformation is unity. For field theories like the KdV, RMB, SIT, s-G, and NLS equations one needs an invariant 2- form like u=

f[dn(x,t)Ad4>(x,t)]dx

(20)

over the running label x: the canonical coordinates are here IT(x,t), 4>(x,t) and satisfy the equal time Poisson bracket {Il(x,t), (x',t)} = S(x — x'): the manifold M is now infinite dimensional 2N - > 2 x 2Xo = 2 X o + 1 in Cantor's sense of 2X0: the invariant forms extend in number correspondingly and the Jacobian is no longer simply unity. As noted (§1) the Liouville-Arnold theorem is not proved (as far as I know) for symplectic manifolds with this very large dimension. However, complete integrability of all of these systems like s-G is demonstrated (a) by finding a large enough set of commuting constants (certainly 2 X0 of these); (b) by using these to explicitly integrate the system. A route is to find the action-angle variables under particular boundary conditions, eg. vanishing sufficiently fast at ±oo and using the scattering (and inverse scattering) method to do this — for the scattering transform (spectral transform) and inverse scattering transform (inverse spectral transform) are canonical transformations (see [57,58,60,61]). A history of demonstrations of such complete integrability for different models is contained in effect in [10,19,56]. Thus for the s-G eqn. (1) one finds, conveniently for the equations of motion, the Hamiltonian

tffol^o1/

-7l?n2 + - ^ 2 + m 2 ( l - c o s <

dx

(21)

in which 70 £ R and 70 > 0- Under a trivial canonical transform plainly leaving 1 _i w eqn. (20) invariant 702 II —-> II, 7 0 2 4> —+ , H is equivalently H[<j>] = J [ ^ n 2 + \<& + m 2 7o \ l - 006(70*0))dx,

(22)

a more usual form in the literature and by expanding the cosine 70 is plainly seen as a coupling constant coupling the nonlinear terms to the linear ones. Note that as 70 —> 0 one actually gets the K-G (Klein-Gordon) linear equation. 1

1

Moreover by the continuation 702 —> i^§ in eqn. (22) one gets sinh-G, eqn. (17) (for which 70 —+ 0 also gives K-G). To be explicit, from the form H[
=

7on (=6H/5U)

Ut = {H,n}

=

7o-1[^x-m2sin0]

(23) {=-5H/64>)

(24)

89 and 4>u = 7 o n t is exactly eqn. (1), since 70 vanishes from these particular Hamiltonian equations. As in [18,56-60] H[4>] can be expressed as H\p] in terms of action-angle variables (under the chosen vanishing b.c.s. at infinity; ref. [18] which is concerned to provide a quantum and classical statistical mechanics of solitons shows how to connect these to action-angle variables under periodic b.c.s.). For laboratory coordinates eqn. (1) the H\p] takes the elegant form N

NK

R

2

H\p] = ]T(M +p?)i+£(M 2 +p-!)' i=i

j=i

Nb

+ ^[4M 2 sin 2 0,+p 2 ]i 1=1 oo

/

ui{k) = (m 2 +

u(k)P(k)dk;

fc2)^.

(25)

•00

The qunatity M is exactly the rest mass of the single soliton solution: M = 87717^', and it can be seen that the two first summations are apparently for Nk kinks and JVj anti-kinks each with relativistic momenta p; or pj . The third sum apparently involves Nb breather solutions, eqn. (12), of s-G eqn. (1) (the numbers Nk, N%, Nb are fixed by the initial data): the parameters fie for each breather eqn. (12) appear now rewritten as the canonical momenta Qe and 0 < Qe < 7r/2. Note that the P(k) in eqn. (25) have the running label k and the pt, pj, pe, 0^, and P(k) together satisfy the requirement that there be 2X° action variables: there are also canonical angle variables for each action variable (which do not of course appear in i/[p]). The phase spaces are indeed i. P(k) > 0, 0 < Q(k) < 2TT; {P(k),Q(k')} ii. —00 < pi, qi < 00, {pi,qf\ = <%,

= S(k - k1).

(not compact)

iii. —00
(not compact). 1

v. 0 < e , < 1*, 0 < $ m < 8TT, {4 7o - e € ,$ m } = stm. Note how these 2Xo canonical pairs P(k), Q(k) together with Nk Pi,qi, Nk pi,q~i, Nb pe,qe and 4^Q 1 0 ^ , $^, together define a torus of at least 2X° dimensions and notice how the torus defined by the Pi,qi;Pi,<ji;pe, qe is opened up and so is not compact. Indeed the P(k), Q(k) define a volume which is a cylinder for each k and this part of the phase space is the large product of each such cylinder: the cylinders are opened into sheets for the Pi, qi, etc. Evidently because this large dimensional torus is formed as the direct product of "more than" 2X° separate one-tori, this means in physical terms that the relativistic "particles" apparent in H\p] eqn. (25) are non-interacting particles: in practice in the statistical mechanics of s-G as presented in [18] (and in e.g.[55] and the other refs. in

-SOLITONSU.V u d t m J f x W Matrices

X

Periodic b.o's on tntegrable Lattice ModeU in 1+1 dimensions monodromy matrices

Lax Pair

Spectral Problem

or

and «7(x,0) plot rapidly vanishing bx.'i at oo

Spectral Data S(0)

canonical transformation

+. = V* | compatibility JO + flAfl = 0 (Zero Cnrvatnre) |SDSY| Integrable Model (=NEE) Rjemann Problem Marchenlo Equation

e.g. «. = n^* - 6«Ui Riemann theta-functions

4e

* 1 - * l=J*(Jb)*2 •

V

• - < -

U(x.t) SOITCS integrable models in 1+1 Dimensions

J K - P Equations |

Lie-I Al|

+ [p„,ff.i = o

also Hamiltoniaas

|snSY|r

IKSA Theorem | -

•

JSiitol

r

Loop Algebras s.(Ar,C)®[X,A-M

1 Saitoh 3+1 Dimensional Integrable models

Elliptic j Modular [ Functions 1 Strings

->

-

Bose Fermi Equivalence

fm.^ - o^

*

^

Affine(Kac-Moody) Lie algebras Virasoro Algebras

<-

Bard Hexagon

_J

T

Potts Model Ising Model

Fig. 1 Overview of generalised 'Soliton' theory as of August 1991 taken from Refs. [7,8]. A hard arrow indicates minimal connection (at least) between the boxes is already established and most hard arrows are actual mappings. Dashed arrows indicate expectation by the authors that some such minimal connection can be achieved or stronger. Note how p-reduction of the KP equations reaches the string equation for 2D-Quantum Gravity coupled to (p,q) conformal matter (72,81). Pure quantum gravity is p = 2 the case considered by Migdal [50].

,/(co)

Conformal Field Theory

Theory of Partitions (Wadati)

Fig. 8. T h e 'map' "Solitons" in the form published in [19] and used in [7]: The inset 'Fig. 1' etc. is that as used in [19,48] and the references [7,8] etc are references in [19] as explained in more detail in the text.

co din reps

Rjcmann

•

I

.

Hubert Space

S«6tt J—, 1 SM 1Genu>-co

m - ma

Sklyaain Bracket and i-nutriz

ctioo-Angle Variables

{TTT>= ITOT^j |J€C

Partition Function Z = fVpexp S\f) for 1+1 dimensional classical or quantum integrable models

If

Braid Group 9i9j = 9j9i l » - j ' l > 2 9i9i+l9i = 9i+i9i9i+i

s

q-Bosons Hopf Algebra

-N

AT=T®T Quantum Groups

V(x, y, t) solves integrable model in 2+1 dimensions e.g. D-S equations or K-P Equations

If

Quantum Integrable Models e . g . s-G, MM, HLS

(«, + 6ml* + « « I ) I = ± K „

5^1 |_y

KP-IJCP-II=+,«A = Ail Hopf Algebra iymmetries of KP-I and KP-II Algebra

Quantum Spin-] XYZ model (1+1)

( ^ m . r „ ] = l ( m + |l)/f m + n _,

lV. T »l = i ( m - "fc-t—i Weyl

W

Algebra

—«-

8-Vertex Model

*

-

*

•

|n-Vertex|—*2D- Quantum Gravity. Partition function is a T-f unction of p-reduced KP

Solvable Lattice Models (2+0)

Polynomials Partition Function Z Jones Polynomials

Generalised Statistics Oriented 3-manifold M covariant (invariant) theory

S - h iu Tr(A

A

•* + lA A A A A)\

= Integral of Chern-Stmons 3-form

=

92 [18]) one finds instead that under the periodic boundary conditions necessarily used there, the consequent phase shifts describe a highly significant interaction between these particles. To show that this total count of action-angle pairs (or 1-tori) is sufficient to integrate the system one observes that for example P(k) = constant but Qt(k) = w(fc); and from the P,Q at time t = 0 one finds all of these at t = t and so inverts these P, Q at time t to the original variables II, <> / at time t. The treatment of the solitons works similarly and these likewise contribute to the H,4> &t time t. To find the P(k),Q(k) etc at t = 0 one uses the "Lax pair" for s-G [10,5658,60,61]. One part of this pair is typically an eigenvalue (or spectral) problem Lv = £v (in which L is a 2 x 2 matrix differential operator and £ € C [10]). It is through L that one converts the initial data for (x, 0) to spectral data 5(0) which then determine all of the action angle variables [59,60]. After the time evolution of these angle variables under the Hamiltonian H\p\ one gains the spectral data S(t). One then inverts back these spectral data to 4>(x,t) (for 2 x 2 matrix spectral problems) through the Gelfand-Levitan-Marchenko equation, a linear integral equation (for nxn matrix spectral problems one uses the more general Riemann-Hilbert problem method [57,61,62]). Much of this is exhibited in the 'map' 'Solitons', the Fig. 8. This 'map' is taken from the Fig. 1 of Ref. [19] with the addition of the 'optical soliton' in the Experiments "box" at the extreme right (which addition was first exhibited in [48]). Note too that the caption for the Fig. 1 of [19] refers to the references [7,8] of [19] now referenced as [31,65] here. Likewise the references [72,81] and [50] all of [19] are repeated here as [64,65], and [66]. Moreover the 'Overview' referred to goes back to 1991 for the Refs. [7,8] of [19]. These references are much concerned with quantum groups. The first appearance of this 'map' was in [67] of 1988 and contains slightly less material. It was offered on that occasion simply as a joke: the joke now is on the perpetrator since the 'map' can evidently be used!2 The idea of the classical inverse method is as described. On the 'map' the Lax pair top left is written matrix wise in the form ipx = Uip,ipt = Vip (U, V and indeed ip [10] are nxn matrices). The integrability condition ipxt — iptx then leads to Ut — Vx + [U, V] = 0 which is the typically nonlinear equation of motion for the model under study. Notice that by working with the Lax pair as dip = Clip, in which fi is a matrix of 1- forms, the compatibility condition becomes dfi+n A f2 = 0, a 'zero curvature condition' (refer to [10] the Chap. 1). Thus the nonlinear equation of motion has, inevitably, exactly this particular geometrical structure: countable infinities of conservation laws are derived in the same geometrical framework in [68]. The motion Ut — Vx + [U,V] — 0 can equally well be written Lt — [A,L] by including differentials d/dx in L as in [10]: L then defines the spectral (or eigenvalue) problem Lv — Qv as mentioned already, and these flows are isospectral flows such that Q — 0. In this form L and A are given in [10,58-60] for the s-G in light-cone coordinates. 2

There was an error in the 'map' in [67], namely that in the 'box' "Symmetries of KP-I and KP-II..." the relation [Km,Tn] = ^(m + l ) ^ m + n - 2 and n —• 1, not n.

93 One speaks of the eigenvalue problem Lv = (v for 2 x 2 matrices, or that for n x n matrices more generally, as a scattering (or spectral) transform which transforms initial data at t = 0 to a suitably complete set of spectral data S(0). 'Suitably complete' means 5(0) can be inverted back via the inverse spectral transform t o regain x(x,0) [10]) at t = 0 and 5(0) is equivalent to 4>{x, 0) in this sense. Likewise after evolution of 5(0) —» S(t) this can be inverted to regain <j>(x,i) at time t. On the 'map' all of this last is marked as:- uU(x,t) solves integrable models in 1+1 dimensions" and <j>(x,t) or tpx(x,t) is an element of the matrix U. This procedure spectral transform-inverse spectral transform evidently constitutes a rather remarkable nonlinear Fourier transform method of solution [10,58]. Moreover like the Fourier tranform the spectral transform-inverse spectral transform are a canonical inverse canonical transform in the Hamiltonian sense. In practice the spectral data S(t) at time t are inverted via a RiemannHilbert problem: the one shown in the double lined box on the map is for the 2 x 2 matrix Zakharov-Shabat system [10] and is equation (2.5.21) in [61]: R(k) is a scalar function of A: € R, C = k is the real axis, and sub-indices 1,2 refer to the two elements $x = $j(a:,A; + i0), $2(x>k + i0) of a column vector: super indices + , — refer to these elements at k ± iO and +,5>~ are matrices and so is H, the given matrix: <E>~ has a given asymptotics (see [67] where some of this is referenced and discussed). In [61] on the other hand P.J. Caudrey working with N x N matrices and N —> 00 was able to go over to the non-local Riemann-Hilbert or D-hax problems capable of solving the K-P I and II equations in 2 + 1 dimensions respectively: the K-P (Kadomtsev-Petviashvili) equations take the form on the 'map' Fig. 8 which is for u = u(x,y,t) (ut + 6uux + uxxx)x

= ±uyy

(26)

and the KP-I, KP-II belong to the + and the — respectively. The solution of KP-I and KP-II is carried further in Caudrey's paper [62]. Moreover actionangle variables for these K P equations are in [20,22] while they were also given

94 in [22] for the Davey-Stewartson system: the Davey-Stewartson [61] system is the DS on the 'map' Fig. 8 inside the KP 'box'. However it has turned out that the solved DS-I system, solved via a Riemann-Hilbert problem [22], is actually not Hamiltonian (while the Hamiltonian form of DS-I is not yet solved). Thus the status of the "action-angle" variables for DS-I in [22] is still open. Notice that the geometry of solitons already mentioned [10-12] also connects in the spirit of Atiyah's remarks t o much algebra evident on the 'map' Fig. 8. Thus the loop algebras in the third column from the left in Fig. 8 connect with the quantum groups, which are Hopf algebras [30,31] with the co-multiplication A = T®T already briefly explained, in the second column from the right, and these quantum groups connect with the quantum inverse method or algebraic Bethe ansatz [24] in the the third column from the right. My paper [30] illustrating this route and the algebra of quantum groups is rather incomplete incomplete because it was quite an early paper on this topic (first presented in 1988, Drinfeld's, Berkeley article is 1986 [32]): following up the remark of Atiyah the word 'quantum' is appropriate because, as Fig. 8 shows, the quantum groups lead to the quantum inverse method [24] for solving quantum integrable systems: this method involves an R-matrix as is displayed in the third column from the right, a solution of the Yang-Baxter relations, also in that column. The semi-classical limit of R is the little r-matrix which enters into the Sklyanin bracket as at the top of the third column from the right of Fig. 8: this defines the Poisson-Lie group Hopf algebra structure traceable back to Drinfeld 1983 [69]. The key expression RT®T~T®TR in the quantum inverse method 'box' in the third column from the right in Fig. 8 is a quantum integrability condition as mentioned in §1: more explicitly [18] R{\, /i)T(A) ® T(ji) = T(ji) ® T(\)R(\,n)

(27)

and A,/i G C. And the matrix trace of eqn. (27) yields [A(A),A(//)] = 0 with A(A) = TrT(A): the A(A) (A(/i)) are indeed a very large number of commuting constant quantum operators! In our paper [70] for example which solves the Tavis-Cummings problem of quantum optics for N > 1 2-level atoms and one e.m. field mode only three independent constant operators are actually involved and the significance of the infinite number of constants deriving from eqn. (27) is obscure (to me). As noted the 'map' Fig. 8 was first presented at the 18th Intl. Meeting on Differential Geometric Methods in Theoretical Physics held at Chester, UK in 1988: there it was intended to illustrate the use of the Riemann-Hilbert methods for the inverse spectral transform, and to illustrate some of my work with J.T. Timonen, Finland on the quantum and classical statistical mechanics of the integrable models [67] - particularly that on the SM of the sine-Gordon field theory in 1+1 dimensions summarised later in [18] and its references. This is how the Partition Function Z = f Vfi exp S\p] comes into the 'map' Fig. 8 via the Riemann surface of genus infinity in which the classical action S\p] is expressed in terms of action-angle variables under periodic b.c.s., that is [18] the

95 canonical invariant S[H, ] the classical action expressed essentially as the integral invariant of Poincare-Cartan [16] as S p l , ] = J Htdx — H[4>] canonically transforms to J P(k)Q(k)tdk — H\p] (for the solitons of s-G to be included the invariant, essentially Poincare's invariant [16] which is here J P(k)Q(k)tdk, must be extended [18]: the sinh-G has no solitons and no such extension is needed [18]). For the quantum and classical statistical mechanics the classical action S[IL, ] is Wick rotated t —• -IT and 0 < r < (3 and f3 = ( f c s r ) _ 1 , kB — Boltzmann's constant, T the temperature. The reader is referred to [18] for actual evaluation of the functional integrals Z = J Vfi exp S\p], especially these for the s-G and sinh-G systems, as carried out in detail in [18]. In passing note too how the Fig. 8 shows that the quantum s-G sits inside the spin-^ XYZ model in 1+1 dimensions (see third column from the right in Fig. 8 where 'Quantum s p i n - | XYZ model' maps onto the quantum s-G). This also leads down in that column to Baxter's '8-Vertex model' of 2+0 dimensional statistical mechanics and this 8-Vertex model of 2+0 dimensional statistical mechanics contains both the 6-vertex and the Ising models [108]. Conversely, reference the remarks in the preface t o [108], the physical applications of the RMB, SIT, s-G and NLS equations in classical and quantum forms (see especially §§4,5 next for the latter) wholly justify the intellectual achievement involved in solving exactly these 2+0 dimensional models of statistical mechanics. The complex of models solved as at Fig. 2 in 1973 thus solves enormously more than was realised at that time! (These several remarks reference the new ref. [108] were only added in the final 'proof'.) Evidently I need several more lectures in order to cover all of the content of the 'map' Fig. 8! However, we have still to deal with experiment also. There is the EXPERIMENTS box, the experimental output of all of this theory, at the extreme right of Fig. 8. Now I need to refer to these experimental aspects, and the Figs. 9,10 attempt to do some of this. The Fig. 9 (taken from [7]) elaborates a bit on the actual EXPERIMENTS box in the Fig. 8. Particularly it refers to some aspects of interest in quantum and nonlinear optics - notably the micromaser, SIT and the optical soliton, the MB equations, optical bistability, transoceanic optical communication in fibres, all but the micromaser already referred to in this paper. For the micromaser reference to the Abstract 'Stroboscopic theory of out-coming atom statistics and quantum measurements for the one atom maser' reproduced in [7] is one possibility: a quantum dynamical analysis of one-, two-, and iV-atom micromasers is in [71] and there is the preprint [72] for the relevant references to the real micromaser in operation at Garching, near Munich (for numerical work on this quantum dynamics see also [73,74]). In the 'key words' Fig. 10 (taken from [7]) I was unable (so far) to give rigorous, or even semi-rigorous connections, like those marked in the 'map' Fig. 8. But, for example, the quantum coherent states and solitons have a correspondence [26]: roughly speaking every semi-simple Lie algebra has a coherent quantum state associated with it [26,75] and these states are typically "squeezed" such that for canonical quantum operators p, q, [q,p] = in, this commutation relation means ApAq > |7i yet eg. the mean

96

biitatlliti)

t

5olvoile Larti«

Fig. 9. Extension of the 'map', Fig. 8, expanding on the EXPERIMENTS 'box'. Notice that in what is the extreme left hand column here that 'e.g. s-G, MTM, NLS' (for sine-Gordon, massive Thirrring model, Nonlinear Schrodinger equation respectively) is extended to include 'MaxwellBloch MB', 'optical solitons' referring to MB as well as attractive NLS. Cavity q.e.d., developed further for 'quantum information' in the section 5, now enters in this Fig. 9 only to 'Micromaser', but there via the quantum integrable Tavis-Cummings model [70] of TV 2-level atoms and one cavity mode. One example of the T-C models is the Jaynes and Cummings model, eqn.(50), fundamental [71-74] to the models of the micromaser. Nonlinear dielectrics [44,45] enter at the right.

97

KBy wonts fa.tlusued^

J \\$XoHitfi V&.CUM.H, iolitws

0»fM iiwfrej,

fk«HM»

Mmt'iw«a.r

pft«U«li'c

tr«.p

Coc(Iv>3

Fig. 10. The EXPERIMENTS of Figs. 8,9 are extended further. 'Non-classical light' enters as 'squeezing' top right and 'sub Poissonian photon statistics from the micromaser' [40,72-75]: the purple reference to 'coherent states = solitons' refers to the coherent states of arbitrary semi-simple Lie algebras, which are typically squeezed e.g. su(2), atomic coherent states [75]. There is the correspondence between such coherent states and integrable systems noted in [26] mentioned in the text. Both the 'g-deformed bosons' [23,29] and the 'quantum repulsive Bose gas' lead naturally to ' B E C , appearing at the bottom, as analysed in the parabolic trap in the section 4: 'GrossPitaevsky' (Pitaevskii) is a self-consistent approximation, in c-number form, to the exact quantum theory of BEC [85] and agrees with much experimental data. A first exact (in the scaling limit) calculation of the correlation function (ip^ip) of section 4 for the repulsive Bose gas and d = 1 which uses the q-bosons on a lattice is in [107]: the correlator for the repulsive Bose gas is in [34].

98 fluctuation . . .

in q which is Aq is squeezed below the Heisenberg bound, Aq <

i

(2^) 2 J n ° t difficult to do mathematically but quite difficult to do for physical systems [75]. In optical fibres the electric field pulse, a solution of the NLS equation, can take the general classical form of eqn. (15). But viewed as a quantum object (a "quantum soliton"?) it will be a solution of the quantum NLS equation in 1+1 dimensions. -i
= 4>xx - 2at>^4>2

(28)

for which there are the Bose commutation relations [0, 0 by the algebraic Bethe ansatz [24] (the case c < 0 needs further work but see eg. the one n-string solution for c < 0 exhibited at eqns. (53) and (54) below). The quantum NLS in 3+1 is not quantum integrable. It is very much my current interest so I make some remarks on it now. 4. B O S E - E I N S T E I N CONDENSATION

(BEC)

The discovery by S N Bose in 1924 that thermally excited photons satisfied "Bose statistics" when seen as a problem of the quantum statistical mechanics of massless particles led Albert Einstein to introduce the corresponding Bose-Einstein statistics for the massive particles which are now called massive bosons. It was realised that at low enough temperatures, less than or much less than a few Kelvin, a phase transition should occur in which macroscopic numbers of these bosons should collapse into one single quantum state of lowest (free) energy. As we now know this is in contrast with the 'fermions' which cannot occupy the same quantum states. In 1938 F. London suggested that the peculiar 'super- fluid' properties of liquid 4 He below T ~ 2.2° K were an actual manifestation of just such a Bose-Einstein Condensate (a BEC). In 1947 [76]

99 N.N. Bogoliubov (NNB) suggested that such BEC's would be well described by a weakly interacting gas of bosons interacting through a "hard-core" potential in pairs, and only in pairs, of the type c5(ri — r2) in which 6(ri — r2) is the 3-dimensional (d — 3) 5-function and c is a small parameter - evidently the boson- boson coupling constant. NNB showed in effect that c > 0 for stability of the system [77], so the pair potential is repulsive hard core. Since for N bosons the total pair potential is c ^ i = 1 X2,'=i JU.{ &{pi — Tj) each pair introduces the potential 2c8(Ti —Vj). Huang [78], for example, describes the properties of this repulsive 'Bose gas'. In 1947 NNB calculated the particle energy spectrum of such a weakly interacting repulsive Bose gas below the critical temperature Tc for any BEC and showed that in this 'condensate' although the individual bosons had the free-particle kinetic energies p 2 / 2 m (p = |p| is the magnitude of the particle momentum) the collective excitation behaved as though each 'particle' of that collective excitation had a single particle kinetic energy y/Nov(0)/mVp in which the constants in the square root are introduced below. From this result for the collective excitations of the condensate he was able to derive superfluidity - an interpretation depending critically on the p 2 —* |p| = p behaviour [79]. It is well known that for relative simplicity of 'many-body problems' involving many interacting particles it is helpful to work in so-called second quantisation [80]: one introduces for bose systems two quantum fields ip, ip^ (tp^ is the adjoint of ip) which satisfy the equal time "Bose commutation relations" [i>{r,t),i>\r',t)]

= h8(r-r')

(29)

in which r,r' are vectors in d dimensions. For the particular 2c6(r — r') interaction between two bosons at r and r' Schrodinger's linear time dependent equation can be rewritten exactly, in second quantised form, as the nonlinear equations ih -ih

dip dt dt

v 2 i> + 2c^V^ = o V2 ft + left ft i> = 0

(30)

(note the 'normal order' in the ip^tpip, etc). Reference to eqn. (28) shows that eqns. (30) with (29) for Ti = 1, m = ^ are exactly the d-dimensional quantum NLS equation: v 2 = ^2i=i 92/9xf, the Laplacian in d-dimensions. If we replace the quantum fields i>,i>^ by classical fields tp,ip* satisfying the Poisson bracket {ip(r,t),tp*(r',t)} = iS(r — r1) in d-dimensions (so that iTT1 [.,.] is replaced by {ip, ip*} the reverse of Dirac's canonical quantisation so that this 'semiclassical limit' is formally singular in h, then eqns. (30) can be seen to be exactly the classical NLS equation in however d space dimensions. Unfortunately, although the NLS systems are completely integrable and solvable in d = 1 dimensions (whether c > 0 or c < 0) these equations are no longer com-

100 pletely integrable in d = 2 or d = 3 space dimensions: there are not 'enough' conserved quantities in these cases (§1). Under translational invariance the total linear momentum P = -ih±f[rvi>-(vr)i>}ddr

(31)

for the classical tp,i>* (called c-numbers by Dirac) commutes with the Hamiltonian under the bracket {.,.}. For the quantum fields ii,^ under the Lie bracket eqn. (29) this likewise remains true. For a quantum Bose condensate with zero macroscopic momentum P NNB exploited the idea that a macroscopic number of massive bosons were now in the p = 0 mode. In momentum space NNB's Hamiltonian was [81]

H

= E |^ a X+-h V

T,

v

^ - pi)
(Pl+P2=P'i+P'2)

in which v(p) is the Fourier transform to p-space of the pair potential v(r — r'): evidently v{p) = v(0) and become a constant for the chosen 2c6(r — r') interaction. To solve this system (to a good approximation) NNB's crucial move was to exploit the supposedly very large number No of particles in the p = 0 state. In eqn. (32) ap,ap are Bose operators with commutation [a p ,aL] = Sppi. NNB replaced these by bp = alN~*ap,

6j = a J ^ 0 " i a o

in which ao, % commute as c-numbers. This way [81] he gets H — HQ +

(33) Hint,

Ho = E jg£&&» Hint = # 1 / ( 0 ) + # £"(p)(&Jtf_p + bpb„p + 2bpbp) + H', in which H' is of third and fourth degree in the bp,bpi. The bilinear form in bp,bp was diagonalised under a 'Bogoliubov-Valatin' transformation, and this way NNB found his energy spectrum as ^(Nov{Q)/mV)p, linear in |p| = p already mentioned. Notice that ^(0) must be positive so that c > 0 (repulsive case) for stability [77]. Subsequently, among much theoretical work, L.D. Faddeev and V.N. Popov [82] and then V.N. Popov [83,84] developed functional integral methods in order to calculate, in particular, the 2-point correlators G{r,r') = (ft(r)i>(r')}.

(34)

The (...) means thermal average a t finite temperatures T > 0 and this is introduced by using Wick rotated time t —> —ir. One finds [18] that 0 < T < (3, (3 = (kBT)"1 {ks = Boltzmann's constant) and the classical action used in the functional integral is periodic of period /?. In fact G(r,r'), eqn. (34) becomes independent of 'time' r and G(r,r') as written indicates this fact [851.

101 Under translational invariance eqn. (34) for G depends only on the vector r — r', not on vectors r,r' separately. Thus the calculations are advantageously carried out in terms of the ap, ap in momentum space. An important point is that the volume V (appearing in H eqn. (32)) is large with periodic boundary conditions such that one can take the finite density limit. This limit is such that e.g., for a total number N of particles, N/V —» n > 0 as the periods in ddimensions go to infinity so that V —• oo. This finite density limit ensures that the large V behaviours scale asV a condition for the thermodynamics derivable from the thermal average (...). Translational invariance is thus achieved under these finite density conditions (see discussions in [18]). In momentum space for T < Tc one finds [84] (see the pp.26, 28) that the following shift transformation is convenient: %p(r,T) —> 'ipir^r) + a, tp^{r,r) —> ffi(r, T) + a* and a, a* € C. This way one shifts ap, aj, as ap = bp + a((3V)hP0

al = bl + a*{(3V)l*6Pa.

(35)

Then one finds (6o) = (bl) = 0 while
=

a((3V)i,(al)

= a*((3V)i

(opoj)

=

(bpbl) + /?Via|% 0 -

(36)

The action o|a) = a\a), a £ C, is a property of the so-called Glauber 'coherent state' \a): {a\a^ = a*(a|. Moreover ( a ^ a l a ) = |a| 2 , ( a | a W a a | a ) = |a| 4 , etc. To this extent the condensate in the p = 0 mode is in a Glauber coherent state [86]: it is important that there is also 'off diagonal long-range order' [7], namely (oo), {%}, two 'order parameters', are both non-vanishing. Unfortunately the reality in the recent (1995) experiments producing BEC in the metal vapours 2 3 Na, 8 7 Rb, 7 Li [85] is that these vapours are cooled to micro-Kelvin temperatures T using magneto-optical traps. And then, held in a magnetic trap, evaporative cooling of the gas is achieved down to T's ~ 1 0 - 8 , 1 0 ~ 9 K when the BEC occurs. To a good approximation magnetic traps introduce harmonic potentials into the quantum theory of the general form [85] V(r) = ^m(Q,lxj

+ Sl2x22 +

fi3x|),

(37)

(for r = (xi,a;2,a;3)). Such traps break translational invariance. And we have been obliged [85] to re-work all of the functional integration theory to accomodate this fact. It no longer pays t o work in momentum space. We work with the fields tp(r, r ) ipi (r, r ) directly and derive the following for each of d = 3, d — 2, d = 1 that: G(r,r')

=

VpoWpo^OexpJSLg1,

L3

=

{m/8ivh'2p)[p0(ry1

G(r.r')

=

^po(r)po(r')R-<,

+ pojr')-1};

(d = 3)

(38)

102

7 G(r,r') Li

=

{m/AntfWipojr))-1

+p0(r')-1]

(d = 2);

(39)

1

=

v/poWpo^Oexp-iJL- ],

=

{m/thtylpoir)-1

+ pop)-1]

(d=l).

(40)

in which -R = |r — r'\. To the approximations of the argument as developed so far the results equivalently involve exp j ^ 0 R(B\ , exp[—(y/ir0po(8))lnR], and exp[—V/PPQ(S)]R, for d — 3,2,1 respectively in which v = m/2h2, 2s = r + r ' 3 , and [85] the results are thus described in terms of ^(r + r ' ) a 'centre of mass', and r — r' (translation against the centre of mass) [85]. Of course the real point is that these correlations in d — 3,2 and 1 no longer depend on i— r' alone! The density po(r) is the density of the condensate and it proves t o be the negative of the potential eqn. (37), cut-off at a particular value where r reaches the vector R c = (RCl, RC2, RC3) [85]. Notice that from eqns. (38), (39), (40) only for d = 3 is G(r,r') long range: evidently for R large, expRL^1

= l + L3/R + 0{R-2)

(41)

and the 'one' means G(r,r') ~ yPo{r)Po(i~') with po(r) described by the inverted paraboloid —V(r) eqn. (37), cut off for r = R c 3 as indicated above. In the translationally invariant theory po( r ) = Po — constant and G ~ po, simply. On the other hand for d = 2, d = 1 there is no longer any such long range behaviour, and in this sense there is no condensation for d = 2 and d = 1 dimensions in the translationally invariant case. When the trap is present and translational invariance is broken the expression eqn. (38)-(40) again indicates that there is a condensate only for d — 3. Notice too that for d = 3 in the presence of the trap the "first order coherence function" G^(r,r') = G(ry)/^po(r)po(r>) ~ 1 (42) for, but only for, large enough R. Eqn. (42) is indeed with the trap present (and d = 3), and one can guess (but this is still to be demonstrated) that the nth order coherence function G^\rur2,...,r2n)

= G(r1,...,r2n)/y/p0(r1)p0(r2)...p0(r2n)

~ 1

(43)

for large enough joint separations between each pair of the test points r i , . . . , r2n and in this sense (only) in the trap the Bose condensate is in a form of quantum coherent state (coherent states have the property that (a) all coherence functions are unity; and (b) (ip), (ip^) ^ 0 with the condition (b) demonstrating the "off diagonal long range order", compare with eqns. (36) above). These particular coherent quantum states are indeed very coherent 4 , and it is paradoxical (perhaps) that since the relevant c-number NLS equations (necessarily 3 In general I use r, r' for vectors in rf-dimensions, but bold type r, r' may also be used to emphasize the vector character, as in R c also 4 Using 'coherent' here to mean in its more general non-technical usage.

103 including the trap terms V(r)ip, V(r)ip*) are not integrable (not integrable for d = 3, d = 2 anyway but not even integrable for d = 1 because momentum P no longer commutes with H), they are necessarily "Hamiltonian chaotic". The quantum theory of BEC which yields the quantum coherent state picture (for d = 3 only) is thus "quantum chaotic" in the sense that the semiclassical limits of the theory (the c-number theory) are classical chaotic! Even so one expects to build an atom-laser successfully during the early years of this Millennium, the analogy with the single mode photon laser being that well above threshold the photons in the laser cavity are in a Glauber coherent state! Since the condensate is held in the magnetic trap through the quantum spin state of the condensate, and this spin can be flipped by an RF field, condensate can fall under gravity at the point of spin flip: the outcoming condensate can then be seen as a real atom-laser [87] for periods ~ millisec (only) before the trap is emptied. Evidently this real atom-laser still needs its pumping mechanism. Very recently [88] experiments have been done which actually measure (•0+ (r)"0(r')) at T ~ 300 nano-Kelvin. One clear observation is the L^R~x fall-off to the long range condensate behaviour predicted by eqn. (41) (for the experiments see the Fig. 4, curve at T — 310 nK in particular in ref. [88]). However, now notice the extra effects of the trap: the 'correlation length' L3 for d = 3 actually depends on both of r and r' (under translational invariance L% does not depend at all on r,r'). Since to the approximations of the theory L3 = z//27r/3po(s) with s = | ( r + r') as explained we currently look for this small factor in the data already reported in [88]: this data is transversely averaged data so that G(r, r') —y G(z, z') with z the long axis of the trap [88], and it may be necessary to perform further experiments to check out the existence of this small, but fundamentally important, term arising solely from the breakdown of translational invariance in the magnetic trap. Observe that this feature strictly speaking destroys the scaling as volume V at finite density so that, strictly speaking, a 'new thermodynamics' is involved in these calculations. Of course the new 'device', this atom-laser must still be shown to have any technological future; but this was true of the first photon-maser (first successfully built in 1954 [7]). Otherwise the BEC system is a beautiful example of the quantum NLS equations realised in an experiment. Notice the problem posed by gravity for the massive bosons of any atom-laser: this was not a problem of the photon laser. Note finally [7] that BEC for 7 Li in a one- dimensional (d = 1) magnetic trap is under experimental investigation: 7 Li forms an attractive (c < 1) BEC and it may be possible to see 'quantum solitons' of the d = 1 quantum NLS equation in this system. Quantum solitons are a theme of the next section, §5. 5. Q U A N T U M INFORMATION

At this stage of its evolution 'quantum information' systems may or may not prove to involve any solitons. The essential point is that the soliton of the NLS equation under translational invariance (and d = 1) is, strictly speaking, a quantum object (a 'quantum soliton'?) which acts as a 'one-bit' of information

104 in an optical fibre. In this sense it is one 'qubit' of quantum information [9,35,89]. In [9] it was suggested that quantum solitons are easier t o realise in actual experiments (or for an actual quantum information technology) than are the other 'qubits' proposed (and recently realised) so far. A typical 'qubit' is the quantum state of a 2-level atom (§2) or of a genuine spin-^ system. In either case this state is \1>) = a\g) + b\e) (44) (with \a\2 +16| 2 = 1, for coefficients a,b G C). We know from the theory of SIT (§2) that a 27r-pulse takes an atom in \g) back t o the state \g) via however a passage through |e) (2mr pulses do this n times). Similarly a #-pulse takes \g) to \i>) = cos \o\g)-i single)

(45)

(note the geometrical phase \g) —• \ip) — e™ \g) for the 2ir- pulse, a phase which has been measured in [90]). The relevance of eqn. (45) to quantum information and quantum computing is [35,89] that two such qubits are quantum states in the Hilbert space spanned by !ffi)|S2>, |ffi)|e 2 ), |ei}|02>, |ei)|e 2 ) and Wl = a|l + %i>|e 2 > + c|ei>|52> + d|ci)|e 2 >.

(46)

Any successive measurements of this state (according t o the Copenhagen interpretation of quantum mechanics) will measure any one of these states with probability cc |a| 2 , |6| 2 , etc. But \ip) itself contains all four states. Moreover optical pulses may act on each atom so t h a t eg. a 27r-pulse on atom 1 takes \ip) t o -a\gi)\g2) - % i ) | e 2 ) - c|ei)|# 2 ) - d|e1>|e2> while a 7r pulse produces —ia|ei)|g 2 ) —j&|ei)|e2) + ic\g\)\g2) + 2<%i)|e 2 ). The point here is that, prior to actual state measurement, we can manipulate on all four basis states spanning the Hilbert space: of course by using a different basis of four states we can manipulate on each one of these four states. Thus for N qubits we can manipulate on 2N numbers and this allows a massive parallelism on these quantum computers which in principle can be exploited for particular kinds of computation (examples are given in [35], for example, where eg. Grover's algorithm allows the searching of an unsorted list of N items in only y/N steps!) An experimental situation already achieved in [90] is to use a 2-level atom plus one photon as two qubits: photon number is conserved in these experiments so that we can be restricted t o the 2-state basis |e,0) and \g, 1), with |e,0) = |e>|0),|ff,l) = | f f )|l). Now |V> = c o s i % , l ) - i s i n i % , 0 )

(47)

and for 6 = 7r (a 7r-pulse) \i>) = \g, 1) <-> |e, 0), upto phases, a form of excitation swapping. For 6 = 7r/2 |^) = - ^ [ | f f , l > - i | e , 0 > ]

(48)

105 a nice quantum 'entanglement' of photon and atom ('entanglement', going back to Schrodinger, means the state \tp) cannot be written as a simple product state: |e, 0) is the product state |e)|0) and \g, 1) is the product state |ff)|l) but \g, 1) —• state not of this form in eqn. (48)). Physical manipulations like this on single atoms are now readily achievable in 'high-Q' (very little damped) microwave cavities. For example [90] does this using 8 7 Rb atoms undergoing high Rydberg transitions at frequencies ~ 51.1 GHz (the '2-level atom' is that between the n = 50 (|ff)) and n = 51 (|e)) states, and n is the principal quantum number for the Bohr-like atom with the one opiicai-electron which is 8 7 Rb). Moreover similar manipulations like this can be done on more than one atom (2 atoms or 3 atoms so far in [90]). Take here the photon-atom entanglement eqn. (48) mentioned: the atom carries one 'bit' of binary information as |e) <-> 1, |ff) *-* 0 (say). So this is one qubit. The photon is also one qubit (|1) or |0)). These two qubits can be manipulated. In particular [90] they can be manipulated as follows. The experimental system used in [90] involves a Ramsey interferometer which allows for close to resonant Ramsey pulses connecting |ff) (n = 50) to another state |i) (actually the n — 49 state) at 54.3 GHz, sufficiently different from the 51.1 GHz transition for the |ff), |e) system. Combined with |0), |1) photon states representing the presence of 0 or 1 photons the relevant basis is the 2 2 = 4 states |i,0), \g,0), \i, 1), \g, 1). The action of a 27T pulse on the 2-level |e), \g) system takes \g, 1) —• e**|g,l) with
\g,0), \i,l) —> \i, 1), |ff, 1) —> ell^|ff,l) with <j> = ir (and by detuning the '27T-pulse' <j> can also be varied). This QPG, combined with unitary rotations acting on each qubit separately can produce any unitary two-qubit operation [92]. As an interesting example, achieved experimentally in [90], with the Ramsey interferometer switched off one performs 7r/2 rotations on the atomic qubit formed by a first atom (a 'source' atom) which initially enters the totally empty cavity in |e) but under Rabi mutation finally exits the cavity in the entangled state l ^ / 2 ) = ( l / v / 2 ) ( | e , 0 ) - f |ff, 1)).

(49)

106 Detection of the atom outside the cavity via an ionisation detector will detect |e) or \g) with equal probability of one half - as measured over many successive experiments. But detection of |e) or of \g) equally well detects |0) (no photons) or 11) (one photon) in the cavity. If now a second atom (the so-called 'meter' atom) enters in \g) and, by nutation, undergoes a 27r-pulse with the Rabi fields resonant on \g) —> \i) switched on, the Ramsey fringes can measure the correlation between the probability of finding the meter atom in \g) and the presence or absence of a photon, namely the fringes determine the conditional probabilities of measuring the meter atom in \g) given that the source atom was detected in \g) or it was detected in |e): these probabilities are equally conditional on 1 or 0 photons being present from the source atoms that is the conditional probabilities P(g2\9i) = P(g\l), -P(<72|ei) = -P(fl|0) a r e being measured. Two points follow: one is that for P(g\l) one is detecting one photon - a so-called 'quantum non-demolition' (QND) detection of one photon; the other point is that one can see that this detection of one photon conforms t o the dynamics of a controlled - N O T (CNOT) gate [93]. Such C-NOT gates are fundamental to quantum computation [9]. The Ref. [90] goes on to consider the experimental manipulation of 3 qubits (3 atoms) [90,94]. As far as I know this number 3 is a measure of current achievements in the experimental manipulation of qubits. The problems of maintaining coherence over many qubits in real cavities is considerable [89] and this may be the ultimate problem for real, that is experimentally realised, quantum computation and information. Before we turn again to solitons I explain further about the nutation of a 2level atom in a cavity. The Jaynes-Cummings model couples one 2-level atom to one single mode of a very high-Q (not at all damped) cavity. The Hamiltonian involved is (at exact resonance between the atom and the cavity mode) H = oj0Sz+LJoa1a

+ g(S+a + a^S-)

(50)

in which S^jS* take the two-dimensional representation of su(2) introduced above eqn. (3) and [a,aJ] = 1 for the photons of the cavity mode. A constant of the motion is the operator M (say) M = Sz+a)a+))

(51)

so that M\g, 1) = l\g, 1), M|e,0) = l|e,0) and M\if>) = l\ip) for any state \ip) like eqn. (47). Since there are two degrees of freedom, one for the cavity mode and actually precisely one for the 2-level atom, the two commuting constants H and M make this system quantum completely integrable. Jaynes and Cummings [91] solved the model directly by solving a 2 x 2 matrix formulation of the quantum mechanics. But in [70] we solved this JC-model, eqn. (50), via the quantum inverse method as part of the solution of a more general quantum problem involving TV > 1 2-level atoms and one mode also shown in Fig. 10.

107 For our purposes ref. [80] actually constructed an effective single mode cavity and sends one 2- level atom into it! If the atom enters in |e) and there are no photons, since the state |e,0) is not an eigenstate of H the system makes a unitary evolution under H at fixed M, and so evolves through states l^), eqn (47), in which 6 starts at 6 — •K (say) and moves through 7r < 6 < Sir. This is the nutation of the '27r-pulse'. In the course of that nutation |e, 0) passes through \g, 1), the 7T-pulse by nutation, emitting one-photon to the cavity. A second 7r-pulse by nutation then absorbs that photon and the resultant total 2TT pulse restores \ip) = +i|e,0) (with a change of sign). This nutation can be observed by the Ramsey interferometer (see the Fig. 2(a) in [90]). The condition for the 27r-pulse was gtint = 2w, ie. ti„t = 2irg~1. For any 0-pulse by nutation one adjusts £*nt to 6g~x [90]. Reference to the dynamics of the micromaser will show [71-74] that we are here concerned with 'trapping states' of a one-atom micromaser which satisfy the condition \/n + lgUnt = 2rw, r = integer. Here r = 1 and n + 1, the photon number, is precisely 1. Notice that the 2-level atom driven by the electric field E(t) in the MB system, §2, is also undergoing optical nutation driven by 27r-pulses in particular. The difference now is that at this level of 'cavity quantum electrodynamics' we must quantise the electro-magnetic field. What, if anything has this to do with solitons and particularly 'quantum solitons'? It is a suggestion in [9] that 'one quantum soliton' is a natural qubit. In [9] this 'quantum soliton' is viewed as a c-number sech solution eqn. (15) of the c-number attractive (coupling constant c < 0) NLS equation eqn. (14) which exhibits however quantum fluctuations: particularly a particle number operator TV (say) and its canonical phase 4> satisfying [N, (/>] = —i (for large enough eigen values of N) will satisfy an uncertainty principle ANA(p > | . This will be true in particular for the one bit sech signals of the NLS equation eqn. (15) in an optical fibre. And this means that the fluctuations in the values of N in this one-bit will register as quantum noise ("shot" noise) in the fibre and blur the signals. A prescription to reduce this noise is to 'squeeze' the fluctuations AAr so that AN < 1/V% while A
l \ / 2 consistent with ANA
| as is still demanded. A further suggestion in [9] is to use quantum solitons of the NLS equation as qubits! If this can make sense such qubits are easier both t o create and to keep (without dissipation - compare [89]) in a fibre than are the high Rydberg atomic states even in very high Q cavities. Unfortunately one must ask at this stage 'Exactly what is a quantum soliton'. For the quantum attractive NLS equation (c < 0) with d = 1, eqn. (28), one shows (see eg. [7], and see the eqns. (53) and (54) below) that there are socalled n-string solutions in which n is an eigenvalue of the number operators N: these quantum states are simultaneous eigenstates of H, N, P, the Hamiltonian, particle number and total momentum operators (and of a further infinity, for d = 1 and translational invariance, of such mutually commuting operators). If this exact n-string solution of the attractive NLS were to be considered the 'quantum soliton' of this system it could not be squeezed in particle number n

108 of course: any eigenstate of N is infinitely squeezed in n already. However it is immediately intuitive that in the case of the quantum soliton of an optical fibre we are actually talking about an optical pulse which is quantised but somehow still very much like the sech 1-soliton solution, eqn. (15), of the c- number NLS equation eqn. (14). In [95] Miki Wadati and colleague showed in 1984 how the solution of the classical NLS model eqn. (14) emerges from a quantum mechanical "matrix element" namely the matrix element lim (n,X",t\<j>(x -Vt)\n

+ 1,X',t).

(52)

n—>oo

The quantum field <j>{x — Vt) satisfies the quantum NLS equation, eqn. (28) (for which h — I and m = ^) and the states \n + 1,X') are the Fourier transforms on P to X' from simultaneous eigenstates \n + 1,P,...) of H, N, P... while the states in the matrix element eqn. (52) depend also on the t because they are actually wave packets deriving [95] by Fourier transformation on P of \n,P, ...t) = e~lHt\n,P,...). Notice that n —* n + 1 in the matrix element eqn. (52) and there is off diagonal long-range order and coherence (compare §4) in this sense. Moreover n —> oo in eqn. (52) is a form of classical limit - even though, unlike the quantum coherent states (§4), n-states are always intrinsically quantum. Wadati shows how eqn. (52) becomes exactly the hyperbolic secant solution eqn. (15) of the d — 1 c-number attractive NLS eqn. (14). 5 However there is apparently the problem for this limit n —> oo that the quantum attractive NLS equation is unstable with no stable ground state unless the particle number n an eigenvalue of N is held fixed [96]. Also for present purposes what are these 'particles': their numbers n are eigennumbers of N, but they are not photons e.g. their masses are m = ^, and so > 0. Of course this mass m > 0 is all an artefact of some effective nonlinear refractive index. But moreover, and still more so for the present purposes of connecting (52) with ref. [9], the matrix element which is eqn. (52) has (by definition) lost its quantum mechanics. However, for any valid comparisons with [9]and its references we have still to calculate all correlators, like those in §4 in x — x', and t — t', and this must now be done. Note that ref. [9] suggests that the coherent quantum soliton is in a Glauber coherent state but matrix element (52) appears to suggest quite otherwise. These various questions are open and demand further work. Next we note that two qubits in this solitons realisation must (presumably) be the 2-soliton solution of the c-number NLS equation eqn. (14). That the two separate qubits in this description interact is plain from the Fig. 5 where the 2soliton solution of the comparable sine-Gordon equation shows that (c-number) interaction. So far the matrix element description of two n-string solutions of the quantum attractive NLS equation is still to be worked out, and for present purposes clearly needs doing. Note that for one n-string we are concerned with 5 T h e m a t r i x element eqn. (52) becomes t h e classical sech multiplied by w h a t is essentially S(X' — X") for large enough n: X' = X" = xo plays t h e role of a phase in t h e argument of t h e sech, i.e. sech [q(x — Vt)] —> sech [q(x — XQ — Vt)].

109 states |fci,..., kn) satisfying

N\ku...,kn) r\Ki,...,

n\ki,...,kn)

Kn)

H\ki,...,kn)

(X)*f) ifei

*™>

(53)

and when c < 0 the set of wave numbers {kj} forms the n-string for chosen kj=P+^{n-(2j-l)}ic,

j = l,2,...,n

(54)

where i = %/—T so that total momentum P = Yl]=i kj = nP an< ^ the e n e r g y En(P) = Y?j=i tf = i - P 2 - T2"(™2 - l ) c 2 (not bounded below for n -> oo!). Thus at this stage of investigation it remains a very interesting but very open question what features of "quantum information" can be extracted from this quantum mechanical model system. One idea is to try to encode quantum informtaion on the n-strings either as one n-string for large n or on a set of n-strings. I do not know how to manipulate single n-strings but for sets of nstrings I note that the matrix clement cqn. (52) for one n-string is essentially located at some place Xo [95] so that in cqn. (15) the argument of the scch is q(x—xo — Vt) and x in cqn. (15) —> x—XQ as already explained in the footnote. 5 From the known asymptotic behaviours of two solitons of the NLS equation, eqn. (14), two n-strings characterised by n\, ni both large, and associated with x0 = Xi, XQ = x2 respectively and with velocities Vi, V2 respectively should mean that one can add two n-string solutions asymptotically that is for large enough initial separations: these (rather complicated) quantum objects thus become quantum qubits each with their own quantum structures in a collision with a possibly exotic quantum entanglement! This entanglement at semiclassical level becomes the phase shift A of the argument X — XQ of the sech as was described in §2 for the s-G system. Interestingly as the final figure, Fig.(11), shows the squeezing [7,9] of a single quantum soliton in an optical fibre has certainly been observed already [9,97100] even though such squeezing at a fixed n of the matrix element eqn. (52) is in itself not possible. Moreover refs. [97-100] include measurements of the correlations between modes of the quantum soliton (especially [97,98]) which refer back to the remarks below eqn. (52). Evidently an early investigation for this new Millennium is to take each part of this particular quantum analysis significantly much further. 6.

F I N A L C O M M E N T S AND CONCLUSIONS

The discovery of the soliton solutions, and more generally of the complete Hamiltonian integrability of many classes of nonlinear partial differential equations, or of other related nonlinear integro-differential systems such as [101] ,

110

3 Squeezing the soliton 120

S 100 Q.
c D

a. +-*

Q. •*-»

3 O

0Q

a> o Q. Q) U5

'5 c III > i2

input pulse energy (pJ) Output energy (top) and squeezing (bottom) plotted as a function of the input energy for a 90/10 asymmetric fibre-based interferometer loop. The output-pulse energy shows an optical-limiting effect at input energies of 53 picojoules and 83 picojoules. They-axis in the lower graph is the photocurrent noise power in a photodiode detector relative to the shot noise (horizontal line). The quantum fluctuations are reduced below the shot-noise level (i.e. "squeezed") at the input energies for which optical limiting occurs. (From Schmitt eta/.) Fig. 11. Experimental data for the 'squeezing' of a quantum soliton in an optical fibre taken with its caption from [9]. These particular quantum solitons are certainly more complicated in practice [7] than any single quantum soliton based on the quantum attractive NLS equation, eqn. (28) (c < 0), since additional nonlinearities are included. The reference to Schmitt et al. is [98].

111 during the period 1965-1974 or soon thereafter, together with the quantisation of these systems since 1979, has exposed wholly new mathematical structures of exceptional interest. Although the Fig. 8 of this paper literally sketches the remarkable 'connectivity' that is the relations between this variety of structures it can only be a sketch; and the opportunity offered to the author to embellish this Fig. 8 with the actual comments of the text of the paper still leaves him with the wish to dig much deeper and go much further. Certainly even at the superficial level necessarily adopted in the paper much is left out. Thus the paper does not attempt to address the wider issues of the mathematics of integrability per se [102] while all of the work on the Painleve tests for integrability already available (eg. [103]) are deliberately not mentioned. I have developed the theme of "complete Hamiltonian integrability" in the paper because this leads directly, via Dirac's canonical quantisation, to quantum theories. But the theory of quantum groups [29-32] is not developed as such in the paper, only mentioned; the co-multiplication structure underlying these generalised (by a spectral parameter Q non-commutative and non-commutative algebras is not explored in the paper, and indeed the commutative and co-commutative algebras which are the Poisson-Lie groups are not exhibited in the paper either. These latter do appear on the 'map', the Fig. 8 where starting from the Sklyanin bracket, which there takes the form {T®T}

= [T®T,r],

(55)

in terms of the 'little r-matrix ([18] and see [24]) in which the left side represents the 16 Poisson brackets which can be considered for the 2 x 2 matrix integrable systems (the AKNS systems [36]) appears at the top of the third column from the right. But it also leads down to the quantum groups as Hopf algebras in the second column from the right. Of course what leads to what in this 'map' can be a matter of subjective choice. On the 'map' the origins of these Hopf algebras in the loop algebras (with spectral parameter) well down in the third column from the left is covered, though inadequately and incompletely, in my paper [30]. But again the KSA (or AKS: Adler-Kostant-Symes) theorem there is not at all developed in the paper, see eg. my [104] for results and references and see also the references to supersymmetric integrable systems theory in this [104]. Lax pair theory and the inverse method are well sketched in the text of §3 , and I hope that the twin pillars of geometry and algebra supporting all of this are at least partly identifiable. A theme which is not developed is the non-commutative quantum geometry of these systems because, implicit as this is in the ii-matrix theory, these aspects still await a proper elucidation (by this author at least). However, in this author's view the most remarkable aspect of the mathematical structure displayed in the Fig. 8 must still be that it leads directly to physical manifestations and even to successful experiments. The Figs. 9 and 10 of this paper serve to show how even the EXPERIMENTS 'box' on the Fig. 8 could do little justice to that experimental situation. Thus the physics of self-induced transparency (SIT) pursued during 1967-1974,

112 and indeed subsequently, was already a striking manifestation of 'optical solitons'; and since then the many physical examples, some collected in [34] of 1977, follow the same theme. Thus, for this author, [8] as well as [34] of 1977 and then [105] of 1980 could already include spin-wave phenomena in liquid 3 He at temperatures T ~ 2.6 mK and the appearance of the integrable s-G equation for spin-waves in the 3 He A- phase. They also heralded the appearance of the non-integrable 'double sine-Gordon' equation (f>xx — <j>tt = —m2 (sin 0 + 1 sin | 0 ) for spin waves in the 3 He B-phase - solved in [105] by 'soliton perturbation theory' about the s-G equation. Things like this can only extend the still scarcely explained 'unreasonableness' [33] of the interplay of the mathematics, mathematical physics and actual physics exhibited by Nature as particularly described in this paper. This physics extends to realisable technologies. And it is a theme of this paper that this is so. The connections between SIT, cavity quantum electrodynamics, and the potential for 'quantum information' and 'quantum computing' is one theme of the paper. The arcane (to this author) connection between methods of functional integration on infinite dimensional systems, and quantum mechanics remains almost as mysterious now as it did when first put forward by R.P. Feynnan [106]. Indeed, although details could not be given in the paper the fact that these 'arcane' functional integral methods from mathematical physics can yield predictions for the behaviours of Bose condensates held in magnetic traps at temperatures T ~ 250 — 450 nK which are in agreement with current experiments [85,88] is a source of present amazement to this author. The new technology which is the atom-laser is still to be created. Because of the obvious problems of gravity, ultra-low temperatures, and the short lifetimes of condensates anyway this technology may never be created. But it remains the spectacular manifestation of mathematics with physics which is Nature mentioned. On the other hand the 'optical soliton' whose short life in the last Millennium was only some 27 years must have a secure future in the new communication systems of this new Millennium. Less predictable is the 'quantum soliton'. What this might yet do for 'quantum information' remains a question which must now be vigorously explored! REFERENCES

1. P.A. Griffiths "Mathematics and the sciences: Is interdisciplinary research possible?" Plenary paper at this meeting. 2. This author's experience of interdisciplinary research is that in straddling more than one supposedly distinct 'camp' of research - as one necessarily must - one runs the risk of finishing with an acknowledged place in no one of them: in short each research camp can become very protective of its own perceived boundaries! There is of course the intrinsic problem, anyway, of simple understanding between camps. This author's actual experience is that newly discovered abstract mathematics although evidently directly applicable to real experiments in the laboratory and con-

113 sequently to newly emerging technologies can rarely be perceived as such by all but a very select few of the available experimentalists and technologists! This lecture attempts to delineate a route which can be said be begin in abstract and aesthetically appealing 'pure mathematics' through systems of partial differential equations of 'applied mathematics' thence to theoretical and experimental physics, and from the last to the newly emerging technology of 'quantum information'. Only the reader can determine if this route is made apparent in the paper. In practice discovery of the optical soliton was already in the border land between theoretical and experimental physics: the rather remarkable mathematics of solitons (Fig. 8) grew 'backwards' out of that theoretical physics as can be seen from the references [3-7] then [8] and then [10] (for example) following: both the algebras of solitons (Fig. 8) and their geometries [10-13] have proved of intrinsic mathematical interest while the non-mathematical reference [9] contrasts all of this with emerging, or potentially emerging, new technology. 3. Caudrey, P.J., Gibbon, J.D., Eilbeck, J.C. and BuUough, R.K., 1973, 'Exact multi-soliton solutions of the self-induced transparency and sineGordon equations', Phys. Rev. Lett., 30, 237. 4. Caudrey, P.J., Gibbon, J.D., Eilbeck, J.C. and BuUough, R.K., 1973, 'Solitons in non- linear optics I. A more accurate description of the Impulse in self-induced transparency', J. Phys. A: Math. Gen., 6, 1337. 5. Gibbon, J.D., Caudrey, P.J., Eilbeck , J.C. and BuUough, R.K., 1973, ' An N-soliton solution of a nonlinear optics equation derived by a general inverse method', Lett, al Nuovo Cimento, 8, 775. 6. BuUough, R.K., Caudrey, P.J., Eilbeck, J . C , Gibbon, J.D., 1974, 'A general theory of self-induced transparency', Opto-Electronics, 6, 121. 7. BuUough, R. K., 2000 'The optical solitons of QEl are the BEG of QE14: has the quantum soliton arrived?. In Proceedings of the 14th National Quantum Electronics and Photonics meeting, Manchester, UK, 5-9 September 1999. Journal of Modern Optics. In the press at October 1999, J. Mod. Optics, Vol. 47, N o . l l , 2029-2065 [erratum J. Mod. Optics, Vol. 48, No. 4, to appear February 2001]. 8. BuUough, R.K., and Caudrey, P.J., 1978, 'Optical solitons and their spin wave analogues in 3He', in "Coherence and quantum optics IV" edited by L. Mandel and E. Wolf (New York: Plenum) pp. 762-780. 9. Abram, Izo, 1999, 'Quantum pp. 21-25.

Solitons', Physics

World, February 1999,

10. Solitons, 1980, Springer Topics in Current Physics, 17, edited by R.K. BuUough and P.J. Caudrey (Heidelberg: Springer-Verlag). Chap. I, The soliton and its history, pp. 1-64, and the other Chaps. 11. Terng, Chuu-Lian and Uhlenbeck, Karen, 2000, Geometry of Solitons, Notices of the AMS, January, pp. 17-25.

114 12. Hitchin, N.J., Segal, G.B., and Ward, R.S., 1999, 'Integrable Systems. Twisters, Loop Groups, and Riemann Surfaces' Oxford Science Publications (Oxford: Clarendon Press) [ISBN 0 19 850421 7] eg. 'Introduction' by N.J. Hitchin, pp. 4-6. 13. 'The Geometric Universe, Science, Geometry and the Work of Roger Penrose', 1998, edited by S.A. Hugett, L.J. Mason, K.P. Tod, S.T. Tsou, and N.M.J. Woodhouse (Oxford: Oxford University Press) §1 p.5 (by Michael Atiyah) and §6, pp. 99-108 (by R.S. Ward) as well as other places. [ISBN 0 19 850059 9] (Hbk). 14. Solitons, 1980 Ref. [10], p.27 and the reference [1.67] there. 15. Liouville, J., 1855, Journal de Mathematique, XX, p.137. 16. Arnold, V.I., 1978, Mathematical Methods of Classical Mechanics (Berlin: Springer- Verlag) Chap. 10, pp.271-275 and pp.279-291. 17. Bullough, R.K., 1994, 'Instabilities in Nonlinear Dynamics: Paradigms for Self- Organization' in "On Self-Organization", Springer Series in Synergetics, Vol. 61, edited by R.K. Mishra, D. Maa/3 and E. Zwierlein (Berlin: Springer-Verlag) pp. 212-244. 18. Bullough, R.K. and Timonen, J.T., 1995, 'Quantum and Classical Integrable Models and Statistical Mechanics' in "Statistical Mechanics and Field Theory", edited by V.V. Bazhanov and C.J. Burden (Singapore: World Scientific Publ. Co. Pte. Ltd.) pp. 336-414 [ISBN 981 02 2397 8]. 19. Bullough, R.K. and Caudrey, P.J., 1995, Acta Applicandae Mathematicae, 39, 193-228. (This article as printed contains gross publishers errors and interested readers are referred to the authors at UMIST, Manchester for an original typescript.). 20. Bullough, R.K., Jiang, Z. and Manakov, S.V., 1986, Proc. Intl. Conf. on Solitons and Coherent Structures, Santa Barbara, Jan. 1985. Physica 18 D: Nonlinear Phenomena, pp. 305- 307. 21. Konopelchenko, B. and Rogers, C , 1991, Phys. Lett, 158A, 391. 22. Jiang, Z., 1987, 'Integrable Systems and Integrability', Ph.D. Thesis, University of Manchester, February. 23. Bogoliubov, N.M., Rybin, A.V., Bullough, R.K., and Timonen, J., 1995, 'Maxwell- Bloch system on a lattice', Phys. Rev. A., 52, No. 2, 14871493. 24. Korepin, V.E., Bogoliubov, N.M., and Izergin, A.G., 1993, 'Quantum Inverse Scattering Method and Correlation Functions, (Cambridge: Cambridge University Press) [Paperback, 1997. ISBN 0 521 58646 1]. 25. Ward, R.S., 1985, 'Integrable and solvable systems and relations among them', Phil. Trans. Roy. Soc. London A315, 451-457 (Discussion meeting 'New Developments in the Theory and Application of Solitons'). 26. D'Ariano, G.M., Montorsi, A., and Rasetti, M.G., 1985, 'Integrable Systems in Statistical Mechanics' (Singapore: World Scientific Publishing Co. Pte. Ltd.) pp. 96-127 and the work of E. Date, M. Jimbo, M. Kashiwara and T. Miwa, and of M. Sato and Y. Sato referenced.

115 27. Cheng, Yi, 1987, 'Theory of Integrable Lattices', Ph.D. Thesis, University of Manchester, January. 28. Weyl, H., 1931, "The Theory of Groups and Quantum Mechanics' (New York: Dover Publications, Inc.) paperback edition (translated from the German by H.P. Robertson, September). 29. Bogoliubov, N.M. and Bullough, R.K., 1992, 'A q-deformed completely integrable Bose gas model, J. Phys. A: Math. Gen. 25, 4057-4071. 30. Bullough, R.K., Olaffson, S., Chen, Yu-zhong, and Timonen, J., 1990, 'Integrability conditions: recent results in the theory of integrable models' in "Differential Geometric Methods in Theoretical Physics" (NATO ARW 'Physics and Geometry' 1989) edited by Ling-Lie Chau and Werner Nahm (New York: Plenum Press) pp. 47-69. 31. Bullough, R.K. and Bogoliubov, N.M., 1992, 'Quantum Groups: q-Boson Theories of Integrable Models' in Proc. XXth Intl. Conf. on Diff. Geometric Methods in Theoretical Physics Vol. 1 edited by Sultan Catto and Alvany Rocha (Singapore: World Scientific Publ. Co. Pte. Ltd.) pp. 488-504 [ISBN 981 02 0827 6 (Vol 1)]. 32. 'Quantum Groups', 1990, Springer Lecture Notes in Physics, edited by H.-D. Doebner and J.-D. Henning (Berlin: Springer-Verlag) [ISBN 3 540 53503 9]. 33. Wigner, E.P., 1960, 'The unreasonable effectiveness of mathematics in the natural sciences', Coram. Pure and Applied Maths 13, 1. 34. Bullough, R.K., 1977, 'Solitons' in "Interaction of radiation with condensed matter. Vol. 1" IAEA-SMR-20/51. (Vienna: International Atomic Energy Agency) pp. 381-469. 35. Deutsch, D. and Eckert, A., 1998, 'Quantum Information. Quantum Computation', Physics World, March 1998, pp. 47-52. 36. Ablowitz, M.J., Kaup, D.J., Newell, A.C. and Segur, H., 1973, Phys. Rev. Lett., 3 1 , 125. 37. Lamb, G.L., 1973, Phys. Rev. Lett, 3 1 , 196. 38. Schweber, S.S., 1961, 'An Introduction to Relativistic Quantum Field Theory' (New York: Harper and Row, Publishers Inc.) Chap. 3. 39. Calogero, Francesco, 1995, 'Integrable Nonlinear Evolution Equations and Dynamical Systems in Multidimensions', Acta Applicandae Mathematicae 39, 229-244; and Calogero, F., Universal Integrable Nonlinear PDEs' in "Applications of Analytic and Geometric Methods to Nonlinear Differential Equations", (Dordrecht, Holland: Kluwer Academic Publishers) pp. 109-114; and references. 40. Bullough, R.K., Thompson, B.V., Nayak, N. and Bogoliubov, N.M., 1995, 'Microwave cavity quantum electrodynamics, I: one and many Rydberg atoms in microwave cavities; and II: fundamental theory of the micromaser' in "Studies in Classical and Quantum Nonlinear Optics" edited by Ole Keller (Commack, New York: Nova Science Publishers Inc.) pp. 609-623 [ISBN 1 56072 168 5].

41. Hynne, F., and Bullough, R.K., 1984, 'The scattering of light. I. The optical response of a finite molecular fluid, Phil. Trans. R. Soc. Lond. A., 312, 251. 42. Hynne, F., and Bullough, R.K., 1987, 'The scattering of light, II. The complex refractive index of a molecular fluid, Phil. Trans. R. Soc. Lond. A., 321, 305. 43. Hynne, F., and Bullough, R.K., 1990, 'The scattering of light, III. External scattering from a finite molecular fluid, Phil. Trans. R. Soc. Lond. A, 330, 253. 44. Bullough, R.K., Batarfi, H.A., Hassan, S.S., Ibrahim, M.N.R., and Saunders, R., 1996, in 'ICONO '95 Atomic and Quantum Optics: High Precision Measurements' edited by Sergei N. Bagayev and Anatoly S. Chirkin, Proc. SPIE 2799, pp.320-328; and see the other references, 91, 92, 93 in Ref. [7]. 45. Bullough, R.K., Hassan, S.S. and Ibrahim, M.N.R., 2000, 'A nonlinear refractive index theory of optical multi-stability in normal and squeezed vacua: analytical and numerical results'. One of a sequence of papers to be published. Also see M.N.R. Ibrahim, Ph.D. thesis, UMIST, 1996. 46. Caudrey, P.J., and Eilbeck, J.C., 1977, Phys. Lett, 62A, 65. 47. Gibbs, H.M. and Slusher, R.E., 1972, Phys. Rev. A, 6, 2326-2334. 48. Bullough, R.K., 1995, 'Optical solitons, chaos and all that: thirty years of quantum optics and nonlinear phenomena! in Proceedings of the First International Scientific Conference (Science and Development) Organized by the Faculty of Science, Al-Azhar University, Cairo, 20- 23 March 1995. Edited by Prof. Dr. Ahmed M. El-Naggar (Dean, Faculty of Science), and Prof. Dr. Abd El-Wahab A. El-Sharkawy (Vice Dean). 49. Arnold, V.I., 1978, Ref. [16], p.285. 50. Bullough, R.K., and Caudrey, P.J., 1980, Ref. [10], p.3 and following pages. 51. Chiao, R.Y., Garmire, E., Townes, C.H., 1964, Phys. Rev. Lett. 13, 479. 52. Kelley, P.L., 1965, Phys. Rev. Lett. 15, 1005. 53. Bullough, R.K. and Caudrey, P.J., 1980, Ref. [10] pp.2-5 and Appendix pp.373-378. And see also [19,54]. 54. Bullough, R.K., 1988, '"The Wave" "par excellence", the solitary great wave of equilibrium of the fluid - an early history of the solitary wave' in "Solitons", edited by M. Lakshmanan, Springer Series in Nonlinear Dynamics (Heidelberg: Springer-Verlag) pp. 7-42. 55. Bullough, R.K., Chen, Yu-zhong and Timonen, J., 1990, 'Soliton statistical mechanics - thermodynamic limits for quantum and classical integrate models' in "Nonlinear World Vol. 2' edited by V.G. Bary'akhtar, V.M. Chernousenko, N.S. Erokhin, A.G. Sitenko, and V.E. Zakharov (Singapore: World Scientific Publ. Co. Pte. Ltd.) pp. 1377-1422. 56. Bullough, R.K. and Caudrey, P.J., 1980, Ref. [10] pp. 29-36. 57. Faddeev. L.D., and Takhtajan, L.A., 1987, 'Hamiltonian Methods in the Theory of Solitons' (Berlin: Springer-Verlag).

117 58. Bullough, R.K., 1980, 'Solitons: inverse scattering and its applications' in "Bifurcation phenomena in mathematical physics and related problems" edited by D. Dessis and C. Bardos (Dordrecht, Holland: D. Reidel Publ. Co.) pp.295-349. 59. Bullough, R.K., and Dodd, R.K., 1977, 'Solitons' in "Synergetics" Proc. Intl. Workshop on Synergetics; Bavaria, May 1977, edited by H. Haken (Heidelberg: Springer-Verlag) pp.92-119. 60. Dodd, R.K. and Bullough, R.K., 1979, 'The generalised Marchenko equation and the canonical structure of the A.K.N.S.-Z.S. method, Physica Scripta 20, 364-381. 61. Caudrey, P.J., 1990, 'Spectral transforms' in "Soliton theory: a survey of results" edited by Allan P. Fordy (Manchester: Manchester University Press) pp.25-54 [ISBN 0 7190 1491 3] also see Caudrey, P.J., 1980, Phys. Letts. A 79, 264 referenced. 62. Caudrey, P.J., 1990, 'Two dimensional spectral transforms' in "Soliton theory: a survey of results" edited by Allan P. Fordy (Manchester: Manchester University Press) pp.55-74 [ISBN 0 7190 1491 3]. 63. Bullough, R.K. and Bogoliubov, N.M., 1992, 'Quantum groups: q-boson theories of integrable models and application to nonlinear optics' in "Proc. Ill Potsdam- V Kiev Intl. Workshop on Nonlinear Processes in Physics" edited by A.S. Fokas, D.J. Kaup, A.C. Newell and V.E. Zakharov (Berlin: Springer-Verlag) pp. 232-240. 64. Fakuma, M., Kawai, H., and Nakayama, Ryuichi, 1991, Int. J. Modern Physics A6(8), 1385-1406. 65. Aoyama, S., and Kodama, Y., 1992, Phys. Lett. B278, 56-62. 66. Migdal, A.A., 1995, 'Quantum Gravity as Dynamical Triangulation' in "Statistical Mechanics and Field Theory" edited by V.V. Bazhanov and C.J. Burden (Singapore: World Scientific Publ. Co. Pte. Ltd.) pp. 214-252. 67. Bullough, R.K., and Olafsson, S., 1989, 'Algebra of Riemann-Hilbert Problems and the Integrable Models - a sketch' in "Proc. XVII Intl. Conference on Differential Geometric Methods in Theoretical Physics" edited by Allan I. Solomon (Singapore: World Scientific Publ. Co. Pte. Ltd.) pp.295-309 [ISBN 9971 50 836 2j. 68. Bullough, R.K., and Sasaki, R., 1980, 'Geometry of the AKNS-ZS Inverse Scattering Scheme' in "Nonlinear Evolution Equations and Dynamical Systems" Springer Lecture Notes in Physics 120 edited by M. Boiti, F. Pempinelli and G. Soliani (Berlin: Springer-Verlag) pp.314-337. 69. Drinfel'd, V.G., 1983, 'Hamiltonian structure on Lie groups, Lie bialgebras and the geometric meaning of the classical Yang-Baxter equations', Soviet Math. Dok., 27, 68- 71. 70. Bogoliubov, N.M., Bullough, R.K., and Timonen, J., 1996, 'Exact solution of generalised Tavis-Cummings models in quantum optics', J. Phys. A: Math. Gen. 29, No. 19, 6305-6312.

118 71. Bullough, R.K., Bogoliubov, N.M. and Puri, R.R., 2000, 'Proc. NEEDS in Leeds meeting 199S, edited by A.P. Fordy and A.V. Mikhailov. Published (in Russian) in the Russian J. of Theor. and Math. Phys., Vol. 122, No. 2, February 2000, pp. 182-204 [English Translation, 2000, Theor. Math. Phys., Vol. 122, No. 2, pp.151-169 ISSN 0040 5779]. 72. Puri, R.R., Kumar, S. Arun, and Bullough, R.K., 'Stroboscopic Theory of Atom Statistics in the Micromaser', preprint 2000. 73. Joshi, A., Kremid, A., Nayak, N., Thompson, B.V., and Bullough, R.K., 1996, 'Exact trapping state dynamics for the S5Rb atom micromaser at very high Q and/or very low T, J. Mod. Optics 43, No. 5, 971-992. 74. Bullough, R.K., Joshi, Amitabh, Nayak, N., and Thompson, B.V., 1996, ' The micromaser at very low temperatures' in "Notions and perspectives of nonlinear optics" edited by Ole Keller. Series in Nonlinear Optics (Singapore: World Scientific Publ. Co. Pte. Ltd.) pp. 13-87 [ISBN 981 02 2627 6]. 75. Bullough, R.K. and 10 co-authors, 1989, 'Giant Quantum Oscillators from Rydberg Atoms: atomic coherent states and their squeezing from Rydberg atoms' in "Squeezed and Nonclassical Light' edited by P. Tombesi and E.R. Pike, NATO ASI Series B: Physics Vol. 190 (New York: Plenum Press) pp.81-106 [ISBN 0 306 43084 3]. 76. Bogoliubov, N.N., 1947, Journal of Physics, 9, 23; Vestu. MGU 7, 43. 77. Bogoliubov, N.N., Tolmachev, V.V., and Shirkov, D.V., 1959, 'A new method in the Theory of Superconductivity1 (New York: Consultants Bureau Inc.) p.8, eqn. (1.8). 78. Huang, K., 1987, Statistical Mechanics (2nd Edn.) (New York: John Wiley and Sons Inc.) Chap. 12, pp.278-304. 79. Bogoliubov, N.N., Tolmachev, V.V. and Shirkov, D.V., 1959, Ref. [77] pp.8-9. 80. Kadanoff, Lev P. and Baym, Gordon, 1962, 'Quantum statistical mechanics' (New York: W.A. Benjamin, Inc.). 81. Bogoliubov, N.N., Tolmachev, V.V. and Shirkov, D.V., 1959, Ref. [77] p.6. 82. Faddeev, L.D., and Popov, V.N., 1965, Sov. Phys. ZhETP 20, 840. 83. Popov, V.N., 1983, 'Functional Integrals in Quantum Field Theory and Statistical Physics' (Dordrecht, Holland: D. Reidel Publ. Co.). 84. Popov, V.N., 1990, 'Functional integrals and collective excitations' (Cambridge: Cambridge University Press) [ISBN 0521 407 877 paperback]. 85. Bogoliubov, N.M., Bullough, R.K., Kapitonov, V.S., Malyshev, C , and Timonen, J., 2000, 'Finite-temperature correlations in the trapped BoseEinstein condensate'. Submitted to Phys. Rev. Lett, at September 1999; resubmitted April 2000. 86. Kleppner, D., 1997, Physics Today, August 11. 87. Bloch, I., Esslinger, T., and Hansch, T.W., 1999, Phys. Rev. Lett, 822, 3008.

119 88. Bloch, I., Hansen, T.W., and Esslinger, T., 2000, Nature 403, 166. 89. DiVincenzo, D. and Jerhal, B., 1998, 'Quantum Information. Decoherence: the obstacle to quantum computation in Physics World, March, pp.53-57. 90. Haroche, S., Nogues, G., Rauschenbeutel, A., Osnaghi, S., Brune, M., and Raimond, J.M., 1999, 'Quantum knitting in cavity QED in "Laser spectroscopy XIV International Conference" edited by Rainer Blatt, Jiirgen Eschner, Dietrich Leibfried and Ferdinand Schmidt- Kaler (Singapore: World Scientific Publ. Co. Pte. Ltd.) pp. 140-149. 91. Jaynes, E.T. and Cummings, F.W., 1963, Proc. IEEE 51, 89. 92. Lloyd, S., 1995, Phys. Rev. Lett. 75, 346. 93. Barenco, A., et al., 1995, Phys. Rev. Lett, 74, 4083. 94. Rauschenbeutel, A., Nogues, G., Osnaghi, S., Brune, M., Raimond, J.-M., and Haroche, S., 1999, 'Generation of GHz Type Three-atom Correlations in a Cavity QED Experiment in "Laser Spectroscopy/' Ref. [90] pp.364365. 95. Wadachi, Miki and Sakagami, Masa-aki, 1984, 'Classical soliton as a limit of the quantum field theory1, Journal of the Physical Society of Japan, 53, No. 6, pp. 1933-1938. 96. Bullough, R.K. and Timonen, J.T., 1995, Ref. [18], p.29. 97. Spalter, S., Korolkova, N., Konig, F., Sizman, A., and Leuchs, G., 1998, Phys. Rev. Lett. 81, 786. 98. Schmitt, S., Ficker, J., Wolff, M., Konig, F., Sizman, A., and Leuchs, G., 1998, Phys. Rev. Lett. 81, 2446. 99. Drummond, P.D., Shelby, R.M., Friberg, S.R., and Yamamoto, Y., 1993, Nature, 365, 307. 100. Friberg, S.R., Machida, S. and Yamamoto, V., 1992, Phys. Rev. Lett. 69, 3165. 101. Lakshmanan, M. and Bullough, R.K., 1980, 'Geometry of generalised non-linear Schrodinger and Heisenberg ferromagnet spin equations with linearly x-dependent coefficients', Phys. Lett. 80A, 287-292. 102. Zakharov, V.E., 1991, ' What is integrability? Springer series in Nonlinear Dynamics (Berlin: Springer-Verlag). 103. Jiang, Zuhan and Bullough, R.K., 1995, Physica Scripta 5 1 , 545-548, and references. 104. Bullough, R.K. and Olafsson, S., 1989, 'Complete integrability of the integrable models: quick review' in "IXth Intl. Congress on Math. Phys." edited by B. Simon, A. Truman and I.M. Davies (Bristol: Adam Hilger) pp. 329-334 [ISBN 0 85274 250 9]. Unfortunately the relevant paper called OB in this paper was never completed. 105. Bullough, R.K., Caudrey, P.J., and Gibbs, H.M., 1980, 'The Double SineGordon Equations: A Physically Applicable System of Equations' in "Solitons" Ref. [10] pp.107-141. 106. Feynman, R.P. and Hibbs, 1965, Quantum Mechanics and Path Inegrals New York: McGraw-Hill Inc.) and earlier work.

120 107. Bogoliubov, N.M., Bullough, R.K. and Timonen, J., 1994, 'Critical behaviour of strongly coupled boson systems in 1+1 dimensions', Phys. Rev. Lett. 72, No. 25, 3933-3936. 108. Baxter, R.J., 'Exactly Solved Models in Statistical Mechanics' (London: Academic Press Inc. (London) Ltd.) [ISBN 0 12 083180 5]. Note added in proof: The reader will find articles relevant to the Section 5 of this paper 'Quantum Information' in the Ref. [13] and in articles 26. and 27. of Ref. [13] in particular: article 27. references a number of fundamental papers on 'non-locality', 'complexity', entanglement, quantum teleportation, and the (unsolved) problems of quantum measurement, all within the general topic of quantum mechanics, which I could at best scarcely touch on in my Section 5. My own reference to 2-dimensional quantum gravity (perhaps relevant to these problems) is in the §6 of [19]. I also draw attention in this connection to the Note 1 on the p.30 of [19]. In a further additional note I add some comments on functional integration in the contexts of this paper. Statistical mechanical partition functions Z = f T>fi exp S\p] as functional integrals with a measure Vfj, for one space and one time dimension (1 + 1 dimensions) are introduced into the text following Eqn. (27). This is in reference to this particular functional integral appearing in the 'map' Fig. 8 at its top right corner. The classical action S[p] in this expression is described in terms of the action-angle variables typified by those appearing in the text below Eqn. (24) for the sine-Gordon model extended however to periodic boundary conditions as is explained in the Ref. [18]. This Ref. [18] develops such a statistical mechanics for all of the integrable models in 1 + 1 dimensions in terms of such partition functions Z. Functional integration is also used implicitly in the §4 of the text, but these functional integrals are evaluated in terms of the fields 0(r, t) rather than in terms of action-angle variables in order to compute both the partition functions Z and the correlation functions G(r,r') in each of d = 3, 2 and 1 space dimensions given in Eqns. (38), (39) and (40) respectively and these functional integral calculations are presented as such in Ref. [85] (r € Rd, is a vector in d dimensions, and for the nonlinear Schrodinger models transformation to action-angle variables is not possible in d + 1 dimensions for d > 1 [18]). The 'map' Fig. 8 makes a second reference to a functional integral, namely in the 'box' at bottom right "Knot (link) Polynomials Partition Functions Z, Jones Polynomials". As a partition function Z this connects (tenuously) with the partition function Z at top right (see the dashed fine connecting from the box top right, via "Invariants of some manifold' to the "Knot" box bottom right). The functional integral Z for the "Knot" box is not quoted explicitly but is written in an invariant form independent of metric as Z[A] = JexpiS(A)T>A where S(A) = S is the integral of the Chern-Simons 3-form given in the box at absolutely bottom right and i = v'—1- This "3-form" box connects directly to the "Knot" box and the reference here is to Witten, E., 1989, 'Some geometrical applications of quantum field theory' in Proc. IXth Intl. Congress on Math. Phys. edited by B. Simon,

121 A. Truman and J.M. Davies (Bristol, Adam Hilger) (ISBN 0-852-74-250-9) p.81. But readers might also see Johnson, Gerald W. and Lapidus, Michael L., 2000, 'The Feynman Integral and Feynman's Operational Calculus' (Oxford, Clarendon Press) p. 643. For further geometrical aspects see the same book pp. 637-659 (say) and concerning Witten's 'topological invariants' of his paper see particularly the quote from Atiyah on the p.641 reference current interest in relations between geometry and physics. Unfortunately there is an obvious, but long standing, error in the 'map' Fig. 8 where the Chern-Simons 3-form should read Tr{AAdA + ^AAA^A) with the extra '
Department of Mathematics, UMIST

P O Box 88

Manchester M60 1QD

UK

Mathematics and the 21st Century Eds. A. A. Ashour and A.-S. P. Obada © 2 0 0 1 World Scientific Publishing Co. (pp. 123-140)

123

Concepts for non-smooth dynamical systems Tassilo Kupper Math. Inst., Univ. of Cologne, WeyertaJ 86-90, D-50931 Cologne, Germany. kuepper@mi. uni-koeln. de.

April 27, 2000 Abstract We present some concepts within the area of dynamical systems which have been extended to non-smooth differential equations. These include the definition of Lyapunov exponents, extension of Conleyindex or KAM-theory, an adaption of the Melnikov-technique for the detection of chaos and an approach to generalize Hopf bifurcation. Keywords: Non-smooth dynamical systems, bifurcation, chaos. AMS Classification: 34A60, 34Cxx, 70K50.

1

Introduction

The area of dynamical systems can be considered as one of the topics that have governed mathematical research in the past century. Starting with Poincare at the end of the Kr century great progress has been achieved in investigating the qualitative behaviour of evolutionary problems. A fascinating review on that kind of research has been provided by Palis during this conference. Important achievements in the study of dynamical systems rely on smoothness properties of the systems since in many cases linearization techniques are employed. For non-smooth systems such techniques are

124

not at hand and for that reason new methods have to be developed. With respect to such difficulties effects leading to non-smoothness have been neglected for a long time. The need of a better understanding of dynamical processes in engineering requires improved modeling including effects like dry friction, impacts, discontinuous switches etc, see [38, 39, 4]. Moreover, some recent applications are based on a direct use of such non-smooth effects like stick-slip motions, see [2]. In an obvious way discontinuities may arise due to geometrical properties such as corners in a billiard. The lack of differentiability implies that there is no uniquely defined tangent in a corner which might lead to non-uniqueness in the evolution of a dynamical system. Impacts usually will cause jumps in the velocity components of a mechanical system, an elementary example is provided by the impact oscillator, more realistic situations can be found in machine dynamics. An important source for non-smooth behavior is due to dry friction arising in dampers, drilling processes or rail-wheel contacts audible as creaking. In a recent application efficient use of stick-slip motions due to dry friction in a micromechanical positioning system has been made [2]. State-dependent switches in electrical, physical or biological systems also lead to differential equations involving state-dependent discontinuities. We list a few examples which have been used as model examples in the investigation of non-smooth systems. (i) Impact oscillator with external forcing f(t) and damping constant r at the reflection point ([4]): x(t) + x(t) +

x(t )

= f(t) =

-rx{r)

(x{t) < a) (x(t) = a)

The forced impact oscillator is one of the most simplest examples where the new effect of grazing bifurcation can be illustrated.

Figure 1.1

125

(ii) Rolling ball on a symmetrical surface with corners

O Figure 1.2 x + g sin a • cos a sgn(a;) = 0 (iii) Rocking block (Housner 1956)

Figure 1.3 .. , mgh . .

.

„

Jo

( (p> 0 \ \

(iv) Pendulum with friction (Reissig [40, 41, 42, 43], Kunze [23]) /,/ /./

/./ x=0

x(t) p(£lt) Figure 1.4

126 x = —rx — c sgn(i) — kx +p(Qt) (v) Single mass friction oscillator ([15, 38, 39, 31, 23, 24])

n xc

r xc+x(t)

Figure 1.5 The effect of dry friction has been studied by several authors with the help of the friction oscillator. A block of mass m is positioned on a belt moving with constant velocity ^o- The block is attached elastically to a wall through a spring. An external periodic forcing u(t) is applied at the position of the spring. Let x(t) denote the position of the block. The corresponding differential equation for x(t) is of the form x + u0x = m~lFR(x — VQ) + cjgu(t). The friction force FR(v) = — sgn(v)FNfj,(\v\) depends on the friction characteristic /i. Various models based on theoretical and experimental investigation have been used [31, 15]. (vi) Multiple mass friction oscillator ([31])

Figure 1.6

127

The multiple mass friction oscillator is built in a similar way. We have studied an oscillator with two coupled masses leading to a system of fourth order for the relative position of the blocks TUiXi =

m2x2

-CiXi - di±i - FRl(ii

= -c2(xi

- Vo) + C2(x2 — Xi) + d(i2 - Xi)

- x2) - d2(x2 - ±1) - FR2 (x2 - v0)

The friction forces FRl, FRi are given in a similar form as in (v). (vii) Neural networks A simple example describing the dynamics of 2 neurons is given by x = -x + qnf{x)

- qi2y,

y = -y + q2if(x). The response function f(x) is a piecewise constant (i.e. f(x) = sgn(x)) or a smooth approximation. A detailed analysis is given in [16]. Experimental observations as well as numerical simulations both applied directly to the nonsmooth systems as well as to mollified approximations indicate that the standard szenario of bifurcations such as saddle-node bifurcations, the onset of periodic motions, period doubling up to chaotic behaviour can occur, [38, 39, 34, 15, 45]. Just by looking at experimentally observed data it is difficult to distinguish between for example periodic orbits of high order, quasiperiodic or chaotic behaviour. For smooth systems Lyapunov exponents provide an useful tool to classify various states. The notion of the standard Lyapunov exponent and their interpretation requires the linearized flows, hence smoothness. Within the frame of a DFG-Schwepurktprogramm [25] we have extended concepts from the classical theory of dynamical systems to the nonsmooth case. In this lecture we will mainly report about results concerning Lyapunov exponents but first we briefly review a few other areas which have been investigated. (i) Analysis of piecewise linear planar systems. For the symmetric case a complete description of the bifurcation behaviour has been obtained in [14]. Further studies concerning piecewise linear systems are treated for example in [22].

128

(ii) The Conley-index is a topological method to prove existence and in a generalized sense bifurcation. In [26] we have extended classical results. The method is illustrated by an example describing the motion of wings. (iii) Usually KAM-theory requires extreme smoothness. Using a change of variables we have been able to extend some results to problems with a lack of smoothness [27]. (iv) Perturbations of planar systems with homoclinic orbits lead to chaotic behaviour. A well-established tool to analyze the influence of the perturbation is provided by the Melnikov function. In [47] we derive a generalized Melnikov function and show that similar results hold if the homoclinic orbit crosses the discontinuity. (v) For smooth systems Hopf bifurcation is characterized by the crossing of a pair of complex-conjugate eigenvalues through the imaginary axis. Geometrically this is equivalent to the change of a stable focus to an unstable focus. While the analytical approach is not available for nonsmooth systems due to the lack of a linearized problem the geometrical setting might be used. This generalized concept for the onset of periodic orbits is studied in [32], and it turns out that there are two different mechanisms for planar systems.

2

The general resulting of non-smooth systems

The mathematical treatment of non-smooth differential equations requires an extension of the standard notion of a solution. An appropriate definition is offered by the class of differential inclusions in the way that a (non-smooth) differential equation such as i(t) = f(x(t))

(2.1)

is replaced by a differential inclusion i(t) e F(x(t)) The function x(t) is called a solution of (2.2) on some interval / if (i) x(t) is absolutely continuous on / (so that x(t) exists a.e.) and

(2.2)

129

(ii) x(t) € F(x(t)) a.e. in / . Here F(x) denotes a suitable set-valued function. Usually the closed convex hull of all limits is taken. A formal definition is given by F(x) := n ^ 0 n M(w)=0 d(conv(/({fHIi - £|| < 8}))) The following examples illustrate some obvious choices for F and some typical difficulties which may arise. Example (i) x(t) = sgn(x(t)) := /(*(«)) Then sgn(a;) if x ^ 0 W 1 r :r [ - 1 ,, 1,1] if x = .0

{

Solutions outside the line of discontinuity x = 0 are well-defined. For an initial data XQ = 0 there are solutions of the differential equation a)

x(t)

or 6) x(t) =

{

0 -t

tt0

or c)

x(t) = 0

ieR.

For that problem uniqueness is violated in forward time while it holds in backward time. The reverse situation is treated in (ii) x(t) = -sgn(x(t)). While in (i) the trivial solution x = 0 is unstable it is stable in (ii). - \ + cost x(t) > 0 (iii) x(t) = -|sgn(x(i)) + cos(f) G < [ - 1 / 2 , 1 / 2 ] + c o s t x(t) = 0 i + cost x(t) < 0 While in (ii) the trajectory stays eventually in the stationary solution x = 0 (a critical point of the set valued function) it leaves the critical point after a finite amount of time. The motion in the line of discontinuities is governed by a reduced differential equation which is determined by projections.

130

General results concerning the theory of differential inclusions can be found in Filippov [13] and Deimling [6], where in particular the standard concepts concerning existence, uniqueness, continuous dependence and stability are covered. For example the difference with respect to uniqueness in Example (i) and (ii) is captured by the notion of a one-sided Lipschitz condition which is satisfied in case (ii) and does not hold in case (i). In our approach we are rather concerned with differential inclusions treated as a dynamical system, hence we focus on qualitative properties such as stability and bifurcations. In the evolution of the long-time behavior uniqueness of solution in forward time is of great relevance. The lack of smoothness causes difficulties which become obvious whenever linearization is needed. It is of course possible to avoid such difficulties with the help of smoothing techniques - when they are applied classical results become available but only approximate information is at hand. For a good understanding the limit procedure must be carried out. Another approach to smoothing is based on the embedding of the original dynamical system in a differential delay equation. Problems at discontinuities are reduced to straightforward integration but again limits must be taken. It has been shown in [7] that this approach is equivalent to the differential inclusion approach. Although it is still an interesting problem to study the limit procedure in a systematic way we prefer to attach the non-smooth problems directly. As a first example we have studied if Lyapunov exponents can be defined in a suitable way, and if they provide similar information as in the smooth case. Straightforward numerical simulations for a series of problems showed a remarkable coincidence between the information gained by formally computations of the Lyapunov exponents, and the corresponding bifurcation diagrams. For that reason we tried to determine a class of non-smooth problems where Lyapunov exponents could be defined. This approach could be carried out either for differential equations directly or for the corresponding Poincare maps. In the example of the friction oscillator Hubbuch [18] worked out parameters regions where Lyapunov exponents could be given in a meaningful sense

131 for Poincare maps and also when that approach failed.

3

Lyapunov exponents

Lyapunov exponents provide a well established tool to characterize the longtime behaviour of dynamical systems, for a review see Eckmann-Ruelle [12]. First we collect a few facts concerning the definition of Lyapunov exponents and their interpretation. For a smooth system

we compare the evolution of nearby trajectories. Assume that
• R n denotes a global flow and let x =
-

(f(t, X0) - lf(t,

XQ)

ft

dx(p(t,XQ)(x0

-XQ)

=

dxq(t,x0)ZQ.

The quantity \i{x0,Z0)

:= hmsup(-ln(|| =

\—+—-))

limsup(-ln(||9x(y3(i,a;o)^o||)) t-vao

t

may be considered as a measure for the longtime evolution. Immediately two questions arise: 1. Does the lim sup exist as a true limit? 2. If so, when is the limit independent of Z0? For the simple linear case with constant coefficients x = Ax we have ip(t, x) — etAx, hence dx
132

it can be shown using Oseledet's multiplicative ergodic Theorem ([35]) that the true limit exists and there are n Lyapunov exponents Ai < Ai < • • • < A„ which are independent of the initial data, moreover, the Lyapunov exponents can be used to characterize attractors: 1. If all the Lyapunov exponents are negative there is a stable fixed point. 2. If there is a periodic orbit at least one Lyapunov exponent vanishes. If all other Lyapunov exponents are negative the periodic orbit is asymptotically stable. 3. If k Lyapunov exponents vanish and all the other ones are negative there is an attracting quasiperiodic fc-dimensional torus. 4. If at least one Lyapunov exponents is positive there is chaotic motion. To extend the notion of Lyapunov exponents we assume the following setting: 1. The phase space R" is separatried into submanifolds Mi, • • •, Mk, M^ such that Rn=Ukk=lMiUMO0. 2. In each manifold the dynamical system is represented by a smooth system x = fi(x) (x 6 Mi) where

fieC1(Rn,W).

3. There are well-defined switching conditions for the transition from one manifold to another. This implies uniqueness in forward time. We further assume that there is no accumulation of switching times and that the switching is continuous, hence we allow simple crossing from one manifold to another or stick-slip motions but no jumps which for example occur in the impact oscillator. Under those conditions there is a uniquely defined flow, and it is possible to define piecewise a linearization as well as a linearized transition from one manifold to another. This local process has already been worked out by

133 Miiller [33]. It is the merit of Kunze [23] that he derived a set of hypotheses which guarantee a global result although these hypotheses are difficult to check for complex settings. Kunze has succeeded to work out all the details for the friction pendulum. Without listing all the technical assumptions we state the theorem which guarantees the existence of Lyapunov exponents almost everywhere. Theorem 3.1 (Kunze [23], Michaeli [31]) There exists G c R " such that 1. Lyapunov exponents are defined in G 2. G is 'large', i.e. Rn\G is a set of measure zero. On the basis of that theorem the numerical computation of Lyapunov exponents is justified. As far as the interpretation is concerned Michaeli [31] has confirmed the stability of periodic orbits for non-smooth systems: If T is a periodic orbit for a piecewise smooth system and if all Lyapunov exponents except the leading one axe negative then T is asymptotically stable. For a large class of examples we have carried out numerical computation of Lyapunov expnents. A comparision with the complete bifurcation diagram confirms their usefulness, for a review see [25]. In the case of the friction pendulum a direct comparision with analytical results is available. For large parameter areas we obtain coincidence between analytical and numerical results. A new approach to characterize attractors is based on linearization techniques and on the computation of an invariant measure related to the attractors. The geometrical shape of an attractor provides useful information. For the understanding of the dynamics it is useful to know the time spent in various parts of the attractor. That kind of information is covered in the invariant measures. Dellnitz et aJ. [8, 9, 10] have followed this approach and developed techniques both for the computation of attractors, invariant measures and for their visualization. It is a special feature occurring in non-smooth systems that trajectories may remain in discontinuities for some finite time. Such properties should

134

be made visible by the corresponding invariant measures. Vosshage [46] has adapted the techniques developed by Dellnitz [6] to non-smooth problems. We use the simple example of the friction oscillator to illustrate some of the features. The equations in normalized form are given by ±i =

x2,

&2 =

—Xi -sgn(x 2 ) + rsin(a;3),

±3 = VTo illustrate the complimentary views provided by the different approaches we consider the friction oscillator in the resonant case rj — 1. Figure 3.1 shows the bifurcation diagram and Figure 3.2 illustrates the Lyapunov exponents. For 7 G [0,1] there is a continuum of stationary solutions. For that reason 2 Lyapunov exponents vanish, the third one is equal to — oo. At 7 — 1 there bifurcates a branch of periodic solutions which is asymptotically stable. The leading Lyapunov exponent vanishes, the second one is negative and the third Lyapunov exponent is equal to —00, indicating that a stick-slip motion occurs, hence a partly reduction to a system of lower order. For 7 = 4/7T there are infinitely many periodic solutions, and for 7 > 4/7T all solutions are unbounded. There is no invariant measure, and for that reason there are no Lyapunov exponents although a formal numerical computation seems to give some numbers which could naively be interpreted as Lyapunov exponents. For that reason this example should serve as a warning that a formal computation of Lyapunov exponents without a sound mathematical justification is sometimes misleading. The periodic solution, say for 7 = 10/9, shows a small sticking phase (see [31]), this goes along with the fact that one Lyapunov exponent is equal to —00. The attractor as computed by the box method is shown in Figure 3.3 and 3.4, of course the attractor would shrink to the periodic orbit if the mesh size of the boxes would have been refined. The corresponding invariant measure is shown in Figure 3.5. Here it is obvious that a significant amount of time is spent in the discontinuity. Further examples have been worked out by Vosshage [46].

135

stat. sol.

]

unique periodic sol.

, unbounded sol.

Figure 3.1 Bifurcation diagram.

4/TT

-oo,0,0

7

no Lyapunov exponents

Figure 3.2 Lyapunov exponents.

Figure 3.3 Projection of the attractor on (x2-,xs)-plane.

136

-0.03

-0.02

-001

001

002

0O3

Figure 3.4 Projection of the attractor on (xi,x$)-plane.

x1 Figure 3.5 Invarant measure.

137

References [1] A. A. Andronov, A. A. Vitt & S. E. Khaikin, Theory of Oscillators. Dover Publications, Inc., New York, 1966. [2] F. Altpeter, Friction modelling, identification and compensation. These NO. 1998 EDFL 1999. [3] H. di Bernardo, C. J. Budd & A. R. Champneys, Corner collision implies border collision bifurcation. Preprint 2000. [4] C. Budd & F. Dux, Chattering and related behaviour in impact oscillators. Phil. Trans. R. Soc. London, Ser. A 347, 365-389, 1994. [5] C. Budd, F. Dux k A. Cliffe, The effect of frequency and clearance variations in single-degree-of-freedom impact oscillators. J. of Sound and Vibrations, 184, 475-502, 1995. [6] K. Deimling, Multivalued Differential Equations. Walter de Gruyter & Co., Berlin, New York, 1992. [7] K. Deprez, Losung unstetiger Differentialgleichungen mittels verzogerter Systeme. Diploma thesis, 1993, University of Cologne. [8] M. Dellnitz, A. Hohmann, O. Junge & M. Rumpf, Exploring invariant sets and invariant measures. Chaos 7, no. 2, 221-228. 1997. [9] M. Dellnitz & O. Junge, An adaptive subdivision technique for the approximation of attractors and invariant measures. Computation and visualization in Science 1, 63-68, 1998. [10] M. Dellnitz & O. Junge, On the approximation of complicated dynamical behaviour. SIAM J. Numer. Anal., 36(2), 491-515, 1999. [11] A. Dontchev & F. Lempio, Difference methods for differential inclusions: a survey. SIAM Reviews, 34, 263-294, 1992. [12] J. P. Eckmann & D. Ruelle, Theory of chaos and strange attractors. Reviews of Moden Physics, 57, 617-656, 1985.

138

[13] A. F. Filippov, Differential Equations with Discontinuous Righthand Sides. Kluwer Academic Publishers Group, Dordrecht, 1988. [14] E. Freire & E. Ponce, Bifurcation sets of continuous piecewise linear systems with two zones. Preprint. [15] U. Galvanetto, S. R. Bishop & L. Briseghella, Mechanical stick-slip vibrations. Int. J. Bifurcation and Chaos, 5 637-651, 1995. [16] F. Giannakopoulos, A. Kaul & K. Pliete, Qualitative analysis of a planar system of piece-wise linear differential equations with a line of discontinuity. Submitted to J. of Diff. Eq. 2000. [17] H. Holscher, U. D. Schwarz & R. Wiesendanger, Modelling of the scan process in lateral force microscopy. Surface science, 375, 395-402, 1997. [18] F. Hubbuch, Die Dynamik des periodisch erregten Reibschwingers. ZAMM 75, Suppl. I, 51-62, 1995. [19] F. Hubbuch & A. Miiller, Lyapunov Exponenten in dynamischen Systemen mit Unstetigkeiten. ZAMM 75, Suppl. I, 91-92, 1995. [20] R. A. Ibrahim, Friction-Induced vibration, chatter, squeal and chaos, part I. Mechanics of contact and friction, ASME Applied Mechanics Reviews, Vol. 47, no. 7, 209-226, 1994. [21] R. A. Ibrahim, Friction-Induced vibration, chatter, squeal and chaos, part II. Dynamics and modeling, ASME Applied Mechanics Reviews, Vol. 47, no. 7, 227-253, 1994. [22] P. Jvanov, Stability of periodic motions with Impacts. Preprint 2000. [23] M. Kunze, On Lyapunov exponents for non-smooth dynamical systems with an application to a pendulum with dry friction. To appear in J. Dynamics Diff. Eqs. [24] M. Kunze & T. Kupper, Qualitative analysis of a non-smooth friction oscillator model, ZAMP 48, 1-15, 1997.

139 [25] M. Kunze & T. Kiipper, Non-smooth dynamical systems. DFG-research report, 1999. [26] M. Kunze, T. Kiipper & J. Li, On the Conley index theory to nonsmooth dynamical systems. J. Diff. Eqs. Vol. 13, 4-6, 479-502, 2000. [27] M. Kunze, T. Kiipper k J. You, On the application of KAM-theory to discontinuous dynamical systems. J. Diff. Eqs. 139, 1-21. 1997. [28] T. Kiipper, Non-smooth Dynamical Systems. CISM - Lecture notes, Udine, 1999 (in publication). [29] R. I. Leine, B. L. van de Vrande & D. H. van Campen, Bifurcations in nonlinear discontinuous systems. Internal report in Eindhoven University of Technology, report number: 99.010. [30] F. Lempio & V. Veliov, Discrete approximations of differential inclusions. Bayreuther Math. Schriften, 54, 149-232, 1998. [31] B. Michaeli, Lyapunov Exponenten fur nichtglatte dynamische Systeme. Dissertation, Koln, Oktober 1998. [32] S. Moritz, "Hopf-verzweigung" bei unstetigen planaren Systemen. Diploma thesis 2000, University of Cologne. [33] P. C. Miiller, Lyapunov-Exponenten zeitvarianter nichtlinearer dynamischer Systeme mit Unstetigkeiten. ZAMM, 74, 70-72, 1994. [34] H. E. Nusse, E. Ott & J. A. Yorke, Border-Collision bifurcations: an explanation for observed phenomina. Phys. Rev. E, 49,1073-1076,1994. [35] V. I. Oseledec, A multiplicative ergodic theorem Lyapunov characteristic numbers for dynamical systems. Trans. Moscow Math. Soc. 19, 197-231, 1979. [36] F. Pfeiffer, Mechanische Systeme mit unstetigen Ubergangen. Ing. Arch. 58, 232-240, 1984.

140

[37] K. Pliete, Uber die Anzahl geschlossener Orbits bei unstetigen, stuckweise linearen dynamischen Systemen in der Ebene. Diplomarbeit, Koln, September 1998. [38] K. Popp k. P. Stelter, Nonlinear oscillations of structures induced by dry friction. In Nonlinear Dynamics in Engineering Systems-IUTAM Symposium Stuttgart, 233-240, 1989. Ed. W. Schiehlen, Springer, BerlinHeidelberg-New York, 1990. [39] K. Popp & P. Stelter, Stick-slip vibrations and chaos. Phil. Trans. Roy. Soc. London A 332, 89-105, 1990. [40] R. Reissig, Erzwungene Schwingungen mit zaher und trockner Reibung. Math. Nachrichten, 11, 345-384, 1954. [41] R. Reissig, Erzwungene Schwingungen mit zaher Reibung und starker Gleitreibung, II. Math. Nachrichten, 12, 119-128, 1954. [42] R. Reissig, Erzwungene Schwingungen mit zaher und trockner Reibung: Erganzung. Math. Nachrichten, 12, 249-252, 1954. [43] R. Reissig, Erzwungene Schwingungen mit zaher und trockner Reibung: Abschatzung der Amplitudes Math. Nachrichten, 12, 283-300, 1954. [44] K. Taubert, Differenzenverfahren fur Schwingungen mit trockener und zaher Reibung und fur Regelungssysteme. Numer. Math., 20, 379-395, 1976. [45] L. N. Virgin & C. J. Begley, Grazing bifurcations and basins of attraction in an impact-friction oscillator. Physica D., 30, 43-57, 1999. [46] C. Vosshage, Visualisierung von Attraktoren und invanrianten mengen in nichtglaten dynamischen Systemen. Diploma thesis 2000, University of Cologne. [47] Y. Zou & T. Kiipper, Melnikov method and detection of chaos for nonsmooth systems, preprint, 2000.

Mathematics and the 21st Century Eds. A. A. Ashour and A.-S. F. Obada © 2 0 0 1 World Scientific Publishing Co. (pp. 141-152) RADICAL

141

THEORY:

DEVELOPMENTS AND

TRENDS

RICHARD WIEGANDT

1. I n t r o d u c t i o n I intend to give a glimpse on radical theory, on its origin and aim, featuring recent important developments and indicating the trends of researches. Attention is focused on decomposition and description of rings which are semisimple with respect to certain radicals, on characterizing and describing classes of rings by closure operations, on ring constructions including most recent results which solve old and difncult problems in the negative, on radical theory in various algebraic and non-algebraic categories and also on cardinality conditions. I am aware that visiting the Pyramids and the Egyptian Museum does not make anyone an egyptologist, though it may be a very instructive (and enjoyable!) tour. In a similar manner, my hope is to provide an overview on the main issues of radical theory which may be useful for the experts and also informative for non-specialists in radical theory. In my opinion radical theory has contributed to the development of mathematics mainly in four aspects: i) Living up to the original goal, its task has been to prove structure theorems for rings and algebras which are semisimple with respect to certain radicals. ii) Studying and comparing classes (that is, properties) of rings via closure operations. iii) Constructing rings which distinguish two given properties of rings. These rings possess sometimes very peculiar properties which may ruin beautiful expectation, but serve to the better understanding of the structure of rings. iv) The infiltration of radical theory into other branches of mathematics (for instance, topology, incidence algebras) opens new dimensions for researchers and enriches the arsenal of investigations. 2. S t r u c t u r e t h e o r e m s A central problem of ring theory to determine the strucutre of rings in terms of linear transformations. Wedderburn [63] (1908) suggested and Kothe (1930) implemented an ingenious technique. They discarded or ignored a "bad" ideal R of a ring (or algebra) A such that the factor ring A/R has a "nice" structure (that 1991 Mathematics Subject Classification: 16N80 The support of the Hungarian National Foundation for Scientific Research Grant # T029525 is gratefully acknowledged.

142

is, representable by rings of linear transformations on vector spaces). Kothe [38] considered the unique largest nil ideal M{A) = ^2(1 < A | J is a nil ring) as a "bad" ideal and determined the structure of A/N(A) which has no nonzero nil ideals. To each element a e N{A) there exists an exponent n > 1 with an = 0, whence the elements of N{A) can be viewed as the n-th root of 0. Root in Latin is radix, so Kothe called the "pathological" ideal M(A) as the (nil) radical of A. A ring A is said to be nil semisimple, if A/"(A) = 0. The classical Wedderburn-Artin Structure Theorem describes the structure of a nil semisimple ring under the assumption that A is artinian (that is, A satisfies the descending chain condition (dec) on left ideals). Wedderburn—Artin Structure T h e o r e m 2.1. A ring A is artinian and nil semisimple if and only if A is a direct sum of finitely many simple rings A\, • •., An, and each Ai is isomorphic to a matrix ring Mfcf(.Dj) over a division ring Di. This Theorem explains also the origin of the attribute "semisimple". The most general but satisfactorily efficient structure theorem describes the structure of Jacobson semisimple rings as subdirect sums of dense rings of linear transformations. The nil radical M(A) is in some extent not big enough to determine the nil semisimple rings in general. Jacobson [34] introduced a somewhat bigger radical via quasi-regularity. A ring A is quasi-regular, if for every element x € A there exists an element y & A such that x + y — xy — 0. The Jacobson radical J {A) of a ring A is the unique largest quasi-regular ideal of A. A ring A is said to be (left) primitive, if A contains a maximal left ideal L such that xA C L implies x = 0, or equivalently, A has a faithful irreducible A-module (for instance A/L). The Jacobson radical J(A) can be represented as the intersection J{A)

= n ( i < A | A/I

is a primitive ring).

T h e o r e m 2.2. A ring A is Jacobson semisimple (that is, J{A) = 0) if and only if A is a subdirect sum of primitive rings. A ring A is primitive if and only if A is a dense subring of a ring Hom(V, V) of linear transformations on a vector space V, that is, to any finitely many linearly independent elements xi,..., xn e V and arbitrary elements yi,. • • ,yn £ V there exists an element t G A such that We mention two more theorems describing the structure of simple rings. A subring B of a ring A is called a biideal, if BAB C B. We say that a ring A is strongly locally matrix ring over a division ring D, if every finite subset C of A can be embedded into a biideal B of A such that B = Mn(D) for some natural number n depending on the size of C. Litoff—A n n T h e o r e m 2.3. A ring A is simple, A2 ^ 0 and A has a minimal left ideal if and only if A is a strongly locally matrix ring over a division ring. In the original Litoff Theorem [35] only embeddability m subrings was demanded, and therefore it was not an "if and only if" theorem. The present version is due to Anh [9].

143 T h e o r e m 2.4 (Beidar [12]). A ring A is a matrix ring over a division ring if and only if A is a prime ring such that to each nonzero element a € A the subset aA contains a nonzero idempotent and the degrees of nilpotency of the elements in A are bounded. Every ring A has a unique largest ideal T(A) whose additive group is torsion. T(A) is called the torsion radical of A. Every ring A possesses also a unique largest von Neumann regular ideal v(A), called the von Neumann regular radical of A. T h e o r e m 2.5 (F. Szasz [59]). Every artinian ring decomposes into a direct sum

A = r{A) e F where F is a uniquely determined ideal of A whose additive group is

torsionfree.

T h e o r e m 2.6 ([21], [49]). The von Neumann regular radical u(A) is a direct summand in every artinian ring A, and v{A) is a nil semisimple artinian ring. 3. Closure operations o n classes of rings Studying the strucutre of rings, various radicals have been introduced, and all of them share some common properties. Amitsur [4] and Kurosh [40] introduced independently the notion of general radicals. A class 7 of rings (that is, a property of rings) is said to be a radical class in the sense of Kurosh and Amitsur, if i) 7 is homomorphically closed, ii) 7 has the inductive property: if A = L)I\ and {I\} is an ascending chain of ideals of A such that all 7A G 7, then also A e 7, iii) 7 is closed under extensions: if I, A/1 6 7, then also A € 7. Every ring A has a unique largest 7-ideal 7(A), called the 7-radical of A. Each radical class 7 determines its semisimple class 5 7 = {all rings A | j(A)

= 0}.

For a radical class 7 we shall often speak only of a radical 7. Obviously 7 = {all rings A | i(A) = A}. Semisimple classes can be characterized dually to radical classes. T h e o r e m 3.1 ([43], [53]). A class a of rings is a semisimple radical class 7 = Uo = {all rings A \ A has no nonzero homomorphic

class of the upper image in a}

if and only if a is hereditary (I < A 6 o~ implies I 6 a) and closed under subdirect sums and extensions. Of particular interest are the special radicals introduced by Andrunakievich [8]. A class g of prime rings is said to be a special class, if g is hereditary and closed under essential extensions (that is, if I is an essential ideal of a ring A (I n K ^ 0 for every nonzero ideal K of A) and I e g, then A e g). The radical 7 = Ug is called a special radical. The nil radical as well as the Jacobson radical are example of special radicals.

144 T h e o r e m 3.2 ([8]). Every special radical class 7 = Ug is hereditary. Moreover, every f-semisimple ring A is a subdirect sum of rings in g, and also of 7- semisimple prime rings. T h e o r e m 3.3 ([31]). A class 7 of rings is a special radical if and only if 7 is homomorphically closed, hereditary and satisfies condition (*) if every nonzero prime homomorphic image of a ring A has a nonzero ideal I G 7, then A e 7. T h e o r e m 3.4 ([52]). A class a of rings is the semisimple class of a special radical if and only if a is hereditary, closed under subdirect sums, essential extensions and satisfies condition (**) if A&o, then A is a subdirect sum of prime rings in a. A subclass g of rings is called a variety, if g is closed under taking homomorphic images, subrings and complete direct sums. As is well-known, the varieties are just the subclasses of rings which can be denned by identities (e.g. xy = yx or xn = x for a fixed n). T h e o r e m 3.5 ([11], [30], [45], [67]). For a class g of rings the following are equivalent: (i) g is a variety which is closed under

conditions

extensions,

(ii) g is a radical class closed under subdirect sums, (iii) g is a radical and a semisimple (iv) g is a homomorphically

class,

closed semisimple

(v) g is a radical class closed under essential (vi) g is a variety closed under essential

class, extensions

extensions.

These classes have been determined explicitly by Stewart. T h e o r e m 3.6 ([58]). A class g satisfying any of the conditions of Theorem 3.5 is either g = {0} or g = {all rings} or Q is the semisimple class of a special radical 7 = UT where T = {F\,..., Fn} is the special class of finite fields, F\,.. •, Fn such that if G is a subfield of some Fi S J- then also G € T. For any class 7 of rings we define its essential cover £ 7 as £ 7 = {all rings A \ A has an essential ideal / e 7 } . In view of Theorem 3.5 (iii) and (v) a radical class 7 is a semisimple class if and only if £ 7 = 7. T h e o r e m 3.7 ([19]). The essential cover £ 7 of a radical class 7 is a semisimple class if and only if 7 is hereditary and has a complement in the lattice of all hereditary radicals. The classes of Theorem 3.7 are generalizations of radical and semisimple classes and have been explicitly determined [13] and [14] in terms of matrix rings over finite fields.

145

4. R i n g constructions Constructing rings with peculiar properties exhibits how delicate or nasty (up to the reader's taste) the behaviour of rings may be. This activity contributes substantially to the better understanding of the strucutre of rings and gives impetus for further researches. To decide that two radical classes (or semisimple classes, or other properties of rings) are different, one has to construct a ring which belongs to one class but not to the other. Two famous examples: i) Bergman [17] constructed a left primitive ring A which is not right primitive. The Jacobson radical class is the class of all quasi-regular rings, so it is a left- and right-symmetric notion, and the same is true for its semisimple class. Hence in the constructed left primitive ring A, being Jacobson semisimple, the intersection of all right primitive ideals is zero. ii) Sasiada [54] (see also Sasiada and Cohn [55]) constructed a simple prime ring which is also a Jacobson radical ring. Kasch [37] defined the total Tot (^4) of a ring A as the set Tot(A) = {a e A | aA has no nonzero idempotents}. Tot(A) is not closed under addition, so it cannot be an ideal in A. In [16] the Kasch radical class K is defined as K. = {all rings A | no nonzero homomorphic image of A has 0 total}. The question as whether the Kasch radical K is a special one is equivalent to the claim that K. coincides with the radical class /Cp = {all rings A | no nonzero prime homomorphic image of A has 0 total}. Beidar [16] gave a quite involved genuine construction for a ring G such that Tot(G) = 0 but T o t ( G / P ) ^ 0 for every prime ideal P of G. The existence of such a ring G proves that K ^ Kp, whence the Kasch radical K, is not a special radical, though K possesses many nice properties (for instance, if L is a left ideal of a ring A € K then also L e K., and if L = K,{A) is a left ideal of a ring A then L C K(A)). For a survey of rings and ring constructions which distinguish the various interesting radicals, the reader is referred to [68]. In the fundamental and inseminating paper [38] Kothe posed the following problem. K o t h e ' s P r o b l e m . Does the nil radical Af(A) of any ring contain A contain every nil left ideal of A? Although this problem has been raised in 1930, still open and seems to be the hardest problem in ring theory. An affirmative answer has been verified for many different classes of rings, but it has withstood so far the efforts of several brilliant

146 mathematicians. Kothe's Problem has many equivalent formulations, we present here two of them (Krempa [39]). Is the 2 x 2 matrix ring over a nil ring A a nil ring? Is the polynomial ring A[x] over a nil ring A a Jacobson radical ring? For a negative answer it would suffice to construct a nil ring A such that A[x] is not quasi-regular. Recently Agata Smotkunowicz [56] solved an old problem of Amitsur [5], and constructed a nil ring A such that the polynomial ring A[x] is not nil. Since nil rings are always quasi-regular, her result can be considered as an approximation of Kothe's Problem from below. An upper approximation was given by Puczylowski and Smoktunowicz [50]: the polynomial ring A[x] over a nil ring is always a Brown-McCoy radical ring (that is, A[x] cannot be mapped homomorphically onto a simple ring with unity element). Another famous and hard problem was posed by Levitzki. Does there exist a simple prime nil ring? It is the most recent development in ring theory that Smoktunowicz [57] constructed such a ring S. A negative answer to Kothe's problem could be given by proving that S[x] is not a Jacobson radical ring, or that the 2 x 2 matrix ring M2(S) is not nil. 5. Radical t h e o r y in other categories The infiltration of radical theory into other branches of mathematics opens new dimensions for researchers and enriches the arsenal of investigations. Radical theory can be developed in categories which are similar to that of rings (e.g. groups, modules, nonassociative rings, near-rings, rings with involuiton, Hgroups), and also in categories differing considerably from that of rings (5-acts, topological spaces, graphs). It was Kurosh [41] and his collaborators who developed the radical theory for groups. For ordered groups it was done by Chehata and Wiegandt [22]. The radical theory of modules and abelian categories is called torsion theory. The theory of hereditary torsions is highly developed (see for instance Golan [33]), much less is known on non-hereditary torsion theories (cf. [20]). As mentioned earlier, semisimple classes of associative rings are always hereditary. This remains true for alternative rings, Jordan algebras with 1/2, but not for nonassociative rings in general. In fact, the "nice" radical theory (with hereditary semisimple classes) collapses to that of abelian groups, as proved by Gardner. T h e o r e m 5.1 ([27]). In the variety of all not necessarily associative rings, if a radical 7 has a hereditary semisimple class Sj then 7 depends only on the additive group of the rings: if A and B are rings with isomorphic additive groups and A 6 7 then also B S 7. The case of near-rings is of particular interest inasmuch as its radical theory degenerates in a nice way. A (right) near-ring N is a not necessarily commutative group (N, + ) and a semigroup (iV, •) such that the addition and multiplication is linked by the right distributive law (x + y)z = xz + yz

Vx, y,z G N.

147 Note that xO ^ 0 may happen. If xO = 0 is true for all x € N, then we speak of a 0- symmetric near-ring. Betsch and Kaarli proved T h e o r e m 5.2 ([18]). If the semisimple class of near-rings is hereditary, then the corresponding radical class contains all nilpotent near-rings. The converse, however, is not true: the class of all nil near-rings in the variety of 0-symmetric near-rings is a radical class with non-hereditary semisimple class (Kaarli [36]). Veldsman [61] proved that if a hereditary semisimple class is not the class of all near-rings then it consists entirely of O-symmetric near-rings. T h e o r e m 5.3 ([60]). A radical class 7 ^ {0} of near-rings contains all nilpotent near-rings if and only if the semisimple class S-y is weakly homomorphically closed: I x*, called: involution satisfying the identities x**=x,

(x + y)* = x* +y*,

{xy)* = y*x*

Vx,y e A.

Examples are commutative rings with identical involution I I - > I , real and complex matrices with transposition and adjoint matrix, respectively, and polynomial rings A[x] over an involution ring subject to x* = x. Working in the category of involution rings with homomorphisms preserving also involution, we have less mappings which makes the situation more difficult. In the case of associative rings the Anderson-Divinsky-Sulinski Theorem [6] tells us that for any radical 7 and for any ideal I of a ring A it always holds 'y(I) < A. This is no longer so for involution rings. T h e o r e m 5.4 ([46]). For a radical 7 of involution rings the followings are equivalent: (i) I < A implies j(I) < A for all ideals in every involution ring A, (ii) if A" g 7 and A2 = 0 then A" G 7 for every involution o on A. Proving structure theorems for rings, one-sided ideals play a decisive role. In the case of involution rings we cannot use one-sided ideals, because a left ideal L closed under involution must be also a right ideal. Working, however, with biideals closed under involution (called *-biideals) one can prove quite strong structure theorems. Aburawash [1] proved the involutive version of the Wedderburn-Artin Structure Theorem 2.1.

148 T h e o r e m 5.5. An involution ring A is nil semisimple and satisfies dec on *biideals if and only if A is a direct sum of finitely many matrix rings Mni(Di) with involution over a division ring Di i = 1 , . . . , r and of finitely many involution rings Kj(Dj), j = 1 , . . . , s, where each Kj(Dj) is a direct sum of a matrix ring Mn.(Dj) and of the opposite ring M°p(Dj) and the involution on Kj(Dj) is the exchange involution

(x,yy

= (y,x)

V(x,y) e Kj(Dj).

Notice that in Kj(Dj) the only ideals which are closed under involution, are 0 and Kj(D), so we may call such involution rings as *-simple. A *-simple involution ring is either a simple ring or the direct sum of two simple rings I and I°p endowed with the exchange involution. The involutive version of the Litoff-Anh Theorem is due to Aburawash [2]. T h e o r e m 5.6. An involution ring A is *-simple and possesses a minimal *-biideal if and only if every finite subset C of A can be embedded into a *-biideal B of A such that B = Mn{D) whenever A is a simple ring, and B = Km(D) whenever A is not simple as a ring. Here D is a fixed division ring and the natural numbers n and m depend on the size of C. The maximal torsion ideal T(A) of an involution ring A is closed under involution and if A satisfies dec on principal *-biideals, then r{A) is a direct summand of A ([47]), analogously to the case of rings without involution (cf. Theorem 2.5). Imposing chain conditions on *-biideals for involution rings is a stronger condition than imposing chain conditions on left ideals for rings without involution. This can be seen very well from the next two Theorems. T h e o r e m 5.7 ([15]). / / an involution ring A satisfies dec on *-biideals then its Jacobson radical J(A) satisfies dec on additive subgroups. Hence A, as a ring, is artinian with artinian Jacobson radical. The reader is reminded that the Jacobson radical of an artinian ring need not be an artinian ring. T h e o r e m 5.8 ([15]). For an involution ring A the following are equivalent: (i) the polynomial ring A[x] satisfies the ascending chain condition on *-biideals, (ii) A is a finite direct sum of matrix rings over finite fields (as given in Theorem 5.5). Theorem 5.8 can be compared with Hilbert's Basis Theorem which states that the polynomial ring A[x] over a ring A with unity element is noetherian (that is, satisfies the ascending chain condition on left ideals) if and only if A is noetherian. Let 5 be a semigroup. An S-act A is a set on which the semigroup S operates, that is, to every s G S and a & A there is asigned an element s a 6 S subject to the rule s{ta) = (st)a Vs, t € S and a 6 A. Let K be a congruence relation on A. Some K-cosets may be 5-subacts. If C = {C\ | A G A} is a set of pairwise disjoint subacts of A, then there may be several

149

congruence relations such that each C\, A G A, is a coset. C determines, however, a smallest congruence K in which the cosets are the 5-subacts C\ G C and the singletons {a} for each a G A$l)C$. The radical theory of 5-acts was developed by Lex, Amin and Wiegandt [44], [3]. For radical theory of semifields we refer to [64], [65], [66]. Recently Veldsman [62] developed the general radical theory of incidence algebras. Topological spaces and graphs differ substantially from ring-like structures. Considering a topological space (or graph) A, any partition of the underlying set determines a congruence relation and a factor space (or factor graph, respectively) uniquely, and vice versa. The topologies (or graphs) defined on a fixed set A form a lattice. No algebraic structure has this property. Nevertheless, radical theory can be interpreted for topological spaces ([51], [10]) as well as for graphs ([24]), and even for their common generalization, called abstract relational structures ([25]). Radical properties correspond to connectedness properties and semisimple properties to disconnectednesses. For the so far most general Kurosh-Amitsur radical theory incorporating that of all kinds of algebraic structures as well as topological spaces and graphs, we refer the reader to [48]. 6. Cardinality condition In the theory of abelian groups it is quite common to impose cardinality conditions and set-theoretic assumptions (see [32]). Analogous questions are meaningful in ring theory, in particular, in their radical theory, but the situation is much more complicated and far less developed. Let 7 be a radical of rings (or abelian groups). 7 satisfies the cardinality condition for ideals (subgroups), if there exists an infinite cardinal number a such that for every ring (abelian group) A, if a G l(A) then a G "/(B) for some B < A (B C A, respectively) with | B | < a. T h e o r e m 6.1 ([28]). If a radical 7 of rings satisfy the cardinality condition for ideals, then 7 = {0}. For a radical 7 there exists an infinite cardinal number a such that every •y-ring is the sum of 7-ideals of cardinality < a if and only if 7 = {0} or 7 is the class of zero-rings on divisible P-groups for a set P of primes. T h e o r e m 6.2 ([28]). 7/7 is the upper radical of a variety of associative rings, then 7 satisfies the cardinality condition of subrings with respect to Hi. The relation between the cardinality condition of rings and abelian groups is given in T h e o r e m 6.3 ([28]). Let 7 be a radical class of abelian groups and 7* = {all rings A with additive group in 7 } . Then the radical 7* satisfies the cardinality condition for subrings if and only if 7 satisfies the cardinality condition. A kind of cardinality condition concerns the problem as whether the direct product of radical rings (abelian groups) is again radical.

150 T h e o r e m 6.4 ([26]). A radical class 7 of abelian groups is closed under direct products if and only if 7 is generated by trosion-free groups.

countable

T h e o r e m 6.5 ([23]). Let Q = {Ax | A e A} be a set of countable abelian groups Ax, |A| = Ni and Ylora{Ax, AM) = 0 for A ^ p e A . If 7 is the radical generated by the set g, then the direct product of Ax, A € A, is not in 7 . There are many radical classes of rings which are closed under arbitrary direct products, for instance the Jacobson radical, the von Neumann regular radical and the radical semisimple classes (in fact, subvarieties) of Theorem 3.5. The radical class of nil rings is obviously not closed under countable direct products. Product-closed radical classes of abelian groups and rings (via Theorem 6.3) were investigated recently in [29]. It would be nice to develop appropriate methods and prove more cardinality condition results in the radical theory of rings. References [1] U. A. Aburawash, Semiprime involution rings with chain conditions, Contr. General Alg. 7, Holder-Pichler-Tempsky, Wien & B. G. Teubner, Stuttgart, 1991, pp. 7-11. [2] U. A. Aburawash, The structure of *-simple involution rings with minimal *-biideals, Beitrdge Alg. und Geom., 33 (1992), 77-83. [3] I. A. Amin and R. Wiegandt, Torsion and torsion-free classes of Acts, Contr. General Alg. 2, Holder-Pichler-Tempsky, Wien & B. G. Teubner, Stuttgart 1983, pp. 19-34. [4] S. A. Amitsur, A general theory of radicals I, Amer. J. Math., 74 (1952), 774-786, II ibidem, 76 (1954), 100-125, and III ibidem, 76 (1954), 126-136. [5] S. A, Amitsur, Radicals of polynomial rings, Canad. J. Math., 8 (1956), 355-356. [6] T. Anderson, N. Divinsky and A. Sulinski, Hereditary radicals in associative and alternative rings, Canad. J. Math., 17 (1965), 594-603. [7] T. Anderson and R. Wiegandt, Weakly homomorphically closed semisimple classes, Acta Math. Acad. Sci. Hungar., 34 (1979), 329-336. [8] V. A. Andrunakievich, Radicals of associative rings, I (Russian), Mat. Sb., 44 (1958), 179212; English transl.: Amer. Math. Soc. Transl, (2) 52 (1966), 95-128. [9] P. N. Anh, On Litoff's theorem, Studia Sci. Math. Hungar., 18 (1983), 153-157. [10] A. V. Arhangel'skit and R. Wiegandt, Connectednesses and disconnectednesses in topology, General Topology and Appl., 5 (1975), 9-33. [11] E. P. Armendariz, Closure properties in radical theory, Pacific J. Math., 26 (1968), 1-7. [12] K. I. Beidar, On rings with zero total, Beitrdge Alg. und Geom., 38 (1997), 233-239. [13] K. I. Beidar, Y. Fong and W. F. Ke, On complemented radicals, J. Algebra, 201 (1998), 328-356. [14] K. I. Beidar, Y. Fong, W. F. Ke and K. P. Shum, On radicals with semisimple essential covers, Preprint, 1995. [15] K. I. Beidar and R. Wiegandt, Rings with involution and chain conditions, J. Pure & Appl. Algebra, 87 (1993), 205-220. [16] K. I. Beidar and R. Wiegandt, Radicals induced by the total of rings, Beitrdge Alg. und Geom., 38 (1997), 149-159. [17] G. M. Bergman, A ring primitive on the right but not on the left, Proc. Amer. Math. Soc, 15 (1964), 473-475. [18] G. Betsch and K. Kaarli, Supernilpotent radicals and hereditariness of semisimple classes of near-rings, Coll. Math. Soc. J. Bolyai 38, Radical Theory, Eger 1982, North-Holland, 1985, pp. 47-58.

151 [19] G. F. Birkenmeier and R. Wiegandt, Essential covers and complements of radicals, Bull. Austral Math. Soc, 53 (1996), 261-266. [20] G. F. Birkenmeier and R. Wiegandt, Pseudocomplements in the lattice of torsion classes, Comm. in Algebra, 26 (1998), 197-220. [21] B. Brown and N. H. McCoy, The maximal regular ideal of a ring, Proc. Amer. Math.

Soc,

1 (1950), 165-171. [22] C. G. Chehata and R. Wiegandt, Radical theory for fully ordered groups,

Mathematica,

Cluj, 20 (1978), 143-157. [23] M. Dugas and R. Gobel, On radicals and products, Pacific J. Math., 118 (1985), 79-104. [24] E. Fried and R. Wiegandt, Connectednesses and disconnectednesses of graphs, Alg. Universalis, 5 (1975), 411-428. [25] E. Fried and R. Wiegandt, Abstract relational structures, I (General theory), Alg. Universalis, 15 (1982), 1-21, II (Torsion theory), ibidem, 15 (1982), 22-39. [26] B. J. Gardner, Two notes on radicals of abelian groups, Comment. Math. Univ. Carolinae, 13 (1972), 419-430. [27] B. J. Gardner, Some degeneracy and pathology in non-associative radical theory, Annales Univ. Sci. Budapest, 22-23 (1979/1980), 65-74. [28] B. J. Gardner, Some cardinality conditions for ring radicals, Quaest. Math., 15 (1992), 27-37. [29] B. J. Gardner, On product-closed radical classes of abelian groups, Bui. Acad. S$iin$e Rep. Moldova, Matematica, 2(30) 1999, 33-52. [30] B. J. Gardner and P. N. Stewart, On semi-simple radical classes, Bull. Austral. Math. Soc., 13 (1975), 349-353. [31] B. J. Gardner and R. Wiegandt, Characterizing and constructing special radicals, Acta Math. Acad. Sci. Hungar., 40 (1982), 73-83. [32] R. Gobel, Radicals in abelian groups, Coll. Math. Soc. J. Bolyai 61, Theory of Radicals, Szekszdrd 1991, North-Holland 1993, pp. 77-107. [33] J. S. Golan, Torsion Theories, John Wiley & Sons, 1986. [34] N. Jacobson, The radical and semisimplicity of arbitrary rings, Amer. J. Math., 67 (1945), 300-320. [35] N. Jacobson, Structure of rings, Amer. Math. Soc. Coll. Publ., 37 Providence, 1968. [36] K. Kaarli, Classification of irreducible .R-groups over a semiprimary near-ring (Russian), Tartu Riikl. Ul. Toimetised, 556 (1981), 47-63. [37] F. Kasch, Partiell invertierbare Homomorphismen und das Total, Algebra Berichte 60, Verlag Reinhard Fischer, Miinchen, 1988. [38] G. Kothe, Die Struktur der Ringe, deren Restklassenring nach dein Radikal vollstandig reduzibel ist, Math. Zeitschr., 32 (1930), 161-186. [39] J. Krempa, Logical connections among some open problems in non-commutative rings, Fund. Math., 76 (1972), 121-130. [40] A. G. Kurosh, Radicals of rings and algebras, Mat. Sb., 33 (1953), 13-26 (Russian), English translation: Coll. Math. Soc. J. Bolyai 6, Rings, Modules and Radicals, Keszthely 1971, North-Holland, 1973, pp. 297-312. [41] A. G. Kurosh, Radicals in the theory of groups, Sibir. Mat. Zh., 3 (1962), 912-931, English transl.: Coll. Math. Soc. J. Bolyai 6, Rings, Modules and Radicals, Keszthely 1971, North Holland, 1973, pp. 271-296. [42] W. G. Leavitt and R. Wiegandt, Torsion theory for not necessarily associative rings, Rocky Mountain J. Math., 9 (1979), 259-271. [43] L. C. A. van Leeuwen, C. Roos and R. Wiegandt, Characterizations of semisimple classes, J. Austral. Math. Soc, 23 (1977), 172-182. [44] W. Lex and R. Wiegandt, Torsion theory for acts, Studia Sci. Math. Hungar., 16 (1981), 263-280.

152 N. V. Loi, Essentially closed radical classes, J. Austral. Math. Soc, Ser. A, 35 (1983), 132-142. N. V. Loi and R. Wiegandt, Involution algebras and the Anderson-Divinsky-Suliriski property, Acta Sci. Math. Szeged, 50 (1986), 5-14. N. V. Loi and R. Wiegandt, On involution rings with minimum condition, Ring Israel Math. Conf. Proc., 1 (1989), 206-214.

Theory,

L. Marki, R. Mlitz and R. Wiegandt, A general Kurosh-Amitsur radical theory, Comm. in Algebra, 16 (1988), 249-305. R. Mlitz, A. D. Sands and R. Wiegandt, Radicals coinciding with the von Neumann regular radical on artinian rings, Monatsh. fur Math., 125 (1998), 229-239. E. R. Puczylowski and A. Smoktunowicz, On minimal ideals and the Brown-McCoy radical of polynomial rings, Comm. in Algebra, 26 (1998), 2473-2482. G. Preuu, Eine Galois-Korrespondenz in der Topologie, Monatsh. fur Math., 75 (1971), 447-452. Ju. M, Rjabuhin and R. Wiegandt, On special radicals, supernilpotent radicals and weakly homomorphically closed classes, J. Austral. Math. Soc., 31 (1981), 152-162. A. D. Sands, Strong upper radicals, Quart. J. Math. Oxford, 27 (1976), 21-24. E. Sasiada, Solution of the problem of existence of a simple radical ring, Bull. Acad. Polon. Sci., 9 (1961), 257. E. Sasiada and P. M. Cohn, An example of a simple radical ring, J. Algebra, 5 (1967), 373-377. A. Smoktunowicz, Polynomial rings over nil rings need not be nil, Preprint, 1998. A. Smoktunowicz, A simple nil ring exists, Preprint, 1999. P. N. Stewart, Semi-simple radical classes, Pacific J. Math., 32 (1970), 249-254. F. Szasz, Uber artinsche Ringe, Bull. Acad. Polon. Sci., 11 (1963), 351-354. S. Veldsman, Supernilpotent radicals of near-rings, Comm. in Algebra, 15 (1987), 24972509. S. Veldsman, The general radical theory of near-rings — answers to some open problems, Alg. Universalis, 36 (1996), 185-189. S. Veldsman, The general radical theory of incidence algebras, Comm. in Algebra, 27 (1999), 3659-3673. J. H. M. Wedderburn, On hypercomplex numbers, Proc. London Math. Soc., (2) 6 (1908), 77-118. H. J. Weinert and R. Wiegandt, A Kurosh-Amitsur radical theory for proper semifields, Comm. in Algebra, 20 (1992), 2419-2458. H. J. Weinert and R. Wiegandt, Complementary radical classes of proper semifields, Coll. Math. Soc. J. Bolyai 61, Theory of Radicals, Szekszdrd 1991, North-Holland 1993, pp. 297-310. H. J. Weinert and R. Wiegandt, On the structure of semifields and lattice-ordered groups, Periodica Math. Hungar., 32 (1996), 129-147. R. Wiegandt, Homomorphically closed semisimple classes, Studia Univ. Babes-Bolyai, 17 (1972) no. 2, 17-20.

Cluj,

R. Wiegandt, Rings distinctive in radical theory, Quaest. Math., 23&24 (1999), 447-472. Author's address: A. Renyi Institute of Mathematics Hungarian Academy of Sciences P.O. Box 127 H-1364 Budapest Hungary e-mail: [email protected]

Mathematics and the 21st Century Eds. A. A. Ashour and A.-S. F. Obada © 2001 World Scientific Publishing Co. (pp. 153-158)

153

O n m i n i m a l s u b g r o u p s of finite g r o u p s By M. A s a a d D e p a r t m e n t of M a t h e m a t i c s , Faculty of Science, Cairo University, Giza, E g y p t

Abstract.

Many authors have investigated the structure of a finite

group G under the assumption that all subgroups of G of prime order are well-situated in G. The aim of this talk is to introduce these investigations.

Introduction and results Throughout, the groups are finite. A subgroup of G of prime order is called a minimal subgroup. Two subgroups H and K of a group G are said to permute if HK = KH.

It is easily seen that H and K permute iff the

set HK is a subgroup of G. We say, following Kegel [1], that a subgroup of G is S'-quasinormal in G if it permutes with every Sylow subgroup of G. Many authors have investigated the structure of a finite group G under the assumption that all minimal subgroups of G are well-situated in G. The aim here is to introduce these investigations. We begin with the following result: T h e o r e m l(Ito). Let G be a group of odd order. If all minimal subgroups of G lie in the center of G, then G is nilpotent. Proof. See [2, p. 283]. An extension of Ito's result is the following statement:

154

T h e o r e m 2. (1) If, for an odd prime p, every subgroup of G of order p lies in the center of G, then G is p-nilpotent. (2) If all elements of G of order 2 and 4 lie in the center of G, then G is 2-nilpotent. Proof. See [2, p. 435]. A more interested result of the same type is the following statement: T h e o r e m 3(Buckley). If all minimal subgroups of an odd order group G are normal in G, then G is supersolvable. Proof. See [3]. An extension of Buckley's result is the following statement: T h e o r e m 4(Shaalan). If every subgroup of G of prime order or order 4 is 5-quasinormal in G, then G is supersolvable. Proof. See [4]. We use the following notation: p always a prime; &p is the class of all finite p-groups; for a subgroup N of a group G, Wp(7V) = (x\x G N, \x\ =p) iip is odd * 2 (7V) = (x\x £ N, \x\ = 2 or 4) $(iV) = (x\x € N, \x\ is a prime or \x\ = 4). Let 3 be a class of groups. We call 9 a formation provided: (1) 3 contains all homomorphic images of groups in 3 , and (2) If G/M

and G/N

subgroups M, N of G.

are in 9f, then G/(M

n N) is in 3 for normal

155

A formation S is said to be saturated if G/$(G)

G S implies that G s 9 .

We assume throughout that 9 is a formation, locally defined by the system {^(p)} of full and integrated formations $s(p) (that is, &p$s(p) = Q(p) £ 3 for all primes p). It is well-known that for any saturated formation S, there is a unique integrated and full system which locally defines S. A solvable normal subgroup N of a group G is an 9-hypercentral of G (see Huppert [5]) provided N possesses a chain of subgroups 1 = N0 < N1 < ...
is a chief factor of G,

has order a power of the prime pit then

G/CG(Ni+1/Ni)

belongs to S ( J J ; ) .

The product of all S-hypercentral subgroups of G is again an S-hypercentral subgroup, denoted by Z%(G) and called the S-hypercenter of G. An extension of Ito and Buckley results is the following statement: T h e o r e m 5(Yokoyama). Let G b e a solvable group, and let N < G such that G/N G 9f, where 9 is a saturated formation containing the class of nilpotent groups. If *(7V) < Z 9 ( G ) , then G e S . Proof. See [6,7]. In [8] Laue proved the following statement: T h e o r e m 6. Let G b e a solvable group. If every subgroup of prime order or order 4 of the Fitting subgroup F(G) is normal in G, then G is supersolvable. In [9] Derr, Deskins and Mukherjee proved the following statement:

156

T h e o r e m 7.

Let N < G with G/N

e 3 , where 5 is a saturated

formation. If * P (/V) < Z 9 ( G ) , then G/Op,{N)

e 3.

R e m a r k (1). From theorem 7, it follows immediately that if ^P{G) ZQ(G),

<

then G/O , (G) £ S. For the case of a solvable group G, theorem 7

can be deduced from theorems of Yokoyama [6,7] and Laue [8]. R e m a r k (2). The 5-hypercenter, as introduced before, is a solvable group because it is the product of all solvable Q'-hypercentral normal subgroups. This fact is essential in the proof represented in [9] because it depends heavily on the fact that in a solvable group G, CG (F(G)) < F(G). In [10] Bolinches and Aguilera extended theorem 7. The 9-hypercenter, as introduced in [10], is not necessarily solvable. In [11] Asaad, Bolinches and Aguilera extended the results of Shaalan [4]. They proved the following statement: T h e o r e m 8.

Let N < G with G/N

e S, where 9 is a saturated

formation containing the class of supersolvable groups. If every subgroup of N of prime order or order 4 is 5-quasinormal in G, then G e 3 . Recently, Asaad and Csorgo [12] extended the theorems 3,4,6 and 8. They proved the following statement: T h e o r e m 9.

Let TV < G with G/N

G 5 , where 3 is a saturated

formation containing the class of supersolvable groups. If N is solvable and every subgroup of prime order or order 4 of F(N) then

is S'-quasinormal in G,

GeQ.

To end this talk, the author would like to mention that the influence of minimal subgroups on the structure of finite groups is still open for doing further research work.

157

References [1] O.H. Kegel, Sylow-Gruppen und Subnormalteiler endlicher Gruppen. Math. Z. 78, 205-221 (1962). [2] B. Huppert, Endliche Gruppen I. Berlin-Heidelberg-New York 1967. [3] J. Buckley, Finite Groups whose Minimal Subgroups are Normal. Math. Z. 116, 15-17 (1970). [4] A. Shaalan, The influence of 7r-quasinormality of some subgroups on the structure of a finite group. Acta Math. Hungar. 56, 287-293 (1990). [5] B. Huppert, Zur Theorie der Formationen. Arch. Math. 19, 561-574 (1968). [6] A. Yokoyama, Finite solvable groups whose S-hypercenter contains all minimal subgroups. Arch. Math. 26, 123-130 (1975). [7] A. Yokoyama, Finite solvable groups whose Q-hypercenter contains all minimal subgroups II. Arch. Math. 27, 572-575 (1976). [8] R. Laue, Dualization of saturation for locally defined formations.

J.

Algebra 52, 347-353 (1978). [9] J.B. Derr, W.E. Deskins and N.P. Mukherjee, The influence of minimal p-subgroups on the structure of finite groups. Arch. Math.

45, 1-4

(1985). [10] A. Ballester-Bolinches and M.C. Pedraza-Aguilera, On minimal subgroups of finite groups. Acta Math. Hungar. 73, 335-342 (1996). [11] M. Asaad, A. Ballester-Bolinches and M.C. Pedraza-Aguilera, A note on minimal subgroups of finite groups. Comm. Algebra 24(8), 2771-2776 (1996).

158

[12] M. Asaad and P. Csorgo, The influence of minimal subgroups on the structure of finite groups. Arch. Math. 72, 401-404 (1999).

Mathematics and the 21st Century Eds. A. A. Ashour and A.-S. F. Obada © 2001 World Scientific Publishing Co. (pp. 159-167)

159

TOTALLY AND MUTUALLY PERMUTABLE PRODUCTS O F FINITE GROUPS A. BALLESTER-BOLINCHES

Departament d'Algebra, Universitat de Valencia Dr. Moliner 50, 46100 Burjassot, Valencia (Spain)

1

Introduction

This survey is about finite groups. Therefore unless otherwise stated all groups considered are finite. We start by quoting a series of classical theorems which are behind our results. Soluble groups have a long history reaching back to Burnside. His famous paqb theorem, published in 1904 ([9]) states that a group is soluble if its order is divisible by at most two different primes. Based on this theorem, P. Hall during the period 1928-1937 characterized solubility by means of the existence of Sylow complements and Sylow systems (see [10]). In particular, we have: A group is soluble if and only if it is the product of pairwise permutable Sylow subgroups Taking the above characterization into account, the following question arises: Assume that a group G is the product of pairwise permutable nilpotent subgroups. Is G soluble? In the early 50's many people worked on this question. It was promptly conjectured that the answer was affirmative. Wielandt, in 1958, proved that a dinilpotent group must indeed be soluble if the factors are of coprime orders ([15]) and Kegel then proved that a dinilpotent group is always soluble ([12]). Today these results are known as the Kegel- Wielandt theorem. The answer to the above question follows then by using an induction argument.

160

Meanwhile, Huppert in 1953 ([11]) proved a particular case of the KegelWielandt theorem which will be useful for our purposes: A group which is the product of pairwise permutable cyclic subgroups is supersoluble. The idea behind the above results is the following: Assume that G = G1G2 • • • Gn is a group which is the product of the subgroups G\,G2,---Gn such that GiGj = GjGi for all i,j where n > 2. Which is the relationship between the structure of G and that of the subgroups Gi? As a special case of a product G = G1G2 • • • Gn of pairwise permutable subgroups we have the one with all the factors Gi are normal in G [normal products). In particular when G = G\ x G 2 x • • • x Gn it is a direct product. The distance between general products and direct products is usually long. For instance, the normal product of supersoluble groups is not supersoluble in general while the direct product of supersoluble groups is always supersoluble. This shows in particular that formations, even saturated, are in general not closed under normal products. However, it is known that they are always closed under direct products and even under central products. To create intermediate situations, it seems reasonable to consider products in which the subgroups of the distinct factors are connected by certain permutability properties. This is probably what Asaad and Shaalan had in mind when in 1989 they introduced the concepts of totally and mutually permutable products, central concepts of this paper. Definition. [1] a) A group G is said to be the totally permutable product (t.p.p.) of the subgroups H and K if G = HK and every subgroup of H permutes with every subgroup of K. b) A group G is the mutually permutable product (m.p.p.) of H and K if G = HK and H permutes with every subgroup of K and K permutes with every subgroup of H. Clearly, direct products and central products are t.p.p. Moreover, "totally" implies "mutually", but the converse is not true: the group S 4 is the mutually permutable product of a Sylow 2-subgroup and the alternating group Ai, but this product is not totally permutable. In [1], the following results are proved:

161

Theorem. [1; Theorems 3.1, 3.2 and 3.8] Let G = HK be the product of two supersoluble subgroups H and K. a) If G is the t.p.p. of H and K, then G is supersoluble. b) If G is the m.p.p. of H and K and either K is nilpotent or G', the commutator subgroup of G, is nilpotent, then G is also supersoluble. A natural question arising at this point is whether it is possible to extend the above results through the theory of formations. In this survey we present some affirmative answers to this question. We introduce first some definitions and results on formations. Recall that a formation is a class of groups T such that: 1. If G G T and N
G T.

2. If Ni
2

Totally permutable products

The first relevant extension of part (a) of the theorem by Assad and Shalaan is the following result due to Maier: Theorem. [13; Theorem] Let T be a saturated formation containing the class U of supersoluble groups. If G = HK is the t.p.p. of H and K and both H and K belong to J-, then G also belongs to J-.

162

Notice that the condition U C T is necessary, as the symmetric group of degree 3 and the formation of all nilpotent groups show. In the same paper, he proposes the following question: "Does the above result extend to non-saturated formations which contain all supersoluble groups?". This question was affirmatively answered by Ballester-Bolinches and Perez-Ramos in 1996. Theorem A. [6; Theorem] Let J- be a formation such that U C T. If G = HK is the t.p.p. of H and K, and both H and K belong to T, then G also belongs to T. In the following, we shall give an sketch of proof of this theorem. In order to prove a theorem like this, it is needed to get information about the structure of the group from the structure of the factors. In particular, we concentrate our attention in the direction "how far apart" the product is from a central product. We shall see that the basis of the results is quite often the supersolubility of the product of two cyclic groups. The following results turn out to be crucial for understanding the structure of totally permutable products. Lemma 1. [13; Lemma 2] If G = HK is the t.p.p. of H and K, then 1. There exists a normal subgroup N of G such that N is contained either in H or in K. 2. H C\K is a nilpotent subnormal subgroup of G. Assume that the normal subgroup of N is contained in K. Then NH is a t.p.p. and N
163

Lemma 3. [6; Lemma 6 and Corollary 2] IfG is the t.p.p. of the subgroups H and K, then [Hu, K] = [H, Ku] = 1. Moreover, Gu = HUKU. Recently J. Beidleman and H. Heineken in [7] have obtained a valuable improvement in the knowledge of the structure of these groups by proving that in fact in this kind of products the nilpotent residual of each factor centralizes the other factor. More generally, if T is a formation such that U C T, then H^ and KT are normal subgroups of G. At this point, it is possible to give a proof of Maier's theorem. Theorem. Let T be a saturated formation containing U. If G = HK is the t.p.p. of the J--subgroups H and K, then G € J-. Proof. Notice that 1 < Hu < HUKU = Gu < G is a proper normal series of G, which can therefore be refined to a chief series of G. If A/B is a chief factor such that A < Hu, then K < CG{A/B). Hence G/CG{A/B) S H/CH{A/B) G f(p) for every prime p dividing | A/B | . Moreover if Hu < B < A< HUKU, then A/B is isomorphic to (A n KU)/(B n Ku), a chief factor of K centralized by H. Since K £ T, it follows that G/CG{A/B) S K/CK{{A n KU)/{B n Ku)) e f{p) for every p dividing | A/B |. Finally G/HUKU = G/Gu eUCT. Consequently, G € T. Lemma 4. [6; Lemma 1] Let the group G = NB be the product of N and B, where N
164

Then (Kc) = (KA) < KA because K centralizes Hu. Hence (Kc)nHu = 1. Now C/(KC) S [HU]A/({KG) n [HU]A) e T (notice that [HU]A e .F applying Lemma 4). Therefore C/Hu n ( # c ) ^ C £ F. The second case is if neither H nor if is supersoluble. Then Hu, Ku ^ 1 and H = HUA and K = ifw£? where A and B are supersoluble projectors of H and K respectively. Moreover A is a proper subgroup of H and B is a proper subgroup of K. Denote U = AB. It is clear that U is a supersoluble subgroup of G. There is a natural action of U on Hu x ifw. Let X = [Hu x if u ][/ be the semidirect product with respect to this action. It is rather easy to see that there exists an epimorphism 7 : X —> [HUKU]U = Y. Moreover, by Lemma 4, there exists an epimorphism from Y to G. We see that X is an ^"-group. By similar arguments to those used above, it is not difficult to prove that [HU]U and [KU]U both belong to T. This implies that X/Hu and X/Ku are ^"-groups and so X is also an .F-group. This implies G £ T, the final contradiction. Let us give now some results on t.p.p. concerning the behaviour of distingued subgroups associated to formations. A well-known theorem by Doerk and Hawkes (see [10]) states that for a formation T of soluble groups, the ^"-residual respects the operation of forming direct products, that is, if G = HxK, then Gr = Hr x Kr. We have: Theorem B. [3; Theorem 4] Let J- be a formation containing U such that T is either saturated or f C 5 . IfG = HK is the t.p.p. of H and K, then GT = HrK*. Notice that this theorem implies that the converse of Maier's theorem is true; that is, if G belongs to J- and T is saturated, then H and K both belong to T. On the other hand, it is quite clear that knowledge of ^-projectors of a group usually reveals little about its ^-"-subgroup structure. In fact, there is no connection in general between the projectors of a group and those of a proper subgroup. Totally permutable products are exceptions to this general rule when T is a saturated formation containing U as the following result shows: Theorem C. [2; Theorem B] Let T be a saturated formation such that U C T and let G = HK be the t.p.p. of H and K. If A is a T-projector of H and B is a J-'-projector of K, then AB is a J-'-projector of G.

165

Finally, Theorems A, B and C hold for totally permutable products of more than two subgroups: Theorem. [3; Theorems 1, 4 and 5] Let G — G1G2 • • • Gn be a product ofpairwise totally permutable subgroups d- Then, Theorems A, B and C hold.

3

Mutually permutable products

In order to obtain an extension of part (b) of Assad-Shaalan theorem, we will focus our attention now on mutually permutable products (m.p.p.). As a previous remark, notice that Assad-Shaalan theorem does not hold in the case "K nilpotent" and "J7 a formation containing W". For instance, if we take the formation T = (G : G' is supersoluble). It is clear that T contains U. Moreover the symmetric group of degree 4, G, is the m.p.p. of the alternating group of degree 4, H say, which belongs to the class T, and a Sylow 2-subgroup K of G. In this example, the derived group of G is H, which does not belong to U. Hence G £ J-. Even if the formation J- is saturated, the result is not true. For instance, let us take the formation function f („\

J{P>

_ /

S S

*3

\ SpU(p-l)

ifp = 2

iip

^2.

where U{p—1) denotes the class of abelian groups of exponent dividing p~ 1. Let T = LF(f). It is clear that U C T. Then £4 is again the m.p.p. of H = At and K G 5j// 2 (S 4 ). It is not difficult to see that H,K G T, but £4 does not belong to the class J-, because ^i/Oz^CEi) is isomorphic to £3, which does not belong to c>2<S3. However, for T = U, the following improvement of Asaad-Shaalan Theorem was obtained. Theorem. [4; Theorems 1 and 2] Let G = HK be the product of H and K. If either K is supersoluble and G €N orKeAf, then Gu = Hu. Returning to formations, the case G' G M is interesting because of the following result: Lemma 5. [5; Lemma 1] Let G = HK be the product of the subgroups H and K such that G' G Af. Assume that J- is a saturated formation with G G T. Then both H and K belong to T.

166

In fact, for saturated formations containing U, the following holds: Theorem. [5; Theorem A] Let G = HK be the m.p.p. of H and K. Let T be a saturated formation containing U. If G' € N, then Gr = H^K*. We do not offer the complete proof because it is quite long and at some points rather technical. One of the main ideas of the proof is to analyse the behaviour of the intersection H n K, In this context, the following results are extremely useful. Let G = HK be a mutually permutable product. Then: (i) [8; Proposition 3.5] If X and Y are subgroups of G such that H n K < X < H and H (1 K < Y < K, then X and Y are mutually permutable subgroups of G. In particular, if H n K = 1, the product G is totally permutable. (ii) [8; Proposition 3.5] H n K is permutable in H and K. Moreover, H (~l K is a subnormal subgroup of G. (iii) [14] If Q is a permutable subgroup of a group A, then QA/CoreA(Q) < Zoo {A/CoreA(Q))- As a consequence, if D = CoreH(H f\K) ^1, then Z)A = Z) x < if. Hence K contains a normal subgroup of A. (iv) [5; Lemma 4] If G' is nilpotent and T is a saturated formation containing U, then (H D i f ) ^ < G^. One might wonder whether some jF-projector of a mutually permutable product with nilpotent commutator subgroup could be the product of Tprojectors of the factor groups. Unfortunately, this is not true in general as the next example shows. Let G be the direct product of a cyclic group (a) of order 3 with the alternating group A^ of degree 4. Let V be the Klein 4-group of A^. Then G is the mutually permutable product of H = (a) x V and K = A±. Moreover G' = V is nilpotent. Notice that H is the supersoluble projector of H and a Sylow 3-subgroup B of A± is a supersoluble projector of K. However HB = G is not supersoluble.

References [1] M. Asaad, A. Shaalan "On the supersolvability of finite groups" Math. 55(1989), 318-326

Arch.

167

[2] A. Ballester-Bolinches, M.C. Pedraza-Aguilera, M.D. Perez-Ramos "On finite products of totally permutable groups" Bull. Austral. Math. Soc. 53 (1996), 441-445 [3] A. Ballester-Bolinches, M.C. Pedraza-Aguilera, M.D. Perez-Ramos "Finite groups which are products of pairwise totally permutable subgroups" Proc. Edinburgh Math. Soc. 41 (1998), 567-572 [4] A. Ballester-Bolinches, M.C. Pedraza-Aguilera, M.D. Perez-Ramos "Mutually permutable products of finite groups" J. Algebra 213 (1999), 369-377 [5] A. Ballester-Bolinches, M.C. Pedraza-Aguilera "Mutually permutable products of finite groups II" J. Algebra 218 (1999), 563-572 [6] A. Ballester-Bolinches, M.D. Perez-Ramos "A question of R. Maier concerning formations" J. Algebra 182 (1996), 738-747 [7] J. Beidleman, H. Heineken "Totally permutable torsion groups" Group Theory 2 (1999), 377-392 [8] A. Carocca "p-supersolvability of factorized finite groups" Math. J. 21 (1992), 395-403 [9] W. Burnside "On groups of order paqb" (1904), 388-392

J.

Hokkaido

Proc. London Math. Soc. 2

[10] K. Doerk, T.O. Hawkes "Finite Soluble groups" De Gruyter Expositions in Mathematics, 4. Berlin, 1992 [11] B. Huppert "Uber das Produkt von paarweise vertauschbaren zydischen Gruppen" Math. Z. 58 (1953), 243-264 [12] O.Kegel "Produkte nilpotenter Gruppen" Arch.Math. (Basel) 12 (1961), 90-93 [13] R. Maier "A completeness property of certain formations" Math. S o c . ^ (1992), 540-544

Bull. London

[14] R. Maier, R. Schmid "The embedding of permutable subgroups infinite groups" Math. Z. 131 (1973), 269-272 [15] H. Wielandt "Uber das Produkt von paarweise vertauschbaren nilpotenten Gruppen" Math. Z. 55 (1951), 1-7

Mathematics and the 21st Century Eds. A. A. Ashour and A.-S. F. Obada © 2 0 0 1 World Scientific Publishing Co. (pp. 169-172)

169

Asymptotic behaviour of solutions of evolution equations

by

Bolis Basit Department of Mathematics Monash University Clayton, Victoria 3168, Australia e-mail: bbasit9vaxc.cc.monash.edu.au

Abstract. In this talk we study asymptotic behavior of solutions of abstract equations of the form (*) B(j>(t) = A X is continuous, B(j> = ' and J 6 {R+,R}. However, our treatment applies to more general operators A, B and to more general groups or semigrops J. The following are the main tools in our investigations. 1- Harmonic analysis: we itroduced the notion of a specrum spjr(tj>) of a function (j> with respect to a class F(J, X) and the resonance set 0g1(o-(J4)) where 9g is the characteristic function of B. It contains the Beurling specrtra of all solutions of the homogenuous equation B(j> = Ao (j>. 2-Ergodicity. The notion of ergodicity is developed to include unbounded functions. Using ergodicity we developed a criteria for a function to be an element of a class T. 3- Extension of the known classes to Mean classes. This enabled us to study not only uniformly continuous bounded solutions but also measrable bounded solutions. These tools are used by the speaker for the study of the abstract Cauchy problem and developed jointly with A. Pryde (Monash University , Australia) and Hans Giinzler (Kiel University, Germany) to include more general equations. This talk is a description of some ideas in three papers [2], [4], [5], however it is largely taken from [5]. We study (1.1)

B<j>(t) = A<j>{t) + i/>(t) for t 6 J,

where A is the generator of a Co-semigroup of operators on a Banach space X, ip : J —> X is continuous, B<j> = ' and J 6 {R + , R}. However the treatment in [5] applies to more general operators A, B and more general groups G or semigroups J. In particular J = Z+, (R + )". General setting: We denote by X a complex or real Banach space, by G a locally compact abelian group and by G the dual of G. w will stand for a function: w : G —> [1, oo) which is continuous and satisfy w(t + s) < w(t)w(s) and Yl^Li \logw(nt) < oo (Beurling-Domar condition), w is called a weight . See [1], [8]. Function spaces (see [5]): BCW(G,X) = wBC(G,X), BUCW(G,X) = wBUC{G,X), C0,W(G,X) = wCQ(G,X). <j>t(s) = <j>(t + s), Atij> = (j>t - 0. p £ C(G,X) is a polynomial of order m if AJ" +1 p = 0 but A?p £ 0. See [1], [5].

170 Harmonic analysis: Beurling spectrum (see [1, p. 34], [6-9])

Ll(R).

If (j> £ L%{R,X) and Iw(<j>) = {/ E Ll(R) : 0 * / = 0}, then Iw{4>) is a closed ideal of Spectrum spw() = { 7 £ t : / ( 7 ) = 0 for all / 6 Iw{(j>)}.

E x a m p l e s (see [1, Proposition 1.1, Proposition 2.6]). (j> = p ^ 0 a polynomial, spw(p) = e the identity of G, = 7 £ G , ^ ( 7 ) = {7} and (/> = 7 ^ + • • • + 7„j>„ with p^ ^ 0 , ;' = 1, • • • , n. *Pu.W = {7i,-•• ,7n}Spectra relative to a class J 7 := F( 7, X): Let T(G, X) be translation invariant closed subspace of BUCW(G, X) (see [5, Section 5]), 0 € £t/C„,(R,X) and /„,(<£) = {/ 6 i i , ( R ) : * f £F (G, X)}. Then 7^(0) is a closed ideal of £j,(G). Spectrum spT(<j>) = {7 6 G : / ( 7 ) = 0 for all / 6 7^(0)}. Ergodicity (see [4] and references therein). A function [x + s)ds — m(0)|| —> 0 as T —> 00. The limit m(<£) (clearly unique) is called the mean of <j>. A function tj> £ Ljoc(J, X) is called totally ergodic if 7^ 0 is ergodic for all characters Ai 7 A (t) = e' , A £ R. X€J||JI/0

A function <> / £ BC(J,X) is called asymptotically almost periodic (respectively Eberlein almost periodic) if H(., : s £ 7} is relatively compact (respectively weakly relatively compact) in BC(J,X). The space of all ergodic (totally ergodic) functions from £| o c (7, X) will be denoted by

E(J,X) (T£(J,X)). We set £ub(J,X) := £(J,X) n BUC(J,X) and T£ub(J,X) := T£(J,X) n BUC(J, X). AP(J, X), AAP{J, X), EAP(J, X) will respectively stand for the spaces of almost periodic functions, asymptotically almost periodic functions and Eberlein almost periodic functions. Stepanoff 5 p -almost periodic functions for all 1 £ BUC(J,X)nT£ub(J,X) and T is translation invariant, invariant under multiplication by characters, closed closed subspace of BUC(J, X) containing all constants, then £ T if and only if spjr((j>) is countable. Theorem 2 (see [2, Theorem 4.2] and references therein). Let T(t) be uniformly bounded Co-semigroup of bounded operators. Let (r(A) n »R be countable. Let ux(t) = T(t)x for some x eX. (a) If range(^4 - iA7)+ ker (^4 - z'A7) is dense in X for all A £ R, ux is asymptoticaly almost periodic. (b) If ker (A - iXI) is dense in X for all A £ R, wx is asymptoticaly stable. Theorem 3. Under certain assumptions (see [5, Section 5]) (i) Iw(4>) C IF{) a n d so spF(tj>) C sp„{<j>)

171 (ii ) <j> 6 T if and only if spjr() = 0 (Hi ) A f + V G ^ for all t e G if and only if spr{<j>) C {1} Theorem 4 (see [5, Section 6]). Let <j> be a solution of (1.1). Under certain assumptions

(&)v i, e f, sPr(<j>) c egl() C *pu)(V')Example (see [5, Section 6]). B<j> = A = f

+ 2if-4>,

A =[§:?].

Then 5 7 , i = 0B{1S)ISX

= [(is)2 + 2i(is) - l]l3x = -(s + l)2lsx,

tr(A) = {-1, - 4 } .

It follows that 6gl( 7-2,7-3}With 7a(t) = e'st, the general solution of the homogeneous equation B£ = A o ( is ^

I ci7o+C27i+c 3 7_ 2 +C47_3 J*

We notice that spw(£) = 9gl(ir(A)). Now let T = APW(J,X) (1 + 1*1)^. Let
or T = C 0 ,„,(J,X) and t»^(() =

So £,V £ C^, 0 (M,«7) n A P ^ R . O ) .

Whereas 0 0

Motivated by the above example, we refer to 6g1(cr(A)) as the resonance set of (l.l)(see [5]). We say that (f> 6 BUCW(J,X) is a resonance solution of (1.1) for the class T(J, X) if ip,£ 6 ^ ( J , X) for all solutions £ g BUCW(J,X) of the homogeneous equation B£(t) = A£(i) but (j>^T(J,X). Mean classes Definition (see [4]). Let 4> e A C LJ^X),

0 ^ ft 6 R+. Set

Affc^(t) := i /0h 0(< + *) ds, « £ J and MA := {0 G 1,^(7, X ) : Mh<j> 6 .4 for all h > 0}. It is easy to show that SP-AP(J,X)

C

MAP(J,X).

Theorem (see [4, Theorem 4.2]). Assume c0 (£ X. 0 e S£(R,X), 1 < p< oo. Then <j> 6 SP-AP{WL,X).

Let A ^ = 6 with 6 6 S P -AP(R,X),

172 REFERENCES

[1] B. Basit and A.J. Pryde, Polynomials and functions with finite spectra on locally compact abelian groups, Bull. Austral. Math. Soc. 51 (1995), 33-42. [2] B. Basit, Harmonic analysis and asymptotic behavior of solutions to the abstract Cauchy problem, Semigroup Forum 54 (1997), 58-74. [3] B.Basit and A.J.Pryde, Ergodicity and differences of functions on semigroups, J. Austral. Math. Soc. 149 (1998), 253-265. [4] B.Basit and Hans Gunzler, Asymptotic behavior of solutions of neutral equations, J. Differential Equations 149 (1998), 115-142. [5] B. Basit and A. Pryde, Asymptotic behavior of unbounded solutions of evolution equations on topological groups (to be submitted). [6] J. T. Benedetto, Spectral Synthesis, Academic Press, New York, 1975. [7] Y. Katznelson, An introduction to Harmonic Analysis, J. Wiley and Sons, New York, 1968. [8] H. Reiter, Classical Harmonic Analysis and Locally Compact Groups, Oxford Math. Monographs, Oxford Univ., 1968. [9] W. Rudin, Harmonic Analysis on Groups, Interscience Pub., New York, London, 1962.

Mathematics and the 21st Century Eds. A. A. Ashour and A.-S. F. Obada © 2001 World Scientific Publishing Co. (pp. 173-188)

173

On Nonlinear Evolution Equations with Applications Lokenath Debnath Department of Mathematics University of Central Florida Orlando, Florida 32816 U.S.A. e-mail:

"... the progress of physics will to a large extent depend on the progress of nonlinear mathematics, of methods to solve nonlinear equations ... and therefore we can learn by comparing different nonlinear problems." WERNER HEISENBERG

"... as Sir Cyril Hinshelwood has observed ... fluid dynamics were divided into hydraulic engineers who observed things that could not be explained and mathematicians who explained things that could not be observed." JAMES LIGHTHILL

Abstract. This paper is concerned with the most general water wave equations in (3 + 1) dimensions. Special attention is given to the derivation of the (1 + l)-dimensional forced Korteweg-de Vries (fKdV) equation and the (1 + l)-dimensional forced nonlinear Schrodinger (fNLS) equation near resonant conditions. Using the multiple scale technique, the (1 + l)-dimensional nonlinear Schrodinger (NLS) equation for the amplitude function A® of wake packets is derived. Section 6 deals with the fourth-order nonlinear Schrodinger equation for the amplitude of the wave potential leading to a major improvement in agreement with Longuet-Higgins analysis (1978a,b). A stability analysis is made based on the simplified version of the Dysthe's (1979) NLS equation. This is followed by Hogan's analysis of the fourth-order evolution equation for capillary-gravity waves in deep water. The final section deals with the Davey-Stewartson equations and the Kadomtsev and Petviashvili equation in water of finite depth.

1. Introduction The mathematical theory and applications of nonlinear evolution equations have experienced a revolution over the last three decades of the twentieth century. During this revolution many fascinating and unexpected phenomena have been observed in physical, chemical and biological systems. Many new instability phenomena have been discovered. Considerable attention has been given to these instability and wave breaking phenomena. Other major achievements of twentieth century applied mathematics include the remarkable discovery of soliton, soliton interactions and

174 the Inverse Scattering Transform (1ST) for finding the explicit exact solution for several canonical nonlinear partial differential equations. Almost all nonlinear evolution equations after 1960 originated from the mathematical theory of water waves. Water waves are the most common observable phenomenon in nature. Invariably water waves break on ocean beaches. One of the common question is: why water waves break on beaches? Answers to this question led to analytical study of nonlinear partial differential equations, topological study involving Riemann's surfaces, algebraic study of Lie groups, experimental and computational studies of nonlinear evolution equations. So, water waves possess an extremely rich mathematical structure. The study of water waves still remains an active subject of mathematical research. This article deals with several nonlinear evolution equations described by one or more of the following physical processes that include dispersion and nonlinearity. Special attention is given to the (1 + l)-dimensional fKdV and fNLS equations, the fourth-order NLS equations in water with and without the effects of surface tension. Included are the Davey-Stewartson (DS) equations and the Kadomtsev and Petviashvili (KP) equation in water of finite depth. 2. Basic Equations of Water Waves We consider an unsteady irrotational motion of an inviscid and incompressible water in a constant gravitational field g which is in the negative z direction. The uneven bottom is described by z = b(x, y). Including the effects of surface tension T, the basic equations and the free surface boundary conditions for the velocity potential
{x, y, z, t) and the free surface elevation Ti = ri(x,y,t)

are V24> = <j>xx + (t>vy + zz = 0, Vt + {4>xVx + yVv) - 4>z = 0 2

4>t + \(V) + 9V-T(Ri

+ R2) = 0

z - (4>xbx + (t>yby) = 0

on

b(x, y)
+ ar], t > 0

(2.1)

z = h0 + ar),

(2.2)

z = h0 + ar,, on

(2.3)

z = b{x, y),

(2.4)

where the curvatures R\ and R2 are fli = % ( l + ^ + ^ ) " ^

and

R2 = 7ly(l + ri2x + 4)-K

(2-5ab)

ho is the typical depth of water and a is the typical amplitude of the surface wave. It is convenient to introduce a typical horizontal length scale I (which may be wavelength A) and a typical vertical length scale ho and a typical velocity scale c = \fgh~o (shallow water wave speed). We also introduce two fundamental parameters e and i5 as

-Vo

* = l

(2 6ab)

-

175 where e is called the amplitude parameter and S is called the long wavelength or shallowness parameter. Making reference to Debnath (1994), the basic water wave equations and free surface boundary conditions without the effects of surface tension (T = 0) in nondimensional form are o~(4>xx + yy) + zz = 0,
in b < z < 1 + en, °n

+ 4>l) +^4*1 = 0

z - 5[Vt + E{xr)x + yVy)\ = ° z - o~(4>xbx + <j>yby) = 0

t> 0

z = l + £V,

on

(2.7) (2-8)

z = \ + er},

(2.9)

o n 2 = 6(1,3/).

(2-10)

3. The Korteweg-de Vries (KdV) Equation Near Resonant Speed According to the linearized theory of Stoker (1957), and Debnath and Rosenblat (1969), twodimensional water wave generated by a steady surface pressure moving at a constant speed U propagate in the far field only if the pressure field travels at a speed U < c = \fgh~o and appear behind the pressure field and no disturbances ahead. On the other hand, if U > c, no disturbance at all exists far from the pressure field, but only transients are generated, and they decay in the far field as t —> oo. However, at the resonant (or critical) speed U = c (or Proude number F = U/c = 1) the linearized solution becomes unbounded as t —> oo so that the linearized theory breaks down due to finite-amplitude effects. The reason for the unbounded growth of the solution is that the energy transferred by the moving pressure field to the water cannot be radiated away from the disturbance because the group velocity of the induced waves tends to U. Thus, the neglected nonlinear terms apparently play a significant role in the evolution of the response in the finite-amplitude regime. We formulate the nonlinear initial value problem in water of finite depth ha, 0 < z < ho under the action of steady surface pressure p(x) moving at a constant speed U. The nonlinear water wave problem is formulated in terms of the velocity potential $ = Ux — \U2t + {x, z, t) and the free surface elevation z = ho + T](x, t). In terms of nondimensional variables denoted by asterisks (x, z) = ho(x*, z*),

t = I — It*,

* = ( ^ ) * ' ,

n = an*

P={agp)p\

the basic equations for water waves, dropping the asterisks, xx + zz = 0, r\t + Frjx + £(j>xr}x - 4>z = 0, c/>t + v + Fx + -£(l + 4>l) = -p(x), z = 0

in 0 < z < 1 + en, z = 1 + er),

t > 0,

(3.1) (3.2)

z = l + en,

(3.3)

on z = 0

(3.4)

176 where F = ^ is the Froude number and e = (jp-1 is a nonlinear parameter. It has already been found that the amplitude of the linearized solution grows like e45, while the dispersive effects proportional to the square of the wavenumber decays like £ _ 5 so that a balance is obtained when t = 0(\/e).

Thus, the originally neglected nonlinear terms eventually become

significant and the unbounded growth can be modified. Akylas (1984a) developed an asymptotic analysis that takes account of the finite-amplitude effects {e = O l e " ) and 7] = O (e5 I, new rescaled variables appropriate in the far field are defined by (j> = e-l${X,z,T),

T) = £-iA(X,T),

(3.5)

where $ and A are assumed to be 0(1). In terms of new variables $ and A, the basic equations (3.1)-(3.4) are transformed together with the off resonant condition F = l + 7£3 where 7 = O(l). Following Whitham's (1974) analysis of the nonlinear theory of the full water wave equations, we derive the evolution equation for the amplitude A(X, T) to the leading order in the form AT + 1AX - 2AAX - ^Axxx

= np(0)8'{X),

(3.6)

where p(k) is the Fourier transform of p(x) and <5(x) is the Dirac delta function. Equation (3.8) is the forced Korteweg-de Vries (jKdV) equation. To solve (3.6), we can obtain appropriate initial conditions by matching asymptotically the finite-amplitude response to the solution determined by the linearized theory. For 7 7^ 0, the solution of the linear nonhomogeneous KdV equation can be obtained by asymptotic approximation for T —» 00. For 7 > 0, the disturbance is found to decay as T —> 00 for any fixed X. On the other hand, for 7 < 0, the disturbance also decays when X < 0, and it represents a steady wave solution for X > 0 in the form A(X,T)

~ 6wp(0)(67)-5 sinh( v /6^X).

(3.7)

This result is in agreement with that of the linearized theory. Moreover, a numerical study of the forced KdV equation reveals that a series of solitons are generated in front of the pressure field. At larger distances from the pressure field, waves are highly oscillatory with a larger amplitude than that predicted by the linearized theory. These predictions are in excellent agreement with the experimental findings of Huang et al. (1982). 4. The Nonlinear Schrodinger (NLS) Equation Near Resonant Conditions According to Debnath and Rosenblat (1969) linearized theory of water waves on a running stream of finite depth due to an oscillating surface pressure field, the solution becomes unbounded

177 at the boundary curve separating two kinds of possible steady states. Indeed, the solution for the free surface elevation r)(x, t) becomes singular on the boundary (critical) curve in the sense that wave amplitude grows like x or like \fl as t —> oo. This leads clearly to a resonance phenomenon. So, it is necessary to include nonlinear terms in the original formulation of the problem in order to achieve a physically bounded solution near resonant conditions. We next discuss (see Akylas (1984b) and Debnath (1994)) the nonlinear evolution equation associated with nonlinear water waves generated by a moving oscillatory surface pressure near resonant (or critical) conditions. We consider a two-dimensional harmonically oscillating pressure distribution of frequency w traveling at a uniform speed U on the free surface of water of constant depth h,—h
It is

convenient to use a frame of reference moving with the pressure field so that the pressure is at rest and a uniform stream U exists in the water. In terms of nondimensional (asterisks) variables and parameters (x ,z ,h ) = —^(x,z,h),

t =—t,

u = — u,

e = -,

water wave equations (2.1)-(2.4) with surface tension T = 0 are, dropping the asterisks, V 2 0 = 0,

-h < z < 0,

t > 0;

Vt + Vx + £xrix = 4>z on z = er}\

(4.1) (4.2)

t + V + x + 2 £ ( < ^ + l) = - p ( x ) e i u * + c.c. on z = erf, <j>z = 0

on z = -h,

(4.3) (4.4)

where c.c. stands for the complex conjugate. According to the linearized theory, the wave amplitude grows like E\fi and the dispersive effects decay like t - 1 ' 2 for large t > l . Therefore, nonlinear effects are found to appear when the amplitude is 0 (v^e) at / = 0(£ _ 1 ). We now introduce the slow time and slow spatial variables by T = et and X = \fe x, respectively. The finite amplitude effect is assumed to be in the form of a wave packet of 0(-V/E) amplitude, modulated by an envelope depending on X and T. Accordingly, we assume the expansions for
4> = e~1/20 + £" 1 / V 1 + & + •••, 1,2

ri = £~ m+Vo

(4.5)

+ V2 + --- ,

(4.6)

where 0 = $o(-X, •*, T ) , ie

& = $(X, 2, T)e

r)0 = A0(X, T), + ex.,

<j>2 = 3>2{X, z,T)e

2ie

+ c.c,

ie

7ft = A{X, T)e

2ie

V2

= A2(X,)e

(4.7ab) + c.c, + c.c.,

(4.8ab) (4.9)

178 and the phase function 9 = k^x + ujt. In view of nonlinear effects, the primary harmonic produces the mean-flow components (J>Q, r)o, and the second harmonics 2, V2, respectively. On the other hand, the nonlinear interactions associated with the waves of wavenumbers fci and hi, which are found behind the pressure disturbance, can be neglected because they occur at higher time scales. The main objective of the nonlinear analysis is to determine an evolution equation for the wave envelope A(X, T). With the appropriate envelope scales and the perturbation expansion (4.5)-(4.6), we assume that the frequency of the oscillating pressure is off the exact critical value UJ by £7, 7 = O(l) and then use an analysis similar to that of Benney and Roskes (1969), to derive the following evolution equation, B =

Aexp(-i-fT), iBT -

7

S + ( ^ - ) BXX

- ( ^ ) B*B2 = -7rap05(X),

(4.10)

where po is the Fourier transform of p(x), a i , Q2i and a are given in Debnath's book (1994). Clearly, (4.10) is the forced nonlinear Schrodinger (fNLS) equation with a minor modification due to assumed frequency detuning. In fact, the right-hand side of (4.10) represents the forcing effect of the applied pressure that includes the Dirac delta function S(X) as a factor. The evolution equation (4.10) can be solved subject to the initial condition determined by asymptotically matching the nonlinear response to the linearized far-field response. So, this resulting nonlinear initial-boundary value problem reveals the existence of a finite-amplitude steady state solution close to the resonant conditions. However, the number and nature of the possible steady states depend on the sign of 0:10:2 (see Debnath (1994)) and the value of the detuning parameter 7. In particular, for an infinitely deep water (h —» 00), to = fco = \, a i = 2, 0:2 = —ikga = —g, so that aiQ2 = — \ < 0, it can be shown that only one steady-state solution describing a uniform wave is possible for all values of 7. This finding is in excellent agreement with the result of Dagan and Miloh (1982). 5. The Nonlinear Schrodinger Equation and Evolution of Wave Packets We consider the evolution of wave packets for gravity waves on the surface of water of uniform, but finite depth (6 = 0). We retain the shallowness parameter 8 in equations (2.7)-(2.10) and consider £ —> 0 for \f5 fixed so that equations (2.7)-(2.10) reduce to zz + Wxx = ®, in & + 1 + !(<^) + | j < ^ = 0

on

0 < z < l + £7?, z = l + en,

t>0

(5.1) (5.2)

4>z - S(r)t + £x%) = 0

on

z = 1 + erj,

(5.3)

4>z = 0

on

z = 0.

(5.4)

179

In consistent with modulated waves from a Fourier integral representation of a wave described by oo

/

F(fc) exp[i(fcc - ait)]dk,

(5.5)

•oo

where F{k) is given and w = ui(k) is also a given dispersion relation. We assume that the main contribution to the wave profile comes from the neighborhood of the carrier wavenumber k — ko so that k — ko = £K and u> = uj(k) can be expanded in a Taylor series about A: = ko up to the term e2. It turns out that, as e —> 0, tf>(x, t) ~ A(x, T) exp[i(k0x — w04)],

(5-6)

where A(x, r) is known, x = e{x - cgt), u>0 = uj(k0), r = e 2 t and cg = ui'(ko) is the group velocity. We recognize here that the relevant scales seem to be associated with both e and e , and hence, we introduce £ = x — Cpt, X = e(x — Cgt),

T

= e2t,

(5.7)

where Cp is the phase velocity of the wave. Following Johnson (1997), it turns out the asymptotic solution takes the form f?o = Ao(X, T) exp(ifc£) + c.c. 4>o = fo(X, T) + F0(C, z, T) exp(ikO + c.c.

(5.8) (5.9)

where c.c. stands for the complex conjugate of the terms in exp(ifc£). The asymptotic analysis leads to an equation for AQ(X, T) in the form -2ikAoT + aAoxx + 0Ao\Ao\2 = 0,

(5.10)

'. = c2 - (1 - 6k t a n h <5fc)sech2<5fc,

(5-H)

where

•k'c•2„-2 p

2 (1 + 9 coth 2 8k - 13sech2<5fc - 2 tanh 4 8k) - 2{2cT + cg sech2<5fc)2(l - ca-' )

(5.12)

Equation (5.10) is one of the standard forms of the Nonlinear Schrodinger equating. It is easy to check that a > 0 for all <5fc and f) changes in sign from positive to negative as 8k decreases. 6. Higher-order Nonlinear Schrodinger Equations It is well known from the theory and experiments of Benjamin and Feir (1967) that a finiteamplitude uniform train of surface gravity waves is unstable to modulational perturbations with

180 sufficiently long wavelengths. According to Yuen and Lake (1982), the envelope of a weakly nonlinear wave packet in deep water is governed by the nonlinear Schrodinger equation. This equation seems to provide an accurate description of the evolution of a wave packet of small wave steepness £ = ak only for a limited time, at most 0 ( e - 2 ) wave periods. Feir (1967) confirmed experimentally this restriction on the validity of the NLS equation, and confirmed that an initially symmetric wave packet of uniform frequency and moderate wave steepness eventually loses its symmetry as it propagates away from the wave maker, and splits into two prominent groups of different frequencies. Later on Su (1982ab) observed similar phenomena in his more comprehensive experiments with initially symmetric wave packets of uniform frequency and various durations. For short pulses his results extend those of Feir and reveal further information on group splitting and frequency downshift in the leading wave group. Both Lake et al. (1977) and Melville (1982) performed the instability experiments of a uniform wavetrain with the typical value of wave steepness ka > 0.2. The sideband disturbances, if they were of equal magnitude initially, were found to grow at equal rate only for a limited time. As nonlinear effects become more and more significant, the lower sideband grows faster and attains a greater maximum than the upper sideband, whereas the carrier wave drops to a minimum. Before attaining these extrema, local breaking was observed by Melville and probably occurred in Lake et al.'s experiments. Lake et al. (1977) suggested that the unequal growth is possibly responsible for the downward shift of the spectral peak of wind waves with increasing fetch. Janssen (1981) investigated the long time behavior of the Benjamin-Feir modulational instability. His study based on the NLS equation exhibits the

Fermi-Pasta-Ulam

(1955) recurrence phenomenon, and is in qualitative agreement with experimental work of Lake et al. (1977) and the numerical computation of Yuen and Ferguson (1978ab). Janssen's analysis also reveals that other effects including dissipation are likely to explain the observed frequency downshift in the experiments of Lake et al. (1977). For small amplitudes, the NLS equation was derived by several authors (see Debnath, 1994) to describe the evolution of a wavetrain and this equation seems to be correct to third order in the wave steepness. Subsequently, Longuet-Higgins' (1978a,b) work on the normal mode perturbation analysis of the fully nonlinear water wave problems in deep water confirmed that the NLS equation provides an adequate description for all but the smallest wave steepness. However, the preceding NLS equation is found to compare rather unfavorably with the exact results of Longuet-Higgins for e > 0.10. Later on Dysthe (1979) has shown that a significant improvement can be made by extending the perturbation analysis to 0(e 4 ). He derived the evolution to fourth order in the amplitude of the wave potential leading to a major improvement in agreement with LonguetHiggins' analysis. In nondimensional form the complex amplitude A of the first harmonic of the Stokes waves, the average value of the velocity potential <j> over one wave period and the average surface elevation f satisfy Dysthe's equations

181

2i (At + -Ax\

+ -Ayy = kAxxx

- -Axx

- A\A\2

2 -J-K.-.-X - - - X*), - -2 i \A\ AX + A(x - uj>z), - 6Axyy) + ^A{AA% - -A*A

0 t + 2C = O,

(6.1) (6.2)

2

^ z - 2(t = (|^| )x.

(6.3)

The right hand side of equation (6.1) is made up of contributions of 0(ei).

Two linear terms on

the right hand side of (6.1) are simply corrections to the dispersive effects of the NLS equation. In the absence of the fourth-order effects, the whole right hand side of (6.1) becomes zero so that (6.1) reduces to the ordinary NLS equation derived by many authors in the 1960s and 1970s. However, the fourth order NLS equation was first derived by Dysthe (1979) to eliminate the weakness of the ordinary NLS equation. His fourth order NLS equation gives a significant improvement on the results relating to the stability of finite amplitude waves. The dominant new effect introduced by adding terms of order £4 to the ordinary NLS equation is the mean flow response to nonuniformities of the radiation stress caused by modulation of a finite amplitude wave. Moreover, the horizontal component of the mean flow along the direction of wave propagation causes a slowly varying Doppler shift of the wave as represented by the last term in (6.1). The Doppler shift seems to have a detuning effect on the modulational instability. Among the new terms in (6.1) only the term A(f>x makes significant contribution to the stability of a finite amplitude wave. Thus, for stability analysis, it is sufficient to use the following simplified Dysthe's equations 2i[A \At t

+ ^A\A x)+^A x) yy--AA 2 yyxx

= A{\A\2 + x), z = 0,

4

2

$Z-(\A\ )X

on

2

V 0 = 0

for

2 = 0, 2 < 0.

(6.4) (6.5) (6.6)

In deep water so that kh = 0(ka)~1 S> 1, Dysthe's equations for A and in dimensional form

: At +

i

TkA*)-w^-\«k2w2A i w iuik t 3zoifc 2 Axxx + -rA2Ax 2 A - 1__ — - \ A \ A X + kA[^>x]z=0 = 0, 16F 1I, + T 2 V2 = 0

(6.7)

-/i
(6.8)

z = ^{\A\ )x

on

z = 0,

(6.9)

z = 0

on

z = -h

(6.10)

2

182 where <j> is the potential of the induced mean current. The first four terms in (6.7) constitute the NLS equation in the fixed frame of reference. All terms on the right hand side of (6.7) are 0(e).

Lo and Mei (1985) made a numerical study of the water-wave modulation based on Dysthe's

fourth order NLS system (6.7)-(6.10). Their analysis shows a reasonable agreement with the recent experimental results. So the Dysthe equation represents a useful model for predicting the long-time evolution of the narrow-banded weakly nonlinear waves. Recently, several authors including Janssen (1983), Stiassnie (1984) and Hogan (1985) derived the fourth-order NLS equation for gravity waves and for capillary-gravity waves. Stiassnie showed that the Dysthe fourth-order NLS equation is a special case of the more general Zakharov equation that is free from the narrow spectral width assumption. Hogan (1985) extended Stiassnie's analysis to deep water capillary-gravity wave packets. Based on the Zakharov integral equation, Hogan derived the fourth-order NLS equation. In addition to the leading order dispersive and nonlinear terms in the NLS equation, Hogan's fourth-order NLS equation features certain nonlinear modulation terms and a nonlocal term that describes the coupling of the envelope with the induced mean flow. Hogan's analysis also reveals that there is a band of stable capillary-gravity waves at the fourth order, such a band is known to exist at third order. The effects of the mean flow for pure capillary waves are, in general, of opposite sign to those of pure gravity waves. Hogan showed that the second-order corrections to first-order stability properties depend on the interaction between the mean flow and the envelope frequency-dispersion term involved in the fourth-order NLS equation. Indeed, Hogan's derivation is based on the Zakharov integral equation under the assumption of a narrow band of waves, and including the interactions of capillary-gravity waves. His fourth-order evolution equation for the complex wave amplitude a(x, t) for capillary-gravity waves in deep water is given in the form 2i(at + cgax) + paxx + qayy — ^\a\ a + saxyy) + i(v\a\2ax - ua2ax) + a4>x,

= -i(raxxx where the group velocity cg is _ (wo\

[1±3K\

Cg

-\2k0){l

+ n)>

_ Sfcg

*~

g '

and other coefficients in (6.11) are given by _(3K2 + 6 K - 1 ) P

~

4(1+ K)2

_

'

q

'

7 _

2

_ (3 + 2K + 3K ) S

~

4(1 + K) 4

_ 3(4K + IK

3

V

~

2

(1-K)(1 + 6K+«2)

1 / 1 + 3K\

~2\T+~^)'

T

8 + K + 2K? 8 ( 1 - 2 K ) ( 1 + K)'

2

- 9K + K -

8(1 + K ) 2 ( 1 - 2 K ) 2

8)

8(1 + K)3

~ _

U

~

( 1 - K ) ( 8 + K + 2K 2 ) 1 6 ( 1 - 2 K ) ( 1 + K) 2 '

(6-11)

183 For pure capillary waves, equation (6.11) reduces to the form 2i(at

+ -ax J + -axx + -ayy + -\a\2a - --(axxx

+ daxyy) + -(3\a\2ax - a2a*x) + a<j>x. (6.12)

Neglecting the right hand side of (6.12) leads to the third-order envelope equation for capillarygravity waves in deep water. Hogan showed that the second-order corrections to first-order stability properties depend essentially on the interaction between the mean flow and the envelope frequencydispersion term in the governing equation. From a physical point of view, two kinds of interaction influence the stability of a nonlinear wavetrain. First, it was shown by Lighthill (1965) that the relative signs of the envelope frequency dispersion term paxx and the nonlinear term (—7|a|2a) govern the stability of the solution. There exists a band of stable waves where the product pr is positive. Second, the corrections to the stability characteristics that occur at higher order can be found to arise from an interaction between the mean flow term (ax) and the frequency dispersion term (paxxx).

It is well known that the

frequency-dispersion term for gravity waves is of opposite sign for capillary waves. For gravity waves the detuned nonlinear term A(|J4| 2 + x) balances the dispersion term — \AXX as shown by Dysthe (1979). This leads to the fact that a resonant quartet is maintained and energy is transferred to growing sidebands. On the other hand, for capillary waves, the nonlinear term is of the form a (g|a| 2 — (j>x) and the dispersion term is — \axx so that the detuning arising from the mean flow has the opposite sign. His theoretical predictions are also in good agreement with recent computational results for sufficiently small values of the wave steepness e = ak. Based on the fourth order NLS equation due to Hogan (1985), Akylas et al. (1998) recently derived an asymptotic expression for small-amplitude capillary-gravity solitary waves in deep water that exhibits the algebraic decay like x~2 of the tails of these waves. This decay at infinity is solely due to the induced mean flow that is not accounted for in the NLS equation. Their numerical computations based on full nonlinear water wave equations also confirm the algebraic decay of the solitary-wave tails in deep water. 7. The Davey-Stewartson (DS) Equations in Water of Finite Depth The ordinary NLS equation is used to describe the situation where the wave propagates only in one direction and for which the wave profile evolves only in the same direction. Such a wave is generated by an initial profile (at t = 0) in the form A(ex) exp(ifcr) + c.c. We consider Davey and Stewartson's (1974) analysis for the development of a wave generated by the initial profile (at t = 0) in the form A(ex, ey) exp(ifcx) + c.c.

184 According to Davey and Stewartson, the slow (or weak) dependence occurs in both the x- and y-directions, but the rapid oscillation is only in the x-direction and the wavepacket propagates in the x-direction with a slowly varying nature in both x- and ^-directions. The group velocity is assumed to be still associated with the propagation in the x-direction. The governing equations are, in the nondimensional form, S(xx + m) + 4>zz=0,

Vt + e(0x% + 4>vriy) - (TV* = 0 2

& + »?+-e(V<« + | ^

0 < 2 < 1 + £??,

on

t > 0,

z = 1 + £7/,

(7.1)

(7.2)

= 0

on

z = l + sV,

(7.3)

02 = 0

on

z = 0.

(7.4)

It is convenient to introduce new variables £ = x - Cpt,

X = e(x - cgt),

Y = ey,

r = e2t

(7.5)

so that equations (7.1)-(7.4) can readily be transformed into a system Szz + S[i( + 2£ex + £2{<$>xx + 4>YY)] = 0, 4>z = S[(£2rit-6cgrix-cprii)

(7.6)

+ s(i + £(j>x)(rii + ETfx) + ei(t>YVY] on z = 1 + ET/, 1 £2 £ + rl+-£(,: +£X)2 + —Y + ~4>2 = 0,

£2
4>z = 0

on

2 = 1 +£77,

2 = 0.

(7.7)

(7.8) (7.9)

Retaining terms 0 ( E 2 ) generates the only contribution from the dependence in Y from the term if>YY in the Laplace equation (7.6). The other terms involving derivatives in Y produce new nonlinear interactions that arise first at 0(£ 3 ). We seek a solution of the form

= MX, Y, T) + £ £n | J2 F™(X m + cx- \ ' n=0

oo

(m=0

(n+l

£

1

y r em

cc

^ = E " i E ^w*- ' ) + - -1 n=0

lm=0

t 7 ' 10 )

)

(7-n)

J

where e = exp(ifc£) and 4oo = 0 so that the first approximation to the surface gravity wave is purely harmonic. To the order 0(e 2 ), the problem at £2e° gives equation for /o (1 - c2g)f0XX + foYY = - c - 2 ( 2 c - 2 ( 2 c p + C j s e c ^ ^ d ^ o ^ x -

(7.12)

185 Given Ao = Aoi, the surface boundary conditions for the £ 2 e 1 leads to -2ifcCp>loT + ctAoxx - CpCgA0YY + lk2c;2(l

+ 9coth 2 5k - 13sech2<5fc - 2 tanh 4 5k)A0\A0\2

+ fc2(2cp + c9sech25fc)^o/ox = 0.

(7.13)

Thus the coupled equations (7.12)-(7.13) are known as the Davey-Stewartson (DS) equations for the modulation of harmonic waves. In the absence of Y dependence, it turns out that equation (7.12) takes the form (1 -
(7.14)

This equation provides the leading contribution to the mean drift generated by the nonlinear interaction of the wave motion, usually called the Stokes drift. Similarly, equation (7.13) reduces to the ordinary nonlinear Schrodinger equation -2ifccpJ40T + aA0XX

+ PA0\A0\2

= 0

(7.15)

which is identical with (5.10). Finally, the Davey-Stewartson equations can also be written in the compact form (1 -
(7.16)

Cp

(

-2ifccpyloT + aAoxx - CpCgAoYY + < 0 +

72fc2

c2(1

1

_ c2) f MM2

+ k2"/A0fox

= 0,

(7.17)

where a and /3 are given by (5.11)-(5.12) and 7 = 2cp + c9sech2<Sfc

(7.18)

where 7 > 0 and Cj,c9 > 0. These evolution equations can be further approximated for long waves (<5 —» 0) and short waves (S —> 00) respectively. However, their validity suffers from criticism because other terms become important. The long wave approximation (<5 —• 0) leads to the one-dimensional propagation of long waves so that the relevant equation is the KdV equation. Thus the KdV and NLS equations are two fundamental equations for weakly nonlinear waves. The former equation represents long waves which can be derived in the limit as 5 —» 0 and e —> 0 with 5 = 0(e). However, the use of a suitable rescaled variable allows us to obtain the KdV equation for arbitrary \/5. On the other hand, the NLS equation requires scaled variables which are defined with respect to e only with <5 = O(l) that is retained as a parameter. At least for a class of nonlinear waves, there are two representations:

186 (i) ri(x, t, e, 8) with e —* 0 and 8 —» 0 gives the KdV equation, (ii) r](x, t, e, 8) with e —> 0 for fixed 8 leads to the NLS equation. The same match also occurs between the Davey-Stewartson equations (7.12) and (7.13) and the two-dimensional KdV equation (see Freeman and Davey (1975)). In fact, the long wave limit (<5 —> 0) of the DS equation matches the short wave limit (8 —• oo) of the (1 + 2)-dimensional KdV equation (2r/or + 37)o»7of + 3%«f J +VOYY=0. This equation is often called the Kadomtsev-Petviashvili

(7.19)

(KP) equation (Kadomtsev and Petvi-

ashvili, 1970). This is another completely integrable evolution equation. This equation also admits an exact analytical solution describing any number of waves that cross obliquely and interact nonlinearly. For the case of three such waves, special solutions exist corresponding to a resonance phenomenon. Acknowledgement. This paper is based on the lecture delivered at the International Conference on Mathematics and the 21st Century in Cairo, Egypt in January 2000. The author expresses his grateful thanks to the Organizing Committee of the Conference and the University of Central Florida for their financial support for attending the Conference.

187

References Akylas, T.R., Dias, F. and Grimshaw, R.H.J. (1998). The effect of the induced mean flow on solitary waves in deep water, J. Fluid Mech. 355, 317-328. Akylas, T.R. (1984a). On the excitation of long nonlinear water waves by a moving pressure distribution, J. Fluid Mech. 141, 455-466. Akylas, T.R. (1984b). On the excitation of nonlinear water waves by a moving pressure distribution oscillating at resonant frequency, Phys. Fluids 27, 2803-2807. Benjamin, T.B. and Feir, J.E. (1967). The disintegration of wavetrains on deep water, Part 1, Theory, J. Fluid Mech. 27, 417-430. Benney, D. and Roskes, G. (1969). Wave instabilities, Studies Appl. Math. 48, 377-385. Dagan, G. and Miloh, T. (1982). Free-surface flow past oscillating singularities at resonant frequency, J. Fluid Mech. 120, 139-156. Davey, A. and Stewartson, K. (1974). On three-dimensional packets of surface waves, Proc. Roy. Soc. Lond. A338, 101-110. Debnath, L. (1994). Nonlinear Water Waves, Academic Press, Boston. Debnath, L. and Rosenblat, S. (1969). The ultimate approach to the steady state in the generation of waves on a running stream, Quart. Jour. Mech. and Appl. Math. XXII, 221-233. Dysthe, K.B. (1979). Note on a modification to the nonlinear Schrodinger equation for application to deep water waves, Proc. Roy. Soc. Lond. A369, 105-114. Feir, J.E. (1967). Discussion: Some results from wave pulse experiments, Proc. Roy. Soc. Lond. A299, 54-58. Freeman, N.C. and Davey, A. (1975). On the evolution of packets of long surface waves, Proc. Roy. Soc. Lond. A344, 427-433. Hogan, S.J. (1985). The fourth-order evolution equation for deep-water gravity-capillary waves, Proc. Roy. Soc. Lond. A402, 359-372. Huang, D.-B., Sibul, O.J., Webster, W . C , Wehausen, J.V., Wu, D.-M. and Wu, T.Y. (1982). Ship moving in a transcritical range, Proc. Conf. on Behavior of Ships in Restricted Waters (Varna, Bulgaria), 2, 26-2 - 26-10. Janssen, P.A.E.M. (1983). On a fourth-order envelope equation for deep water waves, J. Fluid Mech. 126, 1-11. Janssen, P.A.E.M. (1981). Modulational instability and the Fermi-Pasta-Ulam recurrence, Phys. Fluids 24, 23-36.

188 Kadomtsev, B.B. and Petviashvili, V.I. (1970). On the stability of solitary waves in weakly dispersive media, Sov. Phys. Dokl. 15, 539-541. Lake, B.M., Yuen, H.C, Rundgaldier, H. and Ferguson, W.E. (1977). Nonlinear deep-water waves: Theory and experiment, Part 2, Evolution of a continuous wave train, J. Fluid Meek. 83, 49-74. Lighthill, M.J. (1965). Contributions to the theory of waves in nonlinear dispersive systems, Proc. Roy. Soc. Lond. A299, 38-53. Lo, E. and Mei, C.C. (1985). A numerical study of water-wave modulation based on a higher-order nonlinear Schrodinger equation, J. Fluid Meek. 150, 395-416. Longuet-Higgins, M.S. (1978a). The instabilities of gravity waves of finite amplitude in deep water I, Superharmonics, Proc. Roy. Soc. Lond. A360, 471-488. Longuet-Higgins, M.S. (1978b). The instabilities of gravity waves of finite amplitude in deep water II, Subharmonics, Proc. Roy. Soc. Lond. A360, 489-505. Melville, W.K. (1982). The instability and breaking of deep-water waves, J. Fluid Meek. 115, 165-185. Stiassnie, M. (1984). Note on the modified nonlinear Schrodinger equation for deep water waves, Wave Motion, 6, 431-433. Stoker, J.J. (1957). Water Waves, Interscience, New York. Su, M.-Y. (1982a). Three-dimensional deep-water waves, Part I, Experimental measurement of skew and symmetric wave patterns, J. Fluid Meek. 124, 73-108. Su, M.-Y. (1982b). Evolution of groups of gravity waves with moderate to high steepness, Phys. Fluids 25, 2167-2174. Whitham, G.B. (1974). Linear and Nonlinear Waves, John Wiley, New York. Yuen, H.C. and Lake, B.M. (1982). Nonlinear dynamics of deep-water, Ann. Rev. Fluid Mech. 12, 303-334. Yuen, H.C. and Ferguson, W.E. (1978a). Relationship between Benjamin-Feir instability and recurrence in the nonlinear Schrodinger equation. Yuen, H.C. and Ferguson, W.E. (1978b). Fermi-Pasta-Ulam recurrence in the two space dimensional nonlinear Schrodinger equation, Phys. Fluids 21, 2116-2118.

Mathematics and the 21st Century Eds. A. A. Ashour and A.-S. F. Obada © 2001 World Scientific Publishing Co. (pp. 189-198)

189

A ROBUST LAYER-RESOLVING NUMERICAL METHOD FOR A FREE CONVECTION PROBLEM JOCELYN E T I E N N E , J O H N J . H .

M I L L E R , G R I G O R I I I. SHISHKIN

ABSTRACT. We consider free convection near a semi-infinite vertical flat plate. This problem is singularly perturbed with perturbation parameter Gr, the Grashof number. Our aim is to find numerical approximations of the solution in a bounded domain, which does not include the leading edge of the plate, for arbitrary values of Gr > 1. Thus, we need to determine values of the velocity components and temperature with errors that are Gr-independent. We use the Blasius approach to reformulate the problem in terms of two coupled nonlinear ordinary differential equations on a semi- infinite interval. A novel iterative numerical method for the solution of the transformed problem is described and numerical approximations are obtained for the Blasius solution functions, their derivatives and the corresponding physical velocities and temperature. The numerical method is Gruniform in the sense that error bounds of the form CPN~P, where Cp and p are independent of the Gr, are valid for the interpolated numerical solutions. The numerical approximations are therefore of controllable accuracy. Keywords and Phrases: Robust method, Layer-resolving, Boundary layer, Fitted mesh, Free convection, Coupled nonlinear equations

1 THE FREE CONVECTION PROBLEM A free or natural convection flow occurs when a fluid at rest, subjected to a body force such as gravity, is near an object at a different temperature. The heat transfer between the object and the fluid causes an increase or a decrease in the fluid density at the surface of the object, and thus generates an unbalancing body force. The fluid near the surface is accelerated, and a boundary layer develops. We study this problem for a two-dimensional, steady flow near a semi-infinite flat plate. This involves an interesting and typical system of singularly perturbed partial differential equations.

190 Our goal is to construct a numerical method for this problem in a bounded domain that does not include the leading edge of the plate. Because the solution we seek is self-similar, using Blasius' approach, we can reduce the problem to the numerical solution of a coupled system of nonlinear ordinary differential equations. We require that the numerical approximations generated by our method converge to the exact solution with an order of convergence that is independent of the Grashof number Gr for all Gr > 1. We refer to a numerical method with this property as a layer-resolving method. No standard numerical method exists, which fulfills this requirement. 1.1

PHYSICAL DESCRIPTION

We consider a semi-infinite vertical flat plate in an incompressible fluid. We assume that the density of the fluid varies linearly with the temperature and that its other properties are constant. The plate is heated to the temperature 0\, while the fluid temperature away from the plate is 0^. For 0i > 9oo, the heat transfer into the fluid decreases its density in a small region around the plate, resulting in an upward motion of the fluid. Since motion in the fluid results only from this heat transfer, we assume that the fluid away from the surface is not affected by this upward motion. The governing equations are

9u

(i)

+?=o

dx dy J,,. . . 82u

-du+v-=gdu M-0oo)

+

v—

89 _ d26 dy dy2

d9_ dx with the boundary conditions v = u = 0, u —¥ 0,

9 = 0i for y = 0 9 —> 0QQ for y —¥ oo

When we non-dimensionalise these equations we obtain 9u

o-X d2u oyz Pr dy2

dv +

d-y

„ d u „dii ox ay dx

„

.„,

= °

<2>

_

-

dy

with the boundary conditions {i = u = 0, u —¥ 0,

0=1 for £ = 0 0—^0 for y -> oo

191 where the Grashof and Prandtl numbers have their usual definitions

1.2

Gr = g W f t - ' - )

(3)

Pr

(4)

=

a

B L A S I U S ' FORMULATION

The problem is now transformed using Blasius' technique to a one dimensional problem. For a complete description of this we refer the reader to [2]. The transformed problem involves the unique dimensionless variable Gr\* 4 J

y xV4

and two dimensionless functions / and t, which are related to the physical velocities and temperature through the following relations

Q(x,y)

=

t{T,)

(5)

u(x,y)

=

4^)axi/'(r,)

(6)

*(*,») = {~y

~iivf'(v)-mv))

(7)

In terms of these functions, the governing equations become

t" + SPr • ft' = 0

(8)

2 / ' " ' + 3 / / " - 2/' + t = 0

(9)

3IlQ.it 1(3I1S

/(0) ==

/'(o) = o,

/'fa —> oo) —> 0,

t(0) = 1 t(r] - » oo)

0

This is again a singularly perturbed problem. Our aim is to find numerical approximations of the velocity components and temperature in a bounded domain, which does not include the leading edge of the plate. Because we want this solution for arbitrary values of Gr > 1, we need to determine numerical approximations to the solution of the above problem at each point r? of the semi-infinite interval / = (0,oo).

192 2

LAYER-RESOLVING M E T H O D F O R BLASIUS' P R O B L E M

The equations obtained by Blasius are posed on the semi-infinite interval J, and there are boundary conditions at infinity. It is obvious that the problem cannot be solved numerically in this form. A standard alternative approach is to satisfy these boundary conditions by using an iterative method involving additional boundary conditions at r\ = 0 (see [2]). I

1

1

1

1

0

1

-

a

infinity

Figure 1: Mesh on semi-infinite domain for Blasius' problem. Here we use the method described in [1, Chap. 11], which yields a solution on the whole of / . We divide I into two subintervals, [0,L] and [L,+oo). On the first we define a discrete problem on a uniform mesh, and on the second we define an affine extension of each function using the boundary conditions. Thus, for T, the interpolated function of the discrete approximation of t, we have T(T] -> +oo) = 0 and therefore we take T(rj > L) = 0. Similarly, for F we know that -§rf(r] —• +oo) = 0 and we take F(r] > L) = F(L), where this latter value is obtained from the solution of the discrete problem on [0, L\. In order to guarantee that the method is Gr-uniform, a careful choice of the point L is of course crucial. We take L^ = In TV and on the subinterval [0, Ln] we define the uniform mesh IN = {xi = iN_1 InN : 0 < i < N}. This choice is motivated by the discussion in [1] for a simpler problem. The computations described in what follows show that in practice the resulting method is L~ uniform. We introduce the discrete problem V i € { l - - - W - 1 } , 52Ti+3Pr-FiD-Ti =0 V» € {2 • • • N - 1}, D-pFi + 3FiS2Fi - 2{D~Fi)2 = -Tt {PLN){

with

F 0 = 0,

D+Fo = 0, D~FN = 0

r0 = i, TN = O where D+ and D~ are the forward and backward first order finite difference operators, S2 is the centred second order operator and, for any mesh function G, d = G(xi) for all Xi e IN. We need to linearize these equations. The natural first attempt is the linearization S2Tm + 3Pr-Fm-1D-Trn 2 m

D-6 F

m 1 2 m

m 2

+ 3F - 5 F -2(D-F )

= =

0

(10)

-T

m

(11)

which we iterate until uniform convergence is achieved for a given tolerance. But, for a fixed N, the iterates do not converge as m grows. In fact we have

193 lim F2m + lim F2m+1 m—too

lim T 2 m /

and

m—»oo

m—*-oo

lim T 2 m + : m—»oo

even iterations results

odd iterations results

Figure 2: Oscillations of F function (sketch)

Figure 2 shows a sketch of the oscillations of the iterates Fm. To prevent these oscillations we use previously computed values of F by introducing the auxiliary variables -1

1

=

7T.TO-1

2F

+

(12)

2^c

It is clear that F c m _ 1 depends on all previously computed values of F. Since it is much less subject to oscillation than Fm~l, we use it to replace F m _ 1 in equation (10). The resulting method yields good results for all physically relevant values of the Prandtl number. 3

CONVERGENCE OF THE METHOD

We use the above method to compute approximations for values of N in the range 128 to 32768, and work in quadruple-precision in order to obtain significant error bounds. We study the convergence of the resulting sequence of solutions, and their first and second derivatives, using the experimental error analysis technique described in [1]. All of the computations in this section are carried out for Pr = 0.72, which is the value of the Prandtl number for air. Other experiments within a physically relevant range of Pr yield similar results. As in [1], for any mesh functions GN on the mesh IN, we define the maximum pointwise error E*

G"

-^Nm

-G

\JN

the two-mesh difference -p^N

D" = \\G

-=2N I

- G

\INUI2N

194 and the order of convergence ,

-N

D"

where G is the interpolated function corresponding to the GN mesh function and iV max = 32768 is the largest value of N used in the computations. The computed values of the error parameters C and p are defined in an analogous way to those in [1]. From the numerical results in the first and last two rows of Tables 1-3 we see that, in practice, the method is robust and layer-resolving in the sense that it is L-uniform and that the L-uniform order of convergence of the numerical approximations of / and t, and their derivatives, is better than 0.78 for all N > 512. N B"(F) EN(T) DN(F) DN(T) PNN(F) P (T)

N E"(D+F) EN(D+T) DN{D+F) DN(D+T) pN(D+F) pN(D+T)

N EN (g'F) EN(S2T) DN(S2F) DN(S2T) PN(S2F) pN(S2T)

128

256

512

0.020684 0.005699 0.014051 0.003695 1.035344 1.331509

0.010497 0.002864 0.006856 0.001468 1.039840 0.991703

0.005228 0.001672 0.003334 0.000738 1.053815 0.781964

1024 0.002543 0.000938 0.001606 0.000429 1.077516 0.825135

128

256

512

0.008547 0.006304 0.020668 0.007780 0.761882 0.869365

0.003847 0.003760 0.012189 0.004259 0.689707 0.857542

0.001754 0.002147 0.007557 0.002350 0.865701 0.869375

128

256

512

0.017785 0.007241 0.042189 0.011175 0.817064 0.712518

0.010544 0.004030 0.023946 0.006820 0.726389 0.889433

0.006069 0.002212 0.014474 0.003682 0.855304 0.787396

2048 0.001201 0.000510 0.000761 0.000242 1.109775 0.853144

1024 0.000800 0.001190 0.004147 0.001287 0.766244 0.875633

1024 0.003395 0.001197 0.008000 0.002133 0.813816 0.908475

4096 0.000547 0.000268 0.000353 0.000134 1.150155 0.872730

2048 0.000363 0.000643 0.002438 0.000701 0.885989 0.883201

2048 0.001850 0.000624 0.004551 0.001136 0.873976 0.825205

8192 0.000236 0.000134 0.000159 0.000073 1.200088 0.886969

4096 0.000161 0.000337 0.001319 0.000380 0.843121 0.890366

4096 0.000977 0.000325 0.002483 0.000641 0.866127 0.924485

16384 0.000092 0.000061 0.000069 0.000040 1.264512 0.897438

8192 0.000070 0.000176 0.000735 0.000205 0.896398 0.896959

8192 0.000491 0.000163 0.001362 0.000338 0.891145 0.851711

16384 0.000057 0.000087 0.000395 0.000110 0.886538 0.902954

16384 0.000224 0.000066 0.000735 0.000187 0.893042 0.934954

Table 3: Computed maximum pointwise error EN, computed two-mesh difference DN and computed order of convergence pN for 52F and 62T in quadruple precision arithmetic. 3.1

C O M P U T E D ERROR BOUNDS FOR B L A S I U S ' FUNCTIONS

The results in [1] for a simpler problem suggest that we can expect error bounds of the form CN~P, where C and p are independent of Gr. Considering values of N > 512, the experimental error analysis described in [1, chap.8] yields computed values for p and C. Applying this technique to the present problem

195 we obtain the following a posteriori error bounds for the functions F, T and their derivatives, for all N > 512 < 4.607N'1054

max

\(F-f)(n)\

max

I (T - t) (rj) I < 2.320iV-°- 782

max

I ('D+F - f)

T)€[0,+oo) IV

(rj)\ <

/

V

'\

(13)

2.182N-°-7m

-

max

I (D+T - t') (n)\ < 1.175AT-0-869

max

I(S*F - f") MI < 5.386JV- 0814

max

I (~PT - t") (v)\ <

1.188N-0787

These computed error bounds show experimentally that our numerical method is robust and layer-resolving for N in the range 512 to 32768. 3.2

E R R O R IN THE PHYSICAL QUANTITIES

We return now to the original non-dimensionalised problem. We want to compute the error for the velocities and temperature on a bounded sub-domain f2 = [0.1,1] x [0,1] of the non-dimensionalised semi-infinite domain. The choice of the interval [0.1,1] for the variable x is required because of the singularity in the velocity components u, v and their derivatives at the point x = 0. We use the relations between these quantities and Blasius' functions described in section 1.2. We see that the velocity components u and v respectively behave like GV5 and Gr*. Therefore, we need to scale the components by these factors in order to obtain quantities that are bounded uniformly with respect to the Grashof number. Graphs of the resulting approximate scaled velocity components and temperature on [0.1,1] x [0,1] are shown in Figures 3-5 for Gr = 10 5 and N = 32768. We see that a boundary layer in each physical variable arises on the boundary of the plate. The corresponding scaled errors in the physical quantities are

Gr-kmax\(U-u)(x,y)\ n = max_ b s * (D+F

(14) - f) (?j)|

7)=7)(x,y)

<2

max_ I (D+F - f) (rj) I (2,ii)€n ' V V=r,{x,y)

/

I

196

Figure 3: The approximate scaled horizontal velocity for the free convection problem on [0.1,1] x [0,1] with Gr = 10 5 generated with iV=32768.

Figure 4: The approximate scaled vertical velocity for the free convection problem on [0.1,1] x [0,1] with Gr = 10 5 generated with JV=32768.

Figure 5: The approximate temperature for the free convection problem on [0.1,1] x [0,1] with Gr = 10 5 generated with JV=32768.

197

Gr-imax\(V-v)(x,y)\ = max

^

(15)

(rJD+F{V) - 3F(V) - (77/'(77) - 3/(7,)))

(i.y)eri r)=T)(4,y)

< 1.26 max_ \r)D+F(r)) - 3F{r]) - (77/'(77) - 3/(77))

max I ( 0 - e ) (£,?}) I = max_ I ( T - t ) (77) I

(16)

T)=r;(i,y)

We see that we need to estimate the additional quantity r)D+F(ri) — 3F(rj). The required numerical results are given in Table 4. N DN P

128 0.042154 1.035344

256 0.020567 1.039840

512 0.010003 1.053815

1024 0.004819 1.077516

2048 0.002283 1.109775

4096 0.001058 1.150155

8192 0.000477 1.200088

16384 0.000207 1.247754

Table 4: Computed two-mesh difference DN and computed order of convergence pN for T)D+F — 3F in quadruple precision arithmetic. With these results, and those from the previous section, we find the following computed scaled error bounds for the physical quantities

GV- = ||£7-- « l l n

Gr~*\\V- - « H T T

I I © - -§\\

<

4.37N-0766

(17)

1053

<

17.4iV-

<

232JV-o.782

These computed error bounds show that the boundary layers have been successfully resolved. We remark that we can use the same approach to generate similar approximations to the derivatives of the physical variables. 4

CONCLUSION

For free convection on a semi-infinite vertical flat plate, Grashof uniform numerical approximations to the velocity components and temperature have been generated in a bounded domain, which does not include the leading edge of the plate, for arbitrary values of Gr, using the Blasius formulation. Analysis of the numerical approximations shows that this numerical method is robust and layer-resolving. It follows that numerical approximations of controllable accuracy, with errors independent of the value of the Grashof number, can be computed with this method.

198 ACKNOWLEDGMENTS This work has been supported in part by the Russian Foundation for Basic Research under grant No. 98-01-00362 and by the Enterprise Ireland grant SC-98-612. REFERENCES

[1] Paul Farrell, Alan Hegarty, John J. H. Miller, Eugene O'Riordan, Grigorii I. Shishkin. Robust Computational Techniques for Boundary Layers. Series in Applied Mathematics and Scientific Computation, CRC Press, 2000. [2] David F. Rogers. Laminar flow analysis. Cambridge University Press. 1992. [3] Hermann Schlichting. Boundary-layer theory. McGraw-Hill, 7th ed., 1979.

Jocelyn Etienne John J.H. Miller Department of Mathematics University of Dublin Trinity College Dublin 2, Ireland

Grigorii I. Shishkin Institute of Mathematics and Mechanics Russian Academy of Sciences, Ural Branch Ekaterinburg 620219 Russia

Mathematics and the 21st Century Eds. A. A. Ashour and A.-S. F. Obada © 2001 World Scientific Publishing Co. (pp. 199-209)

199

Growth value-distribution and zero-free regions of entire functions and sections Faruk F.Abi-Khuzam Department Of Mathematics, American University Of Beirut, Beirut, Lebanon October 27, 2000 Abstract The growth of a second derivative of the logarithm of the maximum modulus of an entire function provides information about the location of zeros of the function and its sections.We present a survey of work on this topic along with some recent sharp results and open questions. 1991 Mathematics Subject Classification: Primary 30D20,30D15; Secondary 30D35. Keywords and Phrases: Entire function, section, zeros, gaps.

1

Introduction

I a m h a p p y t o be here before this gathering of distinguished workers in t h e field, and I feel deeply honoured t o be given this great o p p o r t u n i t y t o share with you some of t h e m a t h e m a t i c s we love. T h e s t u d y of zero sets of functions or roots of equations is a great story in m a t h e m a t i c s , s t a r t i n g very far back in t i m e and continuing on into t h e present.lt would not be t o o far from t h e t r u t h to say t h a t almost every m a t h e m a t i c i a n has worked a t one t i m e or another on a problem related to some zero s e t . W h e n we scan t h e literature for this topic we find ourselves in good company.The names of Weierstrass, Riemann, H a d a m a r d , Polya, Szego, H a y m a n , Gol'dberg and Ostrowski s t a n d o u t b u t there are numerous others. W h e n we t h i n k a b o u t zero sets of functions, we naturally recall t h e fundamental theorem of Algebra, and P i c a r d ' s t h e o r e m . l t would be i n a p p r o p r i a t e here not t o mention something a b o u t these results and their generalization in w h a t is known as value- distribution theory.But I will content myself w i t h some brief remarks since this talk will emphasize w h a t might be called angular distribution of zeros. P i c a r d ' s (little) theorem distinguishes values according as t h e y are lacunary ( o m i t t e d by t h e function ) or non-lacunary ( assumed by t h e function) .Valuedistribution theory quantifies lacunarity t h r o u g h t h e notion of deficiency and

200 climaxes in the deficiency relation of Nevanlinna, a far reaching generalization of Picard's Theorem.The method of negative curvature provides connection with geometry and extensions in various directions. Let us give a few details.If / is an entire function, we count the number of a—points of / ( i.e. solutions of the equation f(z) = a ) in the disk \z\ < r and denote this number by n(r, a).This is the counting function of the a—points.The related function, defined in most cases by N(r, a) = /„ t~1n(t, a)dt is called the smoothed counting function and one aspect of value-distribution theory is to compare the growth of N or n with some average function called the characteristic and defined, for entire functions , say when /(0) = 1, by

T(r) = ±J^N(r,eix)d\. The deficiency of a value a is defined by 6(a) = 1 — lim sup ~So

N(r, a) T(r)

and the famous deficiency relation is the inequality

aeCU{oo}

For an entire function <5(oo) is always l.So if the function omits a finite value a then, n(r,a) = N(r,a) = 0 and this lacunary value will have 6(a) = l.The defect relation then implies that, unless / is a constant, it can have no other lacunary value.This is Picard's theorem. A large part of value distribution theory is concerned with the deficiency relation and in particular, the number of a—points of the function / in a disk. But along with the problem of counting a—points or zeros, there is the problem of determining the angular distribution of zeros of a given entire function.Many outstanding questions in analysis involve the problem of locating the zeros of some entire, or meromorphic, function.The most famous of all is, of course, the Riemann conjecture which states that all the non-trivial zeros of the zeta function £(u + it) lie on the line a = \li £ is the function defined by

at) = l*(s - i)c«r(ia)7r-4' where s = ^ + it then £ is an entire function with non-negative Maclaurin coefficients oo

71=0

and the Riemann conjecture is equivalent to the statement that all the zeros of £ are real.

201 Another outstanding problem is the width conjecture of Saff-Varga [ESV] which is a statement about the possible dispersion in the plane of the zeros of sections of a given power series.lt has been verified in certain specific cases, notably in work of Edrei, Saff and Varga [ESV] but, as far as I know, continues to be uresolved in the general case. What I want to present here is a program of study, perhaps a bit ambitious, which has as its aim the answer to two questions that I will present shortly. The starting point is an entire function /:C->C and its power series

/CO

= £a z k

and its sections sn(z) =

^akzk.

We now ask two questions: (*) Where are the zeros of / located ? (**) Where are the zeros of sn located ? I aim to show here that a study of the growth of a certain second derivative associated to / promises to supply very precise information about the location of the zeros of /.This second derivative is defined by K

'

# log M(r) d(logr) 2

(2)

where M{r) = sup| 2 | = r |/(z)|, the maximum modulus of / . It will be seen that the best results available through the study of 6(r), so far, concern entire functions of zero order.But there are also general results, such as those in Theorems 3,8 below, covering functions of all orders.Hopefully they will serve as evidence favoring further studies on b(r).

2

Preliminaries

The answer to the first question in the introduction goes back to Weierstrass.Take any sequence of complex numbers tending to co but otherwise arbitrary, then you can find an entire function whose zero set is precisely this sequence.In other words the zeros of an entire function can be located anywhere in the complex plane. So we have to specialize the class of functions under consideration if we want to obtain something interesting.In this talk I shall concentrate mostly on

202 the class of entire functions with non-negative coefficients.This is a nice class of functions that contains many of the usual functions that we meet everyday in our work such as the exponential function, the Mittag-Leffler functions and £.If the coefficients are positive then the zeros of / cannot lie on the positive real axis and, by continuity, there will be a neighborhood of the positive real axis where / would have no zeros.What does this neighborhood look like and what is the corresponding set for the sections. By way of experimenting one may consider first the exponential function.Of course it has no zeros but plotting the zeros of its sections one obtains some very nice pictures as in [ESV] .The numerical computations done on the sections of exp z, which we owe to Iverson[Iv] , suggested the existence of a parabolic region free of all zeros of all sections of exp 2.This was verified ( following a result [NR] of Newman and Rivlin ) by Saff and Varga who proved the following [SV]. Theorem 1 //a,j > 0 for j = 0,1,2,... and b^ = ak-i/a^

and

a = inf{(6fc - bk-i) :k = l,2,...} > 0, then the sections sn of f have no zeros in the parabolic region Pa defined by Pa = {z — x + iy e C :y2 < 4a(x + a), x > - a } .

(3)

For example , exp z has b^ = 1 and a = 1 so that the parabolic region P1 = {z = x + iy e C : y2 < 4(x + 1), x > - 1 } is free of zeros of all sections of exp z. Theorem 1 is reminiscent of another "Parabola" theorem occurring in the Theory of continued fractions.But I am not aware of any connection between the two.The Saff-Varga result is rather elegant but suffers from a certain limitation of its applicability.For it is a fact, easy to prove [AK1], that the condition a > 0 in the Saff-Varga result implies that / is of exponential growth and type < ^.It is a result limited to functions of growth order at most one and finite type. At about the same time as the Iverson experiments, Edrei [E] considered a sort of converse question and obtained: Theorem 2 If, for every positive integer n, the zeros of sn lie in a closed halfplane containing the origin on its boundary, then lira sup K l 1 / " 2 < 1.

(4)

n—*oo

Edrei's result was extended by Ganelius [G] in particular replacing the halfplane by a sector. The Saff-Varga result goes in a direction opposite that of Edrei. But there is a feature common to these two results: both connect the growth of the coefficients of / to a region free of zeros of sections of / . If we suspect that there is an underlying principle behind this connection, we have to express the growth of

203 coefficients in possibly different form.Indeed the zero-free region in the EdreiGanelius result is a sector, that in the Saff-Varga result is a parabola , and we need to connect the geometry of these figures with some index related to the growth of the coefficients or of the function.Let us then turn to the idea of growth of an entire function.

3

G r o w t h of entire functions

One of the most important ideas related to the study of entire functions is the notion of growth. An important index of growth is the order of / which is defined by p = hm sup r—oo

log log M(r) ;

logr

where M(r) = snpiz,r |/(z)|,the maximum modulus of /.It can be shown that the order of / can also be expressed in terms of the coefficients by

I = l i m i n f Ml^KD. p

n-»oo

n log n

Since the order p holds information about the growth of / and its coefficients in an explicit way, we might suspect that it is the index sought.However one obvious implication of the second formula is that a change in the arguments of the coefficients of / does not effect p. Since such a change is expected to effect the angular location of the zeros, we cannot hope to get a description of the geometry of the zeros through some function of the order only.At any rate our remark after Theorem 1, implies that the Saff-Varga functions are of order at most l.Also (4), implies that the Edrei-Ganelius functions are of order zero by the second formula for the order.So in the case of positive coefficients, Theorems 1, 2 suggest the following : when the order of growth of / is at most one we expect a parabolic region free of zeros of sections, and when the order of growth of / is zero we expect a sectorial region free of zeros of sections.But what zero-free region do we expect if p = | or 2.So far we have no idea because we have yet to figure out a way of connecting growth with angular distribution of zeros.

4

H a d a m a r d - H a y m a n convexity

It turns out that in order to obtain the connection, unifying the previous results and extending them to cover functions of any order, including infinite order, we have to adopt a new way of measuring the growth of / . What is needed is some index or functional which is sensitive to the angular distribution of zeros of / . Now we are all familiar with the three circles theorem of Hadamard, which states that logM(r) is a convex function of logr. This is a very important

204 result that finds applications in various areas such as function theory, harmonic analysis, and partial differential equations.Let us put M'(r) It is easy to show that b(r) < Krx implies that / is of order at most A, but 6 can have growth larger than that of / . The study of the growth of b(r) was initiated in 1968 by Hayman [H], independently of the question of angular distribution of zeros.Hayman showed that the classical estimate b(r) > 0 obtained from the three circles theorem could be improved under certain conditions.He showed that there exists a positive absolute constant AQ > 0.180 such that lim sup b(r) > A0 r—*oo

for every transcendental entire function /.The exact value of A0 is as yet undetermined.Hayman's result was followed by work of Kjelleberg [Kj] and others.In particular Boichuck and Gol'dberg [BG] showed that A0 = 0.25 if we restrict to the class of entire functions with non-negative coefficients. The first explicit connection was obtained in [AK2] where a very simple inequality was found to the effect that, under positivity of coefficients, /2(r)-|/(reie)|2<4sin2^/2(r)6(r).

(5)

This inequality was used to obtain an alternative proof, and a refinement, of the Boichuck-Gol'dberg result.But once you have this inequality you see at once the connection between the location of zeros of / -and also its a-values- and the growth of / . For example, if / vanishes at z = re%e then (5) tells us that r and 6 must be governed by the inequality l<4sin2|&(r).

(6)

An immediate consequence of (6) is that limsup,.^^ b(r) > 0.25 for every transcendental entire / with non-negative Maclaurin coefficients.This is the result of Boichuck-Gol'dberg and it is best possible. If now we accept to measure the growth of / by b(r), rather than M(r), then (6) gives us a direct and simple relation between the growth of / and the angular distribution of its zeros.In fact it gives much more.For example, if we return to the exponential function, where b(r) = r, we see that , if (6) were applicable to all sections of ez we would have that all zeros z = re'e of all sections of ez must satisfy the inequality 1 < 4rsin 2 | = 2r(l — cos#) or < r (7) K 2(1 -cos<9) - ' ' which is manifestly a parabolic region.This is the Saff-Varga result.Important cases where (6), with b(r; / ) , does apply to all sections sn have been obtained in [AK1] :

205 Theorem 3 If a,j > 0 and G is the region defined by G = {(r,fl) : 6(r) <

_\Qseyr

>0,-*<0
then G is free of all zeros of f.If in addition, b^ < bk-i where bk = then G is free of zeros of all sections of f.

(8) ak-\/ak

Among functions satisfying the conditions of Theorem 3 we note the MittagLeffler functions Eij\{z) = 5Z°1 0 v(i+jl\) an< ^' ™ particular, the exponential function [AK1]. Particular cases of this theorem give successively: 1. Example 4 / / s u p 0 < r < o o b(r) = 1/4/? where 0 < f3 < oo, then /3 < 1 and the sector S = {z = x + iy € C : \ arg z\ < 2 sin" 1 A / ^ } is free of all zeros of f and also of all its sections under the condition bk < bk-\Example 5 / / b(r) < Kr then the parabolic region Pi/4K = {z = x + iyeC:y2

< —(x + — ) }

is free of all zeros of f and also of all its sections under the condition bk < bk-iNotice that the result in Theorem 3 includes the Saff-Varga result, though the region they obtain is larger. It also sheds light on the Edrei-Ganelius result and, more importantly, applies to functions of all orders including those of infinite order. The main tool in the proof of Theorem 3 is the following lemma whose proof is rather difficult and it would be desirable to find alternative proofs and possible extensions. Lemma 6 If the coefficients of f are non-negative and satisfy an-\an+\ then

< a^

b(r,sn)
206

4.1

T h e extremal cases

Returning to Theorem 3 it is natural to try to study the extremal cases in it .We can look at the case where / is entire with positive coefficients and lim s u p r ^ ^ b(r) = 0.25, and ask if such a function has some special properties.Prom (6), it would appear that the zeros of / will have to be real and negative!!! true, this would be very valuable.For it would give precise information about the location of zeros from an asymptotic relation.But of course (6) does not suffice to give this and a different approach is needed to handle the extremal case which, however, brings a pleasant surprise: Theorem 7 If f has positive coefficients and lim sup b(r) < — r—*oo

(9)

^

then all but a finite number of the zeros of f are simple, real and negative.Furthermore the constant | is best possible. The proof [AK] of this result is obtained by first locating certain radii tn where the growth of / is comparable with that of its maximal. Rouche's theorem then gives that / has n zeros in the disk \z\ < £ n .Thus / has exactly one zero in the annulus tn < \z\ < tn+\.Since complex zeros of / if they exist must occur in conjugate pairs this zero of / must be real and simple.Of course it cannot be positive so it must be real and negative.

5

Extensions

The result in Theorem 7 is a consequence of the fact that the growth of b(r) is sensitive to the presence of equimodular zeros.In the context that we were describing the growth of b(r) doubles in the presence of equimodular zeros.In particular double zeros would double the size of 6(r).The tension between the size of b(r) and the presence of double zeros in the presence of positive coefficients leads to the very precise result in Theorem 7. The preceding discussion suggests that a study of the growth of b(r) in the general case, that is in the case where the coefficients are not necessarily real positive, ought to be taken up.Of course in this very general setting two difficulties arise.The first is that it will no more be possible to have simple explicit formulas for M(r) and b(r) to work with.The second is that there will not be a distinguished line,the positive real axis in the case of positive coefficients, with respect to which we could try to locate the zeros.Or so it may seem.For there is always the curve where / takes on its maximum modulus and one expects this curve to exert a repellent force on the zeros. A preliminary study of this situation has led to the following results [AK3] Theorem 8 Let

fc,1/2

oo

2n—1

= n(!riM 2 ' 9=e- " 2/a ' a>0

n=l

1

207 and

^ ( a ) = or 4l°g f c ' 1 / 2 («)If f is any transcendental entire function then lim sup b(r) > maxAi(a) = A\.

(10)

r—>oo

Also if lim sup,,^,^ b(r) = 2 max A\ (a) = 2A\ then all but a finite number of the multiple zeros of f must satisfy \8n — w(r„)| = IT where zn = rne%0n is a multiple zero of f and uj(rn) is the argument of a point where the maximum modulus is achieved. This result confirms the repelling property alluded to above and underlines the importance of obtaining the sharp constant in (10).It also serves to demonstrate, once more, how the growth of b(r) becomes more pronounced in the presence of multiple zeros or equimodular zeros.Of course there may be no multiple or equimodular zeros.In this case one may consider ratios of successive zeros.The closer such ratios are to unity the closer we are to the equimodular case. A preliminary study of the connection between the growth of b(r) and ratios of successive zeros, in the special case where the zeros are on one ray, has led to the following precise[AK]: Theorem 9 Let f be transcendental, with non-negative Maclaurin coefficients, bounded b(r) and having all but a finite number of its zeros real and negative then: {a)If limn-,00 ^ ± 1 = q for some q e (l,oo] then /ims«pr_>006(7-) = ip(q) where 1

°°

+

^ ) = ; g(TW

k

<">

(6) / / limsupr-x^b^) = ip(q) for some q G (1, co] then lim n _ +00 ^ ^ > q.If equality holds in this last inequality then actually limn_>oo ^ i i - = q. Thus an entire function / satisfying lim s u p , . ^ ^ b(r; f) = 0.25 and having non-negative coefficients, has all but a finite number of its zeros simple, real and negative.In addition, we now have that its successive zeros satisfy the equality linin-.oo ^ -

= CO.

It should be possible to obtain an extension of Theorem 7 at least when the coefficients are non-negative and the zeros are confined to a small angle bisected by the negative real axis.

6

Gap- Series

The previous discussion indicates that the growth of b(r) is smallest when all the zeros lie on one ray through the origin and gets larger as they swing toward

208 the opposite ray.One way to make the zeros swing towards, say, the positive real axis is to consider g(z) = f(zA) where A is an integer at least equal to 2 and / is the entire function with only positive coefficients satisfying lim sup,...^ b(r; f) = 0.25.In this case it is easy to see that A2 lim sup b(r;g) = A2 lim sup b(r; f) = — • r—>oo

r—*oo

(12)

4

This suggests that the growth of b(r) is also connected with the gap structure of the series. In their paper [BG], Boichuck and Gol'dberg have already noted this.Their sharp result may be rephrased for our present purposes as follows. Suppose that oo

/(*) = £>****

(13)

ifc=0

is an entire function where a^ > 0 and limsupfc_>00(A/t+i — A&) = A. Then A2 lim sup b(r; f) > —— • 7

>00

4

In [GO] Gol'dberg and Ostrowski have asked about the connection between the gap structure of the Maclaurin series of / and the location of its zeros.This connection may be studied via b(r).Ongoing work has resulted in some progress but, except for an unpublished result of Ostrowski announced in [GO], no sharp results are known. The real ambitious question is to extend the above results to functions of positive order < 0.5.We have very little knowledge as to what happens in this case. Thank you.

References [AK]

Faruk.F.Abi-Khuzam, The distribution and multiplicity of values of entire functions of small growth, Complex Variables, (2000).

[AK1] Faruk.F.Abi-Khuzam, Zero-Free Regions for Entire Functions and Sections of Their Power Series, Complex Variables, 29(1996),173-187. [AK2] Faruk.F.Abi-Khuzam, Maximum modulus convexity and the location of zeros of an entire function, Proc.Amer.Math.Soc. 106(1989), 1063-1068. [AK3] Faruk.F.Abi-Khuzam, Hadamard Convexity And Multiplicity And Location Of Zeros, Trans. Amer. Math. Soc. 347(1995), 3043-3051. [GO]

A.A.Gol'dberg and I.V.Ostrowski, Connection Between Arguments Of Zeros And Lacunarity, Linear & Complex Analysis Problem Book 3, Part II, (#1574 LNM ).

209 [BG]

V.S.Boichuck and A.A.Gol'dberg, The three-lines theorem, Mat.Zametki 15(1974), 45-53.( Russian )

[H]

W.K.Hayman, Note on Hadamard's convexity theorem, Entire Functions and Related Parts of Analysis, Proc. Sympos. Pure Math., vol. 11, Amer. Math. Soc, Providence, RI, 1968, pp. 210-213.

[Kj]

B. Kjelleberg, The convexity theorem of Hadamard-Hayman, Proc. Sympos. Math., Stockholm ( June 1973, Royal Institute of Technology ), pp. 87-114.

[ESV] A.Edrei, E.B.Saif and R.S.Varga, Zeros of sections of power series, Springer-Verlag, Berlin, 1983. [G]

T. Ganelius, The zeros of partial sums of power series, Duke Math. J. 30(1963), 533-540.

[E]

A. Edrei, Power series having partial sums with zeros in a half-plane, Proc. Amer. Math. Soc. 9(1958), 320-324.

[SV]

E.B.Saff and R.S.Varga, Zero-free parabolic regions for sequences of polynomials, SIAM J. Math. Anal. 7(1976), 344-357.

[NR]

D. J. Newman and T. J. Rivlin, Correction: The zeros of the partial sums of the exponential function, J. Approx. Theory 16(1976), 299-300.

[Iv]

K. E. Iverson, The zeros of the partial sums of ez, Math. Tables Aids Comp. 7(1953), 163-168.

Mathematics and the 21st Century Eds. A. A. Ashour and A.-S. F. Obada © 2 0 0 1 World Scientific Publishing Co. (pp. 211-221)

211

THREE LINEAR PRESERVER PROBLEMS A H M E D R A M Z I SOUROUR

ABSTRACT. Linear preserver problems are questions about characterising linear maps on spaces of matrices or spaces of operators (or more generally on rings or algebras) that preserve certain properties. We present an exposition of three such problems on preserving invertibility or commutativity or rank one. 2000 Mathematics Subject Classification: 15A30, 16S50, 16W10, 46H05, 47B48 Keywords and Phrases: Invertibility, commutativity, Lie and Jordan isomorphisms, rank.

INTRODUCTION. What came to be called the "linear preserver problems" are questions on characterising linear maps on spaces of matrices or spaces of operators (or more generally on algebras) that preserve certain properties. There has been a great deal of research in this area, especially on spaces of matrices, with results dating back to 1897 (see Theorem 0 below). We refer the reader to the expository articles [LTl, LT2]. There has been also some research activity for maps on Banach algebras, algebras of operators, abstract rings, . . . etc. Possibly the earliest result on this subject is Frobenius' characterization, in 1897, of determinant preserving linear maps which we state presently. The transpose of matrix x is denoted by xl. T H E O R E M 0 (Frobenius [Fr]) Let <j> be a determinant preserving map on the space of all (real or complex) n x n matrices, i.e., det(j)(a) = det(a) for every matrix a, then there exists invertible matrices b and c with det(6c) = 1, such that either (p{a) = bxc for every x or 4>{a) — bxc for every x. Three of the most appealing linear preserver problems, in my view, are invertibility preservers, because of its connection with algebra isomorphisms and Jordan isomorphisms, commutativity preservers, because of its connection

212 with Lie isomorphisms and rank one preservers, because many others preserver problems are reduced to it. In this expository article, we will concentrate on these three problems. The discussion that follow will be far from encyclopedic, and the emphasis will reflect the author's experience.

1. I N V E R T I B I L I T Y P R E S E R V I N G M A P S AND J O R D A N I S O M O R P H I S M S Let A and B be algebras with identity. A linear map <j> from A to B, is called unital if 0(1) = 1 and is called invertibility preserving if {ab + ba) = 4>{a)(j>(b) + 4>{b)<j>(a) for all a and 6 € A, or equivalently be a unital bijective linear map from C(X) onto C(Y) . Then the following conditions are equivalent. (a) cj) preserves invertibility. (b) (j> is a Jordan isomorphism.

213 (c) (T) = A~lTA for every T G C{X), where A : Y -»• X is an isomorphism; or (ii) (T) = B~lT*B for every T 6 C(X), where B : Y -> X' is an isomorphism. In particular such maps are automatically continuous in any of the usual topologies on £{X). We should point out that the equivalence of (b) and (c) is true for any additive map from a ring onto a prime ring [HI, pp. 47-51]. and that the equivalence of (c) and (d) include a classical theorem asserting that every automorphism of C(X) is inner, i.e., of the form x i-> a~1xa. Furthermore, the case of a nonunital map <j> can be reduced to the unital case by considering the map ip defined by tp(x) = ^>(1)_1 {x). We state the conclusion formally. Consequently such a map takes one of the forms 4>{T) — ATB or (T) = AT*B for invertible operators A and B between the relevant spaces. The proof of Theorem 1.1 in [S] and the related result in [JS] proceed by first characterising rank one operators in terms of the spectrum. This implies that an invertibility preserving map preserves the property of having rank one. The spectrum of an element o is denoted by spec (a). T H E O R E M 1.2. ([JS], [So]) For an operator R £ C(X), conditions are equivalent:

the following

(i) rank R < 1 (ii) For every T G C{X) and every distinct scalars a and /3, spec(T + aR) n spec(T + PR) C spec(T). (iii) For every T £ C(X), there exists a compact subset KT of the complex plane, such that spec(T + aR) n spec(T + /3R) C KT.

In a different direction, results of Gleason [G] and Kahane-Zelazko [KZ], refined by Zelazko [Z] show that every unital invertibility preserving linear map from a Banach algebra A into a semi-simple commutative Banach algebra B is multiplicative. (See also [RS]). Additional related results are in [Au], [CHNRR], and [Ru]. Articles [CHNRR] and [Ru] contain similar results on invertibility preserving positive linear maps on C*-algebras and von-Neumann algebras respectively.

214 The commutativity assumption in [G] and [KZ] is quite crucial. It would be a major advance if the conclusion holds for noncommutative algebras. More precisely, we pose this question. Question. Let A be a semi-simple Banach algebra and let ^ be a unital bijective linear map on A. If preserves invertibility, must it be a Jordan isomorphism? Aupetit [Au2] has recently announced a proof when A is a von-Neumann algebra. Perhaps the next step is to prove the result for C*-algebras. We close this section by a counterexample. Another example may be found in [Aul; p.28]. EXAMPLE. Let A be the algebra of 4 x 4 matrices of the form

A 0

B C

where

A, B, C are 2 x 2 matrices, and let A 0

B C

A 0

Bl C

It is straightforward to verify that <j> is unital and preserves invertibility, but that it is not a Jordan homomorphism. Other examples may be constructed by taking A to be a radical algebra with identity adjoined and 4> a bijective unital linear mapping sending the radical to itself.

2. C O M M U T A T I V I T Y P R E S E R V I N G M A P S AND LIE I S O M O R P H I S M S A linear map ip from an algebra A to an algebra B is said to be commutativity preserving if - [a, 6], the algebra A becomes a Lie algebra. In fact, it is a standard result (see, for eg. [Hu, Chapter V]) that every Lie algebra } may be embedded as a Lie subalgebra of an associate algebra - the universal enveloping algebra of } equipped with the product [a, b]. As usual, a Lie isomorphism is a bijective Lie homomorphism. We note that if a is an (associative) isomorphism or the negative of an anti-isomorphism from A to B and 7 is a linear map from A into the centre of B, such that y(ab — ba) = 0 for every a and b in A, then a + 7 is a Lie isomorphism, provided it is injective. We may ask for sufficient conditions on algebras A and B for the converse to hold.

215 Evidently every Lie isomorphism

T^tcA~lTA

+ f{T)I

or T

H->

cA-^A

+ f(T)I

where c is a scalar, T* is either the adjoint or the transpose or some other anti-isomorphism (depending on the space considered), and where A is an invertible operator (perhaps a unitary), and / a linear functional. Consequently, the results may be stated as showing that every such a map is a linear combination of a Lie isomorphism and a map with central range, recently algebras of triangular or block-triangular matrices and their infinite dimensional generalisations have received a lot of attention. In the remainder of this section, we will discuss results on commutativity preservers and on Lie isomorphisms for some such algebras. Let Tn(F) denote the algebra of upper triangular n by n matrices over an arbitrary field F. The "transpose" of an n x n matrix A with respect to the "anti-diagonal", i.e., the diagonal that includes the positions (j, n — j) is denoted by T+. It is easy to see that the mapping T i-> T+ is an anti-isomorphism. Indeed it a composition of the usual transpose and an inner automorphism induced by the matrix J := [<5,jn_;], where Sij is the Kronecker delta symbol. T H E O R E M 2 . 1 . [MS] Let F be an arbitrary field and ip a linear map from Tn(F), the algebra of upper triangular matrices, into itself. Assume that n > 3. The following conditions are equivalent. (a) tp preserves commutativity

in both directions.

216 (b) There exists a non-zero scalar c 6 F, a linear functional f on Tn and an invertible matrix S G Tn such that

f(T)I

or
+

f(T)I

(c) There exists a Lie isomorphism a of Tn(F) a non-zero scalar c E F, and a linear mapping f from Tn(F) into its centre such that
Tn(F) be a linear map. The following are equivalent. Then (p is a Lie automorphism ofTn(F) if and only if (p takes one of the following forms:
+

tr(TD)I,

or
(TD)I,

where S E Tn(F) is invertible, tr denotes the trace and D is a diagonal matrix withtr{D) ^ - 1 . We note that the result above implies that every Lie isomorphism

+

T(T)I,

217

tp(T) = -S-nA~1T+ASn

+ T{T)I,

where A is an invertible element ofToo, S is the bilateral shift, n is an integer, and T is a linear functional on Too that annihilates all commutators. Also in the finite dimensional spaces, the following result about block triangular algebras was proved in [MS2]. We start with a definition. For every finite sequence of positive integers n\,n2,.. ri2 + . . . + rik = n, we associate an algebra T ( n i , m,... n x n matrices of the form An 0

A12 A22

0

0

A =

... ...

.nk, satisfying n-i + n*) consisting of all

Alk-\ A2k ifcfc

where A^ is an n, x rij matrix. We call such an algebra a block upper triangular algebra. T H E O R E M 2.4. Let A = T(ni,n2,.. .nr) and Let B = T{m1,m2,.. .ms) be block upper triangular algebras in Mn and Mm respectively, and let ip be a Lie isomorphism from A onto B. Then m = n, r = s and there exists an invertible matrix B € B and a linear functional T on A satisfying T(I) ^ — 1 such that either (a)

m = mi and ip(T) = B~lTB

+ T(T)I,

or

(b) m = mr-i and tp(T) = B^T+B + r{T)I. The mapping T is given by T(T) = tr {TD), where D is a diagonal matrix such that tr (D) ^ — 1 and the diagonal entries in every one of the blocks that determine A are identical.

3. RANK ONE P R E S E R V I N G M A P S A map ip from a space S\ of matrices into a space S2 of matrices is said to preserve matrices of rank one if (T) is of rank one whenever T has rank one. It is said preserve rank one matrices in both directions when
218 often involve rank-one preservers. Classifying isomorphisms of several types of operator algebras is frequently accomplished by exploiting the fact that they preserve rank one operators; see, e.g. [Da; Chapter 17]. Although the forms of rank one preservers are very similar to the forms of other preservers disacussed in the previous sections, we describe them slightly differently. By a left multiplication on an algebra A we mean a mapping La defined by La(x) = ax, for every x € A, where a is an element of A. Right multiplications Ra are defined analogously. The linear rank one preservers on the space of all n x n matrices was characterized by Marcus and Moyls [11]. They show that every such map is a composition of a left multiplication LA by an invertible matrix A, a right multiplication RB by an invertible matrix B, and possibly the transpose map. For related results, and a summary of similar results obtained from 1960 until 1989, we refer to [Lo] and the references therein. In this section, we discuss more recent results about additive (not necessarily linear) maps that preserve rank one especially on triangular matrix algebras. In [OS2], Omladic and Semrl characterized surjective additive maps on the space of finite rank operators on real or complex Banach spaces. In case of finite dimensional spaces, they show that every such a map is a composition of the three types of maps described above and a fourth type induced by an automorphism of the underlying field, which we describe presently. Assume that c H c is an automorphism of the underlying field F, and C = icij] G Mmn{F). We denote the matrix [cy] by C. Evidently the map C i-¥ C preserves every rank. We say that C >-> C is the map induced on the space of matrices by the field-automorphism ci->c. We shall make use of the transpose with respect to the anti-diagonal T *-¥ T+ described in §2. We now define another type of rank one preservers which appears in [BS] (3.1) Let each of fi, f%,... fn be an additive mapping from F to F such that / i is bijective, and let f = (/i, fa,... / „ ) . Define a mapping f on a triangular algebra A = T{ri\... n*), with n\ = 1, by

f

V

CU

C12

...

Cin

0

C22

•••

C2„

0

0

...

Cnn

\

/l(Cll) 0

/ 2 ( c i l ) +C12 C22

... •••

/n(cil) +Cl„ C2„

J

0

0

...

Cnn

This is a surjective additive mapping on A and it preserves rank one matrices, but only when «i = 1. (3.2) For f and / i , fi,... / „ as above, define a mapping f on a triangular algebra A — T{n\ ... rik), with n& = 1, in a similar fashion except that the "action" is

219 on the last column instead of the first row, more precisely f(C) = ( f ( C + ) ) + . Again this is an additive mapping on A preserving rank one matrices, but only when nk = 1. We now present a result from [BS]. T H E O R E M 3.3. [BS] Let A = T ( n i . . . nu) be a block upper triangular algebra in Mn(F), such that A / T^F). Let
A be a surjective additive mapping that preserves rank one matrices. Then ip is a composition of some or all of the following maps: (i) Left multiplication by an invertible matrix in A. (ii) Right multiplication by an invertible matrix in A. (Hi) The map C \-¥ C', induced by a field automorphism a i-+ a of F. (iv) The map f defined in 3.1 above, but only when m = 1. (v) The map f defined in 3.2 above, but only when n* = 1. (vi) The transpose with respect to the antidiagonal T i-> T+. This is present only when A — A+, i.e., nj = nk-j+i for every j . C O R O L L A R Y 3.4 If

REFERENCES [Al] B. Aupetit, Proprietes Spectrales des Algebres des Banach, Lecture Notes in mathematics, No. 735, Springer-Verlag, New York, 1979. [A2] B. Aupetit, Spectrum preserving linear maps, preprint [AMo] B. Aupetit and H. du T. Mouton, Spectrum-preserving linear mappings in Banach algebras, Studia Math. 109 (1994), 91-100. [BS] J. Bell and A. R. Sourour , Additive rank-one preserving mappings on triangular matrix algebra Linear Algebra Appl. 312 (2000), 13-33. [Br] M. Bresar, Commutativity traces of biadditive mappings, commutativity preserving mappings and Lie mappings, Trans. Amer. Math. Soc. 335 (1993), 525-546. [BM] M. Bresar and C.R. Miers, Commutativity preserving mappings of von Neumann algebras, Can. J. Math. 45 (1993), 695-708. [CL] G.H. Chan and M.H. Lim, Linear transformations on symmetric matrices that preserve commutativity, Linear Algebra Appl. 47 (1982), 11-22. [CHNRR] M.D. Choi, D. Hadwin. E. Nordgren, H. Radjavi and P. Rosenthal, On positive linear maps preserving invertibility, J. Funct. Anal. 59 (1984), 462-469.

220 [CJR] M.D. Choi, A.A. Jafarian and H. Radjavi, Linear maps preserving commutativity, Linear Algebra Appl. 87 (1987) 227-241. [Da] K.R. Davidson, Nest Algebras, Pitman Research Notes in Mathematics, no. 191, Longman Scientific and Technical, London and New York, 1988. [Di] J. Dieudonne, Sur une generalisation du groupe orthogonal a quatre variables, Arch. Math 1(1949), 282-287. [Do] D. Dokovic, Automorphisms of the Lie algebra of upper triangular matrices over a connected commutative ring, J. Algebra 170 (1994), 101-110. [F] G. Frobenius, Uber die Darstellung der endlichen Gruppen durch lineare Substitutionen, Stizungsber. Deutsch. Akad. Wiss. Berlin (1897), 9941015. [G] A. Gleason, A characterization of maximal ideals, J. Analyse Math 19 (1967), 171-172. [HI] I.N. Herstein, On the Lie and Jordan and rings of a simple associative ring, American J. Math. 77 (1955), 279-285. [H2] I.N. Herstein, Topics in Ring Theory, Chicago Lecture Notes in Mathematics, University of Chicago Press, Chicago and London, 1969. [Hu] J.E. Humphries, Introduction to Lie Algebras and Representation Theory, Graduate Texts in Math. 9, Springer-Verlag, New York, Heidelberg, Berlin, 1972. [JS] A. Jafarian and A. R. Sourour, Spectrum preserving linear maps, J. Funct. Anal. 66 (1986), 255-261. [KZ] J. P. Kahane and W. Zelazko, A characterization of maximal ideals in commutative Banach algebras, Studia Math. 29 (1968), 339-343. [K] I. Kaplansky, Algebraic and Analytic Aspects of Operator Algebras, Regional Conference Series in Math. 1, Amer. Math. S o c , Providence, 1970. [LT1] C.K. Li and N.K. Tsing, ed., A survey of linear preserver problems, Linear and Multilinear Algebra 33 (1992), 1-129. [LT2] C.K. Li and N.K. Tsing, Linear preserver problems: A brief introduction and some special techniques, Linear Algebra Appl. 162-164 (1992), 217235. [Lo] R. Loewy, Linear transformations which preserve or decrease rank, Linear Algebra Appl. 121 (1989), 151-161. [MSI] L. Marcoux and A.R. Sourour, Commutativity preserving linear maps and Lie automorphisms of triangular matrix algebras, Linear Algebra Appl. 288 (1999), 89-104. [MS2] L. Marcoux and A.R. Sourour, Lie isomorphisms of Nest Algebras, J. Funct. Anal. 164 (1999), 163-180. [Ma] M. Marcus, Linear transformations on matrices, J. Nat. Bureau Standards 75B (1971), 107-113. [MM] M. Marcus and B. N. Moyls, Transformations on tensor product spaces, Pacific J. Math. 9 (1959), 1215-1221. [MP] M . Marcus and R. Purves, Linear transformations on algebras of matrices: The invariance of the elementary symmetric functions, Canad. J Math 11

221 (1959), 383-396. [Mai] W.S. Martindale, Lie isomorphisms of primitive rings, Proc. Amer. Math. Soc. 14 (1963), 909-916. [Ma2] W.S. Martindale, Lie isomorphisms of simple rings, J. London Math. Soc. 44 (1969), 213-221. [Mi] C.R. Miers, Lie homomorphisms of operator algebras, Pacific J. Math. 38 (1971), 717-735. [O] M. Omladic, On operators preserving commutativity, J. Functional Analysis 66 (1986), 105-122. [OP1] M. Omladic and P. Semrl, Spectrum-preserving additive maps, Linear Algebra Appl. 153 (1991), 67-72. [OP2] M. Omladic and P. Semrl, Additive mappings preserving operators of rank one, Linear Algebra Appl. 182 (1993), 239-256. [RS] M. Roitman and Y. Sternfeld, When is a linear functional multiplicative?, Trans. Amer. Math. 267 (1981), 111-124. [Ra] H. Radjavi, Commutativity-preserving operators on symmetric matrices, Linear Algebra Appl. 61 (1984), 219-224. [Ru] B. Russo, Linear mappings of operator algebras, Proc. Amer. Math. 17 (1966), 1019-1022. [S] A. R. Sourour, Invertibility preserving linear maps, Trans. Amer. math. S o c , 348 (1996), 13-30. [W] W. Watkins, Linear maps that preserve commuting pairs of matrices, Linear Algebra Appl. 14 (1976), 29-35. [Ze] W. Zelazko, A characterization of multiplicative linear functionals in complex Banach algebras, Studia Math. 30 (1968) 83-85.

Department of Mathematics and Statistics University of Victoria Victoria, British Columbia Canada V8W 3P4

Mathematics and the 21st Century Eds. A. A. Ashour and A.-S. F. Obada © 2 0 0 1 World Scientific Publishing Co. (pp. 223-245)

223

PREDICTION: ADVANCES A N D N E W RESEARCH

Essam K. AL-Hussaini Mathematics Department, University of Assiut, Assiut, Egypt

ABSTRACT

Prediction is reviewed and the most recent advances in the area are presented. An objective of this paper is to study the Bayesian multisample prediction and give a concise form for the predictive density function of the r'fe observable in sample j based on the informative sample (s). Applications are shown to a general class of population distributions which specializes to a wide spectrum of life testing distribution models. The uncertainty about the true value of the parameter(s) is measured by a general class of prior density functions.

224

1. I N T R O D U C T I O N Statistical prediction is the problem of inferring the values of unknown observables 'future variables' or functions of such observables from current available 'informative' observations. As in estimation, a predictor can be either a point or an interval predictor. Parametric and nonparametric (distribution-free) prediction have been considered in literature. Frequentist and Bayesian approaches have been used to obtain predictors and study their properties. Maximum likelihood predictor (MLP), best linear unbiased predictor (BLUP) and best linear invariant predictor (BLIP) are examples of frequentist point predictors. A review on frequentist prediction intervals was made by Patel(1989). In parametric prediction, he covered the results obtained when the underlying population distributions are discrete (Poisson, binomial and negative binomial) and continuous (normal, lognormal, exponential, Weibull, gamma, Inverse Gaussian, Pareto and increasing failure rate). He also presented nonparametric prediction intervals, among other results. Nagaraja (1995) surveyed prediction results that are particularly associated with the exponential distributions. The BLUP, BLIP, MLP, Bayes predictors, prediction intervals and regions are among the topics surveyed. A more recent review of point and interval prediction of order statistics was made by Kaminsky and Nelson (1998) which covered linear and maximum likelihood point predictions and interval prediction based on pivotals and on best linear predictors, Bayesian prediction intervals and model shifts. Balakrishnan and Rao (1997) studied large sample approximations to the BLUP based on progressively censored samples. Seshadri (1999) examined the methods of prediction intervals for Inverse

225

Gaussian observables that were presented by Chhikara and Guttman (1982), Padgett (1982) and Padgett and Tsoi (1986). Nonparametric prediction was considered by Flinger and Wolfe [(1976), (1979)( a \(1979)W], Guilbaud (1983) and Johnson et al (1999), among others. The problem of prediction can be solved fully within the Bayesian framework [Geisser (1993)]. Several researchers have studied Bayesian prediction. Among others are Dunsmore [(1974),(1976),(1983)], Geisser [(1984),(1986), (1990),(1993)], Lingappaiah [(1978),(1979),(1980),(1986),(1989)], AL-Hussa ini and Jaheen [(1995),(1996),(1999)], AL-Hussaini [(1999)(a),(1999)W], Lee and Lio (1999), Corcuera and Giummole (1999) and AL-Hussaini, Nigm and Jaheen (2000). The two books by Aitcheson and Dunsmore (1975) and Geisser (1993), which are primarily concerned with Bayes prediction, give illustrative examples, analysis and possible applications. A wide range of potential applications of statistical prediction includes density function estimation, calibration, classification, regulation, model com parison and model criticism. For details and references, see, for example, Bernardo and Smith (1994). A growing interest in predicting future records has arisen in the last two decades. For example, the BLUP of future records was obtained by Ahsanullah (1980) when the two-parameter exponential was the underlying distribution. Nagaraja (1984) studied the BLUP and BLIP of records under the WeibuU model. Doganaksoy and Balakrishnan (1997) suggested a simple way to obtain the BLUP of records. Interval prediction of records was studied by Dunsmore (1983), Ahsanullah (1990), Balakrishnan and Chan [(1994),(1998)], Balakrishnan, Ahsanullah

226

and Chan (1995), Berred (1998) and Chan (1998). For details on prediction of records and other interesting topics related to records, see, for example, the book by Arnold, Balakrishnan and Nagaraja (1998). One possible sampling scheme, described by Dunsmore (1974) as plan 2, consists of two random samples: Ti0,...,Tno0 and Tn,...,T n i i. The informative experiment (sample zero) is assumed to be censored (type II) and so only the first 7"o order statistics of the failure times of this sample are available. The target is to predict future observables from the second sample (sample one). The two random samples are assumed to be independent and that they are drawn from the same population. The uncertainty about the true value of the parameter (s) is measured by some proir density function. Plan 1 of Dunsmore (1974) is a special case of plan 2 where no censoring is imposed on the informative experiment and complete sample zero is available. Lingappaiah (1978) extended this sampling scheme from two t o M + 1 independent random samples, all of which are assumed to be drawn from an exponential population. He assumed that the prediction process begins with sample (stage) 1 and moves on sequentially. No prediction is made at stage 0. For a given sample (stage), he proposed the use of the posterior density as a prior for the next stage each time a next stage is required. Such prediction intervals are useful when, for example, a manufacturer wishes to assure the acceptance of M future shipments of equipments. In this paper, the Bayesian multisample prediction problem is formalized in a theorem in Section 2, in which an expression for the predictive density function of the rf1 order statistic in sample j is based on the information available in sample zero and the order statistics in previous samples 1,..., j — 1 , which are assumed to have been observed. In this expression, the posterior

227 is obtained only once without having to find a posterior at each stage and the underlying distribution is assumed to be of a general form rather t h a n being exponential as proposed by Lingappaiah (1978). Applications to the Pareto and Weibull (including the exponentail and Rayleigh) models are presented in Section 3.

2. P R E D I C T I V E D E N S I T Y O F T H E

Rf

O R D E R STATISTIC IN S A M P L E J Consider a series of M + 1 independent random samples drawn from a population whose probability density function is fx(t

\ 9) and cumulative

distribution function Fr{t \ 9), t > 0, where 9 is a vector of parameters t h a t belongs to a space Q such t h a t fx(t

\ 9) > 0 for 9 G CI. Designate the samples

by 0,1,...,M and their sizes by no, ni,..., UM- It is assumed t h a t only the first TQ order statistics, out of no, representing times-to-failure in sample zero are available. Schematically, the samples may be as follows: Sample number 0

rj' 1 order statistic

Tw,...,TnoQ

1

Tn,...,Tnii

M Let the r '

Sample observations

T( r i ) 1 =Y"i r i

TiM,---,TnMM h

T(TM)M=YMrM

order statistic of sample j be denoted by Yjrj = T{rj)j , , tj = 1, ...,nj:j

= 1,..., M .

In sample zero, we shall assume t h a t the times-to-failures are ordered, so t h a t tio < ••• < troo < ... < i n o o, and t h a t only the first ro order statistics are available . T h a t is type II censoring is imposed on sample zero. T h e

228

order statistic T(rj)j represents the time-to-failure number rj in sample j , j = 1, ...,M. At each stage, it is desired to predict an order statistic based on the order statistics at earlier stages and the information available at stage 0. The density function of Yjr. is known to be given by fYirj(yjrj

| 61) ex [FT(yjrj

| 60P-M1 - MVjr,

\ 0)}n^fT(yjrj

| 6) . (2.1)

The following theorem gives the predictive density function of the rj' 1 order statistic in sample j , denoted by Yjrj, given the first ro order statistics at stage 0, denoted by £o, and the previous order statistics of samples l,...,j —1. THEOREM The Bayesian predictive density function of YjTj is given by fhriyjn

| yj-i,rj_1,...,yir1,t0)

oc / Lj {6; y > £ {6 \ t^dO ,

(2.2)

where

W,to)

Lj(e;yrj)<xlllfYiri(yiri\e)},

(2.3)

7r5((?|to)«L(6l;to)7r(0),

(2.4)

« I I I /r(*io I
(2.5)

i=\

/yi7.. (j/ir; | 0) is given by (2.1), fr(tio \ 0) is the population density, evaluated at tio, n(6) is some given prior density function, ig is the vector of the first ro order statistics at stage 0, given by to = (iio, ...,
and

Vr. = (yiri,-,yjri)

•

(2.6)

Proof Suppose that ix{9) is a given prior density function, 6 € Cl. Then the posterior density function (based on sample zero) is given by (2.4). The

229

predictive density function of the r\h order statistic in sample 1 is then given by / y l r i f a n |*o)oc /" fYlT1 {ylri | 9)TT*Q(9 \ t^dB ,

(2.7)

where / y l r i (ylri | 0) is given by (2.1) when j=l. The posterior density function at sample (stage) 1, denoted by -n^ (9 \ y\Tx, to) is defined by

< (0 | Viri, to) « A ^ (l/in I * K (^ I *o) •

(2-8)

Such a posterior density function is used as a prior for the next sample (stage) 2. So that the predictive density function of Y2r2 i s given by /y 2r2 (2/2r2 I Vir, to) « /

/>W2 (2/2r2 I ^)7rJ (0 | y l r i , *o)dfl ,

(2.9)

which, upon the substitution of (2.8), yields / y 2 r > 2 r 2 | J/m,io) « / /y2r2(2/2,2 I e)fylri(ylri

| * K ( 0 | *o)dfl .

(2.10)

Continuing in this line, the Bayesian predictive density function of Yjr., at stage j = 1, 2,..., M, is then given by

fYjr.(yjrj

where Lj(9;y

—rj

I J / j - l . r v - L - . y i n . O o C / l j ( 0 ; W > o (5 | t^dff

) is as given by (2.3), to and y

—rj

,

by (2.6), 7%=l,...,m and

j=l,...,M. Remark Eq. (2.2) gives the predictive density of Yjrj in the form % . ( % > ; |yj-i,r J --i,-.yir 1 ,t 0 ) = ^ / -Lj(0;]/ )TT^(0 | t^dti ,

230

where K is the normalizing constant. It then follows that P[Yjrj > v | yj-!,^,

...,ylri,to]

= I(i>)/I(0) ,

(2.11)

where

/(i/) = /

/ Lj(ff; yr,H(e

I to)d0dyjri ,

(2.12)

and 1(0) = 1/K. The bounds of a two-sided confidence interval with cover T for YjTj, may thus be obtained by solving the following two equations for the lower and upper bounds, Lj and Uj, respectively. i y l = P[Yjrj > Lj I %_ 1 , rj _ 1 ,..., ylri, t^ = I(Lj)/I(0)

,

(2.13)

^ -

,

(2.14)

= P[Yjrj > Uj | yj-i,^,

....yin.&l = I(Uj)/I(Q)

where 7(0), I(Lj) and /(E/j) are obtained by substituting 0, Lj and Uj in the integral I(i^), given by (2.12).

Special Cases (i) Two-sample prediction This is the case in which j — 1 , so that M=l. In this case, we oly have two samples: sample zero and sample one. The Bayesian predictive density of Yiri is then given by (2.7), where t^ represents the vector of the first ro order statistics in informative sample zero and y\ri is a value of the r{ order statistic in future sample one. In this case, the lower and upper bounds of Yi ri are obtained by solving the following two equations which are reduced forms of (2.13) and (2.14) 1+T = P[Yln >L1\t0] 2 1-T

2

= P[Ylri >U1\t0]

where I(v) is given by (2.12) when j = 1.

= /(ii)//(0),

(2.15)

= I(tfi)//(0),

(2.16)

231

(ii) Predictive density of the smallest observable in sample j If, in (2.2), Tj—1, for j = 1,...,M, we obtain the Bayesian predictive density function of the smallest observable in sample j . In this case,

LMyJ «III 1 - Mm I oW'-'Myn I o)

(2.17)

2=1

y is given by (2.6) when r , = l , j — 1, ...,M. The lower and upper bounds of Yji are obtained by solving the two equations ^L=P\Yjl>L1\yj.lil,...,y11,tQ]=I(Lj)/I{0),

(2.18)

^L=P\yji>U1\yj-1,1,...,y11,to]=I(UJ)/I(0),

(2.19)

wher I(v) is given by (2.12) when r j = l , j = 1,...,M. In the two-sample case (j=l), Za(#; y ) takes the form Li(0 ; 2 o ) a [1 - F T ( y n | ^ " ^ V T ^ I I | 0) .

(2.20)

The lower and upper bounds of Y\\ are obtained by solving the two equations i ± l

= P[Yn

>L1\t0]=

^

= P [ y n >U1\t0]=

/(Li)//(0) ,

(2.21)

J(C7i)//(0) ,

(2.22)

where /(^) is given by (2.12) in which Li(6;y_) is given by (2.20). (iii) Predictive density of the largest observable in sample j If, in (2.2), rj=rij, j=l, ..., M, we obtain the Bayesian predictive density function of the largest observable in sample j . In this case, j

Lj(0;ynj)«HlMyinr^frtyin, I o), i=l

(2.23)

232

y_n is given by (2.6) when Tj—rij, j=l,...,M.

The lower and upper bounds

of Yjnj are obtained by solving the two equations (2.13) and (2.14) after replacing Tj by rij.

3. A GENERAL CLASS OF POPULATION DISTRIBUTIONS A N D A GENERAL CLASS OF PRIORS Suppose that the population cumulative distribution function (cdf) is of the form FT(t | 6) = 1 - exp[-X(t)] , t > 0,

(3.1)

where A(£) = X(t; 6) is a nonnegative continuous function of t such that \(t) —> 0 as t —• 0 + and X(t) —• oo as t —> oo. The corresponding reliability, R(t \ 6), hazard rate, h(t | 6) and density, fr(t | 0) functions are given, respectively, by R(t\6)

= l - FT(t | 6) = exp[-X(t)} ,

(3.2)

h{t\6) = X'{t) ,

(3.3)

/ r ( t | 0) = A/(t)exp[-A(t)] , £ > 0 .

(3.4)

Some important distributions, that are used in life testing, naturally belong to this class. Among others, are the Weibull (including the exponential and Rayleigh as special cases), compound Weibull (or the three-parameter Burr type XII), Pareto, beta, Gompertz and compound Gompertz distributions [ see AL-Hussaini and Osman (1997)]. Such a general class of distrbutions together with a general (natural conjugate) class of prior density functions given by ?r(0) OC C(0;7)ezp[-.D(0;7)] , 0 £ Q ,

(3.5)

233

where 7 is a vector of prior parameters, were suggested by AL-Hussaini (1999)^ to develop Bayesian prediction bounds for observables from (3.4). It follows, from (2.5), that L(6;to)(xA(to;e)exp[-B(to;0)},

(3.6)

where r0

1-0

A
= £)A(t«,) + (no - r o )A(i ro0 ) .

i=l

(3.7)

i=l

The following corollary specializes the above Theorem to class (3.1) of population distributions and class (3.5) of priors. COROLLARY If all of the independent M + 1 random samples are assumed to be drawn from a population with cumulative distribution function (3.1) and if a prior density is given by (3.5), then the predictive density of Yjr. is given by fvir-iVirj IVj-in-^-^ir^toJcc 3

/ g(8)exp[-h(6)]d6 Jn

,

(3.8)

where 9W) = Kvr.)v(P;to)

and

K0)=6(y ) + W,to),

% r .)=ri{A / (^)E(- i ) ; i f r i r 1 N ) e ^H i A(j/ i r i )]}, i=l

U=0

\

l

i

(3-9) (3.10)

/ J

Ti(6;to) = A(to;0)C(0n)

and

8{y_r) = Y,^Kyin)

,

(3.11)

*=i

mi=ni-ri + \

,

C(«;*o) = BfaB)

+ £>(0;7) .

(3.12)

234

The proof of this corollary follows by implementing (3.1)-(3.5) in (2.3), where (2.1) and (2.3) yield L

j(8;yr)

ccb(y )exp[-6(y

—~3

—~]

)] , —~3

b(y ) and 6(y ) are given by (3.10) and (3.11), respectively. The posterior —Tj

—Tj

density function at stage zero is obtained by substituting (3.5) and (3.6) in (2.4), to yield TTO*(0 | to) oc 7/(0;io)ezp[-C(0;io] , 6 G fi , where 7?(6';i0) and C(^;*o) are given by (3.11) and (3.12), A^-,6),

(3.13) B^Q)

by (3.7) and C(6;j), D(6;j) are the prior factors appearing in (3.13). 4. APPLICATIONS 4.1 The Pareto-type I model Suppose that T ~ Pareto-type I (a,0) with cdf FT{t | 6) = 1 - ( ^ ) a = 1 - exp[a\n(/3/t)],

t > 0, (a > 0,0 < 0 < d) . (4.1)

Comparing with (3.1), we have \(t) =-aln(0/t)

,

\l{t)=a/t.

(4.2)

Let 6 = (a, 0), where both parameters are assumed to be unknown. It follows from (3.7) that A(to;e)=ar°/A(t0)

,

13(^,6) = -aBfaO),

(4.3)

where

A{h) = n ^ ° i=l

'

B

&>>P) = E l n ^ / i i Q ) + (n° - r°) HP/tro) • (4-4) i=l

235

Suppose that a 'generalized' prior, suggested by Lwin (1972), is of the form TT(0) o c a a / 3 _ 1 e a ; p [ - a ( l n c - 6 1 n / 3 ) ] , a > 0 , 0 < / 3 < d .

(4.5)

Denote the vector of positive prior parameters by 7=(a,b,c,d). Then C(0;7)=a%8

,

D(6;j) = a[lnc-blnf3}.

(4.6)

The posterior density (at stage 0) is then given by (3.13), where (3.11), (3.12), (4.3) and (4.6) yield vW;t0) = <*ro+a/\PA(to)] > mt0)

= a[lnc-b\np-B(t0;p)}

.

(4.7)

It is assumed that a > 0 and 0 < /3 < N, N = min {ti 0 , d}. From (3.10)

b(yr.) = fl(—) E (-!)'' C' 7 XW^i W/fcJ], (4-8) i=i

yir

>

ii=o

v H /

and from (3.11), i

8{yr) = -a J2 mi MP/Vin)-

(4-9)

It can be shown, by using (3.8), (3.9), (4.8), (4.9) and some algebraic manipulations, that TT

1

i=i * i r i ii=o

(l v

[# 2j {ff„ + K- + 1) m ^ . } - ^ - 1 ) ] } , yjrj > (3 ,

(4.10)

where ro

j-1

F j j = In c + ( n 0 - r 0 ) In i r o 0 + E l n i i o + E ^ i=l

i #2j = 6+no + E ( m * + ^) ' mi i=l

=

m i + Z

^lnyiri ~ ^ 2 J ' (4-n)

i=l

^ - ' " i + l and ujj = a+ro+j-l.

(4.12)

236 Therefore, for j > 2 and v > (3

1{y) =

rr y;(-i)''( n ~x) ^{nj i=l ii=0

\

i

+ 1)H2j{Hlj

/

+ K- + *)ln ^P r1

i l i = l 2/ir-i

(4.13) It then follows that a confidence interval for YjT. with cover r has bounds Lj and t/j that are given by the solution of the two equations i i l = I(Lj)/m

and

1 ^ 1 = I{U3)/I{(3) .

If the smallest observable in sample j is to be predicted, then rj=\ for all values of j . In this case, bounds for the confidence interval, with cover T, for Yji can be explicitly obtained by substituting rj=l, for all j . In this case, the ratio I(u)/I(/3)

takes the form

!W) /(/?)

r-Hij + K + iym/-, -u,,JJ y + fa + l)Zn/?J '

,

L

{

'

. '

where H\j (and HZJ) are given by (4.11) and (4.12) with n=\ and li=0 for i=l,...j. It then follows, from (2.18) and (2.19) that the lower and upper bounds of Yji are given, respectively, by Lj = e ^ [ ( - i r i ) { ( l ± I ) - 1 M ^ 1 . + („. + l)ln/3] - J J y } ] ,

(4.15)

Uj = e*P[(^p[){(^L)-1/Ui\Hii

(4.16)

+ (nj + 1)ln/3] - H^}]

In the two-sample case 0 = 1 ) , (4.10) reduces to

,

237

Ji=0

^

1

[yirx-ff2i{-ffn + ("i + l ) l n 2 / l n } W 1

'

] " , yin >/? •

So that /(„) = ^ ( - ^ ^ " ^ [ w i K /i=0

^

*

+ lJffa^ffu + Cm + l J l n i / p ] " 1 ,

'

where #11 = In c + (n 0 - r 0 ) In i r o 0 + X)I=i l n (**o) - #21 In d , and #2i = b + no + mx + l±. If, furthermore, ri—1, then /H = Mn

1

+ l ) t f 2 i { # i i + (rii + l ) l n z / ; p ] - 1 ,

v>p.

Therefore, a confidence interval for Yu with cover r has bounds Li and U\, given by Li = expK^-^iHn

+ (m + 1) In/?} - . ^ / ( m + 1) ,

and tfi = e a f p K ^ ) - 1 / ^ ^ ! ! + (ni + l)ln/?} - Hn\/{nx

+ 1).

4.2 The Weibull model Suppose that T ~Weibull(0,/3) with cdf F T (i | 0) = 1 - exp(6tp) , t>0

,

(4.17)

where # is an unknown scale parameter and {3 is a known shape parameter. Both parameters are assumed to be positive. It follows, from (3.1), that \{t) = 6t0

and

\' = flpV3"1 .

(4.18)

238

The likelihood function £(0;*o) is of the form (3.6), in which (3.7) and (4.18) yield A(to;O) = 0r°A(to;0)

,

B(to;6)=0B(to;p),

(4.19)

where TO

To

A(t»; 0) = Pr° I ] C1

,

B(t0;P) = J2 *f0 + ("o - r „ ) C - (4-20)

i=0

j=l

to is the vector defined in (2.6). Under the assumption that (3 is known, the gamma prior family is closed under sampling from the Weibull distribution. A natural conjugate prior density ir(9) then takes the form of (3.5), where C(9;7) = 6a-1

,

D{6;1) = b6,

(4.21)

7 = (a, b) is the vector of prior parameters. The posterior density function TTQ(6 \ t^) then takes the form (3.13), in which ri(0;h)=Sro+a+1A(to;P)

,

mt0)=6[b

+ B(t0;(3)},

(4.22)

where A^-,13) and B^;/?) are given by (4.20). The function Lj(9;y ) is of the form (3.8), in which b

(Vr.) = ^ . . f f l f l D - 1 ) ' ' C V l)^exp{-6llVit] , '

i=iu=o

\

(4.23)

'* /

S(yri) = 0(Z!rniyfri)

.

(4.24)

where 3

A(yrj;H) = fi J 3 y t T i=l

and

mi = nl-ri

+ l.

(4.25)

239

It then follows, from (3.8)-(3.12), that the predictive density of Yjrj is given by i

n-i

/

/Vir,-(%>i I Sfo-i.ri-i.-.J/in.to) « ^ ( ^ r . ; ^ ) n H ^

- 1

_ 1\

)'^*;. )

[b + B(t0]p) + J2(mi + li)yii].

X

(4.26)

The lower and upper bounds of Yjrj can be obtained by solving (2.13) and (2.14) using numerical integration since I(v) could not be obtained in closed form. The first order statistic in sample j has a predictive density of the form (4.26) with Tj=l, i=l,...,j.

If, in addition, j=l,

(the two-sample

case), then Yu has the predictive density f^Avn I *o) ocPy^lb + B^ft+m^]-^0^

,

(4.27)

m i = rii — j"i + 1.

In this case ( j = l and Tj-=1), I{v) can be obtained in the form /•OO

IW) = / &&" > + B(to;l3) +

rniy^a+ro+1Uyil

Jv

= [b + B(y,p)+m1^}^a+^/[m1(a

+ r0)}.

(4.28)

The lower and upper bounds of the confidence interval for Yu with cover r are obtained by solving (2.21) and (2.22), so that

and

240

Remarks (1) By setting (3=1 and (3=2 in the Weibull(0,/3) model, we obtain the exponential and Rayleigh models, each with parameter 6, respectively. Consequently, all of the results obtained for the Weibull model specialize to the exponential and Rayleigh models by setting (3=\ and 2, respectively. (2) The predictive density (4.27) is of the form of Burr type XII density function with parameters (/3,w,£), where u>=a,+ro and £ = mi / [b + B{tQ;/?)]. (3) If T=Weibull(o;,/3), where both of the parameters are unknown, prediction bounds for Yjrj could only be obtained numerically, whether a noninformative or informative prior is used.

5. CONCLUDING R E M A R K S Several results have been reached in the past three decades regarding prediction.

In the point prediction, maximum likelihood and linear pre-

diction based on location-scale families have been developed. Results on frequentist and Bayesian interval prediction have been obtained and studied. However, further investigations need to be carried out to study prediction when samples are based on families which are not location-scale type and the optimality of predictors resulting in such cases. Nonlinear prediction, optimality and cost still need further study. REFERENCES Ahsanullah, M. (1980). Linear prediction of record values for the two parameter exponential distribution. Ann. Instit. Statist. Math. 32, 363-368. Ahsanullah, M. (1990). Estimation of the parameters of the Gumbel

241

distribution based on m record values. Comput. Statist. Quart. 6, 231-239. Aitcheson J. and Dunsmore, I. (1975). Statistical Prediction Analysis. Camridge University Press, Cambridge. AL-Hussaini, E.K. (1999)'°'. Bayesian prediction under a mixture of two exponential components model based on type I censoring. J. Appl. Statist. Science 8, 173-185. AL-Hussaini, E.K. (1999)' 6 '. Predicting observables from a general class of distributions. J. Statist. Plann. Infer. 79,79-91. AL-Hussaini, E.K. and Jaheen, Z.F. (1995). Bayesian prediction bounds for the Burr type XII model. Commun. Statist.- Theory Meth. 24, 1829-1842. AL-Hussaini, E.K. and Jaheen, Z.F. (1996). Bayesian prediction bounds for the Burr type XII distribution in the presence of outliers. J. Statist. Plann. Infer. 55, 23-37. AL-Hussaini E.K. and Jaheen, Z.F. (1999). Parametric prediction bounds for the future median of the exponentil distribution. Statistics 32, 267-275. AL-Hussaini, E.K.; Nigm, A.M. and Jaheen, Z.F. (2000). Bayesian prediction based on finite mixtures of Lomax components model and type I censoring. Statistics (to appear). AL-Hussaini, E.K. and Osman, M.I. (1997). On the median of a finite mixture. J. Statist. Comput. Simul. 58, 121-144. Arnold, B.C.; Balakrishnan, N. and Nagaraja, H.N. (1998). Records. Wiley, New York. Balakrishnan, N.; Ahsanullah, M. and Chan, P.S. (1995). On the

242

logistic record values and associated inference. J. Appl. Statist. Science 2, 233-248. Balakrishnan, N. and Chan, P.S. (1994). Record values from Rayleigh and Weibull distributions and associated inference. NIST Special Publi cation 866, Proceedings of the Conference on Extreme Value Theory and Applications, vol.3 (Eds., J. Galambos, J. Lechner and E. Simiu) pp.4151. Balakrishnan, N. and Chan, P.S. (1998). On the normal record values and associated inference. Statist. Prob. Letters (to appear). Balakrishnan, N. and Rao, C.R. (1997). Large sample approximations to the best linear unbiased estimation and best linear unbiased prediction based on censored samples and some applications. In: Advances in Statistical Decision Theory and Applications, (Eds. S. Panchapakesan and N. Balakrishnan), Birkhauser, Boston, pp. 431-448. Bernardo, J.M. and Smith, A.F.M. (1994). Bayesian Theory. Wiley, New York. Berred, A.M. (1998). Prediction of record values. Commun.

Statist.-The

ory Meth. 27, 2221-2240. Chan, P.S. (1998). Interval estimation of parameters of life based on record values. Statist. Prob. Letters (to appear). Chhikara, R. and Guttman, I. (1982). Prediction limits for the Inverse Gaussian distributions. Technometrics 24, 319-324. Corcuera, J.M. and Giummole, F. (1999). A generalized Bayes rule for prediction. Scand. J. Statist. 26, 265-279. Doganaksoy, N. and Balakrishnan, N. (1997). A useful property of best linear unbiased predictors with applications to life testing. The

243

Amer. Statist. 51, 22-28. Dunsmore, I.R. (1974). The Bayesian predictive distribution in life testing models. Technometrics 16, 455-460. Dunsmore, I.R. (1976). Asymptotic prediction analysis. Biometrika 63, 627-630. Dunsmore, I.R. (1983). The future occurrence of records. Ann.

Instit.

Statist. Math. 35, 267-277. Flinger, M.A. and Wolfe, D.A. (1976). Some applications of sample analogues to the probability integral transform and a coverage property. Amer. Statist. 30, 78-85. Flinger, M.A. and Wolfe, D.A. (1979) (a) . Methods for obtaining distribu tion-free prediction interval for the median of a future sample. J. Quality Tech. 11, 192-198. Flinger, M.A. and Wolf, D.A. (1979)' 6 '. Nonparametric prediction in tervals for a future sample median. J. Amer. Statist. Assoc. 74, 453456. Geisser, I.R. (1984). Predicting Pareto and exponential observables. Canad. J. Statist. 12, 143-152. Geisser, I.R. (1986). Predictive analysis. In: Encyclopedia of Statistical Sciences, vol.7 (Eds., S. Kotz, N.L. Johnson and C.B. Read), Wiley, New York, pp. 158-170. Geisser, I.R. (1990). On hierarchical Bayes procedures for predicting simple exponential survival. Biometrics 46, 225-230. Geisser, I.R. (1993). Predictive Inference: An Introduction. Chapman and Hall, London. Guilbaud, O. (1983). Nonparametric prediction intervals for sample me-

244

dians in the general case. J. Amer. Statist. Assoc. 78, 937-941. Johnson, R.A.; Evans, J.W. and Green, D.W. (1999). Nonparametric Bayesian predictive distributions for future order statistics. Statist. Prob. Letters 41, 247-254. Kaminsky, K.S. and Nelson, P.I. (1998). Prediction of order statistics. In: Handbook of Statistics, (Eds.,N. Balakrishnan and C.R. Rao), Elsevier Science, Amesterdam, vol.17, 431-450. Lingappaiah, G.S. (1978). Bayesian approach to the prediction problem in the exponential population. IEEE Trans. Rel. R-27, 222-225. Lingappaiah, G.S. (1979). Bayesian approach to prediction and the spacings in the exponential distribution. Ann. Instit. Statist. Math. 31, 391-401. Lingappaiah, G.S. (1980). Intermittant life testing and Bayesian approach to prediction with spacings in the exponential model. Statistica 40, 477-490. Lingappaiah, G.S. (1986). Bayes prediction in exponential life-testing when sample size is a random variable. IEEE Trans. Rel. 35, 106-110. Lingappaiah, G.S. (1989). Bayes prediction of maxima and minima in exponential life tests in the presence of outliers. J. Indusr. Math. Soc. 39, 169-182. Lwin, T. (1972). Estimating the tail of the Paretian law. Scandinavian Aktuarietidskr 55, 170-178. Nagaraja, H.N. (1984). Asymptotic linear prediction of extreme order statistics. Ann. Instit. Statist. Math. 36, 289-299. Nagaraja, H.N. (1995). Prediction problems. In: The Exponential Distribution: Theory and Applications, (Eds. N. Balakrishnan and A.P.

245

Basu), Gordon and Breach, New York, pp. 139-163. Padgett, W.J. (1982). An approximate prediction interval for the mean of future observations from the Inverse Gaussian distribution. J. Statist. Comp. Simul. 14, 199-209. Padgett, W.J. and Tsio, S.K. (1986). Predictin interval for observations from the Inverse Gaussian distribution. IEEE Trans. Rel. R-35, 406408. Patel, J.K. (1989). Prediction intervals - a review. Communic.

Statist-

Theory Meth., 18, 2393-2465. Seshadri, V. (1999). The Inverse Gaussian Distribution. Statistical Theory and Applications. Springer Verlag, Berlin.

Mathematics and the 21st Century Eds. A. A. Ashour and A.-S. F. Obada © 2001 World Scientific Publishing Co. (pp. 247-270)

247

INFERENCE ON PARAMETERS OF THE LAPLACE DISTRIBUTION BASED ON TYPE-II CENSORED SAMPLES USING EDGEWORTH APPROXIMATION N. Balakrishnan*, A. Childs*, Z. Govindarajulu^ and M.P. Chandramouleeswarant *McMaster University, Hamilton, Ontario, Canada * University of Missouri, Columbia, Missouri, USA A University of Kentucky, Lexington, Kentucky, USA

Keywords and Phrases; Order statistics; Type-II censored samples; Laplace distribution; Exponential distribution; Single moments; Double moments; Triple moments; Quadruple moments; Coefficients of skewness and kurtosis; Pivotal quantities; BLUE; MLE; Edgeworth approximation; Recurrence relations; Life-testing

ABSTRACT By deriving two recurrence relations which express the single and double moments of order statistics from a symmetric distribution in terms of the corresponding quantities from its folded distribution, Govindarajulu (1963, 1966) determined means, variances and covariances of Laplace order statistics (using the results on exponential order statistics) for sample sizes up to 20. He also tabulated the BLUE's (Best Linear Unbiased Estimators) of the location and scale parameter of the Laplace distribution based on complete and symmetrically Type-II censored samples. In this paper, we first establish similar relations for the computation of triple and quadruple moments. We then use these results to develop Edgeworth approximations for some pivotal quantities which will enable one to develop inference for the location and scale parameters. Next, we show that this method provides close approximations to percentage points of the pivotal quantities determined by Monte Carlo simulations. Finally, we present an example to illustrate the method of inference developed in this paper.

1 INTRODUCTION Let Xr+\-n < Xr+2-n < ••• < Xn-S:n be a doubly Type-II censored sample available from the Laplace (or double exponential) population with pdf

248

f(x;/j,,a) = —e |x M", - c o < x < oo, - c o < ^ < co, a > 0. (1.1) 2a Here, out of n items placed on a life-testing experiment, the smallest r and the largest s items have been censored. The most common situation in a life-testing problem is, of course, a Type-II right-censored sample (with r = 0 and s > 0) as the experimenter will often terminate the experiment as soon as a certain number of items have failed instead of waiting for all the items to fail; see Mann, Schafer and Singpurwalla (1974), Bain (1978), Lawless (1982), Cohen and Whitten (1988), Bain and Engelhardt (1991), and Balakrishnan and Cohen (1991). It should be mentioned here that Govindarajulu (1963) established two recurrence relations which express the single and double moments of order statistics from a symmetric distribution in terms of the corresponding quantities from its folded distribution. By using these relations and the exact explicit expressions-of the single and double moments of exponential order statistics, Govindarajulu (1966) determined the means, variances and covariances of Laplace order statistics for n up to 20; he also tabulated the BLUE's of the parameters p, and a based on symmetrically Type-II censored samples (with r = s) for n up to 20. Raghunandanan and Srinivisan (1971) derived simplified linear estimators of /i and a in this situation. Ali, Umbach and Hassanein (1981) discussed the estimation of quantiles based on optimally selected order statistics. Shyu and Owen (1986a,b) constructed the one-sided and two-sided tolerance intervals based on complete samples. Balakrishnan and Ambagaspitiya (1988) discussed the robustness features of various linear estimators of ft and a when a single scale-outlier is present in the sample, and Childs and Balakrishnan (1997a) extended this work to include multiple outliers. Recently, Childs and Balakrishnan (1997b) extended the work of Balakrishnan and Cutler (1995) by deriving the MLE's of \i and o based on general Type-II censored samples. Balakrishnan, Chandramouleeswaran and Ambagaspitiya (1996) extended the work of Govindarajulu (1966) and presented tables of BLUE's and variances and covariance of these estimators for n up to 20 for the case of right-censored samples (with r = 0 and s = 0(l)n — 2). They also presented some percentage points of three pivotal quantities which will enable one to construct confidence intervals and carry out tests of hypotheses for the parameters n and a. Balakrishnan and Chandramouleeswaran (1996a) discussed the estimation of the reliability function and the construction of lower and upper tolerance limits based on the BLUE's when the available sample is Type-II right-censored. Balakrishnan and Chandramouleeswaran (1996b) considered the problem of predicting the time of failure of a surviving item and also predicting the lifetime of an item from a future sample, for the case when a Type-II right-censored sample is available from a life-test. In this paper, we first extend the work of Govindarajulu (1963) and derive recurrence relations that will enable the computation of triple and quadruple moments of order statistics from a symmetric distribution in terms of the corresponding quantities from its folded distribution. These results, along with the exact explicit expressions of these moments of exponential order statistics, will enable one to compute all moments of order up to 4 for the Laplace order statistics. These quantities are used to determine the coefficients of skewness and kurtosis of some pivotal quantities based on the BLUE's and MLE's of fi and a, and then propose Edgeworth approximations for the distributions of these pivotal quantities. The proposed approximations will enable one to develop inference for the parameters fi and a based on Type-II

249 censored samples. We then show that this method provides close approximations to percentage points of the pivotal quantities determined by Monte Carlo simulations. Finally, we present a numerical example to illustrate the method of inference developed in this paper. Similar work in the case of the exponential distribution has been carried out recently by Balakrishnan and Gupta (1998).

2 RECURRENCE RELATIONS FOR ORDER STATISTICS FROM TWO RELATED DISTRIBUTIONS Let Z\,Z2,.-- ,Zn be I.I.D. random variables with probability density function f(z) symmetric about 0 (without loss of generality), and cumulative distribution function F(z). Then, for z > 0 let F'(z) = 2F(z) - 1 and f"(z) = 2f(z).

(2.1)

That is, the density function f*(z) is obtained by folding the density function f(z) at zero (the point of symmetry). Let Yin < Y2n < ••• < Ynn denote the order statistics obtained from n I.I.D. random variables Y\,Yz,--- ,Yn having probability density function f*(z) and cumulative distribution function F*{z) as given in (2.1), and let Z\:n < Zi-n < ••• < Zn-n denote the order statistics obtained from the random variables Z\, Zi, • • •, Zn. Let us now denote the single moments E(Z?n) by p,fl, the double moments E(Zf:nZj.n) by /4j:„, the triple moments ~E(Z?nZj.nZf:.n) by l\ik°n> an& t n e quadruple moments E(ZlnZ).nZl.nZfn) by l4^c/J for 1 0. Similarly, let us denote the corresponding moments of order statistics Yim by ufj, v\a-bn\ vfA^l and ^"/u^ ^0I 1 0. Then by making use of the relations in (2.1), Govindarajulu (1963) established the following two recurrence relations.

Relation 1: For i = 1,2,... ,n and a > 1,

Relation 2: For 1 1,

*=i

)

250 It is worth mentioning here that the above two relations have been generalized recently by Balakrishnan (1989) and Balakrishnan, Govindarajulu and Balasubramanian (1993) to the case when the order statistics Zi,n arise from n independent and non-identically distributed symmetric random variables and arbitrarily distributed symmetric random variables, respectively. By proceeding along the lines of Govindarajulu (1963), we now establish two recurrence relations which will express the triple and quadruple moments of order statistics Z{-n in terms of the corresponding quantities of order statistics Yi-n. Relation 3: For l
and a,b,c > 1,

-t-(-ir + i , vV n V M ) +

\

l

+

*>

l

>

Z~/\t) t=j >

v{c)

V

t-j+l,t-i+V.t

2—i \ l ) t=k

v

k-t.n-t

"t-k+l,t-j+l,t-i+V.t j

Proof: The relation is proved by considering the triple integral expression of ni'-^l over the range (—oo < xt < Xj < Xk < oo) and splitting the range into four parts as (0 < x, < Xj < Xk < oo), (-oo < Xi < 0, 0 < Xj < Xk < oo), (—oo < Xi < Xj < 0, 0 < Xk < oo), and (—oo < re,- < Xj < Xk < 0), and then using the relations in (2.1) in each of these four integrals. Proceeding similarly, we may also establish the following relation for the quadruple moments of order statistics Z,:n. Relation 4: For 1 1, ' t-i

(a,b,c,d) _ _ l J \ V n \ , W w * )

+ +

v{bfi'd)

(_1)°YV™VO) \

l

> /t-i A t k-1

)Vt-i+l:tVi-t.k-t.l-t:n „ (M) (c,d) j+l,t-i+l:tL/k-t,l-t:n-t

t=j ' " ' (-1 . „ .

+

(-irb+ct=kE{nM z

+(-Da+6+c+dE(I)^(-;l+l,t-k-\-l,t-j+l,t-i+l:t ,c,b,a)

251 Relations 1-4 can be used recursively to compute the single, double , triple and quadruple moments of order statistics from a symmetric distribution by making use of the corresponding quantities from its folded distribution. In particular, after computing all these moments of order up to 4, one can determine the mean, variance and the coefficients of skewness and kurtosis of any linear function of order statistics from that symmetric distribution. These measures can then be utilized to develop Edgeworth approximations for distributions of linear functions of order statistics as we will illustrate in Section 5. For the case when the order statistics Zi-n arise from the standard Laplace distribution [with n = 0 and a = 1 in (1.1)], the computation of the single, double, triple and quadruple moments by means of Relations 1-4 require the knowledge of these moments from the standard exponential distribution, with pdf /*(z) = e~z, z > 0. In this case, the additive Markov chain representation of standard exponential order statistics [Sukhatme (1937), Renyi (1953)] given by, i

p

Yi-.n = Y] ~-—r, i = frf n- t + 1

l,2,...,n,

where Et's are I.I.D. standard exponential random variables, makes it possible to write down the necessary moments vfn, vfj.^, ",•%'.„ and v^'^'J in simple explicit algebraic forms. For example,

•£:

t+i

(2.2)

(2)

and

^.-£(^r(g^)(S^}

(2 4)

-

Alternatively, one may use the recurrence relations for the single and product moments of exponential order statistics derived by Joshi (1978, 1982), along with the recurrence relations for the triple and quadruple moments derived recently by Balakrishnan and Gupta (1998) in order to compute the necessary moments v^, vfyj, ",-jV.„ anl^ "ijut m a s i m P' e recursive manner.

3 BLUE'S OF (j, AND a Let XT+ln < XT+2:n < •" < Xn_s:n denote a general doubly Type-II censored sample from the Laplace distribution in (1.1), and let

252 Zi:n = (Xi:n — n)/a, i = r + 1, r + 2 , . . . , n — s, be the corresponding order statistics from the standard Laplace distribution. Let us denote B(Zi:n) by //,..„, Var(Zj:7l) by a,,,:n, and Cov(Zi:n, Zj:n) by o-ij:n; further, let X — (Xr+l:n,

/*

=

Xr+2:ni • • • j -An- s.nj

(/Vt-l:n I AVf2:n > •' • i Vn-s.n J

and S = ((o-jj-n)) , r + l < i , j ' < n - s Then, the BLUE's of fi andCTare given by

/ / E - V 7 ^ - 1 - M T S- 1 l/x r S- 1 I ( ^ r E - 1 A * ) ( l r S - 1 l ) - (fjT-E-n)2 J

^

~ i=r+l ^ < i: "'

(iA)

and a

-U/irs-i^irs-iij-^rs-i^/^-.^

6

^™-

^

Furthermore, the variances and covariances of these BLUE's are given by

Var(

^ = H(^-w/X)-{^-ny}

=

CTV

- (3-4)

and

for details, refer to David (1981), Balakrishnan and Cohen (1991), and Arnold, Balakrishnan and Nagaraja(1992). From Equations (3.1) and (3.2), Govindarajulu (1966) computed the coefficients a* and 6* for symmetrically Type-II censored samples (with r = s) for sample sizes n up to 20. He also presented the values of V{ and V2* i n Equations (3.3) and (3.4). Similar tables have been prepared recently by Balakrishnan, Chandramouleeswaran and Ambagaspitiya (1996) for Type-II right-censored samples (with r = 0 and s = 0(l)n — 2) for sample sizes n = 3(1)20. They also presented some percentage points of three pivotal quantities, based on BLUE's [i* and a*, which will enable one to construct confidence intervals and carry out tests of hypotheses for the

253 parameters /z and a.

4 MLE'S OF fj, AND a Recently, Childs and Balakrishnan (1997b) derived the MLE's of n and a based on the general doubly Type-II censored sample, -Xr+l:n ^ XT+2:n 5= ' ' ' 5:

Xn-s:nt

from the Laplace distribution (1.1). They found that when r + 1 < n — s < n/2,

fi = Xn

- ffln

n/2

(4.1)

and (n - s)Xn-s-n

-

2 J -^im _

rXr+l-n

(4.2)

when ^ + l < r + l < n - s , /x = Xr+i,n - a In -

n/2

(4.3)

and 1 >Xn-sm + 2 J ^i:n - (n - r)X r+ i :n n — s—r

(4.4)

and when r + 1 < n/2 < n — s,

P

5(^m:2m + X m+ i :2m )

={J

if n is even, n = 2m.

(4.5)

(actually, when n is even any value in {Xm.2m, -Xm+1:2m] is an MLE for JJ., but we will use the one given since it is unbiased) and 2_, Xin — 2_, -Xi:n — r ^ r + l : n t=m+l i=r+l n-s m 5 ^ n - S : n + Z-i Xi:n — J_j Xi:n — rXT+l:n i=m+2 z=r+l

sXn-sn

+

if n is even, n = 2m if n is odd, n = 2m + 1.

(4.6) Unlike the BLUE's presented in the previous section, the MLE's are explicit linear functions of order statistics,

254

V- =

22 atXtn, i—r+\

a =

^iXi-n

2~2 i~r+l

where special tables are not required for the computation of the at's or 6,'s [they are given explicitly in (4.1)-(4.6)]. Since the MLE for a is not unbiased, we will obtain an unbiased estimator a, of a, based on the MLE (to be used in the following section) by dividing by its expected value when the underlying random variables come from the standard Laplace distribution (/j, = 0, a — 1), J2 biXi.n ~

n_s

J2 *'x™>

t=r+l

(4-7)

t=r+l

where fii:n is given in Relation 1, and the corresponding Vi:n is given in (2.2). Furthermore, a closed form expression for its variance may be obtained using Relations 1 and 2 in conjunction with Equations (2.2)-(2.4). We have,

I i=r+\

i=r+l j=i+l

where exact explicit expressions for iiin, jujj and /i 4 -„ are given in Relations 1 and 2, with the corresponding v{.n, v\l and utj.n given in Equations (2.2)-(2.4). An analogous equation holds for the variance of p.

5 PIVOTAL QUANTITIES AND INFERENCE Based on the BLUE's //* and a* in (3.1) and (3.2), let us define P1

^ , P 2 = ^a , a n d P 0\JV* \^2

=

3

= ^

.

v

(5.1)

Let us also define analogous quantities based on the MLE's Jl and a, p>

=

lz£, crylVl

p>

=

ZZL,

G^JV2

and

p>

=

Uz±t

(5. 2 )

a

where exact explicit expressions for ayV\ = i/Var(£) and uyV2 = \JVai{a) are described in Section 4. It is easily verified that all of the quantities in (5.1) and (5.2) are pivotal quantities. Pi and P[ can be used to draw inference for \x when a is known, while P3 and P^ can be used to draw inference for fj, when a is unknown. Similarly, P 2 and P 2 c a n De used to draw inference for cr when fi is unknown.

255 By making use of the results developed in Section 2, we propose Edgeworth approximations to the distributions of Plt P[, P2 and P 2 and examine their effectiveness by comparing the approximate results with simulated results. First of all, realize that P\ and P 2 in (5.1) can be written as

£ »;z«.

T. nz,.n - 1

„.

„. ,

while P-f and P 2 in (5.2) can similarly be written as

X] a;Zi:n ^ 5Z biZtn - 1 P{ = ""-+1 = - £ L and P^ = ' " Thus, they are linear functions of order statistics Zi:n from the standard Laplace distribution. By making use of the relations in Section 2, the values of a*, b*, V{ and V2* [tabulated by Balakrishnan, Chandramouleeswaran and Ambagaspitiya (1996)], and the exact values of ait bt [given in (4.5) and obtained from (4.6), respectively], Vl and V2 (from the exact explicit expressions described in Section 4) for the case of Type-II right-censored samples, X\n < X 2:n < • '' < Xn-s-n, we determined the values of the mean, variance and the coefficients of skewness and kurtosis (y/Jh and /32) of P{, P2*, Px and P 2 for n = 5(1)10(5)20 and s = 0(1)(§ - 1) for n even and s = 0(1)||] for n odd. These values for P*, P2* are presented in Table 1, while the results for Pj and P 2 are given in Table 2. An examination of the /32 values in Table 1 reveals that the distribution of Pj* (and hence of P{) is slightly heavier tailed than normal and, therefore, an Edgeworth approximation in this case will be quite appropriate. An examination of the {VPi'M values in Table 1 reveals that the distribution of P 2 (and hence of P2) is positively skewed and also heavier tailed than the normal, but lies in the range of an Edgeworth approximation; for details on the possible range for Edgeworth approximation, see Barton and Dennis (1952) and Johnson, Kotz and Balakrishnan (1994). Similar observations may be made from Table 2 regarding the distributions of Pi and P 2 . Note also that the variance and (vJ^ii ft) values for P2* and P 2 are very similar, often agreeing up to the second decimal place in the (y^fli, /32) values, and the third decimal place in the variance values. The Edgeworth approximation for the distribution of a standardized statistic T (with mean 0 and variance 1) is given by F(t) « $(i) -
2

- 1) +

{

~~Q(t3

- 3t) + § ( i 5 - 10t3 + 15t) | ,

(5.3)

where y73i and /32 are the coefficients of skewness and kurtosis, respectively, of T, and $(t) is the cumulative distribution function of the standard normal distribution with corresponding pdf (p(t). By making use of the entries in Tables 1 and 2, we determined the lower and upper 1%,

256 2.5%, 5% and 10% points of Pi, P[, P 2 and P2' through the Edgeworth approximation in (5.3). These values, for the case of Type-II right-censored samples (r = 0) for s = 0(1)(| — 1) for n even and s = 0(1)[|] for n odd and sample size re = 5(1)10(5)20, are presented in Tables 3-6. For the purpose of comparison, these percentage points were also determined by simulations (based on 5000 runs) and they are presented along with the Edgeworth percentage points in Tables 3-6. From Tables 3 and 4 we see that the Edgeworth approximation of the distributions of Pi and Pi provides quite close agreement with the simulated percentage points. The largest discrepancy occurs at the extreme lower and upper tails of the distribution, but only for small sample sizes. As the sample size increases, the agreement becomes quite close at all levels of censoring, even at the extremes of the distribution. From Tables 5 and 6 we see that the Edgeworth approximation of the distributions of P 2 and P2' also provide close agreement with the simulated percentage points. This time however, the discrepancy for small sample sizes only occurs at the upper tail. But again, as the sample size increases, the discrepancy becomes quite small at all levels of censoring, even at the extremes of the distribution. In conclusion, we observe that the Edgeworth approximations of the distributions of Pi, P[, P2 and P2' all work quite satisfactorily even in samples of size as small as 5, and they indeed improve in accuracy as the sample size re increases. We recommend the use of the pivotal quantities P 2 and P2' based on the MLE's since they require no special tables to use. It should also be pointed out here that a similar Edgeworth approximation can not be developed for the percentage points of the pivotal quantities P 3 or P 3 in (5.1) and (5.2) since it is not a linear function of order statistics. However, as displayed in the next section, approximate inference based on Pi with a replaced by u* or based on P[ with a replaced by a, provides quite close results to those based on P 3 or P3' respectively. For this purpose, we have presented in Tables 7 and 8 some selected percentage points of P3 and P3' determined by simulations (based on 5000 runs).

6 NUMERICAL ILLUSTRATION In order to illustrate the usefulness of the inference procedures discussed in the previous sections, we consider here a simulated data set of size re = 20 (with fj, = 50 and a = 5): 32.007, 37.757, 43.848, 46.268, 46.907, 47.262, 47.290, 47.593, 48.065, 49.254, 50.278, 50.487, 50.662, 53.336, 53.493, 53.567, 53.981, 54.942, 55.695, 66.396. Using this sample, the BLUE's and MLE's were calculated based on complete as well as Type-II right-censored samples (r = 0) by making use of the tables of Balakrishnan, Chandramouleeswaran and Ambagaspitiya (1996), and the explicit expressions in (4.5) and (4.6), respectively. These estimates are presented in the following table.

257

n s 20 0 1 2 3

^* 49.561 49.561 49.561 49.561

a a* P 49.766 4.947 4.964 49.766 4.635 4.653 49.766 4.814 4.834 49.766 4.931 4.952

With these estimates and the use of Tables 3 and 4, we determined the 90% confidence intervals for fi (when a is known to be 5) based on the Edgeworth approximation and on the simulated percentage points using both Pi and P[. These are presented in the table below. n 20

s 0 1 2 3

Simulated C.I.

Edgeworth C.I.

«3fc& Pi

(47.501,51.621) (47.501, 51.621) (47.501,51.621) (47.501,51.621)

Pi

Pi

Pi

(47.667,51.865) (47.667, 51.865) (47.667,51.865) (47.667,51.865)

(47.454,51.668) (47.530,51.656) (47.517,51.618) (47.466,51.631)

(47.637,51.895) (47.637,51.895) (47.637, 51.895) (47.637,51.895)

It is clear that the confidence interval based on the Edgeworth approximation is very close to the confidence interval determined by simulations, at all levels of censoring. Similarly, with the use of Tables 5 and 6, we determined the 90% confidence intervals for a, and they are presented below.

&*& n

20

s 0 1 2 3

Edgeworth C.I. P' P2 (3.535,7.522) (3.547, 7.550) (3.285,7.138) (3.298,7.168) (3.383,7.518) (3.396,7.550) (3.433,7.820) (3.447,7.854)

Simulated C.I. Pi

(3.565,7.471) (3.332,7.226) (3.381,7.483) (3.415,7.795)

P' (3.576,7.526) (3.344,7.257) (3.395,7.518) (3.429,7.832)

Once again, we observe that the confidence intervals based on the Edgeworth approximation are very close to those based on simulations. In the case when o is unknown, the Edgeworth approximation method cannot be used to draw inference for /J, using P 3 or P.J. However, as pointed out in the last section, the Edgeworth approximation for the distribution of the pivotal quantity Pj may be used in this case with a replaced by a", or P[ may be used with a replaced by a, in order to draw approximate inference for [i. By this method, we determined the 90% confidence intervals for \i and these are presented in the following table for the choices of r = 0, s = 0(1)3. Also presented in this table are the corresponding 90% confidence intervals for /x based on the simulated percentage points of the pivotal quantities P 3 and Pj given in Tables 7 and 8.

258

Edgeworth C.I. P{(a = a) 20 0 (47.526,51.596) (47.678,51.854) 1 (47.654,51.468) (47.809,51.723) 2 (47.581,51.541) (47.733,51.799) 3 (47.532,51.590) (47.683,51.849)

Jim-n s

Pi (o- = ff')

Simulated C.I. P3 Pi (47.384,51.738) (47.483, 52.000) (47.614,51.647) (47.719,51.906) (47.539,51.583) (47.687, 51.845) (47.391,51.681) (47.538, 51.895)

It is quite clear that the confidence intervals using the approximate Edgeworth method based on the pivotal quantities Pi or P[ are all very close to the confidence intervals determined by simulation of the distribution of the pivotal quantity P3 or P3' at all levels of censoring.

7 RESULTS FOR GENERAL CENSORED SAMPLES All the methods of inference described in the previous sections are equally applicable to general censored samples ^V+l:n < XT+2:n < ' " <

Xn-Sn,

but the Edgeworth approximate percentage points of the pivotal quantities based on the BLUE's requires special tables which at the present time do not exist for general censored samples. However, the Edgeworth approximate percentage points of the pivotal quantities based on the MLE's continue to require no special tables, as they are given explicitly in (4.1) and (4.2) or (4.5) and (4.6) depending on the level of censoring. For the purpose of illustration, let us consider here the numerical example presented in the last section and assume that the smallest two and largest three observations have been censored, i.e., we take r = 2, s = 3 and n = 20. By explicitly computing the coefficient of BLUE's for [i and a and using the formulas for the MLE's given in (4.5) and (4.6) we find that /x* = 49.561, p, = 49.766, a* = 4.373 and a = 4.397. We determined the mean, variance and the coefficients of skewness and kurtosis of P{, P\, P2* and P 2 to be l

**-»;

p; Pi

Pi P2

Mean Variance 0.0637 0.0000 0.0666 0.0000 1.0000 0.0693 1.0000 0.0693

VK

A

0.0000 0.0000 0.5260 0.5261

3.5352 3.7475 3.4151 3.4153

By making use of these quantities, we determined the lower and upper 1%, 2.5%, 5% and 10% points of the distribution of the pivotal quantities P 1 ; P[, P2 and P2' through the Edgeworth

259 approximation in (5.3), and these are presented in the following table:

m *

1% -2.47 pi -2.53 P2 -1.92 Pi -1.92

2.5% -2.00 -2.02 -1.70 -1.70

5% -1.63 -1.63 -1.49 -1.49

10% -1.24 -1.22 -1.22 -1.22

90% 1.24 1.22 1.32 1.32

95% 1.63 1.63 1.78 1.78

97.5% 99% 2.00 2.47 2.02 2.53 2.21 2.74 2.21 2.74

We also simulated the percentage points of all of the pivotal quantities,

m

i% -2.45 pi -2.58 p2 -1.95 ^ -1.96 Ps -0.64 -0.68 Pi

n

2.5% -2.07 -2.06 -1.70 -1.70 -0.53 -0.55

5% -1.69 -1.70 -1.48 -1.48 -0.44 -0.45

10% -1.29 -1.28 -1.22 -1.22 -0.34 -0.34

90% 95% 1.26 1.66 1.25 1.66 1.29 1.73 1.29 1.73 0.34 0.45 0.34 0.45

97.5% 99% 2.07 2.47 2.07 2.58 2.07 2.45 2.08 2.45 0.56 0.68 0.59 0.71

An approximate 90% Edgeworth confidence interval for [i (when a is unknown) is then obtained to be (47.762,51.36) based on the BLUE's, and (47.916,51.616) based on the MLE's. Similarly, the 90% Edgeworth approximate confidence interval for a is obtained to be (2.978,7.195) based on the BLUE's, and (2.994,7.235) based on the MLE's. Upon comparing these intervals with (47.593,51.485), (47.787,51.745), (3.005,7.164) and (3.021,7.204), the 90% confidence intervals for /j. and a based on the BLUE's and MLE's, determined through Monte Carlo simulations of the percentage points of the distributions of the pivotal quantities P3, P^, Pi and Pj we observe that the Edgeworth approximation provides quite close results even in the case of general Type-II doubly-censored samples. In conclusion, the numerical illustration in this and the last section clearly indicates the usefulness of the Edgeworth approximation method in developing inference for the parameters of the Laplace distribution based on Type-II censored samples.

260 Acknowledgments The first two authors would like to thank the Natural Sciences and Engineering Research Council of Canada for funding this research.

References Ali, M.M., Umbach, D., and Hassanein, K.M. (1981). Estimation of quantiles of exponential and double exponential distributions based on two order statistics, Communications in Statistics - Theory and Methods 10, 1921-1932. Arnold, B.C., Balakrishnan, N. and Nagaraja, H.N. (1992). A First Course in Order Statistics, John Wiley & Sons, New York. Bain, L. J. (1978). Statistical Analysis of Reliability and Life-Testing Models: Theory and Methods, Marcel Dekker, New York. Bain, L.J. and Engelhardt, M. (1991). Statistical Analysis of Reliability and Life-Testing Models: Theory and Methods, Second edition, Marcel Dekker, New York. Balakrishnan, N. (1989). Recurrence relations among moments of order statistics from two related sets of independent and non-identically distributed random variables, Annals of the Institute of Statistical Mathematics 41, 323-329. Balakrishnan, N. and Ambagaspitiya, R.S. (1988). Relationships among moments of order statistics from two related outlier models and some applications, Communications in Statistics - Theory and Methods 17, 2327-2341. Balakrishnan, N. and Chandramouleeswaran, M.P. (1996a). Reliability estimation and tolerance limits for Laplace distribution based on censored samples, Mircoelectronics and Reliability 36, 375-378. Balakrishnan, N. and Chandramouleeswaran, M.P. (1996b). Prediction for the Laplace distribution based on Type-II censored samples, Mircoelectronics and Reliability (to appear). Balakrishnan, N., Chandramouleeswaran, M.P., and Ambagaspitiya, R.S. (1996). BLUE's of location and scale parameters of Laplace distribution based on Type-II censored samples and associated inference, Microelectronics and Reliability 36, 371-374. Balakrishnan, N. and Cohen, A.C. (1991). Order Statistics and Inference: Estimation Methods, Academic Press, Boston.

261 Balakxishnan, N. and Cutler, CD. (1995). Maximum likelihood estimation of the Laplace parameters based on Type-II censored samples, In Statistical Theory and Applications: Papers in Honor of Herbert A. David (Eds., H.N. Nagaraja, P.K. Sen and D.F. Morrison), pp. 145-151, Springer-Verlag, New York. Balakxishnan, N., Govindarajulu, Z., and Balasubramanian, K. (1993). Relationships between moments of two related sets of order statistics and some extensions, Annals of the Institute of Statistical Mathematics 45, 243-247. Balakxishnan, N. and Gupta, S.S. (1998). Higher order moments of order statistics from exponential and right-truncated exponential distributions and applications to life-testing problems, In Handbook of Statistics - I7: Order Statistics: Applications (Eds., N. Balakxishnan and C.R. Rao), 25-59. Barton, D.E. and Dennis, K.E.R. (1952). The conditions under which Gram-Charlier and Edgeworth curves are positive definite and unimodal, Biometrika 39, 425-427. Childs, A. and Balakxishnan, N. (1997a). Some extensions in the robust estimation of parameters of exponential and double exponential distributions in the presence of multiple outliers, In Handbook of Statistics 15: Robust Inference (Eds., C.R. Rao and G.S. Maddala), 201235, Elsevier Science, North-Holland, Amsterdam. Childs, A. and Balakxishnan, N. (1997b). Maximum likelihood estimation of Laplace parameters based on general Type-II censored samples, Statistische Hefte 38, 343-349. Cohen, A.C. and Whitten, B.J. (1988). Parameter Estimation in Reliability and Life Span Models, Marcel Dekker, New York. David, H.A. (1981). Order Statistics, Second edition, John Wiley & Sons, New York. Govindarajulu, Z. (1963). Relationships among moments of order statistics in samples from two related populations, Technometrics 5, 514-518. Govindarajulu, Z. (1966). Best linear estimates under symmetric censoring of the parameters of a double exponential population, Journal of the American Statistical Association 61, 248258. Johnson N.L., Kotz S., and Balakxishnan N., (1994). Continuous Univariate Distributions, Vol. 1, Second edition, John Wiley & Sons, New York. Lawless, J.F. (1982). Statistical Models and Methods for Lifetime Data, John Wiley & Sons, New York. Mann, N.R., Schafer, R.E., and Singpurwalla, N.D. (1974). Methods for Statistical Analysis of Reliability and Life Data, John Wiley & Sons, New York.

262

Raghunandanan, K. and Srinivasan, R. (1971). Simplified estimation of parameters in a double exponential distribution, Technometrics 13, 689-691. Renyi, A. (1953). On the theory of order statistics, Acta Math. Acad. Sci. Hung. 4, 191-231: Shyu, J.C. and Owen, D.B. (1986a). One-sided tolerance intervals for the two-parameter double exponential distribution, Communications in Statistics - Simulation and Computation 15, 101-119. Shyu, J.C. and Owen, D.B. (1986b). Two-sided tolerance intervals for the two-parameter double exponential distribution, Communications in Statistics - Simulation and Computation 15, 479-495. Sukhatme, P.V. (1937). Tests of significance for samples of the x 2 population with two degrees of freedom, Ann. Eugen. 8, 52-56.

263 Table 1. Mean, Variance and Coefficients of Skewness and Kurtosis of P' and p2"

n 5

• s 0

b 7

p; Mean 0-0000 0-0000 0-0000 0.0D00 0.0000 0-0000 0-0000 0-0000

o-onoo

a 1

ID

IS

20

.

•>

0-0000 D-0000 0-0000 0-0000 0-0000 0-0000 D-DODO 0-000D 0-0000 0-0D00 D.0000 0-0000 0-0000 0-0000 0-DO0O 0.0000 0.0000 0.0000 0-OODO 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0-000D 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000

Variance 0-31b1 0.317E 0.3450 0.2548 D.2S48 0.2511 0.2122 0-2122 0-2130 0-2216 0.1814 0.1414 0.1615 0.1854 0-1581 0.1581 0-1581 0.1511 0-1702 o-i3ii o-i3ii 0.1311 0-1401 0.1435 0.0880 0-0680 0.0880 0.0860 D.0860 0-0881 0.0810 0.0135 O.Db37 0.0t37 0-0b37 0-0b37 0-0b37 0.0b37 0.0b37 0-0t38 0-0b41 0-01.55

•IF,

P,

0-0000 0-Olbl -0-0752 D-000D 0-001,3 0-0122 0-0000 0.0024 0.0158 -0.0110 0-0000 0-0010 0-0013 -0-0101 0.0000 0-0004 0.004k 0.0D14 -O-lObS 0-0000 0-0002 0.0D22 0.0011 -0-0252 0-0000 0-0000 0-0000 0.0004 0-0022 0-0052 -0-0107 -0.1D7t 0.0000 0-0000 0-0000 0-0000 0.0001 D.0005 0-0011 0.D034 -0.0048 -0-0532

4.1231 4.1371 4.5658 4.0075 4.0D11 4.1513 3-1324 3.1321 3.1701 4-307b 3-8b73 3.8b74 3.8771 4.0300 3-8153 3-8153 3.8171 3.8730 4.134b 3.7701 3.7701 3.771b 3.7100 3.13b5 3.b22S 3.b225 3.b225 3-b22b 3-b231 3-b375 3.707b 3.8515 3.5352 3.5352 3.5352 3-5352 3.5352 3-5353 3-53b7 3.54b2 3.5857 3-b75b

A"

"Smi"' n 5 b 7

a i

10

15

20

s 0 1 2 0 1 2 0 1 2 3 0 1 2 3 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 5 b 7 0 1 2 3 4 5 b 7 6 1

Heart

Variance

1.0000 1.0000 1.0000 l-OODO 1-0000 1-0000 1.0000

0.2210 0.3001 0.4b35 0.1658 0.2214 0.3078 0.15b5 0.1856 0.2318 O.31b0 D.1351 D.15b4 0.16b7 0-2357 0.1110 0.1351 0.15b7 0.1685 0.2311 0.10b2 0.1181 0.1352 0.1575 0.1101 0.0b13 0.0745 0.0805 0.0875 O.OIbl D.lObS D-1207 0-1314 0-0514 0-0542 0-0573 Q.ObOS Q.0b48 D-0b13 0.0745 0.0806 0.0884 0-0178

i-oooo 1.0000 1.0000 1.0000 1.0000 1-0000 1.0000 1.0000 l-OODO 1-0000 1-0000

i-oooo

1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1-0000 1-0000 1-0000 1.0000 1.0D00 1.0000 1.0000 1.0000 1-0000 1-0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000

V*

0.1523 1.0815 1.3443 0.6581 0.1414 1.0120 0.7681 0-8575 0-1510 1-1145 0.733b 0-7881 0.85b7 O.lbOl 0.b68b 0.7332 0.7871 0-8515 0.1731 O.b510 0-b683 0-7354 0.7S74 0.6bb3 0-S2bl 0-5453 0-Sbb7 0-5108 0.bl71 0-b411 O.b105 0.7438 0.4534 0.4b5S 0.4787 0.4130 0.508b 0-5257 0.544b 0-5bbl 0.5117 0-b230

A 4-3b34 4-7558 5.70b1 4-1077 4.3533 4.7815 3-1345 4.104b 4.3545 4.6b04 3.6071 3.132b 4.1001 4.378b 3.7117 3.6071 3-1211 4.1056 4-4185 3-b3b0 3-7112 3-8053 3-1212 4.1231 3.4153 3-44b2 3.4820 3.5236 3-5731 3-b332 3-7140 3-8210 3-3084 3-3251 3-3436 3-3b4b 3-3882 3-4148 3.4451 3-4805 3-5244 3-5815

264 Table 2. Mean, Variance and Coefficients of Skewness and Kurtosis of Pt and ?2

h n 5 b 7

S

s D 1 S D 1 2 Q 1 2 3 D 1

a 1

10

IS

3 D 1 5 3 4 0 1 2 3 4 D 1

a

5D

3 4 5 t 7 D 1

a 3 4

s b 7 A 1

flean O.DODD 0-00D0 0-0000 0-DDDD 0-ODOO 0-0000

Variance 0.3512 0.351B 0-3512 0-2b01 0-2b01 0-2b01

a-oooa a-oooo

o.asst o.asst

0-00Q0 0-0000 0.0D00

0.335b 0-235b 0.1873 0.1873 D-1873 D.1473 D.17S1 0-1751 0-1751 0-1751 D-1751

o-oooo O.DDOD

o.aoDO O.OODO D.DDOD 0-0000 D.OOOO 0.0000 0.0000 a.DODO O.ODOO O.OODO 0.D00D D.0000 D.00D0 0.D00D 0.0000 0.0000 0.0000 0.D00D 0.0000 D.D000 O.OODO O.DDDD 0-0000 0.0000 D.OODD 0.000D O.DOOO 0.0000 O-OODO

D.msa 0.1452

o-msa D-msa D.msa D.01b3 O-OlbS 0-01b3 0-D1b3

o.oibs D.D1b3 D-D1b3 0-01b3 O-Obbb O-Obbb O-Obbb O-Dbbb O-Obbb O-Obbb O-Obbb O-Obbb O-Obbb O-Dbbb

VA

A

O.ODOO 0.0000 0-0000 O.OODO 0.0000 O-DDOD 0-0000 O.OODO O.OODO 0.0000 0-0000 D.ODOO 0.0000 O.OODD 0.0000 0.0000 O.ODOO D.OODD 0-0000 0-0000 O-DDOD D.ODOO O.ODOO 0-0000 O.OODD O.DDOD O.DOOO O.ODOO O-ODDO O-ODDO 0-OOOD O.OODD O.DDOD O.OODO D-ODDO 0.0000 O-DOOO 0-0000 0-0000 D-OOOO O.ODDD O.OODD

4-b177 4.b177 4.b177 4.2002 4.2002 4.BDD3 4.4310 4-4310 4-421D 4-4B10 4-D151 4-0151 4.0151 4-0151

4-asos

n 5

m s D

b 7

8

1

4-asDa 4-2SD2

4-asoa 4-asoa 4-DD87 4-0D87 4.0067 4.0047 4-0D47 3.1471 3.1471 3.1471 3-1471 3.1471 3.1471 3.1471 3.1471 3.7475 3.747S 3.7475 3-7475 3-7475 3.7475 3.7475 3.7475 3.7475 3.7475

ID

15

50 q

_2=

flean 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1-0000 1.0000 1.0000 l.ODDO l.DOOD l.DDOO l.ODDO l.OOOD l.DOOD l.DDOO 1.0000 1-0000 l.DOOD l.DDOO l.ODDO l-OODO 1.0000 l.DOOD l.DDOO l.ODDO l.ODDO l-OODO l-DOOD l.DOOD l.DDOO l.ODDO l-OODO l.DOOD l-DOOD l.ODDO l.ODDO l.OODD l.DOOD l-DOOD l-DDOO

Pi Variance D.aBID 0.3010 0-bOb4 0.18b5 D.33D4 0.3015 0.15b5 0-1851

D-asn D.3b78 0-1354 0-15b8 0.1873 0-23b7 D-1110 0-1351 D-15b8

a-iasb 0-Eb48 0.10b4 0.1111 D.1354 0-1571 0.1115 O.DbIS D.D745 0-0805 0-0875 O.DIbl D-lDbS

D-iaos D-1453 D-0515 0-0543 0-D574 D.ObDI D-0b48 0-0b13 0-D74b 0-D8DS D-D8B5 D-0171

VA

A

o-isao

4.3b21 4-7538 5-1007 4-1015 4-35bl 4-7842 3-1333 4-1024 4-3517 4-b211 3-8013 3-1344 4-1034 4-3811 3-7101 3-80b0 3.124b 4.103b 4.30D1 3-b3b8 3-7122 3-8Db7 3-1301 4-1257 3-4151 3-44bD 3-4818 3-5235 3.5727 3-b337 3-7133 3-4007 3-3085 3-3252 3-3431 3-3b4S 3-3843 3-4150 3-4453 3-4407 3.524b 3.5821

1.0811

l.ian 0.8bD4 0.1514 1.014b 0.788b 0.8571 0.15DS 1-0157 0-7345 0-7413 D.4S82

o.ibai 0.b444 0.73B1 0.78b4 0.8511 0.1115 0.0515 D.bBIO 0.7332 0.7843 0.4b7S D.52bl D.5453 0-Sbb7 0.5107 0.bl71 0-b418 D.b1D4 0.7282 0.4535 0.4b5b D.4788 0-4131 0.5088 0.5251 D-5448 D.5bb3 0.5111 0.b234

265 Table 3. Percentage Points of the Distribution of P,

m

s G 1 2 b a 1 2 7 0 1 i 3 a D 1 2 3 1 D 1 2 3 1 ID D 1 2 3 1 15 0 1 2 3 14 5 t 7 20 D 1 2 3 1 5 t 7 B 1 5

Simulated

Edqeworth

1* -2-t5 -2.t1 -2.82 -2.tl -2-bl -2.1.5 -2-51 -2.51 -2.St -2.7t -2.57 -2-57 -2-57 -2.t3 -2.55 -2.55 -2-55 -2.5b -2-72 -2-St -2. St -2.51 -E.SM -2-bl -2-50 -2-50 -2-50 -2.IT -2.11 -2.50 -2.53 -B.LI

-2.147 -2.147 -2.17 -2.147 -2-17 -2.17 -2.17 -2.17 -2.11 -2.55

2.5Z -5.07 -2.0b -2-20 -2.0b -2.05 -2-07 -5-05 -5.01 -5.01 -5-17 -5-01 -5-01 -5-03 -5-07 -5.03 -5.03 -5-03 -5-03 -2-11 -5-03 -5.03 -5-03 -5.02 -2-0t -2.D1 -5-D1 -5-D1 -5.01 -5.01 -2.01 -2.03 -2.ID -2.00 -2.0D -2-0D -5-00 -2-00 -2.00 -2. DO -5. DO -5.D1 -2.05

S>.

10X

-l.tl -l.tl -l.t3 -l.t2 -l.t2 -l.tl -l.t2 -Lt2 -l.tl -1-tS -l.tB -l.t2 -i.ts -1.U2 -1-tS -l.ta -l.t2 -l.t2 -i.ts -l.t3 -l.t3 -l-b3 -l.t2 -l.t3 -l.t3 -l-t3 -Lb3 -l-t3 -l.t3 -l.t3 -l.t3 -l.tt -l.t3 -i.ts -i.ts -i.ts -l-t3 -l.t3 -l.t3 -l.t3 -l.t3 -i.ts

-1.11 -1.11 -1.15 -1.20 -1.20 -1.16 -1.21 -1-21 -1.20 -1.18 -1-21 -1.21 -1.21 -1.20 -1.22 -1-22 -1.22 -1.21 -1.20 -1.22 -1.22 -1-22 -1.22 -1.21 -1.23 -1.23 -1.23 -1-23 -1.23 -1.23 -1.23 -1.22 -1.21 -1.21 -1.21 -1.21 -1-21 -1.21 -1.21 -1.21 -1.21 -1.23

10* l.n l.n i.n

1SX 17. SZ H i

2.07 2.08 2.08 2.0b 2.0b i.n Lt2 2.08 1.21 Lt2 2-05 1-21 l.t2 2.05 1.20 1-13 2.0b Lit 1.57 2.02 1-21 1.4,2 2.01 1.21 l.t2 2.01 1-21 l-t3 2.OS 1.20 l.tl 2.05 1.22 l.t2 2.03 1.22 l.t2 2.03 1.22 l.t3 2-01 1.21 l.t3 2.05 1.18 1.58 2-00 1.22 l.t3 2.03 1.22 l.t3 2.03 1-22 l-t3 2.03 1.22 Lt3 2-01 1.20 l.tl 2.03 1.23 l.ta 2.01 1.23 l.ta 2.01 1.23 l.ta 2.01 1-23 l.ta 2.D1 1-23 i-ta 2-01 1-23 l.b3 B.D2 1.22 l.b2 2.D1 1.20 1.51 1-17 1.21 i-ta 2-DO 1.21 l.ta 2.00 1.21 l.ta 2.0D 1.21 l.ta 2.00 1.21 Lb3 2.00 1.21 i-ta 2-OD 1.21 l.b3 2.0D 1.21 l.b3 2.01 1.21 l.b3 2.01 1.22 i.bi 1.11 l.tl l.t2 1.5b 1-2D l.t2 1.2D Lb2

2.b5 2.bb 5-71 2.bl 2.b2 2.bb 2.51 2.51 2-bl 2.b3 2.57 2.57 2-58 2.bl 2.S5 2.55 2-5b 2-58 2-57 2-51 2.51 2.51 2.55 2.57 2-50 2.50 2.50 2.50 2-50 2.50 2-51 2-18 2-17 2-17 2.17 2.17 2.17 2.17 2.17 2-18 2-18 2.17

IX -2. SO -2. S3 -2.1b -2-53 -2.1b -2.bl -2.11 -2.11 -2. 57 -2-52 -2.58 -2.17 -2. tO -2.51 -2.52 -2-51 -2.1t -2-38 -2.55 -2-50 -2.51 -2.5t -2.57 -2.51 -2.31 -2.15 -2-t3 -2-17 -2.15 -2.1t -2.51 -2-bl -2.1t -2-52 -2.31 -2-50 -2.51 -2.1t -2.18 -2.ID -2-11 -2.52

2.5* -2.05 -2.08 -2.01 -2.02 -2.D1 -2-D7 -1.11 -1.17 -2.D7 -B-Ot -2-D5 -2-D5 -2-D3 -2.D2 -2.00 -2.03 -2.0t -1.15 -2.OS -2.02 -1-11 -2-07 -2.10 -2.01 -2.03 -2-DS -2-D5 -2.D2 -2.03 -2.01 -2.D2 -2.15 -2.07 -2.0b -2-03 -1-11 -2.D5 -2.07 -1-11 -2.DO -2-D1 -2-Db

5M

10*

ID*

-l.bS -l.tl -l.tl -l.b5 -l-b3 -l.bb -l.tl -l.b3 -l.b8 -l.bl -Lb7 -1-bl -1-bl -l.bl -l.b3 -l.bl -l.b7 -l.b2 -l.b7 -l.bl -l-b2 -i.ts -1-bl -l.b7 -l.bS -1.70 -l.b3 -l-b3 -l.bb -l.bO -l.bl -1.75 -l.b7 -l.tt -i.ts -1-bl -l.bb -l.tt -i.ts -1-58 -l.tl -Lb7

-1.22 -1.21 -1.20 -1.23 -1.11 -1.25 -1.21 -1.21 -1.27 -1.20 -1-21 -l-2b -1-27 -1.17 -1.21 -1.27 -1.25 -1.22 -1.25 -1.25 -1.21 -1-21 -1.20 -1.2b -1.2b -1.28 -1.21 -1.21 -l-2b -1.21 -1.23 -1.25 -1.27 -1.25 -1-21 -1.21 -1.21 -1.25 -1.27 -1.21 -1.21 -1.27

1.22 1.23 1.17 1-23 1.23 1.11 1.21 1.21 1.21 1.11 1.21 1-23 1-23 1.22 1.21 1.11 1.2S 1.21 1.2D 1.2S 1-27 1.25 1.22 1.23 1.2b 1.23 1.22 1.23 1.2b 1.21 1.23 1.22 1.27 1.22 1.22 1-21 1.25 1.21 1.25 1.2b 1-23 1.21

15* 1 7 . SX n x l.bS 1.58 l.b2 l.b5 l.bl l.bS l.bl l.bb l.bS 1.58

Lb7 1-tE i.ts l.bb l.ta l.tl l.t7 l.t2 l.ta l.tl 1.70 i.ts 1.51 l.t2 i.ts l.tt l.ta l.t2 l.tl 1.70 l.tl l.ta l.t7 l.tl l.ta l.bb 1-tE l.b5 l.ta l.b7 l.ta i-to

2.05 1.17 2.0t 2.02 2.03 2.0b 1.11 2.02 2.D1 1.11 2.D5 1.17 2-03 2.01 2.00 1.11 2.OS 2.02 2.03 2.02 2.08 2.08 2.01 2.01 2.03 2.02 1.12 1-15 2.03 2.07 2.01 1.17 2.07 l.lt 1.18 2-D2 2.D1 2. DO 2.07 2.12 2.D1 1-17

2.50 2.17 2.58 2-53 2-17 2.77 2.11 2.15 2.51 2.11 2.58 2.10 2.51 2.17 2.S2 2.12 2.51 2.11 2.51 2.50 2.58 2-51 2.51 2.t2 2.31 2.31 2.31 2.It 2-13 2.It E.tl 2.11 2.It 2.12 2.15 2.10 2-53 2.38 2.SO 2.53 2.15 2-32

266 Table 4. Percentage Points of the Distribution of p;

m5# n 5

b

7

8

1

ID

s D 1 2 0 1 2 D 1 2 3 0 1 E 3 G 1 2 3 1

o i

a

3 4 15 D 1 2 3 4 5 b 7 BO D 1 2 3 4 5 b 7 8 1

IX -2-81 -2.81 -2.81 -2.b7 -2-b7 -2.b7 -2.74 -2.74 -2.74 -2.74 -2-1.1 -2-b4 -2-b4 -2-b4 -2-bl -2-bl -2-bl -2-bl -2-bl -2-bl -2-bl -2-bl -2-bl -2-bl -2.51 -2.51 -2.51 -2.51 -2.51 -2.51 -2-51 -2-51 -2.53 -2.53 -2.53 -2.53 -2.53 -2.53 -2-53 -2-53 -2.53 -2-53

2-5* -2-17 -2-17 -2.17 -2-D8 -2-D8 -2.D8 -2.12 -2.12 -2.12 -2-12 -2-07 -2.07 -2.07 -2.07 -2-01 -2.01 -2.01 -2.01 -2.01 -2-0t -2.0k -2.0h -2.0b -2.0b -2.05 -2.05 -2-05 -2.05 -2.05 -2-05 -2.05 -2.05 -2.02 -2.02 -2.02 -2.02 -2.02 -2-02 -2.02 -2.02 -2.02 -2-02

Edqeworth 10* 102 5* -1.51 -1.13 1.13 -1.51 -1.13 1.13 -1.51 -1.13 1.13 -l.bl -1.18 1.18 -l.bl -1.18 1.18 -l.bl -1.18 1.18 -l.bO -1.1b 1.1b -l.bO -1.1b 1.1b -l.bO -1.1b 1.1b -1-bO -Lib 1.1b -l.b2 -1-11 1-11 -l.b2 -1.11 1-11 -l.b2 -1.11 1.11 -l.b2 -1.11 1.11 -1-bl -1.18 1.18 -l.bl -1.18 1.18 -l.bl -1.18 1.18 -1-bl -1.16 1.18 -l.bl -1-18 1.18 -l-b2 -1.20 1-20 -1.L2 -1.20 1.20 -l-b2 -1-20 1-20 -l.b2 -1.20 1.20 -1-bE -1-20 1.20 -l.b2 -1.20 1.20 -l-b2 -1.20 1.20 -l.b2 -1.20 1.20 -l.b2 -1-20 1-20 -l.b2 -1-20 1.20 -l.b2 -1.20 1.20 -l-b2 -1.20 1.20 -1-bE -1-20 1-20 -l-b3 -1.22 1.22 -l.b3 -1.22 1.22 -l.b3 -1.22 1.22 -l.b3 -1.22 1.22 -l.b3 -1.22 1.22 -l.b3 -1.22 1.22 -l-b3 -1-22 1-22 -l.b3 -1-22 1-22 -l.b3 -1.22 1.22 -l-b3 -1-22 1-22

15* 17.5* n* 1-51 2-17 2.81 1-51 2.17 2.81 1.51 2.17 2.81 l.bl 2.OS 2-b7 l.bl 2-08 2-b7 l.bl 2.08 2-b7 l.bO 2.12 2.74 l.bO 2.12 2.74 l.bO 2-12 2-74 l.bO 2-12 2-74 l-b2 2.07 2.b4 l.b2 2.07 2.b4 l.b2 2.07 2.b4 l.b2 2-07 2.b4 l.bl 2-01 2-bl l.bl 2.D1 2-bl l.bl 2.01 2.b1 l.bl 2-01 2-bl l.bl 2.01 2.b1 l.b2 2.0b 2-bl l.b2 2.0b 2.bl l.b2 2.0b 2.bl l.b2 2.0b 2.bl l.b2 2.0b 2.bl l.b2 2.05 2.51 l.b2 2-05 2-51 l.b2 2.05 2.51 l.b2 2.05 2.51 l.b2 2-05 2-51 l.b2 2.05 2.51 l.b2 2.05 2.51 i.ta 2-05 2.51 l-b3 2-D2 2-53 1-L3 2-02 2-53 l.b3 2.02 2.53 l.b3 2-02 2.53 l.b3 2.02 2-53 l.b3 2.02 2.53 L b 3 2.02 2.53 l.b3 2-02 2.53 l.b3 2-02 2-53 L b 3 2-02 2.53

4

IX -2.58 -2.58 -2.58 -2.bl -2.bl -2.bl -2-58 -2-50 -2.58 -2.58 -2.57 -2.57 -2.57 -2-57 -2.55 -2-55 -2.55 -2.55 -2.55 -2.55 -2.55 -2.55 -2.55 -2.55 -2.52 -2-52 -2.52 -2.52 -2.52 -2.52 -2.52 -2-52 -2-50 -2.50 -2.50 -2-50 -2.50 -2.50 -2.50 -2-50 -2.50 -2.50

2-5* -2-04 -2.04 -2.D4 -2.0b -2.0b -2.Db -2.0b -2.0b -2.0b -2.0b -2-04 -2.04 -2-04 -2.04 -2.07 -2.07 -2-07 -2.07 -2.07 -2.05 -2.QS -2.05 -2.05 -2.05 -2.04 -2.D4 -2.04 -2-04 -2.D4 -2.04 -2-04 -2-04 -2.04 -2.04 -2.04 -2.04 -2.04 -2.04 -2-04 -2-04 -2.04 -2.04

Simulated 10* 10* 5* -l.b3 -1.18 1.18 -1-13 -1.18 1.18 -l-b3 -1-18 1.18 -l-b5 -1.21 1.21 -l-b5 -1.21 1.21 -1-bS -1-21 1.21 -1.14 -1-20 1-20 -l.b4 -1-20 1.20 -1-14 -1.20 1.20 -1-L4 -1-20 1.20 -l.b5 -1.22 1.22 -l.b5 -1.22 1.22 -l.b5 -1.22 1-22 -l.b5 -1.22 1.22 -1-bb -1-23 1.23 -l.bb -1.23 1-23 -l.bb -1.23 1.23 -l.bb -1-23 1-23 -1-bb -1.23 1.23 -J.fcS -1.23 1.23 -1-L5 -1.23 1.23 -l.bS -1.23 1.23 - i . t s -1-23 1.23 -l.b5 -1.23 1.23 -l.bb -1.22 1.22 -l.bb -1.22 1.22 -l.bb -1-22 1.22 -l.bb -1-22 1.22 -l.bb -1.22 1.22 -l.bb -1.22 1-22 -l.bb -1-22 1.22 -l.bb -1-22 1-22 -l.b5 -1-24 1.24 -l.b5 -1-24 1.24 -l.b5 -1.24 1.24 -l.b5 -1.24 1.24 -l.b5 -1.24 1.24 -l.bS -1.24 1.24 -l.fcj -1.24 1-24 -l-b5 -1-24 1-24 -l.b5 -1.24 1.24 -l.b5 -1.24 1.24

15* l.b3 l.b3 l.b3 l.b5 l.b5 l.bS l.b4 l.b4 l.b4 l.b4 l.bS 1-bS l.b5 l.bS l.bb l.bb l.bb l.bb l.bb l.bS 1-bS l.b5 1-bS l.b5 l.bb l.bb l.bb l.bb l.bb l.bb l.bb l.bb l-b5 1-bS l.b5 l.b5 l.b5 l.b5 l.b5 l.b5 1-bS l.b5

17.5* 2.Q4 2.04 2.D4 2.Db 2.0b 2.Db 2.0b 2.0b 2.0b 2.0b 2-04 2-04 2-04 2.04 2.07 2.07 2.07 2.07 2.07 2-05 2.05 2.05 2-05 2.05 2.04 2.04 2-04 2.04 2.04 2.04 2.04 2.04 2.04 2-04 2.04 2-04 2-04 2.04 2.04 2.04 2.04 2.04

11* 2.58 2.58 2.58 2.bl 2.bl 2.bl 2-58 2.58 2.58 2.58 2-57 2.57 2.57 2.57 2.55 2-55 2.55 2.55 2.55 2.55 2.55 2.55 2.55 2.55 2-52 2.52 2-52 2.52 2.52 2.52 2.52 2.52 2-50 2.50 2-50 2-50 2-50 2-50 2-50 2-50 2.50 2-50

267 Table 5. Percentage Points of the Distribution of p2 H *? rat n 5 b

7

a i

ID

s D 1 S 0 1 2 0 1 E 3 D 1 E 3 0 1 2 3 1

n i

B 3 1 15 D 1 2 3 4 5 I 7 EO G 1 2 3 1 S t 7 8 1

V/.

-1.1.1 -1-57 -LIS -l.tl -Ltl -l.St -1-73 -l.fi -l.tl -1.55 -1.77 -1.71 -l.tl -Lt3 -1.60 -1.77 -L71 -l.fi -Lt2 -1.63 -1.80 -1.77 -1.71 -1.1.1 -LIE -1.11 -L81 -1.87 -1.85 -1.83 -1-BD -1.7t -1.18 -1.17 -l.lt -1.15 -1-11 -1.12 -1-11 -1.81 -1.87 -1.85

2-5Z -1.51 -Lit -1.38 -1.55 -1.51 -l.lt -1.58 -1.55 -1.51 -1.15 -1.1,1 -1.58 -1.55 -1-51 -l.t3 -l.tl -1.58 -1-55 -1.50 -l.tl -1-L3 -l.tl -1.58 -1.55 -1.7D -l.tl -Lbfl -1.L7 -l.tt -l.tl -l.t3 -l.tO -1.71 -1-73 -1.73 -1.72 -1.71 -1.7D -l.tl -l.t8 -l-t7 -l.tt

Edqeworth IOX 1 0 * - 1 . 3 7 - i . i t 1-31 - 1 . 3 3 - i . m 1.30 -1.E7 - i . i i 1.27 - 1 . 3 1 - 1 . 1 7 1.32 -1-37 - L i t 1.31 - 1 . 3 3 - 1 . 1 1 1.30 - 1 . 1 1 - 1 . 1 8 1.32 - 1 . 3 1 - 1 . 1 7 1.32 -1-37 - l . l t 1.31 - 1 . 3 3 -1-11 1.30 - 1 . 1 3 - 1 - 1 1 1.32 - 1 . 1 1 - 1 . 1 8 1.32 - 1 . 3 1 - 1 . 1 7 1.3E - 1 . 3 b - l . l t 1.31 - 1 . 1 1 - 1 . 1 1 1-32 - 1 . 1 3 - 1 . 1 1 1.32 - 1 - 1 1 -1-18 1-32 - 1 . 3 1 - 1 . 1 7 1.32 - 1 . 3 t - l . l t 1.31 - 1 . 1 5 -1-20 1.3E - 1 . 1 1 - 1 - 1 1 1.3E -1-13 - 1 - 1 1 1.32 - 1 . 1 1 - L I S 1-32 - 1 . 3 1 - 1 . 1 7 1.32 - 1 . 1 1 -1-22 1.32 - 1 . 1 8 - 1 . 2 1 1.32 -1.18 - 1 . 2 1 1.32 -1-17 - 1 - 2 1 1-32 - l . l t - 1 . 2 0 1-32 -1-15 -1-20 1-32 - 1 . 1 1 - 1 . 1 1 1.32 - 1 . 1 2 - 1 . 1 1 1.3E - 1 - 5 1 -1-23 1.32 - 1 . 5 1 -1-22 1.32 -1.5D -1-22 1-32 -1.5C - 1 . 2 2 1.32 - 1 . 1 1 - 1 . 2 2 1-32 - 1 - 1 1 -1-22 1.3E - 1 . 1 8 - 1 - 2 1 1-3E - 1 . 1 8 - 1 . 2 1 1-33 -1-17 - 1 - 2 1 1.32 - L i t -LEO 1.32 5Z

W, 15*

l.St 1.88 1.11 1-61 l.Bt 1.88 1.83 1.81 l.St 1.88 1.S2 1.63 1.81 l.St 1.81 1.82 1.83 LSI L8t

1.80 LSI

1.8E 1.83 LSI

1.7S 1.78 L71

1.71 LSD

1.80 LSI

1.B2 1.7b 1.7t 1.77 1.77 1.77 1.78 1.78 1.71 1.71 1.80

17.5* 2.17 2.5S 2.S3 2.10 2.17 2-51 2.3t 2.10 2.17 2.tl 2.32 2.3t 2.10 2.18 2.30 2-32 E.3t E.10 E.11 2-27 2-30 2.32 2-3b 2.11 2.21 2.22 2-23 2-21 2.Et 2.27 2.30 2-33 2-17 E.18 E.18 2-11 E.20 2-21 2.22 B.23 2.21 2.2t

Simulated

11*

IX

3.11 3-25 3-11 3.05 3.11 3.2t 2-11 3.05 3.11 3-27 2.11 2.11 3-05 3.11 2-81 2.11 2.11 3.05 3.It 2.8t 2.81 2.13 2-11 3.0t 2.71 2-7t 2.78 2.SO E.63 2. St E.81 2.15 2.tB 2-tl E.70 2.71 2.73 2-71 2-7t 2-78 2.60 2.63

-l.to -L51 -l.St -l.tl -l.tl -L50 -l.tl -l.tl -l.tS -1.50 -1.71 -1.73 -L7D -LtO -1.81 -1.71 -1.7t -Ltl -L51 -1.8b -1.81 -1.60 -1.73 -Lt8 -1.11 -1.13 -LSI -1.67 -LSI -1.61 -L71 -1.78 -E.01 -1.11 -1.13 -1.17 -1.11 -1.11 -1.15 -1.10 -1.10 -L88

2.5* -Lit -1.11 -L28 -1.51 -L11 -L37 -LSI -1.51 -1.11 -1.10 -L5t -1.55 -L53 -Lit -Ltl -L57 -1-St -1.S5 -Lit -Ltl -Lt2 -Ltl -LSS -1.5E -Ltl -l.tS -l.t7 -Lt8 -l.tl -Lt3 -l.tl -L5S -L77 -1.77 -Lt8 -Ltl -1.7E -L73 -Ltl -Lt7 -Lt7 -l.tS

S*

10*

-1.31 -1.30 -1.11 -1.35 -L33 -1.2t -1.37 -L38 -1.31 -L27 -1.11 -L37 -1.35 -1.32 -L11 -1.11 -LM0 -L3S -1.32 -1.11 -1.13 -LIE -LSI -1.37 -1.17 -L17 -Lit -1.11 -L12 -1.13 -1.12 -1.12 -LIT -L51 -1-11 -1.11 -LIS -1.51 -1.18 -L15 -Lit -LM5

-1.13 -1.11 -l.Ot -1.15 -1.13 -1.10 -1.11 -LIS -1.15 -1.11 -1.11 -1.13 -LIS -1.13 -LEI -LEO -1.11 -1.17 -i.m -LIS -1.17 -1.18 -Lit -1.13 -LEI -LEO -LIB -1.E3 -1.18 -1.18 -L11 -1.17 -LEE -1.E2 -1.20 -1.2E -1.21 -1.E3 -1.11 -1.11 -1.11 -L20

10*

15*

1.31 L 8 1 L S I 1.10 1-33 L 8 1 L 3 1 1.81 1.32 1.81 L 3 S 1.88 1-31 L B 3 L 2 1 1.78 L3t LS7 L 3 B 1.10 L2t Lt7

1.31 1.81 L 3 t LSS L10 LSS 1.33 L 7 S 1-31 L 7 t

1.35 1.88 L 3 1 1.88 L 3 2 1.81 1-31 L 8 7 1-31 L 6 2 1-33 L S S 1.31 1.83 L 3 1 1.81 1-30 1.71 1.31 L 7 t L 2 7 1.72 1-33 1.81 1.37 1.81 1.38 L B t L 3 8 1.10 L 3 3 1-80 1.21 1.71 L E I i.ta L33 L37 L28 L32 L3D L32 L35

L77 LS0

1.71 L75

1.72 LSI LB3

1.37 L S t

17- 5* 2.10 E.17 E.50 2.28 2.33 B.11 2-27 E.B7 2-35 2.11 2.13 B.37 S-35 E.13 2-11 2. I t 2-35 2.37 2.37 2.33 2.11 2.2t 2.2b 2.30 2.13 2.13 2.16 2-26 2.25 2.21 2.32 2.2b 2-13 2.13 2-17 E.E1 2-10. B.1S E.10 E.E5 E.E1 E.Et

n*

3-02 3.08 3.EO B.10 2.10 3.05 E.77 2-81 3.08 3-13 E.t3 2.11 E.83 3.11 E.63 2.51 2-86 2.13 3.01 B.87 B.73 E.81 2.S3 E.11 E.73 2.t5 2.71 B-77 2-S1 2.7t 2.S2 E.81 E.57 2.57 2.7D E.t8 2.51 E.tB E.51 E.73 E.70 E.87

268 Table 6. Percentage Points of the Distribution of P{

•5 n

t 7

S

|P s 0 1 2 D 1 S D 1 E 3 0 1

a3 1

10

D 1 E 3 4

ai s3 4

15

ai s3 4 5

t 50

7 0 1 5 3 4 5

t 7 6 I

!S

Edqeworth 1%

-i.ti -1.57 -1.51 -l.tl -1-tl -L5t -1.71 -l.fi -i.ti -i.to -1.77 -1.73 -Ltl -Lt3 -i.ao -1.77 -1.71 -i.ti -Lt5 -1.63 -l.SG -1-77 -1.714 -l.fi -LIE -1.11 -1.01 -1.67 -1.85 -1-83 -1.80 -1.78 -1.18 -1.17 -Lit -L15 -1.11 -LIE -1.11 -1.81 -1.87 -1.85

2-5* -1.51 -l.lt -1.11 -1.55 -1.51 -1-11 -1-50 -1-55 -1.51 -1-11 -l.tl -1-58 -1.55 -1.51 -l.t3 -1-tl -1-58 -1.55 -1.52 -l.tl -i.ts -1-tl -1-58 -1.55 -1.70 -l.tl -i.ta -l.t? -l.tt -l.tl -l.t3 -1-tl -1.74 -1.73 -1.73 -1.7E -1.71 -1.70 -l.tl -i.ta -l.t? -l.tt

5*

10*

-1-37 -1.33 -1.31 -1.31 -1.37 -1.33 -1.11 -1.31 -1.37 -1.35 -1.13 -1.11 -1.31 -Lit -1.14 -1.43 -1.41 -1.31 -1.37 -1-45 -1-44 -1.43 -1.41 -1.31 -1.41 -1.18 -1.18 -1.17 -Lit -1.15 -1.14 -1.43 -1.51 -1.51 -1.50 -1.50 -1-41 -1.41 -1-48 -1-48 -1-47 -Lit

-Lit -L14 -1.13 -1.17 -Lit -1.11 -1.18 -1.17 -Lit -L15 -1.11 -1.18 -L17 -Lit -1.11 -1.11 -LIS -1.17 -l.lt -LEO -1-11 -LIT -1.18 -1.17 -LEE -LEI -LSI -LEI -LED -LED -L11 -1.11 -L23 -LEE -LEE -LEE -LEE -LEE -LEI -LSI -LEI -LEO

10*

15*

1.31 l . a t L 3 0 i.aa 1.28 1.10 L 3 E 1.84 1-31 L 8 t 1.30 L S B 1.3S 1.83 1.3E L S 4 1.31 l . B t 1.30 L 8 7 1.32 1.62 1.3E 1.B3 L32 L84 L 3 1 l.Bt

1.32 1.81 1.32 1.82 L32 L83

1.32 1-84 1.31 1.85 1.3S 1.80 L 3 E 1.81 L32 L62

1.3S 1.83 L3S L64 L 3 E 1.78

1.3E 1.78 L32 L71 L 3 2 1.71

1.32 1.80 L 3 2 LSD L 3 2 1.81

1.32 1.82 1.32 L 7 t L 3 2 1.7t 1.3E 1.77 1.32 L 7 7 L32 L77 L 3 2 1.78

1.32 1.78 L 3 2 1.71 1.32 1.71 1.32 1.60

17.5* 2-47 2.58 2.tt 2.40 2.47 2-51 2.3t 2-10 S-17 S.54 E.3E 2-3b E.40 E.48 S.3D E-3E 2.3t 2-40 2.45 2.27 2.30 2-32 2.3t 2-11 2.21 S.S2 E-S3 E.S1 E.Et 2-27 2.30 2.32 2-17 2.18 2.18 2-11 2-20 2-21 2.22 2.23 2.24 2.2t

Simulated

11*

1!!

3-14 3.35 3-30 3-05 3.14 3.St 2.11 3.05 3-14 3.SO 2.14 2.11 3.05 3.15 2-81 2.14 2.11 3.05 3-11 2.at E.81 5.14 E-11 3'.0t 2.74 2.7t 2.76 2.6D 2-B3 2.6t 2.61 S-13 2.t8 E.tl 2.70 5-71 E.73 E.74 E.7t E.78 s.ao 2.83

-i.to -1.54 -i.to -l.tl -Lt4 -1.41 -l.tl -l.tl -i.ts -l.tl -L75 -1.73 -L70 -LtO -1.84 -1.76 -L7t -l.tl -1.7D -L87 -L83 -1.81 -1.73 -i.ta -L14 -L13 -1.61 -LB8 -L85 -1.81 -1.71 -L71 -2.01 -2.00 -1.13 -1.17 -1.14 -1.11 -1.15 -1.10 -L10 -LB7

2.5* -l.lt -1.41 -L43 -1.51 -1.41 -L37 -L54 -1.54 -1.11 -L18 -l.St -L55 -1.53 -Lit -l.tl -1.57 -L5t -1.S5 -LSI -Lt5 -Lt2 -LtO -1.55 -L52 -1-tl -i.ts -Lt7 -LtB -Lt4 -Lt3 -l.tl -1.51 -1.78 -L77 -Lt8 -l.tl -1.72 -1.71 -l.tl -Lt7 -Lt7 -Lt5

5*

10*

-1.31 -L3D -1.27 -1.35 -L33 -1.27 -L37 -1.31 -1.34 -L3S -1.12 -1.37 -1.35 -1.32 -1.41 -1.41 -1.4D -L36 -L33 -1.11 -1.13 -1.11 -LID -1.38 -1.17 -1.47 -l.lt -1.11 -LIE -L43 -1.13 -1.13 -1.50 -LSI -1.11 -1.11 -LIS -L5D -LIS -1-11 -1.15 -Lit

-L13 -1.11 -LDB -1.11 -LIE -1.11 -1.11 -1-18 -1.15 -1-11 -1.11 -1.11 -1.15 -1.13 -LED -1-11 -L11 -1.17 -1.13 -1-18 -1-17 -LIB -Lit -L13 -L20 -1-20 -1.1a -L23 -1.18 -1.18 -1.11 -1-17 -1-22 -1-21 -1-20 -1-23 -1-21 -1-23 -1.11 -1-11 -1.11 -1-20

ID* L35

15* L81

1.34 1.10 L30

1-3S 1.32 1.38 1.32 L21 L3t L33

L87 LS2

1.83 1.10 1.B2 1.78 i-a? 1.10 l.t?

1.25 1.34 L B 4 L 3 7 1-B5 L 3 B 1.11 1.33 1.7t L30 L3S

L7t LS6

1.34 i.aa 1.33 1.87 1.3B i.ea L 3 4 1.00 L33 L34 L34 L30 L31

LB7

1.84 L81 L7t L75

1.27 1.7E L 3 3 1.84 L3t L65

1.38 1.31 1.32 1.21 1.2B 1.33 1.38

1.87 1.10 L78

1-71 i.to 1.7? i.ao

L21 L74 1-32 L 7 t 1.30 L 7 3 1-32 L S I L34 LS3 L3t LSt

17.5* E.41 S-4t E.4E E.E8 E-31 2.37 2.S7 2.E4 E.33 B.41 2.11 S-3t E.3t E.lt E-11 E.lt 2-3b 2.35 S.3t 2.32 2.20 2-21 2.27 2.32 2-11 2.13 2-11 2-27 2-2t 2.3D 2-33 2-28 2.12 2.11 2.18 2-21 2.10 2-11 2.01 2-25 2.21 2.2t

11*

3.D2 3. D8 3.13 2.12 2. I t 3-03 2-7t 2-10 3.07 3-08 2-t3 2.11 2.87 3.15 2-81 2.5B 2.10 E-11 3-02 2.at 2.7t 2.85 2-82 2.11 2.71 2.b5 2-62 2.77 2.66 E.77 E.8D E.81 S.56 2.5b 2-tl 2-tl 2.51 E-bO 2.tl 2.7t 2.t1 2.65

269 Table 7. Simulated Percentage Points of the Distribution of P, n S

b

7

a

1

10

IS

2D

s D 1 2 D 1 2 D 1 2 3

•1 2 3 D 1 2 3 4 0 1 2 3 4 0 1 2 3 4 5 b 7

o i 2 3 4 5 b 7 8 1

IX -1.17 -2-M7 -5-31 -LSD -l.bl -2.fa3 -1.3S -1.33 -1-84 -2.63 -1.22 -1.23 -1-3D -1.71 -1-D8 -LID -1-lb -1.4b -2.11 -1.D2 -1-Dl -1.D7 -1-D1 -1.45 -0.77 -0.77 -0.7ft -0.81 -0.71 -0.81 -0.12 -l.lt -D-bS -0-b4 -0-b7 -O.bb -0-b7 -D-tfi -0-b7 -D-bb -0.73 -0-77

2.5X

-mo

-Lb4 -3.07 -1.11 -1.22 -1-72 -1.0b -1.02 -1.2b -1.17 -0.14

-o-m

-1.02 -1.22 -o.ab -0-10 -0.11 -1.03 -1.43 -O.flO

-o.ai

-0.B4 -0.8b -1.02 -0-bb -0-bS -D.b3 -0-bM -0-b3 -D.bb -0-70 -o.ab -0.5M -D.S1 -0-S2 -0.S1 -0.SS -0-S4 -0.SM -0-52 -0.56 -0-bO

S>. -1.0b -1.1b -2.0b -0.11 -0.11 -1.17 -o.as -0.82 -0.15 -1.33 -0.7b -0.7b -o.ao -0.11 -0-70 -0.71 -0.72 -o.ao -1.01 -0.b5 -0-bS -0.fc7 -0.b7 -0.76 -0.52 -0-53 -0-51 -0.52 -0.51 -0-52 -0-54 -0.b5 -D.MM -0-M5 -0-M2 -0.M3 -D.M4 -0-M2 -O.MM -0.M2 -D.M5

-cm

1Q'/. -0.78 -0.82 -1.2b -D.b5 -0.b6 -0.80 -0.b3 -0.b2 -O.bb -0.85 -0.5b -0.57 -0.56 -O.bl -0.51 -0.53 -0.53 -0.55 -O.bb -0.41 -0.46 -0.48 -0.41 -0.54 -0.31 -0.31 -0.31 -0.38 -0.38 -0.36 -0.40 -0.45 -0.33 -0.33 -0.32 -0.32 -0-31 -0.33 -0.34 -0.32 -0.34 -D.3b

10* 0.77 0.83 0.85 O.bl 0-70 0-bl 0.51 O.bM O.bl 0-b4 0.57 0-57 0-57 0.57 0.52 0.51 0.54 0-52 0.54 0.47 0.51 0.41 0-48 0-50 0.36 0-31 0.36 0-38 0.40 0.31 0.36 0-38 0.33 0.32 0.32 0.32 0-32 0.32 0-33 0.34 0-33 0.32

15X 1.04 1.1b 1.17 0-14 0.15 D-17 0.61 0.87 0-8b 0-87 0.7b 0.77 0-71 0.80 0.70 0.71 0.73 0.73 0.72 O.bH 0-b6 D-bb 0-bS O.bb 0.52 0.52 0.50 0.51 0-51 0.51 0.52 0-51 0-44 0.42 0.42 0.44 0-42 0.44 0-44 0.45 0.43 0.43

1 7 . 5X 1.21 1-41 1-bO 1.17 1.24 1.32 1.00 1.08 1.01 1.14 0.11 0.1b 0.11 1.02 0-85 0.81 0.13 0.14 0.11

IIX l.b3 2.12 2.30 1.46 l.b7 l.b7 1.3b 1.43 1.45 1.54 1.21 1.23 1.21 1.21 1.01 1.12 1.15 1.20 1.17

o.ai

o.ia

0-64 0.62 o.ao o.ao 0-b3 0-b4 0-b2 0.b5 0-b2 0-b2 0.b3 0-b4 0.55 0.52 0.51 0.51 0.51 0.54 0-54 0-54 0.53 0.53

1.05 1.10 1.01 1-05 0.74 0.7b 0.71 0.71 0.78 0.60 0.77 0-71 0.b7 o.ts 0-b3 D.b4 0.b2 0-b3 0-b7 O.bl 0-b7 O.bS

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

munin[ro«m«r-jr^r^«r^jj«ir)[r
i

I

I r^

I

I

njo^

^

r f

n

I

I

^

jo

I

r t n

I

I

I

I

I

I

r t n

I

I

n,mo

T

r t

•<•

jmo

I

I

I

I

5

I

I

I

I

I

3

jm=roH.njm=ro

I

r t

I

I

njm

a

1

-

I

1

1

I

oJ^o

I

3

I

I

I

njm

S

r f

I

I

I

-u,J,^o«o«oir3-_D-ii_D_D_Dj)-D-Dr,.v r^rurujrSrtrurHHir^r-Hir^r-^i-ii-^r^r^r-ia-r^r^r-ir^r-^aoaaooor^ooaoaaDao

"i

•»o C

M a t h e m a t i c s a n d t h e 21st C e n t u r y Eds. A. A. Ashour and A.-S. F . O b a d a © 2001 W o r l d Scientific P u b l i s h i n g C o . ( p p . 2 7 1 - 3 0 3 )

271

MATHEMATICAL MODELS IN T H E THEORY OF ACCELERATED E X P E R I M E N T S

V. BAGDONAVICIUS and M. NIKULIN Statistique

Department of Statistics, Vilnius Universty, Vilnius, Lituhania Mathematique, Universite Victor Segalen Bordeaux 2, Bordeaux, France & Steklov Mathematical Institute, Saint Petersbourg, Russia

AMS subject classifications:

62 F10, 62 J05, 62 G05, 62 N05

K e y w o r d s : Arrhenius model,accelerated life models, additive accumulation of damages model, changing shape and scale model, Cox model, covariable,cross-effects of hazard rates, Eyring model, exponential distribution, frailty model, gamma frailty model, generalized probit model, generalized proportional hazards model, generalized logistic regression model, generalized proportional odds-rate model, generalized additive model, generalized additive-multiplicative model, heredity principle,inverse gaussian frailty model, linear transformation model, log-linear model, Meeker-Luvalle model, PeshesStepanova model, periodic stress, power rule model, reliability, resource, Sedyakin's model, step-stress, survival function, switch-on's and switch-off's effects, tampered failure rate model, Weibull distribution. 0. I n t r o d u c t i o n Mathematical models describing dependence of the lifetime distribution on the explanatory variables (stresses) will be considered. . The considered models are used in survival analysis and reliability theory analysing results of accelerated life testing. Really, many manufactured devices have a long life when used under normal conditions. Therefore much time is required to get sufficiently large data for reliability estimation. To avoid this, items can be tested under higher stress conditions. In this case all processes resulting in failures of items elapse more quickly. As a result, failures which in normal conditions would occur only after a long testing, can be observed and the size of data can be enlarged. Such reliability testing is called accelerated life testing. Using information about failures under higher stress conditions, inference about item reliability under normal stress must be made. The solution of this problem requires construction of the mathematical theory of models model relating the distribution of failure time on stress. A number of such models was proposed by engineers who considered physics of failure formation process of certain products or by statisticians, see Andersen, Borgan, Gill k Keiding (1993), Bagdonavicius (1978,1990), Bagdonavicius k Nikulin (1995,1998), Bhattacharyya & Stoejoeti (1989), Cox (1972), Cox k Oakes (1984), Clayton k Cuzick (1985), Dabrowska & Doksum (1988), Elandt-Johnson k Johnson (1980), Gertsbakh k Kordonskiy (1969), Genest, Choudi k Rivest (1995), Greenwood k Nikulin (1996), Harrington k Fleming '1982), Kalbfleisch & Prentice (1980), Lawless (1982), Hougaard (1986), Johnson (1975), Kartashov (1979), Kartashov k Perrote (1968), Lee (1992), Lin k Ying (1994, 1995,1996), Meeker k Escobar (1993, 1998), Mann.Schafer k Singpurwalla (1974), Miner (1945), Meeker k Escobar (1998), Nelson (1990), Robins k Tsiatis (1992), Schabe (1998), Sedyakin (1966), Shaked k Singpurwalla (1983), Singpurwalla (1995), Singpurwalla k Wilson (1999), Schweizer k Sclar (1983), Viertl (1988), Viertl and Spencer (1991), Voinov k Nikulin (1993, 1996) etc. There is some eclecticism in various definitions of these models that prevents from seeing relations between them. The construction of accelerated life models will be considered now (Bagdonavicius and Nikulin (1994-2000)). This general approach gives the possibility to formulate a number of new models and to show the place of known ones in the proposed classes of models. These models can be considered as parametric, semiparametric or nonparametric. Parametric models, used in accelerated life testing, where thoroughly investigated, see for example, Viertl (1988), Nelson (1990), Nelson k Meeker (1991), Meeker k Escobar (1998), Basu and Ebrahimi (1982), Singpurwalla (1971), etc. These models are the well known: proportional and additive hazards, logistic regression and other models, which are used most often in survival analysis and in reliability. See also some reviews as Rukhin and

272 Hsieh (1987), Meeker and Escobar (1993), Singpurwalla (1987), etc. Statistical analysis of considered models one can find, for example, in Lin and Ying (1994,1995,1996), Meeker and Escobar (1998), Dabrowska and Doksum (1988), Robins and Tsiatis (1992), Bagdonavicius and Nikulin (1995-2000), Gerville-Reache and Nikoulina (1998), Tsiatis (1990), Schmoyer (1991), Sethuraman and Singpurwalla (1982), Ying (1993), etc. 1. R e s o u r c e All models of accelerated life will be formulated in unified way, using the notion of the resource. We consider only such models for which statistical estimation or hypothesis testing procedures are given up-to-date. Suppose at first that stresses are deterministic time functions: *(•) = (*i(-)

*m(-)) T = [0, oo) -+ B G R m .

If x(-) is constant in time, we'll write x instead of x(-) in all formulae. Let Q be a population of items and suppose that the time-to-failure of items under the stress x(-) is defined by a non-negative absolutely continuous random variable Tx^ = TX^(UJ),UJ £ fi, with the survival function

5.(.,(*) = P{r. ( . ) >t} 1 strictly decreasing on the support of distribution [0,spx(.\). The moment of failure of a concrete item wo G H is given by a nonnegative number TX(.)(UQ). Let Fx(.){t) = 1 — 5j,(.)(t) be the cummulative distribution function of T x (.). We use the following interpretation of it. D e f i n i t i o n 1. The proportion Fx^(t) of items from fi which fail until the moment t under the stress x(-) is called the uniform resource of population used until the moment t. D e f i n i t i o n 2. The random variable Ru = Fx(.)(Txi.))

=

l-Sx(.)(Tx(.))

is called the uniform resource (of population). The distribution of the random variable Ru doesn't depend on x(-) and is uniform on [0,1). This explains the name uniform resource. The uniform resource of any concrete item UQ G £1 is R (UQ). It shows the proportion of the population fi which fails until the item's wo failure Tx(.)(wo). The same population of items Q, observed under different stresses xi() and X2(-) use different resources until the same moment t when Fx^.)(t) ^ f x a ( ) ( t ) . In the sense of equality of used resources the moments ti and <2 are equivalent if FXl^(ti) = FX2(.)(t2). If we denote by G(t) = 1 - t, t 6 [0,1), the uniform survival function and ff(p) = l - p , p G ( 0 , l ] the inverse function of G, then the resource can be written in the form R = H(SX(.)(TX(.))). The considered definition of the resource is not unique. Take any strictly decreasing and continuous 1 function H : (0,1] -+ R such that the inverse G = H' of H is a survival function. Then the distribution of the random variable Ra also does not depend on x(-) and the survival function of Ra isG. D e f i n i t i o n 3 . The random variable Ra is called the G- resource and the number H(Sx(.)(t)) is called the G-resource used until the moment t. If identical populations of items operate under different stresses Xi(-) and x2(-), then independently of the resource choice the moments 0,

(p) = - l n p ,

H{Sx{)(t))

= -In

Sx{.}(t).

273 The choice of exponential resource is due to the fact that the exponential hazard rate

a

resource usage rate is the

«<->w = 5s? i P { T -<-> e (M+fe]' T*(> >t} = ~ JSM

and the used resource is the accumulated hazard rate A

*(-)(t)

= / atx(.)(u)du = Jo

-ln{Sx(.)(t)}

under the stress x(-). These notions have good interpretation. Then the (exponential) resource R =

AX{.)(TX{.))

has the standard exponential distribution with the survival function G(t) = e~', t > 0. The resource R takes values in the interval [0,oo) and doesn't depend on x(-): for any t the number Ax^(t) £ [0,oo) is the exponential resource used until the moment t under the stress x(-) (see Fig. 1), the rate of exponential resource usage is the hazard rate ax^(t) and for any moment t shows the risk of failure just after this moment for items which survived until t. Sometimes the meaning of certain models in terms of other resources will be discussed. It will be done if formulations of models using non-exponential resource will be simplier or more comprehensive.

Fig. 1. R e m a r k 1. We'll give definitions of models for deterministic stresses. If the stress is a stochastic process X(t), t>0, and Tx(.) is time-to-failure under X(-), then denote by Sx(.)(t)

= P{TX{)

> t\X(s)

«*(•)(<) = - £ ( . ) (0/S*(.)(0,

= x(s), 0 < s < t],

^(•)(*) = -MS,(.)(*)}

(0)

the conditional survival, hazard rate and accumulated hazard rate functions. In this case definitions of models should be understood in terms of these conditional functions. 2. G e n e r a l i z e d Sedyakin's m o d e l Definition of t h e m o d e l

274 The first idea which comes by modelling the influence of a stress on lifetime distribution is to suppose that the rate of resource usage ax(.)(t) at any moment t depends on the value of the stress x(t) at this moment and the resource Ax^(t) used until t. It is formalized by the following definition.

D e f i n i t i o n 4 . The generalized Sedyakin's (GS) model (Bagdonavicius (1978)) holds on a set of stresses E if there exist a positive on E x R + function g such that for all x(-) £ E <»*(•)(<) =

ff(*(*M*(-)(*))-

(!)

If the stress is a random process, we denote by E the space of trajectories of this process and consider the conditional functions (0) in all definitions which follow. The definition 4 implies that the accumulated hazard rate (or the used resource) verifies the integral equation (2) ff («(«). ^ ( o M ) duJo In the following subsections it will be shown that if the GS model holds then the survival functions under the step-stresses can be written in terms of survival functions under constant stresses. A

*{-)(t) =

Simple step-stress Let E0 be a set of constant in time stresses and E\ be a set of simple step-stresses of the form

x(r) = {'1'

°^;<*"

(3)

where x\, x2 £ -BoP r o p o s i t i o n 1. If the GS model holds on E\ then the survival function the stress x(-) £ E\ verify the equalities

and the hazard rate under

and u\

a

/

*i(')i

0
...

respectively; the moment t\ is determined by the equality SXl(t\) = SX2(t\). Proof of the Proposition I. Put a = 4 , ( ( i ) = Ax(.)(ti) = AXj{t\). The equalities (2)-(3) imply that for a l H > ti Ax(){t)

= a+

I

g(x2,Ax(.)(u))

du

and g(x2,AX2(u))

du = a+

/

g (x2,AXl(u

- ii + t{)) du.

( l

1

It implies that for all t >
and Ax,(t

h(t) = a + /

g (x2,h(u))

—
•it 1

with the initial condition h(ti) = a. The solution of this equation is unique, therefore we have Axi.){t)=Ax,(t-t1

+ f1),

for all

t>tx.

(6)

275 It implies the equalities (4) and (5). The proof is complete. Corollary 1 Under conditions of the Proposition 1. P { r l ( 0 > h} = P {TX2 > t\ + s I TX2 > t{} .

(7)

The model (7) was proposed by Sedyakin (1966). The equality SXl(ti) = S I a ( i J ) and the definition of the resource imply that for the two groups of identical items, observed under i i and x2, respectively, the moments ti and i j , respectively, are equivalent in the sense of resource usage. The equality (4) implies that for any s > 0 Sx(.){t1 + s) = SXj(f1 + s). Thus the GS model implies that if two identical populations of items under different stresses use the same resources until the moments t\ and t\y respectively, and after these moments both populations operate under the same stress, then the rates of the resource usage of these populations in the intervals [
(

XU

X2, Xm,

0
tl
_

(g)

where Xi, • • - , x m G Eo- P u t to = 0. P r o p o s i t i o n 2. If the GS model holds on Em then the survival function equalities: Sx(.){t)=SXi(t-ti-1

+ fi_i),

if te[U-i,U),

Sx^.){t)

verifies the

(i = l , 2 , . . . , m ) ,

(9)

where where t* can be found by solving the equations SXl{ii)

= SXl{f1),...,SXi(ti-ti-1

+ t*i_1) = Sti+l{ttf,

(i = l , . . . , m - l ) .

(10)

Proof. The Proposition 1 implies that the Proposition 2 is true for m = 2, i.e. we have Axi.)(t)

= AX2(t-t1+t'1)t

for all

te[tuh),

where AXl(ti) = J 4 I 2 ( < I ) . Suppose that Proposition 2 is true for m = j — 1. Then Ax(.){t)

= AXi{t-ti.1+fi_1),

if* 6 [*,•_!,<,•). (*"= 1 , - - - , J - 1),

(11)

where AXl (
«,*_!) = Ax,+1 (tj), (t 0 = 0, i = 1 , . . . , j - 2).

We'll prove that the Proposition 2 is true for ro = j . Continuity of the functions A,;(.)(t) and and the equalities (11) imply that ^ ( ) ( ' j - i ) = AXj_x(tj--i_ -tj-2

+ tj_2)-

So the equation (2) implies that for all t € [ i j _ i , t , )

Ax{.)(t) = A^.jfo-!) + / Jtj-x

g (XJ, A,(.)(u)) du =

Ax^(i),

276

^ - I ( * i - i - < j - 2 + <j-2)+ /

9(xj,M-)(u))

du

(12)

-

Jtj-1

The definition of tj_1, given in (10), and the equation (2) imply that for all t 6 [ti_,h): y t --h-i+t'jtj-i+tj.,

Ax.(t-tj_1

+ fj_1)

Asj-i(*j-l

= AXj{fj_1)

+ /

S^j.A^M) du =

ff(zj,^U(«-t(-i+<j-i)) du.

- < j - 2 + <)-2)+ /

The equalities (2.12) and (2.13) imply that the functions Ax^(t) integral equation h(t) — a-\-j

g(xj,h(u))

du

for all

and AXj(t t (E

— tj-i

(13) + tj_x)

satisfy the

[tj-i,tj),

with the initial condition h(tj-i) = b = A a ; j _ 1 ((j_i — tj-2 + ^ - 2 ) - The solution of this equation is unique, therefore for all t £ \tj-\,tj) we have

Ax(.)(t) = AXl(t-ti-i

+t*j_1).

The proof is complete. R e m a r k 2. In the statistical literature (see Nelson (1990)) the model (9) is called the basic cumulative exposure model In terms of graphs of the accumulated hazard rate functions Ax^(t) (thick curve) and AXi(t) ( m = 3 , i—1,2,3) the result of the proposition 2.2 is illustrated by the Fig. 2.

Fig. 2. R e m a r k 3. The GS model assumes that the failure rate ax^{t) at any moment t depends only on the resource accumulated until this moment (or, equivalently, on proportion of items failed until t) and on the value of the stress applied at this moment t. In situations of periodic and quick change of the stress level or when there are many life shortening switch-on's and switch off's of the stress, this model is not appropriate. We'll consider generalizations later. There are no methods of estimation for this model. What is the region of applications of this model? Suppose that the model is parametric and it is impossible to obtain the complete sample under the "normal" conditions of functionning of items.

277 When the right censored data is used, the goodness-of-fit tests can test that the left tail of a survival distribution corresponds well the chosen model. But often the estimates of p-quantiles with p near the unity are needed and in the case of bad choice of the model big mistakes can be made. The utilisation of the model of Sedyakin can help to solve this problem. If the stepwise stresses are used, it is possible to obtain failures of items at the end of life under the "normal" conditions and therefore to test if the right tail is from the class of specified distribution. A test for the Sedyakin's model can be found in Bagdonavicius & Nikoulina (1997). 3. A d d i t i v e a c c u m u l a t i o n of d a m a g e s m o d e l D e f i n i t i o n of t h e m o d e l Consider the following important particular case of the GS model. We now suppose that the rate of resource usage a,(.)(() at any moment t is proportional to a function of the stress applied at this moment and to a function of the resource used until t. It is formalized by the following definition. D e f i n i t i o n 5. The additive accumulation of damages (AAD) model (Bagdonavicius (1978)) holds on E if there exists a positive function r on E and a positive on [0, oo) function q such that for all x(-)€E (14)

a l ( .,(<) = r{*(*)} 9 { i V > ( ' ) ) } P r o p o s i t i o n 3 . Suppose that the integral

fx dv Jo «W converges for all x > 0. The AAD model holds on the set of stresses E iff there exists a survival function Sxi)(t) The inverse H = G

1

of the function

= G ^

r{x(r)}dry

G such that (15)

G is defined by the equality

Jo

«(»)

Proof. The equation (14) is equivalent to the integral equation •<•>(')

/ Jo

dv

^7T= / ?(»)

r x u

( ( )>

du

-

Jo

The result follows immediately. The name of the AAD model is implied by the following considerations. Fix any constant in time stress xo, for example, let XQ be the usual stress. Then under the AAD model Sl0(t) = G(r(x0)t) and putting p(x) — r(x)/r(x0) we obtain

5,(0(*) = S, o (jf'/K«W}dr). The S-r.-resource is R = S^(SI{)(T^)=

J " p{x(r)}dr. Jo and thus it is stochastically equivalent to the time-to-failure under the

It's survival function is 5 T o usual stress. Decrease of the resource in the interval [r, T + c(r] we'll call the damage. For the AAD model damages have the form p{x(r)} dr and are linear functions of dr. Thus the resource is used by the linear accumulation of damages.

278 R e m a r k 4. The AAD model written in the form (15) is also called the accelerated failure (AFT) model (Cox and Oakes (1984)).

time

C o n s t a n t stresses If X(T) = x = const then the AAD model gives Sx(t) = G{r(x)t}.

(16)

Thus different constant in time stresses change only the scale of distribution. Applicability of this model in accelerated life testing was first noted by Pieruschka (1961). It is the most simple and the most used model in accelerated life testing. Simple step-stresses. Consider the properties of the survival functions under the step-stresses. As before, we denote by Ei and Em sets of stresses of the form (3) and (8), respectively. P r o p o s i t i o n 4. / / the AAD model holds on Ei then the survival function under the stress x(-) verifies the equality

where

,, _ Kfi). r(x2) Proof. The equality (16) implies that

4,(0 = *. ( * £ ) , therefore the moment t*, done in the Proposition 1, is t\ = j ^ ' i t i General step-stress P r o p o s i t i o n 5. / / the AAD model holds on Em equalities:

5IJi-t,--i + ^y^r(j!j-)(<j-*j-1)L

then the survival function

Sx(.)(t)

verifies the

if < € [U-uU), (i = 1,2, ...,m).

(18)

Proof. The first equality is implied by the formula (15) and the form (8) of the stress. The second is implied by the formula (16). The proof is complete. R e m a r k 5. If the AAD model holds on Em, the moment t* defined by (10) has the form

'i^E'Wfe-'i-i)'

Relations between the means and the quantiles

(l9)

279

Suppose that x(-) is a time-varying stress and denote by t r (.)(p) the p-quantile of the random variable T^.), 0 < p < 1. In the next proposition we shall write X(T) to note as the value of the stress x(-) at the point r as the the constant in time stress z ( r ) l ( ) . P r o p o s i t i o n 6. Suppose that the AAD model holds on E and x(-), x ( r ) £ E for all r > 0. Then *<•>(*>

Jo

_dr_

<*(r)(P) x(r)(

If the means E T ^ . j , ETX(T) exist then

•(/*"£?)"• The model (21) is the moiel of Miner (1949). Proof. If the AAD model holds, the equality (2.15) implies that the G-resource used until t is /,(.)(<) = JJ ( S I ( . , ( 0 ) = fr{x(s)}ds.

(22)

The resource .ft has the form : R =

I r{x<(s)} ds.

(23)

The cumulative distribution function G of R doesn't depend on x(-). Taking the constant in time stress x ( r ) , equal to the value of the stress x(-) at the moment r, we obtain

R=

J

r{x(T)}ds

= r{x(r)}Tx(T).

(24)

o Taking the means of both sides we'll get ER = r{x(T)}F,T^T).

(25)

The equalities (22) and (23) imply

™=*[T«*»A-*(Y^A -•*(•)

: ER • E

J

ET S ( T )

and the equality (21) is obtained. Denote by t(p) the p-quantile of the resource R. If r is fixed, the equality (16) implies t{p) = r{x[T)}tx(T)(p).

(26)

Using the definition of the resource we have

p = P{Tl(.)<
r{x(r)}dr<

J

r{x(r)}rfr

=

280

j

r{x(r)}dr\.

(27)

The equalities (27) and (26) imply that t(p)=

r{x(r)}dT

= t(p)

./o

— Jo

' I ( T )(P)

and hence the equality (20) is obtained. The proof is completed. Corollary 2. For the stress (8) the formula (21) implies the equality

£ 1 ^ = 1.

(28)

where

{

o,

r I( .) <<*_!,

3* = < 7 i ( ) - < * - i ,

tfc_i < T s (.) <
is the life of an item, tested under the stress i ( - ) , in the interval [ik_i,ijt). For the stress (2.8) the formula (20) implies that for tc( )(p) 6 [tk-i,tk) true

the following equality is

The model (29) is the model of Peshes-Stepanova (see, Kartashov, 1979). So a/Z of the models (16), (18), (20), (21), (28), (29) are implied by the AAD model and illustrate properties of this model. In the case m — 2, the formula (28) can be written in the form

-^-

+

^ U l ,

(30)

and the formula (29) can be written in the form

So Er

ET1

*> = TTXE: 1

(32>

ET I 3

and

*"W = ,

J ' W - „ • if *.(-)(P) > 'i-

(33)

C(P) Thus, if the AAD model holds on Eu then E T j , ET 2 and WTX? determine ET,,, and <*(.)(?) and t l 3 ( p ) determine i r , ( p ) -

4. Proportional hazards model Definition of t h e m o d e l In survival analysis the most used model describing the influence of covariates on the lifetime distribution is the proportional hazards or Cox model, introduced by Cox (1972). In terms of stresses it is formulated as follows.

281 D e f i n i t i o n 6. The proportional hazards (PH) model holds on a set of stresses E if for all i(-) e E a l ( .)(<) = r{x(t)}

aB(t),

(34)

where ao(t) is the baseline hazard rate. The model means that the rate of resource usage at any moment t is proportional to some function of a stress applied at this moment and to a baseline rate which does not depend on the stress. Under the PH model the resource used until the moment t has the form 4,(.)(*) = / Jo

r{x{u)}dA0(u},

where Ao{t) = I a0(u)du. Jo In terms of survival functions the PH model is written I model is written : S l ( .)(t) = e x p | - j f

(35)

r{*( U )}<M 0 (u)}-

Relations between the P H and the A A D models When the PH model is also the AAD model? The answer is given in the following two propositions. P r o p o s i t i o n 7. Let Eo be a set of constant stresses such that the set r(Eo) has an interior point. Suppose that the PH model holds on Eo. The AAD model also holds on Eo iff the time-to-failure distribution is Weibull for all x £ Eo Proof. 1) If the time-to-failure distribution is Weibull and the PH model holds on Eo then for all

x€E0 Sx(t)

= e

-

W "

= S0(
(36)

Taking two times the logarithms of both sides, we obtain that for all t > 0 a{x)(\nt~ln$(x))

= l n r ( i ) + ln(-lnS0(i)).

(37)

The function ln(— lnSo(i)) doesn't depend on x, so a(x) = a = const for all x e Eo, which implies

Sx{t) = e~l*for,

(38)

i.e. the AAD model holds on Eo2) Suppose that both the PH and AAD models hold on Eo- It means that there exist functions So, S i , r and p such that for all x 6 Eo Sl(p(x)t)

= So(t)r^.

(39)

Taking two times the logarithms of both sides, we obtain that for all t > 0 In{-lnSi(p(z)0} = lnr(i)+ln(-lnSo(t)).

(40)

Put ffi(»)-=ln(-lnSi(e"»,

g0(v) = l n ( - lnSo(e")),

a(x) = lnp(x),

/3(x) = l n r ( x ) .

The equality (40) can be written in the following way: for all u 6 R , x e Eo gi(u + a(x}) = 0(x) + g0(u). The set r(Eo) has an interior point, i.e. contains an interval, so the set p(Eo) also has an interior point. Take xitX2,X3 e Eo such that p(x-2)lp(xl)^p(x3)lp{x2).

282 For a l i i , j = 1,2,3 gi(u

+ a(Xi))

-gi{u

+ a(xj))

= /?(*,-) -0{Xj).

(41)

Put kl = a(x2)-a(x1),

k2 = a(x3) - a(x2),

h = /3(x 2 ) - /?(xi),

h = P(x3) -

P(x2).

For all v 6 R 9i{v + kt) = gi{v) + h (i = 1,2, *i # t 2 ) .

(42)

ffi(t>) = av + b, Si(«) = e x p { - e 6 t ' ' } ,

(43)

It implies that and consequently SI(t)=exp{-eb(p(x)t)'>}.

(44)

So the lifetime distribution is Weibull for all x e EoThe proof is complete. Suppose that E0 is the set of constant stresses defined in Proposition 7, xitx2 constant stresses and a step-stress xs(-) has the form

«.w

fxu

=

E Eo are two fixed

0
T > S,

(_ X2,

where s is a fixed positive number. P r o p o s i t i o n 8. Suppose that the PH model holds on the set E including E0 and xs(-). model also holds on E iff the time to-failure is exponential for all x G Eo. Proof. The Proposition 7 implies that for all x £ E0 Sx(t)

= e'(j^r

'

The AAD

•

(46)

= ^f-\

(47)

Put 0j = 6(xi), i = 1,2. Then 5,((i) = e-(*>",

ami(t)

The PH model implies that a

'-W>-\

ax,(t),

r>s,

aXt(.)(u)du]

= e x p { - / aXl(u)duJo

and for all* > s i*5

ft S

xt(){t)

= Gxp{Vo

ft

7a

aX3(u)du]

=

-{-(*)"-(*)'•(£)"}• 1) Suppose that both the PH and the AAD models hold on E. Then (46) and (16) imply that S0(t) = e-t°,r(xi)

=

l/ei

and (18) implies that for all t > s

5 I ., )W =exp{-(^+^-i) a } The equalities (33) and (34) imply that for all t > s

-{-U)"-(*)" + (in--'K**Hr)l

283 If a = 1, this equality is verified. Suppose that a ^ 1. For all t > s put

^-{-fer-ar^sn-t-fe^)'}The derivative of j ( i ) is

and for all t > s has the same sign for fixed 8i ^ 82 and a ^ 1. So the function g is increasing or decreasing but not constant in t which contradicts to the equality (49). The assumption a ^ 1 was false. So a = 1, and the equality (45) implies that the lifetime distribution under any x 6 Eo is exponential: SI(<)=exp{-^y},

i>0.

2) Suppose that the PH model holds on E and the time-to-failure is exponential for all x G EQ. The formula (35) implies that for all x € EQ Sx(t)

=

exp{-r(x)A0(t)}.

Exponentiality of the times-to-failure under x € Eo and the last formula imply that Ao(t) = ct. The constant c can be included in r(x), so we have A0(t) = t. The formula (35) implies that 5,(.)(i)=exp|-^

7-{x(«)}d( U )|,

i.e. the AAD model holds on E. The proof is complete. Relations between the GS and the P H models The GS model is more general then the AAD model. When the PH model is also the GS model? It is given in the following proposition. P r o p o s i t i o n 9. Suppose that the PH model holds on the set E including Eo and all the stresses of the form (SO) with s < S, where S is any positive number. The GS model also holds on E iff the time to-failure is exponential for all x € Eo. Proof. The P H model implies that for all s < S i i , o ( i ) = » . , W , < > «• 1) If the GS model also holds on E, then for all s < S «.,(•)(') = at2(t-s

+ s,

where
A-'(A]!l(s))

is an increasing function. It implies that if both the GS and PH models hold on E then for all si < 6 and S2 < S ax,(t - si + ¥>(si)) = aX3(t- s2 + raax(si,s2)Any function al2(t)

=

const verifies this. Assume that the function aXl(t)
const

is not constant. Then

for alls > 0,

because the function ctX3(t) cannot be two or more-periodic. Note that c ^ 0, because Ax7(
#

AX3(s).

284 The equalities lim Ax% (ip(s)) = lim AXl (s) = 0 and the monotonicity of tp(s) imply that lima_>o
if0<s<($0-

It contradicts the implication that (p(s) — s = c for any s > 0. It means that the assumption, that Qx^(t) is not constant, was false. So ax,(t) = a = cons* which implies that « . , ( * )

=

« - " ' •

The PH model implies that for all z 6 i?o 54<) = 5 0 (i) r < 1 ' = e - r W , i.e., the time-to-failure distribution is exponential for all x € Bo2) Suppose that for all x E Eg the time-to-failure distribution is exponential and the PH model holds. The proof of the Proposition 8 implies that the AAD model and consequently the GS model also holds on E. The proof is complete. Constant stresses If x S Eo is a constant stress, then the PH model gives ax(t)

Sx(t) = So(*) r ( a , ) ,

= r(x)a0(t),

where

S0(t) = e-""W. For any xQ E

EQ

« « W = Pi"o,x) <*„„(*),

Sx(t) =

S^'-'Ht),

where p(xa,x)

=

r(x)/r(xo).

Simple step-stresses If x(-) 6 Ei is a simple step-stress (3) then the PH model gives

)a0(t),

t>tu

<>
It implies that

r sxi(t),

o
(50)

< > < ! •

The PH model in the form (50) is called the tampered failure rate (TFR) model (Bhattacharyya & Stoejoeti (1989)).

285 For any x0 e En

«*(•)(') = I

p(x0,x2)aXa(t),

t>ti.

f S«,(0>

0
'

•

,

«

>

*

!

.

General step-stresses If x 6 B m is a general step-stress, then PH model can written in the following forms : for any te[«i-i,t() «,(.)(() = r ( i , ) a 0 ( t ) , (to = 0, i = 1,..., m),

^=1 , - » „ - « /

\5o(tt--i)

For any xo 6 Bo and t e [ti-i,
5. G e n e r a l i z e d p r o p o r t i o n a l h a z a r d s m o d e l s The AAD and the PH models are rather restrictive. In the case of the AAD model the stress changes locally only the scale. Under the PH model the hazard rate under the stress x(-) at the moment t doesn't depend on the resource, used until t. It is not very natural if items are aging. Indeed, let Eo C E be a set of constant in time covariates, x 0 be an usual stress, x\ be an accelerated with respect to xo stress, z o i ^ i € 2?o> i-e- SXt (*) > S i , (t) for all t > 0, and Ei be a set of simple step-stresses of the form *W

=

\ x0,

t>U.

If the PH model holds on E0 U Ei then for all (i > 0, t > h < * « ( • ) ( ' )

= " » . ( ' ) •

If one population of items is tested under the usual stress and the second identical population - under the accelerated stress x\ until a moment t\ and after this moment both populations are observed under the same usual stress XQ, the failure rate after the moment t\ is the same for both populations. These populations use different resources until the moment t i , nevertheless, after the moment t, when both populations operate under the stress xo, the resource usage rate is the same. It is not natural for aging items. D e f i n i t i o n s of t h e g e n e r a l i z e d p r o p o r t i o n a l h a z a r d s m o d e l s A generalization of AAD and PH models is obtained by supposing that the rate of resource usage at any moment t is proportional not only to a function of the stress applied at this moment and to a baseline rate, but also to a function of the resource used until t. This is formalized by the following definition. D e f i n i t i o n 7 The first generalized proportional hazards (GPH1) model ( Bagdonavicius h Nikulin (1998)) holds on E if for all *(•) 6 E ar l( .)(t) = r{x(t)}

}{A I( .)(<)} <*0(t).

(51)

286 The particular cases of the G P H l model are the PH model (q{u) = 1) and the AAD model (ao(i) = CIQ — const). A generalization of the GS and PH models is the following model. D e f i n i t i o n 8 The second generalized proportional hazards (GPH2) model holds on E if for all

x(-)eE ««(•)(') = «{*(<). M) W> <*<>(<)•

(52)

The particular cases of the GPH2 model are the GS model (cto(t) = a0 = const) and G P H l model (u(x,s) = r(x)q(s)) Models of different levels of generality can be obtained by completely specifying g, parametrizing q or considering q as unknown. Relations with generalized multiplicative models The G P H l models can be formulated in terms of resources other then exponential. It helps to choose the function q. Denote by fS.M) = ff(S,<.,(*)), (53) the G-resource used until the moment t. D e f i n i t i o n 9 The generalized multiplicative (GM) model (Bagdonavicius & Nikulin (1994)) with the resource survival function G holds on E if there exist a positive function r and a survival function So such that for all x(-) € E

where fg(t)

= H(S0{t)).

The equality (54) implies that for all *(•) 6 E

&(.,(*) = G { j f r(*(r)) dH(5„(r)) J .

(55)

If EQ a set of constant in time covariates then for all x G EQ Sx(t) = G{r(x)H(S0(t))}. For any xi,x2

(56)

€ E0 S*,(t) = G{p(x1,xi)H(SXl(t))),

(57)

where p(x\,X2) = r(xi)/r(xi). P r o p o s i t i o n 10. Suppose that the integral r

dv

Jo l{v,) converges for all x > 0. The GPHl model (51) holds on E iff there exists a survival function G such that the GM model (54) holds on E. Proof. Suppose that the G P H l model (51) holds on E. Define the function H(u) by the formula

Jo

Then the used resource

fi,(()

So the GM model holds on E.

is denned by the formula (53), verifies the equalities

287 Vice versa, if there exists a survival function G such that the GM model holds on E then (55) implies that for all x ( ) £ E S I( .)(<) = G ^

r{x(r)}dH(So(r))^

and a x ( .)(t) = e"-<)( , 'G'(ff(e-^'()( , )))r{x(i)}i/'{So(*)}a 0 (<). Put ?(«) = - a u G ' ( H ( e - " ) ) ,

oS(<) = - f f ' { S „ ( t ) }

a0(t).

Then for all i(-) G £ So the G P H l holds. The proof is complete. Corollary 3 The G P H l with specified q is equivalent to the GM model with the survival function of the resource G = /f - 1 and the following relations hold:

Put v4o(u) = H(SQ{U)).

In terms of survival functions the G P H l model is written £*(.)(<) = G (J

v{x(u)}dA0(u)Sj

,

(59)

where G = H~\

H(p)=

/ -7-rr, A 0 ( u ) = / a0{u)du. Jo
$,(.)(«) = SX0 Q r{x(u)}d^j . Hence /S-»W = S-1(5I(.,(<))=

fr{x(u)}du Jo

and

i.e., the AAD model is GM-model with S^-resource. Constant stresses If the G P H l model holds on a set of constant stresses Ea, 'hen ax{t)=r{x)q(Ax(t))a0{t),

Sx(t) = G(r(x)

For all zi,:E2 € Eo S„(t)

=

G(p(xux2)H(Stl(t))).

Simple step-stresses

Ao{t)).

288 If the GPHl model holds on a set of simple step-stresses Ei, then for any x() 6 Ei of the form (3) e m-f G(r(Xl)Ao(t)), ^ ( ) W - | G(r(Xl)Ao(t) +r(x2)(A0(t)

0 <*<*,.

- A0(h)))),

The survival function Sx(.) can be written in terms of the function SXl : OlM-ISXl(t),

0 <*<*!,

^ ( • ) W - | c{ff(s, 1 (()) + K*i 1 *2)(fi(s„(0-s Il (* 1 ))}, *>
%)W =

G

j X>(*j)(4>(<j) - ^ofe-i)) + r(z,-)(Ao(t) -

5I(.,(<) = G J (ff(5 rj (t)) - ff(5^(«,_!))) + Y^pixux^HiS^tj))

MU-i))

- H(Sx.(tj-i)))

\•

Relations between survival functions under constant and non-constant stresses. Similarly as in the case of the AAD model, consider some useful relations between survival functions under constant and time-varying stresses. Proposition 11 Suppose that x(-), X(T), XQ 6 E for all r > 0. // the GM model holds on E, then

^ w = G (/^feif d f f ( 5 - W ) ) = G^

(60)

H(Sx(T){r)) dlog ff(Sl0(r))) .

Corollary 4 If x(-) £ Ei is a simple step-stress of the form (3), xo £ Eo, then • V ) W - | G{f(lD,i!l)H(SroW) + ( 1 ( i 0 , 1 ! 2 ) ( f f ( S I a ( ( ) ) - ^ . ( ' i ) ) ) } 1 ' > ' i .

l

'

If x() 6 Em is a general step-stress of the form (8), Xo € E0, then for all t £ [i,_i,(,) Sl(.)(t) = G Ipixo^^HiS^t))

-r'f^Pi^^HlS^ti))

~ H(S.0(U-i)))y

(62)

Proof. The equality (55) implies that there exists the functional ri : E —\ [0,oo) such that S*(.)(<) = G {£

ntiCr)]^(5,0(r))| .

(63)

So for all fixed r, r
(64)

289 Putting n {X(T)} in the equality (63), the first of the equalities (60) is obtained. Putting t = r in (64) and the obtained expression of r t {a:(r)} in (63), the second of equalities (60) is obtained. C h a r a c t e r i z a t i o n of t h e G M m o d e l w i t h c o n s t a n t in t i m e stresses At first glance it looks like there are too many GM models. It appears that it is not so. Indeed, assume that a function G is continuous and strictly decreasing on [0,oo[ and Gi(ti) =

G((u/ey).

Let Ea = [xo,£i] C R. be an interval of constant in time stresses, {Sx, x € [xo,xi]} be a class of continuous survival functions, such that Sx(t) > Sy(t) for all x,y € E 0 , x < y, t > 0, H = G " 1 :]0,1] -> [0,oo] and # i = G ^ 1 be the inverse functions of G and G i , respectively. If the GM model with the resource survival function G holds on Eo, then the equality (57) implies that H{Sx(t)) where \(x) = p(xo,x).

= X(x)H(SXo(t)),

t>0,xe

[zo.ii],

(65)

Then = A 1 ' f ( i : ) f l i ( 5 I , ( ( ) ) , t > 0, x G [JJO.II].

Hi(Sx{t))

(66)

The inverse result also takes place : P r o p o s i t i o n 12 Assume that a function G is continuous and strictly decreasing on [0,oo[ and the equality (65) holds. Then the equality (66) also holds iff Gi(u) = G((u/9)"), « G [ 0 , o o ) , for some positive constants 6 and p. Proof. 1) It was just shown that if the GM model holds for the survival function G and Gi(i) = G((t/8)p) then the GM model holds for the survival function G i . 2) Suppose that the GM model holds for the survival functions G and G i , i.e. the equalities (65) and (66) hold. Introduce a function D : [0,oo[-> [0,oo[ such that D(u) = Hi(G(u)), u £ [0,oo[. In this case H\(p) = D(H(p)), p £]0,1], and the relation (66) can be rewritten as follows: D(H(Sx(t)))

= \1'"(x)D{H(SXo(t)),

t > 0, x e [ i 0 , i i ] .

Using (65) we obtain that D(\(x)H(Sx(t)))

= X1'"(x)D(H(FXo(t))),

t > 0, x 6 [ s 0 , * i ]

with the initial conditions D(0) = 0 and l i m , , . ^ D(u) = oo. Putting y = H(SXa(t)) D(\(x)y)

1

= X '"(x)D{y),

y£[0,oo[, o:€[xo,xi],

or for v — In y Q[lnX(x)

+ v) = -ln[X(x))+Q{v),

veK,

i6[io,n],

where Q(v) = ln(D(e"))). This equality leads to the equality Q(v) = av + b,

a = -. V

It implies that D(y) = 0y", where 9 = eb. Consequently, G(y) = Gi(D(y)) = Gi(9y°)

and

Gi(u) = G((u/8)"), u 6 [0,oo[.

we obtain that

290 The proof is completed.

This proposition implies that, for example, the PH model is a submodel of the GM model when G is not only standard exponential but when it is any exponential or two-parameter Weibull survival function. So submodels of the GM model form classes generated by classes of resource distributions which differ only by shape and scale parameters. R e l a t i o n s w i t h t h e frailty m o d e l s A method of the function q choice is obtained by using relations between the GPH1 models and the frailty models with covariates. The hazard rate can be influenced not only by the observable stress x ( ) but also by a nonobservable positive random covariate Z, called the frailty variable, see Hougaard (1986). Suppose that for all x(-) e E ati.){t\Z = z) = zr(x(t))a0{t). (67) Then S^.MZ

= z) = exp{-z

f Jo

r(x(T))dA0{r)}

and £,(.,(*) = Eexp{-Z

J r(x{r)) dAa(r)} = G{ I r(x(r))dA0(T)}, (68) Jo Jo where G{s) = E e ~ ' z . If we put S0(t) = G{A0(t)), the equality (2.68) implies that for all x(-) 6 E 5,(.,(t) = G{ f r{x(r))dH(So{T))}, Jo

(69)

where H = G - 1 . We obtained that the frailty model defined by a frailty variable Z, the GM model with the survival function of the resource G(s) = E e - a Z , and the GPH1 model with the function q defined by (58) give the same survival function under any stress x(-) £ E. R e l a t i o n s w i t h t h e linear t r a n s f o r m a t i o n m o d e l s Under constant in time stresses the GPH1 model is related with the linear transormation ( I T ) , Dabrowska & Doksum (1988b), Cheng, Wei, Ying (1995). Let consider the set Eo of constant in time stresses and let Tx denote the time-to-failure under the covariate x 6 Eo- The LT model holds on Eo if for all x G E0 h(TT) = -l3Tx

+ e,

for all

x 6 £,

(70)

where h : [0, oo) —> [0, oo) is a strictly increasing function, and £ is a random error with distribution function Q. The relation (70) implies that for all x £ Eo Sx{t) = G{e/3Tx+h^}

= G{e" T *.ff(SoM)},

where G(t) = 1 — Q(lnt), So(t) = G{e'"(''}. Therefore, in the case of constant in time stresses, the frailty model defined by the frailty variable Z, the GM model with the survival function of the resource G(s) = Ee~'z, the GPH1 model with the function q defined by (58) and the LT model with the distribution function Q(x) = 1 — G(ln x) of the random error e give the same expression of survival functions.

6. T h e m a i n classes of G P H m o d e l s . Particular classes of the G P H models are very important for survival analysis and accelerated life testing. The numerous examples of real d a t a show that taking two constant in time covariates, say

291 xi and x2, the ratio aX3(i)laXl(t) (which is constant under the P H model), can be increasing or decreasing in time and even a cross-effect of hazard rates can be observed. Such data can be modelled by submodels of the GPH1 or more general GPH2 model. Consider possible parametrizations in the G P H models. G P H m o d e l w i t h a m o n o t o n e r a t i o of h a z a r d r a t e s Consider the GPH1 model with parametrization «(«) = (i + i i r + i .

(7i)

where 7 € R is an unknown scalar parameter. We have the model o. ( .)(t) = r{«(t)}(l + i4, ( .)(*)r + 1 «o(*)-

(72)

The particular case of this model is the PH model when 7 = — 1. Suppose that 7 < 0 and c 0 = r(x2)/r(xi), (xi,X2 G Eo)- Then , w

,,

f l-jr{x2)Ao \.l-Tr(xi)An(t))

(t)l

•"

The ratio aXl(t)/aXl(t) has the following properties: a) if —1 < 7 < 0, then the ratio aXl(t)/aXl(t) increases from the value co until the value c^, = limj-too «»xj(0/ a *i(0i where the constant cx can take any value in the interval (co, 00); b) if 7 = — 1 (PH model), the ratio aX2(t)/aXl(t) is constant in time. c) if 7 < —1, then the ratio aX2(t)/aXl(t) decreases from the value CQ until the value CQO G (1,CO). G P H m o d e l w i t h cross-effects of h a z a r d r a t e s To obtain a cross-effect of hazard rates consider the following submodel of GPH2: "«(•>(«) =r(x(t))(l+Axi)(t)yT^^ao(t). T

Suppose that co = r(x2)/r(xi)

(73) r

> 1, (xlt x 2 € Bo), and 7 X2 < 7 * i < 0. Then

ax,{t)/aXl(t)

(l-7Tx2r(z2)^o(Or = c0T

(l- 7 *ir(*iMo(0)

_Tr

**

^

and a „ {0)/aXl (0) = c 0 > 1,

lim ax, (t)/aXl

(t) = 0.

t—too

So we have a cross-effect of the hazard rates. G e n e r a l i z a t i o n of t h e g a m m a f r a i l t y m o d e l w i t h c o v a r i a t e s Consider the GPH1 model with parametrization q{u) = e~i", 1 e R.

(74)

«„(.)(<) = r(x(t)) e ^ - ( ) C ' a0(t).

(75)

We have the model If 7 = 0, it becomes the usual PH model. Suppose that xo < x(-) and that the support of Sx(.) is /„(.) = [0,spx^). for all x ( ) and t G /„(.) ,

m

* W

=

J - • r e . ) ,o r { « ( r ) } d S 3 „ ( r ) } 1 / \ "1 „ , „ / _ ! _ f " - { x ( r ) } d l n 5 I O ( r ) } ,

if if

The model (75) implies

7^0, 7 = 0

292 and for constant over time stresses x > XQ and any t £ Ix

* ( H i -3S (i -^ w) }*For 7 > 0 the upper bound spx^

of the support /•,,(•) satisfies the equation

' • ( ^ ) + ^ f ' * < ' > r { ^ ( r ) } ^ o ( . ) ( r ) } = 0. Take notice that the condition x0 < x(-) implies that sp,, < spXo. For 7 < 0 the upper bound spx^ of the support Ix(.) satisfies the equation r{a:(r)}
/ Jo

If all stresses are constant over time, then spx — spxc for all x. If spXo = oo, then spx = oo for all x. Take notice that the model (75) is a generalization of the gamma frailty model (GFM) (see Vaupel et al. (1979)) with covariates. Indeed, suppose that the frailty variable Z follows a gamma distribution with the scale parameter 9 > 0, the shape parameter k > 0 and the density

"M = ¥W)e-"-

z>0

-

(76)

The survival function of the resource G(*) = E e " s Z = (l + Put 7 = -1/k

et)-k.

< 0. The formula (2.58) implies that q(u)

= ~e^,y<0.

(77)

The proportionality constant can be included in etc, and q(u) can be written in the form q(u) = e 7 ", 7 < 0. We have the g a m m a frailty model . Consider the frailty model with a density Pz(z) which is the inverse Laplace transformation of the survival function G(i) = ( l - 7 < ) 1 / l l [ o , i / 7 ) ( < ) . (78) The formula (58) implies that }(u) = ei", 7 > 0. We obtained that under the model (77) the survival function of the resource is G(t) = (1 - ft)lh,

7 < 0.

(79)

For 7 > 0 the support of G is [0,1/7) For constant in time stresses xi,x2 € Eo and 7 < 0 the equality (79) implies the generalized proportional odds-rate (GPOR) model (Dabrowska & Doksum (1988)):

S*?(t)

r(Xl)

5,7 (t)

Inverse g a u s s i a n frailty m o d e l .

293 Consider the GPH1 model with parametrization ?(«) = — — , 1 + fu

7>0.

(80)

We obtain the model

(81)

*"« = *W>T+3fej-

Take notice that this model is the inverse gaussian frailty model with covariates. Indeed, suppose that the frailty variable Z has the inverse gaussian distribution with the density p z ( 2 ) =

g

1 / 2

eV^V3/2e-^-

f )

z > 0

.

(82)

The formulas (68) and (58) imply that

(83)

'M = = T ^ The proportionality constant can be included in a0 and q(u) can be written in the form (80). Consider GM models with G specified. These models are alternative to the PH model. Generalized logistic regression m o d e l . If the distribution of the resource is

loglogisticj.e. G

W = TT7 1 { ^ 0 ) '

(84)

then q(t) = e _ t and the GM model can be formulated in the following way:

6*(.)(l)

So(t)

If x(-) € Ei is the step-stress of the form (3) then we have

or f

5 I 0 (<),

0 <<<
^ ' n a i l H ; ^ - ! ) ] " ' , «>*.. If stresses are constant in time then we obtain the model

It is the analogue of the logistic regression model which is used for analysis of dichotomous data when the probability of "success" in dependence of some factors is analised. The obtained model is near to the Cox model when t is small . G e n e r a l i z e d probit m o d e l . If the resource is lognormal, then G(t) = l - * ( l o g i ) ,

<>0,

294

where $ is the distribution function of the standard normal law. If covariates are constant in time then in terms of survival functions the GM model can be written as follows: $-1(5I(t))=

log(r(x))

^-l(S0(t)).

+

It is the generalized probit model see Dabrowska & Doksum (1988). 7. P a r a m e t r i z a t i o n o f t h e f u n c t i o n r i n A A D a n d G P H

models

Following Viertl (1988), consider parametrization of the function r in the AAD and GPH models. If the AAD model holds on Eo, then for all xi, x^ S Eo

S«(<) = S«,W'i,*2)'),

(87)

where the function p(xi,X2) = r(x2)/r(xi) shows the degree of scale variation. It is evident that p{x,x) = 1. In the case of more general G P H l model SX2(t) = G(p(x1,xi)H(S:Cl(t))),

(88)

where

_ g(g„(Q) _ fg(t) shows the degree of the resource usage rate variation. Suppose at first that x is unidimensional. The rate of scale (AAD model) or resourse usage rate ( G P H l model) variation can be defined by the infinitesimal characteristic (see Viertl (1988) for AAD model): u

\

S{X)

i-

= tmo

p(x,x + Ax)-p(x,x)

.,

= [l°9 rW 1 '

Ax

So for all x £ E0 the function r[x) is given by the formula:

r(x) = r{xo)exp 0.

In this case r{x) = e"°+'5>*<1>, where z(x) is some known function, /?o, /?i are unknown parameters. E x a m p l e 1. S(x) = a, i.e. the rate of scale changing is constant. Then

r(x) = e ^ + " ' * , where /?i > 0. It is so called log-linear model. E x a m p l e 2. S(x) = ct/x. Then r(x) = e"»+"' ,I,! ' ;r = a n " \ where /?i > 0. It is so called power rule model.

(90)

295 E x a m p l e 3. 6(x) = a/a: 2 . Then = cue"''1,

r(x) = e^M* where ft < 0. It is so called Arrhenius E x a m p l e 4. <5(x) = a/x(l — x). Then

model.

r(x) = e"°+'' 1 l n *

= ai (T^—)

\

0< K

1,

where fix > 0. It is the model of Meeker-Luvalle (1995). The Arrhenius model is widely used to model product life when the stress is the temperature, the power rule model - when the stress is voltage, mechanical loading, the log-linear model is applied in endurance and fatigue data analysis, testing various electronic components (see Nelson (1990)). The model of Meeker-Luvalle is used when x is proportion of humidity. If it is not very clear which of the first three models to choose, one can take more large class of models. For example, all these models are the particular cases of the class of models determined by S(x) = o i

1

with unknown 7 or, in terms of the function r(x), by

r(x) :

-{

1

/Jo+/3i(»'-i)/«i

if

e;

Po+Ptlogz

jf

£

t0; — (J

In this case the parameter e must be estimated. The model (90) can be generalized. One can suppose that S(x) is a linear combination of some known functions of the stress: k

In such a case r(x)\ == exp expli po 0 ++ V o

z f)iZi(x) ^2^ '(

where zt(x) are some known functions of the stress, /3 0 , • . . ,/?* are unknown (possibly not all of them) parameters. a/x2.

E x a m p l e 5. S(x) = \/x + Then

r(,) = el1'+W,"+W' =

1

ftlie'

'«,

where /?i = 1, /?2 < 0. It is so called Eyring model, applied when the stress x is the temperature. E x a m p l e 6. S(x) = J2

ctj/x'.

i=l

Then r(x) = exp l/3o+l3ilogx

+ J2 /3i/xi

I.

It is so called generalized Eyring model. Suppose now that the stress x = (x\,...,xm) is multidimensional. Define (see Viertl (1988)) the infinitesimal characteristics Si(x) by the equalities ... Oiix) =

lim Ai,->o

p(x,x + AxieA — p(x,x) Axi

where e; = ( 0 , . . . , 1 , . . . , 0). The unity is the i t h coordinate.

dlogrlx) = —-z , dxt

296 Generalizing the unidimensional case, <$j (x) can be parametrized in the following manner

j'=i

where UJJ(X) are known functions, a y are unknown constants. In this case

'{A+££/W*)|.

r(x) = exp i

where Zij (x) are known functions, /?y are unknown constants. E x a m p l e 7. Si(x) = 1/xj + ( a n + ai2x2)/xl and S2(x) = a21 + o:22/xi. Then r(x) = exp { A + Pilog Xi + P2x2 + fo/xi + /34x2/x1] . It is so called generalized Eyring model. This model is used for certain semiconductor materials, when xi is the temperature and x 2 is the voltage. E x a m p l e 8. <S,(x) = ajUj(xj), where u,- are known functions. Then r(x) = exp I fa + Y^(3jZj(xj)

\ ,

where Zj are known functions. It is so called generalized Arrhenius model.

model. It also called the log-linear

G e n e r a l i z e d a d d i t i v e a n d additive-multiplicative m o d e l s Definition 10. The generalized additive (GA) model (Bagdonavicius & Nikulin (1995)) holds on E if there exist a function a on E and a survival function So such that for all x(-) £ E

*4)W _ df?(t) at

-

at

t

(91)

•»(*<«»

with the initial conditions / 0 G (0) = f^JO) = 0; here / 0 G (t) = H(S0(t)). So the stress influences additively the rate of resource using. The last equation implies that &()(<) = G (H(So(t))

+J

a(x(T))dr\

.

(92)

In terms of exponential resource usage the GA model can be written in the form ".(•)(i) = ! ( 4 ( . ) W } W i ) + a ( « ( i ) ) ) . The particular case of the GA model is the additive hazards model (AH) : <*,(.)(<) = a 0 ( 0 + <»(*(<))•

(93)

Both the GM and the GA models can be included into the following model. Definition 11 The generalized additive-multiplicative (GAM) model (Bagdonavicius & Nikulin (1997)) holds on E if there exist functions a and r (positive) on E and a survival function So such that for all x ( ) E E ^

= * « > « > + . ( , ( « ) )

(94)

297 with the initial conditions /,?(0) = /°(.)(0) = 0; here f§(t) = H(S0(t)). So the stress influences the rate of resource usage as multiplicatively as additively. The last equation implies that &<•)(*) = G Qf' r{x(T)}dH{So{r)) + j f a(«(r))dr) .

(95)

In terms of exponential resource usage the GAM model can be written in the form :

<**(•)(*) = ?{^(.)(t)}(rW0M<) +<•(*(*)))• In the particular case of the exponential resource we obtain the additive-multiplicative hazards (AMH) model, see Lin and Ying (1996) : axi.)(t) = r{x(t)}a0(t) + a{x(t)).

(96)

The function a in the GAM models is parametrized as the function In r in the GM models and the function q as the function q in the GPH models. 9. C h a n g i n g s h a p e and scale m o d e l s Definition of models Consider now the important model which does not lie in the class of the GAM models but includes the AAD model as the particular case. Suppose that the constant in time stresses x € Bo change not only the scale but also the shape of time-to-failure distribution: there exist on EQ the positive functions $(x) and v(x) such that for any

&

un

w=«-\tej /•

(97)

The S^o-resource used until the moment t under x is fx(t) = 5'~01(Sr(f))) and the resource usage rate is

yx{t) = r{x)t"^-\ 1

where r(x) = v(x)/'BfflW, H = G" . So the model (97) means that the resource usage rate under the stress x is increasing, if v(x) > 1, decreasing, i/0 < u(x) < 1, and constant, if v(x) = 1. In the case v{x) = 1 « have the AAD model. Consider the following generalization of the model (97) to the case of time varying stresses : Definition 12. The changing shape and scale (CHSS) model ( Bagdonavicius & Nikulin (1998)) holds on E if there exist positive on E functions r and v such that for all x(-) G E ^W=r{.(*)}«"W'»-i.

(98)

V ) C ) = S*° ( j f r{x(r)}r"('( r »- 1 dr) .

(99)

The equality (98) implies that

In terms of the exponential resource usage the model can be written in the form : ax(.)(t) =

r{x(t)}q(Ax{.)(t))?«t»-\

Simple step-stresses

298 If x(-)e

Ei, then the formula (99) implies that 0 < t <
lW

S^(t)=}

. J

(100)

General step-stresses If z(-) 6 Em, then for all t € [*.--i,ti)

«.(•)(*) = < ? < £ j=i

+

- ( - ^ ,"(*<) l

(_L_ V

"(*))

U*0/

(101)

Wo;

10. Generalizations Schabe and Viertl (1995) considered an axiomatic approach to model building. P r o p o s i t i o n 1 3 . (Schabe and Viertl (1995)). Suppose that there exists a functional a :Ex such that for any xi(-),X2(-)

E x [0,oo)-> [0,oo)

£ E it is differentiable and increasing in t, a{x1(),x2(),0)

=0

and r*a(.)~«(*i(-).*2(-).r.l(.)), where ~ denotes equality in distribution. For any differentiable on [0,oo) c.d.f. F exists afunctional allx()£E

b : E x [0, oo) —¥ [0,oo) such that for (102)

Proof. Fix z o ( ) € E and for all x(-) 6 E put a0(x(-),t)

=

F-1(FM.)(a(x(-),x0(-),t))).

The distribution of the random variable R = ao(x(-),Tx^) F. Put

does not depend on x(-) and its c.d.f. is

»(*(•).') = £«o(*(-).0-

299

Then

t

ao(x(-),t)

= I Jo

b(x(u),u)du,

which implies F„ ( .)(t) = P{T I ( .) < t} = P { i ? < a 0 ( x ( ) , < ) } = F ( a 0 ( i ( - ) , 0 ) =

F

( /

*(*(«).«)<*«) •

The proof is complete. R e m a r k 8. P u t G(t) = l-F{t),

5I(.)(0 = 1-FJI(.)(0,

H = G-\

/f ( .)(<) = / f ( 5 l ( . ) ( t ) ) .

The equality (102) implies that

(103)

J^?(.)C) = »(*(•).')•

This model means that the rate of the G—resourse usage is a functional of the stress and the time. The above considered models are submodels of this general model: 1) If b(x{),t) = r(x(t)), we have the AAD model. 2) If b(x(-),t) = r(x(t))a0{t), we have the GM (or, equivalent^, G P H l ) model. 3) If b(x(-),t) = r(x(t)) a0{t) and the resource is exponential, i.e. G(t) = e'',t > 0, we have the PH model. 4) If b(x(-),t) = r j i l l D C W ' H - 1 , we have the CHSS model. Considering the GS model, it was noted that this (and also AAD) model is not appropriate when the stress is periodic with quick change of its values. Greater is the number of stress cycles, shorter is the life of items. So the effect of cycling must be included in the model. Suppose t h a t a periodic stress is differentiable. Then the number of cycles in the interval [0,i] is n(t) = J | dl{x'{u) Jo

> 0} | .

Generalizing the G P H l (or GM) model we suppose that the G-resource used until the moment t has the form 4 ) «

= /

r 1 {jr( U )}dff(5 0 ( U )) + J

r2{x(u)}d

\ l{a;'( U ) > 0} |

(104)

The second term includes the effect of cycling on resource usage. In terms of survival functions S l ( .,(i) = G | / n{x(u)}dH(So(u))

+ f r 2 {*(u)} | dl{x'(u)

> 0}|.

(105)

If amplitude is constant, r2{x(u)} = c can be considered. The AAD model is generalised by the model S l ( 0 ( t ) = G{J

ri{x(u)}du

+ J

r2{x(u)}

| dl{x'(u)

> 0} | | .

(106)

The GS and AAD models are not appropriate if x(-) is a step stress with many switch on's and switch off's which shorten the life of items. In this case the following model can be considered: 4 ) W = ^ % i { z ( u ) } « « f ( S o M ) + f\2{x(u)}l(Ax(u)

>

0 ) | ^

+ ['r3{x(u)}l(Ax(u) < 0)±Pp±. (107) Jo I Az(u) | The second and the third terms include the effect of switch-on's and switch-ofPs (or vice versa), respectively, on resource usage. If the step-stress has two values, the functions r2 and r3 can be constants.

300

11. The heredity hypothesis A process of production is unstable if reliability of items produced in different time intervals are different. If items produced in some specified time interval are considered and the AAD, GM (GPH1) or GA models hold on Eo, then for all xi,x2 € Eo Sz2(t) = GzM*L*2)t), S„(t) Sx,{t)

(108)

= G(Axltz2)H{SXl{t))),

(109)

= G(H{SX, (t) + b(xux2))),

(110)

respectively. D e f i n i t i o n 13. If one of the models AAD, GM or GA holds, the process of production is unstable and the function p(xit x2) (the models AAD or GM) orb(xlt x2) (the model GA) is invariant for groups of items produced in different time intervals, then we '11 say that the heredity hypothesis is satisfied. Suppose that x\ is a usual stress and x2 > xi an accelerated stress. If one of the models AAD, GM or GA and the heredity principle hold, then sufficiently large data can be accumulated during a long period of observations and good estimators of the functions p{x\, x2) or b(xi, x2) can be obtained. The reliability of newly produced items under the "usual" stress xi can be estimated from accelerated life data obtained under the accelerated stress x2, using the estimators p(xi,x2) or b(xi,x2). REFERENCES P.K. Andersen, R.D. Gill, 1982. Cox's regression model for counting processes: A large sample study. Ann. Statist. 10, 1100-1120. P.K. Andersen, 0 . Borgan, R.D. Gill and N. Keiding, 1993. Statistical Models Based on Counting Processes. Springer, New York. V. Bagdonavicius, 1978. Testing the hypothesis of the additive accumulation of damages. Probab. Theory and its Appl,

23, No. 2, p.403-408.

V. Bagdonavicius, 1990. Accelerated life models when the stress is not constant. Kybernetika,

26,

289-295. V. Bagdonavicius, 1993. The modified moment method for multiply censored samples. Mathematical

Lithuanian

Journal, 33, No.4, p.295-306.

V. Bagdonavicius, M. Nikulin, 2000. On goodness-of-fit for the linear transformation and fraility models, Statistics

and Probability Letters, 4 7 , # 2 , 177-188.

V. Bagdonavicius, M. Nikulin, 2000. On nonparametric estimation in accelerated experiments with step stresses, Statistics , 33, # 4 , 349-350. V. Bagdonavicius, M. Nikulin, 2000. Modeles statistiques de degradation avec des covariables dependant de temps, Comptes Rendus, Academie des Sciences de Paris, 329, Serie I, #2,131-134. V. Bagdonavicius, M. Nikulin, 1999. Generalized Proportional Hazards Model Based on Modified Partial Likelihood, Lifetime Data Analysis, 5, 329-350. V. Bagdonavicius, S. Malov and M. Nikulin, 1999. Characterizations and semiparametric regression estimation in Archimedean copulas, Journal of Applied Statistical Sciences, 8, 137-154. V. Bagdonavicius, M. Nikulin, 1998. Additive and Multiplicative Semiparametric Models in Accelerated Life Testing and Survival Analysis. Queen's Papers in Pure and Applied Mathematics, 108, Queen's University, Kingston, Ontario, Canada. V. Bagdonavicius, M. Nikulin, 1997, Analysis of general semiparametric models with random covariates, Revue Roumaine

de mathematiques

Pures et Appliquees, 42, # 5 - 6 , 351-369.

V. Bagdonavicius, M. Nikulin, 1997, Statistical analysis of the generalized additive semiparametric survival model with random covariates, Questiio, 2 1 , # 1 - 2 , 273-291.

301 V. Bagdonavicius, M. Nikulin, 1997, Sur l'application des stress en escalier dans les experiences accelerees , Comptes Rendus, Academic des Sciences de Paris, 325, Serie I, 523-526. V. Bagdonavicius, M. Nikulin, 1997, Transfer functionals and semiparametric regression models, Biometrika,

v. 84, 2, 365-378.

V. Bagdonavicius, M. Nikulin, 1997, Asymptotic analysis of semiparametric models in survival analysis and accelerated life testing, Statistics,

29, 261-281.

V. Bagdonavicius, M. Nikulin, 1997, Accelerated life testing when a process of production is unstable, Statistics

and Probability Letters. 3 5 , # 3 , 269-275.

V. Bagdonavicius, M. Nikulin, 1997, Some rank tests for multivariate censored data, in : Advances in the Theory and Practice of Statistics : A volume in Honor of Samuel Kotz. (eds. N.L.Johnson and N.Balakrishnan), J.Wiley, New York, 193-207. V. Bagdonavicius, V. Nikoulina, 1997, A goodness-of-fit test for Sedyakin's model. Revue de Mathematiques

Roumaine

Pures et Appliquees. 4 2 1, 5-14.

V. Bagdonavicius, M. Nikulin, 1996,

Analyses of generalized additive semiparametric models ,

Comptes Rendus, Academie des Sciences de Paris, 323, 9, Serie I, 1079-1084. V. Bagdonavicius, M. Nikulin, 1995a. Semiparametric models in accelerated life testing. Queen's Papers in Pure and Applied Mathematics, 98, Queen's University, Kingston, Ontario, Canada, 70p. V. Bagdonavicius, M. Nikulin, 1995b, On accelerated testing of systems. European Journal of Diagnosis and Safety in Automation, 5, 3, 307-316. V. Bagdonavicius, M. Nikulin, 1995c. Estimation of system reliability from accelerated experiments. In: Proceeding of International Conference on Statistical Methods and Statistical Computing for Quality and Productivity Improvement (ICSQP'95). I I , 602-608. V. Bagdonavicius, M. Nikulin, 1 9 9 5 d . Accelerated life models for systems and their components. In: Seventh International Conference on Applications of Statistics and Probability in Civil Engineering, Paris. 2, 1157-1164. V. Bagdonavicius, M. Nikulin, 1994. Stochastic models of accelerated life. In: Advanced Topics in Stochastic Modelling, Eds.:J.Gutierrez, M.Valderrama , pp. 73-87. World Scientific, Singapore. A.P. Basu and N. Ebrahimi, 1982, Nonparametric accelerated life testing. IEEE Trans, on Reliability , 3 1 , 432-435. G.K. Bhattacharyya and Stoejoeti, 1989. A tampered failure rate model for step-stress accelerated life test, Comm. in Statist., Part A-Th. and Meth., 18, 1627-1643. S.C. Cheng, L.J.Wei, Z.Ying, 1995, Analysis ofTransformation Models With Censored Data, Biometrika, 82, 835-845. O. Clayton and J. Cuzick, 1985, Multivariate generalizations of the proportional hazards model. Journal of Royal Statistical Society, Series A 148, 82-117. D.R. Cox, 1972. Regression models and life tables, J.R.Statist.

Soc, B , 34, 187-220.

R.D. Cox and D. Oakes, 1984. Analysis of Survival Data, Methuen (Chapman and Hall), New York. D.M. Dabrowska and K.A. Doksum, 1988. Partial likelihood in Transformations Models with Censored Data, Scand. J. Statist., 15, 1-23. R.C. Elandt-Johnson, N.L. Johnson, 1980. Survival Models and Data Analysis, J. Wiley, New-York. T.R. Fleming and D.P. Harrington, 1991. Counting processes and survival analysis. J.Wiley, New York. C. Genest, K. Ghoudi and L.P. Rivest, 1995. A semiparametric estimation procedure for dependence parameters in multivariate families of distributions. Biometrika 82, 543-552. I.B. Gertsbakh, K.B. Kordonskiy, 1969. Models of Failure, Springer Verlag, Berlin. L.Gerville-Reache, V.Nikoulina, 1997. Analysis of reliability characteristics of estimators in accelerated life testing. In: Statistical and Probabilistic Models in Reliability. (Eds. D.Ionescou, N.Limnios), Birkhauser, Boston, 91-100. P.E. Greenwood, P.E. and M.S. Nikulin, S. 1996. A Guide to chi-squared testing, J.Wiley, New York.

302 D.P. Harrington, and T.R. Fleming, 1982. A class of rank test procedures for censored survival data. Biometrika 69, 133-143. P. Hougaard, 1986. A class of multivariate failure time distributions, Biometrika, N X . Johnson, 1975.

73, 3, 671-678

On Some Generalized Farlie-Gumbel-Morgenstern Distributions, Comm.

in

Stat., 4, 5, 415-427. J.D. Kalbfleisch, R.L. Prentice, 1980. The Statistical Analysis of Failure Time Data, J. Wiley, New York. G.D. Kartashov, 1979. Methods of Forced (Augmented) Experiments (in Russian). Znaniye Press, Moscow. G.D. Kartashov and A.I.Perrote, 1968. On the principle of "heredity" in reliability theory. Cybernetics, 9, 2, 231-245.

Engrg.

J.F. Lawless, 1982, Statistical Models and Methods for Lifetime Data, J.Wiley, New York. J.F.Lawless, 1986. A Note on Lifetime Regression Models, Biometrika,

73, 509-512.

E.T. Lee, 1992. Statistical methods for survival data analysis, J. Wiley, New York. D.Y. Lin and Z. Ying, 1994. Semiparametrical analysis of the additive risk model. Biometrika

81,

61-71. D.Y. Lin and Z. Ying, 1995. Semiparametric inference for accelerated life model with time dependent covariates. Journal of Statistical Planning and Inference 44, 47-63. D.Y. Lin and Z. Ying, 1996. Semiparametric analysis of the general additive-multiplicative hazard models for counting processes. The Annals of Statistics,

23, 5, 1712-1734.

N.R. Mann, R.E. Schafer and N.D. Singpurwalla, 1974. Methods for Statistical Analysis of Reliabitity and Life Data, J. Wiley, New York. J.W.Q. Meeker, 1984.

A comparison of Accelerated Life Test Plans for Weibull and Lognormal

Distributions and Type I Censoring, Technometrics,

26, 157-172.

J.W.Q Meeker and L.A. Escobar, 1993. A review of recent research and current issues in accelerating testing, International

Statistical Review, 6 1 , 1, 147-168.

J.W.Q Meeker and L.A. Escobar, 1993. Statistical Methods for Reliability Data, J.Wiley, New York. M.A. Miner, 1 9 4 5 . Cumulative Damage in Fatigue. J. of Applied Mechanics, 12, A159-A164. W. Nelson, 1990. Accelerated Testing. Statistical Models, Test Plans, and Data Analyses, J. Wiley, New York. E. Pieruschka, 1961.

Relation between lifetime distribution and the stress level causing failures.

LMSD-8OOO44O1 Lockhead Missils and Space Division, Sunnyvale, California. J.M. Robins and A.A.Tsiatis, 1992. Semiparametric estimation of an accelerated failure time model with time dependent covariates. Biometrika,

79, 311-319.

A.L. Rukhin and H.K. Hsieh, 1987, Survey of Soviet work in reliability. Statistical Science, 2, 484-503. H.Schabe 1998. Accelerated Life Models for Nonhomogeneous Poisson Processes, Statistical Papers, 39, 291-312. B. Schweizer and A.Sklar, 1983. Probabilistic Metric Spaces. North-Holland , Amsterdam. N.M. Sedyakin, 1966. On one physical principle in reliability theory.(in Russian). Engrg. Cybernetics, 3, 80-87. J.Sethuraman, N.D.Singpurwalla, 1982. Testing of Hypotheses for Distributions in Accelerated Life Testing, JASA, 77, 204-208. M. Shaked and N.D. Singpurwalla, 1983. Inference for step-stress accelerated life tests. J. Statist. Plann. Inference, 7, 295-306. N.D. Singpurwalla, 1987. Comment on "Survey of Soviet work in reliability". Statistical Science, 2, 497-499. N.D. Singpurwalla, 1971. Inference from Accelerated Life Tests When Observations Are Obtained from Censored Samples, Technometrics, 13, 161-170. N.D. Singpurwalla, 1995. Survival in Dynamic Environnements, Statistical Science, 10, 86-103.

303 N.D.Singpurwalla, S.P.Wilson, 1999. Statistical Risk, Springer-Verlag, New York.

Methods in Software Engineering:

Reliability

and

R. Schmoyer, 1 9 9 1 , Nonparametric Analysis for Two-Level Single-Stress Accelerated Life Tests, Technometrics, 33, 175-186. A.A. Tsiatis, 1990. Estimating regression parameters using linear rank tests for censored data, Ann. Statist., 18, 353-72. J.W.Vaupel, K.G.Manton, E.Stallard, 1979. The impact of heterogeneity in individual frailty on the dynamic of mortality, Demography, 16, 439-454. R. Viertl, 1988.

Statistical Methods in Accelerated Life Testing.

Vandenhoeck &

Ruprecht,

Gottingen. R. Viertl and F. Spencer, 1991. Statistical Methods in Accelerated Life Testing. Technometrics,

33,

360-362. V.G.Voinovand M.S. Nikulin, 1993, Unbiased Estimators and Their Applications, Vol.1: Univariate C a s e , Kluwer, Dordrecht. V.G.Voinov and M.S. Nikulin, 1996, Unbiased Estimators and Their Applications, Vol.2: M u l t i variate C a s e , Kluwer, Dordrecht. Z. Ying, 1 9 9 3 . A large sample study of rank estimating for censored regression data, Ann. Statist., 21, 76-99.

Mathematics and the 21st Century Eds. A. A. Ashour and A.-S. F. Obada © 2001 World Scientific Publishing Co. (pp. 305-322) T H E VIBRATIONS OF A DRUM WITH FRACTAL BOUNDARY

JACQUELINE F L E C K I N G E R - P E L L E

ABSTRACT. Let il be a bounded domain in JRN. We study the eigenvalues of the Dirichlet Laplacian defined on the domain Q,. There exists a countable sequence of eigenvalues. Their asymptotics are related to the geometry of the domain. We recall the results established during the previous century concerning the link between the geometry of the domain and the asymptotics of the eigenvalues; we try to answer M.Kac's question "Can one hear the shape of a drum?" especially in the case of domains with fractal boundary. 1991 Mathematics Subject Classification: 35-02; 35P05; 35P20 Keywords and Phrases: Eigenvalues, Counting function, Fractals, Minkowski dimension Heat equation.

CEREMATH/MIP-UT1 (UMR 5640) Universite Toulouse 1 31042 Toulouse Cedex, France [email protected]

305

306 1

INTRODUCTION

We recall the evolution along the previous century of the question raised by M.Kac ([K]) in 1966: "Can one hear the shape of a drum?".

We are mainly

concerned with drums with fractal boundaries. More precisely, it is well known since Euler and Lagrnage that the vibrations of a membrane are described by the wave equation which leads to an eigenvalue problem. These eigenvalues correspond to the eigenfrequencies of the membrane and, of course, depend on its size, on its shape,.... Conversely, it is often of importance to know whether one can recognize the geometry of the body just by listening at the tones and overtones. We are concerned here by this problem mainly when the membrane, has a fractal boundary.

We recall the most significative results obtained on this

topic in the last century. NOTATIONS If w is a bounded domain in IR", du denotes its boundary and |w| n stands for its n-dimensional Lebesgue measure. As usual, Hk(u),

where k > 1 is an integer, is the Sobolev space of all (classes

of) functions / 6 L2{uS) whose all partial derivatives of order < k also belong to L2{u).

Moreover, HQ(U>) is the completion of V(ui) = CQ°(W) with respect

to the norm of HQ(U>).

This paper is organized as follows: We begin first (Section 2) with the case of a cord. Then (Section 3) we study the vibrations of a membrane and recall Weyl's estimate for the counting function. We also introduce the partition function. The study of the second term in the counting function lead us to Kac's question (Section 4). The case of domains with fractal boundaries was first introduced by M.Berry and we mention the results obtained for the counting function in this case (Section 5). Finally we turn our attention to the heat equation which gives also some information on the domain (Section 6).

307 2

A N EXAMPLE: V I B R A T I O N O F A CORD

It is well-known that the position at time t of a point a; on a cord with length a, which is fixed at both ends satisfies: d2v .

.

, d2v,

,

„.

. .

The constant k depends on physical data. For such a problem, it is also classical to seek stationary solutions v(x, t) = u(x)T(t)

so that we are led to the following eigenvalue problem:

(2.1.a)

(2.1.6)

Au(x) + \u(x)

= 0; (x, t) € [0; a] x 1R+,

u(0) = u(a) = 0, x e [0;o].

A pair (A,u) satisfying (2.1) is an eigenpair, it consists in an eigenvalue A € IR and an eigenfunction u : x € [0; a] —¥ u(x) e H. with u{x) = sin(pirx/a),

p € Z.

To each given integer p corresponds the eigenvalue Ap =

(pita'1)2.

The tones of the cord are given by the ground state (p = 1) and the harmonics (P > !)• It is easy on this example to see that the length of the cord determines the sounds and reciprocally. Moreover it is easy to compute the asymptotics, as s —> oo, of the counting function N(s) = E[a^/w],

3

:

N(s) := #{p/(pff/a)2

< s}.

Obviously

where E[.] denotes the entire part.

VIBRATIONS O F A DRUM; W E Y L ' S ESTIMATE

T H E CASE O F A SQUARE AND GAUSS ESTIMATE:

It is easy to extend this result to a square Ka = (0,a)

x (O.o) c IR2. The

position of a point x g Ka, when Ka has fixed boundaries is given by d2v 0j2 ( M ) = cAv(x,t),

(x,t) <=Kax

R+,

308 v(x,t)

= Q,

xedKa.

Of course, in this section, we assume that 1R2 is equipped with euclidean coordinates: x = ( s i , ^ ) € H 2 . The Laplacian A applies to the space variables: Au(x,t)

= Au(xi,x2;t)

d2u = •^{xi,X2\t)

d2u + - ^

We seek stationary solutions as above: v(x, t) = u(x)T(t),

(xi,x2;t). and we are led again

to an eigenvalue problem: (3.1.a)

Au(x) + Xu(x) = 0, x = (xi,X2) € Ka,

(3.1.6)

u(x) = 0, x G dKa.

There exists a countable number of solutions to this problem: u(xi,X2)

=

sin(pxi'K/a)sin(qx2iT/a),

AP,, = (p2+q2)(*/a)2,

(p,q) £ IN* X IN*.

As above, for a given A > 0, we can introduce the counting N(X,Ka)

:= #{A P ,„ < A}. We have: N(X,Ka)

function:

= : M ( ^ ) , where JV2(r)

denotes the number of lattice points inside the quarter of the disk with radius r : (3.2)

N2{r) := #{(p, q)eWx

W/p2

+ q2 < r2}.

Gauss, [G], shows in 1801, that this number is proportional to the area of the disk: (3.3)

A/"2(r\) = (7rr 2 )/4 + 0 ( ^ ) .

Notice that Mi(r) — 0 if r < y/2; it has jumps when r = %/2, It follows from (3.2) and (3.3) that (3.4)

JV(A, Ka) * (Aa 2 )/(47r),

T H E COUNTING FUNCTION:

A -> +oo.

V5,....

309 In 1911, H.Weyl extends this estimate to bounded smooth domains 0 in 1R™ ([Wl], 1911; [W2], 1912). The study of the vibrations of fl leads to the following eigenvalue problem (that we consider in the distributional sense): Find (A,u) € IR x H&(Q) with u =£ 0 such that: (3.5)

-Au(x)

= Xu(x),

x e Q.

Since Q, is bounded, Problem (3.5) has a countable sequence of solutions (the eigenpairs): (Afc, Uk); since the eigenfunctions are defined up to a multiplicative constant, we add the condition ||M||^2(Q\ = 1- Finally we obtain an infinite sequance of eigenvalues: 0 < Ai < A2 < . . . < \j

< ...,

, Aj• —> 00

as

j —> 00,

where each eigenvalue is repeated according to its algebraic multiplicity. The counting function associated to Problem (3.5) for a given A > 0 is: (3.6)

i V ( A , n ) : = # { 0 < A j < A}.

Note that it is equivalent to seek an estimate of \j as j —> +00, or an estimate of N(X, Q,) as A tends to +00. WEYL'S ESTIMATE:

For il a sufficiently smooth bounded domain in IR n , H.Weyl shows : (3.7)

N(\,Cl)~W(\,Sl)

:= (27r)- n B n |fi|„A n / 2 , as A -> 00,

where Bn denotes the volum of the unit ball in M n . Hence the volum of fi can be derived from the knowledge of the spectrum of the Dirichlet Laplacian defined on Q. REMARK 1: This results holds also for Neumann boundary conditions under some additional assumptions on the length of the boundary. In particular, if the boundary is "too long", the counting function can behave like Xa, a > n/2.

310 Counterexamples as "combs" can be exhibited for the Neumann Laplacian. ( [FMt],1973). Moreover, for the Dirichlet Laplacian, the smoothness assumption can be withdrawn ( [FMt], [Mtl],1976; [Mt2],1977).

4

CAN ONE HEAR THE SHAPE OF A DRUM?

T H E " REMAINDER TERM" When Weyl's formula holds, it is natural to try to estimate the "second term" or equivalently to estimate N(X,Q,) — W(X,fi),

the "remainder term".

Under some additional assumptions (if there are not too many periodic geodesies), it can be shown, for Problem (3.5), that : (4.1)

N(X, SI) = W(\, fl) - 7„|0n| B _iA< n - 1 >/ 2 + o(A("- 1 ^ 2 ), A -* +oo.

Here, as above, |0f2| n _i denotes the (n — l)-dimensional Lebesgue-measure of the boundary dQ (for a planar domain, \d£l\i is the length of the boundary); 7„ is a constant which depends only on n. When there are periodic geodesies, the second term oscillates.

([DG],1975;

[Iv2],1984; [Me 2],1984; [Vl],1986; [Se], [Sl],1987; [S2],1988). Obviously, it follows from (4.1) that the knowledge of the spectrum implies not only the knowledge of the "volum" of the domain, but also the measure of its boundary. Hence it is natural to try to derive other geometrical attributes. THE PARTITION FUNCTION:

Indeed, the link between the eigenvalues and the measure of the domain is established as far back as 1949 ([MiP] 1949) , by the use of the heat equation and the Laplace transform of the eigenvalues. In place of the counting function let us introduce the trace of the heat kernel, also called the "partition oo

(4.2)

-oo

Z(t, Q) := J^ e~Xjt = t I

e~nN(\, Q)dX

function":

311 As t ->• 0, the following estimate holds ([MiP] 1949, [K] 1966): Z{t, fl) = (4nt)-n/2{\n\n

(4.3)

+ ait+

...).

As for the counting function, the expansion of the partition function involves the "volume" | Q \n in the first term, the measure of the boundary | dil | n _ i in the second one; moreover the third term is proportional to the number of holes; REMARK 2: Note that (4.2) is established before (4.1) since the asymptotics w.r.t. j for the eigenvalues are derived from the asymptotics w.r.t. to t for the partition function by Tauberian theorems when the derivation of analoguous results for the counting function uses the theory of pseudo-differential operators. Note also that the knowledge of the asymptotics of the counting function implies the knowledge of the asymptotics of the partition function, but the converse is not true. We study problems related to the heat equation below in Section 6. T H E INVERSE P R O B L E M

Since the expansion of the partition function yields so many geometric characteristics, it sounds natural to study the inverse problem : the possibility to determine completely Q, from the spectrum of the associated Dirichlet (or Neumann) Laplacian. A first stimulator for the research in this field is the famous paper by M.Kac, ([K], 1966) : "Can one hear the shape of a drum?".

Also

this problem has several important applications such as the determination of cracks. First, Milnor exhibits two isospectral torii in H 1 6 which are not isometric. Then H.Urakawa is able to construct two isospectral domains which are not congruent, ([U],1982). Finally, in 1992, Gordon, Webb and Wolpert [GWW] show a very simple counterexample in 1R2 .

5

D O M A I N S W I T H FRACTAL BOUNDARIES

The question of domain with fractal boundaries is introduced by M.Berry in [Bel], 1979 and [Be2], 1980. Studying scattering of waves by fractals he suggests

312 to replace in the second term of (4.1) n — 1 by the Hausdorff dimension h ; hence , if dtt is with Hausdorff dimension h, his conjecture is: (5.1)

N(\, fi) = W(X, it) - inH(dn)\h/2

+ o(Xh/2),

A -> +oo.

Here H(dfl) is the /i-Hausdorff measure of the boundary. Before going further, let us recall first the definition of Hausdorff dimension and, more generally of "fractal" dimensions (see e.g. [Fa]). H A U S D O R F F DIMENSION:

The Hausdorff dimension, introduced by Hausdorff in 1919 and popularized by B.Mandelbrot in the seventies, is the most famous of the fractal dimensions. It is defined by the following way: Let us consider first, for given e > 0 a covering of dil by balls (JBJ), 6 / with radii r* < e. For any t > 0, set M(t)

:= lime_>o(inf 2 i e / r i ) '

wnere

the infimum is taken

over all coverings. The Hausdorff dimension of dfl is : h := inf{i > 0/M(t)

< +00} , and its ft-Hausdorff measure is H(dQ) =

M(h).

BOULIGAND-MINKOWSKI DIMENSION:

In 1985, J.Brossard and R.Carmona, [BC], construct a counter example to Berry's conjecture and they suggest to replace in (5.1) the Hausdorff dimension h by d, the Bouligand-Minkowski one. This new conjecture is usually referred as the "modified Weyl-Berry conjecture". Bouligand [Bo] has extended to the noninteger case the notion of Minkowski dimension. The Bouligand-Minkowski dimension of dfl is defined on the following way: For a given e > 0 we consider the interior boundary strip (5.2)

7* := {x e Sl/d(x,dil)

< e)

313 where d(.,.) denotes the euclidian distance in IR n . For any t > O, set M*(t,dQ)

:= lim £ _ > osupe-< n -%*|„.

As above, |.| n denotes the n-dimensional Lebesgue measure. The interior Bouligand-Minkowski dimension of dVt is: (5.3) di := inf{t > 0/M*(t,dSl)

< +00} = n - l i m m / ^ o K l n e r U n | 7 *|„].

We have di € \n — 1, n]. Taking 7* := {x e 1R™ \ £l/d(x,dQ,)

< e}, we define on the same way de, the

exterior Bouligand-Minkowski dimension of dil.

Finally, when | 9 n | n _ ! = 0,

the Bouligand-Minkowski dimension of dQ is d defined by: (5.4)

d := max(dj, de).

REMARK 3: This dimension is not as known as the HausdorfF ones though it is frequently used by chemists, physicists, under various names as "box-counting dimension", logarithmic dimension, , Kolmogorov's entropy Practically, it is sufficient to count the number of squares with side e which intersect the curve 7 in M 2 to deduce its "box-counting dimension". REMARK 4: For any bounded domain , the Hausdorff and the BouligandMinkowski dimensions of the boundary are always such that: n — 1 < h < d. It is is possible to exhibit counterexamples where h < d (with strict inequality).

This is precisely done by Brossard and Carmona to contradict Berry's

conjecture. They construct a subdivision of a union of squares so that one has a strict inequality: h < di . O R D E R O F GROWTH O F T H E " REMAINDER T E R M "

Assume now that Q is a bounded domain in 1R", with a fractal boundary dil; assume moreover that the Bouligand-Minkowski dimension of the boundary is d > n - 1. We have ([LFl], [LF2], [F], [L], 1988): (5.5)

N(X,n)

= W(X,U)+0{X^),

X ->+oo

314 where as above, d» is the interior Bouligand-Minkowski dimension. To establish this result, we consider a Whitney covering of CI, that is a covering with adjacent cubes which are smaller near the boundary: First we insert inside CI the maximum number of adjacent (non overlapping) cubes Qo with side 1. In what is left, we insert again the maximum number of adjacent (non overlapping) cubes Qi which are also adjacent to the previous ones and which are with side 2 _ 1

At the fc-th step, we have a domain

Bk near the boundary which is still free and that we cover partially with n^ adjacent cubes Qk with side 2~k. By use of the boundary strip 7*, defined in (5.2), we have: Bk C 72-t^/s- Hence nk < C2kd.

(5.6)

For a given A, N(X, Qp) = 0 for p large enough, or equivalently for Qp small enough. Set K € IN be such that for all integer p> K, N(\, Qp) = 0 . We use now Courant's method, which is also called "Dirichlet- Neumann bracketing" (see e.g. [CH]): k=K

(5.7)

Y,

k=K nfciV A

( > 3 fc ) < N(x>«)

< £

nkNN(X, Qk) + NN(X,

Bk)

where iVjv(A,w) denotes the counting function for Neumann Laplacian defined on <jj. Finally we combine Gauss formula with an estimate for Bk established in [FMt], [Mtl], [Mt2].. A PRECISE SECOND TERM FOR A DOMAIN WITH FRACTAL BOUNDARY:

For a domain with fractal boundary, it is natural, as for smooth domains, to try to calculate the second term. Unfortunately, except when n = 1 where the "modified Weyl-Berry conjecture" holds ([La]), the second term can oscillate, exactly as for smooth domains. As in [FVl], [FV2], let us consider a union of cubes which are smaller and smaller as shown here:

315

3

n

3

II

II

C

3 II

3

U

3

II

To construct this set in M2, let us first choose s satisfying (5.8)

1 + y/2 < s < 3.

We fix the central square Qo with side 1. Then we "stick", outside Qo, in the middle of each 4 sides of Qo, 4 squares Q\ with side s _ 1 , as shown by the figure above. We have now 4 x 3 "free" sides with length s~1; on each middle part of these 12 sides, outside the previous squares, we "stick" again one square Q2 with side s~ 2 . . . . At the fc-th step we have ri\. squares Qk with sides s~k and nk = - 3 , k > 1, and no = 1. o We denote by Q the union of all these squares (which is disconnected in JR. ) . It follows from (5.8) that the squares do not overlap and that Q is with finite measure. The interior Bouligand-Minkowski dimension di of dQ is (5.9)

di = (ln3)/(lns),

Kdi<2,

since for e > 0 k=K

+00

7* b = Y, M^s-k - 4e2) + Yl nks~2fc fc=0

k=K+l

where K is such that a-(*+D

< 26 < S~K.

We can compute exactly the second term of the asymptotics of N(X, Q).

316 PROPOSITION 1. ([FV1][FV2]). (5.10)

N(X,Q)

= W(X,Q)

where

As X tends to +00: -

-(X/v2)di'2p2

InX — 2Zn7r

2lns

+ o(Vx),

fc=-|-oo

p2(y)

=

Zk-yP2(sy-k);

J2

Pa(r) =

V - ^ r ) .

fc=—00

The function

p% is well defined, positive,

bounded,

continuous; moreover the set of its points of discontinuity

1-periodic

and

left-

is dense in IR .

The set Q being disconnected, we also introduce the connected set O, derived from Q by opening in the middle of each dQk n dQk-i

a small "cut" Ik with

_1

length efc = (100(fc!)) ; the connected open set O has the same Lebesgue measure (in IR2) as Q and it has also the same interior Bouligand-Minkowski dimension di = THEOREM

(ln3)/(lns).

1. ([FV1][FV2]).

--3W«2)d
As X tends to +00

InX — 2lnn + 0(VX) = N{\, Q) - W(X, Q) 2lns < -

-{X/-K2)di'2p2

InX — 2lnn + o(l) 2lns

+ o(Ad'/2).

Here again, the second term has the form: cnM(d£l)Xdi/2.p(lnX)

where p is a

periodic function which is positive, bounded and discontinuous. This periodicity arises naturally for self-similar fractals. It is also the case for the snowflake that we consider in Section 6. Finally, except perhaps for a very small class of domains, the "modified WeylBerry conjecture" is not true ((see also [LV], 1996, [MoVa], 1995).

THE

INVERSE PROBLEM:

We give now some conditions so that "we can hear the dimension a fractal boundary" in H 2 . The following results ([V2],1990; [FV1],1990; [FV3]), close from the one in [BC],1986, are derived from the asymptotics of the partition function denned in (4.2):

317 T H E O R E M 2.

If Q, is a bounded domain in H™, n > 2, di , the

Bouligand-Minkowski (5.11)

interior

dimension of d£l is such that:

di > -21iminf(Lnt)- 1 Ln[|fi|(47ri)- n / 2 -

Z(t,Sl)].

Moreover, if n = 2 and if dQ consists only in a finite number of connected components, then one has equality in (5.11). REMARK 5: These conditions are necessary since it is possible to construct examples with strict inequality in (5.11). 6

H E A T EQUATION ON T H E TRIADIC VON K O C H SNOWFLAKE.

We consider now the heat equation denned on D C IR n , n > 1. Let u : (x,t) £ D x [0, +oo] -> u(x,t) (6.1) (6.2)

€ IR satisfying:

Au(x,t)=^'

t )

,

t>0,

w(x,0) = l ,

xeD,

xeD.

The total amount of heat contained in D at the moment t > 0 is (6.3)

QD(t)

:=

ju(x,t)dx D

and the total amount of heat lost up to the moment t is

(6.4)

ED(t) := J (I - u{x,t)) dx. D

We describe now the asymptotic behaviour of the function Ep(t) 2

as t —> 0,

when domain D is the triadic von Koch snowflake in IR shown here:

318 Let us recall first some known results concerning Eu{t).

For planar domains

with polygonal boundary 3D, there are results from [vdBSr, Dul], one has: (6.5) k

fl/2

ED(t) = |Z?|2 - QD(t) = 2^|fl£>|i - t J2 <0j) + 0(e- r /'),

« -»• +0,

with some positive constant r depending on D. Here \D\2 is (as above) the area of D, \dD\i is the length of the boundary 3D, 6j, j = 1 , 2 , . . . , k are the angles a t the vertices of 3D, and the function c : [0,27r] —• IR is defined by 4sinh((7r — 9)y)

(6.6)

c(0) ••= J {sinh(7ry) cosh(9y) dy. o

The first term in the right-hand side of (6.5) corresponds t o the loss of heat through the sides of the polygon 3D and the second term is the amount of heat lost near the vertices. For an arbitrary open bounded and connected set D in IR2 with smooth boundary {3D e C 3 ), we have: ([vdBDa], [Du]). ED{t) = 2 ^ | 0 Z ? | i - Jrtx(O) + 0 ( i 3 / 2 ) ,

t -• +0,

where x(£>) i s the Euler-Poincare characteristic for D (i.e., 1 — x{D) is the number of holes in D). Moreover, van den Berg ([vdB], 1999) establishes some bounds for Eo(t) for D an arbitrary open set in IR n with a finite volume and fractal boundary which satisfies a uniform capacitary density condition. The main result in [FLVl], [FLV2] is the following T H E O R E M 3 For the triadic von Koch snowflake (with Minkovski

dimension

d=21n2/ln3; : ED{t) = p(lnt)ta

(6.7)

+ q(]nt)t + 0{e-r't)

as

where p and q are continuous, (In 9) -periodic functions constant.

t -> + 0 ,

and r is some positive

Moreover:

(6.8)

Pmin(z) < P(z) < Pmax(z) •

where z = I n t j I n 9 (so that t = 9Z) , p(z) =

.

..

3VS 3^3 ^

k+ f4\ ' £^? , /4V +Z

fc=-oo V

'

m=l

p(zln9); 2 2 k l eX1? Z A-K m 9 +'

x)m+1

-

(

'

319

(6-9)

Pmax(*)

= 3^ E

Both functions pmin{z)

(9 j

a,ndpmax(z)

£

^

~•

are positive continuous and 1-periodic func-

tions. There are several other new results concerning selfsimilar domains as snowflakes, cabbages,... ([F1LV], [LV],[vdB], [MoVa]); they all show the existence of an oscillating term as in (6.8); these oscillations which arise naturally in the calculations are related to the renewal theorem (see [LV]).

ACKNOWLEDGMENT: The author expresses her gratitude to the organizers of the Congress and to Al-Azhar University for supporting her visit, and to H. Berriche for improving the final version of this paper.

REFERENCES [vdB] M. van den Berg Heat equation on the Arithmetic

von Koch

Snowflake,

Probability Thoery, to appear, (1999). [ vdBDa] M. van den Berg and E. B. Davies Heat flow out of regions in H m Math. Z. v. 202, (1989), p.463-482 [ vdBSr] M. van den Berg and S. Srisatkunarajah Heat flow and

Brownian

2

motion for a region in IR with a polygonal boundary, Probab. Th. Rel. Fields v.86, (1990), p.41-52 [Bel]

M.V.Berry,

Structural

Distribution

of

modes

in

fractalresonators,

in

stability in Physic, Springer Vlg, Berlin, (1979), p.51-53.

[Be2] M.V.Berry, Some geometric aspects of wave motion:

wavefront disloca-

tions, diffraction catastrophes, diffractals, in Geometry of the Laplace operator; Proc.Symp.Pure Math., A.M.S. v.36, (1980), p.13-38. [Bo]

G.Bouligand,

Ensembles

impropres

et

Bull.Sci.Math., 2, t.52, (1928), p. 320-344 et 361-376.

nombre

dimensionnel,

320 [BC] J.Brossard and R.Carmona, Can one hear the dimension

of a fractal?,

Coram. Math. Phys. 104 (1986), 103-122. [CH] R.Courant and D.Hilbert, Methods of mathematical physics, Vol. 1, English transl., Interscience, New York, 1953. [DG] J.J.Duistermaat and V.W.Guillemin, The spectrum of positive

elliptic

operators and periodic bicharacteristics, Invent. Math. 29(1975), 39-79. [Du] B. Duplantier (1991) Can one "hear" the termodynamics

of a (rough)

colloid? Phys. Rew. Lett. v.66p. 1555-1558 [Fa] K. Falconer, Fractal geometry, applications,

mathematical

foundations

and

C H I C H E S T E R ; J O H N W I L E Y & S O N S (1990)

[F] J.Fleckinger-Pelle On eigenvalue problems associated with fractal Ordinary

and Partial

Differential

Equation,

domains,

vol.2, Sleeman, Jarvis Ed.,

Pitman Research Notes in Math, 216, (1988), p.60-72. [FLVl] J. Fleckinger, M. Levitin and D. Vassiliev, Heat content of the triadic von Koch snowflake, Proc. R. Soc. Lond. Ser. 3 , V.71,(1995), p.372-396. [FLV2] J. Fleckinger, M. Levitin and D. Vassiliev, "The heat equation on a snowflake" Int. Jal. Appl. Sc. Comp., V.2, N.2, (1996), p.289-305. [FMt] J.FIeckinger and G.Metivier, Theorie spectrale des operateurs uniformement

elliptiques sur quelques ouverts irreguliers, C.R. Acad. Sci. Paris

Ser. A 276 (1973), p.913-916. [FVl] J.FIeckinger and D.G.Vasil'ev, Tambour fractal: exemple d'une asymptotique

formule

a deux termes pour la "fonction de comptage", C.R. Acad. Sci.,

Paris, Ser. I, Math., t.311 (1990) p.867-872. [FV2] J.FIeckinger and D.G.Vasil'ev, An example of a two-term asymptotics for the "counting function"

of a fractal drum, Transact.A.M.S. V337, N.l, (1993),

p99-116 in 1990. [FV3] J.FIeckinger and D.G.Vasil'ev, "Vibration du tambour fractal: du Zeme terme de la fonction de comptage et determination du bord". Matapli, Paris, Oct 1992, p.29-36, [G] C.F.Gauss, Disquisitiones

arithmeticae, Leipzig, (1801).

estimation

de la dimension

321 [GWW] C.Gordon, D.Webb and S.Wolpert, Isospectral plane domains and surfaces via Riemannian

orbifolds, Invent. Math., 110, N.l, (1992), p.1-22.

[Ivl] V.Ya.Ivrii, On the second term of the spectral asymptotics for the LaplaceBeltrami operator on a manifold with a boundary, Funktsional Anal, i Prilozhen 14 (1980), No 2, 25-35; English transl. Funct.Anal.Appl.l4(1980). [Iv2] V.Ya.Ivrii, Precise spectral asymptotics fiberings over manifolds

for elliptic operators acting in

wit boundary, Lecture Notes in Math., Vol.1100,

Springer-Verlag, Berlin, (1984). [K] M.Kac, Can one hear the shape of a drum?, Amer. Math. Monthly, 73 (1966) 1-23. [Ll] M.L.Lapidus, Fractal drum, inverse spectral problems for elliptic operators and a partial resolution of the Weyl- Berry conjecture, Trans.A.M.S. v.325, (1991), p.465-529. [LF1] M.L.Lapidus and J.Fleckinger, The vibrations of a fractal drum, Lecture Notes in Pure and Applied Mathematics, Differential Equations, Marcel Dekker, N.Y.-Basel, (1989), pp. 423-436. [LF2] M.L.Lapidus and J.Fleckinger, Tambour fractal: vers une resolution de la conjecture de Weyl-Berry pour les valeurs propres du laplacien, C.R.Acad. Sci. Paris Ser. I Math. 306 (1988), 171-175. [M] S.Minakshisundaram and A.Pleijel, Some properties of the of the Laplace operator on Riemannian

eigenfunctions

manifolds, Canadian Jal Math.

, I,

(1949), p.242-256. [Me 1] R.B.Melrose, Weyl's conjecture for manifolds with concave boundary, Geometry of the Laplace Operator, Proc. Symp. Pure Math., Vol. 36, Amer. Math. Soc, Providence, 1980, pp. 257-273. [Me 2] R.B.Melrose, The trace of the wave group, Contemp. Math., Vol. 27, Amer. Math. Soc, Providence, 1984, pp. 127-167. [Mtl] G.Metivier, Etude asymptotique

des valeurs propres et de la fonction

spectrale de problemes aux limites, These de Doctorat d'Etat, Mathematiques, Universite de Nice, France, 1976.

322 [Mt2] G.Metivier,

Valeurs propres

de problemes

aux limites

elliptiques

irreguliers, Bull. Soc. Math. Prance, Mem. 51- -52 (1977), 125-219. [MoVa] S.Molchanov and B.Vainberg, On spectral asymptotics for

Domains

with Fractal Boundaries, (1995), preprint. [SI] Yu.G.Safarov, Asymptotics

of the spectrum of a boundary value problem

with periodic billiard trajectories, Funktsional Anal, i Prilozhen. 21 (1987), No.4, 90-92; English translation in Funct. Anal. Appl. 21 (1987). [S2] Yu.G.Safarov, Precise asymptotics

of the spectrum of a boundary value

problem and periodic billiards, Izvestija Akad. Nauk SSSR, Mathematical Series 52 (1988), No. 6, 1230-1251; English translation in Mathematics of the USSR - Izvestija. [Se] R.Seeley, A sharp asymptotic remainder estimate for the eigenvalues of the Laplacian in a domain of JR.3, Adv. in Math. 29, (1978), p.244-269. [U] H.Urakawa, Bounded domains which are isospectral but not congruent, Ann. Sci. Ecole Normale Sup. 15, (1982) 441-456. [VI] D.Vasil'ev, Asymptotics

of the spectrum of a boundary value problem,

Trudy Moscov. Mat. Obsch. 49 (1986), 167-237; English translation in Trans. Moscow Math. Soc, 1987, pp. 173-245. [V2] D.Vasil'ev, One can hear the dimension

of a connected fractal in 1R ,

in Petkov & Lazarov - Integral Equations and Inverse Problems; Longman Academic, Scientific & Technical, 1990 (to appear). [Wl] H.Weyl, Vber die asymptotische

Verteilung der Eigenwerte, Gott. Nach.

(1911), 110-117. [W2] H.Weyl, Das asymptotische tieller Differentialgleichungen,

Verteilungsgesetz der Eigenwerte linearer par-

Math. Ann. 71 (1912), 441-479.

Mathematics and the 21st Century Eds. A. A. Ashour and A.-S. F. Obada © 2001 World Scientific Publishing Co. (pp. 323-356)

323

INTERMEDIATE STATES : SOME NONCLASSICAL PROPERTIES M . S E B A W E A B D A L L A * AND A . - S . F . O B A D A *

t Department of Mathematics, College of Science, King Saud University, P.O.Box 2455, Riyadh 11451, Saudi Arabia * Department of Mathematics, Faculty of Science, El-Azhar University, Nasr City 11884, Cairo Egypt Abstract. In this article we consider in some detail some new classes of states. These states are intermediate states either between the pure number (Fock) states, and the (non-pure) chaotic state (thermal state), such as geometric state, or between the coherent state and number state such as binomial state. We extend our discussion to include some other states such as even (odd) coherent states, even (odd) binomial states, phased generalized binomial state ... etc. In our study of these states we pay attention to a discussion of the nonclassical properties, besides the statistical properties, for example correlation functions, squeezing, and quasiprobability distribution functions (P-representation, W-Wigner, and Q-function). Furthermore we consider the field distribution and the photon number distribution, as well as the phase properties. Finally, some schemes for the production of these states are presented. 1. Introduction. Since the earlier days of quantum optics, it is well known that the Fock (number) state and the coherent state represent two of the most fundamental states of a single boson mode [1,2]. The number state \n > is determined by its photon number and the phase is entirely random. In this state the amplitude of the field has a zero expectation value. For the coherent state \a >, one can generate it from the action of the Glauber displacement operator D(a) = exp(aot — a*a) on the vaccum state |0 >, such that 1 °° an \a >= exp(aa f - a*a)|0 > = e x p ( - - | a | 2 ) Y^ —7=\n > 2

(1.1)

rfV"!

with a(al) standing for the annihilation (creation) boson operator and a complex. For this state the phase is determined and the amplitude of the field has a non-zero value. In fact the coherent state is a linear combination of all \n > states with coefficents chosen such that the photon counting distribution is Poissonian. To generate such a state one can use the fact that a classical charge distribution radiates a field in a coherent state, while a single atom in its first excited state in the absence of external interactions radiates a field in the \n = 1 > state. It is worthwhile to refer to another state that is the choatic (thermal) state [3], whose density operator pth is given by oo

_n

^ = E ( l + % n + l !"><"!

(1-2)

324 where (1 "> n+1 is the Bose-Einstein distribution function. Recently one can find a great deal of interest in producing and generating new states in addition to the previous states. Most of these states are intermediate states and can be generated from the above states. In fact they interpolate between distinctive states, reducing to them in different limits of the parameters involved. In consequence, a unifying role is played by an intermediate state describing the physical properties of its limiting states. The earliest example in the literature is the binomial state [4], which interpolates between the coherent state and the number (Fock) state. Another is the negative binomial state [5], which bridges between the coherent state and the quasi-thermal state (i.e. Susskind-Glogower phase state [6]). The notions of the even binomial states [7] (between the even-coherent and the even-number state) and the q-deformed binomial states [8] (between the q-coherent and the q-number state) have been introduced. The negative binomial states were also generalized to the even (odd) negative binomial states [9], which interpolate between the even (odd) coherent state and the even (odd) quasi-thermal state. The logarithmic state, as a special case of the negative binomial states with then n = 0 term removed, was also investigated [10]. There are also many other intermediate states, among which one can cite: (i) the geberalized geometric state [11], between the number state and the (nonpure) chaotic (thermal) states; (ii) the intermediate number phase state [12], between the number state and the Pegg-Barentt phase state [13]; (iii) the intermediate number squeezed state [14], between the number state and the squeezed coherent state; (iv) the even and odd intermediate number squeezed state [15], between the even (odd) number state and even (odd) squeezed state. Most of the theretical studies concerning these states have focused on their construction and the possible occurrence of various nonclassical effects exhibited by them. All the above mentioned states are grouped under the category of nonclassical.states of light, and our main purpose of the present work is to review some nonclassical properties of some of these states. However to reach this goal we shall make our starting point the even and odd coherent states. This will be done in the following section. 2. Even and odd coherent states. 2.1 Generation of the state. During their study of a singular nonstationary one-dimensional oscillator Dodonov et al generated and introduced to the physics world the even and odd coherent states [16]. They have shown that these states separately form complete sets in the Hilbert spaces of even and odd functions. These functions can be written as follows

\a>±=±\±(\a>±\-a>), or in terms of number states

(2.1)

325 and

,a> x

-' -L7mrW^+1>'

(23)

where A+ and A_ are the normalization constants for both even and odd coherent states respectively, given by the formulae A+ = [cosh|a| 2 ]-i,

A_ = [sinh|a| 2 ]-i

(2.4)

Several methods to construct these states are given in the literature, one of them is to use the inversion operator I which has the properties Ial = —a, la*I = —a' and consequently ID(a)I = D(—a). One can construct two operators, each generating irreducible representations of the group consisting of two elements, the unit operator and the inversion operator / . The first operator is generating the symmetric representation of the group and takes the form coshtdat - &*a) = hb(a)

+ D(-a)] = D+{a),

(2.5)

while the second operator is generating the anti-symmetric representation of the group, thus sinh{aal - a*a) = hp(a)

- D{-a)} = £>_(<*),

(2.6)

and hence \a± = \±D±(a)\Q >. It is easy to see that the parity of these functions with respect to a is the same as with respect to the coordinate, therefore these functions completely describe the even and odd coherent states. Here it would be interesting to refer to the solution of the problem of nonstationary oscillator with the wall at the orgin of the coordinates, which has been obtained in terms of odd coherent states, for more details see ref. [16]. Recently it has been shown that quantum interference between coherent states leads to generation of states whose properties are as far as one can imagine from classical states. For example one can see that the superposition of two coherent states (\a > + ) can arise as a consequence of propagation of coherent light through an amplitude-dispersive medium. Furthermore it has been shown that the even coherent states exhibit ordinary (second order) squeezing as well as fourth order squeezing. Also we find that idea of superposition of coherent states has been extended to include the quadrature variances of a continuous one-dimensional superposition of coherent states, which shows a significant reduction of fluctuations in one of the quadratures. The nonclassical properties of even and odd coherent states have been studied in details, see for example ref. [17-20].

326 2.2 Photon number. The photon number distribution for both even and odd states can be obtained from equation (2.1). For the even cases we have [17] cosh

P«(n) = i

Vl2 ""! '

[ 0

neven

(2.7)

nodd.

While the probability P°(n) of finding n photons for the odd state is (

M2"

I

sinh

P(°1(n)=\

l«l 2 n! -

[ 0

.J nOCW

(2.8)

neven.

It has been shown that the photon number probability distributions are oscillatory and also resemble those associated with the Poisson distribution of the ordinary coherent state, (figure (1) of Ref.[17] may be referred to) 2.3 Quasidistribution function. The representation of quantum fields in phase space in terms of quasiprobabilities is widely used in the field of quantum optics, with particular emphasis on the VK-Wigner [21,22], Q-function [23], and the Glauber Sudershan P-representation [2,23]. Therefore in this subsection we shall be concerned with the VF-Wigner and (J-function for both even and odd coherent states. To find the W-Wigner and Q-functions we have to calculate the characteristic function C(£, s), which is associated with the order of the bosonic (photon) operators and is given by C « , « ) = Tr[pexp(&

- r a + ^\%

(2.9)

where p is the density matrix for the desired state and s is a parameter that defines the relevant quasiprobability distribution function which takes the value -1,0,1 corresponding to Husimi Q-function, W-Wigner, and P-representation, respectively. The quasi-distribution functions 1(0, s) are defined through the Fourier transform of the characteristic function in the form 1 1(0, s) = ^ 1"

r°°

d2texp(Z(3* - t*0)C(t, s),

(2.10)

J — OO

where W(/?) = 1(0,0), Q(0) = 1(0, - 1 ) and P(0) = 1(0,1). For any state \ipg > expanded in terms of the Fock states |n > as follows |^ s > = ^ s n | n > ,

with

P = IV'ff > < -0s!

(2-11)

327 the generalized quasidistribution function /(/?, s) of equ. (2.10) can be cast in the equivalent form [24] 2e-(A)l^

m

pmpn

HJ

1 + s\

/

2

\

rm.n

4\0\2

Tn,n

(2.12)

Hence the Wigner function non becomes W(/?) =/(/?, 0) = ^ e " 2 l ' 3 l 2 ^ ( - l ) " C c n ( 2 / 3 ) ' " - " ^ L - - " ( 4 | / 5 | 2 )

(2.13)

while the Husimi Q-function takes the form

Q(/3) =/(/3,-1) = n

I n + Ol \

7r

^

C

— m,n

„C^4=

(2.14)

y/m\n\

f_x\m

where L°(x) = ^ p-j^^—is the asociated Laguerre polynomial. For P-representation m=o \ n — mJ one has to take the proper limit as s —> 1. For the even coherent states case we can write the expression for the Wigner function as follows W(e){p)

=

e X p(| Q | 2 - 2|^| 2 ) 27rcosh |Q!|^

2 | a | . c o s 2 ( a / J , + a,p)

2 + e 2|a| c o s 2 ( a / 3 *

_ a.0)].

(2.15)

For the odd coherent states case we have w(o){p)

=

exp(|a| 2 -2\f}\2) 27rsmh \a\z

[e_a|a|a c o s 2 ( a

^

+ a,p)

_ e2\a\> cos2{a0*

_ a.pym

(2.16)

Figures for these functions show negative values which is a sign of nonclassical effects. Similarly we can write the Q-function in the case of the even coherent states in the form Q(e)(/3) = ^ ^ j j c o s t a / T + a'/?) + e2lQl2 cos(a/r - a*/?)],

(2.17)

while for the odd case we get

Q(0){/?)

= 5rnS [ C O S ( a / r + a'P) " e2'a'2 C ° S(a/? * ~ at/3)1,

(2 18)

-

After this quick review of the even and odd-cherent states, we look at the intermediate states; we begin by the binomial states.

328 3. The Binomial states. As a natural way to study the intermediate states especially the states that interpolate between coherent states and number states one may tackle the binomial state [4,25]. The binomial state \r),M > is a linear combination of the number states |0 >, |1 >, ...\M > with coefficients chosen such that the photon counting probability distribution is binomial with mean r)M.. The binomial state is denned as M

\iPn,M>=\v,M>=Y,B™\n>,

(3.1)

n=0

where Bff are the binomial coefficients given by B"

=

M

n V

(l

-

\v\2)*

(3.2)

The binomial state is quantum mechanical in nature and it produces light which is antibunched and has sub-Poissonian behaviour, but cannot have the minimum uncertainty product, where it contains a finite number of \n > . To generate the binomial state different methods may be used [7]. One of these methods is to use the Hamiltonian H = fj,J++n*J-,

(3.3)

where J+{J-) are the raising (lowering) operators related to the angular momentum operator, which satisfy the SU(2) commutation relation [J + , J_] = 2JZ and [Jz, J±] = ±J±. The evolution operator of the Hamiltonian (3.3) can be written as £/(t) = exp(CJ+-CV_)

(3.4)

where £ = (—i/j,t). Now if we use equation (3.4) to act on the state \l, —I > where / is the co-operation number, we find \r/> > = U{t)\l, -I > = exp(CJ+ - CJ-W,

-I >

(3-5)

By taking C = Oexp(—|(p) and define r = e~%v tan | , equation (3.5) becomes

hA>=(l + |r|2)-' J2 Cl + y m——l ^

Tl+m l m>

\'

(3 6)

'

^

By setting T = 77/(1 — |T?| 2 ),/ + m = n and 21 = M then equation (3.6) takes the form of equation (3.1). Thus as one can see we have an advantage to consider the binomial

329 state as a good example of the intermediate state. However we shall concentrate on the so-called even and odd binomial states [7,26,27]. These states interpolate btween even (odd) coherent states and even (odd) number states. This will be seen in the following subsections. 3.1 The even binomial state. The even binomial state \tpe > is defined as follows |^e> = y(|M,77>+|M,-7?>)

A.E

M

,

2n

(3.7) 2

2 M

Ml

, "(l-N ) ^|2n>=A1^JB:

where [4/] is the largest integer less than or equal to (M/2), and A is the normalization constant given by |A1|2 = 2[l + ( l - 2 | r ? | 2 ) M ] - 1

(3.8) 2

2

As a limiting case when r) —> 0 and M —• oo such that M|?j| —> \a\ equation (3.7) immediately reduces to equation (2.2) corresponding to the even coherent state | a + >. i) Correlation function To discuss anti-bunching we have to consider the Glauber second order (zero time) correlation function

Prom Equation (3.7) we calculate the expectation value of the photon number n , as well as the second moment n 2 , from which we can rewrite equation (3.9) as follows

,w ( 0 ) - ( i - i ) [ i + (i-%i 2 ) M ][i + ( i - 2 M ^ - 2 ]

9

[U) [1

~

M>

[1 - (1 - 2|77|2)M-i]2

(iAl)>

Equation (3.10) goes to coth2 |a| 2 if r) —> 0 and M —> oo ,which represents the value of the correlation function for the even coherent state case. Numerical investigation of equation (3.10) shows that, the system has super-Poissonian behavior for small value of the parameter rj ,but for large value of M the interval of the super-Poissonian diminshes,while the system shows sub-Poissonian for large value of r). This means that increasing the number of photons in the even binomial state changes the distribution from super-Poissonian to sub-Poissonian, (Figures (1) of Ref [7] may be consulted) ii) Squeezing The squeezing phenomenon represents one of the most interesting phenomena in the field of quantum optics, and is a direct quantum effect of Heisenberg's uncertainty principle. It reflects the reduced quantum fluctuations in one of the field quadratures at the expense

330 of the other corresponding stretched quadrature. Our aim in the present subsection is to consider two cases of squeezing, the first is the normal squeezing which can be discussed through the fluctuations in the quadratures X

= I(at+a);

Y=±(a-tf),

(3.11)

where d and atsatisfy the (bosonic) commutation relation [a,a^] = 1, while [X, Y] = | . While the second case is to consider the amplitude squared squeezing, which arises naturally in the second -harmonic generation and in a number of nonlinear optical processes that is defined through d1 = l(a2 + aV); 2

d2 = l ( a 2 - a' 2 )

(3.12)

2,1

which satisfy the commutation relation [di,d2] =i(l + 2n). To discuss squeezing we have first to calculate the expectation value of the operator a}2s with respect to the state given by equation (3.8). For 2s < M , we find

«2s

[fir 3

/M\

(M - 2n)\ (M-2n-2s)\

W

m2\M-2n

(3.13)

where s is a positive integer. The field is said to be squeezed when AX or AY < \ and AX .AY > -^, must hold. From the numerical study of ref [7] one can see that squeezing occurs for all values of M , but it becomes more effective as M increases. However, the maximum squeezing is shifted to lower 77 as M increases. Also it is noted that as one changes the phase of the parameter r] the squeezing changes between AX and AY . For the second case (amplitude 2 squared squeezing) the field is said to be in an amplitude-squared squeezed state if Adi or Ad2 < n + | . and Adi Ad2 > (n + \)2 must hold. The numerical study of this phenomenon proves that amplitude squared squeezing is effective for large values of M , however the point of maximum squeezing in this case moves slightly as M increases (see Figs 2 and 3 of Ref 7). iii) Quasi-distribution function As in the previous section we shall consider some statistical aspects related to the even binomial state. We are only concerned with the quasiprobability distribution function, WWigner and Q-functions. This is because the P representation is highly singular due to the non-classical character of the even binomial state. From equation (2.9) with p = \ipe >< ipe\, together with equation (3.7) one can calculate the characteristic function C( e '(£, s) in

331 the form (Ml

C^^s) =

hi2 2 \Xini-\V\ ) Y,(2n) U-M n=0 ^ '

exp[-i( S + l)|£| 2 ]WI£| 2 ), (3-14)

2 M

where Ln(x) are Laguerre polynomials of order n (3.15) r=0

v

'

From equs (2.13) and (2.14) we find the diagonal terms in the the Wigner function, 2

M2 ^)(/5 ) =^iA 1 i 2 (i-M 2 ) M x:( 2n ) i-W J

ex P [-2|/?| 2 ]L 2n (4|/?| 2 ).

(3.16)

While for the Q-function we find

\H2

Q(c)03) = ^ ( i - H2)M n=0

V

exp[-|/?| 2 ],

(3.17)

'

where we have only taken the diagonal terms of the density matrix p. For the Wigner function W(/3) of equation (3.16) when \rj\ is taken very small (~ 0.2), the distribution is almost Gaussian and the shape of the field is insensitive to change in M. Increasing the value of M makes the peak slightly sharper. However, if one increases \T]\ (~ 0.6), which means effects due to higher excitations are of considerable importance, one notes a remarkable change in the shape of the function as M increases. Sharpening of the peak as well as the appearance of shallower wobbles are signs of the contributions due to higher excitations. In contrast to this one finds that the Q function of equation (3.17) is insensitive to any change in either |T;| or M (Figs 4-6 of Ref 7 may be consulted for details). 3.2 The odd binomial state. The odd binomial state is the state which interpolates between the odd coherent state and odd number state and is defined by [27] 1^1

\<po >= ^-(\M,r, > -\M,-V >) = \2 Y,

B?n+1\2n+l>

(3.18)

where A2 is the normalization constant given by |A2|2 = 2 [ 1 - ( 1 - 2 M 2 ) M ] - 1

(3.19)

332 Since the linear combination of the odd binomial state does not contain the vacuum state, the range of the parameter 77 will be between 0 and 1 such that 0 < M < 1. By taking the limit M -> 00 and 77 -> 0 then equation (3.18) tends to equation (2.3) corresponding to the odd coherent state. i)Sub-Poissonian behavior To discuss the sub-Poissonian behavior of the state one needs to find the explicit expression of the Glauber second order correlation function(3.9). For the odd binomial state we find

i,|\ (i-2H2)(M-2)(i-H2)2l ) 1-4-=i

¥

[i + (i •

(3.20)

From the above function we can deduce that, g' 2 '(0) is always less than one so far as both 77 and M are finite. However, if we increase the value of M and decrease the value of 77 at the same time, such that 77 —> 0 as M —> 00, then we find the correlation function tends to tanh 2 |a| 2 ,so that a sub-Poissonian effect does exist for the odd coherent state. This emphasises that the odd binomial state has sub-Poisonnian behavior. Also we may point that for large values of M with fixed value of 77 the function approaches unity more rapidly as 77 —> 1 and persists and the system shows coherence behavior [27]. ii) Squeezing Since the normal squeezing is based on the definition of the field quadrature operators given by equation (3.11), therefore in this case we find

y (M )

(M-2n-

1)!

(M-2n-2s-l)!

*-> \ 2 n + l / v / n=0 while the expectation value of the photon number is

M 4 n + 2 (i-H 2 ) M - 2 n -\ (3.21)

< 0*0 > = |77A2|2(M/2)[1 + (1 - 2|T7| 2 ) M " 1 ]

(3.22) The numerical investigation proves that the odd binomial state does not show squeezing whatever the values of the parameters 77 and M . However for amplitude -squared squeezing the situation is different where the squeezing becomes pronounced as M increases, and the maximum point moves toward higher values of 77 , (see Ref [27]) iii) Quasiprobability distribution Following the same procedure as in the previous subsection we can calculate the quasidistribution functions for the odd binomial state. Then from equations (2.13,14) and (3.18), we have the expressions of the Wigner function in the form

333

2

[(M-D/2]

[(M-l)/2]

— —n

m n—n m,ra=0

[——— V ^

'

T7i>n

\B£+1\\BM+1\(2\P\^-^LltT]m2)

cos[2(n - m)(

Similarly we can find the expression for the Q-function in the form 2

[(M-l)/2]

4n+2

(2n+l)l

[(M-l)/2] m ^ 0 m>n

| / o | 2n+2m+2

V(2n+l)!(2m + l)!

K + l l K + l | c o s [ 2 ( n - m ) ( ^ + C)]), (3.24) where we have taken /3 = \f3\e1^ and 77 = |r/|el*. The last two equations consist of two parts; the first part represents the diagonal term, while the second part represents the off-diagonal terms of the density matrix p. Prom the numercial study of the odd binomial state we find that when 77 has a small value (~ 0.1) and M = 5, the Wigner function has a hole on the summit similar to that found for the geometric state to be discussed later. However as we increase the value of 77 (~ 0.6) keeping the value of the parameter M fixed, we have four asymmetric peaks with a chaotic behavior, where the interference between the component states results in the selective preservation of the nonclassical effects during the amplification process. Increasing the value of the parameterM (~ 17), and keeping the parameter r] with the same value, we find the four peaks are shifted and the chaotic behavior becomes pronounced. With respect to the Q-function, we find that when 77 is small (~ 0.1) and M ( ~ 5) the function almost represents the case of a Fock state, however when we increase the value of the parameter 77 then the probability of having single photons also increases.where we have four adjacent deformed peaks at the center, (see figures (5 and 6)of Ref. [27]). 4. The phased generalized binomial state. Our purpose in the present section is to introduce a new class of the intermediate states. The idea of introducing such state is not only to give a wide range of studying the intermediate states, but also to generalize the so-called orthogonal even coherent state given by [28] \+ + \ia)+], (4.1) where B is the normalization constant and | Q ) + is the usual even coherent state given by equation (2.2). It is worthwhile to refer to the origin of the orthogonal-even coherent state. This state is revealed by the study of the definition of equation (4.1), where in the

334 complex a-plane the vector representing (ia) is rotated 90 degrees from a, and therefore the even coherent state | ia)+ is orthogonal to the state | a) + . The state we shall introduce is called 'phased generalized binomial state'and is defined by[29] \X) =A1"[\T,,M)

+ ei+ IrpV.M)],

(4.2)

where A is the normalization constant, and ip and are two different phases, that enable us to get different states. It is interesting to point out that the phased generalized binomial state can be regarded as a generalization to the previous states which have already been considered in the present work. Also we can say that the state |x) would generalize the results of [28], while the state | r), M) can be regarded as the usual binomial state or even binomial or odd binomial state, but this depends upon the value of the phase parameters rf> and <j>. The normalization constant for the state (4.2) is given by A = i [l + Ree<*{ri, M \ Ve^, M>]~*,

(4.3)

while for the usual binomial state, we find that A that takes the form Ab = - 1 + Ree^ ( l + (e** - 1) | r, | 2 )

(4.4)

In the limiting case we find for M -> oo ,and 77 —> 0 such that M|r7|2 = \a\2 equation (4.4) tends to

^

=

\ V + {cos&+

I a I' sin^el"^ 0 0 3 *- 1 )] _ 1

(4.5)

It is clear from the definition of | x) that when ip = 0 and = ip = ns, s = 1,3,5,...see equation (3.8). In the forthcoming subsections we shall consider some statistical properties of the state |x) given by equation (4.2), where we study the case when | r),M) is even (odd) binomial states, provided we take 0 = f and we call the state in this case a phased orthogonal even (odd) binomial state. 4.1 Phased orthogonal even binomial state. In this subsection, we consider the case when (j> = \ in the state |x) equation (4.2). In this case the state \r],M)e represents the usual even binomial state, where equation (4.2) reduces to | x>e = A\'2 [| 7?, M)e + c * I it,, M)e] (4.6) It is clear that

335 {M/2]

e(V,M

2

/M\

| iVtM)e = M £ ( £ ) ( < I" l 2 ) 2 ^ 1 - I " ! 2 ) M _ 2 n

( 4. 7 )

=| Ai | 2 fle[l + (i - 1) | r, \2}M, and then the normalization constant A, becomes Ae = i { l + | A! | 2 c o s ^ e [ l + (i - 1) | r/ I 2 ] " } " 1

(4.8)

where Ai is the normalization constant for even binomial state given by equation (3.8). As we stated above the even binomial state tends to the even coherent state when M -> oo,?7 -> 0 such that M\r)\2 = \a\2. In this case we find equation (4.8 ) tends to the form A~e = - cosh | a | 2 [cosh | a | 2 + cos^icos | a | 2 ] (4.9) For V" = 0 one can find that equation (4.9) is exactly equation (5) of [28], and the state | x)e reduces to the orthogonal-even coherent state. 4.1.1 Second and fourth order squeezing. Now let us employ the Hong-Mandel definition [30] to study the second and the fourth order squeezing. To do so we have to calculate the expectation values for different order operators using the state | x)e- It is clear that the expectation values for both a and at vanish. The following expectation values can be easily found
£

(B™)* (B™+2) v /(2n + 2 ) ( 2 n + l ) s i n ( ^ + mr),

(4.10)

n=0

[fl (tfa)e = 2Ae Yj2n\B^n

| 2 [1 + costy- + TITT)].

(4.11)

n=l [M]

2n{2n - 1) | B% | 2 [1 + c o s $ + rwr)]

(4.12)

n=l

The expectation value for a 4 can also be given as rM-4i

{a

>< -

12 i „ |4 0ae ' 2 2

(i-M )

2- IB-I

x y/(M - 2n)(M -2nx [1 + cos(V> + nir)]

1)(M - 2n - 2)(Af - 2n - 3)

(4.13)

336

w-^fi-n

M |2

2n I

(4.14)

x A / ( M - 2n)(M - 2n - 1) sin(^ + rwr) lB

where 77 =| 77 | e . Now let us define two quadrature operators for the field, X\ =T/2X and X2 =V%Y where X and Y are given by equation(3.11). These quadratures satisfy the commutation relation [Xi, X2] = i and play the same role as the position, and the momentum operators, q and p respectively, where [q,p] = i. 2

2

For the second-order squeezing we may write the quadratures AXi and AX 2 in terms of a,a) as follows: A l ^ 2 - l / 2 = {a1a)e ± Re{a2)e (4.15) When we differentiate the resulting expression with respect to 6 and set the result equal to zero, we find a necessary condition for AXi ( or AX2 ) to be extremum. That is (where shT0 T^O) cos 26 = 0, where the extreme values of 0 = TT/4, 3W/4 are independent of | rj |. On the other hand the extreme values for ip, are ip = 0, TT. The fourth order moment of X\ and X2 in the state | x)e are given by 1 + 2
AX, =

4

3

t2 2 4 + 2 t3 A X 7 = - 1 + 2(a a ) + |fle(o ) + 4(a a) - 4iie(o ) - ^.Re(a a>

(4.16)

(4.17)

where the fourth order uncertainity relation becomes (4.18)

AXi A X f > (9/16)

To measure fourth order squeezing in the quadratures AXi and AX2 from zero, we can rewrite equations (4.16) and (4.17) in the form Q1 = -AXl

-1,

Q 2 = gAX 2

-1

(4.19a)

Thus we may conclude that Xi and X2 will exhibit fourth- order squeezing whenever Qi < 0 or Q2<0

(4.19b)

337 It can be proved from the calculation that the extrema of the fourth-order squeezing coincide with those of the second order squeezing for both 9 and ip. To discuss the second and fourth-order moment of squeezing in phased orthogonal-even coherent state, we have to take the limit of equations (4.15-17). After a long and tedious calculations one finds the following expressions: AX" 2 = - + I a I2 [ s i n h l a l 2 - s i n | ^ a | 2 c o s ^ ] 2 cosh | a | 2 + cos ip cos | a | 2 | a | 2 cos | a | 2 sin tp sin 29 cosh | a | 2 + cos ^ cos | a | 2

AX,

(4.20)

2 2 _1 4 2 =- 1 + 4[cosh | a | + COST/-cos | a | ] [| a | [Ccosh | a |

2 -.DcosV>cos | a | 2 - - s i n | a | 2 sin20sin^]+ | a | 2 [sinh | a | 2 o

(4.21)

— cos if) sin | a | 2 +COS | a | 2 sin,0sin26)]

with )-(! + )-cos ±9) and

D = hi - ^cos 49)

(4.22)

and similar expressions for AX2 , AX2 • 4 4 Prom the expressions for AXi , AXi , one can easily realize that there is a fourth order simultaneous squeezing in the quadreture components for both phased orthogonal even binomial state and phased orthogonal even coherent state, (see figure (12) of Ref [28]). 4.1.2 The distribution functions. Here we pay attention to use the quasi- probability phase-space distributions to examine the representation of the phased generalized binomial states given by equation (4.2). By using equations (2.12) we will be able to get an explicit form for the quasiprobability distribution function for different forms of phased orthogonal states. Thus for the density matrix p generated by equation (4.2) we obtain the following expression for the quasiprobability

338

function

l{0,s) = —

, . 1-s

exp(

M2 i- w

(4.23)

'M\(M\ti!_(l±s_\n J \mj m! \ 1 — s,

(jn—n)

( \r)\2 1 - |,|=

1-s

4

p p r(m-n) f IffI . 71 -i TTl J J n

where P n P m = 4[cos[(0 - S - | ) ( m - n)] c o s ( ^ ± ^ ) c o s ( ^ ± ^ )

(4.23a)

The function L„ (y) in equation (4.23) is the associated Laguerre polynomial defined after (2.15). Note that in our calculations we have taken 77 = |?7|eie and /3 = |/9|et(5, where 9 and 5 are the phases of r\ and (5 respectively. The Wigner function of phased orthogonal even binomial state is [M/2] r

W(P) = ^ ( 1

exp(-2|/3| 2 )

7T

[M/2]

P| n L 2n (4|/3| 2 ) + 2 /

I 12

\

(iH^J

m

£

£

M In I VI

2n! (M\ (M 2m! \ 2 n / \ 2 m

)(i

(4.24)

+n

(4|/3l2)(m"n)-p2nP2m^lm-n)(4|/3|2

and

PlnPin

4cos[(2# — 25 — 4>)(m — n)] cos ( — + n<j) I cos ( — + m(f> (4.24a)

339 While the Q-function is

QW)

i(l-|,|Vexp(-|, (\P\2)

£j

\2n)\l-\T,pJ

[M/2]

(m+n)

(4.25)

|0|2(M+n) \/2n!2m!

-P271-P21

Finally we would like to mention that; as a result of the non-classical character of the present state, we find the P- representation function is highly singular . Therefore we shall not consider of the P-representation. Now we shall make use of the Wigner function (4.24) to get the probability distribution function P(x) by integrating W(/3), with (/? = x + iy) over the imaginary variable y , where oo

/

W{x + iy)dy

(4.26)

-OO

Substituting equation (4.24) into equation (4.26), we get

P{x)=yJlAe(l- N 2 ) M exp(-2x 2 ) [M/2] m,n~Q

~[^](M\(

\2np2HUV2x)

\n?

BtjWUl-|f)J

M M J( )( )( M \ y \2n) \2m) \2(\ -\r,\')J 2

m+n

pl p, f2nl m

-'

2n

2n!

H2m(V2x)H2n(V2x)~ V2n\2m\ \ (4.27)

Where Hm(z) is the Hermite polynomial of order m:

T=0

{-l)rm\(2z)m-2r r\((m-2r)\

(4.27a)

and PLPL

= 4cos[(20 + w )(m - n)] cos (t + n l \ cos ft + m l \

(4.27b)

The Wigner function W((3) has been investigated for different values of M, 7? and ip. For the small values of (~ 0.1), we find that the vacuum state is the dominant contributing

340 one for ip = 0 and 90. For ip = 180 where the Fock state |2) is the only existing state when M = 4 the figures for W(/3) represent this state clearly whatever the value of r). For intermediate r] (~ 0.5) we find for ip = 0 and M = 4 the Gaussian is deformed slightly with the appearance of shoulders. This is due to the slight effect of the state |4) in this case to the vacuum, while for ip = 90 and M = 4 the figure is assymmetric the effect of the state 12} deforms the Gaussian of the earlier case (see fig 2a of [29]). The nonclassical effect is apparent. Increasing M adds more states to be considered. The central peak surrounded by wobbling circles, and changing of ip, changes the symmetry (the case of M = 16 and ip = 180 is presented in Fig.2b of Ref[29]). When 77 is increased (to ~ 0.9) the effect of the Fock state with higher excitations, especially when M takes larger values, is pronounced. Increasing the value of M shows breaking up of the outer engulfing circles. Changes in tp result in assymmetry in the figures especially when we take ip = f. Therefore the shape of the quasiprobabilities is very sensitive to the choice of the phase ip. Investigations of the formula (2.25) for the Q-function reveal that for small value of 77 we find the Gaussian form that characterizes the vacuum state for all values of M and ip = 0 or 90. However when we take ip = 180, and M = 4, the case of the state |2) results and the figure for Q- function is representative of this case whatever the value of r) (see for example Fig. 4d of [4] which represents the Fock |5)). For intermediate values of r?(~ 0).5, one finds a displacement towards the centre(a = 0) which is a sign of squeezing this effect is pronounced for the case ip = 90 and M ( ~ 4). However when M is increased one finds a split into four summits, choosing ip makes the interference between the summits greater (ip = 0) or the split clearer (as when we take ip = 90 or 180). As 77 increases (~ 0.9) one finds that the height of these summits depend on the choice of ip (see Figs. 3 of Ref [29]). 4.1.3 Phase properties. The notion of the phase in quantum optics has found renewed strong interest because of the existance of phase-dependent quantum noise. The new measurements [32] in this field opened the way to deeper understanding of the quantum nature of the phase. There are many different [33] approaches to this problem. To calculate the phase properties for the even and odd phased orthogonal binomial states we adopt the Pegg-Barnett formalism [34]. In this approach, and on the (s + 1) dimensional subspace, one chooses the (s + 1) orthonormal phase states as bases, thus defined by \0m) = -jL=

f] exp(m,

(4.28)

where 9m = e0 + ^ r ; s+1 The phase operator is then defined as

m = 0,l,2....,s

(4.28a)

s

fo=Yl m=0

6

m\0m >< 0m\,

(4.29)

341 which has the state \0m) as its eigenstate with the eigenvalue 9m. The probability distribution for any state \ip) is given by P(0m) = \(0mW\2

(4.30)

This can be used to compute various moments and then the limit s —> oo is taken. The continuous-phase distribution function P{6) is introduced by P{6)=

lim £±I|<0 m |
(4.31)

For any state of the form \ijj) = J2n cn\n), we find that P(6) takes the form 2 ^ - < «p[i(n - m)0] I i1 I 1 ++ £<=-»<«

P(0) = ^

c

(4.32)

Prom this phase probability function the moments can be calculated when the phase reference angle 6Q is put equal to zero, we find that <6>)=0

(o4 + 2E-
(4 33)

-

3 '-^ (m — nY When the function P(9) of equation (4.32), is polotted we note the breaking up of the figure for probability distribution funtion for the case of if> — 0. This is only due to the presence of the states |4n) in the state, and hence we get a four- fold degenercy. This is in general trend with the case of the even states where the probability is split only on two (see Ref[29]) 4.2 Phased orthogonal odd binomial state. Now we turn our attention to introduce and investigate phased orthogonal odd binomial state | x)o i which is given by replacing | 77, M)D instead of | 77, M) in the expression (4.2) taking into account <j> = ir/2 , ip = ip + TT/2. Hence the state \x) is I X)o = Al'2 [| V, M)0 + e ^ + ' / a ) I it,, M)0]

(4.34)

The normalization constant A0 is given by 1 ^ = 2

2lM

l-|A2|2cosV[l + ( i - l ) M 2 ]

(4.34a)

where A2 is the normalization constant for odd binomial state defined in equation (3.19).

342 4.2.1 Second and fourth order squeezing. A similar discussion can be given as in the previous subsection. The extreme values for the present case are identical with those obtained for the case of the phased orthogonal even- binomial state, this is due to the fact that we take the phase %jj of equation (4.6) equals to xp + n/2. Numerical investigations reveal that the optimal case of 6 and ip doesn't introduce fourth order squeezing. This is in contrast to phased orthogonal even binomial state which has simultaneous fourth-order squeezing. For the phased orthogonal odd coherent state, the normalization constant A0 of equation (4,34a) reduces to A0 = - sinh | a | 2 [sinh | a | 2 — cos tp sin | a |2]

l

(4.35)

The second and the fourth order moment of squeezing are given respectively as: AXl

AXx

= --

[cosh I a | 2 — cos |a|2cosV')] sinh | a | 2 — cos ip sin | a | 2 | a | 2 sin | a | 2 sin ip sin 2d sinh | a | 2 — cosip sin | a | 2 2

1 + 4[sinh | a | 2 — cosip sin | a

(4.36a)

a | 4 [Csinh | a

•Dcosipsin | a | 2 - - c o s | a | 2 sm29siail)]+ \ a | 2 [cosh | a | 2

(4.36b)

• cos ip cos | a | 2 — sin | a | 2 sin ip sin 2^]] and simillar expressions for AX2 and AX2, where the constants C and D are given by equations (4.21a). In comparing equations (4.36) with Eqs. (4.20 and 21), we come to the conclusion that the behaviour of r/> and 6 in both cases is identical, provided we have used (tp + (TT/2)) instead of ip. 5 Generalized geometric state. As we have stated earlier the geometric state presents the gradual behavior of some quantum optical systems where the state of the field changes from the pure number (Fock) state to the non-pure chaotic state [11,35,37]. This means a field state that interpolates between the number state \n) and the chaotic state with density given by equation (1.2).

343 5.1 Definition. We define the normalized generalized two parameter state \Y, M) as follows:M

n=0

where Y is a complex parameter and its phase is random in general and the normalization constant Ao is !Ao|2=

l-^y+i-

l^

1

(5 la

- )

The limiting cases of the definition in equation (5.1) are. (a) Chaotic state. For \Y\(= j^=) < 1 and M -* oo, the density operator in this case is PY,^=

lim

\Y,M){Y,M\

M—i-oo

= lim |A0|2 V

Y^Y*^\nXn\

(5.2a)

n,n =0 211

If Y = \Y\e ^ and ^ is a random phase, then the average over tp gives •I

/-27T

i7r Jo

/O

\ - l

1+n

°°

-^

'

/*27T

,

Jo (5.2b)

(»)" *—i (1 + n) This is identical with the single-mode chaotic state with mean photon number n equ.(1.2). (b) The number state. For \Y\ —»• oo and M finite, equation (5.1) reduces to the number state \M > . (c) The vacuum state |0 > . This is either obtained by taking the limit |V| —> 0 or equivalently by taking M = 0. (d) The phase state \0 > (see equ (4.28)) of Pegg and Barnett [34]. For y = eie">, \Y\ = 1 and s = M. This is the partially coherent phase state and when s —> oo the coherent phase state [33] results. 5.2 Properties. The mean value for the mth moment of the photon number operator in the generalized geometric state is given by

344

> m | ^ P -

(5.3)

n=0

In particular, for m = 1,2 we have M

(ft) = |A0|2 ^ n | y | n = | F | ( 1 - | F | ) - 1 ( 1 - | F | M + 1 ) - 1 [ 1 - ( M + 1 ) | F | M + M | K | M + 1 ] . (5.3a) n=0

and M

(n2) = | A o | 2 ^ n 2 | F r n=0

_ [|y|(i + |y|) - ( M + i ) 2 | y | M + 1 + (2M 2 + 2M - i ) | y | M + 2 - M 2 | y | M + 3 ] (i-|y|)2(i-|y|w+i) (5.3b) Note that from (5.3a and 3b) we have in the chaotic-state limit (n) —> ft and (ft2) —> n(2n + 1), and in the number state \M > limit (n) -> M, and (ft2) —• M 2 . To calculate the normalized second-order correlation function one can use equation (3.9). In this case we have the expression ff(2)(0)

= ( i - l y i ^ 1 ) - 1 ^ + M | Y | M + 1 - (i + M ) | y | M ] " 2 [2 - M(M + l ) ^ ^ -

1

+ 2(M 2 - l ) | y | M - M(M - 1)\Y\

(5-4)

which goes to 2 for chaotic state and goes to (1 — JJ) for the number state \M). For the special case of M = 1,(?(2'(0) = 0, an expected result since the state |y, 1 > does not contain in its expansion the photon number state |2 > . Figures (la and b) of Ref [11] show the behavior of ff(2)(0) against | y | < 1. For M = 2, 0.69 < s (2) (0) < 1.94. In the range 0 < |V| < 0.36, there is a partial coherent property # (2 '(0) > 1. For 0.36 < |y| < 0.9 the antibunching effect (<7'2'(0) < 1) is clear but it is less compared with the photon number state [ ]. For higher values of M = 10, the chaotic behavior is exhibited [ gW(0) = 2 ] for | y | < 0.3. In fact as M -> 100, s (2) (0) = 2 for the whole range of 0 < |Y| < 0.95. The ratio of the variance function of the photon number (Are)2 to the mean photon number ( the Fano factor) is defined as F=iMl=<^>-\

< ft >

< ft >

(56)

345 (For a number state, F = 0.). In the special case of M = 1,

The generalized geometric distribution shows sub-Poissonian behavior. For M = 2 the sub-Poissonian behavior is shown in the range 0.37 < |Y| < 0.9. The cases of M = 10,100 indicate the chaotic character of the state (F > 1) (Fig. 2. of Ref[ll]). 5.2.1 Squeezing. Now we shall examine the squeezing property for the generalized geometric state. To reach this we have to use equation. In this case we need to calculate the expectation value for both quadrature variances Xi = \/2X and X2 = \/2Yasafterequ(4:.U). Thus we find the variance (AXi) 2 takes the form 2(AXi) 2 = < a2 + a)2 > +2(h) + 1 - < a + af > 2

(5.7a)

Similarly for X2 we have. 2(AX 2 ) 2 = - < a2 + a)2 > +2(n> + 1+ < a - a< > 2

(5.7b)

On the other hand we find M

(a2) = lAo^y*)" 1 £

\Y\nsM^Y)

= (at 2 )*

(5.8a)

n=2

where Y = |Y|e**(0 = 2ip). Similarly we can show that M

a = |A0|2|F|-ie^2 £

\Y\n^i

= (a*)*

(5.8b)

n=l

The numerical results for the variance expressions, Si = 2(AA"i)2 - 1,

S 2 = 2(AX 2 ) 2 - 1

(5.9)

where Si :2 < 0 signify squeezing, are presented infigs.(3 of Ref[ll]). In the case M = 1 the component Si shows squeezing for <> / = 0 up to \Y\ ~ 0.98, but for = j a lesser squeezing occurs for shorter range of \Y\ ( namely, 0 < \Y\ < 0.7 ). There is no squeezing at all for 4> = f. In conformity with the case M = 1, the component S 2 does not exhibit squeezing for the same values of (p. For M = 10, S 2 shows some squeezing for 4> = 0, j . The same is true for the case M = 50, but the magnitude of squeezing is less. In both cases of M = 10 and M = 50 there is no squeezing in Si as expected. Note that for |Y| > 1, both Si :2 are positive for all <$> hence there is no squeezing. This is consistent with the fact that as \Y\ —¥ oo the generalized geometric state tends to a Fock (number) state which does not exhibit squeezing in either quadrature.

346 5.3 Quasiprobability distribution function. The quasiprobability distribution is considered here. From the characteristic function equation (2.10) we shall be able to find these functions, therefore if we take the parameter s = 1 corresponding to normally ordered characteristics function, and we take the density matrix p = \Y, M)(Y, M\ where \Y, M) is the state given by equation (5.1) then after some calculations we have

n ,n—0

where L„ '(z) is the associated Laguerre polynomial given earlier after equ (2.14). Although the P representation affords a convenient way of evaluting the ensemble averages of normally ordered operators, however the function is highly singular resulting from the nonclassical charater of the state (5.1). Therefore we shall consider the Wigner function which can be calculated if one uses equation (2.13). In this case we have „ M /—p W(0) = - | A o | 2 e - 2 ^ 2 Y (-l) n W-^(2/?)( n - n )yty*TT£("-»)(4|/?| 2 ) 7T

f-~*

(5.11)

V Tb '

n ,n=0

where we used the generating function [36] oo

fc7-*47_n)(*)]

(1 + ky exp[-fez] = J2

(5- n a)

n=0

As a special case if we average over the phase ip then equation (5.11) reduces to 9

, 2

M

2

W(P) =-\X0\ e- ^ E ^ n ^ T M 4 ^ ! 2 )

(5-12)

71=0

In the chaotic state where M —> oo, \Y\ < 1 we get [W(p)]ch = - exp[-(n + b-'m2] 7T

(5.12a)

/

In Ref [11] figures (11-14) show the Wigner function W((3) as a function of 0 = Re(/3) + iIm(/3) for different values of M and \Y\. From equation (5.12) it is clear that W((3) is a symmetric function in both Re(/3) and Im(/?). For |Y|(~ 0.1), W(@) is insensitive to M and its Gaussian-like form has its peack at Re(/?) = Im(/3) = 0. As \Y\ increases and for M = 1,2, W(/3) exhibits a hole, at its center and eventully W(P) behaves as a Gaussian similar to that of the chaotic state. In all cases W(/3) is positive for \Y\ < 1. As for \Y\ > 1

347 and for increasing M the presence of the Laguerre polynomial term in equation (5.12) is more effective and hence W(/3) becomes negative around its center. In fact, equation (5.12) reduces to that for a number (Fock) state \M >, namely, in the limit \Y\ 3> 1 and for fixed M, [W{p)]Fock = -(-)MLM(4\P\2)

exp[-2|/3|2]

(5.12b)

7T

which tends to zero for |/3| 2> 1. Finally we calculate the Q function, which can be used to express the ensemble averages of antinormally ordered operators as a simple intergal. After carrying out the integration in equation (2.10) with s = — 1 it yields QiP) = ^{^L-MM-\\l3?\Y\)eM-W]-

(5-13)

When M tends to infinity and \Y\ < 1, we have the corresponding formula of the Q -function in the chaotic state, [Q(P)U = -(n+l)-1exp[-(n

+ l)-1\l3\2]

(5.13a)

7T

Now let us discuss the probability distribution function P(x) associated with the quadrature x for the generalized geometric state. To do so let us use equation (5.11) to calculate the integral given by equation (4.26). With the aid of equation (5.11a) we obtain the following result 2 P{x) = (-)i\X0\2e-^2

M

^

' (n'!n!)-i(2)-i("+")yty*^i/ n ,( % /2^) J ff n (V2^)

(5.14)

n,n =0

The function Hn{x) is the Hermite polynomial of equ. (4.27a). The behavior of P(x) is discussed in Ref [35] in the average state [i.e., terms with n = n in the summation in equation (5.14)]. For very small (\Y\ ~ 0.1), P(x)is insensitive to M (the same for the Wigner function W(f3) ) and P(x) is of Gaussian-like form. For increasing \Y\ ~ 0.8 , the hole exhibited in W(/3) for M = 1 is now reflected in the non-monotonic decay of P(x). The emeragence of a peak in W(a) for M = 2 at its center results in the monotonic decay of P(x). For M » l , the Gaussian behavior of P(x) is reached as expected for the chaotic state field. Now, for \Y\ > 1, the negative values of W(/3) at its center,due to the Laguerre polynomials, results in the relatively reduced initial value of P(x = 0). As M increase, P(x) has an oscillating behavior of growing envelope for some range of x before it vanishes. For greater |V|, the behavior of P(x) coincides with that for a Fock state \M >, namely PFockix) = (-)i(2MM\)-1H2M(V2^)eM-^2)

•

(5.14a)

348 In the non-average case the matter is discussed also in Ref [35] (i.e., for fixed phase ip ) and for |V| = 0.8 and for ip = 0, it is shows that P(x) has multiple peaks with increasing M. For tp = | , | , P ( x ) has a similar behavior but with reduced peaks and suppressed oscillations for higher M ( ~ 10). For ij> = n the alternating behavior of the Wigner function for odd and even M values is clearly reflected in the pronounced increased initial peak for odd M = 1,3 (note the spike in the Wigner function at its center). Smaller peaks in P(x) appear for increasing M. For larger | y | ( ~ 3) and for ip = 0, P(x) has a behavior essentially similar to that of \Y\(~ 0.8). For ip = f the sharp dips in the Wigner function for odd M at its center result in the sharp drop of P{x) near x = 0 followed by an oscillatory (for higher M) decay. For ip — it, P(x) resembles that of the Fock state equation (5.14a). The results presented provide further insight into the systematic study of the gradual behavior from number to chaotic state. The Wigner function for the generalized geometric function of (2.13)is now given by 2

2

1 — \Yl2

°°

w(p) =- exP[-2|/?i ] i _. ' (J|,+1) {X>ny| 2 "M4[/3i 2 ) (5.15) 2Re

m

m

m n

2

J2 H " ( ^ T ) ' y* y"(2/?) -™z4 - ) (4\p\ ) } m>-n

The Wigner function is particularly important as it gives the correct probability for a chosen observable by integration over the conjugate observable. It is a complete representation of the state of the system since there is a one-to-one correspondence between the Wigner function and the wavefunction of the state. Thus, if we can map the Wigner function, then we have a complete representation of the state. Investigations of the Wigner quasiprobability function of equation (5.15) reveal that it has an almost Gaussian shape for small values of \Y\ (< 0.3) and is largely insensitive to any change in the values of M. This is due to the dominance of the effect of the vacuum state over the effects of the higher excitations. As \Y\ increases, however, distinct regions appear for which the Wigner function displays (non-classical) negative regions. The structure changes dramatically once \Y\ exceeds unity. For these values the vacuum state no longer has the highest probability in the number state expansion and the appearence of crescent-like structures which occur at radii characteristic of the photon number states is noticed. For some different values of M the characteristic number state rings and the anisotropy characteristic of a preferred phase are reported in Ref [37]. 5.4 phase properties of the generalized geometric states. The Hermitian optical phase operator is defined in finite-dimensional state space by equation (4.29). The use of the phase operator involves evaluating expectation values and moments as function of s before letting s tend to infinity. The importance of this limiting

349 procedure has been discussed in detail see for example [38]. Since equation (4.31) can be used as a probability density for calculating the moments of the Hermitian optical phase operator, therefore if we take the argument of Y to be zero so that Y is real and positive, then one can find that the phase probability density is PW - - 1 i-IlT l-2YM+'cos[(M+l)0] + K) 2TT 1 - |F|2(M+i) l - 2 Y c o s 0 + Y2

Y^M+^ K

'

'

We also choose the value of 9Q in equation (4.28) to be —7r so as to ensure the sensible result that the expectation value of the phase operator is the argument of Y. With these choices equation (4.29) gives

<&>=o,

A^ = - + _ y2(M+1) J2 ^rY m=l

I1 - ~^r- J - (5-17) ^

'

It is worth noting that, in the limiting as M tends to infinity, this variance becomes 7T 2

2

A(j> e(M ->• oo) = — + Ui log(l + Y), (5.17a) where dilog{.) is the dilogarithm function. The form of this variance is similar to that found for the variance for the phase sum associated with the two-mode squeezed vacuum state. This is a consequence of the similarity between the form of the coefficients in the number state expansions for the coherent phase state and the two-mode squeezed vacuum. Investigations of P(9) of equation (5.16) for different values of M show that [37]:- For the values 0 < \Y\ < 0.99 1.01 < \Y\ < 20 - -IT < 6 < n, P{9) is almost constant (about ^ ) for |Y| < 0.2 whatever the value of M, which means that the vacuum is the dominant state in this case ( compare with the discussion of the Wigner function for small values of |y| ). However, as | y | increases, a peak is produced at around 8 = 0. This peak sharpens as M increases, tending towards a delta function as M —> co, Y —• 1 (the phase state). As |Y| increases and takes values greater than unity, this peak gradually disappears. Finally it settles to the constant value ( about ^ ) as Y > 10 (it tends to the Fock state \M > as \Y| —> co, and hence phase information is lost). Discussion of the fluctuations in < Acpg > and < An 2 > for different values of M. show that [37], as M increases, the fluctuations in n are increased near |Y| = 1. It is to be noted that both < An 2 >or < Ag > goes to zero as |Y| -> 0 or |Y| = 1 and M —> co. This is expected since |Y| = 0 is the vacuum state and | y | -> 1 with M —^ co is essentially the phase state \6m > and, as has been reported, the expectation value of the commutator[ra, ] vanishes for both these state (see [37] for details).

350 5.5 Production schemes. In this subsection we present two schemes for production of such states. The first one depends on a generalized Jaynes Cummings model while the second relies on the SU(1,1) algebra. Generalizations to the JC model that include nonlinear interactions (in boson and spin variables) have been proposed. An interaction Hamiltonian for one of these generalizations that describes multiphoton processes in finite-level atomic systems is of the form 2r

Hint = J2^{(aS+y

+ (tfS„y},

(5.18)

where Sz,S+, and 5_ are the inversion, raising, and lowering operators which describe the atomic systems having (25 + 1) states. They satisfy the commutation relations [53, S±] — ±S± and 53 has the eigenvalues m such that 53\m > = m\m > where —5 < m < S. The fields operators a and d* satisfy the usual commutation relation [d,dt] = 1. The coupling constants gj couple the atomic system to the field; finally r < S. This Hamiltonian produces the JC model for r = | , 5 = | . When r = | , and 5 = | it gives the model discussed by Senitzky [39]. Several values for 5 were investigated by Buck and Sukumar [40]. The case for general 5 and r = | is the well-known Dicke model and the Tavis-Cummings model [41]of cooperative two-level atoms. Taking r = 1, and 5 = 1 , equation (5.18) describes a three-level atom in interaction with a single mode in which transitions between neighboring levels are effected by single-photon processes while the transition between the upper and lower levels is effected through a two-photon process; which is a special case of a Hamiltonian considered for the three-level atom system. Thus we may say that the interation model (5.18) then represents a (2r + l)-level atom interacting with one mode of the radiation field where one-photon transitions occur between neighboring levels, twophoton transitions occur between levels indicated i and i + 2 ; i < 2 r + l and so on where 2r-photon transitions occur between the two extreme levels of the atom. The model (5.18) can be used to describe some processes such as multiphonon transitions in a two-level atomic system, multiphoton lasers, and Raman and hyper-Ramann processes. The operators S± when applied to the state vector|ra >, for r = S, give aj.

S

r(5-m)!(5

+ m + j')!1i.

, .

+ | m > ^ { (5-J- J - ) !(5 + m)!}2|m +

J>

'

(5.19) ( 5 - m + j)!(5 + m ) ! n i . > = 2 m -\ 0• o(S - mm)\(S w e ,+„m —-^ Tj)> \ ~J>We assume that the system is evolving under the Hamiltonian (5.18) from the initial atomic coherent state aj. S m

r

s

Wo)>= £

r 25 ' \m > |0 > p h, m+S (1 + M 2 ) S

(5.20)

351 where |0 >ph is the vacuum state for the field, while r = tan($/2)e 1 *,. For a short time t (i.e., Qjt < 1), the wave function of the system becomes 2r

\il,(t) > = |V(0) > - f t £ 2 <9j f { ( o S + ) ' + (at5_y}|V(0) > . 3= 1

(5.21)

J

'

By using the equations (5.20) and (5.19) we get the following expression 2r

n

JL.

Wt)> = W 0 ) > - * * 5 : ^ f l ^ E

rm+S

(1 +

2 91

I

| T |2)5 ( S _ m ) | Kg + ^) ! (g + m ~ 2j)!p \m-j).

(5.22) Suppose at time t the atom is measured to be in its ground state | — 5 >, then the state of the field is given by v—%

i—

TJ

T25'

|V/(t) > ^ Ao|0 > - i < 2 ] V j ! S i 7 ^ TT

\j >,

(5.23)

- + I "-

where A0 = (cosi?/2) 2S . By making the coupling constants gja(2S — j)\y/j\Y3^2 and 2r = M [i.e.,(M + l)-level atom] the field state is then the \Y, M > of equation (5.1). To produce the generalized geometric states by using the Lie algebra approach, one can see that the coherent phase state bears a close relationship to the SU(1,1) coherent states which are important in the theory of squeezing. The 5(7(1,1) coherent state |£ > is defined in terms of the action of the operators K+, K_, and Kz obeying the commutation relations [k+,KJ\

= 2K3,

[K3, K±] = ±K±

(5.24a)

2

The single-mode squeezed states result if we let K_ = a /2, with K+ being its Hermitian conjugate, by the action of the unitary operator exp(^K+ — £*K-) on the vacuum state |0). An alternative representation of the 5(7(1,1) algebra for the operators acting on a single field mode is given by A = ahi,

i+ = n i a f

(5.24b)

representating K_ and K+ respectively with the representation completed by the operator h + \ as K3. An 5(7(1,1) coherent state constructed using these operators has the form |0 = e x P ( f i t _ r i ) | o ) ) which on using the disentangling theorem

(5.25)

352

exp(£i f - £*A) = exp{[ein t a n h r ] i f } exp{-2[ln(coshr)](ft + h} exp{[ e - i?? tanhr]A}, (5.25a) with £ = r-exp(ir?) gives 00

10 = (coshr)"1 expf^tanhr]^}^} = (coshr) V

\pirj f a r l u r]n l

~Z ' \n)

(5.25b)

If we write V = exp(irj) tanhr, then we see that this is simply the coherent phase state or the generalized geometric state \Y, 00}. In principle, this state could be produced by an interaction that involves an intensity-dependent coupling. 6 Even geometric states. Now we shall consider the properties of a normalized superposition of the two generalized geometric states| Y, M) and | — Y, M) in the form

\MY, M)) = \(\Y,M)

+ \-Y,M))=(1_

/

1 _ m* \ * [M/2) ^ J/a]+1) j £ r 2 " | 2 n > ,

|y

(6.1)

where [x] denotes the largest integer not exceed x. We refer to these states as the even geometric states by analogy with the terminology applied to superposition of coherent states [42]. With this definition for the even generalized geometric state we can calculate some statistical properties of the field.

\ ' ' / _ 2|F| 4 {[M/2]|F| 4 ([ M / 2 )+i) - ([M/2] + i)|y|4([M/2]+i)}

(6.2a)

(i-|iT)(i-l*T a M / 2 1 + 1 ) ) and 1 _ iy|4

[M/2]

at2&2

<

>« = 1 _ |y| J[J/ 3 ] + i) E 2<2n - 1^4n-

The expectation value for the field operator a2" is given by

( 6 ' 2b )

353 W2)

i_|y|4

<^I<M = X_\Y^^T-

(

2m'

\*

4m

£ l^| ((2^)0

(«•*>

while expectation values for odd powers vanish. It is to be expected that this state would give stronger squeezing than the generalized geometric state. For the state (6.1) we calculate the quasiprobability function and find that

^)(/?)=^exp(-|/?nT-i^]TTy [M/2]

m-n

2

% |

,

2

i £ irrw^i )+2 £ (m )^)\\y\2lm^)

(6 3)

-

cos[2(m-n)(7-0)]i^m-n)(4|/3|2)] with y = |y| exp(ry),and 0 = \/3\ exp(i^). The investigation of the Wigner function of equation (6.3) (see [37])for small values of | y | (< 0.2) does not show any structure to the Gaussian shape apart from a slight anisotropy similar to that of the squeezed states. The effect of the vacuum state is predominant, while the effect of the higher excitations is not pronounced for this case. As \Y\ increases, the effects of the higher photon numbers again lead to negative regions. For \Y\ = 2 one finds a complicated structure for the Wigner function, but an enhanced probability for two phases corresponding to opposite directions in the /? plane is discernible. We consider the phase properties of this state in a manner analogous to that considered in §5; consequently we find that p W m = K

'

1 1-m4 l-|y| 4 « M / 2 ' + 1 >cos{2([M/2] + l)fl} + |y|4[M/2]+i 2TT 1 - |y|4((M/2]+i) l-2|y|2cos20+|y|4 '

y

'

while the phase fluctuations in this case are given by

<(A&) ) - - + l _ |y|4([M/2]+1) 2^ ^

[\Y\

jyjSS— y

7T 2

-> — + di log(l -\Y\2)

as

M -> oo,

| y | < 1.

The phase properties for the even geometric state are considered in [37] where the function P(9) of equation (6.4) is plotted for different values of M and in the ranges 0 < \Y\ < 0.99 and 1.01 < \Y\ < 20 for -n < 6 < n. The appearance of peaks at 9 = 0

354 and also at 6 = n and 6 = —n. The phase probability distribution is of course, lix interval or window there are really two peaks. This is reminiscent of the phase distribution found for the squeezed vacuum states [43]. For both the even geometric states and the squeezed vacuum the 7r periodicity of the phase probability distribution arises from the absence of odd photon number in the expansion of the state. T h e limits of b o t h large and small |V| are again number states and this is reflected in the asymptotic form for P ' 2 ' ( # ) . 7 Conclusion. Superposition of quantum mechanical states of the e.m. field have recontly received much attention in quantum optics. Recently the experimental realizations of nonclassical states of motion of atrapped ion such as. Fock states, coherent states squeezed states and Schrodinger cat states have been reported [44-47]. In these experiments an ion is leaser-cooled in a Paul t r a p to the ground harmonic state. Then the a t o m is put into various quantum states of motion by applications of optical and electric pulse for different durations. Thus the study of non-classical states of light is not a mere academic exercise but it relates to the experimental realm. In this roport we have studied some intermediate states in particular even and odd-binomial states which interpolate between the even and odd-coherent states and the even and odd Fock states; the even and odd negative binomial states which bridge between the even and odd coherent states and the even and odd pure thremal states have been discussed. The geometric state which is a bridge between the pure thermal state and the Fock state has been introduced and the even-and odd geometric states have been studied. For these states we have considered the non-classical properties expecially antibunching, sub-Poisoniam states, squeezing (normal and lingher) of the field quadratures, and the quasi-probability distribution functions have been calculated and poltted. The nonclassical signature shows in attaining negative values for the Wigner-function; and oscilations in the photon distribution. References. [1] P.A.M. Dirac, Principles of Quantum Mechanics,4th ed. (Oxford University Press,Oxford 1958); E.Schrodinger, Naturwissenschaften 14,644(1926). [2] R.Glauber, Phys.Rev. 130,2529(1963);ibid 131,2766(1963). [3] J.Perina,Coherence of light (Reidal,Dordrecht,1985). [4] D.Stoler, B.E.A.Saleh and M.C.Teich, Opt.Acta 32,345(1985); A Vidiella-Barranco, and J.A.Roversi, Phys.Rev.A 50,5233(1994). [5] A.Joshi and S.V.Lawande,Opt.Commun.70,21(1989);G.S.Agarwal, Phys.Rev.A.45,1787(1992). [6] L.Susskind and J. Glogower, Physics 1 ,49(1964). [7] M.S.Abdalla, M.H.Mahran, and A-S F Obada, J.Mod.Opt.41 ,1889(1994). [8] S.C.Jing and H.Y.Fan, Phys.Rev.A.49,2277(1994). [9] A.Joshi and A-S. F.Obada, J.Phys.A:Math.Gen 30,81(1997). [10] R.Simon and M.V.Satyanarayana, J.Mod.Opt.35,719(1988). [11] A-S.F.Obada,S.S.Hassan,R.R.Puri,and M.S.Abdalla, Phys.Rev.A 48,3174(1993).

355 [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44]

B.Baseia, A.F.de Lima, and G.C.Marques, Phys.Lett.A 204,1(1995). D.T.Pegg and S.M. Barnett, Europhys.Lett. 6,483(1988); J.Mod.Opt.36,7(1988). B.Baseia, A.F.de Lima, and A.J. da Silva, Mod.Phys.Lett. 9,1673(1995). B.Roy, Mod.Phys.Lett.B 12,23(1998). V.V.Dodonov, I.A.Malkin, and V.I.Man'ko, Physica 72, 597(1974). C.C.Gerry, J.Mod.Opt. 40, 1053(1993). V.Buzek, and P.L.Knight,Opt.Commun.l8, 331(19991). V.Buzek, A.Vidiella-Barranco, and P.L.Knight, Phys.Rev.A 45, 657(1992). V.Buzek.I.Jex, and T.Quang J.Mod.Opt.37,159(1990). E.P.Wigner, Phys.Rev.40,749(1932). G.S.Agrawal, and E.Wolf, Phys.Rev.D 2, 2161(1970). C.L.Mehta, and E.C.G Sudarshan, Phys.Rev.B138,274(1965). H.Moya-Cessa and P.L.Kinght, Phys..Rev. A 48, 2479(1993). G.Dattoli, J.Gallardo, and A. Torre, J.Opt.Soc.Am.B. 4,185(1987). F.A.A.El-Orany,M.H.Mahran,A.-S.F.Obada,and M.S.Abdalla,Inter.J. Theor.Phys. 35, 1393 (1998). A.-S.F.Obada, M.H.Mahran,F.A.A.El-Orany,and M.S.Abdalla, Inter.J. Theor.Phys.38, 1493(1999). R.Lynch Phys.Rev.A 49, 2800(1994). M.H.Mahran,M.S.Abdalla,A.-S.F.Obada,andF.A.A.El-Orany,Nonlinear. Optics 19, 189(1998). C.K.Hong.and L.Mandel, Phys.Rev.Lett., 54, 323(1985); Phys.Rev.A 32, 974(1985). G.S.Agarwal,and K.Tara, Phys.Rev.A 43,492(1991). J.W.Noh,A.Feguires,and L.Mandel,Phys.Rev.Lett.67,1920(1991); Phys.Rev.A 45,424(1992). See the special issue Physica Scripta T,48 (1993) ,and R.Lynch Phys.Rep.250, 367(1995). D.T.Pegg, and S.M.Barnett, Phys.Rev.A 39,1005(1989). H.A.Batarfi,M.S.Abdalla,A.-S.F.Obada,andS.S.Hassan,Phys.Rev.A51, 2644(1995). and H.A.Batarfi, M.S.Abdalla,and S.S.Hassan, Nonlinear Optics 16, 131(1996). B.Spain and M.G.Smith, functions of Mathematical Physics (Van Nostrand Reinhold,New York, 1970). A.-S.F.Obada, O.M.Yassin and S.M.Barnett. J. Mod. Optics 44149(1997). J.A.Vaccaro,and D.T.Pegg, Physica Scripta T, 48, 22(1993), S.M.Barnett and D.T.Pegg, J.Mod.Opt.39, 2121(1992). I.R.Senitzky, Phys.Rev.A 3, 421(1971). B.Buck and C.V.Sukumar, J. Phys.A 17, 877, 885(1984). R.H.Dicke, Phys.Rev.93, 99(1954);M.Tavis and F.W.Cummings 170, 379(1968);ibid 188, 692(1969). B.M.Garraway, and P.L.Knight, Physica Scripta T 48, 66(1993). J.A.Vaccaro, S.M. Barnett, and D.T. Pegg, J. Mod.Optics, 39,603(1992). D.Leibfried, D.M.Meekhof, B.E.King, C.Monroe, W.M. Itano and D.J.Wineland, Phys. Rev. Lett. 77, 4281 (1996).

356 [45] D.M.Meckhof, C.Monroe, B.E.King, W.M.Itano, and D.J.Wineland, Phys. Rev. Lett. 76, 1796 (1996). [46] C.Monroe, D.M.Meekhof, B.E.King, and D.J.Wineland, Science 272, 1131 (1996). [47] W.M.Itano, C.Monroe, D.M.Meckhof, D.Leibfried, B.E.King, and D.J.Wineland SPIE Proc. 2995, 43 (1997.

Mathematics and the 21st Century Eds. A. A. Ashour and A.-S. F. Obada © 2001 World Scientific Publishing Co. (pp. 357-371)

357

ON THE RELATIVISTIC TWO-BODY EQUATION S.R.Komy Department of Mathematics, Faculty of Science, Helwan University, Egypt. Abstract Recently, an exact two body covariant wave equation has been derived from the field theory of coupled Maxwell Dirac equations (Barut & Komy, 1985). It involves only one common invariant center - of mass time r . It takes full account of Spain and recoil corrections of both particles. The equation that also includes self-energy effects in a non-linear fashion, is al6-component spinor equation for two Dirac particles. Working directly form this equation, energy eigen values and eigen functions are calculated to order a 2 and a 4 , where a is the fine structure constant. The eigen values agree with those determined previously by perturbative techniques, inclding relativistic, recoil and spin corrections, for all the energy levels of orthopositronium. The eigen value problems that arise are of Sturm-Liouville type, but involve two or four coupled, second - order differential equations in the radial variable r, with up to four singular points. In this approach approximated eigen values are determined directly, and the corresponding approximate eigen functions are obtained (for the first time) in simple closed form.

1

Introduction:

In nonrelativistic Q u a n t u m Theory for the dynamics of many particles in tion <j>(x[, X2,---,t) in configuration Vij(£i — x"j) which are functions of the

we have a well-establislished basis t e r m s of the m a n y - b o d y wave funcspace with one time, a n d potentials relative coordinates. This leads to a

358

powerful and, in principle, nonperturbative way of describing bound states, resonances, and scattering states. Of particular interst the trivial analysis for separating the center of mass and relative motion in the non-relativistic two body system to find a simplified problem in which the relative motion is described using the reduced mass. When one or both particles are to be treated relativistically, there is no similar procedure. In fact, if one starts with a two body system and lets one of the particle masses becomes infinite, it requires a non-trivial analysis to demonstrate that the result can be expressed in terms of a relativistic equation for the orther particle in a central potential. In 1930 Dirac found the relativistically invariant equation of motion for an electron in an Electromagnetic field. Together with Maxwell's equations form the fundamental set of equation of photons and electrons [1]. One of the major successes of Dirac theory was the explanation of the energy spectrum of the H-like atoms including the fine structure. However theortical treatment is far from complete. There remains considerable radiative correections, as well as relativistic refinements. Moreover the observed energy levels refer to real systems, so that small corrections are to be expected which arrise from interactions between magnetic moments, nuclear structures, recoil effects, and various contributions of all mentioned. A proper understanding of fine and hyperfine structures require earful treatment of the actual 2—body problem, in particular pure electromagnetic systems such as positronium, monomium, • • • cannot be studied without a well developed two-particle theory. The earliest treatment of the relativistic two-body system was that of Breit who constructed a theory of two electrons interacting with the electromagnetic field [2]. Although this theory does account for retardation effect, it corresponds to a single particle theory and in turn does not yield the proper corrections to the fine and hyperfine structures. On the other hand, many practical calculations are made on the basis of a Hamiltonian which is a sum of Dirac Hamiltanians and the main problem is how to choose the potentials. Usually they are assumed phenomenologically, or one photon and one boson exchange potentials are used [?]. The theory based on such postulated Hamiltonians is approximate. In Quantum electrodynamics two body systems such as the H-atom or positronium are treated perturbatively. Starting from a suitable wave equation (Schrodinger or Dirac equation with reduced mass, or a truncated Bethe-Salpeter equation), recoil and radiative corrections are introduced term by term from Feynman grphs. In this way,

359

energy level correction in positronium have been calculated, up to the order of about a 5 , where a is the fine structure constant. It is very impotant to determine if all these quantum electrodynamic effects can be obtained directly from exact, relativistic, non-perturbative twobody equation. Recently, an exact two body covariant wave equation can be derived from the field theory of coupled Maxell-Dirac equations (Barut-Komy [4]). This derived equation involves only one common time, an important feature, because the use of different proper times for the individual particles would lead to unwieldy retardation effects. The equation, which also includes self-energy effects in a non-linear fashion is a 16—component spinor equation for two Dirac particles. Prom the mathematical point of view, this two body equation offers some intersting new aspects. It leads to problems of Sturm-Liouville type involving a set of coupled second order ordinary differential equations in the radial variable. In Section 2, we derive the two-body equation. In Section 3, solutions for some special cases will be presented.

2

Derivation of t h e two-body equation:

Consider a number of (distinct) fermion fields ipi(x), ^{x), creasing with the Electromagnetic field A^. The action is:

in-

S = J dx[~F^ + £^(i7 m "d M - mM - Z e^7%A™ - E arfyiPjF^. i

(1)

j

The fermions have electric charges tj and anomalous magnetic moments a,j and spin matrices: 7^°; i = 1, 2 , •' • and a^ = -(7 M 7* - 7-OV)-

(2)

Hence: a0t a

= i7°7 2 = i ct = -iotu =

e

ov,

an = 1.

a? = 1, (3)

360 The equation of motion obtained from (1) are: 5 and (il^dmu - m^ipj - e^A^t/jj

- a^F^j

= 0, j = 1,2, • • •

(5)

Because we shall use the Green's function of the wave operator, ther is a preferred covariant gauge, A",/z = 0 (6) and with that choice, equation (4) becomes • ^ = 3n = Y\e$}1irf>i

(7)

+ 2aj(^
j

The general solution of (7) is A^x)

= 4 » ( x ) + JdyD(x

- y) £ [ e ^ ( y ) 7 / ^ ( 2 / ) +

2aj^j{y)atiV^{y)Y].

j

(8) In the second term we perform an integration by parts, and assume that, for localized current distribution the surface terms at infinity vanish. Then Ali(x) = A?(x)

Y,eifdVD(x-v)ii>i(vhMy)

+ j

-

2j2ajdydxD(x-y)^j(y)aXfliPJ(y)

(9)

j

To eliminate Ay, completely from the action, we insert (9) into the action (1) and obtain:

J

3,k

-

j

L

J

Yl eJa* J dyipj(x)Y'>Pj(x)dxD(x - y)^k{y)crxy.ipk{y)

361

-

Y

a e

-

Y

2a a

i * J dyipjix^'ipjix^Dix

y)ipk{yhvipk{y)

I dy:ipj{x)a^7pj(x)dvdxD{x

ik

- y)ijJk(y)(T\»ipk(y)

J

3,k

-

-

^(e^ix^U^ + ^S^W^^lM^)}

(10)

j

The diagonal terms i = k, corresponds to self energy, the interaction of the particle's current with itself. These terms have to be renormalized and treated separately. Here we are interested in the mutual interaction of two or more different particles, hence in terms i ^ k. The interaction term with external fields A™ is in the form — jM™. The sources of A™ do not appear in action as usual. For the mutual interaction of two particles we use the retarted Green,s function Dret{x — y). Now for the e^—interaction there are two terms in (10) with coefficients e\e<2 and e2e2- In the second term we interchange x and y and use the identity Dret(x-y)

= Dadv(y - x).

This is equivalent to writting the e^—interaction part as - Y

e e

i k / dxdy:$j{x)^iljj(x)~D{x

- y^^yh^Mv)

(U)

where I) = -(Dret

+ Dadv).

Similarly, in the other interaction terms, we note that dxDTet(x

-y)

= -OxDadv(y

- a;)

we combine vaious terms to obtain -

Y

2e a

ik

+

Y

/ dxdyiJj(x)'j>i'4)j(x)dx~D(x -

y}ipk{y)aXl,ipk(y)

J

j
ik

/ dxdyipj(x)<jflXip:j(x)dx~D(x -

y)^k(yh^k(y)

j
-

Y E 3
Aa a

i k I dyiP^a^^d^Dix

- yW^V^Mv)-

(12)

362

In particular, for the 2—body problem, which we shall mainly consider in the following the spin algabra is a direct product of two Dirac algebras. We shall always write the spin matrices as, e,g. ,

The first term of the direct product will refer to particle 1, and the second term to particle 2. A feature of the (ea) and aa interaction in (12), not present in the (ee) coulomb terms, is the occurance of the derivatives dxD(x — y), dlldxD(x — y). We have the relations i

i

(x-y) = —6(x°--y°- -r), D^ix-y)

:= 4^*°-» (

i, r =

\x-y\.

CG

4irdmDTet(x - y) = -r^6'(x°-y°lirdmdnDret{x - y) = L , dTmTn

T r4

0mn

6'{x°

-r), 47T

r

5

-y°-r).

(13)

Thus all the derivatives of the D-function reduce to 6,6', and 6" terms. Integrating by parts these 6 derivatives with respt to j/o, w e obtain the derivatives of the currnts with respct to yo at retarted times. The variation of the action with respct to the indivdual fields tpj leads to a set of coupled non-linear integrodifferential equation of the Hartee type. Instead we define composite fields and we vary the action with respect to these comosite fields, we obtain linear equation. Define the bilocal field (f> which is a 16—component composite spinor by (x,y) = rpl(x)®il>2(y)

(14)

and consider, for simplicity, two perticles. The interaction terms contain the composite field. The free part of the action is a sum of terms each containing one field only. We multiply the free particle part of one particle with the (Dirac) normalization integral of the other particle, e.g., f dyip}(y)^^) = 1,

363

at some arbitrary time yo so that the free action terms can be written as

+ jdyd^2(y)(f^idnu-m2)M^i(yH1)0My)-

(is)

Clearly, we have one 4—dimensional and one 3—dimensional integral. However, the interaction part has also one 4—dimensional and one 3—dimensional integral due to the 6—functions in D. After inserting (13) into (12) and collecting the 6,6',6" terms separately, and for 6',6"- terms an integration by parts is formed with respect to xo and obtain derivatives of and 4>. It is remarkable that all derivative terms can be combined into total derivative terms do( ) and doc\>( )• Now, we are in a position to vary the action with respect to <j>(x, y). This is done for each 6—function term separately retarted and advanced, and then cancel the overall 6—function. The total time derivatives give surface terms at xo —> ±oo. For localized bound state problem we assume that these surface terms at infinite times are zero. The total derivatives do not contribute to the equation of motion. The equation of motion for <j){x, y) is then: [(7MidM - mi) ® 7° + 7° ® {n^dy. - m 2 ) + V](f>(x, y)

(16)

where the potential is now: e e

l 2

47rr

„ ,M

~ '

„ei<22r. n

4TT

L

&-T

' " r3

'""

+

jaa^y+^av

-

4——[{am ® an - aman)—

+

jffm®U3(r)-yam®Qj3(r)].

'

r3

.77171S1

(
In the equation (16), the spin matriices of the two particles are written as tensor products, e.g. 7 M ®7 M , with the first factor always referring to particle 1, the second to particle 2. Now we introduce relative and center of mass coordinates: rli = xli-

yy.,

R^ = ax,, + (1 - a)yM

364

or %n = -RM + (1 - a)r M ,

y i , = Ril-

ar^

(18)

p£ = ( l - a ) P " - p " .

(19)

and

hence, p? = a P " + p M ,

The action can be written in terms of the new coordinates -JdR

dr

<^(P,r)[(7'x(aPM+p/i)-m1)®70]

+

7° ® (7M(1 " a)^M - PM -

m

2 + V(r))(R, r) = 0

and the equation of motion (16) becomes: [{a

Y

®7° + (1 - a) 7 ° ® 7 ^)P M + (7M ® 7° - 7° ® 7M)PM

-

(J ® 7°mi + 7° ® Jm 2 ) + V(r)]0 = 0

(20)

We note that the coefficient of po = i j j , concels so that no operator acts on the ro = t—dependence of cfi, or ro—dependence does not change. Hence we can consider to be a function of R^ and r only, (j) = (Rp, r) or in momentum space

^ = 0(P„,p). Equation (20) is in the form of a linear wave equation [r"P M + K}4> = o

(21)

TM = a7/x ® 7o + (1 - 0)70 8> 7^ if = - ( 7 ® 7° - 7° (gi -y).p - (J 7°mi + 7 0 ® im 2 ) + V.

(22) (23)

with

Since T 0 = 70 ® 70, we multiply (21) with 70 <E> 70 and obtain (Po-7^f7f.P + 7^^

2 )

) 0 = O.

365

Hence the Hamiltonian of the system can be written as

H = P0 = (aa®I+(l-a)I®a).p + (a®I-I®a.p + (3®Imi+I®(3m2

+

7V7°.

(24)

The first term represents the Hamiltonian of the center of mass, the second and fourth terms constitute the relative Hamiltonian. The third term can be put either in the center of mass part or into the relative part, or my be divided between them. In order to put the wave equation (20) in a covariant form an arbitrary time like four-vector n, and ?*i = [(xi — x2)n — (xi — x 2 ) 2 ] 1 ^ 2 . Hence we write equation (20) in the form [(7MPiM-mi)

®7rc + 7 « ® 7 M ) ( A M - m 2 ) + y(r)]c£ = 0

(25)

If we take n = (1,0,0,0), then n reduce to the usual radial coordinate r, and 7n = 7 0 . Thus equation (25) reduces to equation (20). We have shown, after the elimination of electromagnetic field in the interaction of two charged particles ipi,ip2, that the resultant action can be written entirely in terms of a composite field (j>(x, y) and total time derivatives of (j>, if we use the retarted Green's function. Here x and y have light. like separation; hence has one time coordinate only. The variation of the action with respect to <\> yields then a linear equation for which is of the type of an infinite component wave equation. The center of mass and relative coordinates can be separated. The 16 x 16 spinor potentials are given for both the Dirac and the Panli couplings to the electromagnetic field.

3

Solutions:

Our task now is to search for stationary states of equation (20). After the angular dependence is separated out, this leads to sixteen coupled equations, among the 16-components of the field , in the variable r. Considering only the minimal coupling, these sixteen equation for a given value j of the total angular momentum form two decoupled sets each containing eight coupled linear equation. In each set four of the equation are first order ordinary differential equations in the radial variable r, and four equations are purely

366

algebraic. These equation are: first set: (£ _ 2s^Ul +

Mvi

+ 2{dT + \)Z2 - 2+Z0 = 0

(E - ^)u00 + Mv00 = 0 ( £ _ 2e^)z2 + Amy2 - 2(dr + i ) U l = 0 (E - ^)Z0 + AMy0 - f)Ul = 0 (E - 4&a) U l

Evi + Mux + MvlVoo + Mum Ey2 + AMZ2 + Ey0 + AMZ0 +

=0 - 2(dr + Z)y0 + ^y2 ^-vm = 0 aruoo = 0

=0 (26)

The second set (which we shall not write) is obtained from the first set, and vise - versa, by the follwing symmetry substitutions:

v00

M <-> A M , y 0 <-> v0, Z0 <-> u0, u00 <-> - Z 0 0 , <-» - " o o , " i *-> -Zi, vi <-> -yi,Z2 <-»• u2, and y2 «-» v2

(27)

Clearly the set (26) containg four algebraic equations, which can be used to express four of the unknown functions in terms of the remaining four in that set, leving four ordinary linear differential equations to be solved. However the obtained set of four equations do not possess simple series solutions. In fact by eliminating (e.g) u00, Z0,vx,y2 and substitute with ui = £ a n r n , . . . , i t a > = £ & n r n n=0

(28)

n=0

in the resulting four differential equations, we obtain 3-term and 4-term recurrence relations which are difficult to solve for an,...,bnOne may try to eliminate more component functions by forming second order differential equations out the set (26). However such second order differential equations are more singular, and simple series solutions of the form (28) do not exist. Functional series solutions are suggested and this seems to work . The basic idea is that to replace {r™} in (28) by an appropriate complete set of functions {/«}, and hope that the series terminates. As an example, in the following we shall consider the case of two free Dirac

367

particles. Two free Dirac particles: The first set (26) is obtained by putting e\ = e2 = 0. Again eliminating the components UQO,ZO,VI and y2 we obtain the following four second ordinary linear differential equations: $ H - ( 1 - £)ttl = 0

§ P + (l-£h>o = 0
2fi dZo. , (r2 , s ( e 2 _ £ ) dp ' ( C

2J 2

p3(e2_^)

duo . rLr 2 dp 1 l

J2\

P 2J

J2\rj p2j-2 2£ 2

2aJ n _^}2/U-U

(29)

p2(e2

i

2aJ

p2(62_^)jyu

p 2 ( 6 2

v _^}^

= 0

(30)

where r,

r.2

(£2-AM2)(£2-M2)

6m2 e 2

and (31)

^ = J ( J + 1)

The first two equations have the well known regular (at p = 0) solutions u\ = pjj(p) and u0o = PJj(p)> where J„(p) is the spherical Bessel function. The form of the other two coupled equations suggest the solution: Z2 = Yu AnPJn+s(p),yo ra=0

= Y, BnPJn+s(p) n=0

(32)

Substituting into (24), it can be shown that the indicial equations imply s = j — 1. Moreover, if A\ = 0 = Bi, then An = o = Bn for odd n. For even n, put n + 2 = 2m, and obtain the recurrence relations: (2m+j-3)(2m+j-4)-J 2 4 _ T2 (2m+j+3)(2m+j+3)-J 2 ,. {2(2m+j)}{2(2m+j)-3} ^ 2 m - 2 J {2(2m+j)+l}{2(2m+j)+3} yi 2m+2

+ [ e 2 {(2m + i)(2m + j - 1) - J 2 } - 2J*

^

^ j ^ J A -2aJB2m = 0

^

(2m+j-3)(2m+j-4)-J 2 p _ T2 (2m+j+2)(2m+j+3)-J 2 n {2(2m+j')-5}{2(2m+j)-3} • D 2m-2 J {2(2m+j)+l}{2(2m+j)+3} • D 2m+2

+[G 2 {(2m + i)(2m + j - 1) - J 2 } -

2J'^7^»g&.^}]gan, -2aJA2m = 0

(33)

368

Notice that for m = 2, the coefficient of A2 in the first relation, B2 in the second relation vanish. Hence if we assume that At = 0 = B4, then A2m = o = B2m for m > 2. Because A_ 2 = 0 = B_ 2 for all n, then (32) leads to four equations, only two of them are indpendent, in A0, B0, A2, and B2 the equation are:

2jTTft + (e!-27TT)B° + J T I ^ = 0 r*=2 1 ± ± ^ R a J 4 J . 1 ± 1 R -n (e "27TT )B2 "T^ 2 + 2lTT Bo - 0 keeping A3 and Bo arbitrary, we find

(
-

1±1R

j

\R

aJ

A

Chosing AQ, arbitrary and BQ — 0, we obtain the two solutions Z2 = A2pjj+i(p)

,

2/0 = BoPJj-i(p) +

B2pjj+i(p)

and Z 2 = A)/07i+i(/9) + A2pjj+i

,

yo = B2pjj-i

(34)

which can be checked by direct substitutions in (29). Interacting two Dirac particles: For the minimal coupling, neglecting the selp interaction, we have the set of equations (26). Again eliminating the components u0o,z0,Vi, and y2 we obtain the set of four ordinary differential equations: {{E + 2-f)2 - ^{E

+ 2f) - *£]U! + 2(E + 2f)Z2 +

2

-^Y0

= 0

2

[E{E + 2s) - AM ]Z 2 - 2EU[ - ^f^-Vao = 0 {(E - *)(E

+ &) - M2 - £(E

- ^)]V00 - 2(E - ^)(4£

+ *)

-2*g±(E-4?)Z2 =0 [E(E + **)- AM 2 ]y 0 + 2{E + 2-f){^ - **) + 2-^-Ux = 0

(35)

369

where Ui,YQ,Z2, and Voo = ruury0,rz2, and rv00. Due to the ^ singularity no simple series solutions exist. Performing second differentiation, we still obtain four coupled second order ordinary differential equation. The main idea is to expand the coefficients of the unknown functions in povers of a and heeping terms up to a2, a 4 ,... as needed. The resulting equations can be solved exactly, thus giving the approximate component functions but in a closed form. For example up to a2, the equations for the compenent functions u\ and voo decouple and take the form:

2

J2 n[uuuoo] = 0 pL

(36)

(M 1 - E2){E2 - AM 2 ) IE2

and A = -£—(2E2 -M2AM2) (37) AkE The regular solutions (at p = 0) of (29) are given by the hydrogenic functions: u\ = Rnj(p) , voo = Rnj{p), provided A = n (integer), and n > j + 1. This in trun implies the energy mass relation E2

=

M2

AM2±M2-AM2{1_^ 2 n2

+

2

The two components Y§ and Z2 satisfy two coupled second order differential equations of more complicated structure, which can be solved by developing power series solutions but {r™} is to be replaced by {Rnj}Up to a 4 we obtain for the set (34), four coupled second order ordinary differential equations. Two of them are: fiii

i

a dm

, r

1 _i_ A

~W + ~p~dj + l~4 + ~p

J2—a2

~?

aAM 1 „.

n

"W^00 - U - ^ i

= 0

(39)

370

where A, p, and k as before (see eq. (36)), , „ AM2-M2 and SE = r

AakE a =

+

AM2M2 ~

(40)

Comparing the set of equations (38) with the set of equations (35), we suggest the solutions: ui = YlAtwt+s{p)

and

v00 = ^2 Bewe+S(p)

£=0

(41)

e=o

where we have put Rne(p) == wi{n,p). The following recurrence relations (written for the first time) are needed: dwe n £ = -3— = ( dp 2

{i + 1} =

^

=(

£i2

1 wt - -Vn2 p z

-

l2wi-i

i " 7^+1 " l\ln2-V+l)2w*

(42)

Again following the same sleps as before, we show that the solutions of (38) take the form n—j—l

Ui=

J2

n—j+1 A w

z t+h

and

v

oo = Yl

B w

e e+s

where the constants Ae, and Be for various values of I depend on two arbitrary constants. In a separate publication (6), a new method to the determination of eigenvalues and eigen functions of the system of equations (26) is developed. For the equal mass case, the constructed eigen functions N3p0, N3si,N3p2,..., 3 3 and N Di,N F2,..., where N is the appropriate principle quantum number in each case. The obtained energy eigen values are accurate to within terms of order a6. For more general cases, the work is in progress.

371

4

References

1. P.A,M. Dirac, Proc. Royal Soc, A126, 360 (1930). 2. G. Breit, Phys. Rev. 29, 553 (1929). 3a. N. Kemmer, Helv. Phys. Acta 10, 48 (1937). 3b. E.Fermi, C.N.Yang, Phys. Rev. 16, 1739 (1949). 4. A.O.Barut, S.R.Komy, Fortschr. Phys. 33, 6, 309 (1985). 5. A.O.Barut and N.Unal, Fortschr. Phys. 33, 319 (1985). 6. A.O.Barut, A.J. Bracken, S.R.Komy, and N.Unal, J. Math. Phys. 34(6), 2089 (1993).

Mathematics and the 21st Century Eds. A. A. Ashour and A.-S. F. Obada © 2001 World Scientific Publishing Co. (pp. 373-386)

373

SINGULARITIES IN GENERAL RELATIVITY AND THE ORIGIN OF CHARGE K. BUCHNER Zentrum Mathematik der TU Munchen D- 80290 Munchen, Germany

Abstract d-spaces are a simple and very useful tool for the description of singularities in General Relativity. In the first part of this paper, we recall the basic definitions of the theory of d-spaces. Then a short review of the results is presented, which were obtained with this theory. It is shown that there are situations, where pointlike particles can pass through singularities. This is the case e.g. for the classical Big Bangs of a series of closed Friedmann universes. In Schwarzschild's solution, the problem of causality violation near the White Source can be solved, although some mild form of causality violation remains. Finally, the first example of a "wormhole" without exotic matter is presented. This means that an electric field proportional to 1/r2 is generated only by topology without any electric charge.

1

Introduction

The famous theorems by Hawking and Penrose (see, e.g. [15], [2]) state that most "reasonalbe" solutions to Einstein's equations contain some sort of singularity. In general, the tidal forces become infinite when an observer approaches such a singularity. This means that physicists who are curious and approach it too closely, are killed. So the question "How does a singularity look like inside" seems to be forbidden. Mathematically speaking, the solutions to Einstein's equations are usually considered as differential manifolds. By definition, they can not contain points in which the metric is singular. Here "singular" means e.g. that its determinant is zero or some scalars built from the curvature tensor are infinite. But one must keep in mind that all general definitions of a singularity

374 discussed so far lead to difficulties. However, this is not relevant here, as in the framework of d-spaces, singularities are considered as points (or sets of points) of space-time. So e.g. the problem with Geroch's definition is that the topology at the singularity is not well defined [10]. And Schmidt's b-boundary sometimes leads to very strange topologies [4], [17], [18]. But in the theory of d-spaces, a natural topology arises, which agrees very well with our expectations at least in the cases discussed as yet. Sometimes questions about the singularities themselves are important. Consider e.g. two closed Friedmann solutions, where the Final Collapse of one of them is identified with the Big Bang of the other. Here the question "Can pointlike particles pass through this singularity" is of practical importance. It means: "Can we get signals from the universe before ours?" However, very little is known about the state of matter immediately before and after the Big Bang. So ist is not clear, whether pointlike particles exist there at all. Still, if it were so, it would be very surprising. Similarly one may ask: "Where do the (pointlike) particles come from, whose geodesies begin in the White Source of Schwarzschild 's solution?" Of course, we can not see this White Source in our part of the world. But if the maximal analytic extention of Schwarzschild's solution makes any sense, this question must be answered. For the discussion of such problems, one needs a generalization of differential manifolds which can include singularities. It is not surprising that to most questions, practically all such theories give the same answers. This is so, because one is mainly interested in the geodesies passing through the singularities. They are completely determined by their starting point and the tangent vector in this point. Now, if in a suitable topology the non-singular points are dense in the total space-time, then continuity arguments are sufficient, and the generalization of differential manifolds serves only to put the results on firm grounds. In the following, the basic definitions of the theory of d-spaces [11], [12], [13] are presented very briefly. It provides a simple way to treat the abovementioned questions. A very similar, but not identical mathematical framework has been developed in [17], [19]. In the second part of this paper, some applications to the most common singularities in General Relativity are discussed. - An other review on this subject with somewhat different content has been given in [3].

375

2

d-spaces

The basic idea of differential manifolds is to express functions and maps in local coordinates. More precisely: If M is an n-dimensional differential manifold and / is a real valued function on M, then / o ip~l is considered instead of / , where (p is the local coordinate function of M. So all functions f on M are reduced to compositions of the real functions g := / o ip^1 on Rn with the n coordinate functions ipi : x K* X' on M. Already in 1967, R. Sikorski had the idea to replace the functions on M (which can be defined via the coordinate functions R. So one arrives at the following definition, which keeps all the good properties of Sikorski's definition: Definition 1 Let M be a topological space. The pair (M,C) is called d-space, and C differential structure, if C is a sheaf of continuous real-valued functions on M which form an algebra (w.r.t. pointwise operation). It may be surprising that Sikorski's requirement of C°°-composition has been replaced by the weak axiom that C forms an algebra. But it turns out that this is sufficient for a general definition, although in practice, one chooses very special algebras for C. So one may also impose differentiability requirements, if needed. On d-spaces, a differential calculus can easily be constructed starting from the directional derivatives, which are identified with the tangent vectors. Note that it is not possible to define them as derivations, if one wants to generalize also Cr-manifolds with r < oo: It is well known that even for finite dimensional Cr-manifolds, the space of derivations is infinite-dimensional. -

376 The following definition has the advantage that the dimension of the tangent spaces is independent of r (i.e. of the differentiability class of the functions a in the following definition, which may be be greater than 1, if C is chosen accordingly):

Definition 2 Let {M,C) be a d-space, x G M, and Cx the stalk at x. A map V :

Cx^>

R

is called tangent vector to (M, C) in x, if for all n G IV, all / i , . . . , / „ G Cx, and all germs a of C1 (lRn, M) at y := (fi(x),..., fn(x)) £ Mn, the equation V(ao(f1(x),...,fn(x)))

=

J2(di<*)-V(fi)

holds, provided a o ( / 1 ; . . . , fn) G Cx. Here 9* a denotes the partial derivative of a w.r.t. the i-th argument. The vector space of all tangent vectors to (M, C) in x is called tangent space TXM. The definition of differential forms is not straight-forward, but can be done in a consistent way [11], [12]. It will not be needed in the following. The integration causes more problems: If one wants to integrate a vector field, the result should be a curve. But what does this mean? A curve is a continuous map of a real interval to the d-space. Such maps may not exist, as can be seen, if the d-space is a Julia set. Still, for the simplest examples, existence and uniqueness of the integration can be prooved [8]. Fortunately, these proofs cover the definition of geodesies in singularities. But in the cases discussed below, continuity arguments are sufficient. So even these theorems are not needed. Before d-spaces had been introduced, there was the problem that Schmidt's b-boundary gave strange topologies ([4], [10], [18]). E.g. the only neighbourhood of the Big Bang was the total Friedmann space-time. It is a great advantage of d-spaces that they yield a topology, which is suitable for all applications studied so far. Note that in the definition of a d-space, one starts from a given topology which cannot be coarser than the initial topology of the functions in C. But it may be too fine: Definition 3 Let (M, 7) be a topological space, and T a sheaf of local functions M —>• M defined w.r.t. the topology 7. A topology a on M that is coarser than 7, is

377 called a slackening of T, if for every V S 7 and every f £ F{V), there are U e a; U D V, and g e T{U) such that f = g\v holds. It is shown in [11], [13], that there exists a coarsest slackening fi, such that all functions g £ ^(U); U € \i are continuous. This fi is called initial topology of{M,C). This topology should be used for the discussion of singularities. Loosely speaking, Geroch defines a singularity as a set of points, where geodesies begin or end [10]. This is one of the reasons, why it is sometimes useful to glue two different space-times or two parts of the same space-time together in their singularities: In many cases it is possible to prolong the geodesies to the newly attached region. Definition 4 We say that two d-spaces Mi and M2 (or two parts of the same d-space) are glued together along Bi C Mi and B2 C M2, if 1. Mi — Bi and M2 — B2 are C3 -manifolds. 2. There is a continuous map Bi —¥ B2 such that each geodesic g ending in x € Bi is mapped to a geodesic that starts in f(x), and each geodesic ending in f{x) to one that starts in some preimage of f(x). "Gluing" means to identify f(x) with all its preimages x, and to consider the geodesies in Mi and the corresponding geodesies in M2 as one and the same geodesic. (Therefore the above mappings of geodesies must be one to one.) 3. On these geodesies g, a parameter r can be chosen in such a way that the tangent vector does not vanish: dg(r)/dT ^ 0 for all g(r) € -Bi and all g{r) G B2- It is not necessarily the arc length, but on Mi — Bi and on M2 — B2, there must be an admissable C3—parameter transformation from the arc length to r . 4- In addition, there must be atlases of Mi — Bi and of M2 — B2 and a choice of this parameter T on each geodesic such that it is a C3-curve on MXUM2(BiUB2). This definition is similar to that of [3]. It is much more restrictive than the earlier ones [1], [20], [21], [22]. But it allows to discuss the continuation of geodesies and the introduction of local coordinates similar to geodesic coordinates. (This is the reason, why C 3 is required, although C2 would be sufficient for the definition of a geodesic.) Of course, definition 4 could easily

378

be generalized such that more general singularities could be included, e.g. the "fork" in {(a;1, x2) £ M2\ x1 < 0 and x2 = 0} U {(x\x2) e M2\ x1 = 0 and x2 > 0} U {(xl, x2) e M2\ x1 > 0 and x2 = -a; 1 } . But the necessary changes are obvious, and would complicate the above formulation.

3

A simple example

When a geodesic, i.e. a pointlike particle, passes through a singularity, the conservation laws for momentum, energy, and mass are not necessarily conserved. This can be seen from a simple example: Consider the (flat) manifolds Ml := {(a:1, a;2, a;3, a:4, a;5) € R5\xh <0and2xi = x5} M2 := {{x1,x2,x3,x\x5) &IR5\x5 > 0 and x4 = 0} . On these manifolds, we introduce a pseudoRiemannian metric, whose tensor has the components diag(+l,+l,+l,+l,—l). The two spaces are glued together along Bx :=B2:={{x1,x2,x3,x4,x5)

e R5\ x4 = xh = 0} .

The energy of a pointlike particle moving on a curve X(T) is proportional to xh, and the square of its mass proportional to (i 5 ) 2 — (x1)2 — (±2)2 — (i 3 ) 2 — (i 4 ) 2 . We normalize the masses such that for the particle under consideration, the constant of proportionality is one. Consider the geodesies composed of two parts: For x5 < 0, we choose the lines with xl = const; i = 1,2,3; 2 x4 = x5, and for x5 > 0 the a;5—lines. Without loss of generality, one may put r = x5 for x5 > 0. Then energy conservation requires that dxb/dr = 1 holds also on Mi. Therefore dx4/dr = 1/2, and the mass of the particle is -\/3/2 in contrast to its mass 1 on M2. This shows that it is not possible to conserve both, energy and mass, at the same time. Of course, the non-conservation of energy and momentum is to be expected in this case [5]. It should be kept in mind that the conditions of definition 4 can be satisfied by adjusting the paramater r. But it is by no means clear that the mass is constant, while energy and momentum are not conserved. To answer this

379 question, definition 4, which is purely mathematical, is not sufficient. Some knowledge about the physical processes in the "edge" of space-time is necessary (cf. also the example in [6]). Indeed, mass conservation leads to strange results: At the edge, a particle travelling on the geodesic in the direction of increasing i 5 -values, would be slowed down, whereas a particle moving in the opposite direction would be accelerated.

4

Schwarzschild space-time

In Kruskal coordinates, the maximal analytic extension of the exterior Schwarzschild solution is (!)

ds2 =

3 2 M ^ . e - r /(2ilO

(dM2

_

dv2)

+ r1 ( ^2

+

gin2

§

^

2 )

(

where the "radius" r and the "coordinate time" t are expressed by u and v as (2)

(—

- l) er/<2M> = u2 - v2

f AMtanh~l(v/u) 1 4Mfan/z_1(u/v)

for u > \v\ or u < \v\ for v > \u\ or v < \u\

Equation (1) shows that the points r = 0 define a singularity of the metric. According to (2), they form the hyperbola v2 — u2 — 1. Fig.l shows more details. There are four regions / , . . . , IV which are separated by the horizons r = ± 2 M. All future directed (i.e. dv > 0) causal lines in region I must end in the upper part of the hyperbola, i.e. in the part with v > 0. Therefore this is a Black Hole. Similarily, all future directed causal lines in region II start from the part with v < 0 of the hyperbola. Therefore it is a White Source. This is very unsatisfactory, because all particles in this region are created "without any reason" from nothing. So causality is broken in a very bad way. Of course, we who live in the asymptotically fiat region III, can not see this, because all lines coming from region II have to cross a horizon with coordinate time t = — oo. Nevertheless, we think that in classical physics, causality has to be respected everywhere. This suggests that the Black Hole should be glued to the White Source. not difficult to veryfy (see [6]) that the points (it, v,d, tp) of the Black may be identified with the points (—u, —v,d,
It is Hole Then the

380

above Kruskal coordinates) of their tangent vectors when they emerge from the White Source.

Figure 1: Maximal analytic extention of the Schwarzschild space-time. The dark lines show a photon starting in region III, which is absorbed in the Black Hole, emitted from the White Source, and, after being scattered in region IV, returns to Region III.

Fig.l shows such a light ray. The photon starts in "our world", i.e. in region III. It falls radially into the Black Hole, and continues to travel from the White Source in Region / / to region IV. So the two asymptotically flat regions III and IV can communicate. It may happen that the photon is scattered by a mirror in region IV in such a way that it falls again radially into the Black Hole. Then, as shown in fig.l, it finally reaches region III and intersects its own trajectory. This is a mild form of causality violation, as the photon crosses the horizons t = - c o four times. One could interpret this trajectory as a photon which is emitted from an observer at some coordinate time t\. The photon intersects the trajectory of the observer again at an earlier time t2 < t\. So in principle, the observer gets information about his future. But he thinks that the photon comes from the past t = - c o .

381 The question about the topology can easily be answered by definition 3: Clearly, the local coordinates of the geodesies (Kruskal coordinates as a function of some suitable parameter) belong to the differential structure. Therefore the open neighbourhoods of a singular point p are simply the open sets containing p in the Kruskal coordinate space. For details see [1], [6].

5

T h e closed Friedmann Universe

The standard model of cosmology describes the beginning (or the end) of our universe by the "radiation filled Friedmann model". At present, it is not clear, whether our universe is expanding forever or whether it will finally collapse again. Here we treat the latter case. It is generally assumed that after such a final collapse, a new universe starts. In classical cosmology, a closed Friedmann universe can be followed by an open one, because the universe looses all its information in the singularity. This is not possible, however, when d-spaces are used [16], because the singularity is a point of this space. Therefore the pulsating universe is a connected set, and the initial condition of the solution to Friedmann's equation can not change. So a closed universe can be followed only by a closed one. The metric of the closed radiation filled Friedmann universe is (3)

ds2 = S2(ri) (-dr? + dX2 + sin2X (dd2 + sinH dip2)) Xe[0,Jr];

0e[O,7r];

y>€[0,27r),

where the function S is given by (4) S'(77) := a sin 77;

t = o(l — COST]) ;

-— = S(rj); drj

a € M•

Here 77 — t = 0 denotes the Big Bang and 77 = IT the final collapse. The "world radius" shrinks to zero in both points. As equation (3) suggests, it is natural to assume that the final collapse is identified with the Big Bang of the "next" universe with 77 G [TT, 27r]. But this has to be verified in view of definition 4 above. It is well known that the distance of an arbitrary point p to the Big Bang and the final collapse is finite. And all geodesies which reach these singularities do it with a fitite affine parameter. Moreover, equations (3) and (4) show that the metric is symmetric for reflections of 77 at 77 — 0,TT,2TT,. ...

382

Therefore the geodesic equations are also symmetric about the singularities. It remains only to show that there is a parameter for these geodesies such that the tangent does not vanish and the geodesic is C 3 in the singularities. The easy calculation is done in [1] and [6]. The topology at the singularities is found in the same way as above. The topology can be generated by the open sets not containing a singularity, and by the balls at the singularities with radius r\ = e, where e is a rational number. There are two interesting facts to notice: First, the singularities of Friedmann space-time are single points. This statement is intuitively clear, and can be made precise by a discussion of the geodesies [1]. A more surprising result is that the dimension of the Friedmann singularities is not four, but five: Consider

(5)

/ ^ lim„ / < X + 0 t > t' y '^" / , ( X 't' y '^„ i

° = */2,

»->o \\x(x + a, tf, tp, rj) - x(x, V, ip,rj)\\ where the denominator is the distance of the points with the given coordinates, and the x-values have to be taken modulo w. This defines a tangent vector in r\ = 0. A similar vector can be defined by interchanging the role of d and x m (5)- In addition, if this formula is applied to tp instead of X, two additional independent tangent vectors are defined for a = TT/2 and a = 7r. Furthermore, one may consider a C1-curve on M ending in r\ = 0. The derivative in the direction of this curve (defined by a sequence of points on this curve) is independent of the above vectors. - The intuitive meaning of these tangent vectors becomes clear, if one considers the embedding of the space-time into a 5-dimensional flat space. There the singularities are cones.

6

T h e origin of charge

Almost 50 years ago, Archibald Wheeler showed [28], how charge can be generated by topology: Charge is the property of space that the divergence of some field (the dielectric field in the electromagnetic case) does not vanish. Consider the field of an electric dipole, i.e. the field of two electric point charges, one positive and one negative, of equal absolute value. Now the two points, where the charges are located, are identified with each other. This has the consequence that the field lines do no longer end in these points, but are continued to the other part of space-time. So the divergence can

383 be made zero, and the charges disappear, although the electric field is not altered. Only the topology has been changed. Wheeler has formulated this idea of a " wornhole" at a time, when one did not know yet many details about exact solutions to Einstein's equations. Recently, his proposal has been frequently discussed in connection with the maximal analytic extention of space-times (a very readable review is [27]). But now, the goal of the authors is not to explain charge by topology, but to find "transversable" wormholes, i.e. paths to other parts of the world which otherwise would be not accessable (see, e.g. [26]). In particular, "transversable" means that no horizon can be crossed. On the other hand, the ReiflnerNordstr0m solution shows that charge is hidden from our part of the world even behind two horizons. Therefore we cannot expect that Wheeler's idea of replacing charge by topology works with transversable wormholes. It is surprising that such a wormhole can be realized with only one pointlike charge. The reason is that gravity produces an infinite series of singularities at the "position" of the charge: The solution to the Einstein-Maxwell equations with one pointlike mass M and electric charge Q is the famous Reifiner-Nordstr0m space-time: In case that M is larger than Q (in natural units), the maximal analytic extension of the metric is given by [9]

<6> *> = -sfe 'LVwW d V i V "''<*" + ""'**"' Here r + and r_ denote the two horizons. They are determined by the mass and the charge of the Black Hole. Further, one defines a := (r+ — r_)/{2r\) > 0 ; j3 := r2__jr\. The "radius" r is implicitely defined by tan U tan V

• r+\ \r — r_| ^ for • r+\ \r — r_| _ / 3 for

r > r+ and 0 < r < r_ r_ < r < r+

Of course, the electric field is proportional to 1/r 2 . The Penrose diagram of this space-time is shown in fig. 2. The singularities r = 0 can be reached by spacelike and null geodesies, which end there. Therefore definition 4 suggests to identify pairs of singular points in such a way that the geodesies can be continued. A straightforward calculation [7] shows that this is possible, if points with the same value of U + V in An and in A'n+1 are glued together. Note that the r-lines are geodesies, which at the same time are also the electric field lines. And the matter is homogeneous; therefore the electric and the dielectric fields are proportional to each other. This means that also the divergence of the dielectric field - the charge - vanishes, although an observer in an asymptotically flat region Cn or C'n only

384

Figure 2: Maximal analytic extention of the Reifiner-Nordstr0m space-time. Dashed lines: r=const. The solid line shows a photon starting in Cn and passing to C'n+l through a singularity.

sees a field of a point charge proportional to 1/r 2 . Surprisingly, it is not necessary to glue singular points. It is also possible to identify points with r = e > 0 and the same value of U + V in An and A'n+1, if all points with r < e are deleted. In this case, no exotic matter appears at the surface r = e, provided e is small enough. This can be seen by a straightforward calculation of the surface energy and momentum [7], which are well defined and finite for e —• 0, whereas the electromagnetic self-energy tends to infinity. So one has the choice: Either one glues the space-time in the singularities. This has the advantage that no surface terms appear. If instead, points with finite radius e > 0 are glued together, the

385

singularities in the metric and an infinite electromagnetic energy are avoided.

References [1] M. Abdel-Megied, K. Buchner, R.M.M. Gad: Topologie und Verklebung singularer Raum-Zeiten. Proc. 4th Intern. Congr. Geometry, N. K. Artemiadis and N. K. Stephanidis ed., Thessaloniki 1996, 57 - 68 [2] J. K. Beem, P. E. Ehrlich: Global Lorentzian geometry. Marcel Dekker, Inc., New York and Basel 1981 [3] A. Beigel, K. Buchner: d-spaces, singularities, and the origin of charge. To be published [4] B. Bossardt: On the b-boundary of the closed Friedmann model. Comm. Math. Phys. 46 (1976), 263 - 268 [5] K. Buchner: A remark on energy and momentum in embedded spacetimes. Progr. Theor. Phys. 46 (1971), 1946 - 1947 [6] K. Buchner: Differential spaces and singularities of space-time. General Mathematics 5 (1997), 53 - 66 [7] K. Buchner: 1/r - Potential ohne Ladung. To be published [8] K. Buchner, K. Biischel: Dynamical systems on differential spaces. Proc. 23 rd National Conf. on Geom. and Topol., Cluj-Napoca 1993, 30 - 37 [9] S. Chandrasekhar: The mathematical theory of Black Holes. Oxford University Press, Oxford 1998 [10] R. Geroch: Local characterization of singularities in General Relativity. Journ. Math. Phys. 9 (1968), 450 - 468 [11] M. Gerstner: d-Raume. Eine Verallgemeinerung der Differentialraume mittels Funktionsgarben. Dissertation, TU Miinchen 1995 [12] M. Gerstner, K. Buchner: Differential spaces based on local functions. Analele §tiint- ale Univ. "Ovidius" Constanta, Ser. Mat. Ill (1995), 37-45 [13] M. Gerstner, K. Buchner: The topology of differential spaces. Analele §tiint;. ale Univ. "Al. I. Cuza", Ia§i, 42(Supliment) (1996), 101 - 111

386

[14] J. Gruszczak, M. Heller, and Z. Pagoda: Cauchy boundary and bincompleteness of space-time. Intern. Journ. Theor. Phys. 30 (1991), 555 - 565 [15] S. Hawking, G. F. R. Ellis: The large scale structure of space-time. Cambridge University Press, Cambridge 1976 [16] M. Heller, W. Sasin: Generalized Friedmann 's equation and its singularities. Acta Cosmologica XIX (1993), 23 - 33 [17] M. Heller, W. Sasin: Structured spaces and their application to relativists physics. Journ. Math. Phys. 36 (1995), 3644 - 3663 [18] R. A. Johnson: The bundle boundary in some special cases. J. Math. Phys. 18 (1977), 898-902 [19] M. A. Mostow: The differential space structures of Milnor classifying spaces, simplicial complexes, and geometric realizations. Journ. Diff. Geom. 14 (1979), 255 - 293 [20] W. Sasin: Geometrical properties of gluing of differential spaces. Demonstratio Mathematica 24 (1991), 635 - 656 [21] W. Sasin: Gluing of differential spaces. Demonstratio Mathematica 25 (1992), 361 - 384 [22] W. Sasin, K. Spallek: Gluing of differential spaces and applications. Math. Ann. 292 (1992), 85 - 102 [23] R. Sikorski: Abstract covariant derivative. Colloquium Mathem. 18 (1967), 252 - 272 [24] R. Sikorski: Differential modules. Colloqium Mathem. 24 (1971), 46- 79 cf. also R. Sikorski: Wstgp do geometrii rozniczkowej, Panstwowe Wydawnictwo Naukowe, Warszawa 1972 [25] K. Spallek: Differenzierbare und holomorphe Funktionen auf analytischen Mengen. Math. Ann. 161,(1965), 143 - 162 [26] M. Visser: Quantum wurmholes. Phys. Rev. D 43 (1991), 402 - 409 [27] M. Visser: Lorentzian wurmholes. AIP Press und Springer-Verlag 1995 [28] A. Wheeler: Geons. Phys. Rev. 97 (1955), 511 - 536 cf. also: A. Wheeler: Einsteins Vision. Springer-Verlag 1968

Mathematics and the 21st Century Eds. A. A. Ashour and A.-S. F. Obada © 2001 World Scientific Publishing Co. (pp. 387-394)

387

The Inner Geometry of Light Cone in Godel Universe M.Abdel-Megied Mathematics Department,Faculty of Science,Minia University El-Minia,EGYPT

1

Introduction

Einstein general theory of relativity with its powerful mathematical instrument enables us to investigate the local and global structure of our universe in any spacetime (M, g), whewe M is a 4-dimensional manifold and g is a Lorentz metric on M with signature —2. The most important features of any space-time is the existense of null curves, particularly , null geodesics(light rays), null surfaces and null hypersurfaces which are characteristics for the Einstein's field equations, in the sense of, the theory of normal hyperbolic second order differential equations. The existence of these null (or light-like) manifolds have physical origin, since null geodesies (light rays) are the trajectories of photons, the null hypersurfaces of constant phase in geometric optics (high frequency) limit can be considered as level surfaces (null surfaces) of the function S = S(x'),i = 0,1,2,3; satisfying the Eikonal equation (masless H — J. equation), namely gijdiSdjS = 0 [Frittelli and Newman (1999)]. A geometric illustration can be given , if we start with a given 2-dim space -like smooth surface Z in a space-time M, that is a solution of the Einstein field equations Rij ~ 2R9ij + Aftj = -8-rrkTij

(*)

where Rij is the Ricci tensor, R is the scalar curvature tensor, A is the cosomological constant k is the gravitational constant and Ttj is the energy momentum tensor. Choosing at each point of Z a null direction perpendicular to Z, depending smoothly on the foot point. There are exactly two such direction fields on Z each of these determines a unique null direction (geodesic) which it is tangent giving a two parameter family of

388 null geodesies.This family can be interpreted as a bundle of light rays issuing from a surface to specific instant of time and propagated freely without direct intersection with matter. Let W be the set of all points of M that can be joint to Z by one of these geodesies. In the neighbourhood of Z, W is a 3-dim. light-like (null) submanifold of M, further away from Z, neighbouring null geodesies (light rays) might intersect each other and W might fail to be a submanifold of M. Following Friedrich and Stewart (1983), we call any subset W of M that can be constructed in this way wavefront. By definition the caustic of a wavefront W is the set of all points i e ( f where W fails to be an immersed submanifold of M. A picture of this construction of caustics has been given by Penrose (1972). The caustics (singularities) its occurrance is an intrinsic property of wavefronts. It can be also interpreted as the location of focusing regions, where the intensity of light becomes very high. It is found that the classification of wavefronts near their singularities (caustics) is equivalent to the classification of Legendrian submanifold near their points where the projection from submanifold to the basis has a singularity in the sense that the tangent map has non-maximal rank [Arnold et al (1985)]. Friedrich and Stewart (1983) used Arnold's results to obtain a local classification of Caustics of wavefronts [Hasse et al (1996)] obtained a local classification of caustics of wavefronts in terms of their projection from space-time to space. This classification is found to be more general than that given by Friedrich and Stewart in 1983, in the sense that it is independent of which timelike vector field has been chosen, i.e., observer independent. A more recent elegant study of the theory of Caustics and wavefronts singularities is achieved by Ehlers and Newman (1999). This article includes, from physical and mathematical point of view, a nice and clear review for the work of V.I. Arnold on the theory of Lagrangian and Legendrian submanifolds and their associated map. It is worth to mention that the motion of wavefronts includes, in particular, light cones can be constructed by all null geodesies (light rays) issuing from a point x e M as a vertex and by choosing for Z (2-dim. space-like surface) an appropriate sphere near x £ M. The study of singularities of wavefronts (caustics), i.e., study of the inner geometry of light cone is of great importance since it gives, e.g, a clear discription for gravitational field in vacuum, particularity near the vertex of light cone [Dautcourt (1965)]. According to general relativity the null geodesies constituting the light cone with vertex x € M can be forced to re-converg by sufficiently strong gravitational field (e.g., quasar, galaxy or cluster of glaxas). This is known as the gravitational lense effect,which is, today, one of the most rapidly growing areas in astrophysics. Acomprehensive review to this physical theory is contained in Schneider, Ehlers and Falco (1992) [see also : Ehlers (1998), Kayser et al (1992)].

389

2

S t r u c t u r e of Light Cone in Godel Universe

Kurt Godel (1949) obtained a solution of the Einstein's field equations (*) with cosmological constant A < 0 which can be represented by the metric ds2 = (dx0+exl/bdx2)2-[(dx1)2+^e2x^b(dx2)2+(dx3)2}

(2.1)

This solution describes rotating dust-filled universe (non-expanding and shear-free) with density of matter p and a constant rigid rotation Q, where Q, = -4r, b = j}2i = i_ ^JHJ __ tvj _ pfitgjyt jg ^ g 4. v e ] o c ity vector.

7kp

The metric (2.1) can be written in the so-called "standard Godel coordinate system" in the form: ds2 = 4[dt2 - dr2 + 2\fl sinh 2 <j>ddt + (sinh 4 - sinh 2 )d(j)2 - dz2]. Among the various striking properties of this Godel universe [Godel (1949)], one of the most interesting is the existence of closed time-like geodesies and past travelling world lines. So, the global serial order with respect to the compareson of "letter than" or "earlier than" is no longer given. These exotic features make violation of the causality and absence of a cosmic time. The proof that in Godel universe such a time travel cannot be performed along timelike geodisc is given by Pfarr (1981). These pecular properties attracted cosmologist's attention to the geodiscs in this universe. The geodisc equations:

^

+

r j * ^ =0

(2.3)

ds2 ° ds ds where TJfc are the Christoffel symboles, s is an affine parameter along the geodesic, are explicitly integrated by Kundt (1956) and Chandrasekhar and Wright (1961). The light cone in Godel universe is constructed by Abdel-Megied and Dautcourt (1972). With the vertex (0,0,6,0), its parametric representation is given by: t

=

„ /-, , (u2 + ti2) tan w , b , 1. 2V26arctan{l , 2„ \ , -h, rr } T=(V + -)w v(u + l ) + u ( u 2 - l ) t a n w J V§ v' 2 2 u/ 2 •,\r v(u -l) + u(v + l)tanw . b(vz - 1){- i '-r-—i—r —\\,<mw (v — utanw)- ! + v*{u + v tan wy (2.4) 2 2 bv (u + 1)sec2TO (v — u t a n w ) 2 + v2(u + vtanw)2

(2.2)

390 where w is an affine parameter along the null geodiscs (light rays) and u,v are directional (tranversal parameter fixing a geodesic (generator of the light cone). The parameter u is arbitrary, while v is restricted to the values 1 < \v\ < 1 + \f2. The null directions £' = 4j£ (i = 0,1,2,3) at the vertex can be calculated from (2.4), so we get

9iJee

= o = (e+v^1)2 - KX)2 - (a 2 - ( a 2

(2.5)

Introducing, at the vertex, the new directional parameters C1 =

C2 =

^

C3 =

i

^

(2.6)

then using (2.5), we have: (C1)2 + (C2)2 + (C3)2 = 1

(2.7)

which represents the celetial sphere of an observer at the vertex (0,0, b, 0) of the light cone (2.4). A very interesting picture which gives a deeper insight to the structure of Light cone in Godel universe can be found in [Hawking and Ellis (1973)p:168].

3

The Inner Geometry of Light Cone

To study the behaviour of light cone near its singular points (caustics) there are some difficulties due to the very complicated form of its inner metric g*ap (a, /3 = 1,2,3) defined by dxl Qx^ 9*afi = 9iJQ^Q^

(3-1)

where y1 = w, y2 = u, y3 = v are the coordinates on the light cone. The metric g*ap is singular, since g*a = 0. Using (2.4) in (3.1) we get the components of the inner metric: »22 = . 523 =

b2 sin2 w cos2 w A „ Jn 2^"(u>v)tan w u v=o 2 2 b sin w cos 2 w J^ - , „ w 7n 2^Qy(u,v)tan"w U

L/

S33

S1H

=

W

6 ?f„,2 >2 v {6v2 A

U

i/=0 COS

W

^—^

-,

,

,

2^Sv{u,v)ta.nvw v=o -_ 1T\4„„2 w - vA - 1)

775

u

b2w sin w cos w ^ , }_^1lv(u,v)tw.

..

£JV

txy UIJLJ. uy L U H cu

u

w

u=0 r—.,

(3.2) /

\

1/

^ ^ ( i i . ^ t a n ra v=o

391 where U = (v — utanw)2 + v2(u + vtanw) 2 , V, Q, 11, S, T are functions in the transversal parameters u and v. Now, as we have mentioned in the introduction, the singularities on light cone are the location of focal points where neighbouring geodesies (light rays) intersect.This leads to a higher degenracy of the inner metric g*ap [Dautcourt (1967)]. A glance on the metric (3.2), we find that at the values w = nn (n = 0,1,2,3,...) the inner metric becomes singular / 0 0 9*aB=\ ° ° \

U

U

0

\

0

(3'3) 2

4

ti*(6« -v -l)

/

Substituting the values w = n-n in the parametric equations of light cone given by (2.4), we get the curve: mrb, t = --j=l,

x

„ , mrb = 0,y = b,z = y = V 8 ^ P

(3.4)

where I = v + -. This shows that the focal points lie on the circles t 2 + z2 = 4 n W

(3.5)

in the' pseudo-Euclidean plane x = 0, y = b. The value n = 0 corresponds to the vertex (0,0,6,0) of the light cone. The circles (3.5) for different values of n are space-like which represent the so-called Keel curves [Riesz (1956)]. Now from (2.4) and (2.6) we have at the vertex of the light cone, the null directions (rays):

?- ^ f O '

^2 - l£jfa-k

™

where I = v + £; The values I = ±2 (v = ±1) correspond to the north and south poles of the clestial sphere (2.7) of an observer near the vertex, while the values / = ±2-^/2 (v = ±(1 + \/2)) correspond to the great circle

( 0 2 + « 2 ) 2 = l-

(3-7)

392 For all other values of I ^ —2, I ^ 2\/2, we get circles with increasing radius. These circles cover the upper half of the plane (2.7). For The negative values of I ^ — 2, I ^ —2\/2, we get circles covering the lower part of the sphere (2.7). Thus we have the following property to the inner geometry of light

There is a correspondance between the focal points on the light cone in Godel universe and the points on the celestial sphere of an observer at the vertex. For physical interest we calculate the length of the Keel curves, i.e., the circumference of the circles (3.5) for any n. From the formula dxi dxj 1/2 dl an——r dl dl

r'2 *2r

=

2n7rb

i

fr^dL

To evaluate this integral put / = 2\/2 — sin2 9 2nnb

I

f V2 cos2 6d9

TT^pn^

s = 4V2nnE{h -

2\/2nnK(-)

where E(^) and K(^) are the elliptic integrals of the first and second kind which have the values £ ( i ) = 1.35064, K(-) = 1.85407. So, for any n, the circumference of any circle (Keel curve) is found to be s = 2Anirb. If we take in considration that 6 = -4=, where: k = —rf, the gravitational constant cr c = 3 x 10 10 cm/sec, the velocity of light / = 6.67 x 10~ 8 cm 2 /gm 2 , the Newtonian constant p = 5 x W~30gr/cma, is the density, s = 2n x 1028cm ~ 0.8nl0 n light year (a light year ~ 9.46 x 1017cm) We can formulate this result as follows:

(3.8)

393 The length of the circumference of the Keel curve in Godel Universe is, for small n, of the order of the gravitational radius.

REFERENCES Abdel-Megied, M; and Dautcourt, G. "Zur Struktur des Lichtkegels in Godel Kosmos." Math. Nachrichtes 54 (1972) pp: 33 - 39. Arnold, V.I; Gusein-Zade,S.M. and Varchenko,A.N. "Singularities of Differential Maps." Vol. l.Birkhauser, Boston, Basel,(1985). Dautcourt,G. "Isotrope Flachen in der allgemeinen Relativitatstheorie" Habilitationsschrift, Humboldt Univ., Berlin (1965). Ehlers,J. "Gravitationslinsen: Lichtablenkung in Schwerefeldern und Ihre Anwendung". Carl Fridrich von Siemons Stiftung, Bd. 69 (1998). Ehlers,J. and Newman, E. "The Theory of Caustics and Wavefront Singularities with Physical Applications." gr-qc/9906065 (10.Junel999). Dautcourt,G. "Characterestic Hypersurfaces in General Relativity." J. Math. Phys. 8 (1967) p : 1492. Friedrich,H. and Stewart,J.M. " Charaterestic Initial Data and Wavefront Singularities." Proc. Roy. Soc. London A 385 (1983) pp: 345 - 371. Frittelli.S. and Newman, E.N. "The Eikonal Equation I" J. Math. Phys. 40 (1999) pp: 383 - 407. Frittelli.S. and Newman, E.N. "The Eikonal Equations II ..." J. Math. 40 (1999) pp: 1041 - 1056.

Phys.

394 Godel, K. "An Example of a New Type of a Cosmological Solution of Einstein's Field Equations." Rev. Mod. Phys. 21 (1949) pp: 447 - 450. Hasse,V.; Kriele,M. and Perlick,V. "Caustics of Wavefronts in General Relativity." Class. & Quant. Gravi. 13 (1996) pp: 1161-1182. Hawking, S.W. and Ellis, G.F.R. "Large Scale Structure of Space-time." Cambridge Univ. Press (1973). Kayser,R.; Schranam,T. and Nieser,L (Eds.) "Gravitational Lenses." Lect. Notes in Phys. 406 (1992) Springer-Verlag. Kundt, W. "Tragheitsbahnen in einem von Godel angegebenen kosmologischen Modell" Zeitschr. f. Phys. 145 (1956),p:611. Penrose,R. " Techniques of Differential Topology in Relativity. " Regional Conf. Series in Applied Math. Published by SIAM, Pheladelphia, PA, (1972). Pfarr,J. (1981) "Time Travel in Godel's Space." Gen. Rel. & Gravi. 13 (1981) pp: 1073 - 1091. Riesz,M. "Problems Related to Charaterestic Surfaces." Proc. Inter. Conf. in Differential Equations (1956) p: 57. Schneider,P.; Ehlers,J., and Falco,E. "Gravitational Lensing." Berlin (1992).

Springer-Verlag,

395

List of Participants 1. G. M. Abd Al-Kader, Faculty of Science, Al-Azhar University, Egypt 2. Elham M. Abd Elrasol, Faculty of Science, Cairo University, Egypt 3. Abo-El Nour N. Abd-Alla, Faculty of Science, South Valley University, Egypt 4. A. M. Abdalla, Faculty of Science, Benha, Egypt 5. M. Z. Abdalla, Faculty of Science, Cairo University, Egypt 6. Nassar Hassan Abdel-All, Faculty of Science, Assiut University, Egypt 7. Laila F. Abdel-A'll, Faculty of Science, Cairo University, Egypt 8. M. Abdel-Aty, Faculty of Science, South Valley University, Egypt 9. M. R. Abdel-Aziz, Faculty of Science, Kuwait University, Kuwait 10. Hamdy I. Abdel-Gawad, Faculty of Science, Cairo University, Egypt 11. A. M. Abdel-Hafez, Faculty of Science, El-Minia University, Egypt 12. M. Ezzat Abdel-Monsef, Faculty of Science, Tanta University, Egypt 13. *M. Abdel-Megied, Faculty of Science, El-Minia University, Egypt 14. M. Elshafie Abdellatif, Faculty of Engineering, Assiut University, Egypt 15. Hosny A. Abdusalam, Faculty of Science, Cairo University, Egypt 14. *Faruk F. Abi-Khuzam, A.U.B., Beirut, Lebanon 16. Abdel Aziz Abo Khadra, Faculty of Engineering, Tanta University, Egypt 17. Mostafa S. Abou-Dina, Faculty of Science, Cairo University, Egypt 18. Abdel-Karim Aboul-Hassan, Faculty of Engineering, Alexandria University, Egypt 19. Saeed Abu-Zour, Faculty of Science, United Arab Emirates University, UAE 20. A. I. Aggour, Faculty of Science, Al-Azhar University, Egypt 21. *Essam K. Al-Hussaini, Faculty of Science, University of Assiut, Egypt 22. Manuel J. Alejandre, Universidad Publica de Navarra, Spain 23. Mohamed Nabil Allam, Faculty of Science, Mansoura University, Egypt 24. M. A. Amer, Faculty of Engineering, Mansoura University, Egypt 25. S. M. Amer, Faculty of Science, Zagazig University, Egypt 26. M. Mostafa Anbar, Faculty of Engineering, Cairo University, Egypt 27. *Mohammed Asaad, Faculty of Science, Cairo University, Egypt 28. *Attia A. Ashour, Faculty of Science, Cairo University, Egypt (Chairman) 29. Maria J. Asiain, Universidad Publica de Navarra, Spain 30. Mohammed Atallah, Faculty of Science, Tanta University, Egypt Plenary or topical lecture speaker

396

31. *Michael Atiyah, Department of Mathematics and Statistics, Edinburgh, U.K. 32. A. H. Azzam, Suez Canal University, Egypt 33. M. A. Bakry, Faculty of Education, Ain Shams University, Egypt 34. *N. Balakrishnan, McMaster University, Canada 35. *Adolfo Ballester-Bolinches, University of Valencia, Spain 36. *Bolis Basit, Monash University, Australia 37. I. Bayoumi, Faculty of Science, Ain Shams University, Egypt 38. *Klaus Buchner, Techniche Universitat Munchen, Germany 39. *Robin Bullough, UMIST, Manchester, UK 40. J. Bustoz, Arizona State University, USA 41. Sergio Camp-Mora, Universidad Publica de Navarra, Spain 42. A. Dabbour, Faculty of Science, Ain Shams University, Egypt 43. "Lokenath Debnath, University of Central Florida, USA 44. Assem Deif, Faculty of Engineering, Cairo University, Egypt 45. Hacen Dib, Faculty of Science, University of Tlemcen, Algeria 46. "Vlastimil Dlab, Carleton University, Canada 47. Eid H. Doha, Faculty of Science, Cairo University, Egypt 48. *W. Ebeid, Faculty of Education, Ain Shams University, Egypt 49. "Jiirgen Ehlers, Max-Planck Inst, fiir Gravitation-Physik, Albert Einstein Inst., Germany 50. Ibrahim F. Eissa, Institute for Stat, and Res., Cairo University, Egypt 51. M. M. El-Ann, Faculty of Science, Al-Azhar University, Egypt 52. Samia S. Elazab, Women's University College, Ain Shams University, Egypt 53. A. El-Gohary, Faculty of Science, Mansoura University, Egypt 54. Salah El-Gindy, Faculty of Science, Assiut University, Egypt 55. Hany M. El-Hosseiny, Faculty of Science, Cairo University, Egypt 56. A. I. El-Maghrabi, Faculty of Education, Tanta University, Egypt 57. *M. S. El-Naschie, Cambridge, UK 58. H. M. El-Owaidy, Faculty of Science, Al-Azhar University, Egypt 59. Tarek M. El-Shahat, Faculty of Science, Alazhar University of Assiut, Egypt 60. Magdy El-Tawil, Faculty of Engineering, Cairo University, Egypt 61. Somaia El-Zahaby, Faculty of Science, Al-Azhar University, Egypt 62. E. Elshobaky, Faculty of Science, Ain Shams University, Egypt 63. Ahmed Elsonbaty, Faculty of Engineering, Assiut University, Egypt 64. L. M. Ezquerro, Universidad Publica de Navarra, Spain 65. *M. Fadlalla, Faculty of Science, Ain Shams University, Egypt

397

66. 67. 68. 69. 70. 71. 72.

M. Faragallah, Faculty of Education, Ain Shams University, Egypt Magdi Elias Fares, Faculty of Science, Mansoura University, Egypt J. Fleckinger, Universite de Toulouse, France *Ahmed F. Ghaleb, Faculty of Science, Cairo University, Egypt M. H. Ghanem, Faculty of Science, Zagazig University, Egypt * Phillip A. Griffiths, Institute for Advanced Study, Princeton, USA *Martin Groetschel, Konrad-Zuse-Zentrum fur Informationstechnik, Berlin, Germany 73. Salah Haggag, Faculty of Science, Al-Azhar University of Assiut, Egypt 74. Badie T. M. Hassan, Faculty of Science, Cairo University, Egypt 75. Michel Hebert, The American University in Cairo, Egypt 76. Ahmed S. Hegazi, Faculty of Science, Mansoura University, Egypt 77. Mohamed Atef Helal, Faculty of Science, Cairo University, Egypt 78. Abdel Rahman A. Heleil, Faculty of Science at Beni Soueif, Cairo University, Egypt 79. Sahar Ibrahim, Institute for Statistcal Studies and Research, Cairo University, Egypt 80. "Hassan N. Ismail, Benha High Institute, Egypt 81. *Mourad Ismail, University of South Florida, USA 82. Fatma Ismail, Faculty of Science, Cairo University, Egypt 83. Abdelouahab Kadem, Universite Farhat Abbas, Algeria 84. M. E. Kahil, Faculty of Science, Cairo University, Egypt 87. Ahmed B. Khalil, Faculty of Science, Cairo University, Egypt 88. Zeinhom M. G. Kishka, Faculty of Science, South Valley University, Egypt 89. *S. R. Komy, Helwan University, Egypt 90. Abdel Monem Kozae, Faculty of Science, Tanta University, Egypt 91. *Tassilo Kiipper, Universitat Koeln, Germany 92. Gamal M. Mahmoud, Faculty of Science, Assiut University, Egypt 93. Mohamed Ezzat Mahmoud, Faculty of Science at Beni Soueif, Cairo University, Egypt 94. B. Merouani, Universite Farhat Abbas, Algeria 95. I. F. Mikhail, Faculty of Science, Ain Shams University, Egypt 96. *John J. H. Miller, Trinity College, Dublin, Ireland 97. Aboul-Magd A. Mohamed, Faculty of Education, Ain Shams University, Egypt 98. Nahed Mokhlis, Faculty of Science, Ain Shams University, Egypt 99. Mohamed Adel M. Ali Mousa, Faculty of Science, Assiut University, Egypt 100. A. B. Morcos, National Research Institute of Astronomy and Geophysics, Egypt

398 101. 102. 103. 104. 105. 106. 107. 108. 109. 110. 111. 112. 113. 114. 115. 116. 117. 118. 119. 120. 121. 122. 123. 124. 125. 126. 127. 128. 129. 130. 131. 132. 133.

Taher Mourid, Faculty of Science, University of Tlemcen, Algeria *M. S. Narasimhan, ICTP, Trieste, Italy A. Nasef, Faculty of Education in Arish, Suez Canal University, Egypt Gamal G. L. Nashed, Faculty of Science, Ain Shams University, Egypt *Aly Nayfeh, Virginia Polytech. Inst, and State University, USA Valentina Nikoulina, Universite Victor Segalen Bordeaux 2, France * Mikhail Nikouline, Universite Victor Segalen Bordeaux 2, France *A.-S. F. Obada, Faculty of Science, Al-Azhar University, Egypt G. Ochoa, Universidad Publica de Navarra, Spain "Jacob Palis, IMPA, Brazil N. Papadatos, University of Cyprus, Cyprus *Claudio Procesi, Universita di Roma La Sapienza, Italy Sherif I. Rabia, Faculty of Engineering, Alexandria University, Egypt Ahmed E. Radwan, Faculty of Education, Tanta University, Egypt S. F. Ragab, Faculty of Engineering, Cairo University, Egypt *Roshdi Rashed, CNRS, France Effat A. Saied, Faculty of Science, Benha University, Egypt A. G. Saif, Atomic Energy Establishment, Egypt Samir Sallam, Kuwait University, Kuwait Mohamed S. Selim, Faculty of Science, Al-Azhar University of Assiut, Egypt Mohamed Sifi, Ecole Superieure des Sciences et Techniques de Tunis, Tunisia Saad M. Sileem, Faculty of Science, Ain Shams University, Egypt X. Soler-Escriva, Universidad Publica de Navarra, Spain Laila Soueif, Faculty of Science, Cairo University, Egypt *A. R. Sourour, University of Victoria, Canada Khalaf S. Sultan, Faculty of Science, Al-Azhar University, Egypt Hussein Sayed Tantawy, Faculty of Engineering, Al-Azhar University, Egypt M. I. Wanas, Faculty of Science, Cairo University, Egypt *Richard Wiegandt, Math. Institute, Hungarian Acad, of Science, Hungary Afaf Zagrout, Faculty of Science (Girls), Al-Azhar University, Egypt Adel Zaki, Faculty of Science, Helwan University, Egypt Maher Zayed, Faculty of Science, University of Benha, Egypt Nasr A. Zeyada, Faculty of Science, Cairo University, Egypt

www. worldscientific.com 4633 he

Mathematics and the 21st century: proceedings of the international conference, Cairo, Egypt, 15-20 January 2000

Read more

Mathematics And 21st Century Biology

Read more

Mathematics and 21st Century Biology

Read more

Mathematics and 21st Century Biology

Read more

International Mathematical Conference 1982: Proceedings (Mathematics Studies)

Read more

Challenges for the 21st century. International conference on fundamental sciences: mathematics and physics

Read more

Challenges for the 21st century: International Conference on Fundamental Sciences: Mathematics and Theoretical Physics: Singapore, 13-17 March 2000

Read more

Applied mathematics entering the 21st century

Read more

International monetary arrangements for the 21st century

Read more

Christ of the 21st Century

Read more

Mechanics of the 21st Century

Read more

Proceedings of the Conference on Applied Mathematics and Scientific Computing

Read more

Fundamental and applied aspects of modern physics: Proceedings of the international conference, Luderitz 2000

Read more

Proceedings of the Conference on Applied Mathematics and Scientific Computing

Read more

Proceedings of the Conference on Applied Mathematics and Scientific Computing

Read more

Medical Textiles: Proceedings of the Second International Conference and Exhibition

Read more

Nucleus-Nucleus Collisions: Proceedings of the Conference Bologna 2000

Read more

The Manifesto of January 3, 2000

Read more

Proceedings of the International Conference on Colloid and Surface Science

Read more

Semigroups and Formal Languages: Proceedings of the International Conference

Read more

International Mathematical Conference 1982: Proceedings

Read more

Confronting the Challenges of the 21st Century

Read more

Inventing the Organizations of the 21st Century

Read more

Inventing the Organizations of the 21st Century

Read more

Globalization - The Juggernaut of the 21st Century

Read more

New and Evolving Infections of the 21st Century (Emerging Infectious Diseases of the 21st Century)

Read more

Inventing the Organizations of the 21st Century

Read more

Inventing the Organizations of the 21st Century

Read more

Countering Terrorism and Insurgency in the 21st Century: International Perspectives

Read more

Ethics and War in the 21st Century (Lse International Studies)

Read more

Recommend Documents

Mathematics and the 21st century: proceedings of the international conference, Cairo, Egypt, 15-20 January 2000

Mathematics And 21st Century Biology

Mathematics and 21st Century Biology http://www.nap.edu/catalog/11315.html MATHEMATICS AND 21ST CENTURY BIOLOGY Commit...

Mathematics and 21st Century Biology

Mathematics and 21st Century Biology http://www.nap.edu/catalog/11315.html MATHEMATICS AND 21ST CENTURY BIOLOGY Commit...

Mathematics and 21st Century Biology

http://www.nap.edu/catalog/11315.html We ship printed books within 1 business day; personal PDFs are available immediate...

International Mathematical Conference 1982: Proceedings (Mathematics Studies)

PROCEEDINGS OF THE INTERNATIONAL MATHEMATICAL CONFERENCE, SINGAPORE 198 1 This Page Intentionally Left Blank NORTH-...

Challenges for the 21st century. International conference on fundamental sciences: mathematics and physics

Challenges for the 21st century: International Conference on Fundamental Sciences: Mathematics and Theoretical Physics: Singapore, 13-17 March 2000

Applied mathematics entering the 21st century

International monetary arrangements for the 21st century

Christ of the 21st Century

CHRIST OF THE 2 I S T CENTURY This page intentionally left blank CHRIST of the 2ist Century Ewert H. Cousins CONT...