This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
xt, etc. introduced near eqn. (1). Then from eqns. (10) 4>xt +c~14>tt = & sin 4>. By a change of independent variables and choice c = 1 this last equation is exactly the sine-Gordon equation eqn. (1) with a = m2: {x,t) = 4 tan Lexp , _ • (11) and the + is a 27r-kink which takes 4>{x,t) = 0 for x —> —oo to 4>(x,t) = 27r for x —> +oo; the — is the 2TT antikink which takes 4>{x,t) = 0 at x = —oo to —27r at x —* oo. The s-G is both a Hamiltonian system and Lorentz co-variant and the stationary kinks found for V = 0 both have the rest energy SrwyQ where 7o > 0 is a real valued coupling constant (see below). However, there is also the OTT bound kink-antikink pair, the so-called breather solution 4>{x,t) x(x,0) [10]) at t = 0 and 5(0) is equivalent to 4>{x, 0) in this sense. Likewise after evolution of 5(0) —» S(t) this can be inverted to regain <j>(x,i) at time t. On the 'map' all of this last is marked as:- uU(x,t) solves integrable models in 1+1 dimensions" and <j>(x,t) or tpx(x,t) is an element of the matrix U. This procedure spectral transform-inverse spectral transform evidently constitutes a rather remarkable nonlinear Fourier transform method of solution [10,58]. Moreover like the Fourier tranform the spectral transform-inverse spectral transform are a canonical inverse canonical transform in the Hamiltonian sense. In practice the spectral data S(t) at time t are inverted via a RiemannHilbert problem: the one shown in the double lined box on the map is for the 2 x 2 matrix Zakharov-Shabat system [10] and is equation (2.5.21) in [61]: R(k) is a scalar function of A: € R, C = k is the real axis, and sub-indices 1,2 refer to the two elements $x = $j(a:,A; + i0), $2(x>k + i0) of a column vector: super indices + , — refer to these elements at k ± iO and \g,0), \i,l) —> \i, 1), |ff, 1) —> ell^|ff,l) with <j> = ir (and by detuning the '27T-pulse' <j> can also be varied). This QPG, combined with unitary rotations acting on each qubit separately can produce any unitary two-qubit operation [92]. As an interesting example, achieved experimentally in [90], with the Ramsey interferometer switched off one performs 7r/2 rotations on the atomic qubit formed by a first atom (a 'source' atom) which initially enters the totally empty cavity in |e) but under Rabi mutation finally exits the cavity in the entangled state l ^ / 2 ) = ( l / v / 2 ) ( | e , 0 ) - f |ff, 1)). l \ / 2 consistent with ANA | as is still demanded. A further suggestion in [9] is to use quantum solitons of the NLS equation as qubits! If this can make sense such qubits are easier both t o create and to keep (without dissipation - compare [89]) in a fibre than are the high Rydberg atomic states even in very high Q cavities. Unfortunately one must ask at this stage 'Exactly what is a quantum soliton'. For the quantum attractive NLS equation (c < 0) with d = 1, eqn. (28), one shows (see eg. [7], and see the eqns. (53) and (54) below) that there are socalled n-string solutions in which n is an eigenvalue of the number operators N: these quantum states are simultaneous eigenstates of H, N, P, the Hamiltonian, particle number and total momentum operators (and of a further infinity, for d = 1 and translational invariance, of such mutually commuting operators). If this exact n-string solution of the attractive NLS were to be considered the 'quantum soliton' of this system it could not be squeezed in particle number n • R n denotes a global flow and let x = X is continuous, B(j> = {x, y, z, t) and the free surface elevation Ti = ri(x,y,t) {ab + ba) = 4>{a)(j>(b) + 4>{b)<j>(a) for all a and 6 € A, or equivalently be a unital bijective linear map from C(X) onto C(Y) . Then the following conditions are equivalent. (a) cj) preserves invertibility. (b) (j> is a Jordan isomorphism. - [a, 6], the algebra A becomes a Lie algebra. In fact, it is a standard result (see, for eg. [Hu, Chapter V]) that every Lie algebra } may be embedded as a Lie subalgebra of an associate algebra - the universal enveloping algebra of } equipped with the product [a, b]. As usual, a Lie isomorphism is a bijective Lie homomorphism. We note that if a is an (associative) isomorphism or the negative of an anti-isomorphism from A to B and 7 is a linear map from A into the centre of B, such that y(ab — ba) = 0 for every a and b in A, then a + 7 is a Lie isomorphism, provided it is injective. We may ask for sufficient conditions on algebras A and B for the converse to hold. Tn(F) be a linear map. The following are equivalent. Then (p is a Lie automorphism ofTn(F) if and only if (p takes one of the following forms: A be a surjective additive mapping that preserves rank one matrices. Then ip is a composition of some or all of the following maps: (i) Left multiplication by an invertible matrix in A. (ii) Right multiplication by an invertible matrix in A. (Hi) The map C \-¥ C', induced by a field automorphism a i-+ a of F. (iv) The map f defined in 3.1 above, but only when m = 1. (v) The map f defined in 3.2 above, but only when n* = 1. (vi) The transpose with respect to the antidiagonal T i-> T+. This is present only when A — A+, i.e., nj = nk-j+i for every j . C O R O L L A R Y 3.4 If s, raax(si,s2)Any function al2(t)
— 4 t a n - 1 [tan^tsin 0 / sech ©#]
81
Fig. 4. The 'phase plane' (phase space) for the nonlinear pendulum. The plots are 0 against 0 and the seperatrices are the trajectories 0 = —7r goes to 0 = -K and +7r goes to —7r.
82 @R = 07
=
(msmfi)(x-Vt)(l-V2)-i (mcosn)(t-Vx)(l-V2)-i
(12)
for [i € R, 0 < // < |7r; the rest energy of this breather for V — 0 is \$mrfQ sin^i, less than the kink plus antikink rest energies (it is a bound pair). There is also the Air kink solution which is a sum of two 27r-kinks: for any t the boundary conditions are 4>{x, t) — 0 for x —» - c o becoming
The sech function, shown in Fig. 6, is the generic 'soliton' (the single soliton solution for all AKNS systems [10,36]). Evidently the amplitude increases as V increases in 0 < V < 1 (V <-> Vc-1 with c = 1): accordingly 'bigger pulses travel faster' [50] and this 'explains' the collision property of two 27r-kinks travelling in the same direction x: the bigger one overtakes the smaller one and re-emerges asymptotically at large enough x, exactly as though the bigger one has passed through the smaller despite the nonlinearity! The formula for the 4-Tr-kink given in Fig. 5 has exactly this property with however the phase shifts as already explained: phase shifts are generic in 1+1 as already mentioned. The Fig. 7 shows how the bigger 27r-kink taken in generic sech form 'passes through' the smaller one: careful scrutiny also shows up the phase shifts after the collision. Notice that the 'area' of the pulse J_ (d(p/dx)dx — 2TT exactly. The kink is the 27r-kink for this reason. Similarly the 47r-kink is (asymptotically!) the simple sum of two 27r-kinks and the 2n7r-pulse is the sum of n 27T-kinks. In this language the 'breather' is the 07T pulse and does not change any total area. Notice that in terms of 2-level atoms the 27r-pulse rotates each atom from
83
6 b 0 d 0 0 0 O O O O b 0 6 C > O 6 6 0 6 < t
f
1 o <>
>
<>
r
?
1
1 6
6 c!>
T
1 i
.
<
Fig. 5. Model of coupled nonlinear pendulums in static equilibrium under gravity (above); and excited by a 47r-pulse (below). The analytical expression is that for the 47r-pulse of the s-G equation. Note the phase shifts A i = 2£nai2, and A2 = — A i : 012 is determined by Vi and V2 the speeds of the two 27r-kinks involved.
84
Fig. 6. The generic hyperbolic secant pulse of the AKNS systems illustrated by eqn. (13).
Fig. 7. The 'bigger' 27r-kink taken in generic sech form eqn. (13) 'overtake the 'smaller' 27r-kink taken in that generic form.
86 ground state t o ground state: no energy is lost from the pulse t o the atoms this way and the 'attenuator' which is a dielectric made up from a smooth distribution n (say) of 2-level atoms each in their ground states initially is actually transparent to such a pulse; indeed it is transparent to all 2n7r-pulses including n = 0, the breather. If the initial area of the pulse is not 2n7r, n — integer positive or negative or zero, then the pulse re-shapes to an appropriate value of n! This then is the mathematics of the physical observations of 'selfinduced transparency' or SIT ([6,7,34,47,48] and refs). Notice how we can move from the s-G to the SIT equations eqns. (9) themselves in this way. Notice too that this ability to 'rotate' a 2-level atom is in principle very general: a pulse of area 6 (a '6 pulse') can rotate the Bloch vector of the 2-level atom by 6. This ability t o change the quantum state of a 2-level atom is a t the heart of recent ideas t o encode quantum information (see §5) on systems of 2-level atoms in cavities - an aspect of 'cavity q.e.d.' (see Figs. 9 and 10 below). The sech pulse eqn. (13) is roughly speaking an electric field pulse (because of the change of independent variables one needs both of d
-i
(14)
and the real valued parameter c has c < 0. This system with c < 0 has a 2-parameter sech solution satisfying b.c.s. 4> —* 0,
(15)
with q, V € R. All of these solutions are of "breather" type (a sech envelope modulates a (complex) oscillatory term): like the s-G breathers these NLS solitons have velocity V and amplitude q (the two parameters) independent of each other (now 'bigger pulses do not travel faster!'): the breather of the NLS equation, eqn. (15), derives directly from the breather solution eqn. (12) of the s-G under a suitable SVEPA. By replacing t by a spatial co-ordinate z the NLS equation eqn. (14) describes stationary solutions for a nonlinear dielectric with third order non-linearity in the space (x, z) £ R~: these are the 'spatial optical solitons' appearing in the Fig. 2 already mentioned, and were first observed by C.H. Townes et al. in 1964 [51] and by P.L. Kelley in 1965 [52]. The first soliton t o be observed was in the form of a 'bump of water' on the surface of a canal in 1834 [10,53]. The motion of this bump which is a sech 2 bump not a sech like eqn. (12) is governed by the famous Korteweg-de Vries equation [10,53-55]: the KdV equation is (in a frame moving a t the sound speed and suitably scaled) Ut + 6uux + uxxx = 0. (16)
87 The field u(x,t), which arises as a velocity, replaces previous fields 4>{x,t) and eqn. (16) is Galilean invariant if Ut —* ut + ux and the number 6 is scaled away - as one checks. The soliton solution is u = 2£ 2 sech 2 £(z — 4£2£)] with £ £ R and there are no breathers: bigger pulses travel faster for KdV - see eg. [10,17,53], The Ref [17] illustrates more than nine examples of soliton systems. 3. C O M P L E T E HAMILTONIAN INTEGRABILITY O F T H E SOLITON SYSTEMS
All of the nine or so examples discussed in [17], and see eg. [18,55,56] for still more, are completely integrable Hamiltonian systems and typically have actual soliton solutions. One completely integrable Hamiltonian system of the same type, a kind of repulsive s-G, is the sinh-Gordon (sinh- G) system which does not have soliton solutions. It takes the form 4>xx - 4>tt =
m2
smh
4>
(17)
and this sinh-G derives from the s-G by analytical continuation in the coupling constant 70 - see below and [18]. All of these systems mentioned are solvable by the inverse scattering method under b.c.s. vanishing sufficiently fast at x —> ±00: particularly the hierarchical sequence RMB, SIT, s-G and NLS of the Fig. 2 are all completely integrable Hamiltonian systems actually solvable via the 2 x 2 matrix inverse scattering method [10,56-58]. Here we consider complete integrability of the s-G equation which in covariant form is eqn. (1), with the b.c.s. <j> —> 0 (mod 2ir), cj>x —+ 0 'fast enough' (eg <j> ~ 2WK ± e - * ' 1 ' , x —* ±00). There is also the nonlinear evolution equation (NEE) form 4>t(x,t) = m2sin[
4>(x';t)dx'}
under comparable b.c.s. The NEE eqn. (18) is for 4>(x,t) = 2
uxt = m sin u
(18) ux(x,t) (19)
which is the s-G in iight-cone' coordinates. Takhtadjan and Faddeev used inverse scattering methods to demonstrate the complete integrability of the s-G equation eqn. (1) in a preprint of 1974 [57]. R.K. Dodd and myself used these methods to demonstrate the complete integrability of the s-G in lightcone coordinates also in 1974 (in an oral presentation at the British Theoretical Mechanics Colloquium in Manchester). However in print we first of all reported both versions together in [59] which appeared in 1977: a more complete analysis was given in [60]. As noted in §1 Liouville's theorem [15] says that for N < 00 degrees of freedom, given N independent constants Ik commuting under the Poisson bracket {.,.} i.e. {Ik, h} = 0, k, £ = 1,..., N; the system is completely integrable and can be integrated as a sequence of integrals. In [16] of 1974 Arnold shows that if the manifold of level lines, the set of Ik = constant, is compact and connected the motions are diffeomorphic to a torus (an N torus TN) and this means actionangle variables can be found. The N degrees of freedom define a symplectic manifold M2N, smooth and differentiable, which carries the differential 2-form
88 LJ = 5 2 i = 1 dpi A dqi in terms of local canonical coordinates Pi,qi, i = l,...,N: the 2-form u is the 2-form u/ 2 ) and M2N carries the forms a / 2 \ a/ 4 ), ...,u/ 2 i v ) in which the last is the phase volume. Each of these a/ 2 ') is invariant under canonical transforms [16] and invariance of the phase volume u/ 2JV ) means the Jacobian of the canonical transformation is unity. For field theories like the KdV, RMB, SIT, s-G, and NLS equations one needs an invariant 2- form like u=
f[dn(x,t)Ad4>(x,t)]dx
(20)
over the running label x: the canonical coordinates are here IT(x,t), 4>(x,t) and satisfy the equal time Poisson bracket {Il(x,t),
tffol^o1/
-7l?n2 + - ^ 2 + m 2 ( l - c o s <
dx
(21)
in which 70 £ R and 70 > 0- Under a trivial canonical transform plainly leaving 1 _i w eqn. (20) invariant 702 II —-> II, 7 0 2 4> —+
(22)
a more usual form in the literature and by expanding the cosine 70 is plainly seen as a coupling constant coupling the nonlinear terms to the linear ones. Note that as 70 —> 0 one actually gets the K-G (Klein-Gordon) linear equation. 1
1
Moreover by the continuation 702 —> i^§ in eqn. (22) one gets sinh-G, eqn. (17) (for which 70 —+ 0 also gives K-G). To be explicit, from the form H[
=
7on (=6H/5U)
Ut = {H,n}
=
7o-1[^x-m2sin0]
(23) {=-5H/64>)
(24)
89 and 4>u = 7 o n t is exactly eqn. (1), since 70 vanishes from these particular Hamiltonian equations. As in [18,56-60] H[4>] can be expressed as H\p] in terms of action-angle variables (under the chosen vanishing b.c.s. at infinity; ref. [18] which is concerned to provide a quantum and classical statistical mechanics of solitons shows how to connect these to action-angle variables under periodic b.c.s.). For laboratory coordinates eqn. (1) the H\p] takes the elegant form N
NK
R
2
H\p] = ]T(M +p?)i+£(M 2 +p-!)' i=i
j=i
Nb
+ ^[4M 2 sin 2 0,+p 2 ]i 1=1 oo
/
ui{k) = (m 2 +
u(k)P(k)dk;
fc2)^.
(25)
•00
The qunatity M is exactly the rest mass of the single soliton solution: M = 87717^', and it can be seen that the two first summations are apparently for Nk kinks and JVj anti-kinks each with relativistic momenta p; or pj . The third sum apparently involves Nb breather solutions, eqn. (12), of s-G eqn. (1) (the numbers Nk, N%, Nb are fixed by the initial data): the parameters fie for each breather eqn. (12) appear now rewritten as the canonical momenta Qe and 0 < Qe < 7r/2. Note that the P(k) in eqn. (25) have the running label k and the pt, pj, pe, 0^, and P(k) together satisfy the requirement that there be 2X° action variables: there are also canonical angle variables for each action variable (which do not of course appear in i/[p]). The phase spaces are indeed i. P(k) > 0, 0 < Q(k) < 2TT; {P(k),Q(k')} ii. —00 < pi, qi < 00, {pi,qf\ = <%,
= S(k - k1).
(not compact)
iii. —00
(not compact). 1
v. 0 < e , < 1*, 0 < $ m < 8TT, {4 7o - e € ,$ m } = stm. Note how these 2Xo canonical pairs P(k), Q(k) together with Nk Pi,qi, Nk pi,q~i, Nb pe,qe and 4^Q 1 0 ^ , $^, together define a torus of at least 2X° dimensions and notice how the torus defined by the Pi,qi;Pi,<ji;pe, qe is opened up and so is not compact. Indeed the P(k), Q(k) define a volume which is a cylinder for each k and this part of the phase space is the large product of each such cylinder: the cylinders are opened into sheets for the Pi, qi, etc. Evidently because this large dimensional torus is formed as the direct product of "more than" 2X° separate one-tori, this means in physical terms that the relativistic "particles" apparent in H\p] eqn. (25) are non-interacting particles: in practice in the statistical mechanics of s-G as presented in [18] (and in e.g.[55] and the other refs. in
-SOLITONSU.V u d t m J f x W Matrices
X
Periodic b.o's on tntegrable Lattice ModeU in 1+1 dimensions monodromy matrices
Lax Pair
Spectral Problem
or
and «7(x,0) plot rapidly vanishing bx.'i at oo
Spectral Data S(0)
canonical transformation
+. = V* | compatibility JO + flAfl = 0 (Zero Cnrvatnre) |SDSY| Integrable Model (=NEE) Rjemann Problem Marchenlo Equation
e.g. «. = n^* - 6«Ui Riemann theta-functions
4e
* 1 - * l=J*(Jb)*2 •
V
• - < -
U(x.t) SOITCS integrable models in 1+1 Dimensions
J K - P Equations |
Lie-I Al|
+ [p„,ff.i = o
also Hamiltoniaas
|snSY|r
IKSA Theorem | -
•
JSiitol
r
Loop Algebras s.(Ar,C)®[X,A-M
1 Saitoh 3+1 Dimensional Integrable models
Elliptic j Modular [ Functions 1 Strings
->
-
Bose Fermi Equivalence
fm.^ - o^
*
^
Affine(Kac-Moody) Lie algebras Virasoro Algebras
<-
Bard Hexagon
_J
T
Potts Model Ising Model
Fig. 1 Overview of generalised 'Soliton' theory as of August 1991 taken from Refs. [7,8]. A hard arrow indicates minimal connection (at least) between the boxes is already established and most hard arrows are actual mappings. Dashed arrows indicate expectation by the authors that some such minimal connection can be achieved or stronger. Note how p-reduction of the KP equations reaches the string equation for 2D-Quantum Gravity coupled to (p,q) conformal matter (72,81). Pure quantum gravity is p = 2 the case considered by Migdal [50].
,/(co)
Conformal Field Theory
Theory of Partitions (Wadati)
Fig. 8. T h e 'map' "Solitons" in the form published in [19] and used in [7]: The inset 'Fig. 1' etc. is that as used in [19,48] and the references [7,8] etc are references in [19] as explained in more detail in the text.
co din reps
Rjcmann
•
I
.
Hubert Space
S«6tt J—, 1 SM 1Genu>-co
m - ma
Sklyaain Bracket and i-nutriz
ctioo-Angle Variables
{TTT>= ITOT^j |J€C
Partition Function Z = fVpexp S\f) for 1+1 dimensional classical or quantum integrable models
If
Braid Group 9i9j = 9j9i l » - j ' l > 2 9i9i+l9i = 9i+i9i9i+i
s
q-Bosons Hopf Algebra
-N
AT=T®T Quantum Groups
V(x, y, t) solves integrable model in 2+1 dimensions e.g. D-S equations or K-P Equations
If
Quantum Integrable Models e . g . s-G, MM, HLS
(«, + 6ml* + « « I ) I = ± K „
5^1 |_y
KP-IJCP-II=+,«A = Ail Hopf Algebra iymmetries of KP-I and KP-II Algebra
Quantum Spin-] XYZ model (1+1)
( ^ m . r „ ] = l ( m + |l)/f m + n _,
lV. T »l = i ( m - "fc-t—i Weyl
W
Algebra
—«-
8-Vertex Model
*
-
*
•
|n-Vertex|—*2D- Quantum Gravity. Partition function is a T-f unction of p-reduced KP
Solvable Lattice Models (2+0)
Polynomials Partition Function Z Jones Polynomials
Generalised Statistics Oriented 3-manifold M covariant (invariant) theory
S - h iu Tr(A
A
•* + lA A A A A)\
= Integral of Chern-Stmons 3-form
=
92 [18]) one finds instead that under the periodic boundary conditions necessarily used there, the consequent phase shifts describe a highly significant interaction between these particles. To show that this total count of action-angle pairs (or 1-tori) is sufficient to integrate the system one observes that for example P(k) = constant but Qt(k) = w(fc); and from the P,Q at time t = 0 one finds all of these at t = t and so inverts these P, Q at time t to the original variables II, <> / at time t. The treatment of the solitons works similarly and these likewise contribute to the H,4> &t time t. To find the P(k),Q(k) etc at t = 0 one uses the "Lax pair" for s-G [10,5658,60,61]. One part of this pair is typically an eigenvalue (or spectral) problem Lv = £v (in which L is a 2 x 2 matrix differential operator and £ € C [10]). It is through L that one converts the initial data for
There was an error in the 'map' in [67], namely that in the 'box' "Symmetries of KP-I and KP-II..." the relation [Km,Tn] = ^(m + l ) ^ m + n - 2 and n —• 1, not n.
93 One speaks of the eigenvalue problem Lv = (v for 2 x 2 matrices, or that for n x n matrices more generally, as a scattering (or spectral) transform which transforms initial data at t = 0 to a suitably complete set of spectral data S(0). 'Suitably complete' means 5(0) can be inverted back via the inverse spectral transform t o regain
= ±uyy
(26)
and the KP-I, KP-II belong to the + and the — respectively. The solution of KP-I and KP-II is carried further in Caudrey's paper [62]. Moreover actionangle variables for these K P equations are in [20,22] while they were also given
94 in [22] for the Davey-Stewartson system: the Davey-Stewartson [61] system is the DS on the 'map' Fig. 8 inside the KP 'box'. However it has turned out that the solved DS-I system, solved via a Riemann-Hilbert problem [22], is actually not Hamiltonian (while the Hamiltonian form of DS-I is not yet solved). Thus the status of the "action-angle" variables for DS-I in [22] is still open. Notice that the geometry of solitons already mentioned [10-12] also connects in the spirit of Atiyah's remarks t o much algebra evident on the 'map' Fig. 8. Thus the loop algebras in the third column from the left in Fig. 8 connect with the quantum groups, which are Hopf algebras [30,31] with the co-multiplication A = T®T already briefly explained, in the second column from the right, and these quantum groups connect with the quantum inverse method or algebraic Bethe ansatz [24] in the the third column from the right. My paper [30] illustrating this route and the algebra of quantum groups is rather incomplete incomplete because it was quite an early paper on this topic (first presented in 1988, Drinfeld's, Berkeley article is 1986 [32]): following up the remark of Atiyah the word 'quantum' is appropriate because, as Fig. 8 shows, the quantum groups lead to the quantum inverse method [24] for solving quantum integrable systems: this method involves an R-matrix as is displayed in the third column from the right, a solution of the Yang-Baxter relations, also in that column. The semi-classical limit of R is the little r-matrix which enters into the Sklyanin bracket as at the top of the third column from the right of Fig. 8: this defines the Poisson-Lie group Hopf algebra structure traceable back to Drinfeld 1983 [69]. The key expression RT®T~T®TR in the quantum inverse method 'box' in the third column from the right in Fig. 8 is a quantum integrability condition as mentioned in §1: more explicitly [18] R{\, /i)T(A) ® T(ji) = T(ji) ® T(\)R(\,n)
(27)
and A,/i G C. And the matrix trace of eqn. (27) yields [A(A),A(//)] = 0 with A(A) = TrT(A): the A(A) (A(/i)) are indeed a very large number of commuting constant quantum operators! In our paper [70] for example which solves the Tavis-Cummings problem of quantum optics for N > 1 2-level atoms and one e.m. field mode only three independent constant operators are actually involved and the significance of the infinite number of constants deriving from eqn. (27) is obscure (to me). As noted the 'map' Fig. 8 was first presented at the 18th Intl. Meeting on Differential Geometric Methods in Theoretical Physics held at Chester, UK in 1988: there it was intended to illustrate the use of the Riemann-Hilbert methods for the inverse spectral transform, and to illustrate some of my work with J.T. Timonen, Finland on the quantum and classical statistical mechanics of the integrable models [67] - particularly that on the SM of the sine-Gordon field theory in 1+1 dimensions summarised later in [18] and its references. This is how the Partition Function Z = f Vfi exp S\p] comes into the 'map' Fig. 8 via the Riemann surface of genus infinity in which the classical action S\p] is expressed in terms of action-angle variables under periodic b.c.s., that is [18] the
95 canonical invariant S[H, >] the classical action expressed essentially as the integral invariant of Poincare-Cartan [16] as S p l , >] = J H
96
biitatlliti)
t
5olvoile Larti«
Fig. 9. Extension of the 'map', Fig. 8, expanding on the EXPERIMENTS 'box'. Notice that in what is the extreme left hand column here that 'e.g. s-G, MTM, NLS' (for sine-Gordon, massive Thirrring model, Nonlinear Schrodinger equation respectively) is extended to include 'MaxwellBloch MB', 'optical solitons' referring to MB as well as attractive NLS. Cavity q.e.d., developed further for 'quantum information' in the section 5, now enters in this Fig. 9 only to 'Micromaser', but there via the quantum integrable Tavis-Cummings model [70] of TV 2-level atoms and one cavity mode. One example of the T-C models is the Jaynes and Cummings model, eqn.(50), fundamental [71-74] to the models of the micromaser. Nonlinear dielectrics [44,45] enter at the right.
97
KBy wonts fa.tlusued^
J \\$XoHitfi V&.CUM.H, iolitws
0»fM iiwfrej,
fk«HM»
Mmt'iw«a.r
pft«U«li'c
tr«.p
Coc(Iv>3
Fig. 10. The EXPERIMENTS of Figs. 8,9 are extended further. 'Non-classical light' enters as 'squeezing' top right and 'sub Poissonian photon statistics from the micromaser' [40,72-75]: the purple reference to 'coherent states = solitons' refers to the coherent states of arbitrary semi-simple Lie algebras, which are typically squeezed e.g. su(2), atomic coherent states [75]. There is the correspondence between such coherent states and integrable systems noted in [26] mentioned in the text. Both the 'g-deformed bosons' [23,29] and the 'quantum repulsive Bose gas' lead naturally to ' B E C , appearing at the bottom, as analysed in the parabolic trap in the section 4: 'GrossPitaevsky' (Pitaevskii) is a self-consistent approximation, in c-number form, to the exact quantum theory of BEC [85] and agrees with much experimental data. A first exact (in the scaling limit) calculation of the correlation function (ip^ip) of section 4 for the repulsive Bose gas and d = 1 which uses the q-bosons on a lattice is in [107]: the correlator for the repulsive Bose gas is in [34].
98 fluctuation . . .
in q which is Aq is squeezed below the Heisenberg bound, Aq <
i
(2^) 2 J n ° t difficult to do mathematically but quite difficult to do for physical systems [75]. In optical fibres the electric field pulse, a solution of the NLS equation, can take the general classical form of eqn. (15). But viewed as a quantum object (a "quantum soliton"?) it will be a solution of the quantum NLS equation in 1+1 dimensions. -i
= 4>xx - 2at>^4>2
(28)
for which there are the Bose commutation relations [0,
(BEC)
The discovery by S N Bose in 1924 that thermally excited photons satisfied "Bose statistics" when seen as a problem of the quantum statistical mechanics of massless particles led Albert Einstein to introduce the corresponding Bose-Einstein statistics for the massive particles which are now called massive bosons. It was realised that at low enough temperatures, less than or much less than a few Kelvin, a phase transition should occur in which macroscopic numbers of these bosons should collapse into one single quantum state of lowest (free) energy. As we now know this is in contrast with the 'fermions' which cannot occupy the same quantum states. In 1938 F. London suggested that the peculiar 'super- fluid' properties of liquid 4 He below T ~ 2.2° K were an actual manifestation of just such a Bose-Einstein Condensate (a BEC). In 1947 [76]
99 N.N. Bogoliubov (NNB) suggested that such BEC's would be well described by a weakly interacting gas of bosons interacting through a "hard-core" potential in pairs, and only in pairs, of the type c5(ri — r2) in which 6(ri — r2) is the 3-dimensional (d — 3) 5-function and c is a small parameter - evidently the boson- boson coupling constant. NNB showed in effect that c > 0 for stability of the system [77], so the pair potential is repulsive hard core. Since for N bosons the total pair potential is c ^ i = 1 X2,'=i JU.{ &{pi — Tj) each pair introduces the potential 2c8(Ti —Vj). Huang [78], for example, describes the properties of this repulsive 'Bose gas'. In 1947 NNB calculated the particle energy spectrum of such a weakly interacting repulsive Bose gas below the critical temperature Tc for any BEC and showed that in this 'condensate' although the individual bosons had the free-particle kinetic energies p 2 / 2 m (p = |p| is the magnitude of the particle momentum) the collective excitation behaved as though each 'particle' of that collective excitation had a single particle kinetic energy y/Nov(0)/mVp in which the constants in the square root are introduced below. From this result for the collective excitations of the condensate he was able to derive superfluidity - an interpretation depending critically on the p 2 —* |p| = p behaviour [79]. It is well known that for relative simplicity of 'many-body problems' involving many interacting particles it is helpful to work in so-called second quantisation [80]: one introduces for bose systems two quantum fields ip, ip^ (tp^ is the adjoint of ip) which satisfy the equal time "Bose commutation relations" [i>{r,t),i>\r',t)]
= h8(r-r')
(29)
in which r,r' are vectors in d dimensions. For the particular 2c6(r — r') interaction between two bosons at r and r' Schrodinger's linear time dependent equation can be rewritten exactly, in second quantised form, as the nonlinear equations ih -ih
dip dt dt
v 2 i> + 2c^V^ = o V2 ft + left ft i> = 0
(30)
(note the 'normal order' in the ip^tpip, etc). Reference to eqn. (28) shows that eqns. (30) with (29) for Ti = 1, m = ^ are exactly the d-dimensional quantum NLS equation: v 2 = ^2i=i 92/9xf, the Laplacian in d-dimensions. If we replace the quantum fields i>,i>^ by classical fields tp,ip* satisfying the Poisson bracket {ip(r,t),tp*(r',t)} = iS(r — r1) in d-dimensions (so that iTT1 [.,.] is replaced by {ip, ip*} the reverse of Dirac's canonical quantisation so that this 'semiclassical limit' is formally singular in h, then eqns. (30) can be seen to be exactly the classical NLS equation in however d space dimensions. Unfortunately, although the NLS systems are completely integrable and solvable in d = 1 dimensions (whether c > 0 or c < 0) these equations are no longer com-
100 pletely integrable in d = 2 or d = 3 space dimensions: there are not 'enough' conserved quantities in these cases (§1). Under translational invariance the total linear momentum P = -ih±f[rvi>-(vr)i>}ddr
(31)
for the classical tp,i>* (called c-numbers by Dirac) commutes with the Hamiltonian under the bracket {.,.}. For the quantum fields ii,^ under the Lie bracket eqn. (29) this likewise remains true. For a quantum Bose condensate with zero macroscopic momentum P NNB exploited the idea that a macroscopic number of massive bosons were now in the p = 0 mode. In momentum space NNB's Hamiltonian was [81]
H
= E |^ a X+-h V
T,
v
^ - pi)
(Pl+P2=P'i+P'2)
in which v(p) is the Fourier transform to p-space of the pair potential v(r — r'): evidently v{p) = v(0) and become a constant for the chosen 2c6(r — r') interaction. To solve this system (to a good approximation) NNB's crucial move was to exploit the supposedly very large number No of particles in the p = 0 state. In eqn. (32) ap,ap are Bose operators with commutation [a p ,aL] = Sppi. NNB replaced these by bp = alN~*ap,
6j = a J ^ 0 " i a o
in which ao, % commute as c-numbers. This way [81] he gets H — HQ +
(33) Hint,
Ho = E jg£&&» Hint = # 1 / ( 0 ) + # £"(p)(&Jtf_p + bpb„p + 2bpbp) + H', in which H' is of third and fourth degree in the bp,bpi. The bilinear form in bp,bp was diagonalised under a 'Bogoliubov-Valatin' transformation, and this way NNB found his energy spectrum as ^(Nov{Q)/mV)p, linear in |p| = p already mentioned. Notice that ^(0) must be positive so that c > 0 (repulsive case) for stability [77]. Subsequently, among much theoretical work, L.D. Faddeev and V.N. Popov [82] and then V.N. Popov [83,84] developed functional integral methods in order to calculate, in particular, the 2-point correlators G{r,r') = (ft(r)i>(r')}.
(34)
The (...) means thermal average a t finite temperatures T > 0 and this is introduced by using Wick rotated time t —> —ir. One finds [18] that 0 < T < (3, (3 = (kBT)"1 {ks = Boltzmann's constant) and the classical action used in the functional integral is periodic of period /?. In fact G(r,r'), eqn. (34) becomes independent of 'time' r and G(r,r') as written indicates this fact [851.
101 Under translational invariance eqn. (34) for G depends only on the vector r — r', not on vectors r,r' separately. Thus the calculations are advantageously carried out in terms of the ap, ap in momentum space. An important point is that the volume V (appearing in H eqn. (32)) is large with periodic boundary conditions such that one can take the finite density limit. This limit is such that e.g., for a total number N of particles, N/V —» n > 0 as the periods in ddimensions go to infinity so that V —• oo. This finite density limit ensures that the large V behaviours scale asV a condition for the thermodynamics derivable from the thermal average (...). Translational invariance is thus achieved under these finite density conditions (see discussions in [18]). In momentum space for T < Tc one finds [84] (see the pp.26, 28) that the following shift transformation is convenient: %p(r,T) —> 'ipir^r) + a, tp^{r,r) —> ffi(r, T) + a* and a, a* € C. This way one shifts ap, aj, as ap = bp + a((3V)hP0
al = bl + a*{(3V)l*6Pa.
(35)
Then one finds (6o) = (bl) = 0 while
=
a((3V)i,(al)
= a*((3V)i
(opoj)
=
(bpbl) + /?Via|% 0 -
(36)
The action o|a) = a\a), a £ C, is a property of the so-called Glauber 'coherent state' \a): {a\a^ = a*(a|. Moreover ( a ^ a l a ) = |a| 2 , ( a | a W a a | a ) = |a| 4 , etc. To this extent the condensate in the p = 0 mode is in a Glauber coherent state [86]: it is important that there is also 'off diagonal long-range order' [7], namely (oo), {%}, two 'order parameters', are both non-vanishing. Unfortunately the reality in the recent (1995) experiments producing BEC in the metal vapours 2 3 Na, 8 7 Rb, 7 Li [85] is that these vapours are cooled to micro-Kelvin temperatures T using magneto-optical traps. And then, held in a magnetic trap, evaporative cooling of the gas is achieved down to T's ~ 1 0 - 8 , 1 0 ~ 9 K when the BEC occurs. To a good approximation magnetic traps introduce harmonic potentials into the quantum theory of the general form [85] V(r) = ^m(Q,lxj
+ Sl2x22 +
fi3x|),
(37)
(for r = (xi,a;2,a;3)). Such traps break translational invariance. And we have been obliged [85] to re-work all of the functional integration theory to accomodate this fact. It no longer pays t o work in momentum space. We work with the fields tp(r, r ) ipi (r, r ) directly and derive the following for each of d = 3, d — 2, d = 1 that: G(r,r')
=
VpoWpo^OexpJSLg1,
L3
=
{m/8ivh'2p)[p0(ry1
G(r.r')
=
^po(r)po(r')R-<,
+ pojr')-1};
(d = 3)
(38)
102
7 G(r,r') Li
=
{m/AntfWipojr))-1
+p0(r')-1]
(d = 2);
(39)
1
=
v/poWpo^Oexp-iJL- ],
=
{m/thtylpoir)-1
+ pop)-1]
(d=l).
(40)
in which -R = |r — r'\. To the approximations of the argument as developed so far the results equivalently involve exp j ^ 0 R(B\ , exp[—(y/ir0po(8))lnR], and exp[—V/PPQ(S)]R, for d — 3,2,1 respectively in which v = m/2h2, 2s = r + r ' 3 , and [85] the results are thus described in terms of ^(r + r ' ) a 'centre of mass', and r — r' (translation against the centre of mass) [85]. Of course the real point is that these correlations in d — 3,2 and 1 no longer depend on i— r' alone! The density po(r) is the density of the condensate and it proves t o be the negative of the potential eqn. (37), cut-off at a particular value where r reaches the vector R c = (RCl, RC2, RC3) [85]. Notice that from eqns. (38), (39), (40) only for d = 3 is G(r,r') long range: evidently for R large, expRL^1
= l + L3/R + 0{R-2)
(41)
and the 'one' means G(r,r') ~ yPo{r)Po(i~') with po(r) described by the inverted paraboloid —V(r) eqn. (37), cut off for r = R c 3 as indicated above. In the translationally invariant theory po( r ) = Po — constant and G ~ po, simply. On the other hand for d = 2, d = 1 there is no longer any such long range behaviour, and in this sense there is no condensation for d = 2 and d = 1 dimensions in the translationally invariant case. When the trap is present and translational invariance is broken the expression eqn. (38)-(40) again indicates that there is a condensate only for d — 3. Notice too that for d = 3 in the presence of the trap the "first order coherence function" G^(r,r') = G(ry)/^po(r)po(r>) ~ 1 (42) for, but only for, large enough R. Eqn. (42) is indeed with the trap present (and d = 3), and one can guess (but this is still to be demonstrated) that the nth order coherence function G^\rur2,...,r2n)
= G(r1,...,r2n)/y/p0(r1)p0(r2)...p0(r2n)
~ 1
(43)
for large enough joint separations between each pair of the test points r i , . . . , r2n and in this sense (only) in the trap the Bose condensate is in a form of quantum coherent state (coherent states have the property that (a) all coherence functions are unity; and (b) (ip), (ip^) ^ 0 with the condition (b) demonstrating the "off diagonal long range order", compare with eqns. (36) above). These particular coherent quantum states are indeed very coherent 4 , and it is paradoxical (perhaps) that since the relevant c-number NLS equations (necessarily 3 In general I use r, r' for vectors in rf-dimensions, but bold type r, r' may also be used to emphasize the vector character, as in R c also 4 Using 'coherent' here to mean in its more general non-technical usage.
103 including the trap terms V(r)ip, V(r)ip*) are not integrable (not integrable for d = 3, d = 2 anyway but not even integrable for d = 1 because momentum P no longer commutes with H), they are necessarily "Hamiltonian chaotic". The quantum theory of BEC which yields the quantum coherent state picture (for d = 3 only) is thus "quantum chaotic" in the sense that the semiclassical limits of the theory (the c-number theory) are classical chaotic! Even so one expects to build an atom-laser successfully during the early years of this Millennium, the analogy with the single mode photon laser being that well above threshold the photons in the laser cavity are in a Glauber coherent state! Since the condensate is held in the magnetic trap through the quantum spin state of the condensate, and this spin can be flipped by an RF field, condensate can fall under gravity at the point of spin flip: the outcoming condensate can then be seen as a real atom-laser [87] for periods ~ millisec (only) before the trap is emptied. Evidently this real atom-laser still needs its pumping mechanism. Very recently [88] experiments have been done which actually measure (•0+ (r)"0(r')) at T ~ 300 nano-Kelvin. One clear observation is the L^R~x fall-off to the long range condensate behaviour predicted by eqn. (41) (for the experiments see the Fig. 4, curve at T — 310 nK in particular in ref. [88]). However, now notice the extra effects of the trap: the 'correlation length' L3 for d = 3 actually depends on both of r and r' (under translational invariance L% does not depend at all on r,r'). Since to the approximations of the theory L3 = z//27r/3po(s) with s = | ( r + r') as explained we currently look for this small factor in the data already reported in [88]: this data is transversely averaged data so that G(r, r') —y G(z, z') with z the long axis of the trap [88], and it may be necessary to perform further experiments to check out the existence of this small, but fundamentally important, term arising solely from the breakdown of translational invariance in the magnetic trap. Observe that this feature strictly speaking destroys the scaling as volume V at finite density so that, strictly speaking, a 'new thermodynamics' is involved in these calculations. Of course the new 'device', this atom-laser must still be shown to have any technological future; but this was true of the first photon-maser (first successfully built in 1954 [7]). Otherwise the BEC system is a beautiful example of the quantum NLS equations realised in an experiment. Notice the problem posed by gravity for the massive bosons of any atom-laser: this was not a problem of the photon laser. Note finally [7] that BEC for 7 Li in a one- dimensional (d = 1) magnetic trap is under experimental investigation: 7 Li forms an attractive (c < 1) BEC and it may be possible to see 'quantum solitons' of the d = 1 quantum NLS equation in this system. Quantum solitons are a theme of the next section, §5. 5. Q U A N T U M INFORMATION
At this stage of its evolution 'quantum information' systems may or may not prove to involve any solitons. The essential point is that the soliton of the NLS equation under translational invariance (and d = 1) is, strictly speaking, a quantum object (a 'quantum soliton'?) which acts as a 'one-bit' of information
104 in an optical fibre. In this sense it is one 'qubit' of quantum information [9,35,89]. In [9] it was suggested that quantum solitons are easier t o realise in actual experiments (or for an actual quantum information technology) than are the other 'qubits' proposed (and recently realised) so far. A typical 'qubit' is the quantum state of a 2-level atom (§2) or of a genuine spin-^ system. In either case this state is \1>) = a\g) + b\e) (44) (with \a\2 +16| 2 = 1, for coefficients a,b G C). We know from the theory of SIT (§2) that a 27r-pulse takes an atom in \g) back t o the state \g) via however a passage through |e) (2mr pulses do this n times). Similarly a #-pulse takes \g) to \i>) = cos \o\g)-i single)
(45)
(note the geometrical phase \g) —• \ip) — e™ \g) for the 2ir- pulse, a phase which has been measured in [90]). The relevance of eqn. (45) to quantum information and quantum computing is [35,89] that two such qubits are quantum states in the Hilbert space spanned by !ffi)|S2>, |ffi)|e 2 ), |ei}|02>, |ei)|e 2 ) and Wl = a|l2> + %i>|e 2 > + c|ei>|52> + d|ci)|e 2 >.
(46)
Any successive measurements of this state (according t o the Copenhagen interpretation of quantum mechanics) will measure any one of these states with probability cc |a| 2 , |6| 2 , etc. But \ip) itself contains all four states. Moreover optical pulses may act on each atom so t h a t eg. a 27r-pulse on atom 1 takes \ip) t o -a\gi)\g2) - % i ) | e 2 ) - c|ei)|# 2 ) - d|e1>|e2> while a 7r pulse produces —ia|ei)|g 2 ) —j&|ei)|e2) + ic\g\)\g2) + 2<%i)|e 2 ). The point here is that, prior to actual state measurement, we can manipulate on all four basis states spanning the Hilbert space: of course by using a different basis of four states we can manipulate on each one of these four states. Thus for N qubits we can manipulate on 2N numbers and this allows a massive parallelism on these quantum computers which in principle can be exploited for particular kinds of computation (examples are given in [35], for example, where eg. Grover's algorithm allows the searching of an unsorted list of N items in only y/N steps!) An experimental situation already achieved in [90] is to use a 2-level atom plus one photon as two qubits: photon number is conserved in these experiments so that we can be restricted t o the 2-state basis |e,0) and \g, 1), with |e,0) = |e>|0),|ff,l) = | f f )|l). Now |V> = c o s i % , l ) - i s i n i % , 0 )
(47)
and for 6 = 7r (a 7r-pulse) \i>) = \g, 1) <-> |e, 0), upto phases, a form of excitation swapping. For 6 = 7r/2 |^) = - ^ [ | f f , l > - i | e , 0 > ]
(48)
105 a nice quantum 'entanglement' of photon and atom ('entanglement', going back to Schrodinger, means the state \tp) cannot be written as a simple product state: |e, 0) is the product state |e)|0) and \g, 1) is the product state |ff)|l) but \g, 1) —• state not of this form in eqn. (48)). Physical manipulations like this on single atoms are now readily achievable in 'high-Q' (very little damped) microwave cavities. For example [90] does this using 8 7 Rb atoms undergoing high Rydberg transitions at frequencies ~ 51.1 GHz (the '2-level atom' is that between the n = 50 (|ff)) and n = 51 (|e)) states, and n is the principal quantum number for the Bohr-like atom with the one opiicai-electron which is 8 7 Rb). Moreover similar manipulations like this can be done on more than one atom (2 atoms or 3 atoms so far in [90]). Take here the photon-atom entanglement eqn. (48) mentioned: the atom carries one 'bit' of binary information as |e) <-> 1, |ff) *-* 0 (say). So this is one qubit. The photon is also one qubit (|1) or |0)). These two qubits can be manipulated. In particular [90] they can be manipulated as follows. The experimental system used in [90] involves a Ramsey interferometer which allows for close to resonant Ramsey pulses connecting |ff) (n = 50) to another state |i) (actually the n — 49 state) at 54.3 GHz, sufficiently different from the 51.1 GHz transition for the |ff), |e) system. Combined with |0), |1) photon states representing the presence of 0 or 1 photons the relevant basis is the 2 2 = 4 states |i,0), \g,0), \i, 1), \g, 1). The action of a 27T pulse on the 2-level |e), \g) system takes \g, 1) —• e**|g,l) with
(49)
106 Detection of the atom outside the cavity via an ionisation detector will detect |e) or \g) with equal probability of one half - as measured over many successive experiments. But detection of |e) or of \g) equally well detects |0) (no photons) or 11) (one photon) in the cavity. If now a second atom (the so-called 'meter' atom) enters in \g) and, by nutation, undergoes a 27r-pulse with the Rabi fields resonant on \g) —> \i) switched on, the Ramsey fringes can measure the correlation between the probability of finding the meter atom in \g) and the presence or absence of a photon, namely the fringes determine the conditional probabilities of measuring the meter atom in \g) given that the source atom was detected in \g) or it was detected in |e): these probabilities are equally conditional on 1 or 0 photons being present from the source atoms that is the conditional probabilities P(g2\9i) = P(g\l), -P(<72|ei) = -P(fl|0) a r e being measured. Two points follow: one is that for P(g\l) one is detecting one photon - a so-called 'quantum non-demolition' (QND) detection of one photon; the other point is that one can see that this detection of one photon conforms t o the dynamics of a controlled - N O T (CNOT) gate [93]. Such C-NOT gates are fundamental to quantum computation [9]. The Ref. [90] goes on to consider the experimental manipulation of 3 qubits (3 atoms) [90,94]. As far as I know this number 3 is a measure of current achievements in the experimental manipulation of qubits. The problems of maintaining coherence over many qubits in real cavities is considerable [89] and this may be the ultimate problem for real, that is experimentally realised, quantum computation and information. Before we turn again to solitons I explain further about the nutation of a 2level atom in a cavity. The Jaynes-Cummings model couples one 2-level atom to one single mode of a very high-Q (not at all damped) cavity. The Hamiltonian involved is (at exact resonance between the atom and the cavity mode) H = oj0Sz+LJoa1a
+ g(S+a + a^S-)
(50)
in which S^jS* take the two-dimensional representation of su(2) introduced above eqn. (3) and [a,aJ] = 1 for the photons of the cavity mode. A constant of the motion is the operator M (say) M = Sz+a)a+))
(51)
so that M\g, 1) = l\g, 1), M|e,0) = l|e,0) and M\if>) = l\ip) for any state \ip) like eqn. (47). Since there are two degrees of freedom, one for the cavity mode and actually precisely one for the 2-level atom, the two commuting constants H and M make this system quantum completely integrable. Jaynes and Cummings [91] solved the model directly by solving a 2 x 2 matrix formulation of the quantum mechanics. But in [70] we solved this JC-model, eqn. (50), via the quantum inverse method as part of the solution of a more general quantum problem involving TV > 1 2-level atoms and one mode also shown in Fig. 10.
107 For our purposes ref. [80] actually constructed an effective single mode cavity and sends one 2- level atom into it! If the atom enters in |e) and there are no photons, since the state |e,0) is not an eigenstate of H the system makes a unitary evolution under H at fixed M, and so evolves through states l^), eqn (47), in which 6 starts at 6 — •K (say) and moves through 7r < 6 < Sir. This is the nutation of the '27r-pulse'. In the course of that nutation |e, 0) passes through \g, 1), the 7T-pulse by nutation, emitting one-photon to the cavity. A second 7r-pulse by nutation then absorbs that photon and the resultant total 2TT pulse restores \ip) = +i|e,0) (with a change of sign). This nutation can be observed by the Ramsey interferometer (see the Fig. 2(a) in [90]). The condition for the 27r-pulse was gtint = 2w, ie. ti„t = 2irg~1. For any 0-pulse by nutation one adjusts £*nt to 6g~x [90]. Reference to the dynamics of the micromaser will show [71-74] that we are here concerned with 'trapping states' of a one-atom micromaser which satisfy the condition \/n + lgUnt = 2rw, r = integer. Here r = 1 and n + 1, the photon number, is precisely 1. Notice that the 2-level atom driven by the electric field E(t) in the MB system, §2, is also undergoing optical nutation driven by 27r-pulses in particular. The difference now is that at this level of 'cavity quantum electrodynamics' we must quantise the electro-magnetic field. What, if anything has this to do with solitons and particularly 'quantum solitons'? It is a suggestion in [9] that 'one quantum soliton' is a natural qubit. In [9] this 'quantum soliton' is viewed as a c-number sech solution eqn. (15) of the c-number attractive (coupling constant c < 0) NLS equation eqn. (14) which exhibits however quantum fluctuations: particularly a particle number operator TV (say) and its canonical phase 4> satisfying [N, (/>] = —i (for large enough eigen values of N) will satisfy an uncertainty principle ANA(p > | . This will be true in particular for the one bit sech signals of the NLS equation eqn. (15) in an optical fibre. And this means that the fluctuations in the values of N in this one-bit will register as quantum noise ("shot" noise) in the fibre and blur the signals. A prescription to reduce this noise is to 'squeeze' the fluctuations AAr so that AN < 1/V% while A
108 of course: any eigenstate of N is infinitely squeezed in n already. However it is immediately intuitive that in the case of the quantum soliton of an optical fibre we are actually talking about an optical pulse which is quantised but somehow still very much like the sech 1-soliton solution, eqn. (15), of the c- number NLS equation eqn. (14). In [95] Miki Wadati and colleague showed in 1984 how the solution of the classical NLS model eqn. (14) emerges from a quantum mechanical "matrix element" namely the matrix element lim (n,X",t\<j>(x -Vt)\n
+ 1,X',t).
(52)
n—>oo
The quantum field <j>{x — Vt) satisfies the quantum NLS equation, eqn. (28) (for which h — I and m = ^) and the states \n + 1,X') are the Fourier transforms on P to X' from simultaneous eigenstates \n + 1,P,...) of H, N, P... while the states in the matrix element eqn. (52) depend also on the t because they are actually wave packets deriving [95] by Fourier transformation on P of \n,P, ...t) = e~lHt\n,P,...). Notice that n —* n + 1 in the matrix element eqn. (52) and there is off diagonal long-range order and coherence (compare §4) in this sense. Moreover n —> oo in eqn. (52) is a form of classical limit - even though, unlike the quantum coherent states (§4), n-states are always intrinsically quantum. Wadati shows how eqn. (52) becomes exactly the hyperbolic secant solution eqn. (15) of the d — 1 c-number attractive NLS eqn. (14). 5 However there is apparently the problem for this limit n —> oo that the quantum attractive NLS equation is unstable with no stable ground state unless the particle number n an eigenvalue of N is held fixed [96]. Also for present purposes what are these 'particles': their numbers n are eigennumbers of N, but they are not photons e.g. their masses are m = ^, and so > 0. Of course this mass m > 0 is all an artefact of some effective nonlinear refractive index. But moreover, and still more so for the present purposes of connecting (52) with ref. [9], the matrix element which is eqn. (52) has (by definition) lost its quantum mechanics. However, for any valid comparisons with [9]and its references we have still to calculate all correlators, like those in §4 in x — x', and t — t', and this must now be done. Note that ref. [9] suggests that the coherent quantum soliton is in a Glauber coherent state but matrix element (52) appears to suggest quite otherwise. These various questions are open and demand further work. Next we note that two qubits in this solitons realisation must (presumably) be the 2-soliton solution of the c-number NLS equation eqn. (14). That the two separate qubits in this description interact is plain from the Fig. 5 where the 2soliton solution of the comparable sine-Gordon equation shows that (c-number) interaction. So far the matrix element description of two n-string solutions of the quantum attractive NLS equation is still to be worked out, and for present purposes clearly needs doing. Note that for one n-string we are concerned with 5 T h e m a t r i x element eqn. (52) becomes t h e classical sech multiplied by w h a t is essentially S(X' — X") for large enough n: X' = X" = xo plays t h e role of a phase in t h e argument of t h e sech, i.e. sech [q(x — Vt)] —> sech [q(x — XQ — Vt)].
109 states |fci,..., kn) satisfying
N\ku...,kn) r\Ki,...,
n\ki,...,kn)
Kn)
H\ki,...,kn)
(X)*f) ifei
*™>
(53)
and when c < 0 the set of wave numbers {kj} forms the n-string for chosen kj=P+^{n-(2j-l)}ic,
j = l,2,...,n
(54)
where i = %/—T so that total momentum P = Yl]=i kj = nP an< ^ the e n e r g y En(P) = Y?j=i tf = i - P 2 - T2"(™2 - l ) c 2 (not bounded below for n -> oo!). Thus at this stage of investigation it remains a very interesting but very open question what features of "quantum information" can be extracted from this quantum mechanical model system. One idea is to try to encode quantum informtaion on the n-strings either as one n-string for large n or on a set of n-strings. I do not know how to manipulate single n-strings but for sets of nstrings I note that the matrix clement cqn. (52) for one n-string is essentially located at some place Xo [95] so that in cqn. (15) the argument of the scch is q(x—xo — Vt) and x in cqn. (15) —> x—XQ as already explained in the footnote. 5 From the known asymptotic behaviours of two solitons of the NLS equation, eqn. (14), two n-strings characterised by n\, ni both large, and associated with x0 = Xi, XQ = x2 respectively and with velocities Vi, V2 respectively should mean that one can add two n-string solutions asymptotically that is for large enough initial separations: these (rather complicated) quantum objects thus become quantum qubits each with their own quantum structures in a collision with a possibly exotic quantum entanglement! This entanglement at semiclassical level becomes the phase shift A of the argument X — XQ of the sech as was described in §2 for the s-G system. Interestingly as the final figure, Fig.(11), shows the squeezing [7,9] of a single quantum soliton in an optical fibre has certainly been observed already [9,97100] even though such squeezing at a fixed n of the matrix element eqn. (52) is in itself not possible. Moreover refs. [97-100] include measurements of the correlations between modes of the quantum soliton (especially [97,98]) which refer back to the remarks below eqn. (52). Evidently an early investigation for this new Millennium is to take each part of this particular quantum analysis significantly much further. 6.
F I N A L C O M M E N T S AND CONCLUSIONS
The discovery of the soliton solutions, and more generally of the complete Hamiltonian integrability of many classes of nonlinear partial differential equations, or of other related nonlinear integro-differential systems such as [101] ,
110
3 Squeezing the soliton 120
S 100 Q.
c D
a. +-*
Q. •*-»
3 O
0Q
a> o Q. Q) U5
'5 c III > i2
input pulse energy (pJ) Output energy (top) and squeezing (bottom) plotted as a function of the input energy for a 90/10 asymmetric fibre-based interferometer loop. The output-pulse energy shows an optical-limiting effect at input energies of 53 picojoules and 83 picojoules. They-axis in the lower graph is the photocurrent noise power in a photodiode detector relative to the shot noise (horizontal line). The quantum fluctuations are reduced below the shot-noise level (i.e. "squeezed") at the input energies for which optical limiting occurs. (From Schmitt eta/.) Fig. 11. Experimental data for the 'squeezing' of a quantum soliton in an optical fibre taken with its caption from [9]. These particular quantum solitons are certainly more complicated in practice [7] than any single quantum soliton based on the quantum attractive NLS equation, eqn. (28) (c < 0), since additional nonlinearities are included. The reference to Schmitt et al. is [98].
111 during the period 1965-1974 or soon thereafter, together with the quantisation of these systems since 1979, has exposed wholly new mathematical structures of exceptional interest. Although the Fig. 8 of this paper literally sketches the remarkable 'connectivity' that is the relations between this variety of structures it can only be a sketch; and the opportunity offered to the author to embellish this Fig. 8 with the actual comments of the text of the paper still leaves him with the wish to dig much deeper and go much further. Certainly even at the superficial level necessarily adopted in the paper much is left out. Thus the paper does not attempt to address the wider issues of the mathematics of integrability per se [102] while all of the work on the Painleve tests for integrability already available (eg. [103]) are deliberately not mentioned. I have developed the theme of "complete Hamiltonian integrability" in the paper because this leads directly, via Dirac's canonical quantisation, to quantum theories. But the theory of quantum groups [29-32] is not developed as such in the paper, only mentioned; the co-multiplication structure underlying these generalised (by a spectral parameter Q non-commutative and non-commutative algebras is not explored in the paper, and indeed the commutative and co-commutative algebras which are the Poisson-Lie groups are not exhibited in the paper either. These latter do appear on the 'map', the Fig. 8 where starting from the Sklyanin bracket, which there takes the form {T®T}
= [T®T,r],
(55)
in terms of the 'little r-matrix ([18] and see [24]) in which the left side represents the 16 Poisson brackets which can be considered for the 2 x 2 matrix integrable systems (the AKNS systems [36]) appears at the top of the third column from the right. But it also leads down to the quantum groups as Hopf algebras in the second column from the right. Of course what leads to what in this 'map' can be a matter of subjective choice. On the 'map' the origins of these Hopf algebras in the loop algebras (with spectral parameter) well down in the third column from the left is covered, though inadequately and incompletely, in my paper [30]. But again the KSA (or AKS: Adler-Kostant-Symes) theorem there is not at all developed in the paper, see eg. my [104] for results and references and see also the references to supersymmetric integrable systems theory in this [104]. Lax pair theory and the inverse method are well sketched in the text of §3 , and I hope that the twin pillars of geometry and algebra supporting all of this are at least partly identifiable. A theme which is not developed is the non-commutative quantum geometry of these systems because, implicit as this is in the ii-matrix theory, these aspects still await a proper elucidation (by this author at least). However, in this author's view the most remarkable aspect of the mathematical structure displayed in the Fig. 8 must still be that it leads directly to physical manifestations and even to successful experiments. The Figs. 9 and 10 of this paper serve to show how even the EXPERIMENTS 'box' on the Fig. 8 could do little justice to that experimental situation. Thus the physics of self-induced transparency (SIT) pursued during 1967-1974,
112 and indeed subsequently, was already a striking manifestation of 'optical solitons'; and since then the many physical examples, some collected in [34] of 1977, follow the same theme. Thus, for this author, [8] as well as [34] of 1977 and then [105] of 1980 could already include spin-wave phenomena in liquid 3 He at temperatures T ~ 2.6 mK and the appearance of the integrable s-G equation for spin-waves in the 3 He A- phase. They also heralded the appearance of the non-integrable 'double sine-Gordon' equation (f>xx — <j>tt = —m2 (sin 0 + 1 sin | 0 ) for spin waves in the 3 He B-phase - solved in [105] by 'soliton perturbation theory' about the s-G equation. Things like this can only extend the still scarcely explained 'unreasonableness' [33] of the interplay of the mathematics, mathematical physics and actual physics exhibited by Nature as particularly described in this paper. This physics extends to realisable technologies. And it is a theme of this paper that this is so. The connections between SIT, cavity quantum electrodynamics, and the potential for 'quantum information' and 'quantum computing' is one theme of the paper. The arcane (to this author) connection between methods of functional integration on infinite dimensional systems, and quantum mechanics remains almost as mysterious now as it did when first put forward by R.P. Feynnan [106]. Indeed, although details could not be given in the paper the fact that these 'arcane' functional integral methods from mathematical physics can yield predictions for the behaviours of Bose condensates held in magnetic traps at temperatures T ~ 250 — 450 nK which are in agreement with current experiments [85,88] is a source of present amazement to this author. The new technology which is the atom-laser is still to be created. Because of the obvious problems of gravity, ultra-low temperatures, and the short lifetimes of condensates anyway this technology may never be created. But it remains the spectacular manifestation of mathematics with physics which is Nature mentioned. On the other hand the 'optical soliton' whose short life in the last Millennium was only some 27 years must have a secure future in the new communication systems of this new Millennium. Less predictable is the 'quantum soliton'. What this might yet do for 'quantum information' remains a question which must now be vigorously explored! REFERENCES
1. P.A. Griffiths "Mathematics and the sciences: Is interdisciplinary research possible?" Plenary paper at this meeting. 2. This author's experience of interdisciplinary research is that in straddling more than one supposedly distinct 'camp' of research - as one necessarily must - one runs the risk of finishing with an acknowledged place in no one of them: in short each research camp can become very protective of its own perceived boundaries! There is of course the intrinsic problem, anyway, of simple understanding between camps. This author's actual experience is that newly discovered abstract mathematics although evidently directly applicable to real experiments in the laboratory and con-
113 sequently to newly emerging technologies can rarely be perceived as such by all but a very select few of the available experimentalists and technologists! This lecture attempts to delineate a route which can be said be begin in abstract and aesthetically appealing 'pure mathematics' through systems of partial differential equations of 'applied mathematics' thence to theoretical and experimental physics, and from the last to the newly emerging technology of 'quantum information'. Only the reader can determine if this route is made apparent in the paper. In practice discovery of the optical soliton was already in the border land between theoretical and experimental physics: the rather remarkable mathematics of solitons (Fig. 8) grew 'backwards' out of that theoretical physics as can be seen from the references [3-7] then [8] and then [10] (for example) following: both the algebras of solitons (Fig. 8) and their geometries [10-13] have proved of intrinsic mathematical interest while the non-mathematical reference [9] contrasts all of this with emerging, or potentially emerging, new technology. 3. Caudrey, P.J., Gibbon, J.D., Eilbeck, J.C. and BuUough, R.K., 1973, 'Exact multi-soliton solutions of the self-induced transparency and sineGordon equations', Phys. Rev. Lett., 30, 237. 4. Caudrey, P.J., Gibbon, J.D., Eilbeck, J.C. and BuUough, R.K., 1973, 'Solitons in non- linear optics I. A more accurate description of the Impulse in self-induced transparency', J. Phys. A: Math. Gen., 6, 1337. 5. Gibbon, J.D., Caudrey, P.J., Eilbeck , J.C. and BuUough, R.K., 1973, ' An N-soliton solution of a nonlinear optics equation derived by a general inverse method', Lett, al Nuovo Cimento, 8, 775. 6. BuUough, R.K., Caudrey, P.J., Eilbeck, J . C , Gibbon, J.D., 1974, 'A general theory of self-induced transparency', Opto-Electronics, 6, 121. 7. BuUough, R. K., 2000 'The optical solitons of QEl are the BEG of QE14: has the quantum soliton arrived?. In Proceedings of the 14th National Quantum Electronics and Photonics meeting, Manchester, UK, 5-9 September 1999. Journal of Modern Optics. In the press at October 1999, J. Mod. Optics, Vol. 47, N o . l l , 2029-2065 [erratum J. Mod. Optics, Vol. 48, No. 4, to appear February 2001]. 8. BuUough, R.K., and Caudrey, P.J., 1978, 'Optical solitons and their spin wave analogues in 3He', in "Coherence and quantum optics IV" edited by L. Mandel and E. Wolf (New York: Plenum) pp. 762-780. 9. Abram, Izo, 1999, 'Quantum pp. 21-25.
Solitons', Physics
World, February 1999,
10. Solitons, 1980, Springer Topics in Current Physics, 17, edited by R.K. BuUough and P.J. Caudrey (Heidelberg: Springer-Verlag). Chap. I, The soliton and its history, pp. 1-64, and the other Chaps. 11. Terng, Chuu-Lian and Uhlenbeck, Karen, 2000, Geometry of Solitons, Notices of the AMS, January, pp. 17-25.
114 12. Hitchin, N.J., Segal, G.B., and Ward, R.S., 1999, 'Integrable Systems. Twisters, Loop Groups, and Riemann Surfaces' Oxford Science Publications (Oxford: Clarendon Press) [ISBN 0 19 850421 7] eg. 'Introduction' by N.J. Hitchin, pp. 4-6. 13. 'The Geometric Universe, Science, Geometry and the Work of Roger Penrose', 1998, edited by S.A. Hugett, L.J. Mason, K.P. Tod, S.T. Tsou, and N.M.J. Woodhouse (Oxford: Oxford University Press) §1 p.5 (by Michael Atiyah) and §6, pp. 99-108 (by R.S. Ward) as well as other places. [ISBN 0 19 850059 9] (Hbk). 14. Solitons, 1980 Ref. [10], p.27 and the reference [1.67] there. 15. Liouville, J., 1855, Journal de Mathematique, XX, p.137. 16. Arnold, V.I., 1978, Mathematical Methods of Classical Mechanics (Berlin: Springer- Verlag) Chap. 10, pp.271-275 and pp.279-291. 17. Bullough, R.K., 1994, 'Instabilities in Nonlinear Dynamics: Paradigms for Self- Organization' in "On Self-Organization", Springer Series in Synergetics, Vol. 61, edited by R.K. Mishra, D. Maa/3 and E. Zwierlein (Berlin: Springer-Verlag) pp. 212-244. 18. Bullough, R.K. and Timonen, J.T., 1995, 'Quantum and Classical Integrable Models and Statistical Mechanics' in "Statistical Mechanics and Field Theory", edited by V.V. Bazhanov and C.J. Burden (Singapore: World Scientific Publ. Co. Pte. Ltd.) pp. 336-414 [ISBN 981 02 2397 8]. 19. Bullough, R.K. and Caudrey, P.J., 1995, Acta Applicandae Mathematicae, 39, 193-228. (This article as printed contains gross publishers errors and interested readers are referred to the authors at UMIST, Manchester for an original typescript.). 20. Bullough, R.K., Jiang, Z. and Manakov, S.V., 1986, Proc. Intl. Conf. on Solitons and Coherent Structures, Santa Barbara, Jan. 1985. Physica 18 D: Nonlinear Phenomena, pp. 305- 307. 21. Konopelchenko, B. and Rogers, C , 1991, Phys. Lett, 158A, 391. 22. Jiang, Z., 1987, 'Integrable Systems and Integrability', Ph.D. Thesis, University of Manchester, February. 23. Bogoliubov, N.M., Rybin, A.V., Bullough, R.K., and Timonen, J., 1995, 'Maxwell- Bloch system on a lattice', Phys. Rev. A., 52, No. 2, 14871493. 24. Korepin, V.E., Bogoliubov, N.M., and Izergin, A.G., 1993, 'Quantum Inverse Scattering Method and Correlation Functions, (Cambridge: Cambridge University Press) [Paperback, 1997. ISBN 0 521 58646 1]. 25. Ward, R.S., 1985, 'Integrable and solvable systems and relations among them', Phil. Trans. Roy. Soc. London A315, 451-457 (Discussion meeting 'New Developments in the Theory and Application of Solitons'). 26. D'Ariano, G.M., Montorsi, A., and Rasetti, M.G., 1985, 'Integrable Systems in Statistical Mechanics' (Singapore: World Scientific Publishing Co. Pte. Ltd.) pp. 96-127 and the work of E. Date, M. Jimbo, M. Kashiwara and T. Miwa, and of M. Sato and Y. Sato referenced.
115 27. Cheng, Yi, 1987, 'Theory of Integrable Lattices', Ph.D. Thesis, University of Manchester, January. 28. Weyl, H., 1931, "The Theory of Groups and Quantum Mechanics' (New York: Dover Publications, Inc.) paperback edition (translated from the German by H.P. Robertson, September). 29. Bogoliubov, N.M. and Bullough, R.K., 1992, 'A q-deformed completely integrable Bose gas model, J. Phys. A: Math. Gen. 25, 4057-4071. 30. Bullough, R.K., Olaffson, S., Chen, Yu-zhong, and Timonen, J., 1990, 'Integrability conditions: recent results in the theory of integrable models' in "Differential Geometric Methods in Theoretical Physics" (NATO ARW 'Physics and Geometry' 1989) edited by Ling-Lie Chau and Werner Nahm (New York: Plenum Press) pp. 47-69. 31. Bullough, R.K. and Bogoliubov, N.M., 1992, 'Quantum Groups: q-Boson Theories of Integrable Models' in Proc. XXth Intl. Conf. on Diff. Geometric Methods in Theoretical Physics Vol. 1 edited by Sultan Catto and Alvany Rocha (Singapore: World Scientific Publ. Co. Pte. Ltd.) pp. 488-504 [ISBN 981 02 0827 6 (Vol 1)]. 32. 'Quantum Groups', 1990, Springer Lecture Notes in Physics, edited by H.-D. Doebner and J.-D. Henning (Berlin: Springer-Verlag) [ISBN 3 540 53503 9]. 33. Wigner, E.P., 1960, 'The unreasonable effectiveness of mathematics in the natural sciences', Coram. Pure and Applied Maths 13, 1. 34. Bullough, R.K., 1977, 'Solitons' in "Interaction of radiation with condensed matter. Vol. 1" IAEA-SMR-20/51. (Vienna: International Atomic Energy Agency) pp. 381-469. 35. Deutsch, D. and Eckert, A., 1998, 'Quantum Information. Quantum Computation', Physics World, March 1998, pp. 47-52. 36. Ablowitz, M.J., Kaup, D.J., Newell, A.C. and Segur, H., 1973, Phys. Rev. Lett., 3 1 , 125. 37. Lamb, G.L., 1973, Phys. Rev. Lett, 3 1 , 196. 38. Schweber, S.S., 1961, 'An Introduction to Relativistic Quantum Field Theory' (New York: Harper and Row, Publishers Inc.) Chap. 3. 39. Calogero, Francesco, 1995, 'Integrable Nonlinear Evolution Equations and Dynamical Systems in Multidimensions', Acta Applicandae Mathematicae 39, 229-244; and Calogero, F., Universal Integrable Nonlinear PDEs' in "Applications of Analytic and Geometric Methods to Nonlinear Differential Equations", (Dordrecht, Holland: Kluwer Academic Publishers) pp. 109-114; and references. 40. Bullough, R.K., Thompson, B.V., Nayak, N. and Bogoliubov, N.M., 1995, 'Microwave cavity quantum electrodynamics, I: one and many Rydberg atoms in microwave cavities; and II: fundamental theory of the micromaser' in "Studies in Classical and Quantum Nonlinear Optics" edited by Ole Keller (Commack, New York: Nova Science Publishers Inc.) pp. 609-623 [ISBN 1 56072 168 5].
41. Hynne, F., and Bullough, R.K., 1984, 'The scattering of light. I. The optical response of a finite molecular fluid, Phil. Trans. R. Soc. Lond. A., 312, 251. 42. Hynne, F., and Bullough, R.K., 1987, 'The scattering of light, II. The complex refractive index of a molecular fluid, Phil. Trans. R. Soc. Lond. A., 321, 305. 43. Hynne, F., and Bullough, R.K., 1990, 'The scattering of light, III. External scattering from a finite molecular fluid, Phil. Trans. R. Soc. Lond. A, 330, 253. 44. Bullough, R.K., Batarfi, H.A., Hassan, S.S., Ibrahim, M.N.R., and Saunders, R., 1996, in 'ICONO '95 Atomic and Quantum Optics: High Precision Measurements' edited by Sergei N. Bagayev and Anatoly S. Chirkin, Proc. SPIE 2799, pp.320-328; and see the other references, 91, 92, 93 in Ref. [7]. 45. Bullough, R.K., Hassan, S.S. and Ibrahim, M.N.R., 2000, 'A nonlinear refractive index theory of optical multi-stability in normal and squeezed vacua: analytical and numerical results'. One of a sequence of papers to be published. Also see M.N.R. Ibrahim, Ph.D. thesis, UMIST, 1996. 46. Caudrey, P.J., and Eilbeck, J.C., 1977, Phys. Lett, 62A, 65. 47. Gibbs, H.M. and Slusher, R.E., 1972, Phys. Rev. A, 6, 2326-2334. 48. Bullough, R.K., 1995, 'Optical solitons, chaos and all that: thirty years of quantum optics and nonlinear phenomena! in Proceedings of the First International Scientific Conference (Science and Development) Organized by the Faculty of Science, Al-Azhar University, Cairo, 20- 23 March 1995. Edited by Prof. Dr. Ahmed M. El-Naggar (Dean, Faculty of Science), and Prof. Dr. Abd El-Wahab A. El-Sharkawy (Vice Dean). 49. Arnold, V.I., 1978, Ref. [16], p.285. 50. Bullough, R.K., and Caudrey, P.J., 1980, Ref. [10], p.3 and following pages. 51. Chiao, R.Y., Garmire, E., Townes, C.H., 1964, Phys. Rev. Lett. 13, 479. 52. Kelley, P.L., 1965, Phys. Rev. Lett. 15, 1005. 53. Bullough, R.K. and Caudrey, P.J., 1980, Ref. [10] pp.2-5 and Appendix pp.373-378. And see also [19,54]. 54. Bullough, R.K., 1988, '"The Wave" "par excellence", the solitary great wave of equilibrium of the fluid - an early history of the solitary wave' in "Solitons", edited by M. Lakshmanan, Springer Series in Nonlinear Dynamics (Heidelberg: Springer-Verlag) pp. 7-42. 55. Bullough, R.K., Chen, Yu-zhong and Timonen, J., 1990, 'Soliton statistical mechanics - thermodynamic limits for quantum and classical integrate models' in "Nonlinear World Vol. 2' edited by V.G. Bary'akhtar, V.M. Chernousenko, N.S. Erokhin, A.G. Sitenko, and V.E. Zakharov (Singapore: World Scientific Publ. Co. Pte. Ltd.) pp. 1377-1422. 56. Bullough, R.K. and Caudrey, P.J., 1980, Ref. [10] pp. 29-36. 57. Faddeev. L.D., and Takhtajan, L.A., 1987, 'Hamiltonian Methods in the Theory of Solitons' (Berlin: Springer-Verlag).
117 58. Bullough, R.K., 1980, 'Solitons: inverse scattering and its applications' in "Bifurcation phenomena in mathematical physics and related problems" edited by D. Dessis and C. Bardos (Dordrecht, Holland: D. Reidel Publ. Co.) pp.295-349. 59. Bullough, R.K., and Dodd, R.K., 1977, 'Solitons' in "Synergetics" Proc. Intl. Workshop on Synergetics; Bavaria, May 1977, edited by H. Haken (Heidelberg: Springer-Verlag) pp.92-119. 60. Dodd, R.K. and Bullough, R.K., 1979, 'The generalised Marchenko equation and the canonical structure of the A.K.N.S.-Z.S. method, Physica Scripta 20, 364-381. 61. Caudrey, P.J., 1990, 'Spectral transforms' in "Soliton theory: a survey of results" edited by Allan P. Fordy (Manchester: Manchester University Press) pp.25-54 [ISBN 0 7190 1491 3] also see Caudrey, P.J., 1980, Phys. Letts. A 79, 264 referenced. 62. Caudrey, P.J., 1990, 'Two dimensional spectral transforms' in "Soliton theory: a survey of results" edited by Allan P. Fordy (Manchester: Manchester University Press) pp.55-74 [ISBN 0 7190 1491 3]. 63. Bullough, R.K. and Bogoliubov, N.M., 1992, 'Quantum groups: q-boson theories of integrable models and application to nonlinear optics' in "Proc. Ill Potsdam- V Kiev Intl. Workshop on Nonlinear Processes in Physics" edited by A.S. Fokas, D.J. Kaup, A.C. Newell and V.E. Zakharov (Berlin: Springer-Verlag) pp. 232-240. 64. Fakuma, M., Kawai, H., and Nakayama, Ryuichi, 1991, Int. J. Modern Physics A6(8), 1385-1406. 65. Aoyama, S., and Kodama, Y., 1992, Phys. Lett. B278, 56-62. 66. Migdal, A.A., 1995, 'Quantum Gravity as Dynamical Triangulation' in "Statistical Mechanics and Field Theory" edited by V.V. Bazhanov and C.J. Burden (Singapore: World Scientific Publ. Co. Pte. Ltd.) pp. 214-252. 67. Bullough, R.K., and Olafsson, S., 1989, 'Algebra of Riemann-Hilbert Problems and the Integrable Models - a sketch' in "Proc. XVII Intl. Conference on Differential Geometric Methods in Theoretical Physics" edited by Allan I. Solomon (Singapore: World Scientific Publ. Co. Pte. Ltd.) pp.295-309 [ISBN 9971 50 836 2j. 68. Bullough, R.K., and Sasaki, R., 1980, 'Geometry of the AKNS-ZS Inverse Scattering Scheme' in "Nonlinear Evolution Equations and Dynamical Systems" Springer Lecture Notes in Physics 120 edited by M. Boiti, F. Pempinelli and G. Soliani (Berlin: Springer-Verlag) pp.314-337. 69. Drinfel'd, V.G., 1983, 'Hamiltonian structure on Lie groups, Lie bialgebras and the geometric meaning of the classical Yang-Baxter equations', Soviet Math. Dok., 27, 68- 71. 70. Bogoliubov, N.M., Bullough, R.K., and Timonen, J., 1996, 'Exact solution of generalised Tavis-Cummings models in quantum optics', J. Phys. A: Math. Gen. 29, No. 19, 6305-6312.
118 71. Bullough, R.K., Bogoliubov, N.M. and Puri, R.R., 2000, 'Proc. NEEDS in Leeds meeting 199S, edited by A.P. Fordy and A.V. Mikhailov. Published (in Russian) in the Russian J. of Theor. and Math. Phys., Vol. 122, No. 2, February 2000, pp. 182-204 [English Translation, 2000, Theor. Math. Phys., Vol. 122, No. 2, pp.151-169 ISSN 0040 5779]. 72. Puri, R.R., Kumar, S. Arun, and Bullough, R.K., 'Stroboscopic Theory of Atom Statistics in the Micromaser', preprint 2000. 73. Joshi, A., Kremid, A., Nayak, N., Thompson, B.V., and Bullough, R.K., 1996, 'Exact trapping state dynamics for the S5Rb atom micromaser at very high Q and/or very low T, J. Mod. Optics 43, No. 5, 971-992. 74. Bullough, R.K., Joshi, Amitabh, Nayak, N., and Thompson, B.V., 1996, ' The micromaser at very low temperatures' in "Notions and perspectives of nonlinear optics" edited by Ole Keller. Series in Nonlinear Optics (Singapore: World Scientific Publ. Co. Pte. Ltd.) pp. 13-87 [ISBN 981 02 2627 6]. 75. Bullough, R.K. and 10 co-authors, 1989, 'Giant Quantum Oscillators from Rydberg Atoms: atomic coherent states and their squeezing from Rydberg atoms' in "Squeezed and Nonclassical Light' edited by P. Tombesi and E.R. Pike, NATO ASI Series B: Physics Vol. 190 (New York: Plenum Press) pp.81-106 [ISBN 0 306 43084 3]. 76. Bogoliubov, N.N., 1947, Journal of Physics, 9, 23; Vestu. MGU 7, 43. 77. Bogoliubov, N.N., Tolmachev, V.V., and Shirkov, D.V., 1959, 'A new method in the Theory of Superconductivity1 (New York: Consultants Bureau Inc.) p.8, eqn. (1.8). 78. Huang, K., 1987, Statistical Mechanics (2nd Edn.) (New York: John Wiley and Sons Inc.) Chap. 12, pp.278-304. 79. Bogoliubov, N.N., Tolmachev, V.V. and Shirkov, D.V., 1959, Ref. [77] pp.8-9. 80. Kadanoff, Lev P. and Baym, Gordon, 1962, 'Quantum statistical mechanics' (New York: W.A. Benjamin, Inc.). 81. Bogoliubov, N.N., Tolmachev, V.V. and Shirkov, D.V., 1959, Ref. [77] p.6. 82. Faddeev, L.D., and Popov, V.N., 1965, Sov. Phys. ZhETP 20, 840. 83. Popov, V.N., 1983, 'Functional Integrals in Quantum Field Theory and Statistical Physics' (Dordrecht, Holland: D. Reidel Publ. Co.). 84. Popov, V.N., 1990, 'Functional integrals and collective excitations' (Cambridge: Cambridge University Press) [ISBN 0521 407 877 paperback]. 85. Bogoliubov, N.M., Bullough, R.K., Kapitonov, V.S., Malyshev, C , and Timonen, J., 2000, 'Finite-temperature correlations in the trapped BoseEinstein condensate'. Submitted to Phys. Rev. Lett, at September 1999; resubmitted April 2000. 86. Kleppner, D., 1997, Physics Today, August 11. 87. Bloch, I., Esslinger, T., and Hansch, T.W., 1999, Phys. Rev. Lett, 822, 3008.
119 88. Bloch, I., Hansen, T.W., and Esslinger, T., 2000, Nature 403, 166. 89. DiVincenzo, D. and Jerhal, B., 1998, 'Quantum Information. Decoherence: the obstacle to quantum computation in Physics World, March, pp.53-57. 90. Haroche, S., Nogues, G., Rauschenbeutel, A., Osnaghi, S., Brune, M., and Raimond, J.M., 1999, 'Quantum knitting in cavity QED in "Laser spectroscopy XIV International Conference" edited by Rainer Blatt, Jiirgen Eschner, Dietrich Leibfried and Ferdinand Schmidt- Kaler (Singapore: World Scientific Publ. Co. Pte. Ltd.) pp. 140-149. 91. Jaynes, E.T. and Cummings, F.W., 1963, Proc. IEEE 51, 89. 92. Lloyd, S., 1995, Phys. Rev. Lett. 75, 346. 93. Barenco, A., et al., 1995, Phys. Rev. Lett, 74, 4083. 94. Rauschenbeutel, A., Nogues, G., Osnaghi, S., Brune, M., Raimond, J.-M., and Haroche, S., 1999, 'Generation of GHz Type Three-atom Correlations in a Cavity QED Experiment in "Laser Spectroscopy/' Ref. [90] pp.364365. 95. Wadachi, Miki and Sakagami, Masa-aki, 1984, 'Classical soliton as a limit of the quantum field theory1, Journal of the Physical Society of Japan, 53, No. 6, pp. 1933-1938. 96. Bullough, R.K. and Timonen, J.T., 1995, Ref. [18], p.29. 97. Spalter, S., Korolkova, N., Konig, F., Sizman, A., and Leuchs, G., 1998, Phys. Rev. Lett. 81, 786. 98. Schmitt, S., Ficker, J., Wolff, M., Konig, F., Sizman, A., and Leuchs, G., 1998, Phys. Rev. Lett. 81, 2446. 99. Drummond, P.D., Shelby, R.M., Friberg, S.R., and Yamamoto, Y., 1993, Nature, 365, 307. 100. Friberg, S.R., Machida, S. and Yamamoto, V., 1992, Phys. Rev. Lett. 69, 3165. 101. Lakshmanan, M. and Bullough, R.K., 1980, 'Geometry of generalised non-linear Schrodinger and Heisenberg ferromagnet spin equations with linearly x-dependent coefficients', Phys. Lett. 80A, 287-292. 102. Zakharov, V.E., 1991, ' What is integrability? Springer series in Nonlinear Dynamics (Berlin: Springer-Verlag). 103. Jiang, Zuhan and Bullough, R.K., 1995, Physica Scripta 5 1 , 545-548, and references. 104. Bullough, R.K. and Olafsson, S., 1989, 'Complete integrability of the integrable models: quick review' in "IXth Intl. Congress on Math. Phys." edited by B. Simon, A. Truman and I.M. Davies (Bristol: Adam Hilger) pp. 329-334 [ISBN 0 85274 250 9]. Unfortunately the relevant paper called OB in this paper was never completed. 105. Bullough, R.K., Caudrey, P.J., and Gibbs, H.M., 1980, 'The Double SineGordon Equations: A Physically Applicable System of Equations' in "Solitons" Ref. [10] pp.107-141. 106. Feynman, R.P. and Hibbs, 1965, Quantum Mechanics and Path Inegrals New York: McGraw-Hill Inc.) and earlier work.
120 107. Bogoliubov, N.M., Bullough, R.K. and Timonen, J., 1994, 'Critical behaviour of strongly coupled boson systems in 1+1 dimensions', Phys. Rev. Lett. 72, No. 25, 3933-3936. 108. Baxter, R.J., 'Exactly Solved Models in Statistical Mechanics' (London: Academic Press Inc. (London) Ltd.) [ISBN 0 12 083180 5]. Note added in proof: The reader will find articles relevant to the Section 5 of this paper 'Quantum Information' in the Ref. [13] and in articles 26. and 27. of Ref. [13] in particular: article 27. references a number of fundamental papers on 'non-locality', 'complexity', entanglement, quantum teleportation, and the (unsolved) problems of quantum measurement, all within the general topic of quantum mechanics, which I could at best scarcely touch on in my Section 5. My own reference to 2-dimensional quantum gravity (perhaps relevant to these problems) is in the §6 of [19]. I also draw attention in this connection to the Note 1 on the p.30 of [19]. In a further additional note I add some comments on functional integration in the contexts of this paper. Statistical mechanical partition functions Z = f T>fi exp S\p] as functional integrals with a measure Vfj, for one space and one time dimension (1 + 1 dimensions) are introduced into the text following Eqn. (27). This is in reference to this particular functional integral appearing in the 'map' Fig. 8 at its top right corner. The classical action S[p] in this expression is described in terms of the action-angle variables typified by those appearing in the text below Eqn. (24) for the sine-Gordon model extended however to periodic boundary conditions as is explained in the Ref. [18]. This Ref. [18] develops such a statistical mechanics for all of the integrable models in 1 + 1 dimensions in terms of such partition functions Z. Functional integration is also used implicitly in the §4 of the text, but these functional integrals are evaluated in terms of the fields 0(r, t) rather than in terms of action-angle variables in order to compute both the partition functions Z and the correlation functions G(r,r') in each of d = 3, 2 and 1 space dimensions given in Eqns. (38), (39) and (40) respectively and these functional integral calculations are presented as such in Ref. [85] (r € Rd, is a vector in d dimensions, and for the nonlinear Schrodinger models transformation to action-angle variables is not possible in d + 1 dimensions for d > 1 [18]). The 'map' Fig. 8 makes a second reference to a functional integral, namely in the 'box' at bottom right "Knot (link) Polynomials Partition Functions Z, Jones Polynomials". As a partition function Z this connects (tenuously) with the partition function Z at top right (see the dashed fine connecting from the box top right, via "Invariants of some manifold' to the "Knot" box bottom right). The functional integral Z for the "Knot" box is not quoted explicitly but is written in an invariant form independent of metric as Z[A] = JexpiS(A)T>A where S(A) = S is the integral of the Chern-Simons 3-form given in the box at absolutely bottom right and i = v'—1- This "3-form" box connects directly to the "Knot" box and the reference here is to Witten, E., 1989, 'Some geometrical applications of quantum field theory' in Proc. IXth Intl. Congress on Math. Phys. edited by B. Simon,
121 A. Truman and J.M. Davies (Bristol, Adam Hilger) (ISBN 0-852-74-250-9) p.81. But readers might also see Johnson, Gerald W. and Lapidus, Michael L., 2000, 'The Feynman Integral and Feynman's Operational Calculus' (Oxford, Clarendon Press) p. 643. For further geometrical aspects see the same book pp. 637-659 (say) and concerning Witten's 'topological invariants' of his paper see particularly the quote from Atiyah on the p.641 reference current interest in relations between geometry and physics. Unfortunately there is an obvious, but long standing, error in the 'map' Fig. 8 where the Chern-Simons 3-form should read Tr{AAdA + ^AAA^A) with the extra '
Department of Mathematics, UMIST
P O Box 88
Manchester M60 1QD
UK
Mathematics and the 21st Century Eds. A. A. Ashour and A.-S. P. Obada © 2 0 0 1 World Scientific Publishing Co. (pp. 123-140)
123
Concepts for non-smooth dynamical systems Tassilo Kupper Math. Inst., Univ. of Cologne, WeyertaJ 86-90, D-50931 Cologne, Germany. kuepper@mi. uni-koeln. de.
April 27, 2000 Abstract We present some concepts within the area of dynamical systems which have been extended to non-smooth differential equations. These include the definition of Lyapunov exponents, extension of Conleyindex or KAM-theory, an adaption of the Melnikov-technique for the detection of chaos and an approach to generalize Hopf bifurcation. Keywords: Non-smooth dynamical systems, bifurcation, chaos. AMS Classification: 34A60, 34Cxx, 70K50.
1
Introduction
The area of dynamical systems can be considered as one of the topics that have governed mathematical research in the past century. Starting with Poincare at the end of the Kr century great progress has been achieved in investigating the qualitative behaviour of evolutionary problems. A fascinating review on that kind of research has been provided by Palis during this conference. Important achievements in the study of dynamical systems rely on smoothness properties of the systems since in many cases linearization techniques are employed. For non-smooth systems such techniques are
124
not at hand and for that reason new methods have to be developed. With respect to such difficulties effects leading to non-smoothness have been neglected for a long time. The need of a better understanding of dynamical processes in engineering requires improved modeling including effects like dry friction, impacts, discontinuous switches etc, see [38, 39, 4]. Moreover, some recent applications are based on a direct use of such non-smooth effects like stick-slip motions, see [2]. In an obvious way discontinuities may arise due to geometrical properties such as corners in a billiard. The lack of differentiability implies that there is no uniquely defined tangent in a corner which might lead to non-uniqueness in the evolution of a dynamical system. Impacts usually will cause jumps in the velocity components of a mechanical system, an elementary example is provided by the impact oscillator, more realistic situations can be found in machine dynamics. An important source for non-smooth behavior is due to dry friction arising in dampers, drilling processes or rail-wheel contacts audible as creaking. In a recent application efficient use of stick-slip motions due to dry friction in a micromechanical positioning system has been made [2]. State-dependent switches in electrical, physical or biological systems also lead to differential equations involving state-dependent discontinuities. We list a few examples which have been used as model examples in the investigation of non-smooth systems. (i) Impact oscillator with external forcing f(t) and damping constant r at the reflection point ([4]): x(t) + x(t) +
x(t )
= f(t) =
-rx{r)
(x{t) < a) (x(t) = a)
The forced impact oscillator is one of the most simplest examples where the new effect of grazing bifurcation can be illustrated.
Figure 1.1
125
(ii) Rolling ball on a symmetrical surface with corners
O Figure 1.2 x + g sin a • cos a sgn(a;) = 0 (iii) Rocking block (Housner 1956)
Figure 1.3 .. , mgh . .
.
„
Jo
( (p> 0 \ \
(iv) Pendulum with friction (Reissig [40, 41, 42, 43], Kunze [23]) /,/ /./
/./ x=0
x(t) p(£lt) Figure 1.4
126 x = —rx — c sgn(i) — kx +p(Qt) (v) Single mass friction oscillator ([15, 38, 39, 31, 23, 24])
n xc
r xc+x(t)
Figure 1.5 The effect of dry friction has been studied by several authors with the help of the friction oscillator. A block of mass m is positioned on a belt moving with constant velocity ^o- The block is attached elastically to a wall through a spring. An external periodic forcing u(t) is applied at the position of the spring. Let x(t) denote the position of the block. The corresponding differential equation for x(t) is of the form x + u0x = m~lFR(x — VQ) + cjgu(t). The friction force FR(v) = — sgn(v)FNfj,(\v\) depends on the friction characteristic /i. Various models based on theoretical and experimental investigation have been used [31, 15]. (vi) Multiple mass friction oscillator ([31])
Figure 1.6
127
The multiple mass friction oscillator is built in a similar way. We have studied an oscillator with two coupled masses leading to a system of fourth order for the relative position of the blocks TUiXi =
m2x2
-CiXi - di±i - FRl(ii
= -c2(xi
- Vo) + C2(x2 — Xi) + d(i2 - Xi)
- x2) - d2(x2 - ±1) - FR2 (x2 - v0)
The friction forces FRl, FRi are given in a similar form as in (v). (vii) Neural networks A simple example describing the dynamics of 2 neurons is given by x = -x + qnf{x)
- qi2y,
y = -y + q2if(x). The response function f(x) is a piecewise constant (i.e. f(x) = sgn(x)) or a smooth approximation. A detailed analysis is given in [16]. Experimental observations as well as numerical simulations both applied directly to the nonsmooth systems as well as to mollified approximations indicate that the standard szenario of bifurcations such as saddle-node bifurcations, the onset of periodic motions, period doubling up to chaotic behaviour can occur, [38, 39, 34, 15, 45]. Just by looking at experimentally observed data it is difficult to distinguish between for example periodic orbits of high order, quasiperiodic or chaotic behaviour. For smooth systems Lyapunov exponents provide an useful tool to classify various states. The notion of the standard Lyapunov exponent and their interpretation requires the linearized flows, hence smoothness. Within the frame of a DFG-Schwepurktprogramm [25] we have extended concepts from the classical theory of dynamical systems to the nonsmooth case. In this lecture we will mainly report about results concerning Lyapunov exponents but first we briefly review a few other areas which have been investigated. (i) Analysis of piecewise linear planar systems. For the symmetric case a complete description of the bifurcation behaviour has been obtained in [14]. Further studies concerning piecewise linear systems are treated for example in [22].
128
(ii) The Conley-index is a topological method to prove existence and in a generalized sense bifurcation. In [26] we have extended classical results. The method is illustrated by an example describing the motion of wings. (iii) Usually KAM-theory requires extreme smoothness. Using a change of variables we have been able to extend some results to problems with a lack of smoothness [27]. (iv) Perturbations of planar systems with homoclinic orbits lead to chaotic behaviour. A well-established tool to analyze the influence of the perturbation is provided by the Melnikov function. In [47] we derive a generalized Melnikov function and show that similar results hold if the homoclinic orbit crosses the discontinuity. (v) For smooth systems Hopf bifurcation is characterized by the crossing of a pair of complex-conjugate eigenvalues through the imaginary axis. Geometrically this is equivalent to the change of a stable focus to an unstable focus. While the analytical approach is not available for nonsmooth systems due to the lack of a linearized problem the geometrical setting might be used. This generalized concept for the onset of periodic orbits is studied in [32], and it turns out that there are two different mechanisms for planar systems.
2
The general resulting of non-smooth systems
The mathematical treatment of non-smooth differential equations requires an extension of the standard notion of a solution. An appropriate definition is offered by the class of differential inclusions in the way that a (non-smooth) differential equation such as i(t) = f(x(t))
(2.1)
is replaced by a differential inclusion i(t) e F(x(t)) The function x(t) is called a solution of (2.2) on some interval / if (i) x(t) is absolutely continuous on / (so that x(t) exists a.e.) and
(2.2)
129
(ii) x(t) € F(x(t)) a.e. in / . Here F(x) denotes a suitable set-valued function. Usually the closed convex hull of all limits is taken. A formal definition is given by F(x) := n ^ 0 n M(w)=0 d(conv(/({fHIi - £|| < 8}))) The following examples illustrate some obvious choices for F and some typical difficulties which may arise. Example (i) x(t) = sgn(x(t)) := /(*(«)) Then sgn(a;) if x ^ 0 W 1 r :r [ - 1 ,, 1,1] if x = .0
{
Solutions outside the line of discontinuity x = 0 are well-defined. For an initial data XQ = 0 there are solutions of the differential equation a)
x(t)
or 6) x(t) =
{
0 -t
t
or c)
x(t) = 0
ieR.
For that problem uniqueness is violated in forward time while it holds in backward time. The reverse situation is treated in (ii) x(t) = -sgn(x(t)). While in (i) the trivial solution x = 0 is unstable it is stable in (ii). - \ + cost x(t) > 0 (iii) x(t) = -|sgn(x(i)) + cos(f) G < [ - 1 / 2 , 1 / 2 ] + c o s t x(t) = 0 i + cost x(t) < 0 While in (ii) the trajectory stays eventually in the stationary solution x = 0 (a critical point of the set valued function) it leaves the critical point after a finite amount of time. The motion in the line of discontinuities is governed by a reduced differential equation which is determined by projections.
130
General results concerning the theory of differential inclusions can be found in Filippov [13] and Deimling [6], where in particular the standard concepts concerning existence, uniqueness, continuous dependence and stability are covered. For example the difference with respect to uniqueness in Example (i) and (ii) is captured by the notion of a one-sided Lipschitz condition which is satisfied in case (ii) and does not hold in case (i). In our approach we are rather concerned with differential inclusions treated as a dynamical system, hence we focus on qualitative properties such as stability and bifurcations. In the evolution of the long-time behavior uniqueness of solution in forward time is of great relevance. The lack of smoothness causes difficulties which become obvious whenever linearization is needed. It is of course possible to avoid such difficulties with the help of smoothing techniques - when they are applied classical results become available but only approximate information is at hand. For a good understanding the limit procedure must be carried out. Another approach to smoothing is based on the embedding of the original dynamical system in a differential delay equation. Problems at discontinuities are reduced to straightforward integration but again limits must be taken. It has been shown in [7] that this approach is equivalent to the differential inclusion approach. Although it is still an interesting problem to study the limit procedure in a systematic way we prefer to attach the non-smooth problems directly. As a first example we have studied if Lyapunov exponents can be defined in a suitable way, and if they provide similar information as in the smooth case. Straightforward numerical simulations for a series of problems showed a remarkable coincidence between the information gained by formally computations of the Lyapunov exponents, and the corresponding bifurcation diagrams. For that reason we tried to determine a class of non-smooth problems where Lyapunov exponents could be defined. This approach could be carried out either for differential equations directly or for the corresponding Poincare maps. In the example of the friction oscillator Hubbuch [18] worked out parameters regions where Lyapunov exponents could be given in a meaningful sense
131 for Poincare maps and also when that approach failed.
3
Lyapunov exponents
Lyapunov exponents provide a well established tool to characterize the longtime behaviour of dynamical systems, for a review see Eckmann-Ruelle [12]. First we collect a few facts concerning the definition of Lyapunov exponents and their interpretation. For a smooth system
we compare the evolution of nearby trajectories. Assume that
-
(f(t, X0) - lf(t,
XQ)
ft
dx(p(t,XQ)(x0
-XQ)
=
dxq(t,x0)ZQ.
The quantity \i{x0,Z0)
:= hmsup(-ln(|| =
\—+—-))
limsup(-ln(||9x(y3(i,a;o)^o||)) t-vao
t
may be considered as a measure for the longtime evolution. Immediately two questions arise: 1. Does the lim sup exist as a true limit? 2. If so, when is the limit independent of Z0? For the simple linear case with constant coefficients x = Ax we have ip(t, x) — etAx, hence dx
132
it can be shown using Oseledet's multiplicative ergodic Theorem ([35]) that the true limit exists and there are n Lyapunov exponents Ai < Ai < • • • < A„ which are independent of the initial data, moreover, the Lyapunov exponents can be used to characterize attractors: 1. If all the Lyapunov exponents are negative there is a stable fixed point. 2. If there is a periodic orbit at least one Lyapunov exponent vanishes. If all other Lyapunov exponents are negative the periodic orbit is asymptotically stable. 3. If k Lyapunov exponents vanish and all the other ones are negative there is an attracting quasiperiodic fc-dimensional torus. 4. If at least one Lyapunov exponents is positive there is chaotic motion. To extend the notion of Lyapunov exponents we assume the following setting: 1. The phase space R" is separatried into submanifolds Mi, • • •, Mk, M^ such that Rn=Ukk=lMiUMO0. 2. In each manifold the dynamical system is represented by a smooth system x = fi(x) (x 6 Mi) where
fieC1(Rn,W).
3. There are well-defined switching conditions for the transition from one manifold to another. This implies uniqueness in forward time. We further assume that there is no accumulation of switching times and that the switching is continuous, hence we allow simple crossing from one manifold to another or stick-slip motions but no jumps which for example occur in the impact oscillator. Under those conditions there is a uniquely defined flow, and it is possible to define piecewise a linearization as well as a linearized transition from one manifold to another. This local process has already been worked out by
133 Miiller [33]. It is the merit of Kunze [23] that he derived a set of hypotheses which guarantee a global result although these hypotheses are difficult to check for complex settings. Kunze has succeeded to work out all the details for the friction pendulum. Without listing all the technical assumptions we state the theorem which guarantees the existence of Lyapunov exponents almost everywhere. Theorem 3.1 (Kunze [23], Michaeli [31]) There exists G c R " such that 1. Lyapunov exponents are defined in G 2. G is 'large', i.e. Rn\G is a set of measure zero. On the basis of that theorem the numerical computation of Lyapunov exponents is justified. As far as the interpretation is concerned Michaeli [31] has confirmed the stability of periodic orbits for non-smooth systems: If T is a periodic orbit for a piecewise smooth system and if all Lyapunov exponents except the leading one axe negative then T is asymptotically stable. For a large class of examples we have carried out numerical computation of Lyapunov expnents. A comparision with the complete bifurcation diagram confirms their usefulness, for a review see [25]. In the case of the friction pendulum a direct comparision with analytical results is available. For large parameter areas we obtain coincidence between analytical and numerical results. A new approach to characterize attractors is based on linearization techniques and on the computation of an invariant measure related to the attractors. The geometrical shape of an attractor provides useful information. For the understanding of the dynamics it is useful to know the time spent in various parts of the attractor. That kind of information is covered in the invariant measures. Dellnitz et aJ. [8, 9, 10] have followed this approach and developed techniques both for the computation of attractors, invariant measures and for their visualization. It is a special feature occurring in non-smooth systems that trajectories may remain in discontinuities for some finite time. Such properties should
134
be made visible by the corresponding invariant measures. Vosshage [46] has adapted the techniques developed by Dellnitz [6] to non-smooth problems. We use the simple example of the friction oscillator to illustrate some of the features. The equations in normalized form are given by ±i =
x2,
&2 =
—Xi -sgn(x 2 ) + rsin(a;3),
±3 = VTo illustrate the complimentary views provided by the different approaches we consider the friction oscillator in the resonant case rj — 1. Figure 3.1 shows the bifurcation diagram and Figure 3.2 illustrates the Lyapunov exponents. For 7 G [0,1] there is a continuum of stationary solutions. For that reason 2 Lyapunov exponents vanish, the third one is equal to — oo. At 7 — 1 there bifurcates a branch of periodic solutions which is asymptotically stable. The leading Lyapunov exponent vanishes, the second one is negative and the third Lyapunov exponent is equal to —00, indicating that a stick-slip motion occurs, hence a partly reduction to a system of lower order. For 7 = 4/7T there are infinitely many periodic solutions, and for 7 > 4/7T all solutions are unbounded. There is no invariant measure, and for that reason there are no Lyapunov exponents although a formal numerical computation seems to give some numbers which could naively be interpreted as Lyapunov exponents. For that reason this example should serve as a warning that a formal computation of Lyapunov exponents without a sound mathematical justification is sometimes misleading. The periodic solution, say for 7 = 10/9, shows a small sticking phase (see [31]), this goes along with the fact that one Lyapunov exponent is equal to —00. The attractor as computed by the box method is shown in Figure 3.3 and 3.4, of course the attractor would shrink to the periodic orbit if the mesh size of the boxes would have been refined. The corresponding invariant measure is shown in Figure 3.5. Here it is obvious that a significant amount of time is spent in the discontinuity. Further examples have been worked out by Vosshage [46].
135
stat. sol.
]
unique periodic sol.
, unbounded sol.
Figure 3.1 Bifurcation diagram.
4/TT
-oo,0,0
7
no Lyapunov exponents
Figure 3.2 Lyapunov exponents.
Figure 3.3 Projection of the attractor on (x2-,xs)-plane.
136
-0.03
-0.02
-001
001
002
0O3
Figure 3.4 Projection of the attractor on (xi,x$)-plane.
x1 Figure 3.5 Invarant measure.
137
References [1] A. A. Andronov, A. A. Vitt & S. E. Khaikin, Theory of Oscillators. Dover Publications, Inc., New York, 1966. [2] F. Altpeter, Friction modelling, identification and compensation. These NO. 1998 EDFL 1999. [3] H. di Bernardo, C. J. Budd & A. R. Champneys, Corner collision implies border collision bifurcation. Preprint 2000. [4] C. Budd & F. Dux, Chattering and related behaviour in impact oscillators. Phil. Trans. R. Soc. London, Ser. A 347, 365-389, 1994. [5] C. Budd, F. Dux k A. Cliffe, The effect of frequency and clearance variations in single-degree-of-freedom impact oscillators. J. of Sound and Vibrations, 184, 475-502, 1995. [6] K. Deimling, Multivalued Differential Equations. Walter de Gruyter & Co., Berlin, New York, 1992. [7] K. Deprez, Losung unstetiger Differentialgleichungen mittels verzogerter Systeme. Diploma thesis, 1993, University of Cologne. [8] M. Dellnitz, A. Hohmann, O. Junge & M. Rumpf, Exploring invariant sets and invariant measures. Chaos 7, no. 2, 221-228. 1997. [9] M. Dellnitz & O. Junge, An adaptive subdivision technique for the approximation of attractors and invariant measures. Computation and visualization in Science 1, 63-68, 1998. [10] M. Dellnitz & O. Junge, On the approximation of complicated dynamical behaviour. SIAM J. Numer. Anal., 36(2), 491-515, 1999. [11] A. Dontchev & F. Lempio, Difference methods for differential inclusions: a survey. SIAM Reviews, 34, 263-294, 1992. [12] J. P. Eckmann & D. Ruelle, Theory of chaos and strange attractors. Reviews of Moden Physics, 57, 617-656, 1985.
138
[13] A. F. Filippov, Differential Equations with Discontinuous Righthand Sides. Kluwer Academic Publishers Group, Dordrecht, 1988. [14] E. Freire & E. Ponce, Bifurcation sets of continuous piecewise linear systems with two zones. Preprint. [15] U. Galvanetto, S. R. Bishop & L. Briseghella, Mechanical stick-slip vibrations. Int. J. Bifurcation and Chaos, 5 637-651, 1995. [16] F. Giannakopoulos, A. Kaul & K. Pliete, Qualitative analysis of a planar system of piece-wise linear differential equations with a line of discontinuity. Submitted to J. of Diff. Eq. 2000. [17] H. Holscher, U. D. Schwarz & R. Wiesendanger, Modelling of the scan process in lateral force microscopy. Surface science, 375, 395-402, 1997. [18] F. Hubbuch, Die Dynamik des periodisch erregten Reibschwingers. ZAMM 75, Suppl. I, 51-62, 1995. [19] F. Hubbuch & A. Miiller, Lyapunov Exponenten in dynamischen Systemen mit Unstetigkeiten. ZAMM 75, Suppl. I, 91-92, 1995. [20] R. A. Ibrahim, Friction-Induced vibration, chatter, squeal and chaos, part I. Mechanics of contact and friction, ASME Applied Mechanics Reviews, Vol. 47, no. 7, 209-226, 1994. [21] R. A. Ibrahim, Friction-Induced vibration, chatter, squeal and chaos, part II. Dynamics and modeling, ASME Applied Mechanics Reviews, Vol. 47, no. 7, 227-253, 1994. [22] P. Jvanov, Stability of periodic motions with Impacts. Preprint 2000. [23] M. Kunze, On Lyapunov exponents for non-smooth dynamical systems with an application to a pendulum with dry friction. To appear in J. Dynamics Diff. Eqs. [24] M. Kunze & T. Kupper, Qualitative analysis of a non-smooth friction oscillator model, ZAMP 48, 1-15, 1997.
139 [25] M. Kunze & T. Kiipper, Non-smooth dynamical systems. DFG-research report, 1999. [26] M. Kunze, T. Kiipper & J. Li, On the Conley index theory to nonsmooth dynamical systems. J. Diff. Eqs. Vol. 13, 4-6, 479-502, 2000. [27] M. Kunze, T. Kiipper k J. You, On the application of KAM-theory to discontinuous dynamical systems. J. Diff. Eqs. 139, 1-21. 1997. [28] T. Kiipper, Non-smooth Dynamical Systems. CISM - Lecture notes, Udine, 1999 (in publication). [29] R. I. Leine, B. L. van de Vrande & D. H. van Campen, Bifurcations in nonlinear discontinuous systems. Internal report in Eindhoven University of Technology, report number: 99.010. [30] F. Lempio & V. Veliov, Discrete approximations of differential inclusions. Bayreuther Math. Schriften, 54, 149-232, 1998. [31] B. Michaeli, Lyapunov Exponenten fur nichtglatte dynamische Systeme. Dissertation, Koln, Oktober 1998. [32] S. Moritz, "Hopf-verzweigung" bei unstetigen planaren Systemen. Diploma thesis 2000, University of Cologne. [33] P. C. Miiller, Lyapunov-Exponenten zeitvarianter nichtlinearer dynamischer Systeme mit Unstetigkeiten. ZAMM, 74, 70-72, 1994. [34] H. E. Nusse, E. Ott & J. A. Yorke, Border-Collision bifurcations: an explanation for observed phenomina. Phys. Rev. E, 49,1073-1076,1994. [35] V. I. Oseledec, A multiplicative ergodic theorem Lyapunov characteristic numbers for dynamical systems. Trans. Moscow Math. Soc. 19, 197-231, 1979. [36] F. Pfeiffer, Mechanische Systeme mit unstetigen Ubergangen. Ing. Arch. 58, 232-240, 1984.
140
[37] K. Pliete, Uber die Anzahl geschlossener Orbits bei unstetigen, stuckweise linearen dynamischen Systemen in der Ebene. Diplomarbeit, Koln, September 1998. [38] K. Popp k. P. Stelter, Nonlinear oscillations of structures induced by dry friction. In Nonlinear Dynamics in Engineering Systems-IUTAM Symposium Stuttgart, 233-240, 1989. Ed. W. Schiehlen, Springer, BerlinHeidelberg-New York, 1990. [39] K. Popp & P. Stelter, Stick-slip vibrations and chaos. Phil. Trans. Roy. Soc. London A 332, 89-105, 1990. [40] R. Reissig, Erzwungene Schwingungen mit zaher und trockner Reibung. Math. Nachrichten, 11, 345-384, 1954. [41] R. Reissig, Erzwungene Schwingungen mit zaher Reibung und starker Gleitreibung, II. Math. Nachrichten, 12, 119-128, 1954. [42] R. Reissig, Erzwungene Schwingungen mit zaher und trockner Reibung: Erganzung. Math. Nachrichten, 12, 249-252, 1954. [43] R. Reissig, Erzwungene Schwingungen mit zaher und trockner Reibung: Abschatzung der Amplitudes Math. Nachrichten, 12, 283-300, 1954. [44] K. Taubert, Differenzenverfahren fur Schwingungen mit trockener und zaher Reibung und fur Regelungssysteme. Numer. Math., 20, 379-395, 1976. [45] L. N. Virgin & C. J. Begley, Grazing bifurcations and basins of attraction in an impact-friction oscillator. Physica D., 30, 43-57, 1999. [46] C. Vosshage, Visualisierung von Attraktoren und invanrianten mengen in nichtglaten dynamischen Systemen. Diploma thesis 2000, University of Cologne. [47] Y. Zou & T. Kiipper, Melnikov method and detection of chaos for nonsmooth systems, preprint, 2000.
Mathematics and the 21st Century Eds. A. A. Ashour and A.-S. F. Obada © 2 0 0 1 World Scientific Publishing Co. (pp. 141-152) RADICAL
141
THEORY:
DEVELOPMENTS AND
TRENDS
RICHARD WIEGANDT
1. I n t r o d u c t i o n I intend to give a glimpse on radical theory, on its origin and aim, featuring recent important developments and indicating the trends of researches. Attention is focused on decomposition and description of rings which are semisimple with respect to certain radicals, on characterizing and describing classes of rings by closure operations, on ring constructions including most recent results which solve old and difncult problems in the negative, on radical theory in various algebraic and non-algebraic categories and also on cardinality conditions. I am aware that visiting the Pyramids and the Egyptian Museum does not make anyone an egyptologist, though it may be a very instructive (and enjoyable!) tour. In a similar manner, my hope is to provide an overview on the main issues of radical theory which may be useful for the experts and also informative for non-specialists in radical theory. In my opinion radical theory has contributed to the development of mathematics mainly in four aspects: i) Living up to the original goal, its task has been to prove structure theorems for rings and algebras which are semisimple with respect to certain radicals. ii) Studying and comparing classes (that is, properties) of rings via closure operations. iii) Constructing rings which distinguish two given properties of rings. These rings possess sometimes very peculiar properties which may ruin beautiful expectation, but serve to the better understanding of the structure of rings. iv) The infiltration of radical theory into other branches of mathematics (for instance, topology, incidence algebras) opens new dimensions for researchers and enriches the arsenal of investigations. 2. S t r u c t u r e t h e o r e m s A central problem of ring theory to determine the strucutre of rings in terms of linear transformations. Wedderburn [63] (1908) suggested and Kothe (1930) implemented an ingenious technique. They discarded or ignored a "bad" ideal R of a ring (or algebra) A such that the factor ring A/R has a "nice" structure (that 1991 Mathematics Subject Classification: 16N80 The support of the Hungarian National Foundation for Scientific Research Grant # T029525 is gratefully acknowledged.
142
is, representable by rings of linear transformations on vector spaces). Kothe [38] considered the unique largest nil ideal M{A) = ^2(1 < A | J is a nil ring) as a "bad" ideal and determined the structure of A/N(A) which has no nonzero nil ideals. To each element a e N{A) there exists an exponent n > 1 with an = 0, whence the elements of N{A) can be viewed as the n-th root of 0. Root in Latin is radix, so Kothe called the "pathological" ideal M(A) as the (nil) radical of A. A ring A is said to be nil semisimple, if A/"(A) = 0. The classical Wedderburn-Artin Structure Theorem describes the structure of a nil semisimple ring under the assumption that A is artinian (that is, A satisfies the descending chain condition (dec) on left ideals). Wedderburn—Artin Structure T h e o r e m 2.1. A ring A is artinian and nil semisimple if and only if A is a direct sum of finitely many simple rings A\, • •., An, and each Ai is isomorphic to a matrix ring Mfcf(.Dj) over a division ring Di. This Theorem explains also the origin of the attribute "semisimple". The most general but satisfactorily efficient structure theorem describes the structure of Jacobson semisimple rings as subdirect sums of dense rings of linear transformations. The nil radical M(A) is in some extent not big enough to determine the nil semisimple rings in general. Jacobson [34] introduced a somewhat bigger radical via quasi-regularity. A ring A is quasi-regular, if for every element x € A there exists an element y & A such that x + y — xy — 0. The Jacobson radical J {A) of a ring A is the unique largest quasi-regular ideal of A. A ring A is said to be (left) primitive, if A contains a maximal left ideal L such that xA C L implies x = 0, or equivalently, A has a faithful irreducible A-module (for instance A/L). The Jacobson radical J(A) can be represented as the intersection J{A)
= n ( i < A | A/I
is a primitive ring).
T h e o r e m 2.2. A ring A is Jacobson semisimple (that is, J{A) = 0) if and only if A is a subdirect sum of primitive rings. A ring A is primitive if and only if A is a dense subring of a ring Hom(V, V) of linear transformations on a vector space V, that is, to any finitely many linearly independent elements xi,..., xn e V and arbitrary elements yi,. • • ,yn £ V there exists an element t G A such that We mention two more theorems describing the structure of simple rings. A subring B of a ring A is called a biideal, if BAB C B. We say that a ring A is strongly locally matrix ring over a division ring D, if every finite subset C of A can be embedded into a biideal B of A such that B = Mn(D) for some natural number n depending on the size of C. Litoff—A n n T h e o r e m 2.3. A ring A is simple, A2 ^ 0 and A has a minimal left ideal if and only if A is a strongly locally matrix ring over a division ring. In the original Litoff Theorem [35] only embeddability m subrings was demanded, and therefore it was not an "if and only if" theorem. The present version is due to Anh [9].
143 T h e o r e m 2.4 (Beidar [12]). A ring A is a matrix ring over a division ring if and only if A is a prime ring such that to each nonzero element a € A the subset aA contains a nonzero idempotent and the degrees of nilpotency of the elements in A are bounded. Every ring A has a unique largest ideal T(A) whose additive group is torsion. T(A) is called the torsion radical of A. Every ring A possesses also a unique largest von Neumann regular ideal v(A), called the von Neumann regular radical of A. T h e o r e m 2.5 (F. Szasz [59]). Every artinian ring decomposes into a direct sum
A = r{A) e F where F is a uniquely determined ideal of A whose additive group is
torsionfree.
T h e o r e m 2.6 ([21], [49]). The von Neumann regular radical u(A) is a direct summand in every artinian ring A, and v{A) is a nil semisimple artinian ring. 3. Closure operations o n classes of rings Studying the strucutre of rings, various radicals have been introduced, and all of them share some common properties. Amitsur [4] and Kurosh [40] introduced independently the notion of general radicals. A class 7 of rings (that is, a property of rings) is said to be a radical class in the sense of Kurosh and Amitsur, if i) 7 is homomorphically closed, ii) 7 has the inductive property: if A = L)I\ and {I\} is an ascending chain of ideals of A such that all 7A G 7, then also A e 7, iii) 7 is closed under extensions: if I, A/1 6 7, then also A € 7. Every ring A has a unique largest 7-ideal 7(A), called the 7-radical of A. Each radical class 7 determines its semisimple class 5 7 = {all rings A | j(A)
= 0}.
For a radical class 7 we shall often speak only of a radical 7. Obviously 7 = {all rings A | i(A) = A}. Semisimple classes can be characterized dually to radical classes. T h e o r e m 3.1 ([43], [53]). A class a of rings is a semisimple radical class 7 = Uo = {all rings A \ A has no nonzero homomorphic
class of the upper image in a}
if and only if a is hereditary (I < A 6 o~ implies I 6 a) and closed under subdirect sums and extensions. Of particular interest are the special radicals introduced by Andrunakievich [8]. A class g of prime rings is said to be a special class, if g is hereditary and closed under essential extensions (that is, if I is an essential ideal of a ring A (I n K ^ 0 for every nonzero ideal K of A) and I e g, then A e g). The radical 7 = Ug is called a special radical. The nil radical as well as the Jacobson radical are example of special radicals.
144 T h e o r e m 3.2 ([8]). Every special radical class 7 = Ug is hereditary. Moreover, every f-semisimple ring A is a subdirect sum of rings in g, and also of 7- semisimple prime rings. T h e o r e m 3.3 ([31]). A class 7 of rings is a special radical if and only if 7 is homomorphically closed, hereditary and satisfies condition (*) if every nonzero prime homomorphic image of a ring A has a nonzero ideal I G 7, then A e 7. T h e o r e m 3.4 ([52]). A class a of rings is the semisimple class of a special radical if and only if a is hereditary, closed under subdirect sums, essential extensions and satisfies condition (**) if A&o, then A is a subdirect sum of prime rings in a. A subclass g of rings is called a variety, if g is closed under taking homomorphic images, subrings and complete direct sums. As is well-known, the varieties are just the subclasses of rings which can be denned by identities (e.g. xy = yx or xn = x for a fixed n). T h e o r e m 3.5 ([11], [30], [45], [67]). For a class g of rings the following are equivalent: (i) g is a variety which is closed under
conditions
extensions,
(ii) g is a radical class closed under subdirect sums, (iii) g is a radical and a semisimple (iv) g is a homomorphically
class,
closed semisimple
(v) g is a radical class closed under essential (vi) g is a variety closed under essential
class, extensions
extensions.
These classes have been determined explicitly by Stewart. T h e o r e m 3.6 ([58]). A class g satisfying any of the conditions of Theorem 3.5 is either g = {0} or g = {all rings} or Q is the semisimple class of a special radical 7 = UT where T = {F\,..., Fn} is the special class of finite fields, F\,.. •, Fn such that if G is a subfield of some Fi S J- then also G € T. For any class 7 of rings we define its essential cover £ 7 as £ 7 = {all rings A \ A has an essential ideal / e 7 } . In view of Theorem 3.5 (iii) and (v) a radical class 7 is a semisimple class if and only if £ 7 = 7. T h e o r e m 3.7 ([19]). The essential cover £ 7 of a radical class 7 is a semisimple class if and only if 7 is hereditary and has a complement in the lattice of all hereditary radicals. The classes of Theorem 3.7 are generalizations of radical and semisimple classes and have been explicitly determined [13] and [14] in terms of matrix rings over finite fields.
145
4. R i n g constructions Constructing rings with peculiar properties exhibits how delicate or nasty (up to the reader's taste) the behaviour of rings may be. This activity contributes substantially to the better understanding of the strucutre of rings and gives impetus for further researches. To decide that two radical classes (or semisimple classes, or other properties of rings) are different, one has to construct a ring which belongs to one class but not to the other. Two famous examples: i) Bergman [17] constructed a left primitive ring A which is not right primitive. The Jacobson radical class is the class of all quasi-regular rings, so it is a left- and right-symmetric notion, and the same is true for its semisimple class. Hence in the constructed left primitive ring A, being Jacobson semisimple, the intersection of all right primitive ideals is zero. ii) Sasiada [54] (see also Sasiada and Cohn [55]) constructed a simple prime ring which is also a Jacobson radical ring. Kasch [37] defined the total Tot (^4) of a ring A as the set Tot(A) = {a e A | aA has no nonzero idempotents}. Tot(A) is not closed under addition, so it cannot be an ideal in A. In [16] the Kasch radical class K is defined as K. = {all rings A | no nonzero homomorphic image of A has 0 total}. The question as whether the Kasch radical K is a special one is equivalent to the claim that K. coincides with the radical class /Cp = {all rings A | no nonzero prime homomorphic image of A has 0 total}. Beidar [16] gave a quite involved genuine construction for a ring G such that Tot(G) = 0 but T o t ( G / P ) ^ 0 for every prime ideal P of G. The existence of such a ring G proves that K ^ Kp, whence the Kasch radical K, is not a special radical, though K possesses many nice properties (for instance, if L is a left ideal of a ring A € K then also L e K., and if L = K,{A) is a left ideal of a ring A then L C K(A)). For a survey of rings and ring constructions which distinguish the various interesting radicals, the reader is referred to [68]. In the fundamental and inseminating paper [38] Kothe posed the following problem. K o t h e ' s P r o b l e m . Does the nil radical Af(A) of any ring contain A contain every nil left ideal of A? Although this problem has been raised in 1930, still open and seems to be the hardest problem in ring theory. An affirmative answer has been verified for many different classes of rings, but it has withstood so far the efforts of several brilliant
146 mathematicians. Kothe's Problem has many equivalent formulations, we present here two of them (Krempa [39]). Is the 2 x 2 matrix ring over a nil ring A a nil ring? Is the polynomial ring A[x] over a nil ring A a Jacobson radical ring? For a negative answer it would suffice to construct a nil ring A such that A[x] is not quasi-regular. Recently Agata Smotkunowicz [56] solved an old problem of Amitsur [5], and constructed a nil ring A such that the polynomial ring A[x] is not nil. Since nil rings are always quasi-regular, her result can be considered as an approximation of Kothe's Problem from below. An upper approximation was given by Puczylowski and Smoktunowicz [50]: the polynomial ring A[x] over a nil ring is always a Brown-McCoy radical ring (that is, A[x] cannot be mapped homomorphically onto a simple ring with unity element). Another famous and hard problem was posed by Levitzki. Does there exist a simple prime nil ring? It is the most recent development in ring theory that Smoktunowicz [57] constructed such a ring S. A negative answer to Kothe's problem could be given by proving that S[x] is not a Jacobson radical ring, or that the 2 x 2 matrix ring M2(S) is not nil. 5. Radical t h e o r y in other categories The infiltration of radical theory into other branches of mathematics opens new dimensions for researchers and enriches the arsenal of investigations. Radical theory can be developed in categories which are similar to that of rings (e.g. groups, modules, nonassociative rings, near-rings, rings with involuiton, Hgroups), and also in categories differing considerably from that of rings (5-acts, topological spaces, graphs). It was Kurosh [41] and his collaborators who developed the radical theory for groups. For ordered groups it was done by Chehata and Wiegandt [22]. The radical theory of modules and abelian categories is called torsion theory. The theory of hereditary torsions is highly developed (see for instance Golan [33]), much less is known on non-hereditary torsion theories (cf. [20]). As mentioned earlier, semisimple classes of associative rings are always hereditary. This remains true for alternative rings, Jordan algebras with 1/2, but not for nonassociative rings in general. In fact, the "nice" radical theory (with hereditary semisimple classes) collapses to that of abelian groups, as proved by Gardner. T h e o r e m 5.1 ([27]). In the variety of all not necessarily associative rings, if a radical 7 has a hereditary semisimple class Sj then 7 depends only on the additive group of the rings: if A and B are rings with isomorphic additive groups and A 6 7 then also B S 7. The case of near-rings is of particular interest inasmuch as its radical theory degenerates in a nice way. A (right) near-ring N is a not necessarily commutative group (N, + ) and a semigroup (iV, •) such that the addition and multiplication is linked by the right distributive law (x + y)z = xz + yz
Vx, y,z G N.
147 Note that xO ^ 0 may happen. If xO = 0 is true for all x € N, then we speak of a 0- symmetric near-ring. Betsch and Kaarli proved T h e o r e m 5.2 ([18]). If the semisimple class of near-rings is hereditary, then the corresponding radical class contains all nilpotent near-rings. The converse, however, is not true: the class of all nil near-rings in the variety of 0-symmetric near-rings is a radical class with non-hereditary semisimple class (Kaarli [36]). Veldsman [61] proved that if a hereditary semisimple class is not the class of all near-rings then it consists entirely of O-symmetric near-rings. T h e o r e m 5.3 ([60]). A radical class 7 ^ {0} of near-rings contains all nilpotent near-rings if and only if the semisimple class S-y is weakly homomorphically closed: I
(x + y)* = x* +y*,
{xy)* = y*x*
Vx,y e A.
Examples are commutative rings with identical involution I I - > I , real and complex matrices with transposition and adjoint matrix, respectively, and polynomial rings A[x] over an involution ring subject to x* = x. Working in the category of involution rings with homomorphisms preserving also involution, we have less mappings which makes the situation more difficult. In the case of associative rings the Anderson-Divinsky-Sulinski Theorem [6] tells us that for any radical 7 and for any ideal I of a ring A it always holds 'y(I) < A. This is no longer so for involution rings. T h e o r e m 5.4 ([46]). For a radical 7 of involution rings the followings are equivalent: (i) I < A implies j(I) < A for all ideals in every involution ring A, (ii) if A" g 7 and A2 = 0 then A" G 7 for every involution o on A. Proving structure theorems for rings, one-sided ideals play a decisive role. In the case of involution rings we cannot use one-sided ideals, because a left ideal L closed under involution must be also a right ideal. Working, however, with biideals closed under involution (called *-biideals) one can prove quite strong structure theorems. Aburawash [1] proved the involutive version of the Wedderburn-Artin Structure Theorem 2.1.
148 T h e o r e m 5.5. An involution ring A is nil semisimple and satisfies dec on *biideals if and only if A is a direct sum of finitely many matrix rings Mni(Di) with involution over a division ring Di i = 1 , . . . , r and of finitely many involution rings Kj(Dj), j = 1 , . . . , s, where each Kj(Dj) is a direct sum of a matrix ring Mn.(Dj) and of the opposite ring M°p(Dj) and the involution on Kj(Dj) is the exchange involution
(x,yy
= (y,x)
V(x,y) e Kj(Dj).
Notice that in Kj(Dj) the only ideals which are closed under involution, are 0 and Kj(D), so we may call such involution rings as *-simple. A *-simple involution ring is either a simple ring or the direct sum of two simple rings I and I°p endowed with the exchange involution. The involutive version of the Litoff-Anh Theorem is due to Aburawash [2]. T h e o r e m 5.6. An involution ring A is *-simple and possesses a minimal *-biideal if and only if every finite subset C of A can be embedded into a *-biideal B of A such that B = Mn{D) whenever A is a simple ring, and B = Km(D) whenever A is not simple as a ring. Here D is a fixed division ring and the natural numbers n and m depend on the size of C. The maximal torsion ideal T(A) of an involution ring A is closed under involution and if A satisfies dec on principal *-biideals, then r{A) is a direct summand of A ([47]), analogously to the case of rings without involution (cf. Theorem 2.5). Imposing chain conditions on *-biideals for involution rings is a stronger condition than imposing chain conditions on left ideals for rings without involution. This can be seen very well from the next two Theorems. T h e o r e m 5.7 ([15]). / / an involution ring A satisfies dec on *-biideals then its Jacobson radical J(A) satisfies dec on additive subgroups. Hence A, as a ring, is artinian with artinian Jacobson radical. The reader is reminded that the Jacobson radical of an artinian ring need not be an artinian ring. T h e o r e m 5.8 ([15]). For an involution ring A the following are equivalent: (i) the polynomial ring A[x] satisfies the ascending chain condition on *-biideals, (ii) A is a finite direct sum of matrix rings over finite fields (as given in Theorem 5.5). Theorem 5.8 can be compared with Hilbert's Basis Theorem which states that the polynomial ring A[x] over a ring A with unity element is noetherian (that is, satisfies the ascending chain condition on left ideals) if and only if A is noetherian. Let 5 be a semigroup. An S-act A is a set on which the semigroup S operates, that is, to every s G S and a & A there is asigned an element s a 6 S subject to the rule s{ta) = (st)a Vs, t € S and a 6 A. Let K be a congruence relation on A. Some K-cosets may be 5-subacts. If C = {C\ | A G A} is a set of pairwise disjoint subacts of A, then there may be several
149
congruence relations such that each C\, A G A, is a coset. C determines, however, a smallest congruence K in which the cosets are the 5-subacts C\ G C and the singletons {a} for each a G A\(l)C\). The radical theory of 5-acts was developed by Lex, Amin and Wiegandt [44], [3]. For radical theory of semifields we refer to [64], [65], [66]. Recently Veldsman [62] developed the general radical theory of incidence algebras. Topological spaces and graphs differ substantially from ring-like structures. Considering a topological space (or graph) A, any partition of the underlying set determines a congruence relation and a factor space (or factor graph, respectively) uniquely, and vice versa. The topologies (or graphs) defined on a fixed set A form a lattice. No algebraic structure has this property. Nevertheless, radical theory can be interpreted for topological spaces ([51], [10]) as well as for graphs ([24]), and even for their common generalization, called abstract relational structures ([25]). Radical properties correspond to connectedness properties and semisimple properties to disconnectednesses. For the so far most general Kurosh-Amitsur radical theory incorporating that of all kinds of algebraic structures as well as topological spaces and graphs, we refer the reader to [48]. 6. Cardinality condition In the theory of abelian groups it is quite common to impose cardinality conditions and set-theoretic assumptions (see [32]). Analogous questions are meaningful in ring theory, in particular, in their radical theory, but the situation is much more complicated and far less developed. Let 7 be a radical of rings (or abelian groups). 7 satisfies the cardinality condition for ideals (subgroups), if there exists an infinite cardinal number a such that for every ring (abelian group) A, if a G l(A) then a G "/(B) for some B < A (B C A, respectively) with | B | < a. T h e o r e m 6.1 ([28]). If a radical 7 of rings satisfy the cardinality condition for ideals, then 7 = {0}. For a radical 7 there exists an infinite cardinal number a such that every •y-ring is the sum of 7-ideals of cardinality < a if and only if 7 = {0} or 7 is the class of zero-rings on divisible P-groups for a set P of primes. T h e o r e m 6.2 ([28]). 7/7 is the upper radical of a variety of associative rings, then 7 satisfies the cardinality condition of subrings with respect to Hi. The relation between the cardinality condition of rings and abelian groups is given in T h e o r e m 6.3 ([28]). Let 7 be a radical class of abelian groups and 7* = {all rings A with additive group in 7 } . Then the radical 7* satisfies the cardinality condition for subrings if and only if 7 satisfies the cardinality condition. A kind of cardinality condition concerns the problem as whether the direct product of radical rings (abelian groups) is again radical.
150 T h e o r e m 6.4 ([26]). A radical class 7 of abelian groups is closed under direct products if and only if 7 is generated by trosion-free groups.
countable
T h e o r e m 6.5 ([23]). Let Q = {Ax | A e A} be a set of countable abelian groups Ax, |A| = Ni and Ylora{Ax, AM) = 0 for A ^ p e A . If 7 is the radical generated by the set g, then the direct product of Ax, A € A, is not in 7 . There are many radical classes of rings which are closed under arbitrary direct products, for instance the Jacobson radical, the von Neumann regular radical and the radical semisimple classes (in fact, subvarieties) of Theorem 3.5. The radical class of nil rings is obviously not closed under countable direct products. Product-closed radical classes of abelian groups and rings (via Theorem 6.3) were investigated recently in [29]. It would be nice to develop appropriate methods and prove more cardinality condition results in the radical theory of rings. References [1] U. A. Aburawash, Semiprime involution rings with chain conditions, Contr. General Alg. 7, Holder-Pichler-Tempsky, Wien & B. G. Teubner, Stuttgart, 1991, pp. 7-11. [2] U. A. Aburawash, The structure of *-simple involution rings with minimal *-biideals, Beitrdge Alg. und Geom., 33 (1992), 77-83. [3] I. A. Amin and R. Wiegandt, Torsion and torsion-free classes of Acts, Contr. General Alg. 2, Holder-Pichler-Tempsky, Wien & B. G. Teubner, Stuttgart 1983, pp. 19-34. [4] S. A. Amitsur, A general theory of radicals I, Amer. J. Math., 74 (1952), 774-786, II ibidem, 76 (1954), 100-125, and III ibidem, 76 (1954), 126-136. [5] S. A, Amitsur, Radicals of polynomial rings, Canad. J. Math., 8 (1956), 355-356. [6] T. Anderson, N. Divinsky and A. Sulinski, Hereditary radicals in associative and alternative rings, Canad. J. Math., 17 (1965), 594-603. [7] T. Anderson and R. Wiegandt, Weakly homomorphically closed semisimple classes, Acta Math. Acad. Sci. Hungar., 34 (1979), 329-336. [8] V. A. Andrunakievich, Radicals of associative rings, I (Russian), Mat. Sb., 44 (1958), 179212; English transl.: Amer. Math. Soc. Transl, (2) 52 (1966), 95-128. [9] P. N. Anh, On Litoff's theorem, Studia Sci. Math. Hungar., 18 (1983), 153-157. [10] A. V. Arhangel'skit and R. Wiegandt, Connectednesses and disconnectednesses in topology, General Topology and Appl., 5 (1975), 9-33. [11] E. P. Armendariz, Closure properties in radical theory, Pacific J. Math., 26 (1968), 1-7. [12] K. I. Beidar, On rings with zero total, Beitrdge Alg. und Geom., 38 (1997), 233-239. [13] K. I. Beidar, Y. Fong and W. F. Ke, On complemented radicals, J. Algebra, 201 (1998), 328-356. [14] K. I. Beidar, Y. Fong, W. F. Ke and K. P. Shum, On radicals with semisimple essential covers, Preprint, 1995. [15] K. I. Beidar and R. Wiegandt, Rings with involution and chain conditions, J. Pure & Appl. Algebra, 87 (1993), 205-220. [16] K. I. Beidar and R. Wiegandt, Radicals induced by the total of rings, Beitrdge Alg. und Geom., 38 (1997), 149-159. [17] G. M. Bergman, A ring primitive on the right but not on the left, Proc. Amer. Math. Soc, 15 (1964), 473-475. [18] G. Betsch and K. Kaarli, Supernilpotent radicals and hereditariness of semisimple classes of near-rings, Coll. Math. Soc. J. Bolyai 38, Radical Theory, Eger 1982, North-Holland, 1985, pp. 47-58.
151 [19] G. F. Birkenmeier and R. Wiegandt, Essential covers and complements of radicals, Bull. Austral Math. Soc, 53 (1996), 261-266. [20] G. F. Birkenmeier and R. Wiegandt, Pseudocomplements in the lattice of torsion classes, Comm. in Algebra, 26 (1998), 197-220. [21] B. Brown and N. H. McCoy, The maximal regular ideal of a ring, Proc. Amer. Math.
Soc,
1 (1950), 165-171. [22] C. G. Chehata and R. Wiegandt, Radical theory for fully ordered groups,
Mathematica,
Cluj, 20 (1978), 143-157. [23] M. Dugas and R. Gobel, On radicals and products, Pacific J. Math., 118 (1985), 79-104. [24] E. Fried and R. Wiegandt, Connectednesses and disconnectednesses of graphs, Alg. Universalis, 5 (1975), 411-428. [25] E. Fried and R. Wiegandt, Abstract relational structures, I (General theory), Alg. Universalis, 15 (1982), 1-21, II (Torsion theory), ibidem, 15 (1982), 22-39. [26] B. J. Gardner, Two notes on radicals of abelian groups, Comment. Math. Univ. Carolinae, 13 (1972), 419-430. [27] B. J. Gardner, Some degeneracy and pathology in non-associative radical theory, Annales Univ. Sci. Budapest, 22-23 (1979/1980), 65-74. [28] B. J. Gardner, Some cardinality conditions for ring radicals, Quaest. Math., 15 (1992), 27-37. [29] B. J. Gardner, On product-closed radical classes of abelian groups, Bui. Acad. S$iin$e Rep. Moldova, Matematica, 2(30) 1999, 33-52. [30] B. J. Gardner and P. N. Stewart, On semi-simple radical classes, Bull. Austral. Math. Soc., 13 (1975), 349-353. [31] B. J. Gardner and R. Wiegandt, Characterizing and constructing special radicals, Acta Math. Acad. Sci. Hungar., 40 (1982), 73-83. [32] R. Gobel, Radicals in abelian groups, Coll. Math. Soc. J. Bolyai 61, Theory of Radicals, Szekszdrd 1991, North-Holland 1993, pp. 77-107. [33] J. S. Golan, Torsion Theories, John Wiley & Sons, 1986. [34] N. Jacobson, The radical and semisimplicity of arbitrary rings, Amer. J. Math., 67 (1945), 300-320. [35] N. Jacobson, Structure of rings, Amer. Math. Soc. Coll. Publ., 37 Providence, 1968. [36] K. Kaarli, Classification of irreducible .R-groups over a semiprimary near-ring (Russian), Tartu Riikl. Ul. Toimetised, 556 (1981), 47-63. [37] F. Kasch, Partiell invertierbare Homomorphismen und das Total, Algebra Berichte 60, Verlag Reinhard Fischer, Miinchen, 1988. [38] G. Kothe, Die Struktur der Ringe, deren Restklassenring nach dein Radikal vollstandig reduzibel ist, Math. Zeitschr., 32 (1930), 161-186. [39] J. Krempa, Logical connections among some open problems in non-commutative rings, Fund. Math., 76 (1972), 121-130. [40] A. G. Kurosh, Radicals of rings and algebras, Mat. Sb., 33 (1953), 13-26 (Russian), English translation: Coll. Math. Soc. J. Bolyai 6, Rings, Modules and Radicals, Keszthely 1971, North-Holland, 1973, pp. 297-312. [41] A. G. Kurosh, Radicals in the theory of groups, Sibir. Mat. Zh., 3 (1962), 912-931, English transl.: Coll. Math. Soc. J. Bolyai 6, Rings, Modules and Radicals, Keszthely 1971, North Holland, 1973, pp. 271-296. [42] W. G. Leavitt and R. Wiegandt, Torsion theory for not necessarily associative rings, Rocky Mountain J. Math., 9 (1979), 259-271. [43] L. C. A. van Leeuwen, C. Roos and R. Wiegandt, Characterizations of semisimple classes, J. Austral. Math. Soc, 23 (1977), 172-182. [44] W. Lex and R. Wiegandt, Torsion theory for acts, Studia Sci. Math. Hungar., 16 (1981), 263-280.
152 N. V. Loi, Essentially closed radical classes, J. Austral. Math. Soc, Ser. A, 35 (1983), 132-142. N. V. Loi and R. Wiegandt, Involution algebras and the Anderson-Divinsky-Suliriski property, Acta Sci. Math. Szeged, 50 (1986), 5-14. N. V. Loi and R. Wiegandt, On involution rings with minimum condition, Ring Israel Math. Conf. Proc., 1 (1989), 206-214.
Theory,
L. Marki, R. Mlitz and R. Wiegandt, A general Kurosh-Amitsur radical theory, Comm. in Algebra, 16 (1988), 249-305. R. Mlitz, A. D. Sands and R. Wiegandt, Radicals coinciding with the von Neumann regular radical on artinian rings, Monatsh. fur Math., 125 (1998), 229-239. E. R. Puczylowski and A. Smoktunowicz, On minimal ideals and the Brown-McCoy radical of polynomial rings, Comm. in Algebra, 26 (1998), 2473-2482. G. Preuu, Eine Galois-Korrespondenz in der Topologie, Monatsh. fur Math., 75 (1971), 447-452. Ju. M, Rjabuhin and R. Wiegandt, On special radicals, supernilpotent radicals and weakly homomorphically closed classes, J. Austral. Math. Soc., 31 (1981), 152-162. A. D. Sands, Strong upper radicals, Quart. J. Math. Oxford, 27 (1976), 21-24. E. Sasiada, Solution of the problem of existence of a simple radical ring, Bull. Acad. Polon. Sci., 9 (1961), 257. E. Sasiada and P. M. Cohn, An example of a simple radical ring, J. Algebra, 5 (1967), 373-377. A. Smoktunowicz, Polynomial rings over nil rings need not be nil, Preprint, 1998. A. Smoktunowicz, A simple nil ring exists, Preprint, 1999. P. N. Stewart, Semi-simple radical classes, Pacific J. Math., 32 (1970), 249-254. F. Szasz, Uber artinsche Ringe, Bull. Acad. Polon. Sci., 11 (1963), 351-354. S. Veldsman, Supernilpotent radicals of near-rings, Comm. in Algebra, 15 (1987), 24972509. S. Veldsman, The general radical theory of near-rings — answers to some open problems, Alg. Universalis, 36 (1996), 185-189. S. Veldsman, The general radical theory of incidence algebras, Comm. in Algebra, 27 (1999), 3659-3673. J. H. M. Wedderburn, On hypercomplex numbers, Proc. London Math. Soc., (2) 6 (1908), 77-118. H. J. Weinert and R. Wiegandt, A Kurosh-Amitsur radical theory for proper semifields, Comm. in Algebra, 20 (1992), 2419-2458. H. J. Weinert and R. Wiegandt, Complementary radical classes of proper semifields, Coll. Math. Soc. J. Bolyai 61, Theory of Radicals, Szekszdrd 1991, North-Holland 1993, pp. 297-310. H. J. Weinert and R. Wiegandt, On the structure of semifields and lattice-ordered groups, Periodica Math. Hungar., 32 (1996), 129-147. R. Wiegandt, Homomorphically closed semisimple classes, Studia Univ. Babes-Bolyai, 17 (1972) no. 2, 17-20.
Cluj,
R. Wiegandt, Rings distinctive in radical theory, Quaest. Math., 23&24 (1999), 447-472. Author's address: A. Renyi Institute of Mathematics Hungarian Academy of Sciences P.O. Box 127 H-1364 Budapest Hungary e-mail: [email protected]
Mathematics and the 21st Century Eds. A. A. Ashour and A.-S. F. Obada © 2001 World Scientific Publishing Co. (pp. 153-158)
153
O n m i n i m a l s u b g r o u p s of finite g r o u p s By M. A s a a d D e p a r t m e n t of M a t h e m a t i c s , Faculty of Science, Cairo University, Giza, E g y p t
Abstract.
Many authors have investigated the structure of a finite
group G under the assumption that all subgroups of G of prime order are well-situated in G. The aim of this talk is to introduce these investigations.
Introduction and results Throughout, the groups are finite. A subgroup of G of prime order is called a minimal subgroup. Two subgroups H and K of a group G are said to permute if HK = KH.
It is easily seen that H and K permute iff the
set HK is a subgroup of G. We say, following Kegel [1], that a subgroup of G is S'-quasinormal in G if it permutes with every Sylow subgroup of G. Many authors have investigated the structure of a finite group G under the assumption that all minimal subgroups of G are well-situated in G. The aim here is to introduce these investigations. We begin with the following result: T h e o r e m l(Ito). Let G be a group of odd order. If all minimal subgroups of G lie in the center of G, then G is nilpotent. Proof. See [2, p. 283]. An extension of Ito's result is the following statement:
154
T h e o r e m 2. (1) If, for an odd prime p, every subgroup of G of order p lies in the center of G, then G is p-nilpotent. (2) If all elements of G of order 2 and 4 lie in the center of G, then G is 2-nilpotent. Proof. See [2, p. 435]. A more interested result of the same type is the following statement: T h e o r e m 3(Buckley). If all minimal subgroups of an odd order group G are normal in G, then G is supersolvable. Proof. See [3]. An extension of Buckley's result is the following statement: T h e o r e m 4(Shaalan). If every subgroup of G of prime order or order 4 is 5-quasinormal in G, then G is supersolvable. Proof. See [4]. We use the following notation: p always a prime; &p is the class of all finite p-groups; for a subgroup N of a group G, Wp(7V) = (x\x G N, \x\ =p) iip is odd * 2 (7V) = (x\x £ N, \x\ = 2 or 4) $(iV) = (x\x € N, \x\ is a prime or \x\ = 4). Let 3 be a class of groups. We call 9 a formation provided: (1) 3 contains all homomorphic images of groups in 3 , and (2) If G/M
and G/N
subgroups M, N of G.
are in 9f, then G/(M
n N) is in 3 for normal
155
A formation S is said to be saturated if G/$(G)
G S implies that G s 9 .
We assume throughout that 9 is a formation, locally defined by the system {^(p)} of full and integrated formations $s(p) (that is, &p$s(p) = Q(p) £ 3 for all primes p). It is well-known that for any saturated formation S, there is a unique integrated and full system which locally defines S. A solvable normal subgroup N of a group G is an 9-hypercentral of G (see Huppert [5]) provided N possesses a chain of subgroups 1 = N0 < N1 < ...
is a chief factor of G,
has order a power of the prime pit then
G/CG(Ni+1/Ni)
belongs to S ( J J ; ) .
The product of all S-hypercentral subgroups of G is again an S-hypercentral subgroup, denoted by Z%(G) and called the S-hypercenter of G. An extension of Ito and Buckley results is the following statement: T h e o r e m 5(Yokoyama). Let G b e a solvable group, and let N < G such that G/N G 9f, where 9 is a saturated formation containing the class of nilpotent groups. If *(7V) < Z 9 ( G ) , then G e S . Proof. See [6,7]. In [8] Laue proved the following statement: T h e o r e m 6. Let G b e a solvable group. If every subgroup of prime order or order 4 of the Fitting subgroup F(G) is normal in G, then G is supersolvable. In [9] Derr, Deskins and Mukherjee proved the following statement:
156
T h e o r e m 7.
Let N < G with G/N
e 3 , where 5 is a saturated
formation. If * P (/V) < Z 9 ( G ) , then G/Op,{N)
e 3.
R e m a r k (1). From theorem 7, it follows immediately that if ^P{G) ZQ(G),
<
then G/O , (G) £ S. For the case of a solvable group G, theorem 7
can be deduced from theorems of Yokoyama [6,7] and Laue [8]. R e m a r k (2). The 5-hypercenter, as introduced before, is a solvable group because it is the product of all solvable Q'-hypercentral normal subgroups. This fact is essential in the proof represented in [9] because it depends heavily on the fact that in a solvable group G, CG (F(G)) < F(G). In [10] Bolinches and Aguilera extended theorem 7. The 9-hypercenter, as introduced in [10], is not necessarily solvable. In [11] Asaad, Bolinches and Aguilera extended the results of Shaalan [4]. They proved the following statement: T h e o r e m 8.
Let N < G with G/N
e S, where 9 is a saturated
formation containing the class of supersolvable groups. If every subgroup of N of prime order or order 4 is 5-quasinormal in G, then G e 3 . Recently, Asaad and Csorgo [12] extended the theorems 3,4,6 and 8. They proved the following statement: T h e o r e m 9.
Let TV < G with G/N
G 5 , where 3 is a saturated
formation containing the class of supersolvable groups. If N is solvable and every subgroup of prime order or order 4 of F(N) then
is S'-quasinormal in G,
GeQ.
To end this talk, the author would like to mention that the influence of minimal subgroups on the structure of finite groups is still open for doing further research work.
157
References [1] O.H. Kegel, Sylow-Gruppen und Subnormalteiler endlicher Gruppen. Math. Z. 78, 205-221 (1962). [2] B. Huppert, Endliche Gruppen I. Berlin-Heidelberg-New York 1967. [3] J. Buckley, Finite Groups whose Minimal Subgroups are Normal. Math. Z. 116, 15-17 (1970). [4] A. Shaalan, The influence of 7r-quasinormality of some subgroups on the structure of a finite group. Acta Math. Hungar. 56, 287-293 (1990). [5] B. Huppert, Zur Theorie der Formationen. Arch. Math. 19, 561-574 (1968). [6] A. Yokoyama, Finite solvable groups whose S-hypercenter contains all minimal subgroups. Arch. Math. 26, 123-130 (1975). [7] A. Yokoyama, Finite solvable groups whose Q-hypercenter contains all minimal subgroups II. Arch. Math. 27, 572-575 (1976). [8] R. Laue, Dualization of saturation for locally defined formations.
J.
Algebra 52, 347-353 (1978). [9] J.B. Derr, W.E. Deskins and N.P. Mukherjee, The influence of minimal p-subgroups on the structure of finite groups. Arch. Math.
45, 1-4
(1985). [10] A. Ballester-Bolinches and M.C. Pedraza-Aguilera, On minimal subgroups of finite groups. Acta Math. Hungar. 73, 335-342 (1996). [11] M. Asaad, A. Ballester-Bolinches and M.C. Pedraza-Aguilera, A note on minimal subgroups of finite groups. Comm. Algebra 24(8), 2771-2776 (1996).
158
[12] M. Asaad and P. Csorgo, The influence of minimal subgroups on the structure of finite groups. Arch. Math. 72, 401-404 (1999).
Mathematics and the 21st Century Eds. A. A. Ashour and A.-S. F. Obada © 2001 World Scientific Publishing Co. (pp. 159-167)
159
TOTALLY AND MUTUALLY PERMUTABLE PRODUCTS O F FINITE GROUPS A. BALLESTER-BOLINCHES
Departament d'Algebra, Universitat de Valencia Dr. Moliner 50, 46100 Burjassot, Valencia (Spain)
1
Introduction
This survey is about finite groups. Therefore unless otherwise stated all groups considered are finite. We start by quoting a series of classical theorems which are behind our results. Soluble groups have a long history reaching back to Burnside. His famous paqb theorem, published in 1904 ([9]) states that a group is soluble if its order is divisible by at most two different primes. Based on this theorem, P. Hall during the period 1928-1937 characterized solubility by means of the existence of Sylow complements and Sylow systems (see [10]). In particular, we have: A group is soluble if and only if it is the product of pairwise permutable Sylow subgroups Taking the above characterization into account, the following question arises: Assume that a group G is the product of pairwise permutable nilpotent subgroups. Is G soluble? In the early 50's many people worked on this question. It was promptly conjectured that the answer was affirmative. Wielandt, in 1958, proved that a dinilpotent group must indeed be soluble if the factors are of coprime orders ([15]) and Kegel then proved that a dinilpotent group is always soluble ([12]). Today these results are known as the Kegel- Wielandt theorem. The answer to the above question follows then by using an induction argument.
160
Meanwhile, Huppert in 1953 ([11]) proved a particular case of the KegelWielandt theorem which will be useful for our purposes: A group which is the product of pairwise permutable cyclic subgroups is supersoluble. The idea behind the above results is the following: Assume that G = G1G2 • • • Gn is a group which is the product of the subgroups G\,G2,---Gn such that GiGj = GjGi for all i,j where n > 2. Which is the relationship between the structure of G and that of the subgroups Gi? As a special case of a product G = G1G2 • • • Gn of pairwise permutable subgroups we have the one with all the factors Gi are normal in G [normal products). In particular when G = G\ x G 2 x • • • x Gn it is a direct product. The distance between general products and direct products is usually long. For instance, the normal product of supersoluble groups is not supersoluble in general while the direct product of supersoluble groups is always supersoluble. This shows in particular that formations, even saturated, are in general not closed under normal products. However, it is known that they are always closed under direct products and even under central products. To create intermediate situations, it seems reasonable to consider products in which the subgroups of the distinct factors are connected by certain permutability properties. This is probably what Asaad and Shaalan had in mind when in 1989 they introduced the concepts of totally and mutually permutable products, central concepts of this paper. Definition. [1] a) A group G is said to be the totally permutable product (t.p.p.) of the subgroups H and K if G = HK and every subgroup of H permutes with every subgroup of K. b) A group G is the mutually permutable product (m.p.p.) of H and K if G = HK and H permutes with every subgroup of K and K permutes with every subgroup of H. Clearly, direct products and central products are t.p.p. Moreover, "totally" implies "mutually", but the converse is not true: the group S 4 is the mutually permutable product of a Sylow 2-subgroup and the alternating group Ai, but this product is not totally permutable. In [1], the following results are proved:
161
Theorem. [1; Theorems 3.1, 3.2 and 3.8] Let G = HK be the product of two supersoluble subgroups H and K. a) If G is the t.p.p. of H and K, then G is supersoluble. b) If G is the m.p.p. of H and K and either K is nilpotent or G', the commutator subgroup of G, is nilpotent, then G is also supersoluble. A natural question arising at this point is whether it is possible to extend the above results through the theory of formations. In this survey we present some affirmative answers to this question. We introduce first some definitions and results on formations. Recall that a formation is a class of groups T such that: 1. If G G T and N
G T.
2. If Ni
2
Totally permutable products
The first relevant extension of part (a) of the theorem by Assad and Shalaan is the following result due to Maier: Theorem. [13; Theorem] Let T be a saturated formation containing the class U of supersoluble groups. If G = HK is the t.p.p. of H and K and both H and K belong to J-, then G also belongs to J-.
162
Notice that the condition U C T is necessary, as the symmetric group of degree 3 and the formation of all nilpotent groups show. In the same paper, he proposes the following question: "Does the above result extend to non-saturated formations which contain all supersoluble groups?". This question was affirmatively answered by Ballester-Bolinches and Perez-Ramos in 1996. Theorem A. [6; Theorem] Let J- be a formation such that U C T. If G = HK is the t.p.p. of H and K, and both H and K belong to T, then G also belongs to T. In the following, we shall give an sketch of proof of this theorem. In order to prove a theorem like this, it is needed to get information about the structure of the group from the structure of the factors. In particular, we concentrate our attention in the direction "how far apart" the product is from a central product. We shall see that the basis of the results is quite often the supersolubility of the product of two cyclic groups. The following results turn out to be crucial for understanding the structure of totally permutable products. Lemma 1. [13; Lemma 2] If G = HK is the t.p.p. of H and K, then 1. There exists a normal subgroup N of G such that N is contained either in H or in K. 2. H C\K is a nilpotent subnormal subgroup of G. Assume that the normal subgroup of N is contained in K. Then NH is a t.p.p. and N
163
Lemma 3. [6; Lemma 6 and Corollary 2] IfG is the t.p.p. of the subgroups H and K, then [Hu, K] = [H, Ku] = 1. Moreover, Gu = HUKU. Recently J. Beidleman and H. Heineken in [7] have obtained a valuable improvement in the knowledge of the structure of these groups by proving that in fact in this kind of products the nilpotent residual of each factor centralizes the other factor. More generally, if T is a formation such that U C T, then H^ and KT are normal subgroups of G. At this point, it is possible to give a proof of Maier's theorem. Theorem. Let T be a saturated formation containing U. If G = HK is the t.p.p. of the J--subgroups H and K, then G € J-. Proof. Notice that 1 < Hu < HUKU = Gu < G is a proper normal series of G, which can therefore be refined to a chief series of G. If A/B is a chief factor such that A < Hu, then K < CG{A/B). Hence G/CG{A/B) S H/CH{A/B) G f(p) for every prime p dividing | A/B | . Moreover if Hu < B < A< HUKU, then A/B is isomorphic to (A n KU)/(B n Ku), a chief factor of K centralized by H. Since K £ T, it follows that G/CG{A/B) S K/CK{{A n KU)/{B n Ku)) e f{p) for every p dividing | A/B |. Finally G/HUKU = G/Gu eUCT. Consequently, G € T. Lemma 4. [6; Lemma 1] Let the group G = NB be the product of N and B, where N
164
Then (Kc) = (KA) < KA because K centralizes Hu. Hence (Kc)nHu = 1. Now C/(KC) S [HU]A/({KG) n [HU]A) e T (notice that [HU]A e .F applying Lemma 4). Therefore C/Hu n ( # c ) ^ C £ F. The second case is if neither H nor if is supersoluble. Then Hu, Ku ^ 1 and H = HUA and K = ifw£? where A and B are supersoluble projectors of H and K respectively. Moreover A is a proper subgroup of H and B is a proper subgroup of K. Denote U = AB. It is clear that U is a supersoluble subgroup of G. There is a natural action of U on Hu x ifw. Let X = [Hu x if u ][/ be the semidirect product with respect to this action. It is rather easy to see that there exists an epimorphism 7 : X —> [HUKU]U = Y. Moreover, by Lemma 4, there exists an epimorphism from Y to G. We see that X is an ^"-group. By similar arguments to those used above, it is not difficult to prove that [HU]U and [KU]U both belong to T. This implies that X/Hu and X/Ku are ^"-groups and so X is also an .F-group. This implies G £ T, the final contradiction. Let us give now some results on t.p.p. concerning the behaviour of distingued subgroups associated to formations. A well-known theorem by Doerk and Hawkes (see [10]) states that for a formation T of soluble groups, the ^"-residual respects the operation of forming direct products, that is, if G = HxK, then Gr = Hr x Kr. We have: Theorem B. [3; Theorem 4] Let J- be a formation containing U such that T is either saturated or f C 5 . IfG = HK is the t.p.p. of H and K, then GT = HrK*. Notice that this theorem implies that the converse of Maier's theorem is true; that is, if G belongs to J- and T is saturated, then H and K both belong to T. On the other hand, it is quite clear that knowledge of ^-projectors of a group usually reveals little about its ^-"-subgroup structure. In fact, there is no connection in general between the projectors of a group and those of a proper subgroup. Totally permutable products are exceptions to this general rule when T is a saturated formation containing U as the following result shows: Theorem C. [2; Theorem B] Let T be a saturated formation such that U C T and let G = HK be the t.p.p. of H and K. If A is a T-projector of H and B is a J-'-projector of K, then AB is a J-'-projector of G.
165
Finally, Theorems A, B and C hold for totally permutable products of more than two subgroups: Theorem. [3; Theorems 1, 4 and 5] Let G — G1G2 • • • Gn be a product ofpairwise totally permutable subgroups d- Then, Theorems A, B and C hold.
3
Mutually permutable products
In order to obtain an extension of part (b) of Assad-Shaalan theorem, we will focus our attention now on mutually permutable products (m.p.p.). As a previous remark, notice that Assad-Shaalan theorem does not hold in the case "K nilpotent" and "J7 a formation containing W". For instance, if we take the formation T = (G : G' is supersoluble). It is clear that T contains U. Moreover the symmetric group of degree 4, G, is the m.p.p. of the alternating group of degree 4, H say, which belongs to the class T, and a Sylow 2-subgroup K of G. In this example, the derived group of G is H, which does not belong to U. Hence G £ J-. Even if the formation J- is saturated, the result is not true. For instance, let us take the formation function f („\
J{P>
_ /
S S
*3
\ SpU(p-l)
ifp = 2
iip
^2.
where U{p—1) denotes the class of abelian groups of exponent dividing p~ 1. Let T = LF(f). It is clear that U C T. Then £4 is again the m.p.p. of H = At and K G 5j// 2 (S 4 ). It is not difficult to see that H,K G T, but £4 does not belong to the class J-, because ^i/Oz^CEi) is isomorphic to £3, which does not belong to c>2<S3. However, for T = U, the following improvement of Asaad-Shaalan Theorem was obtained. Theorem. [4; Theorems 1 and 2] Let G = HK be the product of H and K. If either K is supersoluble and G €N orKeAf, then Gu = Hu. Returning to formations, the case G' G M is interesting because of the following result: Lemma 5. [5; Lemma 1] Let G = HK be the product of the subgroups H and K such that G' G Af. Assume that J- is a saturated formation with G G T. Then both H and K belong to T.
166
In fact, for saturated formations containing U, the following holds: Theorem. [5; Theorem A] Let G = HK be the m.p.p. of H and K. Let T be a saturated formation containing U. If G' € N, then Gr = H^K*. We do not offer the complete proof because it is quite long and at some points rather technical. One of the main ideas of the proof is to analyse the behaviour of the intersection H n K, In this context, the following results are extremely useful. Let G = HK be a mutually permutable product. Then: (i) [8; Proposition 3.5] If X and Y are subgroups of G such that H n K < X < H and H (1 K < Y < K, then X and Y are mutually permutable subgroups of G. In particular, if H n K = 1, the product G is totally permutable. (ii) [8; Proposition 3.5] H n K is permutable in H and K. Moreover, H (~l K is a subnormal subgroup of G. (iii) [14] If Q is a permutable subgroup of a group A, then QA/CoreA(Q) < Zoo {A/CoreA(Q))- As a consequence, if D = CoreH(H f\K) ^1, then Z)A = Z) x < if. Hence K contains a normal subgroup of A. (iv) [5; Lemma 4] If G' is nilpotent and T is a saturated formation containing U, then (H D i f ) ^ < G^. One might wonder whether some jF-projector of a mutually permutable product with nilpotent commutator subgroup could be the product of Tprojectors of the factor groups. Unfortunately, this is not true in general as the next example shows. Let G be the direct product of a cyclic group (a) of order 3 with the alternating group A^ of degree 4. Let V be the Klein 4-group of A^. Then G is the mutually permutable product of H = (a) x V and K = A±. Moreover G' = V is nilpotent. Notice that H is the supersoluble projector of H and a Sylow 3-subgroup B of A± is a supersoluble projector of K. However HB = G is not supersoluble.
References [1] M. Asaad, A. Shaalan "On the supersolvability of finite groups" Math. 55(1989), 318-326
Arch.
167
[2] A. Ballester-Bolinches, M.C. Pedraza-Aguilera, M.D. Perez-Ramos "On finite products of totally permutable groups" Bull. Austral. Math. Soc. 53 (1996), 441-445 [3] A. Ballester-Bolinches, M.C. Pedraza-Aguilera, M.D. Perez-Ramos "Finite groups which are products of pairwise totally permutable subgroups" Proc. Edinburgh Math. Soc. 41 (1998), 567-572 [4] A. Ballester-Bolinches, M.C. Pedraza-Aguilera, M.D. Perez-Ramos "Mutually permutable products of finite groups" J. Algebra 213 (1999), 369-377 [5] A. Ballester-Bolinches, M.C. Pedraza-Aguilera "Mutually permutable products of finite groups II" J. Algebra 218 (1999), 563-572 [6] A. Ballester-Bolinches, M.D. Perez-Ramos "A question of R. Maier concerning formations" J. Algebra 182 (1996), 738-747 [7] J. Beidleman, H. Heineken "Totally permutable torsion groups" Group Theory 2 (1999), 377-392 [8] A. Carocca "p-supersolvability of factorized finite groups" Math. J. 21 (1992), 395-403 [9] W. Burnside "On groups of order paqb" (1904), 388-392
J.
Hokkaido
Proc. London Math. Soc. 2
[10] K. Doerk, T.O. Hawkes "Finite Soluble groups" De Gruyter Expositions in Mathematics, 4. Berlin, 1992 [11] B. Huppert "Uber das Produkt von paarweise vertauschbaren zydischen Gruppen" Math. Z. 58 (1953), 243-264 [12] O.Kegel "Produkte nilpotenter Gruppen" Arch.Math. (Basel) 12 (1961), 90-93 [13] R. Maier "A completeness property of certain formations" Math. S o c . ^ (1992), 540-544
Bull. London
[14] R. Maier, R. Schmid "The embedding of permutable subgroups infinite groups" Math. Z. 131 (1973), 269-272 [15] H. Wielandt "Uber das Produkt von paarweise vertauschbaren nilpotenten Gruppen" Math. Z. 55 (1951), 1-7
Mathematics and the 21st Century Eds. A. A. Ashour and A.-S. F. Obada © 2 0 0 1 World Scientific Publishing Co. (pp. 169-172)
169
Asymptotic behaviour of solutions of evolution equations
by
Bolis Basit Department of Mathematics Monash University Clayton, Victoria 3168, Australia e-mail: bbasit9vaxc.cc.monash.edu.au
Abstract. In this talk we study asymptotic behavior of solutions of abstract equations of the form (*) B(j>(t) = A
B<j>(t) = A<j>{t) + i/>(t) for t 6 J,
where A is the generator of a Co-semigroup of operators on a Banach space X, ip : J —> X is continuous, B<j> =
170 Harmonic analysis: Beurling spectrum (see [1, p. 34], [6-9])
Ll(R).
If (j> £ L%{R,X) and Iw(<j>) = {/ E Ll(R) : 0 * / = 0}, then Iw{4>) is a closed ideal of Spectrum spw(
E x a m p l e s (see [1, Proposition 1.1, Proposition 2.6]). (j> = p ^ 0 a polynomial, spw(p) = e the identity of G,
A function <> / £ BC(J,X) is called asymptotically almost periodic (respectively Eberlein almost periodic) if H(., : s £ 7} is relatively compact (respectively weakly relatively compact) in BC(J,X). The space of all ergodic (totally ergodic) functions from £| o c (7, X) will be denoted by
E(J,X) (T£(J,X)). We set £ub(J,X) := £(J,X) n BUC(J,X) and T£ub(J,X) := T£(J,X) n BUC(J, X). AP(J, X), AAP{J, X), EAP(J, X) will respectively stand for the spaces of almost periodic functions, asymptotically almost periodic functions and Eberlein almost periodic functions. Stepanoff 5 p -almost periodic functions for all 1 < p < 00 will be denoted by SP-AP(J, X). Now we state some results and give an example of a resonance system. Theorem 1 (see [2, Theorem 2.6] and references therein). If (j> £ BUC(J,X)nT£ub(J,X) and T is translation invariant, invariant under multiplication by characters, closed closed subspace of BUC(J, X) containing all constants, then
171 (ii ) <j> 6 T if and only if spjr(
(&)v i, e f, sPr(<j>) c egl(
+ 2i
A =[§:?].
Then 5 7 , i = 0B{1S)ISX
= [(is)2 + 2i(is) - l]l3x = -(s + l)2lsx,
tr(A) = {-1, - 4 } .
It follows that 6gl(
I ci7o+C27i+c 3 7_ 2 +C47_3 J*
We notice that spw(£) = 9gl(ir(A)). Now let T = APW(J,X) (1 + 1*1)^. Let
or T = C 0 ,„,(J,X) and t»^(() =
So £,V £ C^, 0 (M,«7) n A P ^ R . O ) .
Whereas 0 0
Motivated by the above example, we refer to 6g1(cr(A)) as the resonance set of (l.l)(see [5]). We say that (f> 6 BUCW(J,X) is a resonance solution of (1.1) for the class T(J, X) if ip,£ 6 ^ ( J , X) for all solutions £ g BUCW(J,X) of the homogeneous equation B£(t) = A£(i) but (j>^T(J,X). Mean classes Definition (see [4]). Let 4> e A C LJ^X),
0 ^ ft 6 R+. Set
Affc^(t) := i /0h 0(< + *) ds, « £ J and MA := {0 G 1,^(7, X ) : Mh<j> 6 .4 for all h > 0}. It is easy to show that SP-AP(J,X)
C
MAP(J,X).
Theorem (see [4, Theorem 4.2]). Assume c0 (£ X. 0 e S£(R,X), 1 < p< oo. Then <j> 6 SP-AP{WL,X).
Let A ^ = 6 with 6 6 S P -AP(R,X),
172 REFERENCES
[1] B. Basit and A.J. Pryde, Polynomials and functions with finite spectra on locally compact abelian groups, Bull. Austral. Math. Soc. 51 (1995), 33-42. [2] B. Basit, Harmonic analysis and asymptotic behavior of solutions to the abstract Cauchy problem, Semigroup Forum 54 (1997), 58-74. [3] B.Basit and A.J.Pryde, Ergodicity and differences of functions on semigroups, J. Austral. Math. Soc. 149 (1998), 253-265. [4] B.Basit and Hans Gunzler, Asymptotic behavior of solutions of neutral equations, J. Differential Equations 149 (1998), 115-142. [5] B. Basit and A. Pryde, Asymptotic behavior of unbounded solutions of evolution equations on topological groups (to be submitted). [6] J. T. Benedetto, Spectral Synthesis, Academic Press, New York, 1975. [7] Y. Katznelson, An introduction to Harmonic Analysis, J. Wiley and Sons, New York, 1968. [8] H. Reiter, Classical Harmonic Analysis and Locally Compact Groups, Oxford Math. Monographs, Oxford Univ., 1968. [9] W. Rudin, Harmonic Analysis on Groups, Interscience Pub., New York, London, 1962.
Mathematics and the 21st Century Eds. A. A. Ashour and A.-S. F. Obada © 2001 World Scientific Publishing Co. (pp. 173-188)
173
On Nonlinear Evolution Equations with Applications Lokenath Debnath Department of Mathematics University of Central Florida Orlando, Florida 32816 U.S.A. e-mail:
"... the progress of physics will to a large extent depend on the progress of nonlinear mathematics, of methods to solve nonlinear equations ... and therefore we can learn by comparing different nonlinear problems." WERNER HEISENBERG
"... as Sir Cyril Hinshelwood has observed ... fluid dynamics were divided into hydraulic engineers who observed things that could not be explained and mathematicians who explained things that could not be observed." JAMES LIGHTHILL
Abstract. This paper is concerned with the most general water wave equations in (3 + 1) dimensions. Special attention is given to the derivation of the (1 + l)-dimensional forced Korteweg-de Vries (fKdV) equation and the (1 + l)-dimensional forced nonlinear Schrodinger (fNLS) equation near resonant conditions. Using the multiple scale technique, the (1 + l)-dimensional nonlinear Schrodinger (NLS) equation for the amplitude function A® of wake packets is derived. Section 6 deals with the fourth-order nonlinear Schrodinger equation for the amplitude of the wave potential leading to a major improvement in agreement with Longuet-Higgins analysis (1978a,b). A stability analysis is made based on the simplified version of the Dysthe's (1979) NLS equation. This is followed by Hogan's analysis of the fourth-order evolution equation for capillary-gravity waves in deep water. The final section deals with the Davey-Stewartson equations and the Kadomtsev and Petviashvili equation in water of finite depth.
1. Introduction The mathematical theory and applications of nonlinear evolution equations have experienced a revolution over the last three decades of the twentieth century. During this revolution many fascinating and unexpected phenomena have been observed in physical, chemical and biological systems. Many new instability phenomena have been discovered. Considerable attention has been given to these instability and wave breaking phenomena. Other major achievements of twentieth century applied mathematics include the remarkable discovery of soliton, soliton interactions and
174 the Inverse Scattering Transform (1ST) for finding the explicit exact solution for several canonical nonlinear partial differential equations. Almost all nonlinear evolution equations after 1960 originated from the mathematical theory of water waves. Water waves are the most common observable phenomenon in nature. Invariably water waves break on ocean beaches. One of the common question is: why water waves break on beaches? Answers to this question led to analytical study of nonlinear partial differential equations, topological study involving Riemann's surfaces, algebraic study of Lie groups, experimental and computational studies of nonlinear evolution equations. So, water waves possess an extremely rich mathematical structure. The study of water waves still remains an active subject of mathematical research. This article deals with several nonlinear evolution equations described by one or more of the following physical processes that include dispersion and nonlinearity. Special attention is given to the (1 + l)-dimensional fKdV and fNLS equations, the fourth-order NLS equations in water with and without the effects of surface tension. Included are the Davey-Stewartson (DS) equations and the Kadomtsev and Petviashvili (KP) equation in water of finite depth. 2. Basic Equations of Water Waves We consider an unsteady irrotational motion of an inviscid and incompressible water in a constant gravitational field g which is in the negative z direction. The uneven bottom is described by z = b(x, y). Including the effects of surface tension T, the basic equations and the free surface boundary conditions for the velocity potential
are V24> = <j>xx + (t>vy +
4>t + \(V
+ R2) = 0
on
b(x, y)
+ ar], t > 0
(2.1)
z = h0 + ar),
(2.2)
z = h0 + ar,, on
(2.3)
z = b{x, y),
(2.4)
where the curvatures R\ and R2 are fli = % ( l + ^ + ^ ) " ^
and
R2 = 7ly(l + ri2x + 4)-K
(2-5ab)
ho is the typical depth of water and a is the typical amplitude of the surface wave. It is convenient to introduce a typical horizontal length scale I (which may be wavelength A) and a typical vertical length scale ho and a typical velocity scale c = \fgh~o (shallow water wave speed). We also introduce two fundamental parameters e and i5 as
-Vo
* = l
(2 6ab)
-
175 where e is called the amplitude parameter and S is called the long wavelength or shallowness parameter. Making reference to Debnath (1994), the basic water wave equations and free surface boundary conditions without the effects of surface tension (T = 0) in nondimensional form are o~(4>xx +
in b < z < 1 + en, °n
+ 4>l) +^4*1 = 0
t> 0
z = l + £V,
on
(2.7) (2-8)
z = \ + er},
(2.9)
o n 2 = 6(1,3/).
(2-10)
3. The Korteweg-de Vries (KdV) Equation Near Resonant Speed According to the linearized theory of Stoker (1957), and Debnath and Rosenblat (1969), twodimensional water wave generated by a steady surface pressure moving at a constant speed U propagate in the far field only if the pressure field travels at a speed U < c = \fgh~o and appear behind the pressure field and no disturbances ahead. On the other hand, if U > c, no disturbance at all exists far from the pressure field, but only transients are generated, and they decay in the far field as t —> oo. However, at the resonant (or critical) speed U = c (or Proude number F = U/c = 1) the linearized solution becomes unbounded as t —> oo so that the linearized theory breaks down due to finite-amplitude effects. The reason for the unbounded growth of the solution is that the energy transferred by the moving pressure field to the water cannot be radiated away from the disturbance because the group velocity of the induced waves tends to U. Thus, the neglected nonlinear terms apparently play a significant role in the evolution of the response in the finite-amplitude regime. We formulate the nonlinear initial value problem in water of finite depth ha, 0 < z < ho under the action of steady surface pressure p(x) moving at a constant speed U. The nonlinear water wave problem is formulated in terms of the velocity potential $ = Ux — \U2t +
t = I — It*,
* = ( ^ ) * ' ,
n = an*
P={agp)p\
the basic equations for water waves, dropping the asterisks,
in 0 < z < 1 + en, z = 1 + er),
t > 0,
(3.1) (3.2)
z = l + en,
(3.3)
on z = 0
(3.4)
176 where F = ^ is the Froude number and e = (jp-1 is a nonlinear parameter. It has already been found that the amplitude of the linearized solution grows like e45, while the dispersive effects proportional to the square of the wavenumber decays like £ _ 5 so that a balance is obtained when t = 0(\/e).
Thus, the originally neglected nonlinear terms eventually become
significant and the unbounded growth can be modified. Akylas (1984a) developed an asymptotic analysis that takes account of the finite-amplitude effects {e
T) = £-iA(X,T),
(3.5)
where $ and A are assumed to be 0(1). In terms of new variables $ and A, the basic equations (3.1)-(3.4) are transformed together with the off resonant condition F = l + 7£3 where 7 = O(l). Following Whitham's (1974) analysis of the nonlinear theory of the full water wave equations, we derive the evolution equation for the amplitude A(X, T) to the leading order in the form AT + 1AX - 2AAX - ^Axxx
= np(0)8'{X),
(3.6)
where p(k) is the Fourier transform of p(x) and <5(x) is the Dirac delta function. Equation (3.8) is the forced Korteweg-de Vries (jKdV) equation. To solve (3.6), we can obtain appropriate initial conditions by matching asymptotically the finite-amplitude response to the solution determined by the linearized theory. For 7 7^ 0, the solution of the linear nonhomogeneous KdV equation can be obtained by asymptotic approximation for T —» 00. For 7 > 0, the disturbance is found to decay as T —> 00 for any fixed X. On the other hand, for 7 < 0, the disturbance also decays when X < 0, and it represents a steady wave solution for X > 0 in the form A(X,T)
~ 6wp(0)(67)-5 sinh( v /6^X).
(3.7)
This result is in agreement with that of the linearized theory. Moreover, a numerical study of the forced KdV equation reveals that a series of solitons are generated in front of the pressure field. At larger distances from the pressure field, waves are highly oscillatory with a larger amplitude than that predicted by the linearized theory. These predictions are in excellent agreement with the experimental findings of Huang et al. (1982). 4. The Nonlinear Schrodinger (NLS) Equation Near Resonant Conditions According to Debnath and Rosenblat (1969) linearized theory of water waves on a running stream of finite depth due to an oscillating surface pressure field, the solution becomes unbounded
177 at the boundary curve separating two kinds of possible steady states. Indeed, the solution for the free surface elevation r)(x, t) becomes singular on the boundary (critical) curve in the sense that wave amplitude grows like x or like \fl as t —> oo. This leads clearly to a resonance phenomenon. So, it is necessary to include nonlinear terms in the original formulation of the problem in order to achieve a physically bounded solution near resonant conditions. We next discuss (see Akylas (1984b) and Debnath (1994)) the nonlinear evolution equation associated with nonlinear water waves generated by a moving oscillatory surface pressure near resonant (or critical) conditions. We consider a two-dimensional harmonically oscillating pressure distribution of frequency w traveling at a uniform speed U on the free surface of water of constant depth h,—h
It is
convenient to use a frame of reference moving with the pressure field so that the pressure is at rest and a uniform stream U exists in the water. In terms of nondimensional (asterisks) variables and parameters (x ,z ,h ) = —^(x,z,h),
t =—t,
u = — u,
e = -,
water wave equations (2.1)-(2.4) with surface tension T = 0 are, dropping the asterisks, V 2 0 = 0,
-h < z < 0,
t > 0;
Vt + Vx + £
(4.1) (4.2)
on z = -h,
(4.3) (4.4)
where c.c. stands for the complex conjugate. According to the linearized theory, the wave amplitude grows like E\fi and the dispersive effects decay like t - 1 ' 2 for large t > l . Therefore, nonlinear effects are found to appear when the amplitude is 0 (v^e) at / = 0(£ _ 1 ). We now introduce the slow time and slow spatial variables by T = et and X = \fe x, respectively. The finite amplitude effect is assumed to be in the form of a wave packet of 0(-V/E) amplitude, modulated by an envelope depending on X and T. Accordingly, we assume the expansions for
4> = e~1/2
ri = £~ m+Vo
(4.5)
+ V2 + --- ,
(4.6)
where
& = $(X, 2, T)e
r)0 = A0(X, T), + ex.,
<j>2 = 3>2{X, z,T)e
2ie
+ c.c,
ie
7ft = A{X, T)e
2ie
V2
= A2(X,)e
(4.7ab) + c.c, + c.c.,
(4.8ab) (4.9)
178 and the phase function 9 = k^x + ujt. In view of nonlinear effects, the primary harmonic produces the mean-flow components (J>Q, r)o, and the second harmonics
Aexp(-i-fT), iBT -
7
S + ( ^ - ) BXX
- ( ^ ) B*B2 = -7rap05(X),
(4.10)
where po is the Fourier transform of p(x), a i , Q2i and a are given in Debnath's book (1994). Clearly, (4.10) is the forced nonlinear Schrodinger (fNLS) equation with a minor modification due to assumed frequency detuning. In fact, the right-hand side of (4.10) represents the forcing effect of the applied pressure that includes the Dirac delta function S(X) as a factor. The evolution equation (4.10) can be solved subject to the initial condition determined by asymptotically matching the nonlinear response to the linearized far-field response. So, this resulting nonlinear initial-boundary value problem reveals the existence of a finite-amplitude steady state solution close to the resonant conditions. However, the number and nature of the possible steady states depend on the sign of 0:10:2 (see Debnath (1994)) and the value of the detuning parameter 7. In particular, for an infinitely deep water (h —» 00), to = fco = \, a i = 2, 0:2 = —ikga = —g, so that aiQ2 = — \ < 0, it can be shown that only one steady-state solution describing a uniform wave is possible for all values of 7. This finding is in excellent agreement with the result of Dagan and Miloh (1982). 5. The Nonlinear Schrodinger Equation and Evolution of Wave Packets We consider the evolution of wave packets for gravity waves on the surface of water of uniform, but finite depth (6 = 0). We retain the shallowness parameter 8 in equations (2.7)-(2.10) and consider £ —> 0 for \f5 fixed so that equations (2.7)-(2.10) reduce to
on
0 < z < l + £7?, z = l + en,
t>0
(5.1) (5.2)
4>z - S(r)t + £>x%) = 0
on
z = 1 + erj,
(5.3)
4>z = 0
on
z = 0.
(5.4)
179
In consistent with modulated waves from a Fourier integral representation of a wave described by oo
/
F(fc) exp[i(fcc - ait)]dk,
(5.5)
•oo
where F{k) is given and w = ui(k) is also a given dispersion relation. We assume that the main contribution to the wave profile comes from the neighborhood of the carrier wavenumber k — ko so that k — ko = £K and u> = uj(k) can be expanded in a Taylor series about A: = ko up to the term e2. It turns out that, as e —> 0, tf>(x, t) ~ A(x, T) exp[i(k0x — w04)],
(5-6)
where A(x, r) is known, x = e{x - cgt), u>0 = uj(k0), r = e 2 t and cg = ui'(ko) is the group velocity. We recognize here that the relevant scales seem to be associated with both e and e , and hence, we introduce £ = x — Cpt, X = e(x — Cgt),
T
= e2t,
(5.7)
where Cp is the phase velocity of the wave. Following Johnson (1997), it turns out the asymptotic solution takes the form f?o = Ao(X, T) exp(ifc£) + c.c. 4>o = fo(X, T) + F0(C, z, T) exp(ikO + c.c.
(5.8) (5.9)
where c.c. stands for the complex conjugate of the terms in exp(ifc£). The asymptotic analysis leads to an equation for AQ(X, T) in the form -2ikAoT + aAoxx + 0Ao\Ao\2 = 0,
(5.10)
'. = c2 - (1 - 6k t a n h <5fc)sech2<5fc,
(5-H)
where
•k'c•2„-2 p
2 (1 + 9 coth 2 8k - 13sech2<5fc - 2 tanh 4 8k) - 2{2cT + cg sech2<5fc)2(l - ca-' )
(5.12)
Equation (5.10) is one of the standard forms of the Nonlinear Schrodinger equating. It is easy to check that a > 0 for all <5fc and f) changes in sign from positive to negative as 8k decreases. 6. Higher-order Nonlinear Schrodinger Equations It is well known from the theory and experiments of Benjamin and Feir (1967) that a finiteamplitude uniform train of surface gravity waves is unstable to modulational perturbations with
180 sufficiently long wavelengths. According to Yuen and Lake (1982), the envelope of a weakly nonlinear wave packet in deep water is governed by the nonlinear Schrodinger equation. This equation seems to provide an accurate description of the evolution of a wave packet of small wave steepness £ = ak only for a limited time, at most 0 ( e - 2 ) wave periods. Feir (1967) confirmed experimentally this restriction on the validity of the NLS equation, and confirmed that an initially symmetric wave packet of uniform frequency and moderate wave steepness eventually loses its symmetry as it propagates away from the wave maker, and splits into two prominent groups of different frequencies. Later on Su (1982ab) observed similar phenomena in his more comprehensive experiments with initially symmetric wave packets of uniform frequency and various durations. For short pulses his results extend those of Feir and reveal further information on group splitting and frequency downshift in the leading wave group. Both Lake et al. (1977) and Melville (1982) performed the instability experiments of a uniform wavetrain with the typical value of wave steepness ka > 0.2. The sideband disturbances, if they were of equal magnitude initially, were found to grow at equal rate only for a limited time. As nonlinear effects become more and more significant, the lower sideband grows faster and attains a greater maximum than the upper sideband, whereas the carrier wave drops to a minimum. Before attaining these extrema, local breaking was observed by Melville and probably occurred in Lake et al.'s experiments. Lake et al. (1977) suggested that the unequal growth is possibly responsible for the downward shift of the spectral peak of wind waves with increasing fetch. Janssen (1981) investigated the long time behavior of the Benjamin-Feir modulational instability. His study based on the NLS equation exhibits the
Fermi-Pasta-Ulam
(1955) recurrence phenomenon, and is in qualitative agreement with experimental work of Lake et al. (1977) and the numerical computation of Yuen and Ferguson (1978ab). Janssen's analysis also reveals that other effects including dissipation are likely to explain the observed frequency downshift in the experiments of Lake et al. (1977). For small amplitudes, the NLS equation was derived by several authors (see Debnath, 1994) to describe the evolution of a wavetrain and this equation seems to be correct to third order in the wave steepness. Subsequently, Longuet-Higgins' (1978a,b) work on the normal mode perturbation analysis of the fully nonlinear water wave problems in deep water confirmed that the NLS equation provides an adequate description for all but the smallest wave steepness. However, the preceding NLS equation is found to compare rather unfavorably with the exact results of Longuet-Higgins for e > 0.10. Later on Dysthe (1979) has shown that a significant improvement can be made by extending the perturbation analysis to 0(e 4 ). He derived the evolution to fourth order in the amplitude of the wave potential leading to a major improvement in agreement with LonguetHiggins' analysis. In nondimensional form the complex amplitude A of the first harmonic of the Stokes waves, the average value
181
2i (At + -Ax\
+ -Ayy = kAxxx
- -Axx
- A\A\2
2 -J-K.-.-X - - - X*), - -2 i \A\ AX + A(
0 t + 2C = O,
(6.1) (6.2)
2
^ z - 2(t = (|^| )x.
(6.3)
The right hand side of equation (6.1) is made up of contributions of 0(ei).
Two linear terms on
the right hand side of (6.1) are simply corrections to the dispersive effects of the NLS equation. In the absence of the fourth-order effects, the whole right hand side of (6.1) becomes zero so that (6.1) reduces to the ordinary NLS equation derived by many authors in the 1960s and 1970s. However, the fourth order NLS equation was first derived by Dysthe (1979) to eliminate the weakness of the ordinary NLS equation. His fourth order NLS equation gives a significant improvement on the results relating to the stability of finite amplitude waves. The dominant new effect introduced by adding terms of order £4 to the ordinary NLS equation is the mean flow response to nonuniformities of the radiation stress caused by modulation of a finite amplitude wave. Moreover, the horizontal component of the mean flow along the direction of wave propagation causes a slowly varying Doppler shift of the wave as represented by the last term in (6.1). The Doppler shift seems to have a detuning effect on the modulational instability. Among the new terms in (6.1) only the term A(f>x makes significant contribution to the stability of a finite amplitude wave. Thus, for stability analysis, it is sufficient to use the following simplified Dysthe's equations 2i[A \At t
+ ^A\A x)+^A x) yy--AA 2 yyxx
= A{\A\2 +
4
2
$Z-(\A\ )X
on
2
V 0 = 0
for
2 = 0, 2 < 0.
(6.4) (6.5) (6.6)
In deep water so that kh = 0(ka)~1 S> 1, Dysthe's equations for A and > in dimensional form
: At +
i
TkA*)-w^-\«k2w2A i w iuik t 3zoifc 2 Axxx + -rA2Ax 2 A - 1__ — - \ A \ A X + kA[^>x]z=0 = 0, 16F 1I, + T 2 V2
(6.7)
-/i
(6.8)
on
z = 0,
(6.9)
on
z = -h
(6.10)
2
182 where <j> is the potential of the induced mean current. The first four terms in (6.7) constitute the NLS equation in the fixed frame of reference. All terms on the right hand side of (6.7) are 0(e).
Lo and Mei (1985) made a numerical study of the water-wave modulation based on Dysthe's
fourth order NLS system (6.7)-(6.10). Their analysis shows a reasonable agreement with the recent experimental results. So the Dysthe equation represents a useful model for predicting the long-time evolution of the narrow-banded weakly nonlinear waves. Recently, several authors including Janssen (1983), Stiassnie (1984) and Hogan (1985) derived the fourth-order NLS equation for gravity waves and for capillary-gravity waves. Stiassnie showed that the Dysthe fourth-order NLS equation is a special case of the more general Zakharov equation that is free from the narrow spectral width assumption. Hogan (1985) extended Stiassnie's analysis to deep water capillary-gravity wave packets. Based on the Zakharov integral equation, Hogan derived the fourth-order NLS equation. In addition to the leading order dispersive and nonlinear terms in the NLS equation, Hogan's fourth-order NLS equation features certain nonlinear modulation terms and a nonlocal term that describes the coupling of the envelope with the induced mean flow. Hogan's analysis also reveals that there is a band of stable capillary-gravity waves at the fourth order, such a band is known to exist at third order. The effects of the mean flow for pure capillary waves are, in general, of opposite sign to those of pure gravity waves. Hogan showed that the second-order corrections to first-order stability properties depend on the interaction between the mean flow and the envelope frequency-dispersion term involved in the fourth-order NLS equation. Indeed, Hogan's derivation is based on the Zakharov integral equation under the assumption of a narrow band of waves, and including the interactions of capillary-gravity waves. His fourth-order evolution equation for the complex wave amplitude a(x, t) for capillary-gravity waves in deep water is given in the form 2i(at + cgax) + paxx + qayy — ^\a\ a + saxyy) + i(v\a\2ax - ua2ax) + a4>x,
= -i(raxxx where the group velocity cg is _ (wo\
[1±3K\
Cg
-\2k0){l
+ n)>
_ Sfcg
*~
g '
and other coefficients in (6.11) are given by _(3K2 + 6 K - 1 ) P
~
4(1+ K)2
_
'
q
'
7 _
2
_ (3 + 2K + 3K ) S
~
4(1 + K) 4
_ 3(4K + IK
3
V
~
2
(1-K)(1 + 6K+«2)
1 / 1 + 3K\
~2\T+~^)'
T
8 + K + 2K? 8 ( 1 - 2 K ) ( 1 + K)'
2
- 9K + K -
8(1 + K ) 2 ( 1 - 2 K ) 2
8)
8(1 + K)3
~ _
U
~
( 1 - K ) ( 8 + K + 2K 2 ) 1 6 ( 1 - 2 K ) ( 1 + K) 2 '
(6-11)
183 For pure capillary waves, equation (6.11) reduces to the form 2i(at
+ -ax J + -axx + -ayy + -\a\2a - --(axxx
+ daxyy) + -(3\a\2ax - a2a*x) + a<j>x. (6.12)
Neglecting the right hand side of (6.12) leads to the third-order envelope equation for capillarygravity waves in deep water. Hogan showed that the second-order corrections to first-order stability properties depend essentially on the interaction between the mean flow and the envelope frequencydispersion term in the governing equation. From a physical point of view, two kinds of interaction influence the stability of a nonlinear wavetrain. First, it was shown by Lighthill (1965) that the relative signs of the envelope frequency dispersion term paxx and the nonlinear term (—7|a|2a) govern the stability of the solution. There exists a band of stable waves where the product pr is positive. Second, the corrections to the stability characteristics that occur at higher order can be found to arise from an interaction between the mean flow term (a
It is well known that the
frequency-dispersion term for gravity waves is of opposite sign for capillary waves. For gravity waves the detuned nonlinear term A(|J4| 2 +
184 According to Davey and Stewartson, the slow (or weak) dependence occurs in both the x- and y-directions, but the rapid oscillation is only in the x-direction and the wavepacket propagates in the x-direction with a slowly varying nature in both x- and ^-directions. The group velocity is assumed to be still associated with the propagation in the x-direction. The governing equations are, in the nondimensional form, S(
Vt + e(0x% + 4>vriy) - (TV* = 0 2
& + »?+-e(V<« + | ^
0 < 2 < 1 + £??,
on
t > 0,
z = 1 + £7/,
(7.1)
(7.2)
= 0
on
z = l + sV,
(7.3)
02 = 0
on
z = 0.
(7.4)
It is convenient to introduce new variables £ = x - Cpt,
X = e(x - cgt),
Y = ey,
r = e2t
(7.5)
so that equations (7.1)-(7.4) can readily be transformed into a system Szz + S[
(7.6)
+ s(
£2
4>z = 0
on
2 = 1 +£77,
2 = 0.
(7.7)
(7.8) (7.9)
Retaining terms 0 ( E 2 ) generates the only contribution from the dependence in Y from the term if>YY in the Laplace equation (7.6). The other terms involving derivatives in Y produce new nonlinear interactions that arise first at 0(£ 3 ). We seek a solution of the form
oo
(m=0
(n+l
£
1
y r em
cc
^ = E " i E ^w*- ' ) + - -1 n=0
lm=0
t 7 ' 10 )
)
(7-n)
J
where e = exp(ifc£) and 4oo = 0 so that the first approximation to the surface gravity wave is purely harmonic. To the order 0(e 2 ), the problem at £2e° gives equation for /o (1 - c2g)f0XX + foYY = - c - 2 ( 2 c - 2 ( 2 c p + C j s e c ^ ^ d ^ o ^ x -
(7.12)
185 Given Ao = Aoi, the surface boundary conditions for the £ 2 e 1 leads to -2ifcCp>loT + ctAoxx - CpCgA0YY + lk2c;2(l
+ 9coth 2 5k - 13sech2<5fc - 2 tanh 4 5k)A0\A0\2
+ fc2(2cp + c9sech25fc)^o/ox = 0.
(7.13)
Thus the coupled equations (7.12)-(7.13) are known as the Davey-Stewartson (DS) equations for the modulation of harmonic waves. In the absence of Y dependence, it turns out that equation (7.12) takes the form (1 -
(7.14)
This equation provides the leading contribution to the mean drift generated by the nonlinear interaction of the wave motion, usually called the Stokes drift. Similarly, equation (7.13) reduces to the ordinary nonlinear Schrodinger equation -2ifccpJ40T + aA0XX
+ PA0\A0\2
= 0
(7.15)
which is identical with (5.10). Finally, the Davey-Stewartson equations can also be written in the compact form (1 -
(7.16)
Cp
(
-2ifccpyloT + aAoxx - CpCgAoYY + < 0 +
72fc2
c2(1
1
_ c2) f MM2
+ k2"/A0fox
= 0,
(7.17)
where a and /3 are given by (5.11)-(5.12) and 7 = 2cp + c9sech2<Sfc
(7.18)
where 7 > 0 and Cj,c9 > 0. These evolution equations can be further approximated for long waves (<5 —» 0) and short waves (S —> 00) respectively. However, their validity suffers from criticism because other terms become important. The long wave approximation (<5 —• 0) leads to the one-dimensional propagation of long waves so that the relevant equation is the KdV equation. Thus the KdV and NLS equations are two fundamental equations for weakly nonlinear waves. The former equation represents long waves which can be derived in the limit as 5 —» 0 and e —> 0 with 5 = 0(e). However, the use of a suitable rescaled variable allows us to obtain the KdV equation for arbitrary \/5. On the other hand, the NLS equation requires scaled variables which are defined with respect to e only with <5 = O(l) that is retained as a parameter. At least for a class of nonlinear waves, there are two representations:
186 (i) ri(x, t, e, 8) with e —* 0 and 8 —» 0 gives the KdV equation, (ii) r](x, t, e, 8) with e —> 0 for fixed 8 leads to the NLS equation. The same match also occurs between the Davey-Stewartson equations (7.12) and (7.13) and the two-dimensional KdV equation (see Freeman and Davey (1975)). In fact, the long wave limit (<5 —> 0) of the DS equation matches the short wave limit (8 —• oo) of the (1 + 2)-dimensional KdV equation (2r/or + 37)o»7of + 3%«f J +VOYY=0. This equation is often called the Kadomtsev-Petviashvili
(7.19)
(KP) equation (Kadomtsev and Petvi-
ashvili, 1970). This is another completely integrable evolution equation. This equation also admits an exact analytical solution describing any number of waves that cross obliquely and interact nonlinearly. For the case of three such waves, special solutions exist corresponding to a resonance phenomenon. Acknowledgement. This paper is based on the lecture delivered at the International Conference on Mathematics and the 21st Century in Cairo, Egypt in January 2000. The author expresses his grateful thanks to the Organizing Committee of the Conference and the University of Central Florida for their financial support for attending the Conference.
187
References Akylas, T.R., Dias, F. and Grimshaw, R.H.J. (1998). The effect of the induced mean flow on solitary waves in deep water, J. Fluid Mech. 355, 317-328. Akylas, T.R. (1984a). On the excitation of long nonlinear water waves by a moving pressure distribution, J. Fluid Mech. 141, 455-466. Akylas, T.R. (1984b). On the excitation of nonlinear water waves by a moving pressure distribution oscillating at resonant frequency, Phys. Fluids 27, 2803-2807. Benjamin, T.B. and Feir, J.E. (1967). The disintegration of wavetrains on deep water, Part 1, Theory, J. Fluid Mech. 27, 417-430. Benney, D. and Roskes, G. (1969). Wave instabilities, Studies Appl. Math. 48, 377-385. Dagan, G. and Miloh, T. (1982). Free-surface flow past oscillating singularities at resonant frequency, J. Fluid Mech. 120, 139-156. Davey, A. and Stewartson, K. (1974). On three-dimensional packets of surface waves, Proc. Roy. Soc. Lond. A338, 101-110. Debnath, L. (1994). Nonlinear Water Waves, Academic Press, Boston. Debnath, L. and Rosenblat, S. (1969). The ultimate approach to the steady state in the generation of waves on a running stream, Quart. Jour. Mech. and Appl. Math. XXII, 221-233. Dysthe, K.B. (1979). Note on a modification to the nonlinear Schrodinger equation for application to deep water waves, Proc. Roy. Soc. Lond. A369, 105-114. Feir, J.E. (1967). Discussion: Some results from wave pulse experiments, Proc. Roy. Soc. Lond. A299, 54-58. Freeman, N.C. and Davey, A. (1975). On the evolution of packets of long surface waves, Proc. Roy. Soc. Lond. A344, 427-433. Hogan, S.J. (1985). The fourth-order evolution equation for deep-water gravity-capillary waves, Proc. Roy. Soc. Lond. A402, 359-372. Huang, D.-B., Sibul, O.J., Webster, W . C , Wehausen, J.V., Wu, D.-M. and Wu, T.Y. (1982). Ship moving in a transcritical range, Proc. Conf. on Behavior of Ships in Restricted Waters (Varna, Bulgaria), 2, 26-2 - 26-10. Janssen, P.A.E.M. (1983). On a fourth-order envelope equation for deep water waves, J. Fluid Mech. 126, 1-11. Janssen, P.A.E.M. (1981). Modulational instability and the Fermi-Pasta-Ulam recurrence, Phys. Fluids 24, 23-36.
188 Kadomtsev, B.B. and Petviashvili, V.I. (1970). On the stability of solitary waves in weakly dispersive media, Sov. Phys. Dokl. 15, 539-541. Lake, B.M., Yuen, H.C, Rundgaldier, H. and Ferguson, W.E. (1977). Nonlinear deep-water waves: Theory and experiment, Part 2, Evolution of a continuous wave train, J. Fluid Meek. 83, 49-74. Lighthill, M.J. (1965). Contributions to the theory of waves in nonlinear dispersive systems, Proc. Roy. Soc. Lond. A299, 38-53. Lo, E. and Mei, C.C. (1985). A numerical study of water-wave modulation based on a higher-order nonlinear Schrodinger equation, J. Fluid Meek. 150, 395-416. Longuet-Higgins, M.S. (1978a). The instabilities of gravity waves of finite amplitude in deep water I, Superharmonics, Proc. Roy. Soc. Lond. A360, 471-488. Longuet-Higgins, M.S. (1978b). The instabilities of gravity waves of finite amplitude in deep water II, Subharmonics, Proc. Roy. Soc. Lond. A360, 489-505. Melville, W.K. (1982). The instability and breaking of deep-water waves, J. Fluid Meek. 115, 165-185. Stiassnie, M. (1984). Note on the modified nonlinear Schrodinger equation for deep water waves, Wave Motion, 6, 431-433. Stoker, J.J. (1957). Water Waves, Interscience, New York. Su, M.-Y. (1982a). Three-dimensional deep-water waves, Part I, Experimental measurement of skew and symmetric wave patterns, J. Fluid Meek. 124, 73-108. Su, M.-Y. (1982b). Evolution of groups of gravity waves with moderate to high steepness, Phys. Fluids 25, 2167-2174. Whitham, G.B. (1974). Linear and Nonlinear Waves, John Wiley, New York. Yuen, H.C. and Lake, B.M. (1982). Nonlinear dynamics of deep-water, Ann. Rev. Fluid Mech. 12, 303-334. Yuen, H.C. and Ferguson, W.E. (1978a). Relationship between Benjamin-Feir instability and recurrence in the nonlinear Schrodinger equation. Yuen, H.C. and Ferguson, W.E. (1978b). Fermi-Pasta-Ulam recurrence in the two space dimensional nonlinear Schrodinger equation, Phys. Fluids 21, 2116-2118.
Mathematics and the 21st Century Eds. A. A. Ashour and A.-S. F. Obada © 2001 World Scientific Publishing Co. (pp. 189-198)
189
A ROBUST LAYER-RESOLVING NUMERICAL METHOD FOR A FREE CONVECTION PROBLEM JOCELYN E T I E N N E , J O H N J . H .
M I L L E R , G R I G O R I I I. SHISHKIN
ABSTRACT. We consider free convection near a semi-infinite vertical flat plate. This problem is singularly perturbed with perturbation parameter Gr, the Grashof number. Our aim is to find numerical approximations of the solution in a bounded domain, which does not include the leading edge of the plate, for arbitrary values of Gr > 1. Thus, we need to determine values of the velocity components and temperature with errors that are Gr-independent. We use the Blasius approach to reformulate the problem in terms of two coupled nonlinear ordinary differential equations on a semi- infinite interval. A novel iterative numerical method for the solution of the transformed problem is described and numerical approximations are obtained for the Blasius solution functions, their derivatives and the corresponding physical velocities and temperature. The numerical method is Gruniform in the sense that error bounds of the form CPN~P, where Cp and p are independent of the Gr, are valid for the interpolated numerical solutions. The numerical approximations are therefore of controllable accuracy. Keywords and Phrases: Robust method, Layer-resolving, Boundary layer, Fitted mesh, Free convection, Coupled nonlinear equations
1 THE FREE CONVECTION PROBLEM A free or natural convection flow occurs when a fluid at rest, subjected to a body force such as gravity, is near an object at a different temperature. The heat transfer between the object and the fluid causes an increase or a decrease in the fluid density at the surface of the object, and thus generates an unbalancing body force. The fluid near the surface is accelerated, and a boundary layer develops. We study this problem for a two-dimensional, steady flow near a semi-infinite flat plate. This involves an interesting and typical system of singularly perturbed partial differential equations.
190 Our goal is to construct a numerical method for this problem in a bounded domain that does not include the leading edge of the plate. Because the solution we seek is self-similar, using Blasius' approach, we can reduce the problem to the numerical solution of a coupled system of nonlinear ordinary differential equations. We require that the numerical approximations generated by our method converge to the exact solution with an order of convergence that is independent of the Grashof number Gr for all Gr > 1. We refer to a numerical method with this property as a layer-resolving method. No standard numerical method exists, which fulfills this requirement. 1.1
PHYSICAL DESCRIPTION
We consider a semi-infinite vertical flat plate in an incompressible fluid. We assume that the density of the fluid varies linearly with the temperature and that its other properties are constant. The plate is heated to the temperature 0\, while the fluid temperature away from the plate is 0^. For 0i > 9oo, the heat transfer into the fluid decreases its density in a small region around the plate, resulting in an upward motion of the fluid. Since motion in the fluid results only from this heat transfer, we assume that the fluid away from the surface is not affected by this upward motion. The governing equations are
9u
(i)
+?=o
dx dy J,,. . . 82u
-du+v-=gdu M-0oo)
+
v—
89 _ d26 dy dy2
d9_ dx with the boundary conditions v = u = 0, u —¥ 0,
9 = 0i for y = 0 9 —> 0QQ for y —¥ oo
When we non-dimensionalise these equations we obtain 9u
o-X d2u oyz Pr dy2
dv +
d-y
„ d u „dii ox ay dx
„
.„,
= °
<2>
_
-
dy
with the boundary conditions {i = u = 0, u —¥ 0,
0=1 for £ = 0 0—^0 for y -> oo
191 where the Grashof and Prandtl numbers have their usual definitions
1.2
Gr = g W f t - ' - )
(3)
Pr
(4)
=
a
B L A S I U S ' FORMULATION
The problem is now transformed using Blasius' technique to a one dimensional problem. For a complete description of this we refer the reader to [2]. The transformed problem involves the unique dimensionless variable Gr\* 4 J
y xV4
and two dimensionless functions / and t, which are related to the physical velocities and temperature through the following relations
Q(x,y)
=
t{T,)
(5)
u(x,y)
=
4^)axi/'(r,)
(6)
*(*,») = {~y
~iivf'(v)-mv))
(7)
In terms of these functions, the governing equations become
t" + SPr • ft' = 0
(8)
2 / ' " ' + 3 / / " - 2/' + t = 0
(9)
3IlQ.it 1(3I1S
/(0) ==
/'(o) = o,
/'fa —> oo) —> 0,
t(0) = 1 t(r] - » oo)
0
This is again a singularly perturbed problem. Our aim is to find numerical approximations of the velocity components and temperature in a bounded domain, which does not include the leading edge of the plate. Because we want this solution for arbitrary values of Gr > 1, we need to determine numerical approximations to the solution of the above problem at each point r? of the semi-infinite interval / = (0,oo).
192 2
LAYER-RESOLVING M E T H O D F O R BLASIUS' P R O B L E M
The equations obtained by Blasius are posed on the semi-infinite interval J, and there are boundary conditions at infinity. It is obvious that the problem cannot be solved numerically in this form. A standard alternative approach is to satisfy these boundary conditions by using an iterative method involving additional boundary conditions at r\ = 0 (see [2]). I
1
1
1
1
0
1
-
a
infinity
Figure 1: Mesh on semi-infinite domain for Blasius' problem. Here we use the method described in [1, Chap. 11], which yields a solution on the whole of / . We divide I into two subintervals, [0,L] and [L,+oo). On the first we define a discrete problem on a uniform mesh, and on the second we define an affine extension of each function using the boundary conditions. Thus, for T, the interpolated function of the discrete approximation of t, we have T(T] -> +oo) = 0 and therefore we take T(rj > L) = 0. Similarly, for F we know that -§rf(r] —• +oo) = 0 and we take F(r] > L) = F(L), where this latter value is obtained from the solution of the discrete problem on [0, L\. In order to guarantee that the method is Gr-uniform, a careful choice of the point L is of course crucial. We take L^ = In TV and on the subinterval [0, Ln] we define the uniform mesh IN = {xi = iN_1 InN : 0 < i < N}. This choice is motivated by the discussion in [1] for a simpler problem. The computations described in what follows show that in practice the resulting method is L~ uniform. We introduce the discrete problem V i € { l - - - W - 1 } , 52Ti+3Pr-FiD-Ti =0 V» € {2 • • • N - 1}, D-pFi + 3FiS2Fi - 2{D~Fi)2 = -Tt {PLN){
with
F 0 = 0,
D+Fo = 0, D~FN = 0
r0 = i, TN = O where D+ and D~ are the forward and backward first order finite difference operators, S2 is the centred second order operator and, for any mesh function G, d = G(xi) for all Xi e IN. We need to linearize these equations. The natural first attempt is the linearization S2Tm + 3Pr-Fm-1D-Trn 2 m
D-6 F
m 1 2 m
m 2
+ 3F - 5 F -2(D-F )
= =
0
(10)
-T
m
(11)
which we iterate until uniform convergence is achieved for a given tolerance. But, for a fixed N, the iterates do not converge as m grows. In fact we have
193 lim F2m + lim F2m+1 m—too
lim T 2 m /
and
m—»oo
m—*-oo
lim T 2 m + : m—»oo
even iterations results
odd iterations results
Figure 2: Oscillations of F function (sketch)
Figure 2 shows a sketch of the oscillations of the iterates Fm. To prevent these oscillations we use previously computed values of F by introducing the auxiliary variables -1
1
=
7T.TO-1
2F
+
(12)
2^c
It is clear that F c m _ 1 depends on all previously computed values of F. Since it is much less subject to oscillation than Fm~l, we use it to replace F m _ 1 in equation (10). The resulting method yields good results for all physically relevant values of the Prandtl number. 3
CONVERGENCE OF THE METHOD
We use the above method to compute approximations for values of N in the range 128 to 32768, and work in quadruple-precision in order to obtain significant error bounds. We study the convergence of the resulting sequence of solutions, and their first and second derivatives, using the experimental error analysis technique described in [1]. All of the computations in this section are carried out for Pr = 0.72, which is the value of the Prandtl number for air. Other experiments within a physically relevant range of Pr yield similar results. As in [1], for any mesh functions GN on the mesh IN, we define the maximum pointwise error E*
G"
-^Nm
-G
\JN
the two-mesh difference -p^N
D" = \\G
-=2N I
- G
\INUI2N
194 and the order of convergence ,
-N
D"
where G is the interpolated function corresponding to the GN mesh function and iV max = 32768 is the largest value of N used in the computations. The computed values of the error parameters C and p are defined in an analogous way to those in [1]. From the numerical results in the first and last two rows of Tables 1-3 we see that, in practice, the method is robust and layer-resolving in the sense that it is L-uniform and that the L-uniform order of convergence of the numerical approximations of / and t, and their derivatives, is better than 0.78 for all N > 512. N B"(F) EN(T) DN(F) DN(T) PNN(F) P (T)
N E"(D+F) EN(D+T) DN{D+F) DN(D+T) pN(D+F) pN(D+T)
N EN (g'F) EN(S2T) DN(S2F) DN(S2T) PN(S2F) pN(S2T)
128
256
512
0.020684 0.005699 0.014051 0.003695 1.035344 1.331509
0.010497 0.002864 0.006856 0.001468 1.039840 0.991703
0.005228 0.001672 0.003334 0.000738 1.053815 0.781964
1024 0.002543 0.000938 0.001606 0.000429 1.077516 0.825135
128
256
512
0.008547 0.006304 0.020668 0.007780 0.761882 0.869365
0.003847 0.003760 0.012189 0.004259 0.689707 0.857542
0.001754 0.002147 0.007557 0.002350 0.865701 0.869375
128
256
512
0.017785 0.007241 0.042189 0.011175 0.817064 0.712518
0.010544 0.004030 0.023946 0.006820 0.726389 0.889433
0.006069 0.002212 0.014474 0.003682 0.855304 0.787396
2048 0.001201 0.000510 0.000761 0.000242 1.109775 0.853144
1024 0.000800 0.001190 0.004147 0.001287 0.766244 0.875633
1024 0.003395 0.001197 0.008000 0.002133 0.813816 0.908475
4096 0.000547 0.000268 0.000353 0.000134 1.150155 0.872730
2048 0.000363 0.000643 0.002438 0.000701 0.885989 0.883201
2048 0.001850 0.000624 0.004551 0.001136 0.873976 0.825205
8192 0.000236 0.000134 0.000159 0.000073 1.200088 0.886969
4096 0.000161 0.000337 0.001319 0.000380 0.843121 0.890366
4096 0.000977 0.000325 0.002483 0.000641 0.866127 0.924485
16384 0.000092 0.000061 0.000069 0.000040 1.264512 0.897438
8192 0.000070 0.000176 0.000735 0.000205 0.896398 0.896959
8192 0.000491 0.000163 0.001362 0.000338 0.891145 0.851711
16384 0.000057 0.000087 0.000395 0.000110 0.886538 0.902954
16384 0.000224 0.000066 0.000735 0.000187 0.893042 0.934954
Table 3: Computed maximum pointwise error EN, computed two-mesh difference DN and computed order of convergence pN for 52F and 62T in quadruple precision arithmetic. 3.1
C O M P U T E D ERROR BOUNDS FOR B L A S I U S ' FUNCTIONS
The results in [1] for a simpler problem suggest that we can expect error bounds of the form CN~P, where C and p are independent of Gr. Considering values of N > 512, the experimental error analysis described in [1, chap.8] yields computed values for p and C. Applying this technique to the present problem
195 we obtain the following a posteriori error bounds for the functions F, T and their derivatives, for all N > 512 < 4.607N'1054
max
\(F-f)(n)\
max
I (T - t) (rj) I < 2.320iV-°- 782
max
I ('D+F - f)
T)€[0,+oo) IV
(rj)\ <
/
V
'\
(13)
2.182N-°-7m
-
max
I (D+T - t') (n)\ < 1.175AT-0-869
max
I(S*F - f") MI < 5.386JV- 0814
max
I (~PT - t") (v)\ <
1.188N-0787
These computed error bounds show experimentally that our numerical method is robust and layer-resolving for N in the range 512 to 32768. 3.2
E R R O R IN THE PHYSICAL QUANTITIES
We return now to the original non-dimensionalised problem. We want to compute the error for the velocities and temperature on a bounded sub-domain f2 = [0.1,1] x [0,1] of the non-dimensionalised semi-infinite domain. The choice of the interval [0.1,1] for the variable x is required because of the singularity in the velocity components u, v and their derivatives at the point x = 0. We use the relations between these quantities and Blasius' functions described in section 1.2. We see that the velocity components u and v respectively behave like GV5 and Gr*. Therefore, we need to scale the components by these factors in order to obtain quantities that are bounded uniformly with respect to the Grashof number. Graphs of the resulting approximate scaled velocity components and temperature on [0.1,1] x [0,1] are shown in Figures 3-5 for Gr = 10 5 and N = 32768. We see that a boundary layer in each physical variable arises on the boundary of the plate. The corresponding scaled errors in the physical quantities are
Gr-kmax\(U-u)(x,y)\ n = max_ b s * (D+F
(14) - f) (?j)|
7)=7)(x,y)
<2
max_ I (D+F - f) (rj) I (2,ii)€n ' V V=r,{x,y)
/
I
196
Figure 3: The approximate scaled horizontal velocity for the free convection problem on [0.1,1] x [0,1] with Gr = 10 5 generated with iV=32768.
Figure 4: The approximate scaled vertical velocity for the free convection problem on [0.1,1] x [0,1] with Gr = 10 5 generated with JV=32768.
Figure 5: The approximate temperature for the free convection problem on [0.1,1] x [0,1] with Gr = 10 5 generated with JV=32768.
197
Gr-imax\(V-v)(x,y)\ = max
^
(15)
(rJD+F{V) - 3F(V) - (77/'(77) - 3/(7,)))
(i.y)eri r)=T)(4,y)
< 1.26 max_ \r)D+F(r)) - 3F{r]) - (77/'(77) - 3/(77))
max I ( 0 - e ) (£,?}) I = max_ I ( T - t ) (77) I
(16)
T)=r;(i,y)
We see that we need to estimate the additional quantity r)D+F(ri) — 3F(rj). The required numerical results are given in Table 4. N DN P
128 0.042154 1.035344
256 0.020567 1.039840
512 0.010003 1.053815
1024 0.004819 1.077516
2048 0.002283 1.109775
4096 0.001058 1.150155
8192 0.000477 1.200088
16384 0.000207 1.247754
Table 4: Computed two-mesh difference DN and computed order of convergence pN for T)D+F — 3F in quadruple precision arithmetic. With these results, and those from the previous section, we find the following computed scaled error bounds for the physical quantities
GV- = ||£7-- « l l n
Gr~*\\V- - « H T T
I I © - -§\\
<
4.37N-0766
(17)
1053
<
17.4iV-
<
232JV-o.782
These computed error bounds show that the boundary layers have been successfully resolved. We remark that we can use the same approach to generate similar approximations to the derivatives of the physical variables. 4
CONCLUSION
For free convection on a semi-infinite vertical flat plate, Grashof uniform numerical approximations to the velocity components and temperature have been generated in a bounded domain, which does not include the leading edge of the plate, for arbitrary values of Gr, using the Blasius formulation. Analysis of the numerical approximations shows that this numerical method is robust and layer-resolving. It follows that numerical approximations of controllable accuracy, with errors independent of the value of the Grashof number, can be computed with this method.
198 ACKNOWLEDGMENTS This work has been supported in part by the Russian Foundation for Basic Research under grant No. 98-01-00362 and by the Enterprise Ireland grant SC-98-612. REFERENCES
[1] Paul Farrell, Alan Hegarty, John J. H. Miller, Eugene O'Riordan, Grigorii I. Shishkin. Robust Computational Techniques for Boundary Layers. Series in Applied Mathematics and Scientific Computation, CRC Press, 2000. [2] David F. Rogers. Laminar flow analysis. Cambridge University Press. 1992. [3] Hermann Schlichting. Boundary-layer theory. McGraw-Hill, 7th ed., 1979.
Jocelyn Etienne John J.H. Miller Department of Mathematics University of Dublin Trinity College Dublin 2, Ireland
Grigorii I. Shishkin Institute of Mathematics and Mechanics Russian Academy of Sciences, Ural Branch Ekaterinburg 620219 Russia
Mathematics and the 21st Century Eds. A. A. Ashour and A.-S. F. Obada © 2001 World Scientific Publishing Co. (pp. 199-209)
199
Growth value-distribution and zero-free regions of entire functions and sections Faruk F.Abi-Khuzam Department Of Mathematics, American University Of Beirut, Beirut, Lebanon October 27, 2000 Abstract The growth of a second derivative of the logarithm of the maximum modulus of an entire function provides information about the location of zeros of the function and its sections.We present a survey of work on this topic along with some recent sharp results and open questions. 1991 Mathematics Subject Classification: Primary 30D20,30D15; Secondary 30D35. Keywords and Phrases: Entire function, section, zeros, gaps.
1
Introduction
I a m h a p p y t o be here before this gathering of distinguished workers in t h e field, and I feel deeply honoured t o be given this great o p p o r t u n i t y t o share with you some of t h e m a t h e m a t i c s we love. T h e s t u d y of zero sets of functions or roots of equations is a great story in m a t h e m a t i c s , s t a r t i n g very far back in t i m e and continuing on into t h e present.lt would not be t o o far from t h e t r u t h to say t h a t almost every m a t h e m a t i c i a n has worked a t one t i m e or another on a problem related to some zero s e t . W h e n we scan t h e literature for this topic we find ourselves in good company.The names of Weierstrass, Riemann, H a d a m a r d , Polya, Szego, H a y m a n , Gol'dberg and Ostrowski s t a n d o u t b u t there are numerous others. W h e n we t h i n k a b o u t zero sets of functions, we naturally recall t h e fundamental theorem of Algebra, and P i c a r d ' s t h e o r e m . l t would be i n a p p r o p r i a t e here not t o mention something a b o u t these results and their generalization in w h a t is known as value- distribution theory.But I will content myself w i t h some brief remarks since this talk will emphasize w h a t might be called angular distribution of zeros. P i c a r d ' s (little) theorem distinguishes values according as t h e y are lacunary ( o m i t t e d by t h e function ) or non-lacunary ( assumed by t h e function) .Valuedistribution theory quantifies lacunarity t h r o u g h t h e notion of deficiency and
200 climaxes in the deficiency relation of Nevanlinna, a far reaching generalization of Picard's Theorem.The method of negative curvature provides connection with geometry and extensions in various directions. Let us give a few details.If / is an entire function, we count the number of a—points of / ( i.e. solutions of the equation f(z) = a ) in the disk \z\ < r and denote this number by n(r, a).This is the counting function of the a—points.The related function, defined in most cases by N(r, a) = /„ t~1n(t, a)dt is called the smoothed counting function and one aspect of value-distribution theory is to compare the growth of N or n with some average function called the characteristic and defined, for entire functions , say when /(0) = 1, by
T(r) = ±J^N(r,eix)d\. The deficiency of a value a is defined by 6(a) = 1 — lim sup ~So
N(r, a) T(r)
and the famous deficiency relation is the inequality
aeCU{oo}
For an entire function <5(oo) is always l.So if the function omits a finite value a then, n(r,a) = N(r,a) = 0 and this lacunary value will have 6(a) = l.The defect relation then implies that, unless / is a constant, it can have no other lacunary value.This is Picard's theorem. A large part of value distribution theory is concerned with the deficiency relation and in particular, the number of a—points of the function / in a disk. But along with the problem of counting a—points or zeros, there is the problem of determining the angular distribution of zeros of a given entire function.Many outstanding questions in analysis involve the problem of locating the zeros of some entire, or meromorphic, function.The most famous of all is, of course, the Riemann conjecture which states that all the non-trivial zeros of the zeta function £(u + it) lie on the line a = \li £ is the function defined by
at) = l*(s - i)c«r(ia)7r-4' where s = ^ + it then £ is an entire function with non-negative Maclaurin coefficients oo
71=0
and the Riemann conjecture is equivalent to the statement that all the zeros of £ are real.
201 Another outstanding problem is the width conjecture of Saff-Varga [ESV] which is a statement about the possible dispersion in the plane of the zeros of sections of a given power series.lt has been verified in certain specific cases, notably in work of Edrei, Saff and Varga [ESV] but, as far as I know, continues to be uresolved in the general case. What I want to present here is a program of study, perhaps a bit ambitious, which has as its aim the answer to two questions that I will present shortly. The starting point is an entire function /:C->C and its power series
/CO
= £a z k
and its sections sn(z) =
^akzk.
We now ask two questions: (*) Where are the zeros of / located ? (**) Where are the zeros of sn located ? I aim to show here that a study of the growth of a certain second derivative associated to / promises to supply very precise information about the location of the zeros of /.This second derivative is defined by K
'
# log M(r) d(logr) 2
(2)
where M{r) = sup| 2 | = r |/(z)|, the maximum modulus of / . It will be seen that the best results available through the study of 6(r), so far, concern entire functions of zero order.But there are also general results, such as those in Theorems 3,8 below, covering functions of all orders.Hopefully they will serve as evidence favoring further studies on b(r).
2
Preliminaries
The answer to the first question in the introduction goes back to Weierstrass.Take any sequence of complex numbers tending to co but otherwise arbitrary, then you can find an entire function whose zero set is precisely this sequence.In other words the zeros of an entire function can be located anywhere in the complex plane. So we have to specialize the class of functions under consideration if we want to obtain something interesting.In this talk I shall concentrate mostly on
202 the class of entire functions with non-negative coefficients.This is a nice class of functions that contains many of the usual functions that we meet everyday in our work such as the exponential function, the Mittag-Leffler functions and £.If the coefficients are positive then the zeros of / cannot lie on the positive real axis and, by continuity, there will be a neighborhood of the positive real axis where / would have no zeros.What does this neighborhood look like and what is the corresponding set for the sections. By way of experimenting one may consider first the exponential function.Of course it has no zeros but plotting the zeros of its sections one obtains some very nice pictures as in [ESV] .The numerical computations done on the sections of exp z, which we owe to Iverson[Iv] , suggested the existence of a parabolic region free of all zeros of all sections of exp 2.This was verified ( following a result [NR] of Newman and Rivlin ) by Saff and Varga who proved the following [SV]. Theorem 1 //a,j > 0 for j = 0,1,2,... and b^ = ak-i/a^
and
a = inf{(6fc - bk-i) :k = l,2,...} > 0, then the sections sn of f have no zeros in the parabolic region Pa defined by Pa = {z — x + iy e C :y2 < 4a(x + a), x > - a } .
(3)
For example , exp z has b^ = 1 and a = 1 so that the parabolic region P1 = {z = x + iy e C : y2 < 4(x + 1), x > - 1 } is free of zeros of all sections of exp z. Theorem 1 is reminiscent of another "Parabola" theorem occurring in the Theory of continued fractions.But I am not aware of any connection between the two.The Saff-Varga result is rather elegant but suffers from a certain limitation of its applicability.For it is a fact, easy to prove [AK1], that the condition a > 0 in the Saff-Varga result implies that / is of exponential growth and type < ^.It is a result limited to functions of growth order at most one and finite type. At about the same time as the Iverson experiments, Edrei [E] considered a sort of converse question and obtained: Theorem 2 If, for every positive integer n, the zeros of sn lie in a closed halfplane containing the origin on its boundary, then lira sup K l 1 / " 2 < 1.
(4)
n—*oo
Edrei's result was extended by Ganelius [G] in particular replacing the halfplane by a sector. The Saff-Varga result goes in a direction opposite that of Edrei. But there is a feature common to these two results: both connect the growth of the coefficients of / to a region free of zeros of sections of / . If we suspect that there is an underlying principle behind this connection, we have to express the growth of
203 coefficients in possibly different form.Indeed the zero-free region in the EdreiGanelius result is a sector, that in the Saff-Varga result is a parabola , and we need to connect the geometry of these figures with some index related to the growth of the coefficients or of the function.Let us then turn to the idea of growth of an entire function.
3
G r o w t h of entire functions
One of the most important ideas related to the study of entire functions is the notion of growth. An important index of growth is the order of / which is defined by p = hm sup r—oo
log log M(r) ;
logr
where M(r) = snpiz,r |/(z)|,the maximum modulus of /.It can be shown that the order of / can also be expressed in terms of the coefficients by
I = l i m i n f Ml^KD. p
n-»oo
n log n
Since the order p holds information about the growth of / and its coefficients in an explicit way, we might suspect that it is the index sought.However one obvious implication of the second formula is that a change in the arguments of the coefficients of / does not effect p. Since such a change is expected to effect the angular location of the zeros, we cannot hope to get a description of the geometry of the zeros through some function of the order only.At any rate our remark after Theorem 1, implies that the Saff-Varga functions are of order at most l.Also (4), implies that the Edrei-Ganelius functions are of order zero by the second formula for the order.So in the case of positive coefficients, Theorems 1, 2 suggest the following : when the order of growth of / is at most one we expect a parabolic region free of zeros of sections, and when the order of growth of / is zero we expect a sectorial region free of zeros of sections.But what zero-free region do we expect if p = | or 2.So far we have no idea because we have yet to figure out a way of connecting growth with angular distribution of zeros.
4
H a d a m a r d - H a y m a n convexity
It turns out that in order to obtain the connection, unifying the previous results and extending them to cover functions of any order, including infinite order, we have to adopt a new way of measuring the growth of / . What is needed is some index or functional which is sensitive to the angular distribution of zeros of / . Now we are all familiar with the three circles theorem of Hadamard, which states that logM(r) is a convex function of logr. This is a very important
204 result that finds applications in various areas such as function theory, harmonic analysis, and partial differential equations.Let us put M'(r) It is easy to show that b(r) < Krx implies that / is of order at most A, but 6 can have growth larger than that of / . The study of the growth of b(r) was initiated in 1968 by Hayman [H], independently of the question of angular distribution of zeros.Hayman showed that the classical estimate b(r) > 0 obtained from the three circles theorem could be improved under certain conditions.He showed that there exists a positive absolute constant AQ > 0.180 such that lim sup b(r) > A0 r—*oo
for every transcendental entire function /.The exact value of A0 is as yet undetermined.Hayman's result was followed by work of Kjelleberg [Kj] and others.In particular Boichuck and Gol'dberg [BG] showed that A0 = 0.25 if we restrict to the class of entire functions with non-negative coefficients. The first explicit connection was obtained in [AK2] where a very simple inequality was found to the effect that, under positivity of coefficients, /2(r)-|/(reie)|2<4sin2^/2(r)6(r).
(5)
This inequality was used to obtain an alternative proof, and a refinement, of the Boichuck-Gol'dberg result.But once you have this inequality you see at once the connection between the location of zeros of / -and also its a-values- and the growth of / . For example, if / vanishes at z = re%e then (5) tells us that r and 6 must be governed by the inequality l<4sin2|&(r).
(6)
An immediate consequence of (6) is that limsup,.^^ b(r) > 0.25 for every transcendental entire / with non-negative Maclaurin coefficients.This is the result of Boichuck-Gol'dberg and it is best possible. If now we accept to measure the growth of / by b(r), rather than M(r), then (6) gives us a direct and simple relation between the growth of / and the angular distribution of its zeros.In fact it gives much more.For example, if we return to the exponential function, where b(r) = r, we see that , if (6) were applicable to all sections of ez we would have that all zeros z = re'e of all sections of ez must satisfy the inequality 1 < 4rsin 2 | = 2r(l — cos#) or < r (7) K 2(1 -cos<9) - ' ' which is manifestly a parabolic region.This is the Saff-Varga result.Important cases where (6), with b(r; / ) , does apply to all sections sn have been obtained in [AK1] :
205 Theorem 3 If a,j > 0 and G is the region defined by G = {(r,fl) : 6(r) <
_\Qseyr
>0,-*<0
then G is free of all zeros of f.If in addition, b^ < bk-i where bk = then G is free of zeros of all sections of f.
(8) ak-\/ak
Among functions satisfying the conditions of Theorem 3 we note the MittagLeffler functions Eij\{z) = 5Z°1 0 v(i+jl\) an< ^' ™ particular, the exponential function [AK1]. Particular cases of this theorem give successively: 1. Example 4 / / s u p 0 < r < o o b(r) = 1/4/? where 0 < f3 < oo, then /3 < 1 and the sector S = {z = x + iy € C : \ arg z\ < 2 sin" 1 A / ^ } is free of all zeros of f and also of all its sections under the condition bk < bk-\Example 5 / / b(r) < Kr then the parabolic region Pi/4K = {z = x + iyeC:y2
< —(x + — ) }
is free of all zeros of f and also of all its sections under the condition bk < bk-iNotice that the result in Theorem 3 includes the Saff-Varga result, though the region they obtain is larger. It also sheds light on the Edrei-Ganelius result and, more importantly, applies to functions of all orders including those of infinite order. The main tool in the proof of Theorem 3 is the following lemma whose proof is rather difficult and it would be desirable to find alternative proofs and possible extensions. Lemma 6 If the coefficients of f are non-negative and satisfy an-\an+\ then
< a^
b(r,sn)
206
4.1
T h e extremal cases
Returning to Theorem 3 it is natural to try to study the extremal cases in it .We can look at the case where / is entire with positive coefficients and lim s u p r ^ ^ b(r) = 0.25, and ask if such a function has some special properties.Prom (6), it would appear that the zeros of / will have to be real and negative!!! true, this would be very valuable.For it would give precise information about the location of zeros from an asymptotic relation.But of course (6) does not suffice to give this and a different approach is needed to handle the extremal case which, however, brings a pleasant surprise: Theorem 7 If f has positive coefficients and lim sup b(r) < — r—*oo
(9)
^
then all but a finite number of the zeros of f are simple, real and negative.Furthermore the constant | is best possible. The proof [AK] of this result is obtained by first locating certain radii tn where the growth of / is comparable with that of its maximal. Rouche's theorem then gives that / has n zeros in the disk \z\ < £ n .Thus / has exactly one zero in the annulus tn < \z\ < tn+\.Since complex zeros of / if they exist must occur in conjugate pairs this zero of / must be real and simple.Of course it cannot be positive so it must be real and negative.
5
Extensions
The result in Theorem 7 is a consequence of the fact that the growth of b(r) is sensitive to the presence of equimodular zeros.In the context that we were describing the growth of b(r) doubles in the presence of equimodular zeros.In particular double zeros would double the size of 6(r).The tension between the size of b(r) and the presence of double zeros in the presence of positive coefficients leads to the very precise result in Theorem 7. The preceding discussion suggests that a study of the growth of b(r) in the general case, that is in the case where the coefficients are not necessarily real positive, ought to be taken up.Of course in this very general setting two difficulties arise.The first is that it will no more be possible to have simple explicit formulas for M(r) and b(r) to work with.The second is that there will not be a distinguished line,the positive real axis in the case of positive coefficients, with respect to which we could try to locate the zeros.Or so it may seem.For there is always the curve where / takes on its maximum modulus and one expects this curve to exert a repellent force on the zeros. A preliminary study of this situation has led to the following results [AK3] Theorem 8 Let
fc,1/2
oo
2n—1
= n(!riM 2 ' 9=e- " 2/a ' a>0
n=l
1
207 and
^ ( a ) = or 4l°g f c ' 1 / 2 («)If f is any transcendental entire function then lim sup b(r) > maxAi(a) = A\.
(10)
r—>oo
Also if lim sup,,^,^ b(r) = 2 max A\ (a) = 2A\ then all but a finite number of the multiple zeros of f must satisfy \8n — w(r„)| = IT where zn = rne%0n is a multiple zero of f and uj(rn) is the argument of a point where the maximum modulus is achieved. This result confirms the repelling property alluded to above and underlines the importance of obtaining the sharp constant in (10).It also serves to demonstrate, once more, how the growth of b(r) becomes more pronounced in the presence of multiple zeros or equimodular zeros.Of course there may be no multiple or equimodular zeros.In this case one may consider ratios of successive zeros.The closer such ratios are to unity the closer we are to the equimodular case. A preliminary study of the connection between the growth of b(r) and ratios of successive zeros, in the special case where the zeros are on one ray, has led to the following precise[AK]: Theorem 9 Let f be transcendental, with non-negative Maclaurin coefficients, bounded b(r) and having all but a finite number of its zeros real and negative then: {a)If limn-,00 ^ ± 1 = q for some q e (l,oo] then /ims«pr_>006(7-) = ip(q) where 1
°°
+
^ ) = ; g(TW
k
<">
(6) / / limsupr-x^b^) = ip(q) for some q G (1, co] then lim n _ +00 ^ ^ > q.If equality holds in this last inequality then actually limn_>oo ^ i i - = q. Thus an entire function / satisfying lim s u p , . ^ ^ b(r; f) = 0.25 and having non-negative coefficients, has all but a finite number of its zeros simple, real and negative.In addition, we now have that its successive zeros satisfy the equality linin-.oo ^ -
= CO.
It should be possible to obtain an extension of Theorem 7 at least when the coefficients are non-negative and the zeros are confined to a small angle bisected by the negative real axis.
6
Gap- Series
The previous discussion indicates that the growth of b(r) is smallest when all the zeros lie on one ray through the origin and gets larger as they swing toward
208 the opposite ray.One way to make the zeros swing towards, say, the positive real axis is to consider g(z) = f(zA) where A is an integer at least equal to 2 and / is the entire function with only positive coefficients satisfying lim sup,...^ b(r; f) = 0.25.In this case it is easy to see that A2 lim sup b(r;g) = A2 lim sup b(r; f) = — • r—>oo
r—*oo
(12)
4
This suggests that the growth of b(r) is also connected with the gap structure of the series. In their paper [BG], Boichuck and Gol'dberg have already noted this.Their sharp result may be rephrased for our present purposes as follows. Suppose that oo
/(*) = £>****
(13)
ifc=0
is an entire function where a^ > 0 and limsupfc_>00(A/t+i — A&) = A. Then A2 lim sup b(r; f) > —— • 7
>00
4
In [GO] Gol'dberg and Ostrowski have asked about the connection between the gap structure of the Maclaurin series of / and the location of its zeros.This connection may be studied via b(r).Ongoing work has resulted in some progress but, except for an unpublished result of Ostrowski announced in [GO], no sharp results are known. The real ambitious question is to extend the above results to functions of positive order < 0.5.We have very little knowledge as to what happens in this case. Thank you.
References [AK]
Faruk.F.Abi-Khuzam, The distribution and multiplicity of values of entire functions of small growth, Complex Variables, (2000).
[AK1] Faruk.F.Abi-Khuzam, Zero-Free Regions for Entire Functions and Sections of Their Power Series, Complex Variables, 29(1996),173-187. [AK2] Faruk.F.Abi-Khuzam, Maximum modulus convexity and the location of zeros of an entire function, Proc.Amer.Math.Soc. 106(1989), 1063-1068. [AK3] Faruk.F.Abi-Khuzam, Hadamard Convexity And Multiplicity And Location Of Zeros, Trans. Amer. Math. Soc. 347(1995), 3043-3051. [GO]
A.A.Gol'dberg and I.V.Ostrowski, Connection Between Arguments Of Zeros And Lacunarity, Linear & Complex Analysis Problem Book 3, Part II, (#1574 LNM ).
209 [BG]
V.S.Boichuck and A.A.Gol'dberg, The three-lines theorem, Mat.Zametki 15(1974), 45-53.( Russian )
[H]
W.K.Hayman, Note on Hadamard's convexity theorem, Entire Functions and Related Parts of Analysis, Proc. Sympos. Pure Math., vol. 11, Amer. Math. Soc, Providence, RI, 1968, pp. 210-213.
[Kj]
B. Kjelleberg, The convexity theorem of Hadamard-Hayman, Proc. Sympos. Math., Stockholm ( June 1973, Royal Institute of Technology ), pp. 87-114.
[ESV] A.Edrei, E.B.Saif and R.S.Varga, Zeros of sections of power series, Springer-Verlag, Berlin, 1983. [G]
T. Ganelius, The zeros of partial sums of power series, Duke Math. J. 30(1963), 533-540.
[E]
A. Edrei, Power series having partial sums with zeros in a half-plane, Proc. Amer. Math. Soc. 9(1958), 320-324.
[SV]
E.B.Saff and R.S.Varga, Zero-free parabolic regions for sequences of polynomials, SIAM J. Math. Anal. 7(1976), 344-357.
[NR]
D. J. Newman and T. J. Rivlin, Correction: The zeros of the partial sums of the exponential function, J. Approx. Theory 16(1976), 299-300.
[Iv]
K. E. Iverson, The zeros of the partial sums of ez, Math. Tables Aids Comp. 7(1953), 163-168.
Mathematics and the 21st Century Eds. A. A. Ashour and A.-S. F. Obada © 2 0 0 1 World Scientific Publishing Co. (pp. 211-221)
211
THREE LINEAR PRESERVER PROBLEMS A H M E D R A M Z I SOUROUR
ABSTRACT. Linear preserver problems are questions about characterising linear maps on spaces of matrices or spaces of operators (or more generally on rings or algebras) that preserve certain properties. We present an exposition of three such problems on preserving invertibility or commutativity or rank one. 2000 Mathematics Subject Classification: 15A30, 16S50, 16W10, 46H05, 47B48 Keywords and Phrases: Invertibility, commutativity, Lie and Jordan isomorphisms, rank.
INTRODUCTION. What came to be called the "linear preserver problems" are questions on characterising linear maps on spaces of matrices or spaces of operators (or more generally on algebras) that preserve certain properties. There has been a great deal of research in this area, especially on spaces of matrices, with results dating back to 1897 (see Theorem 0 below). We refer the reader to the expository articles [LTl, LT2]. There has been also some research activity for maps on Banach algebras, algebras of operators, abstract rings, . . . etc. Possibly the earliest result on this subject is Frobenius' characterization, in 1897, of determinant preserving linear maps which we state presently. The transpose of matrix x is denoted by xl. T H E O R E M 0 (Frobenius [Fr]) Let <j> be a determinant preserving map on the space of all (real or complex) n x n matrices, i.e., det(j)(a) = det(a) for every matrix a, then there exists invertible matrices b and c with det(6c) = 1, such that either (p{a) = bxc for every x or 4>{a) — bxc for every x. Three of the most appealing linear preserver problems, in my view, are invertibility preservers, because of its connection with algebra isomorphisms and Jordan isomorphisms, commutativity preservers, because of its connection
212 with Lie isomorphisms and rank one preservers, because many others preserver problems are reduced to it. In this expository article, we will concentrate on these three problems. The discussion that follow will be far from encyclopedic, and the emphasis will reflect the author's experience.
1. I N V E R T I B I L I T Y P R E S E R V I N G M A P S AND J O R D A N I S O M O R P H I S M S Let A and B be algebras with identity. A linear map <j> from A to B, is called unital if 0(1) = 1 and is called invertibility preserving if
213 (c)
the following
(i) rank R < 1 (ii) For every T G C{X) and every distinct scalars a and /3, spec(T + aR) n spec(T + PR) C spec(T). (iii) For every T £ C(X), there exists a compact subset KT of the complex plane, such that spec(T + aR) n spec(T + /3R) C KT.
In a different direction, results of Gleason [G] and Kahane-Zelazko [KZ], refined by Zelazko [Z] show that every unital invertibility preserving linear map from a Banach algebra A into a semi-simple commutative Banach algebra B is multiplicative. (See also [RS]). Additional related results are in [Au], [CHNRR], and [Ru]. Articles [CHNRR] and [Ru] contain similar results on invertibility preserving positive linear maps on C*-algebras and von-Neumann algebras respectively.
214 The commutativity assumption in [G] and [KZ] is quite crucial. It would be a major advance if the conclusion holds for noncommutative algebras. More precisely, we pose this question. Question. Let A be a semi-simple Banach algebra and let ^ be a unital bijective linear map on A. If
A 0
B C
where
A, B, C are 2 x 2 matrices, and let A 0
B C
A 0
Bl C
It is straightforward to verify that <j> is unital and preserves invertibility, but that it is not a Jordan homomorphism. Other examples may be constructed by taking A to be a radical algebra with identity adjoined and 4> a bijective unital linear mapping sending the radical to itself.
2. C O M M U T A T I V I T Y P R E S E R V I N G M A P S AND LIE I S O M O R P H I S M S A linear map ip from an algebra A to an algebra B is said to be commutativity preserving if
215 Evidently every Lie isomorphism
T^tcA~lTA
+ f{T)I
or T
H->
cA-^A
+ f(T)I
where c is a scalar, T* is either the adjoint or the transpose or some other anti-isomorphism (depending on the space considered), and where A is an invertible operator (perhaps a unitary), and / a linear functional. Consequently, the results may be stated as showing that every such a map is a linear combination of a Lie isomorphism and a map with central range, recently algebras of triangular or block-triangular matrices and their infinite dimensional generalisations have received a lot of attention. In the remainder of this section, we will discuss results on commutativity preservers and on Lie isomorphisms for some such algebras. Let Tn(F) denote the algebra of upper triangular n by n matrices over an arbitrary field F. The "transpose" of an n x n matrix A with respect to the "anti-diagonal", i.e., the diagonal that includes the positions (j, n — j) is denoted by T+. It is easy to see that the mapping T i-> T+ is an anti-isomorphism. Indeed it a composition of the usual transpose and an inner automorphism induced by the matrix J := [<5,jn_;], where Sij is the Kronecker delta symbol. T H E O R E M 2 . 1 . [MS] Let F be an arbitrary field and ip a linear map from Tn(F), the algebra of upper triangular matrices, into itself. Assume that n > 3. The following conditions are equivalent. (a) tp preserves commutativity
in both directions.
216 (b) There exists a non-zero scalar c 6 F, a linear functional f on Tn and an invertible matrix S G Tn such that
f(T)I
or
+
f(T)I
(c) There exists a Lie isomorphism a of Tn(F) a non-zero scalar c E F, and a linear mapping f from Tn(F) into its centre such that
+
tr(TD)I,
or
(TD)I,
where S E Tn(F) is invertible, tr denotes the trace and D is a diagonal matrix withtr{D) ^ - 1 . We note that the result above implies that every Lie isomorphism
+
T(T)I,
217
tp(T) = -S-nA~1T+ASn
+ T{T)I,
where A is an invertible element ofToo, S is the bilateral shift, n is an integer, and T is a linear functional on Too that annihilates all commutators. Also in the finite dimensional spaces, the following result about block triangular algebras was proved in [MS2]. We start with a definition. For every finite sequence of positive integers n\,n2,.. ri2 + . . . + rik = n, we associate an algebra T ( n i , m,... n x n matrices of the form An 0
A12 A22
0
0
A =
... ...
.nk, satisfying n-i + n*) consisting of all
Alk-\ A2k ifcfc
where A^ is an n, x rij matrix. We call such an algebra a block upper triangular algebra. T H E O R E M 2.4. Let A = T(ni,n2,.. .nr) and Let B = T{m1,m2,.. .ms) be block upper triangular algebras in Mn and Mm respectively, and let ip be a Lie isomorphism from A onto B. Then m = n, r = s and there exists an invertible matrix B € B and a linear functional T on A satisfying T(I) ^ — 1 such that either (a)
m = mi and ip(T) = B~lTB
+ T(T)I,
or
(b) m = mr-i and tp(T) = B^T+B + r{T)I. The mapping T is given by T(T) = tr {TD), where D is a diagonal matrix such that tr (D) ^ — 1 and the diagonal entries in every one of the blocks that determine A are identical.
3. RANK ONE P R E S E R V I N G M A P S A map ip from a space S\ of matrices into a space S2 of matrices is said to preserve matrices of rank one if >(T) is of rank one whenever T has rank one. It is said preserve rank one matrices in both directions when
218 often involve rank-one preservers. Classifying isomorphisms of several types of operator algebras is frequently accomplished by exploiting the fact that they preserve rank one operators; see, e.g. [Da; Chapter 17]. Although the forms of rank one preservers are very similar to the forms of other preservers disacussed in the previous sections, we describe them slightly differently. By a left multiplication on an algebra A we mean a mapping La defined by La(x) = ax, for every x € A, where a is an element of A. Right multiplications Ra are defined analogously. The linear rank one preservers on the space of all n x n matrices was characterized by Marcus and Moyls [11]. They show that every such map is a composition of a left multiplication LA by an invertible matrix A, a right multiplication RB by an invertible matrix B, and possibly the transpose map. For related results, and a summary of similar results obtained from 1960 until 1989, we refer to [Lo] and the references therein. In this section, we discuss more recent results about additive (not necessarily linear) maps that preserve rank one especially on triangular matrix algebras. In [OS2], Omladic and Semrl characterized surjective additive maps on the space of finite rank operators on real or complex Banach spaces. In case of finite dimensional spaces, they show that every such a map is a composition of the three types of maps described above and a fourth type induced by an automorphism of the underlying field, which we describe presently. Assume that c H c is an automorphism of the underlying field F, and C = icij] G Mmn{F). We denote the matrix [cy] by C. Evidently the map C i-¥ C preserves every rank. We say that C >-> C is the map induced on the space of matrices by the field-automorphism ci->c. We shall make use of the transpose with respect to the anti-diagonal T *-¥ T+ described in §2. We now define another type of rank one preservers which appears in [BS] (3.1) Let each of fi, f%,... fn be an additive mapping from F to F such that / i is bijective, and let f = (/i, fa,... / „ ) . Define a mapping f on a triangular algebra A = T{ri\... n*), with n\ = 1, by
f
V
CU
C12
...
Cin
0
C22
•••
C2„
0
0
...
Cnn
\
/l(Cll) 0
/ 2 ( c i l ) +C12 C22
... •••
/n(cil) +Cl„ C2„
J
0
0
...
Cnn
This is a surjective additive mapping on A and it preserves rank one matrices, but only when «i = 1. (3.2) For f and / i , fi,... / „ as above, define a mapping f on a triangular algebra A — T{n\ ... rik), with n& = 1, in a similar fashion except that the "action" is
219 on the last column instead of the first row, more precisely f(C) = ( f ( C + ) ) + . Again this is an additive mapping on A preserving rank one matrices, but only when nk = 1. We now present a result from [BS]. T H E O R E M 3.3. [BS] Let A = T ( n i . . . nu) be a block upper triangular algebra in Mn(F), such that A / T^F). Let
REFERENCES [Al] B. Aupetit, Proprietes Spectrales des Algebres des Banach, Lecture Notes in mathematics, No. 735, Springer-Verlag, New York, 1979. [A2] B. Aupetit, Spectrum preserving linear maps, preprint [AMo] B. Aupetit and H. du T. Mouton, Spectrum-preserving linear mappings in Banach algebras, Studia Math. 109 (1994), 91-100. [BS] J. Bell and A. R. Sourour , Additive rank-one preserving mappings on triangular matrix algebra Linear Algebra Appl. 312 (2000), 13-33. [Br] M. Bresar, Commutativity traces of biadditive mappings, commutativity preserving mappings and Lie mappings, Trans. Amer. Math. Soc. 335 (1993), 525-546. [BM] M. Bresar and C.R. Miers, Commutativity preserving mappings of von Neumann algebras, Can. J. Math. 45 (1993), 695-708. [CL] G.H. Chan and M.H. Lim, Linear transformations on symmetric matrices that preserve commutativity, Linear Algebra Appl. 47 (1982), 11-22. [CHNRR] M.D. Choi, D. Hadwin. E. Nordgren, H. Radjavi and P. Rosenthal, On positive linear maps preserving invertibility, J. Funct. Anal. 59 (1984), 462-469.
220 [CJR] M.D. Choi, A.A. Jafarian and H. Radjavi, Linear maps preserving commutativity, Linear Algebra Appl. 87 (1987) 227-241. [Da] K.R. Davidson, Nest Algebras, Pitman Research Notes in Mathematics, no. 191, Longman Scientific and Technical, London and New York, 1988. [Di] J. Dieudonne, Sur une generalisation du groupe orthogonal a quatre variables, Arch. Math 1(1949), 282-287. [Do] D. Dokovic, Automorphisms of the Lie algebra of upper triangular matrices over a connected commutative ring, J. Algebra 170 (1994), 101-110. [F] G. Frobenius, Uber die Darstellung der endlichen Gruppen durch lineare Substitutionen, Stizungsber. Deutsch. Akad. Wiss. Berlin (1897), 9941015. [G] A. Gleason, A characterization of maximal ideals, J. Analyse Math 19 (1967), 171-172. [HI] I.N. Herstein, On the Lie and Jordan and rings of a simple associative ring, American J. Math. 77 (1955), 279-285. [H2] I.N. Herstein, Topics in Ring Theory, Chicago Lecture Notes in Mathematics, University of Chicago Press, Chicago and London, 1969. [Hu] J.E. Humphries, Introduction to Lie Algebras and Representation Theory, Graduate Texts in Math. 9, Springer-Verlag, New York, Heidelberg, Berlin, 1972. [JS] A. Jafarian and A. R. Sourour, Spectrum preserving linear maps, J. Funct. Anal. 66 (1986), 255-261. [KZ] J. P. Kahane and W. Zelazko, A characterization of maximal ideals in commutative Banach algebras, Studia Math. 29 (1968), 339-343. [K] I. Kaplansky, Algebraic and Analytic Aspects of Operator Algebras, Regional Conference Series in Math. 1, Amer. Math. S o c , Providence, 1970. [LT1] C.K. Li and N.K. Tsing, ed., A survey of linear preserver problems, Linear and Multilinear Algebra 33 (1992), 1-129. [LT2] C.K. Li and N.K. Tsing, Linear preserver problems: A brief introduction and some special techniques, Linear Algebra Appl. 162-164 (1992), 217235. [Lo] R. Loewy, Linear transformations which preserve or decrease rank, Linear Algebra Appl. 121 (1989), 151-161. [MSI] L. Marcoux and A.R. Sourour, Commutativity preserving linear maps and Lie automorphisms of triangular matrix algebras, Linear Algebra Appl. 288 (1999), 89-104. [MS2] L. Marcoux and A.R. Sourour, Lie isomorphisms of Nest Algebras, J. Funct. Anal. 164 (1999), 163-180. [Ma] M. Marcus, Linear transformations on matrices, J. Nat. Bureau Standards 75B (1971), 107-113. [MM] M. Marcus and B. N. Moyls, Transformations on tensor product spaces, Pacific J. Math. 9 (1959), 1215-1221. [MP] M . Marcus and R. Purves, Linear transformations on algebras of matrices: The invariance of the elementary symmetric functions, Canad. J Math 11
221 (1959), 383-396. [Mai] W.S. Martindale, Lie isomorphisms of primitive rings, Proc. Amer. Math. Soc. 14 (1963), 909-916. [Ma2] W.S. Martindale, Lie isomorphisms of simple rings, J. London Math. Soc. 44 (1969), 213-221. [Mi] C.R. Miers, Lie homomorphisms of operator algebras, Pacific J. Math. 38 (1971), 717-735. [O] M. Omladic, On operators preserving commutativity, J. Functional Analysis 66 (1986), 105-122. [OP1] M. Omladic and P. Semrl, Spectrum-preserving additive maps, Linear Algebra Appl. 153 (1991), 67-72. [OP2] M. Omladic and P. Semrl, Additive mappings preserving operators of rank one, Linear Algebra Appl. 182 (1993), 239-256. [RS] M. Roitman and Y. Sternfeld, When is a linear functional multiplicative?, Trans. Amer. Math. 267 (1981), 111-124. [Ra] H. Radjavi, Commutativity-preserving operators on symmetric matrices, Linear Algebra Appl. 61 (1984), 219-224. [Ru] B. Russo, Linear mappings of operator algebras, Proc. Amer. Math. 17 (1966), 1019-1022. [S] A. R. Sourour, Invertibility preserving linear maps, Trans. Amer. math. S o c , 348 (1996), 13-30. [W] W. Watkins, Linear maps that preserve commuting pairs of matrices, Linear Algebra Appl. 14 (1976), 29-35. [Ze] W. Zelazko, A characterization of multiplicative linear functionals in complex Banach algebras, Studia Math. 30 (1968) 83-85.
Department of Mathematics and Statistics University of Victoria Victoria, British Columbia Canada V8W 3P4
Mathematics and the 21st Century Eds. A. A. Ashour and A.-S. F. Obada © 2 0 0 1 World Scientific Publishing Co. (pp. 223-245)
223
PREDICTION: ADVANCES A N D N E W RESEARCH
Essam K. AL-Hussaini Mathematics Department, University of Assiut, Assiut, Egypt
ABSTRACT
Prediction is reviewed and the most recent advances in the area are presented. An objective of this paper is to study the Bayesian multisample prediction and give a concise form for the predictive density function of the r'fe observable in sample j based on the informative sample (s). Applications are shown to a general class of population distributions which specializes to a wide spectrum of life testing distribution models. The uncertainty about the true value of the parameter(s) is measured by a general class of prior density functions.
224
1. I N T R O D U C T I O N Statistical prediction is the problem of inferring the values of unknown observables 'future variables' or functions of such observables from current available 'informative' observations. As in estimation, a predictor can be either a point or an interval predictor. Parametric and nonparametric (distribution-free) prediction have been considered in literature. Frequentist and Bayesian approaches have been used to obtain predictors and study their properties. Maximum likelihood predictor (MLP), best linear unbiased predictor (BLUP) and best linear invariant predictor (BLIP) are examples of frequentist point predictors. A review on frequentist prediction intervals was made by Patel(1989). In parametric prediction, he covered the results obtained when the underlying population distributions are discrete (Poisson, binomial and negative binomial) and continuous (normal, lognormal, exponential, Weibull, gamma, Inverse Gaussian, Pareto and increasing failure rate). He also presented nonparametric prediction intervals, among other results. Nagaraja (1995) surveyed prediction results that are particularly associated with the exponential distributions. The BLUP, BLIP, MLP, Bayes predictors, prediction intervals and regions are among the topics surveyed. A more recent review of point and interval prediction of order statistics was made by Kaminsky and Nelson (1998) which covered linear and maximum likelihood point predictions and interval prediction based on pivotals and on best linear predictors, Bayesian prediction intervals and model shifts. Balakrishnan and Rao (1997) studied large sample approximations to the BLUP based on progressively censored samples. Seshadri (1999) examined the methods of prediction intervals for Inverse
225
Gaussian observables that were presented by Chhikara and Guttman (1982), Padgett (1982) and Padgett and Tsoi (1986). Nonparametric prediction was considered by Flinger and Wolfe [(1976), (1979)( a \(1979)W], Guilbaud (1983) and Johnson et al (1999), among others. The problem of prediction can be solved fully within the Bayesian framework [Geisser (1993)]. Several researchers have studied Bayesian prediction. Among others are Dunsmore [(1974),(1976),(1983)], Geisser [(1984),(1986), (1990),(1993)], Lingappaiah [(1978),(1979),(1980),(1986),(1989)], AL-Hussa ini and Jaheen [(1995),(1996),(1999)], AL-Hussaini [(1999)(a),(1999)W], Lee and Lio (1999), Corcuera and Giummole (1999) and AL-Hussaini, Nigm and Jaheen (2000). The two books by Aitcheson and Dunsmore (1975) and Geisser (1993), which are primarily concerned with Bayes prediction, give illustrative examples, analysis and possible applications. A wide range of potential applications of statistical prediction includes density function estimation, calibration, classification, regulation, model com parison and model criticism. For details and references, see, for example, Bernardo and Smith (1994). A growing interest in predicting future records has arisen in the last two decades. For example, the BLUP of future records was obtained by Ahsanullah (1980) when the two-parameter exponential was the underlying distribution. Nagaraja (1984) studied the BLUP and BLIP of records under the WeibuU model. Doganaksoy and Balakrishnan (1997) suggested a simple way to obtain the BLUP of records. Interval prediction of records was studied by Dunsmore (1983), Ahsanullah (1990), Balakrishnan and Chan [(1994),(1998)], Balakrishnan, Ahsanullah
226
and Chan (1995), Berred (1998) and Chan (1998). For details on prediction of records and other interesting topics related to records, see, for example, the book by Arnold, Balakrishnan and Nagaraja (1998). One possible sampling scheme, described by Dunsmore (1974) as plan 2, consists of two random samples: Ti0,...,Tno0 and Tn,...,T n i i. The informative experiment (sample zero) is assumed to be censored (type II) and so only the first 7"o order statistics of the failure times of this sample are available. The target is to predict future observables from the second sample (sample one). The two random samples are assumed to be independent and that they are drawn from the same population. The uncertainty about the true value of the parameter (s) is measured by some proir density function. Plan 1 of Dunsmore (1974) is a special case of plan 2 where no censoring is imposed on the informative experiment and complete sample zero is available. Lingappaiah (1978) extended this sampling scheme from two t o M + 1 independent random samples, all of which are assumed to be drawn from an exponential population. He assumed that the prediction process begins with sample (stage) 1 and moves on sequentially. No prediction is made at stage 0. For a given sample (stage), he proposed the use of the posterior density as a prior for the next stage each time a next stage is required. Such prediction intervals are useful when, for example, a manufacturer wishes to assure the acceptance of M future shipments of equipments. In this paper, the Bayesian multisample prediction problem is formalized in a theorem in Section 2, in which an expression for the predictive density function of the rf1 order statistic in sample j is based on the information available in sample zero and the order statistics in previous samples 1,..., j — 1 , which are assumed to have been observed. In this expression, the posterior
227 is obtained only once without having to find a posterior at each stage and the underlying distribution is assumed to be of a general form rather t h a n being exponential as proposed by Lingappaiah (1978). Applications to the Pareto and Weibull (including the exponentail and Rayleigh) models are presented in Section 3.
2. P R E D I C T I V E D E N S I T Y O F T H E
Rf
O R D E R STATISTIC IN S A M P L E J Consider a series of M + 1 independent random samples drawn from a population whose probability density function is fx(t
\ 9) and cumulative
distribution function Fr{t \ 9), t > 0, where 9 is a vector of parameters t h a t belongs to a space Q such t h a t fx(t
\ 9) > 0 for 9 G CI. Designate the samples
by 0,1,...,M and their sizes by no, ni,..., UM- It is assumed t h a t only the first TQ order statistics, out of no, representing times-to-failure in sample zero are available. Schematically, the samples may be as follows: Sample number 0
rj' 1 order statistic
Tw,...,TnoQ
1
Tn,...,Tnii
M Let the r '
Sample observations
T( r i ) 1 =Y"i r i
TiM,---,TnMM h
T(TM)M=YMrM
order statistic of sample j be denoted by Yjrj = T{rj)j , , tj = 1, ...,nj:j
= 1,..., M .
In sample zero, we shall assume t h a t the times-to-failures are ordered, so t h a t tio < ••• < troo < ... < i n o o, and t h a t only the first ro order statistics are available . T h a t is type II censoring is imposed on sample zero. T h e
228
order statistic T(rj)j represents the time-to-failure number rj in sample j , j = 1, ...,M. At each stage, it is desired to predict an order statistic based on the order statistics at earlier stages and the information available at stage 0. The density function of Yjr. is known to be given by fYirj(yjrj
| 61) ex [FT(yjrj
| 60P-M1 - MVjr,
\ 0)}n^fT(yjrj
| 6) . (2.1)
The following theorem gives the predictive density function of the rj' 1 order statistic in sample j , denoted by Yjrj, given the first ro order statistics at stage 0, denoted by £o, and the previous order statistics of samples l,...,j —1. THEOREM The Bayesian predictive density function of YjTj is given by fhriyjn
| yj-i,rj_1,...,yir1,t0)
oc / Lj {6; y > £ {6 \ t^dO ,
(2.2)
where
W,to)
Lj(e;yrj)<xlllfYiri(yiri\e)},
(2.3)
7r5((?|to)«L(6l;to)7r(0),
(2.4)
« I I I /r(*io I )][! " ^r(*roO I 0)}n°-r° ,
(2.5)
i=\
/yi7.. (j/ir; | 0) is given by (2.1), fr(tio \ 0) is the population density, evaluated at tio, n(6) is some given prior density function, ig is the vector of the first ro order statistics at stage 0, given by to = (iio, ...,
and
Vr. = (yiri,-,yjri)
•
(2.6)
Proof Suppose that ix{9) is a given prior density function, 6 € Cl. Then the posterior density function (based on sample zero) is given by (2.4). The
229
predictive density function of the r\h order statistic in sample 1 is then given by / y l r i f a n |*o)oc /" fYlT1 {ylri | 9)TT*Q(9 \ t^dB ,
(2.7)
where / y l r i (ylri | 0) is given by (2.1) when j=l. The posterior density function at sample (stage) 1, denoted by -n^ (9 \ y\Tx, to) is defined by
< (0 | Viri, to) « A ^ (l/in I * K (^ I *o) •
(2-8)
Such a posterior density function is used as a prior for the next sample (stage) 2. So that the predictive density function of Y2r2 i s given by /y 2r2 (2/2r2 I Vir, to) « /
/>W2 (2/2r2 I ^)7rJ (0 | y l r i , *o)dfl ,
(2.9)
which, upon the substitution of (2.8), yields / y 2 r > 2 r 2 | J/m,io) « / /y2r2(2/2,2 I e)fylri(ylri
| * K ( 0 | *o)dfl .
(2.10)
Continuing in this line, the Bayesian predictive density function of Yjr., at stage j = 1, 2,..., M, is then given by
fYjr.(yjrj
where Lj(9;y
—rj
I J / j - l . r v - L - . y i n . O o C / l j ( 0 ; W > o (5 | t^dff
) is as given by (2.3), to and y
—rj
,
by (2.6), 7%=l,...,m and
j=l,...,M. Remark Eq. (2.2) gives the predictive density of Yjrj in the form % . ( % > ; |yj-i,r J --i,-.yir 1 ,t 0 ) = ^ / -Lj(0;]/ )TT^(0 | t^dti ,
230
where K is the normalizing constant. It then follows that P[Yjrj > v | yj-!,^,
...,ylri,to]
= I(i>)/I(0) ,
(2.11)
where
/(i/) = /
/ Lj(ff; yr,H(e
I to)d0dyjri ,
(2.12)
and 1(0) = 1/K. The bounds of a two-sided confidence interval with cover T for YjTj, may thus be obtained by solving the following two equations for the lower and upper bounds, Lj and Uj, respectively. i y l = P[Yjrj > Lj I %_ 1 , rj _ 1 ,..., ylri, t^ = I(Lj)/I(0)
,
(2.13)
^ -
,
(2.14)
= P[Yjrj > Uj | yj-i,^,
....yin.&l = I(Uj)/I(Q)
where 7(0), I(Lj) and /(E/j) are obtained by substituting 0, Lj and Uj in the integral I(i^), given by (2.12).
Special Cases (i) Two-sample prediction This is the case in which j — 1 , so that M=l. In this case, we oly have two samples: sample zero and sample one. The Bayesian predictive density of Yiri is then given by (2.7), where t^ represents the vector of the first ro order statistics in informative sample zero and y\ri is a value of the r{ order statistic in future sample one. In this case, the lower and upper bounds of Yi ri are obtained by solving the following two equations which are reduced forms of (2.13) and (2.14) 1+T = P[Yln >L1\t0] 2 1-T
2
= P[Ylri >U1\t0]
where I(v) is given by (2.12) when j = 1.
= /(ii)//(0),
(2.15)
= I(tfi)//(0),
(2.16)
231
(ii) Predictive density of the smallest observable in sample j If, in (2.2), Tj—1, for j = 1,...,M, we obtain the Bayesian predictive density function of the smallest observable in sample j . In this case,
LMyJ «III 1 - Mm I oW'-'Myn I o)
(2.17)
2=1
y is given by (2.6) when r , = l , j — 1, ...,M. The lower and upper bounds of Yji are obtained by solving the two equations ^L=P\Yjl>L1\yj.lil,...,y11,tQ]=I(Lj)/I{0),
(2.18)
^L=P\yji>U1\yj-1,1,...,y11,to]=I(UJ)/I(0),
(2.19)
wher I(v) is given by (2.12) when r j = l , j = 1,...,M. In the two-sample case (j=l), Za(#; y ) takes the form Li(0 ; 2 o ) a [1 - F T ( y n | ^ " ^ V T ^ I I | 0) .
(2.20)
The lower and upper bounds of Y\\ are obtained by solving the two equations i ± l
= P[Yn
>L1\t0]=
^
= P [ y n >U1\t0]=
/(Li)//(0) ,
(2.21)
J(C7i)//(0) ,
(2.22)
where /(^) is given by (2.12) in which Li(6;y_) is given by (2.20). (iii) Predictive density of the largest observable in sample j If, in (2.2), rj=rij, j=l, ..., M, we obtain the Bayesian predictive density function of the largest observable in sample j . In this case, j
Lj(0;ynj)«HlMyinr^frtyin, I o), i=l
(2.23)
232
y_n is given by (2.6) when Tj—rij, j=l,...,M.
The lower and upper bounds
of Yjnj are obtained by solving the two equations (2.13) and (2.14) after replacing Tj by rij.
3. A GENERAL CLASS OF POPULATION DISTRIBUTIONS A N D A GENERAL CLASS OF PRIORS Suppose that the population cumulative distribution function (cdf) is of the form FT(t | 6) = 1 - exp[-X(t)] , t > 0,
(3.1)
where A(£) = X(t; 6) is a nonnegative continuous function of t such that \(t) —> 0 as t —• 0 + and X(t) —• oo as t —> oo. The corresponding reliability, R(t \ 6), hazard rate, h(t | 6) and density, fr(t | 0) functions are given, respectively, by R(t\6)
= l - FT(t | 6) = exp[-X(t)} ,
(3.2)
h{t\6) = X'{t) ,
(3.3)
/ r ( t | 0) = A/(t)exp[-A(t)] , £ > 0 .
(3.4)
Some important distributions, that are used in life testing, naturally belong to this class. Among others, are the Weibull (including the exponential and Rayleigh as special cases), compound Weibull (or the three-parameter Burr type XII), Pareto, beta, Gompertz and compound Gompertz distributions [ see AL-Hussaini and Osman (1997)]. Such a general class of distrbutions together with a general (natural conjugate) class of prior density functions given by ?r(0) OC C(0;7)ezp[-.D(0;7)] , 0 £ Q ,
(3.5)
233
where 7 is a vector of prior parameters, were suggested by AL-Hussaini (1999)^ to develop Bayesian prediction bounds for observables from (3.4). It follows, from (2.5), that L(6;to)(xA(to;e)exp[-B(to;0)},
(3.6)
where r0
1-0
A
= £)A(t«,) + (no - r o )A(i ro0 ) .
i=l
(3.7)
i=l
The following corollary specializes the above Theorem to class (3.1) of population distributions and class (3.5) of priors. COROLLARY If all of the independent M + 1 random samples are assumed to be drawn from a population with cumulative distribution function (3.1) and if a prior density is given by (3.5), then the predictive density of Yjr. is given by fvir-iVirj IVj-in-^-^ir^toJcc 3
/ g(8)exp[-h(6)]d6 Jn
,
(3.8)
where 9W) = Kvr.)v(P;to)
and
K0)=6(y ) + W,to),
% r .)=ri{A / (^)E(- i ) ; i f r i r 1 N ) e ^H i A(j/ i r i )]}, i=l
U=0
\
l
i
(3-9) (3.10)
/ J
Ti(6;to) = A(to;0)C(0n)
and
8{y_r) = Y,^Kyin)
,
(3.11)
*=i
mi=ni-ri + \
,
C(«;*o) = BfaB)
+ £>(0;7) .
(3.12)
234
The proof of this corollary follows by implementing (3.1)-(3.5) in (2.3), where (2.1) and (2.3) yield L
j(8;yr)
ccb(y )exp[-6(y
—~3
—~]
)] , —~3
b(y ) and 6(y ) are given by (3.10) and (3.11), respectively. The posterior —Tj
—Tj
density function at stage zero is obtained by substituting (3.5) and (3.6) in (2.4), to yield TTO*(0 | to) oc 7/(0;io)ezp[-C(0;io] , 6 G fi , where 7?(6';i0) and C(^;*o) are given by (3.11) and (3.12), A^-,6),
(3.13) B^Q)
by (3.7) and C(6;j), D(6;j) are the prior factors appearing in (3.13). 4. APPLICATIONS 4.1 The Pareto-type I model Suppose that T ~ Pareto-type I (a,0) with cdf FT{t | 6) = 1 - ( ^ ) a = 1 - exp[a\n(/3/t)],
t > 0, (a > 0,0 < 0 < d) . (4.1)
Comparing with (3.1), we have \(t) =-aln(0/t)
,
\l{t)=a/t.
(4.2)
Let 6 = (a, 0), where both parameters are assumed to be unknown. It follows from (3.7) that A(to;e)=ar°/A(t0)
,
13(^,6) = -aBfaO),
(4.3)
where
A{h) = n ^ ° i=l
'
B
&>>P) = E l n ^ / i i Q ) + (n° - r°) HP/tro) • (4-4) i=l
235
Suppose that a 'generalized' prior, suggested by Lwin (1972), is of the form TT(0) o c a a / 3 _ 1 e a ; p [ - a ( l n c - 6 1 n / 3 ) ] , a > 0 , 0 < / 3 < d .
(4.5)
Denote the vector of positive prior parameters by 7=(a,b,c,d). Then C(0;7)=a%8
,
D(6;j) = a[lnc-blnf3}.
(4.6)
The posterior density (at stage 0) is then given by (3.13), where (3.11), (3.12), (4.3) and (4.6) yield vW;t0) = <*ro+a/\PA(to)] > mt0)
= a[lnc-b\np-B(t0;p)}
.
(4.7)
It is assumed that a > 0 and 0 < /3 < N, N = min {ti 0 , d}. From (3.10)
b(yr.) = fl(—) E (-!)'' C' 7 XW^i W/fcJ], (4-8) i=i
yir
>
ii=o
v H /
and from (3.11), i
8{yr) = -a J2 mi MP/Vin)-
(4-9)
It can be shown, by using (3.8), (3.9), (4.8), (4.9) and some algebraic manipulations, that TT
1
i=i * i r i ii=o
(l v
[# 2j {ff„ + K- + 1) m ^ . } - ^ - 1 ) ] } , yjrj > (3 ,
(4.10)
where ro
j-1
F j j = In c + ( n 0 - r 0 ) In i r o 0 + E l n i i o + E ^ i=l
i #2j = 6+no + E ( m * + ^) ' mi i=l
=
m i + Z
^lnyiri ~ ^ 2 J ' (4-n)
i=l
^ - ' " i + l and ujj = a+ro+j-l.
(4.12)
236 Therefore, for j > 2 and v > (3
1{y) =
rr y;(-i)''( n ~x) ^{nj i=l ii=0
\
i
+ 1)H2j{Hlj
/
+ K- + *)ln ^P r1
i l i = l 2/ir-i
(4.13) It then follows that a confidence interval for YjT. with cover r has bounds Lj and t/j that are given by the solution of the two equations i i l = I(Lj)/m
and
1 ^ 1 = I{U3)/I{(3) .
If the smallest observable in sample j is to be predicted, then rj=\ for all values of j . In this case, bounds for the confidence interval, with cover T, for Yji can be explicitly obtained by substituting rj=l, for all j . In this case, the ratio I(u)/I(/3)
takes the form
!W) /(/?)
r-Hij + K + iym/-, -u,,JJ y + fa + l)Zn/?J '
,
L
{
'
. '
where H\j (and HZJ) are given by (4.11) and (4.12) with n=\ and li=0 for i=l,...j. It then follows, from (2.18) and (2.19) that the lower and upper bounds of Yji are given, respectively, by Lj = e ^ [ ( - i r i ) { ( l ± I ) - 1 M ^ 1 . + („. + l)ln/3] - J J y } ] ,
(4.15)
Uj = e*P[(^p[){(^L)-1/Ui\Hii
(4.16)
+ (nj + 1)ln/3] - H^}]
In the two-sample case 0 = 1 ) , (4.10) reduces to
,
237
Ji=0
^
1
[yirx-ff2i{-ffn + ("i + l ) l n 2 / l n } W 1
'
] " , yin >/? •
So that /(„) = ^ ( - ^ ^ " ^ [ w i K /i=0
^
*
+ lJffa^ffu + Cm + l J l n i / p ] " 1 ,
'
where #11 = In c + (n 0 - r 0 ) In i r o 0 + X)I=i l n (**o) - #21 In d , and #2i = b + no + mx + l±. If, furthermore, ri—1, then /H = Mn
1
+ l ) t f 2 i { # i i + (rii + l ) l n z / ; p ] - 1 ,
v>p.
Therefore, a confidence interval for Yu with cover r has bounds Li and U\, given by Li = expK^-^iHn
+ (m + 1) In/?} - . ^ / ( m + 1) ,
and tfi = e a f p K ^ ) - 1 / ^ ^ ! ! + (ni + l)ln/?} - Hn\/{nx
+ 1).
4.2 The Weibull model Suppose that T ~Weibull(0,/3) with cdf F T (i | 0) = 1 - exp(6tp) , t>0
,
(4.17)
where # is an unknown scale parameter and {3 is a known shape parameter. Both parameters are assumed to be positive. It follows, from (3.1), that \{t) = 6t0
and
\' = flpV3"1 .
(4.18)
238
The likelihood function £(0;*o) is of the form (3.6), in which (3.7) and (4.18) yield A(to;O) = 0r°A(to;0)
,
B(to;6)=0B(to;p),
(4.19)
where TO
To
A(t»; 0) = Pr° I ] C1
,
B(t0;P) = J2 *f0 + ("o - r „ ) C - (4-20)
i=0
j=l
to is the vector defined in (2.6). Under the assumption that (3 is known, the gamma prior family is closed under sampling from the Weibull distribution. A natural conjugate prior density ir(9) then takes the form of (3.5), where C(9;7) = 6a-1
,
D{6;1) = b6,
(4.21)
7 = (a, b) is the vector of prior parameters. The posterior density function TTQ(6 \ t^) then takes the form (3.13), in which ri(0;h)=Sro+a+1A(to;P)
,
mt0)=6[b
+ B(t0;(3)},
(4.22)
where A^-,13) and B^;/?) are given by (4.20). The function Lj(9;y ) is of the form (3.8), in which b
(Vr.) = ^ . . f f l f l D - 1 ) ' ' C V l)^exp{-6llVit] , '
i=iu=o
\
(4.23)
'* /
S(yri) = 0(Z!rniyfri)
.
(4.24)
where 3
A(yrj;H) = fi J 3 y t T i=l
and
mi = nl-ri
+ l.
(4.25)
239
It then follows, from (3.8)-(3.12), that the predictive density of Yjrj is given by i
n-i
/
/Vir,-(%>i I Sfo-i.ri-i.-.J/in.to) « ^ ( ^ r . ; ^ ) n H ^
- 1
_ 1\
)'^*;. )
[b + B(t0]p) + J2(mi + li)yii].
X
(4.26)
The lower and upper bounds of Yjrj can be obtained by solving (2.13) and (2.14) using numerical integration since I(v) could not be obtained in closed form. The first order statistic in sample j has a predictive density of the form (4.26) with Tj=l, i=l,...,j.
If, in addition, j=l,
(the two-sample
case), then Yu has the predictive density f^Avn I *o) ocPy^lb + B^ft+m^]-^0^
,
(4.27)
m i = rii — j"i + 1.
In this case ( j = l and Tj-=1), I{v) can be obtained in the form /•OO
IW) = / &&" > + B(to;l3) +
rniy^a+ro+1Uyil
Jv
= [b + B(y,p)+m1^}^a+^/[m1(a
+ r0)}.
(4.28)
The lower and upper bounds of the confidence interval for Yu with cover r are obtained by solving (2.21) and (2.22), so that
and
240
Remarks (1) By setting (3=1 and (3=2 in the Weibull(0,/3) model, we obtain the exponential and Rayleigh models, each with parameter 6, respectively. Consequently, all of the results obtained for the Weibull model specialize to the exponential and Rayleigh models by setting (3=\ and 2, respectively. (2) The predictive density (4.27) is of the form of Burr type XII density function with parameters (/3,w,£), where u>=a,+ro and £ = mi / [b + B{tQ;/?)]. (3) If T=Weibull(o;,/3), where both of the parameters are unknown, prediction bounds for Yjrj could only be obtained numerically, whether a noninformative or informative prior is used.
5. CONCLUDING R E M A R K S Several results have been reached in the past three decades regarding prediction.
In the point prediction, maximum likelihood and linear pre-
diction based on location-scale families have been developed. Results on frequentist and Bayesian interval prediction have been obtained and studied. However, further investigations need to be carried out to study prediction when samples are based on families which are not location-scale type and the optimality of predictors resulting in such cases. Nonlinear prediction, optimality and cost still need further study. REFERENCES Ahsanullah, M. (1980). Linear prediction of record values for the two parameter exponential distribution. Ann. Instit. Statist. Math. 32, 363-368. Ahsanullah, M. (1990). Estimation of the parameters of the Gumbel
241
distribution based on m record values. Comput. Statist. Quart. 6, 231-239. Aitcheson J. and Dunsmore, I. (1975). Statistical Prediction Analysis. Camridge University Press, Cambridge. AL-Hussaini, E.K. (1999)'°'. Bayesian prediction under a mixture of two exponential components model based on type I censoring. J. Appl. Statist. Science 8, 173-185. AL-Hussaini, E.K. (1999)' 6 '. Predicting observables from a general class of distributions. J. Statist. Plann. Infer. 79,79-91. AL-Hussaini, E.K. and Jaheen, Z.F. (1995). Bayesian prediction bounds for the Burr type XII model. Commun. Statist.- Theory Meth. 24, 1829-1842. AL-Hussaini, E.K. and Jaheen, Z.F. (1996). Bayesian prediction bounds for the Burr type XII distribution in the presence of outliers. J. Statist. Plann. Infer. 55, 23-37. AL-Hussaini E.K. and Jaheen, Z.F. (1999). Parametric prediction bounds for the future median of the exponentil distribution. Statistics 32, 267-275. AL-Hussaini, E.K.; Nigm, A.M. and Jaheen, Z.F. (2000). Bayesian prediction based on finite mixtures of Lomax components model and type I censoring. Statistics (to appear). AL-Hussaini, E.K. and Osman, M.I. (1997). On the median of a finite mixture. J. Statist. Comput. Simul. 58, 121-144. Arnold, B.C.; Balakrishnan, N. and Nagaraja, H.N. (1998). Records. Wiley, New York. Balakrishnan, N.; Ahsanullah, M. and Chan, P.S. (1995). On the
242
logistic record values and associated inference. J. Appl. Statist. Science 2, 233-248. Balakrishnan, N. and Chan, P.S. (1994). Record values from Rayleigh and Weibull distributions and associated inference. NIST Special Publi cation 866, Proceedings of the Conference on Extreme Value Theory and Applications, vol.3 (Eds., J. Galambos, J. Lechner and E. Simiu) pp.4151. Balakrishnan, N. and Chan, P.S. (1998). On the normal record values and associated inference. Statist. Prob. Letters (to appear). Balakrishnan, N. and Rao, C.R. (1997). Large sample approximations to the best linear unbiased estimation and best linear unbiased prediction based on censored samples and some applications. In: Advances in Statistical Decision Theory and Applications, (Eds. S. Panchapakesan and N. Balakrishnan), Birkhauser, Boston, pp. 431-448. Bernardo, J.M. and Smith, A.F.M. (1994). Bayesian Theory. Wiley, New York. Berred, A.M. (1998). Prediction of record values. Commun.
Statist.-The
ory Meth. 27, 2221-2240. Chan, P.S. (1998). Interval estimation of parameters of life based on record values. Statist. Prob. Letters (to appear). Chhikara, R. and Guttman, I. (1982). Prediction limits for the Inverse Gaussian distributions. Technometrics 24, 319-324. Corcuera, J.M. and Giummole, F. (1999). A generalized Bayes rule for prediction. Scand. J. Statist. 26, 265-279. Doganaksoy, N. and Balakrishnan, N. (1997). A useful property of best linear unbiased predictors with applications to life testing. The
243
Amer. Statist. 51, 22-28. Dunsmore, I.R. (1974). The Bayesian predictive distribution in life testing models. Technometrics 16, 455-460. Dunsmore, I.R. (1976). Asymptotic prediction analysis. Biometrika 63, 627-630. Dunsmore, I.R. (1983). The future occurrence of records. Ann.
Instit.
Statist. Math. 35, 267-277. Flinger, M.A. and Wolfe, D.A. (1976). Some applications of sample analogues to the probability integral transform and a coverage property. Amer. Statist. 30, 78-85. Flinger, M.A. and Wolfe, D.A. (1979) (a) . Methods for obtaining distribu tion-free prediction interval for the median of a future sample. J. Quality Tech. 11, 192-198. Flinger, M.A. and Wolf, D.A. (1979)' 6 '. Nonparametric prediction in tervals for a future sample median. J. Amer. Statist. Assoc. 74, 453456. Geisser, I.R. (1984). Predicting Pareto and exponential observables. Canad. J. Statist. 12, 143-152. Geisser, I.R. (1986). Predictive analysis. In: Encyclopedia of Statistical Sciences, vol.7 (Eds., S. Kotz, N.L. Johnson and C.B. Read), Wiley, New York, pp. 158-170. Geisser, I.R. (1990). On hierarchical Bayes procedures for predicting simple exponential survival. Biometrics 46, 225-230. Geisser, I.R. (1993). Predictive Inference: An Introduction. Chapman and Hall, London. Guilbaud, O. (1983). Nonparametric prediction intervals for sample me-
244
dians in the general case. J. Amer. Statist. Assoc. 78, 937-941. Johnson, R.A.; Evans, J.W. and Green, D.W. (1999). Nonparametric Bayesian predictive distributions for future order statistics. Statist. Prob. Letters 41, 247-254. Kaminsky, K.S. and Nelson, P.I. (1998). Prediction of order statistics. In: Handbook of Statistics, (Eds.,N. Balakrishnan and C.R. Rao), Elsevier Science, Amesterdam, vol.17, 431-450. Lingappaiah, G.S. (1978). Bayesian approach to the prediction problem in the exponential population. IEEE Trans. Rel. R-27, 222-225. Lingappaiah, G.S. (1979). Bayesian approach to prediction and the spacings in the exponential distribution. Ann. Instit. Statist. Math. 31, 391-401. Lingappaiah, G.S. (1980). Intermittant life testing and Bayesian approach to prediction with spacings in the exponential model. Statistica 40, 477-490. Lingappaiah, G.S. (1986). Bayes prediction in exponential life-testing when sample size is a random variable. IEEE Trans. Rel. 35, 106-110. Lingappaiah, G.S. (1989). Bayes prediction of maxima and minima in exponential life tests in the presence of outliers. J. Indusr. Math. Soc. 39, 169-182. Lwin, T. (1972). Estimating the tail of the Paretian law. Scandinavian Aktuarietidskr 55, 170-178. Nagaraja, H.N. (1984). Asymptotic linear prediction of extreme order statistics. Ann. Instit. Statist. Math. 36, 289-299. Nagaraja, H.N. (1995). Prediction problems. In: The Exponential Distribution: Theory and Applications, (Eds. N. Balakrishnan and A.P.
245
Basu), Gordon and Breach, New York, pp. 139-163. Padgett, W.J. (1982). An approximate prediction interval for the mean of future observations from the Inverse Gaussian distribution. J. Statist. Comp. Simul. 14, 199-209. Padgett, W.J. and Tsio, S.K. (1986). Predictin interval for observations from the Inverse Gaussian distribution. IEEE Trans. Rel. R-35, 406408. Patel, J.K. (1989). Prediction intervals - a review. Communic.
Statist-
Theory Meth., 18, 2393-2465. Seshadri, V. (1999). The Inverse Gaussian Distribution. Statistical Theory and Applications. Springer Verlag, Berlin.
Mathematics and the 21st Century Eds. A. A. Ashour and A.-S. F. Obada © 2001 World Scientific Publishing Co. (pp. 247-270)
247
INFERENCE ON PARAMETERS OF THE LAPLACE DISTRIBUTION BASED ON TYPE-II CENSORED SAMPLES USING EDGEWORTH APPROXIMATION N. Balakrishnan*, A. Childs*, Z. Govindarajulu^ and M.P. Chandramouleeswarant *McMaster University, Hamilton, Ontario, Canada * University of Missouri, Columbia, Missouri, USA A University of Kentucky, Lexington, Kentucky, USA
Keywords and Phrases; Order statistics; Type-II censored samples; Laplace distribution; Exponential distribution; Single moments; Double moments; Triple moments; Quadruple moments; Coefficients of skewness and kurtosis; Pivotal quantities; BLUE; MLE; Edgeworth approximation; Recurrence relations; Life-testing
ABSTRACT By deriving two recurrence relations which express the single and double moments of order statistics from a symmetric distribution in terms of the corresponding quantities from its folded distribution, Govindarajulu (1963, 1966) determined means, variances and covariances of Laplace order statistics (using the results on exponential order statistics) for sample sizes up to 20. He also tabulated the BLUE's (Best Linear Unbiased Estimators) of the location and scale parameter of the Laplace distribution based on complete and symmetrically Type-II censored samples. In this paper, we first establish similar relations for the computation of triple and quadruple moments. We then use these results to develop Edgeworth approximations for some pivotal quantities which will enable one to develop inference for the location and scale parameters. Next, we show that this method provides close approximations to percentage points of the pivotal quantities determined by Monte Carlo simulations. Finally, we present an example to illustrate the method of inference developed in this paper.
1 INTRODUCTION Let Xr+\-n < Xr+2-n < ••• < Xn-S:n be a doubly Type-II censored sample available from the Laplace (or double exponential) population with pdf
248
f(x;/j,,a) = —e |x M", - c o < x < oo, - c o < ^ < co, a > 0. (1.1) 2a Here, out of n items placed on a life-testing experiment, the smallest r and the largest s items have been censored. The most common situation in a life-testing problem is, of course, a Type-II right-censored sample (with r = 0 and s > 0) as the experimenter will often terminate the experiment as soon as a certain number of items have failed instead of waiting for all the items to fail; see Mann, Schafer and Singpurwalla (1974), Bain (1978), Lawless (1982), Cohen and Whitten (1988), Bain and Engelhardt (1991), and Balakrishnan and Cohen (1991). It should be mentioned here that Govindarajulu (1963) established two recurrence relations which express the single and double moments of order statistics from a symmetric distribution in terms of the corresponding quantities from its folded distribution. By using these relations and the exact explicit expressions-of the single and double moments of exponential order statistics, Govindarajulu (1966) determined the means, variances and covariances of Laplace order statistics for n up to 20; he also tabulated the BLUE's of the parameters p, and a based on symmetrically Type-II censored samples (with r = s) for n up to 20. Raghunandanan and Srinivisan (1971) derived simplified linear estimators of /i and a in this situation. Ali, Umbach and Hassanein (1981) discussed the estimation of quantiles based on optimally selected order statistics. Shyu and Owen (1986a,b) constructed the one-sided and two-sided tolerance intervals based on complete samples. Balakrishnan and Ambagaspitiya (1988) discussed the robustness features of various linear estimators of ft and a when a single scale-outlier is present in the sample, and Childs and Balakrishnan (1997a) extended this work to include multiple outliers. Recently, Childs and Balakrishnan (1997b) extended the work of Balakrishnan and Cutler (1995) by deriving the MLE's of \i and o based on general Type-II censored samples. Balakrishnan, Chandramouleeswaran and Ambagaspitiya (1996) extended the work of Govindarajulu (1966) and presented tables of BLUE's and variances and covariance of these estimators for n up to 20 for the case of right-censored samples (with r = 0 and s = 0(l)n — 2). They also presented some percentage points of three pivotal quantities which will enable one to construct confidence intervals and carry out tests of hypotheses for the parameters n and a. Balakrishnan and Chandramouleeswaran (1996a) discussed the estimation of the reliability function and the construction of lower and upper tolerance limits based on the BLUE's when the available sample is Type-II right-censored. Balakrishnan and Chandramouleeswaran (1996b) considered the problem of predicting the time of failure of a surviving item and also predicting the lifetime of an item from a future sample, for the case when a Type-II right-censored sample is available from a life-test. In this paper, we first extend the work of Govindarajulu (1963) and derive recurrence relations that will enable the computation of triple and quadruple moments of order statistics from a symmetric distribution in terms of the corresponding quantities from its folded distribution. These results, along with the exact explicit expressions of these moments of exponential order statistics, will enable one to compute all moments of order up to 4 for the Laplace order statistics. These quantities are used to determine the coefficients of skewness and kurtosis of some pivotal quantities based on the BLUE's and MLE's of fi and a, and then propose Edgeworth approximations for the distributions of these pivotal quantities. The proposed approximations will enable one to develop inference for the parameters fi and a based on Type-II
249 censored samples. We then show that this method provides close approximations to percentage points of the pivotal quantities determined by Monte Carlo simulations. Finally, we present a numerical example to illustrate the method of inference developed in this paper. Similar work in the case of the exponential distribution has been carried out recently by Balakrishnan and Gupta (1998).
2 RECURRENCE RELATIONS FOR ORDER STATISTICS FROM TWO RELATED DISTRIBUTIONS Let Z\,Z2,.-- ,Zn be I.I.D. random variables with probability density function f(z) symmetric about 0 (without loss of generality), and cumulative distribution function F(z). Then, for z > 0 let F'(z) = 2F(z) - 1 and f"(z) = 2f(z).
(2.1)
That is, the density function f*(z) is obtained by folding the density function f(z) at zero (the point of symmetry). Let Yin < Y2n < ••• < Ynn denote the order statistics obtained from n I.I.D. random variables Y\,Yz,--- ,Yn having probability density function f*(z) and cumulative distribution function F*{z) as given in (2.1), and let Z\:n < Zi-n < ••• < Zn-n denote the order statistics obtained from the random variables Z\, Zi, • • •, Zn. Let us now denote the single moments E(Z?n) by p,fl, the double moments E(Zf:nZj.n) by /4j:„, the triple moments ~E(Z?nZj.nZf:.n) by l\ik°n> an& t n e quadruple moments E(ZlnZ).nZl.nZfn) by l4^c/J for 1 < i < j < k < I < n and a, b, c, d > 0. Similarly, let us denote the corresponding moments of order statistics Yim by ufj, v\a-bn\ vfA^l and ^"/u^ ^0I 1 < i < j < k < I < n and a, b,c,d > 0. Then by making use of the relations in (2.1), Govindarajulu (1963) established the following two recurrence relations.
Relation 1: For i = 1,2,... ,n and a > 1,
Relation 2: For 1 < i < j < n and a, b > 1,
*=i
)
250 It is worth mentioning here that the above two relations have been generalized recently by Balakrishnan (1989) and Balakrishnan, Govindarajulu and Balasubramanian (1993) to the case when the order statistics Zi,n arise from n independent and non-identically distributed symmetric random variables and arbitrarily distributed symmetric random variables, respectively. By proceeding along the lines of Govindarajulu (1963), we now establish two recurrence relations which will express the triple and quadruple moments of order statistics Z{-n in terms of the corresponding quantities of order statistics Yi-n. Relation 3: For l
and a,b,c > 1,
-t-(-ir + i , vV n V M ) +
\
l
+
*>
l
>
Z~/\t) t=j >
v{c)
V
t-j+l,t-i+V.t
2—i \ l ) t=k
v
k-t.n-t
"t-k+l,t-j+l,t-i+V.t j
Proof: The relation is proved by considering the triple integral expression of ni'-^l over the range (—oo < xt < Xj < Xk < oo) and splitting the range into four parts as (0 < x, < Xj < Xk < oo), (-oo < Xi < 0, 0 < Xj < Xk < oo), (—oo < Xi < Xj < 0, 0 < Xk < oo), and (—oo < re,- < Xj < Xk < 0), and then using the relations in (2.1) in each of these four integrals. Proceeding similarly, we may also establish the following relation for the quadruple moments of order statistics Z,:n. Relation 4: For 1 1, ' t-i
(a,b,c,d) _ _ l J \ V n \ , W w * )
+ +
v{bfi'd)
(_1)°YV™VO) \
l
> /t-i A t k-1
)Vt-i+l:tVi-t.k-t.l-t:n „ (M) (c,d) j+l,t-i+l:tL/k-t,l-t:n-t
t=j ' " ' (-1 . „ .
+
(-irb+ct=kE{nM z
+(-Da+6+c+dE(I)^(-;l+l,t-k-\-l,t-j+l,t-i+l:t ,c,b,a)
251 Relations 1-4 can be used recursively to compute the single, double , triple and quadruple moments of order statistics from a symmetric distribution by making use of the corresponding quantities from its folded distribution. In particular, after computing all these moments of order up to 4, one can determine the mean, variance and the coefficients of skewness and kurtosis of any linear function of order statistics from that symmetric distribution. These measures can then be utilized to develop Edgeworth approximations for distributions of linear functions of order statistics as we will illustrate in Section 5. For the case when the order statistics Zi-n arise from the standard Laplace distribution [with n = 0 and a = 1 in (1.1)], the computation of the single, double, triple and quadruple moments by means of Relations 1-4 require the knowledge of these moments from the standard exponential distribution, with pdf /*(z) = e~z, z > 0. In this case, the additive Markov chain representation of standard exponential order statistics [Sukhatme (1937), Renyi (1953)] given by, i
p
Yi-.n = Y] ~-—r, i = frf n- t + 1
l,2,...,n,
where Et's are I.I.D. standard exponential random variables, makes it possible to write down the necessary moments vfn, vfj.^, ",•%'.„ and v^'^'J in simple explicit algebraic forms. For example,
•£:
t+i
(2.2)
(2)
and
^.-£(^r(g^)(S^}
(2 4)
-
Alternatively, one may use the recurrence relations for the single and product moments of exponential order statistics derived by Joshi (1978, 1982), along with the recurrence relations for the triple and quadruple moments derived recently by Balakrishnan and Gupta (1998) in order to compute the necessary moments v^, vfyj, ",-jV.„ anl^ "ijut m a s i m P' e recursive manner.
3 BLUE'S OF (j, AND a Let XT+ln < XT+2:n < •" < Xn_s:n denote a general doubly Type-II censored sample from the Laplace distribution in (1.1), and let
252 Zi:n = (Xi:n — n)/a, i = r + 1, r + 2 , . . . , n — s, be the corresponding order statistics from the standard Laplace distribution. Let us denote B(Zi:n) by //,..„, Var(Zj:7l) by a,,,:n, and Cov(Zi:n, Zj:n) by o-ij:n; further, let X — (Xr+l:n,
/*
=
Xr+2:ni • • • j -An- s.nj
(/Vt-l:n I AVf2:n > •' • i Vn-s.n J
and S = ((o-jj-n)) , r + l < i , j ' < n - s Then, the BLUE's of fi andCTare given by
/ / E - V 7 ^ - 1 - M T S- 1 l/x r S- 1 I ( ^ r E - 1 A * ) ( l r S - 1 l ) - (fjT-E-n)2 J
^
~ i=r+l ^ < i: "'
(iA)
and a
-U/irs-i^irs-iij-^rs-i^/^-.^
6
^™-
^
Furthermore, the variances and covariances of these BLUE's are given by
Var(
^ = H(^-w/X)-{^-ny}
=
CTV
- (3-4)
and
for details, refer to David (1981), Balakrishnan and Cohen (1991), and Arnold, Balakrishnan and Nagaraja(1992). From Equations (3.1) and (3.2), Govindarajulu (1966) computed the coefficients a* and 6* for symmetrically Type-II censored samples (with r = s) for sample sizes n up to 20. He also presented the values of V{ and V2* i n Equations (3.3) and (3.4). Similar tables have been prepared recently by Balakrishnan, Chandramouleeswaran and Ambagaspitiya (1996) for Type-II right-censored samples (with r = 0 and s = 0(l)n — 2) for sample sizes n = 3(1)20. They also presented some percentage points of three pivotal quantities, based on BLUE's [i* and a*, which will enable one to construct confidence intervals and carry out tests of hypotheses for the
253 parameters /z and a.
4 MLE'S OF fj, AND a Recently, Childs and Balakrishnan (1997b) derived the MLE's of n and a based on the general doubly Type-II censored sample, -Xr+l:n ^ XT+2:n 5= ' ' ' 5:
Xn-s:nt
from the Laplace distribution (1.1). They found that when r + 1 < n — s < n/2,
fi = Xn
- ffln
n/2
(4.1)
and (n - s)Xn-s-n
-
2 J -^im _
rXr+l-n
(4.2)
when ^ + l < r + l < n - s , /x = Xr+i,n - a In -
n/2
(4.3)
and 1 >Xn-sm + 2 J ^i:n - (n - r)X r+ i :n n — s—r
(4.4)
and when r + 1 < n/2 < n — s,
P
5(^m:2m + X m+ i :2m )
={J
if n is even, n = 2m.
(4.5)
(actually, when n is even any value in {Xm.2m, -Xm+1:2m] is an MLE for JJ., but we will use the one given since it is unbiased) and 2_, Xin — 2_, -Xi:n — r ^ r + l : n t=m+l i=r+l n-s m 5 ^ n - S : n + Z-i Xi:n — J_j Xi:n — rXT+l:n i=m+2 z=r+l
sXn-sn
+
if n is even, n = 2m if n is odd, n = 2m + 1.
(4.6) Unlike the BLUE's presented in the previous section, the MLE's are explicit linear functions of order statistics,
254
V- =
22 atXtn, i—r+\
a =
^iXi-n
2~2 i~r+l
where special tables are not required for the computation of the at's or 6,'s [they are given explicitly in (4.1)-(4.6)]. Since the MLE for a is not unbiased, we will obtain an unbiased estimator a, of a, based on the MLE (to be used in the following section) by dividing by its expected value when the underlying random variables come from the standard Laplace distribution (/j, = 0, a — 1), J2 biXi.n ~
n_s
J2 *'x™>
t=r+l
(4-7)
t=r+l
where fii:n is given in Relation 1, and the corresponding Vi:n is given in (2.2). Furthermore, a closed form expression for its variance may be obtained using Relations 1 and 2 in conjunction with Equations (2.2)-(2.4). We have,
I i=r+\
i=r+l j=i+l
where exact explicit expressions for iiin, jujj and /i 4 -„ are given in Relations 1 and 2, with the corresponding v{.n, v\l and utj.n given in Equations (2.2)-(2.4). An analogous equation holds for the variance of p.
5 PIVOTAL QUANTITIES AND INFERENCE Based on the BLUE's //* and a* in (3.1) and (3.2), let us define P1
^ , P 2 = ^a , a n d P 0\JV* \^2
=
3
= ^
.
v
(5.1)
Let us also define analogous quantities based on the MLE's Jl and a, p>
=
lz£, crylVl
p>
=
ZZL,
G^JV2
and
p>
=
Uz±t
(5. 2 )
a
where exact explicit expressions for ayV\ = i/Var(£) and uyV2 = \JVai{a) are described in Section 4. It is easily verified that all of the quantities in (5.1) and (5.2) are pivotal quantities. Pi and P[ can be used to draw inference for \x when a is known, while P3 and P^ can be used to draw inference for fj, when a is unknown. Similarly, P 2 and P 2 c a n De used to draw inference for cr when fi is unknown.
255 By making use of the results developed in Section 2, we propose Edgeworth approximations to the distributions of Plt P[, P2 and P 2 and examine their effectiveness by comparing the approximate results with simulated results. First of all, realize that P\ and P 2 in (5.1) can be written as
£ »;z«.
T. nz,.n - 1
„.
„. ,
while P-f and P 2 in (5.2) can similarly be written as
X] a;Zi:n ^ 5Z biZtn - 1 P{ = ""-+1 = - £ L and P^ = ' " Thus, they are linear functions of order statistics Zi:n from the standard Laplace distribution. By making use of the relations in Section 2, the values of a*, b*, V{ and V2* [tabulated by Balakrishnan, Chandramouleeswaran and Ambagaspitiya (1996)], and the exact values of ait bt [given in (4.5) and obtained from (4.6), respectively], Vl and V2 (from the exact explicit expressions described in Section 4) for the case of Type-II right-censored samples, X\n < X 2:n < • '' < Xn-s-n, we determined the values of the mean, variance and the coefficients of skewness and kurtosis (y/Jh and /32) of P{, P2*, Px and P 2 for n = 5(1)10(5)20 and s = 0(1)(§ - 1) for n even and s = 0(1)||] for n odd. These values for P*, P2* are presented in Table 1, while the results for Pj and P 2 are given in Table 2. An examination of the /32 values in Table 1 reveals that the distribution of Pj* (and hence of P{) is slightly heavier tailed than normal and, therefore, an Edgeworth approximation in this case will be quite appropriate. An examination of the {VPi'M values in Table 1 reveals that the distribution of P 2 (and hence of P2) is positively skewed and also heavier tailed than the normal, but lies in the range of an Edgeworth approximation; for details on the possible range for Edgeworth approximation, see Barton and Dennis (1952) and Johnson, Kotz and Balakrishnan (1994). Similar observations may be made from Table 2 regarding the distributions of Pi and P 2 . Note also that the variance and (vJ^ii ft) values for P2* and P 2 are very similar, often agreeing up to the second decimal place in the (y^fli, /32) values, and the third decimal place in the variance values. The Edgeworth approximation for the distribution of a standardized statistic T (with mean 0 and variance 1) is given by F(t) « $(i) -
2
- 1) +
{
~~Q(t3
- 3t) + § ( i 5 - 10t3 + 15t) | ,
(5.3)
where y73i and /32 are the coefficients of skewness and kurtosis, respectively, of T, and $(t) is the cumulative distribution function of the standard normal distribution with corresponding pdf (p(t). By making use of the entries in Tables 1 and 2, we determined the lower and upper 1%,
256 2.5%, 5% and 10% points of Pi, P[, P 2 and P2' through the Edgeworth approximation in (5.3). These values, for the case of Type-II right-censored samples (r = 0) for s = 0(1)(| — 1) for n even and s = 0(1)[|] for n odd and sample size re = 5(1)10(5)20, are presented in Tables 3-6. For the purpose of comparison, these percentage points were also determined by simulations (based on 5000 runs) and they are presented along with the Edgeworth percentage points in Tables 3-6. From Tables 3 and 4 we see that the Edgeworth approximation of the distributions of Pi and Pi provides quite close agreement with the simulated percentage points. The largest discrepancy occurs at the extreme lower and upper tails of the distribution, but only for small sample sizes. As the sample size increases, the agreement becomes quite close at all levels of censoring, even at the extremes of the distribution. From Tables 5 and 6 we see that the Edgeworth approximation of the distributions of P 2 and P2' also provide close agreement with the simulated percentage points. This time however, the discrepancy for small sample sizes only occurs at the upper tail. But again, as the sample size increases, the discrepancy becomes quite small at all levels of censoring, even at the extremes of the distribution. In conclusion, we observe that the Edgeworth approximations of the distributions of Pi, P[, P2 and P2' all work quite satisfactorily even in samples of size as small as 5, and they indeed improve in accuracy as the sample size re increases. We recommend the use of the pivotal quantities P 2 and P2' based on the MLE's since they require no special tables to use. It should also be pointed out here that a similar Edgeworth approximation can not be developed for the percentage points of the pivotal quantities P 3 or P 3 in (5.1) and (5.2) since it is not a linear function of order statistics. However, as displayed in the next section, approximate inference based on Pi with a replaced by u* or based on P[ with a replaced by a, provides quite close results to those based on P 3 or P3' respectively. For this purpose, we have presented in Tables 7 and 8 some selected percentage points of P3 and P3' determined by simulations (based on 5000 runs).
6 NUMERICAL ILLUSTRATION In order to illustrate the usefulness of the inference procedures discussed in the previous sections, we consider here a simulated data set of size re = 20 (with fj, = 50 and a = 5): 32.007, 37.757, 43.848, 46.268, 46.907, 47.262, 47.290, 47.593, 48.065, 49.254, 50.278, 50.487, 50.662, 53.336, 53.493, 53.567, 53.981, 54.942, 55.695, 66.396. Using this sample, the BLUE's and MLE's were calculated based on complete as well as Type-II right-censored samples (r = 0) by making use of the tables of Balakrishnan, Chandramouleeswaran and Ambagaspitiya (1996), and the explicit expressions in (4.5) and (4.6), respectively. These estimates are presented in the following table.
257
n s 20 0 1 2 3
^* 49.561 49.561 49.561 49.561
a a* P 49.766 4.947 4.964 49.766 4.635 4.653 49.766 4.814 4.834 49.766 4.931 4.952
With these estimates and the use of Tables 3 and 4, we determined the 90% confidence intervals for fi (when a is known to be 5) based on the Edgeworth approximation and on the simulated percentage points using both Pi and P[. These are presented in the table below. n 20
s 0 1 2 3
Simulated C.I.
Edgeworth C.I.
«3fc& Pi
(47.501,51.621) (47.501, 51.621) (47.501,51.621) (47.501,51.621)
Pi
Pi
Pi
(47.667,51.865) (47.667, 51.865) (47.667,51.865) (47.667,51.865)
(47.454,51.668) (47.530,51.656) (47.517,51.618) (47.466,51.631)
(47.637,51.895) (47.637,51.895) (47.637, 51.895) (47.637,51.895)
It is clear that the confidence interval based on the Edgeworth approximation is very close to the confidence interval determined by simulations, at all levels of censoring. Similarly, with the use of Tables 5 and 6, we determined the 90% confidence intervals for a, and they are presented below.
&*& n
20
s 0 1 2 3
Edgeworth C.I. P' P2 (3.535,7.522) (3.547, 7.550) (3.285,7.138) (3.298,7.168) (3.383,7.518) (3.396,7.550) (3.433,7.820) (3.447,7.854)
Simulated C.I. Pi
(3.565,7.471) (3.332,7.226) (3.381,7.483) (3.415,7.795)
P' (3.576,7.526) (3.344,7.257) (3.395,7.518) (3.429,7.832)
Once again, we observe that the confidence intervals based on the Edgeworth approximation are very close to those based on simulations. In the case when o is unknown, the Edgeworth approximation method cannot be used to draw inference for /J, using P 3 or P.J. However, as pointed out in the last section, the Edgeworth approximation for the distribution of the pivotal quantity Pj may be used in this case with a replaced by a", or P[ may be used with a replaced by a, in order to draw approximate inference for [i. By this method, we determined the 90% confidence intervals for \i and these are presented in the following table for the choices of r = 0, s = 0(1)3. Also presented in this table are the corresponding 90% confidence intervals for /x based on the simulated percentage points of the pivotal quantities P 3 and Pj given in Tables 7 and 8.
258
Edgeworth C.I. P{(a = a) 20 0 (47.526,51.596) (47.678,51.854) 1 (47.654,51.468) (47.809,51.723) 2 (47.581,51.541) (47.733,51.799) 3 (47.532,51.590) (47.683,51.849)
Jim-n s
Pi (o- = ff')
Simulated C.I. P3 Pi (47.384,51.738) (47.483, 52.000) (47.614,51.647) (47.719,51.906) (47.539,51.583) (47.687, 51.845) (47.391,51.681) (47.538, 51.895)
It is quite clear that the confidence intervals using the approximate Edgeworth method based on the pivotal quantities Pi or P[ are all very close to the confidence intervals determined by simulation of the distribution of the pivotal quantity P3 or P3' at all levels of censoring.
7 RESULTS FOR GENERAL CENSORED SAMPLES All the methods of inference described in the previous sections are equally applicable to general censored samples ^V+l:n < XT+2:n < ' " <
Xn-Sn,
but the Edgeworth approximate percentage points of the pivotal quantities based on the BLUE's requires special tables which at the present time do not exist for general censored samples. However, the Edgeworth approximate percentage points of the pivotal quantities based on the MLE's continue to require no special tables, as they are given explicitly in (4.1) and (4.2) or (4.5) and (4.6) depending on the level of censoring. For the purpose of illustration, let us consider here the numerical example presented in the last section and assume that the smallest two and largest three observations have been censored, i.e., we take r = 2, s = 3 and n = 20. By explicitly computing the coefficient of BLUE's for [i and a and using the formulas for the MLE's given in (4.5) and (4.6) we find that /x* = 49.561, p, = 49.766, a* = 4.373 and a = 4.397. We determined the mean, variance and the coefficients of skewness and kurtosis of P{, P\, P2* and P 2 to be l
**-»;
p; Pi
Pi P2
Mean Variance 0.0637 0.0000 0.0666 0.0000 1.0000 0.0693 1.0000 0.0693
VK
A
0.0000 0.0000 0.5260 0.5261
3.5352 3.7475 3.4151 3.4153
By making use of these quantities, we determined the lower and upper 1%, 2.5%, 5% and 10% points of the distribution of the pivotal quantities P 1 ; P[, P2 and P2' through the Edgeworth
259 approximation in (5.3), and these are presented in the following table:
m *
1% -2.47 pi -2.53 P2 -1.92 Pi -1.92
2.5% -2.00 -2.02 -1.70 -1.70
5% -1.63 -1.63 -1.49 -1.49
10% -1.24 -1.22 -1.22 -1.22
90% 1.24 1.22 1.32 1.32
95% 1.63 1.63 1.78 1.78
97.5% 99% 2.00 2.47 2.02 2.53 2.21 2.74 2.21 2.74
We also simulated the percentage points of all of the pivotal quantities,
m
i% -2.45 pi -2.58 p2 -1.95 ^ -1.96 Ps -0.64 -0.68 Pi
n
2.5% -2.07 -2.06 -1.70 -1.70 -0.53 -0.55
5% -1.69 -1.70 -1.48 -1.48 -0.44 -0.45
10% -1.29 -1.28 -1.22 -1.22 -0.34 -0.34
90% 95% 1.26 1.66 1.25 1.66 1.29 1.73 1.29 1.73 0.34 0.45 0.34 0.45
97.5% 99% 2.07 2.47 2.07 2.58 2.07 2.45 2.08 2.45 0.56 0.68 0.59 0.71
An approximate 90% Edgeworth confidence interval for [i (when a is unknown) is then obtained to be (47.762,51.36) based on the BLUE's, and (47.916,51.616) based on the MLE's. Similarly, the 90% Edgeworth approximate confidence interval for a is obtained to be (2.978,7.195) based on the BLUE's, and (2.994,7.235) based on the MLE's. Upon comparing these intervals with (47.593,51.485), (47.787,51.745), (3.005,7.164) and (3.021,7.204), the 90% confidence intervals for /j. and a based on the BLUE's and MLE's, determined through Monte Carlo simulations of the percentage points of the distributions of the pivotal quantities P3, P^, Pi and Pj we observe that the Edgeworth approximation provides quite close results even in the case of general Type-II doubly-censored samples. In conclusion, the numerical illustration in this and the last section clearly indicates the usefulness of the Edgeworth approximation method in developing inference for the parameters of the Laplace distribution based on Type-II censored samples.
260 Acknowledgments The first two authors would like to thank the Natural Sciences and Engineering Research Council of Canada for funding this research.
References Ali, M.M., Umbach, D., and Hassanein, K.M. (1981). Estimation of quantiles of exponential and double exponential distributions based on two order statistics, Communications in Statistics - Theory and Methods 10, 1921-1932. Arnold, B.C., Balakrishnan, N. and Nagaraja, H.N. (1992). A First Course in Order Statistics, John Wiley & Sons, New York. Bain, L. J. (1978). Statistical Analysis of Reliability and Life-Testing Models: Theory and Methods, Marcel Dekker, New York. Bain, L.J. and Engelhardt, M. (1991). Statistical Analysis of Reliability and Life-Testing Models: Theory and Methods, Second edition, Marcel Dekker, New York. Balakrishnan, N. (1989). Recurrence relations among moments of order statistics from two related sets of independent and non-identically distributed random variables, Annals of the Institute of Statistical Mathematics 41, 323-329. Balakrishnan, N. and Ambagaspitiya, R.S. (1988). Relationships among moments of order statistics from two related outlier models and some applications, Communications in Statistics - Theory and Methods 17, 2327-2341. Balakrishnan, N. and Chandramouleeswaran, M.P. (1996a). Reliability estimation and tolerance limits for Laplace distribution based on censored samples, Mircoelectronics and Reliability 36, 375-378. Balakrishnan, N. and Chandramouleeswaran, M.P. (1996b). Prediction for the Laplace distribution based on Type-II censored samples, Mircoelectronics and Reliability (to appear). Balakrishnan, N., Chandramouleeswaran, M.P., and Ambagaspitiya, R.S. (1996). BLUE's of location and scale parameters of Laplace distribution based on Type-II censored samples and associated inference, Microelectronics and Reliability 36, 371-374. Balakrishnan, N. and Cohen, A.C. (1991). Order Statistics and Inference: Estimation Methods, Academic Press, Boston.
261 Balakxishnan, N. and Cutler, CD. (1995). Maximum likelihood estimation of the Laplace parameters based on Type-II censored samples, In Statistical Theory and Applications: Papers in Honor of Herbert A. David (Eds., H.N. Nagaraja, P.K. Sen and D.F. Morrison), pp. 145-151, Springer-Verlag, New York. Balakxishnan, N., Govindarajulu, Z., and Balasubramanian, K. (1993). Relationships between moments of two related sets of order statistics and some extensions, Annals of the Institute of Statistical Mathematics 45, 243-247. Balakxishnan, N. and Gupta, S.S. (1998). Higher order moments of order statistics from exponential and right-truncated exponential distributions and applications to life-testing problems, In Handbook of Statistics - I7: Order Statistics: Applications (Eds., N. Balakxishnan and C.R. Rao), 25-59. Barton, D.E. and Dennis, K.E.R. (1952). The conditions under which Gram-Charlier and Edgeworth curves are positive definite and unimodal, Biometrika 39, 425-427. Childs, A. and Balakxishnan, N. (1997a). Some extensions in the robust estimation of parameters of exponential and double exponential distributions in the presence of multiple outliers, In Handbook of Statistics 15: Robust Inference (Eds., C.R. Rao and G.S. Maddala), 201235, Elsevier Science, North-Holland, Amsterdam. Childs, A. and Balakxishnan, N. (1997b). Maximum likelihood estimation of Laplace parameters based on general Type-II censored samples, Statistische Hefte 38, 343-349. Cohen, A.C. and Whitten, B.J. (1988). Parameter Estimation in Reliability and Life Span Models, Marcel Dekker, New York. David, H.A. (1981). Order Statistics, Second edition, John Wiley & Sons, New York. Govindarajulu, Z. (1963). Relationships among moments of order statistics in samples from two related populations, Technometrics 5, 514-518. Govindarajulu, Z. (1966). Best linear estimates under symmetric censoring of the parameters of a double exponential population, Journal of the American Statistical Association 61, 248258. Johnson N.L., Kotz S., and Balakxishnan N., (1994). Continuous Univariate Distributions, Vol. 1, Second edition, John Wiley & Sons, New York. Lawless, J.F. (1982). Statistical Models and Methods for Lifetime Data, John Wiley & Sons, New York. Mann, N.R., Schafer, R.E., and Singpurwalla, N.D. (1974). Methods for Statistical Analysis of Reliability and Life Data, John Wiley & Sons, New York.
262
Raghunandanan, K. and Srinivasan, R. (1971). Simplified estimation of parameters in a double exponential distribution, Technometrics 13, 689-691. Renyi, A. (1953). On the theory of order statistics, Acta Math. Acad. Sci. Hung. 4, 191-231: Shyu, J.C. and Owen, D.B. (1986a). One-sided tolerance intervals for the two-parameter double exponential distribution, Communications in Statistics - Simulation and Computation 15, 101-119. Shyu, J.C. and Owen, D.B. (1986b). Two-sided tolerance intervals for the two-parameter double exponential distribution, Communications in Statistics - Simulation and Computation 15, 479-495. Sukhatme, P.V. (1937). Tests of significance for samples of the x 2 population with two degrees of freedom, Ann. Eugen. 8, 52-56.
263 Table 1. Mean, Variance and Coefficients of Skewness and Kurtosis of P' and p2"
n 5
• s 0
b 7
p; Mean 0-0000 0-0000 0-0000 0.0D00 0.0000 0-0000 0-0000 0-0000
o-onoo
a 1
ID
IS
20
.
•>
0-0000 D-0000 0-0000 0-0000 0-0000 0-0000 D-DODO 0-000D 0-0000 0-0D00 D.0000 0-0000 0-0000 0-0000 0-DO0O 0.0000 0.0000 0.0000 0-OODO 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0-000D 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
Variance 0-31b1 0.317E 0.3450 0.2548 D.2S48 0.2511 0.2122 0-2122 0-2130 0-2216 0.1814 0.1414 0.1615 0.1854 0-1581 0.1581 0-1581 0.1511 0-1702 o-i3ii o-i3ii 0.1311 0-1401 0.1435 0.0880 0-0680 0.0880 0.0860 D.0860 0-0881 0.0810 0.0135 O.Db37 0.0t37 0-0b37 0-0b37 0-0b37 0.0b37 0.0b37 0-0t38 0-0b41 0-01.55
•IF,
P,
0-0000 0-Olbl -0-0752 D-000D 0-001,3 0-0122 0-0000 0.0024 0.0158 -0.0110 0-0000 0-0010 0-0013 -0-0101 0.0000 0-0004 0.004k 0.0D14 -O-lObS 0-0000 0-0002 0.0D22 0.0011 -0-0252 0-0000 0-0000 0-0000 0.0004 0-0022 0-0052 -0-0107 -0.1D7t 0.0000 0-0000 0-0000 0-0000 0.0001 D.0005 0-0011 0.D034 -0.0048 -0-0532
4.1231 4.1371 4.5658 4.0075 4.0D11 4.1513 3-1324 3.1321 3.1701 4-307b 3-8b73 3.8b74 3.8771 4.0300 3-8153 3-8153 3.8171 3.8730 4.134b 3.7701 3.7701 3.771b 3.7100 3.13b5 3.b22S 3.b225 3.b225 3-b22b 3-b231 3-b375 3.707b 3.8515 3.5352 3.5352 3.5352 3-5352 3.5352 3-5353 3-53b7 3.54b2 3.5857 3-b75b
A"
"Smi"' n 5 b 7
a i
10
15
20
s 0 1 2 0 1 2 0 1 2 3 0 1 2 3 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 5 b 7 0 1 2 3 4 5 b 7 6 1
Heart
Variance
1.0000 1.0000 1.0000 l-OODO 1-0000 1-0000 1.0000
0.2210 0.3001 0.4b35 0.1658 0.2214 0.3078 0.15b5 0.1856 0.2318 O.31b0 D.1351 D.15b4 0.16b7 0-2357 0.1110 0.1351 0.15b7 0.1685 0.2311 0.10b2 0.1181 0.1352 0.1575 0.1101 0.0b13 0.0745 0.0805 0.0875 O.OIbl D.lObS D-1207 0-1314 0-0514 0-0542 0-0573 Q.ObOS Q.0b48 D-0b13 0.0745 0.0806 0.0884 0-0178
i-oooo 1.0000 1.0000 1.0000 1.0000 1-0000 1.0000 1.0000 l-OODO 1-0000 1-0000
i-oooo
1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1-0000 1-0000 1-0000 1.0000 1.0D00 1.0000 1.0000 1.0000 1-0000 1-0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000
V*
0.1523 1.0815 1.3443 0.6581 0.1414 1.0120 0.7681 0-8575 0-1510 1-1145 0.733b 0-7881 0.85b7 O.lbOl 0.b68b 0.7332 0.7871 0-8515 0.1731 O.b510 0-b683 0-7354 0.7S74 0.6bb3 0-S2bl 0-5453 0-Sbb7 0-5108 0.bl71 0-b411 O.b105 0.7438 0.4534 0.4b5S 0.4787 0.4130 0.508b 0-5257 0.544b 0-5bbl 0.5117 0-b230
A 4-3b34 4-7558 5.70b1 4-1077 4.3533 4.7815 3-1345 4.104b 4.3545 4.6b04 3.6071 3.132b 4.1001 4.378b 3.7117 3.6071 3-1211 4.1056 4-4185 3-b3b0 3-7112 3-8053 3-1212 4.1231 3.4153 3-44b2 3.4820 3.5236 3-5731 3-b332 3-7140 3-8210 3-3084 3-3251 3-3436 3-3b4b 3-3882 3-4148 3.4451 3-4805 3-5244 3-5815
264 Table 2. Mean, Variance and Coefficients of Skewness and Kurtosis of Pt and ?2
h n 5 b 7
S
s D 1 S D 1 2 Q 1 2 3 D 1
a 1
10
IS
3 D 1 5 3 4 0 1 2 3 4 D 1
a
5D
3 4 5 t 7 D 1
a 3 4
s b 7 A 1
flean O.DODD 0-00D0 0-0000 0-DDDD 0-ODOO 0-0000
Variance 0.3512 0.351B 0-3512 0-2b01 0-2b01 0-2b01
a-oooa a-oooo
o.asst o.asst
0-00Q0 0-0000 0.0D00
0.335b 0-235b 0.1873 0.1873 D-1873 D.1473 D.17S1 0-1751 0-1751 0-1751 D-1751
o-oooo O.DDOD
o.aoDO O.OODO D.DDOD 0-0000 D.OOOO 0.0000 0.0000 a.DODO O.ODOO O.OODO 0.D00D D.0000 D.00D0 0.D00D 0.0000 0.0000 0.0000 0.D00D 0.0000 D.D000 O.OODO O.DDDD 0-0000 0.0000 D.OODD 0.000D O.DOOO 0.0000 O-OODO
D.msa 0.1452
o-msa D-msa D.msa D.01b3 O-OlbS 0-01b3 0-D1b3
o.oibs D.D1b3 D-D1b3 0-01b3 O-Obbb O-Obbb O-Obbb O-Dbbb O-Obbb O-Obbb O-Obbb O-Obbb O-Obbb O-Dbbb
VA
A
O.ODOO 0.0000 0-0000 O.OODO 0.0000 O-DDOD 0-0000 O.OODO O.OODO 0.0000 0-0000 D.ODOO 0.0000 O.OODD 0.0000 0.0000 O.ODOO D.OODD 0-0000 0-0000 O-DDOD D.ODOO O.ODOO 0-0000 O.OODD O.DDOD O.DOOO O.ODOO O-ODDO O-ODDO 0-OOOD O.OODD O.DDOD O.OODO D-ODDO 0.0000 O-DOOO 0-0000 0-0000 D-OOOO O.ODDD O.OODD
4-b177 4.b177 4.b177 4.2002 4.2002 4.BDD3 4.4310 4-4310 4-421D 4-4B10 4-D151 4-0151 4.0151 4-0151
4-asos
n 5
m s D
b 7
8
1
4-asDa 4-2SD2
4-asoa 4-asoa 4-DD87 4-0D87 4.0067 4.0047 4-0D47 3.1471 3.1471 3.1471 3-1471 3.1471 3.1471 3.1471 3.1471 3.7475 3.747S 3.7475 3-7475 3-7475 3.7475 3.7475 3.7475 3.7475 3.7475
ID
15
50 q
_2=
flean 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1-0000 1.0000 1.0000 l.ODDO l.DOOD l.DDOO l.ODDO l.OOOD l.DOOD l.DDOO 1.0000 1-0000 l.DOOD l.DDOO l.ODDO l-OODO 1.0000 l.DOOD l.DDOO l.ODDO l.ODDO l-OODO l-DOOD l.DOOD l.DDOO l.ODDO l-OODO l.DOOD l-DOOD l.ODDO l.ODDO l.OODD l.DOOD l-DOOD l-DDOO
Pi Variance D.aBID 0.3010 0-bOb4 0.18b5 D.33D4 0.3015 0.15b5 0-1851
D-asn D.3b78 0-1354 0-15b8 0.1873 0-23b7 D-1110 0-1351 D-15b8
a-iasb 0-Eb48 0.10b4 0.1111 D.1354 0-1571 0.1115 O.DbIS D.D745 0-0805 0-0875 O.DIbl D-lDbS
D-iaos D-1453 D-0515 0-0543 0-D574 D.ObDI D-0b48 0-0b13 0-D74b 0-D8DS D-D8B5 D-0171
VA
A
o-isao
4.3b21 4-7538 5-1007 4-1015 4-35bl 4-7842 3-1333 4-1024 4-3517 4-b211 3-8013 3-1344 4-1034 4-3811 3-7101 3-80b0 3.124b 4.103b 4.30D1 3-b3b8 3-7122 3-8Db7 3-1301 4-1257 3-4151 3-44bD 3-4818 3-5235 3.5727 3-b337 3-7133 3-4007 3-3085 3-3252 3-3431 3-3b4S 3-3843 3-4150 3-4453 3-4407 3.524b 3.5821
1.0811
l.ian 0.8bD4 0.1514 1.014b 0.788b 0.8571 0.15DS 1-0157 0-7345 0-7413 D.4S82
o.ibai 0.b444 0.73B1 0.78b4 0.8511 0.1115 0.0515 D.bBIO 0.7332 0.7843 0.4b7S D.52bl D.5453 0-Sbb7 0.5107 0.bl71 0-b418 D.b1D4 0.7282 0.4535 0.4b5b D.4788 0-4131 0.5088 0.5251 D-5448 D.5bb3 0.5111 0.b234
265 Table 3. Percentage Points of the Distribution of P,
m
s G 1 2 b a 1 2 7 0 1 i 3 a D 1 2 3 1 D 1 2 3 1 ID D 1 2 3 1 15 0 1 2 3 14 5 t 7 20 D 1 2 3 1 5 t 7 B 1 5
Simulated
Edqeworth
1* -2-t5 -2.t1 -2.82 -2.tl -2-bl -2.1.5 -2-51 -2.51 -2.St -2.7t -2.57 -2-57 -2-57 -2.t3 -2.55 -2.55 -2-55 -2.5b -2-72 -2-St -2. St -2.51 -E.SM -2-bl -2-50 -2-50 -2-50 -2.IT -2.11 -2.50 -2.53 -B.LI
-2.147 -2.147 -2.17 -2.147 -2-17 -2.17 -2.17 -2.17 -2.11 -2.55
2.5Z -5.07 -2.0b -2-20 -2.0b -2.05 -2-07 -5-05 -5.01 -5.01 -5-17 -5-01 -5-01 -5-03 -5-07 -5.03 -5.03 -5-03 -5-03 -2-11 -5-03 -5.03 -5-03 -5.02 -2-0t -2.D1 -5-D1 -5-D1 -5.01 -5.01 -2.01 -2.03 -2.ID -2.00 -2.0D -2-0D -5-00 -2-00 -2.00 -2. DO -5. DO -5.D1 -2.05
S>.
10X
-l.tl -l.tl -l.t3 -l.t2 -l.t2 -l.tl -l.t2 -Lt2 -l.tl -1-tS -l.tB -l.t2 -i.ts -1.U2 -1-tS -l.ta -l.t2 -l.t2 -i.ts -l.t3 -l.t3 -l-b3 -l.t2 -l.t3 -l.t3 -l-t3 -Lb3 -l-t3 -l.t3 -l.t3 -l.t3 -l.tt -l.t3 -i.ts -i.ts -i.ts -l-t3 -l.t3 -l.t3 -l.t3 -l.t3 -i.ts
-1.11 -1.11 -1.15 -1.20 -1.20 -1.16 -1.21 -1-21 -1.20 -1.18 -1-21 -1.21 -1.21 -1.20 -1.22 -1-22 -1.22 -1.21 -1.20 -1.22 -1.22 -1-22 -1.22 -1.21 -1.23 -1.23 -1.23 -1-23 -1.23 -1.23 -1.23 -1.22 -1.21 -1.21 -1.21 -1.21 -1-21 -1.21 -1.21 -1.21 -1.21 -1.23
10* l.n l.n i.n
1SX 17. SZ H i
2.07 2.08 2.08 2.0b 2.0b i.n Lt2 2.08 1.21 Lt2 2-05 1-21 l.t2 2.05 1.20 1-13 2.0b Lit 1.57 2.02 1-21 1.4,2 2.01 1.21 l.t2 2.01 1-21 l-t3 2.OS 1.20 l.tl 2.05 1.22 l.t2 2.03 1.22 l.t2 2.03 1.22 l.t3 2-01 1.21 l.t3 2.05 1.18 1.58 2-00 1.22 l.t3 2.03 1.22 l.t3 2.03 1-22 l-t3 2.03 1.22 Lt3 2-01 1.20 l.tl 2.03 1.23 l.ta 2.01 1.23 l.ta 2.01 1.23 l.ta 2.01 1-23 l.ta 2.D1 1-23 i-ta 2-01 1-23 l.b3 B.D2 1.22 l.b2 2.D1 1.20 1.51 1-17 1.21 i-ta 2-DO 1.21 l.ta 2.00 1.21 l.ta 2.0D 1.21 l.ta 2.00 1.21 Lb3 2.00 1.21 i-ta 2-OD 1.21 l.b3 2.0D 1.21 l.b3 2.01 1.21 l.b3 2.01 1.22 i.bi 1.11 l.tl l.t2 1.5b 1-2D l.t2 1.2D Lb2
2.b5 2.bb 5-71 2.bl 2.b2 2.bb 2.51 2.51 2-bl 2.b3 2.57 2.57 2-58 2.bl 2.S5 2.55 2-5b 2-58 2-57 2-51 2.51 2.51 2.55 2.57 2-50 2.50 2.50 2.50 2-50 2.50 2-51 2-18 2-17 2-17 2.17 2.17 2.17 2.17 2.17 2-18 2-18 2.17
IX -2. SO -2. S3 -2.1b -2-53 -2.1b -2.bl -2.11 -2.11 -2. 57 -2-52 -2.58 -2.17 -2. tO -2.51 -2.52 -2-51 -2.1t -2-38 -2.55 -2-50 -2.51 -2.5t -2.57 -2.51 -2.31 -2.15 -2-t3 -2-17 -2.15 -2.1t -2.51 -2-bl -2.1t -2-52 -2.31 -2-50 -2.51 -2.1t -2.18 -2.ID -2-11 -2.52
2.5* -2.05 -2.08 -2.01 -2.02 -2.D1 -2-D7 -1.11 -1.17 -2.D7 -B-Ot -2-D5 -2-D5 -2-D3 -2.D2 -2.00 -2.03 -2.0t -1.15 -2.OS -2.02 -1-11 -2-07 -2.10 -2.01 -2.03 -2-DS -2-D5 -2.D2 -2.03 -2.01 -2.D2 -2.15 -2.07 -2.0b -2-03 -1-11 -2.D5 -2.07 -1-11 -2.DO -2-D1 -2-Db
5M
10*
ID*
-l.bS -l.tl -l.tl -l.b5 -l-b3 -l.bb -l.tl -l.b3 -l.b8 -l.bl -Lb7 -1-bl -1-bl -l.bl -l.b3 -l.bl -l.b7 -l.b2 -l.b7 -l.bl -l-b2 -i.ts -1-bl -l.b7 -l.bS -1.70 -l.b3 -l-b3 -l.bb -l.bO -l.bl -1.75 -l.b7 -l.tt -i.ts -1-bl -l.bb -l.tt -i.ts -1-58 -l.tl -Lb7
-1.22 -1.21 -1.20 -1.23 -1.11 -1.25 -1.21 -1.21 -1.27 -1.20 -1-21 -l-2b -1-27 -1.17 -1.21 -1.27 -1.25 -1.22 -1.25 -1.25 -1.21 -1-21 -1.20 -1.2b -1.2b -1.28 -1.21 -1.21 -l-2b -1.21 -1.23 -1.25 -1.27 -1.25 -1-21 -1.21 -1.21 -1.25 -1.27 -1.21 -1.21 -1.27
1.22 1.23 1.17 1-23 1.23 1.11 1.21 1.21 1.21 1.11 1.21 1-23 1-23 1.22 1.21 1.11 1.2S 1.21 1.2D 1.2S 1-27 1.25 1.22 1.23 1.2b 1.23 1.22 1.23 1.2b 1.21 1.23 1.22 1.27 1.22 1.22 1-21 1.25 1.21 1.25 1.2b 1-23 1.21
15* 1 7 . SX n x l.bS 1.58 l.b2 l.b5 l.bl l.bS l.bl l.bb l.bS 1.58
Lb7 1-tE i.ts l.bb l.ta l.tl l.t7 l.t2 l.ta l.tl 1.70 i.ts 1.51 l.t2 i.ts l.tt l.ta l.t2 l.tl 1.70 l.tl l.ta l.t7 l.tl l.ta l.bb 1-tE l.b5 l.ta l.b7 l.ta i-to
2.05 1.17 2.0t 2.02 2.03 2.0b 1.11 2.02 2.D1 1.11 2.D5 1.17 2-03 2.01 2.00 1.11 2.OS 2.02 2.03 2.02 2.08 2.08 2.01 2.01 2.03 2.02 1.12 1-15 2.03 2.07 2.01 1.17 2.07 l.lt 1.18 2-D2 2.D1 2. DO 2.07 2.12 2.D1 1-17
2.50 2.17 2.58 2-53 2-17 2.77 2.11 2.15 2.51 2.11 2.58 2.10 2.51 2.17 2.S2 2.12 2.51 2.11 2.51 2.50 2.58 2-51 2.51 2.t2 2.31 2.31 2.31 2.It 2-13 2.It E.tl 2.11 2.It 2.12 2.15 2.10 2-53 2.38 2.SO 2.53 2.15 2-32
266 Table 4. Percentage Points of the Distribution of p;
m5# n 5
b
7
8
1
ID
s D 1 2 0 1 2 D 1 2 3 0 1 E 3 G 1 2 3 1
o i
a
3 4 15 D 1 2 3 4 5 b 7 BO D 1 2 3 4 5 b 7 8 1
IX -2-81 -2.81 -2.81 -2.b7 -2-b7 -2.b7 -2.74 -2.74 -2.74 -2.74 -2-1.1 -2-b4 -2-b4 -2-b4 -2-bl -2-bl -2-bl -2-bl -2-bl -2-bl -2-bl -2-bl -2-bl -2-bl -2.51 -2.51 -2.51 -2.51 -2.51 -2.51 -2-51 -2-51 -2.53 -2.53 -2.53 -2.53 -2.53 -2.53 -2-53 -2-53 -2.53 -2-53
2-5* -2-17 -2-17 -2.17 -2-D8 -2-D8 -2.D8 -2.12 -2.12 -2.12 -2-12 -2-07 -2.07 -2.07 -2.07 -2-01 -2.01 -2.01 -2.01 -2.01 -2-0t -2.0k -2.0h -2.0b -2.0b -2.05 -2.05 -2-05 -2.05 -2.05 -2-05 -2.05 -2.05 -2.02 -2.02 -2.02 -2.02 -2.02 -2-02 -2.02 -2.02 -2.02 -2-02
Edqeworth 10* 102 5* -1.51 -1.13 1.13 -1.51 -1.13 1.13 -1.51 -1.13 1.13 -l.bl -1.18 1.18 -l.bl -1.18 1.18 -l.bl -1.18 1.18 -l.bO -1.1b 1.1b -l.bO -1.1b 1.1b -l.bO -1.1b 1.1b -1-bO -Lib 1.1b -l.b2 -1-11 1-11 -l.b2 -1.11 1-11 -l.b2 -1.11 1.11 -l.b2 -1.11 1.11 -1-bl -1.18 1.18 -l.bl -1.18 1.18 -l.bl -1.18 1.18 -1-bl -1.16 1.18 -l.bl -1-18 1.18 -l-b2 -1.20 1-20 -1.L2 -1.20 1.20 -l-b2 -1-20 1-20 -l.b2 -1.20 1.20 -1-bE -1-20 1.20 -l.b2 -1.20 1.20 -l-b2 -1.20 1.20 -l.b2 -1.20 1.20 -l.b2 -1-20 1-20 -l.b2 -1-20 1.20 -l.b2 -1.20 1.20 -l-b2 -1.20 1.20 -1-bE -1-20 1-20 -l-b3 -1.22 1.22 -l.b3 -1.22 1.22 -l.b3 -1.22 1.22 -l.b3 -1.22 1.22 -l.b3 -1.22 1.22 -l.b3 -1.22 1.22 -l-b3 -1-22 1-22 -l.b3 -1-22 1-22 -l.b3 -1.22 1.22 -l-b3 -1-22 1-22
15* 17.5* n* 1-51 2-17 2.81 1-51 2.17 2.81 1.51 2.17 2.81 l.bl 2.OS 2-b7 l.bl 2-08 2-b7 l.bl 2.08 2-b7 l.bO 2.12 2.74 l.bO 2.12 2.74 l.bO 2-12 2-74 l.bO 2-12 2-74 l-b2 2.07 2.b4 l.b2 2.07 2.b4 l.b2 2.07 2.b4 l.b2 2-07 2.b4 l.bl 2-01 2-bl l.bl 2.D1 2-bl l.bl 2.01 2.b1 l.bl 2-01 2-bl l.bl 2.01 2.b1 l.b2 2.0b 2-bl l.b2 2.0b 2.bl l.b2 2.0b 2.bl l.b2 2.0b 2.bl l.b2 2.0b 2.bl l.b2 2.05 2.51 l.b2 2-05 2-51 l.b2 2.05 2.51 l.b2 2.05 2.51 l.b2 2-05 2-51 l.b2 2.05 2.51 l.b2 2.05 2.51 i.ta 2-05 2.51 l-b3 2-D2 2-53 1-L3 2-02 2-53 l.b3 2.02 2.53 l.b3 2-02 2.53 l.b3 2.02 2-53 l.b3 2.02 2.53 L b 3 2.02 2.53 l.b3 2-02 2.53 l.b3 2-02 2-53 L b 3 2-02 2.53
4
IX -2.58 -2.58 -2.58 -2.bl -2.bl -2.bl -2-58 -2-50 -2.58 -2.58 -2.57 -2.57 -2.57 -2-57 -2.55 -2-55 -2.55 -2.55 -2.55 -2.55 -2.55 -2.55 -2.55 -2.55 -2.52 -2-52 -2.52 -2.52 -2.52 -2.52 -2.52 -2-52 -2-50 -2.50 -2.50 -2-50 -2.50 -2.50 -2.50 -2-50 -2.50 -2.50
2-5* -2-04 -2.04 -2.D4 -2.0b -2.0b -2.Db -2.0b -2.0b -2.0b -2.0b -2-04 -2.04 -2-04 -2.04 -2.07 -2.07 -2-07 -2.07 -2.07 -2.05 -2.QS -2.05 -2.05 -2.05 -2.04 -2.D4 -2.04 -2-04 -2.D4 -2.04 -2-04 -2-04 -2.04 -2.04 -2.04 -2.04 -2.04 -2.04 -2-04 -2-04 -2.04 -2.04
Simulated 10* 10* 5* -l.b3 -1.18 1.18 -1-13 -1.18 1.18 -l-b3 -1-18 1.18 -l-b5 -1.21 1.21 -l-b5 -1.21 1.21 -1-bS -1-21 1.21 -1.14 -1-20 1-20 -l.b4 -1-20 1.20 -1-14 -1.20 1.20 -1-L4 -1-20 1.20 -l.b5 -1.22 1.22 -l.b5 -1.22 1.22 -l.b5 -1.22 1-22 -l.b5 -1.22 1.22 -1-bb -1-23 1.23 -l.bb -1.23 1-23 -l.bb -1.23 1.23 -l.bb -1-23 1-23 -1-bb -1.23 1.23 -J.fcS -1.23 1.23 -1-L5 -1.23 1.23 -l.bS -1.23 1.23 - i . t s -1-23 1.23 -l.b5 -1.23 1.23 -l.bb -1.22 1.22 -l.bb -1.22 1.22 -l.bb -1-22 1.22 -l.bb -1-22 1.22 -l.bb -1.22 1.22 -l.bb -1.22 1-22 -l.bb -1-22 1.22 -l.bb -1-22 1-22 -l.b5 -1-24 1.24 -l.b5 -1-24 1.24 -l.b5 -1.24 1.24 -l.b5 -1.24 1.24 -l.b5 -1.24 1.24 -l.bS -1.24 1.24 -l.fcj -1.24 1-24 -l-b5 -1-24 1-24 -l.b5 -1.24 1.24 -l.b5 -1.24 1.24
15* l.b3 l.b3 l.b3 l.b5 l.b5 l.bS l.b4 l.b4 l.b4 l.b4 l.bS 1-bS l.b5 l.bS l.bb l.bb l.bb l.bb l.bb l.bS 1-bS l.b5 1-bS l.b5 l.bb l.bb l.bb l.bb l.bb l.bb l.bb l.bb l-b5 1-bS l.b5 l.b5 l.b5 l.b5 l.b5 l.b5 1-bS l.b5
17.5* 2.Q4 2.04 2.D4 2.Db 2.0b 2.Db 2.0b 2.0b 2.0b 2.0b 2-04 2-04 2-04 2.04 2.07 2.07 2.07 2.07 2.07 2-05 2.05 2.05 2-05 2.05 2.04 2.04 2-04 2.04 2.04 2.04 2.04 2.04 2.04 2-04 2.04 2-04 2-04 2.04 2.04 2.04 2.04 2.04
11* 2.58 2.58 2.58 2.bl 2.bl 2.bl 2-58 2.58 2.58 2.58 2-57 2.57 2.57 2.57 2.55 2-55 2.55 2.55 2.55 2.55 2.55 2.55 2.55 2.55 2-52 2.52 2-52 2.52 2.52 2.52 2.52 2.52 2-50 2.50 2-50 2-50 2-50 2-50 2-50 2-50 2.50 2-50
267 Table 5. Percentage Points of the Distribution of p2 H *? rat n 5 b
7
a i
ID
s D 1 S 0 1 2 0 1 E 3 D 1 E 3 0 1 2 3 1
n i
B 3 1 15 D 1 2 3 4 5 I 7 EO G 1 2 3 1 S t 7 8 1
V/.
-1.1.1 -1-57 -LIS -l.tl -Ltl -l.St -1-73 -l.fi -l.tl -1.55 -1.77 -1.71 -l.tl -Lt3 -1.60 -1.77 -L71 -l.fi -Lt2 -1.63 -1.80 -1.77 -1.71 -1.1.1 -LIE -1.11 -L81 -1.87 -1.85 -1.83 -1-BD -1.7t -1.18 -1.17 -l.lt -1.15 -1-11 -1.12 -1-11 -1.81 -1.87 -1.85
2-5Z -1.51 -Lit -1.38 -1.55 -1.51 -l.lt -1.58 -1.55 -1.51 -1.15 -1.1,1 -1.58 -1.55 -1-51 -l.t3 -l.tl -1.58 -1-55 -1.50 -l.tl -1-L3 -l.tl -1.58 -1.55 -1.7D -l.tl -Lbfl -1.L7 -l.tt -l.tl -l.t3 -l.tO -1.71 -1-73 -1.73 -1.72 -1.71 -1.7D -l.tl -l.t8 -l-t7 -l.tt
Edqeworth IOX 1 0 * - 1 . 3 7 - i . i t 1-31 - 1 . 3 3 - i . m 1.30 -1.E7 - i . i i 1.27 - 1 . 3 1 - 1 . 1 7 1.32 -1-37 - L i t 1.31 - 1 . 3 3 - 1 . 1 1 1.30 - 1 . 1 1 - 1 . 1 8 1.32 - 1 . 3 1 - 1 . 1 7 1.32 -1-37 - l . l t 1.31 - 1 . 3 3 -1-11 1.30 - 1 . 1 3 - 1 - 1 1 1.32 - 1 . 1 1 - 1 . 1 8 1.32 - 1 . 3 1 - 1 . 1 7 1.3E - 1 . 3 b - l . l t 1.31 - 1 . 1 1 - 1 . 1 1 1-32 - 1 . 1 3 - 1 . 1 1 1.32 - 1 - 1 1 -1-18 1-32 - 1 . 3 1 - 1 . 1 7 1.32 - 1 . 3 t - l . l t 1.31 - 1 . 1 5 -1-20 1.3E - 1 . 1 1 - 1 - 1 1 1.3E -1-13 - 1 - 1 1 1.32 - 1 . 1 1 - L I S 1-32 - 1 . 3 1 - 1 . 1 7 1.32 - 1 . 1 1 -1-22 1.32 - 1 . 1 8 - 1 . 2 1 1.32 -1.18 - 1 . 2 1 1.32 -1-17 - 1 - 2 1 1-32 - l . l t - 1 . 2 0 1-32 -1-15 -1-20 1-32 - 1 . 1 1 - 1 . 1 1 1.32 - 1 . 1 2 - 1 . 1 1 1.3E - 1 - 5 1 -1-23 1.32 - 1 . 5 1 -1-22 1.32 -1.5D -1-22 1-32 -1.5C - 1 . 2 2 1.32 - 1 . 1 1 - 1 . 2 2 1-32 - 1 - 1 1 -1-22 1.3E - 1 . 1 8 - 1 - 2 1 1-3E - 1 . 1 8 - 1 . 2 1 1-33 -1-17 - 1 - 2 1 1.32 - L i t -LEO 1.32 5Z
W, 15*
l.St 1.88 1.11 1-61 l.Bt 1.88 1.83 1.81 l.St 1.88 1.S2 1.63 1.81 l.St 1.81 1.82 1.83 LSI L8t
1.80 LSI
1.8E 1.83 LSI
1.7S 1.78 L71
1.71 LSD
1.80 LSI
1.B2 1.7b 1.7t 1.77 1.77 1.77 1.78 1.78 1.71 1.71 1.80
17.5* 2.17 2.5S 2.S3 2.10 2.17 2-51 2.3t 2.10 2.17 2.tl 2.32 2.3t 2.10 2.18 2.30 2-32 E.3t E.10 E.11 2-27 2-30 2.32 2-3b 2.11 2.21 2.22 2-23 2-21 2.Et 2.27 2.30 2-33 2-17 E.18 E.18 2-11 E.20 2-21 2.22 B.23 2.21 2.2t
Simulated
11*
IX
3.11 3-25 3-11 3.05 3.11 3.2t 2-11 3.05 3.11 3-27 2.11 2.11 3-05 3.11 2-81 2.11 2.11 3.05 3.It 2.8t 2.81 2.13 2-11 3.0t 2.71 2-7t 2.78 2.SO E.63 2. St E.81 2.15 2.tB 2-tl E.70 2.71 2.73 2-71 2-7t 2-78 2.60 2.63
-l.to -L51 -l.St -l.tl -l.tl -L50 -l.tl -l.tl -l.tS -1.50 -1.71 -1.73 -L7D -LtO -1.81 -1.71 -1.7t -Ltl -L51 -1.8b -1.81 -1.60 -1.73 -Lt8 -1.11 -1.13 -LSI -1.67 -LSI -1.61 -L71 -1.78 -E.01 -1.11 -1.13 -1.17 -1.11 -1.11 -1.15 -1.10 -1.10 -L88
2.5* -Lit -1.11 -L28 -1.51 -L11 -L37 -LSI -1.51 -1.11 -1.10 -L5t -1.55 -L53 -Lit -Ltl -L57 -1-St -1.S5 -Lit -Ltl -Lt2 -Ltl -LSS -1.5E -Ltl -l.tS -l.t7 -Lt8 -l.tl -Lt3 -l.tl -L5S -L77 -1.77 -Lt8 -Ltl -1.7E -L73 -Ltl -Lt7 -Lt7 -l.tS
S*
10*
-1.31 -1.30 -1.11 -1.35 -L33 -1.2t -1.37 -L38 -1.31 -L27 -1.11 -L37 -1.35 -1.32 -L11 -1.11 -LM0 -L3S -1.32 -1.11 -1.13 -LIE -LSI -1.37 -1.17 -L17 -Lit -1.11 -L12 -1.13 -1.12 -1.12 -LIT -L51 -1-11 -1.11 -LIS -1.51 -1.18 -L15 -Lit -LM5
-1.13 -1.11 -l.Ot -1.15 -1.13 -1.10 -1.11 -LIS -1.15 -1.11 -1.11 -1.13 -LIS -1.13 -LEI -LEO -1.11 -1.17 -i.m -LIS -1.17 -1.18 -Lit -1.13 -LEI -LEO -LIB -1.E3 -1.18 -1.18 -L11 -1.17 -LEE -1.E2 -1.20 -1.2E -1.21 -1.E3 -1.11 -1.11 -1.11 -L20
10*
15*
1.31 L 8 1 L S I 1.10 1-33 L 8 1 L 3 1 1.81 1.32 1.81 L 3 S 1.88 1-31 L B 3 L 2 1 1.78 L3t LS7 L 3 B 1.10 L2t Lt7
1.31 1.81 L 3 t LSS L10 LSS 1.33 L 7 S 1-31 L 7 t
1.35 1.88 L 3 1 1.88 L 3 2 1.81 1-31 L 8 7 1-31 L 6 2 1-33 L S S 1.31 1.83 L 3 1 1.81 1-30 1.71 1.31 L 7 t L 2 7 1.72 1-33 1.81 1.37 1.81 1.38 L B t L 3 8 1.10 L 3 3 1-80 1.21 1.71 L E I i.ta L33 L37 L28 L32 L3D L32 L35
L77 LS0
1.71 L75
1.72 LSI LB3
1.37 L S t
17- 5* 2.10 E.17 E.50 2.28 2.33 B.11 2-27 E.B7 2-35 2.11 2.13 B.37 S-35 E.13 2-11 2. I t 2-35 2.37 2.37 2.33 2.11 2.2t 2.2b 2.30 2.13 2.13 2.16 2-26 2.25 2.21 2.32 2.2b 2-13 2.13 2-17 E.E1 2-10. B.1S E.10 E.E5 E.E1 E.Et
n*
3-02 3.08 3.EO B.10 2.10 3.05 E.77 2-81 3.08 3-13 E.t3 2.11 E.83 3.11 E.63 2.51 2-86 2.13 3.01 B.87 B.73 E.81 2.S3 E.11 E.73 2.t5 2.71 B-77 2-S1 2.7t 2.S2 E.81 E.57 2.57 2.7D E.t8 2.51 E.tB E.51 E.73 E.70 E.87
268 Table 6. Percentage Points of the Distribution of P{
•5 n
t 7
S
|P s 0 1 2 D 1 S D 1 E 3 0 1
a3 1
10
D 1 E 3 4
ai s3 4
15
ai s3 4 5
t 50
7 0 1 5 3 4 5
t 7 6 I
!S
Edqeworth 1%
-i.ti -1.57 -1.51 -l.tl -1-tl -L5t -1.71 -l.fi -i.ti -i.to -1.77 -1.73 -Ltl -Lt3 -i.ao -1.77 -1.71 -i.ti -Lt5 -1.63 -l.SG -1-77 -1.714 -l.fi -LIE -1.11 -1.01 -1.67 -1.85 -1-83 -1.80 -1.78 -1.18 -1.17 -Lit -L15 -1.11 -LIE -1.11 -1.81 -1.87 -1.85
2-5* -1.51 -l.lt -1.11 -1.55 -1.51 -1-11 -1-50 -1-55 -1.51 -1-11 -l.tl -1-58 -1.55 -1.51 -l.t3 -1-tl -1-58 -1.55 -1.52 -l.tl -i.ts -1-tl -1-58 -1.55 -1.70 -l.tl -i.ta -l.t? -l.tt -l.tl -l.t3 -1-tl -1.74 -1.73 -1.73 -1.7E -1.71 -1.70 -l.tl -i.ta -l.t? -l.tt
5*
10*
-1-37 -1.33 -1.31 -1.31 -1.37 -1.33 -1.11 -1.31 -1.37 -1.35 -1.13 -1.11 -1.31 -Lit -1.14 -1.43 -1.41 -1.31 -1.37 -1-45 -1-44 -1.43 -1.41 -1.31 -1.41 -1.18 -1.18 -1.17 -Lit -1.15 -1.14 -1.43 -1.51 -1.51 -1.50 -1.50 -1-41 -1.41 -1-48 -1-48 -1-47 -Lit
-Lit -L14 -1.13 -1.17 -Lit -1.11 -1.18 -1.17 -Lit -L15 -1.11 -1.18 -L17 -Lit -1.11 -1.11 -LIS -1.17 -l.lt -LEO -1-11 -LIT -1.18 -1.17 -LEE -LEI -LSI -LEI -LED -LED -L11 -1.11 -L23 -LEE -LEE -LEE -LEE -LEE -LEI -LSI -LEI -LEO
10*
15*
1.31 l . a t L 3 0 i.aa 1.28 1.10 L 3 E 1.84 1-31 L 8 t 1.30 L S B 1.3S 1.83 1.3E L S 4 1.31 l . B t 1.30 L 8 7 1.32 1.62 1.3E 1.B3 L32 L84 L 3 1 l.Bt
1.32 1.81 1.32 1.82 L32 L83
1.32 1-84 1.31 1.85 1.3S 1.80 L 3 E 1.81 L32 L62
1.3S 1.83 L3S L64 L 3 E 1.78
1.3E 1.78 L32 L71 L 3 2 1.71
1.32 1.80 L 3 2 LSD L 3 2 1.81
1.32 1.82 1.32 L 7 t L 3 2 1.7t 1.3E 1.77 1.32 L 7 7 L32 L77 L 3 2 1.78
1.32 1.78 L 3 2 1.71 1.32 1.71 1.32 1.60
17.5* 2-47 2.58 2.tt 2.40 2.47 2-51 2.3t 2-10 S-17 S.54 E.3E 2-3b E.40 E.48 S.3D E-3E 2.3t 2-40 2.45 2.27 2.30 2-32 2.3t 2-11 2.21 S.S2 E-S3 E.S1 E.Et 2-27 2.30 2.32 2-17 2.18 2.18 2-11 2-20 2-21 2.22 2.23 2.24 2.2t
Simulated
11*
1!!
3-14 3.35 3-30 3-05 3.14 3.St 2.11 3.05 3-14 3.SO 2.14 2.11 3.05 3.15 2-81 2.14 2.11 3.05 3-11 2.at E.81 5.14 E-11 3'.0t 2.74 2.7t 2.76 2.6D 2-B3 2.6t 2.61 S-13 2.t8 E.tl 2.70 5-71 E.73 E.74 E.7t E.78 s.ao 2.83
-i.to -1.54 -i.to -l.tl -Lt4 -1.41 -l.tl -l.tl -i.ts -l.tl -L75 -1.73 -L70 -LtO -1.84 -1.76 -L7t -l.tl -1.7D -L87 -L83 -1.81 -1.73 -i.ta -L14 -L13 -1.61 -LB8 -L85 -1.81 -1.71 -L71 -2.01 -2.00 -1.13 -1.17 -1.14 -1.11 -1.15 -1.10 -L10 -LB7
2.5* -l.lt -1.41 -L43 -1.51 -1.41 -L37 -L54 -1.54 -1.11 -L18 -l.St -L55 -1.53 -Lit -l.tl -1.57 -L5t -1.S5 -LSI -Lt5 -Lt2 -LtO -1.55 -L52 -1-tl -i.ts -Lt7 -LtB -Lt4 -Lt3 -l.tl -1.51 -1.78 -L77 -Lt8 -l.tl -1.72 -1.71 -l.tl -Lt7 -Lt7 -Lt5
5*
10*
-1.31 -L3D -1.27 -1.35 -L33 -1.27 -L37 -1.31 -1.34 -L3S -1.12 -1.37 -1.35 -1.32 -1.41 -1.41 -1.4D -L36 -L33 -1.11 -1.13 -1.11 -LID -1.38 -1.17 -1.47 -l.lt -1.11 -LIE -L43 -1.13 -1.13 -1.50 -LSI -1.11 -1.11 -LIS -L5D -LIS -1-11 -1.15 -Lit
-L13 -1.11 -LDB -1.11 -LIE -1.11 -1.11 -1-18 -1.15 -1-11 -1.11 -1.11 -1.15 -1.13 -LED -1-11 -L11 -1.17 -1.13 -1-18 -1-17 -LIB -Lit -L13 -L20 -1-20 -1.1a -L23 -1.18 -1.18 -1.11 -1-17 -1-22 -1-21 -1-20 -1-23 -1-21 -1-23 -1.11 -1-11 -1.11 -1-20
ID* L35
15* L81
1.34 1.10 L30
1-3S 1.32 1.38 1.32 L21 L3t L33
L87 LS2
1.83 1.10 1.B2 1.78 i-a? 1.10 l.t?
1.25 1.34 L B 4 L 3 7 1-B5 L 3 B 1.11 1.33 1.7t L30 L3S
L7t LS6
1.34 i.aa 1.33 1.87 1.3B i.ea L 3 4 1.00 L33 L34 L34 L30 L31
LB7
1.84 L81 L7t L75
1.27 1.7E L 3 3 1.84 L3t L65
1.38 1.31 1.32 1.21 1.2B 1.33 1.38
1.87 1.10 L78
1-71 i.to 1.7? i.ao
L21 L74 1-32 L 7 t 1.30 L 7 3 1-32 L S I L34 LS3 L3t LSt
17.5* E.41 S-4t E.4E E.E8 E-31 2.37 2.S7 2.E4 E.33 B.41 2.11 S-3t E.3t E.lt E-11 E.lt 2-3b 2.35 S.3t 2.32 2.20 2-21 2.27 2.32 2-11 2.13 2-11 2-27 2-2t 2.3D 2-33 2-28 2.12 2.11 2.18 2-21 2.10 2-11 2.01 2-25 2.21 2.2t
11*
3.D2 3. D8 3.13 2.12 2. I t 3-03 2-7t 2-10 3.07 3-08 2-t3 2.11 2.87 3.15 2-81 2.5B 2.10 E-11 3-02 2.at 2.7t 2.85 2-82 2.11 2.71 2.b5 2-62 2.77 2.66 E.77 E.8D E.81 S.56 2.5b 2-tl 2-tl 2.51 E-bO 2.tl 2.7t 2.t1 2.65
269 Table 7. Simulated Percentage Points of the Distribution of P, n S
b
7
a
1
10
IS
2D
s D 1 2 D 1 2 D 1 2 3
•1 2 3 D 1 2 3 4 0 1 2 3 4 0 1 2 3 4 5 b 7
o i 2 3 4 5 b 7 8 1
IX -1.17 -2-M7 -5-31 -LSD -l.bl -2.fa3 -1.3S -1.33 -1-84 -2.63 -1.22 -1.23 -1-3D -1.71 -1-D8 -LID -1-lb -1.4b -2.11 -1.D2 -1-Dl -1.D7 -1-D1 -1.45 -0.77 -0.77 -0.7ft -0.81 -0.71 -0.81 -0.12 -l.lt -D-bS -0-b4 -0-b7 -O.bb -0-b7 -D-tfi -0-b7 -D-bb -0.73 -0-77
2.5X
-mo
-Lb4 -3.07 -1.11 -1.22 -1-72 -1.0b -1.02 -1.2b -1.17 -0.14
-o-m
-1.02 -1.22 -o.ab -0-10 -0.11 -1.03 -1.43 -O.flO
-o.ai
-0.B4 -0.8b -1.02 -0-bb -0-bS -D.b3 -0-bM -0-b3 -D.bb -0-70 -o.ab -0.5M -D.S1 -0-S2 -0.S1 -0.SS -0-S4 -0.SM -0-52 -0.56 -0-bO
S>. -1.0b -1.1b -2.0b -0.11 -0.11 -1.17 -o.as -0.82 -0.15 -1.33 -0.7b -0.7b -o.ao -0.11 -0-70 -0.71 -0.72 -o.ao -1.01 -0.b5 -0-bS -0.fc7 -0.b7 -0.76 -0.52 -0-53 -0-51 -0.52 -0.51 -0-52 -0-54 -0.b5 -D.MM -0-M5 -0-M2 -0.M3 -D.M4 -0-M2 -O.MM -0.M2 -D.M5
-cm
1Q'/. -0.78 -0.82 -1.2b -D.b5 -0.b6 -0.80 -0.b3 -0.b2 -O.bb -0.85 -0.5b -0.57 -0.56 -O.bl -0.51 -0.53 -0.53 -0.55 -O.bb -0.41 -0.46 -0.48 -0.41 -0.54 -0.31 -0.31 -0.31 -0.38 -0.38 -0.36 -0.40 -0.45 -0.33 -0.33 -0.32 -0.32 -0-31 -0.33 -0.34 -0.32 -0.34 -D.3b
10* 0.77 0.83 0.85 O.bl 0-70 0-bl 0.51 O.bM O.bl 0-b4 0.57 0-57 0-57 0.57 0.52 0.51 0.54 0-52 0.54 0.47 0.51 0.41 0-48 0-50 0.36 0-31 0.36 0-38 0.40 0.31 0.36 0-38 0.33 0.32 0.32 0.32 0-32 0.32 0-33 0.34 0-33 0.32
15X 1.04 1.1b 1.17 0-14 0.15 D-17 0.61 0.87 0-8b 0-87 0.7b 0.77 0-71 0.80 0.70 0.71 0.73 0.73 0.72 O.bH 0-b6 D-bb 0-bS O.bb 0.52 0.52 0.50 0.51 0-51 0.51 0.52 0-51 0-44 0.42 0.42 0.44 0-42 0.44 0-44 0.45 0.43 0.43
1 7 . 5X 1.21 1-41 1-bO 1.17 1.24 1.32 1.00 1.08 1.01 1.14 0.11 0.1b 0.11 1.02 0-85 0.81 0.13 0.14 0.11
IIX l.b3 2.12 2.30 1.46 l.b7 l.b7 1.3b 1.43 1.45 1.54 1.21 1.23 1.21 1.21 1.01 1.12 1.15 1.20 1.17
o.ai
o.ia
0-64 0.62 o.ao o.ao 0-b3 0-b4 0-b2 0.b5 0-b2 0-b2 0.b3 0-b4 0.55 0.52 0.51 0.51 0.51 0.54 0-54 0-54 0.53 0.53
1.05 1.10 1.01 1-05 0.74 0.7b 0.71 0.71 0.78 0.60 0.77 0-71 0.b7 o.ts 0-b3 D.b4 0.b2 0-b3 0-b7 O.bl 0-b7 O.bS
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
munin[ro«m«r-jr^r^«r^jj«ir)[r
i
I
I r^
I
I
njo^
^
r f
n
I
I
^
jo
I
r t n
I
I
I
I
I
I
r t n
I
I
n,mo
T
r t
•<•
jmo
I
I
I
I
5
I
I
I
I
I
3
jm=roH.njm=ro
I
r t
I
I
njm
a
1
-
I
1
1
I
oJ^o
I
3
I
I
I
njm
S
r f
I
I
I
-u,J,^<
I
I
I
a^iOLnjj3-m^n-mmmr^r^r^ruLnnjooor^3-«Q«o«a>o«o«oir3-_D-ii_D_D_Dj)-D-Dr,.v r^rurujrSrtrurHHir^r-Hir^r-^i-ii-^r^r^r-ia-r^r^r-ir^r-^aoaaooor^ooaoaaDao
"i
•»o C
M a t h e m a t i c s a n d t h e 21st C e n t u r y Eds. A. A. Ashour and A.-S. F . O b a d a © 2001 W o r l d Scientific P u b l i s h i n g C o . ( p p . 2 7 1 - 3 0 3 )
271
MATHEMATICAL MODELS IN T H E THEORY OF ACCELERATED E X P E R I M E N T S
V. BAGDONAVICIUS and M. NIKULIN Statistique
Department of Statistics, Vilnius Universty, Vilnius, Lituhania Mathematique, Universite Victor Segalen Bordeaux 2, Bordeaux, France & Steklov Mathematical Institute, Saint Petersbourg, Russia
AMS subject classifications:
62 F10, 62 J05, 62 G05, 62 N05
K e y w o r d s : Arrhenius model,accelerated life models, additive accumulation of damages model, changing shape and scale model, Cox model, covariable,cross-effects of hazard rates, Eyring model, exponential distribution, frailty model, gamma frailty model, generalized probit model, generalized proportional hazards model, generalized logistic regression model, generalized proportional odds-rate model, generalized additive model, generalized additive-multiplicative model, heredity principle,inverse gaussian frailty model, linear transformation model, log-linear model, Meeker-Luvalle model, PeshesStepanova model, periodic stress, power rule model, reliability, resource, Sedyakin's model, step-stress, survival function, switch-on's and switch-off's effects, tampered failure rate model, Weibull distribution. 0. I n t r o d u c t i o n Mathematical models describing dependence of the lifetime distribution on the explanatory variables (stresses) will be considered. . The considered models are used in survival analysis and reliability theory analysing results of accelerated life testing. Really, many manufactured devices have a long life when used under normal conditions. Therefore much time is required to get sufficiently large data for reliability estimation. To avoid this, items can be tested under higher stress conditions. In this case all processes resulting in failures of items elapse more quickly. As a result, failures which in normal conditions would occur only after a long testing, can be observed and the size of data can be enlarged. Such reliability testing is called accelerated life testing. Using information about failures under higher stress conditions, inference about item reliability under normal stress must be made. The solution of this problem requires construction of the mathematical theory of models model relating the distribution of failure time on stress. A number of such models was proposed by engineers who considered physics of failure formation process of certain products or by statisticians, see Andersen, Borgan, Gill k Keiding (1993), Bagdonavicius (1978,1990), Bagdonavicius k Nikulin (1995,1998), Bhattacharyya & Stoejoeti (1989), Cox (1972), Cox k Oakes (1984), Clayton k Cuzick (1985), Dabrowska & Doksum (1988), Elandt-Johnson k Johnson (1980), Gertsbakh k Kordonskiy (1969), Genest, Choudi k Rivest (1995), Greenwood k Nikulin (1996), Harrington k Fleming '1982), Kalbfleisch & Prentice (1980), Lawless (1982), Hougaard (1986), Johnson (1975), Kartashov (1979), Kartashov k Perrote (1968), Lee (1992), Lin k Ying (1994, 1995,1996), Meeker k Escobar (1993, 1998), Mann.Schafer k Singpurwalla (1974), Miner (1945), Meeker k Escobar (1998), Nelson (1990), Robins k Tsiatis (1992), Schabe (1998), Sedyakin (1966), Shaked k Singpurwalla (1983), Singpurwalla (1995), Singpurwalla k Wilson (1999), Schweizer k Sclar (1983), Viertl (1988), Viertl and Spencer (1991), Voinov k Nikulin (1993, 1996) etc. There is some eclecticism in various definitions of these models that prevents from seeing relations between them. The construction of accelerated life models will be considered now (Bagdonavicius and Nikulin (1994-2000)). This general approach gives the possibility to formulate a number of new models and to show the place of known ones in the proposed classes of models. These models can be considered as parametric, semiparametric or nonparametric. Parametric models, used in accelerated life testing, where thoroughly investigated, see for example, Viertl (1988), Nelson (1990), Nelson k Meeker (1991), Meeker k Escobar (1998), Basu and Ebrahimi (1982), Singpurwalla (1971), etc. These models are the well known: proportional and additive hazards, logistic regression and other models, which are used most often in survival analysis and in reliability. See also some reviews as Rukhin and
272 Hsieh (1987), Meeker and Escobar (1993), Singpurwalla (1987), etc. Statistical analysis of considered models one can find, for example, in Lin and Ying (1994,1995,1996), Meeker and Escobar (1998), Dabrowska and Doksum (1988), Robins and Tsiatis (1992), Bagdonavicius and Nikulin (1995-2000), Gerville-Reache and Nikoulina (1998), Tsiatis (1990), Schmoyer (1991), Sethuraman and Singpurwalla (1982), Ying (1993), etc. 1. R e s o u r c e All models of accelerated life will be formulated in unified way, using the notion of the resource. We consider only such models for which statistical estimation or hypothesis testing procedures are given up-to-date. Suppose at first that stresses are deterministic time functions: *(•) = (*i(-)
*m(-)) T = [0, oo) -+ B G R m .
If x(-) is constant in time, we'll write x instead of x(-) in all formulae. Let Q be a population of items and suppose that the time-to-failure of items under the stress x(-) is defined by a non-negative absolutely continuous random variable Tx^ = TX^(UJ),UJ £ fi, with the survival function
5.(.,(*) = P{r. ( . ) >t} 1 strictly decreasing on the support of distribution [0,spx(.\). The moment of failure of a concrete item wo G H is given by a nonnegative number TX(.)(UQ). Let Fx(.){t) = 1 — 5j,(.)(t) be the cummulative distribution function of T x (.). We use the following interpretation of it. D e f i n i t i o n 1. The proportion Fx^(t) of items from fi which fail until the moment t under the stress x(-) is called the uniform resource of population used until the moment t. D e f i n i t i o n 2. The random variable Ru = Fx(.)(Txi.))
=
l-Sx(.)(Tx(.))
is called the uniform resource (of population). The distribution of the random variable Ru doesn't depend on x(-) and is uniform on [0,1). This explains the name uniform resource. The uniform resource of any concrete item UQ G £1 is R (UQ). It shows the proportion of the population fi which fails until the item's wo failure Tx(.)(wo). The same population of items Q, observed under different stresses xi() and X2(-) use different resources until the same moment t when Fx^.)(t) ^ f x a ( ) ( t ) . In the sense of equality of used resources the moments ti and <2 are equivalent if FXl^(ti) = FX2(.)(t2). If we denote by G(t) = 1 - t, t 6 [0,1), the uniform survival function and ff(p) = l - p , p G ( 0 , l ] the inverse function of G, then the resource can be written in the form R = H(SX(.)(TX(.))). The considered definition of the resource is not unique. Take any strictly decreasing and continuous 1 function H : (0,1] -+ R such that the inverse G = H' of H is a survival function. Then the distribution of the random variable Ra also does not depend on x(-) and the survival function of Ra isG. D e f i n i t i o n 3 . The random variable Ra is called the G- resource and the number H(Sx(.)(t)) is called the G-resource used until the moment t. If identical populations of items operate under different stresses Xi(-) and x2(-), then independently of the resource choice the moments 0,
(p) = - l n p ,
H{Sx{)(t))
= -In
Sx{.}(t).
273 The choice of exponential resource is due to the fact that the exponential hazard rate
a
resource usage rate is the
«<->w = 5s? i P { T -<-> e (M+fe]' T*(> >t} = ~ JSM
and the used resource is the accumulated hazard rate A
*(-)(t)
= / atx(.)(u)du = Jo
-ln{Sx(.)(t)}
under the stress x(-). These notions have good interpretation. Then the (exponential) resource R =
AX{.)(TX{.))
has the standard exponential distribution with the survival function G(t) = e~', t > 0. The resource R takes values in the interval [0,oo) and doesn't depend on x(-): for any t the number Ax^(t) £ [0,oo) is the exponential resource used until the moment t under the stress x(-) (see Fig. 1), the rate of exponential resource usage is the hazard rate ax^(t) and for any moment t shows the risk of failure just after this moment for items which survived until t. Sometimes the meaning of certain models in terms of other resources will be discussed. It will be done if formulations of models using non-exponential resource will be simplier or more comprehensive.
Fig. 1. R e m a r k 1. We'll give definitions of models for deterministic stresses. If the stress is a stochastic process X(t), t>0, and Tx(.) is time-to-failure under X(-), then denote by Sx(.)(t)
= P{TX{)
> t\X(s)
«*(•)(<) = - £ ( . ) (0/S*(.)(0,
= x(s), 0 < s < t],
^(•)(*) = -MS,(.)(*)}
(0)
the conditional survival, hazard rate and accumulated hazard rate functions. In this case definitions of models should be understood in terms of these conditional functions. 2. G e n e r a l i z e d Sedyakin's m o d e l Definition of t h e m o d e l
274 The first idea which comes by modelling the influence of a stress on lifetime distribution is to suppose that the rate of resource usage ax(.)(t) at any moment t depends on the value of the stress x(t) at this moment and the resource Ax^(t) used until t. It is formalized by the following definition.
D e f i n i t i o n 4 . The generalized Sedyakin's (GS) model (Bagdonavicius (1978)) holds on a set of stresses E if there exist a positive on E x R + function g such that for all x(-) £ E <»*(•)(<) =
ff(*(*M*(-)(*))-
(!)
If the stress is a random process, we denote by E the space of trajectories of this process and consider the conditional functions (0) in all definitions which follow. The definition 4 implies that the accumulated hazard rate (or the used resource) verifies the integral equation (2) ff («(«). ^ ( o M ) duJo In the following subsections it will be shown that if the GS model holds then the survival functions under the step-stresses can be written in terms of survival functions under constant stresses. A
*{-)(t) =
Simple step-stress Let E0 be a set of constant in time stresses and E\ be a set of simple step-stresses of the form
x(r) = {'1'
°^;<*"
(3)
where x\, x2 £ -BoP r o p o s i t i o n 1. If the GS model holds on E\ then the survival function the stress x(-) £ E\ verify the equalities
and the hazard rate under
and u\
a
/
*i(')i
0
...
respectively; the moment t\ is determined by the equality SXl(t\) = SX2(t\). Proof of the Proposition I. Put a = 4 , ( ( i ) = Ax(.)(ti) = AXj{t\). The equalities (2)-(3) imply that for a l H > ti Ax(){t)
= a+
I
g(x2,Ax(.)(u))
du
and g(x2,AX2(u))
du = a+
/
g (x2,AXl(u
- ii + t{)) du.
( l
1
It implies that for all t >
and Ax,(t
h(t) = a + /
g (x2,h(u))
—
•it 1
with the initial condition h(ti) = a. The solution of this equation is unique, therefore we have Axi.){t)=Ax,(t-t1
+ f1),
for all
t>tx.
(6)
275 It implies the equalities (4) and (5). The proof is complete. Corollary 1 Under conditions of the Proposition 1. P { r l ( 0 > h} = P {TX2 > t\ + s I TX2 > t{} .
(7)
The model (7) was proposed by Sedyakin (1966). The equality SXl(ti) = S I a ( i J ) and the definition of the resource imply that for the two groups of identical items, observed under i i and x2, respectively, the moments ti and i j , respectively, are equivalent in the sense of resource usage. The equality (4) implies that for any s > 0 Sx(.){t1 + s) = SXj(f1 + s). Thus the GS model implies that if two identical populations of items under different stresses use the same resources until the moments t\ and t\y respectively, and after these moments both populations operate under the same stress, then the rates of the resource usage of these populations in the intervals [
(
XU
X2, Xm,
0
tl
_
(g)
where Xi, • • - , x m G Eo- P u t to = 0. P r o p o s i t i o n 2. If the GS model holds on Em then the survival function equalities: Sx(.){t)=SXi(t-ti-1
+ fi_i),
if te[U-i,U),
Sx^.){t)
verifies the
(i = l , 2 , . . . , m ) ,
(9)
where where t* can be found by solving the equations SXl{ii)
= SXl{f1),...,SXi(ti-ti-1
+ t*i_1) = Sti+l{ttf,
(i = l , . . . , m - l ) .
(10)
Proof. The Proposition 1 implies that the Proposition 2 is true for m = 2, i.e. we have Axi.)(t)
= AX2(t-t1+t'1)t
for all
te[tuh),
where AXl(ti) = J 4 I 2 ( < I ) . Suppose that Proposition 2 is true for m = j — 1. Then Ax(.){t)
= AXi{t-ti.1+fi_1),
if* 6 [*,•_!,<,•). (*"= 1 , - - - , J - 1),
(11)
where AXl (
«,*_!) = Ax,+1 (tj), (t 0 = 0, i = 1 , . . . , j - 2).
We'll prove that the Proposition 2 is true for ro = j . Continuity of the functions A,;(.)(t) and and the equalities (11) imply that ^ ( ) ( ' j - i ) = AXj_x(tj--i_ -tj-2
+ tj_2)-
So the equation (2) implies that for all t € [ i j _ i , t , )
Ax{.)(t) = A^.jfo-!) + / Jtj-x
g (XJ, A,(.)(u)) du =
Ax^(i),
276
^ - I ( * i - i - < j - 2 + <j-2)+ /
9(xj,M-)(u))
du
(12)
-
Jtj-1
The definition of tj_1, given in (10), and the equation (2) imply that for all t 6 [ti_,h): y t --h-i+t'jtj-i+tj.,
Ax.(t-tj_1
+ fj_1)
Asj-i(*j-l
= AXj{fj_1)
+ /
S^j.A^M) du =
ff(zj,^U(«-t(-i+<j-i)) du.
- < j - 2 + <)-2)+ /
The equalities (2.12) and (2.13) imply that the functions Ax^(t) integral equation h(t) — a-\-j
g(xj,h(u))
du
for all
and AXj(t t (E
— tj-i
(13) + tj_x)
satisfy the
[tj-i,tj),
with the initial condition h(tj-i) = b = A a ; j _ 1 ((j_i — tj-2 + ^ - 2 ) - The solution of this equation is unique, therefore for all t £ \tj-\,tj) we have
Ax(.)(t) = AXl(t-ti-i
+t*j_1).
The proof is complete. R e m a r k 2. In the statistical literature (see Nelson (1990)) the model (9) is called the basic cumulative exposure model In terms of graphs of the accumulated hazard rate functions Ax^(t) (thick curve) and AXi(t) ( m = 3 , i—1,2,3) the result of the proposition 2.2 is illustrated by the Fig. 2.
Fig. 2. R e m a r k 3. The GS model assumes that the failure rate ax^{t) at any moment t depends only on the resource accumulated until this moment (or, equivalently, on proportion of items failed until t) and on the value of the stress applied at this moment t. In situations of periodic and quick change of the stress level or when there are many life shortening switch-on's and switch off's of the stress, this model is not appropriate. We'll consider generalizations later. There are no methods of estimation for this model. What is the region of applications of this model? Suppose that the model is parametric and it is impossible to obtain the complete sample under the "normal" conditions of functionning of items.
277 When the right censored data is used, the goodness-of-fit tests can test that the left tail of a survival distribution corresponds well the chosen model. But often the estimates of p-quantiles with p near the unity are needed and in the case of bad choice of the model big mistakes can be made. The utilisation of the model of Sedyakin can help to solve this problem. If the stepwise stresses are used, it is possible to obtain failures of items at the end of life under the "normal" conditions and therefore to test if the right tail is from the class of specified distribution. A test for the Sedyakin's model can be found in Bagdonavicius & Nikoulina (1997). 3. A d d i t i v e a c c u m u l a t i o n of d a m a g e s m o d e l D e f i n i t i o n of t h e m o d e l Consider the following important particular case of the GS model. We now suppose that the rate of resource usage a,(.)(() at any moment t is proportional to a function of the stress applied at this moment and to a function of the resource used until t. It is formalized by the following definition. D e f i n i t i o n 5. The additive accumulation of damages (AAD) model (Bagdonavicius (1978)) holds on E if there exists a positive function r on E and a positive on [0, oo) function q such that for all x(-)€E (14)
a l ( .,(<) = r{*(*)} 9 { i V > ( ' ) ) } P r o p o s i t i o n 3 . Suppose that the integral
fx dv Jo «W converges for all x > 0. The AAD model holds on the set of stresses E iff there exists a survival function Sxi)(t) The inverse H = G
1
of the function
= G ^
r{x(r)}dry
G such that (15)
G is defined by the equality
Jo
«(»)
Proof. The equation (14) is equivalent to the integral equation •<•>(')
/ Jo
dv
^7T= / ?(»)
r x u
( ( )>
du
-
Jo
The result follows immediately. The name of the AAD model is implied by the following considerations. Fix any constant in time stress xo, for example, let XQ be the usual stress. Then under the AAD model Sl0(t) = G(r(x0)t) and putting p(x) — r(x)/r(x0) we obtain
5,(0(*) = S, o (jf'/K«W}dr). The S-r.-resource is R = S^(SI{)(T^)=
J " p{x(r)}dr. Jo and thus it is stochastically equivalent to the time-to-failure under the
It's survival function is 5 T o usual stress. Decrease of the resource in the interval [r, T + c(r] we'll call the damage. For the AAD model damages have the form p{x(r)} dr and are linear functions of dr. Thus the resource is used by the linear accumulation of damages.
278 R e m a r k 4. The AAD model written in the form (15) is also called the accelerated failure (AFT) model (Cox and Oakes (1984)).
time
C o n s t a n t stresses If X(T) = x = const then the AAD model gives Sx(t) = G{r(x)t}.
(16)
Thus different constant in time stresses change only the scale of distribution. Applicability of this model in accelerated life testing was first noted by Pieruschka (1961). It is the most simple and the most used model in accelerated life testing. Simple step-stresses. Consider the properties of the survival functions under the step-stresses. As before, we denote by Ei and Em sets of stresses of the form (3) and (8), respectively. P r o p o s i t i o n 4. / / the AAD model holds on Ei then the survival function under the stress x(-) verifies the equality
where
,, _ Kfi). r(x2) Proof. The equality (16) implies that
4,(0 = *. ( * £ ) , therefore the moment t*, done in the Proposition 1, is t\ = j ^ ' i t i General step-stress P r o p o s i t i o n 5. / / the AAD model holds on Em equalities:
5IJi-t,--i + ^y^r(j!j-)(<j-*j-1)L
then the survival function
Sx(.)(t)
verifies the
if < € [U-uU), (i = 1,2, ...,m).
(18)
Proof. The first equality is implied by the formula (15) and the form (8) of the stress. The second is implied by the formula (16). The proof is complete. R e m a r k 5. If the AAD model holds on Em, the moment t* defined by (10) has the form
'i^E'Wfe-'i-i)'
Relations between the means and the quantiles
(l9)
279
Suppose that x(-) is a time-varying stress and denote by t r (.)(p) the p-quantile of the random variable T^.), 0 < p < 1. In the next proposition we shall write X(T) to note as the value of the stress x(-) at the point r as the the constant in time stress z ( r ) l ( ) . P r o p o s i t i o n 6. Suppose that the AAD model holds on E and x(-), x ( r ) £ E for all r > 0. Then *<•>(*>
Jo
_dr_
<*(r)(P) x(r)(
If the means E T ^ . j , ETX(T) exist then
•(/*"£?)"• The model (21) is the moiel of Miner (1949). Proof. If the AAD model holds, the equality (2.15) implies that the G-resource used until t is /,(.)(<) = JJ ( S I ( . , ( 0 ) = fr{x(s)}ds.
(22)
The resource .ft has the form : R =
I r{x<(s)} ds.
(23)
The cumulative distribution function G of R doesn't depend on x(-). Taking the constant in time stress x ( r ) , equal to the value of the stress x(-) at the moment r, we obtain
R=
J
r{x(T)}ds
= r{x(r)}Tx(T).
(24)
o Taking the means of both sides we'll get ER = r{x(T)}F,T^T).
(25)
The equalities (22) and (23) imply
™=*[T«*»A-*(Y^A -•*(•)
: ER • E
J
ET S ( T )
and the equality (21) is obtained. Denote by t(p) the p-quantile of the resource R. If r is fixed, the equality (16) implies t{p) = r{x[T)}tx(T)(p).
(26)
Using the definition of the resource we have
p = P{Tl(.)<
r{x(r)}dr<
J
r{x(r)}rfr
=
280
j
r{x(r)}dr\.
(27)
The equalities (27) and (26) imply that t(p)=
r{x(r)}dT
= t(p)
./o
— Jo
' I ( T )(P)
and hence the equality (20) is obtained. The proof is completed. Corollary 2. For the stress (8) the formula (21) implies the equality
£ 1 ^ = 1.
(28)
where
{
o,
r I( .) <<*_!,
3* = < 7 i ( ) - < * - i ,
tfc_i < T s (.) <
is the life of an item, tested under the stress i ( - ) , in the interval [ik_i,ijt). For the stress (2.8) the formula (20) implies that for tc( )(p) 6 [tk-i,tk) true
the following equality is
The model (29) is the model of Peshes-Stepanova (see, Kartashov, 1979). So a/Z of the models (16), (18), (20), (21), (28), (29) are implied by the AAD model and illustrate properties of this model. In the case m — 2, the formula (28) can be written in the form
-^-
+
^ U l ,
(30)
and the formula (29) can be written in the form
So Er
ET1
*> = TTXE: 1
(32>
ET I 3
and
*"W = ,
J ' W - „ • if *.(-)(P) > 'i-
(33)
C(P) Thus, if the AAD model holds on Eu then E T j , ET 2 and WTX? determine ET,,, and <*(.)(?) and t l 3 ( p ) determine i r , ( p ) -
4. Proportional hazards model Definition of t h e m o d e l In survival analysis the most used model describing the influence of covariates on the lifetime distribution is the proportional hazards or Cox model, introduced by Cox (1972). In terms of stresses it is formulated as follows.
281 D e f i n i t i o n 6. The proportional hazards (PH) model holds on a set of stresses E if for all i(-) e E a l ( .)(<) = r{x(t)}
aB(t),
(34)
where ao(t) is the baseline hazard rate. The model means that the rate of resource usage at any moment t is proportional to some function of a stress applied at this moment and to a baseline rate which does not depend on the stress. Under the PH model the resource used until the moment t has the form 4,(.)(*) = / Jo
r{x{u)}dA0(u},
where Ao{t) = I a0(u)du. Jo In terms of survival functions the PH model is written I model is written : S l ( .)(t) = e x p | - j f
(35)
r{*( U )}<M 0 (u)}-
Relations between the P H and the A A D models When the PH model is also the AAD model? The answer is given in the following two propositions. P r o p o s i t i o n 7. Let Eo be a set of constant stresses such that the set r(Eo) has an interior point. Suppose that the PH model holds on Eo. The AAD model also holds on Eo iff the time-to-failure distribution is Weibull for all x £ Eo Proof. 1) If the time-to-failure distribution is Weibull and the PH model holds on Eo then for all
x€E0 Sx(t)
= e
-
W "
= S0(
(36)
Taking two times the logarithms of both sides, we obtain that for all t > 0 a{x)(\nt~ln$(x))
= l n r ( i ) + ln(-lnS0(i)).
(37)
The function ln(— lnSo(i)) doesn't depend on x, so a(x) = a = const for all x e Eo, which implies
Sx{t) = e~l*for,
(38)
i.e. the AAD model holds on Eo2) Suppose that both the PH and AAD models hold on Eo- It means that there exist functions So, S i , r and p such that for all x 6 Eo Sl(p(x)t)
= So(t)r^.
(39)
Taking two times the logarithms of both sides, we obtain that for all t > 0 In{-lnSi(p(z)0} = lnr(i)+ln(-lnSo(t)).
(40)
Put ffi(»)-=ln(-lnSi(e"»,
g0(v) = l n ( - lnSo(e")),
a(x) = lnp(x),
/3(x) = l n r ( x ) .
The equality (40) can be written in the following way: for all u 6 R , x e Eo gi(u + a(x}) = 0(x) + g0(u). The set r(Eo) has an interior point, i.e. contains an interval, so the set p(Eo) also has an interior point. Take xitX2,X3 e Eo such that p(x-2)lp(xl)^p(x3)lp{x2).
282 For a l i i , j = 1,2,3 gi(u
+ a(Xi))
-gi{u
+ a(xj))
= /?(*,-) -0{Xj).
(41)
Put kl = a(x2)-a(x1),
k2 = a(x3) - a(x2),
h = /3(x 2 ) - /?(xi),
h = P(x3) -
P(x2).
For all v 6 R 9i{v + kt) = gi{v) + h (i = 1,2, *i # t 2 ) .
(42)
ffi(t>) = av + b, Si(«) = e x p { - e 6 t ' ' } ,
(43)
It implies that and consequently SI(t)=exp{-eb(p(x)t)'>}.
(44)
So the lifetime distribution is Weibull for all x e EoThe proof is complete. Suppose that E0 is the set of constant stresses defined in Proposition 7, xitx2 constant stresses and a step-stress xs(-) has the form
«.w
fxu
=
E Eo are two fixed
0
T > S,
(_ X2,
where s is a fixed positive number. P r o p o s i t i o n 8. Suppose that the PH model holds on the set E including E0 and xs(-). model also holds on E iff the time to-failure is exponential for all x G Eo. Proof. The Proposition 7 implies that for all x £ E0 Sx(t)
= e'(j^r
'
The AAD
•
(46)
= ^f-\
(47)
Put 0j = 6(xi), i = 1,2. Then 5,((i) = e-(*>",
ami(t)
The PH model implies that a
'-W>-\
ax,(t),
r>s,
aXt(.)(u)du]
= e x p { - / aXl(u)duJo
and for all* > s i*5
ft S
xt(){t)
= Gxp{Vo
ft
7a
aX3(u)du]
=
-{-(*)"-(*)'•(£)"}• 1) Suppose that both the PH and the AAD models hold on E. Then (46) and (16) imply that S0(t) = e-t°,r(xi)
=
l/ei
and (18) implies that for all t > s
5 I ., )W =exp{-(^+^-i) a } The equalities (33) and (34) imply that for all t > s
-{-U)"-(*)" + (in--'K**Hr)l
283 If a = 1, this equality is verified. Suppose that a ^ 1. For all t > s put
^-{-fer-ar^sn-t-fe^)'}The derivative of j ( i ) is
and for all t > s has the same sign for fixed 8i ^ 82 and a ^ 1. So the function g is increasing or decreasing but not constant in t which contradicts to the equality (49). The assumption a ^ 1 was false. So a = 1, and the equality (45) implies that the lifetime distribution under any x 6 Eo is exponential: SI(<)=exp{-^y},
i>0.
2) Suppose that the PH model holds on E and the time-to-failure is exponential for all x G EQ. The formula (35) implies that for all x € EQ Sx(t)
=
exp{-r(x)A0(t)}.
Exponentiality of the times-to-failure under x € Eo and the last formula imply that Ao(t) = ct. The constant c can be included in r(x), so we have A0(t) = t. The formula (35) implies that 5,(.)(i)=exp|-^
7-{x(«)}d( U )|,
i.e. the AAD model holds on E. The proof is complete. Relations between the GS and the P H models The GS model is more general then the AAD model. When the PH model is also the GS model? It is given in the following proposition. P r o p o s i t i o n 9. Suppose that the PH model holds on the set E including Eo and all the stresses of the form (SO) with s < S, where S is any positive number. The GS model also holds on E iff the time to-failure is exponential for all x € Eo. Proof. The P H model implies that for all s < S i i , o ( i ) = » . , W , < > «• 1) If the GS model also holds on E, then for all s < S «.,(•)(') = at2(t-s
+
where
A-'(A]!l(s))
is an increasing function. It implies that if both the GS and PH models hold on E then for all si < 6 and S2 < S ax,(t - si + ¥>(si)) = aX3(t- s2 +
=
const verifies this. Assume that the function aXl(t)
const
is not constant. Then
for alls > 0,
because the function ctX3(t) cannot be two or more-periodic. Note that c ^ 0, because Ax7(
#
AX3(s).
284 The equalities lim Ax% (ip(s)) = lim AXl (s) = 0 and the monotonicity of tp(s) imply that lima_>o
if0<s<($0-
It contradicts the implication that (p(s) — s = c for any s > 0. It means that the assumption, that Qx^(t) is not constant, was false. So ax,(t) = a = cons* which implies that « . , ( * )
=
« - " ' •
The PH model implies that for all z 6 i?o 54<) = 5 0 (i) r < 1 ' = e - r W , i.e., the time-to-failure distribution is exponential for all x € Bo2) Suppose that for all x E Eg the time-to-failure distribution is exponential and the PH model holds. The proof of the Proposition 8 implies that the AAD model and consequently the GS model also holds on E. The proof is complete. Constant stresses If x S Eo is a constant stress, then the PH model gives ax(t)
Sx(t) = So(*) r ( a , ) ,
= r(x)a0(t),
where
S0(t) = e-""W. For any xQ E
EQ
« « W = Pi"o,x) <*„„(*),
Sx(t) =
S^'-'Ht),
where p(xa,x)
=
r(x)/r(xo).
Simple step-stresses If x(-) 6 Ei is a simple step-stress (3) then the PH model gives
)a0(t),
t>tu
<>
It implies that
r sxi(t),
o
(50)
< > < ! •
The PH model in the form (50) is called the tampered failure rate (TFR) model (Bhattacharyya & Stoejoeti (1989)).
285 For any x0 e En
«*(•)(') = I
p(x0,x2)aXa(t),
t>ti.
f S«,(0>
0
'
•
,
«
>
*
!
.
General step-stresses If x 6 B m is a general step-stress, then PH model can written in the following forms : for any te[«i-i,t() «,(.)(() = r ( i , ) a 0 ( t ) , (to = 0, i = 1,..., m),
^=1 , - » „ - « /
\5o(tt--i)
For any xo 6 Bo and t e [ti-i,
5. G e n e r a l i z e d p r o p o r t i o n a l h a z a r d s m o d e l s The AAD and the PH models are rather restrictive. In the case of the AAD model the stress changes locally only the scale. Under the PH model the hazard rate under the stress x(-) at the moment t doesn't depend on the resource, used until t. It is not very natural if items are aging. Indeed, let Eo C E be a set of constant in time covariates, x 0 be an usual stress, x\ be an accelerated with respect to xo stress, z o i ^ i € 2?o> i-e- SXt (*) > S i , (t) for all t > 0, and Ei be a set of simple step-stresses of the form *W
=
\ x0,
t>U.
If the PH model holds on E0 U Ei then for all (i > 0, t > h < * « ( • ) ( ' )
= " » . ( ' ) •
If one population of items is tested under the usual stress and the second identical population - under the accelerated stress x\ until a moment t\ and after this moment both populations are observed under the same usual stress XQ, the failure rate after the moment t\ is the same for both populations. These populations use different resources until the moment t i , nevertheless, after the moment t, when both populations operate under the stress xo, the resource usage rate is the same. It is not natural for aging items. D e f i n i t i o n s of t h e g e n e r a l i z e d p r o p o r t i o n a l h a z a r d s m o d e l s A generalization of AAD and PH models is obtained by supposing that the rate of resource usage at any moment t is proportional not only to a function of the stress applied at this moment and to a baseline rate, but also to a function of the resource used until t. This is formalized by the following definition. D e f i n i t i o n 7 The first generalized proportional hazards (GPH1) model ( Bagdonavicius h Nikulin (1998)) holds on E if for all *(•) 6 E ar l( .)(t) = r{x(t)}
}{A I( .)(<)} <*0(t).
(51)
286 The particular cases of the G P H l model are the PH model (q{u) = 1) and the AAD model (ao(i) = CIQ — const). A generalization of the GS and PH models is the following model. D e f i n i t i o n 8 The second generalized proportional hazards (GPH2) model holds on E if for all
x(-)eE ««(•)(') = «{*(<). M) W> <*<>(<)•
(52)
The particular cases of the GPH2 model are the GS model (cto(t) = a0 = const) and G P H l model (u(x,s) = r(x)q(s)) Models of different levels of generality can be obtained by completely specifying g, parametrizing q or considering q as unknown. Relations with generalized multiplicative models The G P H l models can be formulated in terms of resources other then exponential. It helps to choose the function q. Denote by fS.M) = ff(S,<.,(*)), (53) the G-resource used until the moment t. D e f i n i t i o n 9 The generalized multiplicative (GM) model (Bagdonavicius & Nikulin (1994)) with the resource survival function G holds on E if there exist a positive function r and a survival function So such that for all x(-) € E
where fg(t)
= H(S0{t)).
The equality (54) implies that for all *(•) 6 E
&(.,(*) = G { j f r(*(r)) dH(5„(r)) J .
(55)
If EQ a set of constant in time covariates then for all x G EQ Sx(t) = G{r(x)H(S0(t))}. For any xi,x2
(56)
€ E0 S*,(t) = G{p(x1,xi)H(SXl(t))),
(57)
where p(x\,X2) = r(xi)/r(xi). P r o p o s i t i o n 10. Suppose that the integral r
dv
Jo l{v,) converges for all x > 0. The GPHl model (51) holds on E iff there exists a survival function G such that the GM model (54) holds on E. Proof. Suppose that the G P H l model (51) holds on E. Define the function H(u) by the formula
Jo
Then the used resource
fi,(()
So the GM model holds on E.
is denned by the formula (53), verifies the equalities
287 Vice versa, if there exists a survival function G such that the GM model holds on E then (55) implies that for all x ( ) £ E S I( .)(<) = G ^
r{x(r)}dH(So(r))^
and a x ( .)(t) = e"-<)( , 'G'(ff(e-^'()( , )))r{x(i)}i/'{So(*)}a 0 (<). Put ?(«) = - a u G ' ( H ( e - " ) ) ,
oS(<) = - f f ' { S „ ( t ) }
a0(t).
Then for all i(-) G £ So the G P H l holds. The proof is complete. Corollary 3 The G P H l with specified q is equivalent to the GM model with the survival function of the resource G = /f - 1 and the following relations hold:
Put v4o(u) = H(SQ{U)).
In terms of survival functions the G P H l model is written £*(.)(<) = G (J
v{x(u)}dA0(u)Sj
,
(59)
where G = H~\
H(p)=
/ -7-rr, A 0 ( u ) = / a0{u)du. Jo
$,(.)(«) = SX0 Q r{x(u)}d^j . Hence /S-»W = S-1(5I(.,(<))=
fr{x(u)}du Jo
and
i.e., the AAD model is GM-model with S^-resource. Constant stresses If the G P H l model holds on a set of constant stresses Ea, 'hen ax{t)=r{x)q(Ax(t))a0{t),
Sx(t) = G(r(x)
For all zi,:E2 € Eo S„(t)
=
G(p(xux2)H(Stl(t))).
Simple step-stresses
Ao{t)).
288 If the GPHl model holds on a set of simple step-stresses Ei, then for any x() 6 Ei of the form (3) e m-f G(r(Xl)Ao(t)), ^ ( ) W - | G(r(Xl)Ao(t) +r(x2)(A0(t)
0 <*<*,.
- A0(h)))),
The survival function Sx(.) can be written in terms of the function SXl : OlM-ISXl(t),
0 <*<*!,
^ ( • ) W - | c{ff(s, 1 (()) + K*i 1 *2)(fi(s„(0-s Il (* 1 ))}, *>
%)W =
G
j X>(*j)(4>(<j) - ^ofe-i)) + r(z,-)(Ao(t) -
5I(.,(<) = G J (ff(5 rj (t)) - ff(5^(«,_!))) + Y^pixux^HiS^tj))
MU-i))
- H(Sx.(tj-i)))
\•
Relations between survival functions under constant and non-constant stresses. Similarly as in the case of the AAD model, consider some useful relations between survival functions under constant and time-varying stresses. Proposition 11 Suppose that x(-), X(T), XQ 6 E for all r > 0. // the GM model holds on E, then
^ w = G (/^feif d f f ( 5 - W ) ) = G^
(60)
H(Sx(T){r)) dlog ff(Sl0(r))) .
Corollary 4 If x(-) £ Ei is a simple step-stress of the form (3), xo £ Eo, then • V ) W - | G{f(lD,i!l)H(SroW) + ( 1 ( i 0 , 1 ! 2 ) ( f f ( S I a ( ( ) ) - ^ . ( ' i ) ) ) } 1 ' > ' i .
l
'
If x() 6 Em is a general step-stress of the form (8), Xo € E0, then for all t £ [i,_i,(,) Sl(.)(t) = G Ipixo^^HiS^t))
-r'f^Pi^^HlS^ti))
~ H(S.0(U-i)))y
(62)
Proof. The equality (55) implies that there exists the functional ri : E —\ [0,oo) such that S*(.)(<) = G {£
ntiCr)]^(5,0(r))| .
(63)
So for all fixed r, r
(64)
289 Putting n {X(T)} in the equality (63), the first of the equalities (60) is obtained. Putting t = r in (64) and the obtained expression of r t {a:(r)} in (63), the second of equalities (60) is obtained. C h a r a c t e r i z a t i o n of t h e G M m o d e l w i t h c o n s t a n t in t i m e stresses At first glance it looks like there are too many GM models. It appears that it is not so. Indeed, assume that a function G is continuous and strictly decreasing on [0,oo[ and Gi(ti) =
G((u/ey).
Let Ea = [xo,£i] C R. be an interval of constant in time stresses, {Sx, x € [xo,xi]} be a class of continuous survival functions, such that Sx(t) > Sy(t) for all x,y € E 0 , x < y, t > 0, H = G " 1 :]0,1] -> [0,oo] and # i = G ^ 1 be the inverse functions of G and G i , respectively. If the GM model with the resource survival function G holds on Eo, then the equality (57) implies that H{Sx(t)) where \(x) = p(xo,x).
= X(x)H(SXo(t)),
t>0,xe
[zo.ii],
(65)
Then = A 1 ' f ( i : ) f l i ( 5 I , ( ( ) ) , t > 0, x G [JJO.II].
Hi(Sx{t))
(66)
The inverse result also takes place : P r o p o s i t i o n 12 Assume that a function G is continuous and strictly decreasing on [0,oo[ and the equality (65) holds. Then the equality (66) also holds iff Gi(u) = G((u/9)"), « G [ 0 , o o ) , for some positive constants 6 and p. Proof. 1) It was just shown that if the GM model holds for the survival function G and Gi(i) = G((t/8)p) then the GM model holds for the survival function G i . 2) Suppose that the GM model holds for the survival functions G and G i , i.e. the equalities (65) and (66) hold. Introduce a function D : [0,oo[-> [0,oo[ such that D(u) = Hi(G(u)), u £ [0,oo[. In this case H\(p) = D(H(p)), p £]0,1], and the relation (66) can be rewritten as follows: D(H(Sx(t)))
= \1'"(x)D{H(SXo(t)),
t > 0, x e [ i 0 , i i ] .
Using (65) we obtain that D(\(x)H(Sx(t)))
= X1'"(x)D(H(FXo(t))),
t > 0, x 6 [ s 0 , * i ]
with the initial conditions D(0) = 0 and l i m , , . ^ D(u) = oo. Putting y = H(SXa(t)) D(\(x)y)
1
= X '"(x)D{y),
y£[0,oo[, o:€[xo,xi],
or for v — In y Q[lnX(x)
+ v) = -ln[X(x))+Q{v),
veK,
i6[io,n],
where Q(v) = ln(D(e"))). This equality leads to the equality Q(v) = av + b,
a = -. V
It implies that D(y) = 0y", where 9 = eb. Consequently, G(y) = Gi(D(y)) = Gi(9y°)
and
Gi(u) = G((u/8)"), u 6 [0,oo[.
we obtain that
290 The proof is completed.
This proposition implies that, for example, the PH model is a submodel of the GM model when G is not only standard exponential but when it is any exponential or two-parameter Weibull survival function. So submodels of the GM model form classes generated by classes of resource distributions which differ only by shape and scale parameters. R e l a t i o n s w i t h t h e frailty m o d e l s A method of the function q choice is obtained by using relations between the GPH1 models and the frailty models with covariates. The hazard rate can be influenced not only by the observable stress x ( ) but also by a nonobservable positive random covariate Z, called the frailty variable, see Hougaard (1986). Suppose that for all x(-) e E ati.){t\Z = z) = zr(x(t))a0{t). (67) Then S^.MZ
= z) = exp{-z
f Jo
r(x(T))dA0{r)}
and £,(.,(*) = Eexp{-Z
J r(x{r)) dAa(r)} = G{ I r(x(r))dA0(T)}, (68) Jo Jo where G{s) = E e ~ ' z . If we put S0(t) = G{A0(t)), the equality (2.68) implies that for all x(-) 6 E 5,(.,(t) = G{ f r{x(r))dH(So{T))}, Jo
(69)
where H = G - 1 . We obtained that the frailty model defined by a frailty variable Z, the GM model with the survival function of the resource G(s) = E e - a Z , and the GPH1 model with the function q defined by (58) give the same survival function under any stress x(-) £ E. R e l a t i o n s w i t h t h e linear t r a n s f o r m a t i o n m o d e l s Under constant in time stresses the GPH1 model is related with the linear transormation ( I T ) , Dabrowska & Doksum (1988b), Cheng, Wei, Ying (1995). Let consider the set Eo of constant in time stresses and let Tx denote the time-to-failure under the covariate x 6 Eo- The LT model holds on Eo if for all x G E0 h(TT) = -l3Tx
+ e,
for all
x 6 £,
(70)
where h : [0, oo) —> [0, oo) is a strictly increasing function, and £ is a random error with distribution function Q. The relation (70) implies that for all x £ Eo Sx{t) = G{e/3Tx+h^}
= G{e" T *.ff(SoM)},
where G(t) = 1 — Q(lnt), So(t) = G{e'"(''}. Therefore, in the case of constant in time stresses, the frailty model defined by the frailty variable Z, the GM model with the survival function of the resource G(s) = Ee~'z, the GPH1 model with the function q defined by (58) and the LT model with the distribution function Q(x) = 1 — G(ln x) of the random error e give the same expression of survival functions.
6. T h e m a i n classes of G P H m o d e l s . Particular classes of the G P H models are very important for survival analysis and accelerated life testing. The numerous examples of real d a t a show that taking two constant in time covariates, say
291 xi and x2, the ratio aX3(i)laXl(t) (which is constant under the P H model), can be increasing or decreasing in time and even a cross-effect of hazard rates can be observed. Such data can be modelled by submodels of the GPH1 or more general GPH2 model. Consider possible parametrizations in the G P H models. G P H m o d e l w i t h a m o n o t o n e r a t i o of h a z a r d r a t e s Consider the GPH1 model with parametrization «(«) = (i + i i r + i .
(7i)
where 7 € R is an unknown scalar parameter. We have the model o. ( .)(t) = r{«(t)}(l + i4, ( .)(*)r + 1 «o(*)-
(72)
The particular case of this model is the PH model when 7 = — 1. Suppose that 7 < 0 and c 0 = r(x2)/r(xi), (xi,X2 G Eo)- Then , w
,,
f l-jr{x2)Ao \.l-Tr(xi)An(t))
(t)l
•"
The ratio aXl(t)/aXl(t) has the following properties: a) if —1 < 7 < 0, then the ratio aXl(t)/aXl(t) increases from the value co until the value c^, = limj-too «»xj(0/ a *i(0i where the constant cx can take any value in the interval (co, 00); b) if 7 = — 1 (PH model), the ratio aX2(t)/aXl(t) is constant in time. c) if 7 < —1, then the ratio aX2(t)/aXl(t) decreases from the value CQ until the value CQO G (1,CO). G P H m o d e l w i t h cross-effects of h a z a r d r a t e s To obtain a cross-effect of hazard rates consider the following submodel of GPH2: "«(•>(«) =r(x(t))(l+Axi)(t)yT^^ao(t). T
Suppose that co = r(x2)/r(xi)
(73) r
> 1, (xlt x 2 € Bo), and 7 X2 < 7 * i < 0. Then
ax,{t)/aXl(t)
(l-7Tx2r(z2)^o(Or = c0T
(l- 7 *ir(*iMo(0)
_Tr
**
^
and a „ {0)/aXl (0) = c 0 > 1,
lim ax, (t)/aXl
(t) = 0.
t—too
So we have a cross-effect of the hazard rates. G e n e r a l i z a t i o n of t h e g a m m a f r a i l t y m o d e l w i t h c o v a r i a t e s Consider the GPH1 model with parametrization q{u) = e~i", 1 e R.
(74)
«„(.)(<) = r(x(t)) e ^ - ( ) C ' a0(t).
(75)
We have the model If 7 = 0, it becomes the usual PH model. Suppose that xo < x(-) and that the support of Sx(.) is /„(.) = [0,spx^). for all x ( ) and t G /„(.) ,
m
* W
=
J - • r e . ) ,o r { « ( r ) } d S 3 „ ( r ) } 1 / \ "1 „ , „ / _ ! _ f " - { x ( r ) } d l n 5 I O ( r ) } ,
if if
The model (75) implies
7^0, 7 = 0
292 and for constant over time stresses x > XQ and any t £ Ix
* ( H i -3S (i -^ w) }*For 7 > 0 the upper bound spx^
of the support /•,,(•) satisfies the equation
' • ( ^ ) + ^ f ' * < ' > r { ^ ( r ) } ^ o ( . ) ( r ) } = 0. Take notice that the condition x0 < x(-) implies that sp,, < spXo. For 7 < 0 the upper bound spx^ of the support Ix(.) satisfies the equation r{a:(r)}
/ Jo
If all stresses are constant over time, then spx — spxc for all x. If spXo = oo, then spx = oo for all x. Take notice that the model (75) is a generalization of the gamma frailty model (GFM) (see Vaupel et al. (1979)) with covariates. Indeed, suppose that the frailty variable Z follows a gamma distribution with the scale parameter 9 > 0, the shape parameter k > 0 and the density
"M = ¥W)e-"-
z>0
-
(76)
The survival function of the resource G(*) = E e " s Z = (l + Put 7 = -1/k
et)-k.
< 0. The formula (2.58) implies that q(u)
= ~e^,y<0.
(77)
The proportionality constant can be included in etc, and q(u) can be written in the form q(u) = e 7 ", 7 < 0. We have the g a m m a frailty model . Consider the frailty model with a density Pz(z) which is the inverse Laplace transformation of the survival function G(i) = ( l - 7 < ) 1 / l l [ o , i / 7 ) ( < ) . (78) The formula (58) implies that }(u) = ei", 7 > 0. We obtained that under the model (77) the survival function of the resource is G(t) = (1 - ft)lh,
7 < 0.
(79)
For 7 > 0 the support of G is [0,1/7) For constant in time stresses xi,x2 € Eo and 7 < 0 the equality (79) implies the generalized proportional odds-rate (GPOR) model (Dabrowska & Doksum (1988)):
S*?(t)
r(Xl)
5,7 (t)
Inverse g a u s s i a n frailty m o d e l .
293 Consider the GPH1 model with parametrization ?(«) = — — , 1 + fu
7>0.
(80)
We obtain the model
(81)
*"« = *W>T+3fej-
Take notice that this model is the inverse gaussian frailty model with covariates. Indeed, suppose that the frailty variable Z has the inverse gaussian distribution with the density p z ( 2 ) =
g
1 / 2
eV^V3/2e-^-
f )
z > 0
.
(82)
The formulas (68) and (58) imply that
(83)
'M = = T ^ The proportionality constant can be included in a0 and q(u) can be written in the form (80). Consider GM models with G specified. These models are alternative to the PH model. Generalized logistic regression m o d e l . If the distribution of the resource is
loglogisticj.e. G
W = TT7 1 { ^ 0 ) '
(84)
then q(t) = e _ t and the GM model can be formulated in the following way:
6*(.)(l)
So(t)
If x(-) € Ei is the step-stress of the form (3) then we have
or f
5 I 0 (<),
0 <<<
^ ' n a i l H ; ^ - ! ) ] " ' , «>*.. If stresses are constant in time then we obtain the model
It is the analogue of the logistic regression model which is used for analysis of dichotomous data when the probability of "success" in dependence of some factors is analised. The obtained model is near to the Cox model when t is small . G e n e r a l i z e d probit m o d e l . If the resource is lognormal, then G(t) = l - * ( l o g i ) ,
<>0,
294
where $ is the distribution function of the standard normal law. If covariates are constant in time then in terms of survival functions the GM model can be written as follows: $-1(5I(t))=
log(r(x))
^-l(S0(t)).
+
It is the generalized probit model see Dabrowska & Doksum (1988). 7. P a r a m e t r i z a t i o n o f t h e f u n c t i o n r i n A A D a n d G P H
models
Following Viertl (1988), consider parametrization of the function r in the AAD and GPH models. If the AAD model holds on Eo, then for all xi, x^ S Eo
S«(<) = S«,W'i,*2)'),
(87)
where the function p(xi,X2) = r(x2)/r(xi) shows the degree of scale variation. It is evident that p{x,x) = 1. In the case of more general G P H l model SX2(t) = G(p(x1,xi)H(S:Cl(t))),
(88)
where
_ g(g„(Q) _ fg(t) shows the degree of the resource usage rate variation. Suppose at first that x is unidimensional. The rate of scale (AAD model) or resourse usage rate ( G P H l model) variation can be defined by the infinitesimal characteristic (see Viertl (1988) for AAD model): u
\
S{X)
i-
= tmo
p(x,x + Ax)-p(x,x)
.,
= [l°9 rW 1 '
Ax
So for all x £ E0 the function r[x) is given by the formula:
r(x) = r{xo)exp < I S(v) dv
where XQ € Eo is some fixed stress. Suppose that S(x) is proportional to some known function S(x) = au(x),
u(x) of the stress:
a > 0.
In this case r{x) = e"°+'5>*<1>, where z(x) is some known function, /?o, /?i are unknown parameters. E x a m p l e 1. S(x) = a, i.e. the rate of scale changing is constant. Then
r(x) = e ^ + " ' * , where /?i > 0. It is so called log-linear model. E x a m p l e 2. S(x) = ct/x. Then r(x) = e"»+"' ,I,! ' ;r = a n " \ where /?i > 0. It is so called power rule model.
(90)
295 E x a m p l e 3. 6(x) = a/a: 2 . Then = cue"''1,
r(x) = e^M* where ft < 0. It is so called Arrhenius E x a m p l e 4. <5(x) = a/x(l — x). Then
model.
r(x) = e"°+'' 1 l n *
= ai (T^—)
\
0< K
1,
where fix > 0. It is the model of Meeker-Luvalle (1995). The Arrhenius model is widely used to model product life when the stress is the temperature, the power rule model - when the stress is voltage, mechanical loading, the log-linear model is applied in endurance and fatigue data analysis, testing various electronic components (see Nelson (1990)). The model of Meeker-Luvalle is used when x is proportion of humidity. If it is not very clear which of the first three models to choose, one can take more large class of models. For example, all these models are the particular cases of the class of models determined by S(x) = o i
1
with unknown 7 or, in terms of the function r(x), by
r(x) :
-{
1
/Jo+/3i(»'-i)/«i
if
e;
Po+Ptlogz
jf
£
t0; — (J
In this case the parameter e must be estimated. The model (90) can be generalized. One can suppose that S(x) is a linear combination of some known functions of the stress: k
In such a case r(x)\ == exp expli po 0 ++ V o
z f)iZi(x) ^2^ '(
where zt(x) are some known functions of the stress, /3 0 , • . . ,/?* are unknown (possibly not all of them) parameters. a/x2.
E x a m p l e 5. S(x) = \/x + Then
r(,) = el1'+W,"+W' =
1
ftlie'
'«,
where /?i = 1, /?2 < 0. It is so called Eyring model, applied when the stress x is the temperature. E x a m p l e 6. S(x) = J2
ctj/x'.
i=l
Then r(x) = exp l/3o+l3ilogx
+ J2 /3i/xi
I.
It is so called generalized Eyring model. Suppose now that the stress x = (x\,...,xm) is multidimensional. Define (see Viertl (1988)) the infinitesimal characteristics Si(x) by the equalities ... Oiix) =
lim Ai,->o
p(x,x + AxieA — p(x,x) Axi
where e; = ( 0 , . . . , 1 , . . . , 0). The unity is the i t h coordinate.
dlogrlx) = —-z , dxt
296 Generalizing the unidimensional case, <$j (x) can be parametrized in the following manner
j'=i
where UJJ(X) are known functions, a y are unknown constants. In this case
'{A+££/W*)|.
r(x) = exp i
where Zij (x) are known functions, /?y are unknown constants. E x a m p l e 7. Si(x) = 1/xj + ( a n + ai2x2)/xl and S2(x) = a21 + o:22/xi. Then r(x) = exp { A + Pilog Xi + P2x2 + fo/xi + /34x2/x1] . It is so called generalized Eyring model. This model is used for certain semiconductor materials, when xi is the temperature and x 2 is the voltage. E x a m p l e 8. <S,(x) = ajUj(xj), where u,- are known functions. Then r(x) = exp I fa + Y^(3jZj(xj)
\ ,
where Zj are known functions. It is so called generalized Arrhenius model.
model. It also called the log-linear
G e n e r a l i z e d a d d i t i v e a n d additive-multiplicative m o d e l s Definition 10. The generalized additive (GA) model (Bagdonavicius & Nikulin (1995)) holds on E if there exist a function a on E and a survival function So such that for all x(-) £ E
*4)W _ df?(t) at
-
at
t
(91)
•»(*<«»
with the initial conditions / 0 G (0) = f^JO) = 0; here / 0 G (t) = H(S0(t)). So the stress influences additively the rate of resource using. The last equation implies that &()(<) = G (H(So(t))
+J
a(x(T))dr\
.
(92)
In terms of exponential resource usage the GA model can be written in the form ".(•)(i) = ! ( 4 ( . ) W } W i ) + a ( « ( i ) ) ) . The particular case of the GA model is the additive hazards model (AH) : <*,(.)(<) = a 0 ( 0 + <»(*(<))•
(93)
Both the GM and the GA models can be included into the following model. Definition 11 The generalized additive-multiplicative (GAM) model (Bagdonavicius & Nikulin (1997)) holds on E if there exist functions a and r (positive) on E and a survival function So such that for all x ( ) E E ^
= * « > « > + . ( , ( « ) )
(94)
297 with the initial conditions /,?(0) = /°(.)(0) = 0; here f§(t) = H(S0(t)). So the stress influences the rate of resource usage as multiplicatively as additively. The last equation implies that &<•)(*) = G Qf' r{x(T)}dH{So{r)) + j f a(«(r))dr) .
(95)
In terms of exponential resource usage the GAM model can be written in the form :
<**(•)(*) = ?{^(.)(t)}(rW0M<) +<•(*(*)))• In the particular case of the exponential resource we obtain the additive-multiplicative hazards (AMH) model, see Lin and Ying (1996) : axi.)(t) = r{x(t)}a0(t) + a{x(t)).
(96)
The function a in the GAM models is parametrized as the function In r in the GM models and the function q as the function q in the GPH models. 9. C h a n g i n g s h a p e and scale m o d e l s Definition of models Consider now the important model which does not lie in the class of the GAM models but includes the AAD model as the particular case. Suppose that the constant in time stresses x € Bo change not only the scale but also the shape of time-to-failure distribution: there exist on EQ the positive functions $(x) and v(x) such that for any
&
un
w=«-\tej /•
(97)
The S^o-resource used until the moment t under x is fx(t) = 5'~01(Sr(f))) and the resource usage rate is
yx{t) = r{x)t"^-\ 1
where r(x) = v(x)/'BfflW, H = G" . So the model (97) means that the resource usage rate under the stress x is increasing, if v(x) > 1, decreasing, i/0 < u(x) < 1, and constant, if v(x) = 1. In the case v{x) = 1 « have the AAD model. Consider the following generalization of the model (97) to the case of time varying stresses : Definition 12. The changing shape and scale (CHSS) model ( Bagdonavicius & Nikulin (1998)) holds on E if there exist positive on E functions r and v such that for all x(-) G E ^W=r{.(*)}«"W'»-i.
(98)
V ) C ) = S*° ( j f r{x(r)}r"('( r »- 1 dr) .
(99)
The equality (98) implies that
In terms of the exponential resource usage the model can be written in the form : ax(.)(t) =
r{x(t)}q(Ax{.)(t))?«t»-\
Simple step-stresses
298 If x(-)e
Ei, then the formula (99) implies that 0 < t <
lW
S^(t)=}
. J
(100)
General step-stresses If z(-) 6 Em, then for all t € [*.--i,ti)
«.(•)(*) = < ? < £ j=i
+
- ( - ^ ,"(*<) l
(_L_ V
"(*))
U*0/
(101)
Wo;
10. Generalizations Schabe and Viertl (1995) considered an axiomatic approach to model building. P r o p o s i t i o n 1 3 . (Schabe and Viertl (1995)). Suppose that there exists a functional a :Ex such that for any xi(-),X2(-)
E x [0,oo)-> [0,oo)
£ E it is differentiable and increasing in t, a{x1(),x2(),0)
=0
and r*a(.)~«(*i(-).*2(-).r.l(.)), where ~ denotes equality in distribution. For any differentiable on [0,oo) c.d.f. F exists afunctional allx()£E
b : E x [0, oo) —¥ [0,oo) such that for (102)
Proof. Fix z o ( ) € E and for all x(-) 6 E put a0(x(-),t)
=
F-1(FM.)(a(x(-),x0(-),t))).
The distribution of the random variable R = ao(x(-),Tx^) F. Put
does not depend on x(-) and its c.d.f. is
»(*(•).') = £«o(*(-).0-
299
Then
t
ao(x(-),t)
= I Jo
b(x(u),u)du,
which implies F„ ( .)(t) = P{T I ( .) < t} = P { i ? < a 0 ( x ( ) , < ) } = F ( a 0 ( i ( - ) , 0 ) =
F
( /
*(*(«).«)<*«) •
The proof is complete. R e m a r k 8. P u t G(t) = l-F{t),
5I(.)(0 = 1-FJI(.)(0,
H = G-\
/f ( .)(<) = / f ( 5 l ( . ) ( t ) ) .
The equality (102) implies that
(103)
J^?(.)C) = »(*(•).')•
This model means that the rate of the G—resourse usage is a functional of the stress and the time. The above considered models are submodels of this general model: 1) If b(x{),t) = r(x(t)), we have the AAD model. 2) If b(x(-),t) = r(x(t))a0{t), we have the GM (or, equivalent^, G P H l ) model. 3) If b(x(-),t) = r(x(t)) a0{t) and the resource is exponential, i.e. G(t) = e'',t > 0, we have the PH model. 4) If b(x(-),t) = r j i l l D C W ' H - 1 , we have the CHSS model. Considering the GS model, it was noted that this (and also AAD) model is not appropriate when the stress is periodic with quick change of its values. Greater is the number of stress cycles, shorter is the life of items. So the effect of cycling must be included in the model. Suppose t h a t a periodic stress is differentiable. Then the number of cycles in the interval [0,i] is n(t) = J | dl{x'{u) Jo
> 0} | .
Generalizing the G P H l (or GM) model we suppose that the G-resource used until the moment t has the form 4 ) «
= /
r 1 {jr( U )}dff(5 0 ( U )) + J
r2{x(u)}d
\ l{a;'( U ) > 0} |
(104)
The second term includes the effect of cycling on resource usage. In terms of survival functions S l ( .,(i) = G | / n{x(u)}dH(So(u))
+ f r 2 {*(u)} | dl{x'(u)
> 0}|.
(105)
If amplitude is constant, r2{x(u)} = c can be considered. The AAD model is generalised by the model S l ( 0 ( t ) = G{J
ri{x(u)}du
+ J
r2{x(u)}
| dl{x'(u)
> 0} | | .
(106)
The GS and AAD models are not appropriate if x(-) is a step stress with many switch on's and switch off's which shorten the life of items. In this case the following model can be considered: 4 ) W = ^ % i { z ( u ) } « « f ( S o M ) + f\2{x(u)}l(Ax(u)
>
0 ) | ^
+ ['r3{x(u)}l(Ax(u) < 0)±Pp±. (107) Jo I Az(u) | The second and the third terms include the effect of switch-on's and switch-ofPs (or vice versa), respectively, on resource usage. If the step-stress has two values, the functions r2 and r3 can be constants.
300
11. The heredity hypothesis A process of production is unstable if reliability of items produced in different time intervals are different. If items produced in some specified time interval are considered and the AAD, GM (GPH1) or GA models hold on Eo, then for all xi,x2 € Eo Sz2(t) = GzM*L*2)t), S„(t) Sx,{t)
(108)
= G(Axltz2)H{SXl{t))),
(109)
= G(H{SX, (t) + b(xux2))),
(110)
respectively. D e f i n i t i o n 13. If one of the models AAD, GM or GA holds, the process of production is unstable and the function p(xit x2) (the models AAD or GM) orb(xlt x2) (the model GA) is invariant for groups of items produced in different time intervals, then we '11 say that the heredity hypothesis is satisfied. Suppose that x\ is a usual stress and x2 > xi an accelerated stress. If one of the models AAD, GM or GA and the heredity principle hold, then sufficiently large data can be accumulated during a long period of observations and good estimators of the functions p{x\, x2) or b(xi, x2) can be obtained. The reliability of newly produced items under the "usual" stress xi can be estimated from accelerated life data obtained under the accelerated stress x2, using the estimators p(xi,x2) or b(xi,x2). REFERENCES P.K. Andersen, R.D. Gill, 1982. Cox's regression model for counting processes: A large sample study. Ann. Statist. 10, 1100-1120. P.K. Andersen, 0 . Borgan, R.D. Gill and N. Keiding, 1993. Statistical Models Based on Counting Processes. Springer, New York. V. Bagdonavicius, 1978. Testing the hypothesis of the additive accumulation of damages. Probab. Theory and its Appl,
23, No. 2, p.403-408.
V. Bagdonavicius, 1990. Accelerated life models when the stress is not constant. Kybernetika,
26,
289-295. V. Bagdonavicius, 1993. The modified moment method for multiply censored samples. Mathematical
Lithuanian
Journal, 33, No.4, p.295-306.
V. Bagdonavicius, M. Nikulin, 2000. On goodness-of-fit for the linear transformation and fraility models, Statistics
and Probability Letters, 4 7 , # 2 , 177-188.
V. Bagdonavicius, M. Nikulin, 2000. On nonparametric estimation in accelerated experiments with step stresses, Statistics , 33, # 4 , 349-350. V. Bagdonavicius, M. Nikulin, 2000. Modeles statistiques de degradation avec des covariables dependant de temps, Comptes Rendus, Academie des Sciences de Paris, 329, Serie I, #2,131-134. V. Bagdonavicius, M. Nikulin, 1999. Generalized Proportional Hazards Model Based on Modified Partial Likelihood, Lifetime Data Analysis, 5, 329-350. V. Bagdonavicius, S. Malov and M. Nikulin, 1999. Characterizations and semiparametric regression estimation in Archimedean copulas, Journal of Applied Statistical Sciences, 8, 137-154. V. Bagdonavicius, M. Nikulin, 1998. Additive and Multiplicative Semiparametric Models in Accelerated Life Testing and Survival Analysis. Queen's Papers in Pure and Applied Mathematics, 108, Queen's University, Kingston, Ontario, Canada. V. Bagdonavicius, M. Nikulin, 1997, Analysis of general semiparametric models with random covariates, Revue Roumaine
de mathematiques
Pures et Appliquees, 42, # 5 - 6 , 351-369.
V. Bagdonavicius, M. Nikulin, 1997, Statistical analysis of the generalized additive semiparametric survival model with random covariates, Questiio, 2 1 , # 1 - 2 , 273-291.
301 V. Bagdonavicius, M. Nikulin, 1997, Sur l'application des stress en escalier dans les experiences accelerees , Comptes Rendus, Academic des Sciences de Paris, 325, Serie I, 523-526. V. Bagdonavicius, M. Nikulin, 1997, Transfer functionals and semiparametric regression models, Biometrika,
v. 84, 2, 365-378.
V. Bagdonavicius, M. Nikulin, 1997, Asymptotic analysis of semiparametric models in survival analysis and accelerated life testing, Statistics,
29, 261-281.
V. Bagdonavicius, M. Nikulin, 1997, Accelerated life testing when a process of production is unstable, Statistics
and Probability Letters. 3 5 , # 3 , 269-275.
V. Bagdonavicius, M. Nikulin, 1997, Some rank tests for multivariate censored data, in : Advances in the Theory and Practice of Statistics : A volume in Honor of Samuel Kotz. (eds. N.L.Johnson and N.Balakrishnan), J.Wiley, New York, 193-207. V. Bagdonavicius, V. Nikoulina, 1997, A goodness-of-fit test for Sedyakin's model. Revue de Mathematiques
Roumaine
Pures et Appliquees. 4 2 1, 5-14.
V. Bagdonavicius, M. Nikulin, 1996,
Analyses of generalized additive semiparametric models ,
Comptes Rendus, Academie des Sciences de Paris, 323, 9, Serie I, 1079-1084. V. Bagdonavicius, M. Nikulin, 1995a. Semiparametric models in accelerated life testing. Queen's Papers in Pure and Applied Mathematics, 98, Queen's University, Kingston, Ontario, Canada, 70p. V. Bagdonavicius, M. Nikulin, 1995b, On accelerated testing of systems. European Journal of Diagnosis and Safety in Automation, 5, 3, 307-316. V. Bagdonavicius, M. Nikulin, 1995c. Estimation of system reliability from accelerated experiments. In: Proceeding of International Conference on Statistical Methods and Statistical Computing for Quality and Productivity Improvement (ICSQP'95). I I , 602-608. V. Bagdonavicius, M. Nikulin, 1 9 9 5 d . Accelerated life models for systems and their components. In: Seventh International Conference on Applications of Statistics and Probability in Civil Engineering, Paris. 2, 1157-1164. V. Bagdonavicius, M. Nikulin, 1994. Stochastic models of accelerated life. In: Advanced Topics in Stochastic Modelling, Eds.:J.Gutierrez, M.Valderrama , pp. 73-87. World Scientific, Singapore. A.P. Basu and N. Ebrahimi, 1982, Nonparametric accelerated life testing. IEEE Trans, on Reliability , 3 1 , 432-435. G.K. Bhattacharyya and Stoejoeti, 1989. A tampered failure rate model for step-stress accelerated life test, Comm. in Statist., Part A-Th. and Meth., 18, 1627-1643. S.C. Cheng, L.J.Wei, Z.Ying, 1995, Analysis ofTransformation Models With Censored Data, Biometrika, 82, 835-845. O. Clayton and J. Cuzick, 1985, Multivariate generalizations of the proportional hazards model. Journal of Royal Statistical Society, Series A 148, 82-117. D.R. Cox, 1972. Regression models and life tables, J.R.Statist.
Soc, B , 34, 187-220.
R.D. Cox and D. Oakes, 1984. Analysis of Survival Data, Methuen (Chapman and Hall), New York. D.M. Dabrowska and K.A. Doksum, 1988. Partial likelihood in Transformations Models with Censored Data, Scand. J. Statist., 15, 1-23. R.C. Elandt-Johnson, N.L. Johnson, 1980. Survival Models and Data Analysis, J. Wiley, New-York. T.R. Fleming and D.P. Harrington, 1991. Counting processes and survival analysis. J.Wiley, New York. C. Genest, K. Ghoudi and L.P. Rivest, 1995. A semiparametric estimation procedure for dependence parameters in multivariate families of distributions. Biometrika 82, 543-552. I.B. Gertsbakh, K.B. Kordonskiy, 1969. Models of Failure, Springer Verlag, Berlin. L.Gerville-Reache, V.Nikoulina, 1997. Analysis of reliability characteristics of estimators in accelerated life testing. In: Statistical and Probabilistic Models in Reliability. (Eds. D.Ionescou, N.Limnios), Birkhauser, Boston, 91-100. P.E. Greenwood, P.E. and M.S. Nikulin, S. 1996. A Guide to chi-squared testing, J.Wiley, New York.
302 D.P. Harrington, and T.R. Fleming, 1982. A class of rank test procedures for censored survival data. Biometrika 69, 133-143. P. Hougaard, 1986. A class of multivariate failure time distributions, Biometrika, N X . Johnson, 1975.
73, 3, 671-678
On Some Generalized Farlie-Gumbel-Morgenstern Distributions, Comm.
in
Stat., 4, 5, 415-427. J.D. Kalbfleisch, R.L. Prentice, 1980. The Statistical Analysis of Failure Time Data, J. Wiley, New York. G.D. Kartashov, 1979. Methods of Forced (Augmented) Experiments (in Russian). Znaniye Press, Moscow. G.D. Kartashov and A.I.Perrote, 1968. On the principle of "heredity" in reliability theory. Cybernetics, 9, 2, 231-245.
Engrg.
J.F. Lawless, 1982, Statistical Models and Methods for Lifetime Data, J.Wiley, New York. J.F.Lawless, 1986. A Note on Lifetime Regression Models, Biometrika,
73, 509-512.
E.T. Lee, 1992. Statistical methods for survival data analysis, J. Wiley, New York. D.Y. Lin and Z. Ying, 1994. Semiparametrical analysis of the additive risk model. Biometrika
81,
61-71. D.Y. Lin and Z. Ying, 1995. Semiparametric inference for accelerated life model with time dependent covariates. Journal of Statistical Planning and Inference 44, 47-63. D.Y. Lin and Z. Ying, 1996. Semiparametric analysis of the general additive-multiplicative hazard models for counting processes. The Annals of Statistics,
23, 5, 1712-1734.
N.R. Mann, R.E. Schafer and N.D. Singpurwalla, 1974. Methods for Statistical Analysis of Reliabitity and Life Data, J. Wiley, New York. J.W.Q. Meeker, 1984.
A comparison of Accelerated Life Test Plans for Weibull and Lognormal
Distributions and Type I Censoring, Technometrics,
26, 157-172.
J.W.Q Meeker and L.A. Escobar, 1993. A review of recent research and current issues in accelerating testing, International
Statistical Review, 6 1 , 1, 147-168.
J.W.Q Meeker and L.A. Escobar, 1993. Statistical Methods for Reliability Data, J.Wiley, New York. M.A. Miner, 1 9 4 5 . Cumulative Damage in Fatigue. J. of Applied Mechanics, 12, A159-A164. W. Nelson, 1990. Accelerated Testing. Statistical Models, Test Plans, and Data Analyses, J. Wiley, New York. E. Pieruschka, 1961.
Relation between lifetime distribution and the stress level causing failures.
LMSD-8OOO44O1 Lockhead Missils and Space Division, Sunnyvale, California. J.M. Robins and A.A.Tsiatis, 1992. Semiparametric estimation of an accelerated failure time model with time dependent covariates. Biometrika,
79, 311-319.
A.L. Rukhin and H.K. Hsieh, 1987, Survey of Soviet work in reliability. Statistical Science, 2, 484-503. H.Schabe 1998. Accelerated Life Models for Nonhomogeneous Poisson Processes, Statistical Papers, 39, 291-312. B. Schweizer and A.Sklar, 1983. Probabilistic Metric Spaces. North-Holland , Amsterdam. N.M. Sedyakin, 1966. On one physical principle in reliability theory.(in Russian). Engrg. Cybernetics, 3, 80-87. J.Sethuraman, N.D.Singpurwalla, 1982. Testing of Hypotheses for Distributions in Accelerated Life Testing, JASA, 77, 204-208. M. Shaked and N.D. Singpurwalla, 1983. Inference for step-stress accelerated life tests. J. Statist. Plann. Inference, 7, 295-306. N.D. Singpurwalla, 1987. Comment on "Survey of Soviet work in reliability". Statistical Science, 2, 497-499. N.D. Singpurwalla, 1971. Inference from Accelerated Life Tests When Observations Are Obtained from Censored Samples, Technometrics, 13, 161-170. N.D. Singpurwalla, 1995. Survival in Dynamic Environnements, Statistical Science, 10, 86-103.
303 N.D.Singpurwalla, S.P.Wilson, 1999. Statistical Risk, Springer-Verlag, New York.
Methods in Software Engineering:
Reliability
and
R. Schmoyer, 1 9 9 1 , Nonparametric Analysis for Two-Level Single-Stress Accelerated Life Tests, Technometrics, 33, 175-186. A.A. Tsiatis, 1990. Estimating regression parameters using linear rank tests for censored data, Ann. Statist., 18, 353-72. J.W.Vaupel, K.G.Manton, E.Stallard, 1979. The impact of heterogeneity in individual frailty on the dynamic of mortality, Demography, 16, 439-454. R. Viertl, 1988.
Statistical Methods in Accelerated Life Testing.
Vandenhoeck &
Ruprecht,
Gottingen. R. Viertl and F. Spencer, 1991. Statistical Methods in Accelerated Life Testing. Technometrics,
33,
360-362. V.G.Voinovand M.S. Nikulin, 1993, Unbiased Estimators and Their Applications, Vol.1: Univariate C a s e , Kluwer, Dordrecht. V.G.Voinov and M.S. Nikulin, 1996, Unbiased Estimators and Their Applications, Vol.2: M u l t i variate C a s e , Kluwer, Dordrecht. Z. Ying, 1 9 9 3 . A large sample study of rank estimating for censored regression data, Ann. Statist., 21, 76-99.
Mathematics and the 21st Century Eds. A. A. Ashour and A.-S. F. Obada © 2001 World Scientific Publishing Co. (pp. 305-322) T H E VIBRATIONS OF A DRUM WITH FRACTAL BOUNDARY
JACQUELINE F L E C K I N G E R - P E L L E
ABSTRACT. Let il be a bounded domain in JRN. We study the eigenvalues of the Dirichlet Laplacian defined on the domain Q,. There exists a countable sequence of eigenvalues. Their asymptotics are related to the geometry of the domain. We recall the results established during the previous century concerning the link between the geometry of the domain and the asymptotics of the eigenvalues; we try to answer M.Kac's question "Can one hear the shape of a drum?" especially in the case of domains with fractal boundary. 1991 Mathematics Subject Classification: 35-02; 35P05; 35P20 Keywords and Phrases: Eigenvalues, Counting function, Fractals, Minkowski dimension Heat equation.
CEREMATH/MIP-UT1 (UMR 5640) Universite Toulouse 1 31042 Toulouse Cedex, France [email protected]
305
306 1
INTRODUCTION
We recall the evolution along the previous century of the question raised by M.Kac ([K]) in 1966: "Can one hear the shape of a drum?".
We are mainly
concerned with drums with fractal boundaries. More precisely, it is well known since Euler and Lagrnage that the vibrations of a membrane are described by the wave equation which leads to an eigenvalue problem. These eigenvalues correspond to the eigenfrequencies of the membrane and, of course, depend on its size, on its shape,.... Conversely, it is often of importance to know whether one can recognize the geometry of the body just by listening at the tones and overtones. We are concerned here by this problem mainly when the membrane, has a fractal boundary.
We recall the most significative results obtained on this
topic in the last century. NOTATIONS If w is a bounded domain in IR", du denotes its boundary and |w| n stands for its n-dimensional Lebesgue measure. As usual, Hk(u),
where k > 1 is an integer, is the Sobolev space of all (classes
of) functions / 6 L2{uS) whose all partial derivatives of order < k also belong to L2{u).
Moreover, HQ(U>) is the completion of V(ui) = CQ°(W) with respect
to the norm of HQ(U>).
This paper is organized as follows: We begin first (Section 2) with the case of a cord. Then (Section 3) we study the vibrations of a membrane and recall Weyl's estimate for the counting function. We also introduce the partition function. The study of the second term in the counting function lead us to Kac's question (Section 4). The case of domains with fractal boundaries was first introduced by M.Berry and we mention the results obtained for the counting function in this case (Section 5). Finally we turn our attention to the heat equation which gives also some information on the domain (Section 6).
307 2
A N EXAMPLE: V I B R A T I O N O F A CORD
It is well-known that the position at time t of a point a; on a cord with length a, which is fixed at both ends satisfies: d2v .
.
, d2v,
,
„.
. .
The constant k depends on physical data. For such a problem, it is also classical to seek stationary solutions v(x, t) = u(x)T(t)
so that we are led to the following eigenvalue problem:
(2.1.a)
(2.1.6)
Au(x) + \u(x)
= 0; (x, t) € [0; a] x 1R+,
u(0) = u(a) = 0, x e [0;o].
A pair (A,u) satisfying (2.1) is an eigenpair, it consists in an eigenvalue A € IR and an eigenfunction u : x € [0; a] —¥ u(x) e H. with u{x) = sin(pirx/a),
p € Z.
To each given integer p corresponds the eigenvalue Ap =
(pita'1)2.
The tones of the cord are given by the ground state (p = 1) and the harmonics (P > !)• It is easy on this example to see that the length of the cord determines the sounds and reciprocally. Moreover it is easy to compute the asymptotics, as s —> oo, of the counting function N(s) = E[a^/w],
3
:
N(s) := #{p/(pff/a)2
< s}.
Obviously
where E[.] denotes the entire part.
VIBRATIONS O F A DRUM; W E Y L ' S ESTIMATE
T H E CASE O F A SQUARE AND GAUSS ESTIMATE:
It is easy to extend this result to a square Ka = (0,a)
x (O.o) c IR2. The
position of a point x g Ka, when Ka has fixed boundaries is given by d2v 0j2 ( M ) = cAv(x,t),
(x,t) <=Kax
R+,
308 v(x,t)
= Q,
xedKa.
Of course, in this section, we assume that 1R2 is equipped with euclidean coordinates: x = ( s i , ^ ) € H 2 . The Laplacian A applies to the space variables: Au(x,t)
= Au(xi,x2;t)
d2u = •^{xi,X2\t)
d2u + - ^
We seek stationary solutions as above: v(x, t) = u(x)T(t),
(xi,x2;t). and we are led again
to an eigenvalue problem: (3.1.a)
Au(x) + Xu(x) = 0, x = (xi,X2) € Ka,
(3.1.6)
u(x) = 0, x G dKa.
There exists a countable number of solutions to this problem: u(xi,X2)
=
sin(pxi'K/a)sin(qx2iT/a),
AP,, = (p2+q2)(*/a)2,
(p,q) £ IN* X IN*.
As above, for a given A > 0, we can introduce the counting N(X,Ka)
:= #{A P ,„ < A}. We have: N(X,Ka)
function:
= : M ( ^ ) , where JV2(r)
denotes the number of lattice points inside the quarter of the disk with radius r : (3.2)
N2{r) := #{(p, q)eWx
W/p2
+ q2 < r2}.
Gauss, [G], shows in 1801, that this number is proportional to the area of the disk: (3.3)
A/"2(r\) = (7rr 2 )/4 + 0 ( ^ ) .
Notice that Mi(r) — 0 if r < y/2; it has jumps when r = %/2, It follows from (3.2) and (3.3) that (3.4)
JV(A, Ka) * (Aa 2 )/(47r),
T H E COUNTING FUNCTION:
A -> +oo.
V5,....
309 In 1911, H.Weyl extends this estimate to bounded smooth domains 0 in 1R™ ([Wl], 1911; [W2], 1912). The study of the vibrations of fl leads to the following eigenvalue problem (that we consider in the distributional sense): Find (A,u) € IR x H&(Q) with u =£ 0 such that: (3.5)
-Au(x)
= Xu(x),
x e Q.
Since Q, is bounded, Problem (3.5) has a countable sequence of solutions (the eigenpairs): (Afc, Uk); since the eigenfunctions are defined up to a multiplicative constant, we add the condition ||M||^2(Q\ = 1- Finally we obtain an infinite sequance of eigenvalues: 0 < Ai < A2 < . . . < \j
< ...,
, Aj• —> 00
as
j —> 00,
where each eigenvalue is repeated according to its algebraic multiplicity. The counting function associated to Problem (3.5) for a given A > 0 is: (3.6)
i V ( A , n ) : = # { 0 < A j < A}.
Note that it is equivalent to seek an estimate of \j as j —> +00, or an estimate of N(X, Q,) as A tends to +00. WEYL'S ESTIMATE:
For il a sufficiently smooth bounded domain in IR n , H.Weyl shows : (3.7)
N(\,Cl)~W(\,Sl)
:= (27r)- n B n |fi|„A n / 2 , as A -> 00,
where Bn denotes the volum of the unit ball in M n . Hence the volum of fi can be derived from the knowledge of the spectrum of the Dirichlet Laplacian defined on Q. REMARK 1: This results holds also for Neumann boundary conditions under some additional assumptions on the length of the boundary. In particular, if the boundary is "too long", the counting function can behave like Xa, a > n/2.
310 Counterexamples as "combs" can be exhibited for the Neumann Laplacian. ( [FMt],1973). Moreover, for the Dirichlet Laplacian, the smoothness assumption can be withdrawn ( [FMt], [Mtl],1976; [Mt2],1977).
4
CAN ONE HEAR THE SHAPE OF A DRUM?
T H E " REMAINDER TERM" When Weyl's formula holds, it is natural to try to estimate the "second term" or equivalently to estimate N(X,Q,) — W(X,fi),
the "remainder term".
Under some additional assumptions (if there are not too many periodic geodesies), it can be shown, for Problem (3.5), that : (4.1)
N(X, SI) = W(\, fl) - 7„|0n| B _iA< n - 1 >/ 2 + o(A("- 1 ^ 2 ), A -* +oo.
Here, as above, |0f2| n _i denotes the (n — l)-dimensional Lebesgue-measure of the boundary dQ (for a planar domain, \d£l\i is the length of the boundary); 7„ is a constant which depends only on n. When there are periodic geodesies, the second term oscillates.
([DG],1975;
[Iv2],1984; [Me 2],1984; [Vl],1986; [Se], [Sl],1987; [S2],1988). Obviously, it follows from (4.1) that the knowledge of the spectrum implies not only the knowledge of the "volum" of the domain, but also the measure of its boundary. Hence it is natural to try to derive other geometrical attributes. THE PARTITION FUNCTION:
Indeed, the link between the eigenvalues and the measure of the domain is established as far back as 1949 ([MiP] 1949) , by the use of the heat equation and the Laplace transform of the eigenvalues. In place of the counting function let us introduce the trace of the heat kernel, also called the "partition oo
(4.2)
-oo
Z(t, Q) := J^ e~Xjt = t I
e~nN(\, Q)dX
function":
311 As t ->• 0, the following estimate holds ([MiP] 1949, [K] 1966): Z{t, fl) = (4nt)-n/2{\n\n
(4.3)
+ ait+
...).
As for the counting function, the expansion of the partition function involves the "volume" | Q \n in the first term, the measure of the boundary | dil | n _ i in the second one; moreover the third term is proportional to the number of holes; REMARK 2: Note that (4.2) is established before (4.1) since the asymptotics w.r.t. j for the eigenvalues are derived from the asymptotics w.r.t. to t for the partition function by Tauberian theorems when the derivation of analoguous results for the counting function uses the theory of pseudo-differential operators. Note also that the knowledge of the asymptotics of the counting function implies the knowledge of the asymptotics of the partition function, but the converse is not true. We study problems related to the heat equation below in Section 6. T H E INVERSE P R O B L E M
Since the expansion of the partition function yields so many geometric characteristics, it sounds natural to study the inverse problem : the possibility to determine completely Q, from the spectrum of the associated Dirichlet (or Neumann) Laplacian. A first stimulator for the research in this field is the famous paper by M.Kac, ([K], 1966) : "Can one hear the shape of a drum?".
Also
this problem has several important applications such as the determination of cracks. First, Milnor exhibits two isospectral torii in H 1 6 which are not isometric. Then H.Urakawa is able to construct two isospectral domains which are not congruent, ([U],1982). Finally, in 1992, Gordon, Webb and Wolpert [GWW] show a very simple counterexample in 1R2 .
5
D O M A I N S W I T H FRACTAL BOUNDARIES
The question of domain with fractal boundaries is introduced by M.Berry in [Bel], 1979 and [Be2], 1980. Studying scattering of waves by fractals he suggests
312 to replace in the second term of (4.1) n — 1 by the Hausdorff dimension h ; hence , if dtt is with Hausdorff dimension h, his conjecture is: (5.1)
N(\, fi) = W(X, it) - inH(dn)\h/2
+ o(Xh/2),
A -> +oo.
Here H(dfl) is the /i-Hausdorff measure of the boundary. Before going further, let us recall first the definition of Hausdorff dimension and, more generally of "fractal" dimensions (see e.g. [Fa]). H A U S D O R F F DIMENSION:
The Hausdorff dimension, introduced by Hausdorff in 1919 and popularized by B.Mandelbrot in the seventies, is the most famous of the fractal dimensions. It is defined by the following way: Let us consider first, for given e > 0 a covering of dil by balls (JBJ), 6 / with radii r* < e. For any t > 0, set M(t)
:= lime_>o(inf 2 i e / r i ) '
wnere
the infimum is taken
over all coverings. The Hausdorff dimension of dfl is : h := inf{i > 0/M(t)
< +00} , and its ft-Hausdorff measure is H(dQ) =
M(h).
BOULIGAND-MINKOWSKI DIMENSION:
In 1985, J.Brossard and R.Carmona, [BC], construct a counter example to Berry's conjecture and they suggest to replace in (5.1) the Hausdorff dimension h by d, the Bouligand-Minkowski one. This new conjecture is usually referred as the "modified Weyl-Berry conjecture". Bouligand [Bo] has extended to the noninteger case the notion of Minkowski dimension. The Bouligand-Minkowski dimension of dfl is defined on the following way: For a given e > 0 we consider the interior boundary strip (5.2)
7* := {x e Sl/d(x,dil)
< e)
313 where d(.,.) denotes the euclidian distance in IR n . For any t > O, set M*(t,dQ)
:= lim £ _ > osupe-< n -%*|„.
As above, |.| n denotes the n-dimensional Lebesgue measure. The interior Bouligand-Minkowski dimension of dVt is: (5.3) di := inf{t > 0/M*(t,dSl)
< +00} = n - l i m m / ^ o K l n e r U n | 7 *|„].
We have di € \n — 1, n]. Taking 7* := {x e 1R™ \ £l/d(x,dQ,)
< e}, we define on the same way de, the
exterior Bouligand-Minkowski dimension of dil.
Finally, when | 9 n | n _ ! = 0,
the Bouligand-Minkowski dimension of dQ is d defined by: (5.4)
d := max(dj, de).
REMARK 3: This dimension is not as known as the HausdorfF ones though it is frequently used by chemists, physicists, under various names as "box-counting dimension", logarithmic dimension, , Kolmogorov's entropy Practically, it is sufficient to count the number of squares with side e which intersect the curve 7 in M 2 to deduce its "box-counting dimension". REMARK 4: For any bounded domain , the Hausdorff and the BouligandMinkowski dimensions of the boundary are always such that: n — 1 < h < d. It is is possible to exhibit counterexamples where h < d (with strict inequality).
This is precisely done by Brossard and Carmona to contradict Berry's
conjecture. They construct a subdivision of a union of squares so that one has a strict inequality: h < di . O R D E R O F GROWTH O F T H E " REMAINDER T E R M "
Assume now that Q is a bounded domain in 1R", with a fractal boundary dil; assume moreover that the Bouligand-Minkowski dimension of the boundary is d > n - 1. We have ([LFl], [LF2], [F], [L], 1988): (5.5)
N(X,n)
= W(X,U)+0{X^),
X ->+oo
314 where as above, d» is the interior Bouligand-Minkowski dimension. To establish this result, we consider a Whitney covering of CI, that is a covering with adjacent cubes which are smaller near the boundary: First we insert inside CI the maximum number of adjacent (non overlapping) cubes Qo with side 1. In what is left, we insert again the maximum number of adjacent (non overlapping) cubes Qi which are also adjacent to the previous ones and which are with side 2 _ 1
At the fc-th step, we have a domain
Bk near the boundary which is still free and that we cover partially with n^ adjacent cubes Qk with side 2~k. By use of the boundary strip 7*, defined in (5.2), we have: Bk C 72-t^/s- Hence nk < C2kd.
(5.6)
For a given A, N(X, Qp) = 0 for p large enough, or equivalently for Qp small enough. Set K € IN be such that for all integer p> K, N(\, Qp) = 0 . We use now Courant's method, which is also called "Dirichlet- Neumann bracketing" (see e.g. [CH]): k=K
(5.7)
Y,
k=K nfciV A
( > 3 fc ) < N(x>«)
< £
nkNN(X, Qk) + NN(X,
Bk)
where iVjv(A,w) denotes the counting function for Neumann Laplacian defined on <jj. Finally we combine Gauss formula with an estimate for Bk established in [FMt], [Mtl], [Mt2].. A PRECISE SECOND TERM FOR A DOMAIN WITH FRACTAL BOUNDARY:
For a domain with fractal boundary, it is natural, as for smooth domains, to try to calculate the second term. Unfortunately, except when n = 1 where the "modified Weyl-Berry conjecture" holds ([La]), the second term can oscillate, exactly as for smooth domains. As in [FVl], [FV2], let us consider a union of cubes which are smaller and smaller as shown here:
315
3
n
3
II
II
C
3 II
3
U
3
II
To construct this set in M2, let us first choose s satisfying (5.8)
1 + y/2 < s < 3.
We fix the central square Qo with side 1. Then we "stick", outside Qo, in the middle of each 4 sides of Qo, 4 squares Q\ with side s _ 1 , as shown by the figure above. We have now 4 x 3 "free" sides with length s~1; on each middle part of these 12 sides, outside the previous squares, we "stick" again one square Q2 with side s~ 2 . . . . At the fc-th step we have ri\. squares Qk with sides s~k and nk = - 3 , k > 1, and no = 1. o We denote by Q the union of all these squares (which is disconnected in JR. ) . It follows from (5.8) that the squares do not overlap and that Q is with finite measure. The interior Bouligand-Minkowski dimension di of dQ is (5.9)
di = (ln3)/(lns),
Kdi<2,
since for e > 0 k=K
+00
7* b = Y, M^s-k - 4e2) + Yl nks~2fc fc=0
k=K+l
where K is such that a-(*+D
< 26 < S~K.
We can compute exactly the second term of the asymptotics of N(X, Q).
316 PROPOSITION 1. ([FV1][FV2]). (5.10)
N(X,Q)
= W(X,Q)
where
As X tends to +00: -
-(X/v2)di'2p2
InX — 2Zn7r
2lns
+ o(Vx),
fc=-|-oo
p2(y)
=
Zk-yP2(sy-k);
J2
Pa(r) =
V - ^ r ) .
fc=—00
The function
p% is well defined, positive,
bounded,
continuous; moreover the set of its points of discontinuity
1-periodic
and
left-
is dense in IR .
The set Q being disconnected, we also introduce the connected set O, derived from Q by opening in the middle of each dQk n dQk-i
a small "cut" Ik with
_1
length efc = (100(fc!)) ; the connected open set O has the same Lebesgue measure (in IR2) as Q and it has also the same interior Bouligand-Minkowski dimension di = THEOREM
(ln3)/(lns).
1. ([FV1][FV2]).
--3W«2)d2P2 < N(X,0)-W(X,0)
As X tends to +00
InX — 2lnn + 0(VX) = N{\, Q) - W(X, Q) 2lns < -
-{X/-K2)di'2p2
InX — 2lnn + o(l) 2lns
+ o(Ad'/2).
Here again, the second term has the form: cnM(d£l)Xdi/2.p(lnX)
where p is a
periodic function which is positive, bounded and discontinuous. This periodicity arises naturally for self-similar fractals. It is also the case for the snowflake that we consider in Section 6. Finally, except perhaps for a very small class of domains, the "modified WeylBerry conjecture" is not true ((see also [LV], 1996, [MoVa], 1995).
THE
INVERSE PROBLEM:
We give now some conditions so that "we can hear the dimension a fractal boundary" in H 2 . The following results ([V2],1990; [FV1],1990; [FV3]), close from the one in [BC],1986, are derived from the asymptotics of the partition function denned in (4.2):
317 T H E O R E M 2.
If Q, is a bounded domain in H™, n > 2, di , the
Bouligand-Minkowski (5.11)
interior
dimension of d£l is such that:
di > -21iminf(Lnt)- 1 Ln[|fi|(47ri)- n / 2 -
Z(t,Sl)].
Moreover, if n = 2 and if dQ consists only in a finite number of connected components, then one has equality in (5.11). REMARK 5: These conditions are necessary since it is possible to construct examples with strict inequality in (5.11). 6
H E A T EQUATION ON T H E TRIADIC VON K O C H SNOWFLAKE.
We consider now the heat equation denned on D C IR n , n > 1. Let u : (x,t) £ D x [0, +oo] -> u(x,t) (6.1) (6.2)
€ IR satisfying:
Au(x,t)=^'
t )
,
t>0,
w(x,0) = l ,
xeD,
xeD.
The total amount of heat contained in D at the moment t > 0 is (6.3)
QD(t)
:=
ju(x,t)dx D
and the total amount of heat lost up to the moment t is
(6.4)
ED(t) := J (I - u{x,t)) dx. D
We describe now the asymptotic behaviour of the function Ep(t) 2
as t —> 0,
when domain D is the triadic von Koch snowflake in IR shown here:
318 Let us recall first some known results concerning Eu{t).
For planar domains
with polygonal boundary 3D, there are results from [vdBSr, Dul], one has: (6.5) k
fl/2
ED(t) = |Z?|2 - QD(t) = 2^|fl£>|i - t J2 <0j) + 0(e- r /'),
« -»• +0,
with some positive constant r depending on D. Here \D\2 is (as above) the area of D, \dD\i is the length of the boundary 3D, 6j, j = 1 , 2 , . . . , k are the angles a t the vertices of 3D, and the function c : [0,27r] —• IR is defined by 4sinh((7r — 9)y)
(6.6)
c(0) ••= J {sinh(7ry) cosh(9y) dy. o
The first term in the right-hand side of (6.5) corresponds t o the loss of heat through the sides of the polygon 3D and the second term is the amount of heat lost near the vertices. For an arbitrary open bounded and connected set D in IR2 with smooth boundary {3D e C 3 ), we have: ([vdBDa], [Du]). ED{t) = 2 ^ | 0 Z ? | i - Jrtx(O) + 0 ( i 3 / 2 ) ,
t -• +0,
where x(£>) i s the Euler-Poincare characteristic for D (i.e., 1 — x{D) is the number of holes in D). Moreover, van den Berg ([vdB], 1999) establishes some bounds for Eo(t) for D an arbitrary open set in IR n with a finite volume and fractal boundary which satisfies a uniform capacitary density condition. The main result in [FLVl], [FLV2] is the following T H E O R E M 3 For the triadic von Koch snowflake (with Minkovski
dimension
d=21n2/ln3; : ED{t) = p(lnt)ta
(6.7)
+ q(]nt)t + 0{e-r't)
as
where p and q are continuous, (In 9) -periodic functions constant.
t -> + 0 ,
and r is some positive
Moreover:
(6.8)
Pmin(z) < P(z) < Pmax(z) •
where z = I n t j I n 9 (so that t = 9Z) , p(z) =
.
..
3VS 3^3 ^
k+ f4\ ' £^? , /4V +Z
fc=-oo V
'
m=l
p(zln9); 2 2 k l eX1? Z A-K m 9 +'
x)m+1
-
(
'
319
(6-9)
Pmax(*)
= 3^ E
Both functions pmin{z)
(9 j
a,ndpmax(z)
£
^
~•
are positive continuous and 1-periodic func-
tions. There are several other new results concerning selfsimilar domains as snowflakes, cabbages,... ([F1LV], [LV],[vdB], [MoVa]); they all show the existence of an oscillating term as in (6.8); these oscillations which arise naturally in the calculations are related to the renewal theorem (see [LV]).
ACKNOWLEDGMENT: The author expresses her gratitude to the organizers of the Congress and to Al-Azhar University for supporting her visit, and to H. Berriche for improving the final version of this paper.
REFERENCES [vdB] M. van den Berg Heat equation on the Arithmetic
von Koch
Snowflake,
Probability Thoery, to appear, (1999). [ vdBDa] M. van den Berg and E. B. Davies Heat flow out of regions in H m Math. Z. v. 202, (1989), p.463-482 [ vdBSr] M. van den Berg and S. Srisatkunarajah Heat flow and
Brownian
2
motion for a region in IR with a polygonal boundary, Probab. Th. Rel. Fields v.86, (1990), p.41-52 [Bel]
M.V.Berry,
Structural
Distribution
of
modes
in
fractalresonators,
in
stability in Physic, Springer Vlg, Berlin, (1979), p.51-53.
[Be2] M.V.Berry, Some geometric aspects of wave motion:
wavefront disloca-
tions, diffraction catastrophes, diffractals, in Geometry of the Laplace operator; Proc.Symp.Pure Math., A.M.S. v.36, (1980), p.13-38. [Bo]
G.Bouligand,
Ensembles
impropres
et
Bull.Sci.Math., 2, t.52, (1928), p. 320-344 et 361-376.
nombre
dimensionnel,
320 [BC] J.Brossard and R.Carmona, Can one hear the dimension
of a fractal?,
Coram. Math. Phys. 104 (1986), 103-122. [CH] R.Courant and D.Hilbert, Methods of mathematical physics, Vol. 1, English transl., Interscience, New York, 1953. [DG] J.J.Duistermaat and V.W.Guillemin, The spectrum of positive
elliptic
operators and periodic bicharacteristics, Invent. Math. 29(1975), 39-79. [Du] B. Duplantier (1991) Can one "hear" the termodynamics
of a (rough)
colloid? Phys. Rew. Lett. v.66p. 1555-1558 [Fa] K. Falconer, Fractal geometry, applications,
mathematical
foundations
and
C H I C H E S T E R ; J O H N W I L E Y & S O N S (1990)
[F] J.Fleckinger-Pelle On eigenvalue problems associated with fractal Ordinary
and Partial
Differential
Equation,
domains,
vol.2, Sleeman, Jarvis Ed.,
Pitman Research Notes in Math, 216, (1988), p.60-72. [FLVl] J. Fleckinger, M. Levitin and D. Vassiliev, Heat content of the triadic von Koch snowflake, Proc. R. Soc. Lond. Ser. 3 , V.71,(1995), p.372-396. [FLV2] J. Fleckinger, M. Levitin and D. Vassiliev, "The heat equation on a snowflake" Int. Jal. Appl. Sc. Comp., V.2, N.2, (1996), p.289-305. [FMt] J.FIeckinger and G.Metivier, Theorie spectrale des operateurs uniformement
elliptiques sur quelques ouverts irreguliers, C.R. Acad. Sci. Paris
Ser. A 276 (1973), p.913-916. [FVl] J.FIeckinger and D.G.Vasil'ev, Tambour fractal: exemple d'une asymptotique
formule
a deux termes pour la "fonction de comptage", C.R. Acad. Sci.,
Paris, Ser. I, Math., t.311 (1990) p.867-872. [FV2] J.FIeckinger and D.G.Vasil'ev, An example of a two-term asymptotics for the "counting function"
of a fractal drum, Transact.A.M.S. V337, N.l, (1993),
p99-116 in 1990. [FV3] J.FIeckinger and D.G.Vasil'ev, "Vibration du tambour fractal: du Zeme terme de la fonction de comptage et determination du bord". Matapli, Paris, Oct 1992, p.29-36, [G] C.F.Gauss, Disquisitiones
arithmeticae, Leipzig, (1801).
estimation
de la dimension
321 [GWW] C.Gordon, D.Webb and S.Wolpert, Isospectral plane domains and surfaces via Riemannian
orbifolds, Invent. Math., 110, N.l, (1992), p.1-22.
[Ivl] V.Ya.Ivrii, On the second term of the spectral asymptotics for the LaplaceBeltrami operator on a manifold with a boundary, Funktsional Anal, i Prilozhen 14 (1980), No 2, 25-35; English transl. Funct.Anal.Appl.l4(1980). [Iv2] V.Ya.Ivrii, Precise spectral asymptotics fiberings over manifolds
for elliptic operators acting in
wit boundary, Lecture Notes in Math., Vol.1100,
Springer-Verlag, Berlin, (1984). [K] M.Kac, Can one hear the shape of a drum?, Amer. Math. Monthly, 73 (1966) 1-23. [Ll] M.L.Lapidus, Fractal drum, inverse spectral problems for elliptic operators and a partial resolution of the Weyl- Berry conjecture, Trans.A.M.S. v.325, (1991), p.465-529. [LF1] M.L.Lapidus and J.Fleckinger, The vibrations of a fractal drum, Lecture Notes in Pure and Applied Mathematics, Differential Equations, Marcel Dekker, N.Y.-Basel, (1989), pp. 423-436. [LF2] M.L.Lapidus and J.Fleckinger, Tambour fractal: vers une resolution de la conjecture de Weyl-Berry pour les valeurs propres du laplacien, C.R.Acad. Sci. Paris Ser. I Math. 306 (1988), 171-175. [M] S.Minakshisundaram and A.Pleijel, Some properties of the of the Laplace operator on Riemannian
eigenfunctions
manifolds, Canadian Jal Math.
, I,
(1949), p.242-256. [Me 1] R.B.Melrose, Weyl's conjecture for manifolds with concave boundary, Geometry of the Laplace Operator, Proc. Symp. Pure Math., Vol. 36, Amer. Math. Soc, Providence, 1980, pp. 257-273. [Me 2] R.B.Melrose, The trace of the wave group, Contemp. Math., Vol. 27, Amer. Math. Soc, Providence, 1984, pp. 127-167. [Mtl] G.Metivier, Etude asymptotique
des valeurs propres et de la fonction
spectrale de problemes aux limites, These de Doctorat d'Etat, Mathematiques, Universite de Nice, France, 1976.
322 [Mt2] G.Metivier,
Valeurs propres
de problemes
aux limites
elliptiques
irreguliers, Bull. Soc. Math. Prance, Mem. 51- -52 (1977), 125-219. [MoVa] S.Molchanov and B.Vainberg, On spectral asymptotics for
Domains
with Fractal Boundaries, (1995), preprint. [SI] Yu.G.Safarov, Asymptotics
of the spectrum of a boundary value problem
with periodic billiard trajectories, Funktsional Anal, i Prilozhen. 21 (1987), No.4, 90-92; English translation in Funct. Anal. Appl. 21 (1987). [S2] Yu.G.Safarov, Precise asymptotics
of the spectrum of a boundary value
problem and periodic billiards, Izvestija Akad. Nauk SSSR, Mathematical Series 52 (1988), No. 6, 1230-1251; English translation in Mathematics of the USSR - Izvestija. [Se] R.Seeley, A sharp asymptotic remainder estimate for the eigenvalues of the Laplacian in a domain of JR.3, Adv. in Math. 29, (1978), p.244-269. [U] H.Urakawa, Bounded domains which are isospectral but not congruent, Ann. Sci. Ecole Normale Sup. 15, (1982) 441-456. [VI] D.Vasil'ev, Asymptotics
of the spectrum of a boundary value problem,
Trudy Moscov. Mat. Obsch. 49 (1986), 167-237; English translation in Trans. Moscow Math. Soc, 1987, pp. 173-245. [V2] D.Vasil'ev, One can hear the dimension
of a connected fractal in 1R ,
in Petkov & Lazarov - Integral Equations and Inverse Problems; Longman Academic, Scientific & Technical, 1990 (to appear). [Wl] H.Weyl, Vber die asymptotische
Verteilung der Eigenwerte, Gott. Nach.
(1911), 110-117. [W2] H.Weyl, Das asymptotische tieller Differentialgleichungen,
Verteilungsgesetz der Eigenwerte linearer par-
Math. Ann. 71 (1912), 441-479.
Mathematics and the 21st Century Eds. A. A. Ashour and A.-S. F. Obada © 2001 World Scientific Publishing Co. (pp. 323-356)
323
INTERMEDIATE STATES : SOME NONCLASSICAL PROPERTIES M . S E B A W E A B D A L L A * AND A . - S . F . O B A D A *
t Department of Mathematics, College of Science, King Saud University, P.O.Box 2455, Riyadh 11451, Saudi Arabia * Department of Mathematics, Faculty of Science, El-Azhar University, Nasr City 11884, Cairo Egypt Abstract. In this article we consider in some detail some new classes of states. These states are intermediate states either between the pure number (Fock) states, and the (non-pure) chaotic state (thermal state), such as geometric state, or between the coherent state and number state such as binomial state. We extend our discussion to include some other states such as even (odd) coherent states, even (odd) binomial states, phased generalized binomial state ... etc. In our study of these states we pay attention to a discussion of the nonclassical properties, besides the statistical properties, for example correlation functions, squeezing, and quasiprobability distribution functions (P-representation, W-Wigner, and Q-function). Furthermore we consider the field distribution and the photon number distribution, as well as the phase properties. Finally, some schemes for the production of these states are presented. 1. Introduction. Since the earlier days of quantum optics, it is well known that the Fock (number) state and the coherent state represent two of the most fundamental states of a single boson mode [1,2]. The number state \n > is determined by its photon number and the phase is entirely random. In this state the amplitude of the field has a zero expectation value. For the coherent state \a >, one can generate it from the action of the Glauber displacement operator D(a) = exp(aot — a*a) on the vaccum state |0 >, such that 1 °° an \a >= exp(aa f - a*a)|0 > = e x p ( - - | a | 2 ) Y^ —7=\n > 2
(1.1)
rfV"!
with a(al) standing for the annihilation (creation) boson operator and a complex. For this state the phase is determined and the amplitude of the field has a non-zero value. In fact the coherent state is a linear combination of all \n > states with coefficents chosen such that the photon counting distribution is Poissonian. To generate such a state one can use the fact that a classical charge distribution radiates a field in a coherent state, while a single atom in its first excited state in the absence of external interactions radiates a field in the \n = 1 > state. It is worthwhile to refer to another state that is the choatic (thermal) state [3], whose density operator pth is given by oo
_n
^ = E ( l + % n + l !"><"!
(1-2)
324 where (1 "> n+1 is the Bose-Einstein distribution function. Recently one can find a great deal of interest in producing and generating new states in addition to the previous states. Most of these states are intermediate states and can be generated from the above states. In fact they interpolate between distinctive states, reducing to them in different limits of the parameters involved. In consequence, a unifying role is played by an intermediate state describing the physical properties of its limiting states. The earliest example in the literature is the binomial state [4], which interpolates between the coherent state and the number (Fock) state. Another is the negative binomial state [5], which bridges between the coherent state and the quasi-thermal state (i.e. Susskind-Glogower phase state [6]). The notions of the even binomial states [7] (between the even-coherent and the even-number state) and the q-deformed binomial states [8] (between the q-coherent and the q-number state) have been introduced. The negative binomial states were also generalized to the even (odd) negative binomial states [9], which interpolate between the even (odd) coherent state and the even (odd) quasi-thermal state. The logarithmic state, as a special case of the negative binomial states with then n = 0 term removed, was also investigated [10]. There are also many other intermediate states, among which one can cite: (i) the geberalized geometric state [11], between the number state and the (nonpure) chaotic (thermal) states; (ii) the intermediate number phase state [12], between the number state and the Pegg-Barentt phase state [13]; (iii) the intermediate number squeezed state [14], between the number state and the squeezed coherent state; (iv) the even and odd intermediate number squeezed state [15], between the even (odd) number state and even (odd) squeezed state. Most of the theretical studies concerning these states have focused on their construction and the possible occurrence of various nonclassical effects exhibited by them. All the above mentioned states are grouped under the category of nonclassical.states of light, and our main purpose of the present work is to review some nonclassical properties of some of these states. However to reach this goal we shall make our starting point the even and odd coherent states. This will be done in the following section. 2. Even and odd coherent states. 2.1 Generation of the state. During their study of a singular nonstationary one-dimensional oscillator Dodonov et al generated and introduced to the physics world the even and odd coherent states [16]. They have shown that these states separately form complete sets in the Hilbert spaces of even and odd functions. These functions can be written as follows
\a>±=±\±(\a>±\-a>), or in terms of number states
(2.1)
325 and
,a> x
-' -L7mrW^+1>'
(23)
where A+ and A_ are the normalization constants for both even and odd coherent states respectively, given by the formulae A+ = [cosh|a| 2 ]-i,
A_ = [sinh|a| 2 ]-i
(2.4)
Several methods to construct these states are given in the literature, one of them is to use the inversion operator I which has the properties Ial = —a, la*I = —a' and consequently ID(a)I = D(—a). One can construct two operators, each generating irreducible representations of the group consisting of two elements, the unit operator and the inversion operator / . The first operator is generating the symmetric representation of the group and takes the form coshtdat - &*a) = hb(a)
+ D(-a)] = D+{a),
(2.5)
while the second operator is generating the anti-symmetric representation of the group, thus sinh{aal - a*a) = hp(a)
- D{-a)} = £>_(<*),
(2.6)
and hence \a± = \±D±(a)\Q >. It is easy to see that the parity of these functions with respect to a is the same as with respect to the coordinate, therefore these functions completely describe the even and odd coherent states. Here it would be interesting to refer to the solution of the problem of nonstationary oscillator with the wall at the orgin of the coordinates, which has been obtained in terms of odd coherent states, for more details see ref. [16]. Recently it has been shown that quantum interference between coherent states leads to generation of states whose properties are as far as one can imagine from classical states. For example one can see that the superposition of two coherent states (\a > + ) can arise as a consequence of propagation of coherent light through an amplitude-dispersive medium. Furthermore it has been shown that the even coherent states exhibit ordinary (second order) squeezing as well as fourth order squeezing. Also we find that idea of superposition of coherent states has been extended to include the quadrature variances of a continuous one-dimensional superposition of coherent states, which shows a significant reduction of fluctuations in one of the quadratures. The nonclassical properties of even and odd coherent states have been studied in details, see for example ref. [17-20].
326 2.2 Photon number. The photon number distribution for both even and odd states can be obtained from equation (2.1). For the even cases we have [17] cosh
P«(n) = i
Vl2 ""! '
[ 0
neven
(2.7)
nodd.
While the probability P°(n) of finding n photons for the odd state is (
M2"
I
sinh
P(°1(n)=\
l«l 2 n! -
[ 0
.J nOCW
(2.8)
neven.
It has been shown that the photon number probability distributions are oscillatory and also resemble those associated with the Poisson distribution of the ordinary coherent state, (figure (1) of Ref.[17] may be referred to) 2.3 Quasidistribution function. The representation of quantum fields in phase space in terms of quasiprobabilities is widely used in the field of quantum optics, with particular emphasis on the VK-Wigner [21,22], Q-function [23], and the Glauber Sudershan P-representation [2,23]. Therefore in this subsection we shall be concerned with the VF-Wigner and (J-function for both even and odd coherent states. To find the W-Wigner and Q-functions we have to calculate the characteristic function C(£, s), which is associated with the order of the bosonic (photon) operators and is given by C « , « ) = Tr[pexp(&
- r a + ^\%
(2.9)
where p is the density matrix for the desired state and s is a parameter that defines the relevant quasiprobability distribution function which takes the value -1,0,1 corresponding to Husimi Q-function, W-Wigner, and P-representation, respectively. The quasi-distribution functions 1(0, s) are defined through the Fourier transform of the characteristic function in the form 1 1(0, s) = ^ 1"
r°°
d2texp(Z(3* - t*0)C(t, s),
(2.10)
J — OO
where W(/?) = 1(0,0), Q(0) = 1(0, - 1 ) and P(0) = 1(0,1). For any state \ipg > expanded in terms of the Fock states |n > as follows |^ s > = ^ s n | n > ,
with
P = IV'ff > < -0s!
(2-11)
327 the generalized quasidistribution function /(/?, s) of equ. (2.10) can be cast in the equivalent form [24] 2e-(A)l^
m
pmpn
HJ
1 + s\
/
2
\
rm.n
4\0\2
Tn,n
(2.12)
Hence the Wigner function non becomes W(/?) =/(/?, 0) = ^ e " 2 l ' 3 l 2 ^ ( - l ) " C c n ( 2 / 3 ) ' " - " ^ L - - " ( 4 | / 5 | 2 )
(2.13)
while the Husimi Q-function takes the form
Q(/3) =/(/3,-1) = n
I n + Ol \
7r
^
C
— m,n
„C^4=
(2.14)
y/m\n\
f_x\m
where L°(x) = ^ p-j^^—is the asociated Laguerre polynomial. For P-representation m=o \ n — mJ one has to take the proper limit as s —> 1. For the even coherent states case we can write the expression for the Wigner function as follows W(e){p)
=
e X p(| Q | 2 - 2|^| 2 ) 27rcosh |Q!|^
2 | a | . c o s 2 ( a / J , + a,p)
2 + e 2|a| c o s 2 ( a / 3 *
_ a.0)].
(2.15)
For the odd coherent states case we have w(o){p)
=
exp(|a| 2 -2\f}\2) 27rsmh \a\z
[e_a|a|a c o s 2 ( a
^
+ a,p)
_ e2\a\> cos2{a0*
_ a.pym
(2.16)
Figures for these functions show negative values which is a sign of nonclassical effects. Similarly we can write the Q-function in the case of the even coherent states in the form Q(e)(/3) = ^ ^ j j c o s t a / T + a'/?) + e2lQl2 cos(a/r - a*/?)],
(2.17)
while for the odd case we get
Q(0){/?)
= 5rnS [ C O S ( a / r + a'P) " e2'a'2 C ° S(a/? * ~ at/3)1,
(2 18)
-
After this quick review of the even and odd-cherent states, we look at the intermediate states; we begin by the binomial states.
328 3. The Binomial states. As a natural way to study the intermediate states especially the states that interpolate between coherent states and number states one may tackle the binomial state [4,25]. The binomial state \r),M > is a linear combination of the number states |0 >, |1 >, ...\M > with coefficients chosen such that the photon counting probability distribution is binomial with mean r)M.. The binomial state is denned as M
\iPn,M>=\v,M>=Y,B™\n>,
(3.1)
n=0
where Bff are the binomial coefficients given by B"
=
M
n V
(l
-
\v\2)*
(3.2)
The binomial state is quantum mechanical in nature and it produces light which is antibunched and has sub-Poissonian behaviour, but cannot have the minimum uncertainty product, where it contains a finite number of \n > . To generate the binomial state different methods may be used [7]. One of these methods is to use the Hamiltonian H = fj,J++n*J-,
(3.3)
where J+{J-) are the raising (lowering) operators related to the angular momentum operator, which satisfy the SU(2) commutation relation [J + , J_] = 2JZ and [Jz, J±] = ±J±. The evolution operator of the Hamiltonian (3.3) can be written as £/(t) = exp(CJ+-CV_)
(3.4)
where £ = (—i/j,t). Now if we use equation (3.4) to act on the state \l, —I > where / is the co-operation number, we find \r/> > = U{t)\l, -I > = exp(CJ+ - CJ-W,
-I >
(3-5)
By taking C = Oexp(—|(p) and define r = e~%v tan | , equation (3.5) becomes
hA>=(l + |r|2)-' J2 Cl + y m——l ^
Tl+m l m>
\'
(3 6)
'
^
By setting T = 77/(1 — |T?| 2 ),/ + m = n and 21 = M then equation (3.6) takes the form of equation (3.1). Thus as one can see we have an advantage to consider the binomial
329 state as a good example of the intermediate state. However we shall concentrate on the so-called even and odd binomial states [7,26,27]. These states interpolate btween even (odd) coherent states and even (odd) number states. This will be seen in the following subsections. 3.1 The even binomial state. The even binomial state \tpe > is defined as follows |^e> = y(|M,77>+|M,-7?>)
A.E
M
,
2n
(3.7) 2
2 M
Ml
, "(l-N ) ^|2n>=A1^JB:
where [4/] is the largest integer less than or equal to (M/2), and A is the normalization constant given by |A1|2 = 2[l + ( l - 2 | r ? | 2 ) M ] - 1
(3.8) 2
2
As a limiting case when r) —> 0 and M —• oo such that M|?j| —> \a\ equation (3.7) immediately reduces to equation (2.2) corresponding to the even coherent state | a + >. i) Correlation function To discuss anti-bunching we have to consider the Glauber second order (zero time) correlation function
Prom Equation (3.7) we calculate the expectation value of the photon number n , as well as the second moment n 2 , from which we can rewrite equation (3.9) as follows
,w ( 0 ) - ( i - i ) [ i + (i-%i 2 ) M ][i + ( i - 2 M ^ - 2 ]
9
[U) [1
~
M>
[1 - (1 - 2|77|2)M-i]2
(iAl)>
Equation (3.10) goes to coth2 |a| 2 if r) —> 0 and M —> oo ,which represents the value of the correlation function for the even coherent state case. Numerical investigation of equation (3.10) shows that, the system has super-Poissonian behavior for small value of the parameter rj ,but for large value of M the interval of the super-Poissonian diminshes,while the system shows sub-Poissonian for large value of r). This means that increasing the number of photons in the even binomial state changes the distribution from super-Poissonian to sub-Poissonian, (Figures (1) of Ref [7] may be consulted) ii) Squeezing The squeezing phenomenon represents one of the most interesting phenomena in the field of quantum optics, and is a direct quantum effect of Heisenberg's uncertainty principle. It reflects the reduced quantum fluctuations in one of the field quadratures at the expense
330 of the other corresponding stretched quadrature. Our aim in the present subsection is to consider two cases of squeezing, the first is the normal squeezing which can be discussed through the fluctuations in the quadratures X
= I(at+a);
Y=±(a-tf),
(3.11)
where d and atsatisfy the (bosonic) commutation relation [a,a^] = 1, while [X, Y] = | . While the second case is to consider the amplitude squared squeezing, which arises naturally in the second -harmonic generation and in a number of nonlinear optical processes that is defined through d1 = l(a2 + aV); 2
d2 = l ( a 2 - a' 2 )
(3.12)
2,1
which satisfy the commutation relation [di,d2] =i(l + 2n). To discuss squeezing we have first to calculate the expectation value of the operator a}2s with respect to the state given by equation (3.8). For 2s < M , we find
«2s
[fir 3
/M\
(M - 2n)\ (M-2n-2s)\
W
m2\M-2n
(3.13)
where s is a positive integer. The field is said to be squeezed when AX or AY < \ and AX .AY > -^, must hold. From the numerical study of ref [7] one can see that squeezing occurs for all values of M , but it becomes more effective as M increases. However, the maximum squeezing is shifted to lower 77 as M increases. Also it is noted that as one changes the phase of the parameter r] the squeezing changes between AX and AY . For the second case (amplitude 2 squared squeezing) the field is said to be in an amplitude-squared squeezed state if Adi or Ad2 < n + | . and Adi Ad2 > (n + \)2 must hold. The numerical study of this phenomenon proves that amplitude squared squeezing is effective for large values of M , however the point of maximum squeezing in this case moves slightly as M increases (see Figs 2 and 3 of Ref 7). iii) Quasi-distribution function As in the previous section we shall consider some statistical aspects related to the even binomial state. We are only concerned with the quasiprobability distribution function, WWigner and Q-functions. This is because the P representation is highly singular due to the non-classical character of the even binomial state. From equation (2.9) with p = \ipe >< ipe\, together with equation (3.7) one can calculate the characteristic function C( e '(£, s) in
331 the form (Ml
C^^s) =
hi2 2 \Xini-\V\ ) Y,(2n) U-M n=0 ^ '
exp[-i( S + l)|£| 2 ]WI£| 2 ), (3-14)
2 M
where Ln(x) are Laguerre polynomials of order n (3.15) r=0
v
'
From equs (2.13) and (2.14) we find the diagonal terms in the the Wigner function, 2
M2 ^)(/5 ) =^iA 1 i 2 (i-M 2 ) M x:( 2n ) i-W J
ex P [-2|/?| 2 ]L 2n (4|/?| 2 ).
(3.16)
While for the Q-function we find
\H2
Q(c)03) = ^ ( i - H2)M n=0
V
exp[-|/?| 2 ],
(3.17)
'
where we have only taken the diagonal terms of the density matrix p. For the Wigner function W(/3) of equation (3.16) when \rj\ is taken very small (~ 0.2), the distribution is almost Gaussian and the shape of the field is insensitive to change in M. Increasing the value of M makes the peak slightly sharper. However, if one increases \T]\ (~ 0.6), which means effects due to higher excitations are of considerable importance, one notes a remarkable change in the shape of the function as M increases. Sharpening of the peak as well as the appearance of shallower wobbles are signs of the contributions due to higher excitations. In contrast to this one finds that the Q function of equation (3.17) is insensitive to any change in either |T;| or M (Figs 4-6 of Ref 7 may be consulted for details). 3.2 The odd binomial state. The odd binomial state is the state which interpolates between the odd coherent state and odd number state and is defined by [27] 1^1
\<po >= ^-(\M,r, > -\M,-V >) = \2 Y,
B?n+1\2n+l>
(3.18)
where A2 is the normalization constant given by |A2|2 = 2 [ 1 - ( 1 - 2 M 2 ) M ] - 1
(3.19)
332 Since the linear combination of the odd binomial state does not contain the vacuum state, the range of the parameter 77 will be between 0 and 1 such that 0 < M < 1. By taking the limit M -> 00 and 77 -> 0 then equation (3.18) tends to equation (2.3) corresponding to the odd coherent state. i)Sub-Poissonian behavior To discuss the sub-Poissonian behavior of the state one needs to find the explicit expression of the Glauber second order correlation function(3.9). For the odd binomial state we find
i,|\ (i-2H2)(M-2)(i-H2)2l ) 1-4-=i
¥
[i + (i •
(3.20)
From the above function we can deduce that, g' 2 '(0) is always less than one so far as both 77 and M are finite. However, if we increase the value of M and decrease the value of 77 at the same time, such that 77 —> 0 as M —> 00, then we find the correlation function tends to tanh 2 |a| 2 ,so that a sub-Poissonian effect does exist for the odd coherent state. This emphasises that the odd binomial state has sub-Poisonnian behavior. Also we may point that for large values of M with fixed value of 77 the function approaches unity more rapidly as 77 —> 1 and persists and the system shows coherence behavior [27]. ii) Squeezing Since the normal squeezing is based on the definition of the field quadrature operators given by equation (3.11), therefore in this case we find
y (M )
(M-2n-
1)!
(M-2n-2s-l)!
*-> \ 2 n + l / v / n=0 while the expectation value of the photon number is
M 4 n + 2 (i-H 2 ) M - 2 n -\ (3.21)
< 0*0 > = |77A2|2(M/2)[1 + (1 - 2|T7| 2 ) M " 1 ]
(3.22) The numerical investigation proves that the odd binomial state does not show squeezing whatever the values of the parameters 77 and M . However for amplitude -squared squeezing the situation is different where the squeezing becomes pronounced as M increases, and the maximum point moves toward higher values of 77 , (see Ref [27]) iii) Quasiprobability distribution Following the same procedure as in the previous subsection we can calculate the quasidistribution functions for the odd binomial state. Then from equations (2.13,14) and (3.18), we have the expressions of the Wigner function in the form
333
2
[(M-D/2]
[(M-l)/2]
— —n
m n—n m,ra=0
[——— V ^
'
T7i>n
\B£+1\\BM+1\(2\P\^-^LltT]m2)
cos[2(n - m)(
Similarly we can find the expression for the Q-function in the form 2
[(M-l)/2]
4n+2
(2n+l)l
[(M-l)/2] m ^ 0 m>n
| / o | 2n+2m+2
V(2n+l)!(2m + l)!
K + l l K + l | c o s [ 2 ( n - m ) ( ^ + C)]), (3.24) where we have taken /3 = \f3\e1^ and 77 = |r/|el*. The last two equations consist of two parts; the first part represents the diagonal term, while the second part represents the off-diagonal terms of the density matrix p. Prom the numercial study of the odd binomial state we find that when 77 has a small value (~ 0.1) and M = 5, the Wigner function has a hole on the summit similar to that found for the geometric state to be discussed later. However as we increase the value of 77 (~ 0.6) keeping the value of the parameter M fixed, we have four asymmetric peaks with a chaotic behavior, where the interference between the component states results in the selective preservation of the nonclassical effects during the amplification process. Increasing the value of the parameterM (~ 17), and keeping the parameter r] with the same value, we find the four peaks are shifted and the chaotic behavior becomes pronounced. With respect to the Q-function, we find that when 77 is small (~ 0.1) and M ( ~ 5) the function almost represents the case of a Fock state, however when we increase the value of the parameter 77 then the probability of having single photons also increases.where we have four adjacent deformed peaks at the center, (see figures (5 and 6)of Ref. [27]). 4. The phased generalized binomial state. Our purpose in the present section is to introduce a new class of the intermediate states. The idea of introducing such state is not only to give a wide range of studying the intermediate states, but also to generalize the so-called orthogonal even coherent state given by [28] \
334 complex a-plane the vector representing (ia) is rotated 90 degrees from a, and therefore the even coherent state | ia)+ is orthogonal to the state | a) + . The state we shall introduce is called 'phased generalized binomial state'and is defined by[29] \X) =A1"[\T,,M)
+ ei+ IrpV.M)],
(4.2)
where A is the normalization constant, and ip and
(4.3)
while for the usual binomial state, we find that A that takes the form Ab = - 1 + Ree^ ( l + (e** - 1) | r, | 2 )
(4.4)
In the limiting case we find for M -> oo ,and 77 —> 0 such that M|r7|2 = \a\2 equation (4.4) tends to
^
=
\ V + {cos&+
I a I' sin^el"^ 0 0 3 *- 1 )] _ 1
(4.5)
It is clear from the definition of | x) that when ip = 0 and
335 {M/2]
e(V,M
2
/M\
| iVtM)e = M £ ( £ ) ( < I" l 2 ) 2 ^ 1 - I " ! 2 ) M _ 2 n
( 4. 7 )
=| Ai | 2 fle[l + (i - 1) | r, \2}M, and then the normalization constant A, becomes Ae = i { l + | A! | 2 c o s ^ e [ l + (i - 1) | r/ I 2 ] " } " 1
(4.8)
where Ai is the normalization constant for even binomial state given by equation (3.8). As we stated above the even binomial state tends to the even coherent state when M -> oo,?7 -> 0 such that M\r)\2 = \a\2. In this case we find equation (4.8 ) tends to the form A~e = - cosh | a | 2 [cosh | a | 2 + cos^icos | a | 2 ] (4.9) For V" = 0 one can find that equation (4.9) is exactly equation (5) of [28], and the state | x)e reduces to the orthogonal-even coherent state. 4.1.1 Second and fourth order squeezing. Now let us employ the Hong-Mandel definition [30] to study the second and the fourth order squeezing. To do so we have to calculate the expectation values for different order operators using the state | x)e- It is clear that the expectation values for both a and at vanish. The following expectation values can be easily found
£
(B™)* (B™+2) v /(2n + 2 ) ( 2 n + l ) s i n ( ^ + mr),
(4.10)
n=0
[fl (tfa)e = 2Ae Yj2n\B^n
| 2 [1 + costy- + TITT)].
(4.11)
n=l [M]
2n{2n - 1) | B% | 2 [1 + c o s $ + rwr)]
(4.12)
n=l
The expectation value for a 4 can also be given as rM-4i
{a
>< -
12 i „ |4 0ae ' 2 2
(i-M )
2- IB-I
x y/(M - 2n)(M -2nx [1 + cos(V> + nir)]
1)(M - 2n - 2)(Af - 2n - 3)
(4.13)
336
w-^fi-n
M |2
2n I
(4.14)
x A / ( M - 2n)(M - 2n - 1) sin(^ + rwr) lB
where 77 =| 77 | e . Now let us define two quadrature operators for the field, X\ =T/2X and X2 =V%Y where X and Y are given by equation(3.11). These quadratures satisfy the commutation relation [Xi, X2] = i and play the same role as the position, and the momentum operators, q and p respectively, where [q,p] = i. 2
2
For the second-order squeezing we may write the quadratures AXi and AX 2 in terms of a,a) as follows: A l ^ 2 - l / 2 = {a1a)e ± Re{a2)e (4.15) When we differentiate the resulting expression with respect to 6 and set the result equal to zero, we find a necessary condition for AXi ( or AX2 ) to be extremum. That is (where shT0 T^O) cos 26 = 0, where the extreme values of 0 = TT/4, 3W/4 are independent of | rj |. On the other hand the extreme values for ip, are ip = 0, TT. The fourth order moment of X\ and X2 in the state | x)e are given by 1 + 2
AX, =
4
3
t2 2 4 + 2 t3 A X 7 = - 1 + 2(a a ) + |fle(o ) + 4(a a) - 4iie(o ) - ^.Re(a a>
(4.16)
(4.17)
where the fourth order uncertainity relation becomes (4.18)
AXi A X f > (9/16)
To measure fourth order squeezing in the quadratures AXi and AX2 from zero, we can rewrite equations (4.16) and (4.17) in the form Q1 = -AXl
-1,
Q 2 = gAX 2
-1
(4.19a)
Thus we may conclude that Xi and X2 will exhibit fourth- order squeezing whenever Qi < 0 or Q2<0
(4.19b)
337 It can be proved from the calculation that the extrema of the fourth-order squeezing coincide with those of the second order squeezing for both 9 and ip. To discuss the second and fourth-order moment of squeezing in phased orthogonal-even coherent state, we have to take the limit of equations (4.15-17). After a long and tedious calculations one finds the following expressions: AX" 2 = - + I a I2 [ s i n h l a l 2 - s i n | ^ a | 2 c o s ^ ] 2 cosh | a | 2 + cos ip cos | a | 2 | a | 2 cos | a | 2 sin tp sin 29 cosh | a | 2 + cos ^ cos | a | 2
AX,
(4.20)
2 2 _1 4 2 =- 1 + 4[cosh | a | + COST/-cos | a | ] [| a | [Ccosh | a |
2 -.DcosV>cos | a | 2 - - s i n | a | 2 sin20sin^]+ | a | 2 [sinh | a | 2 o
(4.21)
— cos if) sin | a | 2 +COS | a | 2 sin,0sin26)]
with )-(! + )-cos ±9) and
D = hi - ^cos 49)
(4.22)
and similar expressions for AX2 , AX2 • 4 4 Prom the expressions for AXi , AXi , one can easily realize that there is a fourth order simultaneous squeezing in the quadreture components for both phased orthogonal even binomial state and phased orthogonal even coherent state, (see figure (12) of Ref [28]). 4.1.2 The distribution functions. Here we pay attention to use the quasi- probability phase-space distributions to examine the representation of the phased generalized binomial states given by equation (4.2). By using equations (2.12) we will be able to get an explicit form for the quasiprobability distribution function for different forms of phased orthogonal states. Thus for the density matrix p generated by equation (4.2) we obtain the following expression for the quasiprobability
338
function
l{0,s) = —
, . 1-s
exp(
M2 i- w
(4.23)
'M\(M\ti!_(l±s_\n J \mj m! \ 1 — s,
(jn—n)
( \r)\2 1 - |,|=
1-s
4
p p r(m-n) f IffI . 71 -i TTl J J n
where P n P m = 4[cos[(0 - S - | ) ( m - n)] c o s ( ^ ± ^ ) c o s ( ^ ± ^ )
(4.23a)
The function L„ (y) in equation (4.23) is the associated Laguerre polynomial defined after (2.15). Note that in our calculations we have taken 77 = |?7|eie and /3 = |/9|et(5, where 9 and 5 are the phases of r\ and (5 respectively. The Wigner function of phased orthogonal even binomial state is [M/2] r
W(P) = ^ ( 1
exp(-2|/3| 2 )
7T
[M/2]
P| n L 2n (4|/3| 2 ) + 2 /
I 12
\
(iH^J
m
£
£
M In I VI
2n! (M\ (M 2m! \ 2 n / \ 2 m
)(i
(4.24)
+n
(4|/3l2)(m"n)-p2nP2m^lm-n)(4|/3|2
and
PlnPin
4cos[(2# — 25 — 4>)(m — n)] cos ( — + n<j) I cos ( — + m(f> (4.24a)
339 While the Q-function is
QW)
i(l-|,|Vexp(-|, (\P\2)
£j
\2n)\l-\T,pJ
[M/2]
(m+n)
(4.25)
|0|2(M+n) \/2n!2m!
-P271-P21
Finally we would like to mention that; as a result of the non-classical character of the present state, we find the P- representation function is highly singular . Therefore we shall not consider of the P-representation. Now we shall make use of the Wigner function (4.24) to get the probability distribution function P(x) by integrating W(/3), with (/? = x + iy) over the imaginary variable y , where oo
/
W{x + iy)dy
(4.26)
-OO
Substituting equation (4.24) into equation (4.26), we get
P{x)=yJlAe(l- N 2 ) M exp(-2x 2 ) [M/2] m,n~Q
~[^](M\(
\2np2HUV2x)
\n?
BtjWUl-|f)J
M M J( )( )( M \ y \2n) \2m) \2(\ -\r,\')J 2
m+n
pl p, f2nl m
-'
2n
2n!
H2m(V2x)H2n(V2x)~ V2n\2m\ \ (4.27)
Where Hm(z) is the Hermite polynomial of order m:
T=0
{-l)rm\(2z)m-2r r\((m-2r)\
(4.27a)
and PLPL
= 4cos[(20 + w )(m - n)] cos (t + n l \ cos ft + m l \
(4.27b)
The Wigner function W((3) has been investigated for different values of M, 7? and ip. For the small values of (~ 0.1), we find that the vacuum state is the dominant contributing
340 one for ip = 0 and 90. For ip = 180 where the Fock state |2) is the only existing state when M = 4 the figures for W(/3) represent this state clearly whatever the value of r). For intermediate r] (~ 0.5) we find for ip = 0 and M = 4 the Gaussian is deformed slightly with the appearance of shoulders. This is due to the slight effect of the state |4) in this case to the vacuum, while for ip = 90 and M = 4 the figure is assymmetric the effect of the state 12} deforms the Gaussian of the earlier case (see fig 2a of [29]). The nonclassical effect is apparent. Increasing M adds more states to be considered. The central peak surrounded by wobbling circles, and changing of ip, changes the symmetry (the case of M = 16 and ip = 180 is presented in Fig.2b of Ref[29]). When 77 is increased (to ~ 0.9) the effect of the Fock state with higher excitations, especially when M takes larger values, is pronounced. Increasing the value of M shows breaking up of the outer engulfing circles. Changes in tp result in assymmetry in the figures especially when we take ip = f. Therefore the shape of the quasiprobabilities is very sensitive to the choice of the phase ip. Investigations of the formula (2.25) for the Q-function reveal that for small value of 77 we find the Gaussian form that characterizes the vacuum state for all values of M and ip = 0 or 90. However when we take ip = 180, and M = 4, the case of the state |2) results and the figure for Q- function is representative of this case whatever the value of r) (see for example Fig. 4d of [4] which represents the Fock |5)). For intermediate values of r?(~ 0).5, one finds a displacement towards the centre(a = 0) which is a sign of squeezing this effect is pronounced for the case ip = 90 and M ( ~ 4). However when M is increased one finds a split into four summits, choosing ip makes the interference between the summits greater (ip = 0) or the split clearer (as when we take ip = 90 or 180). As 77 increases (~ 0.9) one finds that the height of these summits depend on the choice of ip (see Figs. 3 of Ref [29]). 4.1.3 Phase properties. The notion of the phase in quantum optics has found renewed strong interest because of the existance of phase-dependent quantum noise. The new measurements [32] in this field opened the way to deeper understanding of the quantum nature of the phase. There are many different [33] approaches to this problem. To calculate the phase properties for the even and odd phased orthogonal binomial states we adopt the Pegg-Barnett formalism [34]. In this approach, and on the (s + 1) dimensional subspace, one chooses the (s + 1) orthonormal phase states as bases, thus defined by \0m) = -jL=
f] exp(m,
(4.28)
where 9m = e0 + ^ r ; s+1 The phase operator is then defined as
m = 0,l,2....,s
(4.28a)
s
fo=Yl m=0
6
m\0m >< 0m\,
(4.29)
341 which has the state \0m) as its eigenstate with the eigenvalue 9m. The probability distribution for any state \ip) is given by P(0m) = \(0mW\2
(4.30)
This can be used to compute various moments and then the limit s —> oo is taken. The continuous-phase distribution function P{6) is introduced by P{6)=
lim £±I|<0 m |,}| 2
(4.31)
For any state of the form \ijj) = J2n cn\n), we find that P(6) takes the form 2 ^ - < «p[i(n - m)0] I i1 I 1 ++ £<=-»<«
P(0) = ^
c
(4.32)
Prom this phase probability function the moments can be calculated when the phase reference angle 6Q is put equal to zero, we find that <6>)=0
(o4 + 2E-
(4 33)
-
3 '-^ (m — nY When the function P(9) of equation (4.32), is polotted we note the breaking up of the figure for probability distribution funtion for the case of if> — 0. This is only due to the presence of the states |4n) in the state, and hence we get a four- fold degenercy. This is in general trend with the case of the even states where the probability is split only on two (see Ref[29]) 4.2 Phased orthogonal odd binomial state. Now we turn our attention to introduce and investigate phased orthogonal odd binomial state | x)o i which is given by replacing | 77, M)D instead of | 77, M) in the expression (4.2) taking into account <j> = ir/2 , ip = ip + TT/2. Hence the state \x) is I X)o = Al'2 [| V, M)0 + e ^ + ' / a ) I it,, M)0]
(4.34)
The normalization constant A0 is given by 1 ^ = 2
2lM
l-|A2|2cosV[l + ( i - l ) M 2 ]
(4.34a)
where A2 is the normalization constant for odd binomial state defined in equation (3.19).
342 4.2.1 Second and fourth order squeezing. A similar discussion can be given as in the previous subsection. The extreme values for the present case are identical with those obtained for the case of the phased orthogonal even- binomial state, this is due to the fact that we take the phase %jj of equation (4.6) equals to xp + n/2. Numerical investigations reveal that the optimal case of 6 and ip doesn't introduce fourth order squeezing. This is in contrast to phased orthogonal even binomial state which has simultaneous fourth-order squeezing. For the phased orthogonal odd coherent state, the normalization constant A0 of equation (4,34a) reduces to A0 = - sinh | a | 2 [sinh | a | 2 — cos tp sin | a |2]
l
(4.35)
The second and the fourth order moment of squeezing are given respectively as: AXl
AXx
= --
[cosh I a | 2 — cos |a|2cosV')] sinh | a | 2 — cos ip sin | a | 2 | a | 2 sin | a | 2 sin ip sin 2d sinh | a | 2 — cosip sin | a | 2 2
1 + 4[sinh | a | 2 — cosip sin | a
(4.36a)
a | 4 [Csinh | a
•Dcosipsin | a | 2 - - c o s | a | 2 sm29siail)]+ \ a | 2 [cosh | a | 2
(4.36b)
• cos ip cos | a | 2 — sin | a | 2 sin ip sin 2^]] and simillar expressions for AX2 and AX2, where the constants C and D are given by equations (4.21a). In comparing equations (4.36) with Eqs. (4.20 and 21), we come to the conclusion that the behaviour of r/> and 6 in both cases is identical, provided we have used (tp + (TT/2)) instead of ip. 5 Generalized geometric state. As we have stated earlier the geometric state presents the gradual behavior of some quantum optical systems where the state of the field changes from the pure number (Fock) state to the non-pure chaotic state [11,35,37]. This means a field state that interpolates between the number state \n) and the chaotic state with density given by equation (1.2).
343 5.1 Definition. We define the normalized generalized two parameter state \Y, M) as follows:M
n=0
where Y is a complex parameter and its phase is random in general and the normalization constant Ao is !Ao|2=
l-^y+i-
l^
1
(5 la
- )
The limiting cases of the definition in equation (5.1) are. (a) Chaotic state. For \Y\(= j^=) < 1 and M -* oo, the density operator in this case is PY,^=
lim
\Y,M){Y,M\
M—i-oo
= lim |A0|2 V
Y^Y*^\nXn\
(5.2a)
n,n =0 211
If Y = \Y\e ^ and ^ is a random phase, then the average over tp gives •I
/-27T
i7r Jo
/O
\ - l
1+n
°°
-^
'
/*27T
,
Jo (5.2b)
(»)" *—i (1 + n) This is identical with the single-mode chaotic state with mean photon number n equ.(1.2). (b) The number state. For \Y\ —»• oo and M finite, equation (5.1) reduces to the number state \M > . (c) The vacuum state |0 > . This is either obtained by taking the limit |V| —> 0 or equivalently by taking M = 0. (d) The phase state \0 > (see equ (4.28)) of Pegg and Barnett [34]. For y = eie">, \Y\ = 1 and s = M. This is the partially coherent phase state and when s —> oo the coherent phase state [33] results. 5.2 Properties. The mean value for the mth moment of the photon number operator in the generalized geometric state is given by
344
(5.3)
n=0
In particular, for m = 1,2 we have M
(ft) = |A0|2 ^ n | y | n = | F | ( 1 - | F | ) - 1 ( 1 - | F | M + 1 ) - 1 [ 1 - ( M + 1 ) | F | M + M | K | M + 1 ] . (5.3a) n=0
and M
(n2) = | A o | 2 ^ n 2 | F r n=0
_ [|y|(i + |y|) - ( M + i ) 2 | y | M + 1 + (2M 2 + 2M - i ) | y | M + 2 - M 2 | y | M + 3 ] (i-|y|)2(i-|y|w+i) (5.3b) Note that from (5.3a and 3b) we have in the chaotic-state limit (n) —> ft and (ft2) —> n(2n + 1), and in the number state \M > limit (n) -> M, and (ft2) —• M 2 . To calculate the normalized second-order correlation function one can use equation (3.9). In this case we have the expression ff(2)(0)
= ( i - l y i ^ 1 ) - 1 ^ + M | Y | M + 1 - (i + M ) | y | M ] " 2 [2 - M(M + l ) ^ ^ -
1
+ 2(M 2 - l ) | y | M - M(M - 1)\Y\
(5-4)
which goes to 2 for chaotic state and goes to (1 — JJ) for the number state \M). For the special case of M = 1,(?(2'(0) = 0, an expected result since the state |y, 1 > does not contain in its expansion the photon number state |2 > . Figures (la and b) of Ref [11] show the behavior of ff(2)(0) against | y | < 1. For M = 2, 0.69 < s (2) (0) < 1.94. In the range 0 < |V| < 0.36, there is a partial coherent property # (2 '(0) > 1. For 0.36 < |y| < 0.9 the antibunching effect (<7'2'(0) < 1) is clear but it is less compared with the photon number state [ '2'(0) = | for the state |2 >]. For higher values of M = 10, the chaotic behavior is exhibited [ gW(0) = 2 ] for | y | < 0.3. In fact as M -> 100, s (2) (0) = 2 for the whole range of 0 < |Y| < 0.95. The ratio of the variance function of the photon number (Are)2 to the mean photon number ( the Fano factor) is defined as F=iMl=<^>-
< ft >
< ft >
(56)
345 (For a number state, F = 0.). In the special case of M = 1,
The generalized geometric distribution shows sub-Poissonian behavior. For M = 2 the sub-Poissonian behavior is shown in the range 0.37 < |Y| < 0.9. The cases of M = 10,100 indicate the chaotic character of the state (F > 1) (Fig. 2. of Ref[ll]). 5.2.1 Squeezing. Now we shall examine the squeezing property for the generalized geometric state. To reach this we have to use equation. In this case we need to calculate the expectation value for both quadrature variances Xi = \/2X and X2 = \/2Yasafterequ(4:.U). Thus we find the variance (AXi) 2 takes the form 2(AXi) 2 = < a2 + a)2 > +2(h) + 1 - < a + af > 2
(5.7a)
Similarly for X2 we have. 2(AX 2 ) 2 = - < a2 + a)2 > +2(n> + 1+ < a - a< > 2
(5.7b)
On the other hand we find M
(a2) = lAo^y*)" 1 £
\Y\nsM^Y)
= (at 2 )*
(5.8a)
n=2
where Y = |Y|e**(0 = 2ip). Similarly we can show that M
a = |A0|2|F|-ie^2 £
\Y\n^i
= (a*)*
(5.8b)
n=l
The numerical results for the variance expressions, Si = 2(AA"i)2 - 1,
S 2 = 2(AX 2 ) 2 - 1
(5.9)
where Si :2 < 0 signify squeezing, are presented infigs.(3 of Ref[ll]). In the case M = 1 the component Si shows squeezing for <> / = 0 up to \Y\ ~ 0.98, but for
346 5.3 Quasiprobability distribution function. The quasiprobability distribution is considered here. From the characteristic function equation (2.10) we shall be able to find these functions, therefore if we take the parameter s = 1 corresponding to normally ordered characteristics function, and we take the density matrix p = \Y, M)(Y, M\ where \Y, M) is the state given by equation (5.1) then after some calculations we have
n ,n—0
where L„ '(z) is the associated Laguerre polynomial given earlier after equ (2.14). Although the P representation affords a convenient way of evaluting the ensemble averages of normally ordered operators, however the function is highly singular resulting from the nonclassical charater of the state (5.1). Therefore we shall consider the Wigner function which can be calculated if one uses equation (2.13). In this case we have „ M /—p W(0) = - | A o | 2 e - 2 ^ 2 Y (-l) n W-^(2/?)( n - n )yty*TT£("-»)(4|/?| 2 ) 7T
f-~*
(5.11)
V Tb '
n ,n=0
where we used the generating function [36] oo
fc7-*47_n)(*)]
(1 + ky exp[-fez] = J2
(5- n a)
n=0
As a special case if we average over the phase ip then equation (5.11) reduces to 9
, 2
M
2
W(P) =-\X0\ e- ^ E ^ n ^ T M 4 ^ ! 2 )
(5-12)
71=0
In the chaotic state where M —> oo, \Y\ < 1 we get [W(p)]ch = - exp[-(n + b-'m2] 7T
(5.12a)
/
In Ref [11] figures (11-14) show the Wigner function W((3) as a function of 0 = Re(/3) + iIm(/3) for different values of M and \Y\. From equation (5.12) it is clear that W((3) is a symmetric function in both Re(/3) and Im(/?). For |Y|(~ 0.1), W(@) is insensitive to M and its Gaussian-like form has its peack at Re(/?) = Im(/3) = 0. As \Y\ increases and for M = 1,2, W(/3) exhibits a hole, at its center and eventully W(P) behaves as a Gaussian similar to that of the chaotic state. In all cases W(/3) is positive for \Y\ < 1. As for \Y\ > 1
347 and for increasing M the presence of the Laguerre polynomial term in equation (5.12) is more effective and hence W(/3) becomes negative around its center. In fact, equation (5.12) reduces to that for a number (Fock) state \M >, namely, in the limit \Y\ 3> 1 and for fixed M, [W{p)]Fock = -(-)MLM(4\P\2)
exp[-2|/3|2]
(5.12b)
7T
which tends to zero for |/3| 2> 1. Finally we calculate the Q function, which can be used to express the ensemble averages of antinormally ordered operators as a simple intergal. After carrying out the integration in equation (2.10) with s = — 1 it yields QiP) = ^{^L-MM-\\l3?\Y\)eM-W]-
(5-13)
When M tends to infinity and \Y\ < 1, we have the corresponding formula of the Q -function in the chaotic state, [Q(P)U = -(n+l)-1exp[-(n
+ l)-1\l3\2]
(5.13a)
7T
Now let us discuss the probability distribution function P(x) associated with the quadrature x for the generalized geometric state. To do so let us use equation (5.11) to calculate the integral given by equation (4.26). With the aid of equation (5.11a) we obtain the following result 2 P{x) = (-)i\X0\2e-^2
M
^
' (n'!n!)-i(2)-i("+")yty*^i/ n ,( % /2^) J ff n (V2^)
(5.14)
n,n =0
The function Hn{x) is the Hermite polynomial of equ. (4.27a). The behavior of P(x) is discussed in Ref [35] in the average state [i.e., terms with n = n in the summation in equation (5.14)]. For very small (\Y\ ~ 0.1), P(x)is insensitive to M (the same for the Wigner function W(f3) ) and P(x) is of Gaussian-like form. For increasing \Y\ ~ 0.8 , the hole exhibited in W(/3) for M = 1 is now reflected in the non-monotonic decay of P(x). The emeragence of a peak in W(a) for M = 2 at its center results in the monotonic decay of P(x). For M » l , the Gaussian behavior of P(x) is reached as expected for the chaotic state field. Now, for \Y\ > 1, the negative values of W(/3) at its center,due to the Laguerre polynomials, results in the relatively reduced initial value of P(x = 0). As M increase, P(x) has an oscillating behavior of growing envelope for some range of x before it vanishes. For greater |V|, the behavior of P(x) coincides with that for a Fock state \M >, namely PFockix) = (-)i(2MM\)-1H2M(V2^)eM-^2)
•
(5.14a)
348 In the non-average case the matter is discussed also in Ref [35] (i.e., for fixed phase ip ) and for |V| = 0.8 and for ip = 0, it is shows that P(x) has multiple peaks with increasing M. For tp = | , | , P ( x ) has a similar behavior but with reduced peaks and suppressed oscillations for higher M ( ~ 10). For ij> = n the alternating behavior of the Wigner function for odd and even M values is clearly reflected in the pronounced increased initial peak for odd M = 1,3 (note the spike in the Wigner function at its center). Smaller peaks in P(x) appear for increasing M. For larger | y | ( ~ 3) and for ip = 0, P(x) has a behavior essentially similar to that of \Y\(~ 0.8). For ip = f the sharp dips in the Wigner function for odd M at its center result in the sharp drop of P{x) near x = 0 followed by an oscillatory (for higher M) decay. For ip — it, P(x) resembles that of the Fock state equation (5.14a). The results presented provide further insight into the systematic study of the gradual behavior from number to chaotic state. The Wigner function for the generalized geometric function of (2.13)is now given by 2
2
1 — \Yl2
°°
w(p) =- exP[-2|/?i ] i _. ' (J|,+1) {X>ny| 2 "M4[/3i 2 ) (5.15) 2Re
m
m
m n
2
J2 H " ( ^ T ) ' y* y"(2/?) -™z4 - ) (4\p\ ) } m>-n
The Wigner function is particularly important as it gives the correct probability for a chosen observable by integration over the conjugate observable. It is a complete representation of the state of the system since there is a one-to-one correspondence between the Wigner function and the wavefunction of the state. Thus, if we can map the Wigner function, then we have a complete representation of the state. Investigations of the Wigner quasiprobability function of equation (5.15) reveal that it has an almost Gaussian shape for small values of \Y\ (< 0.3) and is largely insensitive to any change in the values of M. This is due to the dominance of the effect of the vacuum state over the effects of the higher excitations. As \Y\ increases, however, distinct regions appear for which the Wigner function displays (non-classical) negative regions. The structure changes dramatically once \Y\ exceeds unity. For these values the vacuum state no longer has the highest probability in the number state expansion and the appearence of crescent-like structures which occur at radii characteristic of the photon number states is noticed. For some different values of M the characteristic number state rings and the anisotropy characteristic of a preferred phase are reported in Ref [37]. 5.4 phase properties of the generalized geometric states. The Hermitian optical phase operator is defined in finite-dimensional state space by equation (4.29). The use of the phase operator involves evaluating expectation values and moments as function of s before letting s tend to infinity. The importance of this limiting
349 procedure has been discussed in detail see for example [38]. Since equation (4.31) can be used as a probability density for calculating the moments of the Hermitian optical phase operator, therefore if we take the argument of Y to be zero so that Y is real and positive, then one can find that the phase probability density is PW - - 1 i-IlT l-2YM+'cos[(M+l)0] + K) 2TT 1 - |F|2(M+i) l - 2 Y c o s 0 + Y2
Y^M+^ K
'
'
We also choose the value of 9Q in equation (4.28) to be —7r so as to ensure the sensible result that the expectation value of the phase operator is the argument of Y. With these choices equation (4.29) gives
<&>=o,
A^ = - + _ y2(M+1) J2 ^rY m=l
I1 - ~^r- J - (5-17) ^
'
It is worth noting that, in the limiting as M tends to infinity, this variance becomes 7T 2
2
A(j> e(M ->• oo) = — + Ui log(l + Y), (5.17a) where dilog{.) is the dilogarithm function. The form of this variance is similar to that found for the variance for the phase sum associated with the two-mode squeezed vacuum state. This is a consequence of the similarity between the form of the coefficients in the number state expansions for the coherent phase state and the two-mode squeezed vacuum. Investigations of P(9) of equation (5.16) for different values of M show that [37]:- For the values 0 < \Y\ < 0.99 1.01 < \Y\ < 20 - -IT < 6 < n, P{9) is almost constant (about ^ ) for |Y| < 0.2 whatever the value of M, which means that the vacuum is the dominant state in this case ( compare with the discussion of the Wigner function for small values of |y| ). However, as | y | increases, a peak is produced at around 8 = 0. This peak sharpens as M increases, tending towards a delta function as M —> co, Y —• 1 (the phase state). As |Y| increases and takes values greater than unity, this peak gradually disappears. Finally it settles to the constant value ( about ^ ) as Y > 10 (it tends to the Fock state \M > as \Y| —> co, and hence phase information is lost). Discussion of the fluctuations in < Acpg > and < An 2 > for different values of M. show that [37], as M increases, the fluctuations in n are increased near |Y| = 1. It is to be noted that both < An 2 >or < A
350 5.5 Production schemes. In this subsection we present two schemes for production of such states. The first one depends on a generalized Jaynes Cummings model while the second relies on the SU(1,1) algebra. Generalizations to the JC model that include nonlinear interactions (in boson and spin variables) have been proposed. An interaction Hamiltonian for one of these generalizations that describes multiphoton processes in finite-level atomic systems is of the form 2r
Hint = J2^{(aS+y
+ (tfS„y},
(5.18)
where Sz,S+, and 5_ are the inversion, raising, and lowering operators which describe the atomic systems having (25 + 1) states. They satisfy the commutation relations [53, S±] — ±S± and 53 has the eigenvalues m such that 53\m > = m\m > where —5 < m < S. The fields operators a and d* satisfy the usual commutation relation [d,dt] = 1. The coupling constants gj couple the atomic system to the field; finally r < S. This Hamiltonian produces the JC model for r = | , 5 = | . When r = | , and 5 = | it gives the model discussed by Senitzky [39]. Several values for 5 were investigated by Buck and Sukumar [40]. The case for general 5 and r = | is the well-known Dicke model and the Tavis-Cummings model [41]of cooperative two-level atoms. Taking r = 1, and 5 = 1 , equation (5.18) describes a three-level atom in interaction with a single mode in which transitions between neighboring levels are effected by single-photon processes while the transition between the upper and lower levels is effected through a two-photon process; which is a special case of a Hamiltonian considered for the three-level atom system. Thus we may say that the interation model (5.18) then represents a (2r + l)-level atom interacting with one mode of the radiation field where one-photon transitions occur between neighboring levels, twophoton transitions occur between levels indicated i and i + 2 ; i < 2 r + l and so on where 2r-photon transitions occur between the two extreme levels of the atom. The model (5.18) can be used to describe some processes such as multiphonon transitions in a two-level atomic system, multiphoton lasers, and Raman and hyper-Ramann processes. The operators S± when applied to the state vector|ra >, for r = S, give aj.
S
r(5-m)!(5
+ m + j')!1i.
, .
+ | m > ^ { (5-J- J - ) !(5 + m)!}2|m +
J>
'
(5.19) ( 5 - m + j)!(5 + m ) ! n i . > = 2 m -\ 0• o(S - mm)\(S w e ,+„m —-^ Tj)> \ ~J>We assume that the system is evolving under the Hamiltonian (5.18) from the initial atomic coherent state aj. S m
r
s
Wo)>= £
r 25 ' \m > |0 > p h, m+S (1 + M 2 ) S
(5.20)
351 where |0 >ph is the vacuum state for the field, while r = tan($/2)e 1 *,. For a short time t (i.e., Qjt < 1), the wave function of the system becomes 2r
\il,(t) > = |V(0) > - f t £ 2 <9j f { ( o S + ) ' + (at5_y}|V(0) > . 3= 1
(5.21)
J
'
By using the equations (5.20) and (5.19) we get the following expression 2r
n
JL.
Wt)> = W 0 ) > - * * 5 : ^ f l ^ E
rm+S
(1 +
2 91
I
| T |2)5 ( S _ m ) | Kg + ^) ! (g + m ~ 2j)!p \m-j).
(5.22) Suppose at time t the atom is measured to be in its ground state | — 5 >, then the state of the field is given by v—%
i—
TJ
T25'
|V/(t) > ^ Ao|0 > - i < 2 ] V j ! S i 7 ^ TT
\j >,
(5.23)
- + I "-
where A0 = (cosi?/2) 2S . By making the coupling constants gja(2S — j)\y/j\Y3^2 and 2r = M [i.e.,(M + l)-level atom] the field state is then the \Y, M > of equation (5.1). To produce the generalized geometric states by using the Lie algebra approach, one can see that the coherent phase state bears a close relationship to the SU(1,1) coherent states which are important in the theory of squeezing. The 5(7(1,1) coherent state |£ > is defined in terms of the action of the operators K+, K_, and Kz obeying the commutation relations [k+,KJ\
= 2K3,
[K3, K±] = ±K±
(5.24a)
2
The single-mode squeezed states result if we let K_ = a /2, with K+ being its Hermitian conjugate, by the action of the unitary operator exp(^K+ — £*K-) on the vacuum state |0). An alternative representation of the 5(7(1,1) algebra for the operators acting on a single field mode is given by A = ahi,
i+ = n i a f
(5.24b)
representating K_ and K+ respectively with the representation completed by the operator h + \ as K3. An 5(7(1,1) coherent state constructed using these operators has the form |0 = e x P ( f i t _ r i ) | o ) ) which on using the disentangling theorem
(5.25)
352
exp(£i f - £*A) = exp{[ein t a n h r ] i f } exp{-2[ln(coshr)](ft + h} exp{[ e - i?? tanhr]A}, (5.25a) with £ = r-exp(ir?) gives 00
10 = (coshr)"1 expf^tanhr]^}^} = (coshr) V
\pirj f a r l u r]n l
~Z ' \n)
(5.25b)
If we write V = exp(irj) tanhr, then we see that this is simply the coherent phase state or the generalized geometric state \Y, 00}. In principle, this state could be produced by an interaction that involves an intensity-dependent coupling. 6 Even geometric states. Now we shall consider the properties of a normalized superposition of the two generalized geometric states| Y, M) and | — Y, M) in the form
\MY, M)) = \(\Y,M)
+ \-Y,M))=(1_
/
1 _ m* \ * [M/2) ^ J/a]+1) j £ r 2 " | 2 n > ,
|y
(6.1)
where [x] denotes the largest integer not exceed x. We refer to these states as the even geometric states by analogy with the terminology applied to superposition of coherent states [42]. With this definition for the even generalized geometric state we can calculate some statistical properties of the field.
\ ' ' / _ 2|F| 4 {[M/2]|F| 4 ([ M / 2 )+i) - ([M/2] + i)|y|4([M/2]+i)}
(6.2a)
(i-|iT)(i-l*T a M / 2 1 + 1 ) ) and 1 _ iy|4
[M/2]
at2&2
<
>« = 1 _ |y| J[J/ 3 ] + i) E 2<2n - 1^4n-
The expectation value for the field operator a2" is given by
( 6 ' 2b )
353 W2)
i_|y|4
<^I<M = X_\Y^^T-
(
2m'
\*
4m
£ l^| ((2^)0
(«•*>
while expectation values for odd powers vanish. It is to be expected that this state would give stronger squeezing than the generalized geometric state. For the state (6.1) we calculate the quasiprobability function and find that
^)(/?)=^exp(-|/?nT-i^]TTy [M/2]
m-n
2
% |
,
2
i £ irrw^i )+2 £ (m )^)\\y\2lm^)
(6 3)
-
cos[2(m-n)(7-0)]i^m-n)(4|/3|2)] with y = |y| exp(ry),and 0 = \/3\ exp(i^). The investigation of the Wigner function of equation (6.3) (see [37])for small values of | y | (< 0.2) does not show any structure to the Gaussian shape apart from a slight anisotropy similar to that of the squeezed states. The effect of the vacuum state is predominant, while the effect of the higher excitations is not pronounced for this case. As \Y\ increases, the effects of the higher photon numbers again lead to negative regions. For \Y\ = 2 one finds a complicated structure for the Wigner function, but an enhanced probability for two phases corresponding to opposite directions in the /? plane is discernible. We consider the phase properties of this state in a manner analogous to that considered in §5; consequently we find that p W m = K
'
1 1-m4 l-|y| 4 « M / 2 ' + 1 >cos{2([M/2] + l)fl} + |y|4[M/2]+i 2TT 1 - |y|4((M/2]+i) l-2|y|2cos20+|y|4 '
y
'
while the phase fluctuations in this case are given by
<(A&) ) - - + l _ |y|4([M/2]+1) 2^ ^
[\Y\
jyjSS— y
7T 2
-> — + di log(l -\Y\2)
as
M -> oo,
| y | < 1.
The phase properties for the even geometric state are considered in [37] where the function P(9) of equation (6.4) is plotted for different values of M and in the ranges 0 < \Y\ < 0.99 and 1.01 < \Y\ < 20 for -n < 6 < n. The appearance of peaks at 9 = 0
354 and also at 6 = n and 6 = —n. The phase probability distribution is of course, lix interval or window there are really two peaks. This is reminiscent of the phase distribution found for the squeezed vacuum states [43]. For both the even geometric states and the squeezed vacuum the 7r periodicity of the phase probability distribution arises from the absence of odd photon number in the expansion of the state. T h e limits of b o t h large and small |V| are again number states and this is reflected in the asymptotic form for P ' 2 ' ( # ) . 7 Conclusion. Superposition of quantum mechanical states of the e.m. field have recontly received much attention in quantum optics. Recently the experimental realizations of nonclassical states of motion of atrapped ion such as. Fock states, coherent states squeezed states and Schrodinger cat states have been reported [44-47]. In these experiments an ion is leaser-cooled in a Paul t r a p to the ground harmonic state. Then the a t o m is put into various quantum states of motion by applications of optical and electric pulse for different durations. Thus the study of non-classical states of light is not a mere academic exercise but it relates to the experimental realm. In this roport we have studied some intermediate states in particular even and odd-binomial states which interpolate between the even and odd-coherent states and the even and odd Fock states; the even and odd negative binomial states which bridge between the even and odd coherent states and the even and odd pure thremal states have been discussed. The geometric state which is a bridge between the pure thermal state and the Fock state has been introduced and the even-and odd geometric states have been studied. For these states we have considered the non-classical properties expecially antibunching, sub-Poisoniam states, squeezing (normal and lingher) of the field quadratures, and the quasi-probability distribution functions have been calculated and poltted. The nonclassical signature shows in attaining negative values for the Wigner-function; and oscilations in the photon distribution. References. [1] P.A.M. Dirac, Principles of Quantum Mechanics,4th ed. (Oxford University Press,Oxford 1958); E.Schrodinger, Naturwissenschaften 14,644(1926). [2] R.Glauber, Phys.Rev. 130,2529(1963);ibid 131,2766(1963). [3] J.Perina,Coherence of light (Reidal,Dordrecht,1985). [4] D.Stoler, B.E.A.Saleh and M.C.Teich, Opt.Acta 32,345(1985); A Vidiella-Barranco, and J.A.Roversi, Phys.Rev.A 50,5233(1994). [5] A.Joshi and S.V.Lawande,Opt.Commun.70,21(1989);G.S.Agarwal, Phys.Rev.A.45,1787(1992). [6] L.Susskind and J. Glogower, Physics 1 ,49(1964). [7] M.S.Abdalla, M.H.Mahran, and A-S F Obada, J.Mod.Opt.41 ,1889(1994). [8] S.C.Jing and H.Y.Fan, Phys.Rev.A.49,2277(1994). [9] A.Joshi and A-S. F.Obada, J.Phys.A:Math.Gen 30,81(1997). [10] R.Simon and M.V.Satyanarayana, J.Mod.Opt.35,719(1988). [11] A-S.F.Obada,S.S.Hassan,R.R.Puri,and M.S.Abdalla, Phys.Rev.A 48,3174(1993).
355 [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44]
B.Baseia, A.F.de Lima, and G.C.Marques, Phys.Lett.A 204,1(1995). D.T.Pegg and S.M. Barnett, Europhys.Lett. 6,483(1988); J.Mod.Opt.36,7(1988). B.Baseia, A.F.de Lima, and A.J. da Silva, Mod.Phys.Lett. 9,1673(1995). B.Roy, Mod.Phys.Lett.B 12,23(1998). V.V.Dodonov, I.A.Malkin, and V.I.Man'ko, Physica 72, 597(1974). C.C.Gerry, J.Mod.Opt. 40, 1053(1993). V.Buzek, and P.L.Knight,Opt.Commun.l8, 331(19991). V.Buzek, A.Vidiella-Barranco, and P.L.Knight, Phys.Rev.A 45, 657(1992). V.Buzek.I.Jex, and T.Quang J.Mod.Opt.37,159(1990). E.P.Wigner, Phys.Rev.40,749(1932). G.S.Agrawal, and E.Wolf, Phys.Rev.D 2, 2161(1970). C.L.Mehta, and E.C.G Sudarshan, Phys.Rev.B138,274(1965). H.Moya-Cessa and P.L.Kinght, Phys..Rev. A 48, 2479(1993). G.Dattoli, J.Gallardo, and A. Torre, J.Opt.Soc.Am.B. 4,185(1987). F.A.A.El-Orany,M.H.Mahran,A.-S.F.Obada,and M.S.Abdalla,Inter.J. Theor.Phys. 35, 1393 (1998). A.-S.F.Obada, M.H.Mahran,F.A.A.El-Orany,and M.S.Abdalla, Inter.J. Theor.Phys.38, 1493(1999). R.Lynch Phys.Rev.A 49, 2800(1994). M.H.Mahran,M.S.Abdalla,A.-S.F.Obada,andF.A.A.El-Orany,Nonlinear. Optics 19, 189(1998). C.K.Hong.and L.Mandel, Phys.Rev.Lett., 54, 323(1985); Phys.Rev.A 32, 974(1985). G.S.Agarwal,and K.Tara, Phys.Rev.A 43,492(1991). J.W.Noh,A.Feguires,and L.Mandel,Phys.Rev.Lett.67,1920(1991); Phys.Rev.A 45,424(1992). See the special issue Physica Scripta T,48 (1993) ,and R.Lynch Phys.Rep.250, 367(1995). D.T.Pegg, and S.M.Barnett, Phys.Rev.A 39,1005(1989). H.A.Batarfi,M.S.Abdalla,A.-S.F.Obada,andS.S.Hassan,Phys.Rev.A51, 2644(1995). and H.A.Batarfi, M.S.Abdalla,and S.S.Hassan, Nonlinear Optics 16, 131(1996). B.Spain and M.G.Smith, functions of Mathematical Physics (Van Nostrand Reinhold,New York, 1970). A.-S.F.Obada, O.M.Yassin and S.M.Barnett. J. Mod. Optics 44149(1997). J.A.Vaccaro,and D.T.Pegg, Physica Scripta T, 48, 22(1993), S.M.Barnett and D.T.Pegg, J.Mod.Opt.39, 2121(1992). I.R.Senitzky, Phys.Rev.A 3, 421(1971). B.Buck and C.V.Sukumar, J. Phys.A 17, 877, 885(1984). R.H.Dicke, Phys.Rev.93, 99(1954);M.Tavis and F.W.Cummings 170, 379(1968);ibid 188, 692(1969). B.M.Garraway, and P.L.Knight, Physica Scripta T 48, 66(1993). J.A.Vaccaro, S.M. Barnett, and D.T. Pegg, J. Mod.Optics, 39,603(1992). D.Leibfried, D.M.Meekhof, B.E.King, C.Monroe, W.M. Itano and D.J.Wineland, Phys. Rev. Lett. 77, 4281 (1996).
356 [45] D.M.Meckhof, C.Monroe, B.E.King, W.M.Itano, and D.J.Wineland, Phys. Rev. Lett. 76, 1796 (1996). [46] C.Monroe, D.M.Meekhof, B.E.King, and D.J.Wineland, Science 272, 1131 (1996). [47] W.M.Itano, C.Monroe, D.M.Meckhof, D.Leibfried, B.E.King, and D.J.Wineland SPIE Proc. 2995, 43 (1997.
Mathematics and the 21st Century Eds. A. A. Ashour and A.-S. F. Obada © 2001 World Scientific Publishing Co. (pp. 357-371)
357
ON THE RELATIVISTIC TWO-BODY EQUATION S.R.Komy Department of Mathematics, Faculty of Science, Helwan University, Egypt. Abstract Recently, an exact two body covariant wave equation has been derived from the field theory of coupled Maxwell Dirac equations (Barut & Komy, 1985). It involves only one common invariant center - of mass time r . It takes full account of Spain and recoil corrections of both particles. The equation that also includes self-energy effects in a non-linear fashion, is al6-component spinor equation for two Dirac particles. Working directly form this equation, energy eigen values and eigen functions are calculated to order a 2 and a 4 , where a is the fine structure constant. The eigen values agree with those determined previously by perturbative techniques, inclding relativistic, recoil and spin corrections, for all the energy levels of orthopositronium. The eigen value problems that arise are of Sturm-Liouville type, but involve two or four coupled, second - order differential equations in the radial variable r, with up to four singular points. In this approach approximated eigen values are determined directly, and the corresponding approximate eigen functions are obtained (for the first time) in simple closed form.
1
Introduction:
In nonrelativistic Q u a n t u m Theory for the dynamics of many particles in tion <j>(x[, X2,---,t) in configuration Vij(£i — x"j) which are functions of the
we have a well-establislished basis t e r m s of the m a n y - b o d y wave funcspace with one time, a n d potentials relative coordinates. This leads to a
358
powerful and, in principle, nonperturbative way of describing bound states, resonances, and scattering states. Of particular interst the trivial analysis for separating the center of mass and relative motion in the non-relativistic two body system to find a simplified problem in which the relative motion is described using the reduced mass. When one or both particles are to be treated relativistically, there is no similar procedure. In fact, if one starts with a two body system and lets one of the particle masses becomes infinite, it requires a non-trivial analysis to demonstrate that the result can be expressed in terms of a relativistic equation for the orther particle in a central potential. In 1930 Dirac found the relativistically invariant equation of motion for an electron in an Electromagnetic field. Together with Maxwell's equations form the fundamental set of equation of photons and electrons [1]. One of the major successes of Dirac theory was the explanation of the energy spectrum of the H-like atoms including the fine structure. However theortical treatment is far from complete. There remains considerable radiative correections, as well as relativistic refinements. Moreover the observed energy levels refer to real systems, so that small corrections are to be expected which arrise from interactions between magnetic moments, nuclear structures, recoil effects, and various contributions of all mentioned. A proper understanding of fine and hyperfine structures require earful treatment of the actual 2—body problem, in particular pure electromagnetic systems such as positronium, monomium, • • • cannot be studied without a well developed two-particle theory. The earliest treatment of the relativistic two-body system was that of Breit who constructed a theory of two electrons interacting with the electromagnetic field [2]. Although this theory does account for retardation effect, it corresponds to a single particle theory and in turn does not yield the proper corrections to the fine and hyperfine structures. On the other hand, many practical calculations are made on the basis of a Hamiltonian which is a sum of Dirac Hamiltanians and the main problem is how to choose the potentials. Usually they are assumed phenomenologically, or one photon and one boson exchange potentials are used [?]. The theory based on such postulated Hamiltonians is approximate. In Quantum electrodynamics two body systems such as the H-atom or positronium are treated perturbatively. Starting from a suitable wave equation (Schrodinger or Dirac equation with reduced mass, or a truncated Bethe-Salpeter equation), recoil and radiative corrections are introduced term by term from Feynman grphs. In this way,
359
energy level correction in positronium have been calculated, up to the order of about a 5 , where a is the fine structure constant. It is very impotant to determine if all these quantum electrodynamic effects can be obtained directly from exact, relativistic, non-perturbative twobody equation. Recently, an exact two body covariant wave equation can be derived from the field theory of coupled Maxell-Dirac equations (Barut-Komy [4]). This derived equation involves only one common time, an important feature, because the use of different proper times for the individual particles would lead to unwieldy retardation effects. The equation, which also includes self-energy effects in a non-linear fashion is a 16—component spinor equation for two Dirac particles. Prom the mathematical point of view, this two body equation offers some intersting new aspects. It leads to problems of Sturm-Liouville type involving a set of coupled second order ordinary differential equations in the radial variable. In Section 2, we derive the two-body equation. In Section 3, solutions for some special cases will be presented.
2
Derivation of t h e two-body equation:
Consider a number of (distinct) fermion fields ipi(x), ^{x), creasing with the Electromagnetic field A^. The action is:
in-
S = J dx[~F^ + £^(i7 m "d M - mM - Z e^7%A™ - E arfyiPjF^. i
(1)
j
The fermions have electric charges tj and anomalous magnetic moments a,j and spin matrices: 7^°; i = 1, 2 , •' • and a^ = -(7 M 7* - 7-OV)-
(2)
Hence: a0t a
= i7°7 2 = i ct = -iotu =
e
ov,
an = 1.
a? = 1, (3)
360 The equation of motion obtained from (1) are: 5 and (il^dmu - m^ipj - e^A^t/jj
- a^F^j
= 0, j = 1,2, • • •
(5)
Because we shall use the Green's function of the wave operator, ther is a preferred covariant gauge, A",/z = 0 (6) and with that choice, equation (4) becomes • ^ = 3n = Y\e$}1irf>i
(7)
+ 2aj(^
j
The general solution of (7) is A^x)
= 4 » ( x ) + JdyD(x
- y) £ [ e ^ ( y ) 7 / ^ ( 2 / ) +
2aj^j{y)atiV^{y)Y].
j
(8) In the second term we perform an integration by parts, and assume that, for localized current distribution the surface terms at infinity vanish. Then Ali(x) = A?(x)
Y,eifdVD(x-v)ii>i(vhMy)
+ j
-
2j2ajdydxD(x-y)^j(y)aXfliPJ(y)
(9)
j
To eliminate Ay, completely from the action, we insert (9) into the action (1) and obtain:
J
3,k
-
j
L
J
Yl eJa* J dyipj(x)Y'>Pj(x)dxD(x - y)^k{y)crxy.ipk{y)
361
-
Y
a e
-
Y
2a a
i * J dyipjix^'ipjix^Dix
y)ipk{yhvipk{y)
I dy:ipj{x)a^7pj(x)dvdxD{x
ik
- y)ijJk(y)(T\»ipk(y)
J
3,k
-
-
^(e^ix^U^ + ^S^W^^lM^)}
(10)
j
The diagonal terms i = k, corresponds to self energy, the interaction of the particle's current with itself. These terms have to be renormalized and treated separately. Here we are interested in the mutual interaction of two or more different particles, hence in terms i ^ k. The interaction term with external fields A™ is in the form — jM™. The sources of A™ do not appear in action as usual. For the mutual interaction of two particles we use the retarted Green,s function Dret{x — y). Now for the e^—interaction there are two terms in (10) with coefficients e\e<2 and e2e2- In the second term we interchange x and y and use the identity Dret(x-y)
= Dadv(y - x).
This is equivalent to writting the e^—interaction part as - Y
e e
i k / dxdy:$j{x)^iljj(x)~D{x
- y^^yh^Mv)
(U)
where I) = -(Dret
+ Dadv).
Similarly, in the other interaction terms, we note that dxDTet(x
-y)
= -OxDadv(y
- a;)
we combine vaious terms to obtain -
Y
2e a
ik
+
Y
/ dxdyiJj(x)'j>i'4)j(x)dx~D(x -
y}ipk{y)aXl,ipk(y)
J
j
ik
/ dxdyipj(x)<jflXip:j(x)dx~D(x -
y)^k(yh^k(y)
j
-
Y E 3
Aa a
i k I dyiP^a^^d^Dix
- yW^V^Mv)-
(12)
362
In particular, for the 2—body problem, which we shall mainly consider in the following the spin algabra is a direct product of two Dirac algebras. We shall always write the spin matrices as, e,g. ,
The first term of the direct product will refer to particle 1, and the second term to particle 2. A feature of the (ea) and aa interaction in (12), not present in the (ee) coulomb terms, is the occurance of the derivatives dxD(x — y), dlldxD(x — y). We have the relations i
i
(x-y) = —6(x°--y°- -r), D^ix-y)
:= 4^*°-» (
i, r =
\x-y\.
CG
4irdmDTet(x - y) = -r^6'(x°-y°lirdmdnDret{x - y) = L , dTmTn
T r4
0mn
6'{x°
-r), 47T
r
5
-y°-r).
(13)
Thus all the derivatives of the D-function reduce to 6,6', and 6" terms. Integrating by parts these 6 derivatives with respt to j/o, w e obtain the derivatives of the currnts with respct to yo at retarted times. The variation of the action with respct to the indivdual fields tpj leads to a set of coupled non-linear integrodifferential equation of the Hartee type. Instead we define composite fields and we vary the action with respect to these comosite fields, we obtain linear equation. Define the bilocal field (f> which is a 16—component composite spinor by
(14)
and consider, for simplicity, two perticles. The interaction terms contain the composite field. The free part of the action is a sum of terms each containing one field only. We multiply the free particle part of one particle with the (Dirac) normalization integral of the other particle, e.g., f dyip}(y)^^) = 1,
363
at some arbitrary time yo so that the free action terms can be written as
+ jdyd^2(y)(f^idnu-m2)M^i(yH1)0My)-
(is)
Clearly, we have one 4—dimensional and one 3—dimensional integral. However, the interaction part has also one 4—dimensional and one 3—dimensional integral due to the 6—functions in D. After inserting (13) into (12) and collecting the 6,6',6" terms separately, and for 6',6"- terms an integration by parts is formed with respect to xo and obtain derivatives of
(16)
where the potential is now: e e
l 2
47rr
„ ,M
~ '
„ei<22r. n
4TT
L
&-T
' " r3
'""
+
jaa^y+^av
-
4——[{am ® an - aman)—
+
jffm®U3(r)-yam®Qj3(r)].
'
r3
.77171S1
(
In the equation (16), the spin matriices of the two particles are written as tensor products, e.g. 7 M ®7 M , with the first factor always referring to particle 1, the second to particle 2. Now we introduce relative and center of mass coordinates: rli = xli-
yy.,
R^ = ax,, + (1 - a)yM
364
or %n = -RM + (1 - a)r M ,
y i , = Ril-
ar^
(18)
p£ = ( l - a ) P " - p " .
(19)
and
hence, p? = a P " + p M ,
The action can be written in terms of the new coordinates -JdR
dr
<^(P,r)[(7'x(aPM+p/i)-m1)®70]
+
7° ® (7M(1 " a)^M - PM -
m
2 + V(r))
and the equation of motion (16) becomes: [{a
Y
®7° + (1 - a) 7 ° ® 7 ^)P M + (7M ® 7° - 7° ® 7M)PM
-
(J ® 7°mi + 7° ® Jm 2 ) + V(r)]0 = 0
(20)
We note that the coefficient of po = i j j , concels so that no operator acts on the ro = t—dependence of cfi, or ro—dependence does not change. Hence we can consider
^ = 0(P„,p). Equation (20) is in the form of a linear wave equation [r"P M + K}4> = o
(21)
TM = a7/x ® 7o + (1 - 0)70 8> 7^ if = - ( 7 ® 7° - 7° (gi -y).p - (J
(22) (23)
with
Since T 0 = 70 ® 70, we multiply (21) with 70 <E> 70 and obtain (Po-7^f7f.P + 7^^
2 )
) 0 = O.
365
Hence the Hamiltonian of the system can be written as
H = P0 = (aa®I+(l-a)I®a).p + (a®I-I®a.p + (3®Imi+I®(3m2
+
7V7°.
(24)
The first term represents the Hamiltonian of the center of mass, the second and fourth terms constitute the relative Hamiltonian. The third term can be put either in the center of mass part or into the relative part, or my be divided between them. In order to put the wave equation (20) in a covariant form an arbitrary time like four-vector n, and ?*i = [(xi — x2)n — (xi — x 2 ) 2 ] 1 ^ 2 . Hence we write equation (20) in the form [(7MPiM-mi)
®7rc + 7 « ® 7 M ) ( A M - m 2 ) + y(r)]c£ = 0
(25)
If we take n = (1,0,0,0), then n reduce to the usual radial coordinate r, and 7n = 7 0 . Thus equation (25) reduces to equation (20). We have shown, after the elimination of electromagnetic field in the interaction of two charged particles ipi,ip2, that the resultant action can be written entirely in terms of a composite field (j>(x, y) and total time derivatives of (j>, if we use the retarted Green's function. Here x and y have light. like separation; hence
3
Solutions:
Our task now is to search for stationary states of equation (20). After the angular dependence is separated out, this leads to sixteen coupled equations, among the 16-components of the field
366
algebraic. These equation are: first set: (£ _ 2s^Ul +
Mvi
+ 2{dT + \)Z2 - 2+Z0 = 0
(E - ^)u00 + Mv00 = 0 ( £ _ 2e^)z2 + Amy2 - 2(dr + i ) U l = 0 (E - ^)Z0 + AMy0 - f)Ul = 0 (E - 4&a) U l
Evi + Mux + MvlVoo + Mum Ey2 + AMZ2 + Ey0 + AMZ0 +
=0 - 2(dr + Z)y0 + ^y2 ^-vm = 0 aruoo = 0
=0 (26)
The second set (which we shall not write) is obtained from the first set, and vise - versa, by the follwing symmetry substitutions:
v00
M <-> A M , y 0 <-> v0, Z0 <-> u0, u00 <-> - Z 0 0 , <-» - " o o , " i *-> -Zi, vi <-> -yi,Z2 <-»• u2, and y2 «-» v2
(27)
Clearly the set (26) containg four algebraic equations, which can be used to express four of the unknown functions in terms of the remaining four in that set, leving four ordinary linear differential equations to be solved. However the obtained set of four equations do not possess simple series solutions. In fact by eliminating (e.g) u00, Z0,vx,y2 and substitute with ui = £ a n r n , . . . , i t a > = £ & n r n n=0
(28)
n=0
in the resulting four differential equations, we obtain 3-term and 4-term recurrence relations which are difficult to solve for an,...,bnOne may try to eliminate more component functions by forming second order differential equations out the set (26). However such second order differential equations are more singular, and simple series solutions of the form (28) do not exist. Functional series solutions are suggested and this seems to work . The basic idea is that to replace {r™} in (28) by an appropriate complete set of functions {/«}, and hope that the series terminates. As an example, in the following we shall consider the case of two free Dirac
367
particles. Two free Dirac particles: The first set (26) is obtained by putting e\ = e2 = 0. Again eliminating the components UQO,ZO,VI and y2 we obtain the following four second ordinary linear differential equations: $ H - ( 1 - £)ttl = 0
§ P + (l-£h>o = 0
2fi dZo. , (r2 , s ( e 2 _ £ ) dp ' ( C
2J 2
p3(e2_^)
duo . rLr 2 dp 1 l
J2\
P 2J
J2\rj p2j-2 2£ 2
2aJ n _^}2/U-U
(29)
p2(e2
i
2aJ
p2(62_^)jyu
p 2 ( 6 2
v _^}^
= 0
(30)
where r,
r.2
(£2-AM2)(£2-M2)
6m2 e 2
and (31)
^ = J ( J + 1)
The first two equations have the well known regular (at p = 0) solutions u\ = pjj(p) and u0o = PJj(p)> where J„(p) is the spherical Bessel function. The form of the other two coupled equations suggest the solution: Z2 = Yu AnPJn+s(p),yo ra=0
= Y, BnPJn+s(p) n=0
(32)
Substituting into (24), it can be shown that the indicial equations imply s = j — 1. Moreover, if A\ = 0 = Bi, then An = o = Bn for odd n. For even n, put n + 2 = 2m, and obtain the recurrence relations: (2m+j-3)(2m+j-4)-J 2 4 _ T2 (2m+j+3)(2m+j+3)-J 2 ,. {2(2m+j)}{2(2m+j)-3} ^ 2 m - 2 J {2(2m+j)+l}{2(2m+j)+3} yi 2m+2
+ [ e 2 {(2m + i)(2m + j - 1) - J 2 } - 2J*
^
^ j ^ J A -2aJB2m = 0
^
(2m+j-3)(2m+j-4)-J 2 p _ T2 (2m+j+2)(2m+j+3)-J 2 n {2(2m+j')-5}{2(2m+j)-3} • D 2m-2 J {2(2m+j)+l}{2(2m+j)+3} • D 2m+2
+[G 2 {(2m + i)(2m + j - 1) - J 2 } -
2J'^7^»g&.^}]gan, -2aJA2m = 0
(33)
368
Notice that for m = 2, the coefficient of A2 in the first relation, B2 in the second relation vanish. Hence if we assume that At = 0 = B4, then A2m = o = B2m for m > 2. Because A_ 2 = 0 = B_ 2 for all n, then (32) leads to four equations, only two of them are indpendent, in A0, B0, A2, and B2 the equation are:
2jTTft + (e!-27TT)B° + J T I ^ = 0 r*=2 1 ± ± ^ R a J 4 J . 1 ± 1 R -n (e "27TT )B2 "T^ 2 + 2lTT Bo - 0 keeping A3 and Bo arbitrary, we find
(
-
1±1R
j
\R
aJ
A
Chosing AQ, arbitrary and BQ — 0, we obtain the two solutions Z2 = A2pjj+i(p)
,
2/0 = BoPJj-i(p) +
B2pjj+i(p)
and Z 2 = A)/07i+i(/9) + A2pjj+i
,
yo = B2pjj-i
(34)
which can be checked by direct substitutions in (29). Interacting two Dirac particles: For the minimal coupling, neglecting the selp interaction, we have the set of equations (26). Again eliminating the components u0o,z0,Vi, and y2 we obtain the set of four ordinary differential equations: {{E + 2-f)2 - ^{E
+ 2f) - *£]U! + 2(E + 2f)Z2 +
2
-^Y0
= 0
2
[E{E + 2s) - AM ]Z 2 - 2EU[ - ^f^-Vao = 0 {(E - *)(E
+ &) - M2 - £(E
- ^)]V00 - 2(E - ^)(4£
+ *)
-2*g±(E-4?)Z2 =0 [E(E + **)- AM 2 ]y 0 + 2{E + 2-f){^ - **) + 2-^-Ux = 0
(35)
369
where Ui,YQ,Z2, and Voo = ruury0,rz2, and rv00. Due to the ^ singularity no simple series solutions exist. Performing second differentiation, we still obtain four coupled second order ordinary differential equation. The main idea is to expand the coefficients of the unknown functions in povers of a and heeping terms up to a2, a 4 ,... as needed. The resulting equations can be solved exactly, thus giving the approximate component functions but in a closed form. For example up to a2, the equations for the compenent functions u\ and voo decouple and take the form:
2
J2 n[uuuoo] = 0 pL
(36)
(M 1 - E2){E2 - AM 2 ) IE2
and A = -£—(2E2 -M2AM2) (37) AkE The regular solutions (at p = 0) of (29) are given by the hydrogenic functions: u\ = Rnj(p) , voo = Rnj{p), provided A = n (integer), and n > j + 1. This in trun implies the energy mass relation E2
=
M2
AM2±M2-AM2{1_^ 2 n2
+
2
The two components Y§ and Z2 satisfy two coupled second order differential equations of more complicated structure, which can be solved by developing power series solutions but {r™} is to be replaced by {Rnj}Up to a 4 we obtain for the set (34), four coupled second order ordinary differential equations. Two of them are: fiii
i
a dm
, r
1 _i_ A
~W + ~p~dj + l~4 + ~p
J2—a2
~?
aAM 1 „.
n
"W^00 - U - ^ i
= 0
(39)
370
where A, p, and k as before (see eq. (36)), , „ AM2-M2 and SE = r
AakE a =
+
AM2M2 ~
(40)
Comparing the set of equations (38) with the set of equations (35), we suggest the solutions: ui = YlAtwt+s{p)
and
v00 = ^2 Bewe+S(p)
£=0
(41)
e=o
where we have put Rne(p) == wi{n,p). The following recurrence relations (written for the first time) are needed: dwe n £ = -3— = ( dp 2
{i + 1} =
^
=(
£i2
1 wt - -Vn2 p z
-
l2wi-i
i " 7^+1 " l\ln2-V+l)2w*
(42)
Again following the same sleps as before, we show that the solutions of (38) take the form n—j—l
Ui=
J2
n—j+1 A w
z t+h
and
v
oo = Yl
B w
e e+s
where the constants Ae, and Be for various values of I depend on two arbitrary constants. In a separate publication (6), a new method to the determination of eigenvalues and eigen functions of the system of equations (26) is developed. For the equal mass case, the constructed eigen functions N3p0, N3si,N3p2,..., 3 3 and N Di,N F2,..., where N is the appropriate principle quantum number in each case. The obtained energy eigen values are accurate to within terms of order a6. For more general cases, the work is in progress.
371
4
References
1. P.A,M. Dirac, Proc. Royal Soc, A126, 360 (1930). 2. G. Breit, Phys. Rev. 29, 553 (1929). 3a. N. Kemmer, Helv. Phys. Acta 10, 48 (1937). 3b. E.Fermi, C.N.Yang, Phys. Rev. 16, 1739 (1949). 4. A.O.Barut, S.R.Komy, Fortschr. Phys. 33, 6, 309 (1985). 5. A.O.Barut and N.Unal, Fortschr. Phys. 33, 319 (1985). 6. A.O.Barut, A.J. Bracken, S.R.Komy, and N.Unal, J. Math. Phys. 34(6), 2089 (1993).
Mathematics and the 21st Century Eds. A. A. Ashour and A.-S. F. Obada © 2001 World Scientific Publishing Co. (pp. 373-386)
373
SINGULARITIES IN GENERAL RELATIVITY AND THE ORIGIN OF CHARGE K. BUCHNER Zentrum Mathematik der TU Munchen D- 80290 Munchen, Germany
Abstract d-spaces are a simple and very useful tool for the description of singularities in General Relativity. In the first part of this paper, we recall the basic definitions of the theory of d-spaces. Then a short review of the results is presented, which were obtained with this theory. It is shown that there are situations, where pointlike particles can pass through singularities. This is the case e.g. for the classical Big Bangs of a series of closed Friedmann universes. In Schwarzschild's solution, the problem of causality violation near the White Source can be solved, although some mild form of causality violation remains. Finally, the first example of a "wormhole" without exotic matter is presented. This means that an electric field proportional to 1/r2 is generated only by topology without any electric charge.
1
Introduction
The famous theorems by Hawking and Penrose (see, e.g. [15], [2]) state that most "reasonalbe" solutions to Einstein's equations contain some sort of singularity. In general, the tidal forces become infinite when an observer approaches such a singularity. This means that physicists who are curious and approach it too closely, are killed. So the question "How does a singularity look like inside" seems to be forbidden. Mathematically speaking, the solutions to Einstein's equations are usually considered as differential manifolds. By definition, they can not contain points in which the metric is singular. Here "singular" means e.g. that its determinant is zero or some scalars built from the curvature tensor are infinite. But one must keep in mind that all general definitions of a singularity
374 discussed so far lead to difficulties. However, this is not relevant here, as in the framework of d-spaces, singularities are considered as points (or sets of points) of space-time. So e.g. the problem with Geroch's definition is that the topology at the singularity is not well defined [10]. And Schmidt's b-boundary sometimes leads to very strange topologies [4], [17], [18]. But in the theory of d-spaces, a natural topology arises, which agrees very well with our expectations at least in the cases discussed as yet. Sometimes questions about the singularities themselves are important. Consider e.g. two closed Friedmann solutions, where the Final Collapse of one of them is identified with the Big Bang of the other. Here the question "Can pointlike particles pass through this singularity" is of practical importance. It means: "Can we get signals from the universe before ours?" However, very little is known about the state of matter immediately before and after the Big Bang. So ist is not clear, whether pointlike particles exist there at all. Still, if it were so, it would be very surprising. Similarly one may ask: "Where do the (pointlike) particles come from, whose geodesies begin in the White Source of Schwarzschild 's solution?" Of course, we can not see this White Source in our part of the world. But if the maximal analytic extention of Schwarzschild's solution makes any sense, this question must be answered. For the discussion of such problems, one needs a generalization of differential manifolds which can include singularities. It is not surprising that to most questions, practically all such theories give the same answers. This is so, because one is mainly interested in the geodesies passing through the singularities. They are completely determined by their starting point and the tangent vector in this point. Now, if in a suitable topology the non-singular points are dense in the total space-time, then continuity arguments are sufficient, and the generalization of differential manifolds serves only to put the results on firm grounds. In the following, the basic definitions of the theory of d-spaces [11], [12], [13] are presented very briefly. It provides a simple way to treat the abovementioned questions. A very similar, but not identical mathematical framework has been developed in [17], [19]. In the second part of this paper, some applications to the most common singularities in General Relativity are discussed. - An other review on this subject with somewhat different content has been given in [3].
375
2
d-spaces
The basic idea of differential manifolds is to express functions and maps in local coordinates. More precisely: If M is an n-dimensional differential manifold and / is a real valued function on M, then / o ip~l is considered instead of / , where (p is the local coordinate function of M. So all functions f on M are reduced to compositions of the real functions g := / o ip^1 on Rn with the n coordinate functions ipi : x K* X' on M. Already in 1967, R. Sikorski had the idea to replace the functions on M (which can be defined via the coordinate functions
376 The following definition has the advantage that the dimension of the tangent spaces is independent of r (i.e. of the differentiability class of the functions a in the following definition, which may be be greater than 1, if C is chosen accordingly):
Definition 2 Let {M,C) be a d-space, x G M, and Cx the stalk at x. A map V :
Cx^>
R
is called tangent vector to (M, C) in x, if for all n G IV, all / i , . . . , / „ G Cx, and all germs a of C1 (lRn, M) at y := (fi(x),..., fn(x)) £ Mn, the equation V(ao(f1(x),...,fn(x)))
=
J2(di<*)-V(fi)
holds, provided a o ( / 1 ; . . . , fn) G Cx. Here 9* a denotes the partial derivative of a w.r.t. the i-th argument. The vector space of all tangent vectors to (M, C) in x is called tangent space TXM. The definition of differential forms is not straight-forward, but can be done in a consistent way [11], [12]. It will not be needed in the following. The integration causes more problems: If one wants to integrate a vector field, the result should be a curve. But what does this mean? A curve is a continuous map of a real interval to the d-space. Such maps may not exist, as can be seen, if the d-space is a Julia set. Still, for the simplest examples, existence and uniqueness of the integration can be prooved [8]. Fortunately, these proofs cover the definition of geodesies in singularities. But in the cases discussed below, continuity arguments are sufficient. So even these theorems are not needed. Before d-spaces had been introduced, there was the problem that Schmidt's b-boundary gave strange topologies ([4], [10], [18]). E.g. the only neighbourhood of the Big Bang was the total Friedmann space-time. It is a great advantage of d-spaces that they yield a topology, which is suitable for all applications studied so far. Note that in the definition of a d-space, one starts from a given topology which cannot be coarser than the initial topology of the functions in C. But it may be too fine: Definition 3 Let (M, 7) be a topological space, and T a sheaf of local functions M —>• M defined w.r.t. the topology 7. A topology a on M that is coarser than 7, is
377 called a slackening of T, if for every V S 7 and every f £ F{V), there are U e a; U D V, and g e T{U) such that f = g\v holds. It is shown in [11], [13], that there exists a coarsest slackening fi, such that all functions g £ ^(U); U € \i are continuous. This fi is called initial topology of{M,C). This topology should be used for the discussion of singularities. Loosely speaking, Geroch defines a singularity as a set of points, where geodesies begin or end [10]. This is one of the reasons, why it is sometimes useful to glue two different space-times or two parts of the same space-time together in their singularities: In many cases it is possible to prolong the geodesies to the newly attached region. Definition 4 We say that two d-spaces Mi and M2 (or two parts of the same d-space) are glued together along Bi C Mi and B2 C M2, if 1. Mi — Bi and M2 — B2 are C3 -manifolds. 2. There is a continuous map Bi —¥ B2 such that each geodesic g ending in x € Bi is mapped to a geodesic that starts in f(x), and each geodesic ending in f{x) to one that starts in some preimage of f(x). "Gluing" means to identify f(x) with all its preimages x, and to consider the geodesies in Mi and the corresponding geodesies in M2 as one and the same geodesic. (Therefore the above mappings of geodesies must be one to one.) 3. On these geodesies g, a parameter r can be chosen in such a way that the tangent vector does not vanish: dg(r)/dT ^ 0 for all g(r) € -Bi and all g{r) G B2- It is not necessarily the arc length, but on Mi — Bi and on M2 — B2, there must be an admissable C3—parameter transformation from the arc length to r . 4- In addition, there must be atlases of Mi — Bi and of M2 — B2 and a choice of this parameter T on each geodesic such that it is a C3-curve on MXUM2(BiUB2). This definition is similar to that of [3]. It is much more restrictive than the earlier ones [1], [20], [21], [22]. But it allows to discuss the continuation of geodesies and the introduction of local coordinates similar to geodesic coordinates. (This is the reason, why C 3 is required, although C2 would be sufficient for the definition of a geodesic.) Of course, definition 4 could easily
378
be generalized such that more general singularities could be included, e.g. the "fork" in {(a;1, x2) £ M2\ x1 < 0 and x2 = 0} U {(x\x2) e M2\ x1 = 0 and x2 > 0} U {(xl, x2) e M2\ x1 > 0 and x2 = -a; 1 } . But the necessary changes are obvious, and would complicate the above formulation.
3
A simple example
When a geodesic, i.e. a pointlike particle, passes through a singularity, the conservation laws for momentum, energy, and mass are not necessarily conserved. This can be seen from a simple example: Consider the (flat) manifolds Ml := {(a:1, a;2, a;3, a:4, a;5) € R5\xh <0and2xi = x5} M2 := {{x1,x2,x3,x\x5) &IR5\x5 > 0 and x4 = 0} . On these manifolds, we introduce a pseudoRiemannian metric, whose tensor has the components diag(+l,+l,+l,+l,—l). The two spaces are glued together along Bx :=B2:={{x1,x2,x3,x4,x5)
e R5\ x4 = xh = 0} .
The energy of a pointlike particle moving on a curve X(T) is proportional to xh, and the square of its mass proportional to (i 5 ) 2 — (x1)2 — (±2)2 — (i 3 ) 2 — (i 4 ) 2 . We normalize the masses such that for the particle under consideration, the constant of proportionality is one. Consider the geodesies composed of two parts: For x5 < 0, we choose the lines with xl = const; i = 1,2,3; 2 x4 = x5, and for x5 > 0 the a;5—lines. Without loss of generality, one may put r = x5 for x5 > 0. Then energy conservation requires that dxb/dr = 1 holds also on Mi. Therefore dx4/dr = 1/2, and the mass of the particle is -\/3/2 in contrast to its mass 1 on M2. This shows that it is not possible to conserve both, energy and mass, at the same time. Of course, the non-conservation of energy and momentum is to be expected in this case [5]. It should be kept in mind that the conditions of definition 4 can be satisfied by adjusting the paramater r. But it is by no means clear that the mass is constant, while energy and momentum are not conserved. To answer this
379 question, definition 4, which is purely mathematical, is not sufficient. Some knowledge about the physical processes in the "edge" of space-time is necessary (cf. also the example in [6]). Indeed, mass conservation leads to strange results: At the edge, a particle travelling on the geodesic in the direction of increasing i 5 -values, would be slowed down, whereas a particle moving in the opposite direction would be accelerated.
4
Schwarzschild space-time
In Kruskal coordinates, the maximal analytic extension of the exterior Schwarzschild solution is (!)
ds2 =
3 2 M ^ . e - r /(2ilO
(dM2
_
dv2)
+ r1 ( ^2
+
gin2
§
^
2 )
(
where the "radius" r and the "coordinate time" t are expressed by u and v as (2)
(—
- l) er/<2M> = u2 - v2
f AMtanh~l(v/u) 1 4Mfan/z_1(u/v)
for u > \v\ or u < \v\ for v > \u\ or v < \u\
Equation (1) shows that the points r = 0 define a singularity of the metric. According to (2), they form the hyperbola v2 — u2 — 1. Fig.l shows more details. There are four regions / , . . . , IV which are separated by the horizons r = ± 2 M. All future directed (i.e. dv > 0) causal lines in region I must end in the upper part of the hyperbola, i.e. in the part with v > 0. Therefore this is a Black Hole. Similarily, all future directed causal lines in region II start from the part with v < 0 of the hyperbola. Therefore it is a White Source. This is very unsatisfactory, because all particles in this region are created "without any reason" from nothing. So causality is broken in a very bad way. Of course, we who live in the asymptotically fiat region III, can not see this, because all lines coming from region II have to cross a horizon with coordinate time t = — oo. Nevertheless, we think that in classical physics, causality has to be respected everywhere. This suggests that the Black Hole should be glued to the White Source. not difficult to veryfy (see [6]) that the points (it, v,d, tp) of the Black may be identified with the points (—u, —v,d,
It is Hole Then the
380
above Kruskal coordinates) of their tangent vectors when they emerge from the White Source.
Figure 1: Maximal analytic extention of the Schwarzschild space-time. The dark lines show a photon starting in region III, which is absorbed in the Black Hole, emitted from the White Source, and, after being scattered in region IV, returns to Region III.
Fig.l shows such a light ray. The photon starts in "our world", i.e. in region III. It falls radially into the Black Hole, and continues to travel from the White Source in Region / / to region IV. So the two asymptotically flat regions III and IV can communicate. It may happen that the photon is scattered by a mirror in region IV in such a way that it falls again radially into the Black Hole. Then, as shown in fig.l, it finally reaches region III and intersects its own trajectory. This is a mild form of causality violation, as the photon crosses the horizons t = - c o four times. One could interpret this trajectory as a photon which is emitted from an observer at some coordinate time t\. The photon intersects the trajectory of the observer again at an earlier time t2 < t\. So in principle, the observer gets information about his future. But he thinks that the photon comes from the past t = - c o .
381 The question about the topology can easily be answered by definition 3: Clearly, the local coordinates of the geodesies (Kruskal coordinates as a function of some suitable parameter) belong to the differential structure. Therefore the open neighbourhoods of a singular point p are simply the open sets containing p in the Kruskal coordinate space. For details see [1], [6].
5
T h e closed Friedmann Universe
The standard model of cosmology describes the beginning (or the end) of our universe by the "radiation filled Friedmann model". At present, it is not clear, whether our universe is expanding forever or whether it will finally collapse again. Here we treat the latter case. It is generally assumed that after such a final collapse, a new universe starts. In classical cosmology, a closed Friedmann universe can be followed by an open one, because the universe looses all its information in the singularity. This is not possible, however, when d-spaces are used [16], because the singularity is a point of this space. Therefore the pulsating universe is a connected set, and the initial condition of the solution to Friedmann's equation can not change. So a closed universe can be followed only by a closed one. The metric of the closed radiation filled Friedmann universe is (3)
ds2 = S2(ri) (-dr? + dX2 + sin2X (dd2 + sinH dip2)) Xe[0,Jr];
0e[O,7r];
y>€[0,27r),
where the function S is given by (4) S'(77) := a sin 77;
t = o(l — COST]) ;
-— = S(rj); drj
a € M•
Here 77 — t = 0 denotes the Big Bang and 77 = IT the final collapse. The "world radius" shrinks to zero in both points. As equation (3) suggests, it is natural to assume that the final collapse is identified with the Big Bang of the "next" universe with 77 G [TT, 27r]. But this has to be verified in view of definition 4 above. It is well known that the distance of an arbitrary point p to the Big Bang and the final collapse is finite. And all geodesies which reach these singularities do it with a fitite affine parameter. Moreover, equations (3) and (4) show that the metric is symmetric for reflections of 77 at 77 — 0,TT,2TT,. ...
382
Therefore the geodesic equations are also symmetric about the singularities. It remains only to show that there is a parameter for these geodesies such that the tangent does not vanish and the geodesic is C 3 in the singularities. The easy calculation is done in [1] and [6]. The topology at the singularities is found in the same way as above. The topology can be generated by the open sets not containing a singularity, and by the balls at the singularities with radius r\ = e, where e is a rational number. There are two interesting facts to notice: First, the singularities of Friedmann space-time are single points. This statement is intuitively clear, and can be made precise by a discussion of the geodesies [1]. A more surprising result is that the dimension of the Friedmann singularities is not four, but five: Consider
(5)
/ ^ lim„ / < X + 0 t > t' y '^" / , ( X 't' y '^„ i
° = */2,
»->o \\x(x + a, tf, tp, rj) - x(x, V, ip,rj)\\ where the denominator is the distance of the points with the given coordinates, and the x-values have to be taken modulo w. This defines a tangent vector in r\ = 0. A similar vector can be defined by interchanging the role of d and x m (5)- In addition, if this formula is applied to tp instead of X, two additional independent tangent vectors are defined for a = TT/2 and a = 7r. Furthermore, one may consider a C1-curve on M ending in r\ = 0. The derivative in the direction of this curve (defined by a sequence of points on this curve) is independent of the above vectors. - The intuitive meaning of these tangent vectors becomes clear, if one considers the embedding of the space-time into a 5-dimensional flat space. There the singularities are cones.
6
T h e origin of charge
Almost 50 years ago, Archibald Wheeler showed [28], how charge can be generated by topology: Charge is the property of space that the divergence of some field (the dielectric field in the electromagnetic case) does not vanish. Consider the field of an electric dipole, i.e. the field of two electric point charges, one positive and one negative, of equal absolute value. Now the two points, where the charges are located, are identified with each other. This has the consequence that the field lines do no longer end in these points, but are continued to the other part of space-time. So the divergence can
383 be made zero, and the charges disappear, although the electric field is not altered. Only the topology has been changed. Wheeler has formulated this idea of a " wornhole" at a time, when one did not know yet many details about exact solutions to Einstein's equations. Recently, his proposal has been frequently discussed in connection with the maximal analytic extention of space-times (a very readable review is [27]). But now, the goal of the authors is not to explain charge by topology, but to find "transversable" wormholes, i.e. paths to other parts of the world which otherwise would be not accessable (see, e.g. [26]). In particular, "transversable" means that no horizon can be crossed. On the other hand, the ReiflnerNordstr0m solution shows that charge is hidden from our part of the world even behind two horizons. Therefore we cannot expect that Wheeler's idea of replacing charge by topology works with transversable wormholes. It is surprising that such a wormhole can be realized with only one pointlike charge. The reason is that gravity produces an infinite series of singularities at the "position" of the charge: The solution to the Einstein-Maxwell equations with one pointlike mass M and electric charge Q is the famous Reifiner-Nordstr0m space-time: In case that M is larger than Q (in natural units), the maximal analytic extension of the metric is given by [9]
<6> *> = -sfe 'LVwW d V i V "''<*" + ""'**"' Here r + and r_ denote the two horizons. They are determined by the mass and the charge of the Black Hole. Further, one defines a := (r+ — r_)/{2r\) > 0 ; j3 := r2__jr\. The "radius" r is implicitely defined by tan U tan V
• r+\ \r — r_| ^ for • r+\ \r — r_| _ / 3 for
r > r+ and 0 < r < r_ r_ < r < r+
Of course, the electric field is proportional to 1/r 2 . The Penrose diagram of this space-time is shown in fig. 2. The singularities r = 0 can be reached by spacelike and null geodesies, which end there. Therefore definition 4 suggests to identify pairs of singular points in such a way that the geodesies can be continued. A straightforward calculation [7] shows that this is possible, if points with the same value of U + V in An and in A'n+1 are glued together. Note that the r-lines are geodesies, which at the same time are also the electric field lines. And the matter is homogeneous; therefore the electric and the dielectric fields are proportional to each other. This means that also the divergence of the dielectric field - the charge - vanishes, although an observer in an asymptotically flat region Cn or C'n only
384
Figure 2: Maximal analytic extention of the Reifiner-Nordstr0m space-time. Dashed lines: r=const. The solid line shows a photon starting in Cn and passing to C'n+l through a singularity.
sees a field of a point charge proportional to 1/r 2 . Surprisingly, it is not necessary to glue singular points. It is also possible to identify points with r = e > 0 and the same value of U + V in An and A'n+1, if all points with r < e are deleted. In this case, no exotic matter appears at the surface r = e, provided e is small enough. This can be seen by a straightforward calculation of the surface energy and momentum [7], which are well defined and finite for e —• 0, whereas the electromagnetic self-energy tends to infinity. So one has the choice: Either one glues the space-time in the singularities. This has the advantage that no surface terms appear. If instead, points with finite radius e > 0 are glued together, the
385
singularities in the metric and an infinite electromagnetic energy are avoided.
References [1] M. Abdel-Megied, K. Buchner, R.M.M. Gad: Topologie und Verklebung singularer Raum-Zeiten. Proc. 4th Intern. Congr. Geometry, N. K. Artemiadis and N. K. Stephanidis ed., Thessaloniki 1996, 57 - 68 [2] J. K. Beem, P. E. Ehrlich: Global Lorentzian geometry. Marcel Dekker, Inc., New York and Basel 1981 [3] A. Beigel, K. Buchner: d-spaces, singularities, and the origin of charge. To be published [4] B. Bossardt: On the b-boundary of the closed Friedmann model. Comm. Math. Phys. 46 (1976), 263 - 268 [5] K. Buchner: A remark on energy and momentum in embedded spacetimes. Progr. Theor. Phys. 46 (1971), 1946 - 1947 [6] K. Buchner: Differential spaces and singularities of space-time. General Mathematics 5 (1997), 53 - 66 [7] K. Buchner: 1/r - Potential ohne Ladung. To be published [8] K. Buchner, K. Biischel: Dynamical systems on differential spaces. Proc. 23 rd National Conf. on Geom. and Topol., Cluj-Napoca 1993, 30 - 37 [9] S. Chandrasekhar: The mathematical theory of Black Holes. Oxford University Press, Oxford 1998 [10] R. Geroch: Local characterization of singularities in General Relativity. Journ. Math. Phys. 9 (1968), 450 - 468 [11] M. Gerstner: d-Raume. Eine Verallgemeinerung der Differentialraume mittels Funktionsgarben. Dissertation, TU Miinchen 1995 [12] M. Gerstner, K. Buchner: Differential spaces based on local functions. Analele §tiint- ale Univ. "Ovidius" Constanta, Ser. Mat. Ill (1995), 37-45 [13] M. Gerstner, K. Buchner: The topology of differential spaces. Analele §tiint;. ale Univ. "Al. I. Cuza", Ia§i, 42(Supliment) (1996), 101 - 111
386
[14] J. Gruszczak, M. Heller, and Z. Pagoda: Cauchy boundary and bincompleteness of space-time. Intern. Journ. Theor. Phys. 30 (1991), 555 - 565 [15] S. Hawking, G. F. R. Ellis: The large scale structure of space-time. Cambridge University Press, Cambridge 1976 [16] M. Heller, W. Sasin: Generalized Friedmann 's equation and its singularities. Acta Cosmologica XIX (1993), 23 - 33 [17] M. Heller, W. Sasin: Structured spaces and their application to relativists physics. Journ. Math. Phys. 36 (1995), 3644 - 3663 [18] R. A. Johnson: The bundle boundary in some special cases. J. Math. Phys. 18 (1977), 898-902 [19] M. A. Mostow: The differential space structures of Milnor classifying spaces, simplicial complexes, and geometric realizations. Journ. Diff. Geom. 14 (1979), 255 - 293 [20] W. Sasin: Geometrical properties of gluing of differential spaces. Demonstratio Mathematica 24 (1991), 635 - 656 [21] W. Sasin: Gluing of differential spaces. Demonstratio Mathematica 25 (1992), 361 - 384 [22] W. Sasin, K. Spallek: Gluing of differential spaces and applications. Math. Ann. 292 (1992), 85 - 102 [23] R. Sikorski: Abstract covariant derivative. Colloquium Mathem. 18 (1967), 252 - 272 [24] R. Sikorski: Differential modules. Colloqium Mathem. 24 (1971), 46- 79 cf. also R. Sikorski: Wstgp do geometrii rozniczkowej, Panstwowe Wydawnictwo Naukowe, Warszawa 1972 [25] K. Spallek: Differenzierbare und holomorphe Funktionen auf analytischen Mengen. Math. Ann. 161,(1965), 143 - 162 [26] M. Visser: Quantum wurmholes. Phys. Rev. D 43 (1991), 402 - 409 [27] M. Visser: Lorentzian wurmholes. AIP Press und Springer-Verlag 1995 [28] A. Wheeler: Geons. Phys. Rev. 97 (1955), 511 - 536 cf. also: A. Wheeler: Einsteins Vision. Springer-Verlag 1968
Mathematics and the 21st Century Eds. A. A. Ashour and A.-S. F. Obada © 2001 World Scientific Publishing Co. (pp. 387-394)
387
The Inner Geometry of Light Cone in Godel Universe M.Abdel-Megied Mathematics Department,Faculty of Science,Minia University El-Minia,EGYPT
1
Introduction
Einstein general theory of relativity with its powerful mathematical instrument enables us to investigate the local and global structure of our universe in any spacetime (M, g), whewe M is a 4-dimensional manifold and g is a Lorentz metric on M with signature —2. The most important features of any space-time is the existense of null curves, particularly , null geodesics(light rays), null surfaces and null hypersurfaces which are characteristics for the Einstein's field equations, in the sense of, the theory of normal hyperbolic second order differential equations. The existence of these null (or light-like) manifolds have physical origin, since null geodesies (light rays) are the trajectories of photons, the null hypersurfaces of constant phase in geometric optics (high frequency) limit can be considered as level surfaces (null surfaces) of the function S = S(x'),i = 0,1,2,3; satisfying the Eikonal equation (masless H — J. equation), namely gijdiSdjS = 0 [Frittelli and Newman (1999)]. A geometric illustration can be given , if we start with a given 2-dim space -like smooth surface Z in a space-time M, that is a solution of the Einstein field equations Rij ~ 2R9ij + Aftj = -8-rrkTij
(*)
where Rij is the Ricci tensor, R is the scalar curvature tensor, A is the cosomological constant k is the gravitational constant and Ttj is the energy momentum tensor. Choosing at each point of Z a null direction perpendicular to Z, depending smoothly on the foot point. There are exactly two such direction fields on Z each of these determines a unique null direction (geodesic) which it is tangent giving a two parameter family of
388 null geodesies.This family can be interpreted as a bundle of light rays issuing from a surface to specific instant of time and propagated freely without direct intersection with matter. Let W be the set of all points of M that can be joint to Z by one of these geodesies. In the neighbourhood of Z, W is a 3-dim. light-like (null) submanifold of M, further away from Z, neighbouring null geodesies (light rays) might intersect each other and W might fail to be a submanifold of M. Following Friedrich and Stewart (1983), we call any subset W of M that can be constructed in this way wavefront. By definition the caustic of a wavefront W is the set of all points i e ( f where W fails to be an immersed submanifold of M. A picture of this construction of caustics has been given by Penrose (1972). The caustics (singularities) its occurrance is an intrinsic property of wavefronts. It can be also interpreted as the location of focusing regions, where the intensity of light becomes very high. It is found that the classification of wavefronts near their singularities (caustics) is equivalent to the classification of Legendrian submanifold near their points where the projection from submanifold to the basis has a singularity in the sense that the tangent map has non-maximal rank [Arnold et al (1985)]. Friedrich and Stewart (1983) used Arnold's results to obtain a local classification of Caustics of wavefronts [Hasse et al (1996)] obtained a local classification of caustics of wavefronts in terms of their projection from space-time to space. This classification is found to be more general than that given by Friedrich and Stewart in 1983, in the sense that it is independent of which timelike vector field has been chosen, i.e., observer independent. A more recent elegant study of the theory of Caustics and wavefronts singularities is achieved by Ehlers and Newman (1999). This article includes, from physical and mathematical point of view, a nice and clear review for the work of V.I. Arnold on the theory of Lagrangian and Legendrian submanifolds and their associated map. It is worth to mention that the motion of wavefronts includes, in particular, light cones can be constructed by all null geodesies (light rays) issuing from a point x e M as a vertex and by choosing for Z (2-dim. space-like surface) an appropriate sphere near x £ M. The study of singularities of wavefronts (caustics), i.e., study of the inner geometry of light cone is of great importance since it gives, e.g, a clear discription for gravitational field in vacuum, particularity near the vertex of light cone [Dautcourt (1965)]. According to general relativity the null geodesies constituting the light cone with vertex x € M can be forced to re-converg by sufficiently strong gravitational field (e.g., quasar, galaxy or cluster of glaxas). This is known as the gravitational lense effect,which is, today, one of the most rapidly growing areas in astrophysics. Acomprehensive review to this physical theory is contained in Schneider, Ehlers and Falco (1992) [see also : Ehlers (1998), Kayser et al (1992)].
389
2
S t r u c t u r e of Light Cone in Godel Universe
Kurt Godel (1949) obtained a solution of the Einstein's field equations (*) with cosmological constant A < 0 which can be represented by the metric ds2 = (dx0+exl/bdx2)2-[(dx1)2+^e2x^b(dx2)2+(dx3)2}
(2.1)
This solution describes rotating dust-filled universe (non-expanding and shear-free) with density of matter p and a constant rigid rotation Q, where Q, = -4r, b = j}2i = i_ ^JHJ __ tvj _ pfitgjyt jg ^ g 4. v e ] o c ity vector.
7kp
The metric (2.1) can be written in the so-called "standard Godel coordinate system" in the form: ds2 = 4[dt2 - dr2 + 2\fl sinh 2 <j>d
^
+
r j * ^ =0
(2.3)
ds2 ° ds ds where TJfc are the Christoffel symboles, s is an affine parameter along the geodesic, are explicitly integrated by Kundt (1956) and Chandrasekhar and Wright (1961). The light cone in Godel universe is constructed by Abdel-Megied and Dautcourt (1972). With the vertex (0,0,6,0), its parametric representation is given by: t
=
„ /-, , (u2 + ti2) tan w , b , 1. 2V26arctan{l , 2„ \ , -h, rr } T=(V + -)w v(u + l ) + u ( u 2 - l ) t a n w J V§ v' 2 2 u/ 2 •,\r v(u -l) + u(v + l)tanw . b(vz - 1){- i '-r-—i—r —\\,<mw (v — utanw)- ! + v*{u + v tan wy (2.4) 2 2 bv (u + 1)sec2TO (v — u t a n w ) 2 + v2(u + vtanw)2
(2.2)
390 where w is an affine parameter along the null geodiscs (light rays) and u,v are directional (tranversal parameter fixing a geodesic (generator of the light cone). The parameter u is arbitrary, while v is restricted to the values 1 < \v\ < 1 + \f2. The null directions £' = 4j£ (i = 0,1,2,3) at the vertex can be calculated from (2.4), so we get
9iJee
= o = (e+v^1)2 - KX)2 - (a 2 - ( a 2
(2.5)
Introducing, at the vertex, the new directional parameters C1 =
C2 =
^
C3 =
i
^
(2.6)
then using (2.5), we have: (C1)2 + (C2)2 + (C3)2 = 1
(2.7)
which represents the celetial sphere of an observer at the vertex (0,0, b, 0) of the light cone (2.4). A very interesting picture which gives a deeper insight to the structure of Light cone in Godel universe can be found in [Hawking and Ellis (1973)p:168].
3
The Inner Geometry of Light Cone
To study the behaviour of light cone near its singular points (caustics) there are some difficulties due to the very complicated form of its inner metric g*ap (a, /3 = 1,2,3) defined by dxl Qx^ 9*afi = 9iJQ^Q^
(3-1)
where y1 = w, y2 = u, y3 = v are the coordinates on the light cone. The metric g*ap is singular, since g*a = 0. Using (2.4) in (3.1) we get the components of the inner metric: »22 = . 523 =
b2 sin2 w cos2 w A „ Jn 2^"(u>v)tan w u v=o 2 2 b sin w cos 2 w J^ - , „ w 7n 2^Qy(u,v)tan"w U
L/
S33
S1H
=
W
6 ?f„,2 >2 v {6v2 A
U
i/=0 COS
W
^—^
-,
,
,
2^Sv{u,v)ta.nvw v=o -_ 1T\4„„2 w - vA - 1)
775
u
b2w sin w cos w ^ , }_^1lv(u,v)tw.
..
£JV
txy UIJLJ. uy L U H cu
u
w
u=0 r—.,
(3.2) /
\
1/
^ ^ ( i i . ^ t a n ra v=o
391 where U = (v — utanw)2 + v2(u + vtanw) 2 , V, Q, 11, S, T are functions in the transversal parameters u and v. Now, as we have mentioned in the introduction, the singularities on light cone are the location of focal points where neighbouring geodesies (light rays) intersect.This leads to a higher degenracy of the inner metric g*ap [Dautcourt (1967)]. A glance on the metric (3.2), we find that at the values w = nn (n = 0,1,2,3,...) the inner metric becomes singular / 0 0 9*aB=\ ° ° \
U
U
0
\
0
(3'3) 2
4
ti*(6« -v -l)
/
Substituting the values w = n-n in the parametric equations of light cone given by (2.4), we get the curve: mrb, t = --j=l,
x
„ , mrb = 0,y = b,z = y = V 8 ^ P
(3.4)
where I = v + -. This shows that the focal points lie on the circles t 2 + z2 = 4 n W
(3.5)
in the' pseudo-Euclidean plane x = 0, y = b. The value n = 0 corresponds to the vertex (0,0,6,0) of the light cone. The circles (3.5) for different values of n are space-like which represent the so-called Keel curves [Riesz (1956)]. Now from (2.4) and (2.6) we have at the vertex of the light cone, the null directions (rays):
?- ^ f O '
^2 - l£jfa-k
™
where I = v + £; The values I = ±2 (v = ±1) correspond to the north and south poles of the clestial sphere (2.7) of an observer near the vertex, while the values / = ±2-^/2 (v = ±(1 + \/2)) correspond to the great circle
( 0 2 + « 2 ) 2 = l-
(3-7)
392 For all other values of I ^ —2, I ^ 2\/2, we get circles with increasing radius. These circles cover the upper half of the plane (2.7). For The negative values of I ^ — 2, I ^ —2\/2, we get circles covering the lower part of the sphere (2.7). Thus we have the following property to the inner geometry of light
There is a correspondance between the focal points on the light cone in Godel universe and the points on the celestial sphere of an observer at the vertex. For physical interest we calculate the length of the Keel curves, i.e., the circumference of the circles (3.5) for any n. From the formula dxi dxj 1/2 dl an——r dl dl
r'2 *2r
=
2n7rb
i
fr^dL
To evaluate this integral put / = 2\/2 — sin2 9 2nnb
I
f V2 cos2 6d9
TT^pn^
s = 4V2nnE{h -
2\/2nnK(-)
where E(^) and K(^) are the elliptic integrals of the first and second kind which have the values £ ( i ) = 1.35064, K(-) = 1.85407. So, for any n, the circumference of any circle (Keel curve) is found to be s = 2Anirb. If we take in considration that 6 = -4=, where: k = —rf, the gravitational constant cr c = 3 x 10 10 cm/sec, the velocity of light / = 6.67 x 10~ 8 cm 2 /gm 2 , the Newtonian constant p = 5 x W~30gr/cma, is the density, s = 2n x 1028cm ~ 0.8nl0 n light year (a light year ~ 9.46 x 1017cm) We can formulate this result as follows:
(3.8)
393 The length of the circumference of the Keel curve in Godel Universe is, for small n, of the order of the gravitational radius.
REFERENCES Abdel-Megied, M; and Dautcourt, G. "Zur Struktur des Lichtkegels in Godel Kosmos." Math. Nachrichtes 54 (1972) pp: 33 - 39. Arnold, V.I; Gusein-Zade,S.M. and Varchenko,A.N. "Singularities of Differential Maps." Vol. l.Birkhauser, Boston, Basel,(1985). Dautcourt,G. "Isotrope Flachen in der allgemeinen Relativitatstheorie" Habilitationsschrift, Humboldt Univ., Berlin (1965). Ehlers,J. "Gravitationslinsen: Lichtablenkung in Schwerefeldern und Ihre Anwendung". Carl Fridrich von Siemons Stiftung, Bd. 69 (1998). Ehlers,J. and Newman, E. "The Theory of Caustics and Wavefront Singularities with Physical Applications." gr-qc/9906065 (10.Junel999). Dautcourt,G. "Characterestic Hypersurfaces in General Relativity." J. Math. Phys. 8 (1967) p : 1492. Friedrich,H. and Stewart,J.M. " Charaterestic Initial Data and Wavefront Singularities." Proc. Roy. Soc. London A 385 (1983) pp: 345 - 371. Frittelli.S. and Newman, E.N. "The Eikonal Equation I" J. Math. Phys. 40 (1999) pp: 383 - 407. Frittelli.S. and Newman, E.N. "The Eikonal Equations II ..." J. Math. 40 (1999) pp: 1041 - 1056.
Phys.
394 Godel, K. "An Example of a New Type of a Cosmological Solution of Einstein's Field Equations." Rev. Mod. Phys. 21 (1949) pp: 447 - 450. Hasse,V.; Kriele,M. and Perlick,V. "Caustics of Wavefronts in General Relativity." Class. & Quant. Gravi. 13 (1996) pp: 1161-1182. Hawking, S.W. and Ellis, G.F.R. "Large Scale Structure of Space-time." Cambridge Univ. Press (1973). Kayser,R.; Schranam,T. and Nieser,L (Eds.) "Gravitational Lenses." Lect. Notes in Phys. 406 (1992) Springer-Verlag. Kundt, W. "Tragheitsbahnen in einem von Godel angegebenen kosmologischen Modell" Zeitschr. f. Phys. 145 (1956),p:611. Penrose,R. " Techniques of Differential Topology in Relativity. " Regional Conf. Series in Applied Math. Published by SIAM, Pheladelphia, PA, (1972). Pfarr,J. (1981) "Time Travel in Godel's Space." Gen. Rel. & Gravi. 13 (1981) pp: 1073 - 1091. Riesz,M. "Problems Related to Charaterestic Surfaces." Proc. Inter. Conf. in Differential Equations (1956) p: 57. Schneider,P.; Ehlers,J., and Falco,E. "Gravitational Lensing." Berlin (1992).
Springer-Verlag,
395
List of Participants 1. G. M. Abd Al-Kader, Faculty of Science, Al-Azhar University, Egypt 2. Elham M. Abd Elrasol, Faculty of Science, Cairo University, Egypt 3. Abo-El Nour N. Abd-Alla, Faculty of Science, South Valley University, Egypt 4. A. M. Abdalla, Faculty of Science, Benha, Egypt 5. M. Z. Abdalla, Faculty of Science, Cairo University, Egypt 6. Nassar Hassan Abdel-All, Faculty of Science, Assiut University, Egypt 7. Laila F. Abdel-A'll, Faculty of Science, Cairo University, Egypt 8. M. Abdel-Aty, Faculty of Science, South Valley University, Egypt 9. M. R. Abdel-Aziz, Faculty of Science, Kuwait University, Kuwait 10. Hamdy I. Abdel-Gawad, Faculty of Science, Cairo University, Egypt 11. A. M. Abdel-Hafez, Faculty of Science, El-Minia University, Egypt 12. M. Ezzat Abdel-Monsef, Faculty of Science, Tanta University, Egypt 13. *M. Abdel-Megied, Faculty of Science, El-Minia University, Egypt 14. M. Elshafie Abdellatif, Faculty of Engineering, Assiut University, Egypt 15. Hosny A. Abdusalam, Faculty of Science, Cairo University, Egypt 14. *Faruk F. Abi-Khuzam, A.U.B., Beirut, Lebanon 16. Abdel Aziz Abo Khadra, Faculty of Engineering, Tanta University, Egypt 17. Mostafa S. Abou-Dina, Faculty of Science, Cairo University, Egypt 18. Abdel-Karim Aboul-Hassan, Faculty of Engineering, Alexandria University, Egypt 19. Saeed Abu-Zour, Faculty of Science, United Arab Emirates University, UAE 20. A. I. Aggour, Faculty of Science, Al-Azhar University, Egypt 21. *Essam K. Al-Hussaini, Faculty of Science, University of Assiut, Egypt 22. Manuel J. Alejandre, Universidad Publica de Navarra, Spain 23. Mohamed Nabil Allam, Faculty of Science, Mansoura University, Egypt 24. M. A. Amer, Faculty of Engineering, Mansoura University, Egypt 25. S. M. Amer, Faculty of Science, Zagazig University, Egypt 26. M. Mostafa Anbar, Faculty of Engineering, Cairo University, Egypt 27. *Mohammed Asaad, Faculty of Science, Cairo University, Egypt 28. *Attia A. Ashour, Faculty of Science, Cairo University, Egypt (Chairman) 29. Maria J. Asiain, Universidad Publica de Navarra, Spain 30. Mohammed Atallah, Faculty of Science, Tanta University, Egypt Plenary or topical lecture speaker
396
31. *Michael Atiyah, Department of Mathematics and Statistics, Edinburgh, U.K. 32. A. H. Azzam, Suez Canal University, Egypt 33. M. A. Bakry, Faculty of Education, Ain Shams University, Egypt 34. *N. Balakrishnan, McMaster University, Canada 35. *Adolfo Ballester-Bolinches, University of Valencia, Spain 36. *Bolis Basit, Monash University, Australia 37. I. Bayoumi, Faculty of Science, Ain Shams University, Egypt 38. *Klaus Buchner, Techniche Universitat Munchen, Germany 39. *Robin Bullough, UMIST, Manchester, UK 40. J. Bustoz, Arizona State University, USA 41. Sergio Camp-Mora, Universidad Publica de Navarra, Spain 42. A. Dabbour, Faculty of Science, Ain Shams University, Egypt 43. "Lokenath Debnath, University of Central Florida, USA 44. Assem Deif, Faculty of Engineering, Cairo University, Egypt 45. Hacen Dib, Faculty of Science, University of Tlemcen, Algeria 46. "Vlastimil Dlab, Carleton University, Canada 47. Eid H. Doha, Faculty of Science, Cairo University, Egypt 48. *W. Ebeid, Faculty of Education, Ain Shams University, Egypt 49. "Jiirgen Ehlers, Max-Planck Inst, fiir Gravitation-Physik, Albert Einstein Inst., Germany 50. Ibrahim F. Eissa, Institute for Stat, and Res., Cairo University, Egypt 51. M. M. El-Ann, Faculty of Science, Al-Azhar University, Egypt 52. Samia S. Elazab, Women's University College, Ain Shams University, Egypt 53. A. El-Gohary, Faculty of Science, Mansoura University, Egypt 54. Salah El-Gindy, Faculty of Science, Assiut University, Egypt 55. Hany M. El-Hosseiny, Faculty of Science, Cairo University, Egypt 56. A. I. El-Maghrabi, Faculty of Education, Tanta University, Egypt 57. *M. S. El-Naschie, Cambridge, UK 58. H. M. El-Owaidy, Faculty of Science, Al-Azhar University, Egypt 59. Tarek M. El-Shahat, Faculty of Science, Alazhar University of Assiut, Egypt 60. Magdy El-Tawil, Faculty of Engineering, Cairo University, Egypt 61. Somaia El-Zahaby, Faculty of Science, Al-Azhar University, Egypt 62. E. Elshobaky, Faculty of Science, Ain Shams University, Egypt 63. Ahmed Elsonbaty, Faculty of Engineering, Assiut University, Egypt 64. L. M. Ezquerro, Universidad Publica de Navarra, Spain 65. *M. Fadlalla, Faculty of Science, Ain Shams University, Egypt
397
66. 67. 68. 69. 70. 71. 72.
M. Faragallah, Faculty of Education, Ain Shams University, Egypt Magdi Elias Fares, Faculty of Science, Mansoura University, Egypt J. Fleckinger, Universite de Toulouse, France *Ahmed F. Ghaleb, Faculty of Science, Cairo University, Egypt M. H. Ghanem, Faculty of Science, Zagazig University, Egypt * Phillip A. Griffiths, Institute for Advanced Study, Princeton, USA *Martin Groetschel, Konrad-Zuse-Zentrum fur Informationstechnik, Berlin, Germany 73. Salah Haggag, Faculty of Science, Al-Azhar University of Assiut, Egypt 74. Badie T. M. Hassan, Faculty of Science, Cairo University, Egypt 75. Michel Hebert, The American University in Cairo, Egypt 76. Ahmed S. Hegazi, Faculty of Science, Mansoura University, Egypt 77. Mohamed Atef Helal, Faculty of Science, Cairo University, Egypt 78. Abdel Rahman A. Heleil, Faculty of Science at Beni Soueif, Cairo University, Egypt 79. Sahar Ibrahim, Institute for Statistcal Studies and Research, Cairo University, Egypt 80. "Hassan N. Ismail, Benha High Institute, Egypt 81. *Mourad Ismail, University of South Florida, USA 82. Fatma Ismail, Faculty of Science, Cairo University, Egypt 83. Abdelouahab Kadem, Universite Farhat Abbas, Algeria 84. M. E. Kahil, Faculty of Science, Cairo University, Egypt 87. Ahmed B. Khalil, Faculty of Science, Cairo University, Egypt 88. Zeinhom M. G. Kishka, Faculty of Science, South Valley University, Egypt 89. *S. R. Komy, Helwan University, Egypt 90. Abdel Monem Kozae, Faculty of Science, Tanta University, Egypt 91. *Tassilo Kiipper, Universitat Koeln, Germany 92. Gamal M. Mahmoud, Faculty of Science, Assiut University, Egypt 93. Mohamed Ezzat Mahmoud, Faculty of Science at Beni Soueif, Cairo University, Egypt 94. B. Merouani, Universite Farhat Abbas, Algeria 95. I. F. Mikhail, Faculty of Science, Ain Shams University, Egypt 96. *John J. H. Miller, Trinity College, Dublin, Ireland 97. Aboul-Magd A. Mohamed, Faculty of Education, Ain Shams University, Egypt 98. Nahed Mokhlis, Faculty of Science, Ain Shams University, Egypt 99. Mohamed Adel M. Ali Mousa, Faculty of Science, Assiut University, Egypt 100. A. B. Morcos, National Research Institute of Astronomy and Geophysics, Egypt
398 101. 102. 103. 104. 105. 106. 107. 108. 109. 110. 111. 112. 113. 114. 115. 116. 117. 118. 119. 120. 121. 122. 123. 124. 125. 126. 127. 128. 129. 130. 131. 132. 133.
Taher Mourid, Faculty of Science, University of Tlemcen, Algeria *M. S. Narasimhan, ICTP, Trieste, Italy A. Nasef, Faculty of Education in Arish, Suez Canal University, Egypt Gamal G. L. Nashed, Faculty of Science, Ain Shams University, Egypt *Aly Nayfeh, Virginia Polytech. Inst, and State University, USA Valentina Nikoulina, Universite Victor Segalen Bordeaux 2, France * Mikhail Nikouline, Universite Victor Segalen Bordeaux 2, France *A.-S. F. Obada, Faculty of Science, Al-Azhar University, Egypt G. Ochoa, Universidad Publica de Navarra, Spain "Jacob Palis, IMPA, Brazil N. Papadatos, University of Cyprus, Cyprus *Claudio Procesi, Universita di Roma La Sapienza, Italy Sherif I. Rabia, Faculty of Engineering, Alexandria University, Egypt Ahmed E. Radwan, Faculty of Education, Tanta University, Egypt S. F. Ragab, Faculty of Engineering, Cairo University, Egypt *Roshdi Rashed, CNRS, France Effat A. Saied, Faculty of Science, Benha University, Egypt A. G. Saif, Atomic Energy Establishment, Egypt Samir Sallam, Kuwait University, Kuwait Mohamed S. Selim, Faculty of Science, Al-Azhar University of Assiut, Egypt Mohamed Sifi, Ecole Superieure des Sciences et Techniques de Tunis, Tunisia Saad M. Sileem, Faculty of Science, Ain Shams University, Egypt X. Soler-Escriva, Universidad Publica de Navarra, Spain Laila Soueif, Faculty of Science, Cairo University, Egypt *A. R. Sourour, University of Victoria, Canada Khalaf S. Sultan, Faculty of Science, Al-Azhar University, Egypt Hussein Sayed Tantawy, Faculty of Engineering, Al-Azhar University, Egypt M. I. Wanas, Faculty of Science, Cairo University, Egypt *Richard Wiegandt, Math. Institute, Hungarian Acad, of Science, Hungary Afaf Zagrout, Faculty of Science (Girls), Al-Azhar University, Egypt Adel Zaki, Faculty of Science, Helwan University, Egypt Maher Zayed, Faculty of Science, University of Benha, Egypt Nasr A. Zeyada, Faculty of Science, Cairo University, Egypt
www. worldscientific.com 4633 he