This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
- s i n i A \ sin ip cos ip J M whose tangent vector j(t) at each point t coincides with the value of the vector field there: 7(*) = V( 7 (t))This notion applies, in particular, to elements of a Lie algebra, since these may be identified with the left-invariant vector fields at the identity, according to Theorem 2.22. Thus the following definition makes sense. Definition 2.26. The exponential map exp : g —> G is the evaluation at t = 1 of the integral curve $ x oi X € g satisfying $ x ( 0 ) = e: exp(X) = $ x ( l ) . The exponential exp : g —> G is a smooth map which has the following properties: . . . . )= E AutS) by ir(A)ip = Aip. Denote by U(f>) = lA K(JA($))) the image of unitary operators of S) and by U(l) = {e I} C U(S)) the phase operators on Sj. All these objects are groups. Then Wigner's theorem may be rewritten as A u t £ ~ W(i5)/U(l), *(Bl;Cj)=5ij, (9, l*\6, ](0, is bounded with support in [—L,L] and JQ F (j)(x)dx = 1. (A2)
(
} =
(~^V -sin^ \ - sin tp cos ip J
8
Thus, when H is invariant, both partitions of G, into left and right cosets, coincide. But there is more. Proposition 2.6. / / H
= {9lg2)H.
(2)
The neutral element is H = eH and the inverse of gH is g~xH. The law (2) indeed makes sense, for one has: {giH){g2H)
= 9l(Hg2)H
= gi(g2H)H
= gig2HH
= (gl92)H,
that is, the l.h.s. is indeed a left coset. Examples: . Consider the group 0(2) introduced after Definition 2.5: S0(2) <s 0(2), 0(2) = S0(2) U R S0(2) and 0(2)/SO(2) ~ {I, R} ~ Z a . . For any group G, its center Z(G) = {z € G : zg = gz, for allg € G} is an invariant subgroup. A subgroup of Z(G) will hence be called central. Definition 2.7. The direct product of two groups G, G' is the set GxG' = {(9,9'), 9 € G,g' 6 G'}, endowed with the product (gi,g[) • (g2,g2) = (9i92,9W2)- Then G ~ {(g,e'), g e G} and G" ~ {{e,g'), g' e G1} are both invariant subgroups of G x G'. For instance, V4 ~ Z 2 x Z 2 . Definition 2.8. (1) The group G is simple if it has no nontrivial invariant subgroup. Examples: S0(2), SU(2). (2) The group G is semisimple if it has no nontrivial invariant abelian subgroup. Example: S0(4) ~ (SU(2) x SU(2))/ Z 2 is semisimple, but not simple. On the other hand, 0(2) is not semisimple, since S0(2) is an invariant abelian subgroup. Proposition 2.9. Every semisimple group G is of the form G ~ (Gi x G2 x . . . x Gn)/Z, where G\, G2,..., product.
(3)
Gn are simple and Z is a discrete central subgroup of the
9
Now we turn to maps from a group into another one, that preserve some of the group structure. Definition 2.10. A (group) homomorphism is a map a : G -»• G' such that <*(9i9i) =
(4)
1
It follows from (4) that a(g~ ) = cr(g)' , for all5 G G, and that a(e) = e'. The kernel of the homomorphism is the set Ker cr = {g 6 G : a(g) = e'} and its range is the set Imu = {cr(g),g € G}. It is readily seen that Ker a is an invariant subgroup of G and that Im a is a subgroup of G'. Of course, if a : G -> G' is bijective, we recover the Definition 2.2 of an isomorphism. Actually, the notion of homomorphism is in a sense equivalent to that of invariant subspace, as follows from the next theorem. Theorem 2.11. (i) If H (E G, then a : g *-> gH is a (canonical) homomorphism of G onto G/H and Ker a = H. (ii) Conversely, let a : G —> G' be a homomorphism. Then G/Ker a ~ Im a. In other words, H is an invariant subgroup of G if and only if it is the kernel of some surjective homomorphism a : G —> G' such that G' ~ G/H. Example: Take again the group 0(2), with its subgroups SO(2) and {l,R}: . SO(2)
10
Lemma 2.13. One has IntG OutG
<s ~
AutG; AutG/IntG.
Example: Consider SO(3) acting on the additive group R 3 ; the map x i-> Rx is an outer automorphism of E 3 (a precise definition of the notion of action is given in Definition 2.15 below). The notion of automorphism allows one to build a new group from two smaller ones, in a fashion more general than a direct product. Let H, K be two groups for which there exists a homomorphism a : K —> Aut H. Then we define a new type of product: Definition 2.14. The semidirect product of H (noted additively) by K (noted multiplicatively), with respect to the homomorphism a : K —> Aut H, is the group G = H x K of all pairs (h, k) € H x K, with composition law (h, k)(ti, k') = {h + a{k)(ti),
kk'), for all (h, k), (ti, k') £ H x K.
The neutral element of G is (0, ex) and the inverse of (h, k) is (h, k)'1
=
Moreover, it is easy to see that {{h,ex),h {H x K)/H ~ K.
=
(-a(fc)-1W,fc-1).
6 H} ~ H <s G and G/H
Example: 0 ( 2 ) = SO(2) x{l,R), 50(2)
11
. The similitude group of K n : SIM(n) = Rn » (1+ x SO(n)); this is the group underlying the n-D wavelet transform. Definition 2.15. (1) If G is a group and X is a set, an action of G on X is a map (g,x) £ G x X ^ g[x] £ X such that (i) gig2[x] = Si[52[z]], for allg u g 2 € G, for alia; £ X . (ii) e[x] = x, e £ G, for alia; £ X. Then one says that X is a G-set. (2) The action of G on X is transitive if, for every pair x, x' £ X, there exists an element g £ G such that x = g[x']; then X is called a (G-)homogeneous set. (3) The orbit of a point a; £ X is the set of elements of X to which x can be moved by the elements of G. It is denoted by Ox: Ox = {g[x),
g£G)
. (4) The stabilizer subgroup of a point x £ X is the set of elements of G that fix a;: Gx = {g £ G, g[x] = x} . (5) Orbits and stabilizers are related. For each point x £ X, one has G/Gx — Ox. The action is transitive on each orbit and the stabilizer subgroups of any two points of the same orbit are conjugated, hence isomorphic: y £ Ox means that y — g[x], for some g £ G, and then Gy = gGxg~x. If the action is transitive, then X is the unique orbit and thus X ~ G/Gx, for alia; 6 G. Examples: . SO (3) acts transitively on the unit sphere S2 and the stabilizer of every point of the sphere is isomorphic to SO(2). Thus one has S2 ~ SO(3)/SO(2). . SO(3) acting on E 3 : the orbit of the point x £ R3 is the sphere Sr of radius r = \x\ a n d E 3 =\Jr^oSrWe conclude this section with some crude indications concerning topological groups. For lack of space, we have to drastically simplify the treatment here. Precise statements and definitions may be found in the textbooks.
12
Definition 2.16. Roughly speaking, a topological space is a set X with a topology, the latter being denned by a collection of open sets or, equivalently, a collection of closed sets. Then we define the following notions. . A map a : X —>• Y from one topological space into another one is continuous if a~1U is open in X for every open subset U C Y. . A set X is compact if every open covering of X contains a finite open subcovering (a more manageable definition will be given in Sec. 2.4). . A set X is connected if X cannot be decomposed into the union of two nonempty disjoint subsets, X ^ AD B, called connected components; thus every set is the disjoint union of connected components. . Given x, y in the same connected component, two continuous curves Px->y, Qx->y from x to y are nomotopic if they can be deformed continuously into one another. Clearly, homotopy is an equivalence relation. Given x £ X, the set of equivalence classes of closed curves Px->x is an abelian group with respect to the composition of curves. This group does not depend on the choice of x 6 X and is called the (first) homotopy group of X. . The set X is simply connected if every two continuous curves Px^y, <5x->y, x,y € X, are homotopic or, equivalently, if every closed curve Px_>x is homotopic to the null curve. In this case, the homotopy group of X is trivial. Definition 2.17. A topological group is a group G which is also a topological space, such that the group multiplication ((71,(72) *-> g\9i and the inverse operation g H-> g~l are continuous maps (here G x G is viewed as a topological space with the product topology). A compact (topological) group is a topological group which is also a compact space. Examples: . SO(2), SO(3) are compact topological groups. . SO(l,l), SO(l,3) are noncompact topological groups. Proposition 2.18. Let G be a topological group and G0 the connected component of G containing the identity. Then G0 is a closed invariant subgroup of G and G/GQ is discrete.
13 Proposition 2.19. If G is a compact topological group and H is a closed subgroup of G, then H is a compact topological group. Examples: SO(2) C SO(3) , SO(2) C SU(2). Note that the result is false if H not closed (for instance, an irrational helix winding on the torus SO(2)xSO(2) is a dense noncompact subgroup). 2.2. Lie groups and Lie
algebras
Definition 2.20. A subset M of E™ is a fc-dimensional smooth or C°° manifold if it is locally diffeomorphic to Efc for some k ^ n, i.e., for each point a; S M there exist open sets U, V C E", U B x, and a diffeomorphism h : U —> V such that h(UnM)
= VH (R* x { 0 , 0 , . . . , 0}) = {y G V : yk+1 = ... = yn = 0}.
Given two smooth manifolds M, N, the notion of smooth or C°° map h : M —>• N follows immediately from the definition above (a smooth map is also called differentiable or a diffeomorphism). Examples: the sphere 5 " in Rn+1, a (one- or two-sheeted) hyperboloid in E n + 1 , a torus in ffi3, the Mobius strip in E 3 . Definition 2.21. A Lie group is a group G which is at the same time a smooth manifold such that the map G x G -» G, (g,h) H-> gh~x is (infinitely often) differentiable. Examples: . Abelian Lie groups: finite dimensional vector spaces (with vector addition) , S 1 (with multiplication of complex numbers) . Matrix groups: SO(2), SO(3), SU(2), SO(l,3), GL(n,E) = GL(E"). Note that, if G is a Lie group and H a closed subgroup of G, then the quotient G/H is a smooth manifold. For instance, SO(3)/SO(2) ~ 5 2 , the 2-sphere, and the two are isomorphic smooth manifolds.
14
Lie algebras: intuitive discussion . The case of SO(2) Every entry of the matrix ghp) = I . 1 e SO(2) is an ana\sin<^ cos
= 12 - i
Conversely, using (—ia)2 = —12, one gets: #(?) = exp(-«<7¥>) = I2-iaip+K
2f
+v
yj 3|
+•••
Thus cr defines completely the group structure of S0(2) in a neighborhood of the identity and allows one to reconstruct every group element by exponentiation. Notice that a linearizes the group composition law in a neighborhood of the identity, in the sense that flMflWO = O1 - iaiP H X 1 - ivip H = I - icrfv? + V') H
)
. The general case In general, for studying the group structure of G in a neighborhood of the identity, one considers "sufficiently many" one-parameter subgroups fli(-): 9i{s)9i{t) = 9i{s + t), s , t € i , i = l , 2 , . . . , p , where p is dimension of G, that is, its dimension as manifold. As before, one defines T,j, the infinitesimal generator of j t h subgroup gj, so gj(s) = exp(—iEjs). Then, the Lie algebra g of G is the vector space generated by the infinitesimal generators T,j,j = l,...,p. This definition may be visualized intuitively as follows:
15 . in the neighborhood of the identity e, the group G may be seen as a (hyper)surface containing e; . the one-parameter subgroups ffj(s) are represented by curves on that surface, intersecting at e (only); . the generators S j are vectors tangent to these curves at e; . finally, the Lie algebra 0 is the vector space generated by these tangent vectors, that is, the plane tangent to the surface at e. To give an example, consider the group SU(2), which may be parameterized as with a, b e C, \a\2 + \b\2 = 1
SU(2) 3g=f°-l\
(a is the complex conjugate of o). Writing a = x\ + 1x2, b = x$ + 1x4, one gets SU(2) ~ S3, the unit sphere in M4, with equation: 2
1
2
X-^ -\- £ 2
1
2
1
' ^3
2
1
' *^4 — -!••
The identity element is the point (1,0,0,0), the North Pole, and the Lie algebra su(2) is the plane tangent to the sphere S3 at the North Pole. Commutation relations Measuring the non-abelianness of the group G amounts to evaluate the curvature of the surface that models it. Given two one-parameter subgroups gk(s) = e~lsT,k,gi(t) = e~ltT:e, their commutator 9k(s)9e(t)gk(sr1gt(ty1
= I + si(S^S fc - £*£*) + • • •
measures the non-abelian character of G. The crucial point, which may be proven using techniques of differential geometry, is that the commutator [X, Y] = XY — YX of two elements of the Lie algebra g is again an element of g, namely,
[£*,£,] = * £ C^E,-,
(5)
j
where the C3U are called the structure constants of 5. Thus the Lie algebra of G is the vector space generated by the infinitesimal generators S j , equipped with the bracket [X, Y] = XY — YX, an antisymmetric bilinear map of g x g into g. In addition, this bracket satisfies the Jacobi identity (which reflects the associativity of G), namely, [X, [Y, Z}} + [Y, [Z, X}} + [Z, [X, Y}} = 0,
for all X, Y, Z e 0.
(6)
16 Lie algebra of a Lie group: formal definition In order to state a precise definition of the Lie algebra, we have to resort to the language of differential geometry. Let us consider the action of the Lie group G on itself by left translation: Lg : G -> G, Lg(h) = gh. The corresponding differential is a map between tangent bundles: (dLg)h : ThG -»• TghG. A vector field X on G is left-invariant G, (dLg)hX(h) = X(gh). The central result is the following.
if dLgX
= X, i.e., at h £
Theorem 2.22. The vector space g of left-invariant vector fields on a Lie group G is isomorphic to the tangent space at the neutral element: g ~ TeG. Examples: . SO(3): the generators are J i , J2, J3, where Jk generates the rotations around the x^-axis; they obey the commutation relations [Ji, h] = ih, [J2,h] = iJ\, [Jz,J\] = 1J2,
&
[Jk, Ji] = iektmJm,
where eum is the totally antisymmetric unit tensor. . SU(2): the generators £ 1 , 2 2 , ^ 3 obey the commutation relations
These are identical to those of SO(3)! In other words, the two Lie algebras so(3) and su(2) are isomorphic (but not the groups SO(3) and SU(2), as we will see below). Note that Sj = \&j, the Pauli matrices, with commutation relations [o~k,
CTl =
(io)'a2
=
2iekim^m,
(^)' ff3= (o-°i)-
. S 0 0 ( l , 3 ) , the Lorentz group: the generators are Jn (rotations), and Kn (boosts), n = 1,2,3, with commutation relations
[Jl,Km] = itlmnKn , [Kl, Km] = -ieimnJn
.
17 More generally, we list the explicit correspondence between some of the so-called classical Lie groups and their Lie algebras, given by the relation g{t) = e-itx: • SL(n, E) detg = l
sl(n) tr X = 0
• SO(n) gTg = I det g = 1
• so(n) XT + X = 0 (antisymmetric) tr X = 0
• SU(n) 5t5 = I
• su(n) X* = X (hermitian)
det g = 1
tr X = 0
• Sp(2n)
• sp(2n)
9]9 = l
X= (JT _yr\
withyt=r
T
g hg = h, with h= f _j Q J For deriving the correspondences above, the following identities have been used: (eA)T deteA
= eAT,
{eA^ = eA\
=etrA,
eAeB = eA+B if and only if AB = BA. Definition 2.23. An abstract Lie algebra is a vector space g over K (K = E or C) equipped with a bilinear, antisymmetric bracket g x g —> g : (X, Y) >-> [X, Y] satisfying the Jacobi identity: [aX + pY, Z] = a[X, Y] + /3[Y, Z], for allX, Y, Z 6 g, a,/3 € K, [X,Y] = -[Y,X],
forallX,y
eg,
[X, [Y, Z]} + [Y, [Z, X] + [Z, [X, Y]] = 0, for all
X,Y,Zeg.
Theorem 2.24. (Ado) Every real or complex finite dimensional Lie algebra is isomorphic to a matrix Lie algebra, where [X, Y] = XY — YX.
18 As in the case of groups, comparing Lie algebras requires a proper notion of homomorphism. The natural definition reads as follows. Definition 2.25. Let g,g' be two Lie algebras over K. A map tp : g —I g1 is a Lie algebra homomorphism if: (i) V is linear:
exp(0) = e e x p ( - X ) = [exp(X)]- 1 exp((s + t)X) = exp(sX) • exp(<X) exp is a diffeomorphism from an open neighborhood of 0 6 g onto an open neighborhood of e £ G . All one-parameter subgroups of a Lie group G have the form exp(tX) for some X 6 0.
19 Using these tools, Sophus Lie's theory may be summarized as follows. Theorem 2.27. (Lie) (1) Every Lie group G has a unique Lie algebra g = Lie(G) (obtained by derivation). (2) Local structure: To every Lie algebra g corresponds a unique local Lie group G\oc such that g ~ Lie(G\0C). (3) Global structure: To every (real) Lie algebra g corresponds a unique simply connected Lie group G such that g ~ Lie(G), obtained by the exponential map exp : g —> G. More precisely, if G is simply connected and D is a discrete invariant (hence central) subgroup of G, then G and G/D have the same Lie algebra and vice versa. Remarks: (i) In a local Lie group, group operations are defined only in a neighborhood of the identity; (ii) 'Unique' always means 'up to isomorphism'. Let now G\ and Gi be two Lie groups whose Lie algebras 0i, 02 are isomorphic. Then, as a consequence of Theorem 2.27, either G\ and G 2 are globally isomorphic, or G\ and G2 are homomorphic images of the same simply connected group G, called the universal covering group of G\ and G2- In addition, for j = 1,2, Gj is the quotient of G by a discrete central subgroup Dj, Gj = G/Dj, and the kernel of the homomorphism G —> Gj ~ G/Dj is the homotopy group of Gj. Thus the general situation may be represented by the following picture:
universal covering, simply connected G
G/D!
G/D2
G/Dn deriv.
Lie algebra g
20
Examples:
,
For the classical groups that we are mostly interested in, the general scheme yields the following results. . SU(n),n ^ 2 is simply connected; . SO(n),n > 3, is doubly connected; its universal covering SO(n), called Spin(n), is not a matrix group, for n > 6. In particular: Spin(3) = SO(3) = SU(2), Spin(4) = SO(4) = SU(2)x SU(2). . SO(4) ~ (SU(2) x SU(2))/Z 2 , corresponding to so(4) ~ so(3) © so(3) = BU(2) ©su(2).
SU(2) = SO(3)
SU(3)
4
A
ex
SO(3) ~ SU(2)/Z 2
P
deriv.
SU(3)/Z,
\
«P
deriv.
\ su(2) ~ so(3)
5U(3)
SU(4) A \ SU(4)/Z 2
ex
P
deriv.
SU(4)/Z 4 = (SU(4)/Z,)/Z,
/
\ su(4)
21
2.3. Simple
and semisimple
Lie
algebras
According to Lie's theorems, there is a systematic parallelism between the properties of Lie groups and those of their Lie algebras. In particular, the following notions correspond to each other. . H C G: Lie subgroup <*=> ^ C g : Lie subalgebra . H (g G: invariant subgroup <=> ( | g g : ideal . G simple: no nontrivial invariant subgroup <$=>• g simple Lie algebra: no nontrivial ideal . G semisimple: no nontrivial abelian invariant subgroup •<=>• g semisimple Lie algebra: no nontrivial abelian ideal Thus we get the following equivalence for semisimple Lie groups: Gss = {Gl x • • • x G?) ID
<=»
0ss=fl s 1 ®---®fls"
("s" = simple, "ss" = semisimple) The next step is to find a criterion of semisimplicity for a given Lie algebra g. This is achieved in terms of the so-called Killing form, which defines a metric on g. First we define the adjoint representation of g: l H > a d I , &dX(Y) = [X,Y].
(7)
Let {ej, i € / } be a basis of g and X, Y 6 g, X = xlei,Y have ( a d X O n y = [X, Yf = C\kxly\
= ykek- Then we
i e I, i.e., (ad*)*, = C\kxl,
where {Cllk} are called the structure constants of g. Definition 2.28. The Killing form of g is the bilinear symmetric form (scalar product) given by (X,y)=Tr(adXadr).
(8)
In coordinates: (X,Y)
= ( ( a d X ) j (adF)J) = C\kxlCksiy° =
where the second order symmetric tensor gis = C^C^ metric of g.
glsxlys,
is called the Cartan
It is easy to check that the Killing form is invariant under an automorphism rl>ofB:W{X),xl>{Y)) = {X,Y). T h e o r e m 2.29. ( C a r t a n ' s criterion) The Lie algebra g is semisimple if and only if its Killing form is regular: det gki ^ 0.
22
Thus the Killing form turns g into a metric space (scalar product). A fundamental result of Cartan is the following. Theorem 2.30. A semisimple Lie group G is compact if and only if the Killing form of its real Lie algebra g is negative definite. In that case, the Lie algebra g is also said to be compact and g is a Euclidean space. Take for instance the group SO(3), which is simple and compact: [ A i , A j J = y ^Sjjk^-k k
—^ Qim ~ / y £jjk£mkj j,k
~
^"im*
As we have seen above, every semisimple Lie algebra g is a direct sum of simple Lie algebras: 0 = (B 0 i '
with
[0«'0j]
c
9i> Si simple and [fli.fl,-] = 0 if i ^ j .
iei
Example: so(4) = su(2) ©su(2) & SO(4) ~ [SU(2) x SU(2)]/Z 2 Thus, in order to classify the semisimple Lie algebras, it suffices to classify all the simple Lie algebras. This has been done by Cartan in his thesis (1894), another masterpiece of mathematics. We will now sketch this fundamental result. Cartan's method of classification consists in choosing a standard basis in the Lie algebra and translating its properties in graphical terms, which results in the so-called root diagram. Then the latter can be completely classified. To that effect, one considers first a complex Lie algebra g (i.e., with complex parameters) and one solves the eigenvalue equation aAA{X) = aX,
i.e.,
[A,X] =aX, a € C.
(9)
Then the results may be summarized as follows. Let g be a semisimple Lie algebra and A € g an element with a maximal number of distinct eigenvalues. Then: (1) 0 is the only degenerate eigenvalue; its multiplicity / is called the rank of g. (2) Choose I independent eigenvectors associated to 0, Hi, i = 1,2,...,/: adA(ffi) = [/!,#*] = 0. Since [.4,^4] = 0, A = Y^ici^i a n d one can choose A = Hi. The abelian subalgebra f) of g generated by {Hi, i = 1,2,...,/} is called a Cartan subalgebra.
23
(3) Let Ea be an eigenvector of Hi associated to the (nondegenerate) eigenvalue a i : [H\,Ea] = ot\Ea. Using the Jacobi identity, one gets [Hi,Ea] = atEa,
i=
l,2,...,I.
Therefore, all the Hi are diagonalized simultaneously. Denote by ai,a2,-..,oei the eigenvalues of the common eigenvector Ea. The vector a = {a;}i=i,...,z £ Rl is called a root vector and the set of all roots is called the root diagram of g. Thus the root diagram is a set of vectors in tf, where I is the rank of g. In the standard (Cartan) basis {Hi, i = 1, 2 , . . . , I; Ea} of 0 (complexified), the commutation relations take the form: [Hi,Hj]=0,
i,j =
1,2,...,I,
Na Ea
if
[Hi,Ea]=aiEa, [Ea,E_a}=aiHi, \E 1
°"
(10) a
3 is a
E ] = i P +P> ( + z) nonzero root, 0l \0, if {a + P) is not a root.
The set of roots has the following properties: . ka is a root if and only if k = ±1. . The set of all roots is invariant with respect to the reflection in the hyperplane perpendicular to the pair ±a. . All these reflections generate a finite group W, called the Weyl group. Example 1: so (3), with rank 1 and commutation relations [J3,J±] = ± ^ ± ,
[J+,J-] = Jz,
where J± = —=( Jx ± iJ2). V2
Root diagram:
J_ M
J3
J+
•
•
Fig. 1. The root diagram of SU(2).
24
Example 2: su(3) = A2, of rank 2 The root diagram consists of six nonzero roots, the tips of which draw a regular hexagon. The Weyl group is isomorphic to the 6-element permutation group 53, generated by reflections with respect to the 3 lines orthogonal to roots (dashed lines). 11
013 = ai + a-i
Fig. 2.
The root diagram of SU(3) (from Ref. 11).
The root diagram has the following further properties: (i) For any pair of roots a, j3: • 2(a/3) . . . . . the ratio —.—r- is a positive integer (aa) . (5 — - — — a is a root (reflection of /? with respect to the line (aa) perpendicular to ± a ) (ii) If 6ap is the angle between the roots a, ft, one has M) 2 mn , m, n e N + . cos2 0ap = (aa){p(i) 4 As a consequence, only a few values are allowed for the angles 8ap and the ratio of lengths, as follows: 6a0 : 30° 45° 60° 90° 120° 135° 150° 180° Ratio of lengths : -\/3 \/2 1^ arbitrary 1 v 2 V3 1^
25
Classification of simple Lie algebras. Dynkin diagrams One can define an order relation on root systems: . a = (ai,02, • • • ,Q/) is positive if the first nonzero component is positive (lexicographic order), . /?>aif/3-a>0. A root is said to be simple if it is positive and cannot be decomposed into the sum of two positive roots. Then: . If g has rank I, there exist / linearly independent simple roots (i.e., simple roots are a basis of M'), . If a,/3 are simple roots, the angle 6ap takes only the values 90°, 120°, 135°, 150°. These facts are encoded graphically in terms of the so-called Dynkin diagrams. The principles are the following: . Each simple root is represented by a small circle. . The number of links between two circles is 0,1,2 or 3 whenever the angle between the corresponding roots is 90°, 120°, 135° or 150°, respectively. Example: Three simple roots r i , r 2 , r 3 with angles (?i,r2) —120°, (r2,r3) = 135°,(r 1 ,r 3 ) = 90°: n r2 r3 O 0==0 . Closed loops are forbidden:
O-
-b
6
O
. Each circle can support at most three links, so that a connection like the following one is forbidden: O
Q
. Simple roots can have two different lengths only; one uses white disks for the short ones, black disks for the long ones.
26
The result of Cartan's analysis is that there exists four infinite series of simple complex Lie algebras, corresponding to classical groups, plus five exceptional algebras (with no associated classical group). The four infinite series are Ai (I ^ 1),-B; {I ^ 2),C; (I > 3) and A (I ^ 4), where I denotes the rank of the algebra and the restrictions on I guarantee that all algebras are different. Indeed, there are some isomorphisms for the lower ranks: Ai ~BX ~CUB2~C2,A3~D3,D2~A1(BA1.
(11)
The five exceptional algebras are denoted G2,-F4, £?6,-E7 and Eg. All the simple Lie algebras are listed in Table 2.3, together with their Dynkin diagrams. The next step is to list the real forms of the simple Lie algebras and the corresponding Lie groups. Let go be a real Lie algebra. Its complexification is the complex Lie algebra fl = 0Q consisting of all elements of the form X + iY, X,Y £ 00) the bracket being extended by linearity. Conversely, a real form of g is a real Lie algebra 0i such that g is isomorphic to the complexification of gi. Of course, a complex Lie algebra may have several nonisomorphic real forms. There is one fundamental restriction, however. Theorem 2.31. Every semisimple complex Lie algebra has a real form which is compact. Starting from this compact real form, one may now obtain a noncompact one, as follows. The tool is the notion of involutive automorphism of the compact Lie algebra g, that is, an automorphism a of g such that er2 = I. Such a map a has eigenvalues ± 1 and it splits g into eigensubspaces: g = 6®p, where the eigenspace t corresponds to the eigenvalue + 1 , i.e., it is the set of fixed points of a. The commutation relations of g are the following: [1,6] C 6,
(thus 6 is a subalgebra)
[e,p] = p, [p,p]Ct. Then the Lie algebra g* = t ©p*, where p* = ip, is another real form of g c , and it is noncompact (this construction is called Weyl's unitary trick). The commutation relations of g* read: [t, 6] C E,
[e,p*] = p*. [p*,p*]c-e.
(6 is still a subalgebra)
27
Thus classifying the real forms amounts to classify the involutive automorphisms, and again the result was obtained by Cartan. For instance, the complex Lie algebra B2 has three different real forms, namely, so(5), which is the compact one, so(4,1) and so(3,2), which are noncompact. All this extends to the corresponding Lie groups. In the example of B2, one gets SO(5), which is compact, SO(4,l), the de Sitter group, and SO(3,2), the Anti-de Sitter group, which are both noncompact. For each nonexceptional Lie algebra, we list in Table 2.3 the real forms of the corresponding classical Lie groups. For each of them, there is one compact form (SU(/+1), SO(2/+l), Sp(2Z), SO(2/)) and several noncompact forms (for I ^ 3). The isomorphisms of the low-rank Lie algebras (11) in turn entail local isomorphisms for the corresponding groups. For instance, SU(2) ~ SO(3) ~ Sp(2) or SU(4) ~ SO(6) locally (but not always globally). 2.4. Integration
on a locally compact
group
The main advantage of locally compact groups is that they allow a theory of integration, which will prove crucial in a number of situations. Let G be a locally compact group, in particular a Lie group. Then one defines: . A left invariant measure on G, that is, a measure /i£ on G which satisfies the following relation for any ^x-integrable function / : / f(9ogWL(g)
= [ f{g)dtxL{g),
JG
for all g0 G G,
(12)
JG
or, equivalently, dHL{galg)
=duL(g),
or IJ,i(gQ1E) =
HL{E),
for every Borel set E of G.
. A right invariant measure fiR on G: f f(99o)d/jR(g)
= f f(g)dm(g),
JG
(13)
JG
dmigQo1)
= d(J,R(g),
fi^Eg^1)
=
HR(E),
for every Borel set E of G.
Then the fundamental result of Haar is the following. Theorem 2.32. Up to normalization, every locally compact group possesses a unique left invariant measure \IL and a unique right invariant measure fiR. These two measures, called Haar measures, are equivalent.
Abstract Lie algebra
Ai (I > 1) Bi (I ^ 2)
Real form of corresponding Lie group
SXJ(l + l)orS\J(p,q),p
+q=l+ l
S0(2Z + 1) or SO(p, q), p + q = 21 + 1
Q (I ^ 3)
Sp(2l)orSp(p,q),p
+ q = 2l
Di (I > 4)
SO{2l) oiSO(p,q),p
+ q = 2l
FA G2 E6 E7
Table 2.3. Dynkin diagrams for the simple Lie
29
.RemarJcs: (1) Two measures are called equivalent if they have the same sets of measure zero. (2) If /i is a left invariant measure, then ju, image of fi by the homeomorphism g H-> g~l, is a right invariant measure and vice-versa:
,£> = ,/*>,
1(g) = fig"1)-
Since \IL ~ HR, there exists a continuous function A : G ->• E + , called the modular function, such that: diAL{g) = A(g)dnR(g).
(14)
The modular function has the following properties: (1) A(g) > 0, (2) A(e) = 1,
for all g € G,
(3) A(gi)A{g2)
= A{gig2),
for all gi, g2 £ G.
In other words, A is a character of the group G. One has also: dm(g) = A{g~x) dfiL{g) = duUgg') = A{g')diiL{g).
dnL{g~l),
The group G is said to be unimodular if A(g) = 1, for all g G G, i.e., HL - HRExamples of unimodular groups: . . . . .
Abelian groups Compact groups Simple and semisimple groups Inhomogeneous groups: E(3), 7^(1,3), ... Discrete groups.
Examples of nonunimodular groups: . The affine group of E: {(&, a) : b e K, a £ E, a ^ 0}, i.e., K xi 1» . The ox + 6 subgroup of the affine group (a > 0), i.e., E x E+ . The similitude group SIM(n) = R" x (K+ x SO(n)). The Haar measures provide an easy criterion for compactness of the group: G is compact if and only if vol G < oo, where vol G = / d(iL(g) = / JG
JG
dfj,R(g).
30
Examples: . SO(n),SU(n) are compact; . SO(p,q),SXJ(p,q),Wl,P(1,3)
are noncompact.
Note: A similar discussion may be done for measures on homogeneous spaces X = G/H, but there is an essential difference. Indeed, a homogeneous space does not always admit an invariant measure, some (known) criteria have to be satisfied. However, it always admits a quasi-invariant measure, i.e., a measure equivalent (but not equal) to its translates. 3. Mathematical Tools II: Representations 3.1. Basic
notions
Definition 3.1. A linear representation of a group G in a vector space V is a homomorphism T : G —> GL(V), where GL(T^) denotes the set of all invertible linear operators on V: Tigito) = T(9l)T(g2), 1
l
It follows that Tig- ) = T(g)~ denned as the dimension of V.
for all gug2
£ G.
and T(e) = I. The dimension of T is
The most useful case is that where V is a Hilbert space ft and the operators T(g) are bounded. Then T is a homomorphism of G into GL(Sj), the set of bounded operators with bounded inverse. If G is a Lie group, T is called strongly continuous if \\(T(g) - I)0|| -> 0, when g ->• e, for all> £ fj.
(15)
Definition 3.2. When Sj is a Hilbert space, the representation T is called unitary if T(^) is a unitary operator for every g £ G, i.e., T(g~x) = (T(g)f\T(g)h)
= (/I/!), for all*? £ G, for all,/,/* £ Sj.
Exampies: . G = SO(2), fj = L2(S1)
P W / 1 M = /(V> +
f £ L2{S2),
g £ SO(3), u £ S2.
Both representations are unitary and infinite dimensional.
(16)
31 When G is locally compact, one can define the Hilbert spaces of functions which are square integrable with respect to the left or the right Haar measures. Accordingly, one defines the two regular representations: . Left regular representation:
[UL(go)f]{g) = f(9ol9), 9o,geG,feSj
= L2(G,dfiL).
(17)
. Right regular representation: [UR(go)f](g) = f(gg0), 9o,g e G, f € Sj = L2(G,dfiR).
(18)
UL and UR are unitary and unitarily equivalent. Let now H be a closed subgroup of G such that X = G/H possesses a left invariant measure /z. Then one can define the unitary quasi-regular representation: [UqL(g)f}(x) = J^^f(g-lx),feSj
= L2(X,dvL).
(19)
In these expressions, the square root factor is a Radon-Nikodym derivative, which compensates for the non-invariance of the measure. Examples: . G = SO(3), H = SO(2), X = S2. . G = S0 0 (l,3), H = SO(3), X = S0 0 (l,3)/SO(3) = two-sheeted hyperboloid. Definition 3.3. Two representations, T\ and Ti in the Hilbert spaces f)i,i}2, respectively, are equivalent if there exists an invertible operator S : Sji -> f)2 such that T2(g) = ST1(g)S-\
for a l l 3 € G.
(20)
In this case, we note T\ ~ T 2 . Two unitary representations Ti,T% are unitarily equivalent if the operator S in (20) is unitary. The set of unitary equivalence classes of unitary irreducible representations of G is called the dual of G and is denoted by G. Definition 3.4. If T is finite dimensional, one calls character of T the complex-valued function \T on G given by the trace of the matrix T(g): dimT
XT(ff)-TrT(g)= £
[T(fl)]«.
32
Clearly, equivalent representations have the same character: X STS -i (9) = Tr (ST(g)S-1)
3.2. Irreducibility
of
= TrT(g) = x T ( 5 ) , 9 e G.
representations
Given a representation T in the Hilbert space Sj, a subspace Sji C Sj is said to be invariant for T if h € f)i implies T(g)h £ 551, for allg € G. Proposition 3.5. Lei T be a unitary representation in Sj, Sji a subspace of Si, and P the orthogonal projection onto Sji . Then: (1) Sjt is invariant if and only if Sji is invariant. (2) Sji is invariant if and only if PT(g) = T(g)P, for all g S G. Definition 3.6. The representation T in Sj is called irreducible if there exists no nontrivial invariant subspace in Sj (i.e., different from {0} or Sj). Otherwise, T is called reducible. Let Sji be an invariant subspace for T, Sj = Sji © Sjt- Then one may write T(g) =
\Ti(g)
0
A(g)] T2(g)\
where Ti(s):J3i->iji,
T2{g):S)i^S)t,
A(g) : fit-> fii •
One says that T is completely reducible if Sjt is invariant whenever Sji invariant. In that case, one gets A(g)=0,
and
T(g) =
\Ti{g)
0
0 T2(g)\
Then one writes T = T\ © T2, acting in Sj = Sji ® Sj2 (direct sum). If T is reducible, but not completely reducible, it is called indecomposable. T(x) = lx is indecomposable: Example: G = R, Sj 0 1 is invariant,
0
is not.
33
Theorem 3.7. Every unitary reducible representation is completely reducible. Proof: Let Ti,T? be the restriction of T to the subspaces fJi,-fJ2- Then fji invariant means T(g) =
\Ti(g) A(g) ] 0 T2(g)\
By the unitarity of T(g), this gives: 0 Trig-1) A{g~li\) \TdgV 1 = Tig) = r(3) = 0 T2(g~1) [A(gY T2(g)*\ f
i
Thus A = 0 and I \ , T2 are unitary.
D
If dim T < oo, the decomposition can be repeated until exhaustion. Thus one gets the following result: Corollary 3.8. Every finite dimensional unitary representation is a direct sum of unitary irreducible representations (UIRs). 3.3. Schur's
lemma and its
generalizations
Lemma 3.9. (Schur's lemma: classical) Let T\,T2 be two UIRs of G in 9)iand 9)2, respectively, and A : S)\ -^ fa o/n intertwining operator: AT1{g) = T2{g)A,
forallgeG.
Then, either ,4 = 0, or A is invertible and T\ ~ T 2 . In the second case, if dim Ti < oo, A is unique up to a scalar. Corollary 3.10. LetT be a finite dimensional unitary representation and AT(g) = T(g)A, for all g G G. Then A = AI. Given a representation T in the Hilbert space Sj, its commutant is the set of bounded operators that commute with every T(g), g £ G: T' = {A£ B(Sj) : AT(g) = T(g)A, for all g £ G}. Corollary 3.11. Let T be unitary. Then T is irreducible if and only if its commutant T" is trivial, i.e., T" = {AI, A £ C}. Corollary 3.12. If G is abelian, every UIR of G has dimension 1. Proof: T{g)T{g>) = T(gg') = T(g'g) = T(g')T(g), that is, T(g) £ V. By Corollary 3.11, T(g) = X(g)I with \X(g)\ = 1 , i.e., dim T = 1. •
34
Example: G= S0(2) 9(f) =
cos if — sin f GS0(2), sirup cos y
g(27r)=g(0)=I.
Every UIR of S0(2) is of the form Tk(g(ip)) = eikv , k G Z. Indeed, one has T{g(if)) = A(p)I, since S0(2) is abelian. Thus T(g(ipi))T{g(tp2))
= T(g(fl)g(f2))
= T{g{fX +
Therefore one gets
Kfi)Kfi)
= Kfi + fz)>
whose only continuous solutions are X(f) = elkv, k G R. Finally, the condition T(g(2n)) = eik2v = I implies that k G Z. This is equivalent to the theory of Fourier series] Indeed, an arbitrary function / G L2^1) may be expanded in a Fourier series: 00
/(¥>)= £
c,e^,
(21)
k= — 00
so that 00
L2(51)=
0
J5fc,
d i m % = l-
(22)
k= — 00
The regular representation of SO(2) acts in L 2 (5 X ) and reads: 00
[ULW>)f](
00
cke*W>=
k=~oo
J2
ckTk(g(i>))eik\
k= — 00
i.e., (22) corresponds to the decomposition of UL into 1-dimensional UIRs: 00
UL= 0 Tt. & = —00
Lemma 3.13. (Schur's lemma: general) Let U\ be a UIR in Sji, U2 a unitary representation in $)2, and T : $)i —> ^2 a bounded operator that intertwines U\ and 1/2- Then either T = 0, orT is a multiple of an isometry, i.e., there exists a constant A > 0 such that
Wml^XMWl,, for ail 4 eft.
35
Proof: Prom the hypotheses, we have T'TU^g)
= T*U2{g)T = U^g)T'T,
for alls £ G.
Then by Schur's classical lemma 3.9, either T = 0 or T*T = XI.
D
Lemma 3.14. (Schur's lemma: extended) Let U be a UIR of G in the Hilbert space S) and U' a unitary representation in $j'. Let T : Sj —> $j' be a closed linear operator with dense, U-invariant domain D(T), that intertwines U and U' • Then either T = 0, or T is a multiple of an isometry (hence bounded). 3.4. Representations
of Lie
algebras
Throughout this section, we assume that T is a finite dimensional representation of G. Consider first the case of a one-parameter group, namely, SO(2): g((p) = e~l
a — infinitesimal generator (see Sec. 2.2).
As before, T(g(
with T(a) = lim tfi-^O
T(g(y))
~I.
—lip
Since T is unitary, it follows that T(a) = T(cr)t, that is, T(a) is a hermitian matrix. In the general case, one gets g{x) = e~l^'Xj,Ti,x
= (XJ), o-j e Q = Lie algebra of G.
We could take, for instance, SU(2) or SO(3), as discussed in Sec. 2.2. For each one-parameter subgroup Xj H> g(xj), one has T(g(xj))
=e~ix'T^l
The image of the commutator of two such one-parameter subgroups g(xi),g(x2) is the operator T(9l92g^g^)
=
T(gi)T(g2)T(g^)T(g^),
where, for simplicity, we have written gj = g(xj), j = 1,2. To second order, this gives: T(gj) = T(e-"^) Tigj1) = T(eix^)
= I - ixjTiaj) = I + ixjT^)
+ ^-ixjTitTj))2 + ... 2 + K+i^-Tfo)) + ...
36
Thus: T([o-1,a2])=T(o-1)T(a2)-T(o-2)T(al) =
[T(
where [<7i,<72] denotes the Lie bracket in g and [T(ai),T(cr2)] the commutator of the two matrices. Thus the representation T induces a homomorphism of g into the Lie algebra of hermitian matrices, i.e., a linear map a M- T{a) such that T([a1,a2]) =
[T(a1),T(a2)}.
in other words, a Lie algebra representation. In particular, one has T(0) = 0. 3.5. Representations
of compact
groups
Fundamental results In this section, G denotes a compact topological group with normalized invariant Haar measure dg:
/ « = 1 and JG
f{g-x)dg.
/ f(g)dg = f JG
JG
T denotes a strongly continuous representation of G in the Hilbert space Sj, as defined in (15). By the principle of uniform boundedness, this implies that T is bounded, in the sense that there exists M > 0 such that | | r ( 5 ) | | ^ M, for all g G G. The main tool for analyzing the UIRs is the integration over G. Their properties may be summarized in the following proposition. Proposition 3.15. Let T be a strongly continuous representation of the compact group G in the Hilbert space Sj. Then (1) There exists on Sj a new scalar product, defining a norm equivalent to the initial one, with respect to which T is unitary. (2) Every UIR of G is finite dimensional. (3) Every unitary representation ofG in a Hilbert space Sj is the direct sum of finite dimensional UIRs. Examples: (1) Regular representation of G, in Sj =
[
~y
v
'
^ ' ] e G d i m Uj times
L2(G,dg):
37
Note that all elements of G occur in the sum, with a multiplicity equal to their dimension. Thus every UIR of a compact group is a subrepresentation of the regular representation (this is a characteristic property of square integrable representations, see Sec. 3.6 below). (2) Quasi-regular representation of G, in L2 (G/H', dfj), where H is the maximal compact subgroup of G: U,L
= 0 U3.
Here as well, every UIR occurs in the sum, but only once. (3) For SO (3), the UIRs are the well-known representations Uj = Dj of dimension 2j + 1, generated by spherical harmonics. Thus, oo
L2(S2) = ® ^ - ,
L2(SO(3)) = 0 ( ^ e - - - ® J P j ) /
j=o
j
v
""I ' (2.7+1) times
Orthogonality relations Let Ti,T 2 be two UIRs of a compact group G in Hilbert spaces fii,?)2, respectively. Their matrix elements obey the following orthogonality relations: '0, f ^(r^uilwiKTafaJualt*) JG
ifT194T2)
1
,
Ti~r2,
for any ui,vi G fix, u2,v2 6 Sj2In the second case, Ti and T2 are unitarily equivalent: T 2 = V T i V - 1 , with V : Sji ->• i^2 unitary, and d = dim Ti = dim T 2 . Equivalently, in terms of matrix elements in suitable orthonormal bases, the orthogonality relations read:
L
'0,
if T 1 7 £T 2 ,
dg(T1)ij(g)(T2)kl(g)={ - Sik Sji, if Ti ~ T2 .
V. a
In fact, these matrix elements are not only orthogonal, but they also constitute a basis of L2(G,dg), according to the famous Peter-Weyl theorem:
38 Theorem 3.16. (Peter—Weyl) Let G be a compact group and G its dual. For each s € G, choose a unique unitary matrix representation Us = (U^ ) of dimension ds. Then the family B = {^d'suii;)\seG,
i^i,j^ds}
is an orthonormal basis of L2(G,dg). For fixed s e G and i € {1,2, • • • , ds}, denote by 9j\s' the ds-dimensional subspace of L2(G,dg) generated by the functions U^ , j = i,... ,ds. Then the spaces f)\s are invariant under the right regular representation UR of G and the restriction of UR to each subspace S)\s is equivalent to Us. In other words,
s€G
i=l
seG
where dsUs denotes the direct sum of ds copies ofUs. This theorem is essential for the applications. In fact, the matrix elements Uu (g) yield most special functions. For instance, SO (2) yields Fourier analysis; SO(3) yields spherical harmonics; E(2) yields Bessel functions. In the case of SO(3), the matrix elements U^'(g) are the so-called Wigner functions dJmm,(g). In all such cases, the fact that these functions are orthonormal bases simply follows from the Peter-Weyl theorem, whereas addition theorems in general reflect the group law. Characters of finite dimensional representations According to Definition 3.4, the character of the finite-dimensional UIR T is the function x T : G -> C given by dimT XT(g)=TrT(g)=
£
[T(ff)]«, g e G.
The characters have the following properties: • XT{9ol99o) • XT(3
_1
. TX~T2
) =
= XT(9)
for
an
y 9o £ G,
XT(9),
implies xi = X2, where Xj = XTj> 3 = !> 2 -
f , —r-s
f
. Jo,if r 1 7 ir 2 )
39 For an arbitrary finite dimensional representation T = ^^rriiTi, mi £ N + is the multiplicity of the UIR Tt in T, one has
where
n
XT(9) =
^rniXi{g)»=i
Hence, mi= JG n
„
Y,™* = / i=i
dgXi(g)xT(9), dg\xT{g)\2.
J G
Therefore, T is irreducible if and only if / dg \\T (g)\2 = 1JG
Compact vs. noncompact groups It is instructive to compare the properties of unitary representations of compact groups with the corresponding ones of noncompact groups. For a compact Lie group G: . every UIR is finite dimensional; . every unitary representation T is a (finite or infinite) direct sum of UIRs:T = ©jTj. For a noncompact Lie group G: . If G is connected and semisimple, the only representation of dimension 1 is the trivial one: T(g) — I, for all g £ G; . If G is connected and simple, it has no nontrivial finite dimensional UIR: Examp7es: SU(1,1), SO(2,l) - SU(1,1)/Z 2 , SO(l,3) . If G is connected and semisimple, it has no finite dimensional faithful unitary representation (a representation is called faithful if its kernel is trivial). Counterexample (H. Fiihr): Let G = 0(1,1) ~ Mx {±1} (this group is not connected!), and let x D e a nontrivial character of M. Then the following representation TTX of G is unitary, irreducible and of dimension 2:
40
Practical analysis of UIRs of compact groups: weight diagrams
.
•^
* *\
.''
/
•
Fig. 3.
a-2'
V
y^.
\A.
ax
Root diagram and lattice gc of SU(3) (from Ref. 11).
Let G be a compact Lie group. Following Cartan 5 and Weyl6 (also Hopf7 and Stiefel8), the following properties hold true: 9 - 1 1 (1) The root diagram (see the discussion in Sec. 2.3) divides E' into fundamental domains (also called Weyl chambers), which are permuted by the Weyl group W. (2) Let {CKJ} denote the positive roots, p = \ J2^Li aii Do the fundamental domain containing p. Then there is a lattice gc C rf with basic vectors A i , . . . , A;, containing the roots, such that: . Every vector v £ gc is of the form v = ^iPiXi = (pi, • • • ,Pi) € Z ( . . Do is defined by pi ^ 0, for all i. . The j - t h face of Do corresponds to pj = 0, j = 1 , . . . , I. (3) In these notations, one has p = Ai + . . . + A; = ( 1 , 1 , . . . , 1) (4) There is a one-to-one correspondence between lattice points A located strictly inside Do and UIRs D{\). We illustrate these notions on the case of SU(3). Figure 3 shows the root diagram and the lattice gc. Notice the hexagonal symmetry of both patterns, as results from the invariance under the Weyl group W (see Sec. 2.3). Figure 4 shows the domain Do with the lattice points corresponding to the low-dimensional UIRs, each of them being labeled by its dimension. The weights of a representation D(X) are the simultaneous eigenvectors of the commuting generators {Hi}, and thus vectors in Rl. The weight diagram of D(X) is the set of tips of the weight vectors, and it is a set of
41
TO _ ' _ 24 15 ' 27 15 ' 24 10
Fig. 4. Low-dimensional UIRs of SU(3); each UIR is labeled by its dimension (from Ref. 11).
points in gc C rf with integer multiplicity, invariant under the Weyl group W. The number of points of the weight diagram (counting multiplicities) gives the dimension of the corresponding representation D(X). The character of D(X) is expressed in terms of weights: X(A) = X(\)/A,
(23)
where X(X) = ] T e{w)e^wX'^, wew
Xegcn
D0.
(24)
In this expression, A = X(p), since \(p) = 1 (trivial representation),
(Clebsch-Gordan decomposition),
= ^X(Aj), 3
x(Ai)A2 = / j A j
(Speiser's "rubber stamp rule").
j
Thus the weight diagram is a basic tool for classification purposes. It is interesting to note that most of this machinery extends to infinite dimensional Lie algebras known as Kac-Moody algebras.11
42
^(1,0)\(2,1)\(3,2)
Fig. 5. The weight diagram of the representation 15 of SU(3), with highest weight (2,1). The basis vectors are Ai = (1, 0) and A2 = (0,1) (from Ref. 11).
3.6. Square integrable
representations
Among all representations of noncompact groups, there is a class that enjoys particularly nice properties, closely reminiscent of those of compact groups, namely, the square integrable representations. These have a special importance in the theory of coherent states, in particular wavelets.12 Let G be a locally compact topological group, with left invariant measure dfi(g), and U a strongly continuous UIR of G in the Hilbert space •ft. One says that a vector n £ $j, n ^ 0, is admissible for U if
Hv) = [ \(U(g)v\v)\2d»(g)<
00,
(25)
JG
or, equivalently, I(v,
(26)
JG
The representation U is square integrable if it possesses a nonzero admissible vector. It is easy to see that, if a vector n £ Sj is admissible, so is the vector rjg = U(g)n, for every g G G. One calls {77^, g £ G} the set of coherent states associated to U. The set A of admissible vectors is a vector subspace of S), invariant under U. Therefore, since U is irreducible, either A = {0} (in which case U is not square integrable); or A is dense and U is square integrable.
43
Examples: . If G is compact, every UIR is square integrable. . For G - R, Sj = C, the representation U(x) = eiax,x square integrable. Indeed,
€ K, is not
I(n) = f |eiQX|7?|2|2da; = \n\4 f dx = oo. « ./R . For the affine group of R, G&« = {(6, o) : 6 € 1, a G R, a ^ 0} = R x E* and f) = L2(R, dx), the representation
[U(b,a)f](x) =
\a\-^2f(a-l(x-b))
is square integrable (and it is the only one!). A vector ip € L2(R, dx) is admissible if it satisfies the condition
£>>P|
,27)
Square integrable representations have the following characteristic properties: (1) Every square integrable representation U is unitarily equivalent to a subrepresentation of the left regular representation UL (such representations U constitute the discrete series of representations of G). Examples: . SO(l,3) (Lorentz), SO(l,4) (de Sitter), V(l,3) (Poincare) have no discrete series, hence no square integrable representations. . SO(2,3) (Anti-de Sitter), SO(2,4) (conformal group), SO(2,q) do have a discrete series. (2) They satisfy orthogonality relations, more precisely the following theorem holds. Theorem 3.17. (Duflo-Moore) Let U be a square integrable representation of G in Sj. Then: (1) There exists a unique positive self-adjoint, invertible operator C on 9j, with dense domain T>(C) — A, such that the following orthogonality relations hold:
ftoMWB\
= (Cr,,\Cr,)(
JG
for every admissible r},rf, and arbitrary <j>,
44
Examples: . The Weyl-Heisenberg group (of the quantum harmonic oscillator) is unimodular. Thus every function / 6 L 2 (R, dx) is admissible for the Schrodinger representation. . The affine group of the line Gaff is not unimodular. Therefore, there exists a nontrivial admissibility condition, namely, (27). Thus the DufloMoore operator is the operator of multiplication by l^l - 1 / 2 in Fourier space, which is an unbounded, invertible, self-adjoint operator. . For the similitude group of the plane SIM(2) = E 2 » (R+ x SO(2)), exactly the same situation prevails. In conclusion, we may say that square integrable representations are the natural generalization to noncompact groups of the UIRs of compact groups. 4. Classical Physics 4.1. Conservation
laws in classical
mechanics
For a classical physical system with finitely many degrees of freedom, the general principle of symmetry is that the invariance of the Lagrangian or the Hamiltonian under some symmetry operation implies a conservation law. This is expressed in the following terms in the two cases. (1) In Lagrangian mechanics The starting point of the theory is the Lagrangian L(t) = L(q(t), q(t)), a function of (generalized) coordinates q(t) = {qi(t).. .qn(t)} and their time derivatives of first order. The basic axiom is the principle of least action: S
[2dtL(q(t),q(t))=0.
This principle leads to the Euler-Lagrange equations: d dL dL ^ T ^ - ^ - = 0 , i = l,...,n. dt dqi oqi
, „ 28
From this follows the symmetry principle: if L(t) is independent of qi (igdL norable coordinate), then 7— is a constant. dqi (2) In Hamiltonian mechanics Now one starts with conjugate momenta: P(t) = {Pl(t) • • -Pn(t)}, Pi = -Q^-
45
and considers the Hamiltonian H(p(t),q(t)) defines the Poisson brackets:
= YjiPiQi ~ L{q(t),q(t)).
{A B\ = T ( — — - —
One
—
In particular, the canonical Poisson brackets read: {qi(t),qj(t)}
= 0,
{Pi(t),Pj(t)}
= 0,
{qi(t),pj(t)}
= Sij.
As generator of time translations, the Hamiltonian describes the time evolution of observables. Given an observable A = A(q(t), p(t)), its time evolution is governed by Hamilton's equation A = {H,A}.
(29)
As a consequence, {H, A] = 0 implies that the observable A(q(t),p(t))
is
conserved. 4.2. Hamiltonian to Poisson
mechanics: brackets
representations
with
respect
If several conserved observables Aj(q,p) have commutation relations (with respect to Poisson brackets) that close, they constitute a Lie algebra. Thus one obtains a representation of that Lie algebra in terms of Poisson brackets. This applies even in General Relativity, when it is formulated in the Hamiltonian formalism. Examples: (1) Central potential (2-body problem in center of mass frame):
v2 H
= Y + v{r) • The components of the angular momentum L = q x p are the infinitesimal generators of rotations, with commutation relations {Li,Lj}
= tijkLk :
Lie algebra so(3).
The invariance of the system under rotations is expressed by the relation {H, L} = 0. As a consequence, L is constant and therefore the plane of the orbit is fixed in space. For the Earth, this is the plane of the ecliptic, which is indeed fixed, modulo small perturbations due to other bodies, like Jupiter or the Moon.
46
(2) Dynamical group for the Coulomb-Kepler potential: In the case of the Coulomb-Kepler potential, V(r) = r _ 1 , there exists a second invariant, namely, K = L x p + y, which satisfies the relations {H, K} = 0 and K • L = 0. However, the commutation relations do not close: {L,L} = L, {L,K} = K, {K,K} = -2HL. Therefore, in the case H < 0, one introduces the so-called vector A = K(-2H)-1/2. Then,
Runge-Lenz
{L,L} = L, {L,A}=A, {A,A} = L, which is the Lie algebra so(4) ~ so(3) © so(3). Indeed, writing X± — \{L ± A), one gets {X±,X±}
= X±,
{ X + , X _ } = 0.
Since the Casimir operators are equal, X2+ = X2_, the corresponding representations of SO (4) have dimension n 2 , which explains the "accidental" degeneracy of the spectrum of the H-atom. A similar analysis for H > 0 gives SO(1,3) as symmetry group and E(3) for H = 0. One can go one step further, by considering additional operators mapping one level to the next one (ladder operators). Altogether, one gets a Lie algebra so(4,1) and a single UIR of SO(4,l) yields the full discrete spectrum of the H-atom. This is an example of a dynamical group. It is interesting to note that the same results hold true in quantum mechanics, with the usual commutators of operators. Actually, the SO(4) symmetry for the bound states was used by Pauli in 1926 for solving the H-atom algebraically, before Schrodinger's quantum mechanics! 4.3. Classical field theory: theorem
Symmetries
and
Noether's
The most spectacular consequences of invariance properties occur in the case of classical physical systems with infinitely many degrees of freedom, that is, systems that must be described by a (classical) field theory. This covers, for instance, acoustics, fluid dynamics, or classical electromagnetism (Maxwell equations). As in the finite case, one must first identify the various ingredients of the theory.
47
The canonical variables are (classical) fields: tfij(x), j = 1 , . . . , N. Notice that, in a relativistic setting, x = (t,x) = {x^},^ = 0,1,2,3, and all four variables must be on the same footing. The Lagrangian is the space integral of a Lagrangian density, £ = C((pi(x),df1Lfi(x)), which depends only on the fields
d3x£(y)j(a;),5M
The Euler-Lagrange equations are now field equations (the relativistic sum mation convention is applied): dC_ _
dC a , „ / a ,. , = 0 ,
i = l,...,N.
(30)
The conjugate fields are defined as d
I
^2ni(x)ipi(x
As before, the Hamiltonian describes the time evolution of observables. Indeed, given an observable A = A(ni(x),ipi(x)), its time evolution is governed by Hamilton's equation (29), that we repeat for convenience, A = {H,A}.
(31)
In the context of classical field theory, the connection between invariance properties and conservation laws is given by the celebrated theorem of Emmy Noether. But before stating the theorem, we have to make more precise the notion of symmetry. Assume there exists a Lie group G of spacetime transformations under which the fields are covariant: x
H->
x' = gx,
x e K4, g 6 G,
(32)
N
(33)
In (33), S(g) = (S1; ()) is an iV-dimensional representation of G, not necessarily irreducible.
48
There are several possibilities for such transformations: (i) Geometric transformations, that is, transformations that act both on spacetime (thus, x' 7^ x in general) and on the components of the fields: . translations: ^[(x + a) =
(b) Local or gauge transformations, the same transformations as in (a), but now g = g(x), which means that all points of spacetime are transformed independently. Since G is a Lie group, we may consider an infinitesimal l'" = l " +
fa"
With
5X» =
transformation
LJrX^,
r
Si(g) = 8{+ur(G )i, that is, ip'i(x') =
6(pi =
ujr(Gryi(pj(x).
Here uir, r = 1,2...,dimG, are the infinitesimal parameters of the transformation and Gr are the infinitesimal generators in the representation S (thus each Gr is a N x N matrix). We introduce now the following terminology: . A current is a four-vector JM(a;) = Jfx{ipi,duipi). . The current JM is conserved if d^J^x) = 0. . The charge associated to the current J^ is the quantity
Q(t)=
f
d3xJ0(x,t).
49
Theorem 4.1. (Noether) Consider a classical field theory with Lagrangian C{x) = £(tpi,dn(fi),i = 1,...,JV and a R-dimensional Lie group G of spacetime transformations under which the fields are covariant: x i->- x' = gx,
Sl{g)wj(x).
Assume that the fields satisfy the Euler-Lagrange equations (30) and that the action integral is invariant under G: [ d4x'£'(x')= [ d4xC(x), Jsi' Jn for any region ft C IR4 and its transform 0,' under the map x H-> x'. Then there exist R conserved currents JJ^(x): 0"J£(a;) = O,
r = l , 2 , . . . , i i = dimG.
In addition, if the Lagrangian vanishes fast enough for \x\ —> oo, the corresponding charges Qr(t) are constant: |Q r (i)=0,
r =
l,2,...,R.
The proof of this theorem is essentially computational, using a variational approach (as with the Euler-Lagrange equations). Instead of giving it, we will discuss a series of concrete examples, which are of crucial importance in physics. (1) Translations An infinitesimal translation reads xltl — x^ + a11, with \a^\
and the conserved charges read P" = [ d3x T0v = j afx {-g0uC + ^ V i ) , P° = fd3x{-£
+ Tri
fd3xV.{x).
50 Thus P ° coincides with the Hamiltonian, and represents the total energy of the system. By covariance, Pk represents the total momentum and T M " is the energy-momentum tensor. (2) Lorentz transformations An infinitesimal Lorentz transformation is simply an infinitesimal rotation in Minkowski space, hence x"1 =x»+e$x1',
e"" = - e " " , with \e$\ <£ 1,
5x» = €»x" = Ca0 g0nx« = i €a0(g^xa
-
g^x?).
Thus wr = \ea0, X «/3.M
r = 01,02,03,12,13,23,
= x* gPn - x0
a g
» =
^x/3a'".
The field transformation law is given by group theory: the (proper) Lorentz group has irreducible finite dimensional representations (nonunitary for s j£ 0) of dimension (2s + 1 ) , with s = 0, | , 1, | . . . , corresponding to scalar, spinor, vector, tensor fields, etc. The infinitesimal generators are
(G\s))i = {G%)> = -(Gf»)f, with a , £ = 0,1,2,3, i,j = 1,2,... ,2s + 1. The conserved currents become ja0,»
= {xa T,0 _ jjiafi
_ j T,a) _ _ ^ _ ( G g ) , - v. f
I gna/3 ^
The two terms represent orbital angular momentum and spin, respectively. It is important to notice that they are not conserved separately, only the total angular momentum is! (3) Global gauge transformations: U(l) In the abelian case, each field transforms separately, simply by a phase factor,
=e-iaqiifi(x),
where a 6 l i s the transformation parameter and the number qt G K specifies the behavior of the field ?$. Since this is a purely internal transformation, we have Xr'li = 0. The parameters may thus be identified, for
51 an infinitesimal transformation: 5ifi = —iaqiifii, so that ujr = a, with \a\
As for the conserved current and the charge, we get
N
Q = i V^ qj
d3x -Kj (x) tpj (x).
(4) General (global) internal transformations In this case, which covers, for instance, SU(2) (isospin), SU(3) (color), SU(3)xSU(2)xU(l)(Standard Model), the situation is exactly the same as in the case of U(l): N
N
~ (fi(x) + ojr ^2(Gr)i
fj{x),
for|wr| < 1.
J'=l
Since this is a purely internal transformation, we have again Xr'M = 0. The conserved current and the charges read as before:
Qr = J
fd3xJ2^(Gr)i^ij
Remark: In the case of purely internal transformations, abelian or not, the invariance condition is simply
Such a condition, which is in general easy to verify, is the primary constraint in the derivation of classical field theory models. This is one of the strengths of the Lagrangian formalism. By extension, as we shall see in the next section, the same situation prevails in quantum field theory. This is the way
52
in which successive models describing the interactions between elementary particles have been set up along the years, with the so-called Standard Model as ultimate example, so far at least. (5) Local internal
transformations
An infinitesimal U(l) gauge transformation ipj(x) M- eiq'e ipj{x) ~ (l + iqj6)yj{x),
for \0\ < 1,
yields for the variation of the action
Now two situations may arise: (i) For a global transformation, 6 is constant and may be taken out of the integral, so that everything is as before. (ii) For a local transformation 8 = 6(x), the argument may not work. For instance, a spinor field requires coupling to an electromagnetic field (Weyl). However, there is no additional conservation law, but the coupling is uniquely determined (covariant derivative, leading to minimal coupling). In the case of a nonabelian local internal transformations, the result is the same, but a proper derivation requires the language of differential geometry (concepts as a connection one-form and a curvature two-form are needed), but we shall refrain to pursue this any longer, for lack of space. 5. Quantum Physics 5.1. Symmetries
in Quantum
Mechanics
We turn now to Quantum Mechanics. The following assumptions are standard: (i) Pure states are represented by normalized rays (one-dimensional subspaces) in a Hilbert space Sj: $={i>eia,
||V||2 = 1,
0^a<2n}
We denote by Sj the corresponding projective Hilbert space, that is, the set of all one-dimensional subspaces of Sj. Note that Sj is not a vector space!
53
(ii) The transition amplitude between two states ip,
Clearly, the choice of particular vectors ip £ ip,
p$-+ip) = (ip,t)2. (iv) Observables are represented by self-adjoint operators acting on Sj. The motivation for these assumptions is twofold. First the superposition principle imposes a vector space and the notion of transition amplitude requires a scalar product space, thus one needs a prehilbert space. Then mathematical convenience suggests that this state space be complete, i.e., be a Hilbert space. In this set-up, a symmetry is defined as follows. Definition 5.1. A symmetry is an automorphism of Sj, that is, a bijective map between states, r : $j —> $j that preserves transition amplitudes: (rip, T(p) = (ip, (p), for all ip,
U{h) a W($)/U(l).
54 Let us now turn to the case of a symmetry group. Suppose that the system possesses a Lie group G of symmetries, that is, to every g G G one associates a symmetry r p that represents g. How does that fit into the present language? First of all, by Wigner's theorem 5.2, every T9 is represented by an operator U(g), unitary or antiunitary. Next, in a neighborhood of the identity, every g is a square, g = g'g', hence U(g) must be unitary. This is true for all g G G if G is connected, otherwise for all g G Go, where Go is the connected component of the identity of G. In order to proceed, we introduce new notions of representations. A projective representation of G in $j is a homomorphism T : G -> U($j). Similarly, a projective representation of G in S) (or representation up to a phase), is a map Uu : G —> W(fj) such that Uu(gi)Uu(g2)
=u(gi,32)
Uu(gig2)
with |w(#i,# 2 )| = 1.
Next we have to introduce some notion of continuity of a representation. The correct way is to define topologies on U(fi) and U{S)) that turn them into topological groups, but this is technically rather difficult. Instead, we will restrict ourselves to a simple-minded approach, namely, we will require, for physical reasons, that the matrix elements {ip,Tg4>) be continuous functions of g in a neighborhood of the identity. The main result is the following theorem, that summarizes the discussion given by Wigner and Bargmann. Theorem 5.3. (Wigner—Bargmann) If the symmetry group G is continuous in the sense that the matrix elements (ij),Tg
55 Theorem 5.4. (Voisin) Let G be a connected and simply connected group and let Uw be any projective unitary representation of G, with factor
Then, Uu may be deduced from a genuine unitary representation U of a group Gw, which is a central extension of G by E, namely, Gu = {(#,), 6 G E, g G G}, with multiplication law (01,ffl)(02,S2) = (6l+02+t(9l,
92), 9l92).
The link between the two representations is given by the relation Uu{g)=e-i6U{6,g). Here the extension is called central since {(0,e)} C Z(G). Examples: (1) Many groups of physical interest do not have nontrivial factors 0, for instance: . . . . .
abelian, connected and simply connected groups, e.g., E"; abelian, connected and compact groups; semisimple groups (because the Killing form is nondegenerate); in particular, SO(p,q), which is semisimple; ISO(p, q), the inhomogeneous pseudo-orthogonal group, provided that p+ q > 2; the result is false for ISO(l,l), which has a one-parameter family of nontrivial factors.
(2) Poincare group: the factor oj{g\,g2) may be reduced to ± 1 . This is the fundamental result of Wigner, 13 in his pioneering paper (the first paper treating infinite dimensional representations). (3) Galilei group: there is a one-parameter family of nonequivalent factors £(2,3i)j indexed by a parameter m, which is interpreted as the mass of the particle described by the representation. Thus, in nonrelativistic quantum mechanics, mass is absolutely conserved ("superselection rule"). This is the fundamental result of Bargmann. 15 A proper discussion of all this requires tools from cohomology theory, which goes far beyond the present course. As we have just seen, symmetries are represented by (possibly projective) strongly continuous unitary representations of Lie groups. What about observables? It turns out that most of them are described by elements of the corresponding Lie algebra. Consider a continuous one-parameter group
56 of symmetries, for instance time translations (time evolution). This means a family of unitary operators U(t),t 6 K, such that . U(h)U(t2) = U(h+t2), for all tut2 € R, . U(0) = I, which implies U(t)* = Uit)'1 = U(-t), . U(t) is strongly continuous at t = 0. By Stone's theorem, there exists a self-adjoint operator i?, the infinitesimal generator (here the Hamiltonian), such that U{t) = e~lHt, and defined as 'U(t)-T (s-lim = strong limit) -it t-¥0 on the domain D{H) = {>€ Sj such that the limit exists }. In general, this operator is unbounded, so that some care must be exercised! More generally, for a given unitary representation U(g), the generators of the corresponding Lie algebra are represented by unbounded self-adjoint operators, which represent observables, for instance: H = s- lim
. . . .
Time translations => Hamiltonian, total energy Space translations => total momentum Lorentz transformations =>• total angular-momentum Gauge transformations => charges
All these are automatically self-adjoint, but there are, of course, domain problems (which are mostly ignored in the physics literature). These problems lead naturally to the notions of analytic vectors or C°° vectors, Garding domain, etc. Thus we are back to the link between Lie groups and their Lie algebras, this time in terms of operators on Hilbert space. 5.2. Symmetries
in Quantum
Field
Theory
We have seen in Sec. 4.3 that symmetries in classical field theory are described by representations of Lie groups, while Noether's theorem leads to the conservation laws associated to these symmetries. How to translate this in quantum field theory? A possible solution is to consider matrix elements of field operators and treat them as classical field variables (correspondence principle): F?fi(x) = (*a\
(34)
where $ a , $/3 are two state vectors. Consider for instance a Poincare transformation x» h-> x"* = k>i,Xv + a" , (A, a) € V(l, 3).
57
The transformation law of a classical field reads C
W*) »
cl
ri(Ax
a)=Sl(AylVj(x).
+
Apply the correspondence principle: Ff{x)
H- F'f{Ax
+ a)= S{(K)
Ff{x),
where F ? V ) = <*« I ¥><(*') *^> and $ ^ , $ 3 are the state vectors representing the states a,/3 in the new frame {x1}. By the Wigner-Bargmann theorem 5.3, $' a = U(A,a)$a and $^ = U(A, a)$0, where U(A, a) is a continuous unitary representation of P(l, 3). Hence, F[ati{x')
= {KWi{Ax
+
a)^)
= (^a\U~1(A,a)tpi{Ax =
+
a)U{A,a)^0)
S>(\)(*a\
Assuming that the states $a,$0 conclude that U-l{A,a)ipi(Ax
run over a dense subset of $j, we may + a)U{A,a) = Si(A)
(35)
or, equivalently, U{A,a)ipj(x)U-1{A,a)
= (S _ 1 (A))j Vi(Ax + a).
(36)
The same analysis can be made for an arbitrary Lie group of symmetries: The classical transformation law c/
$ H+ d$'i(gx)
=
Si(gyl$j(x),geG
becomes in its quantum translation U(g) vj(x) U-\g)
= (S-Hg))) vM,
9£G,
where U(g) is a unitary (projective) representation of G. For an infinitesimal transformation x i-> gx ~ x + uirXT , |w r | <§C 1, we get U(g) = e*""*' ~ I + iurKr S(g)=
+ O(io2r),
e ^ G " ~ I + urGr + 0(u%),
58 where Kr,Gr are the infinitesimal generators in the representations U,S, respectively. This gives (I + iu)rKr)tpj(x)
(I -iwrKr)
= (I - ujrGr)) ip^x + cjrXr).
(37)
On the other hand ^ O r " + u>rX™) ~ ip^x") + | p u>rXr>" = {I +
urXT>vdu)Vi{xfl).
To first order in w r , this gives i [Kr,
- (G r )j
In the case of Poincare transformations, we thus obtain the consistency relations i[P»,tpj(x)]
= d>1
i [Ja0 , ipj(x)] = (xad0 - x0da) ipj[x) - {Ga0))
= {S-\g)))^{gx),
g 6 G.
(38)
states
Among the many applications of group representations in quantum physics, we point out one that has enjoyed a considerable success (including the Nobel Prize in physics 2005 to R.J. Glauber), namely, coherent states. These were originally introduced by Schrodinger in 1926 in the context of the classical limit of quantum mechanics, but then they were forgotten. They were reconsidered in the 1960s for the purpose of modelling the coherent light emitted by a laser, but it was soon recognized, by Perelomov 17 and Gilmore, 4 independently, that this was essentially an application of group representation theory. A comprehensive discussion of the subject may be found in the recent monograph of Ali et al.12 Here we sketch only the general construction.
59 Let G be a locally compact topological group, with left invariant measure dfi(g), and let U be a square integrable UIR of G in a Hilbert space ft. Choose a nonzero admissible vector n € Sj and define
c(v) =
^JG\(U(g)v\v)\2dn(g).
Then the corresponding family of coherent states (CS) is the set S = {Vg = U(g)v, g e G}. The essential properties of the CS family S are neatly summarized in the following theorem Theorem 5.5. Let G be a locally compact topological group, U a square integrable UIR of G in the Hilbert space $j and n £ Sj a nonzero admissible vector. Define the (CS) map Wv : S) —> L2(G, dfi{g)) by (Wvd>)(g) = -^=(vM,
(39)
Then: (1) Wv is an isometry, that is, W*WV = I, and S defines a resolution of the identity: JG\Tl9)(rig\dv(g)=I.
-^-j
This implies that S is total in $). (2) The range S}v of Wv is a closed subspace of L2 (G, d/j.(g)) and the corresponding orthogonal projection Pn — WVW* is an integral operator, with kernel Kn{g,g') = ^FZjiVglVg')Thus Sjv is a reproducing kernel Hilbert space, that is, $ € L2 (G, dfi(g)) belongs to S)n if and only if it satisfies the reproduction property
H9)= f Kv(g,g'Mg')d^g'). JG
(3) Wn intertwines the representation U and the left regular representation UL of G in 1?(G, dfi(g)): WvU(g) = UL(g)Wv,
for all
geG.
60
(4) The CS map Wn may be inverted on its range by the adjoint map W*, so that one gets a reconstruction formula for every vector <j> £ $j: <j> = W;$ = -±=
f $(g)Vg dfiig), * 6 Siv.
The map Wv is usually called the CS transform associated to the group G. Property (3) expresses its covariance under the operations of G. It also means that U belongs to the discrete series of representations of G. As for Property (4), it means that every vector (signal) 4> m a y be expressed as a (continuous) linear superposition of CSs. Thus the latter play the role of a continuous basis in Sj. The two most conspicuous examples of CS transforms are the Gabor and the wavelet transforms, based on the Weyl-Heisenberg and the similitude (or afnne) groups, respectively. (1) The Weyl-Heisenberg group GWH = E 2 x S1 = {(s,q,p) :seS\
(q,p) 6 E 2 } .
The center of G W H is Z = S1 = {(s,0,0)} and the quotient GWH/Z is isomorphic to IR2. The UIRs of GWH have a very simple form, thanks to von Neumann's uniqueness theorem. Indeed, the latter states that, for fixed A / 0, all the UIRs Ux(s, q,p) of GWH are unitarily equivalent and have the form Ux(s,q,P)
=
eiXsDx(q,p),
where Dx(q,p) is a displacement operator. In the Schrodinger representation (for simplicity, we put A = 1), {D{q,p)f){x)
= eip"l2 eipx f(x -q),xe
K.
x
Moreover, each D is square integrable modulo Z: J
\{Dx{q,p)(j)\(i))\2 dqdp < oo, for all0 G i} A =
L2(R2).
Then, since the group GWH is unimodular, every function / £ L 2 (E, dx) is admissible. The family of CSs so obtained are the canonical coherent states, the original ones introduced by Schrodinger in 1926 and rediscovered in the 1960s by Glauber, Klauder and Sudarshan. 19 The associated CS transform is variously called the Gabor transform, the Windowed Fourier transform, or the Short Time Fourier transform.
61 (2) The affine group of E Gaff = { ( M ) :&£K, a £ E, a ^ O } = KxiIR*. The affine group, which is not unimodular, has only one square integrable representation (up to unitary equivalence, of course), acting in L 2 (E, dx), namely, \a\-1'2f{a-1(x-b)).
[U(b,a)f](x) =
Then ip £ L 2 (E, dx) admissible if and only if
The associated CS transform is the one-dimensional wavelet transform. (3) The similitude group of the plane SIM(2) = E2 xi (M+ x SO(2)). The situation is exactly the same as in one dimension. First, one notices that the natural operations that may be applied to a signal s £ L 2 (E 2 , d2x) are obtained by combining three elementary transformations: sbta,s(x) = [U(b,a,0)s](x) = a~1s(a~lr-S{x
- b)),
where b £ E2 is a translation parameter, a > 0 is a dilation parameter and rg is a 2 x 2 rotation (orthogonal) matrix, 9 £ [0,2ir). Then one checks that U is indeed a UIR of SIM(2), acting in L 2 (E 2 ,oP:r) and, again, it is the only one. Moreover U is square integrable and a vector I/J £ L 2 (E 2 ,d 2 a;) is admissible, and called a wavelet, if it satisfies the condition
/!«*)!• 0 < co. The associated CS transform is the two-dimensional wavelet which is studied thoroughly in our recent monograph. 20
5.4. Applications
in quantum
transform,
physics
There are plenty of applications of groups and group representations in quantum physics. Here are some of them (already mentioned in Sec. 1). A detailed description may be found in our article in the Encyclopedia of Physics. 1 Relativity: In order to completely define a physical system, one must choose a relativity group G rel , that is, the group that leaves invariant the
62
chosen class of equivalent reference frames. The standard examples are the Euclidean group (Euclidean geometry), the Galilei group (nonrelativistic mechanics), the Lorentz or the Poincare group (nonrelativistic mechanics and electromagnetism). This choice then determines the tensorial properties of quantities, via the representations of the group G r e l . Although these considerations apply to classical physics, they are essential in quantum physics as well, as we have seen in Sees. 5.1 and 5.2. For instance, the Poincare group plays a central role in quantum field theory, in particular in the axiomatic approach. In studying atoms and molecules, the rotation group SO(3) simplifies enormously the spectroscopic data (e.g., selection rules); here again, this answers classification purposes. As we have seen in Sec. 4.2, the concept of symmetry may be extended further, for instance, to approximate symmetries, accidental symmetries (H-atom), or dynamical groups. Furthermore, the interaction of matter (mostly atoms) with light is a hot topic now that powerful lasers are available. There, of course, coherent states and their relatives, the so-called squeezed states, play a prominent role. As we have seen in Sec. 5.3, this is another field of application of group theory. As for solid state physics, it relies in an essential way on the crystallographic properties of solids, which are derived from group theory, as we have seen in Sec. 1. Elementary particles: Since the early 1960s, the whole world of particle physics is dominated by group theory. One may distinguish several successive stages: . The original quark model is entirely based on the representations of SU(3). . For understanding dynamical properties, additional symmetries are postulated: chiral symmetry SU(2)xSU(2), or SU(3)xSU(3), current algebra with a local SU(3)xSU(3) Lie algebra. . Going over to gauge symmetries, the same scheme repeats itself: QCD is based on SU(3), the electroweak interactions are based on SU(2)xU(l), the Standard Model is based on SU(3)xSU(2)xU(l). . The next step may be supersymmetry, which mixes bosons and fermions. But this leads also to new "super"mathematics: Lie superalgebras, supermanifolds, supergroups, etc. In conclusion, one may say without exaggeration "...Except for calculus and linear algebra, no mathematical technique has been so successful".1
63 Acknowledgments It is a pleasure to t h a n k the organizers of C O P R O M A P H 4 and, in particular, M. N. Hounkonnou, for the invitation to give these lectures and for their hospitality. Special thanks are due to Daniela Ro§ca, her thorough reading and constructive criticisms have improved the text considerably in terms of rigor and coherence.
References 1. J.-P. Antoine, Group theory in Physics, in Encyclopedia of Physics, Vol. 1, pp. 941-952; R.G. Lerner and G.L. Trigg (eds.) (Third edition, Wiley-VCH, Weinheim, 2005). 2. H. Bacry, Lecons sur la Theorie des groupes et les Symetries des Particules Elementaires (Gordon & Breach, New York, and Dunod, Paris, 1967). 3. A. O. Barut and R. Raczka, Theory of Group Representations and Applications (PWN, Warszawa, 1977). 4. R. Gilmore, Lie Groups, Lie Algebras, and Some of Their Applications (Wiley, New York and London, 1974). 5. E. Cartan, Sur la structure des groupes de transformation finis et continus, These (Nony, Paris, 1894); Bull. Soc. Math. 4 1 , 53 (1913) ; in Oeuvres Completes, Vol. 1. (Gauthier-Villars, Paris, 1952). 6. H. Weyl, Theorie der Darstellung kontinuerlicher halbeinfacher Gruppen durch lineare Transformationen, I-IV, Math. Z. 23, 271 (1925); 24, 328, 377, 789 (1926). Reprinted in Selecta Hermann Weyl (Birkhauser, Basel, 1956). 7. H. Hopf, Uber den Rang geschlossenen Liescher Gruppen, Comment. Math. Helv. 13, 119-143 (1940-1941); Maximale Toroide und singulare Elemente in geschlossenen Lieschen Gruppen, Comment. Math. Helv. 15, 59-70 (19421943). 8. E. Stiefel, Comm. Math. Helv. 14, 350 (1941-1942); Kristallographische Bestimmung der Charaktere der geschlossenen Lie'schen Gruppen, Comment. Math. Helv. 17, 165-200 (1944-1945). 9. D. Speiser, Theory of Compact Lie Groups and some Applications to Elementary Particle Physics, Group Theoretical Concepts and Methods in Elementary Particle Physics, NATO Summer School Istanbul 1962, F. Giirsey, ed., pp. 201-276 (Gordon and Breach, New York, 1964) 10. J.-P. Antoine and D. Speiser, Characters of irreducible representations of the simple groups. I. General theory. II. Application to classical groups, J. Math. Phys. 5, 1226-1234; 1560-1572 (1964) . 11. J.-P. Antoine, David Speiser's group theory: From Stiefel's crystallographic approach to Kac-Moody algebras, in Two Cultures. Essays in Honour of David Speiser, pp. 13-23; K. Williams (ed.) (Birkhauser, Basel, 2006). 12. S. T. AH, J.-P. Antoine, and J.-P. Gazeau, Coherent States, Wavelets and their Generalizations (Springer, Berlin-Heidelberg-New York, 2000). 13. E. P. Wigner, Unitary representations of the inhomogeneous Lorentz group, Ann. Math. 40, 149-204 (1939).
64 14. D. J. Simms, Lie Groups and Quantum Mechanics (Springer Lecture Notes in Math., Vol. 52, Berlin-Heidelberg, 1968). 15. V. Bargmann, On unitary ray representations of continuous groups, 16. J. Voisin, On some unitary representations of the Galilei group. I. Irreducible representations. II. Two-particle systems, J. Math. Phys. 6, 1519-1529; 18221832 (1965). Symetrie galileenne et mecanique quantique, These de doctorat, Universite de Liege, 1965-1966. 17. A. Perelomov, Coherent states for arbitrary Lie groups, Commun. Math. Phys. 26, 222-236 (1972); Generalized Coherent States and their Applications (Springer, Berlin 1986). 18. R. Gilmore, Geometry of symmetrized states, Ann. Phys. (NY) 74, 391-463 (1972); On properties of coherent states, Rev. Mex. Fis. 23, 143-187 (1974). 19. J. R. Klauder and B. S. Skagerstam, Coherent States - Applications in Physics and Mathematical Physics (World Scientific, Singapore, 1985). 20. J.-P. Antoine, R. Murenzi, P. Vandergheynst and S. T. Ali, Two-Dimensional Wavelets and their Relatives (Cambridge University Press, Cambridge (UK), 2004).
65
LECTURES O N THE GAUGE THEORY/GRAVITY CORRESPONDENCE ROBERT de MELLO KOCH Department
of Physics and Centre for Theoretical Physics, University of the Witwatersrand, Wits, 2050, South Africa and Stellenbosch Institute for Advanced Studies, Stellenbosch, South Africa E-mail: [email protected]
These lectures provide an introduction to the A d S / C F T correspondence. The goal is to review the recent progress which provides the gravitational duals of certain marginal deformations of M = 4 super Yang-Mills theory. An attempt has been made to keep the lectures as self contained as possible. Keywords: A d S / C F T ; Matrix Models.
1. The Holographic Principle What are we going to do in this section and why? Quantum field theory is a surprisingly accurate description of the electromagnetic, weak and strong nuclear forces. When combined with general relativity, we find a rather impressive description of nature. However, it cannot possibly be complete: we know of physical phenomena that require both general relativity and quantum mechanics for their description. This includes the big bang and the final stages of black hole evaporation. Unifying quantum mechanics and general relativity into a theory of quantum gravity is a hard (as yet) unsolved problem. In these lectures we will be following an approach to this problem, called the holographic principle. In this section, we start by introducing the principle, reviewing evidence for it and explaining why it may provide a valid approach to quantum gravity. What is the holographic principle? Well, the holographic principle says that in a quantum theory of gravity the number of degrees of freedom available in the system scales like the surface area (and not the volume) of
66
the system. This is a very crude statement. We will be able to refine it a bit more than this; we will not give a completely precise statement of the principle — that is an open research problem. The holographic principle is a radical suggestion; it signals a break down in extensivity. What evidence do we have for the principle? Initial evidence for the holographic principle came from the physics of black holes. We will start our discussion by reviewing this evidence. To make our discussion concrete, let us discuss the Schwarzschild black hole ds2 = -f(r)dt2
+ /(r)-1^2 + r W ,
— . r The laws of black hole mechanics are stated in terms of the surface gravity of the black hole. The surface gravity is the force that is required of an observer at infinity to hold a particle of unit mass stationary at the event horizon. Let us compute the surface gravity of a Schwarzschild black hole. What force must an observer close to the horizon exert to hold the particle stationary? If we hold the particle stationary, it will have a worldline (s is the parameter labeling different points on the worldline of the particle) xr=r,
x° = sf-*,
f(r) = 1 -
x° =60,
x* =
In fact, the way we have set things up, s is the proper time u°
= ^
= f-ha0,
uagaflu0
=
-l.
We can read the force from the proper acceleration aa =
ua.puff.
The only non-vanishing component of the proper acceleration is a
r
15/ = 2*'
so that a = (o-5^)* =
| ^ r * .
Our Schwarzschild black hole has its horizon at r = 2Gm. It is easy to see that a diverges at the horizon. This is the force required to hold the particle at r if the force is applied locally. The surface gravity is the force required to hold the particle in place by an observer at infinity, call it a^r). How can we get our hands on a^r)! Imagine the observer at infinity holds the particle in place by attaching to it a light inextensible string. Let the
67
observer at infinity raise the string by a proper distance As. This observer will need to do a work AWoo
=a00(r)As.
At the particle's position, the displacement is also As, so that the work done is AW = aAs. Let us imagine that the work done is converted into radiation that is collected (i.e., measured) at infinity. The received energy is redshifted by a factor / j so that the energy received by the observer at infinity is AEoc = / 5 a A s . Now, energy conservation demands that the energy extracted be equal to the energy put in so that A £ m = AWoo and hence aoo(r) = f*a(r)
=
-—.
For our Schwarzschild black hole, the surface gravity is (from now on I will use the symbol K to denote the surface gravity) K = 000(7-)
=
r=2Gm
— — .
2Gm
The zeroth law of black hole mechanics says that the surface gravity of a stationary black hole is uniform over the entire event horizon. We see explicitly for the Schwarzschild black hole that the zeroth law is true. To state the first law of black hole mechanics, consider a quasi static process during which a stationary black hole of mass m and surface area A is taken to a new stationary black hole with parameters m + Am and A + AA. The first law states that a the change in mass and surface area are related by
Let us check this for our Schwarzschild black hole. What is the mass of our black hole? Let us study the motion of a particle far from our black hole so that we have small departures from flat space (and hence may apply a
T h e first law also covers the case that the black hole has an angular momentum and an electric charge. We are not going to consider this case, so we will not write the general form of the first law.
68
Newtonian gravity) and at small velocities (so that we can apply Newtonian mechanics). The motion of our particle is governed by the geodesic equation d2xx _ ds2 r
x a
dxa dx? ds ds
a/3 = -z9 "(<7i/a,/3 + 9v0,a ~
9a0,v)-
Since we are working at small velocities, our particles have a four velocity ua =
— ds with u° « 1 and u% « 0. Thus, the (spatial components of the) geodesic equation become (we work far from the black hole, so r is large) dV
~ r» - 4 - V „
—
„-Too-+-g
~
g00^-
Gm
— .
If we interpret this last equation as Newton's second law, we see we have the dynamics of a massive particle moving in the gravitational field of a point mass with mass m. Thus, the mass of our black hole is m. This is the result we were after. Next, the horizon is at b r = 2Gm so that the horizon area is A = 47r(2Gm)2. It is now easy to compute ^ = 327rG2m, dm dm =
1
dA
4Gm8irG
dA 8TTG'
in agreement with the first law. The second law of black hole mechanics states that the surface area of a black hole can never decrease, i.e., that Am > 0. This so-called area theorem was proved by Stephen Hawking in 1971. It also holds for more than one black hole. Imagine you have two black holes which merge to form a single bigger black hole. The sum of the horizon areas of the two original black holes will be smaller than or equal to the horizon area of the final (bigger) black hole. We will not discuss this second law further.0 b
Passing through r = Gm, we see that the gTr component of the metric changes sign, and consequently timelike geodesies are driven to r = 0; no timelike geodesic can get from r < Gm to r > Gm. Thus, r — Gm is by definition the horizon. c There is also a third law of black hole mechanics that we will also not discuss here.
69 The laws of black hole mechanics bear a striking resemblance to the laws of thermodynamics. The fact that the area of the horizon is non-decreasing suggests that we identify the area of the horizon with an entropy. The precise identification we use is SBH
= fcrG" Since the energy of the black hole is equal to the mass of the black hole, the fist law of black hole mechanics dA is to be compared to dE = TdS. In the analogy between black holes and thermodynamics, we see that the surface gravity is playing the role of a temperature. With this identification, the zeroth law of black hole mechanics becomes the statement that a system in thermal equilibrium has a uniform temperature. When this analogy was discovered, it was expected that at best it could only be a formal analogy. After all, the no-hair theorems tell us that all you need to specify the state of a black hole in 3 + 1 dimensions'1 are its mass, angular momentum and charge. How could such a fantastically simple object have such a huge entropy; black holes do not seem to be disordered — they appear to be perfectly ordered! A big step forward was taken when Hawking was able to show that black holes do radiate. This gave a convincing independent computation of the temperature of a black hole and it matched perfectly with the temperature suggested by the above analogy. Part of the progress which led to the developments we are discussing came from attempts to give a derivation of the black hole entropy by counting black hole states in a microscopic quantum gravity description of a black hole. On this front string theory has had dramatic success. Now, to get to the holographic principle, we need to take a leap of faith — we have to suppose that the scaling we discovered for black holes, namely that entropy is proportional to the area (and not the volume) of the system, is a general property of gravity and not special to black holes. Both 't Hooft and Susskind have given arguments for why we are "forced" into taking this d
I am mentioning the dimensionality of spacetime because the no-hair theorems do not hold in higher dimensions.
70
radical point of view. We will not review these arguments here, but rather, we will assume the holographic principle is a correct principle of nature and see where it leads us. If we want this to be a general principle of nature and not just a property of black holes, we might want to start asking ourselves where else it may apply. It seems a useful question to ask is, "Is there a way to get horizons other than with black holes?" There are many other ways. As a first example, consider de Sitter space
ds2 = -fl-,pj
dt2 + (l - J )
dr2 + r2dtl.
This is the simplest accelerating universe you can think of. We usually patch this into cosmological solutions to describe the inflating phase of the universe. In this space, two observers can be separated, and moving away from each other so that a signal emitted by the first observer might not reach the second observer. This leads to what are called cosmological horizons. The holographic principle would then associate an entropy with this horizon. As a second example, consider an observer in Minkowski space undergoing uniform acceleration. This observer sees the Rindler spacetime. The Rindler spacetime has a horizon and the holographic principle would associate an entropy with this horizon. There is a beautiful extension of this idea, due to Jacobson. Jacobson identifies the area of the local Rindler horizon of a non-inertial observer (O) with an entropy 5, the energy that flows across this horizon (as measured by O) with heat Q and the Unruh temperature observed by O as the temperature T. One then finds that the equilibrium thermodynamic relation SQ = TdS, can only be satisfied if gravitational lensing by matter energy distorts the causal structure of spacetime in such a way that the Einstein equation holds. Thus, we obtain the equations of motion of gravity (the Einstein equation) from the relation connecting heat, entropy and temperature — which usually determines the equation of state! This result is remarkable: we are used to thinking of equations of motion as something more fundamental than equations of state. Equations of state only arise as a coarse grained approximation to the dynamics; equations of motion are meant to be the fundamental equations that encode the dynamics of the fundamental fields in the theory. We are used to thinking of the metric as a fundamental field. Is it possible that gravity and spacetime are simply effective concepts, arising
71 after some sort of coarse graining? If this is true, then it would not be appropriate to canonically quantize gravity! This suggests a radical new approach to quantum gravity. The first question we need to answer to get started is what are the microstates (and what are their dynamics?) that we averaged over to get spacetime? The AdS/CFT correspondence, which we will discuss in these lectures, offers a nice laboratory in which we can provide a concrete approach to quantum gravity, along the lines suggested by Jacobson's calculation. Just before ending this section, I would like to mention that Bousso has formulated covariant entropy bounds which attempt to provide a covariant statement of the holographic principle. Suggested Reading: The original articles of 't Hooft and Susskind, which propose the holographic principle are [1] G. 't Hooft, Dimensional reduction in quantum gravity, arXiv:gr-qc/9310026;
[2] L. Susskind, The World as a hologram, J. Math. Phys. 36, 6377 (1995) [arXiv:hep-th/9409089]. For a lovely review of black holes in string theory, see [3] S. R. Das and S. D. Mathur, The quantum physics of black holes: Results from string theory, Ann. Rev. Nucl. Part. Sci. 50, 153 (2000) [arXiv:gr-qc/0105063]. For very recent and fascinating developments in the description of black holes in quantum gravity, see [4] S. D. Mathur, The quantum structure of black holes, arXiv:hep-th/0510180.
Jacobson's argument is given in [5] T. Jacobson, Thermodynamics of space-time: The Einstein equation of state, Phys. Rev. Lett. 75, 1260 (1995) [arXiv:gr-qc/9504004]. Finally, Bousso's attempts to give a covariant holographic bound can be found in [6] R. Bousso, Light-sheets and Bekenstein's bound, Phys. Rev. Lett. 90, 121302 (2003)[arXiv:hep-th/0210295]. 2. Some String Theory What are we going to do in this section and why? There are some good examples which provide concrete realizations of the holographic principle. One of the most interesting of these examples is the AdS/CFT correspondence, which was discovered using the tools of string theory. In this section we will
72
review the necessary string theory background we will need to understand AdS/CFT. String theory is a very simple idea: we usually think of elementary particles as points. To make the transition to string theory, we imagine that elementary particles are not points, but rather they are extended, little loops of string. A few comments are in order. Since the world is quantum mechanical, even point particles appear "fuzzy", smeared in space — so is it such a big deal that we now claim that elementary particles are not point like? The key difference is that because the string is extended, we can "excite it" — set up vibrations in the string —just like we can with a violin or a piano string. At the quantum level, the energy in each of these vibration modes will be quantized and each of these excitation modes behaves like a distinct elementary particle. Thus, in a very natural way a single string gives rise to a vast (infinite) number of distinct elementary particles6. This is the reason why strings provide such an elegant unification of different forces. There is no analog of these excitations for a point particle, illustrating a crucial difference between a point particle and a string. Another important point that deserves discussion, is that these are fundamental strings. There is a big difference between a fundamental string and the shoe laces in your shoes. The shoe laces in your shoes are made by binding lots of atoms together — they are not fundamental strings; they are lumpy strings. When you stretch a lumpy string, the "lumps" (= atoms) move farther apart — you pull them out of their equilibrium configuration. If you measure how the tension changes as you stretch the string, you will learn something about the forces between the lumps. Hooke's law for example, just tells us that the lumps are in equilibrium and hence, for small stretching, you can expand the potential about this equilibrium configuration. The first term in the expansion of the potential depends on the square of the displacement from equilibrium and this in turn implies a linear force law. Fundamental strings are not built up from smaller constituents — when you stretch them their tension should not change. Also, there is no meaning to "bunching" the string up or "spreading" the string out — implying that there is no meaning to exciting longitudinal excitations in the string; for a fundamental string, there are only transverse excitations. The
e
This is only true if the theory is weakly coupled. For a strongly coupled string theory, we would expect that the highly excited modes will decay rapidly into lower energy excitations. In this case, it is probably only sensible to think of the low energy excitations as elementary particles and the string would give rise to a finite number of particles.
73
requirement that you preserve these properties at the full quantum level is nontrivial. It requires for example that superstrings propagate in 10 spacetime dimensions and further that spacetime itself is determined by higher dimensional supersymmetric generalizations of Einstein's field equations in 3 + 1 dimensions. How should we interpret string theory? Well, if we study the theory at long distances, we know that it will be a good approximation to think of the strings as point particles, which are described using quantum field theory. We have a lot of experience with quantum field theory, so that figuring out what particles (= what string vibration modes) participate in the long distance limit will give us theories we are familiar with and know how to interpret. There are a number of consistent string theories that can be defined using perturbation theory. The one that is most interesting to us is what is called the type IIB string theory. When the string moves through spacetime, it traces out a worldsheet. Points on the worldsheet are labeled by a pair of coordinates, (r, a). These coordinates have no absolute meaning and the worldsheet theory is required to be invariant under general (r, a) coordinate transformations. The position of the point on the worldsheet with coordinate (r, a) is described by the spacetime coordinates X^(T,CT). We include a set of fermionic partners for these bosonic fields to obtain a supersymmetric theory. The dynamics of these fields can be specified by giving an action, which can then be used to write down classical equations of motion or to quantize the theory. In bosonic string theory, the action is proportional to the area of the worldsheet traced out by the string. If we do not introduce a worldsheet metric, the action for the string (called the Nambu-Goto action) involves a square root. By writing an action that makes use of a worldsheet metric (called the Polyakov action) we obtain a quadratic action, which is much easier to handle. If you eliminate the worldsheet metric from the Polyakov action, you recover the Nambu-Goto action, which demonstrates the equivalence of the two approaches. A straightforward approach to the superstring involves trying to generalize the Polyakov action. This is the approach we are taking. Now, thanks to the diffeomorphism (and in fact a Weyl symmetry) on the worldsheet, we can make many different gauge choices. The gaugefixed, physical spectrum of the ten-dimensional Type-IIB string is easiest to calculate in the light-cone gauge. In the light-cone gauge, we choose T = X+
= X°
+
X9.
In this gauge it is possible to eliminate X~. Clearly Lorentz invariance is
74
not manifest — only the transverse f group of rotations will be manifest. However, we have eliminated all unphysical modes — both the longitudinal oscillations we mentioned above and oscillations created by the oscillators appearing in the mode expansion of X°, which have negative norm. In the light-cone gauge, the transverse group of rotations is SO(8) whose covering group is Spin(&). The three representations of Spin(8) that will be relevant to us are the vector representation 8v, the spinor representation 8s, and the conjugate spinor representation 8c which are all eight-dimensional. The spinor 8s with right-handed chirality is related to the conjugate spinor 8c with left-handed chirality by parity transformation that flips the sign of one of the components of the vector 8v. We shall use the letters i, j , k as the 8v indices, the letters a, b, c as the 8s indices and the letters d, b, c as the 8c indices. In the Green-Schwarz formalism in the light-cone gauge, the worldsheet action of the Type-IIB string is given by S= ^ 1 f dadr(d+Xid-Xi
-iSadSa
-iSa
d+Sa),
Z7T J
where a is the coordinate along the string, 0 < a < 2TT, T is the worldsheet time and we have set a' = \. The bosonic fields on the string worldsheet X1 describe the transverse spatial coordinates of the string. In addition to these bosonic fields there are additional fermionic fields on the worldsheet: leftmoving 5° and right-moving Sa both of which transform as 8s. Since both right and left movers have the same spacetime transformation properties, this theory is non-chiral on the worldsheet. (Left-right symmetric means non-chiral.) But since only the right-handed chirality of spacetime fermions appears and not the parity transform, the theory is chiral in spacetime. Another inequivalent choice is to take left-moving Sa which transforms as a left-handed conjugate spinor and right-moving Sa which transforms as the right-handed spinor. This choice gives Type-IIA theory which has opposite chirality properties. To summarize, we have, Sa Sa
Sa Sa
II B II A
chiral in spacetime nonchiral in spacetime
nonchiral on world sheet chiral on worldsheet.
Quantization of this 1 + 1 dimensional free field theory is straightforward. The bosons X1 satisfy periodic boundary condition along a, and by spacetime supersymmetry so do the fermions: Sa(a + 2ir) — Sa(a), etc. With 'Transverse to X1*1.
75
this boundary condition, the mode expansion is
X* = x* + \v{r + \ £ -.
Sa
OO
«
5
_i (r+
= "Tf E " e " V
Q < e - i n ^ + ^ + 1 ai.e-^-'A
.
OO
^ = 7 | E ^» e-™(—).
—oo
,
(1)
—oo
Canonical quantization of the fields implies standard commutation and anticommutation relations for the oscillator modes
[ a 4 , < ] = m6ijSm+n, {Sn'Sm}
— &a
fim+n,
[ < , , < ] = m<5u5m+n, {S£, Sm} = Sa Sm+n.
The zero modes of the X1 fields satisfy the Heisenberg commutation relations [a;',^'] = i8li, and the ground state is therefore labeled by the momentum eigenvalue \p). Note that there are fermionic zero modes as well, SQ and SQ. The ground state should furnish a representation of the zero mode algebra {SS,Sb0} = 6ab,
{SS,Sb0} = 6ab.
Let us look at the left-movers since right-movers can be treated similarly. Let us rewrite the anticommutation relations by defining four fermionic oscillators y/2bm = (S27™-1 + iS2m),m — 1 , . . . ,4, which satisfy the usual anticommutation relations {bm, bl) = Smn,
{bm, bn} = 0,
{bl, &J,} = 0.
This rewriting amounts to choosing a particular embedding 50(8) D SU(4) x U(l), so that {bm} transform in the fundamental representation 4 of Si/(4) with | unit of U(l) charge, which we denote as 4 ( | ) , and the {b^} transform in the complex conjugate representation. With this embedding various representations decompose as 8v = 6(0) + l ( l ) + l ( - l ) , 8s = 4 ( i ) + 4 ( - i ) , 8c = 4
(-i)+l(I).
(2)
This embedding is more obvious if we use the fact that SO (6) ~ SU (4) and SO(2) ~ U(l). Then the above is a decomposition of the SO(8)
76
spinor in terms of the 50(6) spinor and its conjugate under the embedding 5 0 ( 8 ) D 50(6) x 50(2). The representation of the zero mode algebra can now be worked out easily by starting with the completely 'empty' Fock space vacuum |0) which is annihilated by all 6 m 's and then obtaining various filled states by acting with the creation operators. One obtains a 16-dimensional representation |0>
bl\0) blbl\0) blbibl\0) &LWJIO)
1(1) 4(|) 6(0) 4(-|) 1(-1)
where the labels in the second column indicate the dimensions of the SU(4) representation and the U(l) charges. We see from the 5t/(4) x U(l) quantum numbers that the 16-dimensional representation of the left-moving ground states reduces as a sum of two representations 8v + 8c. Similarly, for the right-movers, the ground states are given by the sum of 8v + 8c. A string state \ip) is constructed by acting with various creation operators on this 16 x 16-dimensional ground state carrying some spacetime momentum p. A physical state is subject to the on-shell conditions a'M2 = -a'jfpn
= 4N oo
= 4N
(3) n=0
We see that the massless states have no oscillator excitations and can therefore be read off by tensoring the left-moving and right-moving ground states (|*> © |d» ® (\j) © |i)). In the Neveu-Schwarz-Ramond formalism of the superstring, which is equivalent to the Green-Schwarz formalism that we have used here, the 8v comes from the Neveu-Schwarz (NS) sector, whereas the 8c comes from the Ramond (R) sector. The Neveu-Schwarz states are spacetime bosons whereas Ramond states are spacetime fermions. In the tensor product of left-moving and right-moving states, the NS-R and R-NS sector give rise to spacetime fermions V>m and ip^ which are the two gravitini of Type-IIB string. The NS-NS states \i) ® \j) can be reduced in terms of the symmetric traceless, antisymmetric, and scalar combinations which give rise to the
77
metric gij, the 2-form B^, and the dilaton
~ Af A2 © xjrijx2
©
xjrijklx2,
in terms of the Gamma matrices T', and their totally antisymmetrized products T u and r*J'*'. Because Ai and A2 have the same chirality, products such as P and rjJ'fc do not appear, and moreover, the combination Xf Tl^klX2 is required to be self-dual. Altogether we obtain a scalar x, & 2-form B'^, and a self-dual 4-form Dtjki from the R-R sector. In summary, the massless spectrum of Type-IIB is as follows. Bosons: NS-NS: metric g^, 2-form B^, dilaton
f A^dx^ to the action. This is allowed because it respects the symmetries of the action — namely that it is independent of the specific parameterization chosen for the particle's worldline. In exactly the same way, the term / Bijdx1
Adxj
is a perfectly acceptable addition to the worldsheet action of a string. The fundamental string couples to the Bij field. The only remaining field in the NS-NS is the dilaton <j). The dilaton <j> sets the strength of string interactions. The fields in the RR sector couple to Dirichlet branes. The term / Dijkidxl A dxi A dxk A dxl would couple to something with a 3 + 1 dimensional world volume, which we call a D3 brane. The lesson is clear — fields with many indices are a signal that the theory contains higher dimensional objects.
78 The couplings we have discussed above are "electric couplings" to the potential. There are also "magnetic couplings" which we describe now. The two form NS-NS Bij field gives rise to a three form field strength H3=dB2. Contracting H3 with the ten dimensional e tensor gives a 7-form dual field strength, Hy = H$. This field strength in turn can be sourced by a 6-form potential H7 = dA6. This 6-form potential will couple to a brane with a 5 + 1 dimensional worldvolume, something called an NS5 brane. In exactly the same way that electric and magnetic monopoles are related by electromagnetic duality in 3 + 1 dimensions, electromagnetic duality relates NS5 branes and fundamental strings in 9 + 1 dimensions. Exercise: Describe the electric and magnetic sources for the RR-potentials X, B'i:j and Dijkl. Suggested Reading: There are a number of excellent textbooks dealing with string theory. A very readable, beautiful book to get you started is [1] B. Zwiebach, A First Course in String Theory (Cambridge University Press, Cambridge (UK), 2004). The classics (quite a bit tougher) are [2] M. Green, J. Schwarz and E. Witten, Superstring Theory, Volumes 1 and 2 (Cambridge University Press, Cambridge (UK), 1987); [3] J. Polchinski, String Theory, Volumes 1 and 2 (Cambridge University Press, Cambridge (UK), 1998). Another very readable book with a lot of the modern developments is [4] C.V. Johnson, D-branes (Cambridge University Press, Cambridge (UK), 2003). 3. A d S / C F T What are we going to do in this section and why? The AdS/CFT conjecture states that type IIB string theory on the AdSsxS 5 background is dual to M = 4 super Yang-Mills theory in 3 + 1 dimensional Minkowski space. Type IIB string theory is a theory of quantum gravity. The boundary of AdSs space is 3 + 1 dimensional Minkowski space. Thus, the AdS/CFT conjecture
79
claims an equivalence between a theory of quantum gravity and a nongravitational theory living on its boundary, thereby providing a concrete realization of the holographic principle. We already saw in the last section that in addition to strings, superstring theory contains soliton-like "membranes" of various internal dimensionalities; we call these solitons Dirichlet branes (or D-branes). In this section we will explore how these D-branes can be described. One approach to the Dirichlet p-brane (or Dp-brane) dynamics views the brane as a p + 1 dimensional hyperplane in 9 + 1 dimensional space-time where strings are allowed to end, even in theories where all strings are closed in the bulk of space-time: when a closed string touches the membrane, the closed string can open up and turn into an open string whose ends are free to move along the D-brane. For the end-points of such a string the p + 1 longitudinal coordinates satisfy the conventional free (Neumann) boundary conditions, while the 9 — p coordinates transverse to the Dp-brane have the fixed (Dirichlet) boundary conditions; this is the origin of the term "Dirichlet brane". Polchinski has argued that the Dp-brane, with dynamics as described above, is a BPS saturated object which preserves 1/2 of the bulk supersymmetries and carries an elementary unit of charge with respect to the p + 1 form gauge potential from the Ramond-Ramond sector of type II superstring. A striking feature of the D-brane formalism is that it provides a concrete (and very simple) embedding of such objects into perturbative string theory: the D-brane is accounted for by modifying the boundary conditions appearing in the worldsheet sigma model. The massless spectrum of open strings living on a Dp-brane is that of a maximally supersymmetric 1/(1) gauge theory in p+1 dimensions. The 9 —p massless scalar fields present in this supermultiplet are the expected Goldstone modes associated with the transverse oscillations of the Dp-brane, while the photons and fermions provide the unique supersymmetric completion of the theory. Thus, the D-branes naturally realize gauge theories on their world volume. If we consider N parallel D-branes, then there are N2 different species of open strings because they can begin and end on any of the D-branes. N2 is the dimension of the adjoint representation of U(N), and indeed we find the maximally supersymmetric U(N) gauge theory in this setting. The relative separations of the Dp-branes in the 9 — p transverse dimensions are determined by the expectation values of the scalar fields. We will be interested in the case where all scalar expectation values vanish, so that the N Dp-branes are stacked on top of each other. If N is large, then this stack is a heavy object embedded into a theory of closed
80
string which contains gravity. This macroscopic object will curve space and it may thus be described by some classical metric and other background fields, such as the Ramond-Ramond p + 1 form potential. Thus, we have two very different descriptions of the stack of Dp-branes: one in terms of the U(N) supersymmetric gauge theory on its world volume, and the other in terms of the classical Ramond-Ramond charged p-brane background of the type II closed superstring theory. It is the expected agreement between these two descriptions that leads to the AdS/CFT conjecture. Example: The action for N Dp branes can be obtained by dimensionally reducing the action for Af = 1 supersymmetric Yang-Mills theory in 9 + 1 dimensions with gauge group U(N) to p + 1 dimensions. The bosonic part of the 9 + 1 dimensional Yang-Mills action is s
= ~\jdl°xTr
(F^F^),
F^ = d»A„ - dvA» + ig[A^,A„].
To dimensionally reduce, you should (i) replace d10x —> dp+1x and (ii) drop any derivative with respect to the coordinates that have been discarded. Following this procedure for DO branes, we obtain the following Lagrangian (I have relabeled the spatial components of the gauge field A1 as X1) L = Tr (^D0XiD0Xi
+ £[X\Xi]
[X\X*]\
.
The matrix element (X') a j is to be interpreted as the field describing an open string stretching from the oth brane to the 6th brane. Thus, the diagonal elements of the X% give the position of the brane. If we want to describe two DO branes, we would use 2 x 2 matrices X1. For example, to put one DO brane at X1 = a, X1 = 0 i > 1 and another at X1 = -a, X1 = 0 i > 1 we would set X* =
xl =
0
+
03/i i 03/i o
OX-t-i
OX-yo
0Xi2
OX 2 2
i> 1.
OJ/ 1 2 OJ-22
The fluctuations about this configuration (described by the 5x pieces in the above matrices) have a mass squared oc 4a 2 , as expected for the strings stretching between the two DO branes, which obviously have a length equal to 2a. Exercise: By plugging the above expressions for the matrices X1 into the action, verify the masses of the Sx modes.
81 To make our above discussion more precise, start by studying the threebranes in IIB string theory. The metric of the threebrane is given by ds2 = 11 + — )
{-dt2 + dx\ + dx\ + dx\)
+ (l + ^) 1 / a (dr a +r a dnl) . Clearly r is a radial coordinate for six spatial dimensions. To interpret this metric, again consider the motion of the particle, far from the brane, i.e., in the large r region (so that we again have small departures from flat space) and at low velocities so that we can again apply Newtonian mechanics. Arguing just like in Sec. 1, we find 4 d2xi :irL 2 5 ds r The gravitational force in d spatial dimensions falls off like -^rr so that the above force law is appropriate for the gravitational field of a point mass in six spatial dimensions. The correct interpretation of the above metric is that it describes an object located at r = 0 and filling the (x°, x1, x2, x3) dimensions — so we see that the above metric does indeed describe a threebrane. Now, let us take a low energy limit of this theory. We will have very long wavelength supergravity modes propagating in spacetime. In addition, those modes close to the brane will be red shifted to low energy 6 — so we need to keep all of the modes of the full string theory in the near horizon limit of the branes. The two sets of modes will not interact: the low energy supergravity modes have such a long wavelength that they do not even detect the (comparatively) tiny threebranes. Thus, we get the dynamics of two decoupled sectors: long wavelength supergravity modes propagating in flat 10 dimensional Minkowski space and all of the modes of IIB string theory in the near horizon geometry of the threebranes. To get the near horizon geometry of the threebranes, we study the metric in the limit that r is much less than L. In this limit
7-2
ds2 = -? (-dt2 + dx2 + dz2) + L2dfi§ , where z = ^- » L. This is the metric of the AdS 5 xS space. Thus, in the low energy limit, we obtain long wavelength supergravity modes props
If you prefer to turn this into a statement about Newtonian gravity — the D-branes set up a very deep gravitational potential well; the modes sitting really close to the branes are at the bottom of the well and hence have a low energy.
82
agating in flat 10 dimensional Minkowski space and all of the modes of IIB string theory in the AdSs x S 5 geometry. Again, consider the same system but this time, instead of replacing the branes with a metric and background RR fluxes, we will include the branes by allowing them to determine the boundary conditions of new open string degrees of freedom. In addition to these open strings, we will also have closed strings propagating in the bulk of spacetime. The open and closed strings will interact with each other. It can be shown that this interaction reproduces the expected Hawking radiation of the threebranes. By taking a low energy limit of this theory, the interaction of the two sets of string modes again goes to zero. From the closed strings, we again recover long wavelength supergravity modes propagating in flat 10 dimensional Minkowski space. From the open strings we recover N = 4 super YangMills theory. Since this must match with the description of the low energy physics we obtained above, we are led to conjecture that IIB string theory in the AdSsxS 5 geometry is somehow equivalent to M = 4 super YangMills theory with gauge group U(N). This is the best studied example of the AdS/CFT conjecture. It is not at all easy to understand what we mean by "somehow equivalent to". Indeed, a straightforward dictionary relating the two looks highly unlikely. The super Yang-Mills theory lives in 3 + 1 dimensional Minkowski space. The IIB string is in the AdSsxS 5 background, so the field theory and the string theory live in spacetimes of different dimensionality. However, the boundary of AdSsxS 5 is 3 + 1 dimensional Minkowski spacetime, so that our conjecture looks like a concrete realization of the holographic principle. Let us just make a quick detour to discuss a few aspects of the Anti-de Sitter space. One way to visualize a manifold, is to embed it in flat space. For example, an n-sphere Sn can be described using the equation (a;1)2 + ( i 2 ) 2 + ... + ( a ; n + 1 ) 2 = i J 2 > in n +1 dimensional Euclidean space. From this description, the SO(n + 1) isometry of the n-sphere is manifest. For concreteness set n = 2. We can then choose x1 = R cos 6 sin >, x2 = Rsm9sin(f>, x3 = Rcos(f>,
O<0<2TT,
0<!><7r.
83 Using the metric for Euclidean space ds2 = (dx1)2 + (dx2)2 + (dx3)2, and our above parameterization of the S2 in terms of 8 and (p, we easily obtain the metric on S2 as ds2 = (dcf>)2 + sin2 ct>(d6)2.
In an exactly analogous manner, AdS<; space can be viewed as the surface (x-1)2
+ (x0)2 - (x1)2 - (x2)2 - ... - (x*-1)2
= A2,
in the d + 1 dimensional space with metric ds2 = -(da;" 1 ) 2 - (dx0)2 + (dx1)2 + (dx2)2 + ... +
(dxd~1)2.
Prom the above surface and metric, it is easy to see that AdSd space has an SO(2,d — 1) isometry. I will now specialize to d = 3. In global coordinates we choose a; -1 = A cosh/i sin i, x° = A cosh (J, cos t, x1 = Asinh^cos#, x2 = Asinh/isin#. Following exactly the procedure we followed above for the sphere, we obtain the following metric ds2 = A2 ( - cosh2 fidt2 + dfi2 + sinh 2 /j.d62). Another set of coordinates that are often used are the Poincare coordinates ^-1 = i - ( A 2 + x 2 + r 2 - t 2 ) , IT
x° = A-, r
x1=-l(-A2 + 2r x2=A-.
2 2;
r
+r2-i2),
84
In Poincare coordinates, A2 ds2 = — (-dt2 + dx2 + dr2) . Finally, I should mention that Anti-de Sitter spaces are solutions of the Einstein equations with the "wrong sign" for the cosmological constant. Given the conjecture, perhaps the first question that we should address is how the parameters of the field theory and those of the string theory are related. At first, we have a puzzle. Both the field theory and the string have a parameter (usually called h, or equivalently, the coupling constant) which controls the size of quantum corrections. In addition, the string theory also has a parameter, the tension of the string, which controls a new type of "stringy" correction which arises because we do not have a point particle. There does not seem to be a second parameter in the field theory — so even at this most basic level, things are not clear. In fact, the answer to this puzzle was given by 't Hooft long before the AdS/CFT correspondence was discovered, 't Hooft has shown that the Feynman diagrams of the field theory can be arranged in an expansion in two parameters: -^ and the so-called 't Hooft coupling A = 9YM^- The precise matching between the parameters identifies ^ with the closed string coupling constant and A with the radius of curvature of the AdS space. This second identification is worth commenting on. When A is small, we can trust perturbation theory in the field theory. However, in this limit, the corrections coming from the finite size of the string (curvature corrections) cannot be neglected, so that computations on the string side of the correspondence are not possible. When A is large, the string moves on a space with small curvatures and hence computations in the string theory are straightforward. On the other hand, the field theory is strongly coupled. This makes it difficult to test the conjecture — we are usually forced to concentrate on objects that are protected or nearly protected, so that they can be computed at weak coupling and then reliably be continued to strong coupling. Of course, apart from making the conjecture difficult to test, it makes it useful. We obtain new tools with which to handle strongly coupled field theories and strongly curved string backgrounds. We do not yet know how to derive the correspondence, so it remains with the status of a conjecture. However, many thousands of papers have now been written which check the predictions of the conjecture. None of these papers have yet proposed a test that the conjecture did not pass, so our confidence that it is correct grows. There are many ways in which the correspondence can be checked. I am only going to focus on one possibility
85 which we will use later in these lectures. The idea is to look at conserved charges in the field theory and match them to corresponding conserved charges in the string theory. To obtain the conserved charges in the field theory, we can study what symmetries act in the field theory; then according to Noether's theorem, there is a conserved charge for each symmetry. The field theory is invariant under an 50(2,4) conformal symmetry and an 5[/(4) 1Z symmetry. We use the maximal subgroup of the conformal group (that is 50(2) x 50(4)) and the TZ symmetry group quantum numbers to classify states in the field theory. If we use the fact that locally SU(4) and 50(6) are equal, we see that the isometries of the AdSs x S 5 supergravity background exactly match the field theory symmetries. The isometries of the supergravity background can be used to label the supergravity states. Thus, we have a nice explicit map between the states of the supergravity and the states of the super YangMills theory. (If there are degeneracies, this mapping is not unique and you need to do more work to define the dictionary.) In what follows we will be making sure that the 50(2) part of the map works. On the field theory side, the 50(2) quantum number is the anomalous dimension of the operator. If you compute the two point correlation function of an operator with a definite conformal dimension A, conformal invariance restricts its form to be
(0(x)0(y)) a j - i - p . The number A is the 50(2) quantum number. On the string theory side, the 50(2) quantum number is the energy of the string state. Thus, in what follows we will try to see if the spectrum of conformal dimensions of operators in the field theory reproduces the energy spectrum of string theory states. Suggested Reading: Soon after Maldacena's original paper suggesting the AdS/CFT correspondence [1] J. M. Maldacena, The large N limit of superconformal Held theories and supergravity, Adv. Theor. Math. Phys. 2, 231 (1998) [Int. J. Theor. Phys. 38, 1113 (1999)] [arXiv:hep-th/9711200], two papers appeared which added significant insight [2] S. S. Gubser, I. R. Klebanov and A. M. Polyakov, Gauge theory correlators from non-critical string theory, Phys. Lett. B 428, 105 (1998) [arXiv:hep-th/9802109]; [3] E. Witten, Anti-de Sitter space and holography, Adv. Theor. Math. Phys. 2, 253 (1998) [arXiv:hep-th/9802150].
86
For a very nice comprehensive review see [4] 0 . Aharony, S. S. Gubser, J. M. Maldacena, H. Ooguri and Y. Oz, Large N Held theories, string theory and gravity, Phys. Rept. 323, 183 (2000) [arXiv:hep-th/9905111]. 4. Marginal Deformations of A/* = 4 Super Yang—Mills Theory What are we going to do in this section and why? We would love to learn how to apply the gauge theory/gravity duality to learn about field theories like QCD. To do this, we need to learn how to study field theories with less symmetry: no supersymmetry and no conformal symmetry. This is, at present, beyond what we are capable of. However, it is possible to construct theories with less supersymmetry that are still conformally invariant. These are the so-called marginal deformations of Yang-Mills theories. It turns out that these theories do have enough symmetries that we can guess the relevant gravitational duals. In this section we will study in detail some of these marginal deformations. A conformal field theory has an exact scale invariance. This is a rather nontrivial property and difficult to guarantee in general. Indeed, quantum field theory is plagued with divergences; we typically introduce a cut off (or some other dimensionful regulator) to make sense of these infinities. Any dependence on this dimensionful scale will destroy scale invariance and hence conformal invariance. How is it possible to show that a theory has an exact conformal invariance? In general, it is not possible. However, for certain supersymmetric theories there are powerful non-renormalization theorems that can be used to derive exact results. I will sketch how they work, but you will have to do a lot more reading to get a complete understanding. The idea is rather simple and it can be illustrated in a simple setting. Imagine we want to perform the Gaussian integral /=
f dxdy(x2+y2)ne-aix2+y2)
= [dz dz
(zz)ne~azz.
I can assign "charges" to z and z. Lets say that z has charge (0,1) and z has charge (1,0). Then, a must have charge ( - 1 , - 1 ) to ensure that the exponent is uncharged. Further, the integral itself has charge (n + 1, n +1). Since the only parameter that the integral can depend on is a, we must have (a is an a independent number) / = — . an+1
87
It was consistent to assign charges as we did above; this is why we had to have / ~ a"n~l. Now for an example from quantum mechanics. Imagine that we perturb a rotationally invariant Hamiltonian by the perturbation 8H = aX1. If we wanted, we could imagine that our perturbation is 5H = a-X,
a = (a, 0,0).
We will further claim that under a rotation, a transforms as a vector — this is certainly consistent. With this assignment, we learn that the perturbed energy levels cannot depend on a arbitrarily — we must respect the selection rules coming from rotational invariance. To obtain our supersymmetry non-renormalization theorems, we follow this idea again: assign charges in a way that is consistent with the structure of the problem. The selection rules following from these assignments are strong enough to provide a number of exact results. The quantities that we can (sometimes) compute exactly are called effective actions. To introduce the notion of an effective action, recall that in quantum field theory, we are interested in computing correlation functions which, in a path integral approach, are expressed as {Ox02 •••On) = IVcj)iOl02
• • • OneiS.
Our field theory has field content {&}. The operators Oi are built out of the fields {
a*(p)eipx)dp.
2
Jp
Let us now break the field into two pieces, a "high" momentum piece (!>>) and a "low" momentum piece ({x) = 4><{x)= f Jp2
(j)<{x)+(f)>{x), (a(p)e-iPx+a*{P)eipx)dp.
I am assuming we are working in Euclidean space. In Minkowski space, you would need to do something a bit more fancy than a simple cut off on the magnitude of the momentum.
88 (a(p)e-i^
4>>{x) = f
+
a*{p)ei^)dp.
Jn2
We will further imagine that the operators Ot whose correlation functions we are interested in, can be expressed completely in terms of ><. Then, it is possible to eliminate the $> altogether < 0 i 0 2 • • • On) = J V^Ox 02 • • • OneiS =
Jv^
= Jvct>i<<0102---OneiS°".
(4)
Seff is called the "effective action". It governs the low energy dynamics of the theory. What I have done above is (at best) schematic and I did it only to give you an idea about what is going on. In general, the degrees of freedom that you use at low energy might be entirely different to the degrees of freedom that you started with. As an example, the action of QCD is written in terms of quarks and gluons, but the low energy effective action is best written in terms of pions. We use the low energy effective action to analyze four dimensional field theories by taking the limit \i —> 0, or equivalently, by just keeping the leading terms (up to two derivatives) in the low energy fields. The /J, —> 0 low energy effective actions are called infrared effective actions. It is the infrared effective actions that we can sometimes determine exactly. In general, all we can say is that the low energy effective action is a local action describing a theory's degrees of freedom at some scale below a given energy scale \i. Understanding the exact details of how the degrees of freedom are rearranged is usually difficult. We will restrict ourselves to situations in which the low energy degrees of freedom match the original degrees of freedom. In this case, when we discuss the procedure of "integrating out the high energy degrees of freedom" it makes sense to talk about how the couplings appearing in the action change. When we flow to a lower energy, the coefficients of terms in the Lagrangian, namely the couplings, will generally have different values after the flow. We characterize this change by the so-called /? functions fJ--g^ =
Pi{9k,fj)-
You do not (in general) know what the f3 functions are, but you can always compute them in perturbation theory using quantum field theory. If
89 a theory has a scale invariance (we need this to get a conformally invariant theory), all of the /? functions must vanish. We say that we have a marginal deformation of a theory if we can add a term to the Lagrangian and its coefficient has a vanishing /3 function. The case just discussed above — namely if couplings do not change when we integrate the >> modes out of the theory (i.e., their @ function vanishes) giving a marginal deformation is not the only situation that can be imagined. Indeed, we could imagine a situation in which the coupling grows as <^>> is integrated out — we call this a relevant deformation. Finally, the situation in which the coupling gets smaller as >> is integrated out is called an irrelevant deformation. From now on, we will need some supersymmetry technology. See Jim Gates' lectures in this volume for more details. Here we will be schematic, since that is all we need. We work in M = 1 superspace. We will denote left chiral superfields by $ and the gauge invariant superfield strength by Wi. The terms which will be most important at low energy are those with the fewest derivatives. The leading term, called the superpotential in expressed as an integral over one half of superspace. For example, in a theory with only left chiral superfields, the term
[ dixd26LV($), would be a contribution to the superpotential. Notice that it depends only on $ and not on $* — we say that the superpotential is holomorphic. This is an extremely important property and it is forced on us because $* depends on 8R so integrating it over one half of superspace does not lead to terms which can appear in the action. The holomorphic dependence we have just pointed out is the most general dependence allowed. We will generate the terms in the Lagrangian that we usually call potential terms, from the superpotential. The most important correction to the leading term is the so called Kahler potential. The Kahler potential term can be written as
J
dAxd26Lo?6RK{$,$*).
This is an integral over all of superspace so that now both $ and $* are allowed to appear. Thus, the Kahler potential is not holomorphic. The usual kinetic terms in the field theory would be generated by the Kahler potential. We would like to derive a powerful non-renormalization theorem for our supersymmetric field theory. To do this, we are going to look for the selection rules implied by symmetries that the theory has. Towards this end,
90 we will now discuss what are called 71 symmetries. Although we will not show it, associativity of the super-Poincare algebra allows a single Hermitian U(l) generator TZ that does not commute with the supercharges of the theory. That means that under this ?7(1)TC symmetry, not all component fields in a multiplet transform in the same way. We do the book keeping for this by also transforming the superspace coordinates 9L, so that the superfield itself transforms like its first component. With these conventions, the superpotential has 1Z charge equal to 2. Imagine we start off with a theory, valid below the energy scale /xo> that has a generalized superpotential
; = ^ T r ( W ^ ) + /($",A r , M ), 07T2
, .
6
4wi
2TT
gtfay
/($™,Ar,/j) = ^ y ~ d r A r a . r
What effective theory do we obtain if we run down to an energy n? I am now going to argue that the superpotential is not renormalized. To be concrete, we will study the Wess-Zumino model, which has superpotential
U0 = ^ o A 2 $ 2 + ^A 3 $ 3 . By holomorphy, at a scale fi < /io fn = / M ($,A2,A 3 ;/J,/io)We have assumed that the effective theory is still described in terms of a single chiral superfield at the scale /J,. This is the strongest assumption that we make. Looking at the microscopic (scale JJ,Q) superpotential, we see that it is consistent to assign the following charges to the fields and couplings.
$ A2 A3 Ju
U{1) 1 -2 -3 0
U(l)R 1 0 -1 2
With these assignments, the superpotential is invariant under U(l) x U(l)-ji. Whatever the superpotential is at scale fi, we know that it must also share
91 this U(l) x U(l)n invariance. This implies that we must have
If we take the A3 -> 0 limit, we end up with a free theory and consequently only non-negative powers of A3 can appear if we expand g(-) n>0
where gn are allowed to depend on the ratio fi/fio. Now, take the A2 —> 0 limit. Terms with n > 1 blow up — this is not sensible — so that these terms are disallowed. Thus, we now have U =3o/xA 2 $ 2 +SiA 3 $ 3 How do we determine go and g{? Let us start off by setting A3 = 0. Then we only need to determine go. Since we are working in a free theory, a choice of go is equivalent to a choice of renormalization scheme. A particularly convenient choice will set 1 90 = 2' To determine g\ we can require that the perturbation theory generated using the original action must match the perturbation theory generated using the effective action. This sets
Thus /M = -/iA 2 $ 2 + A 3 $ 3 . This shows that the superpotential is non-perturbatively unrenormalized! Let me make one comment. We assumed that we should not get divergences when A2 —> 0. Why? Surely when A 2 —> 0 we have massless particle and we can get divergences at very low energy from these massless particles. The point is that we didn't flow down to /J, = 0, so we cannot hit these divergences! This is how the non-renormalization theorems work in general — you find a general set of selection rules which almost fixes the superpotential. The final details are then fixed by looking at regimes in which we know the exact answer (like the perturbative regime — we did this in our above example). You may be a bit puzzled. Where did we use supersymmetry? Surely for any field theory we can play the same game — why are these
92
arguments so powerful? Well you really do need supersymmetry to make things work so nicely — it is supersymmetry that forces the superpotential to be a holomorphic function of the superfields. Without holomorphy, we cannot prove the non-renormalization theorem. This also makes it clear that there is no non-renormalization theorem protecting the Kahler potential. Now, let us turn to see how the generalized superpotential is renormalized. This needs the following facts (which we will not prove): in general the chiral transformation
is anomalous — although it is a symmetry of the classical theory, there is just no way to make it a symmetry of the full quantum theory. Indeed, the classically conserved current is no longer conserved at the full quantum level EiQiT(ri)r
d^ =
^:_r'Tv(ff).
167T 2
Anomalies can be computed in perturbation theory — they are one loop exact, reflecting the fact that they can also be thought of as infrared effects. Prom this perspective the existence of anomalies depends only on the field content and charges of the light fermions in the theory and not on the details of the interactions. The effects of the anomaly are equivalent to assigning the 8 angle transformation properties under global U{1) symmetries, according to
-)6 + a ^QiTin) Now we turn to the gauge kinetic terms in the generalized superpotential. Imagine that we start with a weakly coupled asymptotically free gauge theory. As we flow to lower energies, the (gauge) coupling will grow. So, we had better not flow to very low energies — we want to assume that our effective theory is still described in terms of the same degrees of freedom that we started with. With this assumption, we will have
where the dots may include irrelevant terms. These irrelevant terms include operators with higher powers of Tr(Wl); they are irrelevant because WL has scaling dimension 1. This deserves a comment. We are performing a derivative expansion of the effective action. Denote the scalar component of our left scalar superfield by 4>. Our expansion is effectively treating <j> as
93
a dimensionless variable; usually we assign <> / dimensions of mass — this is done because we are interested in the scaling properties of the fluctuations of 4> about a given vacuum, which are governed by the kinetic terms. However, in determining the vacuum itself {not fluctuations about the vacuum) it is the potential that is important and hence the constant part of 4> (its vacuum expectation value) should be treated as a dimensionless constant. In particular, taking the scale of the low energy effective action to be an energy /i does not imply that only vacua with (
T(JU)
+ 1.
This must be true of the effective action at any scale fi, since it just follows from the topological quantization of the integral of Tr ( / / ) . At one loop, we can write the complex coupling as ,
.
6
Am
1 ,
\Ab'
where we have defined 6 to be the phase of Ab at scale no- Thus, as we rotate the phase of Ab Ab
_^ e 2 1 rt A 6 j
we have r(/i 0 ) -» T(H0) + 1. Under the flow from jUo to a slightly lower scale, r(/i) changes continuously with /i. It follows that for any \x when A6 —» e27"A6, we must have T(H) —> r(fi) + 1. This implies that r(A, $, A) is NOT a general holomorphic function — as we rotate the phase 6 of Ab by 2-K we must have r —> r + 1, implying that
T(H) = ^
log ( £ ) + MA6, $", Ar;M),
with h(-) a single valued function of its arguments. Since we deal with an asymptotically free theory, A —> 0 corresponds to a weak coupling limit in which the effective couplings should not diverge. Thus r( M ) = — l o g ^
(-)+$>%(*",Ar;/i), ^ ^'
b
j=0
i.e., inverse powers of A do not appear. These extra terms are nonperturbative corrections coming from instantons. In the presence of the
94 gauge fields, the superpotential itself is also allowed to receive nonperturbative corrections so that oo
3=0
Now, to discuss /? functions we need to consider canonically normalized fields. Our field (before integrating out the high momentum modes) will create a particle with probability 1. After integrating out the high momentum modes it creates a particle with a probability which is less than 1. Consequently, we need to perform a wave function renormalization
The factor ^Zn(fi) is fixed by the requirement that the canonically normalized field ($"„) creates a particle with probability 1. This is equivalent to the requirement that the propagator has a residue of 1 which itself is equivalent to the requirement that the kinetic terms have a coefficient of 1. The couplings are scaled as
It is the canonical couplings \crn that are the honest to goodness physical couplings that would be measured in an experiment. It is now easy to obtain the /? function
where we have introduced the anomalous dimension of the field $ " defined by
_ dlogZn ~ dlogW
ln
This rescaling has an additional effect for gauge theories: the wave function renormalization rescaling can be viewed as a chiral rotation $»-> e *«» $« = $»„,
WL -» eia° WL = W[n,
ian = ± log Zn,
ia0 = I log -±-. ^
9cn
95 Now, since chiral rotations are anomalous we find that 9-+6 + T(adj)a0 +
^T(r n
From this we are able to read off
4- = 4+J-? ( w ) io§ 4-+E T W ios zn). 2
5c
„
02
167T2 ^
9cn
„
J
and thus Pg
d,i
W-\T{adj)gln
flcgn
Although the formulas for and /?£" given above are exact, we do not know the j n which will be complicated functions of the parameters of the theory. These are the results that we need to discover marginal deformations of N — 4 super Yang-Mills theory. Notice that
n
(3\r oc 3 - dr - - ^2 rnJnn
The idea is now to look for situations in which the conditions for the vanishing of these /? functions are not all linearly independent. We can then deduce the existence of a submanifold of the g — \r coupling space for which the p functions vanish. Changing the couplings in such a way that we stay on this manifold corresponds to switching on a marginal operator. Consider a theory with three adjoint left chiral superfields $; and the superpotential / = aTr ( $ i $ 2 $ 3 ) + 6Tr ( $ i $ 3 $ 2 ) + cTr (*? + §\ + $jj). This theory has a Z3 symmetry that acts as $ i —>• $ 2 , $2 —> $3 and $3 -» $ 1 , which ensures that 71 = 72 = 73 = 7. As a result, fia oc /?& oc Pc oc f3g oc 7 implying that there is a 3 complex dimensional space of fixed point theories which pass through weak coupling. For g = a = —b and c = 0 we obtain the N = 4 super Yang-Mills theory. The marginal deformations we are interested in these lectures are included in this 3 complex dimensional space: we will be studying the /3-deformed N = 4 super Yang-Mills theory which has superpotential / = e ^ T r ( $ i $ 2 $ 3 ) + e - " 7 T r ($1*3*2).
96
With this deformation the theory only has M — 1 supersymmetry which is part of the reason we are so interested in it. Suggested Reading: The marginal deformations discussed here were discovered by Leigh and Strassler. See their paper [1] R. G. Leigh and M. J. Strassler, Exactly marginal operators and duality in four-dimensional Af = 1 supersymmetric gauge theory, Nucl. Phys. B 447, 95 (1995) [arXiv:hep-th/9503121]. You might also find the following lecture notes helpful [2] M. J. Strassler, An unorthodox introduction to supersymmetric gauge theory, arXiv:hep-th/0309149. For more background on exact results in Yang-Mills theories following as a consequence of supersymmetry, see [3] N. Seiberg, The Power of holomorphy: Exact results in 4-D SUSY field theories, arXiv:hep-th/9408013; [4] K. A. Intriligator and N. Seiberg, Lectures on supersymmetric gauge theories and electric-magnetic duality, Nucl. Phys. B (Proc. Suppl.) 45B,C, 1 (1996) [arXiv:hep-th/9509066]. 5. Noncommutative Field Theories and N S - N S B Fields What are we going to do in this section and why? We would like to determine the gravitational backgrounds that are dual to the ft deformations of Af = 4 super Yang-Mills theories. In this section we will show that the /3 deformations look like the deformations that we use to get noncommutative field theories. Further, we will show that these noncommutative field theories arise naturally as the low energy limits of strings in NS-NS B field backgrounds. In this section, we will consider the dynamics of open strings in a flat spacetime with a metric gij in the presence of a constant NS-NS Bij field. Further, we will assume that By as a matrix has even rank r and r < (p+1). Assuming that the worldsheet £ has Euclidean signature, the sigma model describing open string dynamics is given by S = - ! - / (gijdaxidaxj 4na' y E = 7 ^ 7 / 9ijda^9axj
- l- [
27ria'Bijeabdaxidbxj) Bijx'dtxK
In this action dt is a tangential derivative along the worldsheet boundary 9S. The equations of motion determine the boundary conditions. Recall
97 that the endpoints of these open strings are attached to a Dp-brane. For i along the Dp-branes the boundary conditions are gijdnxj
+ 2iria'BijdtXj\d^
= 0,
where dn is a normal derivative to <9£. For B = 0, the boundary conditions are Neumann boundary conditions. When B has rank r = p and B —> oo the boundary conditions become Dirichlet. We are interested in the classical open string theory for which E is a disc. The disc can be conformally mapped to the upper half plane. In this description, the boundary conditions are - d)xj + 2-Ka'Bi^d + B)xj\z=.
9ij{d
= 0.
In this formula d = d/dz, 8 = d/dz, and Imz > 0. The propagator with these boundary conditions is (xi(z)xj(z1))
= -a1 \gij log \z - z'\ - gij log \z - z'\ + Gij log \z - z'\2 + -?—eij 2-KQ!
log
Z
Z
-^- + D « l . — z' J
In the above equation GlJ =
1
\ij
\g + 2ira'Bj s Gij=gij-(2wa')2(Bg-1B)..,
°ij = W ( ^ l ) I
(
1
1
"
!
\g + 2ira'B 'g - 2ira'B
= - ( W ) 2 {gl^BBi^B
( )s and ( ) A denote the symmetric and antisymmetric part of the matrix. The constants Dli can depend on B but are independent of z and z'; they play no essential role and are usually set to a convenient value. The first three terms in the propagator are manifestly single-valued. The fourth term is single-valued, if the branch cut of the logarithm is in the lower half plane. When two open strings interact they do so by "joining" (or splitting) at their ends, which lie at the boundary of E. The relevant propagator is thus obtained by restricting our general propagator to the boundary, i.e., to real z and z', which we denote r and r'. Evaluated at boundary points, the propagator is {^(T^T'))
= ~a'Gij log(r - r ' ) 2 + \ ^ ^
~
A
where we have set D^ to a convenient value. e(r) is the function 1 for positive r and —1 for negative T. The coefficient 8** in the propagator has a
98 simple intuitive interpretation. In conformal field theory, one can compute commutators of operators from the short distance behavior of operator products by interpreting time ordering as operator ordering. Interpreting r as time [I'M^'W]
=
T (X'^XHT-)
- xi{T)xi{T+)) = ieij.
This last formula implies that the xl are coordinates on a noncommutative space with noncommutativity parameter 9. To study the field theory that emerges in the low energy limit of the open string theory in a background magnetic field, we should take a' —> 0. In this case, the propagator becomes
{xi{T)xi(r'))=l-ei^{T-r'). With this propagator, normal ordered operators satisfy . eipix'{r)
.
. eiqix'(0)
. _ e-^9"piqjc(r)
. eipx(r)+iqx(0)
. ^
More generally : }{x{r)) : : g{x(0)) :=: e^(T)<>"^)
^
f(x(T))g(x(0))
:,
and lim : f(x(r))
T—>0 +
: : g(x(0)) :=: f(x(0)) * g(x(0)) :,
where f{x) * g{x) = e*e"&&f(x
+ Qg{x + C ) | € = c = 0
is the product of functions on a noncommutative space. This is the result we were after: the effect of the magnetic field is to replace the usual product between functions by the above product *. Suggested Reading: The propagator used in the text was first derived in [1] E. S. Pradkin and A. A. Tseytlin, Nonlinear Electrodynamics From Quantized Strings, Phys. Lett. B 163, 123 (1985); [2] A. Abouelsaood, C. G. . Callan, C. R. Nappi and S. A. Yost, Open Strings In Background Gauge Fields, Nucl. Phys. B 280, 599 (1987). An excellent general reference for strings in background B fields and the relation to noncommutative geometry is [3] N. Seiberg and E. Witten, String theory and noncommutative geometry, JEEP 9909, 032 (1999) [arXiv:hep-th/9908142].
99 6. Lunin-Maldacena Backgrounds What are we going to do in this section and why? We have some strong hints as to what the gravitational backgrounds dual to /3-deformed Af = 4 super Yang-Mills should be. These backgrounds should have a U(l) x U(l) isometry (besides the U(l)n 7£-symmetry) and should have non-zero NS-NS B fields turned on. In this section we will review some solution generating transformations introduced by Lunin and Maldacena that will allow us to construct the relevant backgrounds. One of the deep insights we have gained about string theory is recent years is that the different string theories we see are in fact all local descriptions of a single theory. In complex analysis you are used to the idea of a complex analytic function which need only be defined by partial representations valid for particular domains in the complex plane. It is all of these partial representations that, when taken together, define what we mean by the analytic function. For quantum gravity, the different string theories we know of are partial descriptions of the dynamics. The appropriate representation of the dynamics that should be used is determined by the values of certain parameters (for example, the radius of a compact dimension would be such a parameter). These different partial representations can be quite different from each other — indeed, they need not even have the same spacetime dimension! Apart from the ten dimensional superstrings theories that were constructed in the 1980s, it has become apparent that it is also sometimes appropriate to understand the dynamics as the eleven dimensional dynamics of a very poorly understood theory called M-theory. At low energy M-theory is described by eleven dimensional supergravity. By compactifying eleven dimensional supergravity it is possible to make contact with the type IIA and IIB supergravities that arise as the low energy limits of the IIA and IIB superstring theories. Momenta (from the eleven dimensional supergravity point of view) along the compactified dimensions are then reflected as conserved charges in the lower dimensional theory (that is, in the IIA and IIB supergravities). Something like a boost or a rotation, which is a symmetry of the eleven dimensional theory, will have a nontrivial effect on the momenta of the compactified dimensions. From the point of view of the lower dimensional theory, this nontrivial action on the momenta is a nontrivial action on the conserved charges labeling the solution, i.e., these transformations generate new solutions out of old ones. For this reason they are ca^.ed solution generating transformations. Lunin and Maldacena have used precisely this method to derive new
100 supergravity solutions. They consider solutions of the 10 dimensional type IIB supergravity which are compactified on a two torus. These can be related to solutions of eleven dimensional supergravity compactified on a three torus. The eleven dimensional theory compactified on the three torus has an SL(3, R) symmetry, which becomes a solution generating transformation for the IIB supergravity as discussed above. Using this solution generating transformation, they obtain their supergravity solution by transforming the original AdSsxS 5 solution.1 The metric of the new solution is given by ds2 = R2
ds2AdS5 + £ ( ^ 2 + Gn2d4>\) + fG^MCZ i
d
4>if
i
In this metric, ds2AdSs is the metric of five dimensional Anti-de Sitter space. The equation defining a five sphere of unit radius Or1)2 + (x2)2 + (x3)2 + (z 4 ) 2 + (x5)2 + (x6)2 = 1, can be parameterized by trading pairs of coordinates for an angle and a radius. Thus, we trade (a;1,a;2) for ^i and <j>x\ (x3,x4) for /j,2 and <^>2 and (x5,x6) for nz and fa. The equation defining the five sphere now becomes
This defines the /Zj and
+ n\ii2 + n2lfJ,2) ,
where 7 = i? 2 7 ,
i? 4 = Awe^N.
The parameter 7 is related to the five form flux (see below).
There is a B field that fundamental strings would couple to
BNS = yR2G(i4l4d(l>id
with dw\ = cos(a)sin 3 (a:)sin(#) cos(9)dadd.
'The S 5 contains a two torus so the solution generating transformation can be applied.
101
The five form flux that a D3 brane would couple to is given by F5 = (16nN)(uAds5
+Gw s s) ,
w55 = dioid<£i#2#3 > ^Ads5 = dun,
where W55 is the volume element of a unit radius 5 5 . It is this last equation that defines what we mean by N. This gravitational background was set up by Z)3-branes — we can see that because it is the DZ branes that would act as a source of the five form flux. In the spirit of Sec. 3, we expect that the full closed string theory in the above background should be dual to the field theory obtained by taking a low energy limit of the open strings attached to the D3 branes. What is this theory? We have a few clues: (i) The supersymmetry of the background can be analysed. It turns out (we will not prove this) the field theory should have M = 1 supersymmetry. (ii) The presence of the AdSs factor tells us that the field theory must have an 50(2,4) symmetry to match the 50(2,4) isometry of the AdS 5 space. This tells us that we must have a conformal field theory. Taken together with point (i), we learn that we should have an M = 1 superconformal field theory. We studied theories of this type in Sec. 4. (iii) The presence of the B field suggests, given the results of the previous section, that the field theory should be a noncommutative theory. Looking at the supergravity solution we gave above, we see that the B fluxes are transverse to the three brane. When the fluxes are parallel to the three brane, we would replace the usual product in the field theory by f(x) * g(x) = e 4 ' " # <& f(x + £)g{x + Q | £ = < = 0 . The derivatives -jjL are parallel to the brane and hence are momenta of the field theory. In the present case, since momenta transverse to the brane are replaced by charges, it is natural to conjecture that the ordinary product in the field theory should be replaced by
ftg^e^iQ'fQl-QlQDfg where fg is an ordinary product and (Q1,Q2) are the 17(1) x U(l) charges of the fields. A consequence of this last point is that the superpotential will pick up some extra phases, giving precisely the /3 deformed Yang-Mills theory that we wrote down in Sec. 4. We can now state the conjecture that we will be testing in the remainder of these lectures: (3-deformed M = 4 super Yang-Mills theory is dual to closed string theory on the Lunin-Maldacena background.
102
Suggested Reading: The best source for this section is the original paper of Lunin and Maldacena [1] O. Lunin and J. Maldacena, Deforming Geld theories with U(l) x U(l) global symmetry and their gravity duals, JEEP 0505, 033 (2005) [arXiv: hep-th/0502086]. 7. P P - w a v e limit What are we going to do in this section and why? We have a proposed gauge theory/gravity correspondence, between 0 deformed ftf = 4 super Yang-Mills theories and the Lunin-Maldacena backgrounds. In this section we will check that the gauge theory reproduces the closed string spectrum of the string theory, for a large class of closed string states. This is our first check of the proposed duality. One of the simplest things we could check is if the mass spectrum of strings on the Lunin-Maldacena background matches the spectrum of anomalous dimensions of operators in the dual field theory. We argued that this should be the case for the AdSsxS 5 background at the end of Sec. 3. Our argument only made use of the isometries of the AdS space and the conformal symmetry of the field theory, so it is also applicable to the situation we study here. However, a direct comparison is not possible, simply because we do not yet know how to quantize a free string on the LuninMaldacena background. We do however know how to quantize the string in the pp-wave or Penrose limit of the Lunin-Maldacena background. We will perform this quantization in this section and then show that the masses that one obtains do match with anomalous dimensions of operators in the dual gauge theory. This will provide strong support for Lunin and Maldacena's proposal. The metric is ds2 = R2 I -dt2 cosh2 p + dp2 + sinh2 pdti2, + J ^ d\x\
+ G | > 2 ^ 2 + JM&G fedfc\ J • where Hi = cos a,
jU2 = sin a cos 9,
/U3 = sin a sin 6,
1 + 72(M?M! + A*?A*i + ^llAY
(5)
103 It is useful to use the angles tp, (pi and
0 2 = ll> +
fo=1p-
The parameter 7 is the deformation parameter. The Penrose limit of a spacetime is essentially the limit of the spacetime that would be seen by a massless particle. Consequently, the limit is defined using a specific null geodesic. We will perform the Penrose limit using the null geodesic r = ip, with a0 = c o s - 1 4= and 60 = f • To take the limit set Fix1
•K
U/
x2
Jb
_|_
*'"iih{x"+r")-
r
.
J->
1
J-i
^ttof4'
and take R —> 00. The pp-wave metric we obtain is ds2 = — Adx+dx
+ 2 r2 + 3 ^ ( ^ ) 2 + (*2)2) (dx ) +
dr2+r2dn23
+ (dx1)2 + (dx2)2 + (dx3)2 + (dx4)2 +
^ (xldx3 + x2dx4) dx+. 2 V3 + 7 For the string sigma model, we will also need the B field in this limit. We find B^^Vipi
A T>tp2 = G^gR2Vtpx
A 2fy 2 ,
where
2tyi = dpi -dil> + ^%-d*l),
Vy2 = dy2 - dip + M M d V ) .
9 9 Taking the pp-wave limit as above, we find the following B field
27 B = -t=dx3 A dx4 + (x2dx3 A dx+ + xldx+ A dx4), \/3 ^ 3 + 72
and the following field strengths H -" 2 3 + _
2 l
v /„/ 3 T ^„»'
H -"14+ -
2 7
-
y/3 + r2
Thus, the field strength is null which is typical for the pp-wave limit. Given the metric and B field we can now write down the string sigma model.
104 We will be working in the light-cone gauge, just like in Sec. 2. The string worldsheet action is (we are dropping the fermions from our analysis)
+eabB^vsdaX»dbXv
+ a'^jcf>(x)R\ , ab
with R the scalar curvature on the worldsheet, T) is the worldsheet metric and j] = \detr/ab\. We will choose ^/rjr}ab diagonal with ^/rjr]00 = - 1 and Y^r?11 = 1. After shifting
73 x
—> x
+
2^3Tf*
{x1x3+x2x4),
the metric becomes 4j
ds = —Adx+dx~
1 2 2 2 B ^ + 3TT^ + i ) *^ ) )
+ \2 (dx+)
,j=5
+ ^(dx1)2
2^3 V ° y/^+J
+
i=l
(xldx3 - x3dxx + x2dx4 - x4dx2)
dx+.
In the gauge x+ = r, we obtain the following Lagrangian density (we take a to run from 0 to n and set a' = J") £ = -2
dx~ 1 Jfr~ ~ 2 27
+ \/3 + 7'2 +
V3
Ay
2 1 2+ 22 E(^) + Trb^ ) <* ) ) 3 + 7' Li=5 2
dx3
j dx4
\X ~fo~X f
2
3
Ifo 1
1dx
3dx
^TTT \ ^~
4
+
^
2
2dx
4dx
^'
\
l r ^ ,
a
^)~~^
iao
d
;
To quantize the theory, compute the canonical momenta p1 (r,a) = a;1 (r,a) - J ^ - ^ p 3 (r,cr) =x3(r,a)
+
3
,
3+7 ^ '
P 0-,*) = ^
p2(r,a) = i2(r,a) -
p
=±*(T,<7),
(T,CT)
fc
= x4(r,a)
= 5,6,7,8,
and impose the equal time commutation relations [pk(T,a),xj(T,a')]
=
+
-iSjk5(a-a').
J
3 + 72 3 + 72
'
105 The Hamiltonian is
U
X
U=l
'
Aj
4T_
2
+3^+Ir > ' > -
k=5
3
/
2dx
V* + rf r lb"*'
k=i
2^3
k=l 4
xdx
da
(pV+pV-pV-pV)
+
Notice that the modes corresponding to 3Z • 3J * 3J ft 3J 8 have masses that do not depend on 7, i.e., they are unaffected by the deformation. This is not unexpected, since these coordinates come from the AdSs part of the space which does not participate in the deformation. We will, from this point on, consider only The Heisenberg equations of motion are d2xi 8t2
<9V+
w iv%+h"a-£+^=a<
where 0
0
0
0
V
V
3 3+7*
-v
3+7 2
0
3+7 2
V
3+7 2
and 0 hij = 2
0
0
/ V
0
V / 3+7 ?
7_
0
- V 3 +T 2 fcl = fc2 =
472
3+
2 7
'
3+7
1/3+72 2
0
0
0
0
0
fco = fc4 = 0 .
To solve these equations introduce the mode expansions %
<(*)e 2ina
106
Reality of x% (t, a) is encoded (as usual) in x
n
=
\ x — n)
•
The equations of motion now become the following equation for the modes
-gf
+ 4 n 2 < + f3~
+ Zinh^xi + k,x\ = 0.
Make the ansatz
xi{t) =
afA^e^.
dj' will be a destruction/annihilation operator; A]™' is a unitary transformation diagonalizing the equation of motion; ojj is our spectrum. Plugging this into the equations of motion we find
(-{Jkn))28^
+ An28^ + ifiJkn)
+ 2inti> + hSij) a^A^e^*
= 0.
The condition for a nontrivial solution is det (-(oJ{kn))25ij
+ An25ij + ifijuj(kn)
+ 2inhij + kiSij)
= 0,
which leads to the following quartic equation w4 - (4 + 8n 2 )w 2 + 16n4 = 0. It is solved by u = 1 - y/l + An2 « 2n 2 . Notice that the spectrum is independent of the deformation parameter 7. The fact that the spectrum is independent of 7 is unexpected. Evidently, the 7 dependence in the B field exactly compensates for the 7 dependence of the geometry. Our goal now is to show that this spectrum can be reproduced in the (3 deformed J\f — 4 super Yang-Mills theory. The analysis uses certain technical results, non-renormalization theorems to argue that the subset of interactions that we will consider gives the complete result. These theorems are established for the undeformed theory and then one argues that they are not modified by the B deformation, by appealing to general properties of noncommutative field theories. Our analysis will verify the 7 independence of the pp-wave spectrum. To construct the excited string states, we build an operator which is dual to a string which is not excited. The dimension of this operator is not corrected away from its free field limit. In the dual string theory, the corresponding statement is simply that the string state is massless, independent
107 of the string tension. Exciting the string is then given a description in terms of "impurities" moving on the background provided by the operator which is dual to the string with no excitations. First, we build the "background" on which the impurities move. The background is built from an even number of $ 1 , $ 2 and <J>3 fields. The operator will be built out of a very large number (of order \J~N in the large N limit) of Higgs fields. This implies it has a large U(l) charge which corresponds to a momentum transverse to the brane — this is the field theory dual of the infinite boost needed to reach the Penrose limit. Start by selecting one of the Higgs fields from which the background is to be composed. Place a second background Higgs field to the left of this first one and let it "hop over the first" assigning phases for even and odd exchanges, where we call the following exchanges
even exchanges and the exchanges $2$1_
> $
1#2
>
or
$
3
#
2^.
$
2
$
3
]
or
$1$3
^ $ 3 $ ^
odd exchanges. For each odd exchange we append the factor a* = e2*"7*, and for each even exchange we append the factor a = e -2 *' 72 . Place a third background Higgs field to the left of the two terms generated, and let it hop all the way to the right, generating a total of 6 terms. Continue until all background Higgs fields have been selected. As an example, if we wanted to build the background out of one $ x , one $ 2 and one $ 3 , we would go through the following steps <j>! _>. $ 2 ^ 1
+e27TJ7^1^2
_> $ 3 $ 2 $ x + e 2 ' ri7 $ 2 $ 3 $ 1 + $ 2 $!$ 3 +e2'ri7$3$1$2 + $1$3$2 +
27ri e
7$i$2 $ 3
Selecting the background fields in a different order may change the overall (and hence arbitrary) phase of the above operator. By building the operator in this way, each exchange term we add by hand will be matched by an exchange performed by the potential, with an opposite sign so that the corrections generated by the Feynman diagrams sum to zero. Thus, this operator is not corrected in perturbation theory — it is dual to the string state with no excitations. This is not quite exact, because we did not consider the exchange that will swap the last and first Higgs field. However, neglecting this exchange is justified in the limit that we have a huge number of fields in the operator (i.e., the pp-wave limit of the dual string theory).
108
We will now describe how to build excited string states with two impurities. For the impurities take &1 and $ 3 . Let $ 3 hop into the nth position using the same rules for hopping as above. The operator obtained in this way is 0%. For each 0% let &1 hop into the mth position. Call the resulting operator O^ m. Now define (p = 0,1,..., J ) 0
W=ETr(0n,m)(5™-^
where the delta function sets m — n—p mod J. It is now a simple task to show that
(Oji)(x1)VF(y)6J.)(x2)) = HlMl
NJ+3aJ+4
2J
x2\ \xi
Xi
y\4\x2
~2/|4'
where VF{V) is the interaction vertex of the super Yang-Mills theory, M^is defined by l\[J+2nJ+2
{Oji)^)Ojj)(x2))
= Ml
Fi
n{2J+4,
~x2\
and
Hl = 8Sik - 4(5j+1fc - 4(5iit+1. When computing these correlators, we sum over all contractions except the contractions involving the fields that were at the endpoints of C ^ m ; this should give the correct answer in the limit we consider. Operators dual to excited string states are built using the eigenvectors of H7; the eigenvalues of H determine the spectrum of anomalous dimensions of the operators we are interested in. Thus, we need to solve the eigenvalue problem of the operator H. Denoting the components of the eigenvectors H\i) = \i\i), by v\
0= Vj-l
L vj
we have -4v n _i + Svn - 4vn+i - Xv„
(6)
109 for 1 < n < J — 1 and 3v0 - 2vx - vj = Xv0
3vj-2vj-i-VQ
= Xvj.
(7)
Make the ansatz vn =
Aeikn+Be-ikn.
Then, (6) implies A = 8 - 8cos(fc), and (7) implies that the allowed values of k solve Imag [(A - 3 + 2e~ik + e~iJk){3eikJ
- 2e^J^k
- 1-
XeiJk) = 0,
where Imag stands for the imaginary part. It is now a simple exercise to determine A in terms of B using (7). B is determined by the normalization of the eigenvector. Since for small values of k we have A « 4k2 we see that we have reproduced both the 7 independence and the correct dispersion relation of the string theory. Suggested Reading: The discussion of this section is based on [1] R. de Mello Koch, J. Murugan, J. Smolic and M. Smolic, Deformed PPwaves from the Lunin-Maldacena background, JHEP 0508, 072 (2005) [arXiv:hep-th/0505227]. See also [2] S. A. Prolov, R. Roiban and A. A. Tseytlin, Gauge-string duality for superconformal deformations of Af = 4 super Yang-Mills theory, JHEP 0507, 045 (2005) [arXiv:hep-th/0503192]; [3] T. Mateos, Marginal deformation ofAf = 4 SYM and Penrose limits with continuum spectrum, JHEP 0508, 026 (2005) [arXiv:hep-th/0505243]; for complementary discussions. 8. Giant Gravitons What are we going to do in this section and why? In the previous section we have compared the spectra of closed strings as computed in the field theory to the spectrum as obtained from the gauge theory. In this section we would like to extend this analysis to a class of open string states. Further, we will do the analysis in a three-parameter generalization (discovered by Prolov) of the Lunin-Maldacena backgrounds. In this section we are going to work in a different supergravity background. It is a generalization of the Lunin-Maldacena background which preserves no supersymmetry. This makes it extremely interesting. The spacetime still has an AdSs factor (but now it is a product with a deformed
110
sphere, with the deformation characterized by three parameters) which tells us that the dual field theory is still conformally invariant (at least at large N and in the strong coupling limit). Further, we are going to study giant graviton solutions. A graviton moving in a background five form flux will minimize its interaction energy, as a consequence of interacting with the five form flux, by blowing up into a three sphere with non-zero radius. Of course, by increasing its volume it increases its rest mass (it has a non-zero tension). For the original AdSsxS 5 background the sum of these energies can be minimized at a non-zero radius giving rise to so-called "giant gravitons". The larger the angular momentum of the giant, the stronger the coupling to the five form flux and the larger the radius of the giant. Giant gravitons are extremely interesting because at weak string coupling they are very heavy objects. They are not described as perturbative excitations — they are solitons of the string theory. Thus, one is probing non-perturbative aspects of the theory by studying giant gravitons. The giant graviton solutions we consider are D3 branes that have blown up in the deformed sphere part of the geometry. These giants cannot grow without bound — a maximal giant will completely fill the sphere. As we have mentioned, the size of the giant increases with its angular momentum. This upper limit on the size of the giant implies an upper limit on the possible angular momentum the giant can have. This sharp prediction has been verified in many independent ways and lends strong support to the existence of giant gravitons. Excitations of the giant gravitons are described in terms of the open strings attached to the giants. This is particularly simple to handle: the presence of the giant simply determines the allowed boundary conditions for the open stings. Our goal is to show that the dynamics of the open strings attached to the giants can be reproduced in the dual field theory. This will lend strong support to the gauge theory/gravity duality we are studying. As discussed above, the giant graviton solutions we consider are D3 branes that have blown up in the deformed sphere part of the geometry. Our ansatz for the giant assumes that it has a constant radius and a constant angular velocity. To write down the action for the D3 brane, we need the metric and dilaton of the background (to write down the Dirac-Born-Infeld term in the action), the NS-NS two form potential and the RR two and four form potentials (to write down the Chern-Simons terms in the action). The AdS§ and the deformed sphere spaces are orthogonal to each other ds2
=ds2AdS5+ds2s,^.
Ill We will use the following spacetime coordinates: (1) For AdSs use (t, a±, a2, 03,0:4). In terms of these coordinates, the metric is /
dsi ^AdSt
-
4
1+ 2 \
\ 2
/ 2
\
2
J
a ) dt + R [ Sn +
* '
daida,-.
2
fc=l
These coordinates are useful when studying small fluctuations of the giant graviton, since the make the 50(4) subgroup of the 50(2,4) isometry of AdSs manifest. (2) For the deformed five sphere, use (a,6, fa, fa, fa). In terms of these coordinates, the metric is ds2ss j=R2lda2+
sin2 ad62 + G ^
p\ = cos a,
p\dfa2 j + R2Gp\p\p\
/?2 = sin a cos 9,
f^
^dfa J ,
p$ = sin a sin 6,
-2 2 2 1 -2 2 2 1 -2 2 2 G —1 = 11 +, 7lP 2 P3 + 7 2 PlP 3 + 73/°2Pl-
In terms of the dilaton (f>0 of the undeformed background, the dilaton is
The dilaton of the undeformed background satisfies i?4e~*° = five form field strength of the background is
4TTNI*.
The
F5=4fl4e-^(WAdS5+GwS5), W55 = cos a sin3 a sin # cos
6dad9dfadfadfa.
Finally, the RR two form potential is 0 2 = -4R2e~">u>id 2j7i>i ) ,
duii = cos a sin3 a sin 6 cos 6dad6,
and NS-NS two form potential is B = R2 Gui2,
w2 — 73 p\ p\ dfa dfa + 71P2 P\ dfa dfa + 72 p\ p\ dfa dfa.
We will not consider the most general background with three arbitrary parameters in this lecture; from now on we set 71 = 0 and 7 = 72 = 73. To write down the D3 brane action 5 =
~ (2^F / ^ e ~ VI det(G +fl)|+ JCA + JC2AB,
112 we will use the static gauge
y° = t,
2
y^e,
y
=
3
y
= 4>z-
Make the ansatz a = O.Q, 4>i = wt where ao and u> are constants, independent of yM for the giant graviton solution. It is now a simple matter to integrate the Lagrangian density over y1, y2 and y 3 to obtain the Lagrangian L = —my 1 — a(j>\ + bfa, where n
b = 4N
•> i
7-
A/4
e"^0
+ 72
472 A/4 + 7 2
r3
S
log
a = R2
B*
+
A/4+72
7
A/4+72
7
7+ 2
4-y
A/4 A/4
+ 72 + 7
•2^ +
log
2
- 1 A/4+72
A/4+7
Nj?,2
2
+ 1
2
7 6
2
r ,
1
2
r (i? -r ) i?S(l+72£r(l-£))
r = i?sinao is the radius of the giant, T3 is the D3 brane tension and R is the radius of curvature of the AdS space and the radius of the (undeformed) sphere. Solving for
M = ——, d
one obtains
M-b ya[M
- b]
+m2a2
The energy of the giant graviton is now easily computed E = (j)1M-L
= \ m2 +
[M-b\
Qo can be determined by minimizing the energy at fixed M.
(8)
113
Fig. 1. The energy of the giant graviton versus r/R for fixed angular momentum. For the plot shown, 7 = 0.4, N = 10 and M. = 7/N. The energy is shown in units of 1/R.
Clearly the energy of the point graviton is less than that of the giant, so that the giant graviton will be an unstable state. Further investigation shows that the graviton is perturbatively stable — it does lie at the minimum of the potential. However, it is not an absolute minimum and it can tunnel to the point graviton state. The contributions to the energy coming from the Chern-Simons four form flux and C2 A B terms enter with opposite signs. At 7 = 0, the Ci A B term vanishes, while the four form flux term is non-zero. As 7 is increased, the C2 A B term grows faster than the four form flux term. For large enough deformations, the C^/\B term dominates. There is a critical deformation beyond which there is no giant graviton solution. This behavior is shown in Fig. 2 below. A comment is in order: when trying to find the minimum of an action one should make an ansatz at the level of the equations of motion — and not at the level of the action. Indeed, making the ansatz at the level of the action implies that one is restricted to minimizing over the set of functions captured by the ansatz made. There is no reason why the classical solution should belong to this set. For the case considered here, it is possible to evaluate the derivative of the action on the ansatz used and show that it does indeed vanish.
114
:
: / : :.../.... /l '
_
•
:
:
: :
_
:/ /. / :
/
/•...';
Fig. 2. The energy of the giant graviton versus r/R for fixed angular momentum. For the plot shown, N = 10 and M = 1/iV. The solid line has 7 = 0, the dotted line 7 = 0.8 and the dashed line 7 = 1.6. The energy is shown in units of 1/R. As the deformation increases the giant graviton minimum is raised until it is no longer a solution.
The supergravity background we are studying is conjectured (using the Lunin-Maldacena logic) to be dual to the field theory with scalar potential 3
V = TV
£
>$m$n ~ e ^ - $ „ $
r o
|
2
+ T T Y,
[*n,*n]5
n>m=l
where amn = —tmnili- Our giant graviton solutions correspond to branes orbiting with angular momentum along the <j>i direction. The H charge of $1 corresponds to the angular momentum M above. Thus, a giant graviton with angular momentum M. should be dual to an operator built out of MN $1 fields. From now on we use Z to denote $1 and X, Y to denote $ 2 , $3To match what was done in the dual gravitational theory set 71 = 0 and 72 = 73 = 7 so that V = Tr [\e™~
+
-\ e " 7 I Z - e-^ZX\2
+ \YX -
XY\2
[Y,Y]2+[Z,Z]
We want to determine the spectrum of anomalous dimensions for the theory with this potential. A particularly convenient way to do this amounts to identifying the anomalous dimension matrix with the Hamiltonian of a spin chain. The advantage of this procedure stems from the fact that the spin chain obtained in this way is integrable and there are powerful tools
115 available to diagonalize integrable spin chain Hamiltonians. Thus, we need to determine the spin chain of this deformed M — 4 super Yang-Mills theory relevant for the dual description of open strings attached to giants. In the undeformed theory with gauge group U(N), operators dual to sphere giants are given by Schur polynomials of the totally antisymmetric representations, which are labeled by Young diagrams with a single column. The cut off on the number of rows of the Young diagram perfectly matches the cut off on angular momentum arising because the sphere giant fills the S5 of the AdSsxS 5 geometry. For maximal giants, the Schur polynomials are determinant like operators. Attaching a string to the maximal giant is conjectured to give an operator of the form
The open string is given by the product (MiM 2 • • • -^n)}" • The M; could in principle be fermions, covariant derivatives of Higgs fields or Higgs fields themselves. To describe excitations of the string involving only coordinates from the S5, restrict the Mi to be Higgs fields. We will restrict ourselves even further and require that the Mi are Z or Y. A spin chain description can then be constructed by identifying (M1M2 • • • Afn)V^ with a spin chain that has n-sites. If Mi = Z the ith spin is spin up; if Mj = Y the ith spin is spin down. It is not possible for Z's to hop off and onto the string attached to a maximal giant; as soon as Mi = Z or Mn = Z the operator factorizes into a closed string plus a maximal giant graviton. This implies the boundary constraint M\ 7^ Z ^ Mn. However, for non-maximal giants, Z's can hop between the graviton and the open string. In this case, the number of sites in the spin chain is dynamical. If however, one identifies the spaces between the Y's as lattice sites and the Z's as bosons which occupy sites in this lattice, the number of sites is again conserved. For the undeformed theory this leads to the Hamiltonian L-l
L
H = 2Xa2:+2A^oJaj-A^ 1=1
[a\ai+i +
aiaj+1J+Xa(ai+a\)+Xa(aL+a}L).
1=1
The operators in the above Hamiltonian are Cuntz oscillators ata\=I,
a\ai =
I-\0)(0\.
For a giant with angular momentum p/R, the parameter
measures how far from a maximal giant we are.
116
Due to the deformation, hopping is now accompanied by an extra phase. To see how this comes about, note that the deformation replaces [Z,Y] ->• ZYe™'* [Z, Y] [Z, y] + -> ZYYZ
YZe-™'*,
- ZYZYe2™1
+ YZZY
YZYZe~a'1'.
-
It is straightforward to see what interactions in the spin chain Hamiltonian these terms induce (the overbraces indicate Wick contractions) Tr (YZ Z y)Tr (Y Z
) ->• Tr (YZ...) <-> a\at
Tr(ZYYZ)Tr(ZY...)
- 1
«• ajai
»
Tr (ZY Zyei2"7)Tr - i
^Tr(ZY...)
(Y Z....) -> e i2 " 7 Tr (ZY....) «• e ^ a / a i + i
*
Tr(YZYZe-i2lT~<)Tr(ZY...)
-
-> e~i2^Tr
(YZ...) +-> e~ i 2 7 r 7 aia\ + v
To hop onto the spin chain, we are hopping from the "zeroth site", which is the Schur polynomial/giant graviton, and onto the first site of the string. The term which does this has an e~ i27r7 coefficient. Another way to hop onto the spin chain is to hop from the L + 1th site into the Lth site. The term which does this has an el2ni coefficient. It is straightforward to argue for the phases when we hop off of the giant graviton. From the above discussion we see that the deformation modifies this Hamiltonian to L
H = 2Xa2 + 2A ^ i=i
L-l
a*a, - A ^
(a]al+1ei2^
+ a ^ e ^
2
^
i=i
+ Aa(fiie i27r7 + o{e-' 2 , r 7 ) + \a(aLe-i2^
+
a{ei2^).
The semiclassical limit, in which the action derived from coherent states should provide a good approximation to the dynamics, is obtained by taking —¥ CO,
A —¥ C O ,
holding -£?, Lj and a fixed. To obtain the low energy effective action, we will use the coherent states oo
n=0
117 with parameter zi=r«e'*', for the Zth site. The coherent state action is given as usual by
S = Jdt (i(Z\^\Z) - (Z\H\Z)y In the above expression the coherent state \Z) is written as a product over all sites
\Z)=\[\zl). i
As an illustration of the manipulations which follow, we describe the evaluation of the first term in the action. It is straightforward to see that \z,) = - ^ = h f 5 > ? e ' " * ' | n > + \ M E «
P + *
^ | n ) ,
Thus,
<^l In the large L limit, to leading order in L we have flr{a)H{a)
(7ld
{zl
dilz)-lLJ0
T^W
A straightforward computation along these lines gives S = -[dt J
L[ -!^-da [ Jo 1 — r
+ 2Xa2 + y f ((r') 2 + r2 (>' + 27r7)2) da L Jo
+ Az(l)z(l) + A«(0)z(0) +\a(z(0)
+ z(0)) + Aa(z(l) + z(l))].
We identify 7 = L7. Write this action in terms of 7 and rescale a —> -. Clearly, the deformation replaces
118
Let us now consider the description of the open strings using the dual sigma model. We will warm up with a discussion of the undeformed case. The metric of R x S5 is ds2 = -dt2 - \dX\2 - \dY\2 - \dZ\2,
\X\2 + \Y\2 + \Z\2 = 1.
The giant graviton moves along the Z direction with a radius A/1 — p/N; it wraps the remaining S3. It is convenient to consider the Polyakov action in first order form S = v XYM
dr
daL,
L = prdox" + ^A-1 [G^p^pu + G^dix^x"] XYM = 9YMN
as usual, and R4 — a'2XYM-
+
BA-1plld1x».
A and B are Lagrange multipli-
ers implementing the constraints. If you eliminate the momenta and solve the constraints implied by A and B, your action would again be computing the area of the string's worldsheet. Now, move into a coordinate system in which the brane is static. Set X = 0, corresponding to the fact only Z and Y Higgs fields appear in our open string. Now choose Z = re'<*-*l>,
Y = ±\/l-r2e-*2.
The metric becomes ds2 = - ( 1 - r2)dt2 + 2r2dtd(j>1 + T ^ ^
2
+ r 2 ( # i ) 2 + (1 - r2){dcj>2)2.
To make contact with the gauge theory, choose a gauge in which p^,2 is homogeneously distributed along the string, p^ = 2 J and we choose r = t. The total momentum in the Y direction is equal to the number of sites in the dual lattice boson model, L. Thus, we have
J
= l 2^=L-
To obtain the low energy limit, rescale £ - » £ ' = XYxMt. After this rescaling time derivatives are 0(XYlM) = 0(L~2) — 0(J~2). Now implement the constraints following from A and B and use the equations of motion for pr and Pfa to eliminate them from the action. The result is r2cf>i l_r2
*YM
+ 8TT2J2
(r' 2 + r 2 ^ 2 )
which has been shown to be in perfect agreement with the undeformed result from the field theory, after identifying L = J and XYM = 87r2A.
119 The background studied above can be obtained by performing a sequence of "TsT transformations". I am mentioning this because the TsT transformation has a particularly simple action on the string sigma model: to obtain the sigma model for the deformed theory, we simply need to shift 4>'i - >
tijkljPk-
For the above action, we only need to consider (f>'x
-tijkljPk-
Next, since we set X = 0 we know that p% = 0. Thus, 01 ->• 01 ~ eij27jP2 = 0i - C13273P2 = 0i + 73P2Now, we have set 73 = 72 = 7 and in our gauge p2 = 1J, so that >[ ->• # + 2 7 J = 0'i + 2 7 L. This is in complete agreement with the spin chain result we obtained above. Suggested Reading: The results of this section were obtained in [1] R. de Mello Koch, N. Ives, J. Smolic and M. Smolic, Unstable giants, Phys. Rev. D 73, 064007 (2006) [arXiv:hep-th/0509007]. See also [2] S. Frolov, Lax pair for strings in Lunin-Maldacena background, JHEP 0505, 069 (2005) [arXiv:hep-th/0503201]; [3] S. A. Frolov, R. Roiban and A. A. Tseytlin, Gauge-string duality for (non)supersymmetric deformations of M = 4 super Yang-Mills theory, Nucl. Phys. B 731, 1 (2005) [arXiv:hep-th/0507021]. Acknowledgments The work reported in the last two sections is the result of a collaboration with Norman Ives, Jeff Murugan, Jelena Smolic and Milena Smolic. I would like to thank my collaborators for generously sharing their insights and for a rewarding, stimulating and most enjoyable collaboration. I would also like to thank Jan Govaerts and Norbert Hounkonnou for inviting me to participate in a very enjoyable, stimulating and memorable meeting in Benin. I would also like to thank all the participants at the Workshop for lively interactions. This work was supported in part by a South African National Research Foundation grant with grant number Gun 2047219.
120
I N T R O D U C T O R Y A N D F U N D A M E N T A L MATHEMATICAL A S P E C T S OF S U P E R S Y M M E T R Y S. James Gates, Jr. University of Maryland, Physics Department, Center for String & Particle Theory, College Park, Maryland 20742-4111, U.S.A. E-mail: [email protected] A discussion is presented on aspects of supersymmetry in the context of quantum mechanical models. Rather than dealing with the details of specific models, the presentation aims to draw out general features of such systems. Keywords: Adinkras, MSSM, Supersymmetry, Superspace, Superfields.
1. Introduction to S U S Y Via I D Although there is no obvious reason why supersymmetric quantum mechanics should be interesting, surprisingly such theories have led to a number of unexpected insights and speculations. Among these are the discovery of the Witten index, 1 a field theoretical derivation of the Atiyah-Singer Theorem 2 and the possible relation with the hyperbolic algebra -Eio-3 Additionally, since the Green-Schwarz formulation of the superstring is not completely understood, the corresponding superparticle models provide interesting laboratories in which to explore ideas for the more complicated superstring theories. Along these latter lines in order to introduce models, which after fixing degrees of freedom lead to either the NSR spinning string or GS superstring, new types of supersymmetric theories with double supersymmetry were introduced. 4-7 These theories possess both spacetime and world-sheet supersymmetries and were originally called '(supersymmetry) 2 ' 4,5 or 'doubly graded' 6 ' 7 models. Strings with double supersymmetry were called 'spinning superstrings.' 8 The zero slope limit of these constructions lead to a first-quantized supersymmetric particle model with double supersymmetry — the spinning superparticle model, 9 with global spacetime and local world-line supersymmetry. This model describes trajectories in doubly
121
graded superspace (X^, #**•, 0", S») where (X^, 6") describes the superspace coordinates and the RNS fermions ( X m , ty—) are the usual ones of the spinning model. The new commuting spinor coordinate 5 M together with GM form a new world-line supermultiplet representation of the worldline SUSY, and S7* can be interpreted as the twistor coordinates associated with spacetime variables 0^. l o > u More recently, there has been a collaboration of mathematicians and physicists 12 which has been probing a structure, called lQ7l{d, A/"),'13 that appears associated with ID theories. The argument has been presented that this structure plays a fundamental role for supersymmetrical theories akin to that played by the 'little group' of Wigner. Thus, although formally a ID structure, QTZ(d, A/")* is common in all representations of supersymmetry, like the Wigner little group is present for all relativistic field theories. It can be seen by setting all dependence upon spatial coordinates in higher dimensional theories to zero. All of these latter models emphasize the importance of the simplest type of supersymmetry, namely one where there is a single bosonic coordinate, in the investigations of more complicated theories. In this note we will investigate various aspect of this class of theories when there is extended supersymmetry introduced on the world line. 2. Symmetry: A Review of 3D Rotations We begin our introduction by considering not supersymmetry but instead by considering a more familiar symmetry, i.e., rotational symmetry in three dimensions. To this end, we may introduce a 'generator' denoted by L3. This is an operator that is denned by its actions on the coordinates (x, y, z) L3x
= - iy
,
L32/ = ix
,
L3z
= 0
,
(1)
and by the fact that it is a derivation. This means that when the operator (1) acts on a product of coordinates, it obeys a rule similar to a derivative, i.e., L3[xy
= =
Ls^xjjy -i[y2
+ x - x2}
L3^jj .
' T h i s name is short for 'generalized real' Pauli matrices of dimension d and extension N.
122 Now we wish to prove that this operator generates a rotation about the third direction. Let 7 be an angle, we may consider the object denned by ^3(7) = exp[ - 17L 3 I ,
(3)
and evaluate its effect on (a;, y, z). A rotated set of coordinates (x', y', z') may be obtained by application of 7^3(7) x
^3(7) I y
y'
(4)
^•3(7)2/ ^3(7)2,
z'
We can evaluate each of these using the definitions above 00
..
n=l
[ x - iy
- \{lfx
+ 3T(7) 3 2/+ ••
x cos 7 — y sin 7 (5) )ny =
^ 3 ( 7 ) y = J^-ji-iyLa
1
- *7L 3 + \(-i-yU
)2 +
n=\
= [ y + ix
- \infy
£ sin 7 + y cos 7
- sr(7) 3 z + ... , (6)
fts(7)*=
Y\ -X-ilU ^—' n!
)nz=
L
1 - r y l * + \{-%lU
) 2 + •••
n=l
= [z + 0 + ... (7)
So that
^3(7) I y
x cos 7 — y sin 7 2; sin 7 + y cos 7 z
(8)
123 For small values of 7, (4) implies &n3{l)x
= x' - x =
-7j
&K3(l)y
= y' - y = ix
&n3{l)z
= z' - z = 0 ,
,
(9)
and the notation is to indicate a change A due to the rotation 72-3 through the angle 7. In the limit where 7 goes to zero we have A-R.3 = 5K3 and these can be re-written as &n3{l)x
=
- 72/
> <W7)l/ = 7Z
,
(10)
<W7)2 = 0
and a direct calculation reveals that acting on (a;, y, z) <W7) =
-HL3
•
(11)
It is clearly possible to define two other similar operators h\x
= 0
L2X =
,
— iz
Li?/ = iz ,
,
\,\z
L2J/ = 0
,
= — iy
.
L2Z = ix
.
(12)
Given the definitions of the operators in (1) and (12) there are interesting calculations that can be made. (13)
(14)
/ [In , L 2 ] a : \ [Li ,
f-y\ (15)
L2]y
V [U ,
L2]zJ
\
0/
In writing this, we have introduced the standard notation of the commutator
[A, B] for any two quantities A and B.
AB -
BA
(16)
124 From Equation (1) it follows that L3x\ L3J/
(17)
=
L3zJ and on comparing (15) and (17) it is apparent that ( [L: , L2]x\
/
[Li , L 2 ]y V [Li ,
iL3x\ iL3y
L2]zJ
V
(18)
iL3z/
Since each component of this has the same general form we write = «L 3
(19)
If the calculation above is repeated in all possible ways it leads to the familiar result L/
Lj ,
=
1 €%j k
Lfe
(20)
.
ar the 3D rotation generators, it can be shown t h a t Li ,
L j , Lfc
+
Lj ,
Lfe , Li
+
Li , Lj
Lfc ,
T h e operators I q and L 2 lead to 1Zi(a), %2(P), 11i ( a ) = ( Z2:p
iaLi
)
-R-2 (/3) =
• (21)
denned by
exp
-i(3L
(22)
2
and the most general rotation in 3 dimensions is defined by ft(a,/3,7)
= Hi (a) TZ2(0) H3(j)
.
(23)
An alternate but equivalent way to write the general rotation is in the form K(a,/3,-y)
= exp[-i(aL!
2.1. Rotational
symmetry
+/3L2 + 7 L 3 )
exp
& 'the Noether
Method'
— ion L, ]
• (24)
Any quantity £ ( x , y, z) that is 'invariant' under a rotation satisfies C(x', y', z') TZ(a, 0,7)£(x,
= C(x,y,z)
,
y, z) = £(x, y, z)
(25) ,
and if the angles are infinitesimals the left hand side can be written as 1 + t ( a L i + /?L 2 + 7 L 3 ) ]c(x,
y, z) = £{x, y, z)
,
(26)
125
or equivalently this can be written as SnWCi^y^)
= ifaLj
+ /?L 2 + 7 L 3 ) C(x, y, z) = 0 , (27)
where J w (a*) = iotiU
.
(28)
Since the angles are independent, this is actually equivalent to three independent conditions Lx £ ( x , y, z) = 0 , L2C(x,y,
z) = 0 , L3C(x,
y, z) = 0 , (29)
which can be more simply written as Li £ = 0. Any quantity that satisfies (29) is said to possess rotational symmetry. In a similar manner any quantity that satisfies only L3C(x,y,z)
= 0
,
(30)
is said to possess rotational symmetry about the z-axis (or third direction). Symmetries are very useful properties. For example, imagine there is some system that possesses a symmetry about the z-axis, and has an energy £t that is known to depend on both x and y. Further imagine a 'standard measurement' of this quantity is only be made when y = 0 and yields the z-dependence £t(x,y
= 0) = Aoz 4 = £SM(X)
.
(31)
Since the total energy £t is a function of both both x and y it can be written as
£t(x, y) =
£SM(X)
+ £smSM{x, y) ,
(32)
where the function £smSM ('symmetry-modified standard measurement') must satisfy £smSM(x, y = 0) = 0 . Since £t possesses a symmetry with respect to L3 it must be the case that L3 £t = 0. So that 0 = L 3 | 5 S M ( a ; ) + £SmSM(x,y)
\
= L3 [ £SM{X) J + L 3 Y £SmSM(x, y) j i4A0x3y
+ L3
(33)
'smSM (x, y)
To find the explicit form of £smSM an expansion in terms of powers of y can be utilized, 00
£amSM(x,y)
=
^ J/" fn(x) n=l
.
(34)
126
When this expansion is substituted into (32) it leads to oo
0 = 4A0x3y
+ J2 [yn+1^T
~
ny^xUx)
71=1
0 = [4A0x3
- 2xf2(x)}y
°°
- xh{x)
(35)
rl nf df
i.e., a series of equations for the unknown coefficient functions fn(x). separation into various powers of y this system yields y° :
xh(x)
= 0 -
fx{x)
y1 :
4A0x3
- 2xft(x)
= 0 ->
n+l y„n+l (^ n„>Nl ^) :.
= 0
Upon
, f2(x)
= 2A0x2
,
(36)
j- ^„\ _ / 1 \ dfn /„ + 2 (x) = (k n + 2 dx
The complete solution to this set of equations is given by f2{x)
= 2A0x2
, h{x)
= A0
,
(37)
where all other coefficients function vanish. In supersymmetrical theories, the steps discussed above are often called 'the Noether Method' and are commonly used especially in the case of supergravity theories. 14 3. G r a s s m a n n N u m b e r s In the 1840's a mathematician, Hermann Grassmann, proposed results that imply the existence of a type of number that is a non-trivial solution to the equation 02 = 0
.
(38)
This should be thought of as an analogy to the fundamental definition of complex numbers x2 = - 1 .
(39)
The quantity 6 is called by many names, among these are 'Grassmann number' and 'anti-commuting classical number.' It is possible to consider more than one such quantity, (i.e., e1, I = (1), (2), . . . , (TV) ) and we define also eIeK
=
_
e
K
£
I
(40)
127 and note that this last equation implies that (e1)2 = 0 for each independent value of the index. While the concept of a Grassmann number may seem strange, in order to introduce supersymmetry it is convenient to introduce an even stranger object, the Grassmann-valued function. Such a function may be denoted by \P(T) and by definition, it satisfies *(ri)*(r2) =
- *(r2)*(n)
,
(41)
for all values of r\ and T 2 . For T\ = T2 this implies [\&(T)] 2 = 0 just as for a Grassmann number. This is exactly opposite to the coordinates of a particle which satisfy X(n)X(r2) =
+X(r2)X(n)
.
(42)
Notice that the Grassmann numbers e1 are constant which is distinct from \&(T) which is a function of time. It is a coincidence (or perhaps not) that the result in (38) also provides the simplest mathematical way to enforce the Pauli Exclusion principle! Thus objects that obey the rule in (41) are typically called 'fermions' while objects that obey the rule in (42) are typically called 'bosons.' Stated another way, in physics when the multi-particle wavefunctions obey (42), then they are said to possess 'Bose—Einstein' statistics. While in the case they obey (41), then they are said to possess 'Fermi-Dirac' statistics. 4. S u p e r s y m m e t r y The topic of supersymmetry began in the early seventies 15 and has evolved into one of the most active areas of research investigation in theoretical fundamental physics and as well is pursued in mathematics. All supersymmetrical theories possess at least one generator of supersymmetry, typically denoted by a symbol like Q and often called 'the supercharge.' However, the supercharge is defined to act on functions and does so in such a way so as to replace a boson by a fermions and vice versal One such rule can take the form QX(r) =
*(T)
,
Qtt(T) =
*|-X(T)
.
(43)
It follows that QQX(r) = t ^ X ( T )
,
QQ*(r) = i j ^ r )
.
(44)
This last result is interesting in comparison with (19). There it was seen that independent of which coordinate ( x , y, z) was used to make the calculation,
128
each led to the result in (19). If this same logic is applied to (41), it leads to QQ
QQ + QQ =
dr
il^— dr
(45)
In order to make this even more closely analogous to (19), it is possible to introduce a notational device. The 'anti-commutator' of two objects A and B may be defined by { . 4 , 6 } = AB +
BA
(46)
and this allows (45) to be re-written as d_ (47) dr To generalize further, more than one supercharge may be introduced Q Qi. In this case (45) is replaced by
{ Q , Q}
{ Qi , QK }
i2
J2(5IK
(48)
dr
In either case ((47) or (48)), it is clear that Q by itself does not form a closed set under the anti-commutator. Particle physicists typically set h and c equal to one and using these conventions, it is possible to make the identification .d dr
H
(49)
where H may be identified with the energy operator of quantum mechanics. It is then possible to note Q, H
0
H , H
= 0
(50)
Thus, using (47), (48) and (50), the set containing the elements Qi and H is closed. This is the analog of (19). However, in order to reach this notion of closure it is necessary to use both anti-commutators and commutators. This suggests a notation device that will simplify things. The 'schizophrenic' commutator is defined by
'A,
B} = AB -
{-iy{A^{B)BA
(51)
for any two quantities A and B. The mapping operation 7 assigns to its argument the numbers 1 or 0. If we assign the values 7(H) = 0 and 7(Q) = 1, then by picking A and B to correspond to H or Q in all possible ways, (51)
129
'covers' all the results in (47), (48) and (50). The analog of the identity (21) also exists and takes the form (_1)7(-4)7(C) A,
~B, C}} + (-l)^8)^-4) B,
(_1)7(C)7(B)
A, s } } = 0 .
C,
*}} +
(52) Closure under (51) and the satisfaction of (52) means the set containing the elements Qi and H describes a 'super Lie algebra.' The quantities Qi also act like derivations with a slightly modified version of a Leibnitz rule Qi
AB
Qi(^)]s + (-l^iWUJQ^fi)
(53)
and an example of this is provided by the following short calculation Qi[*x]
= [Qi(*)]x +
(-irW7(Q)*[Qi(x) (54)
Qi(«)
x - * Qi(x)
where the second line follows from the first since fermions such as \& are defined with 7(\f) = 1. Similar to the 3D rotations, the generators Qi must be exponentiated, like the L, generators, to form the analogs of the rotation matrices. So we write
Sie1)
exp
ie'Qi]
(55)
,
and as indicated, the analog of the angle is e1. Here e1 are chosen to be Grassmann numbers, i.e., •~f(ei) = 1. Any quantity S( X, \& ) that depends on X and \I> is 'supersymmetrically invariant' if it satisfies «S(X', * ' ) = S(X, * )
, (56)
^ ^ ( X , *) = 5(X,*)
.
If S is an action, this equation must be interpreted to allow for the expression on the left to differ from that on the right by a boundary term. When e1 is an infinitesimal and £( X, \I>) is a Lagrangian that leads to the action <S(X, * ) , the Lagrangian is a supersymmetric invariant if Qi£(X,*) = ^ T \ c
,
for some quantity T\c, known as the 'supercurrent.'
(57)
130 There is one more operator (of often overlooked importance) that can be appended to this algebra. It is the 'scale' or 'dilation' operator which may be denoted by d. It is a bosonic operator, hence 7(d) = 0, defined so that
d , H j = iH ,
d , Qi } = i\ Qi .
(58)
The first of these implies that (59)
IT
The dynamical variables (X, \I>) taken together with (58), the Jacobi identities of d, H and Qi, and the definitions
d,x}=dxX
, [d, * } = d*tf
(60)
(where d\ and dy are real numbers), imply the relations (61)
d* = dx + 2 5. A First Look at Supersymmetric Dynamics
As a first example of supersymmetrical dynamics, it is convenient to treat the case of a free particle with coordinate X(r) possessing unit mass. The action for such a particle is simply given by the time-integral of its kinetic energy. We will denote this as SSM and note SsMb =
/ dr j
2
I— I j =
/ dr CsMb
(62)
The next step is to treat this expression exactly like the one in (31) and thus one must add an analog of CsmSM to Csub- A distinction between the rotational symmetry and supersymmetry can be seen as the latter involves time derivatives while the former does not. This implies that the analog of £-sm.SM may involve time derivatives. Also the Grassmann statistics imposes the requirement that only even powers of the fermion can appear. The simplest such expression one can write taking into account the Grassmann nature of * ( r ) is CsMf
= ifco*—*
(63)
131 The condition of supersymmetry invariance requires there must exist a choice of ko and Tsc such that ,rd •r dr
=
=
Q-CSM
Q dX dr
dX + Q dr dr d_ QX + ik0 Q * * dr Ldr dX ~d7 dX dr
[1 -fco] [1 -
C-SMb + CsMf
Q
2k0
d2 dr 2
+ k0V
*
dr
-ik0$
+
d7*
d dr
dr
Q#
•X
k0t>
dX dr
(64)
This equation is satisfied if jpsc
fco = i
1
=
T
dX
(65)
The discussion in (64) and (65) is another example to the 'Noether Method.' Here we have simply made an ansatz for the analog of (34) and verified its validity. The two dynamical variables X ( T ) and \I>(T) form a 'scalar supermultiplet' with a supersymmetrically invariant Lagrangian given by dXi2 , d T (66) ^ dr ^ Ldr J The form of this Lagrangian determines the numbers dx and d*. All actions S must satisfy CsM
d , S)
= 0
,
(67)
and in the present context this implies Id ,
CSM
|
=
C-SM
•
(68)
An explicit calculation yields I d , CSM j =
I d , CsMb j +
d , £SM/ j ( 69 )
= ( 2 d x + 2)£ S M6 + (2d* + l)£SMf = CSM + (2dx + 1) CsMb + 2 d* A S M /
,
so that dx = — ^ a n d d* = 0 in order to satisfy the condition in (68). There is one other feature to note about (66). The functions X(T) and * ( T ) are real, so how does the factor of li' appear in such a manner so that
132
the Lagrangian is real? A way to accomplish this requires a modification of the 'complex conjugation' operation that we typically denote by *. A modified definition may be taken to be given as * = C®Ox
,
C = i <-> — i in all factors
,
(70)
Ox = invert order of all monomials in all factors To see how this definition of the *-operation solves the problem, it suffices to consider a brief calculation on the term of concern, C®OT:
•l
T
z
d
T
dr
rd dT
\£/*
•V
d_ Idr
i\
* =
hi^
tar*
(71) We refer to this modified conjugation as 'superspace conjugation,' although it is often referred to as 'hermitian conjugation' (even though there are no matrices involved nor expectation values involved). The 'supercurrent' is not a supersymmetrical invariant. However, it does possess some interesting properties. These can be seen by calculating QF30, dX.
Q.F S
i
, „ c£X,
T
icr
(72)
so that d dr Thus we find the infinite sequence of relations r\2n
f*sm
j_rn psm
Q 2 n jF s c =
Hnfsc
r~i2n-\-\ nsm
(73)
inn+1Tsc
_
Qfn+lfsc
=
i H
n
(74)
c sm
t
A consequence of this derivation is that we can form an entire family of supersymmetrical invariants via C (p)
([dp+1X
dP
dP+l .$ drP+l
(75) + i drP dr for any positive integer p. The quantity Kp is simply a constant. The existence of (66) and (75) facilitates a convenient demonstration of the role of the operator d. The abelian operator assigns a 'scale dimension'
i^|[
•
*
]} •
133 to every quantity which appears in a supersymmetrical theory. Using the rules in (59), (60) and (61),
>,4P)}
dKp)C[p)
= (2dx + 2p + 2 +
(76) = L{p) + {2p + dKp)C{p)
,
and in order to satisfy (67) we require di
=
-
iF(r)
,
QF(r)
=
-
±-T,(T)
.
(77)
The observant reader at this point ought to detect what appears to be a swindle. The variations in (77) can simply be obtained from those in (43) by the replacements *(r) -
V(r)
,
^
-
-
F(T)
.
(78)
This observation immediately implies that a supersymmetrically invariant Lagrangian is given by CFM
= ihvj^V
+ \F2
,
(79)
which is found by performing the substitution of (78) on the Lagrangian in (66)! The form of this Lagrangian fixes d^ = 0 and dp = 2The transformation in (77) has been given the name of an 'automorphic duality transformation' and will be the subject of later discussion. Notice that higher order invariants of the form rfm(p)
l o i - \ adP
+l d } \ dP
1 ,
PdPF-\*^ r \
,_m
for any non vanishing integer p greater than one and dep = — 2p also exist. The reason the fermionic multiplet is of interest stems from the fact that it can be used together with the scalar multiplet to form a new supersymmetrically invariant Lagrangian. By beginning with the term ir)^>W'(X.)
134 and using the 'Noether Method,' it is simple to show that the Lagrangian given by
£ tfm—sm
*J7*W'(X) + FW(X)
,
(81)
satisfies QCfm-sm W
±_
=
*
dT
r]W(X)
(82)
for arbitrary functions W(X). This is precisely what is needed to introduce conservative forces acting on the particle. Consider now a total Lagrangian given by CjOT(X,
* , 77, F) = CSM
+ CFM +
i rrfXi 2
.i
C{m-sm d
T
+ *7j¥W'(X) + FW(X)
.i
d
i „9
,
(83) that implies an equation of motion for F given by F = — W(X). This can be substituted into the Lagrangian (while the fermions are set to zero for convenience) to show, Cr<(X,0,0,F
dX dr
= -W)
Mmx)f
(84)
In supersymmetrical theories, functions whose equation of motion are algebraic as above are called 'auxiliary fields.' However, note that this definition of an auxiliary field depends on what is the form of the action. If any of the actions in (80) had been added to (83) then the equation of motion for F would not be algebraic. The Lagrangian in (84) is called an 'on-shell Lagrangian' to distinguish it from the 'off-shell Lagrangian' in (83). The motivation for these names can be seen by the following argument. Since the condition F = — W(X) is imposed, it is possible to calculate the effect of Q on this condition, Q
w
dr
•77 + W ' ( X ) * = 0
(85)
However, this is nothing but the equation of motion for 77 that follows from the variation of the action, i.e., r\ must satisfy its equation of motion and is thus 'on-shell.' This situation is not exceptional. For most of the interesting theories known now for almost forty years, only their on-shell Lagrangians are available.
135
A feature apparent in (84) is that the particle with coordinate X(r) now possesses a potential energy, U(X), that is given by
w(x) = [w(x)Y
(86)
and for any real function W(X) (called the 'superpotential') the potential energy function is bounded from below by zero. This is one of the interesting properties generically seen in supersymmetrical theories. They tend to possess a natural lowest energy state that can be defined as the 'vacuum.' It is now useful to restore the fermions in (84) and simultaneously restrict the form of the the superpotential to that of a quadratic form W(X) = A0 + B0X
+ \CQX2
,
-»
W(X) = [A0 + B0X
+
-»
W'(X) = Bo + C0X
,
\C0X2
(87)
which then leads to the Lagrangian C?OT(X, * , rj, F = -W) = \
dXi2
\[A0
dr •1
T «
T
+ ir]V[B0
+ BQX+ -1
\CQX22]i 2
"
+ CQX]
.
(88) The question of the vacuum for this theory rests upon the solutions to the equation U(X) = 0 and since we have arbitrarily chosen the superpotential to be a quadratic form, this can be analyzed in some detail. The vacuum condition becomes (X)±0 = A0 + B0(X)±
+ ^Co«X>)2 (89)
(X) ± -
j±[-B0±
^B2
-
2A0C0
This implies that a non-vanishing VEV is permitted. In fact, as long as A® is nonzero, there are two equivalent such values. To define a perturbative theory about these vacua, it is indicated to define a new dynamical bosonic variable via the equation X = Y + (X) :
(90)
136 Written in terms of Y the Lagrangian becomes £ t T O T (Y, * , v, F = -W) = i [ ^ } -
2
-h\Bo
+ C0 ( X ) ± ) Y 3 -
$C0(B0 •1
T
d
C0{X)±}2Y2
+
-r
lC0Y4
- i d
+ t » j * [ ( S o + Co (X}±) + C 0 Y ]
. (91) In this form a number of features become apparent. Foremost, in the supersymmetric vacuum, it is seen that in the setting of field theory, the dynamical boson X(T) and the remaining dynamical fermionic degree of freedom described by the ty-r] pair both possess the same 'mass' (here we mean mass in the sense of field theory), M x = M*^
= B0 + Co (X) ±
.
(92)
So that in supersymmetrical theories, generically, there is a mass degeneracy among bosons and fermions. Furthermore, the Yukawa coupling / is related to the quartic self-coupling constant A of the boson via the relation i A = / = Co .
(93)
Finally the bosonic cubic self-coupling / is related to both the mass and the Yukawa coupling constant via f = 3MX/
,
(94)
and the parameter space is completely determined by Mx and A. This model provides a lot of insight into dynamical systems which possess supersymmetry. Generically such systems possess mass-degenerate boson-fermion pairs and interactions where coupling constants are related one to another in specific ways. One of the most remarkable features of these attributes is that in a full relativistic quantum field theory, they are preserved. Renormalization does not disturb the mass degeneracy nor the interrelationship among coupling constants! There is one other interesting point to be made about supersymmetric dynamics. As noted above, the equation of motion for the quantity F is given by F = - W which implies via (77) and the infinitesimal version of (55) 5QV(T)
= -ieQv(r)
= -F
= - \A0 + B0X + £ C 0 X 2 l e
. (95)
137 Vacuum values of this equation may be calculated and lead to (F)
= ([A0
+ B0X
+ £C 0 X 2 ]e)
.
(96)
Now the vacuum value of X is defined by (W) = 0 and so the r.h.s. vanishes in general. However, if the constants Bo = Co — 0, then (F)
=
A0e
&
SQV(T)
=
-
A0e
.
(97)
In the present context, e is a constant. However, just as the group 5 0 ( 3 ) can be gauged, so too can be supersymmetry. For such a further extension, called 'supergravity,' the quantity e would become a function of r . In this case, a supersymmetry transformation could be used to set T](T) to zero. This is a signal that supersymmetry can be spontaneously broken and for such a situation, r\ is known as the 'Goldstino,' a fermionic analog of the 'Goldstone Boson.' 6. A F i r s t Look at Superfields In the previous discussion, the attributes of supersymmetrical theories were described by use of functions (both Grassman-valued and ordinary) of a single time-like parameter T. Such a mode of description provides the most direct way to compare the dynamics of supersymmetrical systems to those of non-supersymmetrical ones. However, there is a much more powerful method available for describing such systems. The method 'superspace' was proposed by Salam and Strathdee. 16 It is equivalent to the previous description but has the advantage of making supersymmetry obvious at all steps of a calculation. Moreover, in complicated systems involving consideration of relativistic quantum field theory, superspace methods are, by far, the most computationally efficient ways to carry out calculations. 'Superspace' is to supersymmetrical theories as Minkowski space is to relativistic theories. Superspace provides the natural setting in which to describe supersymmetrical theories. The basic idea of superspace is similar to that of Minkowski space. For the latter, the temporal coordinate is joined to spatial coordinates to create 4-vectors in Minkowski space. In ID superspace, the temporal coordinate is joined with one (or more) 'Grassmann coordinates.' In general a number of such objects {i.e., 0^\ (//) = (1), (2), (3), ...,(N)) can be introduced such that 0(a) 0(«
=
_ 0(0) 0(a)
f
(98)
138 and furthermore may be regarded as real quantities for the sake of simplicity Q(a)
_
^(a)
(99)
The 'coordinates' of points in ID superspace are thus of the form ZK
=
(T,0)
,
(100)
as originally suggested by Salam and Strathdee. For the sake of simplicity, we choose (for now) (N) = 1 and can thus 'drop' the superscript on 6. A superfield is simply a function denned over the superspace. If Z?(Z) is such a real function, then consideration of a Taylor Series representation implies that it can be written as B(Z) = X ( T ) +
i0*{T)
(101)
In writing this expression, it was assumed that B(Z) is a bosonic function. It is simple to also consider a real fermionic superfield !F(Z) for which a similar expansion takes the form
T(Z) = V(T) +
(102)
6F(T)
(Note that reality is defined with respect the ^-operation in (70).) With the introduction of superspace and superfields, the supersymmetry generator can be realized as a differential operator. This is seen by noting the following formulae, —i
r9
B(Z) = * ( T ) + 6^—X dr
r —i
.n9
9
.w
+ t9. . S i J-(Z)
»r\
= -iF
+
(103)
6--r) dr
Upon comparing the results on the r.h.s. above with those in (43) and (77), it is seen they coincide with the identification Q =
-i
(104)
so that calculating the 'schizophrenic' commutator of this differential operator with itself yields (47). It should be noted that we distinguish between the abstract operator Q from the differential operator Q by removal of the boldface type. There is one other differential operator that plays an important role in superspace formulations of supersymmetrical theories. It is often called the 'supersymmetry covariant derivative.' Since it is actually invariant with respect to the supersymmetry generator, it is more accurate to call this
139 operator the 'supersymmetry invariant fermionic derivative.' We denote it by D and define it as 8_ L 86
D
i6
d_ 8T
(105)
J
and a simple calculation reveals 8 D, D} = - i 2 —
\d [ — , D } = 0 . (106)
, [Q,D} = 0 ,
The utility of this operator is revealed when dynamical issues are discussed. Prior to doing so, there is one other important matter to treat ... establishing an appropriate integration theory. There is also another feature about superspace conjugation that is now apparent. The coordinates being used here r and 6 are both real satisfying (6)* = 6
(r)*
(107)
Nevertheless, the only consistent way in which the operators Q and D can have the definite properties under superspace conjugation given by Q* =
D*
+ Q
D
(108)
is if
8__ 06
(-Y = 6.1. Superspace
(109)
integration
The concept of integration over Grassmann numbers was first introduced by Berezin. 17 It is useful to review this. Consider the superfunction B(Z) dependent on the single coordinate r and a single Grassmann coordinate 6. Then Berezin's definition of the integration theorem implies /
MB(Z)
=
i9(t)
(110)
which is the same as d_ B(Z) = i9(t) (111) 86 Via this definition, integration over Grassmann numbers is operationally the same as differentiation! We are therefore free to define the superspace integral over a superfield Lagrangian C by /
d6 C =
limD£ 0->O
D£
(112)
140
The reason this will work to form a supersymmetrical action is because the term in the ^-expansion with the highest powers of 0 always transforms in the time derivative of something lower in the expansion,
a0( jdec)
total div
(113)
The action is effectively invariant because this equation shows us that the action (i.e., <S = f dr f d6C) only changes by total time derivative terms which are usually ignored anyway. There is one more labor-saving technique in dealing with superfields. Prom (101) and (102) it is seen that the functions X ( T ) , * ( T ) , r](r) and F(T) can be seen to arise from the Taylor Series expansion of the ^-coordinate of superspace. However, with the introduction of the D-operator, there is an alternative way to view their origins. It is a matter of a few short calculations to convince oneself that
X(r) = Jim B(Z) = B(Z) T]{T)
=
lim .F(Z)
^(Z)
*(r) =
- * limDS(Z) = DB(Z)
F{T)
limDJ"(Z) = D.F(Z)
=
05
*
-
B(D)'
B*(D)*" =
- 'B'(D)'
=
-
>*(D)*
- >*(D)
=
- >(D)
=
-DB
1
•—1
(114) This method of defining component fields is called 'components by projection.' Superspace conjugation applied to these equations is consistent keeping in mind the re-ordering rule applies even to operators. Acting on the real superfield introduced above we have B* = B and T* = T and for their first fermionic derivatives,
* DJF
=
= D.F .
(115) These equations are verified by performing the calculation at the level of component results on the far l.h.s. of these expressions and then comparing those to the calculation at the level of component results on the far r.h.s. of these expressions. 6.2. Superspace
actions
&
dynamics
We are now positioned to discuss the dynamics of our supersymmetrical system totally at the level of superfields. We begin by making contact with
141 the Lagrangian (and associated action) in (66). As will be clear shortly the expression given by
SSM = JdTd6fo[^-]DB}
(116)
is the corresponding superspace action. Using Berezin's definition of superspace integration together with the method of components by projection yields the following calculation,
SSM = j drd6[i\[^-]v>B) = J drD [i\ [|*-] DB }| = i\Jdr[[T>8B
> 2 B}|
™*[£ DS +
• dB_ .dr
DB DB +
8B_ dr
= i\J dr[-{T>B)
D2B}| .dB l
d7
+
r dBi 015 I r
*
]} •
$r°° = */*{»&*] dr \ }
]}|
(117)
. dB i -i
— i
= jdr{\
+ 4*
dr
These manipulations can be checked by direct use of the Berezin prescription. We first calculate the component expansion of the superfield integrand of (116), <9Bi„„,
., r c «
. .„d
, i r.,
. .„dX-
dX-12
(118)
27. = TSC + OCSM and observe that the supercurrent is the lowest component field and the Lagrangian is the highest component field. By use of similar arguments, it can be shown that the superfield actions that correspond to (79) and (81) are respectively given by
SFM = J drd6 UPDF
\ ,
(119)
142
gfm-am
=
f dT ^ j j p ^ g ) J .
(120)
Thus, there is a total superspace action describing the dynamics of the system described by (83). We simply write, STOT
=
SSM
+ SFM
+ S(m
sm
(121)
using the superfield actions of (116), (119) and (120). Like ordinary functions, the calculus of variation may be applied to superfields. Application of standard variation principles to STOT yields 0 =
-i?—DB + FW'{B) dr 0 = D B + W{B) ,
, (122)
as the results for variations with respect to B and T, respectively. Using the method of components-by-projection, the component level equations of motion are recovered. 7. Is Supersymmetry Part of Our Cosmos? In the previous discussions, we have seen the basic mathematical structure of supersymmetry illustrated within the mathematical confines of ID simple supersymmetry and superspace. These may be regarded as interesting mathematical diversions. However, there is a very distinct possibility that these mathematical concepts have something to do with our universe. The Standard Model provides the most detailed and best tested theoretical description of the universe available to our species presently. The constituents of the Standard Model consist of quarks that appear in six 'flavors' each of three 'colors' providing representations of SU(3)
V dx d2 d3JL+R
V si *2
s3JL+R
three doublets of left-handed leptons providing representations of SU(2)
143
and six (presumed singlets) providing representations of U(l) -Rj/e
=
Rfj. =
"efl fJ-R ,
i
Re RvT
=
ZR
=
VTR
> Ru,j. ,
RT
=
^^R
=
TR
,
(125)
.
A count of the number of independent fermions yields the number 24. As each of these is massive, we must multiply by four to find there are 96 fermionic degrees of freedom (df(SM) = 96) appearing in the Standard Model. These fermionic particles constitute the known matter in our universe.
Elementary Particles <0 u XL Y •
LID
Chjf'P
MMMWM
0 C
v
n J-.-
If
o 2 IWiOr.
: I ,
X tau
U
W£
•/V bosnn
I II III Three Families of Matter Fig. 1.
The Constituents of The Standard Model.
On the other hand, the number of bosons includes eight massless Gluons, one massive neutral Intermediate Vector Boson, two massive charged
144 Intermediate Vector Bosons and a massless photon, G{ , Z° , W+ , W~
, Aa
•
(126)
Counting the number of bosonic degrees of freedom (ds) is a little trickier. Each massless boson ((?£ and Aa) possesses two degrees of freedom, while each massive boson (2£, W£, W^) possesses three degrees of freedom. We should also include one more particle, the Higgs Boson H, though it has not been experimentally observed. This leads to a total of twenty-eight bosonic degrees of freedom (ds (SM) = 28) appearing in the Standard Model. These constitute the known forces or energies that bind the matter particles into nucleons, atoms, molecules, compounds, etc. Since ds ^ dp, there is no sign of equal numbers of bosonic versus fermionic degrees of freedom, a hallmark of supersymmetry. The fermions in the Standard Model are simply far too numerous! The Standard Model is more that simply a list of the elementary particles ... it is a Lagrangian CSM describing their dynamics! In analogy to the experiment described near equation (31), we can imagine that all measurements today have not given a complete view of the universe. In the fanciful experiment, the measurement was made only along the :r-axis. In the real world of the Standard Model, the limiting factors are the limits of current technology capabilities that permit the probing of fundamental physics. If supersymmetry were an exact symmetry of our universe, then it would be natural to expect that ds — dp- This requires the 'conjuration' of new previously unseen particles. 'Summoning' the existence of previous unknown particles has a long history in this field, though it may seem to be an act of desperation with regard to the idea of supersymmetry. Perhaps the earliest historical precedent is given by the electron. Today it seems natural not to question the existence of the electron as most communications and computer technology could not be created with knowing about it. However, there was a time when humanity was unfamiliar with this concept. This was changed in 1874 by George Stoney who 'summoned' the existence of the electron as basis for explaining the processes of (what is now called) electro-chemistry. As this has proven to be a most profitable sort of conjuration for our species, we may hope this same path might be so for 'superpartners' (new forms of matter and energy discussed below). If supersymmetry is present, at some level in our universe, then it must be the case that the operator Q exists in our universe. In this case, it can be used to deduce the presence of as-yet-unseen particles. In a schematical
145
way the following equation must be valid, QQi ~ Qi
,
(127)
where each quantity Qi is of the form
Qs^Cf 1 fa j[3)
•
(128)
Although similar in appearence to (123), there is one important difference here. While in (123) each matrix entry denoted four fermion states, instead in (128) each matrix entry denotes four boson states, called 'squarks.' This same process is applied to all the leptons QLt
~ Li
,
QRi ~ Ri
(129)
?
which leads to
1
><1 L'M^l'M Rve
=
,
VeR
&e
=
&R ,
(130)
T
JL
,
= V^R
R^
(131)
Rn = Mfl i Rv-r =
VTR
, RT = ffl
.
This leads to a set of bosonic states called 'sleptons.' Next the process is applied to the the bosons of the Standard Model and leads to Q1 , Z°
, W+ , W~
, A
.
(132)
These correspond to 'gluinos,' the 'zino,' the positively charged 'wino,' the negatively charged 'wino' and the 'photino,' respectively. These are all fermions. Finally there is the Higgs particle. Here things become a little more complicated. It is mathematically impossible to realize supersymmetry by adding a single fermion 'superpartner.' Instead it is necessary to add a zoo of Higgs related particles, some bosons H
,
A
,
H°
,
H+
,
H"
,
(133)
A
,
H°
,
H+
,
H-
.
(134)
and associated fermions. H
,
146
As one can see, invoking superpartners as objects that exist in our universe is quite expensive ... at the cost of slightly more than doubling the number of basic constituents. There is even more cost than is apparent, because there is now the questions of what are the masses of the putative superpartners and their coupling to both ordinary matter and themselves. Although supersymmetry solves some of these issues, the number of parameters remaining is frightfully large. So why would we be motivated to carry out such a drastic revision in our view of reality? In fact, we have met one such reason in our 'toy' discussion in an imagined world where there was only time. Recall that we saw that in a supersymmetrical theory, there was a lower bound to the energy. This is a feature that survives an analysis of the full four dimensional theory including known forms of matter and energy together with their superpartners. Without supersymmetry, it has long been known that in giving a description of the universe consistent with quantum theory and relativity, the issue of the lowest state of energy (called 'the vacuum') becomes problematical. Supersymmetry is the only property presently known that resolves this question. The way this occurs is considered elegant by many and is illustrated in Fig. 2 below. The loop on the r.h.s. corresponds to a Feynman integral over
O - O = o Fig. 2.
The Cancellation of Boson and Fermion Loops.
a closed loop formed from propagators constructed from a bosonic field. The loop on the l.h.s. corresponds to a Feynman integral over a closed loop formed from propagators constructed from a fermionic field. Every fermion loop has a minus sign relative to a bosonic loop and there is exact cancellation if there is a mass degeneracy. Though the demonstration above was made in the context of the simplest vacuum fluctuation diagrams, it can be shown to hold no matter how complicated the structure of the diagrams used. There are other more 'practical' reasons that physicists have during the course of the last twenty-five years invested an enormous amount of time and energy devoted to the study of supersymmetrical theories. One such
147
reason involves what is known as the problem of 'naturalness' of the mass of the Higgs boson. Below there is a representation of the 'fully dressed' Higgs
*****
•4Fig. 3.
Naturalness of Scalar Masses.
boson propagator (the thick solid line) as an expansion that begins with the 'bare' propagator and continues by adding higher order interactions as predicted by the Standard Model. There are an infinite number of additional Feynman diagrams required. Like most relativistic quantum field theories, the Standard Model requires renormalization. In particular quadratically divergent terms appear. As the mass of a scalar field is not protected in this process, the physical mass has an ultra-sensitive dependence on the parameters used to define a finite theory. This situation has been given the designation of being 'unnatural.' One solution to this problem is the introduction of supersymmetry. As in the case of the vacuum fluctuations, here this implies there are additional graphs involving 'superpartners' (not present in the Standard Model) that render the mass dependence to be 'natural.' There is one other technical reason that is often cited as explaining the enthusiastic anticipation among particle physicists.
148
The gauge fields of the Standard Model describe three fundamental forces each characterized by a distinct Lie algebra; SU(3) for the chromodynamic interaction, SU(2) for the weak interaction and U(l) for the electromagnetic interaction. Each of these basic forces is described by its own 'coupling constant,' the most familiar being the fundamental unit of electric charge. Particle physicists typically use the symbols as for the SU(3) coupling constant, aw for the SU(2) coupling constant and ae for the U(l) coupling constant. However, in the world of relativistic quantum field theory, coupling constants are not constant! Instead they depend upon the energy at which they are measured. This behavior is called the 'running of the coupling constants.' One other reason supersymmetry has generated such in interest among physicists involves the topic of 'unification.' The two plots in Fig. 4 show the
D. I. Kazakov, P h y s . Rept. 344, 309 (2001)
10
15
'VQ Fig. 4. The 'Running' of Coupling Constants.
behavior of the reciprocal of the coupling constants as a function of the logarithm of the energy. The plot to the left shows the behavior of the coupling constants as predicted by the equations of the Standard Model (SM) and the one to the right shows their behavior in the 'minimal supersymmetric Standard Model (MSSM).' In the case of the SM, the weak and chromodynamic coupling constants decrease with energy (note the plot describes the reciprocal) and that for the electromagnetic interaction increases. However, this behavior occurs in such a way that there is no common point of
149
intersection. There is dramatically different behavior for the MSSM. 18 In this case the chromodynamic coupling constant decreases, while both those for the electromagnetic and weak interactions increase. The changes in all three couplings conspire to meet at a single 'point of unification.' Many physicists believe that this is a desirable situation. Here, above the point of unification, there effectively remain only two forces; the gravitational and the combined chromo-electro-weak interactions. 8. A P e e k at 4D, Jsf = 1 Superfields There exists by now a large literature on superspace and superfields. This work began with that of Salam and Strathdee 16 and continues to this day. There are introductory treatments available, 19 two comprehensive treatments 20 ' 21 and all sorts of levels in between. The mathematics of 4D, A/" = 1 Supersymmetry is remarkably similar to that seen in the simple one dimensional context. For example, the generators of supersymmetry take the forms [ d
= i[d M - %\9fld„ - »A0"a, L d6» d •l?- i ^ a , = i\ dp up -— ! j l ile^dr, 7 ' Um d6» In a similar manner a real 4D Af = 1 superfield takes the form Qn=
i
F(z) = C{x) + eaXa(x)
+ 0 d Xd(*) - 02M(x)
+ eaedcAa(x) - e2eaxa(x)
(135)
62M(x)
-
- e2edixdl{x) + e2e2d(x) (136)
The supersymmetry variations arise as dQF(z) and lead to SQC= - (eaXa SQ Xa = eaM
= i ( c " Q M + ^Qp)F(z) + e^Xa)
- C* ( i \dgfl
SQ M = e* ( Ad - i \daT 6QAa=
e0(Cpa\a
SQ AQ = e?(Cf}ad Sod =
• 1
(137)
, + Aa)
)
,
,
+ i yd0a Xa ) + ^(CfaXa + i \d0a Aj
da{ eaXa +
,
)
+
i\dapXa)
?*daM
e^X") (138)
150 This expansion is known as 'the vector multiplet superfield.' There are simpler superfields whose expansions take the form $(z) = A(x) + eai/ja(x)
- 02F(x)
i\6a6&daA{x)
+
+ i\e2e&dai)a{x) + \eWuA(x)
(139)
.
The fields A{x) and F(x) are complex and some of the ^-expansion terms which appear in (136) are absent from the expansion in (139). The changes induced among the fields by the supersymmetry transformations are found from 6Q*(z)=
*(c"Q M + e * Q A ) * ( z ) ->
5QA=
-
SQF
ea tpa
,
,
SQ ipa
=
-ie"
da A
+ eaF
,
= it* daTp" •
(140) A standard Lagrangion for the interaction of these two multiplets takes the form SM
-J
dixd26d26
+
$ + $ + + $_ $.
/ d4xd26 W($+, $_) + h.c.
[/
(141)
In writing the action as above, we have made use of covariantly chiral superfields defined by V i $ + = V „ $ _ = 0. In terms of component fields this becomes >M
J''
- J ( V M + ) ( V . A + ) - ^ ( V M _ ) ( V a ^ - ) - *V+ d Va^?
- # ° V „ ^ - Q + F+F+ + F _ F _ -
[i\aC[(tcip+a)A+
-
[^i+i+(A+,A-)^+QV'+a
-
[\w,-,-(A+,
+ (tcil>-a)A-]
a
A-)il>- if>-a
+ h.c. ]
+ W,+ (A+, A.)F+
+ h.c]
+ W,-(A+, A_)F_ + h . c ]
Q
[ W + , _ ( A + , A _ ) V + V - a + h.c. ]
+ dc[(tcA+)A+
+ (tcA-)A-]'
. (142)
151
9. P r e l i m i n a r y Discussion of t h e QTZ,(d, Af) S t r u c t u r e There is an approach 12,13 to the description of ID, arbitrary jV-extended off-shell supersymmetry that embeds issues in the field within the context of the well-developed and mature area of the representation theory of Clifford Algebras. Schematically, this can be illustrated in the figure marked as Fig. 5.
[B
if}
Fig. 5. Proposed Coordinate Invariant Definition of Supersymmetry.
The specification of the QlZ(d, Af) structure shown in the diagram is carried out in several steps. First, two real d-dimensional vector spaces VL and VR equipped with Euclidean inner product structures are introduced. The elements of VL will be usually denoted by <j> while those of VR will be usually denoted by tp. Next four distinct sets of real linear maps acting on these spaces are introduced according to the following rules. Let {ML} denote all linear maps acting from VL and to VR, {MR} denote all linear maps acting from VR and to VL, {ML} denote all linear maps acting from VL and to VL and finally {UR} denote all linear maps acting from VR and to VR. Thus we have,^ ML:VR
-
VL
,
MR:VL
-> VR
UL : VL -> VL
,
UR : VR -» VR
,
(143)
tffere it is understood that ML is an element of {ML}-
.
152 Since the dimension of the vector spaces is d, it follows that dimjA^z,} = dim{M.R} = dim{Wi} = dim{Wfl} = d2. The definition of these maps implies that the compositions MROML and ML°MR have the properties ->
MR
O ML
-VR
ML
O MR
:VL->VL
,
VR
(144) ,
and are thus elements of {UL} and {UR\, respectively. Next two subsets {L} e {ML} and {R} e { . M B } , such that rank({L}) = rank({R}) = J\f are defined. Let L(a) € {L} and R(/3) e {R} be specific fixed elements in each subset. We impose the requirements L(a) o R(/3) + L(/3) o R ( a )
=
- 2 (a, /J)I Vl .
,
R(a)
=
-
,
o L(/J)
+ R(/3)
o L(o)
2 (a, 0 ) I V R
145)
where Iv t and Ivfl are the identity maps acting on the respective vector spaces VL and VR and (a, (3) is the Euclidean inner product of a and /3. Finally, there is one other condition to be imposed. For all
- (R(«) ^ , 4>) ,
(146)
where ( , ) denotes the Euclidean inner product imposed both VL and VR. This completes the definition of the abstract structure denoted by the image on the r.h.s. of Fig. 5. The relation of the structure above to real Clifford Algebras has been explained in some of our previous works. However, to describe the particular restriction of Clifford Algebras to the structure above, we have named the specialized algebras so denned the 'QTZ(d, M) algebras' as they bare some resemblance to generalized (G) real (7?.) versions of the Pauli matrices well known in the physics literature. 10. Supersymmetry &; QlZ(d, Af) Structure Supersymmetry, of the type interesting to physicists, is a derivation SQ denned to act on sets of maps {B} and {J - }. As a first step, in defining SQ, the condition rank({#}) = rank({^})
(147)
is imposed. The basic idea is the two classes of maps {B} (called 'bosons' and of even Grassmann grading) and {J7} (called 'fermions' and of odd Grassmann grading) act on R 1 to project it into all possible mathematical quantities associated with the QTZ(d,J\f) structure.
153 These classes of maps will form linear, off-shell (as called by physicists) representations of supersymmetry if 6Q(e)oB
= iL(e)oF
,
8Q{e)oF
= R( c ) o ^
,
(148)
(where e is an element of an anticommuting algebra) then implies M e i ) ° M e 2 ) o S - SQ(e2) o 5Q(ei) o B = - i 2 ( e i , e 2 ) — ^o(ei) 0 ^ 0 ( 6 2 ) 0 ^ - 6Q(e<2) o £ Q ( e i ) o . F =
,
- i2(ei, e2) -^-
,
(149) for two such anticommuting elements ei and £2- Sets of maps that satisfy (147) and (149) are called 'superfields.' Examples of these will be given shortly. For a fixed value of Af, there is a smallest value of d (denoted by djj) such that matrix representations of L(a) and H(a) are faithful.13 This relationship is as follows. For any natural number Af, a decomposition is accomplished by writing Af = 8 m + n
(150)
where m and n are are non-negative integers such that 1 < n < 8. Furthermore, if Af — 8k for some integer k, then m = k — 1. This is the Table 1. The Radon-Hurwitz Function. n
F-nnin)
1 2 3 4 5 6 7 8
1 2 4 4 8 8 8 8
Radon-Hurwitz function (as noted in the work by Pashnev & Toppan 22 ) and can be used to write &M = 16 m FnH{n)
,
Fnn{n
+ 8m) = Fnn(n)
.
It is also useful to consider a basis for all the elements of {A4/,}, {UL} and {UR}- Such basis can be constructed in two steps.
(151) {A4R},
154 We introduce Af arbitrary anticommuting elements £1, € 2 , £3 • . . , £jv of the form t h a t appear in (148) and (149) and make the observation t h a t all structures / 2 p where h{ti,
£2) = L(ei) o R(e 2 )
,
/ 4 ( e i , £2, £3, £4) = L(ei) o R(e 2 ) o L(e 3 ) o R(e 4 )
,
/ 2 p m a x ( e i . e 2, £3, • • • , £ 2 P m a J = L(£i) o R(e 2 ) o . . . o L e
( ( 2 ? w x - i ) ) ° R( e 2 P m a x )
, (152)
satisfy the condition / 2 P S {ML}In a similar manner all structures fo-p where /2(ei, £2) = R(ei) ° L(e 2 ) M^i,
f2PmaAeu
,
£2, £3, £4) = R(ei) o L(e 2 ) o R(e 3 ) o L(e 4 )
,
£2, £3, • • • , e 2 p m a J = R(ei) o L(e 2 ) o . . . o R(£
)) ° L(£ 2 p m a x )
, (153)
satisfy the condition / 2 p £ {UR}Continuing we note t h a t all structures / 2 q + i where /3(£i, £2, £3) = L(ei) o R(e 2 ) o L(e 3 )
,
/5(£i, £2, £3, £4, £5) = L(ei) o R(e 2 ) o L(e 3 ) o R(e 4 ) o L(£5)
,
/2g m ax+i( £ i» £ 2, £3, • • • , e2q m a x +i) = L(ei) o R(e 2 ) o . . . o R( £ 2 9 m a x ) o L ( £ ( 2 g m a x + i ) )
, (154)
155 satisfy the condition / 2(? +i £ {ML}In a similar manner all structures / 2 g +i where /3(ei, e2, e3) = R ( d ) o L(e2) o R(e 3 )
,
/s(ei, e2, e3, e4, e5) = R(ei) o L(e2) o R(e 3 ) o L(e 4 )o R(CB)
,
/2 9m a*+i( e i. £ 2, e3, . . . , e2gma;l!+i) = R(ei) o L(e2) o . . . o R
-(e(2pmair-i)) °
L
(e2Pmax)
,
(155) satisfy the condition / 2 g +i £ {.Mi?}It is useful in the continuation of the discussion to define /o =
> /o =
IVL
IVH
> (156)
/i(c) = L(e)
,
/ X ( C ) = R(c)
,
which are used in the definitions of the following sets {/}e
=
{/o, / 2 p }
,
{/}e
=
{/o, / 2 p }
,
~ _ ^ ^ (157) {/}o = { / l , / 2 g + l } , {/}o = { / l , / 2 9 + l } • As shown in the physics literature by Okubo, 23 it is possible to use the sets {/} e , {/}e5 {/}o and {/} 0 , in the case of a 'normal' Clifford algebra, to respectively provide bases for {ML}, {-MR}, {UL} and {UR}- However, there are two other circumstances that arise, the cases of the 'almost complex' Clifford algebra and the 'quaternionic' Clifford algebra. In each of these cases, the sets in (157) do not provide a complete basis. In the former case, it is necessary to introduce an almost complex structure (we denote by T>) and in the latter case three quaternionic structures (we denote by £a) must be also introduced. The types of the Clifford algebra for 1 d < 8 are shown in the accompanying Table 2, and by noting that Type gK{d,M)
= Type gTl(d, Af + 8r)
(158)
which is valid for any integer r. Once the type is identified, the basis elements then follow from Table 3.
156 Table 2.
gn{d, AO
Type[g7e(
S7?.(8, 8) gn(8,7) gn{8, 6) GK(8, 5) gn{i, 4) gn(i, 3) 6^(2, 2) STC(l, 1)
N N AC Q Q Q AC N
Table 3.
Type N AC Q
Basis Types.
Basis Elements.
uL
ML
MR
{/}o
{fh {{f}o,{f}oV}
{{f}e,{fhv}
{{f}e,{f}eV}
{{f}o,{f}0£a}
{{/}e,{/}e«a}
{{fh,{f}e£a}
{{/}o, {f}oV} {{f}o,{f}oS«}
UR
{/}e
{/}e
10.1. The chiral multiplet as a Clifford superfield of £gfc(2,2)
algebraic
In order to see all this formalism at work, it is useful to consider some examples. In the following this is done very explicitly in one case ... the 4D J\f = 1 chiral multiplet. It is useful however to glance at a few simpler examples first. The simplest way of choosing maps consists of the identifications {B} : M1 —> VL and {T} : M.1 —* VR. In this case, there are introduced functions 4>i{r) for {B} and ipf. for {J"} along with equations LKRP
Sgfo
+ LPRK
= -2SKPI
= ieK(LK)J^
, RKLp ,
6Q^
+ RPLK =
= -2SKP1
- eK (RK)^drh
, (159) ,
(160)
which correspond respectively to (145) and (148) after choosing a specific basis. Although we have used different names for this representation in some of our past works, it is accurate to call this the 'isospinning scalar multiplet.' A second such example is provided by choosing {B} : R 1 —> UL and {T} : K1 —> MR. Once again, we will leave N and d arbitrary and simply
157 identify 'fields' $ki and ^ki, so that $n6{Wi}
,
* i , e W
,
(161)
and define their transformation laws under a supersymmetry variation to be 6Q$kl
= ie1(Ll)ki^h
,
6Qykl
= -^(RVSrfc/i
,
(162)
and we shall call this a 'Type-I Clifford Algebra Supermultiplet.' It can be seen that in many ways, this supermultiplet resembles
, *kle{MR}/V
,
*fcj€{Z4}/D
,
.
(163) *kte{UR}/V
Here $ki together with $^.;- correspond to {B} while ^fe;- together with ^kl correspond to {J7}. We may call this the 'Type-II Clifford Algebra Supermultiplet.' In QTZ(2, 2) we can use the following conventions and identities, L J R J = -<5 IJ I + e IJ / ' , R ! L J = - SUI + e I J / * , Tr[f*] = Tr[t\ = 0 , (/*) 2 = (/*)* = - I .
(164)
In order to define component fields we expand the Q7l{2, 2) fields in the following manner,
®ki = \SkiB
+ \Ulki
(drr'G
,
*kl
= - X/V*1
,
and define their transformation laws under a supersymmetry variation to
158 be
S
Q*ki
- zl {R\edT$a
+ e1 (Vy
-el{l})kldT$ri
- el ( R 1 ) ^ d T $ k e
dT$u (166)
*Q*ki s
Q®ki
and these equations correspond to (148) for this choice of the maps. It should be emphasized, with the exception of the factors of T above, all other quantities appearing above are real in the usual sense. Therefore the symbol e1 denotes two real parameters that are totally independent of e1. In other words, the 'bar' is simply a notational device to denote these independent parameters. Upon combining the results of (164), (165) and (166), this leads to
6QA=
ie1^1
+ ieV
,
6QB=
zeV
-
itl4>1
,
5Qi)1=
eldTB
SQ-IJJ1^
- eldTA
6QF=
ieleudTi)3
SQG=
- i e I e I J a , V - itleudT^
+ eldTA
- e J e JI G + eJenF
+ eldTB
+ e3e3lF
+ ieleudr^
,
+ eJeJIG
(167) ,
, .
These variations close under commutation to
[hxM
= -atfA
-I
^
+ <\?2)d,
(168)
11. The Chiral Multiplet & Associated Adinkra We now wish to use the variations in (167) to construct an Adinkra that is associated with this multiplet. We begin this process by writing explicitly
159 the variations 6QA =
- i e1 i)1 - i e2 ^ 2 + i e1 tp1 + i I2 ip2
6QB=
ie1^1
+ ie2ip2
^drB
+ e2G + ^drA
JQV1
-
2
2
+ ie1^1
1
2
+ ie2i>2 - e2F 1
6Qip = e dT B - e G + e 5 r A + e F JQ^1
=
, , ,
- e 1 d r A - e2 F + e1 d r B - e2 G
5Qi>2 = - e2dTA SQF=
i^drj)2
SQG=
-ie^rip2
+ elF
,
+ e2 dTB + e1 G
- ie2dTi>1
,
,
+ iel dTip2 - il2dri\)x
+ ie2dT^1
+ iexdT^2
,
- it2 dT^>x
. (169) Each bosonic field is associated with an open node (such as seen in Fig. 6) which is given a 'height assignment' that is twice its engineering dimension when drawn in a diagram. Each fermionic field is associated with
-4
O
Fig. 6. Graphical Representation of Bosonic Node. a closed node (such as seen in Fig. 7) which is given a 'height assignment' that is twice its engineering dimension when drawn in a diagram.
Fig. 7. Graphical Representation of Fermionic Node. For the sake of convenience, we may rename the fermions •01 and 4>2 according to the convention
ft _> ft _ ft _ ^3 _
(17Q)
The two physical bosonic fields A and B have engineering dimensions equal to — 2 . The four physical fermionic fields ft, ft, ft and ft have
160
engineering dimensions equal to 0. The two auxiliary bosonic fields F and G have engineering dimensions equal to + \. In order to build the Adinkra associated with this multiplet we implement the following rules. All nodes of the same height are drawn horizontally at the same level. Nodes with different height assignments are drawn according to their assignments. Nodes with the lowest value of their height assignment are drawn lower on the graph than those of higher height assignment. At this stage, the 'skeleton' of the Adinkra can be drawn. Two open nodes appear at the lowest level
F
o oG
AO
OB
Fig. 8. Skeleton Adinkra. of this Adinkra. Reading the diagram from left to right at the lowest level we associate the two nodes with A and B, respectively. Four closed nodes appear above these at a middle level of the diagram. Reading from left to right we associate these four closed nodes with the four physical fermionic fields ip1, tp2, V>3 and V>4. Two open nodes appear at the highest level of this Adinkra. Reading the diagram from left to right at the highest level we associate the two nodes with F and G, respectively. The four supersymmetry parameters correspond to four 'directions' that are linearly independent. Since in the discussion below we are looking for a diagrammatic representation of the set of variations in (169) it would be most convenient to use distinct colors — one for each linearly independent supersymmetry parameter. This stratagem neatly allows one to accurately represent directions in the multi-dimensional space of the supersymmetry parameter on the two-dimensional surface of a graph. We would assign one distinct color to each of e1, e2, e1 and e2. Since we do not have color available in this publication, we will have to ask the reader to use their imagination. By examination of the transformation laws of (169), some very interesting features are seen to emerge. Each independent supersymmetry param-
161
eter induces a unique 'pairing' of vertices, when one ignores the absence or presence of the derivative operator. These pairings are shown in Table 4. The pairing is constructed by looking at the variations in (169). The left-
Table 4.
'First Table of Pairings' of Fields Related to SUSY Parameters. e2
el
el
e-l
A-ip4
A-$3
A-i/>i
A~V
B-^l i/)i-B i>2-G
B-,p2
A-i>4
B-4>3
xpi-G
ipi-A ip2-A
ilix-F i>2-F
4>4-B
4> -A
•4>S-G
i/> 4 -G ip3~B
F-$4
F-i>2
F-i/)i
F-03
G~44
ip4-A %j}3-F
F-4>3 2
G-i/>
•4>2-B 4>4-F 3
1
G-i>
most member of the pair is the function that sits next to the operator 8Q. The rightmost member of the pair is the function which appears as the coefficient of the supersymmetry parameter (at the top of the column in which the pairing is presented) on the r.h.s. of the corresponding equation. One obvious feature of this Table is that it possesses a redundancy. That is, each specific pairing appears twice. The same information is contained in a smaller Table given by
Table 5. el
'Second Table Pairings' of Fields Related to SUSY Parameters. e2
e-l
A-j>4
A~i3 B^l>2
A-^ip1 A-i/i4
A-i/> 2
B-ip1
F~i>3
F-ip4
F-V 2
F^ip1
G-i/>
2
G-f
1
F-4
3
B-i/)3
G^>4
Therefore different 'color' edges can be drawn linking the various vertices in the Adinkra. There are, of course, innumerable ways to do this. However, one possibility for doing so is according to the rules (and to which we refer
162
as the 'colorized' supersymmetry variations) e1 corresponds to blue edges joining Table pairs, e2 corresponds to green edges joining Table pairs, , e corresponds to orange edges joining Table pairs, and e2 corresponds to red edges joining Table pairs.
(171)
One feature that should also be apparent is that the rules in (150) and (151) give the relations of the number of nodes djv (either the number of closed or the number of open nodes) to the number of colors M. The next step is the introduction of colored edges. Since there are four colored supersymmetry parameters appearing in the colorized supersymmetry variations, four distinct colored edges are to be introduced into the diagram. Prom the colorized variation of A, four edges are to be drawn. A blue edge joins the ^4-node to the V^-node. A green edge joins the A-node to the tjj^-node. An orange edge joins the ^4-node to the V>3-node. A red edge joins the A-node to the V>4-node. This process is repeated for every single node with the result
Fig. 9.
4D M = 1 Chiral Multiplet 'Shadow' Adinkra.
This image is a graphical representation of the chiral multiplet, i.e., its 'Adinkra.' As each of the supersymmetry laws is apparent with a complete display of all colored edges, this level of detailed presentation is called the 'peacock mode' of the Adinkra. For some purposes, it is not necessary to display the level of detail described above. In this case, nodes may be 'collapsed' upon one another. For example, the fully collapsed version of this Adinkra is given in Fig. 10, where the numbers next to the nodes signify their respective multiplicities. In Fig. 9 there are features that are obvious. For example, the A and B nodes at the lowest level are local minima that only possess edges upward
163
Fig. 10. A Collapsed Adinkra.
from their level. They are referred to by the name of 'sources.' On the other hand, the F and G nodes at the highest level are local maxima that only possess edges downward from their level. They are referred to by the name of 'sinks.' There is also another feature of the colorized supersymmetry variation that becomes apparent upon further reflection. Consider a term specified by a specific color, a specific bosonic (fermionic) field to the left of the equal sign and a specific fermionic (bosonic) field to the right of the equal sign in the list of variations. Next consider the same specific color, but switch the specified fields. This process will only pick out two terms in the colorized supersymmetry variations. The signs of such terms are seen to be correlated. If the first term so specified has a positive coefficient, then so does the second and vice versa. This means there is even a more elaborate form of the Adinkra that completely specifies the colorized supersymmetry variation. Namely, dashed lines can be added to the Adinkra in such a way so as to specify the signs of each term in the variation. Solid lines indicate positive coefficients while dashed lines indicate negative coefficients. When this elaboration is added to the Adinkra in Fig. 9 we obtain the Adinkra in 'rampant peacock mode' as indicated below
/A A k\
W
We close this chapter by noting that the final results in (169) follow solely from those in (165) and (166). These results are clearly of the form
164 of those which are found in an entirely different manner. One can start by considering any standard formulation of the 4D, Af = 1 chiral multiplet. This can be followed by performing a toroidal reduction on a 0-brane and the results that emerge are isomorphic to those in (140). For our purposes, the way to study a higher D theory on a 0-brane simply means to set all the coordinate dependences of fields to zero, with the exception of the dependence on the time-like variable. We refer to this ID model as the 'shadow' of the 4D, Af = 1 chiral multiplet. It is very satisfying to see that the representation theory of QTZ(d, Af) is fully capable of encoding all the aspects of the 4D, Af = 1 chiral multiplet. In particular, the way in which the holomorphic combination of parameters appears here is precisely what is dictated by the 0-brane reduction of the chiral multiplet in four dimensions. The expansion of the chiral superfield shadow that we have seen denned by (163) and (165) is a special case of a concept we have called a 'root superfield.' In the present context the root superfield takes the form [**i]w(ai, a2) = Wi(dr)-aiB0
+ £(/*)u ( d r ) ^ 2 ^
,
[*£«Ma3) = -£RJa(3r)-°V , [*fcrMa4, a5) = hWr)-a*B0 [*fcf]w(ae) = Wki{dT)-a«0l
(172)
+ khki(9r)-a5B2
,
,
for some set of non-negative integers, a\, . . . , a§. The multiplet described by (165) has a\ = 0, a2 = 1, 0,3 = 0, 04 = 0, 0,5 = 1, and ae = 0. For different choices of these integers, different supermultiplets are described. The root superfield encodes 'dualities' about the higher dimensional supersymmetrical theory with the root superfield as its shadow. Instead of regarding (163) and (165) as the shadow of the 4D, Af = 1 chiral multiplet, we are also free to regard it as the shadow of the 2D, Af = 2 chiral multiplet. In this guise it is consistent to change the expansion in (165) to
*ki=
h5klA + lU*)ki{dr)-lG
,
Vkl
= - \ ^
1
,
and remarkably enough this change correspond to the duality between chiral and twisted chiral multiplets! 24 The multiplet described by (173) has a\ = 0, a2 = 1, 03 = 0, 04 = 1, as — 0, and a% = 0.
165 12. Concluding Comment on Fundamental Mathematical Super-symmetry Representation Theory In ordinary field theory, the Wigner 'little group' emerges when a field is restricted to a single point and the various symmetries that act upon it are investigated. It is our belief that a similar process always occurs for supersymmetrical theories when they are analyzed on a 0-brane. The group that we have named the Q1Z(d, Af) structure seems likely to emerge for all supersymmetrical theories. If this is so, then the £7?.(d, Af) algebra is likely to play an important role in the still unsolved problem of reaching a complete representation theory of supersymmetrical theories. Based on this, we have made a series of conjectures about the role that will be ultimately played by the QTZ(d, Af) structure: Conjecture I All superfields that give an off-shell linear representation of spacetime supersymmetry in all dimensions can be represented as Clifford-algebraic root superfields. Conjecture II If an on-shell supermultiplet is embedded into a representation of £QH(dfj, Af), then an off-shell representation of this supermultiplet is embedded into £GK{d2M, 27V). Conjecture III The constraints to which all irreducible superfields in all D > 3 are subjected insure that irreducible supermultiplets are also irreducible representations of the Q1Z(d,Af) algebra. Conjecture IV All superfields that provide an off-shell linear representation of Af-extended spacetime supersymmetric field theory for D-dimensional Minkowski spaces (with D > 1) can be embedded in the representations of the C(Af + D — 1, 2) Clifford-algebra with a projection of the dependence on the second temporal coordinate taken to zero.
166
In closing, we will also mention that use of these methods of studying supersymmetrical representation theory has recently revealed that the supersymmetry representations are intimately connected to a type of topology. The Adinkras after all, possess closed paths and depending on how these are chosen topological-like indices appear to make sense. This fact seems to be intimately connected to the irreducible representation of supersymmetry. The Adinkras we have seen related to the 4D TV = 1 chiral multiplet is the one with the simplest 'topology.' There are more complicated ones. In particular, the next most complicated topology associated with the case of £7£(2,2) is that associated with the Adinkra of the form shown in Fig. 11. Stated another way, if we begin with the supersymmetry variations given
Fig. 11.
The 4D, N = 1 Vector Multiplet Adinkra.
by (138) and restrict them to the 0-brane, the Adinkra that arises is the one in Fig. 11. Thus the problem of classifying the irreducible off-shell linear representations of supersymmetry very likely is embedded within an area that involves topology, Clifford Algebras and Graph theory in an unexpectedly intricate way. The solution of this problem will likely involve some beautiful and perhaps presently unknown mathematics.
167 Acknowledgments I would like to express my gratitude to the organizers, Professors Norbert Hounkonnou and J a n Govaerts, of the Fourth International Workshop on Contemporary Problems in Mathematical Physics ( C O P R O M A P H 4 ) for their kind invitation to deliver the lectures upon which this article is based and for the opportunity to visit such an exotic locale. As well I wish t o recognize the wonderful hospitality of the University of Abomey-Calavi (UAC) and the International Chair in Mathematical Physics and Applications (ICMPA) for their roles as hosting organizations. As well, I wish t o t h a n k my collaborators who have worked with me in the a t t e m p t to establish a new mathematically rigorous basis upon which investigations of supersymmetry can be based. T h e list of such people includes my former students, Dr. L u b n a Rana, Dr. William D. Linch, III, Dr. Joseph Phillips and present collaborators Prof. Michael Faux, Prof. Charles Doron, Prof. Tristan Hiibsch, Prof. Kevin Iga and Prof. Gregory Landweber. All figures used here were created by Professors Faux, Hiibsch and Iga. My research upon which this presentation has been based is supported by t h e endowment of t h e J o h n S. Toll Professorship, t h e University of Maryland's Center for Particle and String Theory and the U.S. National Science Foundation under grant PHY-03-54401. I also wish to express my gratitude t o Prof. Govaerts (in his role as editor) for his understanding and extensions of deadlines as I circumnavigated the globe during the Summer of 2006 during the completion of this article. References 1. E. Witten, Nucl. Phys. B 188, 513 (191); Nucl. Phys. B 202, 253 (1982). 2. L. Alvarez-Gaume, Commun. Math. Phys. 90, 161 (1983). 3. B. Julia, in Proceedings of the Nuffield Workshop, eds. S. W. Hawking and M. Rocek (Cambridge University Press, Cambridge (UK), 1980). 4. M. B. Green and J. H. Schwarz, Phys. Lett. B 136, 367 (1975); Nucl. Phys. B 198, 252 (1982); Nucl. Phys. B 198, 411 (1982). 5. U. Lindstrom, M. Rocek, W. Siegel, P. van Nieuwenhuizen and A. E. van de Ven, Phys. Lett. B 224, 285 (1989); W. Siegel, Lorentz-Covariant Gauges for Green-Schwarz Superstrings, in Strings '89, eds. R. Arnowitt, R. Bryan, M. J. Duff, D. Nanopoulos and C. N. Pope (World Scientific, Singapore, 1989); S. J. Gates, Jr., M. T. Grisaru, U. Lindstrom, M. Rocek, W. Siegel, P. van Nieuwenhuizen and A. E. van de Ven, Phys. Lett. B 225, 44 (1989); M. Rocek, W. Siegel, P. van Niewenhuizen and A. E. van de Ven, Phys. Lett. B 227, 87 (1989); U. Lindstrom, M. Rocek, W. Siegel, P. van Nieuwenhuizen and A. E. van de Ven, Nucl. Phys. B 330, 19 (1990); U. Lindstrom, M. Rocek, W. Siegel, P. van Nieuwenhuizen and A. E. van de Ven, Phys. Lett.
168
6. 7.
8. 9. 10. 11. 12.
13.
14. 15. 16.
17. 18. 19.
B 228, 53 (1989); M. B. Green and C. M. Hull, Phys. Lett. B 225, 57 (1989); R. Kallosh, Phys. Lett. B 224, 273 (1989); F. Bastianelli, G. W. Delius and E. Laenen, Phys. Lett. B 229, 223 (1989); J. M. L. Fisch and M. Henneaux, A Note on the Covariant BRST Quantization of the Superparticle, Universite Libre de Bruxelles preprint ULB-TH2/89-04-Rev (June 1989); E. Nissimov, S. Pacheva and S. Solomon, Nucl. Phys. B 296, 462 (1988); R. E. Kallosh and M. A. Rahmanov, Phys. Lett. B 209, 233 (1988). M. B. Green, J. H. Schwarz and E. Witten, Superstring Theory (Cambridge University Press, Cambridge (UK), 1986), pp. 280-281. F. E/31er, E. Laenen, W. Siegel and J. Yamron, BRST Operator for the FirstIlk Superparticle, Stony Brook preprint ITP-90-76 (August 1990); F. E/Jler, M. Hatsuda, E. Laenen, W. Siegel, J. Yamron, T. Kimura and A. Mikovic, Covariant Quantization of the First-Ilk Superparticle, Stony Brook preprint ITP-90-77 (November 1990). W. Siegel, Nucl. Phys. B 238, 307 (1984); Phys. Lett. B 203, 78 (1988); Nucl. Phys. B 263, 93 (1986); L. Romans, Nucl. Phys. B 281, 639 (1987). E. Witten, Nucl. Phys. B 266, 245 (1986). M. T. Grisaru, H. Nishino and D. Zanon, Phys. Lett. B 206, 625 (1988); Nucl. Phys. B 314, 437 (1989). S. J. Gates, Jr. and H. Nishino, Class. Quant. Grav. 3, 391 (1986). C. F. Doran, M. G. Faux, S. J. Gates, Jr., T. Hiibsch, K. M. Iga and G. D. Landweber, On Graph-Theoretic Identifications of Adinkras, Supersymmetry Representations and Superfields, e-print arXiv:math-ph/0512016; C. F. Doran, M. G. Faux, S. J. Gates, Jr., T. Hiibsch, K. M. Iga, and G. D. Landweber, Adinkras and the Dynamics of Superspace Prepotentials, e-print arXiv:liep-th/0605269. S. J. Gates, Jr. and L. Rana, Phys. Lett. B 352, 50 (1995) [arXiv:hep-th/9504025]; Phys. Lett. B 369, 262 (1996) [arXiv:hep-th/951015i]; S. J. Gates, Jr., W. D. Linch, III, J. Phillips andL. Rana, Grav. Cosmol. 8, 96 (2002) [arXiv:hep-th/0109109]; S. J. Gates, Jr., W. D. Linch, III and J. Phillips, When Superspace is not enough, e-print arXiv:hep-th/0211034; M. Faux and S. J. Gates, Jr., Phys. Rev. D 7 1 , 065002 (2005) [arXiv:hep-th/0408004]. P. van Nieuwenhuizen and D. Z. Freedman (eds.), Supergravity (NorthHolland, Amsterdam, 1979). Yu. A. Gol'fand and E. P. Likhtman, JETP Lett. 13, 323 (1971); J. Wess and B. Zumino, Nucl. Phys. B 70, 39 (1974). A. Salam and J. A. Strathdee, Phys. Rev. D 11, 1521 (1975); Nucl. Phys. B 86, 224 (1975); Supersymmetry And Superfields, Fortsch. Phys. 26, 57 (1978); S. Ferrara, J. Wess and B. Zumino, Phys. Lett. B 51, 239 (1974). F. A. Berezin, The Method of Second Quantization (Academic Press, New York, 1966). D. I. Kazakov, Supersymmetry in Particle Physics: the Renormalization Group Viewpoint, Phys. Rept. 344, 309 (2001) [arXiv:hep-ph/0001257]. J. Wess and J. Bagger, Supersymmetry and Supergravity (Princeton University Press, Princeton, 1992).
169 20. S. J. Gates, Jr., M. T. Grisaru, M. Rocek and W. Siegel, Superspace (Benjamin Cummings, Reading, Massachusetts, 1983) [arXiv:hep-th/0108200]. 21. I. L. Buchbinder, S. M. Kuzenko, Ideas and Methods of Supersymmetry and Supergravity (Institute of Physics Publishing, Bristol and Philadelphia, 1995, Revised Edition 1998). 22. A. Pashnev and F. Toppan, J. Math. Phys. 42, 5257 (2001) [arXiv: hep-th/0010135]. 23. S. Okubo, J. Math. Phys. 32(7), 1657 (1991); J. Math. Phys. 32(7), 1669 (1991). 24. S. J. Gates, Jr., Nucl. Phys. B 238, 349 (1984); S. J. Gates, Jr., C. M. Hull and M. Rocek, Nucl. Phys. B 248, 157 (1984).
Parallel Sessions: Group I Theoretical Methods of Modern Classical and Quantum Physics
173
BOSONIZATION OF T H E S C H W I N G E R MODEL B Y N O N C O M M U T A T I V E CHIRAL B O S O N S J. BEN GELOUN.t J. GOVAERTS+.t and M. N. HOUNKONNOU+ t International
Chair in Mathematical Physics and Applications (ICMPA), 072 B.P. 50, Cotonou, Republic of Benin E-mail: jobengelounSyahoo.fr, [email protected] * Center for Particle Physics and Phenomenology (CP3), Institute of Nuclear Physics, Catholic University of Louvain, 2, Chemin du Cyclotron, B-1348 Louvain-la-Neuve, Belgium E-mail: Jan.GovaertsOfynu.ucl.ac.be
Bosonization of the Schwinger model with noncommutative chiral bosons is considered on a spacetime of cylinder topology. Using point splitting regularization, manifest gauge invariance is maintained throughout. Physical consequences are discussed.
1. Introduction Bosonization 1 proves to be a successful method for quantization in noncommutative field theory. 2 - 5 On the other hand, it is well known that the ordinary massless Schwinger model is exactly soluble through the bosonization of the massless fermion, in particular within the physical projector method which avoids having to perform any gauge fixing procedure. 6 In this contribution, we investigate a new quantized version of the Schwinger model via a noncommutative chiral bosonization, taking the spacelike dimension to be compactified into a circle, K —» S 1 . The quantization to be considered presently is realized in a noncommutative field space instead of a noncommutative spacetime. The Hamiltonian analysis is significantly simplified without loss of a meaningful physical interpretation. The quantized system thereby obtained generalizes the nonperturbative quantization of the model 6 as an asymptotic ^-quantization, and the quantization rules found by Das et al? are extended to a nontrivial spacetime topology. Furthermore, we provide a gauge invariant quantized model by the point splitting regularization method using the Wilson line phase factor. The Hamiltonian
174
operator is diagonalized leading to a successful nonperturbative quantization. Such an abelian gauge theory based on the circle 5 1 as the space topology is new to the best of our knowledge. Section 2 presents our notations and briefly describes the classical Schwinger model and the relevant quantities of the constrained dynamics. In Sec. 3, we study chiral bosons in the noncommutative field space and develop the nonperturbative quantization of the model based on the bosonization of the fermionic degrees of freedom. The gauge invariant regularization is given in Sec. 4. Finally, some conclusions are presented in Sec. 5. 2. The Classical Schwinger Model Henceforth, the topology of the 1+1 dimensional spacetime is that of the cylinder I x S 1 , where K stands for the timelike coordinate and the torus 5 1 of radius R and circumference L = 2irR for the spacelike one. The Minkowski metric is r]^ — d i a g ( + , - ) , fi,v = 0,1. Units such that h = 1 = c are used throughout. The antisymmetric tensor e'"' is such that e 01 = 1 = —e10. In the chiral representation, the Clifford-Dirac algebra {7 M )7"} = 2r/MI/ is given by 7 0 = a1, 7 1 = ia2 and 75 = 7°7 1 = — a3, where a1 (i = 1,2,3) are the usual Pauli matrices. The field degrees of freedom of the model are the real U(l) gauge vector field Afi(x^) and a single massless Dirac spinor jp(x'1) represented by Grassmann odd variables, describing the fermionic particles. The Dirac spinor decomposes into two complex Weyl spinor representations of opposite chiralities, ip(t, x) = tp+ (t + x) + ip- (t — x), such that 7 5 ^ = TV'iFurthermore, the following choice of periodic and twisted (or anti-periodic) boundary conditions on the circle is assumed, respectively, All(t,x
+ L)=Ali(t,x),
i>±(t,x + L) = -e2ina±i>±(t,x),
(1)
a± being arbitrary real parameters defined modulo any integer such that a+ = Q _ = a (modZ) in order to ensure parity invariance. The model is described by the Lagrangian density given in a Lorentz and gauge invariant form as (Einstein's summation convention is implicit)
C = —F^F^
+ \iirfW*
+ ieA^
- ^ t ( ^ - ieAJjrf*,
(2)
with FpV = d^Av — dvAp, the gauge field strength corresponding, in 1 + 1 dimensions, to a single field F01 which is the pseudo-scalar electric field E; e stands for the gauge coupling constant which, up to a sign, is the charge of the fermionic particle (electron or positron).
175 Gauge invariance of the system implies a constrained dynamics 6 whose Dirac Hamiltonian treatment leads to a first-class constraint $ (Gauss' law) and the first-class Hamiltonian density ~H given as follows, $ = d\-K\ + eip^tp, 7Ti = —E,
n = -TTJ - -iiphsidi
(3)
- ieA1)^
+ -i(di + ieA1)^^,
(4)
where -K\ is the momentum conjugate to A1. In this system, A0 turns out to be a Lagrange multiplier for $. The fundamental graded Dirac-Poisson algebra, taken at equal time, reads (a, b — +, —), {A1 (f,:r),7ri(£,?/)} = 6(x-y),
{ipa(t,x),ipl(t,y)}
= -i5ab8(x
-y).
(5)
3. Nonperturbative Deformed Quantization Bosonization of the fermionic sector of the Schwinger model involves the quantum deformed algebra of chiral bosons, cj)+{t,x) and
n v n = l, V
L
-j-P±(±x)
„ e -^"(±x) +a ^ e ^n( ± *)jl
(6)
Representing a noncommutative field space, the deformed Dirac algebra of the chiral bosons proposed by Das et al.2 induces the following quantum mode algebras corresponding to the chiral fields
[a 0 ,„,a 6>m ] = -eab96ntTn
= - [at„,a^m ,
(7)
with a, b = +,-, 9 a real parameter, and Aa& = eabO + 6ab- Furthermore, we assume the hermiticity conditions for the zero modes qa = q\ and pa = p\. In terms of these bosonic modes, the fermionic operator ipa is written as ipa(z) — F(z)Va(z) with Va the vertex operator defined as Va = K{a) : e"i0('Po-aO
176
the bosonization process. The vertex operators Va and V£ are written as Laurent series as oo
V±(Z)=
Yl
i>±,nZ-(n^a±)-K
(8)
n— — oo oo
Vl(z)= Y, A n ^ ^ - ^
(9)
n=—oo
From (8) and (9), we deduce, by a complex contour integral around the origin z = 0, the expressions for the mode operators ip±,n and V'i „ as "4>±,n Jc %l7r ' Jc 2 j 7 r By well established conformal field theory techniques, 7 the fermionic mode anticommutators are evaluated. One has,
with a, b = +, - , N = n — a{\ + aa) and M = m — b{\ + at). We have used the Wick theorem in (10). Integration over UJ stands for an integration depending on z, while the last integral over z is taken around the origin, z = 0. Using : eA :: eB := e
/ 0(|z|-|w|)(
7
i N/3 2 (i+e 2 ) ec b ^Tg-z~ ° )
_9(| w | - | * I) ( T ^7-we£-fc J Vl ' ' " V(w-2),5«'> J x
a/9 e
""T( 1 +« 2 )*.(-»)
e-a/J,-.
? (i+«a)*a(_„
. g t f (PaP-„-ptp-b)+i/3(a(¥)a-afl¥)_0)(z)-ii(¥)b-fcSvJ_b)(w))
.
/-QN
where ©(•) denotes the Heaviside step function. The different parameters are fixed by 0 = ±\{l + 62)-h,
Pa=P-a
= p(l + 02rh,
With A2 = 1 = p2. (12)
The induced singular part of the short distance operator , as z ->• u, reads i J ( V 0 ( z ) ^ H ) = r ^ - r + --(Z-(J)
(13)
177 so that, by the residue theorem and (13), we get
which reproduces the well known anticommutator of the corresponding fermionic operators. In the same manner, given (12), evaluating the other radial ordered products R(V^{z)Vb(u)), R{Va(z)Vb(w)) and R(V} {z)Vb\w)) and then integrating the corresponding fermionic algebra, the remaining fermionic mode anticommutators are recovered as follows, [ V \ U » V ' 4 , n ] + = SabSnm,
[V'a,n,V'6,n] + = 0 ,
^l.n.W.n
= °-
(15)
Indeed, as z -» w, the only singularity such as the one in (13) with a nonvanishing residue occurs for R(V£{z)V\,{w)). We complete this study by giving the bosonized fermionic field (the #-Mattis-Mandelstam formula),
n=l X
JJeWir^' n=l
The holonomy boundary condition (h.b.c.) of the chiral bosons should also be related to a±, namely the h.b.c. of the fermionic operator ip±, through the deformed Heisenberg algebra on the circle. This relation has still to be understood (this work is in progress). Finally, the bosonic fields <j>± and their quantum states also provide a representation of the fermionic algebra irrespective of the choices p — ± 1 and A = ± 1 .
4. Gauge Invariant Regularization The first-class quantities (3) and (4) are composite operators which require a choice of operator ordering which should remain anomaly free and gauge invariant. Since gauge invariance must be preserved at all steps, a gauge invariant point splitting regularization of short distance singularities is a relevant choice.6 For instance, instead of the products tp'±(x)ip±(x), one considers the gauge invariant quantities lim ipi(y)eieK
duAl{u)
il>±(x)
178 and subtracts the divergent terms. Through this procedure, the fermionic currents in the bosonized representation are given by
4 V ± = - T - 7 = f [d^± =F 6
(16)
27TVTT02
so that the vector and axial gauge invariant currents are expressed as, respectively, Z7TV1 + P 7T
.(17)
2Aevl + #2
Denoting the regularized current components by J± = ip±ip±, the two coupled f/(l) Kac-Moody algebras 2 [ja, Jb] = (aSab + eab9)^dxS(x-y) for the unmixed and mixed commutators are still valid. The gauge contribution is indeed null. Furthermore, we have the regularized vector and axial charges given by
Q = - ^YTW ^+~p-+ = --
Q
6{p+ + p ) ]
- '
[ dxA1 + — L j f b+ +P- - 0(P+ -P-)] •
(18)
7T Jo V l + 02 One notices that the quantum axial anomaly of the axial charge, due to the zero mode of the gauge potential A 1 , remains as in the commutative case. The regularized first-class constraint reads $
- o ^ 2 & &+ + V") + 6dl &+ ~ V-)] • <19) 27TV1 + & The bilinears involving the covariant derivatives, namely, ip±Diip± and Diip±ip±, are also regularized by the point splitting method. Setting Ditp — (di — ieA1)^ and D\^ — (d\ + ieA1)^, we obtain the fermionic contributions to the first-class Hamiltonian =
9l?ri
H± = l-i>lD^± = ±47r(11+g2)(gi(^T^T)TAevTT^A1)2TI72^,
l
-D^± (20)
showing that the deformation has not modified the Casimir energy TT/(12L2).
179 The first-class gauge invariant Hamiltonian density reads U = \vlx + 2 2 4TT(1 + 0 ) + (diiy-
(di(
62Aiy
+ 8
(21)
Introducing the change of variables
? = f«u (22) *r=n{(A*-
jxfepdy
[{V+ - tp-) - 8(V+ +*-)]},
with fj, = | e l/v^j the first-class Hamiltonian density (21) can be expressed using the new degrees of freedom (T,n?,<pa,diipa) as
*=^
+ i(W +
i^
+
i(i) 2
+
(i)a,r.
(23)
Finally, given Aaj, = eat,6 + <5a&, we have the bosonic algebra [T(x),Trr(y)] = iS(x-y),
[<pa,di(pb] = 2ibirAabS(x - y), (24)
[Va^rM^-^VTTPSix-y). 5. Concluding R e m a r k s In this work, we have generalized to the cylinder spacetime topology 1 x 5 ' and to the massless Schwinger model the noncommutative bosonization induced by a deformed Dirac algebra of noncommutative chiral bosons. The usual quantum theory is recovered as 6 -t 0. This quantum model, regularized by the gauge invariant point splitting method, induces a gauge invariant dynamics equivalent to that of an electric field of mass fi = \ e \/y/n, like in the ordinary commutative case. Furthermore, due to the choice of the chiral boson algebra, a charge rescaling e —> e%/l + 02 appears in the expressions. This fact has still to be related to the equivalence found by Das et al.2 which states that noncommutative chiral bosons are equivalent to free fermionic fields moving with a speed equal to c' = cy/T+W, c being the speed of light in vacuum. We expect that, in a forthcoming investigation, a stronger deformation involving the nonzero mode algebra of the chiral fields would lead to a noncommutative spinor field theory.
180 Acknowledgments J. B. G. is grateful t o the Abdus Salam International Centre for Theoretical Physics (ICTP, Trieste, Italy) for a P h . D . fellowship under the grant Prj-15. J. G. acknowledges the Abdus Salam I C T P Visiting Scholar P r o g r a m m e in support of a Visiting Professorship at the International Chair in Mathematical Physics and Applications (ICMPA). T h e work of J. G. is partially supported by the Belgian Federal Office for Scientific, Technical and Cultural Affairs through the Interuniversity Attraction Pole (IAP) P5/27.
References 1. The literature on bosonization is extensive, see for instance, D. Mattis, J. Math. Phys. 15, 609 (1974); S. Coleman, Phys. Rev. D 11, 2088 (1975). 2. A. Das, J. Gamboa, F. Mendez and J. Lopez-Sarrion, JHEP 05, 022 (2004). 3. S. Gosh, Phys. Lett. B 563, 112 (2003). 4. O. Lechtenfeld, L. Mazzanti, S. Penati and L. Tamassia, Nucl. Phys. B 705, 477 (2005). 5. For a rewiew on Noncommutative Field Theory, see, M. R. Douglas and N. A. Nekrasov, Rev. Mod. Phys. 73, 977 (2001). 6. G. Y. H. Avossevou and J. Govaerts, Proceedings of the Second International Workshop on Contemporary Problems in Mathematical Physics, Cotonou, Benin, eds. J. Govaerts, M. N. Hounkonnou and A. Z. Mzesane (World Scientific, Singapore, 2002), pp. 374-394. 7. P. Ginsparg, Applied Conformal Field Theory, Lectures at the Les Houches Summer Session, June 28-August 5, 1988, in Fields, Strings and Critical Phenomena (Les Houches, Session XLIX, 1988), eds. E. Brezin and J. Zinn Justin (North Holland, Amsterdam, 1990); e-preprint arXiv:hep-th/9108028.
181
BOL LOOPS AS A N E W A P P R O A C H IN P H Y S I C S THOMAS B. BOUETOU Departement de Mathematiques, Ecole Nationale Superieure Poly technique de Yaounde, B.P. 8390 Yaoundi, Republique du Cameroun E-mail: [email protected] A short review of Bol algebras and the suggestion of their relevance to physics is given. The classification of trivial Bol algebras is obtained. Keywords: Bol algebras and loops; Lie groups and algebras; pseudo-derivation; 3-Webs.
1. Introduction A survey of geometry and physics or applied sciences stimulates the interest into other types of algebraic structures which are more general non associative, non Lie algebras, in particular Bol loops and the tangential structure deriving from their infinitesimal consideration. The first attempts towards this direction were made by Malcev in 1955, by generalizing the structures of Lie groups. Hence such algebras are now named after him. Later on a systematic study was carried out by Akivis, Kikkawa, Sabinin and others. 1 - 8 Nowadays some applications are available. We would like to point out that Bol algebras are proper infinitesimal objects for a smooth Bol loop, while any symmetric space, for example, is a smooth Bol loop. It is also noted that Bol loops and Bol algebras are significant in mathematical physics (for example they may be regarded as an extension of the symmetry concept, the preferable fundamental field of action being the deep inner structures of matter, the interior of the nucleon or the Planck region. Motivations for the possible relevance of nonassociativity have been the physical concepts of curvature, anomalies, chiral anomalies and spontaneous symmetry breaking). This paper is organized as follows. Section 2 provides a brief review of Bol algebras. In Sec. 3 we give a classification of Bol algebras with trivial trilinear operations, and the corresponding 3-Webs are derived.
182
2. Definitions and Basics Results Definition 2.1. 9 Any vector space V over a field of characteristic 0 with the operations
(1) €,vX-*(Z,V,QeV
where f, 77, £ € V,
and verifying the identities £•£ = 0,
(£,£,0=0,
(£,77,C) + (r/,C,£) + (C,e,'7)=0,
(£,»?, 0 • x - (£,»?, x) • C + (C, x, e •»?) - (£, »y, C • x) + «•»?)• (C • x) = o, (£,»?. (C, x, w)) = ((£, rj, 0 , x, w) + (C, (£,»?, x), w) + (c, x, (£, ??, w)), (2) is called a Bol algebra. We will note that any Bol algebra can be realized as the tangent algebra to a Bol loop with the left Bol identity, and they allow embedding in Lie algebras. This can be expressed in the following way. Let (G, A, e) be a local Lie group, H one of its subgroups, and let us denote the corresponding Lie algebra and subalgebra by © and h, respectively. Consider a vector subspace 23 such that <& = f) 4- fB. Let IT : G —> G \ H be the canonical projection and let $ be the restriction of the mapping composition Iloexp to 05. Then there exists a neighborhood U of the point O in 23 such that $ maps it diffeomorphically into the neighborhood \P(u) of the coset 11(e) in G \ H, © <—-—
23
exp
xp
G —i-•
(3)
G\H
by introducing a local composition law a*b = nB(aAb)
(4)
on points of local cross-section B = exp U of left-cosets of G mod H, where IIB = e x p o $ _ 1 o n : G —> B is the local projection on B parallel to H which puts every element a 6 B in correspondence so that g = aAp, with p € H. We will emphasize that if any two local analytic Bol loops are isomorphic then their corresponding Bol algebras are isomorphic. 10
183 Definition 2.2. 1 1 The linear endomorphism f] of the binary-ternary algebra 05 with the composition law X • Y and (X;Y,W) will be called a pseudo-derivation with the component Z if for all X,Y,Z,W £ 05, I I ( * • Y) = (U X) • Y + X • (U Y) + (Z; X, Y) + (X-Y)-
Z, (5)
U(X;Y,W)
= (Y\X;Y,W)
+ (X;UY,W)
+
(X;Y,l\Z).
A pseudo-derivation of a Bol algebra 03 forms a Lie algebra with respect to the operation of sum and multiplication by a scalar and the Lie commutator [0)11] = 1111 _ l l l T For Bol algebras, one can verify that if Yl and Yl have- Z and Z as a component, respectively, then f] +JT and A]T have Z + Z and XZ correspondingly as a component, while [Yl,Yl] has for a component Z • Z + Y\Z — Y[Z. We denote the algebra formed by the pseudo-derivation as pder^B. Using this notation the definition of a Bol algebra may be re-expressed as follows. Definition 2.3. 1 1 The binary-ternary E-algebra 03 with the bilinear ( ) and trilinear (—;—,—) operations is called a Bol algebra if for all elements
x,y,zeos, • X-X = 0, . (X;Y,Y)=0, . (X; Y, Z) + (Z; X, Y) + (Y; Z, Y) = 0; • the endomorphism 1)X,Y '• Z —> (Z;X,Y) with the component X • Y.
is a pseudo-derivation
In Sabinin's work and that of his students it is shown that the set PdertB = {(n> a ); n C pderft, a, component of Yl} i s a Lie algebra under a proper operation. This notion of pseudo-derivation helps to define the enveloping Lie algebra for a Bol algebra. In Ref. 12 it is proved that for any Bol algebra 03 with the operations X • Y and (Z; X, Y) there is a pair of algebras (©, h) such that © = 03 + h, and [03, [03,03]] C 03, [03,03] n 03 = {0} with X • Y = proj.X,Y,X • Y)] where 2» x > y is the linear endomorphism from Definition 2.3. In the present investigation, we will consider a right Bol loop (B,-,e) with a two-sided neutral and a left division. As is known, left division also
184 exists. Furthermore, the following identity holds. x-[(a-y)-a]
= [(x • a) • y] • a.
(6)
We differentiate (6) with respect to a and set a = e, to obtain the following differential equation of the left Bol loop, -A0(x)?fj$-
+ [Afo) + o £ ( y ) ] ^ £ = Ai(xy),
(7)
where A
r\Z)
~ L gar
Ja=e,
<\z)
~ [ g^
Ja=e-
(»)
We next introduce in short the following
-y?{T) = ±[A!?(z) + a»(z)],
(9)
and note that since e • z — z • e = z, AZ(e)a = a»(£)=7?(e)=6?,
(10)
6£ being the Kronecker symbol. We then obtain from continuity of the inverse matrices A%, a%(z), j£(z) in the neighborhood of e. By introducing as follows the differential operators AT and 7 r , (ATf)(x,y)=A°(x)?f^±,
(11)
hrf)(x,y)=^(y)^^-,
(12)
we can rewrite (7) as (-A„ + 21
= Ax(x-y).
(13)
Deriving now the differential consequences of (7) from this identity, let us consider the relations [-A,
+ 2 7(7 , -Ax + 2 7 A] = [Aa, Ax] + 4[7
(14)
[-A„. + 2 7 M , [-A, + 2lo, -Ax + 27A]] = -[A^, [A^Ax]} + 8[ 7 M , [7*, 7A]](15) These identities may be shown to be valid, by means of the definition of the Lie bracket [A,B]f = A(Bf) - B(Af) and from the fact that [AT,jx]f = Ardxf) — "(x{ATf) = 0. In order to be able to permute differentiations,
185 since AT and j x depend on various variables, we need to assume that these functions are at least of class C2. By applying (x • y)a to both sides of (14), one obtains + 21
( - A r + 2 7 . ) A \ ( z • V)~ i-Ax + 2-r\)AZ{x • y) (17) = [Aa,Ax](x-yy
+ 4[1
and by taking into account (11), dvAax{x • y){-A„ + 2la)(x
• yY- dvA%{x • y){-Ax
+ 2lx){x
• y)v
{A
= Using once again (11) one has, Al{x-y)dvAt{x-y)-Al{x-y)dl/AZ{x-y) _ Defining another operator AT as
=
[Aa)Ax]{x-yr+A[1
(AT/)(*-y) = A , ( * - y ) ^ p
(20)
one can represent (19) in the following form, [ A T , M i x • y)a = [A„,Ax](x • y)a + 4[^,lx}(x
• y)a.
(21)
Hence by using (15), (16) and (21) one obtains, [ - A , + 27M, M O - + 27a, -Ax + 2lx]](x • y)a = ( - A . + 2 7 M ) ( [ - A T + 27., -Ax + 27A](a; • y)a) - [ - A r + 2la, -Ax + 2 7 A ] ( - A i + 2l»)(x • y)a = [Ax, K , Ax]}(x • y)a + 8[7M, h„lx]](x
(22)
• y)a-
Furthermore, one has (-A> + 2 7 M ) { [ A T ,
AT](Z
• y)a} - ([Ar, Ax] + 4[ 7<7)7A ])A2(z • y) (23)
= - [ A , , [Ar, Ax]](x • y)a + 8 [ 7 M , [7
186
as well as, A;
• [i;,
A~X\(X • y)a
- (37, A\]A;(X
• y)a
(24) = -[A,,, [Aa,Ax]](x
• y)a + 8[ 7/1) h<,,7x]}(x • y)<*.
Finally, we have [A^,[AZ,Ali]}(x-y)a
= -[A^^a,Ax}}(x-y)a+8[^,[7a,jx}](x-y)a.
(25)
We also note that [Aa,Ax} = CTaX-AT, [Ap, [A„,AX]} = pT^aX • AT,
[7
(26)
[7„, [7
(27)
where CTaX and pT^
• y) = Clx{x)A»T{x)d-^^
+ 4£ A (y) 7T *(y) x
^ | - ^ . (28)
As a consequence, the following theorem holds. Theorem 2.I. 1 1 The law of composition of a right C 3 -smooth Bol loop (Q,-,e),x • y =
= T^x)9-^,
(29)
J y=e
187 T h e o r e m 2.2. 1 1 Given the initial conditions A^e) — 5^, [A\,Ap]"(e) = avX(3, the system of differential equations [A^, [A\, Ap]] = R^X/3A^, where A^ (/j = l , . . . , n ) are vector fields (identified, if necessary, with linear differential operators) and BF \pAa are given constants, has an unique solution if and only if R"li,\P
=
~^,0X'
-^ff,A/x ' Rv,a0 -^7,o-A a a/i
—
=
RJL,
Ra,a\aI>ii
' Rl,a0
+
+
K\,l*(r
+ ^a,Xf
~ ^i,aXaav
=
^l,X^
®>
' ^v,er0 + ^/3,AM ' ^v,a
~ ^l,afiaaX
~ aZfia^aaX
=
(30) 0-
3. Bol Algebras with Trivial Trilinear Operations Let 971 be a Bol algebra with a trivial trilinear operation. 13 The structure of 971 will be reduced to the representation of an anticommutative bilinear multiplication (•) : 97? x 971 —> 97? such that (X-Y)-(Z-U)
= 0,
V X , Y , Z , U e 971.
(31)
The possible cases are: (1) £ • n — 0 (abelian case); (2) £ • vC = 0 DU t the algebra 97? not being abelian (though an anticommutative 2-nilpotent algebra); (3) (€rj) • (C«0 = 0, but the algebra 97?(-) is not a 2-nilpotent algebra. In particular, 97? (•) can be an anticommutative 3-nilpotent algebra. Hence ( f r - C ) - « = 0,
V £,»?,£,« GOT.
(32)
In this investigation we are considering the question of how many operators meeting the conditions in (6)-(8) do exist (up to isomorphisms) in the case of a 3-dimensional algebra 97?. We will consider (up to isomorphisms) the unique case of the abelian algebra denoted Type .1 and hence investigate Case 2 hereafter. Case 2. Let us denote the subspace 97? • 97? in 97? by V. Then V • 97? = 0 and V ^ 97?, V ^ 0. The following variants can be obtained. • 2.a. dimV = 1. Then consider that 97?=< e i , e 2 , e 3 > and V = < e\ >, with d • e2 = ei • e3 — 0,
(33) e2 • e 3 = a e i ,
a ^ 0
188 By adjusting the basis ei,e2, e3, one can set e 2 - e 3 = ei.
(34)
The thereby obtained algebra will be referred to as a Type .2 algebra. • 2.b. dim V = 2. Then 97T =< ei,e2,e3 >, V = < ei,e2 >, and ex • e 2 = e\ • ez = 0, e2 • e3 = 0. These relations are in contradiction with the condition 97T-97T = V. Hence there exists, up to isomorphisms, only one Bol algebra of case 2. Now we move on to the examination of a Bol algebra in the case 3. Let us assume that 971 • 97T = V C 971, so that, V • V = {0} but V • 971 £ {0}, V ? 97?, V ^ {0}. • 3.a. Let dimV = 1 and V =< e\ >, where ei,e2,e3 are basis vectors in 971. Then, ei • e2 = aei,
e\ • e 3 = 0ei,
e2-e3=7ei,
a2+/32^0.
(35) Without limiting ourselves, one can consider /? ^ 0 by changing the basis ei,e2,e3. The defining relation of the anti-commutative algebra 971 can then be reduced to, ei -e 2 = 0,
ei • e 3 = ei, (36)
e2-e3=eei,
where
e = l,0.
Let us denote the thereby denned algebras as 97To and 97ti, respectively, and examine their possible isomorphism. We note that 9rt0 = {0},
97ti = { e i } ,
(37)
which implies that their centers are different. Hence the algebras 97lo and 97li are not isomorphic. We will denote them as Type .3 and Type .4, respectively. • 3.b. Let diml^ = 2, and let us consider ei,e2,e3, the basis in 97T such that V =< ei,e 2 >. Thus, ei • e 2 = 0,
(38)
and d
- 6 3 , 6 2 - e 3 G V.
(39)
189 Deforming e 3 to e3 = te3 + ve2 + uei, where t ^ 0, u, v e M, we obtain ei • e^ = ei • (ie 3 + ue 2 + uex) = te\ • e 3 , (40) e2 • e3 = e 2 • (ie 3 + ve2 + ue{) = ie 2 • e 3 . We may consider that the transformation $ = ade 3 | : V —• V given by
is defined up to multiplicative by a scalar. Through a choice of the basis ei, e 2 in V and of the vector e 3 , we can reduce $ to one of the following,
where a ^ 0 and V/i £ K. Correspondingly, we have ei-e3=ei,
e2-e3=^e2,
(43)
or ei -e 3 = ei + ae2,
e2-e3=e2.
(44)
Let us examine the first case. In that case p ^ O (otherwise there is a contradiction with the assumption dim V = 2), and by adjusting e 2 to e2 = fie2 we obtain e i - e 3 = ei,
e2 • e3 = e2
(45)
(algebra of Type .5). In the second case the change of (1 \ a)e\ —> e\ reduces the defining relations of the algebra 9K to ei-e3=ei+e2,
e2 • e 3 = e 2
(46)
(algebra of Type .6). The algebras 9JI of Type .5 and Type .6 are obtained by extending 1-dimensional Abelian algebras by means of the 2-dimensional Abelian algebra V = 9JI-3JI = < e\,e2 >. Hence for a £ 9JI \ V the structure of the operator ad\V is, 'a 0'
01
, ,
a ^ O (algebras of Type .5);
(47)
],
a^O
(48)
(algebras of Type .6).
For this very reason, algebras of Types .5 and .6 are not isomorphic.
190 As a result we have obtained the following theorem. Theorem 3.1. Up to isomorphisms, there exit 6 Bol algebras, with a trivial trilinear operation, defined as follows: • • • • • •
Type Type Type Type Type Type
.1: .2: .3: 4: .5: .6:
trivial e2 • e 3 e.x • e 3 e2 • e3 e2 • e3 e2 • e 3
bilinear operation; = e±; = en = e\, e± • e 3 = ex; = e 2 , ei • e 3 = e i ; = e 2 , ei • e 3 = ex + e 2 .
We note that the enveloping Lie algebras of the Bol algebras of Types .3 and .4 are 4-dimensional, but that these Bol algebras are not isotopic by their structure. The enveloping Lie algebras of Bol algebras of Types .5 and .6 are identical, but these Bol algebras are not isotopic by definition. Hereafter, we give the description of 3-Webs, corresponding to the defined Bol algebras. operation. • Type .1. Bol algebra 03 with trivial trilinear and bilinear Here we obtain grouped 3-Webs (Abelian group < E 3 , + , 0 >). • Type .2. Bol algebras 03 with bilinear anticommutative operation, e2-e3=e1.
(49)
We also obtain here a grouped 3-Web (global 3-Web) corresponding to the Lie group, which is isomorphic to the upper triangular unipotent matrix, which means matrices of the form (50) Type .3 Bol algebras 03 with bilinear operation, ei -e 3 = ei,
(51)
and trivial trilinear operation, which has a 4-dimensional canonical enveloping Lie algebra (5 = < ei,e 2 ,e 3 ,e4 >, <S = 03 + h , 0 3 = < e i , e 2 , e 3 >,
h = < e 4 - e 4 >,
(52)
and a composition law having the following structure constants, [ei,e 3 ] = e 4 .
(53)
191 The composition law (A) corresponding to the Lie group G is denned as follows, ~x{ Xz
~Vi A 2/2 = 2/3
X4.
L2/4J
X2
x\ +2/1 X2 +2/2 X3 +2/3 ^l)/3-j/1^3 X4 + 2/4 + 2
(54)
Moreover the subgroup i / = exp \] is realized as a collection of elements {expi(e 4 - e i ) } t 6 R = {(expie 4 )(expiei)"" 1 } t e R = {-t,0,0,*}t€R(55) The collection of elements B = exp 2$ = {exp(iei + ue 2 ) -exp?;e3}tiU>t,eK = {t,u,v,t}tiU,veR
(56)
forms a local section of the space of left cosets G mod H. The subgroup H is denned as H = exp f) = {exp a(e 4 - e 3 )}„ e K = {0,0, -a, a}a€U,
(57)
with 5 = {t,u,v,0}. Any element (a;i,a;2,a;3,a;4) from G such that (X3 < — 2) can be uniquely represented in the form -2X1+2X4—X1X3"] 2-X3
2Z4 2-X3
X2
0 0
A
X3
0
2x 4
J
L 2-X3 J
(58) 2X4 2-X3
~x{
n
B
X2
0 0
A
X3
2x4
X\_
L 2-0:3 J
The composition law (*) of local analytic Bol loop B(*) is denned as \ u' A v' V .0. .0. /
/
u
V t + t'
tv'-vt' + 2-(v+v')
u + u' V + V1
(59)
192 The corresponding local Bol 3-Web can be realized in the neighborhood of the point (0,0) in E 6 = {(X, Y), X, Y € E 3 } as a space form by X = constant,
Y = constant,
X * Y = constant.
(60)
• Type .4 Bol algebra 23 with bilinear operation ei • e 3 = ei,e 2 • e 3 = ei,
(61)
and a trivial trilinear operation, which has a 4-dimensional canonical enveloping Lie algebra 0 = < ei,e2,e3,e4 >, (S = 23 4- h, 03 = < ei,e2,e3 >, I) = < e4 — e\ >, and a composition law having the following structural equations [ei,e3]=e4,
[e2,e3]=e4.
(62)
The composition law (A), corresponding to the Lie group G is defined as follows ~Xi
"2/i"
x2 x3
2/2
.x4_
A
2/3
L2/4J
=
xi +2/1 £2 + J/2 2:3 + 2/3 Z4 + 2/4 + ( ' i + ' ^ W - ^ + w ) " .
(63)
Moreover the subgroup H — exp f) is realized as the collection of elements {expi(e 4 - e i ) } 4 e K = {(expie 4 ) • (exptei) _ 1 }t 6 R =
{-t,Q,0,t}teWL. (64)
The collection B = exp 23 = {exp(tei+ue 2 )-expt;e3}t ]U]t , € R = {t,u,v,0}t,u,veR
(65)
forms a local section of the space of left cosets G mod H. Any element (xi ,X2,xz,x±) from G such that (23 > - 2 ) can be uniquely represented in the form
(x{\
2X1+2X4 + X1X3"1 2+X3
X2
X2
X3
Xz
[
2X4
L 2+0:3 J r
~x{ X2 X3 Xi.
0 0
A
0
= Ui
2x4 "1 2+x3
A
2x4 2+x3
0 0 2X4 . 2+13
.
(66)
193
The composition law ( * ) of the local analytic Bol loop !?(*) is defined as
Y" \ / -?1 u A u1 v' v' V .0. .0. / , l +
,, (t+u)v'-v(t'+u'y t
+
2+(u 2+lv+v') U + U1 V + V1
2t+2t'+tv+tv'+t'v'+t' 2+(v+v')
(67)
v+uv'
-vu'
u + u' V + V1
The corresponding local Bol 3-Web can be realized in the neighborhood of the point (0,0) in E 6 = {(X,Y),X,Y 6 E 3 } as a space of second order. • Type .5 Bol algebra 05, with bilinear operation ei -e 3 = ex, e2 • e 3 = e 2 ,
(68)
and trivial trilinear operation, which has a 5-dimensional canonical enveloping Lie algebra <S = < ei,e2,e 3 ,e4,es >, 0 = 05 + h, 05 = < ei,e2,e3 >, t) = < e 4 — ei,es — e 2 >, and with composition law having the following structural equations [ei,e 3 ] = e 4 ,
[e2,e3]=e5.
(69)
The composition law (A) corresponding to the Lie group G is defined as follows ~Xi
"2/i"
X2
2/2
%3
A
2/3
Xi
2/4
\x<$\
LysJ
xi + 2/i #2 +2/2
=
(70)
Z 3 +2/3
La;5 + 2/5 +
g2i/3 —331K2
2
Moreover the subgroup H = exp f) can be realized as a collection of elements {expt(e 4 - e i ) , e x p p ( e 5 - e2}t,PeR = {-t,-p,0,t,p}ttP€R.
(71)
194 The collection B = exp*B = {exp(tei + ue2) •expue 3 }t )Ui „ e R =
{t,u,v,0,0}t,u,veR (72) forms a local section of the space of left cosets G mod H. Any element (xi,X2,xs,X4,X5) from G such that (2:3 > —2) can be uniquely represented in the form
X2
2 x i + 2 x 4 + XlX3~ 2+X3 2X2 + 2X5 + 12X3 2 + X3
X3
%3
X4
0 0
W
2^4 2+X3 2X5 2+X3
A
2+x 3 2x5
L 2+x3 J
Xi
A
X3
0 2X4 2+X3 2X5
Xi X5.
(73)
2X4 2+X3 2X5 2+X3
~x{
= IL
0 2x4
L 2+X3 J
The composition law (* ) of the local analytic Bol loop B(*) is defined as ( ~t~ u
Ui
V
u' A v'
V .0.
.0. / tv
u + u'+
\
-vf
(74)
™±™>
2+(v+v') V + V1
2t+2t'+tv+2tv'+t'v' 2+(v+v') 2M+2U'+WJ+2W
+u'v'
2+{v+v') V + V1
The corresponding local Bol 3-Web can be realized in the neighborhood of the point (0,0) in E6 = {(X,Y),X,Y € K 3 } as a space of second order. Type .6 Bol algebra !8 with bilinear operation ei • e 3 = ei + e 2 , e2 • ez = e 2 ,
(75)
195 and trivial trilinear operation, which has a 5-dimensional canonical enveloping Lie algebra <S = < ei,e2,e3,e4,e 5 >, 0 = 25 4- h, 23 = < ei,e2,e3 >, f) = < e4 - ei - e 2 ,e 5 - e 2 > , and with composition law having the following structural equations [ei,e 3 ] = e 4 ,
(76)
[e 2 ,e 3 ] = e 5 .
The composition law (A) corresponding to the Lie group G is defined as follows ~x{ X2 X3
an + 2/i
"2/i"
Z2 +2/2
2/2
A
2/3
Xi
2/4
lX5]
L2/5J
(77)
=
2:3 +2/3
x5+y5+X2V3-X3y2 Moreover the subgroup H = exp() can be realized as a collection of elements { e x p i ( e 4 - e i - e 2 ) , e x p p ( e 5 - e 2 } t , p e H = {-t,-t-p,0,t,p}t,p€u.
(78)
The collection B = exp23 = {exp(tei + ue2) • expve3}ttU,ven
{t,u,v,0,0}ttU,vm (79) forms a local section of the space of left cosets G mod H. Any element (x-i_,X2,xz,Xi,x$) from G such that (£3 > —2) can be uniquely represented in the form
(x,\ X2
2xi + 2 x 4 + 2 : 1 xa 2+x3 x 2 ( 2 + x 3 )- , + 2 x 4 ( 2 + X 3 ) + 4 X 5 + 2 x 3 X 5 — 2X3X4 (2+x3)2
X3
^3
\x5J
B
X2 Xz Xi
x5.
A
2x4 2+X3 4x4+4xs+2a;3XS (2+x3)2
0
2ai4 2+x3 4x4+4xs+2x3X5 (2+x3)2
~Xl
= n.
r
2x4 2+X3 4x5+2x3x5—2x3x4 L (2+x 3 ) 2
0 0
Xi
-
A
0 2x4 2+X3 4X5 + 2X3X5—2X3X4 L (2+x 3 ) 2 J
(80)
196 T h e composition law (*) of the local analytic Bol loop B(*) is denned as
"t" u
-f \ u1 A 1 V v .0. .0. / + _i_ •/•' _i_
t + t
i/ 4- i,' 4. tv' —vt' u-T " T 2+(v+v')
tv'
—vt'
+ 2+(v+v<)
j uv'—vu' "•" 2+(v+v') V + V1
_
(v+v')(tv'-vt') (2+(v+v'))2
"(81) T h e corresponding local Bol 3-Web can be realized in t h e neighborhood of the point (0,0) in E 6 = {(X, Y), I , ? 6 M 3 } a s a space of third order.
References 1. M. A. Akivis and A. M. Shelekhov, Geometry and Algebra of Multidimensional Three-Webs (Kluwer Academic Publishers, Dordrecht, 1992). 2. M. Kikkawa, Canonical connections of homogeneous Lie loops and ThreeWebs, Hiroshima Math. J. 2, 37-55 (1986). 3. P. O. Mikheev and L. V. Sabinin, Quasigroups and Differential Geometry, in Quasigroups and Loops: Theory and Applications (Heldermann-Verlag, Berlin, 1990), pp. 357-430. 4. L. V. Sabinin and P. O. Mikheev, The Theory of Smooth Bol Loops (University Druzhby Narodov, Moscow, 1985), 81 pages (in Russian). 5. L. V. Sabinin and P. O. Mikheev, Analytical Bol Loops, Webs and Quasigroups 153, 102-109 (1982) (Kalinin University, Kalinin, 1982). 6. L. V. Sabinin and P. O. Mikheev, The differential Geometry of Bol Webs, Dokl. Akad. Nauk SSSR 281(5), 1055-1057 (1985) (in Russian); English translation, J. Sov. Math. Dokl. 31(2), 389-391 (1985). 7. L. V. Sabinin and P. O. Mikheev, On the law of composition of velocities in special relativity theory, Uspeyi Math. Nauk 48(5), 293 (1993) (in Russian); English translation, Russ. Math. Surv. 48(5), 183-184 (1993). 8. L. V. Sabinin, Odules as a new approach to a geometry with a connection, Soviet Math. Dokl. 18, 515 (1977). 9. T. B. Bouetou and P. O. Mikheev, On isotopy of Bol algebras, Webs and Quasigroups (Tver State University, Tver, 1994), pp. 47-49. 10. P. O. Mikheev, Embedding of Malcev algebras into Lie algebras, Algebra and Logic 31(2), 167-173 (1992).
197 11. L. V. Sabinin, Smooth Quasigroups and Loops, Mathematics and Its Applications, Vol. 492 (Kluwer Academic Publishers, Dordrecht, 1999). 12. P. O. Mikheev and L. V. Sabinin, Smooth quasigroups and differential geometry, Problems in Geometry 20, 75-110 (1988) (Itogi Nauka i Tekhniki, Akad. Nauk SSSR) (in Russian). 13. T. B. Bouetou, Classification of solvable 3-dimensional Lie triple systems, Nonassociative Algebra and Its Applications, Lecture Notes in Pure and Appled Mathematics, Vol. 246 (Chapman and Hall/CRC, Boca Raton, Florida, USA, 2006), pp. 41-54.
198
RANDOM PHASE APPROXIMATION WITH EXCHANGE FOR THE INNER-SHELL ELECTRON PHOTOIONIZATION* ZHIFAN CHEN and ALFRED Z. MSEZANE Department
of Physics, Clark Atlanta University, 223 J. P. Brawley Dr., Atlanta GA 30314, USA E-mail: [email protected]
We have developed a random phase approximation with exchange (RPAE) method to study the inner-shell electron excitation and photoionization of an open-shell atom or ion. The equations for the dipole matrix element and Coulomb matrix element are derived using the second quantization method and checked against previous formulas by reducing them to those for the case of a single open shell or a closed shell. A computer code has been written to perform the RPAE calculation of the inner-shell photoionization processes. Our RPAE method has been used to calculate the photoionization cross sections of the 4d-ef, so called Giant resonance for the atom I and ions Xe+ and I + . The ground states and core wave functions of X e + , I and 1+ are obtained through self-consistent Hartree-Fock calculations. The radial functions of the continuum electron are determined by solving the linear HF equations. Finally the following states for Xe+ and I are included in the calculations, 4d 9 5s 2 5p 5 ( 1 P)e/( 2 £>), 4d95s25p5(1D)ef(2P2 D), 4dg5s25p5(1F)ef(2S2 P2 D), 4dg5s25ps(3P)ef{2D), 4d 9 5s 2 5p 5 ( 3 £>)e/( 2 P, 2 D), 4dg5s25p5(3F)ef(2S2 P,2 D). The following states are included in the 1+ calculations, 4 d 9 5 s 2 5 p 4 ( 2 P ) e / ( 3 D ) , 4 d 9 5 s 2 5 p 4 ( 2 D ) e / ( 3 P , 3 D), 4dg5s25p4{2F)ef(3S3 P , 3 D), 4dg5s25pi(4P)ef{3D), 4d 9 5s 2 5p 4 ( 4 £>)e/( 3 P, 3 D), 4 d 9 5 s 2 5 p 4 ( 4 P ) e / ( 3 5 , 3 P , 3 D). Our results for the Xe+ and 1+ 4d-ef transitions agree excellently with the measurements. However, the peak cross section of the I are about three to four times higher than the experimental maximum. We recommend a remeasurement of the I id Giant resonance with a careful experimental arrangement.
"This work is supported by US DOE, Division of Chemical Sciences, Office of Basic Energy Sciences, Office of Energy Research and AFOSR.
199
1. Introduction The random phase approximation with exchange (RPAE) method has been successfully used in photon (electron)- atom (ion) collisions for quite some time. These studies demonstrated the important influence of the virtual excitation of pairs of electrons and the necessity to include these electron correlations into the calculation in order to obtain good agreement between the theoretical results and the experimental d a t a . 1 - 1 3 However, these studies have been limited mostly to the noble gases and a few other systems since they were originally developed for the systems with nondegenerate ground states. 14 ' 15 The advent of third generation synchrotron radiation sources has provided the scientific community with tools to probe atoms and molecules at an unprecedented level of detail. Working hand-in-hand with theory, new insight has been gained into the role of many-body effects in atomic and molecular physics. In the course of these investigations, many-body effects have been found to dominate a large variety of processes in atomic physics, including "Giant Resonances", dipole and nondipole photoionization, generalized oscillator strengths, and the Compton effect. These new experimental results increase the interest to study the photoionization of an open-shell a t o m 1 6 - 2 6 and to extend the RPAE method to the inner-shell electron transitions. The generalization of the RPAE to atoms with unfilled shells is a difficult problem and has only been performed by a few scientists. Dalgaard 27 extended the RPAE method to the calculation of oscillator strengths for the discrete transitions of open shell atoms with two electrons or two vacancies in the valence shell (e.g., silicon). Armstrong 28 and Starace and Armstrong 29 developed a generalization of the RPAE based on the solution of Rowe equations of motion. 30 However, this generalization does not yield the equation for the filled shell in the limiting case. None of the above methods has proven that the photoionization cross sections are the same in the length and velocity forms. This problem has been resolved by Cherepkov et a/. 3 1 - 3 4 In the limiting case of an atom with filled shells their method yields the well-known equation of the RPAE. Another advantage of this method is that the dipole sum rule holds. The computer code developed for the RPAE method of single open-shell atoms using Cherepkov's equations can be found in Ref. 35. However, the code has not implemented all the parameters contained in the Cherepkov formulas. To date the RPAE method still cannot be used to study the general innershell electron transition of an open-shell atom except for a few elements, such as I 2 + , Cr, etc., which can be treated with the spin polarized technique of the RPAE method.
200
In this paper we present our recently developed RPAE method, which can be used to study the photoionization cross section, the photoelectron angular distributions, and the generalized oscillator strengths of an innershell electron transition in an open-shell atom or ion. The expressions for the dipole matrix elements and Coulomb matrix elements are derived using the second quantization method. The matrix elements are checked against the relevant formulas in the literature by allowing the two open-shells to become closed shells. Since the computer program has been written to perform calculations for two open-shell atoms if one (or two) open-shell(s) is (are) closed in the input data, our code can be used to calculate electron transitions for one open-shell (or closed shell noble gas atoms). The code has been tested by comparing its results with the photoionization cross sections of Ar 3p and O 2p electrons obtained from the computer program of Ref. 35. Our method has been used to study the 4d Giant resonances of Xe + , I and I + . Sections 2 and 3 present the theory and its applications, respectively, while Sec. 4 gives conclusions.
2. Theory We study the electron transition from state \li1[LiSi]l%2[L2S2]LS > to \l^-l{L'lS[]q2[L2S2\[LlcSlc]hLlS' >, qx and l%2 being both open-shells. L'c and S'c are, respectively, the core orbital and spin angular momenta. The RPAE equation we study here is similar to the equation found in Refs. 34 and 35, < l^-^L'iS'^lLiSillL'^lsL'S'WDWl^lLxS^q'lLiSi^S 1
a
k
i
> 3
= < l^- [L'1S'1}q [L2S2][L'cS'c]l3L'S'\\O \\q [L1S1]q [L2S2]LS +
Yl
1 1
,
2
i
> 2
J2l< l" - [Li 'Sl'% [L2S2][Lc'>Sc'']UL»S»\\D\\q [L1Si]q [L2S2]LS
>
La"Sc" e 4 F
>
> /(w + e4 - ei)],
(1)
where the first term is the dipole matrix element in RPAE, and the second term is the dipole matrix element calculated with Hartree-Fock wave functions; l\ and li are, respectively, the orbital angular momenta of the inner-shell and outer-shell electrons; I3 and I4 are the angular momenta of
201 photon electrons; F is the Fermi level; u, t\ and €4 are, respectively, the energies of the photon, l\ and h electrons; and L\, Si, L and S represent any allowed Z"1 subshell state and the atomic state. In the matrix calculation a summation over these states has to be performed. The quantities Ok, C (see Ref. 36) and U are the operators, 0" = -<
h\\Ckr\\li
> [8mk]-V2[al
C = 2Y,(-l)k[k]-1/2R(Uh\hh)
x a,,]'* 0 ),
(2)
< U\\Cw\\h >< h\\Cw\\l3 >
* H x ari]<*°> x [al x a,,]*>}<«» + R{lJi;hh) < h\\C{k)\\h >< h\\CW\\h > *{[alxah}^x[alxahro^00\ where R{hh',hh)
(3)
is the radial integral k
/
KRf^RhRiydr,
(4)
and Ck are normalized spherical harmonics, 3||^||/1>=(-l)'n^3^l]=(/03o/01).
(5)
The operator U can be obtained after summing all the magnetic quantum numbers in Eq. (8) of Ref. 37. The operator a\ , etc., is the electron creation operator, which is the irreducible tensor operator of rank I3 with respect to orbital angular momentum and of rank | with respect to the spin angular momentum. The electron annihilation operator, the Hermitian conjugate of the creation operator, is no longer the component of the irreducible tensor. 37 Such a tensor is formed from (2/ -I- l)(2s + 1) components of the operator a/ sm „ = (—l)' + s _ m - , / aj s -- m _„, where m and u are, respectively, the electron quantum numbers of the orbital angular momentum / and spin s — 1/2 in the z axis. The Coulomb matrix element of the first term in the bracket of Eq. (1) is the so called "time forward" type, which can be derived using Eq. (3). The Coulomb matrix element of the second term in the bracket is called the "time backward" type which can be evaluated using the operator U. The main task in our development is to find the expressions for the matrix elements of all terms of the dipole and Coulomb interactions. For complex electronic configurations having several open-shells this is a difficult and challenging task. One has to combine methods of the angular
202
momentum theory, irreducible tensorial sets, tensorial products in a coupled form, coefficients of fractional parentage, second quantization, etc. One advantage of the second quantization approach in the coupled tensorial formalism compared with coordinate representation, is the relative ease of finding the expressions for the matrix elements of complex electronic configurations. The reduced dipole matrix element can be obtained from the operator Ok, Eq. (2), which acts only on orbital angular momenta, < L'S'\Ok\LS
> = -<
/ 3 ||C*r||ii > [*]*[*]-*
* < l^-1[L[S'1}ir[L2S2][L'cS'c}l3L'S'\\[al l?[L1S1]l?[L2S2]LS
x
ah}^\\
>.
(6)
To simplify the expression in Eq. (6), we first uncouple the a\3 and a^ using Eq. (34) of Ref. 37 and then evaluate the matrix element separately for the operators aj and aix using further equations in Ref. 38. Since a\ or a;x acts only on the Z3 or l\ electron, then L = L'c and S = S'c. We have the following relation, < L'S'\\Ok\\LS
> = -<
h\\Ckr\\h
> [a]*
EI
/_-i\S+S'+l/2 L
>
1
1,5
M *
[S] 1/2
* { L'L \ } < L'S'H\\L§
L_
>< L§W&h\\LS >
= < / ||C fc r||Zi > ( _ l ) i ' + * + L 2 + L i + s i + s 2 + s i + 1 /2+n 2
^ L L
S
1 1
l\
L\ ) 1 i 5
( O L/Z Si )
\L
Lc Li )
where G L) S) = (Z? ~ [Z/1S[]|}Z? [ 'i 'iD is the fractional parentage coefficient. The reduced Coulomb matrix element of the "time forward" type is also derived using second quantization in the coupled irreducible tensorial formalism. The operator {{a^ x a j j ^ x [a\ x a/3](fc0)}(00) is uncoupled twice, which creates the intermediate states L, S, L, S,
203
xa,,r)}{°0)||L'5'>
< L - S - l l ^ x «,,]<*») x ^
W'W2
its,
* < L"S"\\[al x ah]{k0)\\LS / -,\L'+k+3S'
[L',S',k]1/*
D-D
>< LS\\[al x ah]{k0)\\L'S'
>
_ 2
Is
S+S+s
L " + S " + £ + S + f c V ( - l ) 6 + A + 5 f h *»4l
,,,1/2,
Lib
1
2
* < «^- [Li"5 1 "]^ [L 2 S 2 ][L c "5c"]?4L"5"||a, t JK^- 1 [L 1 "5 1 "]Z 2 l2 [L 2 5 2 ]L5 > * < l^-1[L1''S1"]q2{L2S2]LS\\dll\\l^[L1S1]q2[L2S2)LS . r^l/2, m
(_1)
L'+S'+L+5+k^r(-^S+S+S
li
[8,s\v*
>
[h k h \ {LLL'J
LS
* <^1[L1Si]^a[LaS2]LS\\4 Wl^-^S'^^S^LS > * < ^- 1 [Li5i]^[L 2 5 2 ]M||a ( 3 |K^- 1 [Li5 1 ]/ 2 l 2 [L 2 5 2 ][L' c 5i]« 3 L'5' > .
(8)
Since a] and fi;3 act, respectively, only on I4 and I3, then L = L'c and S = S'c. We also have S" = 5. The expressions for the operators aj , a/3, aj 4 and ai1 can be derived using the same method used in deriving the reduced dipole matrix element. Finally the direct part of the reduced Coulomb matrix element of the "time forward" type can be expressed as
g
n 1 [5',L" J L c ",4,5c",5^] 1 / 2 [Z 1 ,5 1 > L]
kLSLiSi
*GL'1s[GL1-s1"
* (-1)
\LL'CL'J\
L" Lc" L J
f Li" L 2 Lc" \ / 5i" 5 2 5C" 1 f Li L 2 L 1 \ L h £1 J 1 5 1/2 5i J \ L'c h L' J * {s\
1/2 l j }
<
^ H ^ l l ' 1 >< '1 Hull's > R{kh;hh).
(9)
The expression for the exchange part of the reduced Coulomb matrix element of the "time forward" type can be obtained similarly by uncoupling the operator {[ajt x ah](k°) x [o^ x ah]{k0)}^0). Noting the condition S'c = Sc", we have
204
<%1-1[Ll''S1"]q2[L2S2][Lc"Sc"]hL"S»\ {[aj4 x a«s]<*0> x [al x ^J^ 0 )}< 00 >||^" x [MM]^ 2 [L 2 S 2 ][L' c S' c ]l 3 L'S'
>
( Lc" h L" ) ( Sc" 1/2 S" 7" ^ { L'c l3 L' ) I S'c 1/2 S' { k k 0 } [ 0 0 0
= [0][L",S\L',S']1/2 * < U\\[al x ahf0)\\l3
r
qi-1[L1"S1"]q2[L2S2]Lc"Sc"
X
* \\[al xa i l ]^°)||Z^- 1 [Li5i]Z^[L 2 5 2 ]L' c 5i> --LL.A.fcJ / Lc" k L"\
(-1) ^
^ - ^ j ( h k h\(
1/2 0 1/2 \
* \ h L'c k J . _4-_ \ Lc" U ' J l 5c" 5 Sc / * < qi-\L1"S1"]q^L2S2]Lc"Sc''\\ah\\l^[L1S1]q2[L2S2]LS 1
l2 L
1
* < «7 [ii5i]i 2 [ 252]i5||4j|^- [Li5i]«^[L 2 S 2 ]L'c5 c > .
> (10)
After evaluating the expressions for the operators 5^ and aj the expression for the exchange part of the reduced Coulomb matrix element of the "time forward" type is given by
]T
Re{kh;h,h)
*n 1 [Li,5 l l L J 5][L' > 5' J Lc",L c ] 1 /2 G ^,f^ i G Li|, f Lc" k L" \( h k h\( Li" L2 L c " 1 * I »3 i c A: J \ Lc" L Lc J \ L Zi Li J f Si" S 2 52" \ / L x L 2 L \ / 5i 5 2 S\ *\ S 1/2 5i J \ L c /"i Li / \ 5 C 1/2 Si J * /_ 1 \U+i'+ri+L
(j^
Combining Eqs. (9) and (11) together and multiplying by ,L, L 1 / 2 the Coulomb matrix element of the "time forward" type can then be written
205
ni[Lc", L'c, Sc", S'c]l'2[Li, Si, L]GL\S\G
^T
w'k"
kLSLiSi * (_-,\Sc"+s'a+2S'+i [
f la k h \ (
h
k
U \
' \LL'CL'}\ L" Lc" L } f Li" L2 Lc» 1 J Si" S2 Sc" \(LiL2L\(SiS2 * \ L h LiJX S 1/2 SiJX L'c h L'j\S'c
*
-
S\ 1/2 S[ j
R{lJi;hh)
Y, kLBL\S\
* (Lc" * \ h f Si" * X S
ni[LuSirL,S][Lc\L'c]^Gl\%MJ0^i U L"\( h k h 1 (Li" L2LC"\ L'c k J \ Lc" LL'JX L h Li] S2 S2" X\LiL2 LX[Si S2 §X 1/2 Si }\L'e h L[ j\S'c 1/2 S'J
„. t_i\l4+L'+h+Lc"+L'c+L+2S'+Sc"+2S2+S+l+Si+S'c^
Q2)
Equation (12) may be checked if li and l2 are set to represent closed shells. In this case we have Lx = L = 0, Si = S = 0, Lx" = Lc" = k, Si" = 1/2, L[ =L'C = h, Si = 1/2, L2 = 0, S2 = 0, L'c = h, S'c = S[ = 1/2, Lc" = h, Sc" = Si" = 1/2, S1 = 0. The reduced Coulomb matrix element of the "time forward" type becomes 2 R{hh;hh)
< h\\Cl\\h >< h\\Cl\\h > -(21 + 1)
x £fl(l 4 Ji;/3Ji) < k\\Ck\\h
X
li\\Ck\\h > (-1)1 + k | £ [ ^ J . (13)
Equation (13) is the same as Eq. (4.55) of Ref. 15. In reducing Eq. (12) to Eq. (13) we have set L' = k in the direct part and changed the symbol k to I. The Coulomb matrix element of the "time backward" type can be obtained using a similar procedure of the second quantization method in the uncoupled method. 37 We used the Winger-Eckart theorem to separate the magnetic quantum numbers from angular momentum and spin quantum numbers, and then summed all magnetic quantum numbers to obtain the operator U. After evaluating the operators a\ , a^, a\ , aix the Coulomb
206
matrix element of the "time backward" type can be expressed as
n1[Lc\L'c,Sc\S^I2[LuSirL]G^Gl\\„
J2 kLSL\S\
(-i)
SC" + S'C+2S'<+i f h
f Lf
k h 1 f lx
L2 L» 1 f 5!" S 2 5 C "
\ L h U ]\ S
\[LXL2L\[SX 5 2 5 1/2 Sx J \ L'c /"i L' J \ 5^ 1/2 5J
* < / 4 ||C*||/i X l3\\Ck\\h > +
A; Z4 1
1 L L'c V J \ L" Lc" L J
Yl
R(kl3;hh)
#('4,J3,li, Ji) < i4||C*||/i X
f 3 ||C*||fi >
LiSiLSk ni[Lc\Sc
,Lc1Scj]
'
[L^S^LJSJGJW^JG^J*^*.
f S£ 1/2SM f L i " L 2 L e » 1 1 Sc" l / 2 5 / \ i *i ^1 J f 5 i " S 2 5 C " 1 f Li I j i n f 5{ S2 S'c \
1 5 1/2 §! J 1 L /i Li J 1 5 1/2 5: J (_l) i i"+' £ 'i+ S c+' S 'c"+25i+L' c +Lc"+2S2+2S+l_
/-^\
The reduced Coulomb matrix element of the "time backward" type can also be reduced to Eq. (13) if both Z"1 and I"2 are closed shells. Equations (12) and (14) are the basic equations used in the RPAE calculations. 3. Application We have utilized our recently developed RPAE method, which can be used to study the inner-shell electron transition of an open-shell atom, to calculate the photoionization cross sections of the Ad-tf transition, corresponding to the so called Giant resonances in Xe + , I and I + . We first create the ground state of Xe + 4d 1 0 5s 2 5p 5 ( 2 P) and the core wave functions of 4d95s25p5(1P,1D,1F,3P,3D,3F) through self-consistent HF calculations. Then the radial functions of the continuum electron are obtained by solving the linear HF equations without self-consistency using those core wave functions. The reduced dipole matrix elements are evaluated using Eq. (8). Since the core wave functions used relaxed orbitals an overlap integral has to be added into Eq. (7).
207 60.0
i . 45-° c
o ti
a> m w w g
30.0
15.0
0.0 70
BO
90
100
110
120
130
Photon Energy (eV) Fig. 1. Comparison of the RPAE results for the Xe+ Ad Giant resonance in the length form. The solid curve and dot-dashed curve are, respectively, the photoionization cross sections from the RPAE and HF calculations. The dotted curve represents the measured Xe 3 + photoionization cross section from Ref. 40.
According to the triangular rule the final combined core and continuum electron states can only be 2S, 2P, and 2D states. Therefore we included the following final states in the calculations, 4d95s25p5(1P)ef(2D),4d95s25p5(1D)ef(2P2D), 4d95s25p5(1F)ef (2S2P2D), 4d95s25p5(3P)ef(2D), 4d?5s25p5{3D)ef{2P,2D),4d?5s25p5(3F)ef(2S,2P,2D), and studied the following reactions, hv + 4d105S25p5(2P)^4d95s25p5(1P,
1
D, 1F, 3 P, 3D, 3F) + ef(2S,
2
P,
2
D). (15)
After evaluating the reduced dipole matrix elements and the Coulomb matrix elements of the "time forward" type and "time backward" type the RPAE equation, Eq. (1), was solved to obtain the dipole matrix element in RPAE and the 2S, 2P, and 2D partial cross sections. The total photoionization cross section is the sum of those partial cross sections. The photoionization of the Xe + 4d-ef is followed, with almost 100% probability, 39 by a continuous double Auger decay process to the Xe 3 + ion. Therefore, our calculation can be compared with the Xe 3 + measurement. Figure 1 shows the partial and total photoionization cross sections versus the photon energy for the Xe + 4d to ef transition.
208
The solid curve and dot-dashed curve are, respectively, the photoionization cross sections from the RPAE and HF calculations in the length form. The dotted curve represents the measured Xe 3 + photoionization cross section from Ref. 40. The importance of the RPAE effects are clearly manifested through the comparison with the HF results. The maximum cross section, 24.44 MB at the photon energy of 96.37 eV from our RPAE calculation agrees very well with the experimental maximum of 25 MB at the photon energy of 95eV.40 We also calculated the 4d Giant resonances of I and I + . The calculation for the I 4d to ef transition is similar to that for the Xe + case, except that the two elements have different nuclear charges. Our RPAE calculation shows maxima at 102.8 eV of 23.85 MB (length) and 19.6 MB (velocity), which agree reasonably well with the results of the time-dependent local density approximation (TDLDA). 41 TDLDA predicts a peak at 96 eV with a maximum value of 27 Mb. However, these data are about three to four times higher than the experimental maximum of 6.5 MB at 91 eV. 42 Since the recent measurement for the I + 4d Giant resonance shows a maximum of 23±3 Mb at 90eV,43 we recommend a remeasurement of the I 4d Giant resonance with a careful experimental arrangement. For the I + 4d Giant resonance the ground state is 4d 1 0 5s 2 5p 4 ( 3 P). The core wave functions are 4d95s25p4(2P2D2F) and 4dg5s25pi(AP,4D,4F). We have included the following combined final core and continuum electron states in the calculations, 4d95s25p4(2P)ef{3D), 4d95s25p4(2D)ef(3P3 D), g 2 4 2 3 3 3 4d 5s 5p ( F)ef( S P D), 4d95s25p4(4P)ef(3D), 4d95s25p4(4D)ef(3P3 D), 4d95s25p4(4F)ef(3S3 P3 D). Therefore, the reactions we studied are /w + 4d 10 5s 2 5p 4 ( 3 P)->4d 9 5s 2 5p 4 ( 2 P, 2D, 2F, 4P, 4D, 4F) + ef(3S,
3
P,
3
D). (16)
We obtained the total photoionization cross section from the sum of the partial cross sections for 3 5 , 3P and 3D. The RPAE calculation gives a maximum of 23.59 Mb at the photon energy of 95.62 eV, which agrees very well with the recent measurement, 23(±3) Mb at 90 eV. 43 Calculated photoionization cross sections (MB) versus photon energy for the 4d-ef Giant resonance of the ions Xe + and I + and atom I are given in Table 1.
209 Table 1. Photoionization cross sections (MB) calculated by our RPAE method for the Ad-ef Giant resonance of the ions X e + and I + and the atom I. Photon (eV) 76.46 76.80 77.76 79.33 81.52 84.31 87.72 91.74 96.37 101.62 107.48 113.95 121.03 128.73 137.03 145.95 155.48 165.62 176.38 187.75
Xe+ 3.09 3.87 5.35 7.26 10.51 14.18 18.89 22.33 24.44 24.20 21.12 16.88 11.48 7.32 3.91 1.94 0.74 0.21 0.016 0.035
Photon (eV) 70.46 70.80 71.76 73.33 75.52 78.31 81.72 85.74 90.38 95.62 101.48 107.95 115.03 122.73 131.03 139.95 149.48 159.63 170.38 181.75
1+ 3.86 4.30 5.32 6.87 9.41 12.53 16.45 19.32 22.42 23.59 22.92 20.38 15.52 10.75 6.15 3.22 1.34 0.46 0.074 0.0093
Photon (eV) 65.31 65.65 66.61 68.18 70.37 73.16 76.57 80.59 85.22 90.47 96.33 102.80 109.88 117.57 125.88 134.80 144.33 154.47 165.23 176.60
I 0.11 0.15 0.29 0.67 1.47 3.07 5.69 9.58 14.57 18.60 22.74 23.85 22.27 18.36 12.39 7.50 3.60 1.58 0.47 0.06
4. Conclusion A new RPAE method, which can be used to study inner-shell electron transitions, has been developed. The expressions for the matrix elements of all terms of the dipole and Coulomb interactions have been derived using the second quantization method. By allowing both the open-shells to be closed, the derived equations reduce to those for closed shells. The method has been used to study the Ad Giant resonances of X e + , I and I + . The present photoionization cross sections for Xe + and I + agree very well with the experimental data. However, the results for the I 4d Giant resonance are about three times higher than those from the measurement in Ref. 42. We recommend a remeasurement of the cross section for this process. References 1. M. Ya. Amusia, A. S. Baltenkov, L. V. Chernysheva, Z. Felfli, S. T. Manson and A. Z. Msezane, Phys. Rev. A 67, 060702 (R) (2003), and references therein; J. Phys. B 37, 937 (2004). 2. O. Hemmers et al., Phys. Rev. Lett. 91, 053002 (2003). 3. M. Ya. Amusia and V. K. Dolmatov, J. Phys. B 26, 1425 (1993).
210 4. Zhifan Chen, A. Z. Msezane and M. Ya. Amusia, Phys. Rev. A 60, 5115 (1999). 5. Zhifan Chen and A. Z. Msezane, J. Phys. B 33, 5397 (2000). 6. Zhifan Chen and A. Z. Msezane, Phys. Rev. A 6 1 , 703 (2000). 7. Zhifan Chen and A. Z. Msezane, J. Phys. B 33, 2135 (2000). 8. Zhifan Chen and A. Z. Msezane, J. Phys. B 35, 815 (2002). 9. M. Ya. Amusia, L. V. Chernysheva, Z. Felfli and A. Z. Msezane, Phys. Rev. A 64, 032711 (2001). 10. A. Z. Msezane, Z. Felfli, M. Ya. Amusia, Z. Chen and L. V. Chernysheva, Phys. Rev. A 65, 054701 (2002). 11. M. Ya. Amusia, A. S. Baltenkov, Z. Felfli and A. Z. Msezane, Phys. Rev. A 59, R2544 (1999) 12. M. Ya. Amusia, V. K. Dolmatov and V. K. Ivanov, Pis'ma Zh. Tekh. Fiz. 6, 1465 (1980) [Sov. Tech. Phys. Lett. 6, 632 (1980)]. 13. M. Ya. Amusia, V. K. Dolmatov and V. K. Ivanov, Zh. Tekh. Fiz. 56, 8 (1986) [Sov. Phys. Tech. 3 1 , 4 (1986)]. 14. M. Ya. Amusia and N. A. Cherepkov, Case Studies in Atomic Physics 5(2), 47-179 (1975). 15. M. Ya. Amusia, Atomic Photoeffect (Plenum, New York, 1990). 16. S. B. Whitfield, K. Kehoe, M. O. Krause and C. D. Caldwell, Phys. Rev. Lett. 84, 4818 (2000). 17. Z. Felfli, N. C. Deb, D. S. F. Crothers and A. Z. Msezane, J. Phys. B 35, L419 (2002). 18. H. P. Saha, Phys. Rev. A 66, 010702 (2002). 19. S. S. Tayal, Phys. Rev. A 65, 032724 (2002). 20. T. W. Gorczyca and B. M. McLaughlin, J. Phys. B 33, L859 (2000). 21. O. Wilhelmi, G. Mentzel, B. Zimmermann and K. H. Schartner, Phys. Rev. A 60, 3702 (1999). 22. G. Prumper, B. Obst, W. Beuten, B. Zimmerman, P. Zimmermann and U. Becker, J. Phys. B 32, 3101 (1999). 23. Zhifan Chen and A. Z. Msezane, Phys. Rev. A 68, 054701 (2003). 24. Zhifan Chen and A. Z. Msezane, Phys. Rev. A 67, 024701 (2003). 25. Zhifan Chen and A. Z. Msezane, Phys. Rev. A 70, 032714 (2004). 26. Zhifan Chen and A. Z. Msezane, Can. J. Phys. 82, 517 (2004). 27. E. Dalgaard, J. Phys. B 8, 695 (1975). 28. L. Armstrong, Jr., J. Phys. B 7, 2320 (1974). 29. A. F. Starace and L. Armstrong, Jr., Phys. Rev. A 13, 1850 (1976). 30. D. J. Rowe, Rev. Mod. Phys. 40, 153 (1968). 31. N. A. Cherepkov, L. V. Chernysheva, V. Radojevic and I. Pavlin, Can. J. Phys. 52, 349 (1974). 32. N. A. Cherepkov and L. V. Chernysheva, Bull. Acad. Sci. USSR, Phys. Ser. (English Transl.) 41(12), 47 (1977). 33. N. A. Cherepkov and L. V. Chernysheva, Phys. Lett. A 60, 103 (1977). 34. G. A. Vesnicheva, G. M. Malyshev, V. F. Orlov and N. A. Cherepkov, Sov. Phys. Tech. Phys. 3 1 , 402 (1986).
211 35. M. Ya. Amusia and L. V. Chernysheva, Computation of Atomic Processes (Institute of Physics Publishing, Bristol and Philadelphia, 1998). 36. P. H. M. Uylings, J. Phys. B 25, 4391 (1992). 37. B. R. Judd, Second Quantization and Atomic Spectroscopy (The Johns Hopkins Press, Baltimore, 1967). 38. U. Fano and G. Racah, Irreducible Tensorial Sets (Academic Press, New York, 1959). 39. M. Ya. Amusia, N. A. Cherepkov, L. V. Chernysheva and S. T. Manson, J. Phys. B 33, L37 (2000). 40. P. Andersen, T. Andersen, F. Folkmann, V. K. Ivanov, H. Kjeldsen and J. B. West, J. Phys. B 34, 2009 (2001). 41. G. O'Sullivan, C. McGuinness, J. T. Costello, E. T. Kennedy and B. Weinmann, Phys. Rev. A 53, 3211 (1996). 42. L. Nahon, A. Svensson and P. Morin, Phys. Rev. A 43, 2328 (1991). 43. H. Kjeldsen, P. Andersen, F. Folkmann, H. Knudsen, B. Kristensen, J. B. West and T. Andersen, Phys. Rev. A 62, 020702(R) (2000).
212
A N E W ANALYTICAL A P P R O A C H TO T H E A T M O S P H E R E C H A R A C T E R I Z A T I O N B Y A B A C K S C A T T E R E D LIDAR SIGNAL GEORGES DEBIAIS 1 , FRANQOIS K. G U E D J E 1 and M. NORBERT HOUNKONNOU 2 ' 3 1 L P 2 A , University of Perpignan, 52, Avenue Paul Alduy, F-66860 Perpignan, France E-mail: [email protected], francois.guedjaSuniv-per.fr 2
International
Chair in Mathematical Physics and Applications University of Abomey-Calavi, 072 B.P. 50, Cotonou, Republic of Benin E-mail: norbert-hounkonnouQcipma.net
(JCMPA),
3 Unite de Recherche en Physique Theorique (URPT), Institut de Mathematiques et de Sciences Physiques (IMSP), Universite d'Abomey-Calavi, 01 B.P. 2628, Porto-Novo, Republique du Benin
In this contribution, we provide new analytical relations finely tuned to describe various atmospheric elements and based on the determination of the corresponding backscattered optical radiation. The novelty of this work consists in the fact that the cloud is not considered anymore as a well delimited and isolated layer in the ambient air with a constant density of particles, but as an open structure mixing with the ambient air and possessing a non-constant number of particles. The proposed model gives rise to a unique formulation of ice crystallite clouds as well as of fog or aerosol concentration. Application to a real signal is considered as an illustration.
1. Introduction Research on the best knowledge possible of the atmosphere and for an optimal management of climatic variability is still at the center of intense activity. 1 - 9 Indeed, a rigorous study of climatic changes must include a thorough understanding of the dynamics of atmospheric and oceanic fluid structure, of the properties of oceanic and continent surfaces, as well as of the interactions of the solar electromagnetic radiation with the atmospheric and cloud micro-particles. 1 ' 9 Our work is based on signals from the Light
213 Detection And Ranging (LIDAR) technique using optical frequencies. We are interested in atmospheric quasi-instantaneous signals detected at an altitude of 20 to 25 km. Recently, we investigated the optical parameter variation versus the altitude, using inverse mathematical relations. 1 ' 2 Here, we aim at providing new analytical relations finely tuned to describe various atmospheric elements and based on the determination of the corresponding backscattered optical radiation. The novelty of this contribution consists in the fact that the cloud is not considered anymore as a well delimited and isolated layer in the ambient air with a constant density of particles, but as an open structure mixing with the ambient air and possessing a nonconstant number of particles. The proposed model gives rise to a unique formulation of ice crystallite clouds as well as of fog or aerosol concentration. Application to a real signal is provided as an illustration. 2. LIDAR Diffusion Theory The LIDAR technique proves to be an improvement over the RADAR one, taking advantage of optical frequencies due to the use of lasers. Emitted photons are either absorbed or scattered by molecules and other atmospheric particles. A very limited number of photons is backscattered and detected by the parabolic LIDAR receptor and then concentrated onto a photomultiplier connected to a data acquisition system. Simulation results show that, at these optical frequencies, less than four percent of the backscattered photons undergo multi-scattering, which allows to restrict the analysis only to mono-scattered photons. Analysis of the LIDAR signal for the time interval ( i i , ^ ) gives information on properties and optical qualities of the layer located at an altitude between (ct\cos6)/2 and (ct2Cos8)/2 above the shooting position, c being the speed of light and 6 the phase angle (see Fig. 1). LIDAR spatial resolution depends on the time duration of the laser impulse and on the time resolution of the data acquisition electronics. Figure 1 shows the LIDAR position (altitude h) and the scattering point M with z = h + r cos 6. Backscattered energy is described by the classical LIDAR equation. For an inhomogeneous medium of which the local optical properties depend on the wavelength, A, and the altitude, z, or the position, r, this energy can be written in the general form,9
P^M
= -C(l-K)PtFiJtr
dF = ())
^T>{r},
(1)
an expression which involves the energy flux F emitted by the source, the probability — j? ^£- for a photon to undergo a scattering at a distance r, the
214 altitude
zA scattering point
• ^ \ /
Lidar position
h z = h + rcos a
sea level
0 •
>
Fig. 1. The LIDAR experimental set-up.
scattering probability Pj in the direction back to the source at the distance r, the photon absorption probability K for a particle, the considered layer transmission rate T(r), and the correction factor C associated to the solid angle corrections and the anisotropy function of scattering centers. In the sequel, we deal with the analysis of these quantities in order to adapt the LIDAR equation to our purposes. 2.1. The transmission
coefficient
The atmospheric medium contains various particles like molecules, aerosols and clouds. The variation dF(X, h, r) of the incoming flux F due to scattering through an elementary layer of thickness dr reads dF
= — a(X, h, r)dr
(2)
with a(\,h,r)
= Y2ati(\,h,r)
=
^
Ti(X,h,r)'
where aj(A, h,r) stands for the scattering cross section per unit volume at a diffusion point for a particle species i, and Ti(\,h, r) is the photon mean
215 free path between two scatterings. Integrating this equation, we obtain the transmission coefficient through a length r in a layer of thickness r cos#, ~ F(X,h,r 2.2. Case of isotropic
= 0)
W
'
and conservative
backscattering
To account for the variation of the solid angle through which the detector of radius R is observed from the scattering point, one multiplies the signal P(X,h,r) by r2/R2 to get S(X,h,r)=P(X,h,r)^.
(4)
Moreover, for computational convenience, one can also use the quantity \nS(X,h,r).
(5)
In the simple scattering regime, for a colinear LIDAR the factor C in (1) can be expressed from the backscattering angle probability density H{6). It is established in scattering theory that 3
C
(6
»
= W^-
g being the asymmetric backscattering factor. In the case of conservative scattering, K = 0. Moreover if backscattering is isotropic, g = 0.5 and Pb = 0.5. From the definition of the transmission coefficient, one gets a simplified equation, 3 In S(X, h, r) = l n ( i ) + l n [ ^ ai(X, h, r)] ~2fYl i
a Xhr
^ ' ')
dr
-
^
i
This equation and the above assumptions enable one to derive the analytical relations in Sec. 4. By specifying values for the LIDAR position and the laser wavelength, numerical results may be obtained as function of r and z, and will be illustrated later on for the quantities S(z) and In S(z). 3. The Problem The LIDAR signal is proportional to the molecule or scattered particle density, i.e., to the pressure at the considered altitude. The standard or non perturbed atmosphere is described by the perfect gas thermodynamics equations with a temperature gradient. The corresponding relations are briefly recalled here as follows, P = pRT,
dP =
-{Pg/RT)dz,
216 0.3
0.25
0.2
\
0.15
0.1
0.05
0 0
5
10 Range above sea level (km)
15
20
Fig. 2. Experimental LIDAR signal: quasi-standard atmosphere. The filtered signal is shown in the continuous line.
with T = (16.5 - 6.5z) + 273.15 for z < 11 km, and T = - 5 5 + 273.15 for z > 11 km. Consequently, P = 101500 x [1 - 0.022 55 zf526 P = 126656 x e " z / 6 ' 3 9 6
for z < 11 km,
for z > 11 km,
where P, p, R, T and g stand for the pressure, the volume mass, the ideal gas constant, the absolute temperature and the gravity constant, respectively. Without loss of generality, these relations can be replaced by the following approximation, with a maximal loss of accuracy not exceeding 1%, P = P0e~x*,
(8)
1/x being the pressure decay length from the pressure P$ at sea level. Using this approximation, one notably simplifies the LIDAR equation for the atmosphere description. However, let us point out that the standard value 1/x = 7.4 km can depend on the LIDAR location and on meteorological conditions. A significant number of LIDAR signals are available since the research activity of one of us (G. D.) has been in relation to the International CAT and CELESTE Collaborations for the detection of gamma radiation coming
217
0.3 0.25 _~ 0.2 to 0.15 0.1 0.05 0 0
5
10 Range above sea level (km)
15
20
Fig. 3. Experimental LIDAR signal: perturbed atmosphere. The filtered signal is shown in the continuous line.
from extra-galactic sources. These activities came to an end in 2003. These signals were devoted to identify and quantify the atmosphere quality for gamma ray detection by the Cerenkov effect. Figures 2 to 4 are given for illustrative purposes. Figure 2 displays a quasi-standard atmosphere. Figure 3 corresponds to an important cloudy perturbation over an altitude of 6 km. Figure 4 shows a very pronounced cloud peak at about 8.5 km, probably composed of ice crystallites, then followed by a smaller one at about 8.2 km of altitude. One can easily determine their exact positions. One can recognize the signal nitration obtained using the Fourier transform in order to best reduce the noise arising from celestial and electronic sources. Many authors have attempted to interpret various LIDAR signals using different methods. Equation (1) is often adapted to this end. See Refs. 3-8 and the recent book by V. A. Kovalev and W. E. Eichinger9 which provides an interesting overview of the most widely used and relevant methods. It is still difficult to take into account all perturbations or fogs and clouds which appear in the atmosphere, which explains the very small number of published quantitative results on the topic. Saka 3 , for instance, has considered a cloud as a layer of thickness E, added to the standard atmosphere at some altitude A with a constant particle density. From assumptions similar to
218 1.6 1.4 1.2 1
'I
0.8
« 0.6 0.4 0.2 0 0
5
10 Range above sea level (km)
15
20
Fig. 4. Experimental LIDAR signal: ice cloudy atmosphere. The filtered signal is shown in the continuous line.
those mentioned above, and using the LIDAR equation this author established an analytical description of the atmosphere with one or two clouds, as illustrated in Fig. 5 in the case of a single cloud. Although the study is interesting, one should point out the excessive "rigidity" of the cloud response compared to experimental signals. Indeed in the latter, responses appear "rounded-off" in the case of dilute perturbations and fogs, or extremely "peaked" for ice clouds. Moreover, the very strong attenuation of parts of the signal corresponding to altitudes above the cloud and leading to the optical thickness, is not very clear and seems to be excessive. 4. Results Given (7), we wish to identify a new model better capable to approximate real signals. In that equation there appear the mean free paths T; which are inversely proportional to species densities. We start from the description of the standard atmosphere and assume that the particle density in clouds can vary with altitude according to a law to be determined. This assumption is one of the novelties of this work. Thus, for the atmosphere of which pressure is described by the relation P — P0e~xz, density can naturally be expressed as da = doae~xz, while its mean free path takes the form
219
-3 •
\
-4 •
E £.
-6 -
"
-7 •
n c
_ml
\
" 5
" 10 Range above sea level (km)
1 15
-
o.oe 0.07 0.06 •f
I \ \ \
0.05
t CO 0.04 0.03 0.02 0.01
'
"^-j
|
Range above sea level (km)
Fig. 5. LIDAR analytical responses, In S(z) and S(z), for a constant mean free path of particles in a cloud (following A. Saka in Ref. 3).
Ta = F0aexz. However in order to find the best law for density variation djv and for particle mean free path in the cloud TJV, we proceed through a series of trial models to be improved at each stage in succession.
4 . 1 . Linear
growth law for
T^
We have first assumed that, due to gravity, density could be largest in the lower part of the cloud and should decrease with altitude. Our approach is thus the following: we retain the modeling of the cloud as a layer located between the altitudes A and (A + E), and then describe its mean free path
220
TJV using an arbitrary linear evolution relation of the form TN = TA + k(z - A)
for
A < z < A + E.
(9)
Here, TA is the mean free path in the lower part of the cloud and k stands for a constant. We then have to distinguish three regions: "before", "inside" and "after" the cloud. The relations we obtain for In S(z) read as follows, InS(z) = l n ( i ) + ln(-^-) - X* + ^(e~L - 1), 4
k cos 6 In S(z) - In - + In — ) - X* + 4
for h < z < A;
Jyl
1 Oa
1 Oa
TA ^ T T M
L
~ 7—z ln(^= k COS V
, 1A
for z > A + E; with L = x(z — h) and M = r 0 o X cos 6exh. Figure 6 shows that the thickset character of the peak component of the response as obtained by Saka is improved here, as we observe a peak which is closer in shape to the experimental one. Furthermore, we note a reduced attenuation of the signal above the cloud, again in better qualitative agreement with experiment. All the graphs presented here, including those of Saka, are obtained for the following arbitrary values of the parameters, A = 7 km, E = 2 km, r 0 o = 15 km, 1/X = 7 km, TA = 20 km and k - 6. 4.2. Linear
decrease
law for TN
In a second trial model, we take an opposite point of view by assuming a growing density versus z and a linearly decreasing mean free path, TN = TA ~ k(z - A)
for
A < z < A + E.
(10)
We obtain for In S(z), InS(z) = l n ( i ) + l n ( - ! - ) - Xz + ~(e~L - 1), for h < z < A; 4
1 Oa
M
•nS(z) = 1 n(l) + I„(g + r ^_ A ) ) + g < ^ , 0/ x
, ,lx
, / 1 x
2(e-L-l)
2
,
.TA-kE.
for z > A + E.
221
Range above sea level (km)
0.12
•
0.1
•
.
(km"
_~ 0.08
°> 0.06
•
I
0.04
0.02
^
_ 5
\ 10 Range above sea level (km)
Fig. 6. Analytical LIDAR responses, InS(z) mean free path in a cloud.
15
and S(z), for a linear growth of particle
Figure 7 shows how the cloud description is then significantly reversed as compared to the previous choice of a linear growth law.
4.3. Combined linear growth and decrease the "sawtooth" model
laws:
Since the relative symmetry of the experimental response is not reproduced, we now attempt to used a combination of the above two models, i.e., by assuming a maximal density in the center, for instance, and a decreasing behaviour in either vertical directions from the center. For what concerns
222
.
-3.5
1
-4
\
-4.5
/
_ -5 E * -5.5 w c -6 -6.5
\ ^ ^ ^ ^ .
-7
-7.5
\ 5
10 Range above sea level (km)
;
15
0.06
0.05
'
•
.
(km"
_~ 0.04
m
I 1
0.03
•
J
0.02
0.01
Range above sea level (km)
Fig. 7. Analytical LIDAR responses, InS(z) ticle mean free path in a cloud.
and S(z), for a linear decrease of the par-
the mean free path, this assumption leads to the following variations, \m=TA-ki(z-A) \N2
for A < z < A + E/2; for A1 = A + E/2;
= TA,+k2{z-A')
and TA, = TA - hE/2
iov A'
+ E.
The results we discuss here correspond to the case k\ = k2 = k. In the present case, we have to consider four regions, for which we find, InS(z) = \n(h 4
+ l n ( - ^ ) - Xz + ^(e~L 10a
M
- 1), for h < z < A;
223
e~xz
1
2(e~~L - 1 )
1
In SM = ln(j) + ln(— + j . ^ ^ ) + " 4 H + » l B ( !V^a), t a j 4 <, S x' i k cos 0
r^
e~xz
1 ln5(z) = l n ( _) + l n (
^
+
2(e~ L - 1)
1
_ _ _ _ _ ) 2_
nL
A__J
+
(r A -fcE/2)r^
itcose (r A - + fc(z-A'))rAJ' for A' < z < A + E; •> . T A ' +kE/2.L , ^ , _ • l n ( - ^ - — — - - ) , for z > A + E. k cos 8 TA' Note that we could easily modify the position of the maximum of the density as well as its variation with altitude in either direction. Figure 8 then displays quite a satisfactory symmetry of the response which is more (or even possibly too much) pronounced. Furthermore, the sides of the cloud are still rather abrupt as a result of the integration limited to the sharp interval [A, (A + E)]. We have tested that even by enlarging this interval by considering a thicker cloud, the response cannot reproduce the thinner peak structure of the experimental signal visible in Fig. 4. 4.4. A Lorentzian
law
The results and conclusions described above have led us to a Lorentzian description similar to our previous study using the Hilbert relations. 2 For this last model, we directly take a density evolution proportional to a Lorentzian function. For the mean free path, TN = T0[l + (z- z 0 ) 2 r 0 2 ] ,
(11)
r 0 being the mean free path at the altitude zo of the cloud center, while TO stands for the density decay constant about ZQ . In addition, in order to avoid the discontinuities observed at the cloud boundaries in the previous models, we integrate the Lorentzian over the whole domain of the LIDAR shooting range, i.e., the interval between h and zmax. This procedure implies total mixing between the cloud and the atmosphere, which allows one to ignore the notion of a bounded cloud layer. From the physical point of view, one thus obtains a cloud of which the boundaries are evanescent and mix more or less rapidly with the ambient air, according to values for To and To, i.e., according to the nature of the perturbation. The interest of the Lorentzian
224
5
10 Range above sea level (km)
15
5
10 Range above sea level (km)
15
Fig. 8. Analytical LIDAR responses, InS(z) and S(z), for the combined linear decrease and growth of the particle mean free path in a cloud.
function consists in the fact that a simple modification of parameters can lead to significant changes, offering the possibility to also describe an ice cloud at high altitude as well as a small perturbed zone or a foggy region. Moreover, we get a unique and particularly relevant relation for the response description at all altitudes, which reads as follows, l n S C z ) = l n ( l / 4 ) + ln(—— +
r0o
-[tan
l
(z
r„(i + (z - Z0)T0 - tan
)+ z0)2r02) 1
(h -
2{e~L - 1) M
Z0)T0}.
To r 0 cos 6 The sharp peak structure displayed in Fig. 9 now seems acceptable and compatible with experimental data.
225
5
10 Range above sea level (km)
15
•
0.07 0.06 0.05
-
0.04
-
0.03
•
0.02
-
0.01
.
\^^
—-—'L_
10 Range above sea level (km)
Fig. 9. Analytical LIDAR responses, InS(z) and S(z), for a Lorentzian evolution of the density in a cloud.
4.5. Multi-perturbation
case
The above relation can be easily extended to n perturbations, each of which being characterized by Zi, T{ and T;, lnS(2)=ln(l/4) + l n { — - + J_
^Till
-E
YiTi cos 6
[tan
1
+
iz-z^T?} } +
(z — Zi)Ti — tan
l
2(e~L - 1) M
(h — ZJ)TJ] .
Figure 10 shows the results of this model for three perturbations. The attenuation of the signal due to the perturbations is also well identified.
226 -2
-3
-4
E C/3
c -6
-7
-e -9 0
5
0.07
10 Range above sea level (km)
15
20
15
20
—
0.06
0.05
_— 0.04
M
0.03
0.02
0.01 "0
5
10 Range above sea level (km)
Fig. 10. Analytical LIDAR responses, InS(z) and S(z), for Lorentzian variations of the particle density in three distinct perturbations at different altitudes. Dotted curves: the unperturbed atmosphere responses.
5. Application to a Real Signal In order to test our results and to fit some experimental parameters, we have considered a real signal obtained with a vertical laser shot (6 = 0) of wavelength in the green, A = 532 nm. However, it is worth noticing some practical difficulties related to the implementation of such a measurement. First, the logarithm of a real signal, even when filtered, appears to be particularly unrealistic above 10 km and one preferably has to consider at such altitudes the values for S(z) instead of In S(z). Second, one has to take care of the fact that the signal amplitude is measured in relative value while the theoretical relations provide absolute values of the
227
12 10 8
7~ 6 E
2 0 -2
0
5
10 Range above sea level (km)
15
20
Fig. 11. The experimental signal, S(z), and, in the continuous line, the analytical interpolation curve.
quantities. This last point can be simply solved, by multiplying the complete real signal by a constant factor in such a manner as to identify it with the theoretical signal corresponding to the standard value for \-> i-e--> X — 1/7A k m - 1 . Then, all parameters are fitted to approximatively reproduce the experimental curve using the analytical Lorentzian model. The signal we have chosen for this study presents both a significant exponential atmospheric decay and, close to an altitude of 10.5 km, a very pronounced cloud structure (see Fig. 11). The best parameter fit leads to the following values, r 0 a = 29.5 km, \ = V 6 - 5 k m - 1 , z0 = 10.450 km, T0 = 16.2 km and r 0 = 22.5 k m - 1 . The atmospheric mean free path Toa is function of the pressure, the temperature and the excitation wavelength. The obtained value is in a good agreement with the known mean values which are in the range of 10 km (for the violet) and of 80 km (for the red). For the cloud, the mean free path is related to the density d/v of particles of diameter SN according to the relation
^
= *£%•
<12>
Even for a single particle species, this last relation unfortunately contains
228
two unknowns and its use at an altitude z = ZQ does not enable one to simultaneously define dn? and 5^. However, one can mention that, for instance, the couple djy = 10 particles/cm 3 and 5^ = 2 yum satisfies (12) well, leading to plausible values. Given this assumption, one may then establish the law of density evolution d^ versus z. 6. Conclusion This work, which is just in its initial stages, leads to important and original results, but not yet sufficiently complete to identify particle density and size in the cloud. Hence we pursue investigations to improve the data analysis. We believe that the correct exploitation of the atmospheric attenuation, which is certainly related to microscopic parameters, could help remove this indetermination. Moreover, LIDAR responses are available at two wavelengths. We plan to study the influence of the laser wavelength on all the atmospheric parameters. Such a work could provide further important results. Finally, in this work, only anisotropic and non-absorbing particles, as well as only mono-scattered radiation have been considered, leading to a unique and readily exploitable relation describing the LIDAR response. Such an approach can be improved, by taking into account the full LIDAR equation to incorporate the anisotropy and the multi-diffusion and absorption properties. Such issues will be central to our future work on this topic. Acknowledgements This work has been finalized during the stay of one of the authors (M. N. H) at the Laboratoire de Physique Appliquee et d'Automatique, University of Perpignan (France). M. N. H thanks Professor Monique Polit, the Head of the Laboratory, Dr. Georges Debiais, and all the staff of this group for their kind hospitality. References 1. G. Debiais, Hilbert transform or Kramers-Kronig relations applied to some aspects of linear and non-linear physics, Proceedings of the Second International Workshop on Contemporary Problems in Mathematical Physics, eds. J. Govaerts, M. N. Hounkonnou and A. Z. Msezane (World Scientific, Singapore, 2002), pp. 233-258.
229 2. G. Debiais and M. N. Hounkonnou, Optical parameter determination of the atmosphere from a LIDAR signal by Hilbert transforms. Attempt at aerosol characterisation, Proceedings of the Third International Workshop on Contemporary Problems in Mathematical Physics, eds. J. Govaerts, M. N. Hounkonnou and A. Z. Msezane (World Scientific, Singapore, 2004), pp. 188208. 3. A. Saka, Model of atmospheric scattering. Theory of LIDAR. Atmospheric attenuation of Cerenkov showers, Ph.D. Thesis, unpublished, University of Perpignan (Prance, 1998). 4. S. Elouragini, Study of the optical and geometric properties of cirrus clouds by optic remote sensing (LIDAR) and passive radiometer techniques, Ph.D. Thesis, unpublished, University of Paris VI (Paris, France, 1991). 5. J. D. Klett, Stable analytical inversion solution for processing LIDAR returns, Appl. Opt. 20, 211-220 (1981). 6. S. A. Young, Analysis of LIDAR backscatter prafiies in optically thin clouds, Appl. Opt. 34, 7019-7030 (1995). 7. V. M. Mitev, I. V. Grigorov and V. B. Simeonov, LIDAR measurement of atmospheric aerosol extinction profiles: a comparison between two techniques — Klett inversion and pure rotational Raman scattering methods, Appl. Opt. 31, 6469-6474 (1992). 8. P. Snabre, A. Saka, B. Fabre and P. Espigat, Atmospheric attenuation in gamma ray astronomy, Astroparticle Physics 8, 159-171 (1998). 9. V. A. Kovalev and W. E. Eichinger, Elastic LIDAR (John Wiley and Sons, Hoboken, New Jersey, 2004).
230
N O N A B E L I A N GLOBAL CHIRAL S Y M M E T R Y REALISATION I N T H E T W O - D I M E N S I O N A L N FLAVOUR MASSLESS S C H W I N G E R MODEL LAURE GOUBA, 1 - 2 JAN GOVAERTS 1 ' 3 and M. NORBERT HOUNKONNOU 1 ' 2 1
International
Chair in Mathematical Physics and Applications University of Abomey-Calavi, 072 B.P. 50, Cotonou, Republic of Benin E-mail: [email protected]
(ICMPA),
2 Unite de Recherche en Physique Theorique (URPT), Institut de Mathematiques et de Sciences Physiques (IMSP), Universite d'Abomey-Calavi, 01 B.P. 2628, Porto-Novo, Republique du Benin E-mail: [email protected] 3 Center for Particle Physics and Phenomenology (CP3), Institute of Nuclear Physics, Catholic University of Louvain, 2, Chemin du Cyclotron, B-1348 Louvain-la-Neuve, Belgium E-mail: [email protected]
The nonabelian global chiral symmetries of the two-dimensional N flavour massless Schwinger model are realised through bosonisation and a vertex operator construction.
1. Introduction Recently,1 the quantum two-dimensional N flavour massless Schwinger model has been revisited without any gauge fixing but using the method of Dirac quantisation. The physical spectrum of this ideal "theoretical laboratory" for nonperturbative quantum field theory is known, and consists of one massive pseudoscalar field which is essentially the electric field with squared mass (N/Tt)e2, e being the U(l) gauge coupling constant, and (N — 1) massless scalar fields, none of which are interacting. At the quantum level, this model has SU(N)_ x S U ( N ) + x U ( l ) y chiral symmetries, where the factor SU(N)_xSU(N) + mixes separately each the chiral Dirac fermionic field components of given chirality, while the factor U(l)y is a common phase symmetry associated to the total fermionic number. The
231 associated SU(N)± and U(l)y Noether currents are bosonised. Using implicitly techniques from two-dimensional conformal field theory and string theory developed in the 1990's,2 one may construct vertex operators in direct relationship with these global chiral symmetries. Prom the modes of both these bosonised Noether currents and these vertex operators, we realise two commuting affine Kac-Moody algebras, of which the zero modes of the vertex operators are shown to correspond to the generators of the nonabelian global chiral symmetries. This paper is organised as follows. In Sec. 2, we introduce the twodimensional N flavour massless Schwinger model. In Sec. 3, we identify the chiral symmetries of the model and specify our notations by also defining the Hilbert space in which we work. In Sec. 4, we construct the relevant vertex operators. Section 5 is devoted to the Kac-Moody algebra. In Sec. 6, we realise the nonabelian global symmetries. Concluding remarks appear in Sec. 7. 2. The Two-Dimensional N Flavour Massless Schwinger Model 2.1. The classical
formulation
Let us consider the two-dimensional N flavour massless Schwinger model with a dynamics described by the following Lagrangian density,
.
N
N
~ \ E ^ V^' ~ e Y, V-fA^, 3= 1
(1)
3= 1
where F^v = d^Av — d^A^. As usual, the spacetime coordinate indices take the values fi = (0,1), while the Minkowski spacetime metric signature is diag?7Mi, = (+,—). We also assume a system of units such that c = 1 = h. This dynamics is singular, and, reducing the second-class constraints through the introduction of Dirac brackets, the fundamental firstclass Hamiltonian reads as 1
'
N
3=1 .
N
+ 1 E( 5 i + ieA1)^^ 3= 1
- cMAV],
(2)
232
where the first-class constraint, related to the U(l) local gauge invariance of the system, is simply N
a = dun + e ^ V j V •
(3)
j=i
2.2. Quantum
formulation
Through canonical quantisation and within the Schrodinger picture, bosonisation of the Dirac fermionic operators is achieved as
where the first factor on the r.h.s of this expression represents the Klein factor necessary in order to have fermionic operators of different flavours or chiralities that anticommute with one another, while the quantities ^±(z) are real chiral bosons. Applying the point splitting regularisation procedure and some field redefinitions, the fundamental quantum Hamiltonian is given as
+sS(ft*i),+ii;(ft*'-)'.
(5)
where (p is essentially (up to a normalisaton factor) the electric field with squared mass /z2 = (N/ir)e2. Here, $ ± are massless chiral bosons defined by
*^^7TTy(£^- #1 )'
'* " - 1 " '
(6)
3. Chiral Symmetries 3.1. SU(N)±
currents
The SU(N)± currents associated to the chiral symmetries are defined by N
J± = E &V (^LV2\a)jj±,
M
= 0,1,
(7)
where the matrices Aa, a € {l,...,(iV 2 — 1)}, are (N2 — 1) independent hermitian traceless matrices spanning the SU(N) algebra. These matrices
233
are a generalisation of the Pauli matrices in the SU(2) case or the Gell-Mann matrices in the SU(3) case. In particular, the matrices associated to the Cartan subalgebra U(1) J V _ 1 of the Lie algebra su(N) may be chosen to be given by (A% =
* VMi
( ! > • * * * -ihi+iki+i) +1) \ f c = i
, i e { l , . . . , ( i V - 1)},(8) /
leading to the following associated currents, J £ = (=FALv^) £ ) & ' ( A % # : .
(9)
1,3 = 1 N
J*
= (TXLV2)
£
^±75
(A%, & .
(10)
1,3 = 1
Through the bosonisation procedure of the fermionic operators, one finds, ho _
/ _i_ *-* i a <*.*
Hi — I
L
\
Let us consider the U ( l ) i V _ 1 currents given by
4W = (±^)
ft*iW,
»e{i,...,(w-i)}.
(12)
In terms of modes, these currents may be written as
JUZ) = Pi + E (J±,nZn + 4»*~") ,
(13)
n>l
where the modes are function of the modes of the (N -1) real bosonic fields $!(., and satisfy the following algebra J±,n > J±,m] = ™Mm«3.2. The quantum
Hilbert
(14)
space
From the algebra (14), we conclude that the operators Jl±n and J3± n, with i, j = {1, • • • , (N — 1)} and n = 1, • • • , oo, form a set of independent harmonic oscillators. Therefore, the state space considered here is a Fock space built up from the simultaneously normalized vacua of all these oscillators, 10), J£,J0>=0,
n>0,
Pl\0)=0.
(15)
234
Let us consider S, the set of roots of the Lie algebra su(N), and As the root lattice of su(N). States |A) can be added to the above states by acting with the plane wave operator elXQ±. We denote |A> = eiA<5±|0>,
(16)
where A £ Ag. Later we shall assume that all a £ £ have length y/2. 4. Vertex Operators Given any complex number z and any root a, let us introduce the vertex operator 2 :e ia -
U£(z)=z^
(17)
where JV-1
al
®L(Z)•
a •
(18)
In order to make (17) single valued in z, the following condition is required, a + a-P ) ±
eZ.
(19)
Therefore (17) is analytic and has a Laurent expansion
UI(Z) = J2 UlmZ-"1,
(20)
m£Z
where the modes can be written as dz m z Ul{z) 2iirz
°^'i
(21)
and satisfy the hermiticity condition fra' _ fj-ot U ±,m — U±,-rn-
(22)
5. The K a c - M o o d y Algebra 5.1. Almost
commutation
relations
Using the modes of currents and those of vertex operators, we can build almost commutation relations by J±,n i J±,m
—
m
"ij°m,n,
u
±,n i
u
±,m
= alUlm+n,
(23)
235
( 0 UlmUln
- {-irPUlnU£m
=<
if a • P > 0,
Ugh* if«-/5 = - l , a • J± ,m+n +m5m+nfl if a • /3 = - 2 ,
(24)
JV-1 aJ
where a • J±, m + n = Yl
^±,m+n-
3=1
In order to have commutation relations, we have to correct the sign (—l) a / 3 which appears in (24). 5.2. Sign
compensation
Let us set V = I T,
, where T denotes the Fock space. We
define a sign compensation operator 3 by C±,a=
£
e(a,p)\p)(P\,
(25)
/3eA E
which only acts on the wave plane factor and satisfies the following conditions e(a,p) G {-1,1} ,
e(a,p) = (-1)^+^6^,
a),
(26)
e(a,P)e(a + P,7) = e(a,p + 1)e(p,j).
(27)
Let us set E±,n = U±,nC±,«,
E£,m = UlmC±,fi.
(28)
We then have the following commutation relations, known as the affine Kac-Moody algebra (in fact, we obtain two such algebras, one for each of the chiral sectors of the model, which commute with one another), [Jkm, J(,n] = m6ij6m,n,
[4
m
, Eln]
(e(a,p)El+£+n |^±,m> E±,n\ - \ a- J±,m+n + m8m>n I0
=o^
m + n
ifa-/3 = - l , if a • P = - 2 , if a • P > 0.
,
(29)
(30)
6. Nonabelian Global Chiral Symmetries We are now ready to realise the nonabelian global chiral symmetries of the model. In fact, having defined the affine Kac-Moody algebra, it is well
236
known from the literature 2 that the following algebra is isomorphic to the Lie algebra su(N), •4,0,
£±,o,
l
a s A.
(31)
The Cartan subalgebra is generated by J±t0, i € {1,..., (N — 1)}, while the nonabelian global symmetries are realised by E± 0 , a S A. 7. Concluding remarks We have realised the nonabelian global chiral symmetries of the twodimensional N flavour massless Schwinger model through bosonisation of the massless Dirac spinors. Our results generalise the recent work of Michael Creutz. 4 An important aspect relative to the nature of the action of these nonabelian global chiral symmetries on the one-particle quantum states, and beyond, of the model, would merit further investigations. Acknowledgements The authors acknowledge the Agence Universitaire de la Francophonie (AUF) and the Belgian Cooperation CUD-CIUF/UAC for financial support. L. G. is presently supported through a Ph.D. Fellowship of the Third World Organisation for Women in Science (TWOWS, Third World Academy of Science). J. G. acknowledges the Abdus Salam International Centre for Theoretical Physics (ICTP, Trieste, Italy) Visiting Scholar Programme in support of a Visiting Professorship at the International Chair in Mathematical Physics and Applications (ICMPA). The work of J. G. is partially supported by the Belgian Federal Office for Scientific, Technical and Cultural Affairs through the Interuniversity Attraction Pole (IAP) P5/27. References 1. L. Gouba, Theories de jauge abeliennes scalaire et spinorielle a 1 + 1 dimensions: Une etude non perturbative, Ph.D. Thesis, unpublished, Institut de Mathematiques et Sciences Physiques (IMSP), University of Abomey-Calavi (Republic of Benin, 2005). 2. P. Goddard and D. Olive, Int. J. Mod. Phys. A 1, 303 (1986). 3. I. B. Frenkel and V. G. Kac, Inv. Math. 62, 23 (1980). 4. M. Creutz, Hidden symmetries in two dimensional field theory, e-Print arXiv:hep-th/0508116 (August 2005).
237
S U P E R C O N D U C T I V I T Y A N D ELECTRIC FIELDS: A RELATIVISTIC E X T E N S I O N OF BCS S U P E R C O N D U C T I V I T Y JAN GOVAERTS 1 ' 2 and DAMIEN B E R T R A N D 1 ' 3 1
Center for Particle Physics and Phenomenology Institute of Nuclear Physics, Catholic University of 2, Chemin du Cyclotron, B-1348 Louvain-la-Neuve, E-mail: Jan.GovaertsOfynu.ucl.ac.be 2
International
(CP3), Louvain, Belgium
Chair in Mathematical Physics and Applications University of Abomey-Calavi, 072 B.P. 50, Cotonou, Republic of Benin
(ICMPA),
3 Center for Space Radiations (CSR), Department of Physics, Catholic University of Louvain, 2, Chemin du Cyclotron, B-1348 Louvain-la-Neuve, Belgium E-mail: [email protected]
1. Introduction The Bardeen-Cooper-Schrieffer (BCS) theory 1 of 1957 provides a microscopic understanding for the phenomena of Low Temperature (LTc) superconductivity. 2,3 Below the critical temperature Tc, attractive electronphonon interactions lead to electron-electron Cooper pair formation in the s-wave channel. The ensuing Cooper pair condensation of such identical bosonic quantum states implies the appearance of an energy gap, associated to a spontaneous breaking of the U(l) local gauge symmety of the electromagnetic interaction, 4 hence also an effective non-zero mass for the photon which translates into the physical Meissner effect5 of magnetic field screening in any bulk superconductor. The existence of a gap A(r) also ensures the phenomenon of perfect conductivity, through the collective dynamics of the condensed Cooper pair electrons for electric currents less than some critical value.
238
The gap A(F) may also be given the interpretation, up to normalisation, of the common complex valued quantum wave function of the spin 0 Cooper pairs. It also plays the role of an order parameter for the phase transition towards the superconducting state, of relevance in an effective field theory description. Among the successes of the BCS theory in the weak coupling regime, one finds the correct description of the temperature dependence of the order parameter, hence the identification of the critical temperature, the critical magnetic field for the Meissner effect, and consequently also the critical current, inclusive of subtle effects such as the isotopic dependence of the critical temperature. In effect, the superconducting state is understood in terms of a coherent superposition of electron-electron pairs of which the momentum and spin values are coupled in order to build up a spin 0 state of vanishing total momentum, in the absence of any electric current,
J J [«(*) + eie^v{k)c\(-k)c\(k)]
,
(1)
k
where cl(fc) and c\{k) represent the creation operators of electron states of momentum k and spin projection up or down, respectively, while u(k) and v(k) stand for the probability amplitudes of occupation of states without or with a single Cooper pair of vanishing total momentum and spin. These two functions are identified through a gap equation expressing the minimisation of the energy of such a trial state with respect to these two functions obeying a normalisation condition involving the combination |u(fc)|2 + |i;(fc)|2. A few years after the formulation of the BCS theory, Gor'kov showed6 how through a finite temperature quantum field theory approach, it is possible to construct an effective field theory representation of the microscopic dynamics, which for all practical purposes coincides with the famous Ginzburg-Landau (GL) phenomenological description of superconductivity dating back to 1950 already.7 Based on Landau's approach towards a general theory of phase transitions, within the GL theory the free energy density of the superconducting state, Fs, compared to that of the normal state, Fn, is expressed as a functional of the order parameter tp(f) which, up to normalisation, is identified with the superconducting gap A(F), ^-Fn
= a\^
+
\m^^\(w-ilA)i,\\^(B-B^y
, (2)
where a and (3 are temperature dependent coefficients defining an effective
239
potential energy density
7(M) = aM 2 + ^ M
4
,
(3)
while the notation for the other quantities is standard, and corresponds to the magnetic vector potential, A, the magnetic induction, B, and the externally applied magnetic induction, Bext, with /x0 being the vacuum magnetic permittivity. Finally, q = - 2 | e | and m stand, respectively, for the Cooper electric charge and effective mass in the conducting material. Given the above potential energy, as soon as the parameter a turns negative below a specific critical temperature Tc, a(T < Tc) < 0, one has a potential of the Higgs type with minima attained for nonvanishing expectation values of ip, thereby spontaneously breaking the local U(l) phase invariance symmetry of the GL functional. As explained in any standard textbook on superconductivity, 2 ' 3 the phenomena of perfect conductivity and diamagnetism are readily established from the GL equations for the order parameter, namely the variational equations of motion stemming from the GL functional,
~& ( v - i f A ) 2 t ^ ( r ) + aiP(r) + \P\W)\21>lf) = 0, ?(?) = ^
x
$(?)
W
with boundary conditions requiring that the current J{r) has a vanishing component normal to any boundary in the case of a finite domain. In particular, space dependence of the order parameter may then be accounted for, so that not only is the Meissner effect characterized by the magnetic penetration length A, but coherence effects are also characterized by a coherence length £, with their ratio distinguishing between Type I and Type II superconductors. Namely, given the GL parameter n = A/£, Type I superconductors correspond to a value of the GL parameter less than l / \ / 2 , K < KC = l / \ / 2 , and Type II superconductors to a value larger than KC, K > KC. The manner in which magnetic fields may or may not penetrate such materials in their bulk is different for each Type. In particular, Type II materials sustain Abrikosov vortices,8 namely flux tubes carrying a unit value of the quantum of flux penetrating the material even in the superconducting state.
240
Furthermore, in the limit that any spatial dependence of the order parameter may be ignored, the GL equations lead back to yet an older empirical approach to superconductivity from 1935 due to the London brothers. 9 The London equations simply read E = |(AJ),
5 = - V X ( A J ) ,
(5)
E and B being the electric and magnetic fields, respectively, J the current density, and A a phenomenological parameter given by A = - ^ , (6) nsq2 ns being the density of superconducting electrons. Perfect conductivity is a direct consequence of the first London equation, while the Meissner effect follows from the second with a magnetic penetration length \i such that Al = —
2
-
(7)
Of course, it is to be understood that all the above descriptions and their ensuing equations of motion are also coupled to Maxwell's equations of electromagnetism, V-E=±p,
Vx£ + f
= 0, (8)
V-S-0,
f x B - ^ ^
= fi0J,
p being the electric charge density, and eo et p,Q the usual electric and magnetic permittivity properties of the vacuum such that e0fi0 = c 2 , c being the speed of light in vacuum. But one of the players in these latter equations which is conspiscuously missing from the above discussion of available descriptions of LTc superconductivity is the electric field. How do electric fields influence or affect the electromagnetic properties of superconducting materials? The usual answer 2 ' 3 to this question is that, in the stationary state without any time dependence, electric fields cannot have any effect whatsoever since, according to the first London equation and because of the perfect conductivity of any superconductor, the electric field must vanish identically at least for bulk materials. Given that this answer is presumably acceptable, it then remains nonetheless possible that for nanoscopic materials of increasing use and interest in nanotechnology, electric fields may have some effect close to the surface of such materials, since it would be difficult to imagine how an externally applied electric field could abruptly and discontinuously vanish
241
when moving from the outside to the inside of such a conductor. 10 In the present contribution, we briefly discuss this question, and present some of the conclusions that have been reached through the work of which far more details may be found in Ref. 11. The characterisation of the problem is presented in Sec. 2. Next, a first framework in which to address the issue is briefly considered in Sec. 3, with experimental results proving that the analysis must be extended to include the effects of all electrons of a superconducting material, even the "normal" ones. An appropriate framework is then developed in Sec. 4, leading first to the identification of the effective potential energy in analogy with the GL potential, and next, in Sec. 5, some further dynamical properties of the superconducting state in the presence of an applied electric field. Finally, Sec. 6 offers some conclusions and prospects for further work along similar lines. 2. The Problem To highlight the issue mentioned above from different points of view, let us first recall that the relativistic covariance properties of Maxwell's equations are best made manifest through the fact that the electromagnetic scalar and vector potentials, $ and A, respectively, define the components of a 4-vector as A» = (*,A\,
/x = 0,1,2,3,
(9)
while the associated field strength tensor F^ = d^A^ — d„AM (with x1* = (ct,x)) is directly related to the electric and magnetic fields E/c and B as E -$ d - = -V--7^A> B = VxA. (10) c c o(ct) The fact of the matter is that all the approaches briefly reviewed in the Introduction are intrinsically nonrelativistic, but are nonetheless coupled to the relativistic covariant Maxwell's equations. In itself this is not necessarily problematic provided the considered regime remains nonrelativistic in a sense to be specified. However, since a nonrelativistic limit amounts to taking a limit such that 1/c —> 0, clearly any of the effects related to the electric scalar potential, $/c, and field, E/c, in the same units as those of the magnetic sector, decouple in such a limit. Electric field effects in superconductors are thus at best subleading in 1/c, but not necessarily vanishing altogether. Therefore one ought to develop a manifestly relativistic invariant framework in which to analyse such effects.
242
From yet another point of view, one may also argue for the necessity of such a framework by considering specific experimental set-ups. 10 For example, imagine an infinite slab subjected to an external magnetic field parallel to it, without an external electric field being applied in the laboratory frame. Due to the Meissner effect, the magnetic field will penetrate the slab only up a typical distance set by the magnetic penetration length A. Imagine now performing a Lorentz boost in a direction both parallel to the slab and perpendicular to the magnetic field. In such a boosted frame, not only is the strength of the magnetic field slightly modified, but more importantly there appears now an electric field perpendicular to the slab, and a priori also inside the superconductor, and thus necessarily with precisely the same penetration length as the magnetic field! Hence, if for such a gedanken experiment it is justified to restrict only to electrons in the superconducting state, one is forced to conclude, on basis of relativistic covariance, that an electric field does not necessarily vanish inside a superconductor, and does penetrate such materials with a penetration length identical to the magnetic one. Such considerations thus call for a framework in which Maxwell's electromagnetism is coupled to the superconducting state in a manifestly relativistic invariant manner. Such a framework may also be of relevance to other issues of superconductivity, especially for heavy metallic coumpounds corresponding to chemical elements of large Z values, implying significant relativistic corrections to electronic orbital properties. 12,13 In effect, an experiment such as the one described above has been performed using a nanoscopic slab of superconducting aluminium, subjected to both a magnetic field parallel to the slab and an electric field perpendicular to it (for the details, see Ref. 11). Much to our surprise, no effect of the electric field whasoever, even when ramped up to considerable values, was observed on the critical temperature for the superconducting state, while the latter's values and dependence on the magnetic field were properly observed, measured and seen to coincide with established values for the critical temperature in the case of that material. The experiment was analysed within the framework of both the nonrelativistic GL approach, and its obvious covariant generalisation through the U(l) Higgs model for a charged complex scalar field with Lagrangian density functional 4,10 ' 14
^:-4^F~+M£),{l(*+'^H'~£(l*|,•1),}• (ID
243
In the latter case, indeed the London equations are modified in the manner expected on account of manifest Lorentz covariance, with in particular necessarily identical electric and magnetic penetration lengths. 10 Either framework predicts effects of an electric field on such an experimental set-up, with specific characterisations of these effects as a function of the temperature. 10 And much to our surprise, no effect whasoever was observed in spite of repeated and carefully prepared measurements and nanoscopic samples of the aluminium slab. 11 Faced with this conundrum, we were led to the necessity of developing a microscopic model of s-wave superconductivity which is manifestly relativistic invariant and accounts for all electronic states, whether "superconducting" or "normal" states, the latter being the main suspects as being behind the close-to-perfect screening of any elecric field however large. In other words, a relativistic extension of the BCS theory for LTc superconductors is a priori required to account for the experimental results. 3. A Model The superconducting state we are interested in being in thermodynamical equilibrium, the natural framework to model the problem at the microscopic level is in terms of Finite Temperature Quantum Field Theory (FTQFT), 1 5 ' 1 6 in which electron states are described by the Dirac field coupled to the electromagnetic field in a U(l) gauge invariant manner. One needs to compute the partition function of the system, in the presence of stationary background electric and magnetic fields, as well as a chemical potential 16,17 /x for the electron states, namely Z(/?)=Tre-^-^>,
0 = ~ ,
(12)
H being the Hamiltonian of the system, N its electron number operator, T the absolute temperature and k Boltzmann's constant, while the trace is over all quantum states of the system. The calculation of such a partition function proceeds through both operatorial and path integral techniques. The Hamiltonian to be used stems from the Lagangian density describing the microscopic dynamics £ = ~
\F^F^V
+ tf ( i y a M - m) V - eA„frf1> + Af•
(13)
Henceforth, natural units such that h = 1 = c and e0 = 1 = ^ 0 are used. Here, ip stands for a Dirac 4-spinor for the Dirac-Clifford algebra
244
{7M> l"} — 2T?M", rfv being the four-dimensional Minkowski spacetime metric of mostly negative signature (for the space components), m is the electron mass and e < 0 its electric charge, while ip = ip^j0. Finally, Af stands for the four-fermion effective interaction to be used to model the phononmediated electron-electron interaction responsible for the superconducting state. A priori, the 4-fermion interaction may involve some combination of all possible relativistic invariant 4-electron operators, of the form, Af = 9\ (ip*l>) +92 {ipj5ip) +93 {•4>l",4>) +94 ( W ^ V O +95 {^"ihi1) (14) with, as usual, 75 = i^0^1^2^3 and CTM1/ = i[y^,^u]/2, and gi (i = 1,2,3,4,5) arbitrary real couplings constants. However, since the model describes a single fermionic species of which the field degrees of freedom are represented by Grassmann odd variables, and applying Fierz identities, it follows that the same effective interaction may be brought into the form Af = Pi {rpcip) (ipcip) + Pi ( ^ 7 5 ^ ) _ +Pz ( ^ 7 ^ 7 5 ^ )
{ipc75^)
_ {i>cltfbip) ,
(15)
—T
where ipc — T]cCip , r/c being an arbitrary phase factor and C the charge conjugation matrix operator. A detailed analysis of the particle and spin content of these different operators in a nonrelativistic limit shows that, in the order in which they appear in the above relation, the first accounts for a p-wave order parameter, the second the s-wave BCS state, and the third a superposition of d- and s-wave contributions. This classification does not coincide with different conclusions available in the literature, which failed to account for the Grassmann character of the electron field and misidentified the proper properties of such a classification under parity. 12 ' 13 Focusing on s-wave BCS superconductivity, we are thus led to consider the following 4-fermion effective interaction
AT 5 = -\g Wd^)+ (V^VO,
(16)
g being a real coupling constant, in order to model the phonon mediated interaction between electron pairs and leading in fine to the Cooper pair condensed state below the critical temperature. Given that choice as well as well established techniques of FTQFT, 1 6 it follows that the partition function of the system may be given the following
,
245
path integral representation, Z(fl)=
lv[i>3,&,^]e~S°dTSd3SC*,
(17)
where As = \ [^drip
- dT^il>] - \i p 7 • VV - V ^ • 7 ^
+rml)ip + eA^ip^i/j
— /xoV^V"
+ i | A | 2 - \ [At (fclhi>)
(18)
+ A (V^TSV-) 1 ] ,
while from now on /io stands for the chemical potential in the absence of external electromagnetic fields. Here, r is the imaginary time parameter in which bosonic fields must be periodic and fermionic ones antiperiodic with period 8, while A is an auxiliary field which is introduced in order to express the 4-fermion interaction in terms of only quadratic couplings of the Dirac field. The ensuing Grassmann odd gaussian integrals are then readily feasable, leading to an effective action for the auxiliary field A. As a matter of fact, the field A coincides thus with the order parameter of the superconducting state, and measures, up to normalisation, the local density of Cooper states in that state. The effective action obtained through the integration over all fermionic degrees of freedom thus corresponds, in the relativistic setting, to the GL action functional in the nonrelativistic setting. In effect, this is also how Gor'kov established the GL effective description from the BCS microscopic one. 6 4. The Effective Potential Before addressing the issue of the external field dependence of the effective action, it is of interest to identify the effective potential independently of such external electromagnetic disturbances and spatial variations of the order parameter. For all practical purposes, this effective potential should correspond to the GL potential of the Higgs type in (3). In such specific circumstances, given the absence of external electromagnetic fields and the assumption of a space independent order parameter Ao, the calculation may be performed exactly through operator techniques by relying on a Bogoliubov transformation which enables one to identify the associated Cooper pairs in analogy with (1). Details may be found in Ref. 11. What such a Bogoluibov transformation achieves is an exact diagonalisation of the quantum Hamiltonian under the above circumstances, with the
246
Cooper pair condensate denning the physical ground state. Furthermore, excitations of the Cooper pair condensate correspond to collective modes of the electron system of definite momentum and charge, hence also of definite energy, known as pseudo-particles. Any of these states, whether the Cooper pair ground state or its pseudo-particle excitations, are obtained as coherent superpositions of the modes of the original free quantum electronic states of the Dirac spinor and its perturbative vacuum. Denoting by V the volume of the superconductor, the effective potential energy density is expressed as VSJ$ = i | A 0 | 2 + / ^ - 2 1 / jgs
[2uj(k) - EB(k)
-
ED(k)]
In [l + e-<«-<*>] - 21 / ^
In [l + e^'M]
, (19)
where EB(k)
= J(u(k)
- no)
+|A0|2,
ED$) = ^J(utf) + m>) +|Ao|2, oj(k) = yk"1
( 0)
+m2.
Here, EB (k) and ED (k) stand for the energies of the pseudo-particle excitations of electron and positron type, respectively, in presence of the Cooper pair condensate Ao which clearly specifies also the gap value in the dispersion relations for such collective excitations of the superconducting state. Note that the chemical potential Ho, in the case of an electron conductor, a is bounded below by m, since the electron's relativistic rest mass energy must be added to the ordinary nonrelativistic chemical potential value in the present Lorentz covariant framework. By minimisation of the effective potential and applying the usual BCS approximation which consists in restricting the momentum integration to a region surrounding the Fermi level with a cut-off set to coincide with the Debye lattice frequency to which a Debye energy £D is associated, 1-3 one
a
I n the case of a positron conductor, fio would be bounded above by — m.
247
obtains the following gap equation for the order parameter A 0 ,
d£
^
+ |Ao|
2
tanh -pJ?
+
IAQI 2
%
\ = -rT-r,
J
(21)
9N(0)'
N(0) being the density of states at the Fermi level. In this expression, possible contributions due to positron-like states are not retained since their value is totally insignificant in the case of an ordinary LTc superconductor. This gap equation may be seen to coincide with the usual BCS gap equation. 1 In particular, in the weak coupling regime and at T = 0 K, its solution reads |Ao(0)|
1
7re 0C
(22)
1 _ e2/gN(0) '
7 being the Euler constant, 7 ~ 0.577, and /3C = l/kTc. Hence, one recovers the BCS results, and the order parameter interaction chosen in (16) does indeed represent s-wave BCS Cooper pairs within the present relativistic framework. Given the kinematical regime in which the model is being considered, relativistic corrections to the effective potential thus prove to be totally insignificant. Even though we shall refrain by lack of space from • presenting here graphs of the effective potential (which would be quite illustrative of the physical results, for which the interested reader is again referred to Ref. 11), the effective potential (20) is indeed of the Higgs type below the critical temperature, namely with an absolute minimum for a nonvanishing order parameter A. However, although a quartic approximation of the Higgs form ,(i) V£'(x)=F(x o)
+
[F(0)-F(xo)]
x2
= |A|
(23)
is quite satisfactory for temperatures T sufficiently close to Tc and A values sufficiently close to the solution to the gap equation (21), and in which the coefficients F(0) and F(x0) are chosen to coincide with the values of V}ff\x) for x = 0 and x = XQ, XQ = |A 0 | standing for the solution to the gap equation, better approximations are possible, which greatly extend the temperature range below Tc for which for all practical purposes the approximation coincides with the exact effective potential given by its integral definition in (20). Possible examples are 11 r(2)
V$(x)
= F(xo) +
[F(0)-F{xo)]
\n(xl + \xl) - ln(a;2 + Az2,) ln^o + \XQ) — \n(\xl)
n2
(24)
248
V$(x)=F(xo)
+
[F(0)-F(xo)}
1+7*
[1 + 7]°
-1 (25)
where A, a and 7 are parameters whose values and temperature dependence may be fitted 11 to the exact effective potential, leading to very efficient approximations to the exact expression, reliable in far greater temperature and order parameter ranges away from their critical values than the usual Higgs potential of the quartic type, V^ ff (x). A study of the phenomenological consequences of such generalised Higgs-like potentials could be of interest, in particular for what concerns their vortex solutions. 14,18 ' 19 5. The Effective Action By including the effects of external electromagnetic fields, a computation of the full effective action, and not only the effective potential, is feasible through a perturbative expansion. Namely by also including effects due to space gradients both in the order parameter and the electromagnetic potentials $ and A, it is possible to obtain explicit expressions for all physically relevant parameters which empirically characterise the superconducting state. Thus not only are the coherence and magnetic penetration length values and their temperature dependences obtained, but also those of the electric penetration length. Furthermore, other characterisations also become accessible, which are usually not discussed in the literature by lack of interest in the possible effects of electric fields on superconductors. For instance, it is also possible to study how the total electron charge density distribution, which ought to balance the background lattice charge distribution, is accounted for by the order parameter ("superconducting electrons"), and thus how superconductors could locally acquire charge in specific circumstances. Likewise, a study of the dependence of all the above quantities on the chemical potential fio is also feasible, and is of interest since varying its value amounts to depleting the superconductor of its electrons, or else increasing that number, in other words, charging the material. By lack of space, all the results obtained so far along such lines are not detailed here. They are available in Ref. 11 together with relevant and illustrative graphs. In the case study of aluminium, experimental values for the magnetic penetration length and its temperature dependence are well reproduced from our analysis when proper account is given of the role of impurity electron rescattering. For what concerns the electric penetration length, our analysis reveals that this observable, heretofore never computed
249 in the literature, receives two types of contributions, in contradistinction to the magnetic penetration length of which the value is solely dependent on the Cooper pair condensate density |A 0 |. Indeed, for the electric penetration length, not only is there a contribution akin to the magnetic penetration length as expected by reason of the Lorentz covariance arguments discussed in Sec. 2, but in addition the ordinary Thomas-Fermi screening effect existing in normal conductors 20 is also at work. Since typically values for the latter screening length are on the order of the Angstrom or in fact even less, while magnetic penetration lengths typically range in the tens to hundreds of nanometers, and since their combined effect which finally sets the electric penetration length derives essentially from the sum of their squared inverse values, namely ^electric
=
^magnetic + ^Thomas-Fermi'
\^)
it follows that it is the Thomas-Fermi screening effect which by far and large dominates the screening effects of electric fields in superconductors. This conclusion also explains the null results of our experimental measurements mentioned in Sec. 2. In other words, Cooper pair contributions, namely "superconducting electron" contributions are indeed similar in value for both the magnetic and electric penetration lengths, but in the latter case contributions from "normal electrons" are also involved, and their effect being so overwhelming in the case of that observable, in effect the complete superconducting electric penetration length essentially coincides with the Thomas-Fermi screening length of the conductor even in the normal conducting state. In fact, since all electron states are being integrated out in the path integral leading to the effective action, the distinction between "superconducting" and "normal" electrons is a matter of convention and arbitrary definition, possible for instance by comparing local charge distributions to the background lattice charge distribution which is also, up to the sign, that of the conducting electrons in the normal state. In technical terms, the electric penetration length is identified directly from the effective potential rather that the full effective action, through the dependence of the effective potential (20) on the chemical potential. Indeed, the chemical potential adds up with the electrostatic potential, and the actual effective action is then function of the electrochemical potential. Through an expansion in the electrostatic potential, one then identifies the electric penetration length. More specifically, given the effective action computed as indicated above through the path integral over the
250
fermionic degrees of freedom, one still needs to add to it the Hamiltonian or energy density of the purely electromagnetic sector, which is treated semi-classically in the effective field theory approach. The latter reads
r
2
1
(27)
where ptot stands for the total charge density in the conductor, inclusive of the background lattice and valence electron contributions (the "static" charges), to which those of conducting electrons modeled through the above discussion are to be added. Consequently, for what concerns a stationary configuration, by adding this contribution to that following from the effective action one is left with a local functional of the form
Aot = / |(V - 2ie£) A|2 + $ p t o t - i f l * 2 - i ( V $ ) V ^ (V x l)2 , (28) where / and g are quite involved expressions determined from the effective action calculation. Note well however that the term in g<&2 derives solely from the effective potential rather than the full effective action, whereas the term involving / is a contribution from the effective action per se which determines the magnetic penetration length. Deriving now the equation of motion for the electrostatic potential, V 2 $ - < 7 $ + ptot=0,
(29)
it is quite clear that the electric field penetration length is determined by the coefficient g through T 2 ^ — = 9-
(30)
electric
Hence indeed the electric field penetration length is solely determined from the expansion to second order of the effective potential with respect to the chemical potential, since in the presence of an external electrostatic potential the effective potential is function of the electrochemical potential, namely the sum of both the chemical and electrostatic potentials. Finally, the explicit analysis finds that g does receive two types of contributions, one which vanishes for a vanishing order parameter Ao, i.e., in the absence of Cooper pairs or the "superconducting electron" contribution, and the second which remains finite even when A 0 = 0, namely the "normal electron" contribution, which for all practical purposes leads in effect to the Thomas-Fermi length.
251
Given this fact, it thus appears that a similar calculation is perfectly feasible also in a nonrelativistic setting, since one only needs to consider the dependence of the effective potential on the chemical potential. Nevertheless, and somewhat suprisingly perhaps, this dependence does not appear to ever have been studied previously, and our result is thus totally new in the literature. Given that the Thomas-Fermi screening length increases with the depletion of conducting electrons, i.e., by charging positively the conductor, it would appear that possibly one could in effect remove the effect of the "normal" electrons and thus reach a regime in which both the magnetic and electric penetration lengths have comparable values, enabling an experimental confirmation of the effects of electric fields in a set-up of the type used in our experiments. Unfortunately, a detailed analysis of the dependence on the chemical potential ^o of both these penetration lengths given our explicit resuls, has established that such a regime is never achieved. Even though both lengths essentially diverge when the conductor is totally depleted of its conducting electrons, their ratio never approaches a value close to unity, rather it essentially retains its value for the neutral conductor. Likewise for what concerns their dependence on temperature, although the magnetic penetration length diverges close to Tc, the electric one does not display any particular behaviour when crossing the critical threshold because of the dominance of the Thomas-Fermi screening length, and one reproduces the correct values and temperature dependence of the latter quantity in the normal state as well. Finally, our analysis has provided for the first time the temperature dependence of the coherence length of a superconductor such as aluminium. Even though the numerical values obtained for that material coincide with measured ones, our explicit expressions for that quantity lead to an unexpected behaviour of that observable when approaching the critical temperature. Indeed, while it remains rather stable at low temperatures, upon approaching Tc, one observes first a slight dip (on the order to 10%) in its value before increasing as expected phenomenologically within the GL framework as Tc is reached. Note that in contradistinction to the magnetic penetration length which is accessible experimentally, there does not appear to exist measurements available in the literature of the temperature dependence of the coherence length of superconductors. This unexpected behaviour of the coherence length requires further corroboration, both experimental and theoretical.
252
6. Conclusions and Prospects In this brief contribution, we have described some of the results achieved through a relativistic invariant extension of the well established BCS theory of LTc superconductivity. The motivation for this study is a better understanding, in a relativistic regime at a later dynamical stage, of the effects not only of magnetic fields but also of electric fields on the superconducting state. Following experimental measurements performed on nanoscopic superconductors of which the results were totally unexpected, it was realised that the role of "normal" electrons is also crucial for what concerns such electric field effects. When these are properly included within a microscopic framework, it appears that ordinary screening effects in conductors overwhelm the properties stemming from the superconducting state, an occurrence which does not apply to magnetic field effects for which only the contributions from "superconducting" electrons are relevant. Having identified precisely the origin of the different contributions to the total electric penetration length, it appears that, at least in the instance of that specific observable, a nonrelativistic analysis of the effective potential and its dependence on the chemical potential would have sufficed to reach the same conclusion. In the regime of temperatures and chemical potentials of relevance to ordinary superconductors, indeed the effect of positron-like states is perfectly insignificant. There exist other physical environments though, for which this would no longer be the case, for instance in the astrophysical context. Our work leaves open a series of issues and even possibilities of detailed study which deserve to be investigated further. For instance, our analysis predicts that under certain circumstances superconductors would acquire locally on their surface nonvanishing charge. Since in recent years such measurements have become possible, a detailed analysis of this issue would be of interest with the prospect of experimental validation of the model. An unexpected temperature dependence of the coherence length has been identified, which deserves confirmation. The relativistic framework has also led to further possible types of order parameters than simply the s-wave BCS one, including p- and d-wave order parameters. As is well known, High Temperature (HTc) superconductors display properties of mixtures of s-, p and d-wave order parameters, in combinations depending on the material being considered. Would the present framework enable a description of some of these HTc superconductors? Note also that by combining now in the effective four-fermion interaction a superposition of the three types of order parameters, one would obtain a description of systems possessing
253
more than one gap, indeed as also been observed for some HTc materials. There is thus a rich phenomenology of properties to be described through such generalisations of our work. Finally, one may still extend further the choice of four-fermion interaction, by including higher derivative couplings or introducing Lorentz noninvariant couplings. Indeed after all, the thermodynamical description remains tied up with the rest frame of the material being considered, and from that point of view one may still extend the range of possible four-fermion interactions, and see whether some classes of models could account for the observed properties of new classes of superconductors. Acknowledgements J. G. acknowledges the Abdus Salam International Centre for Theoretical Physics (ICTP, Trieste, Italy) Visiting Scholar Programme in support of a Visiting Professorship at the International Chair in Mathematical Physics and Applications (ICMPA). This work is partially supported by the Belgian Federal Office for Scientific, Technical and Cultural Affairs through the Interuniversity Attraction Pole (IAP) P5/27. References 1. J. Bardeen, L. N. Cooper and J. R. Schrieffer, Phys. Rev. 108, 1175 (1957). 2. M. Tinkham, Introduction to Superconductivity, 2 edition (McGraw-Hill, New York, 1996). 3. J. R. Waldram, Superconductivity of Metals and Cuprates (Institute of Physics Publishing, Bristol, 1996). 4. S. Weinberg, The Quantum Theory of Fields, Vol. II (Cambridge University Press, Cambridge (UK), 1996), pp. 332-352. 5. W. Meissner and R. Oschenfeld, Naturwiss. 21, 787 (1933). 6. L. P. Gor'kov, Zh. Eksp. Teor. Fiz. 36, 1364 (1959). 7. V. L. Ginzburg and L. D. Landau, Zh. Eksp. Teor. Fiz. 20, 1064 (1950). 8. A. A. Abrikosov, Zh. Eksp. Teor. Fiz. 32, 1442 (1957) (English translation, Sov. Phys. JETP5, 1174 (1957)). 9. F. London and H. London, Proc. R. Soc. London A 149, 71 (1935). 10. J. Govaerts, D. Bertrand and G. Stenuit, On electric fields in low temperature superconductors, Supercond. Sci. Technol. 14, 463 (2001). 11. D. Bertrand, A Relativistic BCS Theory of Superconductivity: An Experimentally Motivated Study of Electric Fields in Superconductors, Ph.D. Thesis, Catholic University of Louvain (Louvain-la-Neuve, Belgium, July 2005), available at http://edoc.bib.ucl.ac.be:81/ETD-db/collection/ available/BelnUcetd-06012006-193449/. 12. K. Capelle and E. K. U. Gross, Phys. Rev. B 59, 7140 (1999); K. Capelle and E. K. U. Gross, Phys. Rev. B 59, 7155 (1999).
254
13. T. Ohsaku, Phys. Rev. B 65, 024512 (2002); T. Ohsaku, Phys. Rev. B 66, 054518 (2002). 14. J. Govaerts, J. Phys. A 34, 8955 (2001). 15. J. I. Kapusta, Finite Temperature Field Theory (Cambridge University Press, Cambridge (UK), 1989). 16. M. Le Bellac, Thermal Field Theory (Cambridge University Press, Cambridge (UK), 1996). 17. R. P. Feynman, Statistical Mechanics: A Set of Lectures (Benjamin/Cummings Publishing, Reading, Massachusetts, 1972). 18. J. Govaerts, G. Stenuit, D. Bertrand and O. van der Aa, Annular vortex solutions to the Landau-Ginzburg equations in mesoscopic superconductors, Phys. Lett. A 267, 56 (2000). 19. G. Stenuit, S. Michotte, J. Govaerts and L. Piraux, Supercond. Sci. Technol. 18, 174 (2005). 20. C. Kittel, Introduction to Solid State Physics, 3 r d edition (John Wiley & Sons, New York, 1966).
255
ANALYTICAL SOLUTIONS OF A GENERALIZED N O N L I N E A R R E A C T I O N - D I F F U S I O N EQUATION M. KABIR MAHAMAN and M. NORBERT HOUNKONNOU International
Chair in Mathematical Physics and Applications (ICMPA), 072 B.P. 50 Cotonou, Republic of Benin E-mail: [email protected], [email protected]
We generalize a diffusion equation modeling a brain cancer treatment and investigate its Lie symmetry group. Analytical solutions are derived.
1. Introduction In a recent work,1 the diffusion equation ut = \uxx
-—{u
-—)
(1)
has been used to study a brain cancer treatment. Unfortunately, this model does not take into account the viscous effect of the investigated medium. The present work aims at meeting this requirement. Thus, we generalize (1) to obtain ut = (\-eu)uxx-
V ( u2\ — lu- —J ,
(2)
where e stands for the viscosity parameter of the brain medium. Furthermore, instead of the numerical method adopted in Ref. 1 to solve (1), we provide in this work analytical solutions generated using Lie group techniques. 2. Symmetry Groups and Exact Solutions Equation (2) corresponds to the kernel equation of the map A : (x, t; u^)
V ( u' —> -ut + (A - eu) uxx - — [u - — I K \ K
(3)
256
on a subset M ( 2 ) of the 2 n d jet-space X x U^ of the manifold X x U, in which the independent variables (x, t) € X and the dependent variables u £U and u^ = (u,ux,ut,uxx,uxt,utt) € U^. M^ is the corresponding 2 n d prolongation of the subspace M C X x U. The Jacobian matrix of A, JA (X, t, u ( 2 ) ) = (o, 0, -eu x ; c - ^ ( l - j± V 0, - 1 , (A - eu), 0, o V (4) does not vanish anywhere on M^. let the vector field be
Then, A is of maximal rank. On X xU
r\
r\
Q = £(x,t,u)—
+r(x,t,u)—
r\
+ cj)(x,t,u) — .
(5)
The second prolongation of this vector field is the vector field on X x U^ given by dux
'
dut
ouxt
' '
du XX
dutt
where the functions
- TxUt - TuUxUt
(f)1 = Dt{4> - (,UX -TUt)
- &Ux - €v.UxUt - Tuu\,
4>xx = Dxx(4> - £ux - TUt) + £uxxx + T
— Yxx T \
-2rxuuxut "SU^IWJJ
u
~ xx t
(8)
ruxxt
"r ('rnti
—
^?xujwx
- £uuUx + {(f>u - %€x)uxx - Tuuuxut Tu1ltUxx
= 4>tt + (2(j}tu - Ttt)ut 2
+ (
t
\\)j
+ iuux + ruttt
- 2£,tUxt ~ ZttUx + (>„ - 2Tt)uU
- 3TuUtUtt
1rxuxt
ATuUxUxti
- U " l « t
(7)
+ (,Uxt + TUtt
= 4>t + {
u
- (,UUX,
- Tuuu\,
2£tuUxUt
- 2£uUtUxt
-
iuUxUtt (10)
257 (j)Xt = Dxt(4> - £llx - TUt) + £uxxt =
~~£tu
r
x
u
~ xu t
~
+
TUxtt
+ ((fixu ~
nx)ut
+ (
T
u u
~ uu x t
+
•
' H)
To check the Lie symmetries of (2), we make the assumption (see Refs. 2 and 3), pr{2)QA(x,t,u{2))
A(x,t,u{2))
= 0 whenever
= 0,
(12)
and check the corresponding conditions on £, r and (f>. The above conditions provide V * ^ X I
yv-
V ( u2\ - — \u - — 1 .
ut = (A - eu)uxx
(13)
Plugging in (13) >' and 4>xx given by (8) and (9), respectively, equating the coefficients of the various monomials in u and its derivatives with zero, and then solving the corresponding system of partial differential equations, we get £x = £u = £t = 0, TX = n = TU = 0 and <j> = 0. Hence r and £ are some arbitrary constants and
M
,
u )
=a-+&-,
(14)
with a and b being some arbitrary constants. This operator generates a group of translations w.r.t. x and t. Thus if
u(x,t)
is a solution of (2), so is
u(x + a,t + /?),
(15)
with a and (3 two arbitrary constants. Otherwise, (2) may also admit some plane wave solutions u(x, t) = w(x — ct) with c being an arbitrary constant. In fact, setting y = x — ct and w = u reduces (2) to the ordinary differential equation (ODE) (A - ew)wvy + cwy - (V/K)(w
- w2/K)
= 0.
The Lie invariance algebra of (16) is generated by the vector field C = so that pr^C = d/dy. Thus, setting X =w
and
* = y
leads to Wy = ^x1
and
W
VV = " * M t x ,
(16) d/dy, (17)
258
which, if inserted in (16), gives the following ODE,
*xx
+
7x^x*x
7x^-\
** - °'
(18)
where the new dependent variable \P = \?(X) does not appear explicitly. Then, setting
in (18), we get Abel's equation of the first kind, 3,4
Following F. Schwarz,2 let us introduce the changes of variables X = k(z),
G = g(z)v{z) - h(z),
k, g and h being arbitrary smooth functions, with dk/dz ^ and g ^ 0. Setting b2{z) = c/(A - ek{z)) and 63(2:) -(V/A") (k{z) - k{z)2/K) (A - ek(z))~l, and making the choices , b2 h=—
, and g=
5k'b32 - 3b'2b3 + 3b2b'3 ^ ,
(20) 0 =
(21)
where ()' = d()/dz, we obtain the rational normal form (RNF) of the Abel equation (19), v'(z) +A(z)v3+B(z)v+
1 = 0,
(22)
2
where A = -k'b3g and B = -k'h(-2b2 + 3b3h) + g'/g. Equation (22) is a particular example of the equations in the form v'(z) + r(z,v) = 0 where r(z,v) = P(z,v)/Q(z,v) is rational in its arguments. The general symmetry operator of the ODEs in the unknown function v = v(z) has the form V = M(z,v)^
+ G(z,v)-^.
(23)
This symmetry is trivial if M = -Qip(z,v) and G = Pi/)(z,v), with ip(z,v) an arbitrary function of z and v. Otherwise, it is nontrivial. To integrate (22) by quadrature, we need its nontrivial symmetries. The following statements provide some necessary and sufficient conditions for (22) to allow a nontrivial structure preserving symmetry.
259 Theorem 2.1. The equation v'(z)+Av3(z)+Bv(z) + 1 = 0
(24)
with A'—SAB ^ 0 allows a nontrivial symmetry group with the infinitesimal generator
U
= A^ZAB(S4-A,"1)-
(25)
if and only if its coefficients A and B satisfy A = D(B-
^J)
with
D^O.
(26)
The constant D is defined by this relation. The case A' — SAB = 0 is called exceptional: it allows the symmetry generator
Proof. See Ref. 4.
•
Theorem 2.2. If an Abel equation has a RNF of type (24) and allows a one parameter symmetry group with generator (25), in the canonical variables z and v = v(z) of this generator, it has the form - - D z 3 - z - l =0 (28) v1 with D defined by (26). As consequence, its general solution is obtained by integration in the form dz ——3 + constant. (29) 3 v Dz + z + 1 ' In the exceptional case A' = SAB, the canonical form of the differential equation and the integral are, respectively, 1 , + z6 + 1 = 0 (30) /
and /N 1, / •*+ 1 x 1 2z- 1 . , viz) = — - m ( — = = = = ) = arctan —•=—h constant. (31) 3 Vz2-2-l V3 \/3 Proof. See Ref. 4 • Hence, as soon as we know the suitable function k(z) which satisfies either the condition (26) or the exceptional case, we deduce subsequently the general plane wave solutions of (2). Because of the stiffness of these
260
constraints, finding the suitable k is quite intricate. However, a particular class of plane wave solutions of (2) is constructed by using the ansatze u(x,t) =
(32)
f(t)exp(ax)
with a a constant and f(t) an arbitrary smooth function. In fact, by plugging (32) in (2) we get the following determining equations (33) 2
—
+ | j = o.
(34)
cexp(Xa2-~)t
(35)
Solving this system, we have f(t) =
V
a= ±
(36)
eK2'
with c an arbitrary constant. Hence, the following functions form a class of plane wave solutions of (2), ui(x,t)
= ci exp
u2(x,t)
= c 2 exp
'f XV \eK2
'f XV \eK
2
-I)
t+ X
\leK*
V^
)' "jeK*
(37) (38)
with c\ and a as arbitrary constants. The invariance of (2) under the group of translations, (x, t, u) —> (x+a, t+/3, u), allows to derive the more general forms of the above solutions, respectively as, u\ (x, t) = ci exp
(S-^) ( i + 6 ) + ( a ; + a ) v5
(39)
and U2(x,t)
= C2exp
( w v\. , , , nr
where a, b, r and s are some arbitrary constants.
(40)
261 3. Concluding Remarks In this contribution we have generalized a state equation arising from a model of brain cancer chemotherapy by taking in account the contributions of the viscosity of the brain's medium. The local (Lie) symmetry group of the resulting equation as well as this group's invariants are derived, leading to the reduction of the new PDE to an Abel equation and providing information on the corresponding class of solutions, that is, the plane wave solutions. A canonical form of this Abel equation, as well as some conditions for its integration by quadrature, that is the existence of some non-trivial symmetries, are established. Information on the existence of some plane wave solutions of the new PDE guided us to set suitable ansatze so that we recover such solutions. However, the poorness of Lie symmetries of the equation (2) may be circumvented by investigating its nonclassical symmetries, 3,6,7 mainly through its heir equations (see Ref. 6). Thus hidden symmetries can be investigated, since they could lead to yet more general analytical solutions. References 1. B. Mampassi, B. Saley and B. Some, African Diaspora Journal of Mathematics 1, 1 (2004). 2. P. J. Olver, Applications of Lie Groups to Differential Equations (SpringerVerlag, New York, 1993). 3. R. O. Popovych, On reduction and Q-conditional (nonclassical) symmetry, Proceedings of the Second International Conference, Symmetry in Nonlinear Mathematical Physics (Kyiv, July 7-13, 1997), eds. M. Shkil, A. Nikitin and V. Boyko (Kyiv Institute of Mathematics, 1997), Vol. 2, pp. 437-443, e-Print arXiv:math-ph/0207015. 4. F. Schwarz, Stud. App. Math. 100(3), 269-294 (1998). 5. G. M. Murphy, Ordinary Differential Equations and their Solutions (D. Van Nostrand, New York, 1960). 6. M. C. Nucci, J. Math. Anal. Appl. 279, 168-179 (2003). 7. M. C. Nucci and P. A. Clarkson, Phys. Lett. A 164, 49-56 (1992).
262
T H E A L G E B R A I C S T R U C T U R E OF A GENERALIZED COUPLED DISPERSIONLESS S Y S T E M K. V. KUETCHE* and T. C. KOFANEt Department of Physics, Faculty of Sciences, University of Yaounde I, P.O. Box 812 Yaounde, Republic of Cameroon E-mail: * [email protected], [email protected] B. T. BOUETOU Departement de Genie Informatique, Ecole Nationale Superieure Poly technique, Universite de Yaounde I, P. O. Box 8390 Yaounde, Republique du Cameroun E-mail: tbouetouQyahoo.fr A physical model of the 0(3)-invariant coupled integrable dispersionless equations that describe the dynamics of a focused system within the background of a plane gravitational field is studied. The investigation is carried out both numerically and analytically, and is pursued both in the Lie algebra and quasigroup theory contexts in which the structure constants of the former are extended into the structure functions of the latter. The energy density and topological structures such as loop solitons are examined. Keywords: Solitons; quasigroups; Lie algebras, General Relativity.
1. Introduction Nonlinear equations play a central role in modern science. In particular, Ordinary Differential Equations (ODEs) and Partial Differential Equations (PDEs) of nonlinear type are very often encountered in the theoretical description of a broad variety of phenomena and processes. Examples are found in various disciplines such as classical mechanics, biology, chemistry and electronics, to name a few. In addition the focal point of the study of any nonlinear PDE is the question of its integrability. There exist three approaches to this question, namely: Lie algebra analysis, numerical studies and Painleve analysis. The third analysis requires that the nonlinear PDE is integrable if and only if it possesses the Painleve property. 1 - 3
263
During the past several years, it has been seen that the study of nonlinear evolution equations has attracted many mathematicians and theoretical physicists due to its considerable applications in various branches of science. 4 - 6 The study of nonlinear phenomena has been very interesting and challenging mathematically and physically. Considerable interest has been paid recently to dispersionless or quasiclassical limits of integ r a t e equations and hierarchies. 7-11 Study of dispersionless hierarchies is of great importance since they arise in the analysis of various problems in physics, mathematics, and applied mathematics from the theory of conformal maps on the complex plane. Different methods have been used to study dispersionless equations and hierarchies. 7-11 In particular, several (l+l)-dimensional equations and systems have been analyzed by the quasiclassical version of the Inverse Scattering Transform (1ST), including the local Riemann-Hilbert problem approach. Dispersionless integrable hierarchies can be viewed as quasiclassical limits of the ordinary integrable systems. 11 ' 12 A typical example is the dispersionless Kadomtsev-Petviashvili (dKP) hierarchy which has played an important role in theoretical and mathematical physics. 1 3 - 1 5 The Lax formulation of the dKP hierarchy can be constructed by replacing the pseudodifferential Lax operator of KP with the corresponding Laurent series. On the other hand, an analogous construction can be made for the modified KP (mKP) hierarchy and thus leads to the dmKP hierarchy. The singular point structure analysis leading to the Painleve Property (P-property) for ordinary differential equations 16 plays a very useful role in determining the integrability property of nonlinear dynamical systems. 17 ' 18 Weiss et al.19 reformulated and generalized the P-test for Partial Differential Equations ( P D E ) . 1 7 - 2 1 When compared with the uncoupled systems, many coupled systems are not completely analyzed because of the complicated and tedious mathematical analysis involved in understanding the nature of their dynamics. However, the P-analysis of the coupled Nonlinear Schrodinger (NLS) equation, higher-order coupled NLS, nonlinear coupled Klein-Gordon equation, 2 2 - 2 4 inhomogeneous coupled NLS, nonlinear coupled integrable dispersionless equations, 2 5 - 2 8 and so on, has been investigated. In recent past years, the integrable system of coupled integrable dispersionless equations has been studied by many authors. 2 9 - 4 0 Some authors 33 have presented and solved the above system by the 1ST technique. From this method, they solved the Gel'fand-Levitan equations. Kotlyarov 41 proved that these integrable models are gauge equivalent to the sine-Gordon and
264
Pohlmeyer-Lund-Regge models. Again, Konno and Kakuhata have investigated the 1ST method and obtained the soliton solutions for growing, decaying and stationary solitons. Investigations of the solitary waves and their integrability properties have been considered. In another paper, the same authors solved the system by the 1ST method and discussed oneand two-soliton solutions. Even though the coupled integrable dispersionless equations are known to be completely integrable, their P-property has been established. The remarkable feature of the P-analysis, particularly for soliton solutions, is that a natural connection exists between the Lax pair, BT, Hirota bilinear form and Miura transformation, which can be constructed through the expansion of the solitons about the singularity manifold. The 1ST scheme for soliton equations is a powerful tool for obtaining 7V-soliton solutions and an infinite number of conserved quantities. The most famous one is the ZS-AKNS scheme. 42 ' 43 Many inverse scattering schemes, such as the ZS-AKNS one and its variations, have 2 x 2 matrix form. There are, however, fewer generalizations of them to 3 x 3 or higherdimensional matrix forms. It is interesting to look for soliton equations within the general nxn inverse scattering scheme. Recently, some authors 44 proposed a generalized coupled dispersionless system given by dlS + [dxS,[S,G}} = 0,
(1)
where the matrix S = S(t,x) and the constant matrix G are elements of an arbitrary non-abelian Lie algebra. This equation belongs to the nxn ZS-AKNS-type 1ST scheme and nonlinearity comes from the non-abelian character. Equation (1) is a generalization of the coupled integrable, dispersionless equation dlq+±dx(rs)=0, dfxr - rdxq = 0, s
dfx ~
(2)
sd
xQ = 0,
based on a group-theoretical point of view. For SU(1,1)~ 0(2,1)~ SL(2,R), Eq. (1) reproduces Eq. (2). For SU(2)~0(3), we can obtain (the star refers to conjugation) dlq+±dx(rr*) = 0, dfxr - rdxq = 0, d r tx * ~ r*dxq = o,
(3)
which can be equivalent to the Pohlmeyer-Lund-Regge system. Equations (2) and (3) have been solved within the 1ST scheme under the appro-
265
priate boundary conditions and shown to be integrable. They possess the important conserved quantities dxq2 + dxrdxs = q%,
(4)
dxq2+dxrdxr*
(5)
and = q2,
which are obtained through the 1ST scheme. Here qo = dxq(±oo) is constant. Furthermore, some connection of issues arising in theoretical physics with nonassociative algebras and differential geometry has been uncovered. This connection has helped solving many problems in physics. Indeed, during the last thirty years quite remarkable relations between nonassociative algebra and differential geometry have been discovered. Exotic algebraic structures such as quasigroups and loops were obtained from purely geometric structures such as affinely connected spaces. The notion of odule was introduced as a fundamental algebraic invariant of differential geometry. For any space with an affine connection, loopuscular, odular and geoodular structures (partial smooth algebras of a special kind) were introduced and studied. There are now three main approaches in theoretical physics exploring the notion of a nonassociative system, • the octonionic approach; • the Lie-admissible approach; • the quasigroup approach. We will focus only on the last. It is essentially based on new nonassociative algebraic methods in differential geometry where the local properties of some global continuous structures such as quasigroups, loops, etc., have been studied. The recent development has demonstrated that also various nonassociative systems, such as quasigroups, loops, odules, etc., are playing important roles in geometry and also in physical applications. In physics, the main motivation for the quasigroup approach is provided by modern gauge theories, quantum gravity and some attempts of extension or generalization of the classical methods of symmetry and invariance. Here we shall make some brief comments about these directions. Nowadays, gauge theories based on continuous (Lie) groups have become an essential part of modern theoretical physics providing the unified treatment of fundamental forces of nature through the localization (gauging) of the global group symmetry. Recently, this approach has been generalized for quasigroups by some authors. 4 4 - 4 7
266
Recently, a purely algebraic formulation of differential geometry, namely nonlinear geometric algebra, has been elaborated by some authors. 48,49 In this approach, nonassociativity appears as an algebraic equivalent of the geometrical notion of curvature. This geometric algebra provides a new algebraic approach to the theory of gravity, where spacetime is considered as an algebraic system with a geodesic multiplication of points in a certain way. The curvature of spacetime is the expressed by the nonassociativity of this multiplication. There is hope for arranging another attack upon the most actual problem of gravity — the quantization of gravity. Our aim in this paper is to reconsider some differential system of equations modeling the behavior of a charged particle within a magnetic field, and to develop some computations both analytically and numerically from a curved-space perspective. In this view, our work is planned as follows. In Sec. 2, we briefly review the smooth loop theory. Next, a generalization of the system under consideration is given in Sec. 3. Then in Sec. 4 some applications are considered dealing with the background of a weak plane gravitational wave. Finally, we end our work with some concluding remarks. 2. Smooth Loops 2.1. Basic smooth
structures
2.1.1. Smooth local loops (loopusculas) 2.1.1.1. Definition. Let ip : M x . . . x M —• M be a partial m-ary operation on a Cfc-smooth manifold M such that ip(a\,..., am) = b (i.e., tp is defined on a\,..., a m ). 4 8 Then there exist open sub-manifolds U\,..., Um containing a\,..., am, respectively, ip being defined on U\ x . . . x Um and the restriction ip\ulx...xum '• U\ x . . . x Um -—» M being a C r -smooth mapping (r < A;). Then ip is said to be a C r -smooth partial m-ary operation on a Cfc-smooth manifold. If tp is defined everywhere on M then we say that ip is a C r -smooth global m-ary operation. A Cfc-smooth manifold M equipped with a family of C r -smooth partial (global) operations (r < k) and a family of constants (fixed elements) is called a C r ' fc -smooth partial (global) algebra (C r -smooth partial algebra if r = k). 2.1.1.2. Definition. Let < M, -,e > be a partial magma (groupoid) with a binary operation (x,y) i—> x • y and neutral element s, M being a Cksmooth manifold and the operation of multiplication (at least C 1 -smooth) being defined in some neighborhood U 9 e. 48 As is known, the above oper-
267
ation is locally left and right invertible, i.e., if x • y = Lxy = Ryx then in some neighborhood of the neutral element e the operations L~l and R~x exist. This will allow us to introduce left and right division, x/b = Rb
a \ x = La x
l
x,
(6)
with the properties (x/b) • b = x, (x • b)/b = x.
a • (a \ x) = x, a \ (a • x) = x,
(7)
Thus we have indeed a partial loop on M. 2.1.2. Canonical odules and odular structures 2.1.2.1. Definition. Let < Q,-,\,e > be a partial left loop with two-sided neutral e defined on a C^-smooth manifold Q (dim Q = n) . 48 We say that A\,...,A„ are the left basic fundamental vector field of Q if [Aa(x)f
= Ai(x)
In such a case, any A = (aAa vector field.
=
d{x-yf dya
( C 1 , . . . , £" 6
(8) y=e
is called a left fundamental
2.1.2.2. Definition. Let Q be a Cfc-manifold. A partial left loop < Q> •> W > is called left(p,k)-canonical (p > 1) if Lx : y \—> x • y is C1smooth near e and its left fundamental vector fields are C p -smooth near e. In the case p = 1, we say left canonical instead of left(p, A;)-canonical. Analogously,48 one can define a right (p, fc)-canonical loop, replacing the left basic fundamental vector field by right ones, oi, • • • , an such that, K(y)f=ae(y)
2.2. Infinitesimal
=
d(x • yY dxa
(9)
theory of smooth loops: the general a
theory
48
Determination of ip and / | . Let < Q,-,e > be a smooth partial loop with the neutral e. Let us introduce the following notations, Lab = a-b,
l(a,b) =
T(a,b) = [l(a,b)l,s, A%{a) = [{La)^]°,
LalyoLaoLb, (10) B>l(a) =
[(La)£]X
268
where a,/?, A,/* = 1, ••• ,n = dimM. Differentiating with respect to c at c = e the relation (a • b) • l(a, b)c = a • (b • c), we have, ^t^A^a.b)Tl(a,b)B;(b).
(11) v
Let us introduce the vector fields A7 and covector fields B sions ( A » ) A = A$(a),
by the expres-
( B » ) / J = B$(a),
[A^Ap] (a) = Cl0(a)A,(a),
(12)
<%, = -Cja,
(13)
where Clg are defined as the point structure functions. Following (13), we get the relation AZ(a)d,Ax0(a)
- A}(a)d7Axa(a)
= Cl0(a)A*(a),
(14)
which is equivalent to - A}{a)dyAxa{a)]
B»x(a) [Al(a)d^(a)
= CTap{a),
(15)
or more briefly, B"{[Aa,Ap])
= CPafi.
(16)
Using Jacobi's identity [Aa, [Ap,Ay]] + [Ay, [Aa,A0]} + [A/3, [Ay, Aa]] = 0,
(17)
we have + C^{a)Cvpl>{a)
A^Md^C^ia)
= 0.
(18)
Here < a, /3,7 > stands for the cyclic sum over a, /3,7. Assuming that d2(a-b)a dfrdbi*
_ ~
d2(a-b)a db»dbv '
we obtain, ^ W % ^
- ^(a)%51
=
C^(b)Tj(a,b)
- CTTX(a •
b)Tl(a,b)^(a,b). (19)
In particular at b — e, we have CUa)
= QJe)
+
dl2(a,b) db»
dll(a,b) db"
(20) J b=e
This result is promising since it gives an assessment of the value of the structure function on a given manifold.
269
2.2.1.1. Proposition. Let < Q,-,e > be a smooth partial loop. Then ipa = (a • b)a and Z7 — l%(a, b) are the solutions of the following system of differential equations, 48 ( ^
=
A^ilJB^b), - C?x{
A°(b)§r - Al(b)§r = C^(bX a y = aa, lvT\e=KK \c
(21)
The functions A" (b) are supposed to be given and satisfy the conditions A«(e) = d«. Remark. There exists some correspondence between the structure functions and the curvature tensor of a manifold.48 Let us introduce the following infinitesimal right and left translation matrices, respectively, = x»+R»v(x)yv
\{x-yy
+ ---,
K
'
with L%{y) = ^ f ^ | * = e and R»v(x) = ^0^\y=eThe matrices R%{x) and L^(y) can be used to introduce a local frame field, (R„{x) = R"v(x)dll, \Lu{y) = L»v{y)d»-
, K
. '
It is well known that for two vector fields, their commutator is a vector field. We know that Lv{x) and R„{x) are frame fields, so it is quite natural to define the structure functions A7 (a;) and C 7 (a:) by [L,(i),LfW] = - A ^ ( x ) t 7 W ,
(24)
[R»(x),Rv(x)}
(25)
= -C\u(x)R7(x).
In general, the structure functions A7M„ (a;) and C7M1/ (x) do not coincide. Those functions possess the expansions \\„(x)
= -Rli/]s(e)xs
+ --.,
(26)
C\l/(x)
= -R][iii/](e)x5
+ ..-.
(27)
The commutator of two frame fields can also be calculated, [L,(x),Rv(x)]
= -\R\Sv(e)y%
+ ••••
(28)
Here R? . (e) denotes components of the curvature tensor at the unit element e.
270
3. The Extended Generalized Coupled Dispersionless System Let G be a simple extended compact Lie group, dimG = n, and Q be its extended Lie algebra. The generators Ta(x) of Q (where x = (X^J/J, = 0,1,...)) satisfy the commutation relations with the structure functions 49 Cabe(x) such that [Ta(x),T»(x)]
= Cabc(x)Tc(x),
(29)
and the Cartan metric r)ab defined by r]ab = Tr(TaTb).
(30) ab
ab
Without any loss of generality, we can take r} as diagonal and C c(x) as totally antisymmetric. Given the T°'s we define S by S = <j>a(x)Ta(x) = r,abr(x)Tb(x), a
(31)
a
1
where 4> =
2
(32) These quantities are ro-
{M^n-'Gn,
[66)
where ft £ G. Let us write the action of generalized coupled dispersionless system 50 as 1=
[dtdxC(S,d„S,d0S),
(34)
where £ is the Lagrangian density defined by C = e^Tr^SdoS
- l-M[S, [3 M S, S]}).
(35)
Note that the e-character represents the Levi-Civita symbol. The lagrangian density is manifestly invariant under the global gauge transformation (33). By using the Euler-Lagrange equation, we obtain e»0{d20liS-[[S,M],dllS})=0.
(36)
From (36), we show that Tr(d^S)n is conserved for integer n (n > 2) and is also invariant under gauge transformations. Equation (36) is known as the
271 generalized coupled dispersionless system. 50 Positing <9MTa = B°CTC and d„Ta = D«CTC we get [S,M] =
(37)
hence,
+ {d^d)^kbC&\Cdca
(38)
+ MehC^B^C^}
= 0.
We consider the case of a single j-coordinate and set •Cabc =
ifabcq(xj),
Db0a=eb0av{xi,t), ^T = xj -£x°,a = xj
(39) +£x°,
where u,v,q are arbitrary functions and £ is a constant. By a suitable redefinition of x° leading to £ = 1, it follows that d2acj> - d2T
y,
02 = f, 03 = a + 2{dT -da)\nF,
(41)
one derives the following bilinear equations, (D2T - D2a + 1)F • W = 0, (D2. -Dl + 1)F-H = 0, (DT - Da)2F • F - \{W2 + H2) = 0,
(42)
272
where Ds denotes Hirota's derivatives. This system is equivalent to ( WFTT + FWTT - WFa
1+E
fx(^xi) (44)
= ( A + ifr)
ts + A
fs + T^
- A
[* x (* x * ) ] ,
£ being a phase velocity. Now, according to the usual procedure, the Hamiltonian H is given by
I
H
dxTi,
(45)
with H being the hamiltonian density defined by H = d0cf> • * - C = Tr(^M[S,
(46)
[d^S, S]}),
IT being the canonical conjugate momentum to
(47)
It is noteworthy to emphasize here that the symbol (•) denotes the inner product defined by
Let us consider (f> = (^1,^2,^3) and k — (0,0,1). Therefore $ = (^3 2,(i>l-
= {
{^-^-t+'-t^-^-t+'-t^-^-t^'-t)-
an&y
=
Furthermore, by
setting u
TP(n h\ — I
*{a,b) - {j^
j . v—2uv A
\ da
q2u
, _uv_ db
/nu
+ 3 ^ - J dl + T+zdl - jt^(ao
/-<(„ y.\ uv — q a da • uv — q b db • b — a du U(U,U) — 1- 1 + £ T 1_£ 1 + s dg ds dg, TJ In h ,.\ — uv+q2a dc 1 c—b du
H2(a,b,c) = ^ £ I(a, b,c) =
( T ^
+ tE%%,
+ ^ r )
fs ~ ^(bc
- ac),
„2\
- a ), (48)
273
we write, ( *j& = F ( 0 i , 0 a ) + i?i(01,02,0s), ^ • = F(
(49)
Also, using the inner and exterior products defined above, we obtain
% = -£• {|0|2 [a - s ) ^ + «(02 - 0i)l -0 3 (i - s ) ^ } (50) 2
-2uu(|0| + 4>\ - 0102 - 0103 - 0203), which is the Hamiltonian density of the system. Note here that we have
I0| 2 =01+0I4. A n Illustration: In the Background of a Weak Plane Gravitational Wave Now we give some illustrative values of u, v, q derived from the quantum kinematics survey of some test particle in the background of a weak plane gravitational wave. The metric tensor of the space-time of a weak plane gravitational wave can be given as perturbations around the Minkowski metric of the form,51 9liV=r]IJtv-\-hliV,
r]^ = diag(-l,+l,+l,+l).
(51)
In case of a polarized weak plane gravitational wave moving in the direction of x-7 the only non-zero components of /iM„ in the TT-gauge are for example h-22 = —h,33 = ACOSLO(X°
— xl),
(52)
for j = 1. Here A =constant with A
(53)
where 6, rj, v are some arbitrary constants. Some curves may then be derived, as displayed in the figures hereafter. We apply some numerical method to plot some curves. As was done previously in Ref. 50 for the above loop soliton, we consider the same boundary conditions. We will discuss two major situations. The first refers to the absence of perturbation, namely u = 0. In this case, we try a set of plots which
274
will be compared to the ones obtained analytically from Hirota's method. The second case deals with the influence of the perturbation, namely u) ^ 0. Here we compare some curves we derived with the previous ones for u — 0. This may help us assess the genuine role of the angular frequency. Incidentally, considering Figs. 1, 3, 5 and 7 which present plots of <$>\ = X versus 4>z = Z and of the energy density versus Z, we see one, two and three loop shaped self-confined structures. This is obtained for angular frequency w = 0. For the one loop, we chose a phase velocity of v — 0.6; v — 0.8 for the two-loop; v = 0.0 and v = 0.2 for the two other cases. These figures establish a degree of reliability of the two methods used so far. The two and three loop cases may be interpreted as the interaction between the two and three single one loop structure identified previously. With the plot of the energy density, we can get some assessment of the relative stability of the two and three single one loop solitons interacting. It is worth pointing out that for other suitable choices of the phase velocities, one may get some N loop structures of interactions. Furthermore, we have found that we may only get some loop structures for absolute velocities less than unity. This agrees well with the above analytical method from the perspective of Hirota's scheme. Now, we consider the second case for which we choose u = 100. We then get the Figs. 2, 4, 6 and 8. Comparing these plots to the previous ones for the case CJ = 0, there are some important changes as far as shapes are concerned. In fact, for u — 100, we have chosen a real small perturbation amplitude compared to unity. In Fig. 2 one may see how the effect of small perturbations has shifted to the left-hand side the one loop soliton leading in this way to a small stretching of the loop. In Fig. 4 these perturbations have created some distortion of the two loop case creating two more single loops travelling along the main one. Contrary to the previous case and as displayed in Fig. 6, the effects of small perturbations on the three loop soliton is to shift the two single one loops in opposite directions so that they would collide with the static one at the center. As shown in Fig. 2, there has been some creation of two more one loop solitons travelling along with two main ones. Then for this velocity of v — 0.2, the effect of small perturbations is really important as is the case for the previous two loop situation. It should be noticed here that the effects of such small perturbations are clearly visible on the energy density plot which for instance tells us whether there has been some new loop-like soliton.
275 5. Summary We have given an extension of the Lagrangian 50 leading to coupled dispersionless systems. If we assign the field (j)a and the constant ka to the position vector r = (X, Y, Z) of the string and the constant external electric field J respectively, our system behaves like a charged particle moving in an external magnetic field. In contrast to the KdV-type equations in which dispersion effects balance the nonlinearity, our system shows that the nonlinear external force of the dispersionless equations balances the linear elasticity of the string. With some numerical attempt carried out in this work, we have solved the above nonlinear coupled system within a spacetime universe from a flat perspective. The introduction of a perturbation term has rendered this system not easily tractable at all using Hirota's method. However, this numerical attempt has confirmed the above result derived from the analytical method. As has been seen, the effects of small perturbations seem to shift the structure under consideration in such a manner as to alter its shape and leading to some new structures. This simply means that some gravitational sources such as spinning double stars or supernovae explosions may play an essential role on dynamic system modeling, since the topological space under interest is locally modified. It is worth noting that we could have considered the case of strong fields but the actual problem depends on the physical choice of the structure functions characterizing this field. This constitutes an open topic of research. Finally, we have also been interested in energy density computations. This has helped us assessing the relative stabilities of some one loop structures. However, further study along this line should be made so that the relative stabilities of loop-like structures should be assessed. References 1. W.-H. Steeb and N. Euler, Nonlinear Evolution Equations and Painleve Test (World Scientific, Singapore, 1988). 2. R. S. Ward, Nonlinearity 1, 671 (1988). 3. J. Weiss, J. Phys. 27, 1293 (1986). 4. A. I. Maimistov and E. A. Manykin, Sov. Phys. JETP 58, 685 (1983). 5. R. Hirota and J. Satsuma, Phys. Lett. A 85, 407 (1981). 6. Y. Hase and J. Satsuma, J. Phys. Soc. Jpn. 57, 679 (1989). 7. Y. Kodama, Phys. Lett. A 129, 223 (1988). 8. Y. Kodama, Phys. Lett. A 147, 477 (1990). 9. K. Takasaki and T. Takebe, Int. J. Mod. Phys. A 7, 889 (1992). 10. I. Krichever, Comm. Pure App. Math. 47, 437 (1994).
276
11. 12. 13. 14. 15.
16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33. 34. 35. 36. 37. 38. 39. 40. 41. 42. 43. 44. 45. 46.
K. Takasaki and T. Takebe, Rev. Math. Phys. 7, 743 (1995). R. Carroll and Y. Kodama, J. Phys. A 28, 6373 (1995). D. Lebedev and Yu. Manin, Phys. Lett. A 74, 154 (1979). V. E. Zakharov, Physica D 3, 193 (1981). Y. Kodama and J. Gibbons, Proceedings of the 4 t h Workshop on Nonlinear and Turbulent Processes in Physics (World Scientific, Singapore, 1990), pp. 166. M. Ablowitz, J. Raman and Segur, J. Math. Phys. 21, 775 (1980). A. Ramani, B. Dorizzi and B. Grammaticos, Phys. Rev. Lett. 49, 1539 (1982). M. Lakshmanan and R. Sahadevan, Phys. Rev. A 31, 861 (1985). J. Weiss, M. Tabor and G. Carnavale, J. Math. Phys. 24, 522 (1983). J. Weiss, J. Math. Phys. 15, 13 (1994). J. Weiss, J. Math. Phys. 25, 2226 (1984). W. H. Steeb, M. Kloke, B. M. Spiker and M. Lakshmanan, J. Phys. A 17, 825 (1984). R. Sahadevan, K. M. Tamizhmani and M. Lakshmanan, J. Phys. A 19, 1983 (1986). K. Porsezian and S. P. Shanmugha, Chaos, Solitons and Fractals 5, 119 (1995). K. Porsezian, S. Shanmugha and A. Mahalingam, Phys. Rev. E 50, 1543 (1994). K. Porsezian and T. Alagesan, Phys. Lett. A 198, 378 (1995). P. A. Clarkson, Physica D 18, 209 (1986). J. P. Corones, J. Math. Phys. 17, 756 (1976). T. Alagesan, Y. Chung and K. Nakkeeran, Chaos, Solitons and Fractals 21, 63 (2004). P. A. Clarkson and C. N. Cosgrove, J. Phys. A 20, 2003 (1987). N. Joshi, Phys. Lett. A 125, 456 (1987). T. Alagesan and K. Porsezian, Chaos, Solitons and Fractals 7, 1209 (1996). K. Konno and H. Oono, J. Phys. Soc. Jpn. 63, 377 (1994). H. Kakuhata and K. Konno, Theor. and Math. Phys. 133(3), 1675 (2002). K. Konno and H. Kakuhata, Theor. and Math. Phys. 137(2), 1527 (2003). K. Kimiaki and K. Hiroshi, J. Phys. Soc. Jpn. 65(3), 713 (1996). R. Hirota, J. Math. Phys. 14, 805 (1973). M. Wadati, H. Sanuki and K. Konno, Prog. Theor. Phys. 53, 419 (1975). K. Konno and H. Kakuhata, J. Phys. Soc. Jpn. 64, 2707 (1995). H. Kakuhata and K. Konno, J. Phys. Soc. Jpn. 65, 340 (1996). V. P. Kotlyarov, J. Phys. Soc. Jpn. 63, 3535 (1994). P. G. Drazin and R. S. Johnson, Solitons: An Introduction (Cambridge University Press, Cambridge (UK), 1989). M. J. Ablowitz and H. Segur, Solitons and the Inverse Scattering Transform (SIAM, Philadelphia, 1985). H. Kakuhata and K. Konno, J. Phys. A 301, 401 (1997). I. A. Batalin, J. Math. Phys. 22, 1837 (1981). I. A. Batalin and G. A. Vilkovisky, J. Math. Phys. 27, 172 (1985).
277 47. A. I. Nesterov, Quasigroups and Nonassociative Algebras in Physics, Transactions of the Institute of Physics of the Estonian Academy of Science 66, 107 (1990). 48. Lev V. Sabinin, Smooth Quasigroups and Loops, Mathematics and Its Applications (Kluwer Academic Publishers, Dordrecht, 1999). 49. Lev V. Sabinin, Acta Appl. Math. 50, 45 (1998). 50. K. Hiroshi and K. Kimiaki, J. Phys. Soc. Jpn. 68, 757 (1998). 51. C. W. Misner, K. S. Thome and J. A. Wheeler, Gravitation (Freeman, San Francisco, 1973).
278
angular and phase velocities w=0 and v=0.6
A
' 1
-50
50
100
Fig. 1.
One loop soliton and energy density in the unperturbed medium with v = 0.6.
angular and phase velocities u^lOO and i>=0.6
Fig. 2.
One loop soliton and energy density in the perturbed medium with v = 0.6.
279
angular and phase velocities u*0 and v=OA
Fig. 3.
Two loop soliton and energy density in the unperturbed medium with v = 0.8.
angular and phase velocities o?=100 and i>=0.8
20 10 x
0 -10 -20 -50
50
300 200 100 •
-100
Fig. 4.
Two loop soliton and energy density in the perturbed medium with v = 0.8.
280
angular and phase velocities o?=0 and v=Q.O
Fig. 5.
Three loop soliton and energy density in the unperturbed medium with v = 0.0.
angular and phase velocities w=100 and u=0.0
Fig. 6.
Three loop soliton and energy density in the perturbed medium with v = 0.0.
281
angular and phase velocities u>=0 and u=0.2
Fig. 7.
Three loop soliton and energy density in the unperturbed medium with v = 0.2.
angular and phase velocities ui=100 and u==0.2
Fig. 8.
Three loop soliton and energy density in the perturbed medium with v = 0.2.
282
A DENSITY FUNCTIONAL THEORY STUDY OF THE ADSORPTION OF CH 3 ON THE Pt(100) AND N i ( l l l ) SURFACES P. S. MOUSSOUNDA, B. M'PASSI-MABIALA Groupe de Simulations Numeriques en Magnetisme et Catalyse Departement de Physique, Faculte des Sciences, Universite Marien Ngouabi, B.P. 69, Brazzaville, Republique du Congo E-mail: [email protected].
(GSMC),
M. F. HAROUN, P. LEGARE Laboratoire des Materiaux, Surfaces et Procedes pour la Catalyse, 25, rue Becquerel, F-67037 Strasbourg Cedex 2, France G. RAKOTOVELO, A. RAKOTOMAHEVITRA Departement
des Sciences Exactes, Faculte des Sciences, Universite de Mahajanga, B.P. 652, 401 Mahajanga, Madagascar C. DEMANGEAT
Institut de Physique et Chimie des Materiaux de Strasbourg, 23, rue du Loess, F-63034 Strasbourg Cedex 2, France The adsorption of CH3 on Pt(100) and N i ( l l l ) surfaces at the coverage of 0.25 ML has been investigated using the density functional theory with the generalised gradient approximation of Perdew-Wang for exchange correlation energy. The metallic surface was represented with a slab formed by four layers of Pt(100) or N i ( l l l ) . We optimized the adsorbed CH3 on all high symmetry sites of Pt(100) (on top, hollow, bridge) and N i ( l l l ) (on top, bridge, hep, fee) surfaces. On Pt(100) surface, CH3 is found to adsorb only on the top site, the hollow and bridge sites being unstable. CH3 is found to adsorb at all four high symmetry sites on the N i ( l l l ) surface with similar adsorption energies for fee, hep and bridge sites, whereas the on top site is much less stable.
283
1. Introduction The interaction of CH 3 with transition metal (TM) surfaces has been extensively studied because methyl is the heaviest fragment resulting from methane dissociation1 or in the methanation of CO and H2 on TM surfaces.2 Pt is an important industrial metal for many catalytic reactions. Up to now, the majority of the works has been concentrated on the adsorption of small molecules on P t ( l l l ) and Pt(llO) because these surfaces are thermodynamically stable. Experimentally, the technique of reflection absorption infrared spectroscopy (RAIRS) has been applied to isolate both methyl and methylene fragments on P t ( l l l ) . 3 ' 4 A substantial number of theoretical studies of the adsorption of CH 3 on P t ( l l l ) exists in the literature. Minot et al.5 have studied methyl adsorption on P t ( l l l ) using a cluster model and a band structure calculation within the framework of the extended Hiickel theory. Cluster type density functional theory (DFT) studies for CH 3 adsorption on P t ( l l l ) have been reported. 6 " 8 Theoretical studies of CH3 adsorption on P t ( l l l ) and Pt(llO) have been carried out using the DFT slab approach. 9-11 CH 3 on nickel surfaces has been studied extensively. The experimental techniques used include static secondary ion mass spectroscopy (SIMS),2 or high resolution electron energy loss spectroscopy (HREELS). 12 Several theoretical studies of CH3 adsorption on N i ( l l l ) have been carried out using the extended Hiickel method. 1 3 - 1 5 Siegbahn et al.16 have performed cluster model calculations for CH^ (x=0-3) adsorbed on the Ni(100) and N i ( l l l ) surfaces. Yang and Whitten 17 used an embedded cluster approach with abinitio valence configuration interaction (CI) calculations to study adsorption energy of CH3 o n N i ( l l l ) . Burghgraef et al.18 studied the same problem with GGA-DFT calculations on clusters of 7 or 13 Ni atoms. Michaelides and Hu 9 studied the methyl adsorption on N i ( l l l ) with DFT slab calculations. After having studied the adsorption of CH4 and the extraction of the first H, leading to CH 3 on Pt(100) and Ni(lll), 1 9 , 2 0 we calculate the adsorption energies of CH3 on those surfaces using the DFT-slab method. For the metastable Pt(100) surface, because of surface reconstruction, no calculations concerning the CH3 adsorption have appeared in the literature. Nevertheless we can point out a few calculations concerning the adsorption yet of 0 on Pt(100). 21 ' 22 Experimentally adsorbates like CO, NO and ethane are known to remove this reconstruction. 23-25 Hence, it is relevant to study CH3 adsorption on the unreconstructed Pt(100) surface.
284
2. The Computational Framework The calculations presented here used the DACAPO code 26 which is based on the density functional theory using a plane wave pseudopotential formalism. The Kohn-Sham one-electron equations are solved self-consistently. Computations were performed including the generalised gradient approximation (GGA) using the Perdew-Wang functional (PW 91) 27 and Vanderbilt ultrasoft pseudopotentials. 28 For all results presented here, a slab formed by four Pt(100) or N i ( l l l ) layers is used with (2 x 2) surface unit mesh. This corresponds to a coverage of 0.25 monolayers (ML) when there is only one CH3 per unit cell. The unit cell is repeated in supercell geometry with successive slabs separated by a vacuum region equivalent to seven and five interlayer spaces for Pt(100) and N i ( l l l ) , respectively. Adsorption is allowed on only one of the two exposed surfaces. The plane wave basis set is limited by a 400 eV energy cutoff. The surface irreducible Brillouin zone is sampled by 13 special fc-points using a (5 x 5 x 1) Monkhorst-Pack grid. 29 The electron density is determined by diagonalization of the Kohn-Sham Hamiltonian. 30 A Fermi broadening corresponding to ksT =0.1 eV was employed to help convergence. All total energies are then extrapolated to zero electronic temperature. The top layer of the slab and the CH3 molecule are relaxed. The ionic degrees of freedom are relaxed using a conjugate gradient minimization scheme, until the sum of all forces is less than 0.05 eV/A. In all calculations for the N i ( l l l ) surface, we restrict to non-spin polarised electrons because the difference in adsorption energy between the spin polarised and non-polarised calculation was found to be ~ 0.03 eV which is rather small as compared to the adsorption energy. The calculated equilibrum lattice constants for bulk fee Pt and fee Ni are 4.004 A and 3.525 A, respectiveley, in close agreement with the experimentally determined values of 3.924 A and 3.524 A. 31 Our results are in good agreement with other theoretical calculations using similar methods. 6 ' 1 0 ' 3 2 - 3 7 For the calculations on the isolated CH 3 we use a simple cubic supercell with a lattice constant of 16 A. The adsorption energies Ead are given by,38
— Ead = -BcH3/slab
_
-^slab
—
-&CH3,
(1)
where .Ecm/siab i s the total energy of the adsorbate-substrate system, Es\&^ is the energy of a clean relaxed slab, while Ecm i s t n e energy of a isolated CH 3 molecule.
285
3. Results and Discussion In this section we determine the site preferences, adsorption energies and geometries of the CH 3 molecule adsorbed on Pt(100) and N i ( l l l ) surfaces. We compare these results with available theoretical findings. Firstly, we checked that the properties of the free molecule in the gas phase were accurately reproduced. Table 1 compares the calculated and experimental 39 C-H bond lengths and H-C-H angle of CH3. It is clear from Table 1 that there is good agreement between calculated and experimental values. 3.1. CH3 on
Pt(100)
3.1.1. Site preferences, adsorption energies and structures of CH3 on Pt(100) The adsorption of CH 3 on Pt(100) at high-symmetry top (T), bridge (B) and hollow (H) sites, as shown in Fig. 1, were investigated. For all the adsorption sites, only one hydrogen orientation was considered. For the top site, one of the H atoms in the CH3 species points towards the nearestneighbour Pt atom in the top layer. For the bridge site, one of the H atoms in the CH3 species points towards its closest hollow site in the top layer. For the hollow site, one of the H atoms in CH3 species points midway between two neighbouring Pt atoms in the top layer. The optimization of CH3 on these sites shows that the hollow and bridge sites are very unstable. The molecule on these sites moves always towards the top site. This is really new because, to our knowledge, no such calculations have appeared in literature. We can however notice calculations concerning the adsorption of CH3 on the (100) surface of Ni. Upton, 40 and Siegbahn and Panas 16 using a cluster model found that the hollow and bridge sites are stable. The stability of these adsorption sites is consistent with the recent calculations of Wonchoba and Truhlar 41 and of Lai et al.i2 using a semi-empirical potential and the slab-type DFT study, respectively. The optimized geometries obtained on the top site for CH3 on Pt(100) are summarized in Table 2. Figure 2 gives a representation of the on-top adsorbed CH3 with some geometric parameters. The carbon atom is 2.08 A above the surface. The three hydrogen atoms are equivalent and the C-H bonds are extended from 1.09 A (see Table 1) calculated for free CH3 to 1.10 A for the adsorbed molecule. The H-C-H angles are reduced from 120°, as calculated for the isolated CH 3 radical, to 111.1° for the adsorbed CH 3 . The Pt-C-H angles are 107.7° for the symmetrically equivalent H atoms.
286
The calculated adsorption energy of CH3 on Pt(100) compared to other theoretical results for the P t ( l l l ) and Pt(llO) surfaces for the top site which is preferred adsorption site are reported in Table 3. One can see from this table that our value is 0.27 eV higher than the results of Kua et al.7'8 and Petersen et al.10 obtained from their cluster and DFT slab calculations, respectively. The lowest values of 1.77 eV and 1.74 eV were obtained by Au et aZ.43'44 using an Pti 0 (7,3) cluster, and Minot et al.5 within the framework of the extended Hiickel theory, respectively. Such discrepancies may be due to edge effects resulting from ignoring the proper periodic boundary conditions, the size of the cluster and the limited number of basis functions.
3.1.2. Density of states analysis In this section, we discuss the local density of states (LDOS) as obtained by projection of the wave functions on an atomic basis. Figure 3 illustrates the LDOS of CH3 adsorbed at a top site and of the free CH3. The LDOS of the free CH 3 molecule presents three peaks: the bottom peak at -12 eV below the Fermi level (Ep) corresponds to molecular states 2al which is a mixing of a C 2s and three H Is orbitals. The second peak, about -5.5 eV below the Ep represents the le orbitals. The last peak at EF is composed of states which are predominantly of molecular character, 3al. 9 As can be seen in Fig. 3, all the energy peaks in the adsorbed CH 3 LDOS (except for the 2al states) are shifted downwards in energy relatively to the corresponding peak in the isolated CH 3 LDOS. For 3al-derived states, the peak intensity becomes much smaller than that of the free methyl, suggesting that the amount of electron density in 3al-derived states is significantly reduced after adsorption. We found that the 3al and le orbitals are the main orbitals involved in bonding with the surface. So, on adsorption of CH 3 in the top site, these valence molecular orbitals may be expected to mix with Pt states of appropriate symmetry and energy, resulting in the formation of the adsorbate-substrate bond. All these results are consistent with results reported by Michaelides and Hu, 37 Petersen et al.,10 Xia and Xie 45 for CH 3 on N i ( l l l ) , on Pt(110), and on R h ( l l l ) , respectively. The Pt LDOS of the clean and CH 3 covered (100) surface are shown in Fig. 4. Upon adsorption, two new features appear: i) strong decrease of the LDOS (EF), and ii) new states extending from -4 eV to -6 eV and at -12 eV below the Fermi level. These new states are a mixing between Pt d, C p and H s orbitals.
287
3.2. CH3 on
Ni(lll)
3.2.1. CH3 adsorption We consider the adsorption of the CH 3 on the N i ( l l l ) surface. The N i ( l l l ) surface exhibits four high symmetry adsorption sites: top (T), bridge (B), fee hollow (F) and hep hollow (H), shown in Fig. 5. The hep hollow site has a Ni atom directly below it in the second plane, whereas the fee threefold hollow site has a Ni atom only in the third layer. For the two threefold hollow sites, one hydrogen orientation has been taken into account with C-H bonds aligned with surface Ni atoms. Table 4 shows the adsorption energies for CH 3 on N i ( l l l ) for various calculations. We first note that the results reported by Schiile et a/.46 which range from 1.89 eV to 2.08 eV for top site adsorption and from 1.95 eV to 2.20 eV for hollow site adsorption are close to our results. This cannot be said for the results of Yang et al.17 and Michaelides et al.37 Their values are much weaker than our results. The large discrepancy between our results and those of these authors 17 ' 37 for CH3 on N i ( l l l ) could have several origins. Michaelides et al.37 use a method similar to the present one: i) a 3-layers slab, and ii) a (2 x 2 x 1) fc-grid sampling. The small thickness of the slab and the low number of Appoints can explain their adsorption energies smaller than ours. Yang et al.17 used a many-electron embedding theory to describe bonding of CH 3 on a 28-atoms cluster. A recent theoretical paper by Lai et al.42 using the DFT-slab model for the adsorption of CH3 on Ni(100) reports adsorption energies of 2.24 eV, 2.11 eV and 2.05 eV for the bridge, hollow and top sites, respectively. These values are quantitatively consistent with our results.
3.2.2. CH3 Geometry The optimized structures obtained at all adsorption sites are listed in Table 5. (1) fee threefold hollow site The fee threefold hollow adsorption site is the most stable. For the fee site, the smallest C-Ni bond length is 2.10 A. Two H-Ni bond lengths of 1.99 A and one H-Ni bond length of 2.02 A are formed. The C-H bond length increases from 1.09 A for the free CH 3 to 1.12 A for the adsorbed CH 3 , for the three hydrogen atoms. The CH 3 pyramid is somewhat squeezed with a smaller H-C-H angle of 106.0° as compared to the tetrahedral value of 109.5° for CH 4 .
288
(2) hep threefold hollow site For the hep site, the C-Ni distances are 2.10 A and the C-H bond lengths are 1.12 A. The shortest H-Ni bond length is 2.00 A. A difference of 0.1° is found between the H-C-H angle of two equivalent hydrogen atoms and the H-C-H angle of the other hydrogen atom. (3) bridge site For the bridge site, the three H-Ni distances differ in length. The three C-H bond lengths are 1.12 A , equal to those at the fee and hep hollow sites. Two C-Ni distances are 2.10 A, whereas the third Ni atom is at 2.11 A. The H-C-H angle between two equivalent hydrogen atoms is 106.2° whereas it is 106.0° for the third hydrogen atom. (4) top site For CH3 adsorption at the top site all H atoms are symmetrically equivalent. Moreover, the H-C-H angle is 109.9° which most closely approximates the tetrahedral structure of methane. We find that the C-Ni distance of 1.95 A is strongly shortened compared the other sites, but the top site has a long H-Ni distance of 2.53 A. The top site is definitively the least stable. 3.2.3. Density of states analysis In this section, we consider the LDOS of two adsorption sites: ground state fee hollow and highest metastable top. (1) fee hollow site Figure 6 illustrates the LDOS for CH3 adsorbed at the fee hollow site. The dotted curves represent the densities of free CH3 and have been superimposed for comparison. The positions of the molecular states of free CH3 are indicated as in Sec. 3.1.2. The peak intensity of the 2al orbital is unchanged but it is shifted by about 1.5 eV from its position in the free molecule. The peak intensity of the le orbital in the adsorbed CH 3 curve is smaller than that of the corresponding peak in the free CH3. More important is the dramatic change in the 3al orbital which occurs upon adsorption. There is strong mixing between the 3al orbital and metal 3d states which occurs some -5 eV below the Fermi level. (2) top site LDOS of CH 3 at the top site is shown in Fig. 7 with the LDOS of the free CH3. The 2al orbital is not affected, while the peak intensity of the le orbital decreases relatively to that of the isolated CH3. The 3al orbital is slightly weakened and is shifted by about 3 eV from its position in the
289 free molecule. On the top site, CH3-Ni mixing is weaker compared to the fee hollow site. 4. Conclusion We have computed the adsorption energies of CH 3 on Pt(100) and N i ( l l l ) surfaces at 0.25 monolayer coverage with density functional periodic calculations. The most important results can be summarized as follows. On the Pt(100) surface, CH 3 adsorbs only on a top site with adsorption energy of 2.60 eV. The bridge and hollow sites are unstable. The analysis of the LDOS curves for CH 3 adsorption and of free methyl shows a strong 3alPt d mixing, a weak le-Pt d mixing, and no mixing for 2al-Pt d. A new Pt-d LDOS state appears at -12 eV below the Fermi level. On the N i ( l l l ) surface, calculated adsorption energies on the four high-symmetry adsorption sites are in the following order: fee >- hep « bridge >- top. For the fee ground state, the LDOS analysis has shown that the 3al, le and 2al orbitals of CH 3 are involved in bonding with the metal. The most significant contribution to the CH3-Ni bonding arises from the 3al orbital. Acknowledgments P. S. Moussounda and G. Rakotovelo are grateful to the "Services de Cooperation et d'Action Culturelle des Ambassades de France au CongoBrazzaville et a Madagascar", and the "Agence Universitaire de la Francophonie (AUF)" for financial support. References 1. M. B. Lee, Q. Y. Yang and S. T. Ceyer, J. Chem. Phys. 87, 2724 (1978). 2. M. P. Kaminsky, N. Winograd, G. L. Geoffroy and M. A. Vannice, J. Am. Chem. Soc. 108, 1315 (1986). 3. F. Zaera, Chem. Rev. 95, 2651 (1995). 4. D. J. Oackes, M. R. S. McCousta and M. A. Chesters, Faraday Discuss. 96, 325 (1993). 5. C. Minot, M. A. van Hove, and G. A. Somorjai, Surf. Sci. 127, 441 (1982). 6. G. Papoian, J. K. N0rskov and R. Hoffmann, J. Am. Chem. Soc. 122, 4129 (2000). 7. J. Kua and W. A. Goddard, III, J. Am. Chem. Soc. 102, 9492 (1998). 8. J. Kua, F. Faglioni and W. A. Goddard, III, J. Am. Chem. Soc. 122, 2309 (2000). 9. A. Michaelides and P. Hu, J. Chem. Phys. 114, 2523 (2001). 10. M. A. Petersen, S. J. Jenkins and D. A. King, J. Phys. Chem. B 108, 5909 (2004).
290 11. D. C. Ford, Y. Xu and M. Mavrikakis, Surf. Sci. 587, 159, (2005). 12. M. B. Lee, Q. Y. Yang, S. L. Tang and S. T. Ceyer, J. Chem. Phys. 85, 1693 (1986). 13. R. M. Gavin, J. Reutt and E. L. Mutterties, Proc. Nat. Acad. Sci. (USA) 78, 3981 (1981). 14. C. Zheng, Y. Apeloig and R. Hoffmann, J. Am. Chem. Soc. 110, 749 (1988). 15. R. C. Baetzold, J. Chem. Phys. 88, 5583 (1988). 16. P. E. M. Siegbahn and I. Panas, Surf. Sci. 240, 37 (1990). 17. H. Yang and Y. L. Whitten, J. Am. Chem. Soc. 113, 6442 (1991). 18. H. Burghgraef, A. P. J. Jansen and R. A. van Santen, Surf. Sci. 324, 345 (1995). 19. P. S. Moussounda, M. F. Haroun, B. M'Passi-Mabiala and P. Legare, Surf. Sci. 594, 231 (2005). 20. M. F. Haroun, Ph.D. Thesis, unpublished, University Louis Pasteur (Strasbourg, France, 2006). 21. O. Ge, P. Hu, D. A. King, M. H. Lee, J. A. White and M. C. Payne, J. Chem. Phys. 106, 1210 (1977). 22. N. A. Deskins, J. Lauterbach and K. T. Thomson, J. Chem. Phys. 122, 184709 (2005). 23. T. E. Jackman, K. Griffiths, J. A. Davies and P. R. Norton, J. Chem. Phys. 181, 403 (1978). 24. E. Ritter, R. J. Behm, G. Potschke and J. Wintterlin, Surf. Sci. 79, 3529 (1983). 25. M. Ronning, E. Bergene, A. Borg, S. Ausen and A. Holmen, Surf. Sci. 477, 191 (2001). 26. The DACAPO plane wave/pseudopotential DFT code is available as Open Source Software at http://w.w.w.fysik.dtu.dk/CAMP0S/. 27. J. P. Perdew, J. A. Chevary, S. H. Vosko, K. A. Jackson, M. R. Pederson, D. J. Singh and C. Fiolhais, Phys. Rev. B 4 1 , 6671 (1992). 28. D. Vanderbilt, Phys. Rev. B 41, 7892 (1990). 29. H. J. Monkhorst and J. D. Pack, Phys. Rev. B 13, 5188 (1976). 30. M. C. Payne, M. P. Teter, D. C. Allan, T. A. Arias and J. D. Joannopoulos, Rev. Mod. Phys. 64, 1045 (1992). 31. Landolt-Borstein, New Series Vol. Ill.b, Structure Data of Elements and Intermetallic Phases (Springer, Berlin, Heidelberg, 1971). 32. A. Kokalj and M. Causa, J. Phys. Condens. Matter 11, 7463 (1999). 33. J. A. Steckel, A. Eichler and J. Hafner, Phys. Rev. B 68, 085416 (2003). 34. Z. Crljen, P. Lazic, D. Sokcevic and R. Brako, Phys. Rev. B 68, 195411 (2003). 35. P. Kratzer, B. Hammer and J.K. N0rskov, J. Chem. Phys. 105, 5595 (1996). 36. J. Greeley and M. Mavrikakis, Surf. Sci. 540, 215 (2003). 37. A. Michaelides and P. Hu, Surf. Sci. 437, 362 (1999). 38. A. Gro/3, Theoretical Surface Science (Springer-Verlag, Berlin, 2003). 39. D. R. Lide, Handbook of Chemistry and Physics, 78™ edition (CRC Press, London, 1998). 40. T. H. Upton, J. Vac. Sci. Technol. 20, 527 (1982).
291 VH stirfae* ptame
Pt. 2M6ptmte
Fig. 1. Schematic plan view of three high-symmetry adsorption geometries considered in this paper for CH3 on Pt(100): (a) top site, (b) hollow site, (c) bridge site. Only two layers of metal atoms are shown for clarity. Small white and small black spheres represent H atoms and C atoms, respectively.
t**j» ¥le.w
i d e view
111 t
tti.r
i07 7
Fig. 2. Top and side view of CH 3 adsorbed in the top site on Pt(100). The Pt atoms are the largest spheres. Small black spheres indicate C atoms, and small white spheres represent H atoms.
41 42 43 44. 45 46,
S. E. Wonchoba and D. G. Truhlar, J. Phys. Chem. 102, 6842 (1998). W. Lai, D. Xie and D.H. Zhang, Surf. Sci. 594, 83 (2005). C.-T. Au, M.-S. Liao and C.-F. Ng, J. Phys. Chem. A 102, 3959 (1998). C.-T. Au, C.-F. Ng and M.-S. Liao, J. Catal. 185, 12 (1999). H. Xia and D. Xie, Surf. Sci. 558, 15 (2004). J. Schiile, P. Siegbahn and U. Wahlgren, J. Chem. Phys. 89, 6982 (1998).
292
,
.
,
.
,
. | —
|
.
CH3/R
|
I LA
Fig. 3. Local density of states (LDOS) for CH3 adsorption at the top site of Pt(100). The solid curves are for adsorbed CH3 and the dashed curves are for free CH3. The Fermi level lies at 0 eV.
Fig. 4. LDOS curves for the d orbital of Pt. The solid curve is for the surface P t atom close to CH3 and the dashed curve for the clean Pt(100) surface. The Fermi level lies at 0 eV. Table 1. Comparison between calculated and experimental structural parameters of isolated CH3. C-H bond length (A) H-C-H (")
Experimental 3 9 1.08 120.0
Calculated 1.09 120.0
Table 2. Calculated structural parameters for CH3 on Pt(100). Numbers in parentheses represent the number of bonds with that length. Adsorption site Top
dc-pt
dc-H
du-pt
H-C-H angle
(A)
(A)
(A)
2.08(1)
1.10(3)
2.63(3)
Pt-C-H angle
n
(°) 107.7(3)
111.1(3)
Table 3. Theoretical adsorption energies (in eV) of CH3 at the top site, on, (a) Pt(100), (6) P t ( l l l ) , (c) Pt(110). Present result (a)
2.60
Ref. 7(6) Ref. 8(6) Ref. 10(c) 2.33
Ref. 9(6)
Ref. 11(6)
Ref. 43(6) Ref. 44(6)
Ref. 5(6)
2.05
2.04
1.77
1.74
293
top
bridg*
Fig. 5. Schematic plane view of the different adsorption sites for CH3 adsorption on Ni(lll) surface. The dashed line represents the unit cell. White and black spheres represent the first and second layers of Ni atoms, respectively.
Fig. 6. LDOS curves for CH3 adsorption at the fee hollow site of Ni(lll). The solid curves are for the adsorption system and the dashed curves are for the isolated states. The Fermi level lies at 0 eV.
Fig. 7. LDOS curves at CH3 for CH3 adsorption at the top site of Ni(lll). The solid curves are for the adsorption system and the dashed curves are for the isolated states. The Fermi level lies at 0 eV.
294
Table 4. Theoretical adsorption energies (in eV) of CH3 at top, bridge, fee and hep sites on N i ( l l l ) . (a) Adsorption energies depending on the cluster size. Adsorption site fee hollow hep hollow bridge top
Present Results 2.54 2.50 2.51 2.28
Ref. 17 1.69 1.69 1.56 1.46
Ref. 37 1.46 1.48 1.37 1.22
Ref. 46 1.95 - 2.20(a)
1.89 - 2.08(a)
Table 5. Structural parameters for CH3 adsorbed at four high-symmetry sites of N i ( l l l ) . Numbers in parentheses represent the number of bonds with that length. Site fee hep bridge top Site fee hep bridge top
dc-Ni
dc-H
(A)
(A)
2.10(2), 2.11(1) 2.10(3) 2.10(2), 2.11(1) 1.95(1)
1.12(3) 1.12(3) 1.12(3) 1.10(3)
dH-Ni
H-C-H angle
(A)
(°)
1.99(2), 2.02(1) 2.00(2), 2.01(1) 2.00(1) 2.01(1), 2.03(1) 2.53 (3)
106.0(1) 106.1(1), 106.2(1) 106 .2(1), 106.3(2) 106 .0(1), 106.2(2) 109.9(3)
295
THE MAGNETIC STRUCTURE OF FeMn LAYERS ACROSS A Cu SPACER B. M'PASSI-MABIALA Groupe de Simulations Numiriques en Magnitisme et Catalyse (GSMC), Departement de Physique, Universite Marien Ngouabi, B.P. 69, Brazzaville, Republique du Congo E-mail: [email protected] B. R. MALONDA-BOUNGOU, L. MOUKETO Groupe de Simulations Numiriques en Magnetisme et Catalyse (GSMC), Departement de Physique, Universite Marien Ngouabi, B.P. 69, Brazzaville, Republique du Congo and Centre for Atomic Molecular Physics and Quantum Optics (CEPAMOQ), University of Douala, P. O. Box. 8580 Douala, Cameroon C. DEMANGEAT Institut de Physique et Chimie des Matiriaux de Strasbourg, 23, rue du Loess, F- 63034 Strasbourg Cedex 2, France Following the convincing experimental evidence of oscillatory exchange interaction between antiferromagnetic FeMn layers across a Cu spacer recently found by Cai et al., we have performed ab-initio spin-polarized density functional theory calculations on ( F e M n ) „ / C u / ( F e M n ) n with n = 1,2,3 in the (001) crystallographic face. For n = 1 we have investigated all possible magnetic configurations in the self-consistent procedure and we end up with five solutions. The ground state is found to be of ferromagnetic type: i) there is an intrinsic ferromagnetic configuration in the FeMn plane, and ii) there is a ferromagnetic coupling between the two FeMn planes separated by the Cu spacer. At 20.17 mRyd/cell a solution with complex magnetic behavior is observed: ferromagnetic coupling between Fe and Mn, in one FeMn plane, and antiferromagnetic coupling between Fe and Mn in the second FeMn plane separated by Cu. Another solution at 41.26 mRyd above the ferromagnetic ground state presents in-plane antiferromagnetic configurations in each FeMn layer. For n = 2 the ground state presents in-plane ferromagnetism for the two FeMn layers adjacent to Cu; the other two FeMn layers are clearly of in-plane antiferromagnetic type. Contrary to n = 1, the ferromagnetic coupling through the Cu spacer is now absent. We report on this ground state and various metastable states with small differences of energy with the ground state. For n = 3 the magnetic map is more complex. Explanation of the experimental results of Cai et al. is tentatively made.
296
1. Introduction The discovery of exchange coupling between Fe films separated by a thin Cr spacer layer1 and its oscillatory behavior as the Cr thickness is varied2 together with its relevance to giant magnetoresistance 3 have triggered a large number of experimental and theoretical investigations on multilayers consisting of different transition metal ferromagnets (FM's) and nonmagnetic (NM) spacers. By now it has become a well-understood general phenomenon that ferromagnetic layers of Fe, Co, Ni and their alloys separated by most any 3d, Ad, or 5d transition metal spacers 4 - 9 exhibit an exchange coupling that oscillates as a function of the spacer thickness with a period of approximately 10 A (an exception, Cr). Antiferromagnetism, as the counterpart of ferromagnetism, originates from the same fundamental mechanism, i.e., the quantum-mechanical exchange interaction. 10 From a more general physical principle point of view, it is no doubt a critical question whether an exchange interaction can be propagated between metallic antiferromagnets (AF's) across a nonmagnetic metal spacer, just as in full-metal systems of the FM/NM/FM type. The unidirectional anisotropy of an FM layer adjacent to an AF layer, namely, exchange bias, 11 is readily measured and has been used to indirectly probe the properties of AF's, including the determination of the AF anisotropy, 12 spin flop field,13 AF surface order parameter, 14 and AF domains. 15 Cai et al.16 have recently proposed that exchange bias might be employed to probe the interlayer exchange interaction between AF's in elaborate multilayers of "FM/AF(1)/NM/AF(2)". The exchange interaction between AF FeMn layers across a Cu spacer was studied by employing the exchange bias as a probe in multilayers of "NiFe/thin FeMn/Cu/thick FeMn". Convincing experimental evidence of oscillatory exchange interaction between AF FeMn layers across a Cu spacer with a period approximately twice that of FM multilayers has been found. This result has shown that long-range oscillatory exchange interaction is a basic and universal feature in metallic FM/NM/FM and AF/NM/AF due to the quantum interferences induced by the spin-dependent interface reflection of Bloch waves with different oscillating periods originating from the different interface reflection conditions for FM and AF spin ordering. A great number of theoretical studies has been performed, essentially focusing on the oscillatory character of the coupling. The one done by Ref. 7 gives a detail and comprehensive discussion of the various aspects of the problem of interlayer magnetic coupling. The interlayer exchange coupling is
297
described in terms of quantum interferences due to confinement in ultrathin layers. This approach provides both a physically transparent picture of the coupling mechanism, and a suitable scheme for discussing the case of a realistic system. This is illustrated for the Co/Cu/Co(001) system. The cases of metallic and insulating spacers are treated in a unified manner by introducing the concept of the complex Fermi surface. The aim of this work is to investigate the magnetic structure of (FeMn)„/Cu/(FeMn) n as a AF/NM/AF system by means of the Density Functional Theory (DFT). The 7-FeMn alloy is a typical AF used for exchange bias, which has been extensively investigated since it was initially exploited as a domain stabilizer twenty years ago. So the knowledge about FeMn is rich and well understood. Ab-initio density functional calculations have been performed by M'Passi-Mabiala et al.17 on (Feo.5Mno.5)„/Co(001) and Co/(Feo. 5 Mno. 5 )„/Co(001) for n varying from 1 to 3. Within generalized gradient corrections the Fe-Mn interfacial alloy, one monolayer thick on Co(OOl), with FM coupling between Fe and Mn corresponds to the ground state whereas the same Fe-Mn monolayer in Co/FeMn/Co presents a magnetic moment of Mn opposite to that of Fe and Co. This paper is organized as follows. In Sec. 2, we comment the theoretical model. In Sec. 3, we present and discuss the results obtained for (FeMn)„/Cu/(FeMn)„, n = 1,2,3. In Sec. 4 we give conclusions and our outlook. 2. The Computational Model The calculations are performed using a scalar relativistic version of the k space TB-LMTO method 18 with atomic sphere approximation. This method is based on the density functional theory 19 with gradient corrections. The generalized gradient approximation (GGA) Perdew-Wang (PW91) functional 20 has been used. For Cu the experimental lattice parameter fee phase a = 3.61 A is set as input. The overlayer system is modeled, using the repeated slab geometry 21 in which 5 layers of Cu(OOl) surrounded by layers containing Fe and Mn atoms are separated by five layers of empty spheres. These empty spheres are sufficient to prevent interaction between slabs 21 which is controlled through vanishing dispersion in the direction perpendicular to the slab and vanishing charge in the central layer of the empty spheres. The calculations are performed using an increasing number of k points until final convergence is obtained in the irreducible Brillouin zone. This is discussed in detail
298 in the Ph.D. thesis of Meza-Aguilar. 22 The description of the ordered alloy Feo.sMno.5 needs calculations with 2 inequivalent atoms per layer. The study of the multilayers FeMn(l)/Cu/FeMn(2) has been done with different magnetic configurations for FeMn(l) and FeMn(2) as input. 3. Results and Discussion The magnetic moments and difference of total energies (DTE) for n(Feo.5Mn0.5)/Cu/n(Feo.5Mno,5), with n = 1-3, are to be reported in this section. All possible colinear magnetic configurations have been taken into account as inputs. In the Tables hereafter, the t et 4- arrows indicate that the Mn and Fe moments are up or down, respectively. Configurations with the same DTE are equivalent. The ground state energy is set to 0.00 mRy/cell. For n — \ the magnetic moments and difference of total energy (DTE) for FeMn monolayers across the Cu spacer are presented in Table 1. There are five solutions with DTE 0.00, 1.29, 20.17, 41.26 and 41.79 mRyd/cell, respectively. The first is the ground state configuration with an intrinsic ferromagnetic coupling between Fe and Mn atoms in the two layers adjacent to the Cu spacer. There is a ferromagnetic coupling between these two FeMn layers separated by Cu. A similar result was already obtained by M'PassiMabiala et al.17 in the case of FeMn on Co(001). However, the second solution shows evidence of a metastable state by only 1.29 mRyd/cell. We have to point out that calculations have been done for T = 0 K. Since the energy difference lies below the thermal energy at room temperature and is possibly due to structural imperfections, it is not obvious to claim which of these two states is the genuine ground state in realistic conditions. In this solution, the two layers adjacent to Cu are antiferromagnetic contrary to the ground state with a ferromagnetic coupling between Fe and Mn in each plane. The third solution presents a complex magnetic behavior with a ferromagnetic coupling between Fe and Mn, in one FeMn plane, and an antiferromagnetic coupling between Fe and Mn in the second FeMn plane separated by Cu. The fourth and fifth solutions show an antiferromagnetic configuration between Fe and Mn in each FeMn plane adjacent to Cu. For n = 2 the converged solutions are reported in Table 2. The ground state magnetic configuration (DTE=0.00 mRyd/cell) consists of a ferromagnetic coupling between Fe and Mn in two layers adjacent to Cu with an antiferromagnetic coupling between these two FeMn planes separated by Cu. The metastable state (DTE=1.36 mRyd/cell) shows a ferromagnetic coupling between Fe and Mn in two layers adjacent to Cu with a ferromagnetic coupling between these two FeMn layers. Fe and Mn atoms are antifer-
299 romagnetically aligned in FeMn surface layers of the ground and metastable states. There are many solutions with close DTE values, namely, 7.10, 8.47, 9.19, 9.58 and 9.84 mRyd/cell. Concerning the magnetic moment, nothing may be concluded by lack of symmetry between these configurations. The last one (25.14 and 62.35 mRyd/cell) displays a complex magnetic behavior. For n = 3 the converged solutions are reported in Table 3. The ground state presents a ferromagnetic coupling between Fe and Mn in two layers adjacent to Cu with an antiferromagnetic coupling between these two FeMn layers. Fe and Mn atoms are ferromagnetically aligned in the two FeMn subsurface layers while the coupling is antiferromagnetic in the two FeMn surface planes. The second solution (DTE=15.57 mRyd/cell) displays the same magnetic behavior in and between the two layers adjacent to Cu with the ground state. But Fe and Mn atoms are antiferromagnetically aligned in the two FeMn subsurface layers while the coupling is ferromagnetic in the two FeMn surface planes. At 18.14 mRyd/cell a ferromagnetic coupling between Fe and Mn in two layers adjacent to Cu with an antiferromagnetic coupling between these two FeMn layers is observed. Fe and Mn atoms are ferromagnetically aligned in the two FeMn subsurface layers while the coupling is antiferromagnetic in the two FeMn surface planes. Other solutions display a complex magnetic behavior. As shown by Cai et al.16 there are two different spin alignment configurations at the two interfaces of AF/NM/AF, i.e., t4-/NM/f| versus t4,/NM/4,t, where the arrows nearest and next nearest to the slash represent the directions of the interface spin, and the neighboring interface spin, respectively. It could be explained by the spin-dependent reflections of Bloch waves at the two interfaces, the quantum interface states confined in the spacer and the corresponding energies. The difference of these oscillatory energies makes the oscillatory interlayer exchange interaction. The ground state configurations obtained with different thicknesses are in qualitative agreement with experiment results 16 where evidence of oscillatory exchange interaction between antiferromagnetic FeMn layers across a Cu spacer was observed. 4. Conclusion We have performed ab-initio TB-LMTO calculations for the determination of the magnetic behavior of n(FeMn)/Cu/n(FeMn) (001) as a AF/NM/AF system. For n = 1 the solution with a ferromagnetic coupling between the two FeMn planes separated by the Cu spacer is the ground state configuration. For n = 2 an in-plane ferromagnetic configuration is displayed at the
300
interfaces, and an in-plane antiferromagnetic one for the other layers. For n = 3 the in-plane ferromagnetic configuration is displayed at the interfaces, but the other layers show a complex magnetic behavior. Acknowledgments B. M'Passi-Mabiala wishes to acknowledge the Abdus Salam International Centre for Theoretical Physics (ICTP) and the Swedish International Development Cooperation Agency (SIDA). B. R. Malonda-Boungou addresses thanks to the AIEA/ICTP Sandwich Training Educational Programme (STEP) for financial support. References 1. P. Griinberg, R. Schreiber, Y. Pang, M. B. Brodsky and H. Sowers, Phys. Rev. Lett. 57, 2442 (1986). 2. S. S. Parkin, N. More and K. P. Roche, Phys. Rev. Lett. 64, 2304 (1990). 3. M. N. Baibich, J. M. Broto, A. Fert, F. Nguyen Van Dau, F. Petroff, P. Etienne, G. Creuzet, A. Friederich and J. Chazelas, Phys. Rev. Lett. 61, 2472 (1988). 4. For a survey, see, e.g., Ultrathin Magnetic Structures, eds. B. Heinrich, A. Bland and J. A. C. Bland (Springer, New York, 1994), Vol. II. 5. S. S. Parkin, Phys. Rev. Lett. 67, 3598 (1991). 6. Q. Leng, V. Cros, R. Schafer, A. Fuss, P. Griinberg and W. Zinn, J. Magn. Magn. Mater. 126, 367 (1993). 7. P. Bruno, Phys. Rev. B 52, 411 (1995). 8. M. D. Stiles, Phys. Rev. B 48, 7238 (1993). 9. F. J. Himpsel, J. E. Ortega, G. J. Mankey and R. F. Willis, Adv. Phys. 47, 511 (1998). 10. W. Heisenberg, Z. Phys. 49, 619 (1928). 11. W. H. Meiklejohn and C. P. Bean, Phys. Rev. 102, 1413 (1956). 12. D. Mauri, E. Kay, D. Scholl and J. K. Howard, J. Appl. Phys. 62, 2929 (1987). 13. J. Nogues, L. Morellon, C. Leighton, M. R. Ibarra and I. K. Schuller, Phys. Rev. B 61, R6455 (2000). 14. D. Lederman, J. Nogues, and I. K. Schuller, Phys. Rev. B 56, 2332 (1997). 15. C. L. Chien, V. S. Gornakov, V. I. Nikitenko, A. J. Shapiro and R. D. Shull, Phys. Rev. B 68, 014418 (2003). 16. J. W. Cai, W. Y. Lai, J. Teng, F. Shen, Z. Zhang and L. M. Mei, Phys. Rev. B 70, 214428 (2004). 17. B. M'Passi-Mabiala, S. Meza-Aguilar and C. Demangeat, Surf. Sc. 547, 201 (2003). 18. O. K. Andersen and O. Jepsen, Phys. Rev. Lett. 53, 2571 (1954); O. K. Andersen, Z. Pawlowska and O. Jepsen, Phys. Rev. B 34, 5253 (1986);
301
19. 20. 21. 22.
The standard TB-LMTO-ASA code (version 47) developed by O. K. Andersen et al. at M.P.I. Stuttgart, Germany, was used. P. Hohenberg and W. Kohn, Phys. Rev. 136, B864 (1964); W. Kohn and L. J. Sham, Phys. Rev. 140, A1133 (1965). J. P. Perdew, Y Wang and E. Engel, Phys. Rev. Lett. 66, 508 (1991). M. A. Khan, J. Phys. Soc. Jpn. 62, 1682 (1993). S. Meza-Aguilar, Ph.D. Thesis, unpublished, Strasbourg (France, 2000).
Table 1. Ground states and the first closest metastable magnetic configuration for l(Feo.5Mno.5)/Cu/l(Feo.5Mno.s), and magnetic moment per atomic site in units of /is. The GGA-PW91 exchange correlation functional has been used. The energy of the ground state is set to 0.00 mRy/cell. The output configurations a-e correspond to the following set of input configurations {f f/Cu/t t> 4- 4-/Cu/4- 4- }.
U 4/Cu/t t, t t/Cu/4.4.}, {4.4./CU/4. t, t t/Cu/t 4-}, {4- t/Cu/f 4-, t 4-/CU/4- f } , {t 4-/Cu/f 4-, t t/Cu/4- t } , respectively. Output configurations Diff. Energy Fel Mnl Cu3a Cu3b Cu2a Cu2b Cula Culb Cu4a Cu4b
Cu5a Cu5b Fe4 Mn4
a 0.00 2.75 3.64 0.02 0.02 -0.00 -0.02 -0.00 -0.00 -0.00 -0.02 0.02 0.02 2.76 3.63
b 1.29 -2.75 -3.63 -0.01 -0.01 0.01 0.01 0.00 0.00 -0.01 -0.01 0.01 0.01 2.75 3.63
c 20.17 -2.77 -3.64 -0.02 -0.02 0.00 0.02 -0.00 -0.00 -0.00 0.00 0.01 0.01 -2.42 3.64
d 41.26 -2.40 3.67 0.02 0.02 0.00 -0.00 0.00 0.00 -0.00 0.00 -0.02 -0.02 2.40 -3.67
e 41.79 2.33 -3.69 -0.02 -0.02 0.00 -0.00 -0.00 -0.00 0.00 -0.00 -0.02 -0.02 2.33 -3.69
,
™ a b o ^
CO
bO
CO
bO
CO OS
tO
CO
bO
1
t
]
1
bO
CO
bO
CO CO
CO
o
CO
to
bO
-J
1
o o
o
o o
o
o
o
o o
o
i
o
i
CO
o
1
o
o
1
i
r
0
o
bO
CO CO CO
CO bO
1
bO CO OS OS
CO bO
cn bO
o
o b.
0
b.
0 it
C£
b
V
bO
to
o CO bO
so
1
r 52
5°
o p p o p o o o b p CO b b b b b b b b b to o H- I-* o Ko
to
CO
5° in
s s
b b b b b o to to o o b o o
o
to OS CO
o o o o o o to oo o o o o o o o o o o OS bO CO o o o o o o to CO - J
^
o
CO OS
s
Ci CC
to
CO bO OS
it
0
c
r
CO 00
to
CO OS
bO
c c
bO -J
CO
bO OS
l-» CO
bO
8
1-"
s
OS CO
to cn b bo
1
p
b b b b b b b b b o o o CO i—' ho
p
o p o o cn b b b b
h-»
bO OS
CO
bO
bO
CO
CO
1 J—1
CO b O bO CO -J
CO
1
1
CO
b bs b
CO
bO
^
1
Cn bO
CO
I—1 CO
!° 5
;
-
^ '^
<<
£•
5
-
•*
H
t/t 4
os
CO
bO
CO 00
tO 00
bO
cn
CO
to
CO
to to o
1
OS OS
bO
p o o o o o o o o CO b b b b b b bO b b b b o b o o o CD CO CO o 1 1 1 • 1 CO t o to to b p o o p O o o o o CO en b b b b b b b b to b b b o o o o bO CO CO CO CO 00 bO bO
os
1
1
o b o 6 o o o o o o o o o o o o o o o o o o bO CO o - J bO CO o
to o
o p O o o o o b o o bi b b b b b b b b b b as OS to CO o o o o o o bO CO
OS OS
1
bO
bO
CO
1
c o t o b o b o p o o p p o o o o o to bO CO tO o_ b b b o b <=> b i b c i i o o o o © © b o b o o s < i > £ . ©
B o > a * . C i = < = c c e c e e e g "*• o'SJcrPa'pa'aja'SJ
respect
- =K-I ^ cC,T cc C,I cc
to to co co cr P o" p>
c KI
3 '-'
£^ 3r . &w 3 r .
t O t O O O O O O O O O O O t O t O t O i - ' W t O
v cr
c c e c J^ ^ I_I
Cn - j H-*
O
l
l -
l
C 0 I 0
J
C O b O
W t O t O i ^ t O t O O O O O O O O O O O t O t O t O i - ' O O l O
C O t O t O i - ' t O t O O O O O O O O O O O t O t O t O ) -
C O t O t O h - ' b O b O O O O O O O O O O O t O t O t O t - ' C O l O
(-> O
Oi
O
O
O
o
o o o o o
C 0 t 0 i - ' i - ' t 0 S 3 O O O O O O O O O O t 0 t 0 i -
Cn
o
O
O
a> --j -q
O
W t O i - ' h - ' t O t O O O O O O O O O O O t O t O t O i - ' C O t O
O
W t O t O e - ' t O t O O O O O O O O O O O t O b O l s S l - ' C O t O t o o o o o
1
§'-' , >*r_3w ' 0 *r < ^ 3 r-rj * .Oi c J^t tc' Oi m
O O t O t O i —
•3l .
304
G E N E R A T I O N OF MATRICES W I T H SPECIFIED EIGENVALUES HABATWA V. MWEENE Physics Department, University of Zambia, P. O. Box 32379, Lusaka, Republic of Zambia E-mail: [email protected] We present a prescription for forming matrices with specified eigenvalues and known eigenvectors. With this method, we can form Hermitian, anti-Hermitian, symmetric and general matrices with arbitrary eigenvalues with great ease. Certain functions are required for the implementation of this method. Probability amplitudes connecting observables with discrete eigenvalue spectra perform the task, and they can be obtained from spin theory. For the example case of 5 x 5 matrices, these functions are given, and various illustrative matrices are generated together with their normalized eigenvectors.
1. Introduction The production of matrices with specified eigenvalues is an important problem in linear algebra, related as it is to both the eigenvalue problem and the inverse eigenvalue problem. 1,2 One way of doing this is by means of similarity transformations. In this paper we present another method. With this particular method, an analytic generic matrix is produced with the desired eigenvalues, and by simply changing the values of some arguments, matrix forms such as Hermitian, anti-Hermitian, symmetric, general, etc., are obtained. At the same time, the normalized eigenvectors are generated. The method lends itself to easy implementation on computers. And since the elements of the matrix and the eigenvectors are given in analytic form, these matrices and their eigenvectors should be useful for general analysis in other areas of linear algebra.
305
2. Matrix Generation The basic result underlying this method is the following. Suppose that the N 2 functions {
J2
(1)
i=i
where B and C are parameters that take the values S i , B2, . •., BN and Ci, C2, ..., CN, respectively. Here the quantities
Mij =
J2
(2)
n=l
we find that the eigenvectors of this matrix are (P(B1;Ci)\
(3)
\P{BN;Ci)/ with the respective eigenvalues Aj. Indeed if we write
[V] = MU
(4)
we find that the fc-th row of [M][&] is N
Vk =
Y,Mki
\
1=1 \n=l N
=
/ N
Y,^n
1=1
N
= 2_^ A„(/>* (Bk; Cn)Sni 71=1
= \i
(5)
Thus the eigenvalue equation [M][Zi] = \i[£i.
(6)
306
is satisfied. Property (1) means that the eigenvectors are orthonormal, [&%]
= &&> = £
*(*»; W ( £ i ; c,-) = Sij.
(7)
These results give us a means of generating matrices with specified eigenvalues and known eigenvectors. We only need appropriate functions
(8)
its operator is, / 2cos0 sinfle"^ 0 0 0 \ sinfle^ cos 9 ^sinfle-^ 0 0 [c-S] = 0 ^sinfle^ 0 - i ^ sin0e-iv 0 0 0 ^sintfe^ -cos0 sin&r^ \ 0 0 0 sinfle^ - 2 cos 6 } (9) The eigenvalues of this matrix are 2,1,0, —1 and —2, with the respective normalized eigenvectors
/
cos4|e-^ 2
\
le-if>
(10)
[x£L2] =
2
\
sinflsin !^ sin4 f e*2'"
/
307
sin^cos2^-^ \ (3 sin2 f - c o s 2 f ) cos2 f e - ^ -^sin0cos0 -(3cos2|-sin2|)sin2|e^ \ -sin0sin2§ea*' / (
[£L] =
/
[£l0]
^sin2^e-i2^ \ -^sinflcosfor^ |(2cos 2 9- sin2 6) ^sinflcosfle^ • # sin2 9ei2*
sin6»sin 2 |e- i 2 ^ 2 £ 2 0 -(3cos 2 sin " " " 2 )-1 sin " " " £e~ 2 ^ sin 9 cos (9 :,2 e
[x£Li] =
(11)
(12)
•l<^
(13)
(3 s i n 2 f - cos2 f ) cos2 f e ^ sin/9cos 2 !e i 2 < '
and
sin4 f e " ^
'
•sinflsin2^-^ (14) ^sin20 [£l-2] = -sinflcos^e^ cos4 \ei2f \ J Here, we have used the index m to label the eigenvalues of the spin matrix because that is the convention. Also, we need to keep the distinction between the eigenvalues of this matrix and the eigenvalues {Aj} belonging to the matrix we wish to generate once we have obtained the probability amplitudes. For the direction b = (sin 9' cos ip', sin 9' sin ip', cos 9'),
(15)
the eigenvectors of the operator are identical, except that (9,
308
is a probability amplitude, and its modulus squared is a probability for a certain measurement. 3 However, from a mathematical point of view, all that is needed is the fact that these elements possess the property (1). The scalar products of the eigenvectors corresponding to b with those corresponding to c give the most generalized forms of these functions. Thus the required quantities are 5
ttBi;Cj) = [x£]nx$}.
(16)
Evidently, in this case, x = (6,tp,6',tp'). To illustrate the notation, we give a few of these functions.
[£U[X(X}
4>(B1;C1) = =
c o s
4^cos4^ei2W)
2
2
+ sin ff sin 6 cos2 ^ cos2°-e i(v ~*' > + ^ sin2 6' sin2 0 + sin 0' sin 6 sin2 - sin2
-e-^-v') (17)
[x{
= sin0cos4^cos2^ei2^') 2 2 +(3 sin2 °- - cos2 6-) sin 6' cos2 | cos2 L ^ - ^ 3 - - sin2 9' sin 9 cos 9 4 - (v3 cos2 - - sin2 °-)y sin ff sin2 - sin2 * e - ' ( * - * ' ) 2 2 2 2 -sin^sin4^sin2^e-i2^-^'),
(18)
309 and
^(B5;C5)-[x^L_2]t[x£-2] =
s i n
4? 2
4?>W) 2
s i n
+ s i n ^ sin^ sin2 ^ sin2 L ^ ' ^ + | sin2 0' sin2 0 2 2 8 + sin 0' sin 0 cos2 ^ cos2 | e -* ( *'-*'' ) + co84|cos4|e-i2^-*'').
(19)
There are 22 other functions. We remind ourselves that since Z(Ci,Bi)=4?(Bj;Ci),
(20)
aCuB^^ix®]^]].
(21)
then
In general, the probability amplitudes satisfy the inter-dependence law 5
MB^Cj) = 5 > ( f l i ; W ( A ; ,•),
(22)
where d is a third direction to which corresponds the new parameter D. This is a particular form of a fundamental quantum property of three sets of probability amplitudes belonging to one quantum system, 6 N
(KAxCj)
= YJX(Ai;Bl)
(23)
(=i
In the context of quantum theory, the quantity \
310
With these probability amplitudes, the eigenvector for the eigenvalue Aj is
( [x|j[x2l \
[Xm=lJ [Xm.J [Xm=-l\
(24)
[Xmi
\ix£L nx£}]J
2 case of the form of (2), Of course, since the matrix (9) is a particular its eigenvectors are given by (24); evidently, they correspond to the angles 9' = 0 and tp' = 0. In other words, for this case, the unit vector b points in the z direction. The association between the eigenvalues of the matrix we are generating and their eigenvectors is through (2), where the value of C and the eigenvalue have the same index. 4. Examples The structure of the probability amplitudes and the prescription for generating the matrices leads to the following properties. If 9 = 9' and ip =
(-0.130,0.000) (0.621,-0.082) (2.294,-0.615) (0.490, -0.203) (0.621,0.082) (2.583,0.000) (-0.062,0.008) (1.127,-0.302) (2.294,0.615) (-0.062,-0.008) (1.827,0.000) (0.607, -0.080) (0.490,0.203) (1.127,0.302) (0.607,0.080) (0.818,0.000) (0.371,0.099) \ (-1.080,-0.624) (1.923,0.797) -1.762,-0.232)
(-1.080,0.624) \ (1.923,-0.797) (0.371,-0.099) (-1.762,0.232) (0.901,0.000) /
311 while the respective eigenvectors are / (-0.030,0.120) \ (0.239,0.236) [£i]= (0.539,-0.145) (0.165,-0.597) V (-0.293,-0.302)/
[61
/ (0.124,-0.312) \ (-0.344, -0.442) (-0.344,0.045) (0.102,-0.242) \ (-0.370,-0.497)/
[fc
/ (-0.272,0.488) \ (0.176,0.300) (-0.368,0.000) = (-0.176,0.300) V (-0.272,-0.488)/
&] =
/ (0.370,-0.497) \ (0.102,0.242) (0.344,0.045) (-0.344,0.442) \ (-0.124,-0.312)/
and / (-0.293,0.302) \ (-0.165,-0.597) [6]= (0.539,0.145) (-0.239,0.236) V (-0.030,-0.120)/ In order to generate a general symmetric matrix, we use the arbitrary complex eigenvalues (2.0,1.0), (4.0, -3.0), (-3.0,1.0), (-1.0,4.0) and (4.0, -2.0) with the arbitrary arguments 6 = 1.0, 6' = 3.0 and tp =
whose eigenvectors are real and are, respectively,
[6] =
/ 0.085 \ 0.265 0.506 0.644
[6] =
V 0.501 /
&] =
/-0.644\ 0.119 0.463 -0.535
V 0.265 /
/-0.265^ -0.535 -0.463 0.119
V 0.644 /
and
M=
( 0.506 \ 0.463 -0.240 -0.463 V 0.506 /
/ 0.501 \ -0.644 [&] = 0.506 -0.265
V 0.085 /
312
Using the imaginary eigenvalues (0.0,2.0), (0.0,4.0), (0.0,-3.0), (0.0,-1.0) and (0.0,4.0) with the arbitrary arguments 9 = 1.0, tp - 2.0, 9' = 3.0 and tp' = 4.0, we end up with the anti-Hermitian matrix / (0.000,0.791) (-0.329,2.057) (-0.561,1.708) (0.062,-0.120) V (0.386, -0.524)
(0.329,2.057) (0.561,1.708) ( - 0 . 0 6 2 , - 0 . 1 2 0 ) ( - 0 . 3 8 6 , - 0 . 5 2 4 ) \ (0.000,1.493) (-0.032,-0.200) (0.576,1.751) (0.797,1.545) (0.032,-0.200) (0.000,1.987) ( - 0 . 1 3 3 , - 0 . 8 2 9 ) ( - 0 . 3 2 2 , - 0 . 9 7 9 ) (-0.576,1.751) (0.133,-0.829) (0.000,-0.203) ( - 0 . 1 9 2 , - 1 . 1 9 8 ) (-0.797,1.545) (0.322,-0.979) (0.192,-1.198) (0.000,1.932) /
Its eigenvectors are, respectively, / (-0.040,-0.015) >y (-0.004,0.169) (0.383,-0.126) Ki] = (-0.393,-0.509) . \ (-0.341,0.527) /
/ (0.146,0.084) \ (0.081, -0.443) [6] = (-0.573,0.092) (0.064,0.117)
/ (-0.314, -0.253) \ (-0.194,0.548) (0.012,0.000) [&] = (0.194,0.548) V (-0.314,0.253) /
/ (0.430,0.478) \ (0.064,-0.117) (0.573,0.092) [&] = (0.081,0.443) V (-0.146,0.084)/
V (-0.430,0.478)7
and ((-0.341, -0.527) \ (0.393, -0.509) (0.383,0.126) [&] = (0.004,0.169) V (-0.040,0.015) / Given the arbitrary eigenvalues (2.0,1.0), (4.0,-3.0), (-3.0,1.0), (-1.0,4.0) and (4.0,-2.0) in conjunction with the arbitrary arguments 9 = 0.2, ip = 3.1, 9' — 2.3 and ip' = 6.0, the matrix generated is general, and is / (2.177,-0.086) (-2.891,1.324) (1.662,-0.858) (-0.228,0.432) ( - 0 . 0 8 0 , - 0 . 1 8 3 ) \ (-1.640,2.724) (0.136,0.692) (0.047,-1.243) (1.510,0.741) ( - 0 . 9 1 8 , - 0 . 3 6 6 ) - 0 . 1 9 6 , - 1 . 8 6 0 ) (-0.663,-1.052) (0.729,0.502) (2.791,-0.899) (-1.812,0.436) (0.472,0.124) (1.239,-1.138) (1.797,-2.317) (0.630,0.059) (-0.207,1.100) \ (-0.065,0.189) (-0.150,0.977) (-0.252,1.846) (0.450,1.024) (2.328,-0.168) /
313
It has the respective eigenvectors / (0.008,0.007) ^ (0.048,0.039) [&]= (0.185,0.126)) (0.471,0.270)
V (0.730,0.345)7
&] =
( (0.221,0.035) ^ (0.588,0.046) (0.452,0.000) (-0.588,0.046) V (0.221, - 0 . 0 3 5 ) /
/
(0.055,0.027) \ (0.245,0.097) (0.563,0.174) [&] = (0.520,0.117) V (-0.537, - 0 . 0 7 7 ) /
N=
/ (0.537,-0.077)^ (0.520,-0.117) (-0.563,0.174) (0.245, -0.097) V (-0.055,0.027)/
and / (0.730,-0.345) \ (-0.471,0.270) [&] = (0.185,-0.126) (-0.048,0.039) \ (0.008,-0.007)/ The results above illustrate some of the possibilities of the method. At this time, the kinds of families that can be generated with different combinations of the eigenvectors and of the arguments have not been fully investigated, but it seems probable that with the proper choices, it is possible to generate such special forms as tridiagonal matrices, etc. It should also be possible with a little ingenuity to generate matrices with specific values of some of the elements. 5. Conclusion and Discussion In this paper, we have presented a prescription for forming matrices in such a way that their eigenvalues and eigenvectors are known. The method is very general indeed, and simply by varying the arguments, different kinds of matrices can be obtained. Always, the normalised eigenvectors are simultaneously given. A good amount of work is still needed in order to fully understand and utilize this method. For example, it is necessary to classify more properly and completely according to the values of the arguments 9, ip, 0' and (f' the kinds of matrices that can be generated. The matrices dealt with here are of order 5. In order to deal with matrices of higher order, it is necessary to have the functions {0} for those
314 cases. A source of these functions is spin theory, but t r e a t m e n t of other iV-dimensional q u a n t u m systems should produce other forms of the functions. Each such set of functions probably produces matrices of different characters. As such, it expands the range of uses to which we can p u t these matrices. References 1. G. H. Golub and H. A. van der Vorst, Eigenvalue computation in the 20th century, J. Comp. and Appl. Math. 123, 35 (2000). 2. B. Boley and G. H. Golub, A survey of matrix inverse eigenvalue problems, Inverse Problems 3, 595 (1987). 3. H. V. Mweene, Derivation of spin vectors and operators from first principles, e-Print arXiv: quant-ph/9905012, unpublished. 4. See any standard text on quantum mechanics, such as, B. H. Bransden and C. J. Joachain, Introduction to Quantum Mechanics (Longman Scientific and Technical, New York, 1989). 5. H. V. Mweene, Generalized probability amplitudes for spin projection measurements on spin 2 systems, e-Print arXiv:quant-ph/0502005, unpublished. 6. A. Lande, New Foundations of Quantum Mechanics (Cambridge University Press, Cambridge (UK), 1965).
315
T H E P O T E N T I A L G R O U P M E T H O D FOR STURM-LIOUVILLE EQUATIONS K. SODOGA, M. N. HOUNKONNOU International Chair in Mathematical Physics and Applications, 072 B.P. 50, Cotonou, Republic of Benin E-mail: ksodogaOcipma.net, [email protected] G. DEBIAIS LP2A, Faculte des Sciences, Universite de Perpignan, 52, Avenue Paul Alduy, F-66860 Perpignan Cedex E-mail : [email protected] We use SU(1,1) potential group methods to deduce exact solutions of S t u r m Liouville equations.
1. Introduction The potential group method proves to be an elegant technique for solving exactly the standard time-independent Schrodinger equation. This method relates the Hamiltonian of the system to the Casimir operator of a dynamic group. This group connects states that have the same energy but belong to different one-dimensional potential strengths. The method has been used to derive the solution of the Morse, Poschl-Teller and Ginocchio potentials by Alhassi et al.1 Wu et al.2 have discussed the connection of the general Natanzon potential to the SU(1,1) and the SO(2,2) potential groups. Sukumar 3 has used the differential realization of the su(l, 1) algebra to generate a number of known results. Recently, the method has been successfully extended to the position-dependent effective mass (PDEM) Schrodinger equation. 4 ' 5 The PDEM Hamiltonian with Levy-Leblond6 kinetic energy term can be viewed as a special case of the Sturm-Liouville (SL) operator. 7 ' 8 Our aim in this presentation is to generalize the potential group method to the Sturm-Liouville equation. Here we study the case of the SU(1,1) group.
316
2. The SU(1,1) Potential Group Method for the SL Equation Let us consider the SL equation -j-a{x)p{x)^{x)
+ [V(x) - E]p(x)*(x)
= 0,
(1)
where a is a positive real-valued function, V is the potential function, and E is the eigenvalue associated to the eigenfunction \I>. The eigenfunction \P is normalized with respect to the weight function p. The differential operator associated to the SL equation is Hermitian in the Hilbert space L 2 (E, p{x)dx) and given by H = -a(x)^-r(x)l+V(x),
(2)
with r related to a by the Pearson equation (op)' = rp. The generators J+, «/_ and Jz of SU(1,1) obey the commutation relations [JZ,J±] = ±J±,
(3)
[J+,J_] = - 2 J Z .
(4)
The Casimir operator of this group, given by J2 = JZ2-(J+J_ + J_J+),
(5)
commutes with Jz. Therefore, both operators J 2 and Jz can have common eigenstates which we denote by \jm). Their eigenvalues are denoted by j(j + 1) and m, i.e., J 2 b'"i) = j(j + l)\jm),
J2\jm) - m\jm).
(6)
According to Alhassi et a/.,1 one can identify the discrete and continuous irreducible representations with bound and scattering states, respectively, of a wide range of quantum mechanical potentials in one-dimension. In this case, the generators of the group are realized by means of linear differential operators. As in the case of the standard Schrodinger equation, 9 we may take the SL operator (2) to be a linear function of the Casimir operator J 2 , iJ = - ( | + J2).
(7)
In this case, the common eigenstates | jm) of J 2 and Jz are also eigenstates of H with the eigenvalues — (j + \)2. Here we are concerned with the bound states which belong to the discrete unitary irreducible representation of the
317 SU(1,1) group. 2 In this case, m can take the values (— j + n) where n is a positive integer and j is assumed to be integer or half-integer. Following Sukumar, 3 we define the basis states as \jm) = eim%m(x),
(8)
and consider the following realization of the generators Jz = -i
d d
J± = e ±id
(9) d ±/i(a;)— ± g(x) + f(x)Jz + c(x)
(10)
where h(x), g(x), f(x) and c(x) are real-valued functions. The realization (9)-(10) ensures that (3) is satisfied for any choice of the functions h(x), g(x), f(x) and c(x). In order to satisfy the commutation relations (4), these functions must fulfill the equations
f*{x)-h(x)$P-
df(x) h(x)-c(x)f(x) v /
= l,
=0. (11) dx '' " ~ dx These equations can be integrated easily to give the following solutions x
, f Unh
fix) =
dt
(12)
- Lwr r dt Asech Lwr x
c{x) =
(13)
where A and xo are constants of integration. Taking into account (9) and (10), the SL operator (7) reads as dh d2 , fdh \ d no H = -ti2 2 h dx dx dg (14) - ( 1 - f)m2 + 2cfm -h-^-g2 + fg + c Identifying this equation with (2), we obtain the following constraints on the functions h and g a = h2,
T = h(h' + 2g-f),
(15)
and the potential function V is a m-dependent function given as V = Vm = - ( 1 - f2)m2
+ 2cfm - hg' - g2 + fg + c2 -
\.
Since a and r are related by the Pearson equation, the functions h(x) and g(x) read as
h {x) =a(x),
g(x) =
1
h'(x) +
h(x)^l+f(x)
(16)
318 Note that the integral in (12) or (13) may be expressed in terms of the change of variable function u(x) defined as jj.
/
-rj-r = u(t) + constant.
(17)
Then for any choice of a(x) which leads to h(x), the functions f(x), c(x) and g(x) should be determined from (12), (13) and (16), respectively. Therefore, the expression for the potential function Vm may be written explicitly. 3. Application We now consider different systems of SL operators characterized by the functions a and p. Then several potentials can be obtained by suitably choosing the integration constant XQ. We note that SL equations can be grouped in four classes.
3.1. Class I: cr = 1 and p = 1; The Standard equation
Schrodinger
The SL operator corresponding to this class is the simplest one and is identical to the standard Schrodinger operator of a particle of unit mass moving in a potential function V Ho
=
-&
+ V{x)
(18)
-
3.1.1. Example 1: XQ = 0 Here h{x) = ± 1 . For h(x) = 1, Eqs. (12), 13) and (16) give fo(x) = — t a n h i ,
CQ(X) = Asechx,
go(%) = —|tanha;.
(19)
The resulting potential depends on the parameters m and A, V0(x;m,A)
= - ( m 2 - \ - ,42)sech2a; - 2Amsechz tanhz.
(20)
The eigenvalues are E^
= -{n-m+\f,
n = 0,l,...iV<m-±.
(21)
For a fixed value of m and varying the values of A, the family of potentials denned by (20) have identical spectra given by (21).
319 3.1.2. Example 2: x0 = - c o We consider the choice h[x) = 1. The corresponding solutions are /o(aO = - l ,
co(x)=Ae-x,
g0(x) = - \ .
(22)
The potential in this case is given by V0(x;m,A)
= A2e~2x
- 2mAe~x.
(23)
For fixed values of m, the Morse class of potentials (23) has the same spectrum (21) as the class of potentials (20). The potentials (20) and (23) have been obtained by Sukumar 3 by this potential group method. 3.2. Class II: a i^\,
p — \
For the choice h(x) = +y/a(x), the functions / , c and g may be determined from /o, Co and go as f(x) = / 0 (u(a:)),
c{x) = c0{u{x)),
g{x) = g0(x) + \ti(x).
(24)
Therefore, the potential function V is related to Vo as V(x;m,A)
= V0(u(x);m,A) -go(u(x))ti(x)
-
\h(x)h"(x) + y0(u(x))ti(x)
- \ti2{x).
(25)
3.2.1. Example 3: a = 1/M(x) The associated Hamiltonian operator is given by 1 d + V(x), dx _M(x) dx
(26)
which corresponds to the Hamiltonian with a position-dependent mass. Let us consider the mass function M(x) - '
,1 + z 2
We then obtain h(x) = (1 + a; 2 )/(7 + x2), and u(x) = x + (7 — 1) arctanx. (i) zo = 0: V (x;m, A) = —(m2 — \ — A2) sech2w(a;) — 2 Amsechu(x) +
tanhu(x)
2 + 'T; (3x 4 - 2 x 2 ( - 2 + 7) - 7 + x(x2 + 7) tanhu(x)) . (x + 7)
(27)
320 (ii)
XQ
= —oo:
= A2e~2u{x) 2mAe-u^ 4 ( - 1 + 7 ) ( - 3 x + 2s 2 (-2 + 7) + 7) (z2 + 7 ) 4
V(x;m,A)
(28)
We recover the potentials found by Roy and Roy5 from the application of the SU(1,1) potential group method to PDEM Schrodinger equations. 3.3. Class III: cr = 1, p(x) ^ 1 Here u(x) = x, so that the functions / , c and g may be determined from /o, c0 and g0 as f(x)=f0(x),
c(x)=co(x),
g{x) = g0(x) + \h{x)
p(x) '
(29)
Therefore, the potential function V is related to VQ as V(x;m,A)
= V0(u(x);m,A)
- ±-h2(x) ( ^
) -
p(x)J
.^x)Kx)m
Mx)Kx)m.
+
-h2{x)
4MXV(z) (30)
3.3.1. Example 4: o = 1; p(r) = r 2 , r > 0 The associated Hamiltonian operator is given by H =
J__d_ r 2 dr
dr
+ ^M,
(31)
which corresponds to the radial Hamiltonian which arises in spherically symmetric problems. (1) r0 = 0: V(r;m,A)
= - ( m 2 - \ - A 2 )sech 2 r tanhr — 1 -2 A m sechr tanh r +
(32)
(2) r0 = - 0 0 :
V(r;m,A)
= AzeTlr
-2mAe
(33)
321 3 . 4 . Class
IV: a ^ 1, p{x)
# 1
Here, u{x) ^ x. T h e functions / , c and g are related t o / 0 , CQ and g0, respectively, as f(x)
= fo{u{x)),
g(x) = go(u(x))
c(x) -
(34)
c0(u(x)),
+ - h'(x)+h(x)
p'(x)
(35)
P(x). Therefore, the potential function V is related t o VQ as V(x;m,A)
= Vo(u(x);m,A) - i
—
h'(x) + h(x)
h(x) 2 p'(x)
h'(x) + h(x)
p{x) + y0(u{x))
ti{x)+h(x)
9o(u(x)) p'(x) p{x) _
p'(x) p(x) h'(x) + h(x)
p'(x) p{x)
(36)
4. C o n c l u d i n g R e m a r k s We have discussed t h e application of the SU(1,1) potential group m e t h o d t o the Sturm-Liouville equation. Using this method, we have obtained the exact spectrum corresponding t o a number of potentials. In this context, we note t h a t the wavefunctions \P n ~ ( J + ) n ^ o m a v D e obtained from t h e ground state \Po which satisfies the relation J _ $ o = 0. It is interesting t o note here t h a t different Sturm-Liouville systems can be exactly isospectral.
References 1. Y. Alhassid, F. Giirsey and F. Iachello, Phys. Rev. Lett. 50, 873 (1983); Y. Alhassid, F. Giirsey and F. Iachello, Ann. Phys. (NY) 148, 346 (1983); Y. Alhassid, F. Giirsey and F. Iachello, Ann. Phys. (NY) 167, 181 (1986). 2. J. Wu, Y. Alhassid and F. Giirsey, Ann. Phys. (NY) 196, 163 (1989); J. Wu and Y. Alhassid, J. Math. Phys. 3 1 , 557 (1990). 3. C. V. Sukumar, J. Phys. A 19, 2229 (1986). 4. A. de Souza Dutra and C. A. S. Almeida, Phys. Lett. A 275, 25 (2000). 5. B. Roy and P. Roy, J. Phys. A 35, 3961 (2000). 6. J.-M. Levy-Leblond, Phys. Rev. A 52, 1845 (1995). 7. R. Dautray and J.-L. Lions, Mathematical Analysis and Numerical Methods for Science and Technology: Spectral Theory and Applications (Springer, Berlin, 2000). 8. M. N. Hounkonnou, K. Sodoga and E. S. Azatassou, J. Phys. A 38, 371 (2005). 9. G. Levai, J. Phys. A 27, 3809 (1994).
Parallel Sessions: Group II Coherent States, Wavelets, Functional Analysis and Orthogonal Polynomials
325
THE BETA-GEOMETRIC MODEL APPLIED TO FECUNDABILITY IN A SAMPLE OF M A R R I E D W O M E N D. B. ADEKANMBI Department of Pure and Applied Mathematics, Ladoke Akintola University of Technology, Nigeria E-mail: dammy-vickySyahoo.com.au T. A. BAMIDURO Department of Statistics, University of Ibadan, Nigeria E-mail: adebayobamiduroSyahoo.com The time required to achieve pregnancy among married couples termed fecundability has been proposed to follow a beta-geometric distribution. The accuracy of the method used in estimating the parameters of the model has an implication on the goodness of fit of the model. In this study, the parameters of the model are estimated using the Method of Moments and Newton-Raphson estimation procedure. The goodness of fit of the model was considered, using estimates from the two methods of estimation, as well as the asymptotic relative efficiency of the estimates. A noticeable improvement in the fit of the model to the data on time to conception was observed, when the parameters are estimated by Newton-Raphson procedure, and thereby estimating reasonable expectations of fecundability for married female population in the country.
1. I n t r o d u c t i o n Fecundability is the probability of conception per menstrual cycle in the absence of contraception. It has its importance in the study of fertility from the fact that it is one of the determinants of fertility rates of a population. 1 Fecundability of Nigerian women has rarely been investigated by researchers, and this might be due to the problem of lack of adequate data on the subject. This study has its aims in fitting a theoretical model to data on time to conception, and in finding a better method of estimating the parameters of the model, so as to achieve a substantial improvement in fit of the model to data on time to conception for married female Nigerians. In modeling the number of menstrual cycles required to achieve pregnancy, the beta-
326
geometric model has been found appropriate by many researchers in this field. 2 - 5 The distribution of the number of cycles required to achieve pregnancy, x, for any married woman is taken as geometric with parameter p which is assumed to vary according to a beta distribution. There is enough evidence that couples vary in their fecundability.6 The unconditional probability of x, which is the month of conception for a woman is P(X = x) = f(x; a, b) = — i - f pa(l - p)b+x-2dp, rs(a,b) J0
(1)
where B(a, b)=
f pa~l {l-pf-^dp, Jo
a, b > 0 and 0 < p < 1,
since only fecund women are included in the sample. One has, for x = 1;
a+b
Pr(X = x) = f(x;a,b)
=< ab(b+l)---(b+x-2) (, (a+f>)(a+6+l)--(a+&+z-l)
r
IUI
(2) x
^
> 9 z
-
r
The expected values of l/p which is the expected time required for a fecund woman, based on the assumption that p is distributed as a beta is E(^)
^blIoPa-r-1(l-P)b-1dp
=
(3) _ -
(a+b-l)(a+b-2)...(a+b-r) ( o - l ) ( o - 2 ) . ..(a-r) '
u
^>-
The first few moments of x about the origin conditional on p which follow a simple geometric distribution are
E(X/p) = l, E{X'/p) = $ - $ + I,
E(XyP)= £-1, E(XVp) = « - p + « - I .
W
The unconditional moments of x are therefore obtained by substituting the appropriate terms of the moments into E(l/pr). The mean, mode and variance of p are given by the following expressions, Mean : p = ^ , Mode : p - - 2 - 1 a bqb T22 _ Variance: a = {a+byi { a+b+l) •
\°)
According to Potter and Parker, 2 the mean and the variance of the month of conception are expressed as E(l/pr)
= (a + b- I) I {a - 1)
,
a2m = ab(a + b - 1) /(a - l)(o - 2) . (6)
327
The expression for the harmonic mean of fecundability is H„ = — mi . Hn = / " T 1 ^ P
for moment estimates; for maximum likelihood estimates.
(7)
[a+b—l)
In order to measure skewness, Pearson's measure is employed which is expressed as (mean — mode) . . standard deviation 2. D a t a Retrospective data on cycles to pregnancy of first child of a family has been claimed to be free of digit preference.5 The data employed in this study are retrospective data of a sample of married female Nigerians, that are exposed to the risk of conception, and are of child bearing age. Women with at least one conception were included in the sample. Those that are pre-maritally pregnant, and those who have never conceived at the time of the interview were excluded from the sample. The interest of the study is in estimation of the natural fecundability of fecund women, and inclusion of these set of women will contribute contaminations to the model. A total of 942 married women were asked how many cycles it took them to get pregnant, out of whom 488 became pregnant within 24 cycles. 2.1. Estimation:
method
of
moments
Equating the theoretical moments of x which are v\ and v2 to their empirical estimates mi and m2, results into two equations in terms of the two parameters of the model. Solving for the two unknown parameters a and b, the estimated values of a and b are a = 2(m 2 -ml)/(m2 -2m\ + mi) - fi(m1,m2) 6 = (mi - l ) ( a - 1) = / 2 ( m i , m 2 ) = f2.
- /i,
,Q,
The quantity a has bias in both its numerator and the denominator which is of little significance since we are considering a large sample. 7 Eliminating the bias is not necessary since the bias in the moment estimators is as a result of ratios of two correlated quantities. If m r and vT are, respectively, the r t h sample and population raw moments, then 8
E(mr) = vr, V(mr) = ^ Cov{mr,ms)
E{ml) = ^+"j,"-D»?, , =
(10) v
^-v'v\
328
where E(mr), V(mr) and Cov(mr,ms) are, respectively, the mean, the variance of mr and covariance of mT and ms. The asymptotic variances and the covariance of the two estimators are found to be, 9 V(a) - AlVirri!) + AlV(m2) + 2A1A2Cov(m1,m2), V(b) = BfV(mi) + B\V(m2) + 2BlB2Cov(m1, m 2 ), Cov(a, h) = AxBiVirm) + A2B2V(m2) + (A1B2 + A2B1)Ccrv(m1,m2), (11) where Ai - -x-—, A2 = , Bi , B2 = , (12) ami om2 ami om2 and / i and f2 are as given in (9). Equating the sample raw moments of x, mi and m2 to their corresponding population moments Vi and v2, we have A _ 2[2uin 2 -f2-v?] R n i
A _
_ vl+2v2vl-6viV2+1v2+vl ~ (v 2 -2t)J+i)i) 2
-2t;i(ui-l)
o _ -2i
'
^ °>
The variances T^(a), 1^(6) and covariance Cov(a, b) of the moment estimates are evaluated by substituting the corresponding values for vr at a = a and b = b. The covariance matrix of the moment estimates therefore is / V(a)_ \Cov(a,b)
Cov(a,b)\ V(b) J
{
}
The method of moments, though, provides consistent estimators and the estimating equations are simple, nevertheless the method does not provide the most efficient estimators. 10 In order to obtain reliable estimates of fecundability, a high precision of the estimates of a and b is very important. 2.2. Estimation:
Method
of Maximum
Likelihood
Adopting the method of scoring systems for deriving maximum likelihood estimates (MLE), the likelihood function for the samples of n, which is a comprehensive summary of the data conditional on a and b is, 10 n
L = f{xi;a,b)
• f(x2;a,b)
• • • f(xi,a,b)
= JJ_f(xi,a,b),
(15)
t=i
where Xi is the conception month for the ith woman (i = 1,2,..., n), and / is as denned in (1). The quantities 5l
dlogL = -^~-
52 =
dlogL ~db-'
(16)
329
are the efficient scores for a and b, respectively. Given that Kx is the observed frequency of conceptions in month x, then the log-likelihood function becomes
LogL = 5 3 ^ B log/,
(17)
and the efficient scores for a and b now become
5l = 5X*g£),
5a
=sx^>.
(is)
The issue of deriving the MLE in essence is finding the MLE of a and b for which the likelihood function has the maximum value. At this point S\ and S2 are equal to zero. A suitable iterative method for deriving the MLE to solve equations Si = 0 and S2 = 0 is the Newton-Raphson method. The Newton-Raphson method has the advantage of converging quadratically when the initial value is near the root, but is expensive in terms of function evaluations. 11 The iterative procedure is described as follows 96 = I'1 * S,
(19)
where g9 is the most recent estimate of 6\ — 0O; S is a column vector of the scores computed at 6 = 6Q; and I is the information matrix. The information matrix is obtained as follows
<20»
'-= -<£!£>•
for r,s — 1,2 and #i = a, 62 = b. In deriving the MLE, the information matrix used is _ ,d2\ogf rs { {21) ~ dOrd6s h which is the generalization of (20) and removes the need to find expectations. 10 The asymptotic covariance matrix of the MLE are found in the matrix 1 =
(
Var(a) \Cov((a,b))
Cov((a,b))\ Var(b) I
l
'
with elements Var(a) = I11=
^
Var(b) = I22 =
,
^
j!t
, (23)
Co»{a,b) = l» = I*
=-jzfcjr,
where _ / l 1 -
d2 l o g / , M~'
/22
~
d2log/s 56^'
T h i
=
3 2 log A —^db~-
,„., (24)
330
2.3. Asymptotic
relative
efficiency
Unbiased estimators are usually compared in terms of their variances. The variances of the moment estimates are always larger than the variances of MLE, since V - I~x is positive definite.9 The determinant of variance is called the generalized variance of a function. The corresponding generalized variances are given by |V| = detV and d e t / - 1 = TJT, respectively. The asymptotic relative efficiency (ARE) of moment estimates (ab) compared with MLE (ab) according to Sen and Puri, 12 ARE = (\V\\I\)-i. 2.4. Numerical
(25)
results
Inferences about the distribution of fecundability such as estimates of its mean, variance, harmonic mean, skewness, etc., are as displayed in Table 1. Having obtained the estimates of a and b by the method of moments, the estimates are used as the initial values to find the maximum likelihood estimates. The moment estimates are a = 9.89232, b = 15.1607. Substituting these values into (11) yields estimates of V(a), V(b) and Cov(a, b). The covariance matrix of the maximum likelihood estimates of a and b is obtained by substituting the necessary values of a and b into (22). The asymptotic relative efficiency derived from (25) is 87 percent. The MLE of the two parameters are higher compared with moments estimates. The MLE have lower estimates of arithmetic mean, variance of p, but with higher harmonic mean when compared with moment estimates. Table 1. Moments estimates and maximum likelihood estimates MLE
Moments estimates
a
11.1124
9.8923
b
17.0386
15.1607
r
0.9946
0.9610
mean(p)
0.3947
0.3949
Harmonic mean Hp
0.3725
0.3697
Mode
0.3867
0.3857
Variance(
0.0082
0.0092
Skewness (03)
0.0883
0.0959
Estimates
331 2.5. Considerations
of goodness
of fit
The expected frequencies of women conceiving during each successive month of observation are evaluated using estimates of a and b from the two methods of estimation. The expected frequencies obtained are then compared with the observed frequencies to ascertain the set of estimates from the two methods of estimation that yield a better fit. The x 2 value for the difference between the observed and expected frequencies of conception using the MLE is 9.03 which is not significant at the 5 percent level, while the x 2 value of moment estimates is 9.33 which is not significant as well. It is noticeable that the MLE yield an improved fit compared with moment estimates. The MLE though yield a better fit, but the value is high which implies a poor fit. One of the reasons for the poor fit might be due to memory bias, since the study is based on retrospective data. 3. Discussion The likelihood function used in deriving MLE contains all the information in the sample, and is thus a comprehensive summary of the data. The maximum likelihood estimates (MLE) therefore could be considered better than the moment estimates which are derived from the first and second moments of the conception month only. Placing a higher importance on quality rather than convenience, the method of maximum likelihood estimation is more appropriate in estimating the parameters of the model. Fitting the beta-geometric model to data on first conception has the limitation that it only applies to the beginning of married life. Its advantage is that minimum assumptions are made, and yielded consistent results. References 1. J. Bongaarts, Demography 12, 645 (1975). 2. R. G. Potter and M. P. Parker, Population Studies 18, 99 (1964). 3. D.D. Bairds and A. J. Wilcox, J. of the American Medical Association 253, 2979 (1985). 4. C. R. Weinberg and B. C. Gladen, Biometrics 42, 547 (1986). 5. M. S. Ridout and B. J. Morgan, Biometrics 47, 1423 (1991). 6. H. Leridon and A. Spira, Fertility and Sterility 41, 580 (1984). 7. M. C. Sheps, Population Studies 18, 85 (1964). 8. C. R. Rao, Linear Statistical Inference and its Applications (Wiley and Sons, New-York, 1965). 9. C. R. Rao, Advanced Statistical Methods in Biometric Research (Wiley and Sons, New-York, 1952).
332 10. A. Stuart and K. Ord, Kendall's Advanced Theory of Statistics, Vol. 2 (Edward-Arnold, London, 1991). 11. P. R. Turner, Numerical Analysis (MacMillan, London, 1994). 12. P. K. Sen and M. L. Puri, Proc. 2nd Internat. Symp. on Multivariate Analysis, 33 (1969).
333
O N T H E E X I S T E N C E A N D U N I Q U E N E S S OF SOLUTIONS TO T H E T H E R M A L FILTRATION MODEL F. B. AGUSTO Department of Mathematical Sciences, Federal University of Technology Akure, Nigeria, E-mail: [email protected] O. M. BAMIGBOLA Department of Mathematics, University of Ilorin, Ilorin, Nigeria, E-mail: [email protected] In this work, we coupled the filtration model at a decreasing rate to the heat equation and determined the condition for the existence and uniqueness of a solution to the coupled equation.
1. Introduction The purpose of this paper is to develop a kinetic model which couples equations in Ref. 1 to the heat equation, and to give the conditions for the existence and uniqueness of solutions of the coupled system. In Sec. 2 the thermal model is described. In Sec. 3 the conditions of existence and uniqueness of solutions to these equations are discussed. 2. The Thermal Model The heat transfer formulation employed considers the effects of conduction. The heat flux per unit area is therefore given by Q = -AVT, where A is the coefficient of thermal conductivity. The principle of conservation of energy yields, -(HT)
+ VQ = Lj(x,t),
334
where H, the heat capacity of impurities suspended in the liquid at the reference temperature, is defined as H = <[>VpC.
Hence one has, Q-t(4>VPCT) + V(-AVT) = u(x, t),
(1)
where V(x,t,T)
= u(x,t) +
a(x,T),
while <j> stands for the porosity, V(x,t,T) for the saturation of impurities suspended in the liquid, u(x, t) for the concentration of impurities suspended in the liquid at temperature T, a(x, T) for the initial concentration of impurities suspended in the liquid at temperature T, p for the liquid density, and C for the specific heat capacity of impurities suspended in the liquid. Simplifying, the energy equations become £ [V(x, t, T)T(x, t)] - A V ( V 7 > , t)) = w{x, t), (2) T(x,0)
= TQ,
%=g.
Demchik's 1998 model of filtration1 at a decreasing rate which is based on the Mints model is given as uxt(x,t) +b(t)ut(x,t) u{x,0)=coe-ev°lx, u(0,t) = c0(l+7t)z, b(t)=(3vol(l+jt),
-pu(x,t)
=
f(x,t), (3) V)
p = pVo17(z-l),
z^aovof-1.
Here u is the concentration of the impurities suspended in the liquid, /? is the kinetic coefficient assumed to be a constant, CQ is the impurity concentration in the liquid at the filter inlet, v0 is a constant and p is the scaled pressure. Let 0 C Ml be a bounded open subset with boundary <9fi and (0,T) an open interval. The symbol QT denotes the cylindrical domain n x (0,T), while S T = dtl x (0,T) is the lateral boundary of QT. Finally, / 6 CHtO.ooJ.L^n)) and u € L 2 (fi). The relations (2) and (3) define the desired equations of the coupled system of interest. Note that Eq. (3) decouples from the thermal equations (2), and may be solved for u without regard of the temperature T'. The existence and uniqueness of solutions of Eq. (3) was considered in Ref. 2. Hence we shall consider here the existence and uniqueness of
335
solutions of Eq. (2). In Eq. (2) one has V(x,t,T) = u(x,t) + a(x,T). The existence of solutions with V(x, t, T) is very interesting and is considered in a subsequent paper. Here, for simplicity, we shall take a > 0 as constant, hence V shall become V(x,t) = u(x,t) + a. 3. Existence and Uniqueness of Solutions 3.1. Theorems
on existence
and
uniqueness
We consider the following theorems on the existence and uniqueness of solutions of the Cauchy problem YeX: with BY(0)
jt[B(t)Y(t)]+A(t)Y(t)
= f(x,t)
in X',
(4)
=B(0)Y0.
Theorem l . 3 Let the separable Hilbert spaces V, W, the linear operators A(t), B{t) with 0 < t < T, and the data Y0 £ W and f <= L2(0,T;V') be given and assume further that B(t) is a regular family of Hermitian operators. In addition, suppose B(0) is monotone. Then 2A(t)vv + B'(t)vv>c\\v\\2,
vEV,
0
Then there exists a solution Y of the Cauchy problem (4) and it satisfies \\nL'(o,T;V)
< C(A,c) (||/||£ 2 ( o, r i v.) +B(0)Y0(Y0)Y
.
Theorem 2. 3 Let the separable Hilbert spaces V, W, the linear operators A{t), B(t) with 0
v e V,
almost everywhere (a.e.)
Then there is at most one solution of the Cauchy problem.
t G [0,T].
336
3.2. Variational
formulation
Let the separable Hilbert spaces V = H1^), W = L2(Q) and X = 2 2 L (0, T; V), and Y0 G W and w G L (0, T; V). We consider the problem jt[V(t)T(t)]-\AT(t)=w(x,t),
vr(o) = v(o)r0. Letr
eV.
Then
- / V(t)T(t)T'(t)dt-X Jo where A(t)Tr
f VTVrdt+ Jo
[ grdt= f f(t)T(t)dt Jo Jo
= / VTVrdx,
VT, r G
A(t)T-T = / VrVrda; = f ||Vr|| 2 > C | | T | & I , Vn Jn Hence, A(t) is ^-elliptic and regular and
+ V0T0,
ff1^),
VT
6
tf1^).
= f V(t)T(t)r(t)dx, VT,T G L2(ft) Jn is regular on L2(Q.) given that V(a;,£) is absolutely continuous from Ref. 2, hence \dtV(x,t)\ < K(t), a.e. t G [0,T] and K G i H O . T ) . Now V(a;,t) = u(a;,i) + a > 0, since w(x,i) > 0. Hence V(t) is monotone, B(t)Tr
= V(t)Tr
£V(M)>7. This implies V'TT>7||T||*a)
and consequently, 2A(i)Tr + 5 ' r T > c | | r | | ^ 2 . Hence uniqueness of solution follows from Theorem 1 and existence follows from Theorem 2.
337
4. Conclusion We have established the uniqueness and existence of the solution of the thermal filtration problem. We shall in a subsequent paper give the numerical solution for this coupled system. References 1. I. I. Demchik, The theory of filtration at a decreasing rate, J. Appl. Maths. Mechs. 62, 479-481 (1998). 2. F. B. Agusto, O. M. Bamigbola and O. P. Layeni, Optimal control for filtration, Proceedings of the Third International Conference on Contemporary Problems in Mathematical Physics, Cotonou, Benin, eds. J. Govaerts, M. N. Hounkonnou and A. Z. Msezane (World Scientific, Singapore, 2004), pp. 170-174. 3. R. E. Showalter, Monotone Operators in Banach Space and Nonlinear Partial Differential Equations, AMS Mathematical Surveys and Monographs, 49 (1997).
338
B A S I C SET OF POLYNOMIALS: A G E N E R A L OVERVIEW A. ANJORIN and M. N. HOUNKONNOU International
Chair in Mathematical Physics and Applications University of Abomey-Calavi, 072 B.P. 50 Cotonou, Republic of Benin E-mail: [email protected]
(ICMPA),
In this paper, we provide a general overview of the basic set of polynomials {Pn(z)}n>oThe domain of effectiveness is related to the radius r of convergence of the associated basic series of {Pn{z)}n>o- The Cannon condition is given. The application in the case of Chebychev polynomials leads to the improvement of the Whittaker constant up to fourteen digits.
1. Introduction Basic sets of polynomials have been investigated in recent y e a r s 1 - 8 since the work of Whittaker. 2 The properties of series of the form cn-Pn(z) + ciPi(z) + • • •, where Pi(z), i = 0 , 1 , . . . , are prescribed polynomials and c; chosen in a field K of scalars, widely differ according to the particular chosen polynomials. For example, the region of convergence may be a circle (Taylor series), an ellipse (series of Legendre polynomials), a half-plane (Newton's interpolation series). Whittaker, 2 in his attempt to find common properties exhibited by all these polynomials, introduced the notion of basic sets of polynomials. In his work, he gave the definition of basic sets, basic series and effectiveness of basic sets. Cannon 3 obtained the necessary and sufficient conditions for the effectiveness of basic sets for classes of functions of finite radii of regularity and of entire functions. Nassif and Adepoju 1 investigated the zeros of polynomials belonging to simple sets. Initially, the subject has been approached in the framework of mathematical analysis. Then, Newns 7 laid down the treatment of the subject based on functional analysis considerations. Over the years, Newns' approach has received further advancement through the works of Adepoju 1 and Falgas 6 to mention a few. In this paper, we only deal with the basic theoretical aspects and illustrate the case of the basic set of Chebychev polynomials of the first kind.
339
2. Definitions and Some Theorems We first recall some definitions and theorems relevant to the sequel of our analysis. Definition 2.1. A sequence {Pn(z)}n>o of polynomials is said to be a basic set, if and only if any polynomial P(z) can be expressed as a unique finite linear combination of the Pk (z) 's as n
P(z) = 2jcfcPfc(z),
where
Ck € K
and
n < oo.
(1)
k=0
These polynomials Pk(z) are linearly independent. In particular, the set of monomials {zn}n>o has a unique representation of the form n
zn = ^2^n,kPk(z),
where
7rn,fc € K.
k
In general, given any polynomial P{z) — Y2Jk=Q ckZk, we get n
p
k
n
w = E E ^ ' w ^ w = E^ p *( 2 )' k=0 j=0
(2)
k=0
with £k — Y^j=ocj^hki f° r & — 0, ..,n. Hence, the representation is well defined. If {-Pn(^)}n>o forms a basic set, there corresponds an associated basic series oo
oo
/
n
\
/(Z) = H a « Z " = 5Z a " ( Yl *n,kPk (z) I n=0 oo
n—0
\fc=0 \fc=0
/
= 5>„/(0)P n (z),
(3)
n=0 a
where 7r n /(0) = Y^T=o k*k,n- Since a* = /W(0)/fc!, we have
^(°)
=
oo
-.W/fVv
Eir^'
(4)
A;=0
Finally, the 7r„'s can be expressed as
_ ^ 1 dk nn k * ~ P k\ ' dz* n
fc=0
2
' =°
and the set {7rn}n>o is called a basic set of operators corresponding to the basic set of polynomals {Pn(z)}n>o- In the sequel, we consider domains in the complex plane.
340
Definition 2.2. Let f(z) be a regular function in a domain D. The basic series ^2^L0^nf(0)Pn(z) is said to represent f(z) in the domain D, if it uniformly converges to f(z) in D. If the domain is a disc D(r) of radius r, we say that the basic set {Pn(z)}n>o represents f(z) for \z\ < r. Definition 2.3. A basic set {Pn{z)}n>o is effective in a domain D if every regular function f(z) in D is represented by a basic series associated with {Pn(z)}n>o- We then say that the basic series associated with {Pn(z)}n>o is effective. Let {Pn(z)}n>o be a basic set in a closed circle. The Cannon sum 3 Wn(r) of {Pn{z)}n>o is defined by n
Wn(r)
= J2 \**,k\Mk(r),
Mk(r) = max \Pk(z)\,
(5)
with the Cannon function expressed by A(r) = BEn-.oo {Wn(r)}»
= lim sup {Wn(r)}»
.
(6)
n—±co
This statement can be generalized to any regular domain such as open or closed discs. Definition 2.4. 2 The Whittaker constant W is the least upper bound of a number c such that the function f{z) is an entire function of exponential type c.
Definition 2.5. Let iVn be the number of non-zero terms in the representation (1). Then, the basic set {Pn(z)}n>o is called the Cannon set if NlJn —> 1 as n -—> oo. This condition is called the Cannon condition. Let us consider the effectiveness of the set {Pn(z)}n>o m the domain D(r) (resp. D(r)) which is the open (resp. closed) disc of radius r. Theorem 2.1. 3 Let {P„(z)}n>o be a basic set of polynomials, a a scalar, and H(a) the class of holomorphic functions in the domain D(a). Assume that X(r) = a > r. Then, the basic series is effective in the closed domain D(a) for the class H(a).
341
Corollary 2.1. If for any value of r > 0, X(r) = r, then the basic set {Pn(z)}n>o «5 effective in the domain defined by \z\ < r. Proof. The proof follows from Theorem 2.1 for the limit case A(r) = r.
•
Thus, the condition A(r) = r is a sufficient condition for effectiveness in D(r). Theorem 2.2. 3 The necessary and sufficient condition for the Cannon set {Pn(z)}n>o to be effective in D(r) is X(r) = r. 3. Applications We consider the Chebychev polynomials of the first kind, namely, [n/2]
Pn(z)=J2(2k)zn-2k(z2-l)kk=0
Theorem 3.1. 4 The set {Pn(z)}n>o kind forms a basic set.
(7)
of Chebychev polynomials of the first
Proof. By the uniqueness of the polynomial representation (7) and the linear independency of the polynomials, the set {Pn{z)}n>o forms a basic
set.
• n
Let us examine the domain of effectiveness of this basic set. Using z YJlKn,kPk(z) with 7rn>fc = (2"fc), the relation (5) yields
=
[n/2]
7r„,fc \zn-2k(z2
Mn(r) = max|P„(2)| < max V \z\=r
\z\=r
-
l)k\
^
[n/2]
< ^7T„,,r"-2fc|r2-l|fc,
(8)
fc=0
and n
n [k/2]
*»•***.:> r * ~ 2 V ~ 1IJ'•
Wn(r) = ^7r„, f c Mk(r) <J2J2 fc=0
fc=0
(9)
j=0
The Cannon function A(r) in (6) is such that _ f n [k/2] ) k 2j 2 A(r) < l i m ^ o o I Yl E *".*** J r ~ \r -1\H I k=0 j=0 J
1/n
.
(10)
342
Applying the well known Stirling relation, n! ~ y/2irnne~n as n -» oo, to the combinatoric coefficients 7rn,ft, we can show that this series converges well only for r < 1. By a similar procedure, we can bound the Cannon function from below by a divergent series for all r > 1. Hence, the Chebychev basic set of the first kind is effective only in the unit disc. Furthermore, the following statement holds. Theorem 3.2. The set {Pn(z)}n>o Cannon set.
"/ Chebychev polynomials forms a
Proof. The proof is obvious since the number of non-zero terms in the unique representation of {P„(z)}n>o is Nn = [n/2] + 1. • Theorem 3.3. 4 For the basic set of Chebychev polynomials {Pn(z)}n>o of the first kind, there exists a positive number c such that we have the upper bound \Pn(z)\ < cn+1. Then, for these polynomials, the Whittaker constant has an upper bound not exceeding 0.73775075151785 and lower bound not exceeding 0.73775075151525. Proof. Using the Levinson method, 5 we obtain several upper bounds of the form Mn < cn+1 leading to the polynomial equation 10r 5 - 3r 4 - 4r 3 - 4r 2 - 6r - 10.18564 = 0
(11)
with four complex roots and one real root W up to fourteen digits, i.e., 0.73775075151525 < W < 0.73775075151785. • 4. Concluding Remarks We have provided the main properties of a basic set of polynomials. In addition, we have examined the case of the basic set of Chebychev polynomials of the first kind which improves the range accuracy of the Whittaker constant W. Acknowledgments A. A. is grateful to the Abdus Salam International Centre for Theoretical Physics (ICTP, Trieste, Italy) for a Ph.D. fellowship under the grant Prj-15.
343
References 1. J. A. Adepoju, Ph.D. Thesis, unpublished, University of Lagos (Republic of Nigeria, 1979). 2. J. M. Whittaker, Interpolatory Function Theory (Cambridge University Press, Cambridge (UK), 1935). 3. B. Cannon, Proceedings of the London Math. Soc, Ser. 2, Vol. 4 3 , 348-364 (1937). 4. A. Anjorin and M. N. Hounkonnou, ICMPA preprint MPA/2005/06. 5. N. Levinson, Duke Math. J. 1 1 , 729-733 (1944). 6. M. Falgas, Annales Scientifiques de I'Ecole Normale Superieure, 3e Serie, Vol. 8, 1-76 (1964). 7. W. F. Newns, Phil. Trans, of the Roy. Soc. of London, Ser. A 245, 429-468 (1953). 8. A. El-Sayed Ahmed, On derived and integrated sets of basic sets of polynomials of several complex variables, Acta Mathematica Academiae Paedagogicae Nyiregyhaziensis 19, 195-204 (2003).
344
WAVELETS A N D WAVELET F R A M E S O N T H E 2-SPHERE J.-P. ANTOINE Institut
de Physique Theorique, Universite catholique de B-1348 Louvain-la-Neuve, Belgium E-mail: AntoineOfyma.ucl.ac.be
Louvain,
We review the construction of the continuous wavelet transform (CWT) on the 2-sphere by two methods, the group-theoretical approach and the geometrical method based on conformal invariance. Then we discuss the discretization of the spherical C W T and build associated discrete wavelet frames, first halfcontinuous ones (only the scale is discretized), then fully discrete ones. To that effect, we generalize the notion of frame, introducing weighted and controlled frames.
1. Introduction Many situations in physics, astronomy and medicine yield data on spherical manifolds, so it is natural to design a suitable tool for treating them. Fourier analysis on the two-sphere S2 is standard, but cumbersome, since it amounts to work with expansions in spherical harmonics! The latter, denoted {l^ m }, constitute an orthonormal basis in L2(S2, cfyz(w)), so that any function / £ L2 (S2, d/j,(co)) may be expanded as
/M = £ £ f(l,m)
= (Yr\f)
f(l,m)Yr(u>), =
(1)
[ d/i(a;)l™M/(w), (2) Js2 where CJ = (#,>) € S2, 6 € [0, TT] is the latitude angle and ip € [0,2iv) is the longitude angle, dfi(u>) = sin8d9dip, and f[m is a Fourier coefficient of / . The problem is that Fourier analysis is global, since Y™ is not localized at all on the sphere! Actually, there are specific combinations of spherical harmonics which are well localized (the so-called spherical harmonics kernels1), but then one looses the simplicity of an orthonormal basis. Thus it is not surprising that alternative methods have been proposed by various authors. We may quote, for instance, Gabor analysis on the tan-
345
gent bundle, 2 frequential wavelets, based on spherical harmonics, 3 diffusion methods with a heat equation. 4 Discrete wavelets on the sphere have also been designed, using an S2 multiresolution analysis on spherical meshes (via the lifting scheme),5 or locally supported spline wavelets on spherical triangulations. 6 However, various problems plague those constructions, such as an inadequate notion of dilation, the lack of wavelet localization, the excessive rigidity of the wavelets obtained, the lack of directionality, etc. In this respect, the continuous wavelet transform (CWT) has many advantages: locality is controlled by dilation, the wavelets are easily transported around the sphere by rotations from SO (3), efficient algorithms are available. Holschneider7 was the first to build a genuine spherical CWT, but his construction involves several assumptions and lacks a geometrical feeling. In particular, it contains a parameter that has to be interpreted as a dilation parameter, but whose geometrical meaning is unclear. Thus the problem was tackled by our group in Louvain-la-Neuve (P. Vandergheynst, L. Jacques, M. Morvidone), resulting in a series of p a p e r s 8 - 1 1 that yield a rigorous and efficient spherical CWT. A further simplification was obtained later by invoking conformal arguments. 12 Of course, in practice, the usual two-dimensional CWT in the plane is discretized and replaced with suitable discrete frames. Thus, to complete the picture, one needs to design discrete spherical wavelet frames as well, and this was indeed realized in the last paper of the series. 11 The aim of the present contribution is to give a rapid survey of the series of works 8 - 1 2 mentioned above. As a general reference for 2-D wavelets, including the figures, we use our recent monograph. 13 2. The C W T on the Two-Sphere As we have learned from the previous cases, the design of a CWT on a given manifold X starts by identifying the operations one wants to perform on the finite energy signals living on X, that is, functions in L2(X,dv), where v is a suitable measure on X. Next one realizes these operations by unitary operators on L2 (X, dv) and one looks for a possible group-theoretical derivation. In the case of the two-sphere S2, the required transformations are of two types: (i) motions, which are realized by rotations g S SO(3), and (ii) dilations of some sort by a scale factor a £ KJj.. The problem is how to define properly the dilation.
346
Fig. 1. Visual meaning of the stereographic dilation on S2.
A possible solution is to use a (radial) stereographic dilation on 5 2 , which is obtained in three steps (see Fig. 1): (i) given a point A € S2, different from the South Pole S, project it stereographically to the point B in the plane tangent to the sphere at the North Pole N; (ii) dilate B radially in the usual way to B'; and (iii) project back B' to the sphere, which yields A'. The map A H-> A' is the required spherical dilation around N. In order to dilate around any other point C, just bring it to N by a rotation g s SO(3), dilate as above, and go back to C by the inverse rotation g_1. The operations just defined have a natural realization by unitary operators in L2(S2,dfi(u))): . rotation Re : (Rgf)(u)
= fig'1"),
. dilation Da : {DJ){u)
= \(a,6)1/2f(uj1/a),
g € SO(3), aeM^,
(3) (4)
where uia = (9a ,ip), 9a is defined by tan 4^ = a tan | for a > 0 and the normalization factor X(a, 6)1/2 (variously called cocycle or Radon-Nikodym derivative) is needed for compensating the noninvariance of the measure \i under dilation. Explicitly, this factor is given as A(a 6) =
'
4a 2 [(a -l)cos0 + (a2 + l ) r 2
(5)
Note that the rotation g may be factorized into 3 rotations (Euler angles): Re = R% Ry0 R\,
347
The question now is, can one derive a CWT from these ingredients, starting from first principles, for instance the general coherent state formalism, as was the case for the 2-D plane CWT? Is this transformation unique? 2.1. The group-theoretical
method
According to the general scheme, 13 a possible way of answering the question is to use the general coherent state formalism relying on square integrable representations of a suitable transformation group. Thus we have to identify first the group of affine transformations on S2, containing motions (here, rotations) and dilations defined above. But a problem arises immediately. On the one hand, motions Q € SO(3) and dilations by a € K+ do not commute. On the other hand, it is impossible to build a semidirect product of SO(3) and K+, since SO(3) does not admit any outer automorphisms, so that the only extension of SO(3) by R+ is their direct product. However, this contradiction may be evaded if one embeds the two factors into the Lorentz group S0 o (3,1), by the Iwasawa decomposition: S 0 0 ( 3 , l ) = SO(3)-A-iV,
(6)
where A ~ SO G (l,l) ~ E ~ R+ (boosts in the ^-direction) and N ~ C. This procedure is justified by the fact that the Lorentz group S0 0 (3,1) is the conformal group both of the sphere S2 and of the tangent plane M2. Next we have to compute the action of the Lorentz group on the sphere. The stability subgroup of the North Pole is P = S0 Z (2) • A • N (minimal parabolic subgroup). Thus S2 ~ S0 o (3,1)/P ~ SO(3)/SO(2), so that S0 0 (3,1) acts transitively on S2. Then an explicit computation with help of the Iwasawa decomposition (6) shows that the pure dilation by a, realized as a Lorentz boost along the z-axis, coincides with the stereographic dilation (4). Going over to the Hilbert space, we find that the Lorentz group S0 o (3,1) has a natural unitary irreducible representation (UIR) in L2(S2,dfi(u})), namely, [U(g)f] (w) = \{g,u)1'2
L2(S2,d/x(w)), (7) where X(g,u>) is the Radon-Nikodym derivative (5) (which actually does not depend on tp in u = (9, ?)). Thus the parameter space of spherical wavelets is the homogeneous space X = S0 0 (3,1)/JV ~ SO(3) • ! + , which is not a subgroup of S0 o (3,1). f ( < r M , for g G S0 0 (3,1), / €
348
In order to apply the general formalism, we must introduce a section a : X —» S0 0 (3,1) and consider the reduced representation U(a(g,a)). Choosing the natural (Iwasawa) section a(g,a) = ga, Q 6 SO(3), a £ A, we obtain U(a(g,a)) = U(ga) = U(g)U(a) = ReDa,
(8)
exactly as before, in (3)-(4). The following three propositions show that the representation (8) has all the properties that are required to generate a useful CWT. First of all, it is square integrable on the quotient manifold X = SO c (3, 1)/N ~ SO(3) • ffi+ (for simplicity, we shall identify these two isomorphic manifolds). Proposition 2.1. The UIR (7) is square integrable on X, that is, there exist nonzero (admissible) vectors xp E L2(S2,d(i) such that r°° fit]
^ Jo
a
r
dg \(U(o-(g,a))iP\(f>)\2 := (4>\A^4>) < oo, for all
L2(S2,dfj,).
JSO(3)
(9) Here dg is the left Haar measure on SO(3). The resolution operator (also called frame operator,) A^ is diagonal in Fourier space (i.e., it is a Fourier multiplier): A^f(l,m)
= Gi,(l)f(l,m),
(10)
where f°° ^\Ml,m)\2,
G^(l) = # £ r £ Z l +
Jo
for all I € N,
(11)
a
\m\^i
and xjja{l,m) = (Y;m|^>a) is the Fourier coefficient of tpa — Da%p. Next, we have an exact admissibility condition on the wavelets (this condition was also derived by Holschneider7 in a somewhat ad hoc way). Proposition 2.2. An admissible wavelet is a function ip € L2(S2, dfj,(cj)) for which there exists a constant c > 0 such that G^(l) ^ c , forallleK 2
(12)
2
Equivalently, the function ip € L (S , dfi(uj)) is an admissible wavelet if and only if the resolution operator A$ is bounded and invertible. As in the plane case, 13 there is also a weaker admissibility condition on ip:
f JS2
M*L Mu) V ^
=
' 1 + COS0
o.
(is)
349
Here as well, this condition is only necessary in general, but it is also sufficient under mild regularity conditions on rp. This is clearly similar to the "zero mean" condition of wavelets on the line or the plane. As in the flat case, it implies that the spherical CWT acts as a local filter, in the sense that it selects the components of a signal which are similar to ip, which is assumed to be well localized. In addition, our spherical wavelets generate continuous frames. Indeed: Proposition 2.3. For any admissible wavelet ip such that JQ dip ip(0, (p) ^ 0, the {ipa,e = Rg Daip : a > 0, g € SO(3)} is a continuous frame, that is, there exist two constants m > 0 and M < oo such that ™U\f
< [°°^ J0
dgl^aj^l2
[ O
<M|| € L2 (S2, dp),
JSO{3)
(14) or, equivalently, there exists a constant d > 0 such that d^G^l)
^ c, for
allien
(in other words, the operators A^ and AZ1 are both bounded). Note that the condition JQWdip ip(9,
fora>0.
(15)
Using the previous results, we may now introduce the spherical CWT . Definition 2.4. Given the admissible wavelet tp, the spherical CWT of a function / G L2(S2, d/j,(oj)) with respect to %p is defined as Wf(g, a) = (W, a |/) = / 2 dn(u)[RaDarl>](uj)f(u) Js
= (tf„ * f)(g).
(16)
In the last equality, * denotes a spherical correlation. According to the general coherent state formalism, there is a reconstruction formula. For / e L2 (S2, dfx(u)) and ip an admissible wavelet such that
350
(a)
(b)
(c)
Fig. 2. The spherical wavelet ip^ wavelet, for a = 1.25: (a) Original (a = 0.125); (b) Rotated; (c) Rotated and scaled (a = 0.0625).
f^w dip xf>(6, if) £ 0, one has /(")=/
f°° da f — /
J0
a
dQWf(Q,a)[AjRaDatl>](u).
(17)
JsO(3)
Correspondingly, instead of the familiar isometry property, one gets a Plancherel relation: r°° dn
||/||2= / Jo
^
a
f
/
=
dgWf(Q,a)Wf(6,a),
(18)
JSO(3)
where Wf(g, a) = <& i 0 |/) - (A^ReDarP\f).
(19)
The new fact here is the occurrence of the inverse resolution operator A^1 in these formulas. This results from the square integrability of the representation (7) over the quotient space X, instead of the group itself. Note that all the formulas simplify if the wavelet is axisymmetric. In particular, the third Euler angle \ drops out in Re, so that motions are now indexed by points UJ € S2. Thus one writes i?[w] instead of Re. The corresponding wavelet family is thus {ipa,u = R[u] Daip : a > 0, ui € S2}, and it is a frame under the same condition as in Proposition 2.3. Otherwise there is no essential modification. In order to illustrate the capabilities of our spherical CWT, we present first, in Fig. 3, an academic example, namely, the transform of the characteristic function of a triangle with apex at the North Pole, 0° ^ 9 ^ 50°, 0° ^ ip ^ 90°, obtained with the spherical DOG wavelet Vo > f° r a = 1.25 given in (15). The transform is shown at three different, gradually smaller scales, a = 0.2,0.1 and 0.035. As expected, it vanishes inside the
351
(a)
(c)
'
(b)
(d)
Fig. 3. Spherical wavelet transform of the characteristic fuoction of a triangle, obtained with the spherical DOG wavelet V o 0 . for « = 1.25. (a) Original image. The transform is shown at three gradually smaller scales, (b) a = 0.2; (c) a = 0.1; and (d) a = 0.035.
triangle, and presents a "wall" along the contour, with sharp peaks at each vertex, and the North Pole does not play any particular role. This example confirms that the spherical CWT behaves exactly as its plane counterpart. Next we present, in Fig. 4, a real life example, namely, the analysis of an image of the Milky Way, based on data from the Hipparcos and Tycho Stars Catalogues.
352
P?EE
|
(a)
(b)
• . > ; - •
**
(c)
(d)
Fig. 4. Spherical wavelet transform of an image of the Milky Way. (a) Original image. The transform is shown at three successive scales, (b) a — 0.08; (c) a = 0.04; (d) a = 0.02.
2.2.
The Euclidean
limit
The geometry of the sphere suggests that, when the radius R increases to infinity, the CWT on S 2 should tend locally to the CWT on the tangent plane at the North Pole. This condition, imposed for consistency reasons by Holschneider,7 may actually be derived in the group-theoretical approach, using the technique of group contraction, with the sphere radius as parameter, R —> oo. The limit R -> oo must be taken at several successive stages. The result of the analysis is the following.
353
(1) For the groups S0(3) S 0 0 ( 3 , l ) = SO(3)-A-iV
—> R2 x S0(2) —> M 2 xSIM(2)
Thus the parameter space S0(3)-R+, which is not a group, becomes in the limit the group SIM(2), that is, precisely the group underlying the 2-D plane CWT. (2) For the group actions Let us replace the sphere S2 by the sphere SR of radius R. Then: action of a(X) c S0 0 (3,1) on S2R —• action of SIM(2) on R 2 . (3) For the representations Define a family of representations US,R on L2(SR, dfj,R(w)), where dfiR(oj) = R2dfx{uj), by Us,R(r,a)=Us{a{r,a/R)). Then US,R —> U as R —> oo, as a strong limit on a dense set. (4) For the CWT on S2 Let ij){x) e L 2 (M 2 ,d 2 f) and ipR = I ^ V , where n f i : L2(SR,dnR(co)) ->• L 2 (K 2 , d2:?) is the unitary map induced by the stereographic projection (see (20) below). Then r
G^R(0^c(forall/eN)
*-^°
c^ ~ / VR2
-72 IT
|^(£)|2 — < o o . \k\2
Thus admissible vectors on 5 2 correspond to admissible vectors on R 2 , i.e., the Euclidean limit holds. In summary, for V = linifl-voo RR^R '•
tpR admissible on SR
ip admissible on E2
/ JS2
dfiR(w) — 1 + COS V
j
d2xip(x)
= 0.
JR<
To give an example, take the two Difference of Gaussian wavelets. When R -» oo, the SDOG wavelet on SR tends to the usual DOG wavelet on R 2 .
354
2.3. The geometrical
or conformal
method
The group-theoretical method discussed so far yields an asymptotic connection with the plane CWT, via the Euclidean limit R -> oo. In fact, there is also a direct connection (unitary map) through the inverse stereographic projection and it is uniquely specified by geometrical considerations, as we show now. The result is that one obtains uniquely the spherical CWT from the plane (Euclidean) one, simply by lifting everything from the tangent plane to the sphere by inverse stereographic projection, the wavelets, the admissibility conditions, the directionality or steerability properties. 12 (1) Uniqueness of the stereographic projection Let IT : S2 —> R2 be a radial diffeomorphism from the 2-sphere to the tangent plane at the North Pole: n(0,
with inverse
7r-1(r,if) = (0(r),tp).
Assume that 7r is a conformal map, i.e., it preserves angles, or, equivalently, the metric g' induced by n on R2 is conformally equivalent to the Euclidean metric g: g,ij(r,ip) = e^gij(r,
0(r) > 0.
Then r{6) = 2 tan | , i.e., n is the stereographic projection. (2) Uniqueness of the stereographic dilation Let Da be a radial dilation on the sphere S2: Da{6,V>) =
{ea{6),V).
Assume Da is a conformal diffeomorphism. Then one has uniquely: tan(y) =atan(-), i.e., Da is the stereographic dilation (4). Thus one obtains an equivalence between the two wavelet formalisms. Let II : L2(S2,dfi(uj)) -> L2(R2,d2x) be the unitary map induced by the stereographic projection (note that II = 11^ for R — 1, as defined in Sec. 2.2 (4)): [UF](x)=1
+
^/2)2F(^1(x)),
FeL2(S2,d»(u)),
(20)
with inverse [n_1/](
^)
=
T 7 ^ ? / ( 7 r M ) ' / eL2 ( E2 ' d2f )-
(21)
355
(a)
(b)
(c)
(d)
Fig. 5. The spherical Morlet wavelet is shown at two scales, (a) o = 0.3 and (b) a = 0.03. Then displaced: (c) o = 0.03, centered at (tr/3, ir/3); and (d) The same, rotated by 7r/2.
Then every admissible Euclidean wavelet ip € L2(W.2,d?x) yields an admissible spherical wavelet n-1V> £ L2(S2,dfj,(w)). In particular, if tp is a directional wavelet, so is n - 1 ^ . As an example, the (real part) of the spherical Morlet wavelet is shown in various positions in Fig. 5. In order to show its directional selectivity, we present in Fig. 6 the analysis of the triangle from Fig. 3. The wavelet is oriented in two ways, x = 0° and % = 90° (x i s t n e third Euler angle (see Sec. 2), which describes a rotation of the wavelet around its center). As expected, this wavelet filters out the directions perpendicular to its orientation, keeping the great circles tp = const, in the first case and the longitude circles 0 = const, in the second case.
356
(a)
(b)
Fig. 6. Analysis of a triangle with the spherical Morlet wavelet, in two different orientations: (a) x = 0°; (b) X = 90°, showing the directional selectivity of the wavelet.
3. Discrete Wavelet Frames on the 2-Sphere In order to discretize our spherical CWT, we have to generalize the notion of frame. The classical notion 13 is that a countable family of vectors {cf>n : n £ T} in a (separable) Hilbert space Sj is a (discrete) frame if there exist two positive constants m and M such that HI/lP ^ £|<^„|/>|2
^ M | | / | | 2 , for a l l / G S .
(22)
n€r
The index set T may be finite or infinite. We introduce two variants to this classical notion. The family {
(23)
The family {<j>n} is a weighted frame in Sj if there are positive weights wn > 0 such that m||/|| 2 ^ J2
w
n\(4>n\f)\2
< M||/|| 2 , for a l l / G ^ .
(24)
These two notions are in fact mathematically equivalent to the classical notion of frame, namely, a family of vectors {<j>n} is a controlled frame, resp. a weighted frame, iff it is a frame in the standard sense (with different frame bounds, of course). However, this is not true numerically, the convergence properties of the respective frame expansions may be quite different.13,14 And, indeed, the new notions will be used precisely for improving the frame bounds, which ultimately control the convergence.
357
3.1. Half-continuous
spherical
frames
In a first step, we will try to build a half-continuous spherical frame, by discretizing the scale variable only, while keeping continuous the position variable on the sphere (this is exactly the approach adopted by DuvalDestin et al. for designing the so-called continuous wavelet packets 15 ). We work with the axisymmetric SDOG wavelet (15) ip = ipa (a — 1-25) and the half-continuous grid A = {(u),a,j) : w e S 2 , j € Z, a,j > flj+i}, where A = {aj : j € Z} is an arbitrary decreasing sequence of scales. Let us first start from the standard weighted frame condition given in (24):
m||/||2 «S E " i /
d^u) \Wf(w,aj)\2 < M||/||2,
using a discrete dyadic scale with K voices aj = OQ2~^K, weights Uj mimic the natural (Haar) measure da/a3: _
aj -
aj+i _
2
(25)
j € Z. The
2 1 / J < r -i
= a, Upon estimating the frame bounds, it turns out that the ratio M/m converges rapidly to 1.8107 as K increases. Thus we obtain a weighted frame, but there is no way of getting a tight one. The reason is obvious, the resolution operator A$ has not been taken into account. Thus we start from the Plancherel formula (18) and write a modified frame condition mll/ll 2 < E j€Z
"i I
d»(u)Wf(u,aj)WJ(^a~)
^ M ||/|| 2 .
(26)
JS2
A sufficient condition for the relations (26) to hold is that the following ones be valid: 47T
m < 2 m G * ( Z ) _ 1 j'GZ ^ Vj ^ ( / ' 0 ) | 2 ^ MProceeding as before, with the same SDOG wavelet, one obtains that the ratio M/m tends to 1 as K increases. Thus a tight frame might be obtained by this method. Indeed, we have the following result.
358
P r o p o s i t i o n 3 . 1 . Let A = {a,j : j G Z} be a decreasing sequence of scales. Ifip is an axisymmetric wavelet for which there exist two constants m, M € R+ such that
m ^ 91,(1) = ^ y J2 vi $*i ('- °)| 2 ^ M- f°r oil I 6 N,
(27)
jez then any function f £ L2(S2,dfi(u)) may be reconstructed from the corresponding family of spherical wavelets, as
/ M = E vi [ dMw') w/(w'> ai) [^lRw}Da^] («'),
(28)
where 1$ is the (discretized) resolution operator defined by £Zlh(l,m) g-\l)h{l,m).
=
Note that the resolution operator £$ is simply the discretized version of the continuous resolution operator A$. Clearly (28) may be interpreted as a (weighted) tight frame controlled by the operator tZl. 3.2. Discrete
spherical
frames
Finally, we proceed to design a fully discrete spherical frame, by discretizing all the variables. The scale variable is discretized as before: a € A = {aj G K+ : a,j > aj+i,j 6 Z } . As for the positions, we choose an equiangular grid Qj indexed by the scale level: Sj = Wjpq = (9jp,
j q
= §:},
(29)
for p, q G Mj :— {n G N : n < 2Bj} and some range of bandwidths B = {Bj G 2N : j G Z } . Note that, in (29), the values {0jP} constitute a pseudo-spectral grid, with nodes on the zeros of a Chebyshev polynomial of degree 2Bj. Their virtue is the existence of an exact quadrature rule, due to Driscoll and Healy,16 namely, dfi(u)f(u)=
]T
wjpf(wjpq),
(30)
/. for certain (explicit) weights u>jp > 0 and for every band-limited function / G L2(S2,dfi(oj)) of bandwidth Bj (i.e., J(l,m) = 0 for all I ^ Bj). Thus the complete discretization grid reads as follows: A(A,B)
= {(aj,ujjpq):j
G Z,p,qeAfj}.
(31)
359 As before, we are looking for a weighted frame {i/jjpq = R[Uipq]Dajij;} controlled by the operator Al : "ll/ll2 ^ E
E
VjVjpWficJ^a^Wfiuj^aj)
*C M ||/|| 2 .
(32)
j€Zp,qetfj
Proposition 3.2. Consider the discretization grid A(A, B) defined in (31). Given an axisymmetric admissible wavelet ip on S2, define the quantities
S'(l) = E S W O <#(') l^(/,0)|2, J=||^||=
sup JlgjH,
(33) (34)
where the infinite matrix X = (<%');,(-^N is given by
xw = E
27rUjC {l n
j ' ^-.coa+ng^a) 1^^0)11^^,0)1 (35)
and Cj(M') = (2(i + Bj) + l) 1 / 2 (2(Z' + 5,-) + if'2. Next define K0 = inf/eN S"(Z) and ifi = sup J g N 5'(/). In these notations, if the constant S = 11X11 is such that 0 ^ 6 < K0
^ Kx < oo,
(36)
t/ten the family {ipjpq — R^^^Da^ : j G Z,p,<7 6 A/}} is a weighted spherical frame controlled by the operator Al1 (i.e., (32) holds). The frame bounds are m = KQ — d, M = KQ + S. Of course, the norm of infinite dimensional matrix X is difficult to compute. However, if / G L2 (S2, dfi(u)) is band-limited of bandwidth b G N°, then X is b x ^-dimensional and the computation is possible. A numerical evaluation has been performed with the SDOG wavelet, b = 64, a dyadically discretized scale with K = ao — 1, and a bandwidth associated to the grid size at resolution j : Bj = BQ2^\ Bo € N, where Bo is the minimal bandwidth associated to tp\. The result is that the sufficient condition (36) is satisfied for BQ ^ 4, but a tight frame cannot be obtained by increasing Bo, that is, using finer and finer spherical grids. Indeed, we know from the first trial above that a discrete frame with a one voice discretization of the scale variable is not sufficient to get a tight frame! As usual, when the frame bounds are close enough (i.e., 8 is sufficiently small), approximate reconstruction formulas may be used. The convergence of the process may still be improved by combining the reconstruction with a conjugate gradient algorithm. A spectacular example may be found in
360
(a)
(b)
(c)
(d)
Fig. 7. Local enhancement of Jupiter's Red Spot, (a) Original image; (b) Local mask; (c) Zoom over the Red Spot; (d) Zoom over the Red Spot with sharper details.
the last of our papers. 11 The signal is a World map, recorded on a equiangular grid of 512x512 points. The reconstruction (|j| ^ 6, K = 10) is performed with a half-continuous spherical frame and the SDOG wavelet, with a relative error of 1.1%. Adding the conjugate gradient algorithm with 3 iterations only, the relative error drops to 2.10 - 3 %. Instead of that example, we present another application of our spherical frames, namely, a local enhancement of Jupiter's Red Spot. The method runs as follows. Before reconstruction, the coefficients at the finest scale Wf(u>, oj) are multiplied by a Gaussian mask M(w) = l+na> [R{ui]Da'G](u) localized on the center us' of the Spot, with ||M||oo = 2. This mask increases their amplitudes by a factor up to 2 in the vicinity of the Red Spot, but the rest of the coefficients are not modified (the mask is thus a frame multiplier14). The reconstruction is made with a half-continuous spherical frame with a SDOG wavelet, data bandwith b = 256, and equiangular grid of size 512 x 512, which gives a good discretization for | j | < 7 and ag = 1. Technical
361 tools are the SpharmonicKit package 1 7 and our own M A T L A B © YAWtb toolbox. 1 8 T h e result, shown in Fig. 7, is quite spectacular. Clearly, such a technique is impossible t o implement with a purely frequential spherical decomposition; one really needs a spherical wavelet frame. 4. C o n c l u s i o n s a n d P e r s p e c t i v e s T h e spherical C W T described in the present paper is fully operational, b u t there remains a number of open questions. For instance, can one derive spherical wavelet frames without Fourier analysis on 5 2 ? C a n one use other discretization grids? Is there a fast algorithm? More generally, can one extend the t r e a t m e n t t o different geometries (two-sheeted hyperboloid, 1 9 paraboloid, torus, general axisymmetric manifold)? T h e problem is to find which methods can be generalized: G r o u p theory? Conformal invariance 1 2 ? Direct construction based on a different notion of dilation and a convolution theorem on the manifold 1 9 ? Concerning applications, plenty of problems arise: analysis of t h e Cosmic Microwave Background (CMB), omnidirectional cameras, plenoptic vision, lightfields, study of molecular (star-shaped) surfaces, etc. Work is in progress in several of these directions in the groups at UCL and E P F L . References 1. D. Potts, G. Steidl and M. Tasche, Kernels of spherical harmonics and spherical frames, in Advanced Topics in Multivariate Approximation, pp. 287-301; F. Fontanella, K. Jetter and P. J. Laurent (eds.) (World Scientific, Singapore, 1996). 2. B. Torresani, Position-frequency analysis for signals defined on spheres, Signal Proc, 43, 341-346 (2005). 3. W. Freeden, T. Maier and S. Zimmermann, A survey on wavelet methods for (geo)applications, Revista Mathematica Complutense, 16, 277-310 (2003). 4. T. Biilow, Multiscale image processing on the sphere, in DAGM-Symposium, pp. 609-617, 2002. 5. P. Schroder and W. Sweldens, Spherical wavelets: Efficiently representing functions on the sphere, in Computer Graphics Proceedings (SIGGRAPH1995), SIGGRAPH, ACM, 1995, pp. 161-172. 6. D. Ro§ca, Locally supported rational spline wavelets on the sphere Math. Comput. 74 (252), 1803-1829 (2005). 7. M. Holschneider, Continuous wavelet transforms on the sphere, J. Math. Phys. 37, 4156-4165 (1996). 8. J.-P. Antoine and P. Vandergheynst, Wavelets on the 2-sphere: A grouptheoretical approach, Applied Comput. Harmon. Anal. 7, 262-291 (1999). 9. J.-P. Antoine and P. Vandergheynst, Wavelets on the n-sphere and other manifolds, J. Math. Phys. 39, 3987-4008 (1998).
362 10. J.-P. Antoine, L. Demanet, L. Jacques, and P. Vandergheynst, Wavelets on the sphere: Implementation and approximations, Applied Comput. Harmon. Anal. 13, 177-200 (2002). 11. I. Bogdanova, P. Vandergheynst, J.-P. Antoine, L. Jacques and M. Morvidone, Stereographic wavelet frames on the sphere, Applied Comput. Harmon. Anal. 26, 223-252 (2005). 12. Y. Wiaux, L. Jacques and P. Vandergheynst, Correspondence principle between spherical and Euclidean wavelets, Astrophys. J. 632, 15-28 (2005). 13. J.-P. Antoine, R. Murenzi, P. Vandergheynst and S. T. Ali, Two-Dimensional Wavelets and their Relatives (Cambridge University Press, Cambridge (UK), 2004). 14. P. Balazs, private communication and Regular and irregular Gabor multipliers with application to psychoacoustic masking, Ph.D. thesis, U. Wien, 2005. 15. M. Duval-Destin, M.-A. Muschietti and B. Torresani, Continuous wavelet decompositions, multiresolution, and contrast analysis, SI AM J. Math. Anal., 24, 739-755 (1993). 16. J. R. Driscoll and D. M. Healy, Computing Fourier transforms and convolutions on the 2-sphere, Adv. Appl. Math., 15, 202-250 (1994). 17. D. Rockmore, S. Moore, D. Healy and P. Kostelec, SpharmonicKit (Dartmouth College) http://www.cs.dartmouth.edu/ geelong/sphere/. 18. http://www.fyma.ucl.ac.be/proj ects/yawtb. 19. I. Bogdanova, Wavelets on non-Euclidean manifolds, Ph.D. thesis, EPFL, 2005.
363
O N C O M P A C T ELEMENTS OF B A N A C H A L G E B R A S U. N. BASSEY Department of Mathematics, University of Ibadan, Ibadan, Nigeria E-mail: [email protected] In this paper we prove, among other things, that a semi-simple and topologically simple Banach algebra which has an identity and contains a non-zero finite-dimensional element is itself a finite-dimensional algebra. Keywords: Banach algebra, semi-simple algebra, C*-algebra, if'-algebra, compact element.
1. Introduction Let A be a normed algebra over the field of complex numbers C. An element u 6 i i s said to be left compact if the mapping Lu := x i-> ux (x G A) is a compact linear operator on A, i.e., the sequence {uxn} has a convergent subsequence whenever {xn} is a bounded sequence in A. An element u £ A is said to be compact (resp. finite-dimensional) if the mapping Tu>u :— x i->uxu (x £ A) is a compact (resp. finite-dimensional) linear operator on A. Finite-dimensional elements are compact since the closed unit ball in their range is compact. The results of the present paper are concerned with (left) compact elements of three classes of Banach algebras, namely, semi-simple algebras, C*- and i7*-algebras. We recall that a left ideal J of an algebra A is said to be modular if there exists a modular identity for J, that is, an element v of A such that x — xv € J for all x 6 A. A modular right ideal is denned similarly. We refer to maximal proper ideals as maximal ideals. The (Jacobson) radical of an algebra A, denoted by Rad(A), is the intersection of all maximal modular left ideals in A (and this is equal to the intersection of all maximal modular right ideals in A). A is semi-simple if Rad(A) = {0}. Tullo1 has studied conditions on Banach algebras which imply finitedimensionality. In Theorem 2.2 below it is shown that a semi-simple and
364
topologically simple Banach algebra which has an identity and contains a non-zero finite-dimensional element is itself a finite-dimensional algebra, that is, it has finite dimension. Let A be a commutative semi-simple Banach algebra. A linear operator T : A -> A is a multiplier of A if and only if T(xy) — x(Ty) for all x,y € A. Kamowitz (see Ref. 2, p. 79) has proved the following remarkable theorem. T h e o r e m 1.1. Let A be a commutative semi-simple Banach algebra and T a compact multiplier of A. If the maximal ideal space of A contains no isolated points, then T = 0. Using this result, we easily identify in Theorem 2.3 commutative semisimple Banach algebras with no non-trivial left compact elements. Our next result is on left compact elements of C*-algebras. A C*-algebra is a closed *-subalgebra of the algebra HJ-L) of all continuous linear operators on some Hilbert space W. A C*-algebra may be defined abstractly as a Banach algebra A with a conjugate linear involution "*" which satisfies ||a|| 2 = ||a*o|| for all a £ A. (Abstract C*-algebras are otherwise known as I?*-algebras). According to a theorem of Gelfand and Naimark (see Ref. 3, p. 45, Theorem 2.6.1; and Ref. 4, p. 209, Theorem 10), the two definitions are equivalent. Before going on we must define the notion of centralizer of an element of a Banach algebra. Given an element u of a Banach algebra A, the centralizer of u is the set of all elements of A that commute with u. Plainly, the centralizer of u is a closed subalgebra of A containing u. Let X be a complex Banach space and let T be a compact linear operator on X. Bonsall5 has proved that T is a left compact element of its centralizer (see also Ref. 4, p. 174, Theorem 3). Consider a fixed element u of a C*-algebra A with identity, say 1, and suppose that it is a left compact element of its centralizer. In Theorem 2.5 below we prove that if there exists a sequence of left compact elements of A which commute with u and converge weakly to 1, then u is a left compact element of A. The final result of this paper has to do with a left compact element of an i7*-algebra. An H"-algebra (see Ref. 4, p. 182) is a complex Banach algebra A, with involution " * ", that is also a Hilbert space with respect to an inner product (•, •) such that the following axioms hold, for all x,y,z € A,
(i) H 2 = <*,*>; (ii) (xy,z) = (y,x*z),
{yx,z) = (y,zx*).
365
It is pertinent to mention here that concrete examples of H"-algebras include an important class of compact linear operators acting on a Hilbert space. Consider the algebras c p , 1 < p < 2, studied by McCarthy 6 and others. This is the class of operators on a Hilbert space for which the cp norm, ITI l J \P — —
trace
(T*T)p/2
is finite. Let the Hilbert space be separable and infinite dimensional. The algebra c\ is the trace class algebra (TC) in the notation of Schatten's book 7 while ci is the if*-algebra, the Schmidt-class of operators (see Ref. 8, p. 287). The strong radical Rads(A) of a Banach algebra A is the intersection of all modular two-sided ideals. If Rads(A) = {0} then A is said to be strongly semi-simple. Nakano (Ref. 9, p. 20, Theorem 5.5) has shown that each minimal closed ideal of an H"-algebra A is finite-dimensional if and only if each element u £ A is left compact; while Grove (Ref. 10, p. 74, Theorem 1.2) has provided five characterizations of strong semi-simplicity for an i?*-algebra A each of which implies that each element u G A is left compact. In what follows, we contribute to the study of left compact elements of i?*-algebras by proving in Theorem 2.7 the final result of this paper, that for any left compact element u of an i?*-algebra A such that A = ||-Lu|| = sup{||ua;|| : x e A, \\x\\ < 1} > 0 there exists an element e 6 A of norm 1 such that u*ue = A2e. Our results have operator theoretic flavour. We end this paper with some examples of Banach algebras containing left compact elements. 2. The R e s u l t s A Banach algebra A is topologically simple if it contains no proper closed two-sided ideals. The following result of Tullo (Ref. 1, p. 2, Proposition 4) is needed for the proof of Theorem 2.2 below. Proposition 2.1. A topologically simple Banach algebra which has an identity and contains minimal one-sided ideals is finite dimensional. Theorem 2.2. Let A be a semi-simple and topologically simple Banach algebra with identity. Suppose that A contains an element u such that the mapping x H-> UXU (X £ A) is a finite dimensional operator on A. Then A is finite dimensional.
366
Proof. Let A be a semi-simple and topologically simple Banach algebra with identity, say 1, and let 0 ^ u G A such that the mapping TUtU :— x i-> uxu (x G A) is a finite dimensional operator on A. Then dim(uAu) < oo. Let e be a small positive number and let v € uAu be a non-zero element such that 6iva(vAv) = e. Then for any element y G A with vyv ^ 0, we have vyvAvyv — vAv. Therefore there exists an element x G A such that vyvxvyv = vxv. Now we have (vxvyv — v)A(vxvyv
— v) C vAv.
(1)
Equality cannot hold in (1), since it would imply that {0} = vy(vxvyv — v)A(vxvyv =
— v)yv
vyvAvyv
= vAv. Hence, (vxvyv — v)A(vxvyv
— v) — {0}.
(2)
Since A is semi-simple, condition (2) implies vxvyv — v = 0. That is, vxvyv — v, which is a contradiction. Now we show that Av is a minimal left ideal. Let L C Av be a non-zero left ideal. Then there exists yo G L, x0 G A with yo%oyo ^ 0 a n d 2/o = yvAlso there exists an element x £ A such that v =
vxvxoyv.
Consequently, Av = Avxvxoyv
C Avx^yv C L.
It follows that Av is a minimal left ideal. A as given in the theorem is also a topologically simple Banach algebra with identity. Hence by Proposition 2.1, we conclude that A is finite dimensional. •
367
Theorem 2.3. Let A be a nontrivial commutative semi-simple Banach algebra with its maximal ideal space containing no isolated points. Then A has no nontrivial left compact elements. Proof. Suppose on the contrary that A has nontrivial left compact elements. Let u e Abe one of such elements. Then the multiplication operator Lu : x H> ux : A —> A is compact. But A being a commutative Banach algebra, Lu is a multiplier of A. For, Lu(xy) = u(xy) = x(uy) — x(Luy) for all x,y £ A. Hence by Theorem 1.1, Lu = 0. This is a contradiction. Hence A has no nontrivial left compact elements. • A Banach algebra in which every element is (left) compact is called a (left) compact algebra. Corollary 2.4. The maximal ideal space of a commutative semi-simple left compact Banach algebra A contains isolated points. Proof. Every element of A is left compact. Therefore every nontrivial element u € A is left compact. It follows that the maximal ideal space of A contains isolated points. For, otherwise, by Theorem 2.3 each element of A would be trivial. • Let A be a C*-algebra and let A' be the topological dual of A. Then the mapping (•, •) : A x A' —> C is defined by (x, f) = f(x) for all x £ A,
feA1. Now, consider a fixed element u of a C*-algebra A with identity 1 and suppose that u is a left compact element of its centralizer. Then we have the following theorem. Theorem 2.5. If there exists a sequence {an} of left compact elements of A which commute with u and anx —> x (x € A) in the weak topology of A. Then u is a left compact element of A. Proof. Suppose that for each n € N, a„ is a left compact element of A, anu = uan and lim (x,Lanf) = (x,f) for all x € A, f € A'. Then the n—>oo
sequence {||an||} is bounded, and so there exists a subsequence {ank} and t E A such that lim \\uank - t\\ = 0. n—•oo
Then, by Theorem 3.2 of Freundlich (Ref. 11, p. 276), t is a left compact element of A, and also for all x E A, f € A' since, by Sakai (Ref. 12, p. 19), lim (anx,f)
= lim (x,Lanf)
=
{x,f),
368
we have (tx,f)
= lim
(ua„hx,f)
ft—»00
= lim
{u{ankx),f)
ft—>00
= lim
(ankx,Luf)
k—»oo
= (x,Luf) = (ux,f), from which it follows that t = u.
•
Hereafter we contribute to the study of left compact elements of H*algebras by proving the following theorems. Theorem 2.6. Let A be an H*-algebra. Then an element u E A is left compact if and only ifu* E A is left compact. Proof. Let u E Abe left compact. Then the mapping Lu := x i-» ux (x E A) is a compact linear operator on A. Now consider the mapping {LUY : A —• A. For x, y E A, we have (x,(Lu)*y)
= (Lux,y)
= (ux,y) = (x,u*y) =
.: {Lu)* = Lu. :=y>-> u*y
(x,Lu.y).
(y € A).
The result now follows from Schauder's Theorem (Ref. 4, Corollary 4, p. 175) that Lu is compact if, and only if, (Lu)* is compact. • Remark. It is clear from Theorem 2.6 that if u is a left compact element of an H"-algebra A, then also u*u acts compactly on A. Now the mapping u t-> Lu is a *-homomorphism of A into L(A), the Banach algebra of all continuous linear operators on A, and so Lu*u — \Lu) Lu — Lu* Lu is a compact positive self-adjoint operator on A. We have the following theorem. Theorem 2.7. Let A be an H*-algebra. Then for each left compact element u E A such that A = \\LU\\ > 0 there exists an element e E A such that u*ue = A2e and ||e|| = 1.
369 Proof. By hypothesis A =
sup ||ui|| (see Ref. 4, p. 184) and u is left Nl
Setting Lu*f = A2e, we have = A2e,
lim Lu,Luxn n—>oo
that is, lim u*uxn = A2e.
(3)
n—>oo
Now we know " S W-'-'u* •L'uXn — A Xn\\
=
\^-'u* J-'uXn ~ A Xn, Lu* LuXn
—A
= \\Lu»Luxn\\
— 2A \Lu*Luxn,
= \\Lu,Luxn\\2
- 2\2\\Luxn\\2
< 117" . l l 2 l l r
r
Xn)
xn) + (A j |[icr„.11
+ X4\\xn\\2
I I 2 - 9 A 2 I I 7 "
Since the right side of the last inequality converges to 0 as n —> oo, we have lim (u*uxn — \2xn)
= 0.
(4)
n—>oo
Combining (3) and (4) we have A2 lim xn = lim u*uxn = A2e. n—too
n—¥oo
Therefore, lim xn = e. n—*oo
Consequently, we have the equation u*ue = lim u*uxn = A2e. n—>oo
Prom the relation A = lim | | « i n | | = ||ue|| = ||L„e|| < ||L u ||||e|| = =
A||e|| A lim ||arn||
•
370
Example 1. Let ri denote the Hilbert space of square summable sequences with component-wise multiplication. Then ri is a commutative semi-simple Banach algebra with discrete maximal ideal space. If we let u = {«„} be a sequence of complex numbers converging to 0, then Lu := x = {xn} i—> ux = {unxn} is a non-zero compact linear operator on %. Therefore u is a non-trivial left compact element of H, and Spn(u) = {un • n is a positive integer} U {0} (Ref. 2, p. 80), where Sp-n(u) denotes the spectrum of u. Example 2. Consider the algebra L2(G) of all complex valued square int e g r a t e functions on the compact topological group G. Here the multiplication operation is convolution
U*9)(s)= I
fist-^gWdfi
JG
where [i is the normalized Haar measure on G and JG
The involution is given by f*(t) = fit"1)- Then L2(G) is a semi-simple i?*-algebra (Ref. 8, p. 330), and it is well-known (Ref. 8, p. 284) that each element of L2(G) is (left) compact. Example 3. Let A be a Banach algebra and let u € A be a nilpotent element of A. Then u is power compact, i.e., there exists a positive integer n such that x H-> unx (x € A) is a compact linear operator on A. To see this, let u be a nilpotent element of step n, i.e., un — 0. Then the principal ideal unA generated by un is singleton {0} C E. Thus dim(u"A) = 0 < oo, showing that un is left compact.
References 1. A. W. Tullo, Conditions on Banach algebras which imply Unite dimensionality, Proc. Edinburgh Math. Soc, Vol. 20 (Series II), part 1, 1-5 (1976). 2. H. Kamowitz, On compact multipliers of Banach algebras, Proc. Amer. Math. Soc, Vol. 18, No. 1, 79-80 (1981). 3. J. Dixmier, C*-Algebras (North Holland Publishing Company, Amsterdam 1977). 4. F. F. Bonsall and J. Duncan, Complete Normed Algebras (Springer-Verlag, Berlin, 1973). 5. F. F. Bonsall, Compact linear operators from an algebraic standpoint, Glasgow Math. J. 8, 41-49 (1967).
371 6. C. A. McCarthy, cp, Israel J. Math. 5, 249-271 (1967). 7. R. Schatten, Norm Ideals of Completely Continuous Operators (SpringerVerlag, Berlin, 1960). 8. C. E. Rickaxt, General Theory of Banach Algebras (Robert E. Kieger Publishing Company, Huntington, New York, 1974). 9. H. Nakano, Hilbert algebras, Tohoku Math. J. 2, 4-23 (1950). 10. L. C. Grove, A generalized group algebra of compact groups, Studia Math. T X X V I , 73-90 (1965). 11. M. Preundlich, Completely continuous elements of a normed ring, Duke Math. J. 16, 273-283 (1949). 12. S. Sakai, C*-Algebras and W*-Algebras (Springer-Verlag, Berlin, 1971).
372
APPLICATION OF THE ADOMIAN DECOMPOSITION METHOD TO SOLVE T H E D U F F I N G E Q U A T I O N A N D COMPARISON WITH THE PERTURBATION METHOD GABRIEL BISSANGA University Marien Ngouabi Faculty of Sciences, Department of Mathematics, B.P. 69 Brazzaville, Republic of Congo e-mail: [email protected] In this paper, the Adomian Decomposition Method (ADM) is used to study the Duffing equation. The series solution is constructed and compared with the solution obtained by the perturbation method. Keywords: Duffing equation, Adomian Decomposition Method, regular perturbation.
1. Introduction The Duffing equation is used in many areas: physics, mechanics, astronomy, etc. Because of its nonlinearity, it is difficult to find the exact solution; often it is solved by numerical methods. The Adomian Decomposition Method (ADM) is very useful to get an approximation of the solution. First, we construct the solution by the ADM. Then, we apply the regular perturbation method, 1 and finally compare the two methods. 2. The Adomian Decomposition Method 2.1. About the Adomian
Decomposition
Method
Suppose that we need to solve the following equation Au = / ,
(1)
in a real Hilbert space H, where A : H —> H is a linear or a nonlinear operator, / € H and u is the unknown. The principle of the ADM is based
373
on the decomposition of the nonlinear operator A in the following form2
6
A = L + R + N, where L + R is linear, N nonlinear, L invertible with L _ 1 as inverse. Using that decomposition, Eq. (1) is equivalent to 6 + L~lf
- L~lRu -
L^Nu,
(2)
where 9 obeys L9 = 0. Equation (2) is called the Adomian fundamental equation or Adomian's canonical form. We look for the solution of (1) in a series expansion of the form u = ^2n=o Un anc ^ w e c o n s ider Nu = ^2n=0 An where An are special polynomials of the variables uo,m, ...,un called Adomian polynomials and defined b y , 2 - 5 An —
1 dn n\d\n
'+oo
*I>
n = 0,l,2,...,
\n=0
A=0
where A is a parameter used for "convenience". Thus Eq. (2) can be rewritten as follows +oo
1
'+oo
1
'+oo
1
Y,^ = B + L- f-L- R[Y^nn) n=0
-L-
\n=0
[Y,An
(3)
\n=0
We suppose that the series X^n^o Un a n ( ^ Xm=o ^ « obtain by identification the Adomian algorithm
are
convergent, and
(u0=e + L-1f, ui = -L~l(Ru0)
-
L~lA0, (4)
1
- %+i = - £
1
(Run)-L
An.
In practice it is often difficult to calculate all the terms of an Adomian series, so we approach the series solution by the truncated series u = ]C"=0Ui, where the choice of n depends on error requirements. 2.2. Application of the Adomian Duffing equation
Method
to solve
the
The general form of the Duffing equation is adhg)_ + hdx^L +
dt2
dt
^^
+
^ ^
=
^
0
(5)
374
where a, b, c, d are given real constants, x(t) is some physical quantity, f(t) a given function and t stands for time. Here we examine the following initial value problem for the nonlinear spring, 1 (fx(t) —±-L+x(t)+ex3(t)=0,
0
(6)
with the initial conditions *(0) = 1,
^
= 0,
(7)
where 0 < £ « 1. We consider Eq. (6). Let us take L-1x(t)=
Lx(t) = — p ^ ,
J
J
Nx = x„33
x(z)dzds,
and suppose that +00
a:(t) = $ > „ ( * ) •
(8)
n=0
From (6) we have, rt
ft
ps
pt
x(t) =x(0)+x'{0)
/ ds/ x(z)dzds-e Jo Jo Jo We thus obtain the Adomian algorithm x0(t) =x(0)+x'{0)f*ds
fS
/ Jo Jo
x3(z)dzds.
= l, (9)
^ xn+1 (t) = - /„ J0S xn(z)dzds
-ef0
JQS An(z)dzds,
which gives us A0(t) = N(x0) = 1, -±(l+e)t2,
x1(t) = Al{t) = -\{l
+ e)t\
x2(t) = ±(l
+ e)(l + 3e)ti,
A2(t) = ^(l
+ s)(7 + 9e)t\
Vn > 0,
375
X3(<) = - ^ ( l + £)(l + 24 £ + 27e 2 )i 6 , A3(t) = ~~(1
+ e)(61 + 204e + 147e 2 )t 6 ,
x4(<) = ^ o ( l + e)(l + 207e + 639e2 + 441e 3 )* 8 , etc. Thus using (9) we can calculate each term of (8). 3. The Regular Perturbation Method We rewrite (6) and (7) in the following form, ^ " W +, ..,* u(t) +, ,„.3 eui(t) 2
= 0,
dt
«(0) = 1,
^
0
(10) (11)
Let us suppose that the solution u(t) of the initial value problem (10)(11) has an expression of the following form,1 -f oo
u(t) = J2un(t)en-
(12)
n=0
Taking (12) into (10)-(11), and collecting equal powers of e we obtain a linear system of recurrent initial value problems for un(t), n = 0,1,2,..., /^a+«o(*)=0,
(13)
\«o(0) = l,
^M=0,
U3)
\«i(o) = o,
^ i = o,
(i4)
(
^+u2(t)
= -3ul(t)Ul(t),
«2(0) = 0,
(15)
«dt1 = 0 , etc.
Prom (13), we have uo(i) = cos*. Prom (14), we have ui(t) = §£sini + ~cos3t. From (15), we have u2(t) = j ^ c o s i + Y§g *2 COS t - Y28 C 0 S ^* ~ 256 * s m 3* + 1^4 COS 5t. E t c .
--^cost -^tsint
376
4. Comparison of t h e Two Methods It is easy to see that in (8), we have obtained xn(t) = Pn(e)t2n is a polynomial of e with degre n. Suppose that
where Pn{e)
x(t)=x0(t)+x1(t)+x2(t).
(16)
To compare (16) with the solution obtained by the regular perturbation method, we must suppose in (10) that u(t) = u0(t) +m(t)e + u2(t)s2 + o(e 2 ) and expand uo(t), ui(t), U2(t) in MacLaurin series in the neighbourhood of t = 0 up to order 4. One obtains
u0(t) = l - £ + fi + 0 (* 4 ), ui(*) = - T + T + 0 ( * 4 ) . u2(t) = £+o(t4). Thus X(t) = X0(t) + Xi(t) + X2(t) = 1 - | ( 1 + e)t2 + £ ( 1 + e)(l + 3e)t4 = 1 - T - £ y + ^ ( l + 4 £ + 3e 2 )i 4
= i _ t + *1 + f_£ + <1)£ . *lg2 1
2 ^
24
' ^
2
6 /fc ^
T
2
= u 0 (f) + ui(i)e + u2(t)e = u(t).
8
fc
2
+ o(s )
5. Numerical Analysis In the following, we replace in (6) x{t) by the truncated series solution Xn(t) = SlLo xi{t) obtained by the Adomian method and analyse the error for e = 0.01. We suppose that L(x) = jffi' + x(t) + ex3(t). We then have, L(XX) = -1.287 9 x 10~ 3 i 6 + 7.650 8 x 10~ 3 1 4 - 0.520151 2 , L(X2) = 8.1441 x 1 0 - 7 f 1 2 - 2 . 8 4 6 5 x 10" 5 1 1 0 + 3.8799 x 10~ 4 t 8 -2.6013 x 10- 3 1 6 + 5.229 7 x 10~ 2 1 4 , L(X3) = -5.2974 x lO" 1 1 i 18 + 3.951 7 x lO" 9 1 1 6 - 1.443 x lO" 7 * 1 4 +3.195 1 x 10~ 6 1 12 - 4.633 5 x 10" 5 i 1 0 +4.4081 x 10" 4 1 8 - 4.396 8 x lO" 3 * 6 , L{X4) = 4.84 x 10~ 15 t2i - 3.223 8 x 10~ 13 i 22 + 1.517 4 x 10" 1 1 1 2 0 -5.023 3 x l O - 1 0 t18 + 1.270 9 x 10" 8 t16 - 2.556 3 x 10~ 7 tu +4.0 x 1 0 - 6 i 1 2 - 4 . 8 7 1 4 x lO" 5 * 1 0 + 5.2168 x lO" 4 * 8 .
377
We remark that when n increases we obtain a good approximate solution. It follows that, for example if the required error is of the order of 1 0 - 3 , in the neighbourhood of t — 0, X4 is an approximate solution. Hence, the Adomian algorithm yields an approximate solution after only 4 iterations. 6. Conclusion After comparaison, we remark that the solution obtained by the ADM is a particular case of the solution obtained by the regular perturbation method. The ADM solution is the regular perturbation solution in the neighbourhood of t = 0. The existence and the convergence of the series solution of (10)-(11) in a bounded segment 0 < t < T are issues which are addressed in quite a number of papers such as Ref. 1, pp. 28-38, so that from the convergence of (12), we can deduce the convergence of (8) and in practice, to calculate the terms of (8) is less difficult than to calculate the terms of (12). Thus to study in the neighbourhood of t = 0 some initial value problems for differential equations of the type ^ f + x = ef(x,^) with f(u, v) continuously differentiate with to respect to u and v and e a small parameter, we can use the ADM. References 1. E. M. De Jager and Jiang Fu Ru, The Theory of Singular Perturbations, North-Holland Series in Applied Mathematics and Mechanics, Vol. 42 (North Holland, Amsterdam, 1996). 2. K. Abbaoui, Les Fondements de la Methode Decompositionnelle d 'Adomian et application a la resolution de problemes issus de la biologie et de la medecine, Ph.D. Thesis, unpublished, University Paris VI (Prance, October 1995). 3. K. Abbaoui and Y. Cherruault, Convergence of Adomian's Method applied to differential equations, Computers Math. Applic. 28(5), 103-109 (1994). 4. K. Abbaoui and Y. Cherruault, Convergence of Adomian's Method applied to non linear equations, Mathematical and Computer Modelling 20(9), 69-73 (1994). 5. K. Abbaoui and Y. Cherruault, The Decomposition Method applied to the Cauchy problem, Kybernetes 28, 68-74 (1999). 6. N. Ngarhasta, B. Some, K. Abbaoui and Y. Cherruault, New numerical study of Adomian method applied to a diffusion model, Kybernetes 31, 61-75 (2002).
378
H E R M I T E I N T E R P O L A T I O N POLYNOMIAL OF SEVERAL REAL VARIABLES J. DZOUMBA, D. MOUKOKO and YUMBA NKASA University Marien Ngouabi, Faculty of Sciences, Department of Mathematics, B.P. 69 Brazzaville, Republic of Congo E-mail: moukokodaniSyahoo.fr In this paper, we define the Hermite interpolation polynomial of several real variables.
1. Introduction For k > 2, there exist many k real variable polynomials of smaller degree with respect to each variable which interpolate the function / : Efc —> E and whose first partial derivatives interpolate the corresponding partial derivatives of / . Among these polynomials, H is the unique remainder when any of these polynomials is divided by the k real variable polynomial -K. The other polynomials can be written as a sum of H and another polynomial such that the latter and its first partial derivatives vanish at the interpolation points. The quantity H can be called the k real variable Hermite interpolation polynomial of the function / . In this paper, we give a proof for two real variable functions. This result can be extended to several real variable functions.
2. Hermite Interpolation Polynomial of Two Variables Let (xj)j=o,n and (yi)i=o,m be respectively n + 1 and m + 1 distinct reals, and / : E 2 —>•ffia function such that f{xi,yj), -g^(xi,yj), -£-(xi,yj) exist for i = 0, • • • , n and j = 0, • • • , m.
379 Notations. a) I = {0, • • • , n} and J = {0, • • • , m}. n
and V
b) Vt G / , Xi(z) = n S : j=0
'
m
i € J, ^(2/) = EI S -
We recall
«=0
'
that if i - j , 0 else. c) V(*,j) e / x J , / y = f{xi,yj),
^
= TjL{xi,yj), ^ - =
%{xuyj).
d) If p is a polynomial of real variables x and y, d°xp (resp. c^Jp ) denotes the degree of p with respect to the variable x (resp. y). The degree of p will be denoted by the pair {d°xp, d°yp). n
m
e) nx(x) = H(x-Xi),
ny(y) = H(y - yi) <md n(x,y) = •Kx(x)TTy(y).
i=0
i=0
We shall suppose that the function / used in this section has first partial derivatives in the interpolation domain. Lemma 2.1. The two real variable polynomials (Uij)(itj)eixj (2n + 1,2m + 1) defined by, V(i, j) 6 I X J, Uij{x,y) = [-1 + (1 - 2X'i{xi){x + (1 - 2Y!(yj)(y -
Xi))Xi(x) )) Yj(y)] Vj
of degree
Xi(x)Yj(y),
are such that V(Z, k) e I x J, = i and fc = j ,
CM*i.¥*) = {J else, 2 ^-{xuVk)
= 0 and ^(xi,yk)
= 0,
V(/,fc) € / x J.
Proof. Direct by construction of the two real variable polynomials Uij. D Lemma 2.2. The two real variable polynomials {Vij)(i,j)^ixJ °f degree (2n + l,m) defined by Vij(x,y) = (x - Xi)Xf(x)Yj(y), V(i,j) E I x J, are such that Vij(xi,yk) = 0, V(Z, k) & I x J, dVij, •••~(xi,y ) dx ' k
and
x
.
f 1 if I = i and A; = j , \0 else,
-{I" -
-dt?-( i>yk) = o, v(/,fc) el x j .
380
Proof. Direct by construction of the two real variable polynomials Vij.
•
Lemma 2.3. The two real variable polynomials (Wy)(t,j)e/x./ of degree (n, 2m + 1) defined by Wij(x, y) = {yare such that Wij(xi,yk)
yj)Xi(x)Y?(y),
= 0 and —^-(x^yk)
dWu .
.
-WiXhVk)
=
V(», j ) 6 I x J, =Q,V(l,k)
(1
if / = i and k = j ,
\0
else.
E I x J, and
Proof. Direct by construction of the two real variable polynomials Wij. • Lemma 2.4. The above three classes of two real variable polynomials (Uij)(i,j)eixJ, (Vij)(i,j)eixj and (Wy)(i,j)e/xJ are linearly independent. Proof. Let be n
m
i=0 j=0
Then V(J,j) T(xi,yk)=aik=0,
n
m
i=0 j=0
n
m
i=0 j=0
elxj, dT -^(xi,yk)
= fiik = 0,
dT -j-(xi,yk)
= Jik = 0.
an We proved that the real coefficients (ctij)(ij)eixj, {Pij)(i,j)€i*j d , are ( yij)(i,j)eixJ equal to zero. Hence, the polynomials Uij, V^ and Wij are linearly independent. •
Lemma 2.5. The above three classes of two real variable polynomials (Uij)(i,j)eixJ, (vij)(i,j)eixJ and (Wy)(ij)€/xJ form a basis for the vector subspace P of the two real variable polynomial vector space. This subspace is of dimension 3(n + l)(m + 1). Proof. These two real variable polynomials are linearly independent and their number is 3 ( n + l ) ( m + l ) . Therefore they span a subspace of dimension 3(n + l)(m + l ) . • Remark 2 . 1 . P is also a subspace of the (2n + 1,2m + l)-th degree polynomial subspace.
381 Theorem 2.1. The two real variable polynomial
i=0 j=0
i=0 j=0
i=0 j = 0
y
of degree (2n + 1,2m + 1) is such that, V(Z, k) €. I x J, P(xi,yk) = fik,
f_(xhyk) = ^, X
, dy\ hyk)
=
(1)
Qy •
Proof. Direct by construction of the two real variable polynomials (Uij)(i,j)eixJ> (vij){i,j)eixJ and (Wij)^j)€lxJ. • Corollary 2.1. The two real variable polynomial p G P which satisfies (1) may be written as n
m
E E fij[-l +Xi(x)(l - 2X(xi)(x -
Xi))
+Yj(y)(l-2Y!(yj)(y-yj))]Xi(x)Yj(y) n
m
i=0 j = 0
+
tz^(y-yJ)xi(x)Y^y). i=0 j = 0
Proof. It is enough to write p as a linear combination with coefficients M l a nd £&. 8x
an<1
fy, n U
Sy •
Proposition 2.1. 7/ a two real variable polynomial p satisfies (1) then, 1) Vj € J, p(-,yj) and gf (-,2/j) interpolate, respectively, f(-,yj) §£(-,2/j) at tAe points (a;j)i=o,n2) Vi € 7, p(a;j,-) and §f (£»,-) interpolate, respectively, f(xi,-) %L(xi, •) at i/ie points (yi)i=o,mProof. The proof is straightforward.
and and
•
We shall use the following important result 1,2 about the one real variable Hermite interpolation polynomial of one real variable function. Theorem 2.2. The one variable Hermite interpolation polynomial of the differentiable function g : R —> R, at the points (xi)i=o,n is of degree 2n + 1 .
382
Corollary 2.2. If a two real variable polynomial p satisfies (1) then one has d°xp>2n + l and d°yp>2m + 1. Proof. 1) V? E J, the one real variable polynomials p(-,yj) and gf (•,%) interpolate, respectively, f(-,yj) and -g^(-,yj) at the points (xi)i=o
Xf(x)=Xi(x)[-J2xk(x)}
+ Xi(x).
(2)
fc=0
Proof. We know that J2"=o Xi(x)
= 1-1* follows that n
Vt € / ,
X?(x) +Xi(x)[Y,Xk(x)]=
Xi(x),
m and therefore we obtain (2).
D
Lemma 2.7. Vi € / , there exists one (n — l)-th degree polynomial ri such that X?(x) = irx(x)-ri(x)+Xi(x).
(3)
Proof. It is enough to use the preceding lemma and to note that
VfceJ, k^i,
xk(xi) = o.
a
383
Remark 2.2. The polynomials (Yj) satisfy the following relations of the form (2) and (3): n
YJ2(y) = Yj(y)[-Y/Yk(y)}
V i e J,
+ Yj(y),
(4)
fc=0
and Y?(y)=Ky(y)-qj(y)+Yj(y),
(5)
where qj is a polynomial of degree m — 1. Proposition 2.2. The two real variable polynomial p may be written in the form p(x,y) =p00(x,y) where p00, p10 andp01
+nx(x)p10(x,y)
+
ny(y)p01(x,y),
are two real variable polynomials of degree (n,m).
Proof. It is enough to use (3) and (5) and to remember that Vi € / ,
(a; - xt)Xi(x)
= nx(x),
and Vj £ J,
Yj(y)(y - yj) = ny(y). D
Lemma 2.8. Every two real variable polynomial Q of degree (2n+l, 2 m + l ) may be written in the form Q(x,y) =
Qoo(x,y)+Ttx(x)Qi0(x,y)+'Ky(y)Q01(x,y)+Trx(x)ny(y)Qi1(x,y),
where Qoo, Qio, Qoi> Qu are two real variable polynomials with degrees between 0 and n (resp. 0 and m) with respect to the variable x (resp. y). Proof. The proof is straightforward.
•
Proposition 2.3. A two real variable polynomial Q(x,y) =
Qoo(x,y)+irx(x)Qio(x,y)+TTy(y)Qoi(x,y)+Trx(x)iry(y)Q11(x,y)
of degree (2n + 1,2m + 1) satisfies (1) if only if Qoo(x,y) +nx(x)Qw(x,y)
+
nv(y)Q01(x,y)
verifies (1). Proof. The proof is straightforward.
•
384
Theorem 2.3. Every two real variable polynomial Q(x,y) of degree (2n + 1,2m + 1) which satisfies (1) has the form Q(x,y) = p(x,y) + nx(z)ny(y)Qu(z,y). Proof. We have p 0 0 = Qoo because these two real variable polynomials are the interpolation polynomials of / at points (xi,yj)i=0n.i=.0m, and p1Q = Qio (resp. p01 = Q0i) since p10 and Qw (resp. p01 and Qoi) are the interpolation polynomials of the same function at the points ( a ; «iS'j)i=0,n;j=0,m -
D
Remark 2.3. The polynomial p is the remainder obtained by dividing any polynomial which satisfies (1) by ir(x,y) = irx{x)-Ky{y). We give the following definition for the two real variable Hermite interpolation polynomial. Definition 2.1. The two real variable Hermite interpolation polynomial is the remainder obtained by dividing any polynomial which satisfies (1) by •Kx(x)TTy(y).
This definition implies uniqueness of the two real variable Hermite interpolation polynomial and can help us find properties of interpolation polynomials. Theorem 2.4. The polynomial n
H2n+i2m+i(x,y)
m
= Y^^2hi-1
+ xi(x)(1
- 2X'i(xi){x
-xt))
i=0 j=0
+ Yj(y)(l - 2Y!(yj)(y n
m
i=0 j=0 n m
+
-
W))]*i(*)^(l/)
o-
r\.
Y.Y,-t{y-yi)Xi{x)Y?{y) i=o j = o
y
is the two real variable Hermite interpolation polynomial of f at the points \.xi•>yj)i=0,n•,j=0,m^
Proof. Direct since this polynomial is equal to p.
D
385
Example 2.1. The two real variable Hermite interpolation polynomial of f(x,y) = x V at the points (0,0), (0,1), (0,2), (1,0), (1,1), (1,2), (2,0), (2,1), (2,2) is H5,5(x, y) = 2Myx2 + 31y5x2 + 315?/V - 420yx -
315y2x2
+ 450y2x + 1802/V + 29ix3y - 168yxA -
30y5x.
Remark 2.4. We obtain the following properties about interpolation polynomials. 1) The one real variable interpolation polynomial of g is the (unique) remainder when any polynomial which interpolate g at the points (x»)»=o,n i s divided by Ttx(x). 2) The one real variable Hermite interpolation polynomial of a differentiable function g : R —• E is the remainder in division by n2 (x) of any polynomial which interpolate g and of which the first derivate interpolates g' at the points (xi)i=0 n. 3) The two real variable interpolation polynomial of / : M2 —> R is the remainder by dividing v(x,y) by TTX(X) (resp. ny(y)), where v(x,y) is the remainder term from the division of any polynomial which interpolates / at points {xi,yj)i=0n.j=0m by ny(y) (resp. irx(x)). By the same procedure we can prove that the three real variable Hermite interpolation polynomial of / : R3 —> R is n m k H2n+12m+12k+2(x,y,z) = ^ ^ ^ fiji[-2 + Xi(x)(l - 2X'i(xi)(x - Xi)) i=0 j=0 1=0
+
Yj(y)(l-2Yj(yj)(y-yj))
+ Z«(z)(l - 2Z'l(zl)(z n
m
zWiWYjMMz)
k
+EEEft^-^.2(^(!')2'W »=0 j=0 1=0 nit lit m k*,
r\ r
+ E E E ^ f (v - w)*(*)i?(»)z.(*) y
i=o j=Q /=o n
m
k
t=o j=o 1=0
r\ j. y
386
This expression may also be writen in the form, Pooo(x^V^z)+'!Tx(x)Pioo(x,y,z)+'Ky(y)p010(x,y,z)+Trz(z)p001(x,y,z) where p000, p~ioo> Poio a n d Pool a r e 0 I degrees (n,m,k), and the latter is obtained by dividing any polynomial which interpolates / at the points (xi,yj,Zk) by the three real variable polynomial k
n(x,y,z)
= Trx(x)ny(y)7rz(z)
where
TTZ(Z)
- Y[(z ~ zi)j=o
3. Conclusion The definition based on the smaller degrees is not valid for the several real variable Hermite interpolation polynomial because there are many polynomials of smaller degrees. The properties of the interpolation polynomial can help to make up easily the interpolation polynomial. The result on the two real variable Hermite interpolation polynomial case can be extended to the several real variable case. References 1. A. Quarteroni, R. Sacco and F. Saleri, Methodes numeriques pour le caicui scien.tiG.que. Programmes en MATLAB, Collection IRIS (Springer-France, Paris, 2000). 2. M. Sibony and J.-Cl. Mardon, Approximation et equations differentielles, Analyse numerique II (Hermann, Editeur des Sciences et des Arts, Paris, 1984). 3. D. Moukoko and J. Dzoumba, Newton form of Hermite interpolation polynomial, Far East Journal of Mathematical Sciences 14 (1), 107-120 (2005).
387
B R A I D I N D E X W I T H U P TO T E N CROSSINGS O. A. FADIPE-JOSEPH Department of Mathematics, University of Ilorin, P. M. B. 1515, Ilorin, Kwara State, Nigeria E-mail: [email protected] By using the table of Jones polynomials we compute the braid index with up to ten crossings. Keywords: Factor, trace invariant, braid index.
1. Introduction Murray and von Neumann 1,2 defined an invariant in the theory of continuous dimension for subspaces affiliated with type 111 factors. This invariant is called index. Jones 3 discovered the Jones polynomial in 1984. During the 1980's his research focused on von Neumann algebras, and in the course of his work he discovered a new polynomial invariant for knots, which led to surprising connections between apparently quite different areas of mathematics. He discovered a new polynomial invariant of knots which represented the first advance in this field since 1928, and has also enabled molecular biologists to gain new insight into how deoxyribonucleic acid (DNA) can remove the tangles that result when replication and cell division firstly duplicate the DNA and subsequently pull the chromosomal mass into different cells. The result represents a landmark in modern mathematics whose ramification still remains to be fully explored. The discovery came indirectly by a way of a branch of quantum mechanics called von Neumann algebras. These were developed to handle quantum mechanical observables such as energy, position and momentum. The capacity of the operators representing such quantities to be added or multiplied implies that they have the structure of an algebra. Von Neumann algebras can be built out of simpler structures called factors, which have the intrigu-
388
ing property that they can have "continuous dimension", i.e., real numbers. Jones 3 was studying subfactors when he discovered that, rather than having continuous dimensions, the values for the index are either greater than 4 or lie in the set {4cos 2 (f) for k > 3, k <E N}. While showing the proof to some friends at Geneva, it was remarked that sections resembled the braid group, which is like a knot except that it is a series of threads beginning at the top which are woven over and under before being realigned at the bottom. A braid can be converted into a knot by joining its ends together. Vaughan Jones ended up having a meeting at Columbia University with knot theorist Joan Birman 5 to see if his work might have some application in knot theory. When the two sat down together the discovery was almost instantaneous. Jones proved that von Neumann algebras are related to knot theory and provided a way to tear very complicated knots apart. The original polynomial for knots derived by Alexander 6 in 1928 fails to separate a left-handed trefoil from a right-handed one, showing that much remained undiscovered, since some of the simplest knots could not be distinguished. It was quickly discovered that the Jones polynomial and the Alexander polynomial give complementary descriptions of knots which can be combined to make a two-variable polynomial giving a more complete representation than provided by either alone. Let the braid group be denoted by Bn. The closure of a braid b £ Bn is the oriented link, b, obtained by tying the top end of each string to the same position on the bottom of the braid. A braid index is the least number of strings needed to make a closed braid representation of a link, that is, the braid index of a link L is the smallest n for which there is a pair (b,n) with b = L. The braid index is equal to the least number of Seifert circles in any projection of a knot. 7 Jones 4 remarked that the trace invariant can probably be used to determine the braid index in a great many cases. Price 8 provided a complete classification up to conjugacy of the binary shifts of commutant index 2 on the hyperfinite H i factors. Fadipe-Joseph 9 studied the behaviour of the index under tensor products. Index values via n x n matrices were determined in a way different from using trace invariants. 10 The trace invariant is very important in the study of knots. 11 ' 12 The trace invariant for special angles was constructed in Ref. 13. This was done for the particular cases in which | < 6 < ~, for which their indices were also determined.
389 Stoimenow14 gave tables of polynomials, signatures and the values of the degree 3 Vassilev invariant of knots with up to ten crossings. In the present work, the Jones polynomial is used to compute the braid index with up to ten crossings. Jones polynomials are denoted Vt(i) for links, Vk{t) for knots, and are normalized so that Vunknot(t) = 1-
(1)
For example, the right-handed and left-handed trefoil knots have the polynomials Vtre{on(t) = t + t 3 - t \
(2)
Vtrefoii-(t) = i " 1 + i - 3 - t - 4 ,
(3)
respectively. Some interesting identities from Jones 4 follow. For any link L, t = e±2ni/kt
forsome
Vk(e2^3)
A; = 3 , 4 , 5 , . . . , = 1,
(4) (5)
and
When knot theory was first developed its major appeal was in its possible application to chemistry. Lord Kelvin motivated the substance called ether, which was what the entire universe was supposed to be made of, and matter could be explained as knots in the ether. However we now know that this is not true. Vaughan Jones discovered the connection when computing a new polynomial invariant for knots. In this field, knots can be used to represent systems and thereby increase the ease of the study. 2. Main Results Jones 4 defined the algebra TL(k,r) where r € C with identity 1 and generators ei, e2, ..., &k-\ subject to the following relations, (i) e? =
eie*
= e{,
(,11} C i e j ± i e j = T&{ = (i_|l()2 Cj)
(iii) ei€j = ejei}
if | i - j \> 2.
390 Here t is a complex number. The similarity between relations (ii) and (iii) and Artin's presentation of the n-string braid group, { s i , s 2 , . . . , s „ : SiSi+iSi = si+iSiSi+1,SiSj
= SjSi, if
\i-j\>2}
5
was pointed out. Defining gi — ^/t [tei — (1 — e,)], the gi satisfy the relations and Jones 4 obtained the representations r< of Bn by sending Sj to gi. In order to compute the braid index, we shall make use of the following known results. Lemma 2.1. 5 / / b is in Bn and there is an integer k greater than 3 for which b £ Kerrt, t = e2ni/k, then b has braid index n. Lemma 2.2. 9 Fort = e 2 "/ f c , with k = 3 , 4 , 5 , . . . , VL{t) = (-2C0S7T/A;)"-1 if and only if b 6 Kerrt (for b € Bn). How to read the Tables Tables 1-5 contain the braid index with up to ten crossings. For the Jones polynomial, if the absolute term occurs between its minimal and maximal degree then it is bracketed, else the minimal degree is recorded in braces before the coefficient list. For example, for the knot 4i : 1 — 1[1] — 11, the absolute term occurs between its minimal and maximal degrees, then it is bracketed, [ ]; for 4i, we have 1 - 1 • t1 - t2 + t3 = 1 - t - t2 + t3. So also, for 3i : {1}101 — 1, the minimal degree is recorded in braces, { }, before the coefficient list; we have ^(l + O-t + l-t2
-t3)
=t + t3
-t4.
Throughout this work we let k = 4 in Lemma 2.1 and 2.2, which implies that t = i. The knot 8 8 has trace invariant VL (t) = -1 + It - Zt2 + 4i 3 - 4i 5 - 3t6 + 7 2t — t&. Substituting for t we have that Vz,(i) = 4 — 8i. Substituting for VL(I) in Lemma 2.2 one obtains n = 7. 3. Conclusions Knot theory is one of the most exciting fields of study in mathematics because of its many applications; examples of other application are in molecular chemistry and particle physics. Another area of the application of knots theory is statistical mechanics. This application was left undiscovered until
391 very recently. Knowing the trace invariant, one can determine t h e braid index in different crossings.
References 1. F. J. Murray and J. von Neumann, On rings of operators II, Trans AMS 4 1 , 208 (1937). 2. F. J. Murray and J. von Neumann, On rings of operators IV, Ann. Math. 44, 716 (1943). 3. V. F. R. Jones, Index for subfactors, Invent. Math. 72, 1 (1983). 4. V. F. R. Jones, A polynomial invariant for knots via von Neumann algebras, Bulletin (New series) of the American Mathematical Society 12, 103 (1985). 5. J. Birman, Braids, Links and Mapping Class Groups, Annals of Mathematics Studies, No. 82 (Princeton University Press, Princeton, New Jersey; University of Tokyo Press, Tokyo, 1974). 6. J. W. Alexander, A lemma on systems of knotted curves, Proc. Nat. Academ. Sci. (USA) 9, 93 (1928). 7. S. Yamada, The minimal number of Seifert circles equals the braid index of a link, Invent. Math. 89, 347 (1987). 8. G. L. Price, On the classification of binary shifts of minimal commutant index, Proc. Nat. Acad. Sci. (USA) 96, 8839 (1999). 9. O. A. Fadipe-Joseph, Possible values of the index for subfactors, Mathematical Science Letters 1(2), 20 (2000). 10. O. A. Fadipe-Joseph, Values of index via n x n matrix, The Journal of the Mathematical Association of Nigeria (ABACUS), Mathematical series 30(2A), 13 (2003). 11. B. Curtin, Some planar algebra related to graphs, Pacific J. of Math. 209, 231-248 (2003). 12. V. F. R. Jones, The planar algebra of a bipartite graph, Proceedings of the International Conference on Knot Theory and Its Ramifications, Knots in Hellas'98 (Delphi), Series on Knots and Everything, Vol. 24, eds. C. McA. Gordon, V. F. R. Jones, L. Kauffman, S. Lambropoulou and J. H. Przytycki (World Scientific, Singapore, 2000), pp. 94. 13. O. A. Fadipe-Joseph, The trace invariant for special angles, Nig. J. Pure and Appl. Sci. 19, 1741 (2004). 14. A. Stoimenow, Polynomials of knots with up to 10 crossings, Department of Mathematics, Humboldt University, Berlin (Germany, 1999).
392
Table 1. Knots Oi 3i 4i 5i
52 6i 62 63 7i
72 73 74 75 76 77 81 82 83 84 85
86 87
88 89 810 811 812 813 814 815 816 817 818 8l9 820 821
Braid Index (between 0 and 8 crossings).
Polynomial Invariant, Po/Pol{v)
Trace Invariant,
1 {1}101-1 1-1[1]-11 {2}101-11-1 {1)1-12-11-1 1-1[2]-21-11 1 [-1)2-22-21 -12-2[3]-22-l {3)101-11-11-1 {1)1-12-22-11-1 {2)1-12-23-21-1 {1)1-23-23-21-1 {2)1-13-33-32-1 l[-2]3-34-32-l l-23-4[4]-33-l l-l[2]-22-21-ll [0)1-12-23-32-21 l-12-3[3]-32-ll l-23-33[-3]2-ll [lj-13-33-43-21 l[-l]3-44-43-21 -12-34-44[-2]2-l -12-34-4[5]-32-l l-23-4[5]-43-21 -12-45-45 [-3] 2-1 l[-2]4-45-53-21 l-24-5[5]-54-21 -12-35-5[5]-43-l l[-2]4-56-54-31 21-25-56-64-31 -13-56-66[-4]3-l l-35-6[7]-65-31 l-46-7[9]-76-41 {3)10100-1 -11-12-1[2]-1 {1)2-23-32-21
1 -1 2-2i -1 1 2 + i 1 -i -2 + 6i 1 -1 -1 1 1 2 + i - 9 - 4i 4 + i 1 - 3 + 3i -3 + 5i -2 1 - 3i - 7 - 4i 4 - 8i -8i -3 + 3i -2- i -lOi 5 - Hi -4- i 1 5 -7 + 7i -2 - 16i -1 1 1
Braid Index
vL(t) 1 1 4 1 1 3 2 6 1 1 1 1 1 3 8 5 1 5 6 3 4 7 7 7 5 3 8 8 5 1 6 8 9 1 1 1
393
Table 2. Braid Index (for 9 crossings). Knots
9i 92 93 94 95 9e 97 9s 99 9io 9n 9l2 9l3 9l4 9l5 9l6 9l7 9l8 9l9 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940
Polynomial Invariant, Po/Pol(v)
Trace Invariant,
{4}101-11-11-11-1 {l}l-12-22-22-ll-l {3)1-12-23-33-21-1 {2)1-12-34-33-21-1 {1)1-23-34-33-21-1 {3)1-13-34-54-32-1 {2)1-13-45-54-32-1 l-23[-4]5-55-32-l {3)1-13-45-55-42-1 {2)1-24-56-55-31-1 -12-45-56-43-2[l] l[-2]4-56-65-32-l {2)1-24-57-65-42-1 l-23-56-6[6]-43-l -12-46-67-64[-2]l {3)1-14-56-76-53-1 l-24[-5]6-76-43-l {2)1-25-67-76-42-1 l-24-6[7]-76-43-l [l]24-57-76-53-l -12-46-78-65[-3]l 1-24 [-6] 7-77-53-1 {2)1-25-68-86-53-1 l-35-7[8]-77-42-l l[-2]5-78-87-53-l l-35-78-87[-4]3-l l-35-7[9]-87-53-l -13[-5]8-89-85-31 -13-68-89[-7]5-31 -13-58-9[9]-86-31 -13[-5]8-910-86-41 l-36-910-109[-6]4-l -13-69-10[ll]-97-41 -14-710-12[12]-108- 41 {1)1-23-45-34-31-1 -12-46-66-54-2[l] l-25-7[7]-87-43-l {2)1-37-810-108-63-1 -13-68-910-86[-3]l -15[-8]11-1313-118-41
1 1 -1 -1 1 -1 -1 7 1 1 2 - 2i - 2- i -1 13 + 6i -9 + 2i 1 4 - lOi 1 - 7 + 7i 4i 1 + 9i - 6 - 5i -1 -16 - 7i -4- i 12 - i -14i 2 - 6i - 7 - 7i 9 - 19i 2 - 7i -13 + 2i -1 12 - 6i -1 2-4i -9 + 9i -1 -5 + lOi 3
Braid Index
vL(t) 1 1 1 1 1 1 1 7 1 1 4 3 1 9 7 1 8 1 8 5 7 7 1 9 5 8 9 6 8 10 7 8 1 8 1 5 8 1 8 4
394
Table 3.
Braid Index (between 9 and 10 crossings).
Knots
Polynomial Invariant, Po/Pol(v)
Trace Invariant, vL{t)
941
l-35-78-8[8]-53-l 1-11[-1]1-11 [l]-12-22-22-l l-2[3]-33-22-l {-8}-12-34-44-32 1-11-21-1[2] -13[-3]5-54-42 -23-46-44[-3]l {2)1-24-45-43-2 l-l[2]-22-22-21-ll {1)1-12-23-33-32-21 l-12-3[4]-43-32-ll l-23-34[-4]3-32-ll -12-34-55-44-2[2]-l [l]-13-45-66-53-21 l[-2]4-57-76-53-21 l-l[2]-34-44-43-21 l-23[-4]6-66-53-21 -12-35-67-7[6]-43-l l-13[-5]6-77-64-21 -12-46-78-76[-3]2-l l-25-7[8]-98-64-21 [lj-24-69-99-85-31 -12-46-67[-6]5-32-l l-24[-5]7-87-64-21 -12-35-6[7]-65-32-l l-24[-6]8-99-75-31 -12-36[-7]8-87-53-l 1[-1] 3-45-65-43-21 [l]-24-57-77-63-21 l-24-6[8]-87-64-21 -12-47-910-98[-5]3-l l[-2]5-79-98-74-21 [l]-25-710-ll 10-96-31 l-36-8[10]-109-74-21 -13-69-11 12-119[-5]3-l
-7 1 - 3i -2 -1 + 3i 1 2 + i -4-4i -3 + i 1 2 + i
942 943 944 945 946 947 948 949
10l 10 2 103 10 4 105
10 6 10 7 10 8 10 9 IO10
lOn IO12 IO13 IO14 IO15
lOie IO17 lOis IO19 IO20 IO21 IO22 IO23 1024 IO25 10 2 6 IO27
1
-6-3i 4 + 3i 3 8-4i -2 - i 4 + i 7 6 + 7i 3 - 13i 1 + Hi -16 - 7i -12 - 7i 4 - lOi -2 + 6i - 8 - 5i -2 +12i 7 - 7i 4-6i -13 - 6i -3-4i -4-i 4- i 8 + 4i 5 + 2i
Braid Index 7 4 3 4 1 3 6 4 1 3 1 7 6 4 7 3 5 7 7 9 8 9 1 9 8 6 8 8 8 7 9 6 5 5 7 6
395
Table 4. Knots
10 2 8 10 2 9 1030 1031 1032 1033 IO34 1035 1036 1037 IO38 IO39 1040 IO41 1042 IO43 IO44 IO45 1046 1047 1048 IO49 1050 1051 1052 IO53 IO54 IO55 IO56 IO57 1058 IO59 1060 1061 10 6 2 1063 1064
Braid Index with up to 10 crossings (continued).
Polynomial Invariant, Po/Pol(v)
Trace Invariant,
-12-46-79-8[7]-53-l l-25[-7]9-ll 10-86-31 l[-3]6-811-ll 10-85-31 -12-47-8[10]-97-53-l l-36-9[ll]-llll-85-31 -13-58-10[ll]-108-53-l -12-34-56-5[5]-32-l l-24-67-8[8]-64-21 l[-2]4-68-88-64-31 -12-47-8[9]-87-42-l l[-2]5-79-109-75-31 [l]-25-79-1010-85-31 -13-69-1213-1110[-6]3-l l-36[-8]ll-1211-96-31 -13-610-12[14]-1310-74-l -13-69-ll[13]-119-63-l l-36[-9] 12-1313-107-41 -14-711-14[15]-1411-74-l {1)1-13-34-54-43-21 -12-45-67-55-3[2]-l -12-46-7[9]-76-42-l {3}l-25-69-109-85-31 [l]-25-68-98-74-21 -12-58-1012-109[-6]3-l -12-47[-8]10-98-63-l {2)1-37-912-1211-95-31 -12-46-78[-6]6-42-l {2)1-25-710-109-85-31 [l]-25-710-1110-96-31 -13-710-1214-120[-6]3-l l-25-8[10]-1110-86-31 l-36[-9]12-1212-106-31 l-36-1013-14[14]-118-41 l-l[3]-44-55-43-21 -12-46-77-66-3[2]-l {2)1-25-79-99-74-31 l-24[-6]8-88-74-21
-8 + 16i - 2 + lOi 2 + 5i 15 -12 + lOi 8 - 9i -1 - 3i -7 -2 - i 8 - 18i - 2 -2i -9i 12 + 6i 10 + 9i 11 - H i 11 - H i -1+ 5i
Braid Index
vL{t)
1
5 7-15i -2 -2 -9-lli 15 + 7i - 1 + 6i -15 - 8i -3 4 -i 5 - 4i 10 + 5i 1 - 7i -7 + 8i 2i 5 - 2i 1 - 6 - 5i
9 8 6 9 9 8 4 7 3 10 3 3 7 9 9 9 9 6 1 6 9 3 3 9 9 4 8 4 5 6 8 7 8 3 6 1 7
396
Table 5. Knots 1065 1066 1067 1068 1069 1070 1071 1072 1073 IO74 IO75 1076 1077 1078 IO79 1080 1081 10 8 2 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 IO93 IO94 1095 1096 IO97 1098 IO99 IO100
Braid Index with up to 10 crossings (continued).
Polynomial Invariant, Po/Pol(v)
Trace Invariant,
-12-58-911-108[-5]3-l {3}l-26-811-1312-107-41 l[-2]5-810-1010-85-31 -12-47-89-9[8]-53-l -13-711-1315-1411[-7]4-l 1-36-911-11 10[-8]5-21 -13-610-12[13]-1210-63-l [l]-25-811-1212-107-41 -13-610-1314-1311[-7]4-l l[-3]6-811-109-84-21 l-36-1012-13[14]-107-41 [l]-14-68-109-86-31 -13-68-1011-98[-4]2-l [l]-36-811-llll-95-31 -12-58-9[ll]-98-52-l {3)1-26-811-1211-106-31 -13-711-13[15]-1311-73-l l-35[-7]10-1010-85-31 -13-610-1314-1311[-7]4-l -13[-6]ll-1315-1411-84-l -13-57-99-87-4[3]-l l-48-ll[14]-1413-106-31 l-36-10[13]-1313-107-41 -14-813-16[17]-1613-84-l -13-712-1517-1613[-9]5-l l-37-10[12]-1312-96-31 -13-69-ll[13]-119-63-l [l]-37-1014-1514-128-41 -13-69-1011[-10]8-53-l l-36[-8]ll-1211-96-31 -13-711-1416-1412[-8]4-l l-37-1114-16[15]-129-41 l[-3]7-1114-1414-117-41 [l]-37-913-1412-117-31 -13-710-12[15]-1210-73-l -13-68-1011-98-5[3]-l
-13 - 5i -2-5i -5 + 3i -17 1 lOi 12 -2i 5 + 7i -1 2 + 4i -11 - 3i -1 3-4i 3 + 2i 1 + 9i 7 - 4i -1 +3i -5-4i -1 -8 - 12i 3 + 3i -4 + 5i 2 - 6i 19+ i 9 - 6i -12 + lOi 11 - H i - 4 + 9i -2 - H i 12 + 6i 8 + 2i -18 -5 + 6i 3i + 3 -2 + 4i -7 + 7i
Braid Index
vL(t) 8 6 6 9 1 8 8 7 1 5 8 1 6 5 7 7 4 6 1 9 5 6 6 10 8 9 9 8 8 9 7 9 7 5 5 8
397
HAZARD RATE P R E D I C T I O N IN L I F E T I M E DATA A N A L Y S I S KOSSI ESSONA GNEYOU Department of Mathematics, University of Lome, B.P. 1515 Lome, Republic of Togo E-mail: kgneyouStg.refer.org We consider a nonparametric estimation of the hazard rate function based on right-censored data using the wavelet method. Asymptotic properties and strong uniform consistency rates are established under suitable conditions. Keywords: Hazard rate; life time data; right censorship model; wavelet method
1. I n t r o d u c t i o n Let X\, X2, • • •, Xn be a sequence of independent, identically distributed (i.i.d) non-negative random variables (r.v.) with common continuous distribution function (d.f.) F and density function / , and Y\, Y2, ..., Yn another sequence of i.i.d non-negative r.v. with common continuous d.f. G, both sequences (Xi) and (Yi) being defined on the same probability space (fl, A, P) and mutually independent. In this paper we are concerned with the nonparametric estimation of the hazard rate function A defined by m
=
T^)
'
F{t)
< lj
(1)
whenever F and / are unknown and the observations available are the pairs (Zi, 5i) where for i = 1,2,..., n, Z, = min(X„y,),
* = 1<W«> = { ; ^ £
(2)
Xi is said to be censored on the right by Yi when Si = 0. Set X = Xu Y = Yi, Z = Zu 6 = <Si and denote by H the d.f. of Z. It is easily seen that H(t) = ! - ( ! - F(t))(l - G(t)).
398 In life testing, medical follow-up and other studies, the random variables X and Y indicate, respectively, the observation of the occurrence of an event of interest (such as failure time or death time) and the occurrence of another event (called a censoring event). For example, this is the case in medical follow-up of a grave pathology where Y is the hospitalization time and X is the survival time of a patient unknown except when X > Y. The hazard rate as the probability density or the distribution function, is a basic characteristic describing the behavior of a random variable X. The problem of nonparametric estimation of the hazard rate function is related to that of the density function. The most important methods considered in this topic by statisticians, in non-censored or censored models as well, are kernels, nearest neighbor, orthogonal series or projection methods. For progress and developments in the literature, see Refs. 1-13 and references therein. The aim of this paper is to give another approach to estimate the hazard rate function A based on wavelet methods. Indeed, wavelet methods and multiresolution analysis of L2 (K) introduced by Mallat 14 (see also Ref. 15) have become, during these last years, a mathematically sound tool for adaptively estimating functions. A remarkable property of the wavelet transform is that it reflects the local regularity of the original function, being large where the function is irregular and small where the function is smooth. For references on density and hazard rate estimation using wavelet methods, see, e.g., Refs. 16-19, and recently, Refs. 20-21. Optimal rates of convergence of the mean integrated squared error, ZAloss, and weak convergence, have been investigated in these works. To get our wavelet estimator of A, we use the projection approach. Recall that any function / € L 2 (R) may be expressed in the form
fit) = J2 ahk
w
j>j0 kez
where {
an
orthonormal basis of L2(M) and are j>jk(t) = 2''/ 2 V(2 , 't - *).
(4)
The coefficients aj0k and 0jk are given by +oo
/
r+oo
f(x)4>jok(x)dx -oo
and
0jk = /
f(x)tpjk(x)dx,
(5)
•/—oo
where <j> is the scaling function or father wavelet satisfying J_oo
399 The decomposition in (3) gives an approximation of / € L2 (R) at resolution jo and the detail in / at resolutions finer than j 0 . To estimate the hazard rate function by the wavelet method, we first fix an a priori resolution j depending on the sample size n and then obtain, under appropriate assumptions on the regularity of the density and the scaling functions, a linear wavelet estimator of the hazard rate, by projecting into the subspace Vj of the multiresolution analysis (see, e.g., Refs. 16-18 in the framework of the probability density estimation). At the end, we choose the optimal resolution (the smoothing parameter) j 0 = ja{ri) by the classical method of minimizing the mean integrated squared error. 2. Estimation of the Hazard Rate Function by the Wavelet Method Let TF = sup{i € M+/F(t) < 1}, and TG and TH be denned as TF with F replaced by G and H, respectively. Obviously TH = m i n ( l > , T G ) and it may be proved 22 that Z^n) = m a x ( Z i , . . . , Zn) tends to TH almost surely as n tends to +oo. Let
k{x,y) = YJ4>^-k)
(6)
fcez Under assumptions (A1)-(A3), K satisfies the following statements (see, e.g., Ref. 23) which determine the asymptotic properties of the estimator: (51) K is bounded, i.e., there exists a positive constant C such that K(x,y)
(52) K(x,y) = Ofor | x-y |> 2L and J ^ K(x,y)dy = E * 6 z ^ - * ) = 1(53) For all positive integers I
400
the hazard rate A by
6i
\n{t) = lJ2kj{t,zi)
t
(7)
4=1
where 1 - Hn = (1 - Fn)(l - Gn) and Fn and Gn are Kaplan-Meier 24 product-limit estimators of F and G, respectively, given by (see, e.g., Ref. 25) n —i
1 - Fn(t) = {
^
:—r
{i/Z(i)<x}
X
^
(
'
,
iff < 27,0,
'
0
W
ift>^(„),
1 — Gn(t) being defined as 1 - -F„(t) with S^) replaced in (8) by 1 — d^), where Z^ < Z(2) < • • • < ^(n) a r e the order statistics of the sample (Zi,Z2, • . . , Zn), and for i = 1 , . . . , n, J(j) is the 8j corresponding to Z^ =
Zj, 1 <j < n. In the uncensored case (i.e., G = 0), Hn is replaced in (7) by the usual empirical distribution function of the Xi's, Fn(t) = l / n ^ " = 1 l^Xi
^t)
= l±kj{t,Zi)T-^m.
(9)
Note that EXn(t) is not the mathematical expectation of A„(i) but rather, is a r.v. whose expectation is the projection of X(t) on the linear subspace Vj spanned by {4>jk(t) = 2i/2<j)(2H — k),k£ Z} and in the uncensored case, EXn(t) = E(Xn(t)) where E denotes the usual expectation. Definition (7) of the wavelet estimator of A differs from that of Ref. 20. Indeed, these authors have, independently, estimated by wavelet methods the subdensity /*(£) = f(t)(1 — G(t)) of those observations that are still to fail and the probability (1 — if) of observations remaining at risk, and thus form an estimator of the hazard rate function by dividing the subdensity estimator by a wavelet estimator of (1 — if). More precisely, to estimate the subdensity /*, they divide the time axis into a dyadic number of small intervals (bins) of equal width, binned the observed data into bins, and used a wavelet regression estimate on the binned data to get an estimate of the subdensity.
401 Remark. Recalling that (j>jk(t) — 2j/2(j){2H - k), k e Z, we may write
\n(t) = ^2ajk(t>jk(t),
(10)
kei
where 1
aik =
n
5
{Zi)
nZ+»
(11)
T=£&y
Thus our wavelet estimator of the hazard rate function A is linear and can be easily computed. Compared to (3), the second term on the right-hand side of (3) may be neglected if one imposes a high degree of regularity on the density function / or on the hazard rate function A directly. Because of the compactness of the support of
[0,TG],
(12) then for all
te[o,TF] VarA„(*) = - ^ L + 0(2-^>)
(13)
where
^=A-
(">
402
Proof. The proof of this theorem and the next are given in the Appendix. Thus the estimator A„ is consistent with bias depending on j . Hence an appropriate choice of j could be checked by having a small mean integrated squared error of A„ (MISE(A n )). This is possible if one imposes a high order of regularity on the density function / . We have the following result. Theorem 3.2. Assume that G(Tp) < 1, f has a derivative of order (r +1) and / ( r + 1 ) is bounded on [0,Tp], 4> satisfies assumptions (Al)-(A3), j = j(n) is such that j(n) —• +oo and n2 - - 7 '"' —>• +oo as n —>• +oo. Then MISE(A„) = 0(2-(^M)
+ 0{^-^)
+ 0 ( ( ^ ^ ) * ) .
(15)
By Theorem 3.2, it is easy to check that if the multiresolution analysis (Vj)j€z is r-regular and if / ( r + 1 ) exists and is bounded on [0,Tp], then an optimal choice of j is j > Log 2 n/(2r + 3) which yields MISE(A n ) = 0(n 57 + 5 ), where Log 2 n = Logn/Log2. In the next theorem, we establish the strong consistency of the wavelet estimator. Theorem 3.3. Let T < Tp be such that H(T) < 1. Assume that
Qsupr
| A„(*) - X(t) |= O [ V ^ S ? j
+
°( 2_J )
as
'-
(16)
The demonstration of Theorem 3.3 uses the process y/n(An — A) where A (resp. A n ) is the cumulative (resp. the Kaplan-Meier empirical cumulative) hazard function. Hence, according to Csorgo's approximation 26 of the process ^Jn{h.n — A) by a Wiener process, one can obtain, from the limit set in Eq. (1.30) in Ref. 27, a law of the iterated logarithm for the wavelet estimator A„. Namely, for all t G [0,1>] define d(t) = J0 X*(s)ds — J* dF(s)/((l - F(s))(l - H(s))). We have the following result. Theorem 3.4. Under the assumptions of Theorem 3.3 and if d{T) < +oo, then the wavelet estimator An satisfies limsupl— where nj{n)
= d(T)
„., .r inf
1 /
0
sup \ Xn(t) - \(t) \—1
K2(2H,s)ds.
a.s.
(17)
403
Appendix: Proofs Before proofs of theorems may be given, we need to establish the following lemmas. Lemma 3.1. Let E\n Then
be defined as in (9) and assume that G(TF) < 1.
sup \\n(t)-EXn(t)\=0(JL0gL0gn) o
a.s.
(18)
n
Proof. We have | Xn(t) -EXn(t)
H ^ K ^ Z ^ i j - j —
* (l-HniTFm-H(TF))
- i 3 ^ ) ) I
I'nH = l *,(«, Zt)St | SUPo
Recalling that (1 - Hn(x)) = (1 - F„(£))(l - Gn(t)), where Fn and Gn are the Kaplan-Meier product-limit estimators of F and G, respectively, we may write Hn(t) - H(t) =
(F n (t) - F(t)) + (Gn(t) - G(t)) ~(Fn(t) - F(t))Gn(t) - (Gn(t) -
G(t))F(t).
Thus, su
Po
Hence, by results discussed in Ref. 5, we have sup | Hn(t) - H(t) |= 0(JL0gL°Sn) a.s. (20) o
a.s.
as
n -> +co,
404
with m = E(Kj(t,Z)S) = J^F Kj(t,x)(l tion (A2) <j) is regular, 3C>0
such that
- G(x))dF(x).
Since by assump-
|< C ( l + | t - x \)~2.
| K(t,x)
(21)
Thus (21) implies m
,_ |
2j(T -t)
J_2rYtF F ' K(2H, 2H + u)(l - G(t + 2->u) f(t +
<
2-iu)du
2C |f / l + 2^dist((i,[0,T F ])) 2 '
where dist(t, A) denotes the distance between t and a subset A. Since t € [0,Tp], the quantity in the right-hand side of the last inequality is then bounded (by 2C || / ||oo)- Consequently, m is bounded. Thus i Y^i=i Kj{t, Z{)8i is almost surely bounded, i.e., there exists a constant M > 0 such that | £ £ " = i #,-(*, Z*)^ |< M almost surely. Thus, from (19) we have for n large enough sup o
| A„(t) - £A n (i) |< — M a
sup o
| # n ( t ) - H(t) |
a.s.
(22)
The lemma follows from (20) and (22). L e m m a 3.2. Assume that G{Tp) < 1, G has a derivative g = G' bounded on [0,TG] and that f G C^O.Tp]. Then for all t £ [0,7>] and for p= 1,2,
* - " f *?(«•»)(._^,*- i r J I p r + OP"')- (23) Proof. Note that by the property of <j> and the orthonormal property of {
Kj(t,x)dx
= £ , 6 Z VWJt
~ *) J?F 4>(2jx - k)dx
= fiF 4>{u)duYJk&
= / 0 TF 2^(E f c e z^»(2 j * - WWx = 2 2 i JoTF
£A€Z
-
k)fdx
= V fiF
k dx
)
= V.
405
For any continuous function L, set the notations L(x) = 1 - L(x), ||I/||oo = sup 0 < t < 7 > | L(t) |< +00, d = 1 - F ( 7 » > 0. We may write AW - AW ^ ^ ( / ( » ) - /(*)) + jffi^W*)
~ n*)).
(24)
Ms) Recalling that A* (s) = - , we have H{s) X*(x) - X*(t) = ^(f(x)
- f{t)) + ^ ^ y ( V ( x ) - V(t)),
(25)
where V(x) = H(x)F(x) = G(x)F2(x). So for p = 1 and in view of (24), we have |
J0TF
Kj(t,x)X(x)dx
- X(t) |=| J 0 T F
VK(2H,VX)(X(X)
- X(t))dx |
< /5 2 -[ F _ t ) £(2'i,2*'t + s) | X(t + 2~is) - \{t) | ds < CollAj-Hoo f_L2L K(2H, 2H + s)d«, where Co = max(d _1 , || / W^ d~2) and Aj(t)=
sup
\f(t + h)-f{t)\+
0
sup
| F(t +ft)- F(t) | .
0
By assumption on / , H-AjHoo behaves like 0(2 _ J ), so that rTF
[ " Kj(t,x)Mx)dx - X(t) \= 0{2-j), Jo and the lemma is proved for p = 1. In the same way, we have for p = 2, pTF
| 2-' /
p2L
K2(t,x)X*(x)dx - X*{t) |< CiH^Hoo /
JO
K2(2jt,2jt
+ s)ds,
J-2L
(26) where C\ — max(d 2 , || / ||oo ^ 4 ) and Bj(t)= sup \f(t + h)-f(t)\+ 0
sup
\V(t +
h)-V(t)\.
0
Furthermore by hypothesis on / and g the derivative V of V exists and is bounded on [0,TF]. Thus B(j) behaves like 0(2~j) so that | 2~j [ F K2At,x)X*{x)dx - X*(t) |= 0{2~j), Jo which is the statement of the lemma for p = 2.
406
Proof of Theorem 3.1. By Lemma 3.1 we have, for all t £ [0, Tp], +e>(JLogLogn)
E{Xn(t)) = E(EXn(t))
and Var Xn(t) = Var£A n (t).
Tl
V
Consequently we have to evaluate the expectation and the variance of EXn(t). By Lemma 3.2 we have E(EXn(t))
= E \ki{t,Z)T=jm\
= \lF
= / 0 TF kj(t,x)X(x)dx
= X(t) + 0(2~i).
Kj(t,x)±E§$dF(x)
Hence, EA n (t) = X{t) + 0{2~i) + 0{\/L°sL°gn)
and (12) is proved. For
the variance, we have nVaT{EXn(t)) = E
= Var K,(t, Z)
k*(t,z)
1 - H(Z) E Kj(t,Z)
(l--ff(Z))2
1 - H(Z)
= 1-11.
(27)
To conclude, we have to evaluate the two terms I and II. For the first term we have I = E ^ ( ^ ) ( T 3 T W ] = So" = !lF
^{t,x)^§Kdx
= JoF
K](t,x)X*{x)dx.
K ^ t ^ j ^ ^ d F i x )
Hence, by Lemma 3.2, 2~-yI = X*(t) + 0 ( 2 ~ J ) . As for the second term, we have c
[ kj{t Z)
' l-H{Z)\)
n
p J. p
o
=
(/0
o
KifovM*)**)
>
and by Lemma 3.2, we have II = (X(t) + 0{2-i))2
= X2(t) + 0{2~>) + 0(2- 2 >') = X2(t) + 0 ( 2 - ' ) .
It follows that n2~j Vax(E\n(t)) which implies (13).
= X*(t) +
0(2''),
407
Proof of Theorem 3.2. Let us consider the quantity
MISE(An) = E f J
F
(Xn(t) - X{t))2dt) .
By Fubini's theorem, we may write MISE(A„) = JQTF Var(A„(t))d« + / 0 TF (EA„ (*) = / 0 r ' Vav(EXn(t))dt
+ JQTF(K(EXn(t)
X(t))2dt
- X(t))2dt +
0 ( ^ ^ ) .
By (13), we have F
{E(EXn(t))-X(t))2dt
MISE(A„) = -^— f * \*{t)dt+ [ n 2 3 Jo Jo +0 (I) n +
= [F
f
Jo
Jo
0(^li^) n kj(t,x)(X(x)-X(t))dx
dt (28)
frTFF
Let vn(t) =
/ Kj(t,x)(X(x) Jo (c./., (24)), we have
KM I< I
J -2H -in
' 2L
- X(t))dx. As in the proof of Lemma 3.2
K(2jt, 2H + s) | X(t + 2~js) - X{t) \ ds
k(2-jt,2jt + s)Dj(s)ds
(29)
-2L
where Dj{s) = | f(t + 2~h) - f(t) \ + \ F{t + 2~h) - F(t) |. Since /< r+1 > exists and is bounded on [0, Tp], Taylor's expansion of / and F up to order r and the statements (S1)-(S3) yield K ( * ) | < C0f x
_ ., / \u\r+1K{2H,2H (r J+ 11)! J_2L
Af(r+l)(t+e12-'u)
I + \f(r)(t+e22-'u)
+ u)
^
< ^-(r+l)^
where C\ is a constant. Hence by (28), (15) is established, MISE(A„) = 0 ( - l _ ) + 0(2-(2r+^)
+
0(L°g^°Sn).
(3Q)
408 Proof of Theorem 3.3. We may write \n(t) = - 'JTKj&Zi) £ — - = [TF Kj(t,x) n h l-Hn(Zi) Jo
dH {x)
? , (31) 'l-Hn{x-y
where -.
1
n Sil
^ " ( ^ = -zZ iZi<x}
n
=-^2Uz{<x„6{=iy
i=l
(32)
»=1
Here, Hn is the empirical distribution function of the i-th uncensored observation (Zi,Si = 1) whose distribution function is H(x) = P [Z < x, S = 1] = / (1 - G{s))dF(s). Jo
Let
A( ) =
* f T ^ = -Los(1-F(a:))
and AnW = T n ^ " ^ . = -Log (1 - /•„(*)), Jo 1- Hn(s-)
a; < TH.
Here, A(x) and An(a;) denote the cumulative and Kaplan-Meier empirical cumulative hazard functions, respectively. Since dH(x) = (1 — G(x))dF(x), a natural estimator of dH{x) is dHn(x) = (1 - G„(x))dF„(a;). In view of (31), we have A„(i) - \{t)
M
Jo - f Jo
F
'(1-Hn(x-))
Kj(t,x)X(x)dx + 0(2~j)
[ " Kj(t,x) l~Gn^ dFn(x) Jo 1 - Hn(x-) - [ " Jo
Kj(t,x)dA(x)+0(2-j)
f F Kj(t,x)d{An(x) Jo
- A(x)) + 0(2-i).
(33)
409 However, / Jo
F
Kj(t,x)d(An(x)
- A(x)) = V £F K{2k,Vx)d{An(x)
= Aj{t) - 22i / 0 T F ( A „ ( X ) -
- A{x))
A{x))K'(Vt,2>x)dx,
where Aj{t) = [(A„(x) - A{x))K(2jt,2jx)]oF. By the statement (S2), K(2h,2iTF) = 0 for 0 < t < T>. Since A„(0) = A(0) = 0, we have Ai(i)=0. Hence, in view of (21) with K replaced by K', we have \Xn(t) - X(t)\ < | 2 2 ' / F (A n (^) - A(x))K'(2H,2jx)dx Jo < 2* sup | An(x) - A(x) | xCV j < * sup | A„(«) - A(*) | Since £ € equal to
[0,TF],
*
+
^ ^ _f
1+ 2i(digg>[0>1>]))8
| +0(2-') cfa + 0(2">')
+ 0(2->).
(34)
the expression in the right-hand side of (34) is less or C"2 J 'sup| A„(a:)-A(a;) | . X
By Lemmas 2 and 4 of Ref. 25, s u p 0 < t < r | An(t) — A(t) | behaves like T
T
O(
-
_
_
J . It follows from the last inequality that
sup |A„(i) - \{t)\ < C'M2j f L 0 g L ° S n X ) 5 + 0(2-j) o
a.s.
(35)
References 1. G. S. Watson and M. R. Leadbetter, Hazard analysis I, Biometrika 51, 175184 (1964). 2. G. S. Watson and M. R. Leadbetter, Hazard analysis II, Sankhyd A26, 101116 (1964). 3. P. Deheuvels, Conditions necessaires et sufEsantes de convergence presque sure et uniforme presque sure des estimateurs de la densite, C. R. Acad. Sci. Paris A278, 1217-1220 (1974). 4. A. Foldes, L. Rejto and B. B. Winter, Strong consistency properties of nonparametric estimators for randomly censored data II: Estimation of density and failure rate, Period. Math. Hungar. 12, 15-29 (1981).
410 5. A. Foldes and L. Rejto, A LIL type result for the product-limit estimator, Z. Wahrsch. Verw. Gebiete 56, 75-86 (1981). 6. W. Stute, A law of the iterated logarithm for kernel density estimators, Ann. Probab. 10, 414-422 (1982). 7. M. Tanner and W. Wang, The estimation of the hazard function from randomly censored data by the kernel method, Ann. Statist. 11, 983-993 (1983). 8. J. Mielniczuk, Some asymptotic properties of kernel estimators of a density function in case of censored data, Ann. Statist. 1, 766-773 (1986). 9. K. E. Gneyou, Inference statistique pour l'analyse du taux de panne en fiabilite, Ph.D. Thesis, unpublished, University Paris VI (France, 1991). 10. K. E. Gneyou, Normalite asymptotique d'une fonctionnelle du taux de panne base sur des donnees avec censures aleatoires a droite, Ann. Uni. Benin, serie Sciences XII, 3-15 (1996). 11. K. E. Gneyou, A functional law of iterated logarithm for a hazard rate process from censored data, J. Rech. Sci. Uni. Benin (Togo) 1(2), 44-48 (1997). 12. H. G. Miiler and J. L. Wang, Hazard rate estimation under random censoring varying kernels and bandwidths, Biometrics 50, 61-76 (1994). 13. P. Deheuvels and J. H. J. Einmhal, Functional limit laws for the increments of Kaplan-Meier product-limit processes and applications, Ann. Probab. 3, 1301-1335 (2000). 14. S. Mallat, Multiresolution approximations and wavelet orthonormal bases of L 2 (R), Trans. Amer. Math. Soc. 315, 69-87 (1989). 15. Y. Meyer, Ondelettes et operateurs I (Hermann, Paris, France, 1990), pp. 6987. 16. G. Kerkyacharian and D. Picard, Estimation de densite par methodes de noyau et d'ondelette: les liens entre la geometrie du noyau et les contraintes de regularity, C. R. Acad. Sci. Paris 315(1), 79-84 (1992). 17. G. Kerkyacharian and D. Picard, Density estimation in Besov space, Statist. and Prob. Letters 13, 15-24 (1992). 18. I. Johnstone, G. Kerkyacharian and D. Picard, Estimation d'une densite de probabilite par methodes d'ondelette, C. R. Acad. Sci. Paris 315, 211-216 (1992). 19. P. Hall and P. Patil, Formula for mean integrated squared error of nonlinear wavelet-based density estimators, Ann. Statist. 23, 905-928 (1995). 20. A. Antoniadis, G. Gregoire and G. Nason, Density and hazard rate estimation for right-censored data using wavelet methods, J. R. Statist. Soc. 6 1 , 63-84 (1999). 21. J. B. Aubin and A. Massiani, Comportement asymptotique d'un estimateur de la densite adaptatif par methode d'ondelettes, C. R. Acad. Sci. Paris 337(1), 293-296 (2003). 22. A. Carbonez, L. Gyofi and E. C. Van der Meulen, Partition-estimates of a regression function under random censoring, Statist. Decisions 13, 21-37 (1995). 23. W. Haxdle, G. Kerkyacharian, D. Picard and A. Tsybakov, WaveJets, Approximation, and Statistical Applications (Springer-Verlag, New York, 1998).
411 24. E. K. Kaplan and P. Meier, Nonparametric estimation from incomplete observations, J. Amer. Statist. Assoc. 53, 457-481 (1958). 25. S. Diehl and W. Stute, Kernel density and hazard function estimation in the presence of censoring, J. Multivariate Anal. 25, 299-310 (1988). 26. S. Csorgo, Universal gaussian approximations under random censorship, Ann. Statist. 6, 2744-2778 (1996). 27. M. G. Gu and T. L. Lai, Functional laws of the iterated logarithm for the product-limit estimator of a distribution function under random censorship or truncation, Ann. Probab. 18, 160-189 (1990).
412
L A G U E R R E - F R E U D EQUATIONS FOR T H E R E C U R R E N C E COEFFICIENTS OF SOME D I S C R E T E SEMI-CLASSICAL ORTHOGONAL POLYNOMIALS OF CLASS T W O C. HOUNGA, 1 ' 2 M. N. HOUNKONNOU 1 - 2 and A. RONVEAUX 1 - 3 1
International
Chair in Mathematical Physics and Applications University of Abomey-Calavi, 072 B.P. 50, Cotonou, Republic of Benin
(ICMPA),
2 Unitd de Recherche en Physique Theorique (URPT), Institut de Mathematiques et de Sciences Physiques (IMSP), Universite d'Abomey-Calavi, 01 B.P. 2628, Porto-Novo, Republique du Benin 3 Departement de Mathematiques, Unite d'Analyse Mathematique et de Mecanique Universite catholique de Louvain, Chemin du Cyclotron, 2, B-1348 Louvain-la-Neuve,
(AMM), Belgique
In this paper, we give Laguerre-Freud equations for the recurrence coefficients of discrete semi-classical orthogonal polynomials of class two, when the polynomials in the Pearson equation are of the same degree. The case of generalized Charlier polynomials is also presented.
1. Introduction Let {Pn}n>o be a sequence of monic orthogonal polynomials with respect to the linear functional £, satisfying a three-term recurrence relation, 1 fP n + i(a;) = (x-/3n)Pn(x) -j„P„-i(x), \P0(x) = 1 , P1(x) = x-/30,
n > 1, [)
where /?„ and j n are complex numbers (with 7„ ^ 0). By convention, 70 = < C,l >. The difference operators A and V are defined by AP(x) = P(x + 1) - P(x),
VP(z) = P(x) - P(x - 1).
413
Definition 1.1. The linear functional C is said to be discrete semi-classical if C is regular and there exist two polynomials r of degree > 1 and a such that A(aC)=rC,
<
>=<
TJC,P
>,
JC,TP
< AC,P >= - < C, V P > .(2)
Moreover, if C is discrete semi-classical, the class of £, denoted cl(C), is defined as c/(£) = min{max(—2 + degr(a), — l + degr(r))}, where the minimum is taken over all pairs of polynomials a and r of degree at least unity, satisfying A(
{
A(aC) = TC
(Pearson equation),
< A(aC),Pn(x)Pn(x) > =< TC,Pn(x)Pn{x) >, (3) < A((r£),P n (a;)P n+ i(a:) > = < TjC,Pn{x)Pn+1(x) >. In this work, we derive the Laguerre-Freud equations for the recurrence coefficients f3n and 7„ of discrete semi-classical orthogonal polynomials of class two. The linear functional C satisfies (2) with the polynomials a and r defined as T(X) = ao + a\x + a2x2 + a3x3 and a{x) — bo + bix + b2X2 + 63a;3. 2. The Coefncients B* n
The coefficients 5 * derive from the action of the linear form C on xn+kPn(x),3 B* =< C,xn+kPn(x) >, with the initial condition B° = n < £.,x Pn(x) > = < C,Pn(x)Pn(x) > = io,n- Using the orthogonality property, we obtain Bn = —Tn+i:iIotTl,
Bn — (Tn+i^Tn+2,1
&n — Pn+l,l(2n+3,2 ~ Tn+2,lTn+3ii)
— Tn+2fl)I(),ni
+ Tn+3tiTn+2<2
~
(4) Tn+3t3]I0
3
where the coefficients T n ,j are defined by Pn{x) = Y17=o Tn^x1. The polynomials Pn(x + 1), APn(x) and Pn(x - 1) are defined3'4 as n—1 n
n—1 1
Pn(x + 1) = x + (n - £ ) Pi)^' +
£ 0
+ [-(" - 1) E
ft/?;-X>]zn-2 k=l
+ ---,
^ +
G (5)
414
APn(x)
= nx"-1
n-1
+ [(£)
Pi]xn~2 +
- (n - 1) £
[ Q
n-1
n-1
£ f t + (n-2) i=0
Pi/3j-(n-2)Y/lk}xn-3
£ 0
+ ---,
k=l
nII—- 11
Tl—l
n
Pn(x - 1) = x + {-n - Y, Pi)^'1
s
\
+ [(n - 1) J2 & + ( 2 )
+
n-1
(6) 0<j<j
fc=l
3. The Structure Relation Lemma 3.1. Let £ be a linear functional satisfying A(a£) — TC, where a{x) = 63a;3 + 62a;2 + 61a; + 60 a-nd T(X) = a3a;3 + a2a;2 + a^x + a 0 with (I^KI^I + I^21 + \h\ + \bo\)) 7^ 0. Then, the corresponding family of monic discrete semi-classical orthogonal polynomials of class s — 2 verifies the following structure relation, n+2
a{x)VPn{x)
= Y,
KkPk{x),
(7)
k=n-3
An,n+2 — nbz
)
A„,n+i = nb2 +
An,n = nbi +
b2 ^
+
'
i=0
;(;)-©*+e->I> n-1
-f- np;» + n ( 7 „ + 7„ + i)
+ E^? + i=0
63 n—1
n—1
i=0
i=l
415 n-l
n-l
Wi = E ^ -
2r
i=0
+2^
n-l
+
li + nyn
L i=l n-l
3 ^ Z 7i(ft-i + ft) + 7« ^ ft + n(/3„_i1 + /3fti)7n n)7„ 4=0
L i=l
+ (I-)|A+(;)
n-l 2
62 +
&3
n-l
/
»=0
v
x
" &3
x
.
n-l
n-l
-3)J> i<j
1=1
• (
We get the following expressions for the coefficients Xn,n-i!
2 J7"
63-
^n,n-2 and
An,n—3> An,n-1 = ~Ol7n ~ 7n[« ~ 1 + ftl-1 + fti]«2 - (n - 1)7„6 2
-a37„[#t + ftx-iftz + ft^-l + 7n+l + 7« + 7n-l
+ (n - 2)(^_! + /?„) + f > + (" ~ *)] - [ ( n - 2)(/?„_1 +0n) + J2f3i+(n~ »=0
^
Mhnfcs,
(8)
'
An,n-2 = - [ ( n - 2)(a 3 + fo3) + a3(/3n + f3n-i + /3„_2) + a 2 ]7„7„-i, A n ,n-3 = -a37 n 7„_i7„_ 2 . Proof. Multiplying (7) by Pk(x), we deduce
Pk(x)a(x)VPn(x)
=
n+2 Y,
0)
k=n-3
and \n,kh,k = < C,Pk(x)a{x)VPn(x) > = < C,Pk(x)a(x)APn(x - 1) >, n-3
+ l)Pn(x)
we finally obtain the coefficients (8).
> - < a(x)C,Pn{x)APi{x)
>, •
416
4. Laguerre—Freud Equations Lemma 4.1. The Laguerre-Freud equations of discrete semi-classical orthogonal polynomials of class s — 2 in terms of coefficients \n,k are the following, - < rC,Pn(x)Pn(x) >= An,„_3 < C,Pn(x - l)P„_ 3 (a:) > + < jC,Pn{x - l)Pn-i(x) > +A„,„_! < C,Pn(X - l)Pn-l(x) + 2An,„J0,n,
> (10)
and -
< TC,Pn(x)Pn+i(x)
> = A n + i , „ _ 2 < £,Pn(x x
+ A n + l , n - l < £,Pn(
— l)Pn-l(x)
- l)Pn-2(x)
>
> + A n + i i n / o , n + A „ , n + l 7 n + l ^0,n- ( H )
Proof. Equations (10) and (11) are immediate from the expansion of (3), the structure relation (7) and the relation of orthogonality. • Lemma 4.2. The following relations hold n—3
< C, Pn(x - l)P„_ 3 (a;) > = [- X ) & t=0 n—3
n—3 + (n -
1 ) Y , Pi + (Pn-2 i=0
n—1 2 _ 2
+ Pn-l)
£ Pi ~ i=0
] £ ?» + n7n-i i=l , ( " ~ ^
1•.
) (Pn-2 '
+(2 - n X / V a A x - i ) - ( 3 V o , „ - 3 , ,
N
+
Pn-l)
(12) n-2
< £,P„0r - l)P„-2(aO >= [(( " ) + (" - l)A»-i - X>Ko,«-2, ^
< £,Pn(x
- l)Pn-i(x)
'
i=0
>= - n / 0 , „ - i .
Proof. Using the expansion of Pn(x — 1) defined in (6), the relation of orthogonality and the coefficients £?*, we obtain the expressions (12). •
417
Theorem 4.1. The Laguerre-Freud equations for the recurrence coefficients of discrete semi-classical orthogonal polynomials of class s = 2 are the following, o )] = fl3l2 I ] 7i + «(7n+i + 7n+i) + + ( 2 ) ^ " + ( 3 ) + "t^ifin
+ Pn-l)
+ Pniln
+ 7n+l)
n—1
n—1
+7n+i(/?n + 0„+i)] + r(/3„) + J2 e0MPi)
e
+ 2 Yl
j=0 n-1
(n-l)Y^Pi
?MPi)
i=0 ,
v
+63[2n(7n+i+7n+i)+4^7i + 2 ( " ) ] , »=i (it)
^
(13)
'
- a 3 7 n + l (7n + 7 n + l + 7 n + 2 + P\ + PnPn+1 + Pl+l) X
~ [03 + (2fl + 1)6 3 ]
/Wl7n+1 =
(
\
n—1
n
2) - ^ f t - " ( / 3 n + l + ^ n + n - 1 ) ] + IZ<J^) '
i=0
i=l
i=0
£
^
(2n + l ) 7 „ + i - n ^ f t + f "
' 3
i=0
•2
PiPj-nY.fi
0
-(2n-l)Yli-nln+l+(n\J2Pi-(n\l)] j=l
n
i=0
+^[3^7i(ft-i+A)+27„+i(n^ + ^ / 3 i ) -
i=0
+ b2[2J2li+ ^
J ] - &i f "
' 2
i=l
J+rca27n+i +
[01 +a 2 /8„]7„+i,
(14)
where A>=~, W ^ ^ " ^ , r(/30) = -7i[a2+03(2i8o+/3i)], Mo a; — a 2ft o-(/?o) + 7i( 3/9o + h + ai + a2p0) = - 7 1 [(«2 + b3)Pi + 03(71 + 72
+PI+P0P1+PI)}.
(15)
Mo and M\ are the moments of order 0 and 1 with respect to the linear form, respectively. Proof. From Lemmas (3.1), (4.1) and (4.2), the required equations follow directly. • Setting 03 = 0 in the Eqs. (13) and (14), we recover the Laguerre—Freud equations (3) for the recurrence coefficients of discrete semi-classical orthogonal polynomials 3 of class s — 1.
418
5. Applications The above formalism can be applied to the generalized Charlier orthogonal polynomials denoted by C„ (x) with the weight p given by pr(i) = (/i i )/(z!) r , i = 0,1,2,-•-, p > 0, a(x) = xr and T(X) = p - xr. Putting r = 3, the weight P3(i) for the generalized Charlier polynomials 6 of class s = 2 is obtained. The corresponding Laguerre-Preud equations for the recurrence coefficients read as, 1
/Wi =
It
±
IL
E #
i=0
- U j ' -
7n+2 =
^
It
fc=l
5 n +
A.
i=0
( 3 ) -7n(/3n + / 3 n - l ) - / 8 n ( ^ + 7 n ) + A i ] + n - 2 / 3 n ,
n
n
EA 'n+1
J.
+ 2 5 > * + (0n-n + l ) 2 > + n ( / £ + 7 » )
3
n
-"Z)#-
E
i=0
0
i=0
' »=o
+ E &+ (
^ J + 3 E ^
t=i n
2
+
A-l)
i=l
^
'
) ~(7n+i+7n+&+,8"+i+/3«/3"+1)'
with the inital value /30 = $*. Here M 0 and Mi are the moments of order 0 and 1 given by 00
M
j
<^) = E ( ^ >
00
M
1 .
iM = Eij$s-
(16)
These quantities Mo(/x) and Mi(p) converge for all p > 0. The recurrence relations for the generalized Charlier polynomials (G-Ch) with r — 2 can be rewritten in terms of a discrete Painleve II equation (d-P II) ,8 which allows to prove the asymptotic behaviour of /?„ and 7„ conjectured in Ref. 6, (limn-^oo fin/n = 1, hmn->oo7n = A4)- ^ would be therefore interesting to investigate the G-Ch with r = 3 in both aspects: the possible reduction of the Laguerre-Freud equations to some discrete Painleve equation and the asymptotic behaviour of/3 n and j n . For r = 1 and r — 2, lim„_>oo pn/n — 1, a result that could lead to the same asymptotic trend for the case r = 3. The 7„ limit seems more tricky to determine. From limn_j.oo 7 „ / n = p. for r = 1 and lim„_>oo 7„ = p for r = 2, we tend to conjecture limn^oo n 7 n = p for r = 3.
419 References 1. T. S. Chihara, An Introduction to Orthogonal Polynomials (Gordon and Breach, New York, 1978). 2. S. Belmehdi and A. Ronveaux, Laguerre-Freud equations for the recurrence coefficients of semi-classical orthogonal polynomials, J. Approx. Theory 76, 351-368 (1994). 3. M. Foupouagnigni, M. N. Hounkonnou and A. Ronveaux, Laguerre-Freud equations for the recurrence coefficients of Dw-semi-classical orthogonal polynomials of class one, J. Comp. App. Math. 99, 143-154 (1998). 4. L. D. Salto, Polinomios Dwsemiclasicos, Ph.D. Thesis, unpublished, Universidad de Alcala de Henares (Spain, 1995). 5. M. Foupouagnigni, Equations de Laguerre-Freud: Cas des polynomes orthogonaux semi-classiques de classe 2, Master's Thesis, unpublished, Institut de Mathematiques et de Sciences Physiques (IMSP, Republique du Benin, 1995). 6. M. N. Hounkonnou, C. Hounga and A. Ronveaux, Discrete semi-classical orthogonal polynomials: Generalized Charlier, J. Comp. App. Math. 114, 361-366 (2000). 7. E. Azatassou, M. N. Hounkonnou and A. Ronveaux, Laguerre-Freud equations for semi-classical operators, Proceedings of the First International Workshop on Contemporary Problems in Mathematical Physics, Cotonou, Benin, eds. J. Govaerts, M. N. Hounkonnou and W. A. Lester, Jr. (World Scientific, Singapore, 1999), pp. 336-346. 8. W. Van Assche and M. Foupouagnigni, Analysis of nonlinear recurrence relations for the recurrence coefficients of generalized Charlier polynomials, Journal of Nonlinear Mathematical Physics 10, Supplement 2, 231-237 SIDE V (2003).
420
A S T U D Y OF A VISCOELASTIC C O N T A C T P R O B L E M O. P. LAYENI* and A. P. AKINOLA Department of Mathematics, Obafemi Awolowo University, Ile-Ife 220005, Republic of Nigeria E-mail: [email protected] We give existence results for unilateral contact between a certain special case of a class of nonlinear viscoelastic bodies and a proposed harmonic deformable foundation under Tresca friction. The harmonic foundation is defined in terms of a subdifferential and incorporates in some limits the cases of a rigid foundation and a Winkler foundation. Our results are obtained using tools from the theories of nonlinear functional analysis and variational inequalities. Keywords: Viscoelasticity; unilateral contact; harmonic
foundation.
1. Introduction In this contribution, we discuss the well-posedness of a quasistatic unilateral contact problem between a nonlinear viscoelastic material and a foundation which is not (necessarily) rigid. The constitutive property of the viscoelastic material which is motivated by the model of Barboteu et al.1 is in terms of small strain E(u) and Cauchy stress £ of the form
t = tEl + tVs,
(1)
where tEi = <5E(u) and tVs e (<9<8 + <2L)E(ii) = Sr\E(u). Here, E(u) is the small strain tensor function, 21 is a symmetric and elliptic operator and 05 is a nonlinear operator defined over strain rates, while (5 is a nonlinear operator defined over the strain tensor field and space variable, Lipschitz continuous with respect to the strain and Lebesgue measurable with respect to the space variable. A stacked dot stands for a time derivative of the respective arguments. Many authors in recent years have considered contact problems involving nonlinear materials {e.g., Refs. 2-5). Here, our nonlinear material is * Correspondence author.
421 more general than that of Ref. 1 in that both stress components due to the viscoelastic and elastic components are nonlinear functions of the sought deformation. In addition, we consider a unilateral frictional contact with a harmonic foundation which reaction is described by a subdifferential expression (—'EN S 8JN), with
JN{UN)
oo
for
UN
> 1,
fc(l — COS(UN)(X))
for
1 > UN > 0,
0
for
UN
(2)
< 0,
where N is the unit normal vector to the body and the subscript N denotes the normal component of the respective argument while k £ R+. This is an example of the class of deformable foundation given in Ref. 6. It should be noted that the above foundation captures the nonlinear Winkler foundation 7 for small deformation theories and models a rigid foundation as k goes very large, i.e., k -> oo. 2. Problem Setting The physical setting is the following. A nonlinear viscoelastic body with law (1) occupies a bounded domain fi x I, over a finite time I, with a regular boundary partitioned into three mutually disjoint measurable portions dflc x I, dUD x I and dCtF x I such that meas-dftx > 0 (X = C,D and F). The body is clamped on dflo, & volume force is applied in CI, traction prescribed on dO.F and dClc is such that the body may come into frictional quasistatic contact with a harmonic deformable foundation. Finally, E n and S n are the n dimensional Euclidean and Symmetric function spaces, respectively. We further give necessary notations useful in the sequel. Let Hk = Wk'2 denote the Sobolev space of functions whose distributional derivatives up to the A;-th order are in the Lebesgue space L2. Also, (•, •) denotes the duality product between the spaces involved. Lemma 2 . 1 . Given tBi = ®E(u) and tVs 6 (<9*8 + VL)E(u) = S = Y,Ei + EVs is equivalent to £ € 9KE(u) + <&E{u).
9\E(u),
Proof. This follows simply from the definition of the subdifferential. Given that tVs e (<9
- E(u)) (d
- E(u)\
,
422
for all E{w) in an admissible space. Hence, the sum £ = Y,EI + 5V* is such that t<E
(3)
•
which concludes the proof.
Next, we give our problem statement in light of the equivalence of (1) and (3) in the following description. Nonlinear Problem 1 (NPi): Find a displacement vector field u : fl x I -} ln and a stress field £ fixing" such that £ G m.E(u) + <3E(u); V-t
+ pf = 0
in Q x I;
u =0 t-N
on <9fi£> x I; = t
-t,N(uN) k u(0) =
E PN(UN) UQ
(4)
on 9 0 ^ x I; =
djN(uN)
and u(0) = u\
on 9fic x I; in flxl;
with jjv given in (2) together with the Tresca friction condition J | £ T | < h(t), | £ r | < h(t) =^ u r = 0
on <9Qc x I,
1 | £ r | = h(t) =£> there exists A > 0
such that £ r = — XuT.
(5)
3. Variational Formulation and Results Let the set V be defined as V = {u G H11 u = 0 on dnD x 1 } , and define also /C, the set of admissible displacements, by K, = {u G V | uN < 0 on dflc x 1}.
(6)
Lemma 3.1. Given u and £ regular enough satisfying (4) and (5), the NP\ has a variational form: Find u G K such that
E{u)\ (7)
(g,w-u)
+ J{w) - J(u) + *(h(t),w)
- V(h(t),u)
>0
423
for all w G /C, u(0) = uo where i&(h(t),w) = / \h(t)\ \wT(x)\ds and J{w) = / jN(w)ds JdQcXl JdQxI (g, w-u)
= (pf,
w - u)^^
+ (F, w - U)9UF
and XI
.
Proof. This follows straightaway from taking a product of the equilibrium equation with a test function x (which is homogeneous in w — u where w G /C), using the constitutive law, integrating over ft x I, applying Green's theorem and reflecting the boundary conditions in (4) and (5). • We will need the following well-posedness result. Theorem 3.1. Let f be given in L2(0,T;H) and x$ € D((j)). Then there exists a unique function x 6 C([0,T];H) fl Wx'2QQ,T];H) which satisfies x(t) + d^)(x{t)) - ux(t) 3 f(t) a.e. t €]0,T[, x(0) =
XQ,
almost everywhere on ]0,T[. If x0 € D(4>) then x € W1,2([0,T];H) and
(9)
Proof. A detailed proof of the above theorem due to H. Brezis can be found on page 35 of Ref. 8. •
Remark 3.1. We note that *(/i(i),w) and J(w) are the contributions due to friction and the harmonic foundation on the boundary dttc x I, respectively. In addition, we note that the contribution due to the Tresca friction is convex but nondifferentiable. To address the observation in Remark 3.1, given S > 0, we introduce the regularized functional ty(h(t),w$) which is differentiable and such that ^f(h(t),w6) ->• ^f(h(t),w) and wg -> w as S -> 0. The regularized problem now reads as follows.
424
Find u 6 K such that
- J[us)
+ ^(h(t),ws)
- V(h(t),us)
E(us)) = 0
for all ws S K., ug(0) = uosIn order to study the variational equation (10), we consider the following related variational inequality. Find u € K, such that
+ V(h(t),w5)
-
- V(h(t),us)
E(u5)) >0
for all ws € /C and us(0) = uosLet (0lE(u6),E(ws)
-E(us))
= (Us,ws-us)v
(12)
Also, let W = £>(Q3) = $E(u) € S n |
•<*<4» = {° S f K
<13)
[oo if E(u) f W. By Riesz representation theorem, let there be a (Lipschitz monotone) operator Z such that <SE{us),E{w5)-E{us)) =(Zu6,uls-u6)v. (14) Reflecting the definition of the indicator 3K of a convex set /C and (11)(14) above, we have the following result. Lemma 3.2. Given (12)-(14), the variational problem (11) is equivalent to the Nonlinear Problem 2 (NP2): Find a displacement vector field uj : fl x I 4 t " such that us + Zu5 + d&(u5) 3 g, <5{us) = J (Sis) + V(us) + ?ic{us), us(0) — uos-
(15)
425
We consider a special case of the regularized problem NP2 wherein Z = dy-al , a e l
(16)
Consequently, we have the following result. Lemma 3.3. Suppose Z = dj - al, a E E and u0d € D(d&) D D(dj). The differential inclusion (15) is equivalent to u5 - aus + &°(us) 3 9, 6°(tfc) = d(6(us)
+ 7(««)) = d
(17)
us(0) = uosProof. Substituting Z = d^ — al into the inclusion (14), we have us — aus + d&(u*s) + dl{us) 3 9- Due to the convexity of <3 with respect to the displacement ug and the domain definition of uos, d&(us) + d^{us) = d(&(its) + l(us)) = dip(u), say. Consequently, we have the differential inclusion (17). •
Theorem 3.2. Suppose (2), (12), (14), (16) and g € L 2 (I;V), under the assumptions of Lemma 3.3 there exists a unique us 6 Hl(J.\V) satisfying the differential inclusion (17) almost everywhere on I. If however geW^^V), then us e W ^ f t V ) . Proof. We apply Theorem 3.1 to (13). Drawing on Ref. 8 and the statement of Theorem 3.1 we note that a precondition for the existence of a unique solution is that the sum (3° be maximal monotone. d
426
Remark 3.2. We have shown the existence of a unique solution for a differential inclusion problem and consequently a regularized problem involving a special case of a nonlinear viscoelastic material in frictional contact with a harmonic foundation. A model such as this is realizable in Kelvin-Voigt viscoelastic materials whose viscoelastic stress contribution is ideally locking (e.g., of limited compressibility as in some rubber materials) and the elastic stress contribution is either linear (possessing the usual properties of symmetry and ellipticity), or elastic hardening {e.g., 11 EI = a(I—T>M)E(u)+bll* where VM(U) is a projection tensor operator over a convex set M, S* is a Cauchy stress, a ^ O g l and 6 s B + U{0}). This result embodies the cases of unilateral contact of this class of bodies with a rigid elastic foundation. References 1. M. Barboteu, W. Han and M. Sofonea, A frictional contact problem for viscoelastic materials, Journal of Applied Mathematics 2(1), 1-21 (2002). 2. A. Amassad and C. Fabre, On the analysis of a viscoplastic contact problem with time dependent Tresca's friction law, Electron. J. Math. Phys. Sci. 1, 47-71 (2002). 3. M. Sofonea, On a contact problem for elastic-viscoplastic bodies, Nonlinear Analysis, Theory, Methods and Application 29(9), 1037-1050 (1997). 4. M. Rochdi, M. Shillor and M. Sofonea, Quasistatic viscoelastic contact with normal compliance and friction, Journal of Elasticity 51, 105-126 (1998). 5. W. Hans and M. Sofonea, Evolutionary variational inequalities arising in viscoelastic contact problems, SI AM J. Numer. Anal. 38(2), 556-579 (2000). 6. O. P. Layeni and A. P. Akinola, A note on a contact problem involving a class of deformable foundation, submitted for publication (2005). 7. N. Kikuchi and J. T. Oden, Contact Problems in Elasticity: A Study of Variational Inequalities and Finite Element Methods (SIAM, New York, 1998). 8. V. Barbu, Optimal Control of Variational Inequalities (Pitman, Boston, 1988).
427
List of Participants
A. Abdelhamit Departement de Mathematiques et d'Informatique, University of Ouagadougou, Ouagadougou, Burkina Faso D. B. Adekanmbi Department of Pure and Applied Mathematics, Ladoke Akintola University of Technology, Ogbomoso, Republic of Nigeria A. Afouda International Chair in Mathematical Physics and Applications, University of Abomey-Calavi, Republic of Benin H. Agbodjalou International Chair in Mathematical Physics and Applications, University of Abomey-Calavi, Republic of Benin F. Agusto Department of Mathematical Sciences, Federal University of Technology, Akure, Republic of Nigeria B. Ahounou Institut de Mathematiques et de Sciences Physiques, University of Abomey-Calavi, Republic of Benin A. Akpo Faculte des Sciences et Techniques, University of Abomey-Calavi, Republic of Benin S. T. Ali Department of Mathematics and Statistics, Concordia University, Montreal, Quebec, Canada, and, International Chair in Mathematical Physics and Applications, University of Abomey-Calavi, Republic of Benin
428
K. A. Araoun University of Abomey-Calavi, Republic of Benin A. Anjorin International Chair in Mathematical Physics and Applications, University of Abomey-Calavi, Republic of Benin, and, Lagos State University, Republic of Nigeria J.-P. Antoine Institute of Theoretical Physics, Catholic University of Louvain, Louvain-la-Neuve, Belgium, and, International Chair in Mathematical Physics and Applications, University of Abomey-Calavi, Republic of Benin K. Assiamoua Department of Mathematics, University of Lome, Republic of Togo T. Assih Department of Physics, University of Lome, Republic of Togo S. E. Attakpa University of Abomey-Calavi, Republic of Benin A. Attogouinon International Chair in Mathematical Physics and Applications, University of Abomey-Calavi, Republic of Benin G. Y. H. Avossevou Institut de Mathematiques et de Sciences Physiques, University of Abomey-Calavi, Republic of Benin, and, International Chair in Mathematical Physics and Applications, University of Abomey-Calavi, Republic of Benin
429 E. Azatassou International Chair in Mathematical Physics and Applications, University of Abomey-Calavi, Republic of Benin E. Baloitcha International Chair in Mathematical Physics and Applications, University of Abomey-Calavi, Republic of Benin U. N. Bassey Department of Mathematics, University of Ibadan, Ibadan, Republic of Nigeria Z. Beheton International Chair in Mathematical Physics and Applications, University of Abomey-Calavi, Republic of Benin J. Ben Geloun International Chair in Mathematical Physics and Applications, University of Abomey-Calavi, Republic of Benin G. Bissanga Faculte des Sciences, Departement de Mathematiques, Universite Marien Ngouabi, Brazzaville, Republic of Congo T. L. Bla Institut de Recherche Mathematique, University of Cocody-Abidjan, Ivory Coast Th. Bouetou Departement de Mathematiques et Genie Informatique, Ecole Nationale Superieure Poly technique, Yaounde, Republic of Cameroon 0 . J. Chabi Institut de Mathematiques et de Sciences Physiques, University of Abomey-Calavi, Republic of Benin F. Dagan University of Abomey-Calavi, Republic of Benin
430
E. D. Dagbenonmakin International Chair in Mathematical Physics and Applications, University of Abomey-Calavi, Republic of Benin A. S. d'Almeida Department of Mathematics, University of Lome, Republic of Togo G. Debiais Groupe de Physique Fondamentale, Universite de Perpignan, Perpignan, France S. Degla Institut de Mathematiques et de Sciences Physiques, University of Abomey-Calavi, Republic of Benin R. de Mello Koch Department of Physics and Center for Theoretical Physics, University of the Witwatersrand, Wits, Republic of South Africa, and, Stellenbosch Institute for Advanced Studies (STIAS), Stellenbosch, Republic of South Africa H. Derra Departement de Mathematiques et d'Informatique, University of Ouagadougou, Ouagadougou, Burkina Faso A. S. Diallo Institut de Mathematiques et de Sciences Physiques, University of Abomey-Calavi, Republic of Benin A. Dossou Universite des Sciences et Techniques de Masuku, Gabon H. Enjieu Institut de Mathematiques et de Sciences Physiques, University of Abomey-Calavi, Republic of Benin
431 M. Essoun Institut de Mathematiques et de Sciences Physiques, University of Abomey-Calavi, Republic of Benin 0 . A. Fadipe-Joseph Department of Mathematics, University of Ilorin, Ilorin, Republic of Nigeria K. Fall University Gaston Berger, Saint-Louis, Republic of Senegal S. J. Gates, Jr. Physics Department, University of Maryland, College Park, USA K. E. Gneyou Department of Mathematics, University of Lome, Republic of Togo G. A. Goldin Department of Mathematics and Physics, Bush Campus, Rutgers University, Piscataway, New Jersey, USA L. Gouba Institut de Mathematiques et de Sciences Physiques, University of Abomey-Calavi, Republic of Benin E. Gouba University of Ouagadougou, Ouagadougou, Burkina Faso A. H. Goudjania International Chair in Mathematical Physics and Applications, University of Abomey-Calavi, Republic of Benin
432
J. Govaerts Center for Particle Physics and Phenomenology (CP3), Institute of Nuclear Physics, Catholic University of Louvain, Louvain-la-Neuve, Belgium, and, International Chair in Mathematical Physics and Applications, University of Abomey-Calavi, Republic of Benin D. M. Hamani Institut de Mathematiques et de Sciences Physiques, University of Abomey-Calavi, Republic of Benin M. Hassirou Institut de Mathematiques et de Sciences Physiques, University of Abomey-Calavi, Republic of Benin C. Hounga Institut de Mathematiques et de Sciences Physiques, University of Abomey-Calavi, Republic of Benin E. Houngninou University of Abomey-Calavi, Republic of Benin M. N. Hounkonnou International Chair in Mathematical Physics and Applications, and, Institut de Mathematiques et de Sciences Physiques, University of Abomey-Calavi, Republic of Benin G. Honnouvo Department of Mathematics and Statistics, Concordia University, Montreal, Quebec, Canada V. Kana University of Ouagadougou, Ouagadougou, Burkina Faso K. Kangni Unite Fondamentale de Recherche en Mathematiques et Informatique, University of Cocody-Abidjan, Ivory Coast
433
M. Keita University of Abobo-Adjame, Abidjan, Ivory Coast M. Kere Departement de Mathematiques et d'Informatique, University of Ouagadougou, Ouagadougou, Burkina Faso B. 0 . Konfe Departement de Mathematiques et d'Informatique, University of Ouagadougou, Ouagadougou, Burkina Faso B. Kote Departement de Mathematiques et d'Informatique, University of Ouagadougou, Ouagadougou, Burkina Faso Y. B. Kouagou Institut de Mathematiques et de Sciences Physiques, University of Abomey-Calavi, Republic of Benin A. V. Kpadonou Institut de Mathematiques et de Sciences Physiques, University of Abomey-Calavi, Republic of Benin B. Kpamegan Department of Mathematics, University of Abomey-Calavi, Republic of Benin M. F. Kpindjo International Chair in Mathematical Physics and Applications, University of Abomey-Calavi, Republic of Benin F. Lahlou Department of Physics, University of Fes, Morocco D. Lakoande University of Ouagadougou, Ouagadougou, Burkina Faso
434
D. Lauvergnat Laboratoire de Chimie Physique, Universite Paris-Sud, Orsay, France O. P. Layeni Department of Mathematics, Obafemi Awolowo University, Ile-Ife, Republic of Nigeria E. Ligan Departement des Sciences et Technologies, Section Mathematiques, Ecole Normale Superieure, Abidjan, Ivory Coast M. K. Mahaman International Chair in Mathematical Physics and Applications, University of Abomey-Calavi, Republic of Benin K. J. E. Maleka University Cheikh Anta Diop, Republic of Senegal B. R. B. Malonda Faculte des Sciences, Departement de Physique, Universite Marien Ngouabi, Brazzaville, Republic of Congo B. Manga Institut de Mathematiques et de Sciences Physiques, University of Abomey-Calavi, Republic of Benin W. Marcos Institut de Mathematiques et de Sciences Physiques, University of Abomey-Calavi, Republic of Benin D. Moukoko Faculte des Sciences, Departement de Physique, Universite Marien Ngouabi, Brazzaville, Republic of Congo F. Mouna Ecole Nationale Superieure Polytechnique, Yaounde, Republic of Cameroon
435
P.-S. Moussounda Faculte des Sciences, Departement de Physique, Universite Marien Ngouabi, Brazzaville, Republic of Congo B. M'Passi-Mabiala Faculte des Sciences, Departement de Physique, Universite Marien Ngouabi, Brazzaville, Republic of Congo G. Munyeme Department of Physics, University of Zambia, Zambia L. A. Musesa University of Kinshasa, Kinshasa, Democratic Republic of Congo H. V. Mweene Department of Physics, University of Zambia, Zambia P. Nang Universite des Sciences et Techniques de Masuku, Gabon T. U. F. Ndongmouo Institut de Mathematiques et de Sciences Physiques, University of Abomey-Calavi, Republic of Benin E. Ngompe University of Ouagadougou, Ouagadougou, Burkina Faso I. Nourou Department of Mathematics, University of Abomey-Calavi, Republic of Benin W. A. Nzobadila Institut de Mathematiques et de Sciences Physiques, University of Abomey-Calavi, Republic of Benin R. R. Odutayo Olabisi Onabanjo University, Ago-Iwoye, Republic of Nigeria
436
0 . Oke Olumuyiwa Ladoke Akintola University of Technology, Dugbe-Ibadan, Republic of Nigeria H. Onibon International Chair in Mathematical Physics and Applications, University of Abomey-Calavi, Republic of Benin M. Ouedraogo University of Ouagadougou, Ouagadougou, Burkina Faso K. J. Oyewumi Department of Physics, University of Ilorin, Ilorin, Republic of Nigeria R. O. Raji Olabisi Onabanjo University, Ago-Iwoye, Ogun State, Republic of Nigeria J. Y. Semegni The African Institute for Mathematical Sciences (AIMS), Muizenberg, Cape Town, Republic of South Africa K. Sodoga Institut de Mathematiques et de Sciences Physiques, University of Abomey-Calavi, Republic of Benin H. W. B. Sore Departement de Mathematiques et d'Informatique, University of Ouagadougou, Ouagadougou, Burkina Faso S. Tao Departement de Mathematiques et d'Informatique, University of Ouagadougou, Ouagadougou, Burkina Faso C. Tchadjeu University of Ouagadougou, Ouagadougou, Burkina Faso
437
K. Tchakpele Department of Physics, University of Lome, Republic of Togo D. Temga University of Ouagadougou, Ouagadougou, Burkina Faso B. Toukourou Lycee National Leon Mba, Libreville, Gabon 0 . R. Walo University of Kinshasa, Kinshasa, Democratic Republic of Congo R. L. Woulache Laboratoire de Mecanique, Faculte des Sciences, University of Yaounde I, Yaoude, Republic of Cameroon B. Zannou University of Abomey-Calavi, Republic of Benin P. Zongo University of Ouagadougou, Ouagadougou, Burkina Faso M. Zorom Departement de Mathematiques et d'lnformatique, University of Ouagadougou, Ouagadougou, Burkina Faso A. Zoungrana Departement de Mathematiques et d'lnformatique, University of Ouagadougou, Ouagadougou, Burkina Faso