This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
x + y
is continuous on X x X into X
'• Note that we are using the product topology here, i.e., the minimal topology that makes the projections Pj, P2 continuous (see Chapter 1 in [3]).
Preliminaries 3
P2: (a, x) -» ax
is continuous on K x X into X
Since the continuity given by Px implies (x, y) -*(x- y), it follows that every t.v.s. is a commutative topological group. Naturally topological spaces in general and t.v.s. in particular can be provided with some geometric and analytical structure; the following definitions pertain to that. Definition 0.2.8: The pair (X, p) denotes a metric space tf p: X x X -» R is a mapping that satisfies: (i) p(x, y) > 0 if x * y, p(x, y) = 0 if x = y; (ii) p(x, y) = p(y, x): and (iii) p(x, y) < p(x, z) + p(z, y) for any z G X. If A is a subset of metric space (X, p), we call and denote the diameter of A and the distance from x to A respectively by:
diam A = sup [p(x, y): x, ye A] dist (JC, A) = inf {p(x, y) : y e A}. The notation B(x; r) is used to denote a closed ball centred at x with radius r > 0, thus, B{x;r)=[ye
X: p(x, y) < r ) .
The subset of this ball given by {y e X: p{x, y) < r) is called an open ball and is denoted Bo (JC; r). A set 11 in space X is said to be open if for each point x e 11 there exists an open ball centred at x and contained in 11. It is easy to verify that this defines a topology on X. Given a topological space X, we call it metrizable if there exists a metric p such that the open balls form a basis for the topology; such a metric is said to be compatible with the topology on X. It can be checked that the metric p is translation invariant in this case, i.e., p(x, y) = p(x + z,y + z) for x, y, z e X. Recall that the symbol A denotes the closure of A in X. If X has a topology T other than the one induced by the metric, we use A T to denote the closure of A in {X, T). Definition 0.2.9: A subset C c X of a t.v.s. is convex if for any JC, y e C, A x + (1 - A) y e C, where A e [0,1]. Given A c X, the convex: /iw// of A denoted conv A is the smallest convex subset of X that contains A, thus: conv A = n {£ c X : K 3 A, K is convex}. n
n
Thus x e conv A if and only if x = £ A; *,- where JC; e A and X ^, = 1- The closure of conv A is 1=1
i=i
conv A. A well known theorem states: If A is compact so is conv A. Definition 0.2.10: A real valued mapping on a vector space X defined over IR or C (K stands for real or complex number field) is called a norm (denoted || ||) if it satisfies: (a) || Ax|| = |A| ||JC|| for all A e K, x e X; (b) jjjc + v|| < ||x|| + ||y||; and (c) ||x|| = 0 implies x = 0. A normed space is thus a pair (X, || ||). It is easy to check that a distance function between a pair of elements can be defined on this space by the rule: {x, y) —> [|JC - y||. Consequently the vector space X can be given a topology. Hence a normed vector space is a topological vector space. Definition 0.2.11: In a normed linear space X, a sequence {xn} is said to be convergent if there exists an element x in X such that | | x n - x\\ —> 0. The sequence {xn] is said to converge to the element x.
4 Mathematical Perspectives on Theoretical Physics
Definition 0.2.12: A sequence {xn} in X is said to be a Cauchy sequence if given £ > 0 we can find an integer N(e) such that \\xn — xm\\ < e for all n, m > N(e). Evidently every convergent sequence is a Cauchy sequence, but the converse is not always true. Definition 0.2.13: A normed linear space in which every Cauchy sequence is a convergent sequence is said to be complete, and it is called a Banach space after the Polish mathematician Stefan Banach. One of the important spaces that is used in quantum theories is the Hilbert space. In order to define it next we describe the space from which it follows. Pre-Hilbert space: If H is a (complex or real) vector space and < .,. > is a non-degenerate scalar product on H, then we call the pair (H, < . , . > ) a vector space with scalar product or a pre-Hilbert space. Every pre-Hilbert space carries a norm in a natural way, the norm being | | / j | = ( / , / ) l y ' 2 where / € H. If in addition every Cauchy sequence is convergent, then this space in view of Def. (0.2.13) is complete. A complete pre-Hilbert space is called a Hilbert space (see Def. (0.2.14)). Examples of Norm: In Cm (or Rm) define the norms as m
(0 ll/lli = XlJ3 (ii)
| | / | | . . = max { | / - | : i = l ,
-,m}
(iiD ,*)= £/•*/=> 11/11= jlU-lj 2 The last of these gives the Euclidean length1 of the vector/= (fx,---,fm) e C m (Rm), and | | / - g\\ here is the Euclidean distance of points/and g. Let (H,( ., .)) be a pre-Hilbert space. A family M - {ea: a e an index set A} of elements from H is called an orthonormal system (ONS) if ( ea,-ep)
>= 8afifor
a, j8 e A.
An orthonormal system M is called an orthonormal basis (ONB) of a subspace T of H, if M is total in T (i.e., M c Tand L(M)
:D T). If M is an ONB of H, then M is called an orthonormal basis of H.
Definition 0.2.14: A complete normed space whose norm || || is given by a scalar product ( ) is called a Hilbert space. More explicitly a space X is Hilbert if to each-pair of elements x, y in X there is associated a scalar {x, y ) that satisfies: (i) ( ax xx + «2 x2, y ) = ax ( xx, y) + a2 ( x 2 , y )
(ii) ( x , y ) = (y, x) (the complex conjugate) (iii) (x,x) 2
= (|| JC||) is positive definite when x ^ 0.
L(M) = closure of the linear hull L (M) (the set of finite linear combinations of elements of M, or in other words, the smallest subspace of H that contains M).
Preliminaries 5
The scalar product defined above is called a positive definite Hermitian form. It is usual to denote a Hilbert space by 9i. We shall use this notation throughout the text. It is important to note here that elements of Hare arbitrary (e.g., real or complex numbers or real valued or complex valued functions) and scalar product (as obvious from (ii)) is not necessarily real. It can also be checked that the norm and the scalar product on J/'are related by the Schwartz inequality: (iv) | U y ) | £ | H | | | y | | and by the polarization identity: (v) 4( x, y) = \\x+y\\2 - \\x- y\\2 + i\\x + iy\\2 - i\\x -
iyf.
The notion of completeness in 9i means that if a sequence of elements {(j)n} in !H satisfies the condition 110/1 ~ 0mll —> 0 for m, « —> °°, then there exists an element
n - 0|| —> 0 for n —> °°. this context we would like to mention that there are two types of convergence in ?/(in fact in a normed linear space), the strong and the weak. The sequence of elements {0,,} converges to the element (f> strongly if as n —> °°
ik-0ii->o and weakly to an element <j)' if for each element y/ e 9{
The "strong convergence" is also called "convergence in the mean" and it implies:
IKII-HMIEvidently strong convergence implies weak convergence. Let u,veJ{, we say that these elements are orthogonal if ( u, v ) = 0. Suppose that H is a subset of 9(, the set of all elements we 9{fox which (u,v) = 0, v e H forms a subspace of"K,we denote this by HL. If the subset H is the whole space 9{, then the space H' consisting of all elements u in #"such that u e !HL is called the null space of the Hermitian product. We shall in general deal with a separable Hilbert space. Its dimension is either finite or is denumerably infinite. In the former case, we shall sometimes use a familiar nomenclature a unitary space of dimension n. Even when the space is infinite dimensional, an orthonormal basis can be obtained. Since it can be shown that there always exists an infinite complete orthonormal sequence, that can be obtained by applying Gram-Schmidt orthogonalization process to a complete denumerable subset of 9{. We give below a few examples to illustrate the objects that have been defined above. The numbering of the example has an additional letter a, b, etc., added to the definition of the object it represents. Example 0.2.1a: Let X = R be the set of real numbers. Let a subset U of R be called open if for each point x in U there exists an open interval / containing x and contained in U . Obviously R with the set 11 formed by open sets U becomes a topological space. Example 0.2.1b: Let X = {a, b, c, d, e,f) and let 11 be the set formed by subsets (assumed as open) [{a, b, c, d}, {a, b, e,f), {c, d, e,f), 0 , X]. It can be easily checked that X together with 11 forms a topological space. Example 0.2.1c: Among the simplest examples of a compact and non-compact set are respectively an open or a closed disc and an extended plane as shown in Fig. (0.1). Note that the topological space of (0.2.1a) is non-compact and that of (0.2.1b) is compact.
6
Mathematical Perspectives on Theoretical Physics
• Q mg
|x] < r, open disc v-tv
|x| < r, closed »ICD2
Plane extended to infinity on either side
x = ^X-|, X2) fc n
l ^ ^ ^ f l Open and closed discs, and extended plane Example 0.2.1d: Connected and disconnected topological spaces are visually represented as follows.
I^SCTj Connected and disconnected topological spaces Also note that an open, half open, or closed interval / cz R is always connected. Example 0.2.2a: The set Q of rationals is dense in the set R of real numbers. Example 0.2.3a: An open interval (a, b) is a neighbourhood of each of its points. A closed interval [a, b] is a neighbourhood of each point of (a, b). As can be easily seen, [a, b] is not the neighbourhood of a or b. Example 0.2.3b: The set R of real numbers is the neighbourhood of each of its points. The set Q of rational numbers is not the neighbourhood of any of its points. Example 0.2.5a: If the set U of subsets in Exp. (0.2.1b) is replaced by [{a} {b} ••• { / } ; {a, b] •••; {a, b, c} •••; {a, b, c, d] •••; {a, b, c, d, e} • • ; 0 , X], the topology defined will be the discrete topology. Example 0.2.6a: Every metric space is Hausdorff, for if d(x, y) = S> 0 defines a metric on X, then the sets defined as: Va : = {x\d(a, x) < 5/2}, Vh : = {x\d(b, x) < 6/2} a * b are disjoint neighbourhoods of points a, b in X. Example 0.2.6b: A topological space endowed with the discrete topology is a Hausdorff space. Example 0.2.7a: The cartesian product Rn = R x • • • x R over the field of reals is the simplest example of a topological vector space; the vector in this case is x = (xlt ••• , xn), the mappings Pl and P2 are pointwise addition and scalar multiplication of vectors e R". Example 0.2.7b: A similar example is offered by the collection of nxn of real or complex numbers.
matrices defined over the field
Example 0.2.8a: Let X = R2 be a topological space whose metric is the Euclidean distance: p(x, y) : = •/(*! - y\ ) 2 + (x2 ~ y2 ) 2 •
Preliminaries 7
Denote it by (X, T). Define another metric p' on X = R2 given by: p'{x,
y): = max {|*, - y,|, \x2-
y2\}.
It can be easily verified that both p and p ' given above satisfy the postulates of a metric on R2 and the two topologies defined by them are the same, i.e., 1(p) = T(p'), in the sense that every 5-ball constructed for T(p) can be shown to be contained in a ball in T(p') and vice-versa. Example 0.2.10a: Let X be the space of continuous functions that are defined on the interval (a, b) and let m denote a general measure on X. The mapping / s u c h that
f^{jjf{x)\» dXy {l\f(x)\"dm) defines a norm for 1 < p < °°. The space X is called a Lp-space. Example 0.2.11a: Let X be the real line R, the sequence defined by fn (x) = —j=— (x e R, n = 1, 2, 3, • • •) is a convergent sequence, since f(x) = lim fn (x) = 0. But the sequence given by f'n (x) = -Jri H-»°o
cos rix is not convergent to any / ' , since f'n (0) = 4n
—> °° whereas / ' (0) = 0.
1
Example 0.2.13a: The vector space R and the space C(K) of continuous functions defined on a compact set K of R1 are complete spaces (obviously it is true when 1 is replaced by n). Example 0.2.13b: The space of square (summable) functions defined on a measurable space (Exp. 2.10a for p = 2) is a complete space. Example 0.2.14a: The space formed by arbitrary sequences (
Exercise 0.2 1. Show that the scalar product defined by (i) and (ii) in Def. (0.2.14) is a sesquilinear form, i.e., it is a map:
XxX^C (x, y) H» <x, y>
which is linear in the first variable and is semilinear in the second variable ({ x, ay) a (x,y)). 2. Let the function u i-> ||M|| be a seminorm on X i.e., \\u\\ > 0 for any u € !tf. Show that \\u\\ = 0 if and only if u e 9{Q the null space corresponding to the Hermitian product. 3. Prove property (iv) of Def. (0.2.14) by choosing R as the Hilbert space. 4. Let (X, {Xa}, T) be a topological space and let Y be a subset of X. A topology formed by open sets Oa = Y n X a of Y is called a subspace (or relative) topology on Y and is denoted as TY.
8 Mathematical Perspectives on Theoretical Physics
Show that the real line R c R2 has the relative topology which is induced by the usual topology on R2 given by the Euclidean metric. 5. Show that the sequence {sin nx}\ 0 < x < it converges weakly to 0 but does not converge in the mean. 6. Given an orthonormal sequence {
DO
such that the series X |tfn|2 converges, show that the series £ an^>n will converge in the mean l
l
to an element g of the Z2-space where g satisfies: (g,
<j)n) = an{n = 1, 2 , 3 , •••)
(Riesz-Fischer theorem).
Hints to Exercise 0.2 1. Since {x, y) = (y, x), we have ( x, a1yl + a2y2 ) = {^I^I entry implies ax{yx,
x) + a2{y2,x)
= a , {yl,x)
+ a2
+ a
2y2^x)
(y2,x)
= al
anc
* linearity in the first
{ x, yx ) + a j (x,
y2).
2
3. Consider the function <j): R - » R defined by <j>(t) = ||JC + fy|| = (x + ty, x + ty) = (x, x) + 2t( x, y ) + t2( y, y ) . Since
N be a non-constant holomorphic mapping between them. Then there exists a positive integer m, such that every q e. N happens to be the image under <j> (counting multiplicities) exactly m times, i.e., for all q e N: I
(b^ip) + 1) = m.
(1.3.1)
Definition 1.3.8: The integer m is called the degree of <j), and 0 is called an m-sheeted (ramified) cover of JV by M. Equivalently 0 is said to have m sheets.
3.3
Differential Forms on M, their Algebra and Calculus
Definition 1.3.9: Let M be a Riemann surface. A 0-form on M is a continuous function on M. A 1-form t] is an (ordered) assignment of two continuous functions / t and f2 to each local coordinate z (= x + iy) on M such that r]=fldx+f2dy
(1.3.2)
is invariant under coordinate transformations. A 2-form Q on M is an assignment of a continuous function g to each local coordinate z (= x + iy) such that: Q = gdxAdy
(1.3.3)
is invariant under coordinate transformations.8 The functions in (1.3.2) and (1.3.3), after change of coordinate transformations, are related to their counterparts by:
r/>)] fu Ju r/:U(w))l U(w)J i £ iz L/2(^w))J \_dv 6'
(1.3.4)
dv _
This is possible in the following manner. Choose local coordinates z on M and w on W that vanish at p and
(z) = £ a^z n>0,an^0. Use another holomorphic mapping h, say, to k>n write w = zlh(z)n = (zft(z))" = z ". Note that z —» zft(z) = z is another local coordinate vanishing at /?, and in terms of this new coordinate, the mapping 0is given by w = zn, showing that <j> takes the value (j>(p)n times.
7
The ring of all holomorphic mappings on M.
8
Note that the forms have been defined in (1.2)—we are repeating them here in local coordinates to simplify concepts on a Riemann surface.
40
Mathematical Perspectives on Theoretical Physics
8 (w)-g(z(w))
dx
dy
dx
dy
_dv
(1.3.5)
dv^
where we have used the transformed coordinate w = u + iv. If we use complex (analytic) coordinates and note that coordinate changes are holomorphic, the 1-form and 2-form given by (1.3.2) and (1.3.3)9 can be written as: Fl(z)dz + F2(z)dz
(1.3.2) (a)
GdzAdz.
(1.3.3) (a)
Since dz = dx + idy and dz = dx - idy, these functions in (1.3.2) and (1.3.3) are respectively related as f{ = Fl + F2, f2 = i{Fl - F2) and
(1.3.4) (a)
g = -2iG
(1.3.5) (a)
Remark 1.3.10: In view of the rules of exterior multiplication (dx A dy = -dy A dx), it is obvious that on a Riemann surface all forms of degree > 2 are zero, thus if AP denotes the vector space of pforms, then AP is a module 10 over A 0 and AP = {0} for p > 3. One can also talk about the integration and differentiation of these forms. Definition 1.3.11: An r-form (r = 0, 1,2) can be integrated over ^-chains (r = 0, 1, 2). When r - 0, the chain is a finite set of points. When r = 1, it is a finite union of paths (u,C ( ) and for r - 2 it is a finite union of discs (u,D,). Thus integrations of 0, 1, 2 forms are given by: X naf(Po)
over tne
0-cnain napa
(1.3.6)
with na e Z, pa e M, and Uci
W=
£ \fx^t\y(t))~
+ f2{x(t\y(t))^-\
dt
(1.3.7)
where we have simplified the u C;- by treating this as a piecewise differentiable curve C : / = [0,1] —> M and have used the local expression (1.3.2) to write the right hand side of (1.3.7). Similarly lLlDi n=\jDg(x,y)dxAdy
(1.3.8)
Here again we have simplified uZ), to a disc D with a single coordinate chart and have used (1.3.3) to write the above equality. To define the differential operator on the forms, we must assume that they are at least C1, i.e., the functions defining them are continuously differentiable at least up to the first order. Definition 1.3.12: The differential operator d assigns a (r + l)-form to a /--form on a Riemann surface. The operator d satisfies d s 0 (whenever d is defined). Accordingly d operating on a 0-form (function/) gives: 9
We have used capital letters to distinguish them from the previous ones.
10
See Chapter 4 for modules.
Complex Functions, Riemann Surfaces and Two-Dimensional Conformal Field Theory (an Introduction) 41
df=^-dx dx
+ ^dy=fxdx+fydy
(1.3.9)
dy
whereas for a 1-form (1.3.2) and a 2-form (1.3.3) it gives: dr\ = dif^dx + f2dy) =rf/iA dx + df2
A
dy
= ((fOx dx + (fOy dy)Adx
+ ((f2)xdx + (f2)y dy) A dy
= (fi)ydy Adx + (f2)x dx A dy = {(f2)x-(fi)y)dxAdy
(1.3.10)
(1.3.11) dQ. = 0 When we use complex analytic coordinates, we obtain two other differential operators defined by11:
/= J £ dz =fzdz df=^-d~z=f-dz dz
(1.3.12)
for c'-functions, and d(O= d(fx dz +f2 dz) = df{ A dz + df2 A dz dd)= d(fxdz +f2dz) = ~dfx Adz + df2 A dz
(1.3.13)
for a C1 1-form co. It is easy to check that the operators d and d satisfy: (i)
d=d+d
(ii) d'^dd +dd=d2 =0 J (L3-14) The following result on C'-forms links the operations of integration and differentiation on Riemann surfaces. Result 1.3.13: Let 77 be a C1 r-form (r = 0, 1, 2), and let D be a (r + l)-chain, then12 (1.3.15)
where dD is the r-chain obtained by applying the boundary operator d. The above result—known as Stokes' theorem—is non-trivial only when r = 1.
11 12
We have used <9and d here in place of d' and d" used in an earlier section. For an (r + l)-chain D, dD is an r-chain. Note that the d used here is not the same as that in (1.3.12).
42
Mathematical Perspectives on Theoretical Physics
3.4
Star (•) Operator on M
Besides the operators introduced above, another operator, denoted * and known as the conjugacy (Hodge star) operator, can be defined on the vector space A of forms by using an inner product. The operator * maps Ar to A2~r and satisfies the rule: ** = (-l) r (1.3.16) It can be verified that using the operator *, a function, a 1-form defined in (1.3.2) and a 2-form defined in (1.3.3) are mapped as follows: *f(z)=f(z)(a(z)dx *dy) *7] = -f2dx + fxdy
(1.3.17) (1.3.18)
*Q = p(z) (1.3.19) where a(z) and /3(z) are functions defined on same local regions as/and Q are. Definition 1.3.14: A form r\ is called closed provided it is C1 and dt] = 0; it is co-closed \id{*vj) = 0. Definition 1.3.15: A 1-form r\ is called exact if 7] = df for some C2-function on M, and is called co-exact if *7] is exact. The latter holds if and only if 7] = *df for some C2-function /. Remark 1.3.16: On a simply connected domain (see Sec. 1) closed (co-closed) differentials are exact (co-exact), whereas every exact (co-exact) differential is closed (co-closed). Hence locally closed <=> exact (co-closed <=> co-exact).
3.5
Harmonic and Holomorphic Forms on M
Definition 1.3.17:
Let/be a C2-function on M then/is called harmonic if the 2-form
(fxx + fyy)dx A dy called the Laplacian of/and denoted by A / is zero (see Sec. 1). A 1-form TJ is called harmonic if locally it can be given as df with/a harmonic function. Definition 1.3.18: function/
A 1-form r\ is called holomorphic provided locally r/ equals df for a holomorphic
Remark 1.3.19: The concept of a harmonic function is a local one, thus locally every real-valued harmonic function is the real part of a holomorphic function. The following results which are basic for integration on Riemann surfaces are easy to check. Result 1.3.20: A differential13 co is harmonic if and only if it is closed and co-closed. Result 1.3.21: If w is a harmonic function on M, then du is a holomorphic differential. Result 1.3.22: A differential r\ = 0, dz + <j>2 dz is holomorphic if and only if <j)2 = 0 and >, is a holomorphic function locally. Result 1.3.23: A differential form r\ is holomorphic if and only if r\ = a + i*a where a is a harmonic differential. 13
' A differential 1-form on a Riemann surface is simply called a differential sometimes (see [8]).
Complex Functions, Riemann Surfaces and Two-Dimensional Conformal Field Theory (an Introduction) 43
Result 1.3.24: Suppose D is a relatively compact region on a Riemann surface M with piecewise differentiable boundary, and/and co are a C^-function and a differential 1-form in a neighbourhood of the closure of D. Then
iLf^=jjDfdco-jjDCOAdf
(1.3.20)
(The above result uses Stokes' theorem, and as we can see, it is integration by parts.)
3.6 Square-integrable 1 -forms on M Definition 1.3.25: pressed locally as:
A 1-formc r\ on a fixed region D of M is a measurable form if it can be exr]=fdz + gdl
(1.3.21)
in terms of measurable functions/and g on D (see Chap. 0 for the definition). The complex Hilbert space of 1-forms defined above with norm
IN£=JJ>A*ij<~
(1.3.22)
is denoted L2(D), and is called the space of square-integrable 1-forms on D. In local coordinates the RHS of (1.3.22) becomes jjD
i(ff + g~8)dz Adz=
jjD 2 ( | / | 2 + \g\2)dx A dy
(1.3.23)
The inner product of two forms f] {, r\2e L2(D) is given by: 07i> *72)/> = JJ o i7iA*iJ 2
d-3-24)
Using the local expression (1.3.21) it can be easily checked that (*7i» * 7 2 ) D = ( , M I ) O C1-3-25) Obviously L2(A/) denotes the Hilbert space of square-integrable 1-forms on the Riemann surface M. We state the following important result known as Weyl's lemma (for a proof see [8]). Result 1.3.26: A measurable square-integrable function on the unit disc D is harmonic if and only if for every C°°-function g on D with compact support, the following holds: jjDfAg
=O
(1.3.26)
where Ag is the Laplacian of g. An alternative version of above result using local coordinates can be put as: Result 1.3.27: Let/be a measurable square-integrable function on the unit disc D. The function/is holomorphic if and only if for every C°°-function g on D with compact support
JJD/(z)J=-«feArfi = O
(1.3.27)
44 Mathematical Perspectives on Theoretical Physics
We next see that L2(M) can be decomposed in more than one way as a direct sum of orthogonal spaces. Let E denote the L2(M)-closure of all 1-forms defined as: E = {df:fis a smooth function on M with compact support}. Let E* be the collection of all forms 77 e L2(M) such that * rj e E. Then for every t] e E (£*) there exists a sequence {/„} of smooth functions with compact support on M such that: 77 = Hmdfn (= lim * dfn). n
n
2
The set L (M) has the following decompositions: L2(M)
= E ® E1, L2(M) = E* ® (E*)1
L2(M) = E® E* ® H where H = E n (£*) x . The following result is a consequence of above decomposition. (See Hint to Exc. 13 for definitions of E1 and (E*)1).
(1.3.28)
x
Result 1.3.28;
The Hilbert space H consists of harmonic differentials in L2{M).
Remark 1.3.29: On a compact Riemann surface there do not exist exact non-zero harmonic differentials. In order to have such differentials one has to allow singularities on them (see [8]). Result 1.3.30(a): Given a point Po on M and a local coordinate z that vanishes at Po, one can always find a function 0 with the following properties: (i)
+ |log z2| is harmonic in a neighbourhood A^ of P~>. In addition f [ C is analytic. Thus, in short, (D)a^(A) is an open subset of M2. To prove the given result (1.3.6), we assume that <j> is not constant; then, from above, is surjective. 4. Let Tx (M) denote the cotangent space to an n-dimensional Riemannian manifold M and let A denote the space of differential forms, then the inner product on fx (M) can be used to define an inner product on A in the following manner: (a) z/w on U{ = {[z, w] : w * 0} [z, w] i-> w/z on f/ 2 = {[z, w] : z * 0}. 18. We write the metric on the Riemann surface M 3 = CM as <j>(z) \dz\ and note that A e Aut ( C J is an isometry if and only if (i) I9 U) for z 6 C u {->}. / and is written ker <j>. If every element of G' is preimage of some element in G, the mapping is a homomorphism onto, otherwise it is a homomorphism into. For instance, if H is an invariant subgroup of G, the mapping G —> G/H is a 'homomorphism onto.' Definition 2.1.9: A 1-1 correspondence between two groups G and G' is called an isomorphism if it preserves the group structure. Thus from an abstract point of view two isomorphic groups are one and the same. Definition 2.1.10: If a 1-1 correspondence x • y, \ff:G-*G: y/(x) -> x~l are continuous, where G x G has the product topology. The requirement (iii) is the compatibility condition between the two structures (algebraic and topological). Obviously the two mappings of (iii) can be reduced to a single continuous mapping: (x, y) —> xy'1. Example 2.2.2: (a) The additive group of real numbers R is a topological group (T.G.) with metric topology. (b) The group of (n X n) nonsingular real matrices denoted GL (n, R) is a T.G. with matrix multiplication as the group structure and with the usual topology of R" as the topological structure. (c) The rotation groups in Exp. (1.1.2, d) are also T.G.s. (d) The unit circle 5' : {exp {2nix) \xe R} is a T.G. with multiplication of complex numbers as the group operation. (In each of these examples it is an easy exercise to check that the mappings (f> and y/ of Def. (2.2.1) are continuous.) Given two topological groups G\ and G2, one can define another T.G. called the product group G{ x G2 by choosing the product topology on the set G{ x G2 as the topology and by letting the pointwise operations as (xx, x2) (y^, y2)~l = (x{ y[l, x2 y21)- be an associative binary operation that can be defined only on certain pairs (x, y) e N x N and let yr be another operation \\f: x —> x~l defined for some elements of N, then iV is a local topological group provided <j) and y/ are continuous and wherever they are defined xx~l = AT1 X = e holds good. 3 ] = [1 - E] (A(j>) which shows that Atp e X1. 4. D (A) is dense, means that V vector ysD (A) there is a sequence yne D (A) such that yis the limit vector of sequence yn whereas D (A+) is the set of all vectors I// that satisfy (3.3.4). To show that adjoint operator A+ is closed we must establish that the limit vector)//' of every convergent sequence y'n that € D(A') also e tD(A+). From (3.3.4) for = < Acp, y'n > -» < j^0, y1 > = < ). This shows that l//' is in CD (A+). In other words A+ y'n converges to a vector A+ y' = £. J'{X) y», so \f(x)] =f (Si). Thus if/is a real function, then/(.#) is a self-adjoint operator, and if/*/= 1, then since \f(A)]+f(A) = 1 =/(jQ [f(A)]+JW {xx) (x2) (CrL) = {0}, hence the subalgebra S as well as the quotient algebra Qis nilpotent. If, on the other hand, Q,is nilpotent and / is contained in the centre of L, then, using CfCi= {0} for some r, we have Crg c / for gsL and therefore Cr+lL c [/, L] = {0} so thatX is nilpotent. The nilpotency of 5 and / can be established trivially (b) plane on a circle of radius. are arbitrary, we note that (a) and (b) will be true even when these are replaced by A' and (ug) = (j> (u)g forge G and u € P, the condition (6.6.3) is still satisfied. This enables us to define a group of automorphisms: Aut(F) = {> € Diff(P) | <j) is G-equivariant} satisfies: no $= ft (6.6.7)(a) 24 (x) (M + iN) - 2iF V^^V -id (A-A') A^ X D -> D (7.5.39) Obviously A and D are gauge invariant. If one were to choose the vector superfield V with C, x, M and N as zero, the supersymmetry breaks but the familiar gauge transformation is still invariant (note that we have denoted - i(A - A*) as a). This particular choice of C, £, M, N is called the Wess-Zumino or WZ gauge. It can be easily verified that in WZ gauge the vector field V satisfies: V3 = 0 * is immediate, it shows that this is a linear mapping. To check (ii) and (iii) we take *F and A as 1- and 2-forms. Accordingly (ii) can be written as: (c) ) Thus for a function/(0) (of this superfield 0) we have the functional derivative: x.( / and S, corresponds to the f (x) (x3)){<j>(x2) \*n))o = ?—T^TTT; — ~ ( 9 - 4 - 26 ) exp(iS0)nn(x)d n(x) + do4>(x) {x). 2 - J ^e^'-^f J In Thus the RHS of (9.5.5) equals: ) + £"_ _Q dt dx J(x, t) 0(x, t) (x3)^(x4))\0)c^ '
d<j> A *J0 < °= for every open set N containing Pl and P 2 , and (d(j), df) = 0
= (d0, *J/") for all smooth functions/on M-that have compact support and vanish on a neighbourhood of Px and P2. Another concept which is of great use in Riemann surfaces is the classification of differentials on them via the theory of abelian differentials. We shall list a few of these below.
Complex Functions, Riemann Surfaces and Two-Dimensional Conformal Field Theory (an Introduction) 45
3.7
Abelian Differentials on M
Definition 1.3.31: Let r be an integer. A meromorphic r-differential r\ on M is an assignment of a meromorphic function/to each local coordinate z on M such that f(z)dzr
(1.3.29)
is invariantly defined (i.e., it is independent of coordinates). When r = 1, this is called an abelian differential. Result 1.3.32: Let P e M and let z be a local coordinate on M that vanishes at P. Then for every integer n > 1, there exists an abelian (meromorphic) differential f{z) d(z) on M which is holomorphic on M \ j P ) and has singularity ——f
at
P 0-e-> / h a s a pole of order « + 1 at P).
Result 1.3.33: Let P 1 and P 2 C^i * ^2) be two points on M with local coordinates zx and z2 that vanish atPj andP 2 respectively. Then there exists an abelian differential 77, holomorphic atM \ {Px, P2] and with singularities — at P,
and —— at P2.
Zi
z2
To prove the results (1.3.32) and (1.3.33) we set respectively differentials: (i) 77 = - (1/2) ( a + i * a) and (ii) TJ = a + i * a where a = d
(0 = origin).
If/(z) = z"g(z), where g(z) is holomorphic and non-vanishing at P, then oid0f=
n.
Remark 1.3.35: Note that local parameters are homeomorphic, hence the order of 77 (defined in terms of the order of / ) is well defined, i.e., invariant under coordinate transformations. Moreover {P € M : ovdpT] * 0} is a discrete set on M, hence it is a finite set if M is compact. Definition 1.3.36: If 77 is an abelian differential given by (1.3.29) for r=\, P is defined as the coefficient a_, in the Laurent series expansion of/(z):
then the residue of 77 at
f(z) = X ^ . Thus
ResP77 = a_x.
On the other hand, it is known that (see [8])
where C is a simple closed curve in M that bounds a disc D containing P, and has a winding number 1 around P, and 77 is holomorphic in clD x {P}. Hence the residue of meromorphic 1-form at a point P is a well defined concept. Result 1.3.37: Suppose P{, P 2 , •••, P ; (/ > 1) are distinct points on a Riemann surface M, and a1? • • •, at are complex numbers with
46
Mathematical Perspectives on Theoretical Physics
X«t=0
(1.3.30)
*=i
Then there exists a meromorphic (abelian) differential rj on M, which is holomorphic o n M \ {P,, •••,Pi) and satisfies: (a) ord^T] = - 1 (b) Res^rj = ak for every k = 1, •••, / (1.3.31) Result 1.3.38:
Every abelian differential 77 on a compact Riemann surface has the following property: £
3.8
Re S / ,r/ = 0.
(1.3.32)
A Few Results Based on Transformation Groups of M
Next we return to some group theoretic aspects that are needed in the theory of Riemann surfaces. Recall that the group PSL(2, C),14 the projective linear group formed by 2 x 2 complex matrices of determinant one, is a group of automorphisms of (C which is better known as the group of Mbbius transformations: az + b z —> a, b, c, a e C and ad - be = 1. cz + d Definition 1.3.39: Let G be a subgroup of PSL(2, C) which acts as a group of biholomorphic automorphisms of the extended plane C u {00}. Let Q(G) denote the set of all those points ZQ e C u {00} at which G acts (properly) discontinuously.15 The set Q(G) = Q is an open G-invariant subset of C u {°°}. If Q * (j>, we call G a Kleinian group. The set A = A(G) = C u {<*>} \ Q(G) is called the ZwraV .sefo/G. Definition 1.3.40: A Kleinian group G is called Fuchsian if there is a disc A (A = \z\ < cc; a e IR) that is invariant under G. It is called an elementary group if A (G) consists of two or less points. The following facts about the Kleinian groups are well known. Fact 1.3.41:
(a) Every Kleinian group G is finite or countable.
(b) A group G which is Kleinian must be discrete (note that converse is not true). Fact 1.3.42:
If G is Fuchsian with invariant disc D, then A(G) <= dD.
Definition 1.3.43:
Let M be an arbitrary Riemann surface, and 16
M its universal covering space with canonical projection: IT : M —> M and G as the covering group (group of topological automorphisms), so that the following diagram commutes:
n
\ n\^
/ /n M
Q3^^3 14 15
16
See Chap. 2 for group theory. • • • • • G acts discontinuously at ZQ e C u {°°} if: (i) the isotrophy subgroup G^ of G at z0 [g e G: g(z0) - ZQ] is finite; (ii) there exists a neighbourhood V of ZQ such that g(VO = V for g e GZ(); (iii) g(V) n V * <j> for g e G \ GZ(>. Given a connected topological manifold X, a new manifold A!" is called the universal covering manifold of X if X has the following properties: (i) there is a surjective local homeomorphism n : X —> X; (ii) the manifold X is simply connected, i.e., the fundamental group of X is trivial (i.e., nl (X) - {1}) (see Chapter 0); (iii) every homotopically non-trivial closed curve on X lifts to an open curve on X and the curve on X is uniquely determined by the curve on X and the point lying over its initial point.
Complex Functions, Riemann Surfaces and Two-Dimensional Conformal Field Theory (an Introduction) 47
Then, since FI is a normal covering, the group G = Tlx (A/)-the fundamental group of M. It is worth noting that an M can be one of the following: (i) I u ( » } = DX; (ii) € = D2; (iii) A= U= ( z s
az + b cz+d ' Thus: (a) (b) (c) In
Aut (C u {oo}) = PSL(2,
Result 1.3.44: spaces:
Every Riemann surface is conformally equivalent17 to one of the homogeneous D/Gt
(i = l, 2, 3)
where D ; and G, are given in (i), (ii) and (iii), and (a), (b), and (c). Furthermore, these groups are fundamental groups of the corresponding Riemann surfaces. If the Riemann surface is simply connected, then (and then alone) it is conformally equivalent to one of the D ; 's listed above. In addition, if FIj (A/) = Z (the ring of integers), then M is conformally equivalent to one of the following: (i) C ' = C N { 0 } ; (ii) A* = A N { 0 } ; (iii) A r = { z e C : r<\z\<\) = (0 < r< 1). If IIj (A/) = Z®Z, then M is a torus
If the universal covering of a Riemann surface M is a sphere, then M must be a
Result 1.3.46: If the (holomorphic) universal covering space of M is
Definition 1.3.47:
Let az + b Z K>
7
cz + d be written as z t-> Az where
fa
17
b\
H J-° M and N are conformally equivalent if and only if there exists an analytic bijection of M onto N. A surface is simply connected, if it can be continuously contracted to a point.
48 Mathematical Perspectives on Theoretical Physics
with ad - be = 1. The transformation A is called parabolic if it has one fixed point. Note that the trace equation: Trace2 A = (a + df
(1.3.33)
implies that A is parabolic if and only if Trace2A = 4 Using equations (1.3.33) and (1.3.34) we can further define:
(1.3.34)
Definition 1.3.48: A is called elliptic if Trace2 A = T e R and 0 < T < 4. It is called loxodromic if T i. [0,4] but 6 IR. A loxodromic transformation A is called hyperbolic if T > 4. Using the above characteristic properties of Mobius transformations, one can prove: Result 1.3.49: If M is a Riemann surface with n,(M) = Z © Z, then the holomorphic universal covering space of M is C. Result 1.3.50: Suppose M is a Riemann surface with holomorphic universal covering space U = {z s C : I m z > 0 ) and n,(M) is commutative. Then M must be A, or A* = A N {0} or Ar = {z e C : |z| < r, 0 < r < 1} (compare the above two results with (1.3.44)). We further note that this classification of fixed points of a Mobius transformation also helps in determining whether or not two topologically equivalent Riemann surfaces are conformally equivalenta concept that is required in String theory. We now move on to the metric properties of a Riemann surface. The Riemannian metric on our familiar (simply-connected) Riemann surfaces M, = C, M2= A = {z e C : |z| < 1} and M3 =
\dz\
zeMi;
(ii)
[T^-r)
oii)
(rHy)1*'
|*| z « M2; z € Mj
'
It should be noted that (iii) holds only for z * °°. We have seen earlier that M3 is a compact surface of genus 0; in fact this is a sphere, as can be illustrated by using the stereographic projection and the metric on it given in (iii) (see Exc. 19). An important result pertaining to Aut (M3) is given as follows:
Result 1.3.51:
Let (a
b\
belong to Aut (M3) (the group of Mobius transformations). Then A defines an isometry of the metric given in (iii), if and only if elements of A satisfy. d=a,
b = -c,
and
|a|2+|c|2=l
In this section we have only presented the material for a beginner in the theory illustrating it with some 20 exercises. For detailed literature and deep results, see References [2], [3], [6], [8], [9], [11]. We devote the final section of this chapter to conformal field theory in 2-dimensions.
Complex Functions, Riemann Surfaces and Two-Dimensional Conformal Field Theory (an Introduction) 49
Exercise 1.3 1. Using dz = dx + idy and dz = dx-idy, show that the functions/, and/ 2 given in (1.3.2) are related to those given in (1.3.2a) as: (i)
fi = Fl+ F2, 1 f2 = i{Fx - F2)
and (ii) dz A dz = -2i dx A dy. 2. Show that the partial derivatives fz and/j in (1.3.12) satisfy: fz=\(fx-ify)
3. Establish Result (1.3.6). 4. Verify the equalities (1.3.17), (1.3.18) and (1.3.19) after mentioning explicitly the requirements for the * operator. 5. Let A* denote the vector space of it-forms. Show that the vector space A = A0 © A1 © A2 is a graded anti-commutative algebra under the form multiplication. 6. Show that every C2-function/satisfies:
7. 8. 9. 10. 11.
-2iddf=Af=d*df. Prove Result (1.3.20). Prove Result (1.3.21). Prove Result (1.3.22). Prove Result (1.3.23) and deduce from here that a differential ft) is holomorphic if and only if it is closed and *co = - i(O. Use the defining equation (1.3.21) for T) and show that \\D rj A * n = \\D i(ff
+ gg)dz Adz = jjD 2 ( | / | 2 + \g\2)dx A dy.
12. Prove Result (1.3.27), i.e., Weyl's Lemma in its alternative form. 13. Show that the Hilbert space H defined in (1.3.28) consists of harmonic differentials in L2(M). Deduce condition (iii) of Result (1.3.30)(a) from here. 14. Prove the results (a) (1.3.37) and (b) (1.3.38). 15. By choosing the image of a fixed point as °° e (Cu {°°}, show that a Mobius transformation A is parabolic if and only if A is conjugate to a translation z i-> z + b (or equivalently to z H-> z + 1). 16. Show that the Mobius transformation A with two fixed points is conjugate to z w a z where a * 0, 1 and that: A is elliptic
<=> |of| = 1
a*\;
A is loxodormic <=> \a\ * 1 a * 0; A is hyperbolic » « E R
a > 0, a * 1.
50
Mathematical Perspectives on Theoretical Physics
17. Define the complex projective space P and show that it is a Riemann surface. 18. Prove Result (1.3.51), show further that A here is elliptic. 19. Use the metric —'—hr (z & °°) and the stereographic projection to show that the compactified
1+M complex plane C u {<*>} is a sphere. 20. If <j)(z) denotes the coefficient of \dz\ in the above exercise, assuming that the curvature of this . sphere is
Laplacian (log 0(z)) ^ show that it equals 1.
(4>(z))2
Hints to Exercise 1.3 1. Use (1.2.3) and the procedure described in the Hint to Exercise. 8. 2. The verification is an immediate consequence of (1.2.3). 3. Because of the importance of this result, we begin by explaining a few facts that were not explicitly mentioned in the text. We note that we have not drawn any distinction between an analytic mapping and a holomorphic mapping defined on a Riemann surface, although a more precise way would be: <> / is holomorphic on M if
(a)
D ; C D2 C [Ua nf-1 (Ufi n g'1 (Up)). Since D, and D2 are both contained in the open set within parenthesis, it follows that both/^ a = ZpofoZ^ and gpa = Z'p o g o Z^1 are defined and holomorphic on Za(D2), so the set of points in the compact set Za(D{) w h e r e / ^ = gpa is either finite or is all of Za ( D , ) . Hence/and g are same either throughout Dx or at only a finite subset of Dv We use both these options as follows. Let A be the set of points in Mx which have a neighbourhood on which/= g at only finitely many
18
For/? e M wherep is in (Zw £/„), let N= {q, da (q, p)<£}= Z~x [z : \z - Z a (p)\ < e}, then N, which can be mapped by Za onto an open disc in C, is called an open parametric disc at p with radius e. Note that we are using z as the coordinate of the point.
Complex Functions, Riemann Surfaces and Two-Dimensional Conformal Field Theory (an Introduction) 51
points and let B be the set of points which have a neighbourhood 9{_ such that/= g throughout 9{. Then A and B are disjoint open subsets of M s , and the above discussions imply that A u B = M{. But since Mx is connected, we have either Mx = A or Mx = B. (ii) If Mj and M2 are Riemann surfaces and 0 : Mx —> M2 is holomorphic but not constant, then if A is an open subset of M,,
Da
Uan )-1 (Up) n A.
Now the mapping./^ s Z'pO ) o Z^ 1 is not constant on Za(D) (else 0 would be constant on D and hence would be constant on A/j in view of the result (i)), therefore the set V = fap Za(D) is an open subset of Z'p(Up). This means that
(dx'i A dx?z A ••• A dx{p\dx^
A d x ^ A ••• A d x h ) = Elky''pk
g
j
^ ••• g h k p
where g^kr denote the components of Riemann metric and e'\ ...'£ is the Kronecker tensor that equals +1 or - 1 according as k{, •••, kp is an even or odd permutation of ilt •••, ip. Using (a), the inner product between two arbitrary p-forms
a=
— a-h ••• ivp dx'i ••• dx'p a n d P= — fih ...j dxh ... dxh
can be written as:
(b)
ah ... ip pi - \
(a\p) = ±
When M is a Riemann surface the maximum value which p can take is 2, hence (a) simplifies to: (C)
(dx'l A dx'2\dxji A dxj2) = £ ^
gh k\ g>2k2
= 8Ug22-gi2g2i. If in addition the manifold is oriented, we can define an n-form T called the volume form or the volume element on M. In terms of an orthonormal basis {6'}, this is given by (d)
r = 01
A
&
A
•••
A
6"
or equivalently as F = F, ... ; dx 'i A ••• A dxl". On such an oriented manifold, an isomorphism AP(M) —> A"~P(M) can be defined by an operator *, which is related to F in the following sense:
52 Mathematical Perspectives on Theoretical Physics
(e) T(a\p) = a A */? for every p-form a e AP(M). (Note that */} is an (n - p) - form). For example, by choosing a = dx1 A ••• A <& P , (e) gives: (f) which implies
r c ^ 1 A ••• A j^iyS) = rpu •'"
<«>
< * % • i. - . ' . = 7 7 r M - v v
+l
- ^ " ' ''"
For a Riemann surface (n = 2), we set the form F as: (h) a (z) dx A dy. The choice of p is limited to 0, 1 and 2. When a is a 0-form, jS is also a 0-form, hence Eqs. (g) and (h) give: (i) *f(z)=f(z)(a(z)dxAdy). When P is a 1-form 77 =/i(z) etc + / 2 (z) dy, using Eqs. (g) and (d) and noticing that 9'idx1) = <^' (dx1 = dx, dx2 = dy), we obtain: (j) *7] = -/2(z)
d (df) = d(fz dz) = fz-z dz A dz.
From (1.2.3) we know that/ z = —(fx- ify) and/=r = — (fx+ ify), whereas dz = dx + idy and dz = dx - idy. We substitute these values in (a), and since:
(b)
JZ [\{f, - ify)} = \ dXX + I/* - Hyx + fyy)
Complex Functions, Riemann Surfaces and Two-Dimensional Conformal Field Theory (an Introduction) 53
and (c)
dz A dz = (dx - idy) A (dx + idy) = 2i(dx A dy)
we obtain
-2iddf=Af. I f / i s harmonic, A/is zero, i.e., ddf= 0 i.e., 3/is holomorphic. 9. The differential 77 =
mat
V = *7i + V2 • We write 77 = udz + vdz, then using d = d + d, we
dT] = (u~z - vz) dz
A
dz
and (b)
d*ri = -i(u£ + vz) dz A dz.
In view of result (1.3.20), r\ is harmonic if and only if the right hand side of Eqs. (a) and (b) is zero; this will be so only when u and v are holomorphic. Hence v can be written in terms of holomorphic differentials as: r\ = j]i + r\i. Next we begin by assuming that a is harmonic, hence we can write it in terms of holomorphic differentials ax and a2 :
a = ax + a2 *a = -/ai + ia2 . This gives a + i*a = 2aj. Obviously 2ax is our 77. Conversely, if 77 is holomorphic, then r\ and 77 are harmonic (in view of result (1.3.21)) and so 77-77 ——- = a 2 as well as - it] - it]
*a= 2 is harmonic. This gives: 77 = a + i*a. The last part is left to be completed by the reader. 11.
Now r]=fdz + gdz = f(dx + idy) + g(dx - idy) = (f+g)dx + i(f- g)dy. Treat / + g as / , and i(f- g) as f2 and use (1.3.18) to write *r] with the assumption that * is defined in this region. This gives
*T] = -(i(f- g))dx + (f+ g)dy
54 Mathematical Perspectives on Theoretical Physics
= -if(dx + idy) + ig (dx - idy) = -ifdz + igd z and *T] = if dz - ig dzTherefore \\D n A *rj = j \D i iff + gg) dz A d~z =
jjD2(\f\2+\g\2)dxAdy.
12. The necessity part is easy to show using Stokes' theorem (although/is not defined on dD). Let / b e holomorphic and g be smooth with compact support. Then
To prove the sufficiency, we suppose that/is C1. Then, since for every g with compact support
We have -UD
jrg(z)dzAdl = \lDf^dzAdl-
=O
which leads to Cauchy-Riemann equations. Moreover if / and g are arbitrary, and g is with compact support, we have:
In view of the form of Weyl's lemma, this shows that/is C° and thus it is holomorphic. 13. By definition, E1 = {77 e L2(M); (77, df) = 0 for all smooth functions f on M with compact support}. Similarly, £*x = {77 e L2(M); (j],*df) = 0 for all/described above}. Using these we can show that if 77 6 L2(M) is C1, it e E*1 CE"1) if and only if it is closed (co-closed). For instance, consider that 77 is co-closed, and / is a smooth function with support inside a disc D (closure of D compact); then writing the inner product:
(df,rD = jjDdfA*v =
jjDd(f*n)-jjDfAd(*v)
we note that the second integral is zero since t] co-closed means that *rj is closed, and the first is zero as / is a smooth function and is integrated on a closed boundary. Thus 77 e £°~. Conversely, given that 77 G E1, we have (df, 77) = 0, which leads to
Complex Functions, Riemann Surfaces and Two-Dimensional Conformal Field Theory (an Introduction) 55
JJ M /Arf(*^) = O for all smooth functions on M with compact support. This implies that d{*r\) = 0, i.e., 77 is coclosed. Finally, to show that H = £ x n E*L consists of harmonic forms, we observe that if 77 is harmonic, it means it is closed and co-closed and hence from above it e H. For the converse, let 77 e H and let D be a coordinate disc on M with local coordinate z = x + iy. Choose a real-valued function $ which is smooth and is supported in D, denote the partial -*- by y/} and —r- by y/2 ax ay and note that dy
dx
Write 77 =fdx + gdy where/and g are measurable. Then since 77 e H and it is real-valued, we have:
0 = (77,
•IL<*+
>*>*{•&;"+&>)
Similarly
0= (77,* ^ 2 ) = Jj D r7A*(*^7)
Subtraction of the above equalities gives:
0=(r1,dyl-*dy/2)= JJD/A0.
56 Mathematical Perspectives on Theoretical Physics
Therefore in view of Weyl's Lemma, (1.3.26),/is harmonic and hence Cl. Again, writing these steps for *T) which also e H, we shall see that g is harmonic and hence C1. Accordingly, r\ =fdx + gdy is of class C1 and since H = E± n (E*)±, 77 e (E*)L and also to E"1, hence it is closed and co-closed. To prove (iii) of Result (1.3.30)(a) we simply note that d
at Pk and
Z
at P o (P o ^ Pk). This result can obviously be generalized to an arbitrary number
° of finite points on M giving us a required meromorphic Abelian differential. Thus we set /
1
k=l
k=\
V= I (*krlk= I
akfk(z)dz.
We assume that singularity of an fk(z) is — whereas fj(z) (j ^ k) is holomorphic at zk. We Zk
now use definition (1.3.36) to obtain the relations (a) and (b) of (1.3.31). From this definition and our result (1.3.30)(b) it also follows that the form r\ is well-defined. (b) Since M is compact it can be covered by a finite number of sets. Accordingly we can triangulate A/by / number of triangles A], •••, A; (2-simplices) assuming that each singularity of r] is in the interior of just one triangle. Then using the fact that: ; 1
£ ResP77 = — X L r\ (where d Ay is the (positively oriented) boundary of Afi, we obtain the result since each 1-simplex d Aj appears twice with opposite sign in the above sum. 15. Recall that an element b € G (a group) is conjugate to an element a of G if b = c o a 0 c"1 for every c in G. We are dealing here with G = PSL (2, C). Let A e G be parabolic, and let ZQ be the corresponding fixed point, let C be a Mobius transformation such that C(z0) = °°, then evidently we have a Mobius transformation D which is conjugate to A such that: (i)
D(oo)
= C o A o C~l
(00) = 00
showing that D is parabolic, equating (i) to
(and changing the coefficients) it follows that cz + d a parabolic transformation can be expressed as z i-> az + b with a * 0 . The trace condition (1.3.32) restricts that a be equal to 1, and b can be chosen as 1, hence we have proved the result that if A is parabolic, it is conjugate to a Mobius transformation z i-> z + b (and thus also t o z H z + 1). Conversely given the translation z i-» z + 1 represented by the matrix:
{0 1) We have to show that for every
fa b\
Complex Functions, Riemann Surfaces and Two-Dimensional Conformal Field Theory (an Introduction) 57
the transformation: ,
(a
b\ (I
Y\ ( d
-b\
is parabolic. Now a2
(a(d-c)-bc ~{
2
-c
\
c(a-b) + adj
It can be easily checked that A is Mobius and that Trace 2 A = 4, hence A is parabolic. 16. We use almost the same arguments as we did in Exercise. 15. Given A e G = PSL (2, C) with two fixed points z, and z2, we assume C e G, such that C(zx) = 0 and C(z2) = °°, then construct the mapping D = Co Ao C~l which gives, using z, and z2, that D(z) = cz + d and D(z) = az + b. Normalization gives: D{z) = c'z + 1, D(z) = a z + 1. Since they must both be equal, we have D(z) = (a - c) z = Az. Obviously, A = 0 will mean that the mapping given by D is degenerate, and A, = 1 will mean that it is the identity. Hence a mapping D conjugate to A with two fixed points must be such that z i-> Az where A * 0, 1. We leave the converse and the rest of the exercise as a simple problem to be solved using the definition (1.3.48). 17. Consider the space: V = C x C - {(0, 0)} = {(z, w) : z, w € C, \zf + |w| 2 * 0}
(i)
with the subspace topology derived from the product topology on C x C. We say that two pairs (z, w), (z\ w') are equivalent if there is a non-zero complex number / such that (z, w) = (tz\ tw'). The equivalence class containing (z, w) is denoted as: (ii)
[z, w] = {{tz, tw) : t e C , t ± 0 ) .
The quotient map q : (z, w) —> [z, w] maps V onto the space P of equivalence classes which inherits the quotient topology induced by q : V -> P. The space P is called the complex projective space (see also 10A.1). To show that it is a Riemann surface, we consider the maps <j): V —> C j 9 and 0' : P —> C w defined by: (iii)
0 (z, w) = (j)'([z, w]) = z/w = oo
if w * 0 if w = 0.
Evidently > =
- C = Cu{o-}.
58 Mathematical Perspectives on Theoretical Physics
Since A(z) =
and
az + b . , d (az + b\ implies A (z) = — r = cz+d dz\cz + dj(cz
0(A(z)) =
1 T + df
2
\cz + d) \cz + dj Simplification after substitution in (i) yields: (ii)
1 _ 1 \cz + d\2 +\az + b\2 l + \z\2 '
Thus A is an isometry if and only if (ii) holds. lid = a and b =- c with \a2\ + \c\2 = 1, then evidently the left hand side of (ii) equals the right hand side, for \cz + d\2 + \az + b\2 = \cz + a\2 + \az -c\2 = {cz + a) ( c z + a) + (az - c) (a z - c) = \c\2\z\2 + \a\2 + acz + acz +\a\2\z\2 + \c\2 - acz = (\a\2+\c\2)(\z\2+D
-acz =
\z\2+l.
To show the converse, we expand \cz + d\2 + \az + b\2 and equate it to 1 + \z\2; this gives: (\a\2 + \c\2) \z\2 + (cd + ab)z
+ (dc + ba)~z + (\b\2 + \d\2) = 1 + \z\2.
This can hold only if (iii)
|a|2 + \c\2 = 1 = \b\2 + \d\2
and (iv)
cd + ab = 0 as well as dc + ba = 0.
Note, however, that the vanishing of one of these in (iv) implies the vanishing of the other. Since ad - be = 1, we have: add — bed = d or using (iv) a\d\2+a\b\2= d. Again, using (iii), we have a = d or a = d. Similarly, multiplying ad- be = I by - c and using Eqs. (iv) and (iii) subsequently we get b = - c. To show that A is elliptic, we note that (v)
Trace A = a + f l = 2 R e a .
Complex Functions, Riemann Surfaces and Two-Dimensional Conformal Field Theory (an Introduction) 59
If Re a were to be ± 1, A will naturally be the identity (thus A will give a trivial isometry) and this is not what we want; hence we have -2 < Trace A < 2. Since Trace2 A < 4, A is elliptic. 19. Let S2 be the unit sphere in R3 given locally as: Consider the stereographic projection through the point (0, 0, 1) of this sphere onto C, thus „
„
£ + if]
(6 v, 0 •-> -2—^ = z. Put
z = x + iy and note that 2x = _ ^ L , 2v = y ^ r
and
(x2 + y2) + 1 = - ^ .
This helps us to write the inverse of above map as: ^
(i)
(2Rez
2Imz
|z|2-O
This in turn defines a diffeomorphism between S2 \ (0, 0, 1) and C which can be extended to a diffeomorphism between S and C u ( » j . The Euclidean metric in R induces a metric on S . We show next that this metric is the same as the one given in the exercise. From (i) we have:
(11) (£ J77, J O =
f 2 ( l - * 2 + y 2 ) d * - 4 x y d y 2(l + x 2 - y 2 ) r f x - 4 x ^ y 4(xrfx + ydy)l 7 j ^2 ' A 2 ^2 '7 2 ?^" ^ (l + x2+y2) (l + x2+y2) (l + x2+y2) )
d? + dr)2 + dt? =
x {[(1 - x2 + y2)2 + Ax2 y2 + Ax2] dx2 (l + x2
+y2)
+ [4x2 y2 + (1 + x2 - y2)2 + Ay2] dy2 + [Sxy - Axy (1 - x2 + y2 + 1 + x2 - y2)] dxdy) =
r
I.
(l +
2
{(1 + x2+
2 \4
x2+y2)
+ (1 + x2+y2)2dy2+
20. Beginning with Laplacian (log <j) (z))
0}
y2fdx2
60 Mathematical Perspectives on Theoretical Physics
we compute the Laplacian (log (f)(z)), i.e., ( d2
d2 \
- f r + f^
nog2-log(l+x
2,
i
+
l " 2 ^ - * 2 +y2)
/ ) ] = - -^—
+
2(l + ; t 2 - y 2 )
iTF 7 ^
J-
_4 =
(l + * 2 + > 2 ) 2
=-(
4
THE TWO-DIMENSIONAL CONFORMAL FIELD THEORY
As the title suggests we shall give in this section a simplified version of conformal field theory—in the sense that we shall limit our discussions of the theory on the Minkowskian/Euclidean space of two dimensions. In later chapters (Chapters. 5 and 11), we shall see that conformal groups play a dominant role in string theory via the Virasoro operators (see Sec. 5.2) L,,-which happen to be the generators of this group.
4.1
Conformal Group
Given an n-dimensional Minkowski/Euclidean space M with coordinates x? (/l = 1, • • •, n), a conformal transformation is a diffeomorphism x*1 —> x ^ such that the line (metric) element is preserved up to a scale factor, i.e.: d~s2 = dZ^ dxv rjMV = n(x)dxfl
dxv T)^v
(1.4.1)(a)
In the case of an infinitesimal transformation x^1 —> xM + e M, the infinitesimal distance ds2 transforms as: dsi->dsz + (dllev+dvell)dx^dxv.
&^ = evr]tlv
(1.4.1)(b)
These two taken together lead to: ^ e v + dv&ll = illn) Ti^v dpsp,
(1.4.1)(c)
The collection of all such transformations given in (1.4.1) is the conformal group of M. It is known that for all dimensions n > 2, it is finite-dimensional, whereas for n — 2 it is infinite-dimensional. However, it is in this case that models in statistical mechanics can be related to string theory since whatever be the dimension of the embedding space, the theory in effect described by the world-sheet swept by string is 2-dimensional. And this brings the two great theories—the quantum and the string— much closer and renders them more comprehensible (see the original work of Belavin Polyakov and Zamolodchikov [4] and the references there, and 7.[21]). Since this group is infinite-dimensional, it has an infinite number of generators as we shall see in Subsec 4. To begin with, we consider a particular case of this group—the Lorentz group—the group of those transformations where the scale factor is ± 1. This group is abelian and thus has one generator and its irreducible representations are one-dimensional (see Chapter 2). Thus all representations of this group,
Complex Functions, Riemann Surfaces and Two-Dimensional Conformal Field Theory (an Introduction) 61
such as tensors, can be decomposed into one-dimensional representations. To achieve this end, we shall use the light-cone formalism 20:
4.2
Light-cone Formalism and the Lorentz Group
Define z = x° + x\ z s J C 0 - * 1
(1.4.2)
where z and z are independent coordinates. The coordinate transformation inverse to (1.4.2) gives back the Minkowski coordinates (x0, x^:
x° = ± (z + i), xx = ±-{z-z)
(1.4.3)
Under the Lorentz group with (single element) a = a>01 (see Remark (2.2.12)), the transformations for light-cone coordinates (resulting from 8x° = cox\ 8xx = 0)x°) are 6z = coz, Sz = - oilThe Minkowski line element in these coordinates becomes:
(1-4.4)
ds2 = - (dx°)2 + (dx1)2 = -dz d'z
(1.4.5)
and written out with metric tensor this becomes: ds2 = gzztdzdz+ gzl dzdz + g-_zdzdz+
g-- dz dz
2i
(1.4.6)
This eventually leads to the values of these metric tensor components:
s«=s;;=°'S« = « « = - j
(1A7)
in light-cone coordinates. Inverse metric can easily be verified as: gzz = g" = 0, gzl = gIz = - 2.
_ (1.4.8)
An arbitrary contravariant tensor with components f in (xf1) system has components f and tz given by the rule: tz=t°+t\
tz = t°-t[
(1.4.9)(a)
and a covariant tensor T^ (in view of (1.4.3)) is given as Tz=j(T0+Ti),
20.
21 •
Tz = ^(T0-Tx)
(1.4.9)(b)
Light-cone coordinates (an accepted usage in literature) must not be confused with complex coordinates of previous sections. zz m all expressions stands for z z •
62 Mathematical Perspectives on Theoretical Physics
The easiest examples of these tensors given by Eqs. (1.4.9)(a) and (1.4.9)(b) are dz, dz and — , -—
dz
dz
when written out in full they stand for: dz = dx° + dx\ dz = dx° - dxl
Tz=l[-d? + J?\
(1.4.10)(a)
Tz^W-Jj)
(1A10)(b)
From (1.4.4), the following fact regarding the action of the Lorentz group on these tensors is immediate. Fact 1.4.1: Each component of a tensor in the light-cone coordinates forms an irreducible tensor of the Lorentz group: (a)
Sf = cof,
8tz = -tot1,
(b)
STz = -coTz,
8Tz = coT-z
(1.4.11)
It is interesting to note that the scalar product in light-cone formalism is one of following (a)
t • T=t^TM=
tz Tz + tz Tz
(b)
=-Ht-zTz+tzTz)
(c)
= - — ( r f + tl
(1.4.12) Tz).
Equality (c) implies that sum of any tensor with two indices zz, such as Tzz + Tzz, can be expressed as a divergence, i.e., Tzz + Tzz =-2 T% (1.4.13) Also, using the metric, one can express any tensor in terms of only upper and lower z (z) indices. In conclusion we note that in light-cone formalism (1.4.1)(c) becomes: d0 € ! = - d l s 0
4.3
a n d d0 e 0 + d l e l = 0
Euclidean Space Formalism
We next show an analogue of the light-cone coordinates in Euclidean space. This is desirable for two reasons, namely (1) working in a space with positive definite metric allows one to have an access to the mathematical theory of Riemann surfaces; (2) conformal field theories associated with statistical models are in Euclidean space (in string theory they are in Minkowski space). The change to Euclidean space from Minkowskian space is achieved via the Wick's rotation principle: * ° - * - w ° , x1 -4 x1 (1.4.14) 1 2 ] Denoting the Euclidean coordinates as x , x in place of Minkowskian x°, x , the line element (1.4.5) becomes ds2 = (dx1)2 + (dx2)2
(1.4.15)
Complex Functions, Riemann Surfaces and Two-Dimensional Conformal Field Theory (an Introduction) 63 The coordinates z and z in this format are chosen as i
Z=
. 7
~~
X + IX ,
1
. 2
Z - X - IX .
The inverse coordinate transformation is evidently
xl=j
(z+z), x2= - j ( z - i ) -
The coordinates (z, z) are our familiar complex coordinates. The line element (1.4.15) now becomes: ds2=dzdz
(1.4.16)
and the metric tensor is given by: (a)
8zz=S-zi=0,
gzl =
(b)
ga=g"=0,
gzz = g~zz = 2
gz-z=-, (1.4.17)
The group of transformations in the case of the Euclidean space is 50(2), with the following rule of Euclidean rotation: x1 —> xl cos co + x2 sin co x2 -> x2 cos co - x1 sin co
(1.4.18)
These transformations lead to: z - > e-iu> z, z - » eiw 1
(1.4.19)
Given a tensor with components f and 7^, their counterpart in z, z coordinates is given by: (a)
tz=tl
+ it2,
tz = t l - it2,
(b)
Tz = — (7, - iT 2 ),
7j = — (7, + iT2)
(1.4.20)
From these equations it is evident that all the conditions that are satisfied in light-cone formalism can be shown to hold good in this complex coordinate formalism by appropriate introduction of ± i. We shall need this fact while studying Majorana and Weyl spinors (Chapter 7). We also observe that Eq. (1.4.1)(c) in the Euclidean case gives: dx e 2 + 2€] = 0
and
, e , - d2 e 2 = 0
(1.4.20)(a)
4.4 Two-dimensional Conformal Group We now return (although briefly) to the conformal transformation group in two dimensions. We first note that since the line element ds2 = -dzdz is preserved up to a scale factor, this implies that we can find a smooth function ev(z'z)
such that
ds2 = -ev(-z'l)
dzdz
(1.4.21)
64 Mathematical Perspectives on Theoretical Physics
Accordingly the metric in (z, z) coordinates is given by:
(a)
§zz=g-l=0,8z.
=
g-z=-Le^n
(b)
gzz = g~zz = 0, g" = gzl = -2e~^z-l)
(1.4.22)
We shall soon use these to discuss the conformal tensor calculus, but first we establish the claim (made in the introduction of this section) that the conformal group for 2-dimensional spaces has infinitely many generators. However, the good part is that the Lie algebra formed by them is of great value in physical theories. Note that the transformations of the type:
(a)
z^f(z)
z^gCz)
(b) z - > A ( i ) z-»/fc(z) (1.4.23) with/, g, h, k as smooth functions will preserve (1.4.21), in the case of (a) where for instance e¥<-z> z) will be f'(z)g' (z) ev<-z' z\ The second transformation, though, will change the orientation in view of (1.4.10)(b). To avoid complications of orientation change, we stick to the transformations Eq. (1.4.23)(a), and consider infinitesimal transformations22: z->z+ JJanz
n +
\~z-*z
+I
n
anz
n +l
n e Z
(1.4.24)
n
In view of (1.4.13)(a) and (1.4.20)(a) it follows, that these transformations are generated by: zn +
l J_sL dz
-B + l d dz
s I
z
(1425)
Obviously the Lie algebra formed by them satisfies: (a)
[Ln, LJ = -(n-m)
Ln + m,
(b)
[Ln, Z j = 0,
(c)
[Ln, Z j = - ( « - m ) Ln + m
(1.4.26)
We shall return to these generators (operators) in Chapter 5 and of course shall use them in Chapter 11 (see in particular Sec. 11.6).
4.5
Mobius Transformation
In the previous section we have already come across Mobius transformations. We give below an important result on these transformations—which will also serve as an example to the theory discussed above. • In writing an in the second correspondence, we have followed the usual practice in literature. This does not mean that a „ is the complex conjugate of an. The common feature they share is that they are both infinitesimals independent of z and z respectively (an of z and an of z).
Complex Functions, Riemann Surfaces and Two-Dimensional Conformal Field Theory (an Introduction) 65
Result 1.4.2: The most general transformation that maps the Riemann sphere onto itself is the Mobius transformation: . z-*z
=
az + b -,
_ _„ az+b z->z = __ -
cz + d
(1.4.27)
cz + a
where a, b, c, d are complex parameters satisfying the equality: ad -bc= 1. The mapping is one-one. The conformal group also known as Mobius group is a six real-parameter group (see Remark (2.2.12)). Infinitesimal Mobius transformations can be written as23: z' = z + a_x + a0 z + ax z2 ~z = z + a_i + a0 z + ai z2
(1.4.28)
and in view of (1.4.25), they are generated by Lg, L±1, and L o , L±l.
4.6
Conformal Tensor Calculus
Let the coordinate changes be written as: z —> w(z),
z —> w (z)
then a general tensor T^'j s T(z, z) will correspond to a tensor T'(w, w) in (w, w) coordinates, with transformation rule:
T(z, z) -> Hw, w) = (*»-Yk teY"' T(z, I) \dz
)
\dz
(1.4.29)
)
The numerical quantities (i - k) = h, (j - /) = h are called the conformal weights or dimensions of tensor T. Similar to other metric theories, we can raise and lower the covariant and contravariant indices by using the appropriate metric tensor, thus for instance: Tz = gzl Tl = - i - ev{z-'z) Tl and T- = - 1 ev{z'l)
Tz.
(1.4.30)
We now state some important results that relate our discussions to familiar pictures in physical theories. Result 1.4.3:
A time translation x° —> x° + c, where c is real, is induced by z —> z + c, z —> z + c
and so is generated by the sum of generators (L_t + L_x) in view of (1.4.25). The generator (L_x + L_{) is the Hamiltonian. Result 1.4.4:
A space shift xl -> x[ + X is induced byz—>z + A, z - ^ z - A and is generated by
L_x - L_j, which is the total momentum. 23
Note that while coefficients an, an in (1.4.24) can stand for light-cone coordinates or for their analogue in Euclidean space, in this case they are in second formalism (see the paragraph below (1.4.13)).
66 Mathematical Perspectives on Theoretical Physics
Result 1.4.5:
The rotations 5x° = -Qx1, Sx1 = + tj)x° are the consequence of the transformations
z -> e'^z, z —> e~"^ z- The resulting generator L o - L o is the angular momentum generator. This is nothing but the generator of the 2-dimensional Poincare group (see Exercise 4.1.7). Finally, the dilation z -» Az, z —> Xz for X real is generated by Lo + Lo. It can be checked that under a dilation a tensor T~*X(h + 'i)T, while under a rotation T -> e i(A - * > *r. The sum (A + A) and the difference (h - h) is respectively called the dialation weight of T and the spin of T. Finally, we list a few results dealing with conformally invariant two-dimensional theories. These will be pursued in detail in later chapters (Chapters 6 and 11).
4.7
Conserved Currents
Result 1.4.6: If the theory is Poincare-invariant, then there exists an energy momentum tensor Tan (which can always be chosen to be symmetric)—which is a conserved current, i.e.,24 «9"r^ = 0
(1.4.31)
and whose charge generates translations (see Sec. 6.3). The current corresponding to Lorentz rotations is a moment of energy-momentum tensor: xaTp8-xpTaS
(1.4.32)
which is conserved on its Sindex due to the symmetry of Taa and due to its conservation. Result 1.4.7: If the theory is dilation-invariant, i.e., it is invariant under xa —> Xxa, and the associate current ./^ is given by a moment of the energy momentum tensor: jp = xaTap
(1.4.33)
then jp is conserved provided 7"^ = 0 The above relation leads to constructions of further conserved currents, for instance, define an arbitrary current fa(x)Tafi (1.4.34) and demand that d? fa + d" fP - ft]"1* = 0
(1.4.35)
where (f> is an arbitrary function of x and rf^ is the metric tensor, then it can be checked that/ a (x) Tap is conserved. A simple example of (1.4.34) is given by xaxSTsp-x2Tali
(1.4.36)
This generates the special translations of the conformal group. Moreover, these additional conserved currents define corresponding generators and together with Poincare and dilation generators, they have the conformal group as their algebra. Returning to (z, z)-coordinates in a two-dimensional Euclidean invariant theory, the above statements lead to the following realizations: If the energy momentum tensor is symmetric and traceless, it has only two components, say Too and T01. The traceless condition in coordinates (z, z) implies: Tz- = 0 24
Equations (1.4.31), etc., are in arbitrary dimension.
(1.4.37)
Complex Functions, Riemann Surfaces and Two-Dimensional Conformal Field Theory (an Introduction) 67 thus there are only two components, T^ and T--. The conservation condition leads t o
d-Jzz = 0,
dzT-zl=0
(1.4.38)
showing that T^ and T— are functions of only z and z, respectively. With the help of the above discussions, we can now define an infinite set of conserved currents, for instance, consider any two functions /(z) and g(z) and form a new pair: f(z)Ta,
g(~z) T-
(1.4.39)
which evidently satisfies d- (/"(z)T,,) = 0 and dz(g(z) T- = 0. The corresponding generators are:
_
j
1
1
L dz
n=J-$Yln
n+2
T
+ 2T
(1A40)
«
This shows that a theory for which Tzz = 0, carries an infinite-dimensional conformal group. As mentioned above, we shall return to these generators in later chapters.
Exercise 1.4 1. Find the light-cone components of tensors of type (2, 0) and (0, 2), i.e., components of contravariant and covariant tensors of degree 2, and then write its generalized version for a mixed tensor of type (r, s). 2. Obtain the Lorentz group transformation for tensors of type (2, 0) and (0, 2) in light-cone formalism and then write its generalized version. 3. Show that for the line element (1.4.21) the Christoffel symbols F ^ i n z, z coordinates satisfy:
and the remaining ones are zero. 4. Establish the results (1.4.6) and (1.4.7). 5. Show that for the line element (1.4.21), the current defined by (1.4.36) is a conserved current that generates special translations of the conformal group. 6. Establish the generators of 2-dimensional conformal field theory as given in (1.4.40). 7. Show that for a free spin-zero 2-dimensional field theory, the energy momentum tensor components are: Tzl=0,
Tzz=dz¥dzy,
T- = d-zyd-z¥.
68
Mathematical Perspectives on Theoretical Physics
Hints to Exercise 1.4 1. The components of a contravariant tensor of degree 2 are obtained by writing the elements of tensor product of contravariant vector spaces. Thus denoting the tensor by s, we have (a)
szz=
f®tz
= (t°+ tl) ® (t°- tl) = t°
* 01 J . / 1 0
fi°
f11
Similarly szz =
?00 +
f0! +
r l 0 + rU
and szz and szz are respectively: ,oo_fio+,o1_?nandfoo_roi_fio
+ r ii
In the case of covariant tensor, we have: (b)
Szl = ±(JQ+Tx)®^
(TQ-TJ
= ~7 (^oo - Tox + TlQ - Tu). And the other three are: (c)
Szz=^(TO0+T0l zi
=
+ Tl0+Tu)
~7 (^00 ~ ^10 + ^01 " ^ll)
S& - ~~T (^oo - TQl - Tl0 + Tn). We denote the general tensor of type (r, s) by T and note that its components could look like:
(d)
=
r
rr r ] n n ^ • M k
I
where (i, j) and (k, I) are some partitions of r and s respectively, and z, z have been suppressed in T'j. (This is also denoted T™ or T(z, I) in literature). 2. Note that 5(?J ® rz) = & z ® r z + rz ® 5fz. Similarly, S(tl ® r*) = 5 ^ ® ^ + f' ® 8tz. Writing rz ® r ' = Tzz and rz ® fz = Tzz and using (1.4.1 l)(a), we obtain: (a)
STZZ = 2coTzz and 8 TTz =-2coTzz
Complex Functions, Riemann Surfaces and Two-Dimensional Conformal Field Theory (an Introduction) 69
In the case of the combinations f ® tz and tz <8> tz, it can be easily checked that 8(tz <8> tz) and 8(tz ® tz) are zero. Using the same procedure to write 8(TZ ® Tz) and 5{T- <S> T-) with the help of (1.4.11 )(b), we have: (b)
8SZZ = -2(0 Szz and S S- = 2a S-
and 8 S,- = 8S-7z = 0. In view of these computations it follows that for a general tensor, the Lorentz transformation rule would be:
(c)
STJj = co [(i - k) - (] - I)] T^.
3. Recall that ^Pr = — 8a$ (dpgys + dYgps-
dggfr).
Write a, p, / a s z and 8 as z, as well as z for summation, then since gzz = 0 = ga, it becomes r z , = | « « (dz §z-z + dz gz-z) = e-" ( ^
e
^ = dzxir.
Similarly for F | ? , we have (9- y/. The mixed component YK
= \gZZ
^zgzz+dzgzz-dzgzz)
is evidently zero. (A good source for the remaining four exercises is Ref. 7.[21].)
References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11.
L. V. Ahlfors, Complex Analysis (New York: McGraw-Hill, 1979). L. V. Ahlfors and L. Sario, Riemann Surfaces (Princeton University Press, 1960). A. F. Beardon, A Primer on Riemann Surfaces (New York: Cambridge University Press, 1984). A. Belavin, A. M. Polyakov and A. B. Zamolodchikov, Infinite Conformal Symmetry in Twodimensional Quantum Field Theory, Nucl. Phys. 241, 333-380 (1984). R. Bott and L. W. Tu, Differential Forms in Algebraic Topology (Springer-Verlag, 1982). P. Buser, Geometry and Spectra of Compact Riemann Surfaces (Birkhauser, 1992). R. V. Churchill and J. Brown, Complex Variables and Applications (McGraw-Hill, 1990). H. M. Farkas and I. Kra, Riemann Surfaces (New York: Springer-Verlag, 1991). O. Forster, Lectures on Riemann Surfaces (Springer-Verlag, 1981). S. Kabayashi and K. Nomizu, Foundations of Differential Geometry (Interscience Publishers, 1969). H. Weyl, The Concept of a Riemann Surface (3rd ed., Reading, MA: Addison-Wesley, 1964).
CHAPTER
EUMENTS OF GROUP THEORY AND GROUP REPRESENTATIONS
r\ dL
The theory of groups as a structural theory has been developed and studied in great detail by mathematicians (e.g., the theory of topological groups and Lie groups). From the point of view of physicists, however, the abstract formulations are not of interest nor worth the effort, unless they relate to some physical system or to some observed phenomena. Thus, for instance, the groups whose actions leave some postulated laws of physics or observed phenomena unaltered are of importance to physicists. These groups are known as symmetry groups of given physical systems. We shall return to these in later chapters. We give below the basics of group theory compromising the mathematical rigor, with a view to emphasize the applicability.
1 l.l
INTRODUCTION Definition of a Group, Examples, and Conjugate Classes
Definition 2.1.1: A group 'G' is a set of elements (a, b, c, ...) together with a binary operation G x G -» G (i.e., (a, b) -» a • b = ab (closure)) that satisfies: (i) a • (b • c) = (a • b) • c = abc (associativity); (ii) there exists a unique element V in G such that a • e = e • a - a; (iii) corresponding to every element a in G, there is an element b such that a • b = b • a = e, V is called the inverse of a and is written (a" 1 ). The group 'G' is finite if the number of elements is finite, otherwise it is infinite. If the elements of G are functions a(x) of a continuous parameter x, then G obviously has an infinite number of elements and as such is an infinite group. If for every pair (a, b), ab = ba, the group G is called an abelian group. The total number of elements in a finite group G is called its order and is denoted \G\ or g. In this case every element 'a' has a smallest integer, say n, associated with it such that a" = e, n is called its order and is denoted \a\. Any group H c G is called a subgroup of G. Apparently the set {a, a2, ..., a"} = An c G is a subgroup of G. The group An is called a cyclic group. To define a cyclic group we need just one element and a binary operation. This one element is called the generator of the group. Clearly every finite group G has a finite number of generators: a, b,c, ..., meaning thereby that every element of G is expressible as a finite product of powers (including the negative powers) of a, b, c, .... The order \G\ = g is the sum of the orders of its generators (i.e., \G\ = |a| + \b\ + \c\ + •••).
Elements of Group Theory and Group Representations 71
Example 2.1.2: (a) All integers a, b, ... that satisfy a - b = mn for an arbitrary integer m and a given integer n form a group under addition modulo n. This is denoted Zn. (b) The set of real numbers (integers) denoted R (Z) forms a group with addition as the group operation, whereas the set of rationals Q (excluding zero) forms a group with multiplication as the group operation.1 (c) The T complex numbers exp (2 mm/k) (m = 0, 1 ..., k- 1) form a cyclic group under multiplication. (d) All 2 x 2 matrices
( cos 6
s'm 6\
\- sin 8
cos 6)
0<
6<2n
form a group with respect to matrix multiplication. This is called the rotation group R2 in twodimensions. Similarly, the set of 3 x 3 matrices cos (j)
1 + sin
- sin
cos 8 cos
\ sin <j) sin 0
- sin 0 cos 0 + cos 6
1
N
sin 6
0<6
cos 0y
with matrix multiplication as the group operation is the rotation group /? 3 . (e) The set of 3! permutations of {1, 2, 3} is the group 5 3 where any two permutations Pt /*• are combined into one permutation Pk by carrying out successive permutations of Pj followed by that of Pi.
All of the above groups except the last two are abelian. While using groups as modes of application we treat the elements a, b, ... as operators A, B, ... and call groups (in classical terminology) as groups of transformation. Of particular interest are the groups which leave the geometry of an object or the equations defining a physical system invariant. These are called symmetry groups of the system. For example the groups in (d) and (e) above are symmetry groups. Definition 2.1.3: An element 'fo' in G is said to be conjugate to a given element 'a' in G, if for some _1
C
C
element V of G, we have b = vav~ . We denote this relation as b = a. It is evident that a - a, and b = a implies a = b and also a = b, b = c leads to a = c. The set of all elements conjugate to each other forms the so-called conjugacy class or simply the class. In view of the above definition, G can be decomposed into a disjoint union of these classes. If \C,\ denotes the number of elements in the i-th class and N the total number of these classes, then \G\ = X,-Ii iQl- T n e integer N is an important characteristic of the group. It may be noted that V in any G forms a class by itself, and if G is an abelian group each element forms a class by itself.
1
The groups formed by R, Z and Q are denoted as: (R, +); (Z, +); and (Q, •).
72 Mathematical Perspectives on Theoretical Physics
1.2 Invariant Subgroups, Factor Groups, Simple and Semi-simple Groups Definition 2.1.4: Every subgroup Gs
1.3
Products of Groups and Homomorphism
Definition 2.1.7: Let G and G' be two arbitrary groups with independent binary operations. The group G x G' formed by using the combination rule: (a, a') (b, b') = (ab, a'b') is called their direct product. The order of the group is \G\\G'\ = gg. Evidently \G x G'\ = \G' x G\. In particular if G{ and G 2 are two commuting subgroups of a larger group G, then the direct product group 2 Gx x G 2 c G, and the groups Gj, G2 are invariant subgroups of Gj x G2- If G{ x G 2 = G, then evidently Gx and G2 are invariant subgroups of G. If, however, Gx is invariant and G 2 n G{ = (e), and also every member a e G can be expressed as a{aJ2 (product of elements e G1? G2), then the product of G{ and G2 equals G; the group G is called the semi-direct product of Gy, G2 and is written Gx tx 5 G2 or simply as Gx K G2. Definition 2.1.8: A mapping (f> : G ^> G' which preserves the group structure (i.e., (f>(a • b) =
G{xG2c:G
in view of the binary operation Gx G -> G.
Elements of Group Theory and Group Representations 73
it is called an inner automorphism of G. An automorphism which is not equivalent to the transformation given by a single element is called an outer automorphism. For instance, in a cyclic group of order m, the correspondence a —» ak if k is prime to m defines an outer automorphism.
2
LIE GROUPS AND TOPOLOGICAL GROUPS
The importance of Lie groups in mathematics and physics can hardly be overemphasized. For instance they play an important role via diffeomorphism groups in the study of various types of partial differential equations. The subject of Lie groups since the first contributions of Sophus Lie [22] has developed tremendously. One can now talk of infinite-dimensional Lie groups modelled on Banach manifolds or Frechet manifolds. The work in this direction is due to Bott (1956) [4], Eells (1958) [11], Abraham (1961) [1], Smale and Palais (1968) [28], Omori (1970) [27], and Ebin and Marsden (1970) [10]. In the late sixties through the pioneering work of Arnold (1966) [2] (followed by that of Marsden, Ebin and Fischer (1972) [25]), the applications of Lie theory to mechanics, in particular to hydrodynamics and plasma physics, shifted the emphasis from mathematics to physics. During the past decade and a half, the theory of Yang-Mills [Moncrief (1977), (1980), Singer (1977), (1980), Atiyah, Hitchin and Singer (1978),] (see Chapter 10 for these references), Kac-Moody algebras [Chap. 5] and the string and superstring theories [Chap. 11] have given another boost to interacting research areas of Lie groups. It is therefore just in order that we acquaint our readers with the primary tools of the topic, leaving the task of sophisticated technical details to specialized texts on the subject (see [16] and also Chapter 8 of 0.[8] for an introductory account). To begin with, we give the definition of topological group. These groups are of a more general nature.
2.1 Topological Groups Definition 2.2.1: A set G is a topological group if the set is (i) a group, (ii) a Hausdorff topological space, and (iii) mappings 0 : G x G -> G :
74
Mathematical Perspectives on Theoretical Physics
For example, the n-time product of Sl with itself gives the familiar torus T" {{e2 *Ui, ..., e2nvc") \ (xu ...,*„ e R"). Definition 2.2.3: Given a topological group G and an element a E G, the mappings La and Ra defined as (i) La : G —> G : x —> ax, and (ii) Ra : G —> G : x —> xa are called left and right translations of G with respect to a. These mappings are evidently homeomorphisms of G; they help us determine the local properties of any element a with the help of those of e, since neighbourhood U of e defines a unique neighbourhood La U = V of a (see subsection (5.3)). Definition 2.2.4: A subset H of a topological group G is a topological subgroup if: (i) HH~X
'
Note that Definitions (2.2.4) and (2.2.5) are the same for Lie groups, once the topological space and continuous mappings here are replaced by analytic (C°°) manifolds and analytic (C°°) mappings.
Elements of Group Theory and Group Representations 75
Definition 2.2.10: A Lie group G is a set which is (a) an analytic (or a C°) manifold, (b) a group, and (c) where the two group laws: multiplication m : G x G —> G : (x, y) —> xy and the inversion i : G -> G : x -» x~l are analytic (or C°°) maps with regard to the manifold structure in (a). 4 Example 2.2.11: (a) All topological groups given in Exp. (2.2.2) are also Lie groups. (b) Let R (C) denote the real (complex) numbers and H the quaternions, then the groups GL (n, R) (GL (n, Q ) and GL (n, H) of n x n invertible matrices with real (complex) and quaternions are Lie groups. (c) The quotient group R/Z (Remark 2.2.6), which is isomorphic to the circle group Sl(= Tl), is a Lie group. (d) The cartesian product of Lie groups is a Lie group. Accordingly, the n-dimensional torus [Tn = S} x S1 x, ..., S1] formed by n copies of 5 1 is a Lie group. (e) The group of (n + 1) x (n + 1) matrices A with complex coefficients satisfying A • A+ = 1 and det A = 1 is a Lie group (A + = Hermitian conjugate/conjugate transpose of A). It is denoted SUin + 1) and is called the special unitary group (See also (g)). In particular when n = 1, the collection of matrices
(a
P) (2.2.1)
with the restriction | a | 2 + |/?f = 1 gives SU(2). The group SU(2) is homomorphic to the 3-sphere S3 of unit vectors in C 2 . (f) Let gr be the canonical pseudo-metric of signature (n - r) on R" whose infinitesimal length is d s 2 = d x \ + d x 2 + ••• + d x \ - d x 2 r + x ••• - d x 2 n
(2.2.2)
and whose matrix representation, also denoted gr, is: (lr
O
s'={°
\
-l.-.)
(2'23)
Let O{r,n - r) be the group of linear transformations A of R" such that gr(Ax, Ay) = gr (x, y), V x, y G R". This group can be identified with the group of n x n matrices A such that A'grA = gr (2.2.4) From the above equation it follows that det A = ± 1, V A e O (r, n - r). The subgroup of O(r, n - r) of all those A's for which det A = 1 is denoted SO(r, n - r). The group O(r, n- r), as well as SO(r, n - r), are Lie groups. In particular the groups O(n, 0) s O(«), S0(ra, 0) s SO(n) are Lie groups, these are respectively called the orthogonal group and the special orthogonal group*. Furthermore, when n = 4 and r = 1, the group (9(1, 3) is called the Lorentz group and 5 0 ( 1 , 3) is known as the proper Lorentz group. (g) Let gr denote the canonical non-degenerate hermitian sesquilinear form on C" of signature (n-r): gr(x, 4'
y ) =xly
l
+ ••• + x r y r - x
r + l
y r + l - ••• - x n y
n
(2.2.5)
The underlying manifold of a Lie group can be C°° or analytic. See Sec. 0.3 for definition and distinction between the two structures. The group which leaves the quadratic (x2 + y2 + z2 - c2t2) invariant is called the (homogenous) Lorentz group. (Notice the sign convention here). If we add to it the group of transformations, we obtain the inhomogeneous Lorentz group, commonly known as the Poincare' group.
76
Mathematical Perspectives on Theoretical Physics
V x, y G C". The matrix representation of gr is the same as in Exp. (f) above. In this case, the group of linear transformations A of C" satisfying gr(Ax, Ay) = gr(x, y), Vx, y e C", denoted U(r, n - r), can be identified with the group of {n x n) complex matrices A such that A+grA
= gr
(2.2.6)
+
where A is the conjugate transpose of A. The above equation implies that det A = ± 1. The group U(r, n - r) and its subgroup SU(r, n- r) formed by matrices A with det A = 1 are Lie groups. Evidently U(r, 0) s U{ri) and SU(n, 0) s SU(n), known as the unitary group and special unitary group of dimension n, are Lie groups.
( °
(h) The matrix co=\
X>
l , where 1 denotes the n-rowed unit matrix, defines the canonical symplectic
structure on R2". The group of linear transformations A of R2n satisfying co(Ax, Ay) = (o(x, y) V x, y 6 R2" denoted Sp (n, R) can be identified with the group of In x In matrices A such that A'coA = a.
(2.2.7)
The above equation implies that det A = ± 1 V A e Sp(n, R). This group, known as the real symplectic group in 2n dimensions, is a Lie group. (See Exercise 12 for (f), (g) and (h).) Remark 2.2.12: Since the ^-dimensional spheres for n = 1 and n = 3 turned out to be Lie groups, a natural question would be, "Do all n-dimensional spheres define Lie groups?" Surely ... 'no.' All even dimensional spheres S2n fail to be Lie groups, since the multiplication map 'm' fails to be analytic. (See Bott [4].) If the underlying manifold of Lie group G is compact, G is called a compact Lie group. The (R/Z, +) = Sl is thus a compact Lie group and so is T ". Compact Lie groups are important for more than one reason. For instance, they can be given a bi-invariant Riemannian metric (a metric, i.e., left as well as right invariant). Classically,5 a Lie group can be thought of as a special type of continuous group (see Def. 2.2.8) whose elements a, b, c, etc., are labelled by r real parameters (a{, a2, -.., ar) which can vary over a finite or infinite range; their domain space is called the group-parameter space. The elements c = a • b and d = (a)"1 obtained from multiplication and inverse operations are analytic functions over group parameter space. A Lie group is said to be compact if the domain of variation of its parameters is closed and bounded*. If the number r is the smallest number that characterises the group-the parameters are called essential and the group is known as r-parameter group.
al2\
fan
As an example, the real transformation group GL (2, R) formed by 2 x 2 matrices a = \
V«21
a
has
22)
an arbitrary element a = ( a n , a ]2 , a2i, ci^i)- The parameters here are four in number, and since the group parameter space is R, the group is not compact. The group SO (2, R) formed by orthogonal matrices of determinant 1 is a compact group since parameters are bounded, and it is a 1-parameter group. The Lorentz group given in (e) is a six-parameter group in four dimensions; whereas Poincare' group is a 10-parameter group. Clearly for an arbitrary n, Lorentz group has n(n - l)/2 essential parameters. 5
'
Since Lie groups arose via continuous transformation groups, this definition is closer to a physicist's way of looking at them. The domain (set) is said to be closed, if the limit of every convergent sequence of points in the set is also in the set. A set of numbers (a) is said to be bounded if every a satisfies: \a\ < M where M is a given positive number.
Elements of Group Theory and Group Representations
77
2.2 Algebraic Groups Our list of definitions will remain incomplete without a word about algebraic groups. These groups are often thought of as mere applications of algebraic geometry, although there is evidence to the contrary. In fact they have their origin as early as 1883, when E. Picard used them in the Galois theory of linear differential equations (see [7]). The proponents of Lie group theory would like to think of them as algebraization of Lie groups—if that was the case one would expect to find them in Lie's work. But they are simply not there, though Lie's work contains detailed studies of some matrix groups over C (Lie [22]). The definition given below should be enough to convince the reader about our viewpoint that algebraic groups are entities independent of the two topics mentioned above, i.e., algebraic geometry and the Lie groups ([3], [5], [9]). Definition 2.2.13: Let K denote an algebraically closed field and SLn(K) the group of n x n matrices x = (xtj) with entries in K and with det x = 1. A subgroup G c SLn{K) is called a linear algebraic group over K if there exists a set S of polynomials P in K[xtj] 1 < i,j < n for which P(xt) = 0 <=> that x = (Xy) is in G for every P in 5. These are generally referred to as /^-groups and their algebra is denoted as K[G]. Definition 2.2.14: The group GLn(Kf of all non-singular n x n matrices defined over K can be viewed as a A"-group contained in SLn + 1 (K) via the embedding:
'•*{l
(de°r')*eGL"m
<2 2 8)
--
Note that SLn, On, SOn, where the last two denote the group of (« x ri) orthogonal matrices and orthogonal matrices with determinant 1, are contained in GLn(K) and hence are K-groups. Definition 2.2.15: An algebraic group G over an algebraically closed field K is an algebraic variety over closed field K if G has a group structure such that the product map G x G —> G and the inversion map G —> G are morphisms of algebraic varieties. The above definition shows the relation of algebraic groups with algebraic varieties. It should be noted that being an algebraic variety in the case of linear algebraic groups implies that 'G be an affine variety.' For more literature on the topic, see [21], [26], [31]. Definition 2.2.16: A group G is called unipotent if all its elements are unipotent. An element g e G is unipotent if all its eigenvalues are 1. These unipotent groups play an important role in the Jordan decomposition of algebraic groups (see for instance [3] and Chapter 0 of [9]).
Exercise 2.2 1. Show that if H is an open subgroup of topological group G, then H is closed in G. 2. Show that if H is an open normal subgroup of the topological group G which is locally connected, then GIH is discrete. 3. Fill in the lines to prove that T1 is topologically isomorphic to R/Z. 6
'
We are using the integer V as a subscript in GLn (£)-which differs from previous notation GL (n, R). This is done to emphasize the algebraically closed field K.
78
Mathematical Perspectives on Theoretical Physics
4. Show that the centre C of G{x e G \ xa = ax for all a e G] is a normal subgroup of G. It is denoted Z(G). 5. Show that the complex linear transformation group GL(2, C) is an eight-parameter group. Show further that the group SU(2) is compact, and it is a 3-parameter group. 6. Show that the elements of a group G of one-dimensional coordinate transformation (x —> x + e) that leave the wave function invariant can be expressed as U = exp(-iepx)
where p x = . V idx)
7. Show that the group SU(2) can be viewed as an isospin invariant group for the doublet q consisting of quarks u (1^ = — ] and d \l^ = — , expression in brackets stand for their isospin; (see Subsec 6.5.2 and Table-5). 8. Let M be four-dimensional real vector space equipped with space-time bilinear form of signature (+, -, -, - ) . The linear transformations which preserve the bilinear form define the group O (M) which is isomorphic to 0 ( 1 , 3). Show that SO0(M ) (the elements of 0(M) with determinant 1) is a connected component of O(M). Show further that the group operation in the semi-direct product (Af) K SO0(M) is defined as (x, A) • (y, B) = (x + A • y, A • B). (i) The above group is known as the Poincare group or inhomogeneous Lorentz group (of the Special Theory of Relativity). 9. The Galilean group (the group of transformations in non-relativistic mechanics) is isomorphic to R4 K (R3 K (50(3))). Define the group action of R3 K 50(3) on R4. (0 -A fO n (\ 0\ 10. Let e = ia2 = iI . define the Levi-Civita symbol, and let a{ = , cr3 = be the other two Pauli matrices and cr0 be
. Then show that the set 5W" = {±£, ±cr0, ±cru ±a2,
±
where e is ia2 (the Levi-civita symbol). Show further that generators Ay and A2
pertaining to infinitesimal rotations about the x and y axis are
e
^ ^
and 0 0
0 , and that
to o o ,
the commutation relation satisfied by them is (b) [A, Aj] =-eiJk A k . 12. Establish that O(r, n - r), SO(r, n - r), U(r, n - r), SU(r, n-r) by showing that they are Lie subgroups of GL(n, R), etc.
and S(n, R) are all Lie groups
Elements of Group Theory and Group Representations 79
Hints to Exercise 2.2 1. Since H is open, aH{a e G but not in H) is open in G, and thus K = u (aH) is open in G, hence the complement of K, which is H, is closed. 2. Use Exercise. 1 and Definitions (2.2.4) and (2.2.5). 3. R is an additive group with metric topology, and Z the additive subgroup of integers is closed in R. Thus quotient group R/Z is a topological group. The isomorphism between R/Z and Tl is given by the map (i) 0 : R -> T1 = S1 : x -» e2ld*. The map 0 is a continuous epimorphism with kernel Z. 6. Consider G as a group that leaves the wave function y/(x) invariant, i.e., \j/'{x') = y ix(x)) a Y (x - e); let £/ e G, then v'OO = (/ V (*) = V (* - € ) • I f e i s small, (i)
yr(x-e)=
yr(x) - e-f- +O(ez) = y/(x) - ie Pxy/. ax
Hence U = exp(-ie px). 7. LetaeS(/(2)bewrittenas +
fau
an\
U21
a
22J
(seeEqn(2.2.1)),andlet^'=
(u'\
(an \=aq = \ \d ) \.a2l
an\(u\ a22){dj
+
Compute (q'\q'}. Since aa = a a = I, (q'\q') ={q\q) (seeApp.9A). 8. Find the identity element of the group (which is (0, e)) and the inverse of a given element (e.g., (x, A)'1 = (-A"1*, A'1)), and establish that (i) satisfies the associativity; these three taken together would show that the given operation is a group operation. 9. Let (a, A) be an arbitrary member of R3 x SO(3). Thus (a, A) maps (x, i) e R3 x R = R4 to the element (i) (A • x + ta, t) e R3 x R = R4. The action (i) is well defined. For instance, another element (/}, B) e R3 « SO (3) maps (A • x + ta, i) to (BAx + t(Ba + (J), f) e R3 x R = R4. (See Sec. 33 in 3.[11] on Galilei transformations.) 12. Let S(n, R) denote the group of (n X n) symmetric matrices. Consider the mapping/: GL(n, R) -> S(n, R) defined as (see Eq. (2.2.4)): f{A)=A'grA.
(i) 1
x
The map / i s continuous and therefore/" ({gr}) is closed. Thus O(r, n - r) = f~ {{gr}) is a closed subgroup of GL(n, R). From (iii) of Def. (2.2.4) it also follows that in view of Ftn. 3, it is a Lie subgroup of GL(n, R). Let S(n, C) denote the group of n x n Hermitian matrices and let / : GL(n, C) -> S(n, C) be the map defined as (see Eq. (2.2.6)): f(A) = A+grA.
(ii) l
The map/is continuous and therefore/ ~ ({£r}) is closed. Accordingly U(r, n - r) is a closed subgroup of GL(n, C) and hence is a Lie subgroup. Similarly, let A(2n, R) denote the group of {In x 2n) skew symmetric matrices and let/: GL(2n, R) —> A {In, R) be the map defined as (see Eq. (2.2.7)):
80 Mathematical Perspectives on Theoretical Physics
/ ( A ) = A' O)A.
(iii) l
The map / i s continuous and hence / ~ ({co}) is closed. As a result Sp(n, R) = / "' ({co}) is a closed subgroup of GL{2n, R)-and thus it is a Lie subgroup of GL(2n, R). (See Chap. 0 for continuous mapping and closed set.)
3
BASICS OF GROUP REPRESENTATION
In layman's language a representation of a group G is its realization as a group of transformations of some set X with structure which is required to be preserved under these transformations. For instance: (i) if X is a linear space over a field K and GL(X) denotes the group of invertible transformations, then a homomorphism of G into the group GL(X) is called the linear representation of G with respect to X over the field K (if K = R and X has a basis eu e2, ..., en, GL(X) = GL(n, R)). (ii) If X is a Hilbert space (usually denoted as # ) and U(X) is the group of unitary operators on X, the homomorphism from G to t/(A") is a unitary representation of G over X (see Chapter 3 for operators). More formally we have: Definition 2.3.1: A representation of a given group G is a homomorphic mapping G —> GL(X) where elements of GL (X) are linear operators or matrices on a space X. X is called the representation space and its dimension is known as the dimension or degree of the representation. For topological groups, a representation can be defined as follows: Definition 2.3.2: Let £ be a complex Banach space and GL(E) (defined above) be the group of continuous transformations. A representation n of a compact group G in £ is a homomorphism n: G -> GL(E) such that all maps G -> E, g h-> n(g) x (x e E) are continuous. The space E is called the representation space of G with reference to n and is therefore denoted E^ When £ = tt, each 7r(g) is a unitary operator in view of (ii) as given above (i.e., (n(g))+= (Tr(g)r' = n(g~l) V g e G) and the representation of G is unitary. When G is a Lie group, the mappings defined from G into the space of representations are required to be C °° (see Kirillov in [19] and Ref. [13] for an elementary description). It is easy to note that a group can have more than one representation. It is therefore natural to ask the relationship, if any, among these representations.
3.1
Relation Between Two Representations
Definition 2.3.3: Let Rx and R2 be two representations of G in the spaces X{ and X2. The operators A : Xx —> X2 that commute with R] and R2:
ARl(g) = R2(g)A
(2.3.1)
are called intertwining operators. The space of these operators is denoted HomG (/?[, R2) = C(^?l5 R2) and its dimension, denoted c(Rh R2), is called the intertwining number of Rx and R2. If the operator A is invertible, Rl and R2 are called equivalent. If Rx = R2 = R and Zj = X2 = X, the notations given above are simply EndG{X) = C(R) and c(R) respectively. 7
'
If G is a topological group, A will be a bounded linear operator.
Elements of Group Theory and Group Representations 81
Given a representation R of group G with respect to a space X, suppose that X admits a subspace Xx which is /?(G)-invariant, then R restricted to Xx defines the so called subrepresentation and the natural representation that follows on the quotient space XIXX is called the quotient representation. In case the subspace X2, the complement of Xx in X, is also R (G) invariant, another subrepresentation R2 is defined and R = Rx + R2 is said to be completely reducible. A representation that does not allow a "subrepresentation" is called irreducible. It is called algebraically irreducible when G is an algebraic group and topologically irreducible when G is a topological group.8 For a reducible representation R of degree n, it follows that every matrix D(g) that corresponds to g e G can always be reduced to a block diagonal form using an (n x n) nonsingular matrix M: 'Dx(g)
MD(g)M-l=
0 ^
°2ig) . , 0
Dk(gh
k
The representation R = ^ /?,, where diagonal matrix D((g) corresponds to /?,. Furthermore if Dt(g) is i=i k
ni x «( matrix, then n = ^T «,-. ;=i
Definition 2.3.4: Every compact, connected Lie group G has a "God given" representation defined by the so-called Adjoint action on the tangent space Te(G) at e.9 This action is given by X —» g Xg~x and is written as Ad(g)X where X e Te (G) and g e G. This is the canonical finite dimensional representation of Lie group G on Te(G). It can be verified that the derivative of Ad, denoted ad : Te(G) —> End(TeG) and written for X, Y € Te(G) as ad(X) (Y), equals the Lie bracket [X, K] (see Exercise 4).
3.2
Tensor Product of Representations
Definition 2.3.5: Let ttx <8> # 2 denote the algebraic tensor product of complex Hilbert spaces Hx and H2-W Let (TTj, ^ ) and ( ^ , tt2) be two representations of G. Denote by Hx ® tt2 the space of all Hilbert-Schmidt operators A : !H'2-> !HX,H'2 being the C-linear dual of H2. By definition A is HilbertSchmidt if11
IW| 2 2=I>£I| 2 <~
(2-3.2)
;=i 8 9
' '
11
We shall simply use the word irreducible, since the adjective will be clear from the context. Note that for any g e the Lie group G conjugation by g (Def. (2.1.3)) defines an automorphism
82 Mathematical Perspectives on Theoretical Physics
for an arbitrary orthonormal basis {£,} of H'2. This in turn defines an inner product oo
(A,B) = tr{B*A) = X
(A&Bx}.
/=i
for A, B € Hx <§> tt2. The space 9ix ® tt2 equipped with this inner product is Hilbert space (see Sec. (0.2)), and is denoted as Hom2(9{'2, Hx). The tensor product representation nx ® fy can be defined as an algebraic action on Mx <8> tt2 by setting: (nx
(g) r]
(2.3.3)
where ge G,£,eHx, and J] e H2. This extends to a unitary representation on ttx ® ?(2 = Hom2 (9{'2, Hx) through the equality: {nx ® n2)gA = nx (g) o A o n2 (g~l). g 6 G and A e !tt2. Here ^ 2 is the representation of G with respect to 9{'2.
Exercise 2.3 1. Let (n, H) be a unitary representation of a topological group G and let Hx c ^ b e a closed K(G)invariant subspace that defines the pair (nx, !HX). Show that the orthogonal complement (^j) 1 = 9i2 is also a n(G) invariant subspace of H and hence n= nx + fy. 2. Show that every unitary representation {n, tt^ of a group G uniquely defines another representation (ft, ?{'„) where ^"'^is a C-linear dual of Hx 3. Let /?j and # 2 be two irreducible representations of G, then show that any intertwining operator A e HomG (Rx, R2) is either zero or is invertible (Schur's Lemma). 4. By using G to be a matrix group over the reals, show that the derivative ad = d(Ad) satisfies ad(X) (Y) = [X, Y] for X, Y e Te (G).
4
SPECIFIC EXAMPLES OF GROUP REPRESENTATION
In this section we give just three examples on group representations which relate in general respectively to quantum (in particular to current algebras), Yang-Mills and string theories. Following example illustrates that begining with a group of unitary operators and its representation space 9i, another group can be defined whose representation space is H. (This example may be viewed as a follow up of the theory given in Subsec. 4.3 of Chap 0, and also as an example for constructing unitary operators). Example 2.4.1: Let G0(!H) denote the group of motions of the Hilbert space H, i.e., the group of transformations of the type x (-» Ax + b where b e # a n d A is a unitary operator (See Eq. (3.1.4)). The composition rule in Go (#) is defined as: (A,, bx) • (A2, b2) = (A, A 2 , bx + A! b2)
(2.4.1)
Elements of Group Theory and Group Representations 83
since the pair (A2, b2) carries x to A-gc + b2 and the pair (Ax, bx) carries A2x + b2 to Ax{A2x + b2) + bx. The central extension of G0(M) is another group G(#) formed by triplets (A, b, c) where c € the unit circle Sl. The composition rule in G(#) is given by: (A,, bi, c{) • (A2, b2, c2) = (A, A2, by + Ax b2, q c2 exp (ilm (&,, A, b2))).
(2.4.2)
The group G(#) has the irreducible unitary representation in the space Exp!K which is given by: (A, b, c)^UAtb,c where
UA
b
(2.4.3)
,. (Expy) = c exp (- - ^
(Aw, &)) Exp (Ay + b)
(2.4.4)
If 9(is the complexification of a real Hilbert space 7ir and A is a unitary real operator (i.e., the operator that leaves ttr invariant) with b e tir, then the operators {UA b r } can be realized in L^(4>), the Gaussian model of Expj^(see Chapter 0 and Gelfand, et ah, in [19]). To see this, consider the isomorphism px: E x p # r -> L^(<&) defined as Exp v -» e-^HI^s.^A^,!;) becomes: ^.fc. c ^>(F) = cei(F'b)
for
A
_ ^
then in v k w
O(A>)
e'{-v)
= ce^'Av+h)
( 0 .4.9), f/A
fe t
(2.4.5)
where A* is the conjugate (seeDef. (3.1.7)) of the operator Ar = A| Hr, F,be product in H. The equation (2.4.4) thus simplifies to: UA.b,c
of
«'•<••*> • e(A*--v)
Hand <) denotes an inner
(2.4.6)
Example 2.4.2: We have seen that special unitary group SU(ri) for even n is a Lie group. It is formed by unitary matrices U : U + £/ = {/£/ = /, det £/ = 1. Every unitary matrix U can be written as e'H for some Hermitian matrix H. Using the property det U = 1, it can be easily checked that 7r// = 0*. In this set of (n x n) Hermitian matrices, there are (n2 - 1) traceless Hermitian matrices. Accordingly, an element U of SU(n) can be expressed as:
r «2-i f/ = exp
* X G r <-
i (2- 4 - 7 )
where € r are (real) group parameters and gr are the group generators12 represented by traceless Hermitian matrices. Of these ( n 2 - 1) traceless matrices, (n - 1) can be diagonalised simultaneously. The number (n - 1) is called the rank of SU(n). When n = 2, we have the group SU{2) of isospin invariance of Yang-Mill's theory. Obviously the rank of this group is 1 and the parameters are 3 in number, hence (2.4.7) gives £/(£„ 6 2 , e 3 ) = exp
12
i X zrgr\
(2.4.8)
Express e' as a series, which would terminate after a finite no.of steps, evaluate its determinate and you will note that the term involving i is trace of matrix H. A set S c G (any arbitrary group) is said to be a system of generators of G if the powers S" (n = 0, 1, 2,..., 5° = e) cover the entire group G (see also Def. (2.1.1)).
84 Mathematical Perspectives on Theoretical Physics
The generators gy, g2, g3 are the Pauli matrices
Note that if we were to use the generators as Jr= —— to represent SU(2), we would have the commutation rule for these generators as: [Ja,Jb]=i€abcJc
(2.4.9)
where eafe(. is the anti-symmetric (Levi-Civita) tensor (see [21](b) for in-depth study). From the previous section it is clear that an arbitrary group can have more than one inequivalent linear representation. Depending on the vector space one chooses, some of these representations are more useful than others. In the next example we illustrate the method of finding a representation of degree (n + 1) for the special linear group SL(2, C) of complex 2 x 2 matrices of determinant 1. Example 2.4.3: Let X denote the (n + l)-dimensional complex vector space whose elements are homogenous polynomials P of degree n in two variables (zl5 z2) with complex coefficients: (2.4.10) 1=0
In order to show that X is a representation space of SL (2, C), we should establish a homomorphism: A i-> R(A)- P
(2.4.11)
where A € SL(2, C) and R(A) • P is a transformed homogeneous polynomial in X. We write the pair (z,, z2) as a column vector
:
s z and let A =
, and then define R(A) • P(z) as a new
homogenous polynomial given by: P(A->Z) = P((*
~P)(Zz[)]=i,xl(Szi-Pz2)l(-Yzl+az2)n-1
(2.4.12)
Notationally, P(A~lz) = PA(z) = P((5zl - /3z2), (-y*i + az2)). Since R(AB) = R(A) R(B), (2.4.11) and (2.4.12) lead to: AB i-> R(A) R(B) P = PAB (z). In order to bring this in line with physical theories that use this kind of group representation, we write n = 2m. Thus m can be an integer or a half-integer, and we rewrite (2.4.10) as m
(zi> z2) = Z, xk *i
Zz
(2.4.13)
k=—m
These homogeneous polynomials can be reduced to non-homogeneous ones by writing (z,, z2) as (z, 1), where z = Z\/z2 with z2 * 0. Thus we have,
Elements of Group Theory and Group Representations
P(z,l)=
X xkzr* = Q{z)
85
(2.4.14)
k ——m
The above process is evidently reversible since P(zv z2) and Q(z) are related by the formula:
P( Z l , z2) = z22m 2 ( - ^ - 1 (z2 * 0)
(2.4.15)
Let X ' m denote the space of all polynomials Q of degree 2m, and let fl'm be the corresponding representation of SL(2, C) in X'm then A f-> /?^,(A) • <2 = QA is given by:
QA(z) = (-7Z + «)2'" Q f * Z ~ ^ ]
y-yz
(2.4.16)
+a)
From (2.4.14) it follows that the monomials {z"'~k} for -m
P(z,, .... z 2 r ) =
**,- k2rz\l
X h
••• zfy
*2r=0
where the indices kt must satisfy ^
k{ = n. We can then write the column vector
: Z
\ 2r;
treat P{z) as a function in C2r, and for A e SL(2r, C) use the transformation rule: fir
Ir
\
Rn(A) • P = PA = PA(z) = P(A"' •z) = P\ X a . - z , - , - , £ v , - Z i
where the matrix
'«!
...
a2r^
,Vj
...
V2rJ
A"1 =
Exercise 2.4 1. Verify the Lie bracket relation (2.4.9) for Jr= —— where
as z,
86 Mathematical Perspectives on Theoretical Physics
(0
l\
(0
-A
(1
0\
Hints to Exercise 2.4 1. To prove [Ja, Jb] = ieahc Jc, we write the LHS for a = 1 and b = 2. Thus we have:
OTi)roKXi)H[(i-K;13] = - — [2i <73] = «= 123 J 3 . Note that to obtain a non-zero result on the RHS, the subscript c in &abc must be 3-otherwise e abc will be zero.
5
THE THEORY OF BUNDLES AND RELATED OBJECTS
In this section we use our knowledge of groups to learn about another important topic-Ztand/es-which originated in mathematics around 1950 as generalization of topological products. But very soon groups, in particular Lie groups, were introduced in this generalization scheme which led to an important mathematical structure, namely principal fiber bundles. These, as we shall soon see, turned out to be an effective tool via their offshoot-the theory of connections-in the description of other mathematical and physical theories. The notion of connection has been described in the literature in more than one way, for instance (i) by using the classical concept of parallel translation; (ii) by following the EhresmannKoszul construction of principal fiber bundles, more specifically by considering the smooth splitting of tangent spaces into horizontal and vertical subspaces; and (iii) by treating the cotangent bundles as the main ingredients in the description. To draw a distinction among these approaches, we have divided this section into two parts: A and B. Part A deals with introductory definitions and with the theory of connections based on (ii). The description of connection based on (iii) together with their reference in physics is the subject matter of Part B. Other important objects, e.g., associated bundles (of principal fiber bundles), the universal connection and the metric on a bundle, are also given in this part. Due to our limited scope, we describe the theory in brief and refer the reader to other advanced and specialized texts on this subject l.[10], [30], [18], and [14].
Part A 5.1
Fiber Bundles, Bundle Morphisms
Definition 2.5.1: Given two topological spaces E and X and a continuous surjective mapping n: E —» X, the triple (E, X, n) is called a bundle. The space X is called the base. Example 2.5.2: The triple {Xx x X2, Xu nx) where E is the Cartesian product Xx x X2 and n is the projection nx on the first factor, is the Cartesian bundle. In particular,
Elements of Group Theory and Group Representations 87
7
(s, a)
(Xi,X2) 1 f
(X,)
X,
S
^ ^ ^ Q Cartesian product bundle of a circle and a line. the triple (S1 x /, s\ nx) which represents a cylinder formed (as though) by gluing the two ends of a piece of paper, is a Cartesian product bundle of circle S1 with line segment /. The inverse image 7fl (x) for every x e X is assumed to be homeomorphic to a given space F\ it is called the fiber at x and is denoted as Fx or sometimes as Ex. The space F is called the typical fiber. An additional structure in the bundle can be provided by incorporating a group of homeomorphisms of F and a covering of X by open sets; we then have the following definition: Definition 2.5.3: A bundle (E, X, n) is a fiber bundle (E, X, n, G) if it is equipped with a typical fiber F, a topological group G of homeomorphisms of F (onto itself) called the structural group, and a covering of X by a family of open sets {Vj, j e J c N} such that: (a) Locally it is a trivial bundle, i.e. for every / e /, the topological space n~~] (V) is homeomorphic to the topological product Vj x F. This homeomorphism denoted ty from 7T (V}) to V;- x F can be written as:
(2.5.1)
and it satisfies the following diagram: £
/ V,
\ I,
/,. o <j>j = n VjxF
^ ^ ^ Q Homeomorphic map
88
Mathematical Perspectives on Theoretical Physics
4>k,x °
4>~j'x--
F
F
^
(2-5.2)
F
Fx A
P C ^
^ —-*•
•—~~
A
A 1
~
01, x ° 02, x
A
02, x
Y
• •
A
A
,
UESaJ Composite mapping #A x o 0^x \ F-* F. Evidently if G has only one element, the bundle is trivial. A
(c) The composite mapping 0 t
4 x
o
§•
induces a continuous mapping g:k : V- n Vj. —> G as
follows: A
* ^ A
^
(*) = ^ , x o
i
^;,^
(2.5.3)(a)
A
The elements ^ (x) = (f>k
x
o ^ • ^ which belong to the group of homeomorphisms G are also called
the transition functions of local representations. If x e Vt n V r> Vt, then it can be checked that the composite map satisfies the following: giJ
(x) o gjk (x) o gki (x) = Id F
(2.5.3)(b)
The identity map \dF on F is in fact the identity element of G. The condition (2.5.3) is referred to as cyclic condition of transition functions. An important concept related to fiber bundles is that of sections, which is described as follows. Given a fiber bundle £ = (£, X, n, F), a C-(smooth) map x : X —> E such that n o s = ldx is called a Cr-(smooth) section13 of the fiber bundle E over X. The space of these sections is denoted as F (AT, E) or simply as F (£). For an open set U c X, the set of sections of the bundle E restricted to U is denoted as F (E ly). In particular if x e U and s e T (E | y ), then 5 is called a local section of £ at x. Note that by choosing local coordinates {xl, 1 < i < m} in a neighbourhood of JC € X and {(.*', y-'); 1 < i < m, 1 <j < n) in a neighbourhood of j(x) € £, we may think of s as a function from an open subset of Rm to R" + m such that14
OO -> (y, y V -
*m))
(2.5.4) r
The section is also referred to as cross-section. See Exc. 1. The section is C (C°° = smooth) onlyin the case of C (smooth) fiber bundles. See Def. (2.5.4). Note that (2.5.4) can be written only if the spaces X and E carry coordinate charts and F is homeomorphic to R". This is what we have assumed here, though we have not mentioned it explicitly.
Elements of Group Theory and Group Representations
89
From this point of view one can use the calculus to write the Taylor expansion of 5 at x for a given set of local coordinates and eventually define an equivalence class of expansions of same order, say k. This is called the k-jet of s at x (see [20] for details). Definition 2.5.4: A bundle (E, X, n, G) is said to be a differentiable (CO fiber bundle if E, X and the typical fiber are differentiable (CO manifolds, n is a differentiable (CO mapping, and G is a Lie group: further more the covering of X given by the domains of an admissible atlas implies that the mappings gjk are differentiable (CODefinition 2.5.5: Consider two bundles (Ex, X,, Kx) and (E2, X2, 7%), the mappings which map fibers into fibers (i.e., preserve the local product structure) are called bundle morphisms. Thus a bundle morphism is a pair of maps ( / , / ) : / : £, -> E2 and / : X, -> X2
(2.5.5)
such that following diagram commutes: £1
T
X,
f
E | m
>
5>
X2
Bundle morphism
Given the map / , the map/^-if it exists-is unique. Definition 2.5.6: Consider a differentiable fiber bundle (E, X, /r) where £ and X are manifolds of dimension n + m and n. A chart (U, y/) on E defines fiber coordinates on E, if the mapping y/: £/ -> R "+m is a bundle morphism with Rn+m viewed as natural (Cartesian product) bundle Rn+m = R" x Rm.
5.2
Tangent Bundle
It can be checked that given a differentiable manifold X" the triple (T(Xn), X", K) is a fiber bundle, where T(Xn) stands for the space of pairs (x, vx), x e X", and vx e Tx(X")-the tangent space to X" at x. The mapping ;ris the projection n: {x, vx) h-> x. The fiber at x is TX(X"), while the typical fiber is R". The covering of X " is given by {V,-} where {V,-, y/j] is an atlas of X". The homeomorphism ty (see (2.5.1)) can be written as the pair (n, y/f o n') such that (7T, y/f o if) : nx (Vj) -» V;- x R"
(2.5.6)
is given by (x, vx) i-> (x, I//,-' ( f j ) . Here n:' is the mapping that assigns (x, vx) to vx, and y/}'(vx) is the representative of vx in the chart (VJ, y/}). The fiber coordinates on T(X' ! ) are given by the mappings (y/j, Id) o (n, y/j o n') : 1C1 (Vj) -» R" x R" l
The coordinates of a point p = (x, z^) € n~ (Vj) a T(X") are thus:
(2.5.7)
90 Mathematical Perspectives on Theoretical Physics
(x1 ... xn, v\ ... vnx)
(2.5.8)
The structural group G of this bundle is the Lie group GL(n, R)-the group of linear automorphisms of R". Recall that the group consists of non-singular n x n matrices. The fiber bundle described above is the Tangent Bundle. If the manifold X" is C-differentiable, then the tangent bundle (T(Xn), X", n) is C^'-differentiable. Definition 2.5.7: Let C, = (E, X, n, F) denote a fiber bundle with structure group G, then if F is a Banach space and G is the Lie group of automorphisms 15 (linear continuous bijections) of F, then £is called a vector bundle with fiber type F. When F is R" (or C ) and G is GL{n, R) (or GL(n, Q ) , then we call £ a real (or complex) vector bundle of rank n. Apparently, the tangent bundle given above is a real vector bundle of rank n. Definition 2.5.8: A fiber bundle £ = (P, M, K, F) with structure group G is called a principal fiber bundle over M with structure group G, if F is a Lie group, and G is the Lie group of diffeomorphisms h of F (see Def. (2.5.3)) such that: h(flf2) = h(f1)f2, / , , / 2 e F
(2.5.9)
It should be noted that the mapping that carries h to h(e) where e is the identity element of F establishes group isomorphism between G and F. Thus it is customary to write the bundle C, in short form as P{M, G),16 and use the right action p of the group G in setting the conditions of definition. We shall return to this in Chapter 6 while dealing with gauge theory. We now move on to some algebraic and geometric concepts of bundle theory. As can be expected, the literature dealing with these concepts is quite vast. With our limited scope we only list the definitions of objects that are important from the applications' point of view and give the references for further reading. In order to define the notion of a parallel translate of a point of the fiber or a vector in relation to a principal fiber bundle, we return to Lie groups, in particular to Lie groups of transformations.
5.3
Lie Groups of Transformations, One-parameter Subgroups of a Lie Group, Killing Vector Fields
Given a group G and a manifold X, the set {
(2.5.10)
defined as (g, x) i-> a(g, x) is differentiable, and if the set of transformations {ag) (<Jg(x) = o(g, x)) in relation to the composite mapping follow the group property:
ogoah
<Je = identity transformation 1
(2.5.11)
From above it is evident that cr^i = cr" . The group G is said to operate effectively on X if crg(x) = x for every x e X implies g = e. It is said to operate transitively on X if for every x 6 X and y e l , there exists g E G such that Gg(x) = y. If the group of transformations {o,} is one-dimensional, most of its 15
' It is assumed that such a Lie group can be defined, see [21(a)]. ' In spite of this shorthand notation, we often use the general form £= (E, X, K, G).
16
Elements of Group Theory and Group Representations 91
properties can be stated in terms of the vector field which generates it. Naturally if the group of transformations {Og(t)}; t G R} results from a one-parameter subgroup [g (t); ( E R} of G, it has similar properties, as we shall soon see. By definition, a one-parameter subgroup of a Lie group G is a differentiable curve: R -> G; m
g{i) such that
(2.5.12)
gO)g(s) = g(t + s) g(0) = e
(2.5.13)
Using the ideas developed for the group {<Jt} and for the group Gg(,y it can be checked that the image of the one-parameter subgroup {g(t); teR} under the mapping ax is indeed the curve generated by the transformations <7gW; t e R} that operate on x G X. The mapping <7X mentioned above is thus defined as follows: am (x) = a{g (/), x) = ax{g (/)) We give below a pictorial representation of the mappings in (2.5.14).
(2.5.14)
X
G
(mj^ x
|dbl^
£__v-Y V
X^Js^—^^^ a \
y
9w
}x= a(e, *)/ i
^
_
\
^
An illustration of the relation between a ^ and a.
The vector field which generates the group of transformations {Gg{f)\ t e R} is called a Killing vector field on X relative to the group G. We denote it as v. The integral curve of v, which passes through x satisfies the differential equation:
* £ « U , („,(,«) with initial condition ax (e) = x. It is well known that if G acts effectively on X, the space of Killing vector fields on X is isomorphic to the Lie algebra Q of G. The two groups of transformations that we consider next are of great importance in the study of mathematics as well as physics. They are defined only with respect to Lie groups; specifically they are the groups of transformations formed by left and right translations of G, and are defined as: Left translation : Lg: G -» G; Lg (h) = gh Right translation : Rg: G -» G; Rg(h) = hg. (2.5.15) where h is an arbitrary element of G and g is a fixed element. We now state two important results based on above concepts (see Hints to Exercises 4 and 5 for the proof). Result 2.5.9: There is a one-one correspondence between the set of left (right) invariant vector fields on G and the set of vectors tangent to G at e, i.e., to the tangent vector space Te(G).
92 Mathematical Perspectives on Theoretical Physics
A vector field v on G is left (right) invariant if L's v(h) = v (Lgh) = v (gh) [R'g v(h) = v (Rgh) = v(hg)], V g,he
G
Result 2.5.10; The set of left (right) invariant vector fields is closed under the Lie bracket relation. Having studied the properties of the transformation groups associated with Lie groups, we are now in a position to define the notion of parallel transport (translate) and connection on a principal bundle.
5.4 Parallel Transport and Connection We note that in a principal fiber bundle (E, X, n, G) while each fiber is diffeomorphic to G, this diffeomorphisms is not canonical since it depends on the atlas {Uh y/^} of X. We shall see that a connection is a means to set up a canonical correspondence between any two fibers along a curve C in X. Using this correspondence one could say that a point of the fiber over a given point of the curve is parallel translated along the curve. In view of this it is natural that the definition of connection involves mappings between tangent spaces of the base manifold X and those of the bundle manifold E. Definition 2.5.11:
A connection on the principalfiberbundle C, = (E, X, n, G) is a linear mapping Op: Tx{X)^Tp{E),x
= K(p)
that satisfies: (i) K' <Jp is the identity mapping on TX(X),
(iii) ap depends differentiably on p
(2.5.16)
The symbol Rg stands for the right action of G on E with respect to g e G, and R'g is its differential. From the above definition it follows that i f C : / c R ^ I ( / i - > C(t)) is a curve in X that passes through the point x0 = C(0) and p0 is a point on the fiber such that 7i(p0) - x0, then the parallel translation of PQ along C is given by the C : I c R —> E (t h-> C(t)) which is defined as: - j - C(t) = cp -j-C(t), C(0) = Po dt dt
(2.5.17)
Since op is linear, the space Hp= op(Tx{X)), x=n(p)
(2.5.18)
is a vector subspace of Tp (£). Using the vector space Hp, the definition of connection can also be formulated in the following manner. Definition 2.5.12: A connection on £= (£, X, n, G) is a field (an assignment) of vector spaces Hp, Hp c Tp (£) such that (i) n': Hp-* TX(X), (x - n{pj) is an isomorphism
(iii) Hp depends differentiably on p.
(2.5.19)
Elements of Group Theory and Group Representations 93
The complement of the subspace Hp with respect to the tangent space Tp (£) is denoted V evidently Vp = Tp (Ex) and K'(Vp) = 0; thus Tp (E) = Hp® Vp; v = vh + vv where vh s Hp and vv e. Vp, v e Tp(E). The elements of H and Vp are called horizontal and vertical vectors and the spaces are referred to as horizontal and vertical spaces. The dimension of H is evidently the same as that of Tx (X) or that of X, whereas the dimension of Vp equals that of Lie algebra Q of G. We clarify this in the following remark: Remark 2.5.13:
The group G acts effectively on E, hence there is a natural isomorphism between Q
and the space of Killing vector fields on E relative to G. Moreover, since p and R p lie in the same fiber, any Killing vector (denoted v (p)) satisfying the relation:
**(P) = -^(iW)|,=o is tangent at p to the fiber Ex defined at x = rc(p), that is vk(p) e V . Also a Killing vector field does not vanish at any point, unless the point is associated with the zero element of Q. This implies that the dimension of the space formed by Killing vectors {vk(p)} atp, equals that of Q and therefore that of V . This leads to a canonical isomorphism Vp —> Q given by M H u, and as a consequence there follows another equivalent definition of connection. Definition 2.5.14: A connection on a principal fiber bundle £ = (E, X, n, G) is a linear mapping (Op : Tp (E) —> Q (i.e., a 1-form on E with values in the vector space Q) which satisfies: (i) (Op (u) - u for every u e Vp (ii) (O^p (R'gv) = Ad(g~l) (Op{v) (iii) (0p depends differentiably on p. The equivalence of last two definitions can be checked by using the fact: v e Hp<^ (0{v) = 0
(2.5.20)
We next see that a connection on a principal fiber bundle f = (E, X, K, G) can always be defined by means of a 1-form on X that has values in Q. In a manner of speaking, this is an existence theorem for a connection on £. Result 2.5.15: On every principal fiber bundle £= (E, X, n, G) where X is a paracompact manifold, a connection (O with values in Q can be constructed. (See Hint to Exercise 6 for the proof.) Another important geometric object that can be defined on a principal fiber bundle is the curvature. Definition 2.5.16: Let £= (E, X, K, G) be the principal fiber bundle with connection H defined by the 1-form a>; and let h:Tp{E)-*Hp be the mapping such that v t-> vh. Then the 2-form DsDffl
(2.5.21)(a)
is called the curvature form of the connection (0 (or that of H ) where by definition: Doo(vv, V2) - d(o{hvx, hv2) (2.5.21)(b) Thus symbol D stands for exterior covariant derivative of forms defined on E, just as d stands for exterior covariant derivative on X. In general the exterior covariant derivative of an r-form \j/= \[fu ® ea (with values in a vector space with basis ea) could be written as:
94 Mathematical Perspectives on Theoretical Physics
Dy{vx, v2,...,vr+l) = dyihv^
hvx ..., hvr+l)
(2.5.22)
The curvature form Q defined in (2.5.21) satisfies the relation: Q. (u, v) = d(O(u, v) + [O)(u\ co(v)]
(2.5.23)
for any vector fields u, v on E. The above equation, known as the Carton's structural equation,11 is quite fundamental in the study of geometric structure of a principal fiber bundle. We shall return to it in later chapters. (See Exercise 7 for the proof of this equation.) If {ea} denotes a basis in Q, then above equation becomes: Cla= dcoa+ -
C^mp A a)?
(2.5.24)
The differentiation of (2.5.24) gives:
dna= - CfadaP A cor- - C%da^
A dcor
(2.5.25)
Evaluating dQ.a on (hu, hv) and making use of (2.5.20) and (2.5.22) implies that DQ.a{u, v) = d£la(hu, hv) = 0 for every u, v
(2.5.26)
i.e., DQ = 0. The above equations are called the Bianchi identities on E. In conclusion to our study of connections on principal fiber bundles, we shall define two more objects. One of these is the linear connection (metric connection) on X, when X is considered as the base manifold of £. Recall that we are already familiar with the notion of metric connection on a differentiable manifold from Sec. (0.3). The other is the bundle associated to P(X, G) (see Part B).
5.5 The Linear and the Metric Connection, and the Torsion Form Definition 2.5.17: A connection on the principal fiber bundle L(X) of frames on a differentiable manifold X is called a linear connection on X. Now L(X) = uxe
x
LX(X) =
Let <j)p: Tx (X ") —» R", p = (x, «,) b e the mapping that carries a vector v e Tx (X ") to its corresponding component with respect to the basis {M,}, thus if {6'} is the basis dual to {ut}, then
dp{u) =
for ueTp(P)
(2.5.27)
is called the canonical form of X. Some authors call it the canonical form of L(X) (see l.[10]). Definition 2.5.18: The 2-form 0 = DO is called the torsion form of the linear connection on X (see Exercise 10). If instead of considering the set of arbitrary frames, we consider orthonormal frames at every point x of X, the principal fiber bundle whose structural group is now O(n, R) is denoted O(X)}& The bundle O(X) is called a reduction bundle of L(X). 17
- There are two Cartan's structural equations and this is one of them. The other is Eq. (a) in Exc. (2.5.10).
1
Evidently the presence of a Riemannian structure has been assumed here (see Subsec. (0.3.4)).
Elements of Group Theory and Group Representations 95
In view of the result: "if (E, X, n, G) is a reduction of L(X) then a connection on E determines a linear connection on L(X)," 19 it follows that a connection on O(X) determines a linear connection on L(X). This linear connection on L(X) is called a metric connection. If X is a Riemannian manifold, the Riemannian connection (also called the Levi-Civita connection) is the unique metric connection such that its torsion form: 0 =0
(2.5.28)
Having described the connection (developed by Koszul and Ehresmann based on the ideas of E. Cartan) in the fullest generality, we proceed now in the next subsection to give the definition of connection and covariant derivation from another perspective.
PartB As already mentioned in the introduction, we shall describe here the meaning of connection on a vector bundle say E(E, B, n, F).20 We make the following definition for the purpose.
5.6 Connection and Curvature on a Bundle from a Different Point of View Definition 2.5.19: Let T* (B) denote the cotangent bundle of B, i.e. the bundle formed by the spaces of 1-forms (covariant vectors) at all points of B. A connection or a covariant differential is a linear differential operator D from the space of smooth sections on E to the space of smooth sections on T*(B) ® E, i.e., D : r (B, E) -> T (B, f(B)
such that D satisfies the Leibniz rule (see (iii) in Eq. (0.3.10)), thus for s e Y (B, E) a n d / e C°° (£) it gives: D(fs) = df® s+fDs l
(2.5.29) 21
Locally (i.e., in n~ (U) s U x F, U an open set of B) the above equation can be written as : V,. • (fs)dx' = (V,./ • s + f V,-i) dxl (i = 1,..., n.)
(2.5.30)
If {i,} i = 1, 2,..., n denotes a set of n linearly independent sections in a neighbourhood U of B, then the action of D on the sections st can be expanded in terms of same local frames with coefficients in T (B), i.e., Zfc; = £ 0/ ® S;
(2.5.31)
The matrix 9 of one-forms d{ on U is called the connection one-form. Using the matrix notation for {si) we can write (2.5.31) as: Ds=8®s 19
21
(2.5.32)
See reference [30], p. 302 for proof. Note that we use B in place of X to denote the base space in order to distinguish the study in this subsection from the previous subsection. See Sec. (0.3) for V,,
96 Mathematical Perspectives on Theoretical Physics
The covariant differentiation of the above equation gives the curvature two-form matrix: D{Ds) = D(G ® s) = d9® s + 9® Ds = d9® s + 9® (9® s) = {d8-e*9}®s
(2.5.33)
Note that we have replaced the tensor product between 0's by a wedge product and have also changed the sign of the term. The first of these steps is taken because 0 is a one-form, and the second results from the fact that 9 is a matrix, and to obey the matrix multiplication rule the two connection matrices have to be interchanged. Note further that equation (2.5.33) defines the covariant exterior differentiation of the one-form matrix 9, and this eventually gives the curvature two-form Q22: Cl = D0 = d9-eA9
(2.5.34)
Let / = {sf} be another frame which is related to s = {.?;} through local linear transformations as: s' = g(x)s
(2.5.35)
where g(x) e GL(m, R) is a matrix formed by C°°-functions of x e B. Let 9' be the connection matrix with respect to s', then we have: Ds' = 9' ® s' or equivalently using (2.5.35) and (2.5.29) we obtain: D(g(x)s) = dg(x) ® s + g(x) Ds = dg(x) ® s + g(x) (9 ® s) = (dg(x) + g(x)9)®s
= 9'g(x)®s
(2.5.36)
This implies 9'g(x) = dg(x) + g(x) 9 and since g (x) is invertible, we have: 9'= dg{x) • {g(x)Tx + g(x) 9(g(x)yl
(2.5.37)
In Chapter 6 (Eq. (6.6.11)), we shall see that this is the transformation law of a Yang-Mills potential (with g~l in place of g). Differentiate (2.5.37) to obtain23: D9' = Q' = D(dg • g~l + g9 g~l)
(2.5.38)
Simplification of the right hand side yields: n'=gQg-1
(2.5.39)
where Q.' is given by (2.5.34) with 9' in place of 9. The transformation property (2.5.39) is indeed the transformation rule for Yang-Mills field strength (tensor) under a gauge transformation (see Sec. 6.5). Finally the covariant differential of a two-form matrix Q gives the Bianchi identity: DQ = dQ.-9AQ = 0
(2.5.40)
This turns out to be the Yang-Mills field equation. We shall return to these ideas in Chapter 6 and later on, in Chapter 10. 23
The reader should return to (2.5.21) to appreciate the difference between the two approaches. g and g~x are in fact g(x) and (g(x))~l.
Elements of Group Theory and Group Representations 97 Our next definition describes an important class of bundles known as Associated Bundles.
5.7
Associated Bundles
Definition 2.5.20: Let P(M, G) be a principal fiber bundle and F be a manifold on which G acts to the left. A fiber bundle E s E{M, F, G, P) is called the associated bundle to the principal bundle P(M, G) if its differential structure is constructed in following manner: (i) The group G acts to the right on the product manifold Px F, with the rule: a : P x F -> P x F,
(u, Q a = (ua, cTl £)
where u e P, f e F and a e G. Denote the quotient space of P x F by this group action as E = P xG F. (ii) Let <j): P x F —> M be the mapping such that 0(M, £) = 7T(M), the mapping <j) induces a mapping nE called the projection of E onto A/. For each x e M the set % ! (x) is called the fiber of E over x. (iii) The isomorphism n~x (£/) = U x G induces an isomorphism 7%' ([/) = U x F for every neighbourhood U of M (see Exercise 12). (iv) The differential structure in £ is assigned by ensuring that n^ {IT) be an open submanifold of E which is diffeomorphic with Ux Funder the isomorphism % : (£/) = UxF. The projection % is then a differentiable mapping of E onto M. Sometimes the group action is selected to suit the manifold F. For example, F can be G itself and the group action is then the Ad action of G on G, or F can be the Lie algebra § of G and the action of G on g can naturally be the action ad (see l.[10]). The associated bundles in this case are denoted (P xAd G) = Ad(P) and (P xad G) = ad{P) respectively.24 We use these bundles in (6.6) while studying the gauge theory.
5.8 Affine Bundle and Affine Connection Another bundle that is sometimes used is the so-called affine bundle A (A/), formed by affine frames of M. The structure group G in this case is A in, R)-the group of affine transformation of A" (when R" is regarded as affine space25 it is denoted as An). An arbitrary element a of A(n, R) is a =
fa C\
\ae
GL(n, R), £ e R" is a column vector
(2.5.41)
The element a maps a point r\ s An into ai] + £. An affine frame of a manifold M at x consists of a point p e A^CAO and a linear frame (MX, M2, ..., un), and it is denoted (p; ux, u2, ..., «„). Let the origin of R" and its natural basis be denoted by 0 and (e,, ..., en), then (0; ex, ..., en) gives the canonical frame a. Every affine frame (p; ux, ..., un) can be identified with A" via an affine transformation U: A" —> AX{M). This 1-1 correspondence between the set of affine frames at x and the set of affine transformations of A" onto AX(M) eventually gives the projection mapping n: A(M) -» M such that n{U) = x. The connection resulting from this bundle is called the affine connection on M. Let y: L(M) —> A(M) 24 25
Written out in full these associated bundles are E(M, G, Ad, P) and E(M, Q, ad, P). An affine subspace (affine hyperplane) of a linear space A' is a set of elements of X which can be written as: x = y + x0 where x0 is a given point of X and y e Y-a linear subspace of X.
98
Mathematical Perspectives on Theoretical Physics
be the map that carries a frame (M,, ..., «h) to (0^, ux, ..., un) where 0x e AX(M) is the point corresponding to the origin of TX(M), and let 5) and 0)be the connections corresponding to A (M) and L{M), then using the pullback map y* we have: Y* a>= (0+ a
(2.5.42)
where CO is a gl(n, R)-valued and a is a Revalued 1-form (see Kobayashi l.[10] for details).
5.9 Tensorial and Bundle-valued Forms In our earlier discussions we have established correspondence between the connection form on P and the 1-form on M. In the following paragraphs we now show that there exists a one-one correspondence between tensorial forms defined on principal bundles and the bundle-valued forms defined on a paracompact manifold M. Let us first explain the terms such as tensorial forms mentioned above. Recall that the group action of the structure group G on the principal bundle P(M, G) is a (free) right action denoted by p (see Def. 2.5.8), whereas when we consider the associated bundles with manifold F (or vector space V), the group action is a left action on F (or V). In defining the tensorial forms and the bundle valued forms, both these actions come into play. Definition 2.5.21: Consider a principal bundle P{M, G) together with a finite dimensional vector space V. Let r : G —> GL{V) be a representation of G on V, and let <j> e A!(P, V) be an /-form on P with values in V. The form <j> is called pseudo-tensorial of type (r, V) if: p*
G
(2.5.43)
where pa denotes the right action on P corresponding to the element a. It is easy to note that connection 1-form CO on P is pseudo-tensorial of type (ad,Q). The form
(2.5.44)
whenever some Xt at p e P, 1 < i < I is a vertical vector. The form 0 e A' (P, V) is called tensorial of type (r, V) if it is horizontal and pseudo-tensorial of type (r, V). Corresponding to every tensorial form there exists a unique /-form denoted ^ on M with the values in the associated vector bundle E = P\V. The form s^ is defined as follows:. s+ix) (*„ X2, ..., X{) = u(
for every x 6 M
x
(2.5.45)
where u e Tf (X), K; € TU(P), if (K,) = Xt 1 < i < I and u is a linear mapping from V onto % ' (x). This ensures that both sides of (2.5.45) are elements of % ' (x). We note that since (j> is tensorial, the definition of SQ is independent of the choice of the point u of the fiber if1 (x), and of the vectors K, € TUP. The form S+ defined above belonging to /\{M, E) is called the l-form associated to (j). In particular when Q e A (P, Q) is the curvature 2-form daa) given by the £-valued connection 1-form CO on P, it can be checked that Q is a tensorial 2-form of type {ad, Cj).26 We have already seen earlier in (2.5.23) that it satisfies the equality: dco(X, Y) = Q(X, Y) - [co(X), co{Y)] We are using d® CO in place of dco to emphasize the role of connection 1-form CO.
Elements of Group Theory and Group Representations 99
which can also be written as: da = £2 - co A (o.
The corresponding form sa e A 2 (M, adP) is denoted Fw and is known as the curvature 2-form of M. We shall return to this in Chapter 6 while dealing with gauge theory. Next we devote our attention to the notion of metric on fiber bundles. Without going into details we want to note that this is possible only in the case of principal bundles and that too only on a particular class of these bundles, as would soon be evident from the following definition: Definition 2.5.22: Let (M, g) be a compact, connected, ^-dimensional Riemannian manifold and let G be a compact, semi-simple, w-dimensional Lie group of the principal bundle P(M, G). Let ( L denote a G-invariant inner product on the Lie algebra Q of G, and suppose that {e'(x)}l <,<„, is a basis for the fiber (adP)x of the Lie algebra bundle adP. Then using the fact that for every x e M, the metric gx induces inner products on the tensor spaces and the spaces of differential forms, we can write the inner products for bundle valued differential forms on M. More precisely, let a, p e AP{M, adP) and y e Aq (M, adP), with local expressions27 a(x) = «,(*) ® «''(*) j8(jc) = Pj(x) ® ej(x)
(2.5.46)
y{x) = yk(x) ® ek{x) where a,-(x), Pj(x) e /^{M)^ yk(x) e Aq(M)x and e'(x), eJ(x), e\x) G (adP)x for every i,j and k, then we can use the above local expressions for the following definitions: (1) The product (a,/3) belonging to jF (M) the set of smooth functions on M is given by the mapping x h-» {a,P)x where (a,p)x = gx(ai(x), pj(x)) ( e\x), ej(x) )g
(2.5.47)
(2) The inner product ((a,P)) belonging to R is:
((a,P)) = jM(a.P)dvg
(2.5.48)
where dvg is the volume element on M with respect to metric g. The corresponding norms (the first of these being local) resulting from these products are: x H» \a\x = TJ(OC, a)x for every x e M, ||a|| = | V « a , a » | a
m
P
(2.5.49)(a) (2.5.49)(b)
1
(3) The formal adjoint S of d : A (M, adP) -> A ^ (M, adP) follows from the definition of adjoint operators, thus28: {{daa,o)) = ((a,Saa)) 27
' Repeated indices imply summation. See Chapter 3, Def. (3.1.7).
28
where ere AP+1 (M, adP)
(2.5.50)
100
Mathematical Perspectives on Theoretical Physics
(4) The product a A ye Ap+q (M) is given by: m ( « A ^ = (a, (x) A yk(x)) (e\x), e\x) ) g e AP+C> (M)
(2.5.51)
(5) Finally the bracket [a, y] A of bundle valued forms is defined as: x -* [a, y]x = («,(*) A yk(x)) [el(x\ ek{x)\ e APX+" (M, adP)
(2.5.52)
Note that the wedge product A in (4) and the Lie bracket [ ]A in (5) is simply written as A and [ ] when there is no cause for confusion. All these five definitions are used in principal bundle formalisms of gauge theory in Chapter 6 (see, for instance, Sec. 6.6 and Sec. 6.7). We also remark that the above definitions could be extended to the case of principal bundles where the compactness of M is replaced by paracompactness. Finally we note that for every compact Lie group G a special type of connection, known as the Universal connection, can be defined in the following manner. Remark 2.5.23: Let G be a compact Lie group and n a fixed positive integer. Then there exists a principal bundle P (N, G) (dim N = n) together with a connection F such that any connection F on any other principal bundle P(M, G) {dimM < n) can be obtained as the inverse image of F by a suitable bundle homomorphism/: P into P. In other words, if co and 5) are connection forms corresponding to F and F , then co =f* 5). The connection F is called the universal connection of "G and the integer «" (see 1.[ 10] for examples of these connections).
Exercise 2.5 1. Show that a vector field v on X" is a cross-section of the tangent bundle T(X"). 2. Show that given the real vector bundle £ on a manifold M, a Riemannian metric on £ is a smooth section s of the vector bundle S2(E) on M such that s (x) is a bilinear symmetric, positive definite map of Ex x Ex into R for every x e M. 3. Show that L(M) = u x e M LX(M), formed by the union of frames defined on an n-dimensional differentiable manifold M, is a principal fiber bundle: L(M) (M, GL{n, R)). 4. Establish Result (2.5.9) using the concept of auxiliary functions for the mappings involved. 5. Prove Result (2.5.10). 6. Prove Result (2.5.15). 7. Establish the Carton's structural equations on the principal fiber bundle E. 8. Let P = M x G be a trivial principal fiber bundle, the canonical flat connection in P is defined by taking the tangent space to M x {a} at p = (x, a) (p € P, x e M, a e G) as the horizontal subspace at p. Show that a connection in P is the canonical flat connection if and only if it is reducible to a unique connection in M x {e}. 9. A connection in P(M, G) is flat if and only if the curvature form vanishes identically. 10. Show that the torsion form 0 satisfies the relation: (a)
0 (M, v) = d6 (u, v) + — [(0(v) 0(u) - co(u) 9(v)].
11. Let S3 be the unit sphere in C2 and S2 the unit sphere in R3 and l e t / : S3 -> CP1 be the natural map/(z0, Z[) = [z0, z,] where [z0, z j stands for the homogeneous coordinates on CPl, then show
Elements of Group Theory and Group Representations
101
that using the identification of C/3' with S2 via stereographic projection, a bundle structure giving the principal fiber bundle S3(S2, f/(l)) can be defined with the structure group U(l). The mapping / : S3 -> S2 is called the Hopf fibration. 12. Show that the isomorphism itx (£/) = U x G induces the isomorphism n^x (IS) = U x F where K: P —> M and KE : E —> M are the projection maps of the principal bundle P and its associated bundle E.
Hints to Exercise 2.5 1. By definition, section (cross-section) of the bundle (T(Xn), X", it) is a mapping 5 : X" —> T(Xn) such that n o s = Idxn. Denoting the vector field on X" by v, we wish to examine that v fulfills this requirement. We know that T(X ") is formed by the pairs (x, vx) where vx is a tangent vector e Tx(Xn)—the tangent space at x e X". The mapping n carries the pair (x, vx) to x. Hence if v: x \-* (x, vx), then evidently n o v is the identity mapping on X". 2. To begin with, we recall that a Riemannian metric on a differentiable manifold M is a smooth section of the vector bundle S2{T(Mj) on M, where S2{T{M)) denotes the bundle whose fiber on x e M is the vector space S2(TXM) of symmetric bilinear maps of TXM x Tx M into R. In fact this is also referred to as the Riemannian metric on the tangent bundle T(M). Having said this, it is clear that when T(M) is replaced by a general real vector bundle E on M, then a Riemannian metric on £ is a smooth section s of the vector bundle S2(E) on M such that s(x) is a bilinear, symmetric and positive definite map of Ex x Ex into R for every x e M. A Riemannian vector bundle is a pair (E, s) where £ is a real vector bundle and s is a Riemannian metric on E. 3. By definition, a triple (P, M, G) is a principal fiber bundle P(M, G) over a differentiable manifold M if the Lie group G can be identified with the typical fiber F of manifold P, and G acts freely on P on the right. A frame U = (ux, ..., un) at a point x e M is an ordered basis of the tangent space TXM. Let LX(M) denote the set: (i)
LX(M) = {U\U is a frame at x e M]
and let (li)
L(M) =
ux£MLx(M).
Define the projection n: L(M) —> M such that U h-> x. The group G = GL(n, R) acts freely on L(M) to the right: (iii)
(U,g)\-*
Ug = ( u l g { ... u x g l n , ...,uig[...
u ; g ' n , . . . , u n g ' [ . . . u n gnn)
where g = (gj) e GL(n, R). Furthermore L(M) can be given a manifold structure such that this GL(n, Reaction is smooth and GL(n, R) can be identified with a typical fiber LX(M) of L(M). Hence L(M) is a principal fiber bundle denoted L{M) (M, GL(n, R)). 4. To establish the required result we first define the auxiliary functions for the mappings defined on X and G, and then go on to define auxiliary functions for the mappings Lg and Rg. Let the local coordinate expressions for x e X and g e G be (V) and (ga) which € R" and Rp respectively. The mappings a : G x X -> X
with (g, x) )-> cr(g, *),
102
Mathematical Perspectives on Theoretical Physics
ae : X -» X ax:G-*X
with aAx) = a(g, x),
with ox(g) = a(g, x) a
(i) J
have the same local coordinate representation {& (g , x )) that belongs to R". The derivatives ag :TX(X)~> Ta (X)(X) and ax : Tg(G) -» TaAg)(X) are respectively represented as da'lg'.x')
.
doHg",^)
«(*))>—f-j- 1 -
(a;w)/=—f—-*-•.
<»>
J
dx dg The representative of <JX ate e G (i.e., when g = e) is the set of auxiliary functions ala (xj). We now replace X by G in (i) to write the local expressions for left (right) translation Lgh = gh (Rgh = hg) as (ghf = If (gb, hc) {Qigf = R" {gh, h')). Similarly replacing X by G in (ii), we have the left auxiliary (right auxiliary) function L^{gb) (R%(gh)); which is given by: (dRa{gb,hr)
dL«{g»,h<) d
dh
' h=e
)
d
dh V
I"
(m;
h=eJ
We emphasize that although the result (2.5.9) is being proved here for left invariant vector fields, it also holds good for right invariant vector fields. Recall that a vector field v is called left invariant if L'g v(h) = v(Lgh) = v(gh) for every ~g, h e G.
(iv)
Denote the value of a vector field v at e by y, i.e., 7= v(e), then at h = e (iv) becomes: L'g 7= v(g). This can be locally written in terms of left auxiliary functions as va(JS)^ir dg
= L'k{g)rk~r.
(v)
(vi)
dga
Conversely, L'g v{e) = v{g) implies that v is left invariant in view of the following equalities: v(Lgh) = v(gh) = L'gh v(e) = (Lg o Lh)' v(e) = L'g o L\ v(e) = Lg v(h). (vii) The above proof does not explicitly mention the correspondence between the left invariant vector field and the tangent vector. We emphasize this point by noting that 7= v(e) is indeed the tangent vector at e e G, i.e., / e Te(G). 5. We have to show that if v and w are left invariant vector fields on G, then their Lie bracket is also left invariant which means that they are closed under Lie bracket relation. In order to prove it, we have to establish: L; [v, w] = [L'g v, L'g w] = [v, w]. (i) Prior to showing this, however, we make two important remarks in this connection, namely (a) given a smooth function fe C°° (G) the vector fields v, w determine smooth functions vif) and w(/) that belong to C"(G); (b) the composition vw considered as an operator on C°°(G) does not in general determine a vector field, whereas the difference (vw - wv) does, which means that
Elements of Group Theory and Group Representations
103
the collection of vector fields on G is closed under the Lie bracket operation. Thus in order to prove (i), a smooth function which is not explicitly there has to be brought into picture. Note that in this context (v) and (vi) of the above Hint to Exercise 4 look like:
(L; i)f= vif(g))
(n)
where f(g) is the value of / a t the point g e G, and ^)^7
= V(«)/|((iii) og dg In view of remark (b), L' (vw) has no meaning in our context, whereas L' v(w(f)) = (L' v) w(f) = v(w(f)) does. Similarly writing La w(v(f)) = (L' w) v(f) = w(v(f)), we have: o
L'g [v, w]f=
a
L'g {(vw - wv)f) = (L'gv) w if) - (L'gw)v(f) - v(w(f)) - w(v(f)) = (vw - wv)f.
(iv)
To show that L'g [v, w] = [L'gv, L'g w], we only observe that retracing our steps in (iv) and using (ii), (vw - wv)f in fact implies (L'vL'.,w - L'w L' y)/which leads to [L'v, L' w], 6. Since X is paracompact, a result proved in a neighbourhood can be extended to the whole manifold through a partition of unity (see Boothby O.[l]). Let {[/,-} be the covering of X associated to the fiber bundle structure of E, with {y/,} the corresponding mappings on E, and let a> be a differential 1-form on X with values in £-the vector space of the Lie algebra of G. Suppose that p0 is a point in E such that V,(Po) = (xo> £o) where x0 e £/, and g0 e G, then any vector v e T (E) can be expressed as: v = ul + u2
(i)
where y/[ul e T{XQgQ) ([/,- x {g0}) and ^ u2 e T^ g^ (x0 x G). This means that w2is a vertical vector. Define 0)p as the connection form at p0 by the equality: copo(v) = a) (nf u{) + oi).
(ii)
The transitive action of G on each fiber implies that the variant of cop can be defined at any other point p = Rgp by the relation: 29 cop(R'sv) = Ad(g-1) copQ(v).
(iii) l
In view of this relation, we conclude that there is a form ft), defined on n~ (£/,-) with values in Q. Next suppose that {0;} is a partition of unity on X subordinate to the covering {{/,}, then the form (O = Z (6i o 7T)a)j is the required connection form on E. 7. To establish the equality (2.5.23) we shall have to consider three different combinations of vector fields u and v or that of vectors up and vp, namely both up and vp are horizontal, one horizontal and another vertical, and both vertical. Now Q(«, v) = Dco(u, v) - dco(hu, hv). This means that we have to show that: dco(hu, hv) = dco(u, v) + [O)(u), a>(v)] in all three cases. 29
- See (ii) of Def. (2.5.14).
(i)
104
Mathematical Perspectives on Theoretical Physics
Case 1: In view of (2.5.20), which says co(vp) = 0 vp e Hp, it follows that (o{up) = co(vp) = 0, hence [(o(up), O)(up)] = 0, and since hup = up, we have: Dco(up, vp) = dO)(up, vp). Case 2: Extend both up and vp in a neighbourhood of p to vector fields u and v which are respectively horizontal and vertical, then co(u) - 0 and co (v) being a fixed element of Cj, uco (v) = ua ——(const.) = 0 where {ea} stands for local coordinates in H . This means that not only a de [co(u), co(v)] = [0, co(v)] = 0, but also dco(u, v) = ucoiv) - va>(u) - co[u, v] = 0 as the Lie bracket [M, V] in third term is horizontal. Since Dco(u, v) = dco(hu, hv) = dco(hu, 0) = 0, we note that both sides are identically zero, and hence equality holds good. Case 3: Extend u and v to Killing vector fields in the neighbourhood of p, and note that the algebra of Killing vector fields is isomorphic to Q. As a consequence it follows that: dco(u, v) = u(o{v) - vco(u) - o)[u, v] = 0 - 0 - co[u, v] = -[co(u), co(v)] = 0. (Note that we have established an equality similar to co[u, v] = [co(u), (O(v)] in Exercise 5, while dealing with left invariant vector fields.) On the other hand, Dco(u, v) = dco(hu, hv) = d(O(0, 0) = 0. Hence the equality is identically zero on both sides. 8. The result to be proved here is, in a way, a restatement of the definition of canonical flat connection on a principal fiber bundle. This bundle has to be a trivial fiber bundle for the definition to be meaningful. In order to establish the result, we use the principal bundle homeomorphism between the two bundles: P = M x G and P' = M x {} given by the mappings (p : P -» P', y: G -» [e] which trivially satisfy the needed relation ty(pb) = 0(p) y(b) = p'e, where p e P and b e G. Note that p = (x, a) € M x G and p = (x, e) e M x {e}. P' = M x [e] is the sub-bundle of principal fiber bundle P. The tangent space TXM at x e M (for every x) can be identified with the horizontal subspace Hp- of the bundle P ' at the point (x, e). This defines a connection on Mx {e}, and due to the bundle homeomorphism it defines a connection on P. Moreover the canonical I-form 9 on G, which is the left invariant Q valued 1-form determined by 9{A) =A,As Q gives rise to the connection form co in the following manner. Let/: M x G -> G be the projection mapping, then Q) = f*9
(i)
defines the connection form of the canonical flat connection. The curvature form corresponding to co is zero, since
dco=d(f*9) = f\d9)=f*(-±-
[9, d]) = -±[f*e,f*9]
= -± [co, co] (i
(see Exercise 7). The Cartan structural equation when written for 1-form 9 on G is called the Maurer-Cartan equation: d9=-~[9,
9}.
9. A connection in P(M, G) is called flat if every point x e M has a neighbourhood such that the induced connection in the restricted bundle P\u= nx (U) is isomorphic with the canonical flat connection in U x G. In other words, P admits a flat connection if there exists an isomorphism
Elements of Group Theory and Group Representations
105
(j): ifx (U) —> U x G which maps the horizontal subspace at u e n~l(U) upon the horizontal subspace at
0 ( M , V) = D9(U,V)
= d6(hu,hv)
(a)
where h : Tp(P) —> H is the mapping that carries v —> vh—the horizontal component of v. We further note that since the connection-form « i s a ^-valued 1-form and f9 is a Revalued 1-form they satisfy: (b)
u vertical
: co(u) * 0, 9(u) = 0
(b)
(c)
u horizontal : co(u) = 0, G(u) * 0.
(c)
Now the equality (a) in the Exercise has to be verified (just as we did in Exercise 7) for different combinations of u and v, e.g., both horizontal or both vertical, or u vertical and v horizontal. The first two cases turn out to be trivial (see the steps taken in Exercise 7), hence we establish the equality for the third case. Suppose that a vertical vector u corresponds to the fundamental vector field A*, i.e. u = A*, and v corresponds to the standard horizontal vector (B(a)) . We shall write u and v as A* and (B(a))p when required. Evidently 0 ( M , V) = 0, since dd(0, hv) = 0. Also 3a
31
The vector field A* is induced by the 1-parameter group of transformation Ra where a, = exp tA, whereas the vector field (/?„)* A* is induced by the 1-parameter group of transformations RaRn Ra-\ = R^-ia a = C^a)*- The latter group is generated by (ad(a~1)) A e Q . Note that these two conditions completely determine B(Q for each f e R".
106
Mathematical Perspectives on Theoretical Physics
co(v) • 6(u) = 0 in view of (b) or (c). Whereas co («) 9(v) - co(A*) 6 ((Ba)p) = A a in view of properties (ii) and (iii). On the other hand writing 2 d6(u,v) = u9(v) - vd(u) - 9 [u,v] we note that the first term vanishes since it is the derivative of a constant «-tuple, the second term vanishes as 9 (u) = 0, and the third term can be written as: 9[A*p, (B(a))pl
(d)
Using the property (v) followed by (iii), we see that (d) equals Aoc; and this establishes our result. 11. To define a principal bundle P(M, G) we are supposed to have a Lie group G whose (free) right action p on P is such that the orbits of p are the fibers of n: P —> M, i.e., n can be identified with the canonical projection P —> PIG. Also for every yv: U x G —> n~x (U) (considered as a local trivialization) one has: y/;1 (uxg) = y ; 1 (ux)g
V uxe
Px, g e G
(i)
where uxg = p(ux, g) and x is a point of U e M (see Def. (2.5.3) and (2.5.8) to appreciate the use of mapping \j/~l here). Now S3 is a unit sphere in C 2 and there exists a natural m a p / : 5 3 —> C Pl which sends the pair (z0, zx) e S3 to the equivalence class [z0, z,] e C P 1 . In Exercise (1.3.8) we have already seen that S2 can be stereographically identified with C P1. Hence using the right action of the Lie group U(l) on S3, the requirements of principal fiber bundle (given above) can be easily verified (see Hint to Exercise 3). 12. By definition of principal bundle, every point x e M has a neighbourhood U such that K~x (U) = ( / x G . W e identify n"x(U) with Ux G and note that the action of G on the right on n'1 (If) xF is indeed the action on U x G x F, which is given as: (x, g, a) -> (.x, g, a) o h = (x, gh, /T1 a)
(i)
where x e U, g,he G and oce F. This means that the isomorphism TT"1 ([/) = UxG induces an isomorphism %"' (U) = U x F. A Note on Hints: Exercise (2.2): Exercises 4, 5, 10 and 11 can be'solved with the help of Refs. [13], 4.[4], 4.[8] and 4.[9]. Exercises 6 and 7 will be appreciated better after Chapters. 6 and 9. Exercise (2.3): a good source for hints to these exercises is the article by A. A. Kirillov in [19] and the text 3.[11],
References 1. R. Abraham, Piecewise differentiable manifolds and the space-time general relativity, J. Math. Mech. 11 (1962), 553-592. 2. V. I. Arnold, (a) On the topology of three-dimensional steady flows of an ideal fluid, Prikl. Mat. Meh. 30 (1966), 183-185 (Russian), translated as J. Appl. Math. Mech. 30 (1966), 223-226; (b) Singularities of smooth mappings (Russian) Uspehi Mat. Nauk 23 (1968), no. 1 (139), translated as Russian Math. Survey. 3. A. Borel, Linear Algebraic Groups (Benjamin, 1969). 4. R. Bott, An application of the Morse Theory to the Topology of Lie groups, Bull. Soc. Math. France 84(1956), 251-282. 5. R. W. Carter, Finite Groups of Lie Type (Wiley-Interscience, 1985).
Elements of Group Theory and Group Representations
107
6. H. Chandra, Automorphic Forms on Semisimple Lie Groups (Springer-Verlag Lecture Notes 62, 1968). 7. C. Chevalley, Theory of Lie Groups /(Princeton University Press, 1946). 8. C. M. DeWitt and J. A. Wheeler (ed.), Lectures in Mathematics and Physics, Batelle Rencontres (1967) (New York: W. A. Benjamin, 1968); R. Bott and J. Mather, Topics in Topology and Differential Geometry (New York: W. A. Benjamin, 1968). 9. F. Digne and J. Michel, Representations of Finite Groups of Lie Type (Cambridge University Press, 1991). 10. D. G. Ebin and J. Marsden, Groups of diffeomorphism and the motion of an incompressible fluid, Ann. of Math (2) 92 (1970), 102-163. 11. J. Eells, Jr., On the Geometry of Function Spaces. Symposium internacional de topologia algebraica [International symposium on algebraic topology], 303-308. Universidad Nacional Autonoma de Mexico and Unesco, Mexico City, 1958. 12. C. Ehresmann, Les Connections Infinitesimales Dans un espace Fibre Differentiate (Colloque de Topologie, Bruxelles, 1950). 13. M. Gourdin, Basics of Lie Groups (Editions Frontieres, 1982). 14. W. Greub, S. Halperin and R. Vanstone, Connections, Curvature, and Cohomology, Vol. 1: De Rham Cohomology of Manifolds and Vector Bundles (New York: Academic Press, 1972). 15. V. Guillemin and A. Pollack, Differential Topology (New Jersey: Prentice-Hall, Englewood Cliffs, 1974). 16. S. Helgason, Differential Geometry, Lie Groups and Symmetric Spaces (New York: Academic Press, Inc., 1978). 17. F. Hirzebruch, Topological Methods in Algebraic Geometry (3rd enlarged ed., New York: Springer-Verlag, 1966). 18. D. Husemoller, Fiber Bundles (2nd ed., Berlin: Springer-Verlag, 1975). 19. A. A. Kirillov (ed.), Representations of Lie Groups and Lie Algebras (Akademiai kiado, Budapest, 1985). 20. B. A. Kupers'chmidt, in G. Kaiser and J. E. Marsden, Geometric Methods in Mathematical Physics (Berlin: Springer-Verlag, 1980). 21. S. Lang, (a) Differential Manifolds (Addison-Wesley, Reading, MA, 1972); (b) SL2 (R) (New York: Springer-Verlag, 1985). 22. S. Lie, Theorie der Transformations Gruppen (Leipzig: B. G. Teubner, 1888). 23. G. W. Mackey, Unitary Group Representations in Physics, Probability and Number Theory (The Benjamin/Cummings Publishing Company, 1978). 24. K. B. Marathe and G. Martucci, 1O.[35]. 25. J. Marsden, D. G. Ebin and A. Fischer, Diffeomorphism Groups, Hydrodynamics and Relativity (Toronto? Pub: s.n.)? 26. D. Mumford, Abelian Varieties, Bombay Lectures (2nd ed., Tata Inst. of Fund. Re., Bombay, India, 1974). 27. H. Omori, Infinite Dimensional Lie Transformation Groups (Springer-Verlag, 1974). 28. S. Smale and R.S. Palais, What is global analysis, Amer. Math. Monthly 76 (1969), 4-9. 29. N. E. Steenrod, The Topology of Fiber Bundles (Princeton University Press, 1951). 30. S. Sternberg, Lectures on Differential Geometry (Prentice-Hall, 1965). 31. D. Sundararaman, (a) Topics in Several Complex Variables (Boston: Pitman, 1985); (b) Moduli Deformation and Classification of Compact Complex Manifolds (Marshfield, MA: Pitman, 1980).
108
Mathematical Perspectives on Theoretical Physics
32. G. Warner, Harmonic Analysis on Semisimple Lie Groups (Springer-Verlag, 1972). 33. F. W. Warner, Foundations of Differentiable Manifolds and Lie Groups (Glenview, IL: Scott, Foresman and Company, 1971). 34. S.-T. Yau (ed.), Geometry, Topology, and Physics for Raoul Bott (International Press, 1995). 35. H. Zassenhaus, Lie Groups, Lie Algebras and Representation Theory (Les Presses De l'Universite de Montreal, 1981).
CHAPTER
A PRIMER ON OPERATORS
1
O
DEFINITIONS AND EXAMPLES
In physics as well as in mathematics operators play an important role in physical and algebraic structures. We therefore give some elementary details of operator theory before using them as tools in other theories. It is well known that very often the search for solutions (ansatz) of a given system becomes easier when cast in operator formalism. Symbolically given an equation L(x) = a, where a is known and L stands for a well defined operation, we solve for unknown x. Thus in the set of equations: ln xx + lux2 = a{ l
2\
x
\ +
Z
22 X2 =
a
2
the pair (a{, a2), (*i> x2) and (2 x 2) matrix
(hi h2) Ui hi) represent respectively a, x and the operator L. The operator may be defined only for a certain class of elements (i.e., those amongst which the unknown is supposed to lie); this set is called the domain of definition of L, denoted D(L). As the elements of D(L) vary, the results of the operation define another set called the range of L, denoted R(L). If the solution to the equation L (x) = a is unique, we say that the operator L possesses an inverse denoted LT1. Thus if the matrix in the above example is non-singular, L has an inverse L~l. The example we have chosen here illustrates an important class of operators known as linear operators. It is this class of operator with which we shall deal in general. In order to study different aspects of operators, we begin by considering the cartesian product 3 x X of two sets. A mapping from AxX
into X : (a, x) H-» ax = y, called an external operation on X, assigns
the role of operators to elements of A. Thus the operator here is a procedure that transforms a given element of X into another element of X (the range set of the operator in this case is included in the domain set X). To explain the algebraic structure of the set 2 , we assume that X is a set of functions (continuous or differentiable as required) defined on an arbitrary bounded domain (Euclidean, Hilbertian1 '
The Hilbert spaces considered throughout this chapter are complete and separable. (See 0.2.14 for the definition.)
110
Mathematical Perspectives on Theoretical Physics
or a compact smooth manifold). In this setting generic elements of 3 and X are respectively Q, and/ and g; thus we have Q f = g. Simple examples of Q are the multiplication by a constant and the derivation, i.e.,
fl/=A/=g
(Q = A.) and Qf=^L=g dx
la = ~). \ dxJ
An operator is called //««zr if it satisfies: a (a/, + bf2) = aO/, + bQf2. (3.1.1) Evidently the 'domain' of a linear operator is a vector linear space and the 'range' is also a linear space. Note that the operator which squares or more generally exponentiates a given element/is not a linear operator. Operators of this type are non-linear operators.
l.l
Properties of a Linear Operator
Definition 3.1.1: Consider a linear operator Q defined on a topological vector space (see Def. (0.2.7)) X, then linear operator Q, is continuous if Q.y/n -> fiy/for any sequence of vectors {y/n] in X that converges to a limit vector i/fin X. It is bounded if there exists a positive number b such that || Q\ff\\ < b || v/|| for every vector y/. The smallest number b with this property is called the norm of Q. and is denoted |Q|. It can be shown that a linear operator is continuous if and only if it is bounded. A set of bounded operators on X is denoted 'B (X). The sum and product of two operators are defined as: nf=nj+n2f, =
+
While Q[ + Q2 ^2 ^i> ^1 ^2 ' of / by x and Q2=T-.
and « / = ( Q , Q 2 ) / = Q , ( Q 2 / ) .
s not
(3.1.2)
always equal to Q 2 t^. For example when Q, is multiplication
(£W/*" 2 "i/
dx If however f=f(x,
y) and
«i = ^ dx
and
Q2 = A
ay
0^/=
£W=
-LL. axdy
Sums, scalar multiples and products of bounded operators are bounded. In view of Def. (4.1.1) "S (X) is an algebra. Definition 3.1.2: A vector subspace X' c X is said to be invariant under the action of an operator Q if any vector \\i in X' is transformed to another vector I/A' in X'. Also if {e,} denotes an orthonormal basis in X, operator Q has a matrix representation given by: O ( e , - ) = I «,,- er i / is assumed to be smooth throughout.
(3.1.3)
A Primer on Operators 111 Definition by Or1.
3.1.3:
We say that a linear operator Q has an inverse £2if£2Q = Q Q = l , w e denote it
Result 3.1.4: If Q is a linear operator on an n-dimensional vector space with basis {e,}, it can be shown that the following four conditions are necessary and sufficient for Q to possess an inverse. (1) There is no non-zero vector (//that satisfies Qy/= 0. (2) The set of n vectors Qe, (i = 1, •••, n) is linearly independent. (3) There is a linear operator Q such that Q Q = ilQ. = 1. (4) The matrix [Qy] corresponding to Q has non-zero determinant. Among linear operators with inverses, there are those which preserve the scalar product of a vector space (assuming that a scalar product is defined on it). These are called unitary operators. We denote them as U and we note that for any vector y/ in a given vector space X
\\Uy\\=\M
(3-1.4)
holds. We would like to mention here that (3.1.4) does not imply that an operator satisfying it possesses an inverse. In the case of finite-dimensional vector spaces it has an inverse, but for infinite-dimensional vector spaces this is not always true. For instance consider the space I2 formed by infinite sequences3 y/= (x{, x2, •••, xn, •••) and let A be an operator such that Ayr=(0,xl,x2,-,xn,-).
(3.1.5)
r
Evidently \\Ayf\\ = || Vll f ° every vector y/, but A has no inverse. The following results about unitary operators can be easily checked. Result 3.1.5: If U is a unitary operator and set of vectors (
(3.1.6) +
for any vectors (j), \|/ in X. The bounded operator Q is called self-adjoint (Hermitian) if Q = Q (see also Eq. (3.1.13)).
1.2
Matrix Representation of a Linear Operator
When we talk of operators in quantum mechanics (see 9A.14-18), we denote the orthonormal basis as |y/,) and rewrite (3.1.3) as
"lv;>=XM Q )lvo->
(3-1-7)
j
The basis vectors | y/;) are called the basis state vectors in X, and Ay,- (Q) is the matrix representation of Q with respect to this basis. It is easy to check that (3.1.3) and (3.1.7) lead respectively to 3
/2-space = the space formed by (x,, • • •, xn, •••) such that X I*J2 < °°> w^ addition and scalar multiplication n= l
defined componentwise. It is also denoted as l2 - space (See Excercise 6 in 0.2)
112
Mathematical Perspectives on Theoretical Physics
(e,, Q( ep) = Q/j,
(3.1.8)
and (yfi\a\yfP = Aij(Q)4
(3.1.9)
Also as the scalar product in quantum mechanics is given by an integral, equality (3.1.9) can be written as: Afj (ft) = \ \j/i to &¥} to <**-5
(3.1.10)
Using (3.1.10) the unitarity condition (3.1.4) can now be expressed as:
\\Uy/\\ = | to UyKx)dx= J if to ifix) dx = \\y/\\
(3.1.11)
Returning to the use of functions for further study, we note that given an operator Q, its complex conjugate operator is defined by the relation: ( « / ) ' = a"f*
(3.1.12)
Pursuing this scheme of ideas we say that Q. is Hermitian if for any pair of functions /, g it satisfies:
jfngdx=
$ gtffdx
(3.1.13)
In view of (3.1.7) and (3.1.8) it can be easily verified that Q is Hermitian if
(f\n\g) = (8\n\f)*
(3.1.14)
Evidently a Hermitian operator is represented by a Hermitian matrix (see Sec. 2). Another important notion about operators is their completeness (not to be confused with the completeness of Hilbert spaces). We call an operator complete if it involves all variables of the domain of its definition. If that is not the case, we call it incomplete. Thus for a particle which is freely moving in space, we note that the operator (
^ =
{Px) P
° ~
»
d
2Ki dx
is incomplete, but the Hamiltonian Hm s H \ x, v, z, op 1, •
, 2ni dx
, 2ni dy
.t 2ni dx
)
is complete. We close this section with three examples of operators which are used in both mathematics and physics all the time and give at the end a list of operators that we shall use as we go along.
Note that on the IMS we have used the quantum mechanics (Dirac) notation ( , ) in place of ( ,) to denote the scalar product, we use the two notations interchangeably. Also, when there is no fear for confusion, we shall 5
use Qij and not Ay (Q.). y/* denotes the complex conjugate of \j/(a wave function in quantum-mechanics terminology).
A Primer on Operators
113
Example (3.1.8): Recall that the Fourier transform of a function/!*) if: R —> (T) is another function F(u) (see Sec 4 in App. 9c): F(u) = {2K)~2
£°
e~iuxf(x)dx.
Hence Fourier transform Z7 can be viewed as £2/, where F = Q/= H ft (x; x')f{x')dx J — DO
= f (2ff)-J e4**'^)
dx
(3.1.15)
J
Note that this is an integral representation of the operator Q, the quantity (2TT)~7 e~'w = Q. (x, x') is called the kernel of this operator. In the above example we chose the Fourier transform of a function to obtain an integral representation of an operator; since in the Fourier transform there is a built-in integration, this may wrongly imply that only those transformation procedures that involve integration could be given an integral operator formalism. To illustrate that this is not the case, we show how a differential operator can be given an integral representation. ox Example 3.1.9:
Let D =
denote the differential operator. In order to find its integral representadx tion we have to determine its kernel D (x, x) in the following equality (for differentiable functions with compact support defined on the interval (- °°, °°)) f^L dx
=
f~ D(x,x')f{x')dx'
(3.1.16)
J-°°
Using the result of Excercise 4 for the integral representation of the multiplicative unit operator, we can
•
df(x)
write - 4 dx
as
J-~
dx
dx
(3.1.17)
Integrating the RHS by parts we have:
^ ^ dx
= 5(x - x') f{x')
=f J-°°
_„
- J" \ ~ 8{x - JC')1 /(*') dx' °° L dx
~8{x-x')f{x')dx dx
= J ^ 8'(x - x') f{x') dx'. This shows that the kernel Z)(x, x') = 5'(x -x)—the derivative of Dirac-delta function.
(3.1.18)
114 Mathematical Perspectives on Theoretical Physics
Example 3.1.10: Shift operator: As the name suggests, operators that shift a parameter to right or left in a continuous system, or the entries of a sequence in a discrete system, are called right or left shift operators. For example, if {U,}(t e R) denotes a family of unitary operators and S£ U, = Ut+E (e > 0), then SE is a right shift operator of length e. S_£ - (5e)~' is a left shift operator: S_e Ut - Uc_£, e > 0. If B = (bu • • •, bn) is a sequence of real numbers of length n, the operator that shifts the 1 st, 2nd, • • • entries to the (k + l)th, (k + 2)th, ••• places and fills the first k entries with zeros: Rk(B)=(0,0,-,0,bi,
•••,&„_*)
is the right shift operator of order k. The left shift operator is naturally defined by shifting the entries to the left, thus Lk(B)=(bk+l,bk+2, k
•••,£,„ 0 , 0 , •.., 0).
k
Note that L (B) = FT (B). We give below two operators to which we shall return later in Chapters 5 and 11. (i) Virasoro operators, (ii) Del Giudice, Di Vecchia and Fubini (DDF) operators: these operators commute with Virasoro operators, when these are applied successively to ground states they give all possible physical states. Moreover they form a closed algebra called the spectrum generating algebra. The elements of this algebra (i.e., operators) are in one-one correspondence with Fourier coefficients afx of string theory.6 Finally we list some of the classical operators that we use all the time.
1.4
List of Operators (Commonly in Use)
Name of the Operator
The Notation
The Spaces and the Set of Functions on which it Commonly Acts
Position
Q (or X)
R", C".
Grad (Gradient)
V= — ,—,— 1 ^dx
Div (Divergence)
Curl
^
onlR 3 :/(x,j, 2 ).
->
V»
Set of functions/defined
Vx
on\R3 :f(x,y,z). Set of functions /defined 2
6
Set of functions/defined
dz
2
2
.
onR3
:f(x,y,z).
Laplace
V = A = — T , —r-, — T \dx2 dy2 dz2 )
Set of functions/defined 3 on\R :f(x,y,z).
Momentum
-iVsP
Set of functions y defined on Minkowskian space: y/(t, x, y, z).
See Chapters 11 and 11. [13] for (i) and (ii).
A Primer on Operators 115
Name of the Operator
The Notation
The Spaces and the Set of Functions on Which it Commonly Acts
Kinetic Energy
- V2/2m = f
Functions on R3.
Energy
d _ -
Complex-valued
-i — = E
_»
"t
functions \{/ (t, X) known as the wave functions. Set of particles moving in R 3 (V(r, t) isthepotential energy. Set of functions defined o n Minkowskian space. Set of functions on R3. Set of functions on R3.
- V2/2m + V (r, t) = f + V
Total Energy
/ 2 N - -~- - V2 \= - U2 ^ "t ) - if x V s 7 x p = J = L J2 = j 2 + j 2 + j 2
Wave or D'Alembert's Angular Momentum Total Angular Momentum
S
J2 + Jl + J2
Angular Momentum Opera-
M =
tors in Wave Mechanics
=—
y
2ni \ dz 2ni
Set of functions on R3
z -zr—
ay) (y/x denotes the azimuths about the axis Ox). Similarly for My and Mz.
d
Vx
Operators defined for Specific Spaces The Hamiltonian Operator
( H x,y, z, V
h
h
/9
,
,
2ni dx
2ni dz)
{dx2
?)
For Minkowskian space.
2ni dy
2m \ 2ni J
dy2
dz2)
+ V(x,y,z,t) d2 y + V(x) 2 dx 1
One-Dimensional
H=
Schrodinger Operator (also known as Harmonic Oscillator) Schrodinger Operator (Harmonic Oscillator in a General Form)
H = — (-A + q(x))
For a particle moving in one-dimensional medium where V(x) = j * F{i) dt is the potential. For a particle moving in R" (here q{x) is a positive quadratic form on R").
116
Mathematical Perspectives on Theoretical Physics
The Spaces and the Set of Functions on which it Commonly Acts
Name of the Operator
The Notation
Schrodinger Operator (in General)
H (a, V) = — (-z V - a)2 + V
Dirac Operator
ca p + (3moc2
7
For Minkowskian space (here a is a vector potential and V is a scalar potential): Space of spinors.
Exercise 3.1 1. Establish the equality (3.1.11). 2. If Q is an operator on Hilbert space ^ a n d {>„} is a complete set of orthonormal functions ((0«> Qn) = 8,1,n)< show that the matrix representation for Q using the set {
The above representation of Q is in terms of a matrix of infinite order. 3. Show that two operators commute if and only if their Lie bracket is zero. 4. Show that the multiplicative unit operator Q.u (£luf=f) defined on a set of functions {f(x) : x € (_ oo, co)} can be written in terms of integral operator as
f(x)=[oaQu(x,x')f(x')dx' where the kernel Qu (x, x') is the Dirac delta function 8(x - x') that satisfies
J dxS(x - x') = 1 (see subsection 3 of (0.4) for Dirac 5-function). 5. Writing the operators p and J in terms of their components: (px, pv, pz), ( jx,jv,jj),
show that
and that the commutation relations satisfied by j's are: Ux, jy] = iJv [/r JJ = 'A' Uv Jxl = Vy which can succinctly be written using the indices 1, 2, 3 and the anti-symmetric tensor e i2j 7. p = hp.
as:
A Primer on Operators
[//' Jni\ =
e
117
imnJn-
6. Show that the Laplace operator
in cylindrical coordinates (p, (j>, z) and spherical polar coordinates (r, 0, 0) is respectively:
dp2
A
d2 (9r2
^
p2 d(j)2
p dp
|
2 d r <9r
|
1 d2 r 2 <902
|
dz2 cos0 ^ | 1 ^2 r 2 s i n 0 50 r 2 sin 2 0 5^>2 '
7. Consider the complex space L (0, 1) of complex square integrable functions y/(x) on the interval 0 < x < 1. Let w be a real number and U a linear operator defined as Uy/(x) = e'wx y/(x). Show that U is a unitary operator. 8. Consider the set of operators ei*o
J - =d
for
n £ Z.
we Show that their Lie bracket satisfies: K> dm\ = (n-
m) dn+m.
The set {dn} is a basis for the Lie algebra of complex vector fields on the unit circle S 1 (see Eq. (5.2.9)).
Hints to Exercise 3.1 2. Let gn be the image of
S« = X a«, * ^t
(0
hence gn = Q.(pn becomes
(ii)
X a«, * ^ = Q^«it
Premultiplication by
as
\fma<$>ndx={
118 Mathematical Perspectives on Theoretical Physics and
/ 0m hdx
= <0m' 0t> = Sm, t
Using Ftn. 4 we have (iv> ",„„ = an, mHence (ii) yields the required matrix representation: k
It should be noted that the coefficients of the expansion of gn—the image of
+
JyJ' + JzK=
J
X Px
K
y Z Py Pz
and express jx, jy, jz, with the understanding that Px =
. d ~ 1 ^
etc. The commutation relations can easily be computed by writing j x , jy, y, in terms of px, py, pz. 6. Coordinate relations x = p cos
sin0 d r. p dip
•
A
sin0
d
, cos> d d\ + 1 dp p d(f> dz J
Accordingly d2
( z- =
dx2
, 5 COS0
sin0 V
^ COS0
sin0 ^ "\ Z
{
dp p d(j)){ dp p d
A Primer on Operators 119
2
EIGENVALUES AND EIGENFUNCTIONS
2.1 The Resolvent and the Spectrum of an Operator In general a function/e X = D (Q) is transformed to another function g under the action of an operator Q. In X however there is a set {fn} of non-zero functions that satisfy the equation Q / n = Xnfn for some constant Xn ( e R or C). These functions are called the eigenfunctions* of the operator and Xn are called the eigenvalues. A sophisticated way to introduce the concept is as follows: If for any A e K(IR or C) the operator XI - £2 is not injective (i.e., N(Xl - Q) = kernel (XI - Q) * 0), and there exists an / e D(Q), / ^ 0, such that Q / = Xf, then X is called the eigenvalue a n d / i s called the eigenfunction. The subspace N (XI - Q) is the eigenspace of X, and dim (TV (XI - Q)) is its multiplicity. In case A is not an eigenvalue (i.e., XI - Q is injective), then there is a (well-defined) operator R(X, Q) = (XI - Q)" 1 which can be viewed as a function on the set defined below. The set p(Q) = {A € K : XI - Q. is injective, and 7? (A, Q) e $ (X)} is called the resolvent set of Q, and the function R (•, D): p(fi) - » S (X), A -» /?(A, Q) is called the resolvent of Q. The set CT(£2) = K \ p(Q) is called the spectrum of Q. The set of eigenvalues is obviously contained in cr(Q), it is called the point-spectrum and is therefore denoted as a (Q). (For details see [17], O.[3].)
2.2 Examples and Results on Eigenvalues and Eigenfunctions of an Operator Example 3.2.1: Let X be the space of C°°-functions defined on R 3 and Q be the Laplace operator A defined in Subsec. (1.4). The functions fn= cos nx + cos ny + cos nz are eigenfunctions of the equation A/ n = Xnfn where Xn= -n2 is non-degenerate. Example 3.2.2: The operators Mk (k stands for x, y, z) given in Subsec. (1.4) possess the eigenvalues m (HI2n) with corresponding eigenfunctions (2TC)~V2 exp (- im
8
'
When Q. is a linear operator acting on vector space X (Euclidean or Hilbertian), we call them eigenvectors and generally denote the set as {\jfn}.
120
2.3
Mathematical Perspectives on Theoretical Physics
Hermitian Operators
We next see that Hermitian operators, denoted as H, have the following properties: (1) The eigenvalues are real. (2) The eigenfunctions corresponding to distinct eigenvalues are orthogonal. (3) The operator H can always be represented by a Hermitian matrix. To prove (1) we pre-multiply the eigenvalue equation Hfn= Xnfn by f*', integrate and use the defining relation (3.1.9)—(3.1.10) to obtain:
(3-2.1)
In view of the Hermitian property (3.1.14) of H we further have
(fnMfn) =
(fm\H\fn) = \(fm,fn)
(3.2.2)
(fn\H\fm)* = Xm(fn,fS
(3.2.3)
(fm\H\fn) = K{fn,fn)
(3.2.4)
which equals since H is Hermitian. Equating (3.2.2) and (3.2.4) we have the orthogonality of /„ and/ m as Xm ^ Xn (see Exercise 1 for equal eigenvalues). To prove (3) we begin by choosing Q as an arbitrary operator. Suppose that / i s an eigenfunction of Q, i.e., Q / = Xf. Let {/„} be a complete set of orthonormal functions, then / = ^ ck fk implies that the above relation can be put as k
"5>*/t=*I cjk. k
(3.2.5)
k
We now pre-multiply by/ H and integrate to obtain k
k
I^=ica.
(3.2.6)
k
If we choose now Q as a Hermitian operator, then
ft,MA> = ft|n|/m>* implies that Qmk= il*^ Therefore, in view of Exercise 2 of Sec. 1, we conclude that the matrix is Hermitian. From this property it follows that the eigenvalue problem of a Hermitian operator is equivalent to the eigenvalue problem of a Hermitian matrix (which may be of infinite order sometimes, see Exercise 4). For instance, since (Qmyt) is a Hermitian matrix, there exists a unitary matrix U = (Uk[) that diagonalizes the matrix (O.ml): m
k
and thus gives the eigenvalue of the operator Q.
A Primer on Operators 121
Another property which makes Hermitian operators rather important is the following: Proposition 3.2.6:
Two Hermitian operators commute iff they have a basis of common eigenfunctions.
Proof: (Nee.) Let/[, •••,fi be the common set of eigenfunctions for operators H and H . Then eigenequations ///•= A,/-, H= Xifi taken together imply
(a)
HHfi=H atfi) = WJi and
(b)
HHfi = H (Xj) = X^.f,
Since the R H S of (a) and (b) are equal, it shows that (c) H H = H H. (Suff.) Starting with the commutativity (c) we now show that they h a v e the s a m e set of eigenfunctions. We assume that they have different set of eigenfunctions fx, •••,fi, •••, f {, •••, / , , •••• Their eigenequations accordingly are: (d)
Hf^Xifi,
and
(e)
Hf{ = A,/,-.
Pre-multiplication of (d) by H and using (c) we have
(f)
HHfi=H
A,-fi= A,- (Hfi = H(H ft).
The extreme RHS of equation (f) shows that if A; is treated as a non-degenerate eigenvalue, then apparently (Hfy is an eigenfunction of H — which must necessarily be a constant multiple offi.In other words H ft = kft, i.e., ft is also an eigenfunction of H. Since choice of f{ in (d) is arbitrary, it follows that the set of eigenfunctions of H is also the set of eigenfunctions of H . Similarly, using the equation (e) it can be shown that eigenfunctions of H are also the eigenfunctions of H. We wish to mention here that conclusions will be the same, even when we delete the requirement of non-degeneracy of eigenvalue in above arguments and use instead an /n-fold degenerate eigenvalue (for proofs see [5], [11] and Hint to Exercise 1).
2.4
Properties of Commuting Operators
Since in quantum mechanics measurable quantities (observables) always correspond to (bounded) operators, the result proved above leads to an important consequence. Remark 3.2.7: In a given physical system two measurable quantities A and B cannot be measured simultaneously unless the operators representing them commute. For example, the coordinate q and its conjugate p — the linear momentum — cannot be measured simultaneously. To see this consider the position operator Q which assigns coordinate q to a measurable quantity and the momentum operator P=
— 9 and note that 2ni dq
- (PQ - QP) f{q) = 9[~?\^-[-^-)4-
(
<^» = -hf^
V 2m) dq V 2m J dq 2m where/is a differentiable function of q. This implies that QP - PQ - h/2m showing the non-commutativity of Q and P. If, however, there are several Q's and P's with the equalities Qt Qj = Q} Qr Pi Pj = Pj P ; leading to P, Qj = Pj Qt, we note that for i ^j the measurable quantities can be measured simultaneously. 9
P and Q of previous section are same except for the constant and the coordinate notation.
<
122 Mathematical Perspectives on Theoretical Physics
Remark 3.2.8: If two arbitrary operators A and B have an eigenfunction <j> in common then either (i) they commute, or (ii) their commutator [A, B] has a zero eigenvalue. Since A and B share the same eigenfunction <j>, for some constants a and j3(a* p), we have A= a(j> and B = /J0. This implies: (AB - BA)(j> =[A, B]
(3.2.8)
hence either [A, B] = 0, i.e., A and B commute, in which case the operators A and B have the same set of eigenfunctions, or [A, B] * 0 has a zero eigenvalue as is obvious from the extreme RHS equality of (3.2.8). To illustrate (ii) further, we consider the operators Mx, My, Mz which satisfy the relation: [Mx, My]
=^-Mz.
The set of eigenvalues of Mz given by m(h/2n) where m = 0, 1,2, • • • includes the zero eigenvalue for the commutator [Mx, My], and in this case operators Mx, My share the common eigenfunction 0 = cos t. Note that eigenfunctions of Mx and My are respectively (2n)~m exp(-im\ffx) and (2TT)~1/2 exp (-imyry) where \j/x and \fry are azimuths with respect to Ox and Oy. For m = 0 these eigenfunctions coincide with
(2nTm.
Sometimes it is useful to consider not eigenvectors but generalized eigenvectors of a linear operator which are defined as follows: Definition 3.2.9: Associated with an eigenvalue A of an operator L a vector £ is called a generalized eigenvector of order v if (L - A)v^ = 0, and for every v ' < V (L - A)v £ * 0. Naturally given an eigenvalue A of L, the generalized eigenvector for v = 1 is an eigenvector. (See [14] for details.)
Exercise 3.2 1. Prove the results (3.2.3) and (3.2.4). 2. Prove the result (3.2.5) and show that the result holds good for infinite-dimensional spaces as well. 3. Show that for a Hermitian operator it is always possible to select an orthogonal (orthonormal) set of eigenfunctions even in the case of degenerate eigenvalues. 4. Given a Hermitian operator Q and its matrix representation (£2^), show that the eigenvector pertaining to an eigenvalue A^ of matrix (£1;) uniquely determines the corresponding eigenfunction forQ. 5. Show that the measurable quantities represented by operators Mx and My cannot be simultaneously measured, whereas the quantities represented by pairs (A/2, Mx), etc., can be simultaneously measured. 6. Show that the Casimir operator*
r=l
v
z
J
r=\
commutes with each Jr. Also the raising and lowering operators J±=Ji± U2 satisfy the commutation rule [J+, J_] = 273, [/±, 73] = + J± (see Sec. 7 and the hints to the exercise there). * Given a non-abelian group G, a nonlinear function of generators of G that commutes with all of its generators is called a Casimir operator. It is usually denoted by C.
A Primer on Operators 123
7. Show that every eigenvalue of a unitary operator is a complex number of absolute value 1. 8. Consider 2
3
e x Fp ( f A ) = / + tA + — A2 + — A 3 +•••
2!
3!
as a curve of linear operators and - j - (exP
M
>
as its tangent vector at time t, then show that the tangent vector to the curve L(exp tA) LT1 at t = 0 is LALT1, where L is any invertible operator. 9. Show that if there exists a generalized eigenvector x of order v for a given eigenvalue X, then the elements (L - X)x, (L - X) x, ••-, (L - X)v~l x are respectively generalized eigenvectors of order O - 1), ( v - 2 ) 1 associated with X.
Hints to Exercise 3.2 1. Since a is an eigenvalue, there exists an eigenvector y/ such that Ay/=ay/, this implies (A-a) y/=0. In view of Result (3.1.4) (1) it follows that (A - a) has no inverse and conversely if (A - a) y/= 0 holds for the non-zero vector \ff, then by definition a must be an eigenvalue. Result (3.2.4) is obvious from condition (4) of Result (3.1.4). 2. Suppose that a is an eigenvalue of A, then for a non-zero ^ w e have A\f/= ay, also since L has an inverse L~\ from (1) of Result (3.1.4), Lift* 0. Accordingly LAW1 (L\f/) = LA (LTlL) y/ = Lay/= aLxf/ As Liffis a non-zero vector, it follows that a is an eigenvalue of LALT1. If we start by choosing a to be the eigenvalue of LALT1, we can show that this is an eigenvalue of A simply by writing: A =L~l LAL~l L together with the fact that L y i s non-zero if y is an eigenvector of LAL~X. 3. When A, * Xj eigenequations Hft = A,/-, Hfj = Xjfj with Hermition operator H imply10
\x UJ Wfi -f, (H /;)*) dx = o = (Xi- xp jx f* ft dx which shows that eigenfunctions ffs form an orthogonal set. If Xt is s-fold, there are s linearly independent eigenfunctions fn, •••,fis that may not be orthogonal. They can be replaced by an orthogonal set in following manner. Choose a set of s functions as:
fn=fn fi2r =
fis' =
' It is assumed that eigenfunctions (/•) defined on X are integrable.
124
Mathematical Perspectives on Theoretical Physics
The new set consists of (s - 1) eigenfunctions f£, • ••,fi's, each of which together with their linear combinations is orthogonal to ft{. Either these (s - 1) functions are mutually orthogonal, or, if not, the above procedure can be repeated by setting: f " -
f'
Ji2 - Til
fa'=
(y/, yf) = (Uyr, Uy/) = (uy/, uy/) = uu{y, yr) which shows that uu must be 1. 8. Since (LALT1)" = LA'1 L~x for all n, it follows that exp t(LAL~x) = L (exp tA) L~\ and the derivative at t = 0 is the coefficient of t in the exponential series.
3
SOME PROPERTIES OF OPERATORS
We devote this section to few more definitions of operators and their properties and also to questions related to spectral decompositions of operators defined in previous sections. Most of these results are given without proofs but along with the references where these can be found.
3.1
Projection Operators and their Properties
Definition 3.3.1: Let J^be a separable Hilbert space and M one of its subspaces. Every vector yr& ^¥can be uniquely decomposed into a sum of vectors y/M and y/M± belonging to M and its orthogonal complement M1 respectively. An operator which maps every y/to \jfM is called a. projection operator on M, and is denoted as EM. Evidently M is an invariant subspace under the action of EM. The projection operator on the whole space -^Tis the identity operator /, and on M1 there is the projection operator £" which is related to EM as: E'=1-EM
(3.3.1)
Thus every vector y/e 9{\n relation to these two projection operators satisfies: (EM +E')y/=
EMyt + E'yi = y/M+ y/Mx
(3.3.2)
It can be easily verified that every projection operator is self-adjoint. In fact we have the following result that characterizes a bounded linear operator as a projection operator. Result 3.3.2: (i) A bounded linear operator £ is a projection operator if and only if E2 - E - E+. (ii) The subspace onto which E projects consists of vectors Eyrfox all vectors i^in 3f([6], [7]. See Exercise 1).
A Primer on Operators
125
Definition 3.3.3: Two arbitrary projection operators El and E2 are said to be orthogonal if the subspaces M{, M2 on which they act are orthogonal* It can be easily checked that in this case: ElE2=E2E1
(3.3.3)
The operators EM and E' are obviously orthogonal. In case subspaces Mx and M2 are such that Mx c M2, then we say that Ex < E2 or equivalently E2> Ev The following result uses these definitions. Result 3.3.4: Let Ex and E2 be projection operators onto subspaces Mx and M2. We have: (i) If Ex E2 = E2El, then Ex E2 is a projection operator that projects on the subspace Mx n M2. (ii) If Ex and E2 are orthogonal, then El + E2 is the projection operator onto Mx © M2. (iii) If Et < E2, then E2 - Ex is a projection operator, and the corresponding subspace on which it projects is the orthogonal complement of My in M2. Note that (iii) is obvious in view of Def. (3.3.1). Definition 3.3.5: Let A" be a subspace of another space Y and let J3 be a linear operator such that for every vector y/in X, Sly/is in X, whereas for every vector 0 in XL, A<j> is in X1; then we say that X reduces S\. If a is an eigenvalue of operator «#on X and Xa denotes the collection of all vectors i/^that satisfy Siy/= ay/, then it can be easily seen that Xa is a subspace that reduces A. Moreover if S\ is a Hermitian (or unitary) operator on X, these subspaces corresponding to different eigenvalues at are mutually orthogonal, and are naturally the spaces spanned by corresponding eigenvectors. If X - ® Xa., then X reduces A, as a matter of fact !A. acts on X as a diagonalizing matrix. Thus every operator A which is Hermitian or unitary splits into two separate parts Ax and Jl2, the part Pix operates on the space spanned by its eigenvectors and Jl2 acts on its orthogonal complement. The operator Jlx can be represented by a diagonal matrix consisting of its eigenvalues, once an orthonormal basis of eigenvectors is chosen for the space X = ® Xa.. Following results further characterize the unique features of Hermitian and unitary operators.
3.2
More on Hermitian and Unitary Operators
Proposition 3.3.6: Every finite-dimensional complex vector space11 acted upon by Hermitian or unitary operators can be spanned by their eigenvectors. Proof: Let A be a Hermitian/unitary operator that acts on a given space X. Let XA denote the vector space spanned by its eigenvectors and £ be a projection operator on Xx Assume that XA ^ X. Now we know that XA reduces J3, and since by our assumption X^ ^ X, we shall have its orthogonal complement I ^ c l a l o n g with the operator Sl{\ - E ) that acts on it. Since X% is finite-dimensional, Sl{\ -E) must have a non-zero eigenvector
<j)= b^>
which means that 0 e X% is indeed an eigenvector of S%, thus our assumption that X% * X% •£ X is incorrect. • * 11
Two vector spaces are orthogonal iff the sets of their basis vectors are orthogonal. If X is a real vector space, the result holds good only for a Hermitian operator.
126
Mathematical Perspectives on Theoretical Physics
In view of the above result, it follows that a Hermitian/unitary operator acting on a finite-dimensional complex vector space can always be given a diagonal matrix representation. Our next proposition shows how an arbitrary linear operator on a Hilbert space can be thought of as a collection of projection operators. This naturally implies that the simple properties of projection operators can be utilized to learn about operators of complex nature. Proposition 3.3.7: Let "Hbe. an n-dimensional Hilbert space and let .# be a linear operator on it. Then for a suitable choice of orthonormal basis (a) the matrix representation of %. is upper triangular; (b) equivalently there exist projection operators: 0 = Eo< E1 < E2< ••• < £„ = 1 that satisfy: (i)
(1 - £,.) J4Et = 0
0 < i
(ii) The algebra generated by {£,} is maximal abelian.12 Proof: To prove (b), as a first step we must show that given a basis {e,} of orthonormal vectors in H, projection operators satisfying the inclusion relation can be defined. The trivial projection operator, acting on the span of {e,} is En = 1. Since 1 • #"= En 9i= # = > El 9i= En9{ 2
we have £, = En. Let ^ _ j denote the space spanned by all basis vectors ey, e2, •••, en_x (excluding en), using the above argument we can define a projection operator En_l on tfi-\- Evidently En_{ < En since ^4_! c 9i The spaces ^,_ 2 , •••, #j can be likewise constructed and projection operators satisfying the inclusion relation can be formed. The operator Eo in this sequence represents the zero operator, that carries every element to the null element of 9i The operators Ei defined on 9^ as projection operators are in fact projection operators on 9i Since 9(= 9^ + 9^, by the very construction Et satisfies Et 9^= 9^ and Ej 9^ = 0, therefore we have from these two equalities: Ei O^ + El, ^
= Et9{=
9^ + 0 = 9
and E{ (£, 0i) = Ei9{i = Et 9l thus Ef = E{ on 9i More explicitly, i f / e ^/has components h and g e 9^ and 9
E? = 0.
To prove (ii) of (b), we note that in view of Def. (3.1.1) the algebra "B {9C) is generated by {£,}. And from Remark (3.2.8) it is abelian. We have so far dealt with operators: (i) that are bounded; (ii) that operate on finite-dimensional spaces. From now on we shall relax these two conditions, for as we shall see in the foregoing example, the unbounded operators are not hard to find. Example 3.3.8: Let P be a linear operator on the space L2 (- «>, °°) of square integrable functions/(x) defined by the rule:
(Pf) x = xf (x). The operator has the Hermitian property:
(8,Pf) = I" Six)* xf (JC) dx = P (x g(x))* f(x) dx = (Pg,f) 12
That is, if there is any other abelian algebra of operators on"Hthatcontains it, then that algebra must coincide with it. See Lecture 1 in [1] for details.
A Primer on Operators 127
only if the integrals converge. This is evidently not bounded since
\\Pff=\~Jxf{x)\2dx will often be many times larger than
n/ii2=j;j/(*)i2
n r*/toi2
(3.3.4) +
for every vector 0 in D(Jt). Moreover, D(Jt) is dense, therefore A y/= SI y/for every y in D(^l). Thus ^L+ can be viewed as an extension of A, in other words, the domain D (Jl+) can be larger than D(A); in case they are same, A = S>C. From the above it is clear that if an operator A is not symmetric, the next best thing we can do is to ask that it be closed. If the operator is not closed but is defined on a dense domain, then it can be shown that its adjoint R+ is closed. Remark: We emphasize that for operators belonging to #(.#) the notions of Hermitian, symmetric, and self-adjoint are equivalent.
Exercise 3.3 1. Prove Result (3.3.2). 2. Show that a projection operator E has only two eigenvalues, which are 1 and 0. 3. Let A be a linear operator and X a subspace that reduces A If E is a projection operator onto X then show that (i) AE = EJZand (ii) (I - E) Jl= Sl(l - E). Conversely an arbitrary operator A that satisfies (i) and (ii) must be such that X reduces it. 4. Show that an operator JZ defined on a dense domain has an adjoint which is closed. 13
Since in the physics literature Hermitian means self-adjoint, using physicists' terminology, A is symmetric if it is Hermitian and is densely defined.
128
Mathematical Perspectives on Theoretical Physics
Hints to Exercise 3.3 1. If E is a projection operator on a subspace #"of 9i', then for every vector ye E2y = E(Ey) = £ w = 2
H':
yH = Ey
thus E = E. Also, since (j>= fH+ ip^ and y = Y^+ y^,
we
nave
(<j>, Ey) = (
EEyn-^EV.
But the limit vector for the sequence has to be unique, thus E*¥ must equal *F, this implies that 4* is in Pleading to the conclusion that !His a subspace. Again using the self-adjointness of E we can show that if £ y is in X then ( 1 - E) y is in 5/ 1 . To do this we take an arbitrary vector Ety e i^and form the inner product: (£0, (1 - E)y) =((p, E(\ - E)y) = (
Xy.
This shows that X (X - 1) y= 0, i.e., X = 0 or 1. 3. Since X reduces ft, for every vector yeX, Ay= 0 e X, also since E projects on X, EyE
l/f and
EAy = E(JZy) = E<(> = $ = Ay = A Ey i.e., (i) holds and (ii) follows trivially. To prove the converse, i.e., given (i) and (ii) X reduces A, we have to show that if y e X and 0 € X x , then Sly must be in X and Slip in X 1 . By definition if E is a projection operator on X, then (1 - E) is a projection operator on X1, consequently Eyy and (1 - E)
A Primer on Operators 129
4
THE SPECTRAL DECOMPOSITION
We have seen that a Hermitian or unitary operator Q acting on a finite-dimensional space when expressed in terms of its eigenvalues and eigenvectors can be represented by a diagonal matrix with eigenvalues along the diagonal. We shall next see that using these eigenvalues it can be expressed as a sum of projection operators, and that this notion can be generalized to define other projection operators leading to spectral decompositions of operators on infinite-dimensional spaces, although some of these operators may not possess eigenvalues nor eigenvectors. The technique developed below is quite useful to the study of these operators of general nature. Consider a Hermitian operator H with eigenvalues X^4 (k = 1, •••, n) and corresponding eigenspaces Mk spanned by eigenvectors. We know that they are mutually orthogonal, accordingly if we denote by Ik the projection operator onto Mk, the operator H can be written as:
H=thh
(3-4.1)
k=\
Evidently these projection operators are mutually orthogonal, i.e., / ; Ik = 8jklk and have the completeness property: n *=i
We now define another set of projection operators:
Ex = £ Ik
(3.4.2)
Xk<x
where x is a real number and where we have assumed that Xx < X^ < • • • < Xn (recall that A/s are real). Ex is the projection operator on the subspace formed by eigenvectors corresponding to all eigenvalues Xk< x; it is zero for x < X{ and equals 1 when x > Xn, due to completeness property of {Ik}. Also, if x < v, then ExEy=Ex=EyEx
(3.4.3)
from Def. (3.3.3) and Result (3.3.4). The operator Ex defined in this manner, increases from zero to one as x increases through the set Xx to Xn; more precisely it increases by Ik when x reaches the value Xk. For a given x, by choosing a positive number e small enough so that there is no eigenvalue between x - e and x, we set the difference between Ex and Ex_ E as: dEx=Ex-Ex_e
(3.4.4)
We note that if x equals an eigenvalue Xb then by very definitions of Ex and Ex_e, (3.4.4) gives: dEh =(/, + I2+ ... + /,) - (/,+ ... + / , _ , ) = /,
(3.4.5)
accordingly, it follows that:
J"_ dEx = 1 and (replacing Xk by x in (3.4.1)): 14
' To keep the discussions simple, we are taking the eigenvalues to be distinct.
(3.4.6)
130 Mathematical Perspectives on Theoretical Physics
J" xdEx=H
(3.4.7)
For any vectors y'and <j) that belong to D(H), the above equations imply:
<0, y/> = J_"_ rf<0, £ ^ >
(3.4.6)(a) (3.4.7)(a)
It should be noted that by the very nature of (3.4.4) (0, Exy/) (which is a complex function of x) jumps in value by {0, Iky/) at x = Xk. In the case of the unitary operators U, we write the eigenvalues as e'9k where 0 < 9X < • • • < 6n < 2n gives the order relation, hence for each real number x we can write
Ex= I h Bk^x
We note that in this case Ex defines the projection operator on the space formed by all eigenvectors that correspond to eigenvalues e'6k with 8k < x. If x < 0, Ex = 0 and if x > 2K, EX = 1. Using the same arguments as in the case of Hermitian operators, we can write
U=±e^Ik k=\
as
U= \2* eix dEx
(3.4.8)
For any two vectors i/Aand
(
(3.4.8)(a)
It is easy to note that the operators Ex defined above are continuous from the right as a function of x, since for e > 0 Ex + e y/ —> Ex y/ as e —» 0 for any i// and JC, whereas they are discontinuous from the left, since dEx = /, at x = Xt or 8t. Having defined these projection operators, we are now in a position to state two fundamental results on Hermitian and unitary operators defined on infinite-dimensional spaces.
4.1
Results Based on Spectral Families of Operators
Definition 3.4.1: A family of projection operators Ex depending on a real parameter x is called a spectral family if it has the following properties: (i) If x < y, then Ex < Ey or Ex Ey = Ex= Ey Ex. (ii) For any vector ^ a n d any x, and a small s > 0, Ex + e y/ —> Ex y/as e —> 0. (iii) For any vector y/, Ex \\i —> 0 as x —> - oo, and Exy/ —» y^asx—» + °°. We use this definition to state a few results without proof (see Sec. 107, Sec. 109 and Sec. 120 in [12], and Theorems 5.9 and 8.4 in [15] for proofs). Result 3.4.2: For each self-adjoint operator .3, there exists a unique spectral family of projection operators Ex such that for all vectors y/ and 0 (if A is unbounded, y/e D (A)) the following holds good: <0, .PLi/') = J"_ xd (<(>, Exy/)
(3.4.9)
A Primer on Operators 131
And the operator .3. can be written as:
#= J~ xdEx
(3.4.10)
Result 3.4.3: For each unitary operator U there exists a unique spectral family of projection operators Ex such that Ex = 0 for x < 0 and Ex = 1 for x > 27T and
{<)), Uiff) = fK eixd ((j), Ex\i/)
(3.4.11)
for all vectors 0 and y/. The operator (/ can be written as: U=j**eudEx
(3.4.12)
The Equations (3.4.10) and (3.4.12) define the spectral decomposition (resolution) of the self-adjoint operator JZ and the unitary operator U respectively. More generally (Stone's theorem) , if there is a one-parameter unitary group [Ut: t e R} having positive spectrum,15 then there is a spectral measure E, i.e., a measure on the real line K such that U,= J~_ eitx dE{x)
(3.4.13)
Since the positive spectrum condition implies that E(x) is concentrated on the positive half-line 0 < x < <», a semigroup (with operators) Pp t > 0 can be defined where: Pt= j ~ e-'xdE(x)
(3.4.14)
It can be checked that Pt is a strongly continuous contraction operator satisfying Po = 1 and P, = Pt. Thus, while (3.4.13) gives unitary operators, (3.4.14) gives self-adjoint operators. This construction is important in the path integral approach to quantum theory (see [1] Lecture 3 and Sec. 7.6 in [17]). To illustrate the above two results, we give below three examples. Example 3.4.4: Consider the Hilbert space L2(0, 1) of square summable functions f(x) defined on the interval [0, 1]. Let Abe the self-adjoint operator defined as Af(x) = xf(x) for every/, and let Ex be the projection operators that are given by the rule:
and
(Exf)y=f(y)
if
y < x
(3.4.15)
(Exf)y = 0
if
y>x
(3.4.16)
Then the family {Ex} satisfies (3.4.9) and (3.4.10), and thus gives a spectral decomposition of A Example 3.4.5: Let Q be the self-adjoint operator defined by the action (Qf) (x) = xf(x) where /(JC) now belongs to L2(- <», »), and let Ex be defined as in Example (3.4.4), then the family {Ex} satisfies (3.4.9) and gives the spectral decomposition of Q. Example 3.4.6: Let U be the unitary operator defined by (Uf) (x) = e'27af(x), where f(x) belongs to L2(0, 1), and let (Exf) (y) =f(y) if y < x/2nand (Exf) (y) = 0ify> xlln, then the family {Ex} satisfies (3.4.11) and gives the spectral decomposition (3.4.12). In each of these examples that we have cited, Ex is a continuous function of x rather different from what we introduced in the beginning, this fact brings us to our next result that distinguishes self-adjoint I5
' An operator.!? is said to have positive spectrum if all its eigenvalues are non-negative.
132
Mathematical Perspectives on Theoretical Physics
operators that possess no eigenvalues (or eigenvectors) from those that do possess eigenvalues and hence eigenvectors in domains of their definition. Proposition 3.4.7: Let JZbe a self-adjoint operator with the spectral decomposition: A= \°° xExdx
(3.4.17)
then Ex jumps in value at x = X if and only if X is an eigenvalue of ^L If Ix denotes the projection operator onto the subspace spanned by the eigenvectors corresponding to X, then EJX = 0 for x < X and EJX = Ix for x > A, and for £ > 0, (Ex\ff - Ex _ £ iff) -» Ix y/ for every vector y/, as £ —> 0. We give in Exercise 3 an outline of the proof, for more details reader may refer to [11] and [12]. From the above result it is evident that Ex for an operator Q sometimes increases by jumps (this is so when Q. has an orthonormal basis of eigenvectors) and sometimes continuously as well as by jumps. In the latter case the eigenspace spanned by eigenvectors is smaller than the whole space. Our next remark and definition clarify this point. Remark 3.4.8: The set of points x at which Ex, the projection operator for a self-adjoint operator Q, jumps is the point spectrum of Q.. The point spectrum, as we already know from Subsec. (2.1), is the set of all eigenvalues of Q. Definition 3.4.9: The set of points x, such that Ex increases continuously in the neighbourhood of x is called the continuous spectrum of Q. The point spectrum and continuous spectrum comprise the spectrum of Q, denoted <7(Q) (see Subsec. (2.1)). Note that this is the totality of points at which Ex increases. If Q is a unitary operator U, 'the set of points x is replaced by 'the set of e" at points x' in the definitions, and thus the spectrum of U is the totality of points e'x at points x for which Ex increases either by jumps or continuously. A far reaching16 relation between the operator and its spectrum is given by our next three results (see [12] Sees. 126-132, and [11] for the proofs). Result 3.4.10:
A self-adjoint operator is bounded if and only if its spectrum is bounded.
Result 3.4.11: A self-adjoint operator PL is positive ((0, J&j>) > 0 for every
Exercise 3.4 1. Fill in the lines of proof required for (a) Example (3.4.4) and (b) Example (3.4.5). 2. Obtain the spectral decomposition of unitary operator defined as (Uf) (x) = e2ttixf(x)
where f{x) e L2 (0,1).
3. Give lines of proof for Proposition (3.4.7) and Result (3.4.12). 16
These results are of great importance in quantum mechanics since a real physical quantity is represented by a self-adjoint operator.
A Primer on Operators
133
Hints to Exercise 3.4 1. (a) Given (3.4.15) and (3.4.16) we have to establish (3.4.9) for every /and g e L2(0, 1) and show that Ex satisfies (i), (ii) and (iii) of definition (3.4.1). Thus for any /and g, we have:
£
xd(g, Exf) = J_~_ xd jjj g(y)' (Exf)(y)dy
(a)
where we have used the definition of scalar product on L2(0, 1). We now write it as jl0 xd \* g(y? f(y)dy. Note that the limits of the outer integral in (a) are changed since x e (0, 1), and that of the inner integral change due to (3.4.15) and (3.4.16). Obviously now,
RHS=
fQg(x)*xf0c)dx
= <*.*/>• This establishes (3.4.9). By the very definition of a projection operator (see Def. (3.3.1) and Def. (3.4.1)), if x < y, Ex< E , then Ex Ey= Ex= EyEx, which shows that (i) is valid. Again using the definition of the norm of an operator (which in this case is Ex+£- Ex, £ > 0) we have:
\\(Ex+£-Ex)ff=\Xx+E\\fiy)tdy.
(fi)
Note that the integral on the RHS in (/?) tends to zero as £ —> 0 for a n y / e L 2 (0,l) and x in [0,1), and this gives (ii). Obviously Ex = 0 if x < 0 and Ex=\ if x > 1, therefore the family of projection operators [Ex] defines a spectral family, and gives the spectral decomposition of A (b) The lines of proof for the operator Q defined on L2 (-«>, °°) are identical, except that in this case Ex = 0 as x —* - oo and Ex = 1 as x —> + oo and Ex continues to increase in the interval (— °°, °°). 2. We set a projection operator Ex on the interval (0,1) by defining
and
(Exf)y
= f(y)
when y < xlln
(Exf)y
=0
when y > xlln.
Using the same lines of argument as given in l(a), we note that {EJ defines a spectral family for which Ex= 0 if x < 0 and Ex= 1 if x > 21 n. 3. To prove Prop. (3.4.7), we note that for e > <5> 0 and any vector y/, Ex - E^_e and Ex- E^_s are projection operators. Moreover Ex - Ex_^ > Ex - E,s, thus \\{EX - Ex_ G ) !//•- (Ex - E^_s) i/^||2* can be expressed to show (i) that \\(EX - E^) v | | converges to a limit as e —> 0. and (ii) that U(£A - Ex_^) y/1|2- \\(EX- Ex_$) i/^l2 -> 0 as e , 8-> 0, (i) implies that the vectors (Ex - Ex_^) y/ have the Cauchy property as e —> 0. But since the space is complete, (E^ - Ex_^) y must converge to a limit vector, say y/x as e —» 0. *
||(^-^- 6 )V^-(£A-£A-5)V'l| 2 = l|(£A-^-e)V'l| 2 +ll(£A-^-8)^| 2 -<^(£A-^- e )(£A-'EA-5)V'> -< iff, (Ex - Ex_5) (Ex - Ex _ 6 ) y/>. In view of (3.4.3) and £j^ = Ex, each of the last two terms equals \\(EX-EX_5)¥\\^
134
Mathematical Perspectives on Theoretical Physics
If y/x * 0, and x > A, then we have Ex (Ex - EX_E) = Ex - Ex_e\ whereas for x < X and thus for small e, x < X - e, Ex (Ex - EX_E) = 0. This shows that Ex (Ex - EXJ) yr = Ex y/x = 0 for x < X and Ex (Ex - Ex_e) y/ = {Ex yrx - yrx) = 0 for x > X . So for any vector 0 we have (0, %.yr{) = J
xd (), Ex y/^ = X (0, v^) = (0, Xy/X) leading to the equality Ay/X = Xy/X, which means that
X is an eigenvectors and y/x is an eigenvector. To prove the second part of the proposition we let Ix be the projection operator onto the subspace spanned by the eigenvectors, that correspond to eigenvalue X of A Now
\\EX Ix y/||2 - > 0 as x -* - ~ and \\EX Ix y/\\2 -> \\IX y/\\2 as x -> oo. Thus for any vector y/ \\EX Ixy\\2->0ifx<X
and \\EX lx yf -> \\lx yf
for x > X. This means that-E^ Ix = 0 for x < X and Ex lx - Ix for x > X. Hence for any vector y/, (Ex - Ex_e) Ixy/= Ixy/ showing that Ex jumps in value at x = X. Again for any vector y/, writing II(Ex - Ex_s) (1 - lx) y/ I = <(1-/ A ) yrAEx-Ex^)2{\-lx)
yr>
= <(l-Ix) ¥,(EX-Ex^)(l-Ix) W> We note that as e —> 0, it —> 0 for one of the two reasons, namely either {Ex - EX_G) (1 - lx) y/ converges to a zero limit, or it is a non-zero eigenvector which must be orthogonal to (1 - lx) y/. Hence for any vector y/ we have (Ex - Ex _ e ) y/ —> Ix y/ as G —> 0. D In the case of Result (3.4.12), we note that if, for instance,/(x) = c0 + q x + • • • + cnxn (c(- 6 <E), t h e n / ( ^ ) = co + cj A+ ••• + cnf('. Let if) (x) =f(x)*; for any vectors y a n d 0 we have {<j>, \j{JZ)]+
y$ = (y,fW4>f = J_^ f(x)* d(y/, Ex0)* = J"^ ( / ) (x)d{
is a unitary operator. Writing <0,/U) 0> = | " /(x)J || ^ 0 | | 2 ,
we have that/(j?) is positive if f(x) > 0 over the spectrum of SI
5
GROUP THEORETIC ASPECTS OF OPERATORS
In the previous section we have already studied (though partially) the equation: Ly/=Xy/ (3.5.1) for a linear operator in order to learn about its eigenvalues {A} and its eigenfunctions {y/}. Since it happens to be an important equation in mathematical physics, we shall devote the next Section 6 to the study of (3.5.1) where L stands for a few specific operators of interest to both mathematicians and physicists. In this section we define a few objects related to the algebraic structure of operators. Definition 3.5.1: Let fi 1 and Q 2 be any two operators that commute with L, then their product as well as their linear combination also commutes with L; the collection of all such operators forms an algebra called the commutator algebra of L. We shall denote it by CL.
A Primer on Operators 135
From equation (3.5.1) it is evident that Q e CL implies that Q yr is an eigenvector of L with the same eigenvalue A that corresponds to eigenvector y/. If V^ denotes the vector space spanned by eigenvectors that correspond to same eigenvalue X, then Vx is invariant under Q, it follows that it is invariant under entire CL. We further observe that if V^ is finite-dimensional and the operators € CL generate the full algebra of matrices in the vector space V^, then any solution y/ e Vx can be obtained from a single solution y/0 e V% ( % * 0) by elements of the commutator algebra. We shall return to this idea in Result (3.6.1). Definition 3.5.2: The set of all invertible operators in CL forms a group G called the full symmetry group of L. If G is a Lie group, it is of particular interest. We shall see in an example in Chapter 6 that this group for operator A acting on S3 is 5O(3) (Exp. (6.7.6)). It should be noted that the set CL can always be made into a Lie algebra since it contains the elements QjQ2 - Q2 ^ i together with Ql7 Q2. We shall show later in an example (Chapter 6., Exp. (6.7.8)) that this Lie algebra derived via the symmetry group of A is indeed the Lie algebra of SO(3). We have mostly talked about operators on finite-dimensional spaces, in reality though we have to deal with operators on infinite-dimensional spaces. The next two definitions involve such spaces. Definition 3.5.3: An operator Q. is called transitive if the only closed subspace Hoi the ambient Hilbert space H' satisfying is H= {0} or 9{= H'. The term intransitive simply stands for non-transitive. Definition 3.5.4: A linear bounded mapping (operator) Q. from a Hilbert space Hx into a Hilbert space 9{2 is called compact if it carries bounded sets in Hx into subsets of bounded sets in H2. An important property of such an operator is that the image sequence {iix,,} of every weakly convergent sequence {xn} in #j is strongly convergent. The following results concerning such operators listed here without proof are of great interest in operator algebras (see 10.[3] for proofs of Results (3.5.5) and (3.5.6) and also [17]). Result 3.5.5:
Every compact operator on an infinite-dimensional Hilbert space is intransitive.
Result 3.5.6:
If an operator^ commutes with a non-zero compact operator theafl must be intransitive.
6
A FEW IMPORTANT OPERATORS
We devote this section to those operators which are of great interest from the applications 'point of view' namely, the Laplace operator, the Schrodinger operator and the Dirac operator. In earlier sections we have studied only (self-adjoint) Hermitian operators; in general, however, the operators are not self-adjoint. In sub-section 3 we introduce these general type of operators through examples of two well known operators, mentioned above. We begin here with the Laplace operator A which can easily be checked to be self-adjoint and hence real-valued on the domains we are considering here (see Subsecs. 3 and 4 for cases where - A is not necessarily self-adjoint).
6.1 Laplace Operator For the sake of simplicity we denote the Laplace operator A in the next two subsections as L and note that on compact manifolds without boundary, writing it as L = d8+ Sd we can use it to determine the
136 Mathematical Perspectives on Theoretical Physics
harmonic forms. Due to its importance, we study it on four different domains. The first of these is the Euclidean plane R2: {(xj, x2)}, where L, often denoted as LR2, is in its simplest form: dxf
+
dx\ •
Its eigenfunction and eigenvalue in this case are respectively e'X(-Xl<Xl +Xl "^ and - A2, a = (a,, c^) being a unit vector in K2 and X a complex number. When IR is considered as the homogeneous space M(2)/O(2) formed by M(2)—the group of isometries, modulo the orthogonal group 0(2) that leaves (0, 0) fixed, we note that every differential operator on K2 which is M(2)-invariant turns out to be a polynomial in L. If we think of IR2 as a complex plane z = x + iy, then L becomes: d2 L = 4—-.
(3.6.1)
dzdz
Here, we obtain its transform under conformal maps. For this we consider the group SL(2, C) formed by complex matrices
fa P}\= g of determinant vr o)
1, the group acts transitively on the one-point compacti-
fication of the plane via the conformal maps:
s-^f-
(3-6-2)
yz + 5 We define the transform Dg of a differential operator D on R2 under the mapping (3.6.2) as 17 : Dg:f-*(D(fog))og-1 If the differentia] operator D is —— we have: dz
feC~
(IR2)
(3.6.3(a))
(3.6.3(b))
The simplification of (3.6.3)(b) gives:
f | - 1 / = (.-Yz + a)2 ^f
(3.6.4(a))
V dzJg dz Complex conjugation of the above equality further gives:
f 4:1 / = (-7 z + a)2 4^ V dz Jg 17
(3.6.4(b))
dz
Note that (3.6.2) implies that g( z ) = a z
+
^ , and hence g~l(z)=
Tz + S
5l~^
-yz + cc'
A Primer on Operators 137
Thus the transformation L of L under the mapping (3.6.2) satisfies: Lg=\yz-a\4L.
(3.6.5)
Finally we consider the Poincar© model of the non-Euclidean plane and obtain the corresponding L in (3.6.11 (b)). Now the Poincare'model is given by the open disc T> = |z| < 1 in R2 equipped with the Riemannian structure: (3.6.6)
where | , t] are any tangent vectors at z e T> and ( , ) is the inner product in R2. Using (3.6.6), the infinitesimal arc length of a segment in £> can be written as: ds2 = (dx2 + dxl )/(l - x2 - x\f
(3.6.7)
or equivalently as: ds2 = gy dxl dx> where gi} = (1 - | Z | 2 )" 2 Sy The disc ©can be identified with SC/(1,1)/S0(2) through the mapping:
z^f±l
N2-|^|2 = l
(3.6.8)
(3-6.9)
Pz + a The above mapping is the onto conformal mapping of the unit disc.18 It can be checked that the action of 5(7(1,1) on T> is transitive and the subgroup fixing the origin is 50(2).
6.2 The Riemannian Measure Consider the Riemannian measure 0 - > J (j> 4 G d x x • • • d x n
(3.6.10)(a)
and the Laplace-Beltrami operator L-t>^^ld'n[l
rJGdA
(3.6.10)(b)
in an n-dimensional space, where G = |det ^|. In view of (3.6.8) we note that for ©they are respectively: dz=
Tdx1dx2
(3.6.1 l)(a)
[1- x{ - x2 J ( d2 (92 ^ L = ( l - ^ 2 - x 2 2 ) 2 — - + —T\
and
V dx-
(3.6.1 l)(b)
dy" J
Since the Riemannian measure and the Laplace-Beltrami operators are invariants of isometries, and all isometries of T> are given by the mapping (3.6.9) and the conjugation z —> z, it follows that dz and L 18
See Chapter 1, Sec. 4.
138 Mathematical Perspectives on Theoretical Physics
given respectively by (3.6.11)(a) and (3.6.11)(b) are invariants of these mappings (see Hints toExcercises 1 and 2). Listed below are few more results on the Laplacian without proof, first two of these relate the Laplace operator on K" to that of Sn~l. The proofs can be found in any standard text on complex or harmonic analysis (the texts to our taste are references l.[l] and [9]). Result 3.6.1: equation:
An eigenfunction O of L^2 with eigenvalue (- A2) always satisfies the functional
where z, w e C and yAA 0
- ^ - C®(z + eiew)d9 = &(z) Wx o (w) 2K U is given by: ^»=^rj
s
i
eiKx'eW) eimd d<0.
(3.6.12)
(3.6.13)
Here do) is the circular measure on S1, x = (x{, x2) e fR2 and m e Z. Conversely a continuous function <J> satisfying (3.6.12) is a solution to the equation L%i 3> = - A2 3>. It can be verified that if the solutions <£> of the above equation satisfy: <£ (eim6 z) = eim6'<X> {z)
(3.6.14)
then they must be constant multiples of the function \ff^m defined in (3.6.13) (see [9], pp. 14-16). Result 3.6.2:
The Laplacian L^n {n > 1) has the form
^ ^ T
+
—
dr
ir r
+J L
< 3 - 6 - 15 >
T r
or
where L is the Laplacian on S"~} - the (n - l)-dimensional unit sphere (see Exercise 6 of Sec. 1 for n = 3). Result 3.6.3:
The eigenspaces corresponding to L on Sn~l are of the form:
Em = Span of | X
a s
'• a
i i]
e c
" isotropic19 I
(3.6.16)
where {sv ••• , sn) are cartesian coordinates on S"~l and m e Z + . The eigenvalue is - m (m + n - 2). Each eigenspace representation is irreducible. Since they are mutually orthogonal, they give the orthogonal decomposition
£ Em
(3.6.17)
m=0
of the Hilbert space L2 (5"" 1 ) formed by square integrable functions on Sn~l. n
1
A vector a = (ax, • • •, a n ) e C" is called isotropic if ^
a
f = 0-
A Primer on Operators
6.3
139
Operators other than A
Continuing in this sequence of ideas we consider a few more important operators that relate to known physical theories. We again use the letter L to denote a formal differential operator on the Hilbert space 9{= L (E) (E being 3-dimensional Euclidean space) and emphasize that L is not necessarily selfadjoint. However, given such an operator, a densely defined linear operator can be constructed by restricting the domain L2(E) to CJ°(£), i.e., to the set of infinitely differentiable functions with compact support. This new operator turns out to be self-adjoint. We illustrate this point by choosing L as Schrodinger operator (note that in Sec. 1 we have used a factor of — on R.H.S): L=- A+ q{x) (3.6.18) Obviously L may not even be an operator on 5/"= L2(E), if q(x) is not satisfactorily designed; for instance if q(x) is singular, then for u(x) e L2{E) it is not necessary that q(x) u(x) e L2(E). To begin with, therefore, we ensure that q(x) is locally square integrable (i.e., q eL2(K), K a compact set of E) and, to make L into an operator for which the notion of self-adjointness is relevant, we assume that its domain is restricted to CJ° (£) or at least to CQ (£). The operator obtained by restricting the domain of L to CQ (denoted S) is called the minimal operator pertaining to L. Evidently the operator 5 can be extended to another operator S by enlarging its domain with the inclusion of all functions u e L2{E). The following definition for self-adjointness of a differential operator L would make it clear as to why a densely defined operator needs to be constructed from a given L. A differential operator L is said to be formally self-adjoint if Green's identity holds, i.e., if (3.6.19)
where Eo is any sub-domain of E with compact closure and with smooth boundary dE0; ds on the RHS stands for surface element and
for the inward normal derivative on dE0. When q(x) is zero L an becomes the familiar Laplace operator - A, whose minimal operator denoted usually as f (in literature) is self-adjoint. The domain of t in view of the above discussions is CQ (£) (See Chapter 6 in [17]).
6.4
Dirac Operator
Finally we write the Dirac operator20 for a particle moving freely in space. Using the same notation L we have: ' Although we shall return to this operator in other Chapters (e.g., 9 and 10), as a historical note we would like to mention here that this operator, introduced by the physicist P. A. M. Dirac in search of Lorentz invariant first order differential operator in the late 1920s, has led to important discoveries in mathematics through Atiyah, Bott and Singer's work. M. Atiyah and I. Singer-through their celebrated work on index theory broadened the concept of this operator by giving it a bundle-theoretic formulation (we use it in the Appendix to Chapter 10 to define elliptic operators and the heat operator). As this bundle-theoretic definition of the Dirac-operator is due to Atiyah and Singer, the Dirac-operator in this format is referred to asAtiyah-Singer operator (See. 7.[14]).
140
Mathematical Perspectives on Theoretical Physics
L=-
ia-grad + /3.
(3.6.20)
This operator L differs from the Schrodinger operator or other differential operators in the sense that its domain consists of vector valued (or spinor valued) functions u(x) = (wj(.r), u2{x), u3(x), u4(x)), the components Uj(x) (i,j = 1, 2, 3, 4) are functions of space variables (xlt x2, JC3). More specifically, if u(x) = (U[(x)) 6 C 4 , then components ak(k= 1,2, 3) of the vector a can be viewed as operators and can be identified with their (operator) representations given by (4x4) matrices with complex entries. Apparently P is also a similar (4 x 4) matrix in this case. Thus Lu = v = (vl (x), • • •, V4(x)) in terms of the components would be:
»*(*) = t (-' S («/)** T ^ + P*h «* h=\ V
dx
i=\
i
to)
(3-6.21)
)
The matrices ak and P are Hermitian symmetric matrices and satisfy the commutation relation: ak ah + ahak= 2Skhl for k, h = 1, 2, 3, 4 where /3 has been set up as aA. Using this operator we can construct other operators on Hilbert space H = (£2(IR3))4. Elements of .Tifare ([^-valued square integrable functions that satisfy:
IMI2=J|M(*)|2
|MW|2= ^
KWI 2
and whose associated inner product is («, v)= j
u(x) •
The corresponding minimal operator f (tu 3
v(x)dx. = Lu) is obtained by restricting the domain D(T) to
4
(Co°° (K )) . Thus it consists of all u(x) with the components uk {x) lying in Co~ (R 3 ). r is self-adjoint. The Dirac operator for a particle in static field with potential q(x) can be written as: L = -ia-grad+/J + Q
(3.6.22)
where (2 is the multiplication operator q(x)I. In order to define the minimal operator S (of L in (3.6.22)) with domain (CQ (R 3 )) 4 we must ask that q(x) be locally square integrable. Again 5 is densely defined but not necessarily self-adjoint.
Exercise 3.6 1. Show that the Riemannian respectively (1 — |z| 2 )~ 2 e and Laplace operator in R 2. Show that the Riemannian isometries of t).
measure and Laplace-Beltrami operator for the Poincare model T> are and (1 - \z\2)2L^, where e m and Lgy stand for the Euclidean measure . measure and the Laplace-Beltrami operators on Dare invariant under
A Primer on Operators 141
Hints to Exercise 3.6 1. By definition gij=
d-\z\2)25ii
=
(l-x?-xl)2SiJ
accordingly
(i-N 2 ) 2 This leads to
VG=
X
-T-T. d-W 2 ) 2
Also Euclidean measure for R 2 is dxxdx2, hence result is obvious for part 1 of the exercise. For the second part, we note that g'i in this case is simply Ofy)"1 therefore g11 — (1 - |z|2)2 8'J and gV ~/G= S'i. This substitution along with the value of in (3.6.10) (b) gives the required vG result, since
('-WVi-.d-*',-^ ( £ + £ } 2. We show first that the mappings21
and z —> z are isometries for ©, i.e. (3.6.7) or (3.6.8) are unaltered by replacing z by z or by
az + p Note that (3.6.7) can be written as: (I)
^-r + \dzdz 7-r^r ( 1 - U I 2 ) 2 [V 2 ) \ 2i ) J (l-|z|2)2 which immediately shows that z —» z is an isometry. Now i3z + a implies that 21
See also Sec. 3, Chapter 1.
142 Mathematical Perspectives on Theoretical Physics
(pz + a)2 )
{pz + a (pz + a)2 Similarly
Pz + a implies that dz -> ——
ipz + a)2
^-dz.
We substitute the values of z, z, dz and dz in RHS of (i) and simplify:
1
f 2
(l (az + p)(az + p)) )(Pz ( {pz + a){pz+a)) (ii)
=
dz
dz 1 2
+ a)
(fiz+af\
( (Pz + afiPz+a)2 1x 2 2 2 2 2 2 [ {|^| |z| + |a| + a~Pz + api- (M \z\ + |j8| + apz + fia z)f J X dz dz
_
dz dz
2
{(Pz + a)(Pz + a)) " ( 1 - U | 2 ) 2 ' Here we have used the basic computational method to show that
^ a z ^ Pz + a is an isometry. A more fundamental way of arriving at it would be by showing first that absolute values: zi - z 2 l~ZlZ2
are conformal invariants for any pair of points z,, z2 e ©, and then by taking the limit z2 —» z P This would lead to the fact
H i-N2 , where
m
=
i-l^l
a z + j3 ce = — —.
Pz + a
2
=ds
2
A Primer on Operators
143
To show the invariance of the Laplace operator under this mapping, we simply note that —— and dz
-r— respectively correspond to dz
(fc + afj; while
and
e-Mv-v
(1/)2
(/K + <*>2^-
,
{pz + a) (j3z+a)2 accordingly the result is obvious. For establishing the invariance of the Riemannian measure Adx dy
(i-w 2 ) 2 we use the conformal invariance of 2\dz\
i-Ui2 by choosing zlt z^ with equal y-coordinates to obtain dx and with equal ^-coordinates to obtain dy. [Naturally this subtle approach avoids computational complications. We are of the opinion, though, that they (computational methods) provide some insight for a layman.]
7
REPRESENTATIONS OF SU(2) AND S(/(3) USING THE THEORY OF OPERATORS
We devote this section to two well known examples where group representations and operator theory are used to explain ideas (e.g. weak, and strong interactions) in particle physics.22 The groups we have in mind are the special unitary groups SU(2) and SU(3), the former being the group of 'isospin invariance' of Yang-Mills theory (also known as the group of internal symmetries of weak interactions in electro-weak theory) and the latter (the 'eight fold way') that of quark-theory and of QCD (the group that explained the strong interactions). (See Subsec. (6.5.2).) We have already touched upon both these groups in Chapter 2 (see Exercise 2.2.11 and Exercise 2.4.2) in connection with the Lie algebra satisfied by their generators. Our aim here is to obtain the irreducible representations of these groups using the terminology of operators studied in this chapter. The operators in this case are generators of these groups, and the space on which they act is the Hilbert space of state vectors. Thus eigenfunctions are now replaced by eigenstates (see Sees 9.2 and 9A).
7.1
The Group SU(2)
In the case of this group we achieve our objective by using the fact that SU(2) is isomorphic to 50(3)— the group of rotations—and as such we have the angular momentum operators Ja{a = 1, 2, 3) (the generators of the group) and their squared sum: 22i
The reader may like to return to this section after Chapter 9 (in particular 9A). We discuss these ideas further in Chapter 6.
144
Mathematical Perspectives on Theoretical Physics 3
J2=Y,Ja
(3-7.1)
a=\
at our disposal. These can be identified with the generators {aj (a = 1, 2, 3) of 5(7(2), where aa are Pauli matrices.23 Recall that Ja = —— satisfied the relation [Ja, Jh]= ieah(.Jc
(3.7.2)
Since the structure constants in both cases (see 2.4.9) are the same up to scalars, the equations set up in the J's can be interpreted to give results for SU(2), more precisely an irreducible representation of SU(2) can be obtained in terms of eigenstates of these operators. To this end, we first note that J2 defined by (3.7.1) is an invariant operator (in fact it is a Casimir operator; see Exercise 6 of Sec. 2 and the footnote there for definition) which commutes with all the other operators Ja: [J2, Ja] = 0
(3.7.3)
Next, defining the raising and lowering operators J+= 7, + U2,
J_ = 7, - U2
(3.7.4)
J2 = ! (y+ J_+J_
J+) + }\
(3.7.5)
we can write J as
and using (3.7.2) we have the bracket relations: [j + , y j = 2y3
(3.7.6)
and
(3.7.7)
[y±) y 3 ] = + y ±
We now consider an eigenstate \a, jx) of J" and y3 with eigenvalues a and fl respectively: y2 \a, n) = a \a, //)
(3.7.8)
y 3 |a,/i) = n |a, ft)
(3.7.9)
In view of (3.7.7) it follows that y3 also has J± \a, ji) as eigenstates with eigenvalues Qi ± 1). It is also natural to ask the transformed state of \cc, fx) under the operators J+ and J_. Using the commuting relation (3.7.3) and the same eigenvalue a, we have (see Hint to (Exercise 2)): y ± \a, V) = K±(a,
/i) |a, n±\)
(3.7.10)
where K± (a, jJ.) are constants dependent on a and \i. Now for a fixed a the values of fx are bounded, more precisely because of the equality J2- j \ = j \ + J2, > 0, from (3.7.8) and (3.7.9) we have: a-/x2>0
(3.7.11)
Lety' denote the largest value that an eigenvalue fx can have, then, since J+ ja, fx) = const, ja, fx + 1} for every fx, it follows that J+\a,j) = 0 This in turn implies: ll
- See Equality (2.4.9) and Exercise (2.4.1) and also Example (6.4.3).
(3.7.12)
A Primer on Operators
145
0 = J_J+\a,j) = (J2 - J\ - / 3 ) | a, j) = (a-f-j)\a,j) (where we have used (3.7.5) and (3.7.6)), leading to the solution:
(3.7.13)
a = j(j+l) Similarly (since [i is bounded), setting / as the smallest value that fi can take, we have
(3.7.14)
J_|a,/> = 0
(3.7.15)
which gives a = f(j'-l) (3.7.16) Equating the two values of a, we get two solutions fory", they are either/ = -j orj' =j + 1. The latter being inadmissible, we finally conclude that if j is the largest value of \i then -j is its smallest value. Also since J_ lowers the value of \i by one unit (see 3.7.10), j -f = 2/ which means that j is either an integer or a half integer. The states \a, /x) with pi =j,j - 1, •••, - 1 , ••• , -j form the basis of an irreducible representation of SU(2). The dimension of the representation is (j + 1). The representation matrices can be obtained by using (3.7.9) and (3.7.10) (see for example the Hint to Exercise 3).
7.2
The Group 51/(3) and its Irreducible Representations
In Sec. (2.4) we have already seen that elements U 6 SU(3) which are unitary matrices U+- U+ U = I with det U = 1 can be expressed as (see Eq. (2.4.7)): [/(£,, •••, £8) = exp (iergr) r = 1, 2, ••-, 8 (3.7.17) where sr are the group parameters and gr are the group generators represented by (3 x 3) traceless Hermitian matrices; for our present purpose we choose these as the A-matrices of Gell-Mann, which are given as follows: '010"| (0 -i 0\ M 0 0A A, = 1 0 0 , X2 = i 0 0 , A3 = 0 - 1 0 ,
A
4
,0 0 oj
[o 0 OJ
'o o n
[o o -i]
= 0 0 0 , ; i
,1 0 oj
5
= 0 0
0 , A
[O 0 0j
fooo^ 6
= 0 0
[i 0 0J
1 ,
(3.7.18)
(oiOy
(0 0 0 \ (\ 0 0 s X = 0 0 - i , A. = -4=- 0 10. yo i o ) 1,0 o - 2 ) Note that the A-matrices have the commutation property
[lT'T-] = / / ^T-
a7 19)
-
146
Mathematical Perspectives on Theoretical Physics
where fabc is totally antisymmetric and has the following non-zero values: /l23 = 1' /l47 = ~Z>fl56 ~ ~ ~Z>f246 ~ ~~Z> fl51 ~ ~Z
f -l f -
l
/345 - ~Z' /367
(3.7.20)
f ~ f^ f ~ l^
y > /458 ~ -J ~Z ' /678 ~ A/ ~Z •
It can also be checked that they satisfy the normalization property: Tr (Afl, Xb) = 2 5 a ,
(3.7.21)
From (3.7.19) it follows that the group generators gr satisfy the Lie algebra relation ^ 8b\ = if** 8c (3-7.22) and also, since 5(7(3) is a rank 2 group (Sec. 2.4), there are just two Hermitian matrices which are diagonal. Apparently these are A3 and Ag; this leads to the fact that the corresponding generators g 3 and g g satisfy: [ft. 8%] = 0 The raising and lowering operators in terms of the remaining # r 's are 24 : T±=8i±
»ft.
U±=g6±
igl,
V± = g4 ± ig5
(3.7.23) (3.7.24)
The generators g3 and g% in this set up are denoted as: T3=g3, Y=^j8s
(3-7.25)
One of the Casimir operators* for SU(3) is g2 = ^
gt gt
i
The corresponding commutation relations using the operators of (3.7.24) and (3.7.25) are given by: [T3, T±] = ±T±, [T3, U±} = + ~ U±, [T3, V±] = ± y V± [Y, T±] = 0,
[Y,U±] = ±U±,
[T+, T_] = 2T3,
[Y, V±] = ±V±
(3.7.26)
[U+, U_]= | - y - T3 s 2U3
[V+,V_] = ^Y + T3=2V3
(3.7.27)
[T+,V+] = [T+,UJ\ = [U+,V+] = 0 [T+,V_] = -U_,
[T+,U+\ = V+
(3.7.28)
[U+,VJ = T_, [T3,Y] = 0. Note that these commutation relations can be easily verified using the values offahc given in (3.7.20). 2A
*
' The use of letters T, U, V, Y is an accepted usage in physics literature. Since SU(3) is a rank-2 group, there are two independent Casimir operators. The other one is cubic in the g/s.
A Primer on Operators
147
Now, since SU(3) is of rank 2 (equivalently T3 and Fcan be diagonalized simultaneously), the states in its irreducible representation must be labeled by two eigenvalues: t3 and y. A representation can then be considered as a two-dimensional figure in the t3 j-plane. The effect of the raising and lowering operators on the states formed by ?3 and y can be enumerated from (3.7.26)-(3.7.28) to obtain the graphic representation of SU(3). For instance (using the first bracket in (3.7.26)) T+ raises t3 by 1 unit, T_ lowers h by 1 unit, whereas both T+ and T_ leave y unchanged (bracket no. iv). Similarly operator U+ lowers t3 by 1/2 unit and raises y by 1 unit, and operator V+ raises r3 by 1/2 unit and y by 1 unit, etc. From the above it is evident that once an appropriate scale is selected for t3 and y, the raising and lowering operators connect points along the lines (in the t3 y-plane) whose inclinations are multiples of 60° with each other (see Fig. (3.1)).
u.
T_ - *
-1
'
\
\y
v_
".
\ A 6 0 ° T+ A
'
/\
1
*
u_
*^
f3
^ 7 Q 1 Graphic representation of raising and lowering operators of SUC3) on the t y-piane
Each irreducible representation of SU(3) is characterized by a set of two integers (n, m), and graphically it forms a hexagonal boundary such that three sides of the hexagon are equal in length to n units and the other three to m units. The hexagon collapses to an equilateral triangle if either n or m is zero. n
\m
m/
n
\
r
\
n
\r
m
7
Z_A m
m
(i)
^ ^ 3
/\m
(ii)
Boundaries of the SU[3) reoresentation (n. m). (a 0) and (0. m)
Finally, while an irreducible representation of SU(2) is characterized by one integer or a halt integer j , whose graphic representation is a straight line of length 2/ with 2/ + 1 sites each of them being occupied singly by one state, an irreducible representation of SU(3) has a more complex graphic representation. As can be expected, the variations in both arguments of the pair (n, m) lead to the multiplicity of states on each site in the f3 y-plane. In simple words they form the following pattern: the sites in the boundary are singly occupied, on the second layer they are doubly occupied and on the third they are triply occupied-the process continues until a triangle layer is reached and as a result multiplicity ceases to increase. This stage is reached when multiplicity becomes m + 1 for n > m (or n + 1 for m > n). 25
' Note that U+, which is called the raising operator, in this case is a misnomer.
148 Mathematical Perspectives on Theoretical Physics
The sum of the multiplicity of states at each site is the dimension of the representation. The formula for this dimension is: d(n, m) =
=a
(3.7.29)
Sometimes an irreducible representation is labeled not by the pair (n, m) but by its dimension. For instance, a J-dimensional irreducible representation is labeled by d and its complex conjugate by d*. In conclusion to this section, we give below four simple cases of graphic representations of SU(3) which are explained in Exercise 4. In all these examples all sites (shown by x) are singly occupied except the centre of 8 (see Section (4.2) of 9.[6] or Section (5.3) of 6.[16] for more details). y
t \7
f f
3
/ \ x
(n,m) = (1,0),3 (triplet)
•- h
- \
x
{n,m) = (0,1), 3* (triplet)
\
f
x^n
y. /-»• '3 /
\
(n, m) = (1, 1), 8 (octet)
/ \
/
(n, m) = (3, 0),10 (decuplet)
R S ^ j Examples of SLK3) representation with states labeled by (^3, y)
Exercise 3.7 1. Verify the Lie bracket relations (3.7.6) and (3.7.7). 2. Establish (3.7.10) and determine the constants A"± {a, [i) involved there. 3. Find the representation matrices of the irreducible representation of SLUT) forj = — and j = 1. 4. Give mathematical explainations for the diagrams of Fig. 3.3, that represent the sites of states for different choices of (n, m).
Hints to Exercise 3.7 1. Written out in full (3.7.6) becomes: [(7! + U2), (7, - U2)] = 2Jy Using the additive property of Lie bracket and the fact that [Ja, Ja] = 0 for a = 1, 2 we have on the LHS: [7,, - U2] + [U2, 7,] = 2i[J2, 7,] = -2i[J,, 72] = -2i(i£ 1 2 3 73) = 27 3 . Equality (3.7.7) can be verified in a similar manner. 2. We use (3.7.3) to write the equality:
A Primer on Operators 149
0=(J2J3-^J2)(J±\a,
fi))
= J2J3(J±\cc, fi)) - J3 (J± J2\a, fi)) = J2(fx ± 1) J±\a, fi) - aJ3 J+\a, fi) = {fi± \)J2{J±\a, fi)) - a(fi±
l)j±\a, ft)
(i)
To determine the constants K±(a, fi) in (3.7.10) we use the relation: (a, fi\J_ J+\a, fi) = \K+(a, fif.
(ii)
Since J_ is the Hermitian conjugate of J+, it follows that (a,fi\J_=Kt(a,fi)(a,(fi+l)\.
(iii)
Also from (3.7.13) the above equality becomes: (a, fi\J_ J+\a, fi) = (a, fi\J2- j \ - J3\a, fi) = j(j + 1) - fi2 - fi.
(iv)
Hence we obtain
KJia, fi) = (j(j + D-fl2-
H)T = [(j-fi)(j
+ fi+ 1)]T
(v)
where fi varies and j stands for (maxA Using similar steps, we have KJLa, fi) = [(/' + fi) (j- fi + 1 ) ] ^
(vi)
Note that sincey in the above values of K±(a, fi) stands for the largest value of fi which is integer or a half integer, fi takes only those values which are admissible with this value of j . 3. When j = 1/2 the only admissible values for fi are 1/2 and 1/2 - 1 = -1/2, hence the eigenstates in question are 11/2, 1/2) and 11/2, - 1/2). If these eigenstates are denoted in matrix form by column matrices, i.e:
i.l\ = H and k-±U°) 2 2/
[0J
2
(0
2/ [l)
then using , 1 1\ 1 1 1\ , . 1 1\ 11 J-,—,—) = , —) and 7-1—, ) = , 3 2 2/ 2 2 2/ 2 2/ 2 2 we have the matrix for J3 given by:
1\ ) 2/
(ii) From
y++2l,21\ = 0, and+y2+ l--!-\ = rri + l V i - i / 2 / LV 2 2 A 2 2
+1^1,1)
JJ 2 2 /
we have
(0 n
•/+=[o o H + '72
(ill)
150 Mathematical Perspectives on Theoretical Physics
(Note that we have used Eq. (v) of Exercise 2 in writing / + —,
+ f° •/- = • / • %
)). Similarly
0>
l
0J
=W
Ov)
2
These give the matrices for J{ and J2 as:
wo
n
i (0
-A
(v)
•'••Til o J - W l i o) The matrix representation for the irreducible representation of SU{2) is therefore given by:
if 1 °) ±f° l) ±f° "''I 2 [o - 1 / 2 U 0 / l[i
0/
Evidently it involves the Pauli matrices and thus confirms the identification Ja = —— mentioned in Section 7. For the choice j = 1, we have /i = 1, 0, - 1 ; the eigenstates in this case (with their matrix representation) are:
n\
fo\
|1, 1>= 0 , | l , 0 > = vOj
1 , 11,-1)= [0J
(o\ 0
(vi)
Uy
The eigenvalue of 73 on these eigenstates is respectively given by: / 3 |i, i) = i|i, i), y 3 |i, o) = o|i,o>, / 3 | i , -1} = - i | i , - l ) hence the matrix representation of J3 is:
no 73=
oN
0 0 0 ,0 0 - 1 ,
(vii)
Using the formula K±(a, n) = [(/• + fi) (j ± li + 1)]T for pairs (1, 1), (1, 0), (1, -1) we have the values of the constants needed for J+ and Jj, they are respectively (0, -Jl, 4l) and (V2~, 42, 0), accordingly, from 7+|l, 1) = 0, 7 + |l, 0) = -Jl\\, 1) and /+|1,-1) = -Jl\\, 0). We have: '0 42 J+ = 0 0
0N 42 = Jx + f72
(viii)
A Primer on Operators
151
and from 7_|1, 1) = V2~|l, 0), 7_|1, 0) = V2~|l, - I ) and J_|l, -1) = 0 we obtain: '0 J_ =
42 ,0
0
0^|
0
0
J2
0y
= 7, - U2
(ix)
The solutions of (viii) and (ix) yield:
ro -i oN
ro i o\ Jl
=
-j=r
1 0 1 , J2=-±=r
«
0
-i
The triplet (/,, J2, 73) gives another irreducible representation of SU(2), and it is easy to check that 7 a 's satisfy the Lie algebra relations (3.7.2). 4. In order to solve this exercise we list a few more facts about the states and the sites involved in SU(3) representations. We recall that: T+ raises t3 by 1 unit and leaves y unchanged. U+ lowers t3 by 1/2 unit and raises y by 1 unit, V+ raises / 3 by 1/2 unit and raises y by 1 unit. On the other hand, since T_ = (T+)+ (adjoint of T+) etc., operators T_, V_ become the lowering operator and f/_becomes the raising operator on r3. T_ leaves y unchanged and U_, V_ lower y by 1 unit. This also implies that sites are symmetrical with respect to y-axis. We further note that since T+, U_ and V+ all increase the value of T3, for SU(3) representations there must exist a maximally stretched state ^ r a that satisfies: (i) T+
For a given pair (n, m) the width of widest portion of hexagon is n + m, thus 0 max state gives: r-\
T / A
\
n +m
(n) T3 ((pmj = —j-,
n-m
Y (
-^~-
We now consider the first diagram for which n = 1, m = 0. From (ii) 7 3 (0 max ) = f3 = — ^(0max) = y = 1/3 thus coordinate representation of one of the sites with (one state) is ( —, — . Due to symmetry the other site is
, — given by T —, —) = ,—). i 2 3/ 2 3/ 2 3/ Now none of the raising operators on T3 can be used in view of (i). Hence the only operator that gives the admissible result is:
(iii) V 1 1 \ = 0, ^ . \ "2
3/
3 / ( This gives the third site 0,
-2\ .
152 Mathematical Perspectives on Theoretical Physics
(Note that since m = 0, 3 sides of the hexagon have reduced to zero). Using the formula (3.7.29) we have d = 3. Diagram (2) is conjugate of (1). r 3 (0max) = —, and Y ($IO!a) =
. This explains the inverted triangle. The dimension d* = 3, which is denoted
as 3* or 3 . In the case of diagram (3), («, m) = (1, 1) gives:
(iv) r3(tfUx)Sr3 = i, y ( ^ ) S y = o The site of state |1, 0) is given by the point (1, 0) in the t3 - y plane. Using same arguments as above we use the operators T_, V_ and U+ to obtain: (v) 71|l,0> = |0, 0), V_|l, 0>= — , - l \ , t / + | l , 0 > = — , l \ T_ V _ | l , 0 > \ =
-—,-l\,T_T_\l,0)
= | - 1 , 0>, T_ U+ |1, 0> = - — , l V V_ U+ |1, 0) = |0, 0>. Which shows that all sites except the centre of the hexagon is occupied by single states, the centre is doubly occupied. Obviously dimension equals 8. Diagram (4) is left as an exercise for the reader.
References 1. W. Arveson, Ten Lectures on Operators Algebras, A.M.S. No. 55 (1984). 2. H. Bercovici, C. Foias and C. Pearcy, Dual Algebras with Applications to Invariant Subspaces and Dilation Theory, A.M.S. No. 56 (1985). 3. M. S. Birman and M. Z. Solomjak, Spectral Theory of Self-Adjoint Operators in Hilbert Space (D. Reidel Publishing Company, 1987). 4. L. de Broglie, Heisenberg's Uncertainties and the Probabilistic Interpretation of Wave Mechanics, with Critical Notes of the Author (Boston: Kluwer Academic Publishers, 1990). 5. H. R. Dowson, Spectral Theory of Linear Operators (New York: Academic Press, 1978). 6. N. Dunford and J. Schwartz, Linear Operators (Vols. 1, 2 and 3, New York: Interscience Publishers, 1958). 7. C. Foias, C. Pearcy and B. Sz.-Nagy, (a) The Functional Model of a Contraction and the Space Ll, Acta Sci. Math. (Szeged) 42 (1980) 201-204; (b) Contractions with Spectral Radius One and Invariant Subspace, Acta. Sci. Math. (Szeged), 43 (1981) 273-280. 8. H. F. Hameka, Quantum Mechanics (New York: Wiley, 1981). 9. S. Helgason, Topics in Harmonic Analysis on Homogeneous Spaces (Boston: Birkhauser, 1981). 10. M. W. Hirsch and S. Smale, Differential Equations, Dynamical Systems, and Linear Algebra (New York: Academic Press, 1974). 11. T. F. Jordan, Linear Operators for Quantum Mechanics (New York: Wiley, 1969). 12. F. Riesz and B. Sz.-Nagy, Functional Analysis (New York: F. Ungar Publishing Company, 1955). 13. W. Rudin, Real and Complex Analysis (New York: McGraw-Hill, 1966). 14. J. L. Soule, Linear Operators in Hilbert Space (Gordon and Breach Science Publishers, 1968). 15. M. H. Stone, Linear Transformations in Hilbert Space (Am. Math. Soc. New York, 1932). 16. B. Sz.-Nagy and C. Foias, Harmonic Analysis of Operators on Hilbert Space (Amsterdam: North-Holland, 1970). 17. J. Weidmann, Linear Operators in Hilbert Spaces (New York: Springer-Verlag, 1980).
CHAPTER
BASICS OF ALGEBRAS AND RELATED CONCEPTS
1
A H1
SOME DEFINITIONS AND EXAMPLES
The topic of algebra has always been important to both physicists and mathematicians. The diversity of algebras in recent years has increased manifold, and their links to applications in physics are becoming more and more apparent. Some of these algebras (known quite well in literature) are the following: associative algebras, Lie algebras, Jordan algebras, Cartan algebras, Heisenberg algebras, Chevalley algebras, Von-Neumann algebras, and of more recent origin, current algebras, Hopf algebras, affine Kac-Moody algebras, Virasoro algebras and vertex operator algebras.1 The latter three algebras are all infinite dimensional (see Chapter 5) and have been responsible for better understanding of string/ superstring theory, apart from the fact that they opened new channels of interpretation for some of the classical concepts, e.g., the Roger-Ramanujan identities, the Dirac's magnetic monopole, soliton solutions of the KdV equation (see Preface and Chapter 7 in [7]). One could very well say that no study in mathematical physics could be complete without the knowledge of these algebras and Lie algebras. Due to our limited scope, we give only the definitions and examples (together with some exercises and their hints) of only those objects which are relevant to our main theme: the applications in quantum, YangMills and string theory. To begin with we give a few definitions.
l.l
Associative, Jordan and Lie Algebras
Definition 4.1.1 An algebra A is a linear vector space over a field J of characteristic 0 or a prime number p on which a distributive binary operation 'o' that commutes with scalars can be defined as: a o ( X b ) = X(a o b) = ( k a ) o b
a, b e A ,
X
&
J
(4.1.1)
It is called an associative algebra if for all triples a, b, c e !A the associativity rule: (a o b) o c = a o (b o c)
(4.1.2)
is valid. A subset 5 of A is called a subalgebra of A if it is closed under the operation 'o'.
1
We study here mainly Lie algebras due to their importance in applications. The study of Hopf algebras is postponed to Chapter 9, since in current terminology they are also called quantum groups (see Appendix D in Chapter 9).
154 Mathematical Perspectives on Theoretical Physics
Definition 4.1.2: algebra.
An algebra J over J characterized by the following properties is called a Jordan a ob=boa
2
2
a o(()oa)-((i ob)oa = 0
(4.1.3) (4.1.4)
The above properties are also written as: [a, b] = 0 and [a2, b,a] = 0 (See [5] for details.) Definition 4.1.3: An algebra L is called a Lie-algebra if the commutator formed by all pairs of a given algebra is bilinear and anti-symmetric, thus denoting the commutator (called the Lie bracket) as [a, b] we have: [a,b] = -[b,a]
(4.1.5(a))
and [a + A . a ' , b + /j. b'] = [a, b] + X [a, b] + \i [a, b']+ X\i [a', b']
(4.1.5(b))
It can be easily verified that (4.1.5)(a) leads to the Jacobi identity: [a, [b, c]] + [b, [c, a]] + [c, [a, b]\ = 0
(4.1.6)
It is to be noted that while a Jordan algebra J is an associative algebra, a Lie algebra is not so. The associativity is replaced by the Jacobi identity in this case. Before we give examples to illustrate the above definitions, we shall define one more object: the lattice, since, as we shall soon see that the Lie algebras (with which we shall be most concerned) could very often be constructed via Lie groups and lattices. See for instance Example (4.1.7) and Hints to Exercise 8 of this section and Exercise 3 of Sec. (5.1).
1.2
Lattices
Definition 4.1.4: Let V be a real Af-dimensional vector space with an inner product (not necessarily non-singular) denoted x o y for x, y e V, and let [e;}i = 1, ...,N denote the basis vectors of V, then the set of points of V of the form: In,*,-
n,-e Z
(4.1.7)
1=1
is called a lattice (denoted A) in V. If V is a Euclidean space R or a Minkowskian space RN~1'', the lattice A is called a Euclidean or Lorentzian lattice respectively. If | det(e( o efi | = 1 the lattice is called unimodular. A unimodular lattice implies that A contains just one point of V per unit volume. In other words, if a lattice A is unimodular, each unit volume of V can contain only one point of A. Definition 4.1.5: The dual A* (not always a lattice) of any lattice A is the set of points y e V for which x o y is an integer for all x e A. Only if the inner product is non-singular and A spans V, the set A is a lattice called the dual lattice of A. The lattice A is called an integral lattice if x o y is an integer for every x, y e A. It can be verified that in this case A c A . If A is both unimodular and integral then it is self-dual, i.e., A = A* (see Goddard and Olive in 5.[6]).
Basics of Algebras and Related Concepts 155
1.3
Examples of Algebras, and the *- and C*-Algebra
Returning to algebra and examples of algebra, we first note that in physics the abstract elements a, b, c belonging to an algebra are realized by specific quantities such as functions <j)(a) on T*(M) (the cotangent bundle of a manifold M) say, or the operators^ on Hilbert space J{. The product a o b in the case of functions defined on T*(M) can then be expressed as:
Ha)o
= 4~Cuv4^r
(4-1-8)
dav
da" uv
where a e f*{M) and C is a tensor with u, v ranging over 1, 2, ..., dim M. The distributivity and commutativity with respect to scalars can easily be checked for the above product. Another simple example of algebra is the so-called polynomial algebra. The set of all polynomials (in x): n
f= a o + I X * ' (an* 0) 1=1
defined on a field J forms an infinite dimensional commutative and associative algebra over J. Consider now an algebra S\ of observables 'a' in quantum mechanics. Let w denote a state in S\ which defines a linear positive and normalized functional on S\, i.e., 0)(Xa + X' a') = X (O(a)+ X' (O{a') co(a*a) > 0
(4.1.9(a)) (4.1.9(b))
(0(1) = 1
(4.1.9(c))
then S\ equipped with a state co is called a *-algebra. Note that a is conjugate of observable a, I is the identity and X , A' are real numbers. With every *-algebra A one can associate another algebra known as C -algebra^L in the following manner: for each element a eJZ and a state co and/e A, define a probability measure fxm a on the real line such that (O(f(a)) = AV a if) = [fettle
a
(4.1.10)
The elements/and g e A satisfy the properties
||/+S||~=||/|U+IUL = |A|||/|U
(4.1.11(a)) (4.i.ii
| | / | | - = 0=>/=0
(4.1.11(c))
\\fg\\£
(4.1.1 Kd))
IU/|U
11/11 \\g II
H//Il = ll//Il = II/1H 2
'
(4.1-11(0)
Note that the definition of this product implies that the product is real valued or complex valued depending on the manifold M.
156
Mathematical Perspectives on Theoretical Physics
where
||/||-= sup|/(A )|.
It can be checked that distributivity and multiplication with scalars for (4.1.10) follows using the properties (4.1.11). These are incidentally the properties of absolute values of complex-valued functions defined on the real line R. The property (4.1. ll)(d) further ensures that multiplication is a continuous operation, i.e., fn^>f> 8n~> 8=> fn8n ~» f8 The above algebra is also referred to as Banach algebra (see [10]). Another example of an algebra is given by the collection of continuous functions with compact support on a locally compact group G (denoted CC(G)). The binary operation here is the convolution defined as: p * yr{x) = jG
dy
(4.1.12)
where dy is a Haar measure on G (see Exercise 1-5 for other examples of algebras and Def. (0.4.3) for Haar measure). We now give a few simple examples of Lie algebras.
1.4
Examples on Lie Algebra
Example 4.1.6: Every vector subspace L of an associative algebra closed under the operation [x, y] = xy — yx is a Lie algebra. Example 4.1.7: The general linear algebra gl(V) formed by endomorphisms of a vector space V (denoted EndV) is a Lie algebra. If V is n-dimensional R'1 we write it as gln or gl(n, R). The standard basis in this case is {e^Kl^ i <j < n) where ey is the matrix with (i,/)-th entry as 1 and zero elsewhere. Since e,-,- ekl = Sjk en it can be verified that the Lie bracket satisfies: ieip ekl\=
S
jkeil-
5
Uekj
which in turn leads to Jacobi identity. Example 4.1.8: The tangent space Te{G) at the identity element of every Lie group G is a vector space which forms a Lie algebra. More specifically if G is a r-parameter group with generators [XM] (jj. = 1, ..., r) then the commutators formed by generators satisfy (i.e., they are linear combinations of generators): [Xfl,Xv\ = Cx^vXx
(4.1.13)
The algebra of generators is a Lie algebra as is evident from (4.1.13). The constants C^v, known as structure constants, take values on the representation space of G. Since [X^, Xv ] = - [Xv , X^ ] these constants C ^ v are anti-symmetric in [iv . It can be verified that in consequence of Jacobi identity satisfied by the commutators they also satisfy an identity:
I
(Cm Cm + C^ Cdvfs + CVflS C ^ ) = 0
(4.1.14)
3=1
(Note that this Lie algebra is finite dimensional since the basis vectors in this case are {XJ \l - 1, 2, ..., r.) We devote next section to two special classes of algebra, the solvable Lie algebras and the semisimple Lie algebras, which will eventually help us in defining the infinite dimensional algebras.
Basics of Algebras and Related Concepts
157
Exercise 4.1 1. Show that the direct sum T(V) = f © V © (V ® V) © ... of all the tensor powers of a vector space V over a field J is an infinite-dimensional algebra called the tensor algebra. Show also that it is an associative algebra. 2. Show that the quotient vector space E(V) = T(V)/A, where A is a two sided ideal 3 generated by all elements of the form x ® x (x e V), is an associative algebra obtained by antisymmetrizing the tensors. This is known as the exterior algebra, and unlike T(V) it is finite-dimensional. 3. Let Vw", r e Z+(r
c+=
© r=even integer
cr,
c_=
©
cr
r = odd integer
are linear subspaces of C, and C + is also a subalgebra of C. The Clifford algebras C(V(1)) and C( V(0)) are respectively called the Dirac algebra and the Pauli algebra. Show that even subalgebra of Dirac algebra is Pauli algebra. 6. Show that every real Lie algebra £ can be extended to a complex Lie algebra with same structure constants. 7. Let the space-time representation of angular momentum and momentum operators M^v and P^ be given as: (a) Muv = xu — M h dxv (b) />,=
xv — , dxn
/ -
then show that the Lie brackets for them satisfy: (c) [Mmn, Mpq] = gnp Mm - gmp Mm - g,lqMmp + gniqMnp (d) [Mmn,Pq] = 3
'
4
glulPm-gniqPn
A left ideal (right ideal) / of an algebra A is a subalgebra of A such that is I => ai e I (iae / ) V a e A (see Section 2).} Note that e{ stands collectively for one element set ex, e2, e3, ... ,en while et et stands for two element sets ele2,ele3,...,elen,e2e3,...,e2en,...etc.
158
Mathematical Perspectives on Theoretical Physics
[Pm, Pa] = 0
(e)
where g^v stands for Minkowskian metric (- + + + ). The algebra obtained above is called the Poincare algebra. Show that if, on the other hand, g^v is replaced by the 5-dimensional metric gAB = (- + + + + ), one obtains the de Sitter algebra resulting from the group 0(1, 4) (see Sec. 7.3). The generalized angular momentum operators MAB in this case satisfy: (f) [MAB, MCD\ = gBC MAD - gAC MBD - gBD MAC + gAD MBC whereas identifying M^5 = f ^ , it can be checked that (g) [Ffl,rv}=-Mllv=Mvll. 8. Show that the vector space spanned by the generators of an /--parameter group forms a Lie algebra by choosing r = 2 and 3. 9. Show that the Lie algebra of the group of volume preserving diffeomorphisms is the algebra of divergence-free vector fields.
Hints to Exercise 4.1 1. Let ®rV and ®SV denote the vector spaces of tensors of rank r and s respectively. The binary operation on T( V) which identifies the tensor product between elements of ®rV and &V with that of®' + V: (x, ® x2 ® ... ® xr) ® 0>, ® y2 ® ... ® ys) - (x{ ® x2 ® ... ® xr ® y{ ® y2 ® ... ® ys) makes T(V) into an algebra, as Property (4.1.1) of Def. (4.1.1) is obvious with respect to scalar multiplication. The associativity is easy to check. Since the tensor product of any number of elements can be taken as many times as one likes, the algebra is infinite-dimensional. 2. The vector space A = {x ® x) generated by x e V is a two-sided ideal of T(V), i.e., AT(V) c A, as well as T(V)A c A (compare this property of two-sided ideal with that of a normal subgroup). The multiplication operation induced by ® in E(V) is denoted A and two cosets tx + A and t2 + A for f,, t2 € T(V) satisfy: (i) (r, + A) A (t2 + A) = [(/,® t2) - (r2 ® *,)] + A. The vector space V is c E(V), since x e V can be identified with the coset x + A e E(V). V can then be used to give exterior powers of V, e.g., (ii) Ar(V) = V A V A ... A V. (r copies). Each of these is a subspace of E(V), and if Vis n dimensional with basis [el } (i = 1, •••, n), fn} nl then A (V) is = '• \r) r\{n-r)\ e
dimensional (0 < r < ri) with basis vectors e., A e,- A ... A l
2
i 0'i < «2< ••• < ir)- T h u s
(iii) E(V)=
© Ar(V) r= 0
has finite dimension 2". 3. The set V{/." (r = 0, 1, 2, •••, n) whose vectors satisfy (iv) can easily be seen to satisfy the property (4.1.1) required for an algebra. It is associative and distributive with respect to additions, since (denoting the operation by ' o') we have for distributivity: (i) 2(u + a u, v + X v') = (u + a u) o (v + X v') = {u+ a u')(v + X v') + (v + X v')(u + a u')
Basics of Algebras and Related Concepts
159
= (uv + vu) + a (u'v + vu') + X (uv' + v'u) + <xX (u'v' + v'u') = 2{u, v) + 2a <«', V) + 2X (u, v') + 2aX <«', v'). The associativity for addition can also be checked in a similar manner. The properties (i), (ii) and (iii) in view of (iv) lead to: (ii) efij + efr = ±2 5tj
(iii) <«,, e,.)=(e,.)2 = ± l
(iv) eft = efi i * j . Since all combinations of products are to be considered we have:
The basis vectors for these products are naturally formed by 1, e7, elxel2 for V(2" and 1, e7, e, e, , e, ej e, for V^" (see Ftn. 4). When n = 1, the basis vector of V(0) is only one in number, say, for instance, it is e, then in view of (iii), e2 = - 1 . Hence an arbitrary element of C(V(0)) is of the form cc+ )3e = a±iP where a, p e R. Thus C(V(o}) is C. 4. A basis for V^ is (eb e2) with (e;, e>) = - dy, and elements of C{V^) axe of the form a + bex + ce2 + dexe2 where a, b, c, d e R. Write i = ex, j = e2, k = e-^e-^ then using the fact that r = 0 implies e\-e\ = (e^e^) - - 1 . We can write: y = -;' = * (i) jk = -kj = i 11 = -ik = j and 12 = / = * 2 = - l But these equalities set out in (i) are the defining conditions of the algebra of the quaternions with the basis (i, j , k), hence the result. 5. A basis for the even subalgebra of C(V(1)) is given by: (1, exe2, exez, e,e 4 , e2e3, e2eA, e3eA,
exe2e3e^.
Let a basis for V(0) be (gj, e2, ?3) which satisfies (i), (ii), (iii) of Exercise 3. The elements of C(V{0)) will then be (1, ex, e2, e3, exe2,?{?3,
e2e3,exe2e3).
Use the identification mapping:
and note that: exe2 = e{e2 exe3 = el(-ele2)e3 = -e 2 e 3 = (-l)e 2 e 3 exe3 = exe2exei = -e 2 e 4 = (-l)e2e4 e2e3 = e1«3e1«4 = -e 3 e 4 - = (~1)«3«4 which shows that the elements of C(V^) (the Pauli algebra) are in one-one correspondence with the elements of C+ (the even subalgebra) c C(V(1j) (the Dirac algebra). In other words, there is an isomorphism between the Pauli algebra and the even part of the Dirac algebra.
160
Mathematical Perspectives on Theoretical Physics
6. The complex extension of real Lie algebra is denoted C ® L and is called the complexification of L. To show that it has same structure constants, we assume that£ is finite-dimensional and thus has basis {e; } (i = 1, 2, ..., n). The Lie product: gives the structure constants {C|}. The elements of C ® L, when it is regarded as a complex vector space, are defined by: (i) A (jU ® x) = {kpi )
LA^MTO]=^—
-,„_ j|,,_ -,,_j
_f _i__ _A_V _^_-
91
-('"ax, " ' ' a x j l ^ a x ^ ' - a x j ,.
_{
a
a
a
a1
I
dxn
3x^
dxp
dxm )
( _ a _ _ a _ _ ^_ _a_"l I
OJCBI
9x£/
3xp
3x,,)
_{ _a_ _ a _ _ _a_ _d_) -{XmdXnx«dXp-Xpdxclx"axJ +
^i
[.
i
oxm
X
•"'(7 -,
p
dxp
-\
dx9
X
m
->
dxn )
Since the metric is Minkowskian we have:
_a_ *•"•'
-i
dxn
a x
p
~ 8np ~ Spn ~ -\
x
n-
dxp
Hence the expression in the first parenthesis of (i) simplifies to
Basics of Algebras and Related Concepts
t (HO
d
161
*]
Snp\ Xm -foT ~ Xq ~^~
= SnpMmq-
Using similar simplifications as in (ii), we obtain (c). To establish (f) we consider /• \
h*
d
M
(IV)
AB =
X
A T
d X
B
3—
dxB dxA where xA, xB are coordinates in a 5-dimensional space, and repeat the steps taken in (c). Finally, to show (g), we note that
(v)
p^^^-^-L.
accordingly
-
X
v "3 V. dx5
X
X 53 [X 3 dxv ) y dx5
d (
3 V
f a - <. x5 { dx^ +
a xv dx5
xu\ xv V {dx5 )dx5
*5 "5 dxM J
d (
a x5 top
d )
a 1 xv — > dx5 J
± 3 similar terms . J
The last four terms in the above expression given in ( ) are zero as xv and x5 are independent coordinates, while the two terms in { } cancel each other, hence we are left with the first two terms which equal Mv^.
2
SOLVABLE AND SEMI-SIMPLE LIE ALGEBRAS
In order to study the above classes of algebras, we have to define two basic objects that are required there—the Lie subalgebra and the ideal.
2.1
Lie Subalgebras, Ideals and Lattices
Definition 4.2.1: Let A and B be two subspaces of a given Lie algebra X, and let [A, B] denote the subspace spanned by all vectors of the type [a, b] (formed by the Lie product) for a e A and b 6 B. A
162
Mathematical Perspectives on Theoretical Physics
subspace S which is closed under Lie multiplication, i.e. [S, S] c S is called a subalgebra of L and is denoted SDefinition 4.2.2: Given a subalgebra 5 of L, if the Lie product [/, s] for I e L and 5 e 5 is a member of 5, then S is called an idea/ of L and is denoted /. Thus an ideal of L is characterized by the relation [£,F\
(4.2.1)
It is easy to note that concepts of subalgebra and ideal play the same role in Lie algebra as the subgroups and normal subgroups play in Lie group theory. Accordingly if one were to talk about a homomorphism h between two Lie algebras Ll and L2, then it can be checked that the kernel of h is an ideal of L x and the image under h is a subalgebra of L2The sum and intersection of ideals are again ideals, and the ideals of a Lie algebra form a lattice under these two operations. To avoid any confusion with the word lattice introduced in the previous section, we wish to note here that this usage refers to following definition: Definition 4.2.3: Given a vector space V, the set 5 of all subspaces of V equipped with the operations of intersection and union (of subspaces) is called a lattice of V.
2.2 Semi-simple and Simple Lie Algebras and their Levi Decomposition Given a Lie algebra L, consider the Lie product [£, L] = L'.5 Evidently £ z> L' and £' is an ideal of L. Thus with the Lie algebra L a series of ideals can be associated by repeating the process of taking the Lie product, thus: [£', £'] = £"
[L(r\ L^'-] =
aL'a£
L{r+x)
where r is a positive integer representing the Lie product number. The ideals obtained in this manner fromX are said to form a derived series of ideals of L. More specifically these ideal are said to form an increasing derived sequence. Definition 4.2.4:
A Lie algebra L is called a solvable Lie algebra if the series L D i ' D l " ID . . . D i W D . . .
is eventually zero. Every abelian Lie algebra (i.e., an algebra L for which [£, L\ - 0) is trivially solvable, since [L, L] =£' is zero. The simplest example of a solvable Lie algebra which is non-abelian is offered by the two-dimensional affine Lie algebra. In this case £ = {ex, e2 } and [L, L] = L' is given by the Lie product [ev e2] = ex and hence £" is zero. An ideal of a given Lie algebra can either be solvable or be not solvable. All ideals that are solvable form a sublattice (under the operation of intersection and union) of the lattice associated toX. Evidently the intersection as well as the sum of two solvable ideals is a solvable ideal.
5
In Def. (4.2.18) we denote it as T>L and thus indicate the difference between the two definitions.
Basics of Algebras and Related Concepts
163
Definition 4.2.5: The sum of all solvable ideals of a Lie algebra L is its unique maximal solvable ideal. This is called the radical of Lie algebra L and is denoted %. The supplement of the radical in L is called the Levi subalgebra of L. Apparently the radical of a Lie algebra is 0 if it does not have any solvable ideals other than zero. Definition 4.2.6: A Lie algebra is called a semi-simple Lie algebra if it has no non-trivial abelian ideals (note that every Lie algebra contains a trivial abelian ideal 0). In view of the definitions of solvable ideals and the radical, one concludes that a Lie algebra is semisimple if and only if its radical is zero. A Lie algebra L that is not semi-simple can however be expressed as a direct sum of two subalgebras one of which is its radical % and the other is the semi-simple subalgebra S: £ = %.®S
(4.2.2)
This is called the Levi decomposition of L. According to this decomposition every element of L can be written uniquely as a sum of an element in ^ and an element in 5Definition 4.2.7: A Lie algebra is called a simple Lie algebra if it is non-abelian and has no proper ideals (note that the whole algebra L can be viewed as an ideal—but this is not a proper ideal). Every simple Lie algebra can be thought of as a semi-simple Lie algebra. Example 4.2.8: The real Lie algebra so(3, R ) s X formed by a 3-dimensional real vector space V3 equipped with familiar cross product is a simple Lie algebra. To see this consider a vector subspace 5 of V3 which can be a line or a plane (passing through the origin), let x be a vector in S and y a vector in L. If S has to be an ideal then x x y must be in 5, but the cross product of two vectors is always perpendicular to the plane formed by them, hence it cannot be in 5- Thus 5 is not an ideal of so (3, R). Also for x,ye so (3, R), x x y = - y x x which shows; that it is non-abelian and hence it is simple. Remark 4.2.9: simple ideals:
Furthermore, every semi-simple Lie algebra 5 can be written as a direct sum of S = I:® I2@ ... ® Ik
(4.2.3)
Thus a Lie algebra L can be expressed as a direct sum of its radical and simple ideals:
L = 3(.®h ® h ® ••• ® h
(4-2.4)
Before we close this section, we give a few more elementary definitions and the results that we shall use.
2.3
Lie Algebra of Derivations, Adjoint Mapping and Centralizer
Definition 4.2.10: \JS\.!A be an algebra over a field J. A mapping D :S\ —> S\ is called a derivation of S\ if it is jF-linear and satisfies the Leibnitz rule: D(ab) = {Da)b + a(Db) for a, b e A
(4.2.6)
The kernel of a derivation D is a subalgebra of S\. If Dl and D2 are two derivations of S\, then it can be easily checked that DXD2 - D2Dl is a derivation. Obviously the above definition holds (in particular) for a Lie algebraX, and implies that the set of all derivations of L is a Lie algebra denoted (DL.
164
Mathematical Perspectives on Theoretical Physics
Definition 4.2.11: Let x be a fixed element of X. Then the linear mapping y —> [x, y] of £ into L is called the adjoint linear mapping on L, and is denoted adLx or simply adx. Based on this we have: Result 4.2.12: Let £ denote a Lie algebra and tDL the Lie algebra formed by derivations of £. For every x e L, adx is a derivation. The mapping x —> adx is a homomorphism of the Lie algebra £ into 'DL. Moreover if D e tDL and x e £, then [D, adx\ = ad(Dx)
(4.2.7)
A mapping adx is also called an inner derivation of JC (see Sec. 3 and Hint to Exercise 1 for the proof of the above result). Definition 4.2.13: LetX be a Lie algebra and X be a subset of £. The centralizer of X in £ is the set of all those elements ofX which permute with elements of X. The centralizer of X is the intersection of all kernels of adx as x runs through X. We denote it Xc. It can be checked that Xc is a subalgebra of £. Result 4.2.14: Let £ be a Lie algebra and / an ideal of L. The centralizer Ic of / in X is an ideal (see Exercise 2). The centralizer of L in L is called the cerafre of L. The centre of £ is the kernel of the homomorphism x —> adx. Definition 4.2.15:
LetX, and£ 2 be two Lie algebras overiF. An extension of X2 by i ^ is a sequence: Ll
fl
) X
^
) X2
(4.2.8)
where £ is a Lie algebra over ?"; / 2 is a surjective homomorphism of X onto £ 2 and fx is an injective homomorphism of X, onto the kernel of/2. The kernel "Koff2 is called the kernel of the extension. Evidently the homomorphism/! is an isomorphism of X, onto 3Cand/ 2 can be viewed as an isomorphism of the quotient £1%onto X2- By an abuse of language £ is called an extension of L7 by £v
2.4
Modules, Lower and Upper Central Series
According to Bourbaki a unitary module M over a field f is a set equipped with binary operation (x, y) —> xy of M x M into A/, that satisfies all axioms of algebra except the associativity (see [2]). A subset of M is a submodule if it is invariant under the binary operation. Using M, a Lie algebra L can be formed, conversely L (or a subset of L) with respect to binary operation on L can be viewed as a module (or a submodule provided the subset is closed with respect to binary operation). In Sec. 3 we shall return to modules while studying the representation theory of Lie algebras. Definition 4.2.16:
An ideal of L is a submodule of L which is stable under inner derivations of L.
Definition 4.2.17: A submodule of £ which is stable under every derivation of L is called a characteristic ideal of L. Definition 4.2.18: The characteristic ideal [L, L] is called the derived ideal of a Lie algebra L and is denoted 'DL (note that we have denoted this as L' also). Every submodule of L containing ©X is an ideal of L. The derived series of L is the decreasing sequence
Basics of Algebras and Related Concepts
165
Finally we have the upper central series of Lie algebra X defined as follows: Definition 4.2.20: The increasing sequence C0X, CXL, C2L ... of characteristic ideals defined as C0L = [0},Cp+lL = inverse image of the centre of L/C' L under the canonical mapping ofX onto L/C L, is called the upper central series of L. Note that the ideal C,X is the centre of X.6 We shall revert to these series in the next section where we use them to classify Lie algebras.
Exercise 4.2 1. Show that for any x e L, adx is a derivation, and that x —> adx is a homomorphism of L in ©X. Show further that for x e L and D e T>L the Lie bracket [D, adx] = ad(Dx). 2. Show that the centralizer Ic of an ideal / in L is an ideal of L. 3. Let L be a Lie algebra, / an ideal (respectively a characteristic ideal) and J a characteristic ideal of /. Then show that J is an ideal (respectively a characteristic ideal) of L. 4. Show that if L is an n-dimensional Lie algebra over a field J and the centre of L is of dimension > n - 1, then L is commutative. 5. Let X] and X2 be two Lie algebras over the same field J and let 0 be a homomorphism from Lx onto X2. Then show that
Hints to Exercise 4.2 1. To show that for any x e X, adx is a derivation, we must show that for y, z e X, adx (yz) satisfies (4.2.6). Now the multiplication rule in X is (yz) —> [y, z], accordingly we should show that (i)
(adx)(\y, z]) = [(adx)y, z] + \y, (adx)z].
When written out in full, this is the Jacobi identity: (ii) [x, [y, z]] = [[*, y], z] + [y, [x, z]]. Linearity of adx also follows, since the Lie bracket is bilinear in both arguments, hence x —» adx is derivation. Let
0 [x, y] = tj> (JC)0 (y) - 0 (y)0 M
Note that <j) [x, y] = ad [x, y], hence writing (iv) 6
'
z) - ady • ((adx)•
z)
Let 0 : L SS^S L\CpL, and let zp denote the centre, then Cp+]L = (j>~\zp)- Since z0 is the centre of L/C^L we have C,£ = centre ofX. (See also App. 9 of Vol. 2 of l.[10].)}
=L,
166
Mathematical Perspectives on Theoretical Physics
we obtain: (v) [[x, y], z] = [x, [y, z)] - b>, U, z]]
which is again the Jacobi identity given in (ii). Replacing adx and ady as 0 (x) and 0 (y) in (iv), we have (iii). Finally, to show that [D, adx] = ad(Dx), we note that [D, adx] is a derivation on X, therefore for y e X we have: (vi) [D, adx] • y = (Dadx - adxD) • y = D((adx) • y) - adx • (Dy) = D([x, y]) - [x, Dy]. Since D is a derivation on X, the first term equals [Dx, y] + [x, Dy], hence (vi) is simply [Dx, y] = ad(Dx) - y. Thus the bracket [D, adx] = ad(Dx). Note that if we were to write ady in place of D, we would simply re-assert the homomorphism shown in (iii). 2. By definition an inner derivation of £ is also an inner derivation of the ideal / (/ is stable under inner derivations). We prove this result using this fact on stability. Let D be an inner derivation of X and let x e Ic and y e /, then we have (Leibnitz rule): D([x, y]) = [Dx, y] + [x, Dy]. From the above observation on stability Dy e I, hence the term on the LHS and the second term in the RHS are zero since x e Ic. This leads to: [Dx, 3.
5.
6.
7.
8.
y]=0
which means that Dx s 7C, or that Ic is stable under D, thus it is an ideal of X. In view of Def. (4.2.16) and (4.2.17) we have that: every inner derivation (respectively a derivation) of X leaves the ideal / stable and induces on / an inner derivation (respectively a derivation). This induced inner derivation (respectively any derivation) leaves the characteristic ideal / stable, meaning thereby that / is an ideal (characteristic ideal) of X. Let M and M' be any two submodules of Lx, then 0 {[M, M']) = [(j) (M), 0 (MO]. In particular 0 ([£!, X,]) = [0 (£,),
Basics of Algebras and Related Concepts
167
D([x, y]) e [I, I] =
3
REPRESENTATIONS OF LIE ALGEBRAS, MODULES OVER LIE ALGEBRAS
We are already familiar with the concept of representation from Chapter 2 where we studied it for groups. We also know that very often information on group representations (in particular on Lie groups) is acquired through representations of their Lie algebras. Now to define a representation of a Lie algebra L over a field, we need a vector space V over which we form a Lie algebra L (V) of linear operators, and a homomorphism 0 fromX to L (V) that satisfies the usual properties of homomorphism between two compatible algebraic objects. Using these ingredients we have the definition as follows:
3.1
Representations of a Lie Algebra
Definition 4.3.1: A representation of a Lie algebra L is the pair (V, 0) where 0 is a homomorphism on L such that for x e L, (j) (x) is a linear operator on V which satisfies: <j) (c{xx + c2x2) - c^^xO + c2
(4.3.1 (a))
for xy, x2 in £ and for q, c2 e J, and
(4.3.l(b))
An element v e V under the linear operator
(4.3.3)
We also know that the operator adx is similar to a first order differential operator since it satisfies the Leibnitz rule (see (4.2.6)): (adx)\y, z] = [{adx)y, z] + [y, (adx)z] (4.3.4) From the previous section it is easy to note that (4.3.4) can be viewed as another way of writing the Jacobi identity. The mapping ad which takes x to the operator adx is a linear mapping from L into the space of linear operators on L. To see that it is a representation we have to show that ad preserves the Lie products, i.e., ad [x, y] = [adx, ady] Now [adx, ady]z = (adx)(ady)z - (ady)(adx)z
(4.3.5)
168 Mathematical Perspectives on Theoretical Physics
= [x, [y, z]] - [y, [x, z\] = [x, [y, z]] + [y, [z, x]] = (ad [x, y])z
(4.3.6)
Thus ad is indeed a homomorphism and (L, ad) is a representation of Lie algebra L known as the adjoint representation of X. The mapping ad helps us to define a symmetric bilinear form on L called the Killing form by the relation: {x, y) = TrL(adx)(ady)
(4.3.7)
Remark 4.3.2:
A Lie algebra L is semi-simple if and only if its Killing form is non-singular.
Remark 4.3.3:
A Lie algebra L is solvable if and only if its Killing form is zero.
3.2 Representations via Modules over a Lie Algebra We next define another algebraic object—a module over a Lie algebra in order to give another way of representation without introducing a homomorphism explicitly. We would like to note that although the following definition is that of a module over a Lie algebra, it can likewise be defined over a group or a ring. Definition 4.3.4: A module over a Lie algebraL is a vector space M along with a bilinear mapping: L x M —> M which carries (x, m) e L x M to an element (vector) x into M and satisfies: [xt, x2]m = xl{x2m) - x2{xlm)
(4.3.8)
The product defined by the mapping is called the module product rule. In order to see how modules are used as representation spaces, we associate to every element x of L a linear operator f{x) on M which assigns to m e M the vector xm in M. The equality (4.3.8) therefore translates into: / l*i, x2] -> [xx, x2]m - xx(x2ni) - x2{xxm)
(4.3.9)
Comparing this with the definition of a representation, it follows that (M,f) is a representation ofL. More succinctly, (using 4.3.2) we can say that a representation of L on module M is a linear mapping p of L into the endomorphism module of M such that P (l*i> x2]) m = p (x})p (x2) m - p (x2) p (*,) • m
(4.3.10)
We often use this simpler definition of representation in preference to Def. (4.3.1). Remark 4.3.5: If one were to define a module over an associative algebra^, all that is needed is the replacement of (4.3.8) by the equality: (xxx2)m = xl(x2m)
(4.3.11)
It is quite usual to obtain the representation (M, f) by using matrices to define the linear operator f(x) for J t e X . We choose ex ... en as the basis vectors in M and write:
i=i
Evidently \f-{x)] is the matrix (linear operator) with respect to x. This representation is called a matrix representation.
Basics of Algebras and Related Concepts 169
3.3
Nilpotent Lie Algebras
We next recall the definition of another type of derived series of ideals than the one used for solvable algebras in the previous section. This other derived series will lead us to nilpotent and Cartan subalgebras needed for further structural theory. In the previous section we already defined this derived series of ideals (using different notation, see Def. (4.2.19)): L =>£2= [L, L]-=>L3 = [L2,L]z> ... D £ I + 1 = [L\£\
(4.3.12)
and called it the lower central series of ideals. We now use it to define a nilpotent algebra. Definition 4.3.6: A Lie algebra L is called nilpotent if in the above series Lk + ' is zero for some integer k. The integer k is called the order of nilpotency of L. We list below a few facts about nilpotent Lie algebras. Fact 4.3.7: We note that Lr c £ 2r ,hence every nilpotent Lie algebra is solvable, but a solvable algebra is not necessarily nilpotent. For example the two-dimensional non-abelian Lie algebra defined by [e,, e2] = ex is solvable (as we have already seen in the previous section) but not nilpotent, since Lk is spanned by ex for all k > 1. Fact 4.3.8: nilpotent.
A finite-dimensional Lie algebra L is solvable if and only if its derived algebra [£, L] is
Fact 4.3.9: A Lie algebra L is nilpotent if and only if the linear operator adx for every x in L is a nilpotent operator, i.e., some power of adx is zero for every x e L. Fact 4.3.10: are ideals.
A finite-dimensional Lie algebra L is nilpotent if and only if all its maximal subalgebras
Fact 4.3.11:
The centre of a non-zero nilpotent Lie algebra is non-zero.
Fact 4.3.12:
The Killing form of a nilpotent Lie algebra is zero.
Fact 4.3.13: The subalgebras, the quotient algebras and the central extensions of nilpotent Lie algebras are nilpotent. A finite product of nilpotent Lie algebras is a nilpotent Lie algebra. Since every nilpotent Lie algebra is solvable, the result given in above fact holds good for solvable Lie algebras as well. In addition to above facts, we wish to note that the following statements are true for a Lie algebra to be a nilpotent algebra L: (a) CrL = {0} for sufficiently large r; (b) CrL = L for sufficiently large r; (c) There exists an integer r such that for all elements x x , x 2 , •••, x r in L , adxl
o adx2 o ••• o adxr
=0
(4.3.13)
(d) There exists a decreasing sequence of ideals <£;)oLn= (0}such that [L, £j] c £ i + 1 and dim L/Lt + { = 1 for 0 < i < n. Given below is an important result known as Engel's theorem.7 Result 4.3.14: Let V be a vector space over a field J and let L be a finite-dimensional subalgebra of gl(V) whose elements are nilpotent endomorphisms of V. If V •$• {0}, there exists v ^ 0 in V such that x-v = 0 for all x e L. (See Sec. 3 of Chapter 1 in [4].) 7
'
The result given in Fact (4.3.9) is also cited as Engel's theorem in literature (see [5]).
170 Mathematical Perspectives on Theoretical Physics
Definition 4.3.15: y satisfies
Let H be a nilpotent subalgebra of L, the subalgebra of L each of whose element (adh)ny = 0
(4.3.14)
for every element h of H for some integer n, is called the fitting null component ofL with respect to H. This subalgebra is denoted L^. From (4.3.14) it is evident that H c L^. Definition 4.3.16: A nilpotent subalgebra H is called Cartan subalgebra of L if H = L^. Since for h e H and x e L a Cartan subalgebra satisfies: [h, [h, ...[h, *]...]] = 0
C4.3.15)
it follows that x also lies in //. Every finite-dimensional Lie algebra L contains at least one Cartan subalgebra. Even if L contains more than one Cartan subalgebra, they all have one thing in common—their equal dimensionality. This dimension, denoted /, is called the rank of Lie algebra L and L is therefore offen denoted as .#,. Remark 4.3.17: The Cartan subalgebras of a semi-simple Lie algebra are maximal abelian subalgebras; the converse, however, does not always hold, i.e., a maximal abelian subalgebra of a semisimple Lie algebra is not necessarily a Cartan subalgebra. We next define two more important objects connected with modules and Cartan subalgebras, which help in the classification of Lie algebras (see Ref. [1], [4] and [9] for details).
3.4
Weight System and Roots of a Lie Algebra
Definition 4.3.18: Let L denote the dual vector space of L (when L is considered as a vector space), and let M be a module over L. A linear form /i e X is called a weight of M if there exists a non-zero vector m in M such that xm=fl(x)m
(4.3.16)
for x e L. Note that (in view of the previous discussions) x here acts as an operator on M, and as such fx (x) can be treated as an eigenvalue corresponding to the vector m of M. Thus the weight \i can be thought of as the collection of eigenvalues. The set of all weights of a module is called its weight system. Remark 4.3.19: Any non-zero module over a solvable Lie algebra defined on a field of complex numbers always admits at least one weight (see Ref. [1], [5]). In order to consider direct sum decompositions of modules over nilpotent Lie algebras, we further define the so-called simultaneous eigenvector. Definition 4.3.20: A non-zero vector m in M is called a weight vector or a generalized simultaneous eigenvector if there exists an integer r such that (x - ft (x)\)rm = 0
(4.3.17)
for all xinL. The set of all weight vectors corresponding to a weight fj., together with the zero vector, forms a module, which is a submodule of M. It is denoted M^L and is called the weight submodule for the weight ^u. Now in a weight submodule each element x e L is represented by an operator which is the difference of a nilpotent operator and a multiple of the unit operator. Hence it follows that for a module over a
Basics of Algebras and Related Concepts
171
nilpotent Lie algebra there always exists a basis consisting entirely of weight vectors that expresses M as a direct sum: M=@M£
(4.3.18(a))
In particular if £ is a semi-simple complex Lie algebra and H is one of its Cartan subalgebras, then any module over L can be written as the direct sum of its weight submodules with respect to H, thus:
M=®M>f n
(4.3.18(b))
Consider now a Lie algebra L over complex numbers and let H be a Cartan subalgebra. The dual space H* of H is the set of all complex valued linear forms on H. Definition such that
4.3.21:
A linear form a eH* is called a root if there exists a non-zero element x e L [h,x] = a(h)x
(4.3.19)
for every h in H. If we compare (4.3.19) with (4.3.16) it becomes apparent that roots can be thought of as special cases of weights in the sense that L can be regarded as a module over H via the adjoint representation, and therefore a direct sum decomposition of L (similar to (4.3.18)) can be obtained in terms of its root spaces L$, defined below. In order to obtain this decomposition, we assume thatX is a semi-simple Lie algebra over the complex numbers. The bilinear (Killing) form restricted to H is therefore non-singular (see Remark (4.3.2)), and this means that for every as H* there is a unique vector hae Hsuch that8: (ha, h) = a(h)
(4.3.20(a))
for all h e H, where ( , ) denotes the bilinear form. Thus H* can be identified with H. Further, for any non-zero root a, there exists a unique vector (denoted ea) e L such that for every h e H [h, ea] = a(h)ea
(4.3.20(b))
The vector e a is called a root vector, and since [h, e j = adh(ea) it is a simultaneous eigenvector of the linear operator adh acting on L. The 1-dimensional vector space spanned by e a is the root space denoted LaH. In particular the Cartan subalgebra H is the root space for a = 0. Since every non-zero root P & a determines a distinct root space, we obtain the required decomposition: L=H®(®LaH\
(4.3.21)
From (4.3.21) it follows that the total number of non-zero roots of a semi-simple Lie algebra equals the difference: dimension of L minus the rank of L. We now list a few facts about the properties of roots and weights of semi-simple Lie algebras (see [1], [3]). Fact 4.3.22: If a is a root then - a is also a root. But there are no other non-zero roots which are multiples of a. 8
ha in H can be viewed as the image of a under the mapping that identifies H* with H. In Chapter. 5 we shall use the terminology co-root for ha
172
Mathematical Perspectives on Theoretical Physics
Fact 4.3.23: If a and /3 are roots with corresponding root vectors ea and e^ and a + (5 * 0, then using the orthogonality given by the Killing form it can be seen that ea and e^ are orthogonal. Fact 4.3.24: Weights and roots have the additive property with respect to taking tensor products. Thus for instance if M^ and N^ are two weight submodules of the modules M and N over a rrilpotent Lie algebra H, then the tensor product M^ ® iV^is contained in the weight submodule (M ® N)J} + v of the module M ® N. We now show the close relationship between weights and roots which does not seem to be so apparent in spite of the fact that Cartan subalgebra H plays an important role (see Eqs. (4.3.18)(b) and (4.3.19)) in both cases. This relationship is best seen through the so-called weight ladder module obtained in the following manner. Given a weight module M^, repeated actions of elements ea and e_a of L generate the whole ladder of weight submodules that correspond to the entire weight ladder ji + zee for z = 0, ±1, ±2, . . . . However, since L is finite-dimensional, there are only finite number of elements in the ladder, giving rise to only finitely many submodules in the weight submodule ladder. The direct sum of these weight submodules can be regarded as a module over the (ladder generating) Lie subalgebra H + LaH + LTaH. Due to our limited scope we shall not go into the details of this topic (we shall however use these concepts in later sections, see Hint to Exercise. 6 of the next section). Interested readers can find them in other texts, e.g., (see Ref. [1], [9]). Finally before we close this section, we introduce the notion of ordering amongst the roots and weights of a Lie algebra and define a simple root, a highest root and a highest weight which will eventually lead to the definitions of Cartan matrix and Weyl group.
3.5
Lexicographic Ordering, Simple and Highest Root, and Highest Weight
Let H* denote the dual of H—the nilpotent Cartan subalgebra of a given complex semi-simple Lie algebra L. Further let H*R denote an /-dimensional real subspace of H* whose elements are real linear combinations of roots. We have seen that H* can be identified with H and therefore a metric such as given below can be defined: (a, p) = (ha, hp)
(4.3.22)
for a, P in H* and h^ hp in H. Restricting the above metric to H*R implies that (a, ft) is real, moreover if a, P are non-zero (a, a) and (P, P) are positive non-zero numbers, the Killing form on H*R is positive definite and it makes HR into a Euclidean space with Killing form as the inner product. We choose an ordered (but arbitrary) basis / l 5 / 2 , ...,/; in H*R and note that any X e H*R can be written as: X=rlfl+
... + r,f,
where rt (i = 1, . . . , / ) are real. We now define the lexicographic ordering in H*R by saying that a vector X is higher than another vector Y in H*R if the first non-zero component of their difference X - Y is greater than zero. This is notationally written as X > Y. Having selected a lexicographic ordering in H*R, we can now define a positive root and a simple root. Definition 4.3.25: A root a is positive if it is > 0 and is called a simple root if it is a positive root and is not a sum of two positive roots.* *
In view of (4.3.20)(b) corresponding to / simple roots a,- (i = 1, ... 0 there are I root vectors eaj = e,. One can also define/other vectors as/; = 2e_ a /. Elements e, and/j-,/= 1,2... / are called the simple raising and simple lowering elements of A,. These elements together with all ht e H generate A,.
Basics of Algebras and Related Concepts 173
A semi-simple Lie algebra of rank I carries a system of I simple roots (see Section 4) with respect to a lexicographic ordering. If a and fl are roots, then the sequence of linear forms P~pa, ..., /5- a, (5, [I + a, ...,/? + qa is called an a-ladder through /J if every member of the sequence is a root and if y3 (p + \)a and j5 + (q + \)a are not roots. The numbers p and q being 0, ± 1, ± 2, ± 3, ... are related as: p - q = 2(a, P)l(a, a)
(4.3.23)
If a and (5 are simple roots, by Def. (4.3.25) their difference cannot be a root, hence in this case p = 0. We shall later use the RHS of Eq. (4.3.23) to determine the a-ladder through /J. In the next section we shall also see that, this RHS for the set of simple roots defines the Cartan matrix. Using the induction method for constructing the ladders for simple roots, the highest root of the system can be determined (see Exercise (4.5)). As for the determination of the highest weight in a weight system of any module which is an irreducible representation module of X, we would like to note that in a semi-simple Lie algebra of rank /just like roots, weights are also linear combinations of simple roots. They can be considered as a set of points lying in the Euclidean space H*K. And since H*R is lexicographically ordered with respect to some basis, the weights also form a totally ordered set and hence the concept of highest weight is well defined. Moreover since L is finite-dimensional, there are only a finite number of different weights and one of these is highest, i.e., higher than others. In the previous subsection we have already seen the ladder of weights constructed from a given weight /n . Written out in full it is: fi- pa ... [i + qa. All these weights belong to the weight system. It is interesting to note that the real number r = 2{fi, a)/(a, a) which is an integer defines an element \x-ra in the ladder. We shall see that the weight system of an irreducible module over£ can be obtained using the Dynkin diagram (Def. (4.4.14)). We now state an important result (see Lemma 4.6.5 in [ll(a)] concerning the highest weight of a given representation p:L —> V and a consequential definition. Result 4.3.26: Let p be a representation of L in a vector space V. Suppose that v e Vis a non-zero vector such that9 (i) v € Vx for some he H* (V= © Vx ) (ii) piXt)v = 0 1 < i < I, Xt e L (4.3.24) (iii) v is cyclic for p (i.e., V= p (£)v). Then p is a representation with weights, and X is the highest weight of p. Definition 4.3.27: A highest weight is called a basic weight if it is not the sum of two non-zero highest weights. The number of basic weights of a semi-simple Lie algebra equals its rank /. An irreducible module is said to be a basic module if its highest weight is one of the basic weights. Since the number of basic weights is the same as that of simple roots, one can set up a correspondence amongst them as:
2
(4.3.25)
where i and j take the values 1 to /. Given the basic weights A ,, A 2 , ..., Xt any other weight /j. can be expressed as 9. f is also referred to as extreme vector. We recall that a vector x e an £-module M is an extreme vector if e, x = 0 for i = 1 ... /, where e,'s are simple raising elements ofL =%i (see [1]).
174
Mathematical Perspectives on Theoretical Physics / i=i
where each mi is an integer coefficient given by the relation: m, = 2(A ,, «,)/<«,, a,)
(4.3.26)
The numbers m{ which can be 0, positive or negative, are called the Dynkin indices of weight /x. The Dynkin indices of the highest weight are always non-negative and they uniquely determine (up to isomorphisms) an irreducible module over a semi-simple Lie algebra. We shall return to Dynkin indices and Dynkin diagram in the next section (see Exercise (4.4.6)).
Exercise 4.3 1. LetX be a 3-dimensional non-commutative Lie algebra with centres. Show that if z - T>L, then L is nilpotent. 2. Prove Fact (4.3.7). 3. Prove Fact (4.3.8). 4. Prove Fact (4,3.9). 5. Prove Fact (4.3.12). 6. Show that the statements given in (4.3.13) for a Lie algebra L to be nilpotent are equivalent. 7. Prove Fact (4.3.13). 8. Let £ be a nilpotent Lie algebra and p (respectively q) be the smallest integer such that CPL = {0} (respectively CqL = £). Show that p = q + 1 and that C{L 3 CP~'L. 9. Let L be the 6-dimensional Lie algebra over a field J with basis elements (ev e2, •--, e6) and multiplication table [e,, e2] = - [e2, ex] - e4, [e{, e 3 ] = - [e3, ex] = e5, [e2, e 3 ] = - [e 3 , e2] = e6,
with other brackets zero. Let /3 be the bilinear form on L such that j3(e3, e4) = j3(e4, e3) = 1, P(eh e6) = f5(e6, ej) = 1, /5(e2, e5) - j3(e5, e2) = -1 and it is zero for other ordered pairs. Then show that (5 is invariant, L is nilpotent, z = T>L ^ {0} and /3 is non-degenerate. 10. Show that the following multiplication tables define two nilpotent Lie algebras L3 and LA of dimensions 3 and 4: £3:
[eu e2] = e3 [e}, e3] = [e2, e3] = 0
L4:
[ex, e2] = e3 [ex, e3] = e4 [ev e4] = [e2, e3] = [e2, e4] = [e3, e4] = 0.
11. Show that a Lie algebra L is solvable if and only if its Killing form is zero. 12. Prove Fact (4.3.23).
Hints to Exercise 4.3 1. From the definition T>L = [L, L], we note that in this case the centre z = [L, L] = L2. The algebra L is 3-dimensional and non-commutative. Since z =X 2 , it follows thatX 3 the next derived ideal: [L2, L] = [z, £] is zero, showing that £ is nilpotent.
Basics of Algebras and Related Concepts
175
2. By induction Lr+X ZD L(r + ' \ thus if the LHS is zero, the RHS is also zero, hence from Def. (4.2.4) and (4.3.6), the result is obvious. (Note that in Def. (4.2.18) and (4.2.19) this is characterized as Cr+lL ZD 1fL, the vanishing of Cr+ XL implies the vanishing of T>rL.) 3. Denote [L, L\ = (DL. The condition is necessary since when L is a solvable algebra, the nilpotent radical ofX is T>L. To see the sufficiency we note tha.x.Ll'DL is commutative when Ci + lL and Lr_i
*,, ...,*,-e L.
But C'L is the set of linear combinations of elements of the form given above, the vanishing of one implies the vanishing of the other, accordingly (a) and (c) are equivalent, leading to equivalence of (a), (b) and (c). Finally, if there exists a sequence (£,) 0 < (< r of ideals with the defining properties for £ , to be nilpotent, then it is easy to note that there exists a decreasing sequence (V;)o <,- < „ of vector subspaces of L of dimensions n, n- I, n-2, ..., 0 and a sequence of indices i o < 2 , < ...
with
Lo=Vio,Ll
=
< i
r
Vii,...Lr=Vir
then as [L, Vt ] c V{ , the V,- are ideals and [L, V,] aVi + l for all i. Hence the nilpotency of L implies (d) and thus all (a), (b), (c) and (d) are equivalent. 7. Let 5 and / be a subalgebra and an ideal of nilpotent Lie algebra L. Then Lll = Q, is the quotient algebra, and
176 Mathematical Perspectives on Theoretical Physics
by treating them as subsets of L, thus, for instance, CrS c CL = {0}. Finally, the assertion on products follows from (c) of equivalence relations (4.3.13). 8. Using the equivalence of (a) and (b) in statement (4.3.13) it is obvious that if p and q are the smallest integers that satisfy CPL = {0} and CqL = L for Lie algebra L, then p must equal q + 1. Also note that in the Hint to Exercises 6 we have shown that in this case £,- => C(- + jX and (replacing r b y q) Lq_i c CtL. This leads to CiLz>Lq_i =3 C'i+X L, i.e., C,X => C'lL. 9. Using the multiplication table we obtain that DL = [£, L] = (e4, e5, e6). It is easy to see that is also the centre of X, since every element of L permutes with those of CDL. Hence z = T>£ * {0}. This leads to: I^L = [DL, T>L] = 0 showing thatX is nilpotent (we shall also prove it, by computing the brackets for {e,}). A form (5 on L is invariant if /3([x, y], z) = P(x, [y, z]) for all x, y, z e L (see Def. (4.4.7)). In this case we have to check for \et }i = 1, ..., 6. Now Pde^ e2], e3) must be equal to P(el, [e2, e3]). Using the table for /?(e(-, e) and the Lie brackets, we note that the former is f3(e4, e3) = 1 and the latter is P(ex, e6) = 1. Similarly /3([e1? e 3 ], eA) must be equal to Piey, [e3, e4]) or that P(e5, e4) must equal /3(el5 0), but these are both zero. Likewise filling in all other values of i, j , k in P(et, [ej, ek]) for i*j± k, we have in all 20 pairs and they can all be seen to be equal. This shows that ft is invariant, also non-degeneracy of ft is obvious. To show that L is nilpotent we have to compute [L, L], etc., and show that dfL = [$/' XL, 2 / ~ iL] is zero for some r. Now [L, L] = (e4, e5, e6, 0, 0, ...) = T>L. T>2L = [L] = {[eA, e5], [e4, e6], [e5, e6]) = (0, 0, 0).
Thus L is nilpotent. 11: We use Fact (4.3.7) and (4.3.12) (established in Exercises 2 and 5) to obtain the result. 12. In effect we have to show that if a + P ^ 0 then the one-dimensional spaces LaH and L®H spanned by ea and ea are orthogonal. We use the defining equation (4.3.22) which gives [h, e j = aihje^ and the decomposition equality (4.3.23), and since adh is a derivation on whole of L and in particular on LaH and L% we note that for ea e LaH and e^ e L^H we can write: adh[ea, ep] = [(adh)ea, ep] + [ea, (adh)ep] = a(h)[ea,
ep] + p(h)[ea,
= (a(h) + p(h))[ea,
eft
ep].
This shows that [ea, ep] is an eigenvector belonging to Lfi + ^. Thus [LaH, £%,] c Lfi + & holds good for any roots a and p.lfa+/5^0 and yis any other element of H*, adea o ade^ will map LyH into L^Jr^Jry which is different from LyH, Hence (ew eg) = Tr(adea o adep) = 0 showing that ea 1 ep.
4 4.1
UNIVERSAL ENVELOPING ALGEBRA, WEYL GROUP AND CARTAN MATRIX Universal Enveloping Algebra, Representations on Modules
In Sec. 1 we have seen how an associative algebra can be made into a Lie algebra. We shall now see the reverse process, namely how a given Lie algebra can be associated with an associative algebra. For this purpose we treat L as a vector space and form its (contravariant) algebra: T{L)=J®L 0 (£®L)
...
Basics of Algebras and Related Concepts 177
We know T{L) is an associative algebra (see Hint to Exercise 1 in Sec. (4.1)). We now collect the set of all elements of the form (differences between Lie algebra products and the corresponding commutators in T(L)): [x,y]-(x®y-y®x)
(4.4.1)
for x, y in X, and we note that this collection forms a two-sided ideal denoted /. The associative quotient algebra: T(L)/I=U(L)
(4.4.2)
is called the universal enveloping algebra of X. Since associative algebras are easier to handle (structurally), the enveloping algebra U(L) is found very useful in the study of Lie groups (via their Lie algebras). Some of the properties of U(£) are listed below: Properties 4.4.1: Every representation of a Lie algebra X can be extended to a representation of U(L). Every module over X can be thought of as a module over U(L). Conversely a unitary left module over U(L) is called a left X-module—in short a X-module. The action of U(L) over a module M is defined as: (JC,, x2, ..., xn)m = xx(x2(... xn(m)...))
(4.4.3)
where jclt ..., xne X and m e M. Also if {xt} is a basis of X then monomials of the form: xi{ ® xi2 ® ... ® xin
n = 0, 1, 2, ...
(where n = 0 gives the trivial monomial 1) span the tensor algebra T(L) and hence the cosets of these monomials span [/(£). Before closing this subsection we give a few definitions concerning representations that involve modules associated to the enveloping algebra of a Lie algebra and the unitary modules over a field 7 (see [1] and [8] for details). Definition 4.4.2: Two representations p and p' of X on ^F-modules M and M' (over the same fieldT) are called similar or isomorphic if the X-modules M and M' are isomorphic. It should be noted that for this it is necessary and sufficient that there exists an isomorphism y/ of the jF-module M onto the jF-module M', such that p'(x) = Xjfo p(x) oyf1
(4.4.4)
for x e L. Definition 4.4.3: Let / be an index set and for i e I, let p, be a representation of L on the module A/;, and let M be the X-module which is the direct sum of the modules A/;. Then there is a corresponding representation
P= 5>/ iel
called the direct sum of representations such that: p(x) • m = ®(Pi(x)mi)JeI where x e L and m = ^ w , 6 M. iel
(4.4.5)
178
Mathematical Perspectives on Theoretical Physics
Definition 4.4.4: A representation p of L on M is called simple or irreducible if the associated Lmodule is simple. Note that this amounts to saying that there exists no submodule of M (over the field f) other than {0} and M, which is stable under all the p(x)'s for x e L. Thus a class of simple modules defines a class of simple representations. Definition 4.4.5: A representation p of L on M is called semi-simple or completely reducible if the associated X-module is semi-simple. Thus p in this case is a direct sum of simple representations on submodules of M such that none of these submodules is stable under p (x) for every x in L. Definition 4.4.6: Let L be a Lie algebra over a field ^F, and M a X-module. Then the X-module structure on M and the trivial X-module structure on f define a £-module structure on the ^F-module N formed by bilinear forms on M. Explicitly N stands for ®(Af, M; f) and its structure is given as: (xN • p)(m, m') = - p(xM • m, m') + p(m, xM • m')
(4.4.6)
where xe L,m, m'e M, ps N and xN, xM stand for the linear operators on modules N and M that result from x in L. The set of all elements x e L which satisfy xN • /3 = 0 for a fixed element ft of N forms a subalgebra of L. In view of the above definition, we note that if M is a ^F-module and gl{M) denotes the Lie algebra of endomorphisms of M, then for a given bilinear form fl on M, the set of x e gl(M) that satisfies: -P(xm, m ) + P(m, x m )=0
(4.4.7)
forms a Lie subalgebra of gl(M). Definition 4.4.7: Let L be a Lie algebra over J. The adjoint representation of L on L and the zero representation of L on jF define a X-module structure on the jF-module AT= 25(£, X; jF) of bilinear forms on L. A bilinear form /3 on L is said to be invariant if it is invariant under the representation x —> xN. From equality (4.4.6), the necessary and sufficient condition that p be invariant is: P([x,ylz) = P(y,[x,z])
4.2
(4.4.8)
Root Systems and the Weyl Group
In order to define a Weyl group in all generality, we recall the definition of a root system in an arbitrary finite-dimensional vector space V over the field of rational numbers Q. Definition 4.4.8: Let the vector space V carry a positive definite symmetric bilinear form. A finite subset X of non-zero vectors of V is called a root system in V if the following four properties are satisfied by its elements (called roots), a, p\ etc.: (i) X spans V (ii) if a X and tecs X with t e Q, then / = ± 1 (iii) if a, P e X, then 2 {a, P)/(a, a) is an integer (4.4.9) (iv) if a, P e X, then /3 - 2 [(a, p")/ V as Sa(v) = v-2 [(a, v)l(a, a)]a.
(4.4.10)
The set of all symmetries defined by (4.4.10) forms a group called the Weyl group W of the root system X. Since X is finite, the group W is finite. Moreover W can be viewed as the permutation group of X.
Basics of Algebras and Related Concepts 179
For all split10 Lie algebras L of finite-dimension over a field of characteristic 0, a root system can be defined (as shown in the previous section), by taking the vector space V over Q as the one spanned by the basis vectors ax... an e H* (the dual of the Cartan subalgebra H). Denote the set of basis vectors by X. Then clearly X czV and elements of X satisfy the remaining three properties (ii), (iii) and (iv). The Weyl group in this case is given by the elements {Sa.}, at e H* (i = 1, ..., n). The following two properties relating to above discussions can be easily checked. Property 4.4.9:
For any root system X a V and a e X
(Sa(u), Sa(v)) = (u, v) for all u, v e V
(4.4.11)
Property 4.4.10: Let n(a, P) = 2(a, /3)/(A P) define a function on X x X —> Q , the only values that n takes in Q are 0, ± 1, ± 2, ± 3. Definition 4.4.11: A subset K c X i s called a root system basis for X in V if Y is a vector space basis for V and for any [3 e X we have n
P=Y4mi a,
m,. e Q
i=i
where F = {otj, ..., an}, and either all the m/s are non-negative or they are all non-positive. A root system basis is irreducible if there is no non-trivial disjoint union Y = Yl u Y2 with (a, P) = 0 for all a € Fj, and /? e F2- ^ ' s known that every root system X in V contains a root system basis (see [8]). Before closing this subsection we list the important properties of a root system basis Y of the root system XinV: (i) Y spans V (ii) if a, (i e Y and a * P then < a, p > < 0 (iii) Y is a vector space basis of V. (iv) if Y = {cfj, «2, ... an} and P e X+ (the subset of all positive roots in X), then either j8 e Y or there is an at 6 F with /? - a, € X + . « (v) if /? e X+. then there exist positive integers «,• for / = 1, 2, ... n with P= ^ mt at. i=i
We leave the verification of these properties as an exercise for the reader, and ofcourse we shall use them in the remaining part of this chapter.
4.3
Cartan Matrices
Definition 4.4.12: Let Y = ( a t , ..., an) denote a basis for root system in V and let n : Y x Y —> {0, ±1, ±2, ±3} be the mapping defined in Property (4.4.10). Then the matrix:
||n(o,, apH = £
} a
'
(4.4.12)
is called the Cartan matrix of root system X*. 10. ALiealgebra£ over a field J of characteristic 0 is said to be split algebra if for each x e L all the characteristic values of adx are in f. All algebras over an algebraically closed field, in particular over the field of complex numbers, are split algebras (see [5] for details). * From property (ii) of root system basis it is evident that all diagonal elements of a Cartan matrix are 2, and off diagonal elements are from the set {0, - 1 , -2, -3}.
180 Mathematical Perspectives on Theoretical Physics
The elements n(ah aj) = Ay of Cartan matrix evidently determine the root-ladder of simple roots (see Eq. (4.3.23)), and hence also the highest root of the system. n
If a = £ ml a, e X+, then ^ mt is called the height of a with respect to Y. Two root systems A", and X2 1=1
in Vx and V2 respectively are said to be isomorphic if there exists an onto non-singular linear transformation T: Vt —» V2 such that T{X{) = X2, and for u, v e Vu{T(u), T{v))2 = c{ u, v){ where c is a positive rational number and (,) ( is a symmetric bilinear form on V) (/ = 1, 2). Property 4.4.13: All the roots of a root system X and its Cartan matrix can be determined once a basis Ffor the root system is given. Also two root systems with bases having identical Cartan matrices are isomorphic. It is important to note here that the Cartan matrix for a given root system is unique up to permutations of its rows and columns irrespective of the choice of a root basis. This observation leads to the fact that Cartan matrices for Lie algebras are different only when these algebras are non-isomorphic. We next define another important object of this chapter—the Dynkin diagram which turns out to be a useful tool in the determination of Cartan matrices of root systems and in the classification of Lie algebras.
4.4
Dynkin Diagram
Definition 4.4.14: The Dynkin diagram A of a root system X in V with basis Y = {ax, 0^, ..., an} consists of a graph in R2 that has n vertices labelled au ..., an and has Ny = n(ah af)n (<Xp a,) line segments joining the a r th vertex to the 0,-th vertex. Moreover, for any two arbitrary roots a and /3 (which include a/s), if n(a, p)*0 and (ft, ft) > (a, a), then the Dynkin diagram carries an arrow from the /J-vertex to cu-vertex. From the above definition it is clear that the number of lines which join the vertices a ; and GCj can be obtained using the expression: -—U
'
ll!
—r-
= 4 cos- Z(ah aj)
(4.4.13)
Property 4.4.15: Given a basis Y of a root system X in V, the Dynkin diagram of this root system X determines the Cartan matrix of X. To examine this property we need the two facts (already listed above), namely (1) n(a,, af) = {0, ±1, ±2, ±3} for (Xj, GCj € Y; (2) the number of line segments joining the vertices a;, a,- in the Dynkin diagram is n{at, ap n (a,, a,). To begin with, we note that if there is no line between a- and 0Cj then nity, ap = 0, thus i, j'-th as well asj, i-th element of Cartan matrix is zero. If n(at, ot;) n (ct;, a,-) = 1 (i.e., a, and Oj are joined by one line) then n(a;, ap = n (a;, at) = -1 and (a;, at) = (ap a,) giving roots of equal length. In view of Property (4.4.10) (see Hint to Exercise 2), the only other values that n{ab apn(Oj, a,) can take are 2 and 3. Suppose now that n(a{, ap n (a ; , a;) = 2, the roots in this case will not be of equal length; if there is an arrow from a7 to a,, it would mean that (a,-, Oj) is greater than (at, at), this would therefore imply: 2(ai,aj)
(«;,«,)
c 2(« y .,q,.)
(a,-, a,)
Basics of Algebras and Related Concepts 181
since numerators are equal. But the RHS and the LHS of this inequality are n(a,, a,) and n(at, a}) respectively. Hence we have n(at, a,) < n((Xj, a,) which means that n(at, a.p = -2 and n((Xj, a,) = - 1 . The remaining case n(aj, a}) n (ay, a,) = 3 can be argued in a similar manner. For instance the roots will be of unequal lengths. An arrow from a, to at would imply that (a,, a-) is greater than (a,, a,) leading eventually to n(a,, aj) = -3 and n(ap a,) - - 1 . From our discussion on Cartan matrix in previous subsection, we already know that off diagonal elements are < 0, and Dynkin diagram helps determine these elements, in all cases. We now devote the rest of this section to illustrate the concepts introduced above by a few examples based on variations of root systems of a two-dimensional vector space. Example 4.4.16: Let V stand for a 2-dimensional vector space and X stand for the root systems for the four different variations given below. (a) V = Q 2 = Q x Q , a = (1,0), X = {±a, ±)3} P = (0, 1); (b) V={(a, J3b):a,be Q}, a =(1,0), X = {±<x, ±p, ±(a + p)) p= a= P= a= P=
(c) V=Q2,
(-1/2, V3/2}); (1,0), (-1, 1); (1, 0), (-3/2, V3/2);
X= {±a, ±p, ± (a + p), ±(2a+p)} (d) V = {(a, J3b) :a,beQ], X = {±a, ±p, ±(a + p), ±(2a+p), ±Qa+ p), ±(3a+2p)} We shall now verify the compatibility of different root systems with the corresponding vector spaces using Def. (4.4.8). It is easy to note that (i) is trivially satisfied in each of these four examples, i.e., X spans V, in fact there is a root basis Y formed by {a, P) in all of them which spans V. The condition (ii) is also quite evident in all cases. To examine (iii) and (iv) we choose the roots (2a + p) and (3a + p) of (d) and write down the required inner products. In the case of (iii) we have to show that: 2( la + p,3a+ P)l(2a + P,2a+ P)
(4.4.14)(a)
is an integer, i.e., 2((l/2, V3/2), (3/2, V3~/2)>/<(l/2, V3 12), (1/2, V3 12)) is an integer. Simplification yields: 2(3/4 + 3/4)/(l/4 + 3/4) = 3
(4.4.14)(b)
To establish (iv) we use the calculations of (4.4.14)(b) and write: (3a + p) - 3(2a + p) = - 3 a -2p = - ( 3 a + 2)3)
(4.4.15)
Evidently -(3 a + 2)3) e X. It should be noted that the symmetry transformations Sa can be interpreted geometrically as reflections in the plane perpendicular to the root a e X, these are called Weyl reflections. In this context property (iv) of Def. (4.4.8) merely asserts that the Weyl reflection of a root P in the hyperplane (through the origin) perpendicular to any non-zero root a yields another root
(a, a)
182
Mathematical Perspectives on Theoretical Physics
This fact is also obvious for all four examples given above. We now use equality (4.4.12) of Def. (4.4.12) to write the Cartan matrix for example (d). For this we denote the root basis {a, /3} as {av c^ }. The Cartan matrix is: '2(«i.«i)
<«!,«,} 2(«2»"l)
2{(Xi,«2)}
(a2,a2) 2 <«2»"2)
i {ax,ax)
(2 1-3
-n 2j
(a2,a2)j
The Dynkin diagram A of root system of example (d) is computed as follows: n(av a2)n(a2,
ax) - (-l)(-3) = 3
Thus there are three line segments joining a, and a2 and there is an arrow pointing from c^ to a,. «i
a2
Property (4.4.13), which says that Dynkin diagram for a given root basis determines the Cartan matrix, can be easily verified.
4.5
Casimir Element and Casimir Operator of L
Let C{L) stand for the centre of universal algebra U(L). It is known that C(£) is an abelian subalgebra of U(L) and it consists of those elements of U(L) which commute with every element of L. These elements of C(L) are called Casimir elements. More formally we define it as follows: Definition 4.4.17: Let£ be a Lie algebra over a field J, and U(L) be its enveloping algebra. Let J be an ^-dimensional ideal of L such that an invariant bilinear form /5 on L when restricted to J is nondegenerate. Suppose that J admits two bases (e,-) and (e't) i - 1, ..., n for which j9(e,, e'p = 8{j.n The element
C=Yeie'i i
of U (L) belongs to the centre C(L) and is called the Casimir element. It is evidently independent of the choice of the basis. In particular when ji is the bilinear form associated with aX-module M, the element C is called the Casimir element associated with M (or with the corresponding representation). Remark 4.4.18: WhenX is a semi-simple Lie algebra, there always exists a Casimir element. Since in this case there is a non-singular Killing form (which we again denote as f5), with respect to which two bases (e,) and (e',-) can be defined satisfying the relation: p(et, e'j) = dij.
The element
1=1
1
' The bases (e,) and (e') are dual of each other with respect to /}.
Basics of Algebras and Related Concepts
183
is the required Casimir element, that commutes with all elements of L. The element C is sometimes referred to as second order Casimir element. In the case of a semi-simple Lie algebra (that we are talking about), it is analogous to the Laplace operator in the theory of special functions, and as such it is sometimes expressed by using a basis {hv ..., /z,) of Cartan subalgebra H of L:
C = X M ' ' + X , eae~a . i= l
(4.4.16)
\ea'e-a)
a*0
where {/?'} denotes the dual basis with respect to the Killing form ( , ) on H, and summation in the second term stands for all non-zero roots a e L which in turn define non-zero root vectors eae L^. Definition 4.4.19: Let L be a Lie algebra over the field J, let J be an ideal of L and let p denote a representation of L in a finite dimensional vector space V. Define a bilinear form F on L by setting: T(X,
Y) = Trp(X)p(Y)
X , Y z L
and assume that it is non-degenerate when restricted to J. Let {X,, ...,Xn} X'n} be its dual (i.e., T(Xh X'p = 8y), then the element 12
(4.4.17) be a basis of J and {X\, ...,
C=£p(X,-)o piX',)
(4.4.18)
is an endomorphism of V which commutes with every endomorphism p{A) fox A 6 L. The element C of (4.4.18) is called the Casimir operator of J corresponding to the representation p. If V is an irreducible X-module M, then the Casimir operator of L is an automorphism of M. Hence C is a multiple of unit operator 1, thus, denoting the eigenvalue by C(M), we have in this case:
C= C(M) 1. If M has highest weight X , then using Weyl's formula13 we have C(M)= pYA,A+ Ya)
(4.4.19)
We shall return to these ideas in the next chapter, where we shall see how these concepts are used in representation theory of infinite-dimensional algebras.
Exercise 4.4 1. Prove Property (4.4.9). 2. Prove Property (4.4.10). 3. Show that the symmetry transformations {Sa} of a vector space V satisfy (0 •Sa = S_a and (ii) si = 1 for every root as X. 12
13
We have used the same symbol C to denote the Casimir element as well as the Casimir operator, since it has the same properties as that of Casimir operator defined earlier for Lie groups (See Exercise 3.2 and Sec. 3.7). See Belinfante, Jacobson, Vardarajan for details on Weyl's formulae, e.g., character formula and dimension formula ([1], [5], [11]).
184
Mathematical Perspectives on Theoretical Physics
4. Show that the Cartan matrices for Exp. (4.4.16) (a), (b) and (d) are respectively: T2 O i p
-11 I" 2
-11
|_0 2J [-1
2] L-3
2]
5. Establish the following for a simple Lie algebra # 2 , when the Dynkin diagram in terms of simple roots ax, OC2 is given as:
o
o
(i) its Cartan matrix; (ii) the a r ladder through c^; (iii) its root system, the highest root and the root diagram; (iv) a canonical basis for A2. 6. Find the basic weights A ]5 X 2 for the Lie algebra A2 of Exercise 5, and basic representation modules corresponding to them, along with their weight diagrams.
Hints to Exercise 4.4 1. By definition 5^(M) = u - 2 [(a, u)/(a, a )]a therefore < Sa(u), sa(v)) = (u, v) - 2(u, a)(a, v)/(a, a) - 2(v, a)(a, u)/(a, a) + 4 (<*> u) (a> v) ,/ fl; a) {a, a)1 Since the bilinear form ( , ) is symmetric, we have the required result. 2. To prove this property, we think of V as having the base field R (which ZD Q)-the field of real numbers; the inner product on V and therefore on X is the restriction of the usual inner product on R", n being the dimension of V. If 6 denotes the angle between roots a and /?, thought of as vectors in R", then cos26>=
(i)
This implies that
2 «(a^)n ( /3,a)(p, = 4^|^4 = 4cos e 15) (a, a) Since cos#< 1 n(a, p)n(P, a ) < 4 . But n(a, /3) and n(fi, a) are both integers (Def.(4.4.8)(iii)), therefore we have
(n)
- 4 < n(a, p) < 4. (iii) In order to show that n(a, p) takes only the values 0, ±1, ±2, ±3, we have to show that it does not take the value ± 4 as implied by (iii). If possible let n(a, /3) = 4, then from (ii)rc(/3,a) = 1, but
Basics of Algebras and Related Concepts
185
n(fi, a) = ^ ' (a, a) therefore (a, a) = 2<j8, a) = 2 «(a, 0) = 4
Sa(x) = x- 2((a, x)/(a, a))a, and S_a(x) = x- 2«- a, x)/(- a, - «))(- a). Apparently SJ.x) = 5_a (x) for all x in V and all a in X. To compute S^ix) we write Sa(x) = y, thus ^«(y) = y-
2
« « ' y>/<«. « » « •
On substitution this becomes:
Sa(SaM) = *(a,- ^a) « l\- Ma, (a, x -a)^4a)L a)]a // J _
2{a,x)
— X
_ 2(«, x) U
(a, a)
—
4(«, x)
— U -I
(a, a)
-
—U
(a, a)
— X
thus5« = 1. 2(a,,a) 4. We use Def. (4.4.12) to form the matrix -p -±- . Since in each of these examples the root basis consists of two elements a, P the matrix is: 2
(a, a) (a, a)
2(P,a)
(a, a)
2&JQ(p, p) 2{ML
(P, P)
Substitution of different values (such as a = (1, 0) and P = (0, 1) in the case of (a)) gives the required result. 5. (i) From the definition of Dynkin diagram and Eq. (4.4.13) we have (a)
«(«,-, a)n(a,
a,) =
4 (a-, a - ) \ , —r
186 Mathematical Perspectives on Theoretical Physics
= 4 cos2 Z(a;, aft. The LHS of the above equality represents the number of lines between the vertices a{ and a-}, which can be 0, 1, 2, or 3. Also (a,, a,) < 0 for simple roots, hence using different choices of r(r = 0, 1, 2, 3) in rIA - cos 2 Z(a{, a,) it follows that angles between a, and of can be one of these: 90°, 120°, 135°, 150°. In our case the value of r = 1, hence the angle is 120°. The roots a, and (Xj are equal, since in the Dynkin diagram there is no arrow from one root to the other. To compute the Cartan matrix: "2(«i.«i)
2
(ax,ax)
<"i.«2)l
(a2,a2)
2 («2.«l)
JAn
2 («2'«2>
. (a,,a,)
AX21
U2I
^22 J
(a 2 ,a 2 ).
we observe that since roots are equal the off-diagonal elements are not only equal but simplify to 2 cos Z (ax, %) = 2 cos 120° = - 1 . Hence the matrix is
(ii) Using the expression for a root ladder and the fact that difference of simple roots is zero, we have p = 0, thus the root ladder ax through 0^ is simply (b) ctj, CC2+ ax, ..., ccj + qax where q satisfies (4.3.26), i.e.,
(a2,a2) This gives q = 1, accordingly the root ladder (b) has just two roots (Xj, (X2+ ocl. (iii) Since every simple root a as well as any other positive root has its negative counterpart that belongs to the root system, we have the root system consisting of six roots ±ctx, ±0^, ±(0!! + o^). The highest root is obviously ax + Oj- The root diagram is as given below: /a1
-«2\
-(a, + oy
<
\L
(a-i + a2)
Q f f Q Root dicagram of A2 Obviously the Euclidean space //R* formed by simple roots is a plane spanned by ax and o^. From the above diagram it is also clear that root system of Sl2 ' s symmetric under the Weyl group W, and has additional symmetry under inversion in the origin (e.g., at —> -a,-, etc.). (iv) Let hx, h2 be the elements of Cartan subalgebra given by:
Basics of Algebras and Related Concepts 187
*,•=
,
' .
(I'=1,2)
and let us assume that the root vector ea. corresponds to et e %.v define
2e_a. The vectors ex, e2,fx,f2u together with hx, h2 form the canonical basis and generate the Lie algebra !A2. The following Lie bracket relations can be easily verified: [hit hj] = 0
[«i./il = *i [e 2 ,/i] = 0 = [«i./ 2 ]. Writing similar relations forft2we get
[hx, ex] = an ex = 2ex,
[hx, e2] = ax2 e2 = -e2
[A,,/,] = -anfi [^/ 2 ] = ^2-
[Ai,/2] = ^12/2 = / 2
= -2/i,
[A2, e i ] = -€,, [^2,/,]=/,, [h2,e2] = 2e2, [h2, f2] =-2f2. Besides these six vectors there are the commutators [e1; e2] = ei2 and \fx,f2] =f\2- Thus in all we have eight vectors which span the vector space underlying !A2. 6. From the defining Eq. (4.3.28) for basic weights, we know that when i*j, the basic weight A, is perpendicular to (Xf, in the case i =j the formula implies that projection of A, on the root a, is half the length of av Since the rank of A2 is two, there are only two simple roots and two basic weights. To obtain the latter we must find the lengths of ax and 0^, for this we use the fact that the sum of the squares of lengths of all the roots is equal to the rank. From the root diagram of Exercise. 5, there are six roots, and the end points of root vectors form the vertices of a regular hexagon, thus they are all of equal length, say I. From 6l2 - 2 we have / = —^=, i.e., {ax, ctx) = v3 (a2, oc2) = — and since Z(ax, oc^ = 120°, {ax, a 2 ) = 3 the basic weights will be as follows:
/a1
~CC2 \
\
- (a, + ag) ^
/
Y<)
-a, /
. Thus superimposed on a root diagram 6
1 JU
>- (ai + a2)
\az
^CTHCT Weight diagram of .q2 1
' The vectors {e,} and {/j} in a Lie algebra J^; of rank / are called the simple raising elements and simple lowering elements of ^t; (see Ftn. on p. 172).
188 Mathematical Perspectives on Theoretical Physics
Corresponding to these weights X x, X 2 there are two basic modules for A2. The representation which corresponds to one is just the dual of the other. The weight diagram (with highest weight Xx) using the Weyl group can be seen to be an equilateral triangle with weights Xx, X2- Xx, -Xj as its vertices. Likewise the weight diagram with highest weight j ^ will be the equilateral triangle with weights X2, Xx - X2, - Xx as its vertices. Thus if fa, fi2, fa, is the weight system of basic module Mx with highest weight Xx = fa, then -fa, - fa, -fa is the weight system for basic module M2 with highest weight X2 - -fa. The weight diagrams in two cases would be:
^ 3 ^ 3 Weight diagrams of Mi and M7 Each of these modules will be three-dimensional, since corresponding to weights fa, fa, fa there will be three distinct weight vectors xx, x2, xy We use xx x2 x3 as basis vectors and, choosing xx as the extreme vector (i.e., epcx = 0 15 for every 0, we can set the other two as x2 =fxxx and x3 = f2x2. We use the relations given in (4.3.18) to write htXj = ^(/z,)^ 1 6 and Xfhp = Stj and then use these to note further that e
i*2 = exfxxx = [ex,fx]xx +fxexxx = hxxx = xx.
The rule Lff M^H a Mff + a (see Sec. 3) and an inspection of the weight system further gives exxx = 0 = exxy We now use
n\
ro\
xx = 0 , x2 =
,0)
ro\
1 , x3 =
UJ
0
u,
to write the matrix representation of !A2 corresponding to the irreducible (basic) module Mx. Thus we have:
"0 1 01 ex h-> 0
0
0
H->
0
0
1
eX2
H>
0
0
r 0
[ 0 0 oJ
[0 0 o_
0
0
01
ro
0
01
ro
/ ! H> 1
0
0
f2 \-> 0
0
0
hx i-> 0
.0 15
e2
ro o
.0 0 oj
.0 0 oj "1 0 01
16
ro 0 01
-1
0
0 oj
[01 oj ro 0 0* h2 h^ 0
1
/ 1 2 H> 0
o o" 0
0
[1 0 0.
0
[0 0 -i_
We are using (4.3.24) (i) and (ii) but with different notation. Comparing it with (4.3.16) it is obvius that while ht replaces x (an element of the Lie algebra L), xt replaces m (a vector in the module).
Basics of Algebras and Related Concepts
189
since the weight system of M2 can be obtained from that of M{ by inversion in the origin. The representation matrices can be written by using the dual basis x*, x*, x% (in place of xu x2, x3). It can be checked that they are negative transposes of the matrices given above.
References 1. J. G.F. Belinfante and B. Kolman, A Survey of Lie Groups and Lie Algebras with Applications and Computational Methods (SIAM, Philadelphia, 1972). 2. N. Bourbaki, Elements de Mathematique, Groupes et Algebres de Lie, Chaptre I (Paris: Hermann, 1960). 3. E. B. Dynkin, The Structure of Semisimple Algebras, Uspekhi Mat. Nauk 2 (1947), 59-127; English transl., Amer. Math. Soc. Transl. No. 17, 1950, reprinted in Amer. Math. Soc. Translations Series I, Vol. 9 (1962) 328-469. 4. J. E. Humphreys, Introduction to Lie Algebras and Representation Theory (New York: SpringerVerlag, 1972). 5. N. Jacobson, Lie Algebras (Interscience Publishers, 1962). 6. V. G. Kac, (a) Infinite Dimensional Lie Algebras (Birkhauser Boston Inc., 1983); (b) (ed.) Infinite Dimensional Lie Algebras and Groups (World Scientific, 1989). 7. V. G. Kac and A. K. Raina, Highest Weight Representations of Infinite Dimensional Lie Algebras (World Scientific, 1987). 8. A. A. Sagle and R. E. Walde, Introduction to Lie Groups and Lie Algebras (Academic Press, Inc., 1973). 9. D. H. Sattinger and O. L. Weaver, Lie Groups and Algebras with Applications to Physics, Geometry, and Mechanics (Springer-Verlag, 1986). 10. A. E. Taylor and D. C. Lay, Introduction to Functional Analysis (John Wiley and Sons, 1980). 11. V. S. Varadarajan, Lie Groups, Lie Algebras, and Their Representations (a) (New Jersey: Prentice Hall, 1974); (b) (New York: Springer-Verlag, 1984). 12. H. Zassenhaus, 2.[35].
CHAPTER
INFINITE-DIMENSIONAL ALGEBRAS
3
Infinite-dimensional algebras occur in many areas of mathematics and physics, for instance one finds them in 2-dimensional conformal field theories (Virasoro algebra), in gauge and quantum field theories as well as in string theories. In this chapter, we focus our attention mainly on affine algebras i.e, kacMoody algebras with affine matrix. Because it is these algebras which have deep links with combanitorics and the theory of modular forms and theta functions on mathematical side, and have an important role (via vertex operators) in dual reasonance models/quantum string theories on physics side. Besides this, it is possible to construct root-systems, and obtain representations (e.g. highest weight representation) for these algebras in much the same way as one does for finite-dimensional Lie algebras (see Dolan in Ref. [Ad], [15] and [17], as well as Chari and Pressley in [8b]). It is worth mentioning here that KacMoody algebras were initiated independently by Victor Kac and R.V. Moody as the study of a class of infinite-dimensional Lie algebras in mid-sixties, and have assumed a phenomenal role ever since. We devote first 3 sections to these algebras, and devote the latter 3 sections to Heisenberg systems, and the Vertex and Virasoro operators. We shall see in those sections the effectiveness of Kac-Moody algebras as a tool in other theories.
1
LIE ALGEBRAS ASSOCIATED TO CARTAN MATRICES
In the previous chapter while studying the weight and root systems of finite dimensional Lie algebras, we have seen that every (semi-simple)1 Lie algebra carries at least one Cartan subalgebra, with respect to which we have a root space decomposition:
g = X®Z ga aeA
We also learnt there that using a symmetric bilinear form on g, a matrix called the Cartan matrix can be defined associa-ted with the root space A, which is unique up to isomorphisms. In this section we reverse the order, we consider the so-called generalized Cartan matrix and construct a Lie algebra from it. A familiarity with this approach is required for learning the basics of infinite-dimensional algebras, particularly that of Kac-Moody algebras. The material in this section is based on the (standard) text [8(a)]. For more details and enlightening exercises, the reader is referred to this text. The notations used here differ from previous chapter in some cases, for instance the Lie algebra there has been denoted as L and the Cartan subalgebra as H, here we shall denote the Cartan subalgebra as H.
Infinite-Dimensional Algebras 191
l.l
Generalized Cartan Matrix and its Realization
Definition 5.1.1: A complex n x n matrix A = (a,-,-)" ,-_i of rank / is called a generalized Cartan matrix if its entries satisfy: (i) au = 2 (i = l, - , n ) (ii) atj are non-positive integers for i * j (iii) ay = 0 implies a,,- = 0 (5.1.1) We now assume that A is real and indecomposable,2 then it can be verified, that just one of the three following statements (listed in Fact (5.1.2)) holds good for both A and 'A Fact 5.1.2: (a) If det A # 0 then there exists a column vector u > 0 (i.e., all «,'s are > 0) such that Au > 0; moreover Av > 0 implies either v > 0 or v = 0. Such an A is called a finite type matrix, (b) If det A = 0, and rank A is (n - 1) , then there exists M > 0 such that Au = 0, and Av > 0 implies Az; = 0. The matrix A is now called aj^zne type, (c) There exists K > 0 such that Au < 0, and Av>0,v>0 imply w = 0. The matrix in this case is known as indefinite type. The above facts can be translated in terms of principal minors (determinants of principal submatrices) of A. For instance: (a') A is of finite type if and only if all its principal minors are positive, (b') A is of affme type if and only if all its proper principal minors are positive and det A = 0 (see Chapter 4 of [8(a)]). Using these facts on indecomposable generalized Cartan matrices a complete classification of Dynkin diagrams of these matrices can be obtained, and associated Lie algebras can be analyzed (see Thm. (4.8) of [8(a)]). Returning to our goal of defining a Lie algebra for a given generalized Cartan matrix, we first define the term realization of a matrix A. Definition 5.1.3:
Let A be an n X n matrix of rank / with complex entries (to be more general), H
a complex vector space and 3i its dual. Let n = {ax, ..., ccn] a 9i and II = {«j, ..., a,,} c !Hbe indexed subsets of 9{* and 9{ which satisfy the following properties3: (i) The sets IT and n are linearly independent (ii) (a^ocj)
=ay
(ij=
1, 2, ...,«)
(iii) d i m ^ = 2 n - / .
(5.1.2)
Then the triple {W, II, fl} is called a realization of A. Every matrix A possesses a realization, which is unique up to isomorphisms4 only if det A * 0. It can be checked that if {X II, n } is a realization of A, then [H , n , II} is a realization of 'A = the transpose of A.
2
3
4
A matrix A is said to be decomposable if after reordering its rows/columns it takes a diagonal block form
In other words it can be expressed as a direct sum of two or more matrices. If this is not the case, it is called indecomposable. Note that the symbol (tilde) - o n f l and ax, ..., an does not stand for complex conjugation. See Sec. (3.1) to appreciate (ii) of Eq. (5.1.2). The word 'isomorphism' relates to vector spaces.
192
Mathematical Perspectives on Theoretical Physics
In analogy with the terminology used in the previous chapter the elements of II and n are respectively simple roots and simple co-roots (see (4.3.20)(a) and Ftn. 8 there), while II and LI are root and co-root basis. The root lattices accordingly are:
Q = £ Zo,, Q + = i z + a , i=i
(5.1.3)
1=1
A partial ordering > on 9i* can be defined by setting a > / 3 i f c c - / 3 e Q + . Having set up this machinery, we now obtain an (auxiliary) Lie algebra g(A) associated with the (n x n) complex matrix A whose realization is {lH, II, n } . Remark 5.1.4: Consider two sets of generators {e,}, \f{\ along with the vector space H. Assume that the elements of those sets satisfy* (see Hint to Exercise 5 in Sec. 4.4). (i) [ei,fj]= SijUi (i,j = 1,2, .... n) (ii) [h, K] = 0 (h,h' e H) (iii) [h, et] = (a,-, h)et
(i = 1, 2,
...,n,h(=fy
(iv) [M] =-(«.-• h)fi (i = U . . . , H , f t e ^ (5-1.4) Then the vector space generated by the union of {e{}, (/)} and #" equipped with the Lie brackets (5.1.4) forms the Lie algebra denoted g(A) which is associatedtothe matrix A and the realization {M, II, fl}. The Lie algebra g (A) is characterized by a fundamental result, which we state without proof in next subsection (see Hint to Exercise 2 of this section and Thm. 1.2 in [8(a)]).
1.2 Construction of Kac-Moody Algebra g(A) and its Universal Enveloping Algebra Result 5.1.5:
(a) The Lie algebra g(A) depends on A and equals g(A) = fi+®H® fL
(5.1.5)
where n + ( n _ ) is the subalgebra freely generated by the set {e,, ..., en) ({/,, ...,/„}). (b) In view of the relations (5.1.4), there exists a mapping e, —> -ft,fi —> -e{, (i = 1, ..., «), and h —> -h, (h e H) which can be uniquely extended to an involution o of the Lie algebra g(A). (c) g(A) admits the root space decomposition with respect to Ji: ( g(A)=
\ ® g_a ®X® aeQ+
V 9*0
where
J
(
\
© ga aeQ+ \ a*0
j
g a= {x e g(A)\[h, x] - a(h) x], dim g a< °o f
* Repeated indices do not imply summation here. f
dim ga < n!hla\ this estimate links the dimension of ga to the height of root a. (See Def. 4.4.12).
(5.1.6)
Infinite-Dimensional Algebras 193
and
ga c n
for ± a e Q + .
±
(d) There exists a unique maximal ideal / i n g(A) which satisfies: / = (In fi_) 0 ( / n fl + )
(5.1.7)
The ideal 7 intersects ^/trivially. With g (A) and / defined above, one can obtain the quotient set: g(A)/I=g(A)
(5.1.8)
The set g{A) is the Lie algebra associated to a complex matrix A. When A is a generalized Cartan matrix (as defined in (5.1.1)), the algebra constructed in the above manner is called a Kac-Moody algebra. The quadruple (g(A), M, IT, ft) is called the (g, #)-pair associated to the matrix A. Note that #"is the Cartan subalgebra of g(A). The images of the generators {e,}, {/;•} under the canonical mapping g(A) —> g(A)II, which are again denoted as {e ( }, [£•}, are the Chevalley generators of g(A).5 Also the involution cf on g (A) leaves the ideal / invariant and thus induces an involution a on g(A) which satisfies: (T(«,.) = -/•, cr(/i) = - e , ,
(5.1.9)
The involution O" is called the Cartan involution of g(A). n
Furthermore if we set H' = ^ C a , and consider the derived subalgebra g'(A), we have i=i
g'(A) n H= H'\ g'(A) n ga= ga if a * 0. In view of (5.1.6) we therefore have: 8(A)= 0
ga
(5.1.10)
aeQ
where ga - {x € ^(A)|[/i, *] = a(/z) x for all h € ^/} is the root space corresponding to the root a. Evidently g o = ^/"(see subsec. 4 of (4.3)). Equations (5.1.9) and (5.1.10) taken together lead to the fact that if a is a simple root, then ga is 1-dimensional and for every simple root a of ga, there is a root -a, and the root space g_ a is 1-dimensional. If a is not a simple root, then multiplicity of a = multiplicity of (-a). where In particular
multiplicity of a = dim ga. A_ = -A + .
The Lie algebra g(A) is finite-dimensional if and only if all principal minors of matrix A are positive. In fact these Lie algebras are semi-simple, and as such, the classical theory of Killing-Cartan on these algebras becomes the theory of Kac-Moody algebras associated to matrices of finite type.
5
These generators generate the (derived) subalgebra g'(A) = [g(A), g(A)], and g(A) = g'(A) + Ti, (g(A) = g'(A) if and only if det A * 0).
194 Mathematical Perspectives on Theoretical Physics
Definition 5.1.6: A generalized Cartan matrix A is Euclidean if it is indecomposable, symmetrizable, singular and every principal submatrix is of finite type. The infinite-dimensional Kac-Moody Lie algebras associated with the Euclidean generalized Cartan matrix are called affine Lie algebras. These algebras are characterized by their different type of realizations (non-isomorphic realizations), and accordingly their Cartan matrices, although of the same order, differ in row and column contents. Evidently Dynkin diagrams play an important role in their characterization (see Exercise 1). In the next section we shall give an explicit construction of affine algebras from a different perspective. But before that we have to define another important object which will be used later. Definition
5.1.7:
The canonical mapping g(A) —» g(A) preserves the direct sum decomposition
n + © M. © fi_ of g(A) in the sense that the image of n + ( f i j denoted n+ (n_) gives the (so-called) triangular decomposition of g(A): n+®H®n_
(5.1.11)
where n + ( n j is the subalgebra of g(A) generated by ei (/j). It should be noted that if a is a positive root, then ga a n+ and if it is a negative root, ga c «_. In the former case it is a linear span of elements of the form [.. .[[e, , e, ], el ] ... e, ] and in the latter it is a linear span of the elements of the form [... [(/), / ; ], f^] ... /j ] such that *
fa,
each a ,
is formed by et 's
CC- — \
/
~x
'*
[- a , each at
is formed by ft 's.
From here it follows that: ««,. = C « i ,
«_«,.= C/,;
g, a i = 0 i f | 5 | > l .
(5.1.12)
This further implies the important fact: Fact 5.1.8: If /? € A + \{a,}, then {& + Z«.) n A c A+. Finally, the universal enveloping algebra of g(A) is denoted U(g(A)), and the corresponding triangular decomposition of U{g{A)) is: U(n+) ® U(X) © U(nJ
(5.1.13)
Exercise 5.1 1. Show how you would obtain the realization {!H, II, fl} of a matrix by choosing a generalized Cartan matrix (
2
{-b
- ^
2)
be a (2 x 2) Generalized Cartan Matrix (GCM). Show that the Lie algebra defined with the Cartan subalgebra based on this matrix is generated by six elements e 0 , e1( / 0 , / t , h0, hx that satisfy:
Infinite-Dimensional Algebras
(a) [«,,#= Sghi, (c) ft, *_,] = a&. Cj,
195
(b) [ho,h{] = O (d) [h^f^-ayfj
[«,, [«,, [«,, «,]]] = [/;, [ / , [ /,£]]]
= 0 if i * y.
This GCM Lie algebra is the simplest version of Kac-Moody algebra; it is denoted as Aj(1) by Kac. The subscript 1 in AJ1' indicates the rank of Cartan matrix in this case. In general, the Lie algebra obtained through a GCM of rank I and order (/ + 1) and having similarly defined Lie brackets with respect to 3/ + 3 generators (e,,/, h{) (i = 0, 1, ..., I), is denoted A^. 3. Show that every finite-dimensional compact connected Lie group G can be associated to an infinite-dimensional Lie algebra via a set of smooth mappings from the circle S 1 to G. The algebra obtained in this manner is called the untwisted affine Kac-Moody algebra. 4. Show that the centre of the affine Lie algebra g(A) associated to an affine type matrix A is 1-dimensional and is defined by the element /
c = X^«i • i=0
where 5 , e n c #"(the system of simple coroots), and at corresponds to a, e S = (a 0 , a,, ..., a,) —the unique vector coming from the Dynkin diagram S(A) of affine type GCM. The a,'s are all positive and relatively prime (the at can be viewed as the dual of a{ via a non-degenerate symmetric bilinear C-valued form ( | ) on !H). More simply, at (i = 0, ...,/) refers to the Dynkin diagram S('A) of the dual algebra. S('A) is obtained from S(A) by reversing the directions of all arrows but keeping the same 'number' (label) for vertices. The element c is called the canonical central element of g(A).
Hints to Exercise 5.1
F2 ~ a l 1. Note that the matrix
is a generalized Cartan matrix as (i) and (ii) of Def. (5.1.1) are \_-b
2J
satisfied. To obtain the realization, we shall have to determine the vector space #"and the indexed subsets n = [ax, c^} citf* and U = {a{, a2} c ^ . By property (iii) of Def. (5.1.3) the dimension of 51 will be 2 or 3 depending on a * b or a = b. In the latter case the rank of matrix is 1, and accordingly the dim tt= 2 x 2 - 1 = 3. Assume that a = b is a complex number, then we choose #"to be C3, ccx, o^ the first 2 (linearly independent) coordinate functions and a {, a2 the first two rows of the augmented matrix: ' 2 -a 0" -a 2 1 0 1 0_ Next we have to check that (i) and (ii) of Def. (5.1.3) are satisfied. The property (i) is obvious since the rows a x = (2, - a, 0) and a2 = (- a, 2, 1) are linearly independent, just as well the coordinate functions ax, 0^ are, by definition.
196
Mathematical Perspectives on Theoretical Physics
Property (ii) which states (dct,a -^ = atj is also immediate as a;- (the jth coordinate function) assigns to a , the ith entry of that row. Thus ( f i j . a , ) = (a, : ( 2 , - a, 0)) = 2 = a n (fii, a 2 ) = («2 : (2. - a, 0)) = - a = an (a2, a t ) = (a, : (-a, 2, 1)) = - a = a 2 , ( a 2 , a 2 ) = ( « 2 : (-*• 2> D) = 2 = a 22 . When a ^ Z>, the vector space M is 2-dimensional and a t and 5 2 are respectively (2, - a ) , (- &, 2). The above equalities can easily be verified for pairs (ai, a j ) . 2. To begin with we note that matrix A is of rank 1 and hence its Cartan subalgebra will differ from a 2-rowed matrix whose determinant is non-zero. We construct these generators using the process that holds good in a general case also (see, for instance, Sec. 2). Let £, • (i,j = 1, 2, ..., n) stand for an (n x n) matrix which has i, j entry as 1 and 0 elsewhere. Write (i) E0=Enl, Et =EiM, i =1,2, . . . , n - l (ii) F 0 =E 1 > B , F,. = £ / + 1 > , i =1,2, . . . , « - 1 (iii) // 0 = £„, „ - £ u , //,- = £,. j - £ i+1 , i+1 , i = 1, 2, ..., n - 1 then since n = 2, we have:
(i') £ 0 =
mF
ro oi [ l Oj ro i i
«-[o
(iii')
,
t[
r-i oi Ha0= L o IJ
ro
n
£, = ' L0 0. ro o"
F
''[i o.
H, =
ri
on
Lo - u
Define (iv) eo=Eo®
t,
fo=Fo® K
ex = El ® 1
/, = F , ( 8 » 1
ho=Ho® 1 ® c, fcj = « ! ® 1 and let the Lie bracket for the elements e^fi (i = 0, 1) be given by the rule: (v) \e-v fj] = [£, Fj] ® t+l + kSk+U0 Tr(E,.F;)c, (*, / e Z)*. (The element 'c' in (v) and in h0 is the central element of the Lie algebra formed by E-t, Ft, //,-.) In order to show that the generators defined in (iv) satisfy the equalities given in the exercise, we first show that [et, fj] = 8^ hj. Now [e0, / 0 ] = [Eo, F o ] ® tl
+(
-'> + 1 • 5, + (_1)f 0 Tr(£ 0 F 0 )c
= Ho ® 1 + c = h0. Similarly [ehfj = [Ey, F J ® f0+0 = ff, ® 1 = /ij. (The second term does not appear since k = 0.) Two more equalities similar to (v) will be formed by pairs (ft,, e-) and (ht, f-), for instance we shall have: From (iv) it is evident that k, I take only the values -1,0, 1.
Infinite-Dimensional Algebras
(vi)
[ft., ej\ = [//,, Ejl ® tk +' + k8k + ,, 0 Tr(HiEj) c
(vii)
[ht, fj ] = [Hit Fjl ® f*+/ + *<5, +; , o Tr (fl ; Fp c.
197
Consider the case i = 0 and j = 1 in (vi) to write: (viii)
[h0, el]=[HQ,El]®t0
+0
= -2(£, ® f°) = - 2 e i . Other equalities arising from (vi), (vii) for the remaining combinations can be easily verified. Using the same Lie bracket relation for (h0, h{) as given in (v), etc., we note that [h0, / i j = 0. Finally, the fact that the brackets [e,, [et, [et, ey]]] equal zero can be established by suitable substitutions. Note that the Cartan subalgebra is generated by (h0, hx). 3. Let denote the set of smooth mappings (i)
0 : Sx ={zs
C| |z| = l} - > G
such that Z H (j)(z)e G-the compact connected Lie group. For 0 l5 (j)2 e 3> the group operation in G gives 0[ • 02(z) = 0!(z) • 0 2 ( z )' an( ^ m u s defines a group structure on <&. Obviously $ is infinite-dimensional, since there are infinitely many smooth maps from Sl to G. The group O is called the loop group of G. Now from Exp. (4.1.7) we know that every finite-dimensional Lie group G admits a Lie algebra g, whose basis vectors {7**} 1 < a < dim g satisfy6: (ii)
[7*. 1*] = iff
f
where ff are the structure constants of g. Moreover, the T^'s are generators of G (which as we know, is connected), hence an element of G can be written as (iii)
exp (-iT01 0a)
where 0a (1 < a < dim g) are the group parameters. From (i), the parameters 0a's can be viewed as functions on the unit circle |z| = 1. Using these functions as well as (iii), a typical element of O can be expressed as: (iv)
0(z) = e x p H T " Ga(z)).
Also using the Laurent expansion
(v)
^
8~" (z)" for 9a(z), 0(z) can be re-written:
0( z )=ex P [-i £ ( r V > - " l
V n=-~ y Thus if we write 7'"= 7^ z", we have a set of infinite number of generators {7""} and an infinite set of parameters 6^ for . In terms of these, (v) becomes:
(vi)
Hz) » 1 - i I r^| 0a"
near the identity. Since 0 e O, we have in the process obtained the Lie algebra g for <E>, whose elements satisfy the following Lie bracket relation: Note that our notation for structure constants and basis vectors here is different from Chapter 4. The inclusion of / in the equality is to facilitate the description of mappings. See also super Lie groups and super Lie algebras for the convention of / in Chapter 7 App.
198 Mathematical Perspectives on Theoretical Physics
(vii)
[TZ,Tft = if?Trm
+ n.
Note that by choosing n = 0 in the set {7^} we retrieve the Lie algebra g, in the sense that the subalgebra formed by {T"} is isomorphic to g. The Lie algebra g obtained by considering the group <E> of maps S1 —> G is the untwisted affine Kac-Moody algebra.7 4. See Prop. (1.6), Theorem (4.8) and Sec. 6.2 in [8(a)] for the proof.
2
AFFINE ALGEBRAS: AN INTRODUCTION
We devote this section to affine algebras and the next section to their representations. These algebras in many respects can be considered as analogues of simple finite-dimensional Lie algebras. As already mentioned in the introduction the present version of the subject originated from Kac-Moody algebras Ad. [18]. The subject has grown tremendously during past two decades due to the applicability of these algebras. They provide a powerful and natural framework in the study of unified field theories on the physics side, and make an effective tool for learning the theory of infinite dimensional Lie groups (via their Lie algebras) on the mathematics side. While Kac and Moody defined these algebras as realizations of representation of generalized Cartan matrices (as we have already seen in Sec. 1), others like J. Lepowsky and I. Frenkel developed these as generalizations of simple Lie algebras using the so-called vertex representation. Many of the original research papers and survey articles that show the progress of the subject from a mathematical point of view, as well as its relation to physical systems, e.g., dual resonance models, 2-dimensional conformally invariant structures, Boson-Fermion correspondence in quantum field theory, can be found in two readable texts (V. Kac [8(a)], and P. Goddard and D. Olive [6]). Our presentation in this section is based on the formulation of the theory given by Frenkel and Kac, Goddard and Olive, and Lepowsky and Wilson. We define an affine algebra in a general form and deduce from it other affine algebras, e.g., the Heisenberg algebra, the Cartan algebra, the loop algebra and the Virasoro algebra. Finally we show that every affine algebra has a connection with a Kac-Moody algebra.
2.1
Construction of Affine Algebra
Definition 5.2.1: Let C[t, f[] be the algebra of Laurent polynomials in the indeterminates t and fl over C, and let g be a complex simple finite-dimensional Lie algebra, which carries a non-zero invariant bilinear form (,). The infinite-dimensional algebra g formed by vector space C[t, r 1 ] ®cg®Cc
(5.2.1)
with the Lie bracket* [x ® t\ y ® tm] = [x, y] ® tn+m + n 5,,,_m (x,y)c 7
'
(5.2.2)
See Sec. (3.5) of Goddard and Olive in [6] for construction of the twisted affine Kac-Moody algebra. A twisted affine Kac-Moody algebra can be defined with the help of a compact finite-dimensional Lie algebra g and an automorphism a of g which is of finite order. We denote it as g a, and emphasize that this 'twisting' is related to 'symmetry breaking.' * Very often x ® tk = tk ® x is denoted as x(k). (See for instance Eq.(5.2.6)).
Infinite-Dimensional Algebras
199
where x,y e g, n, m e Z, and c e the centre of g, is called the affine (Euclidean) algebra associated with g (see Frenkel in [6]). If g is replaced in (5.2.1) by its Cartan subalgebra M, the resulting subalgebra C[t, r 1 ] <8>c H 0 Qc = gH
(5.2.3)
is called the Heisenberg subalgebra8 of g. The subalgebras: 9{®Qc^3i and
(5.2.4)(a)
1
(5.2.4)(b)
are called the Cartan and scalar subalgebras of g. Note that the latter subalgebra can be identified with g, though g is not a subalgebra of g in formal sense. The quotient algebra gl Qc = g © C [t, r 1 ] = g0
(5-2.5)
is called the loop algebra associated with g. This nomenclature is self-evident since the loop algebra can be realized as an algebra of g-valued functions on the unit circle (\z\ = 1) with finite Fourier series, and with Lie bracket acting pointwise [8(b)] (see also Hint to Exercise (1.3)).
2.2
Derivations and the Affine Algebra
Let x(n) stand for x ® t", and let d(n) = f+{ — be the derivation of the algebra of Laurent polynomial als C [t, f ' ] . Then assuming that d(n) acts trivially on Cc, the derivation can be extended to the whole of g by setting [d(n\ x(m)] = mx(n + m) n, m e Z, x € g
(5.2.6)
Also using d{ri) (for every n), a semi-direct product vector space | = g 0 Cd(n)
(5.2.7)
can b e defined, a n d o n g t h e bilinear form ( , ) of g c a n b e extended to ( , ) „ a s follows: (x(m) + aGc+ ax d(n), y(m) + p0 c + 0, d(n) )„ = Sm+m, n (x,y) + aQ ft + a, )30 where
(5.2.8)
x,y € g, m, m, n e Z, a0, a ] t j30, ft e C.
We note that the set of derivations {d(n)} defines a Lie algebra with Lie bracket: [d{n\ d{m)} = {m-n)
d{n + m).
(5.2.9)
The equality (5.2.9) is self-evident in view of the commutation relation between f+l—and f+x—. dt dt It is easy to note that the requirements on the Lie bracket (as given in Def. (4.1.3)) for the definition of a Lie algebra are met in this particular case. We shall denote this Lie algebra as 2). For every arbitrary element: 8
'
See Sec. 4 for explicit construction.
200
Mathematical Perspectives on Theoretical Physics
d
= X akd(k) (ak e C & e Z) k
in
a
k ( >)k
k
can be defined. In particular when d = d(0), we denote g + Cfi?(0)= g + G / as £ and note that the above construction helps in finding the root system of affine algebras, in other words it enables us to write down a root decomposition of g and g . The root decompositions of g and g can formally be written as:
(a) g = 5? 0 £ £ 2 ;
(b) g = X 8 X «s
(5-2-10)
We obtain these decompositions in (5.2.13) and (5.2.15). In Subsection (2.4) we shall also see that g and (, ) n can be used to define the Casimir element of g.
2.3
The Root Decomposition of g
We have already seen (Sec. 4.3) that every finite-dimensional complex simple Lie algebra has the root decomposition9: g = X®
X Sa
(5-2.11)
aeA
with respect to Cartan subalgebra #", where A c W (the dual of # ) is the root system, and for every a € A, ga is the 1-dimensional space spanned by a and defined as: ga = {x € g: [h, x] = (a,/z)x for every ft e # }
(5.2.12)
We also know that a subsystem of roots consisting of simple roots {a,, ..., a j where / is the rank of g can be fixed and by using an order in the root system, the highest root of the system can be selected. Obviously a similar procedure has to be adopted for obtaining the root decompositions of g . We show it explicitly as follows. We note that in this case the Cartan subalgebra ^/of g is replaced by !H+ Cc + C d(0) = 9{ . The algebra H (as required by any subalgebra to be a Cartan subalgebra) is a maximal commutative diagonalizable subalgebra in g. Returning to the root decomposition of g, we denote the dual space of H (with respect to the invariant C-valued non-degenerate bilinear form on g) by ik * and the corresponding root system
H}
The root decomposition of g now reads as:
g = h® X 8s
( 5 - 2 - 13 )
aEA
Next we extend any linear function/e V to ik* (denoted as/only) by setting/(c) =f(d) = 0, and we choose a linear function /3 on !H such that j3|^
+ Cf
= 0, j8(d)=l.
(5.2.14)
9. Note that Cartan subalgebra has been denoted there as H, smdga\sL/^ (SeeEq. (4.3.21)). We also note that the root system in Sec. 4.4 is X; the symbol A there, is used for Dynkin diagram.
Infinite-Dimensional Algebras
201
In view of the above choice with respect to central element C and d(0) s d, we can write the decomposition of g with respect to i/" as a sum of the tensor product of monomials t" and the root spaces that pertain to 9{\ g = X © X (f ® c g&)
<5-2-15)
The pair (n, a) ranges over Z x (A u 0)\(0, 0), and the root system of g with respect to H is given as: A = {nP + a,
neZ,
a e A u 0} s {0}
(5.2.16)
The root space gs spanned by the root a = n/3 + a, in view of (5.2.15), can be written as t" ® c ga. Hence the multiplicity (dim gs) of a root a =nfi+ oce A is 1 if a * 0 and is n otherwise. A root 5 = nji+ a with a e A is called real, and a root a - nfi, n e Z N 0 is called imaginary. For every positive root a in A there is also a corresponding root - a in A. The root a is said to be positive if n > 0 or n = 0, for a e A+. This allows A to be expressed as a union: A = A+ u (-A + ). The bilinear form defined on 8 when restricted to H and the root system A satisfies10: (a) (, )|^ is non-degenerate (b) (, )\g_3 © gs is non-degenerate (c) (gs,gg) = 0 if 5 + b * 0.
(5.2.17)
Just as we have a root basis in the case of g which is formed by simple roots, we have in this case the set of simple roots { a o = / J - a , 5 j , ..., 5,,} (where a denotes the highest root,11) which forms a Z+-basis of A +. We denote this set by S .
2.4
Formulation of the Virasoro Algebra
Having discussed an affine algebra along with its root system, we now show how some of these affine algebras can be constructed from first principles. We illustrate this for three algebras namely the KacMoody algebra, the Virasoro algebra and the Heisenberg algebra (see [6], [11]). The construction of Kac-Moody algebra (denoted g) has already been considered in the Hint to Exercise 3 of Sec. 1. We have shown there that the algebra can be obtained by using an infinite-dimensional group of smooth mappings from the unit circle S1 to a compact connected Lie group G. In a similar manner the Virasoro algebra can be formed by considering the infinite-dimensional group of one-to-one mappings from the circle S1 to S1 (see Goddard and Olive in [6]). The group composition in this case is: y/i o y/2(z) = Vi (V2(*)) 10 11
Note that the suffix d of ( )d has been suppressed here. See Sec. 4.4 for definition.
202
Mathematical Perspectives on Theoretical Physics
where y/ b y/2 are smooth maps defined on the first circle Sl= {ze C : |z| = 1}. We denote this infinitedimensional group by v and its Lie algebra by v . It can be shown that the generators of Lie algebra in this case are 12 : L,,= - z " + 1 ^ - , dz which evidently satisfy:
ne
Z
(5.2.18)
[Lm, L J = (m - n) Lm+n
(5.2.19)
Now if we were to consider the semi-direct product (5.2.7):
f s g ® X Cd(n) nez
we note that
]£ Cd(n)
(5.2.20)
nez
is a Z-graded subalgebra of the affine algebra g , and by the very definition of d(n) the algebra v can be identified with the Lie algebra formed by {d(n)}, hence the Virasoro algebra can be thought of as an algebra that can be deduced from the general affine algebra defined above. We further note that the direct sum of g (see Excercise 1.3) and v is again a Lie algebra denoted g © v . The generators of this algebra satisfy:
(a) (b)
[7,« rf] = Iff Tjn+n [Lm, m = -nTam+n
(c)
[Lm, LJ = (m-n)
Lm+n
(5.2.21)
Due to the importance of the Virasoro algebra in string theory, we shall return to it in Exercise 5 of this section and to Virasoro operators in Section 6. The third infinite-dimensional algebra that we wish to construct is the Heisenberg algebra. We postpone this to a later section, where we formulate this algebra from first principles and show how one can define Fock spaces and other operators using the socalled Heisenberg systems. 12
' To obtain the Lie algebra for v consider its faithful (i.e. injective) representation defined by its action (i) on functions/: Sl —> a vector space V:
O 4 /(z)=/(r 1 (z)), £e V-
(i)
For an element £ close to the identity, the equality |(z) = ze~ie(z) gives iff1 (z)~ z + ize(z). Hence Taylor's series for/in Dcf(z) =f(^ ~l (z)) near the identity yields D^f(z) ~f(z) + ie{z) z—f(z). Using the Laurent expansion dz OO
I
£ £ ^ / , we can introduce Ln = -z"+l — , n e Z. See also Sec. (1.4). Note that we can also use n = - oo dz z = exp (id) here, 6 being the parameter on Sl. £(z) =
Infinite-Dimensional Algebras
2.5
203
The Chevalley Basis and the Casimir Element in Terms of the Chevalley Basis
In the previous section we have already been introduced to Chevalley generators and the Chevalley basis in connection with the Lie algebra g (A) of a generalized Cartan matrix A. We show here that such a basis can be obtained for any arbitrary simple Lie algebra g by using a 2-cocycle on the root lattice Q of g. Let g be a complex finite-dimensional simple Lie algebra, #"its Cartan subalgebra, and A the set of roots with respect to M. Let Ft = {a{, ..., a,} be the set of simple roots. Then in view of (4.3.20)(a) we know that for every a, € n there exists a unique element ha. e !H such that: (ha., h) = Oiih),
h € M
The elements {ha.} provide an orthonormal basis for Cartan subalgebra 9i. For notational convenience we shall sometimes denote ha, as ht, when there is no fear for confusion. Remark 5.2.1:
Let Q be the root lattice and let ( , ) be an invariant bilinear form on g normalized
as {a, a) = 2 for a e A. Define a bilinear function13 e: Q x Q —> {± 1} with following properties: e(a, j8) £ (a + p, y) = e(/3, y) e(a, p + f) e(a, ft £ (A a)
= e"i{a'P)
= (- l) /3>
e(a, 0)
=1
£(a, - a)
= 1 (normalizing condition)
(5.2.22)
Kac shows that using this bilinear function e, the set of roots A, and the algebra !tf, a linearly independent set of basis vectors can be obtained which spans a vector space, and this vector space can be made into a Lie algebra g' by assigning Lie bracket relations to these basis vectors. The Lie algebra g' is isomorphic to g and the basis obtained in this way is called the Chevalley basis of g (see [5]). We give below this basis, along with the conditions that these basis vectors satisfy. For details see the above reference. For a, P, y... e. A, let Ew Ep,. Ey ... denote the vectors that belong to the root spaces ga, gp, gy respectively and let ha_ = hi denote the vector for a, e FI. Then the union {Ea} u {&,} of these sets generates a Lie algebra if the following Lie bracket relations for arbitrary pairs belonging to the sets and their union are satisfied: (a)
[Ep,Ey] = 0
(b)
[Ep Er] = £(r, p) Ep+y
13
if/?+y«Au{0} if p + y € A
' e is called the 2-cocycle of Q which defines the central extension of Q in the following manner. The root lattice Q of a simple finite dimensional Lie algebra g of type An, Dn or En has a central extension T by the group Z/2 Z = {± 1}, given as: 1 -> {+ 1} ->• T
^ > Q -> 0.
This is uniquely defined by the relation aba'1 b~x = eKl^a^\
= P-
where a, b e T, a, f)e Q, and
204
Mathematical Perspectives on Theoretical Physics
(c)
[Hp, Ey] = (P, y) Ey
(d)
[Hp, Hy] = 0
(e)
[Ey, E_Y] = Hr
(5.2.23)
(where we have used the notation Hp, Hyin place of ha.). Remark 5.2.2: We would like to note here that Kac's description of Casimir operator which appears to be different is essentially the same as given below, except for the fact that we have mentioned the Cartan subalgebra #"more explicitly (see Sec. 2.5 in [8(a)]). In fact to write the Casimir element of an arbitrary affine Lie algebra (Subsec. 2.6), we use slightly different notations (see Frenkel in [6]) to show dependence on the cocycle e. We use x^ for Ea and identify ha with h e ^where required. Accordingly (5.2.23)(a-e) become: (a')
[xp, xer] = 0
(b')
[*/, xfi = e(y,p) 4 + y
(c')
if 0 + y £ A u {0}
if p + r e A
[h, xEa] = (h, a)
(d')
[h, h'] = 0
(5.2.24)
(e')
[xey, xE_r] = h y .
From Sec. 4 of the previous chapter we know that the Casimir element C is an element of the center of the enveloping algebra of g (see Eq. (4.4.16)). Using the same arguments and Eq. 5.2.24, in this case we can write it as: i
C=^h]-2 i=l
where
J^xeaxia
+ 2p
(5.2.25)(a)
aeA
p =— £ a
(5.2.25)(b)
When the algebra in question is a simple Lie algebra over C of the type A,(1) (see Exercise 1.2 for A,(1)), then p is zero (see Sec. 2.8 in [8(a)] and page 279 in [6] for different (e')).
2.6
Casimir Element of g
Just as we defined modules related to other Lie algebras (see Sec. 4.3 and Sec. 4.4), we can define a g = g ® Crf(fc)-moduleV with the property that v & V can be annihilated by all x(ri) (for x e g) when n (e Z) is sufficiently large. Using the bilinear form( , )k (as given in (5.2.8)), we choose dual bases in gs and gkc _~ (where c is a central element of g and k e Z is fixed). These are respectively { x s, i} and {xkc_~ ,} for 1 < / < dim g~. The basis elements x~ , and xkc_Si,
are now used to define the
endomorphism C{k) of the module V which is called the Casimir element14 (of the affine algebra): u
' See (4.4) for the definition in the case of a simple algebra.^.
Infinite-Dimensional Algebras 205 /
C(k) = 2cd(k)+ X /z;(0)/i,.(*) + 1=1
2
X I
X
kc-a, i *a. i + 2& • M) + <* ® P)
( 5 - 2 - 26 )
aeA+ i'=l
Here 7/ and p (in the fourth term) stand respectively for the Coxeter number15 and the sum (5.2.25)(b), and h-t{k) (i = 1, ..., / and k e Z) stands for an element of the Cartan subalgebra ji*. From our discussions on d(k), h^k), xa , and the roots, it is apparent that all elements on the RHS of (5.2.26) can be viewed as operators, accordingly C(k) is an operator. This can be used to define another operator: l(k) = C(k) - d(k) (2c + 2H)
(5.2.27)
where c e centre of g and H is the Coxeter number. We shall see that the above operator can be expressed in terms of Chevalley operators by using the following ordering method between two elements of g (the integers k and k' are arbitrary here): : x(k)y(k'): = x{k)y(k')
k' > k
: x(k)y(k'): = y (x(k)y(kf) + y(k')x(k)) k = k' : x(k)y{k')\ = y(k')x(k) k' < k. (5.2.28) In order to express l(k) in terms of the Chevalley basis, we use the above ordering method amongst these basis vectors. In view of the earlier subsection, all that we need is a set of independent vectors which can be obtained with the help of the bilinear function e given in (5.2.22). A subset of these vectors forms an orthonormal set for the Cartan subalgebra J{ of g , and the remaining ones span the root spaces g~ of g . The set in question is {Aj(«)} u {xea(m)} where, for i = 1 ... I, me Z, /z,(m) denotes an element of !k, and for a e A, xea(m) is an element of g ~. The operator l(k) can now be written as:
W) = X X 0 W » i (A7 - * ) : ) - X 0 4 » 4 (*' - *):)) itez V; = l
2.7
a e A
(5.2.29)
'
Canonical Generators of the Affine Algebra g
We use the set of simple roots {a 0, a it ..., at} of g to define the elements 15
The Coxeter number H of an affine Lie algebra is the sum of numerical labels attached to the vertices of the Dynkin diagram that is obtained from the generalized Cartan matrix pertaining to that algebra (see Exercise 1.4). Note that Cartan subalgebra Oi is different from jj - the subalgebra of g , there k was chosen as zero.
206 Mathematical Perspectives on Theoretical Physics
a,; = 2 ( S i , S j ) / ( S j , S j ) i , j = 0 , 1 , .... I.
(5.2.30)
The matrix
is called the Cartan matrix of the Lie algebra g . Note that this is also the extended Cartan matrix of g, and as such the canonical generators £,, F,-, Hi (z = 1,2,..., /) with their usual meaning, e.g., Et € ga., F ; e g_a,, Ht = [Et, F,-] and a, (//,) = <5L 2 can be used to define similar relations on g. Thus if we choose Eo e #_~ and F o e g 5 such that a(H0) = - 2 where // 0 = [Eo, F o ] and set e 0 = t ® Eo l
fo=f ®Fo Ao = 1 ® Wo + c
et = t° ® £,fi=tG®Fi
i = l , 2 , ...,/
A,. = 1 ® //,-,
(5.2.31)
then it can be verified that they satisfy (see Exercise 1.2) [e, jj] = 3 7 h^ [hh hj] = 0, Ik, ej] = Oij ejt [kh fj[ = -aiifi l a
(adei) - U
ej
= 0, (fldf^vf.
= 0 i *j.
(5.2.32)
The elements e ; ,/-, /i; (i = 0, 1,2, ...,/) are called the canonical generators; they generate the subalgebra g of g . Evidently g as a subalgebra is of codimension 1 in g . This is the (so-called) Kac-Moody Lie algebra associated with the matrix A. This shows that every affine algebra can be related to a KacMoody algebra.
2.8 The Weyl Group of g For a real root a (i.e., (5, a) ^ 0), let a = 2 5 / ( 5 , a) denote the dual root, and let r- be the reflection in the space H
with respect to a given by: r - ( 5 ) = 2 -
u € i?*
(5.2.33)
The collection {/- } for a 6 A ((5, a) •*• 0) generates a group called the Weyl group of g . We denote it as W. We note that unlike the Weyl group of g which is finite-dimensional, the Weyl group W is infinite-dimensional. The following properties of W can be easily checked. Property
5.2.3:
(a) The bilinear form ( , ) restricted to H* is W-invariant.
(b) Any real root is a W-conjugate of a simple root. (c) The line C/3 (with fi defined above in (5.2.14)) is the fixed point set of W. Note that the group W is generated by reflections r~ (i = 0, 1, 2, ..., /) that correspond to simple roots a,. Denoting r 5 by r;, it can be shown that the Weyl group W of the Lie algebra g can be identified with the subgroup of W generated by ri (i = 1,2, ..., Z) (see Sec. (3.7) in [8a] and Sec. (1.6) of [5].
Infinite-Dimensional Algebras 207
Exercise 5.2 We note that some of the symbols used here differ from those of the text. 1. Let si (2, C) denote the set of traceless (2 x 2) matrices over C. Show that it is a Lie algebra g with the basis:
ri
01
ro n
ro 01
* - [ o - i | ' - o o J - H oj and the bracket relations: (a) [h,e] = 2e (b) [h,f] = -2f (c) [e,f}= h. Show further that using g, an infinite-dimensional Lie algebra g = g ® C [t,t~l] can be defined where C [t, t~l] is the algebra of polynomials in the indeterminate / and its inverse t~l. Specify its basis and the bracket relations. 2. Show that the generalized Cartan matrix Lie algebra g defined in Exercise. 5.1.2 (denoted there as A'/can be identified with the direct sum of vector spaces resulting from g given in Exercise 1 and the centre of g, thus:
8 = I ©C z where z = h0 + hx is in the center, i.e., [z, g] = 0. 3. Let A be a generalized Cartan matrix of finite type and C [t, f ' ] denote the algebra of Laurent polynomials in t. Then show that the Lie algebra
with the Lie bracket: [P ® x, P' ® x'] = PP' ® [x, x]
(P, P' e C [t, T 1 ], x, x e g(A))
can be identified with the Lie algebra of regular rational maps from the set of non-zero complex numbers C* —> g(A) such that the element
X tl ® *,. corresponds to the mapping z H> LZ'JC,-, where z e C * . 4. Show that the following subalgebras of g "-=
X«-2.n += 5eA+
X 8a
and
b =h ®n+
5eA t
are the generalizations of maximal nilpotent and the Borel subalgebras of g.
Hints to Exercise 5.2 1. The bracket relations (a), (b), (c) can be easily verified, however we examine it for (c), thus we write (c) in terms of matrices e and/, and obtain:
208
Mathematical Perspectives on Theoretical Physics
(\ o\ (o o\ a o\
(i)
ioo)-(o i H i - i ) - *
Any elements x, y e g will be of the form axe + a 2 / + °h, h, Pie + fiif+ &h where a, and /3, are complex numbers. To check that [x, y] = -\y, x], we have to check that [a, et, fa e,] = -[/?,- et, a,- e;] where we have written ex = e, e2 =/, e3 = h and have used double indices to imply summation. In view of (a), (b), (c) and linearity, the anti-commutativity is obvious. To check the Jacobi identity we write: \f, [h, e)] + [h, [e, f]] + [e, If, h]] = [f, 2e] + [h, h] + [e, 2 / ] The RHS is zero and hence the identity holds good. Consider now the tensor product g®C[t,fl]
(ii) = g.
The basis elements of g will be (h ® t'", e ® t", f®f\m,n,pe
Z)
(iii)
and the bracket relations using (a), (b), (c) can therefore be written as: (a')
[h ® t"\ e ® tn] = 2e ® tm+n
(b')
[h ® t"\ f®tp]
(c')
[e ® t", f®tp]
= -2f® tm+p = h® t"+p
(iv)
From (iv) the arbitrary elements of the Lie algebra g are (2 x 2) matrices, tensor multiplied with terms of infinite Laurent series. The anti-commutativity and the Jacobi identity for (a'), (b'), (c') are easy to check. Also for any two elements x and y in g, the bracket relation: [x ® t"\ y®tn]
= [x, y] ® tm+n
(v)
can be easily verified using (a'), (b') and (cO2. To establish the identification between the two descriptions of g, we must show that there exists a correspondence between the generators of g and that of g ®Cz- This can be done by writing the Lie brackets in both cases and comparing them using the Hints to Exercises (5.1.2) and (5.2.1). Note that the Lie bracket for g in this general situation would be: [x ® t'", y®t"]=
[x, y] ® tm+n + mSm_„ Tr (xy)z
(i)
where x, y e g, ze centre of g and Tr (xy) = trace of the product of matrices x, y. Using this for likely combinations of generators in g as given in the above exercise, we shall have, for example: [e ® t"\ f ® tn] = [e, f] ® f'+" + m8m_n
lr{ef)z
(ii)
Now the second term on the RHS is non-zero only when m = -n. We choose m = 1 = -n; there is no loss of generality in this choice since (t, tl) can generate all powers in C [t, r1]. As a consequence (ii) becomes:
Infinite-Dimensional Algebras
209
[e ® t, f® r ' ] = [,/] ® r ° + 1 • l(ho+ hx) = h ® 1 + (h0 + h{)
(iii)
(Note that from (i) of the Hint to Exercise 1, Tr(e/) = 1.) In view of the earlier description of g (given in Exercise 5.1.2) it is appropriate to write e0 for e ® t and/ 0 f o r / ® t~l. This gives: [e0, fo] = h® 1 + z
(iv)
But [e0, f0] = h0, hence it follows that hQ —> h ® 1 + zHaving obtained the correspondences: 0 —> e ® t, f0 - > / <8> t~\ h0 —» h ® 1 + z
(v)
we have to find for the remaining three e{,f{, h] and examine that these are compatible with the set (v). One more choice for m = -n can be that m - -n be equal to 0 in (i). In this case we further choose that x is / a n d y is e, then (i) becomes:
[f®t°,e®
t°] =\f,e]®t°+0
(vi)
Write ex for/® t° and/, for e ® t°. This gives: [eu /i] = [/, e] ® r°
(vii)
hx-^> -h® 1. Note that if z is replaced by /i0 + /i, in (v), the above correspondence in (vii) becomes apparent. This shows the appropriateness of our choice. To examine the compatibility we substitute these values of ebf{ and hi in (iv) of Exercise (5.1.2) to verify that they are satisfied. Write ho = h® 1 + z and hx = - h ® 1 in [h0, /i,] to obtain: [h ® 1 + z, - h ® 1] = [h ® 1, - h ® 1] + [z, - A ® 1] = 0
(viii)
This shows that h®l + z,-h®l belong to the Cartan subalgebra of g ® C £ Now the LHS of (vi) in Exercise (5.1.2) can be written as either [h0, e,]
or [hu ,] or [hQ, e0]
or [h{, e0]
(ix)
We prove just one (the first one, for instance). [h ® 1 + z,f®
t°] = [h ® 1, g ® 1] + [z,f®
1]
= - 2/ ® 1 + 0
(x)
While [h0, e j = A01 e, = - 2ej = - 2/® 1. Similarly (vii) of Exercise (5.1.2) can also be verified. Note that we have thus established a two-way (bijective) correspondence: e0 <-> e ® t, f0 <-> / ® r 1 , h0 = h ® 1 + z e, < ^ / ® l , / , - e c ® U | = - / i ® l . This proves the identification between the two descriptions of g.
210
Mathematical Perspectives on Theoretical Physics
The exercises 3 and 4 are left for the reader. The hints for solution can be found in Chapter 7 of [8(a)] for 3, and in Sec. (1.3) of [5] for 4.
3
MODULES AND REPRESENTATIONS
Having seen the connection between affine algebras and Kac-Moody algebras, it will be worth while to study the modules and the representations for Kac-Moody algebras (to begin with). It should be apparent that the concept of weight space decompositions, the Weyl group and the representations (via modules) established in the case of finite-dimensional simple Lie algebras can be generalized to affine Lie algebras. Let A be a I x I generalized Cartan matrix and g(A) be the Lie algebra associated to a realization {M, n , II}, then a g(A)-module Vis called H-diagonalizable if
where
Vx= [v & V\h(v) = (X, h)v for h e ti\ * 0
(5.3.1)
is a weight space and A e $t is a weight. The dimension of Vx is called the multiplicity of the weight X. If in addition the generators {e,}, {/•} (i = 1, ..., Z) are locally nilpotent on V (i.e., there exists a positive integer n such that (e,)"w = 0 for every v in V, similarly for fy, then the module V is called integrable (see Sec. 3.2—3.5 in [8(a)]). Let the triangular decompositions for g(A) and its enveloping algebra U(g(A)) (see (5.1.11) and (5.1.13)) be given by (a) (b)
g(A) = n_ © H © n+ U(g(A)) = U(n_)
® U(n+)
(5.3.2)
then using the above decomposition, an important type of module can be defined as follows.
3.1
Highest Weight Modules
Definition 5.3.1: A g(A)-module V is called a highest weight module with highest weight A e rf if there exists a non-zero vector v e V such that (a) (b)
n+ (v) = 0; h(v) = (A, h)v
for h e X; and
U(g(A)) (v) = V.
(5.3.3)
The vector v is called a "highest weight vector." It can be easily seen that in view of (5.3.2)b and (5.3.3)(a), the Eq. (5.3.3)b becomes U(n_)(v) = V. Hence we have: V= 0
Vx;Vx=dV
dimV A
(5.3.3)(c)
Every two highest weight vectors are proportional. We now generalize these ideas to an affine Lie algebra g. Definition 5.3.2: Let A € H ; the highest weight module V(A) is an irreducible g -module for which there exists a non-zero vector v 0 € V(A) such that
Infinite-Dimensional Algebras 211
n + (v0) = 0 h(v0) = (A,h) vo= A(h) v0 for he X
(5.3.4)
For any A e !H *, there is a unique module (up to isomorphisms) of this kind, the element A is called the highest weight of the module V(A). As usual we have a weight decomposition of V(A) with respect to H which is given as:
V(A) =
£
V(A)M
(5.3.5)
litLH''
The subscript JX in the above decomposition ranges over the set < A - ^ r{ai \ where ri is an element
I
i=0
J
of the Weyl group defined in Subsection (2.7) and ai e A + . The dimension of ViA)^, denoted mJA), is called the multiplicity of /j. and ^ is called a weight of the module V(A) if m^ (A) * 0. Furthermore, if A(fy) (f = 0, 1, 2, ...,/) is a non-negative integer for every i, then A is referred to as dominant. For a module V(A) with dominant highest weight A, it is known that {e,-} and {/j} are locally nilpotent.
3.2
The Basic Representation of the Affine Algebra
In order to define the so-called basic representation of g and the corresponding weight system denoted P, we define a subalgebra in g by setting: M=Cd + (C(t)®cg)
(5.3.6)
By definition the basic representation of g is an irreducible g -module V for which there exists a nonzero vector voe V such that16 3W> 0 ) = 0 a n d c(vo) = v0
c = h0 + £ bkhk, bk e C
(5.3.7)
It can be verified that such a representation always exists when the module in question is a highest weight module V(A0), with the highest weight AQ e H * defined by the following relation: Ao(fto) = 1, A 0 (ft ; ) = 0 (i = 1, .... /), Ao(d) = 0.
(5.3.8)
The first two equalities of (5.3.8) show that AQ is a dominant weight. The weight system P of the basic representation consists of the elements of the form [5]: A o + y - ( j < 7 . y } + *)fl7e Q , * e Z,
(5.3.9)
The multiplicity of the weight given in Eq. (5.3.9) is p ( n ) (k), the number of partitions of k into parts of n different objects. The partition function p ( n ) (k) satisfies the identity 16. See Sees. (4.3) and (4.4).
212
Mathematical Perspectives on Theoretical Physics
Ip ( f l ) (*)9*=fn(l-9*)]
(5-3.10)
The collection of g -modules V forms a category %_ if the following two conditions are satisfied (see also Sec. (0.6)): V
= Z
v
n
where each
v
ti = ( y
e
W ^
= he
fi}
(5.3.11)(a)
is finite dimensional. The elements e- and/j (i = 0, 1, 2, ..., /) are nilpotent operators on V. (5.3.1 l)(b) It can be verified that if Vl is a submodule of a g -module V, then Vx e 3Cif and only if Vj and V/Vj both belong to %. The module of adjoint representations also belongs to %. Let ndenote a representation of g in V e !?C For each y e A (the set of roots in g) fix an element £ y in the root space g y c g such that y([Er, E_y]) = 2 (see Remark (5.2.1) and Eq. (5.2.23) (e) and denote Ea_ by Ei (i = 1, 2, ..., /). For a real root 5 ( = k^+ y) e A (the set of roots in £) we set Es = £ ® Eye gs, and define an element of representation ;ras follows: r? = exp (- n(Es))
exp (7t(E_s))
exp (-7r(£ a ))
(5.3.12)
For 5 , 6 II c A, we denote the above element rf. It can be checked that the collection {rf} (i = 0, 1, 2, ...,/) forms a group denoted W ^ We state below three important results (see [5] for proof) based on the representation n. Result 5.3.3: The operator r~ for every a is a well defined automorphism of the space V with the following properties: (a)
r5 (VM) = Vr_w
(5.3.13)
where r~ is the reflection in the space H* with respect to 5 (see (5.2.33)) (b)
(r?)2\Vii = ±id.
Result 5.3.4: Let A71 denote the subgroup of W" formed by the elements (r*)2 (i = 0, 1, ..., I). Then if ker n c Cc, the following is true: (a) AK is a normal abelian subgroup of period 2. (b) The quotient group W^/A* is isomorphic to W the Weyl group of g (see Subsec. (5.2.8) for the Weyl group W). Result 5.3.5: For the adjoint representation every element r~ is an automorphism of the Lie algebra g, which preserves the (standard) bilinear form on g, and has in addition the following properties: H is r~d- invariant, and r~d ^.
(a) (b)
r?d(gs)
=
= r-
grs(b)beA.
We shall briefly return to representations in the next section after we have introduced the Heisenberg systems.
Infinite-Dimensional Algebras
213
Since our final aim in this chapter is to define the vertex operators, we devote the next section to Heisenberg systems and differential operators—the main ingredients of these operators. Although we have already introduced the Heisenberg algebra in a previous section, we establish it from first principles in the next section in order to show its action on Fock spaces. In the process we shall see the Fock space from the mathematician's point of view (as used in string theory). Recall that (in physics) the state vector space of a many body system (formed by identical particles) is called a Fock space. The fundamental postulate of the space is, that the basis vectors (given in terms of Dirac's ket notation): K , n2, n3, . . . )
(5.3.14)
which allot the eigenvalues ki (say) to nl particles and k2 to n2 particles, etc., constitute a complete set of orthonormal basis vectors for the system of identical particles. Note that ni represents identical particles that differs from rij (i ^j) and that n,-, n} characterized as occupation numbers are in fact the eigenvalues of Hermitian operators TV,-, Nj defined in the system. The operators Nh Nj are known as occupation number operators. In particular the no-particle (vacuum) state and one-particle state (in this system) are respectively: y/ 0) = |0, 0, 0, . . . ) , y\l)=
|0, 0, ..., nt= 1 , 0 , 0 , ...,) = *,. >
(5.3.15)
whereas the most general state of the system is a linear combination of the kets (5.3.14). All these are elements of the Fock space.
4
HEISENBERG SYSTEMS AND DIFFERENTIAL OPERATORS
As already mentioned we devote this section to Heisenberg systems, to the action of the Heisenberg algebra 1 7 on Fock spaces, and to the differential operators which follow in a natural manner while obtaining representation of this algebra on the space of polynomials C [t{, t2, ••-, tn, ...] in infinitely many variables.
4.1
Heisenberg Systems
It may be recalled that an infinite-dimensional Lie algebra with basis vectors (p,-, qt) and an element c is a Heisenberg algebra with commutation relations18: [Pi, qi] = c ( i = 1 , 2 , . . . )
(5.4.1)
This is a nilpotent algebra with centre Cc. In view of the above, a Heisenberg-Lie algebra can be expressed as a direct sum S = S ® Cc (5.4.2)(a) where the subspace S (in general, infinite dimensional) is assumed to be equipped with a non-degenerate alternate bilinear form y/ which satisfies: [ a , p \ = y/(oc p ) c
a , P s S
(5.4.2)(b)
Let !Hb& the complex vector space of dimension n with the non-degenerate bilinear form, and let T c H * be an ^-dimensional lattice. Construct the direct sum • The notations to write the Heisenberg algebra differ from previous sections for obvious reasons. qi denotes the position operator and pt the momentum operator.
214 Mathematical Perspectives on Theoretical Physics
5 = S @M (5.4.3) and treat #"as a commutative Lie algebra*, then S becomes a Lie algebra. Further, viewing F as a group, we define its action as automorphism of the Lie algebra S by setting: Ty(s © h) = (s - y(h)c) ®h,s<ES,he?{,YeT
(5.4.4)
where evidently Ty is an operator on 5 . The pair (S , F) is called a "Heisenberg system." On the other hand, beginning with an n-dimensional lattice F and a real non-degenerate bilinear form ( , ) on it, one can construct the space # = (F ® z C)*
(5.4.5)
and extend the form ( , ) given on F to it, which in turn gives !H* 3 F. Setting 5 = (C [t, r l ] <8>c 90 © Cc = Si © Cc and defining a bilinear alternate form i/f on !H as: Wl
®h,tm®
K) = n Sn+m,o (h, h') h, h' e H
(5.4.6) (5.4.7)
we note that S is a Lie algebra with the bracket [a, p] = yr(a, p)c
(5.4.8)
where a, j3 e H and [c, 5 ] = 0 (i.e., c is a central element of S ). This construction gives an example of Heisenberg systems (5, F) (defined above) with S being the Heisenberg algebra pertaining to H (see Eq. (5.2.3)). We shall denote it as (5 , F) and call it as the Heisenberg system associated with the lattice F. Note that S in this case is isomorphic to a direct sum of commutative Lie algebra ^defined in (5.4.5) and a Heisenberg algebra 3 = 5 © Cc where S stands for: S = X (*" ® c #)
(5-4-9)
The space 5 admits a (canonical) polarization with respect to the bilinear form y/ 9 : S = S_® S+=
X '" ®c #
® X ? " ®c ^
(5.4.10)
We shall use this polarization to define Fock spaces related to Heisenberg systems.
4.2
Fock Spaces Constructed from a Heisenberg System
We now suppose that F is an even lattice of rank /, i.e., (7, 7) is always even for ye F and as such it can be thought of as being isomorphic to Zl as a group, then we have: *
The itfwhich stands for arbitrary commutative Lie algebra here, should not be confused with Cartan subalgebra of earlier sections. 19 ' h<S>t" e S_(he #and n < 0) is usually denoted h(- n).
Infinite-Dimensional Algebras 215
Definition 5.4.1: Let 5 (5_) denote the symmetric algebra of S_ (the negative part of 5) and let C [F] denote the group algebra of the lattice F (i.e., when it is viewed as an abelian group). The C-vector space given below Vr = S (S_) ®c C [F]
(5.4.11)
is called the Fock space of the lattice F. Sometimes the name Fock space is used for the space 5 (SJ also, with the understanding that unity 1 be treated as-the vacuum vector. Remark 5.4.2: We would like to note that the group algebra C [F] has a basis {ey} with y e P, and the multiplication rule in it is ey • ey = ey+ y'. We shall use this fact .to define the representation of Heisenberg system (5 , F) in the space Vr. Let F' be the lattice dual to F, i.e.,20
Definition 5.4.3:
r = {P e # R |(a, J8> e Z for all a e F} set
Vr = S (SJ ® c C [F']
which carries a symmetric non-degenerate bilinear form ( | ) satisfying: (a)
(ey\ey')
(b)
(h(- n) | h(- m)> = n (h, h') 5n> m on S_ and
(C)
(xl,...,XN\x'l,...,X'M)=
=
Sy:ronC[T'],
X ^ l l ^ ' f f d ) ) " ' {XNWa{N))^N, M
(5-4.12)
aeSN
on 5 (S_). Here SN denotes the symmetric group of order N. The space VT' is a Fock space. Remark 5.4.4:
For any element A e F' we can define the vector space (denoted V(X)): V(X) = S (S_) ® C [F + A]
(5.4.13)
The space V(X) is a subspace of V r , in particular V(0) = Vr is a subspace of VF'. In this manner we have constructed a family of Fock spaces starting from a Heisenberg algebra. We shall use them later to obtain representations.
4.3 The Canonical Representation In Sec. 2 we defined a 2-cocycle e : Q —> {± 1} which satisfied properties laid down in (5.2.22) and we used e to obtain the (Chevalley) basis vectors and eventually a Lie algebra. We use this now to define a representation of a Heisenberg system in a vector space V as follows. Definition 5.4.5: Given a Heisenberg system (S , F) (F not necessarily an even integral lattice) and a 2-cocycle e: F —> {+ 1} with the properties given in (5.2.22). A pair n- {nx, TJ^} defines a representation (associated with e) of (5 , F) in a vector space V if nx is the representation of the Lie algebra 5 in V, and TT, is a protective representation of the group F in V such that the following two conditions are satisfied: 2O
^ R = F®ZR.
216 Mathematical Perspectives on Theoretical Physics
(a)
«2(y. r') s ^ ( / + r') - e(y. rO ^(tf ^(rO
(b)
7zr2(y) nx{a) n2(y)~l = nx{Ty(a))
(the consistency condition)
(5.4.14)
where y, y' e F, a e S and Tyis the automorphism given in (5.4.4). Further let 5 be the subspace of S = S @ Cc with the decomposition (see 5.4.9-10): S = S_® S+. The two components of 5 are isotropic with respect to the bilinear form y/ introduced above (See Subsec 4.1). Using this polarization of S another representation of (5 , F) can be established in the following manner. Consider the symmetric algebra S(SJ and define the derivations on S(S_) and C(F) with respect to S+, J#and F, and use the multiplication on S(S_). For an element u e S+ a derivation du of S(S_) is: du(v) = y/(u, v)
where v € S_
(5.4.15)
whereas for an element h e 9{, a derivation dh of the group algebra C [F] is: dh(ey) = x W e r
( y e F and ey e C [F])
(5.4.16)
and for an element y' e F, it is: dr(er) = (y',y)el
(5.4.17)
The multiplication of 5(S_) by an element x e \s_is denoted Lx. We also define a family of operators {Ty} on C [F] for every / i n F by the relation: fY(er)
= e(y, y')ey+r.
(5.4.18)
We now use the vector space V r = S(S_) ® c C [F] obtained above to establish the required representation n- {nu H2) of the Heisenberg system (S ,T). The following definitions are made for this purpose: nx{u) = du ® 1, u e S+;
nx(v) = Lv ® 1, v e 5_
nx(c) = lv; nx{h) = I ® dh, h e !H; K2(y) = f,ye
T
(5.4.19)
The consistency relation (5.4.14)b for the above definition of nx and n2 can be easily verified (see Hint to Exercise 5.4.1). The representation defined above is called the canonical representation of (SY) associated with cocycle e. The following remarks and a result (Stone-von Neuman theorem for Heisenberg systems) concerning the canonical representations are noteworthy. Remark 5.4.6: The representation n- [7tl, 7^} defined above is irreducible, i.e., there are no nontrivial subspaces of V which are invariant with respect to both nx and ^ . Remark 5.4.7: lent. 21
Any two canonical representations associated with two equivalent cocycles are equiva-
See Ref. [5] for the proofs of results given as Remarks (5.4.6)-(5.4.7)) and Results (5.4.8)-(5.4.9), and G. Segal in [6] for projective transformations of a group.
Infinite-Dimensional Algebras
217
Result 5.4.8: Let K' be an irreducible representation of a Heisenberg system (S , F) (with a fixed polarization S = S_@ S+) associated with a cocycle e in a vector space V. Suppose that there exists a vacuum vector vo€ V, i.e., v0 is a non-zero vector satisfying: n'{u) (V0) = 0 and
u e S + ; 7r'(/t) (o 0 ) = 0 /t e # ;
7r'(c)(i;0) = z>0
then the representation TT' is equivalent to ?r. Note that to obtain the representation n, we defined the Heisenberg algebra from first principles. We now revert to the defining equation (5.2.3) of the Heisenberg algebra which resulted from a simple Lie algebra g and use the root lattice Q in place of F; the lattice Q equals its dual Q (see Sec. 4.1). Accordingly the Heisenberg system is (5 , Q) 2 2 where the polarization is the same as described above, and the representation that follows is given below. Recall that we defined the so-called basic representation for the affine group g , we use this to write the following: Result 5.4.9: Let nx be the restriction of the basic representation to the subalgebra S and let n^ denote the representation of Q defined by J^ia) = T*, a e Q. Then the representation (nt, Tt^> of the Heisenberg system ( 5 , Q) in the space V(AQ) is equivalent to the canonical representation of this system associated to the cocycle satisfying (5.2.22). We next use the Heisenberg system ( 5 , F) and the canonical polarization to define differential operator on the vector space V = S(S_) ®c C[F] of this representation. We assume that the lattice F is integral even. This assumption allows that the spaces 5 _ and 5 + which are completions of S_ and S+ are respectively equal to 5* and 5* (the duals of S+ and 5_). The bilinear form y/on S_® S+ can be extended to S _ @ S+ as well as to 5_ © S + . Further using J ( 5_), which is the completion of the symmetric algebra S (5_), we set the vector space V = S (SJ
® c C (F)
which gives the canonical imbedding V c
(5.4.20)
V.
Definition 5.4.10: A differential operator on V is a linear map from V to V. Here are a few examples of differential operators. For any xe Vthe left multiplication denoted Lx is a differential operator. The tensor products:
dp®l,peS+;
\®dh,h&tt\
1 ® fy, 1 ® dy, y e F
(5.4.21)
are also examples of differential operators. These operators are generally denoted (in literature) dp, dh, Tr, and ^respectively (when there is no cause for confusion). Likewise the elements q ® 1 and 1 ® er are written simply as q and ey. "With these basic differential operators one can construct other differential operators. We shall return to these in the section on vertex operators, but first we have to familiarize ourselves with two more operators, the creation and annihilation. We study them in brief in the next section.
21
Note that S is g^in Sec. (5.2) (see Eq. (5.2.3)).
218 Mathematical Perspectives on Theoretical Physics
Exercise 5.4 1. Let P = C[xx, x2, ...] be the algebra of polynomials in infinitely many variables xt and let P denote the formal completion of P (i.e., the algebra of all linear combinations of (finite) monomials in the xt). Suppose that D : P -» P is a linear map. Then show that the following three statements are valid: (a) If [xh D] = at D (a, e C i = 1, 2, ...), then D = D(l) exp - Y a,. —
I (b) If -j-,
.
dx
i )
D = fl-D (i = 1, 2, ...), then D(l) = c exp ( £ /3,-x, I where c e C.
(c) If [*;, D] = OiD and - ^ - , D = ^D, then
D = c exp Y fijXj
exp - Y a;.
for some c e C.
The operator D given in (c) is called a vertex operator.
Hints to Exercise 5.4 1. Recall that a linear map D can be a differential operator, e.g.,
I
X
p.,... 'V - ^ r - 3 7 -
where
Pi, •••.-,6 ^
it can be a multiplication by a polynomial p e i°, or can be an operator T a defined as: Ta(f(xu x2, ...)) =f{xx + a,, x 2 + a2, ...) fe
P
(i)
where a stands for (a1? cUj, ...) for a, € C. Using Taylor's formula for function / o n RHS we note that ^a= e*P X a , - / -
and70=l.
(ii)
We shall use this operator to prove (a). Replace D by DTa and observe that the 'if part' becomes [xt, DTa] = a{DTa, whereas the 'then part" becomes DTa= D(l). This is true for any operator, in particular for DT0; in this case, however, the RHS of the if part is zero, hence the proposition becomes if [*,-, D] = 0 for i = 1, 2, ..., then D = D(l). But D(f) = D(l)/for every polynomial / i n F (by induction on the degree of / ) , hence the result. To prove (b) we replace D by exp - Y^ piyi
v
/
D and obtain if:
J
Infinite-Dimensional Algebras 219
I "£"'exp r ? piyt n = then
exp
- ^
/3,-y,
Pi exp
(~ ? P i y i J A
(iii)
D(l) = const.
Using the same argument as in (a), when we choose j3 = (fiv /32, ...) to be zero, the statement becomes: If -^—,D\ dx
L i
= 0 for i = 1, 2, ... then D(l) = const.
J
This, however, also follows from the fact that - ^ - , D = 0 implies - ^ - (£> (1) = 0 (/ = 1, 2, ...),
Ldxi
J
*,
i.e., D(l) = const, hence the statement in (b) holds good. Part (c) follows by taking together (a) and (b) and substituting the value of D(l) in (a) from that of (b).
5
CREATION AND ANNIHILATION OPERATORS
We have been familiar with the processes of creation and annihilation in mathematics for almost three centuries that could be compared with the present day concept in physics. For example the integration and differentiation on a space of polynomials have the effect of creating and destroying by adding to, and by subtracting from, the powers of monomials. In physics they are usually associated with pair productions and removals (of a particle). We describe these operators here as they are of great use in particle as well as in string theory. Our introduction in Subsec. 2 is based on the linear harmonic oscillator of a particle dependent on one coordinate (see Chapter 5 of 9.[24]).
5.1 Creation and Annihilation Operators on Fock Spaces Classically, the spaces on which these operators act are Fock spaces. For instance using the basis vector (5.3.14), we can define a creation operator: fl*|filt n2, ..., n M , n,-, nM,
...) ~ |n 1; n2, ..., n M , nt + 1, n i+1 , ... >
(5.5.1)
which adds to the basis state with quantum number kt one more particle. Similarly from the definition of a Hermitian adjoint operator, it follows that an annihilation operator ai can be defined such that a.-lw,, n2, ..., n M , n,, ni+l, ...) ~ | W]) n2, ..., n,._,, nt- 1, nM
...)
(5.5.2)
Thus operator at removes from the basis state with quantum number kt one particle. Evidently the effect of these operators on vacuum state i//0) and one-particle state y^!) described in (5.3.15) would be: a] v/ 0) = i/P, at »//'/= «/ 0)
(5-5.3)
Since the vacuum state contains no particle to be destroyed, we postulate that: a, y/ 0) = 0,
aj yty= 0 (j * i) V i, ; e Z N ( 0 ) .
(5.5.4)
220 Mathematical Perspectives on Theoretical Physics
5.2
Hamiltonian in Terms of Creation and Annihilation Operators
With the introductory background of creation and annihilation operators given above, we define them now in simplest form. For this purpose we use the quantized version of the Hamiltonian for the harmonic oscillator, with potential, energy V(q) = -j w2q2; w here is a positive constant and q is the position vector of the particle. The Hamiltonian H = ^p2+w2q2 in quantized form becomes
(with unit mass andp as the momentum)
H= — P2 + —w2Q2 2 2 where P and Q are operators and satisfy by assumption23:
(5.5.5)
[P, Q] = -i We now define the creation and annihilation operators in this case as:
(5.5.6)
a = - L (WQ _ ;/>), a = -£=• (wQ + iP) V2 V2
(5.5.7)
The operators a* and a satisfy the following relations: (i) (ii)
[a, a*] = w H= — (aa* + a a) = a a + —w
(5.5.8)
They act on the Fock space of polynomials V = C [a ] by multiplication and differentiation; evidently the operator a can be viewed as w—j-. da By taking a simple example from string theory, we shall see how these operators can be constructed with ease. Let x(t, 8) denote the position of (closed) string in the compactifield space S' = R/2TTZ, where 9 e [0, 2 n] and t e R are the parameters it is assumed that x(t, 0) = x (t, 2n). Consider now the Hamiltonian: 2
H=j_f«\]_(dx) J
In o [ 2 (, dt )
+
adx\2}de
i559)
2 {dd J j
and the following Fourier expansion for x(t, 6) in order to simplify it: x(t, 6) = qo(t) + X 4n(t) ^2cos nd + £ qjj) 71 = 1
VIsin n6 + k9
(5.5.10)
n= l
It can be checked (after some calculations, see Hint to Exercise 2) that H can be written as: H= I 23
7 ( 9 ' + n2q2) + ±q2 + ±k2.
Planck's constant has been taken as unity.
(5.5.11)
Infinite-Dimensional Algebras 221
The/: in (5.5.10) and (5.5.11) belongs to Z. Using the creation and annihilation operators a*n, an (ne Z), the Hamiltonian can be formally expressed as: H
= \ X (a*, an + an an) + i - P02 + \ 1
n*0
l
K2
(5.5.12)
l
on L 2 (S')—the set of square iptegrable
where Po is the momentum operator defined as Po - -i
dd functions on Sl; and K is a position operator which acts as K • e" = n en on C [Z]—the group algebra of Z. The operators a* and an act as multiplication and differentiation operators respectively. In fact an = \n\ —— and their Lie bracket [an, an] equals \n\. This implies that H of (5.5.12) can be written as:
H= X (anan+\\n\) + \{Pl+K2)
(5.5.13)(a)
F r o m the a b o v e d i s c u s s i o n s it f o l l o w s that t h e s p a c e o f a c t i o n o f H s h o u l d b e t h e F o c k s p a c e : V= C[a*, a*2, 03, ..., a * , , a * 2 , a!. 3> . . . ] ® L2{SX)
® C [Z]
b u t w e n o t e t h a t H d e f i n e d i n ( 5 . 5 . 1 3 ) ( a ) c a n n o t b e a n o p e r a t o r o n V, s i n c e HI-
—^ | n | - l = °o-i ^ n*0
does not converge. (Note that 1 is the vacuum vector in V (see Result (5.4.8)). This problem of nonconvergence, however, can be circumvented by replacing 4"M by —[V • Here —^ is the value of the Riemann zeta function, the analytic continuation of
n=\
n
at s = - 1 (see [12]). Thus it is the renormalized form of H: H™= I
anan+ i-P02+ \K2-
-L
(5.5.13)(b)
that one uses as an operator on V. For more details see [14]. We shall use these ideas in what follows next.
5.3
Operators on the Fock Space Associated to Heisenberg System
Returning to Heisenberg system (S , Q) associated with an integral even lattice Q, we now define the operators on (Fock space)
VQ = S(S_) ® C C[Q] the vector space which gives the canonical representation of the system associated with a cocycle e. The creation operator for h e #"and n > 0 is denoted h{ - n) and equals 24 : 24
See footnote 19.
222 Mathematical Perspectives on Theoretical Physics
t-"®h
(5.5.14)
The annihilation operator denoted h(n) equals: d
<.r»®h)
h e M,n>0
(5.5.15)
and operates as follows: d(rn ® h) ('"'" ®h') = n 5,u m {h, h')
(5.5.16)
Sometimes when the element h of # i s /iyfor ye Q and (h, hy \ - y(h), the annihilation operator hy(n) is simply denoted as y(n). For h e # w e also define the operator: *(0) = 1 <E> dh where dh{ey) = y{h)eyye
Q and eys C [Q]
(5.5.17)
For ye Q, two more operators can be defined: dys 1
(5.5.18)
C r s 1
(5.5.19)
and Note that in the previous section we showed that VQ (denoted Vr there) can be considered as a subspace of a larger Fock space. In fact each of those spaces and V in particular admit a Z+-gradation, for instance: n>0
This gradation follows from: (a)
deg(r" ® h) = -n,
(b)
deg( e y ) = - - i (y,y),
n = 1, 2, ... ys Q
(5.5.20)
On each of these V_n 's we can define an operator known as the energy operator as follows: DQ(v)
= -nv
for y e V_
(5.5.21)
We shall use these operators along with those introduced in the previous section to define vertex operators in next section. We use them now, to define the canonical action of the Heisenberg algebra on the Fock space: V = S(S_) ®C [T]. We use x to denote an arbitrary element of 5 (5_) and x <E> ex for an arbitrary element of V. The action by the Heisenberg system (S, I") is indeed the action by the operators defined in (5.5.14)—(5.5.19). We thus have: h (-n) • x ® ex = (h(-n) • x) ® eX n > 0 h (n) • x ® eX = n —^—
dh{-n)
® eX n > 0
(5.5.22) (5.5.23)
Infinite-Dimensional Algebras 223
h(Q) • x ® ex = x ® (dh ex) = (h, A) x ® ex
(5.5.24)
X • x® ex= e(X', A) * ® eA + A'
(5.5.25)
c • x® ex = x ® ex
(5.5.26)
where A, A' € F and c is a central element of 5 . The above equations (considered together) imply the following: A' • (A • y) = £(A', A) (A' + A) • y X (h(0) • (A'"1 • y)) = (A' (*(0))) • y A' • (A • (A"1 • (A-1 • y))) = e(A', A) £(A, A')"1 y (5.5.27) where y e V and A, A' e T. This action of the Heisenberg algebra can also be defined on double Fock spaces by taking the product (S x S') between the algebra 5 and its copy denoted S '. We give below a brief description of this idea. Let prime ' denote the copy of an object, thus S'_ is a copy of S_, C [T]' is a copy of C [F], and C [V]' is a copy of C [F'] where F' is the lattice dual to F. The Fock spaces with primed objects are denoted with bar for distinction (see [14]). Thus (a) (b)
Vr = 5CS'J®C[lT V r = 5(5l)®C[F']'
(c)
Va) =S(S'_) ® C[F + X]\ where A e F'.
(5.5.28) 25
In fact we use only (b) to form the tensor product Vr ® V r which we identify with : Sym(S+©5J®
X
Ceart
in the following manner: A,(-n,) X^-nJ
... U~nr) eX ® nx{-m{)' /i 2 (-m 2 )' ... ^ v (-m,)'(^)'
H-* A,(-n,) AjC-ziz) ... Ar(-«r) /i,(fn,) H2(m2) ... ^ . ( m , ) ^ ' ^
(5.5.29)
The space i/r< defined below: f/r = Sym (S+ © 5_) ® C [F'] = Sym(5+ © S_) ® ZAe r C e{k A) c V r ® Vr
(5.5.30)
is a complex vector space, and is called the double Fock space of F'. The canonical action of the Heisenberg algebra S x S', in view of equations (5.5.22)—(5.5.26) on the space Ur, can be written down as follows: 25
- e<-^M) an(j e(k ^) denote t h e double exponential series. We use unprimed letters X, n etc. even though they e F'
224
Mathematical Perspectives on Theoretical Physics
(a)
h(-n) • u ® ea-A = (h(~n)u) ® e(X' M)
(b)
h(0) • u ® ea- M) = (h, X)u ® ea ">
(c)
h(n) • u ® e(X'M)
= n . du dh(-n)
<8> e(A< M)
(5.5.31)
and (a)
A(-n)' • u ® e a> M) = (h(n)u) ® e ( A ' r t
(b)
h(0)' • u ® ea' # = (h, fi) u® e a
(c)
h(n)' • u ® ea-rt
M)
= n - ^ - ® ea oh{n)
rt
(5.5.32)
where M e Sym (S+ © 5_) and n > 0. Evidently the central elements c and c' act as Id operators on the space. The difference between the action equations (5.5.31) and (5.5.32) resulting from the Heisenberg algebra S and its copy S ' is worth noting. In any case this was expected in view of the identifying relation (5.5.29). We note that the operator
is defined by dh(n)
- | ( ^ 7 = (m, n) | n \8n m an (n)
h(n) = h® t", j{m) = j® t'"; h,je
H
(5.5.33)
Also dvw dv dw -r = w +v dh(n) dh(n) dh(n)
„, (5.5.34)
In the above equations m, n stand for any non-zero positive and negative integers. Although we do not use the double Fock spaces explicitly, we would like to remark that they are used in superalgebras and superstring theory-the final objective of our text (see [11] for details). We return to vertex operators in the next section.
Exercise 5.5 1. Write down the exponential analogue of (5.5.10) and obtain —— and ——. at ad 2. Using the Fourier expansion (5.5.10) for x{t, 6) establish the Hamiltonian H as given in (5.5.11). 3. Use (^Q, ..., jxr) to represent V/T where F and F' are the root lattices given in Subsec. (5.4.2). Set /IQ = 0. Then show that the Fock space Vr decomposes as a direct sum of (r + 1) subspaces:
(a)
V(,o)®-eVW
Infinite-Dimensional Algebras 225
In the case of T being a root lattice, V becomes a weight lattice, and (a) gives the decomposition in terms of Fock spaces formed with the help of the weights of the algebra.
Hints to Exercise 5.5 2. We differentiate (5.5.10) with respect to t and 0 to obtain: -T7 = 9o + X ?« ^ C O S <"
nd
+ X -„ >/2"sin 0
n=l
/*) r
(i)
#1 = 1
°°
°°
-r— = - V2" Y^ n<7n sin 0 + -Pi Y «^_n cos n^ + £. »=i
^
(ii)
n=i
Substituting it in (5.5.9) we have after some simplification: H=
T~ V 47r
i°
+
L
2 9n22cos2 «0 + ^: n 2sin2ne + 2^0 £ V 2 ^ cos nd) «=i
v«=i
+ 2^o Z«-« ^ s i n «0 + 4 £ V#i = l
j
)
^ {(cos j0 cos kd) qjqk + (sin j0 sin kd)q_jq_k }
7=1 A>1
+ 4 £ X 9; 9_t cos ;0 sin *e ] d0 + 2 X "29« 7=1 * = i
sin2
"e
('»)
L "=1
+ 2 X " V - n cos2 n6 + k2 + 2k I Y^ V2~(- n^n sinn0 + nq_n cosre0)[ 71=1 00
-
4
U =l 00
J 00
0
X X ^ / s i n ; ) 9-* (
cos
kff) + 4 X X {;^(sin;0) ^(sin *0) 7 = 1 k>\ j*k
7 =1 *=1
+ jq_j (cos jd) kq_k (cos A:0)}
00
dd.
( dx\ The first three terms of the above integral which correspond to can be written as: V dt )
226
Mathematical Perspectives on Theoretical Physics
T - £ * 19o + S «»( cos 2 " 0 + 1) + £ 9-n (1 ~ cos 2n0) d6
1
-2
1
V
2 , V"
2
1
1
•2 . V
1 -2
1
(iv)
NT -2
as J "cos 2n0d6 = 0. The next five terms each of which involves derivatives of qn or q_n and g0 with respect to t are zero when integrated. Next we note that the first three terms in square bracket, corresponding to
are the only terms which contribute to a non-zero part:
— V L Z n2q\ (1 - cos 2nO) + Y n2qln(\ + cos 2nd) + k2 }d6 An
° U=i
»=i
J
=|i»2^2+y£»V^+^-. Z
n=l
Z
n=l
Z
(v)
The rest of the terms are zero. Adding (iv) and (v) we obtain (5.5.11). 3. Use Def. (5.4.3) and Eq. (5.4.12) to obtain the solution (see also Sec. 1-A in [14]).
6
THE VERTEX AND VIRASORO OPERATORS
We devote this section to vertex and Virasoro operators. Both these operators were discovered by physicists, for instance the vertex operator D defined in terms of exponentials in Sec. 4 (Exercise 2) was used in dual string theory and prior to that a similar operator (fermionic field operator) for the soliton in the Sine-Gordon model was used by Skryme [13]. Similarly the Virasoro operator was used in looking at the conformal invariance in string theory. The commutation relations of these operators correspond to the re-parametrization and conformal symmetry of the string (see Chapter 11). In recent years it has been found that they are extremely useful in the understanding and development of infinite-dimensional algebras as well as string theory. These operators therefore have led to a strong interplay between mathematics and physics. One now talks of different types of formulations of algebras (e.g., a Spinor construction) and of representations of algebras (e.g., untwisted and twisted) associated with these operators, and examines their relationship with other operators and algebras (see [4]). Due to our limited scope, we content ourselves with the basics of these operators. Detailed accounts on these topics can be found in references [9] and [6] where most of the research papers have been reprinted (see also [3]).
Infinite-Dimensional Algebras 227
6.1 The Vertex Operators Before we write down a vertex or Virasoro operator in a general form, we state two simple results based on differential operators on the vector space V = S(S_) ® c C[F]. Recall that, we obtained these operators in Sec. 4 and Sec 5. In Sec 4. we used the Heisenberg system ( 5 , V) with a fixed polarization and a representation n = {nu n2} associated to a cocycle e to do so, (see Result (5.4.9) and Definition (5.4.10)), and in Sec. 5, we obtained them on Fock space VQ, Q being an integral even lattice (see Subsec. 5.3). Result 5.6.1:
Let D denote the differential operator from V —> V: D s exp(
(5.6.1)
where we S + , z e C vO and y e F. 26 Then using Taylor's formula and defining equations (5.4.15), and (5.4.17) it follows that: D{v) = v + y/(u, v),
D(er) =
{r 7) r z
'' e '
(5.6.2) y
where v e S_, y' e T and y/ is the alternate bilinear form defined in Sec. 4. (Note that D(e ) = (In z) < y' y> er = In z<" r * e y s z
Let A : V —> V be a differential operator for which there exist u0 e S + , v0 e 5_,
7, 7 ' e T and z e C \ 0 such that: (a)
[„, A] = y/- (M,I; 0 )A,
« e
(b)
[Lv, A] = y/ (« 0 , p)i4,
o e 5_
(c)
[dr,A]=
y" e T
(d) fr Then the operator A equals:
(y,y")A,
AT_r=
z(r'y"]A,
5+
y" e T.
afy (exp v0) exp (-((In z) 5 r ' + du))), where a e C.
(5.6.3)
(5.6.4)
We use the operator D of Eq. (5.6.1) and the annihilation operator given in (5.5.15) to define the vertex operator. Definition 5.6.3:
Let V = S (S_) ®c C [H and V = S(S _) <S>C C [T] be the vector spaces associated
to the Heisenberg system (5 , T) and let D, y(ri) (= hy{n)) be the differential operators on V. Then the operator:
X(y z) = exp X — / ( - ") exp (7+ (In z)dy) • exp f - X — 7<»)
26
- K = 5 (5 J ® C [H (see Eq. (5.4.20)).
(5-6.5)
228
Mathematical Perspectives on Theoretical Physics
for z e C \ 0 and y e T is the so-called vertex operator from V to V . The following facts about this operator are easy to check: -ir.r)er Fact 5.6.4: (a) The middle term can be written as z 2 exp(ln z)
Ar(er')=Sr^r)eT'
(5.6.6)
(b) Although X(y, z) is an operator from V to V, it can be developed as a power series in z:
X(%z)=yZxr(y)zr
(5.6.7)
rez
involving operators Xr(y) (r e Z) which map V into itself. The operator X(y, z) is an analytic operator (as a function of z) accordingly it is natural to talk about its residue. Indeed the operators Xr{y) of (5.6.7) are the operators that are obtained by considering the residue: Xr(y) • v = Res (X(y, z)v • zr'1) v € V
(5.6.8)
z=0
By the very definition of X(y, z) (product of exponentials), it is evident that the RHS has a finite pole at z = 0, and therefore Xr(y) is well defined. Again using the cocycle e, we can write the operators depending on the cocycle, thus27: Xe(y, z) = X(y, z)er , Xr£(y) = Xr(y)ey
(5.6.9)
These can be further used to define commutation and anticommutation relations between the vertex operators. For this purpose we set28 X\y, 7'; z, z0) = : X(y, z) X(y\ z 0 ): ey + y ,
(5.6.10)
and state a simple result:
Result 5.6.5: Let y, y' e T and v e V, then X\y, z)X\y\ z0) = e(y, y') (z - zo){y'r)
{zz0Y{r'r)l2
X\y, y', z, z0)
where |z| > |z o | If e(y, y') e (yr, y)~x - ± ( - \){r-r\
(i) then
[X*{% X\y\ z o )] + v = £(7, 7') Res ((z - Z o ^ ^ ^ z z o ) " ^ ^ 2 x XE(y, y';z,zo)vz = z0
27
ey{a) = e (y, a) defined in subsection (5.2.5). ' See Eq. (5.2.28) for the ordering convention.
28
(ii) zr~l)
Infinite-Dimensional Algebras 229
where [ , ] _ = [ , ] denotes the commutator and [ , ] + = { , } denotes the anticommutator (see Prop. (1.2.15) in [4] for the proof). We now show (through next result) how vertex operators can be used to give a representation of an affine algebra g associated to a complex simple finite-dimensional Lie algebra g. Recall that in this case 9l is a Cartan subalgebra of g, A is the associated root system and Q is the lattice in tf generated by A. The bilinear form ( , ) restricted to A satisfies (y, y) = 2. Moreover g has a Chevalley basis {Ey} u {ha), ye A, a, e n , and also has an associated cocycle e. Then setting •S* = X sk
where s
k=
fk
®X k ± 0 ,
kez 5
-=Z5-t'
SQ=Cc + !H and V = S(SJ ® c C [Q]
k<0
and using the operators h(r), dy,Cy, Do (defined in subsection (5.5.3)) andXr(y) as well as the Hermitian form on V given by ( | ) (see Subsec. (5.4.2)), we obtain the following (important) result (see [5] for the proof): Result 5.6.6:
The map n: g -> End V given by 7T(c) = l v , 7t(d)= Do n{tr ® h) = h(r), r e Z
(5.6.11)
^(? r ® £y) = Xr(^) CY y e A is a representation of the Lie algebra g, which is equivalent to the basic representation (see Sec. 3 for the basic representation).
6.2
The Virasoro Operators
Just as in the case of vertex operators, here also we deal with a Heisenberg system with canonical representation {nx, K2} associated to an even integral lattice Q, a cocycle, e and the space
V = S(SJ®CC[Q] on which the Virasoro operators act. These operators are defined in terms of elements of the Cartan subalgebra !tf. We choose an orthonormal basis {a,} (i = 1, ...,/) for Hand define them as follows: Definition 5.6.7:
The operators
O0 = - I I a,- (~r) a, ( r ) - | X («,- (0))2
and
Dm = - 1 £ X fl. (-r) a, (r + m) m € Z \ 0
(5.6.12)
(5.6.13)
2 rezi = l
are the well known Virasoro operators. They map V into itself. The operator Do (as should be evident from its form) is also known as the energy operator of the system. These operators satisfy the following relations with other operators, e.g., 7t{(s), r y a n d X(y, z):
230 Mathematical Perspectives on Theoretical Physics
[Dn, »,(*)] = KX f tm+x -^-(5) J s 6 5 TrDm
(5.6.14)
f~{ = Dm+ y(m) - j 5 m , 0 (7-7)
[£,„, *(* Z)l = - A y (y, y) + z-^-1 *(y, 2) [Dm, Dn] = (n-m)
Dm+n + -L
{n?
- m)Sm< _„.
(5-6.15)
(5.6.16) (5.6.17)
Equalities (5.6.14) and (5.6.15) can be proved by using the bracket identity: [0,02, O3] = O,[O2, O3] + [OL 0 3 ]O 2 (5.6.18) since every term of Dm is a product of two operators. The defining equations required for establishing these two are respectively (5.4.19) and (5.4.18). For proving (5.6.16) and the constant term of (5.6.17), we have provided a few hints in Exercises (5.6.4) and (5.6.5). Recall that on the subalgebra g of g , we are already familiar with the action of the differential operator
d(n) = tn+l ±. dt
Using the defining relations of Result (5.6.6), it can be shown that if n: g —» End V is the representation of the Lie algebra g , then: [Dm, a(x)] = n(dm(x)) for any x e g eg
(5.6.19)
where we are denoting d(m) as dm. Thus commutation with the Virasoro operator induces the operator
f'+l
—=d dt '"
Now the operators {dm} form an infinite-dimensional algebra denoted T> (see Sec. 2) with the bracket relation:
[dm,dl] = (l-m)dm
+l
(5.6.20)
The relation (5.6.17) therefore can be viewed to give a central extension T> of the algebra 2). It is interesting to note that both these algebras have a 3-dimensional subalgebra which is isomorphic to sl(2, C). In the case of i> it is formed by D_x, Do, D, and in the case of
Infinite-Dimensional Algebras
231
We feel however, that inspite of these shortcomings we have given sufficient flavour of the subject, and have provided adequate references for a further indepth study.
Exercise 5.6 1. Prove Result (5.6.1). 2. Prove Result (5.6.2). 3. The middle term of the vertex operator defined in (5.6.5) operates on C [F]. Show that its action on ey in particular gives: z
{r'.r)+-^{r.Y) 2
er
, Y+v r
4. Prove the relations (5.6.16) and (5.6.17) for the Virasoro operator Dm. 5. Find the value of [Dm, £>_J(1) where m > 0.
Hints to Exercise 5.6 1. Note that elements of C [F] can be viewed as constants when the operator du is concerned, likewise elements of S(S_) can be treated as constants when the operator dr'is concerned. Hence using (5.4.15) and (5.4.17) we have the result given in Eq. (5.6.2). 2. We replace the operator A by the operator B = f_yexp(-v0)
• A • exp((ln z) dy> + du ). Then our
statement in (5.6.3) becomes: if [du, A] = [Lv, A] = [dy», A] - [ff., A] = 0 for any u e S+, v e S_, y" e F, then A = const. The latter statement is obvious (see also Prop. (3.1) in [8(c)]). 3. Note that the middle term of a vertex operator can be put as:
\rez
where Ar satisfies Ar (ey) = 8r^y
J
y-j
ey . Accordingly we have:
i ( r ' r > r f y 7r8
er)
Z
e
e
\L
Z d
r,(y,Y')
• )
\rez
In the above sum, all other terms, except the one for which r = (7, 7') are zero, hence the result is 4
e
.
4. To show the equality (5.6.16) we note that the terms in X(y, z) are products of three exponents, which (using (5.6.18)) can be treated individually to obtain the overall result. Thus we can write these commutation relations simply as:
232
Mathematical Perspectives on Theoretical Physics
(a)
Dm, exp — 7 ( - r) = -z y(m - r) exp — / ( - r ) L r J r
(b)
[£>„,, exp(y + (In z)d^} = -y(m) exp(y+ (In z) d).
On the other hand the differentiation of these two exponents in (a) and (b) above gives:
-z"'+1 -f- e x p ^ 7 ( - r) = -zr+m y(- r) exp ^y(- r)
(c)
dz
r
r
-zm+[ ~ e x p ( r + (In z)dy) = -zr 7(0) exp(y + (In z)dy). dz ' Comparing the RHS of (a) and (b) with (c) and (d), it follows that (d)
[D"\ X(y, z)] and -zm+i -j-X{y, z) dz differ only by m permutations of the factors. Each permutation implies the addition of a scalar factor yz'" (7, 7) and hence we have the equality:
[Dm, X(y z)] = -z'" [^ (7, 7) + zj^j X(Y, z). We leave the proof of the second part for the reader. 5. Note that [Dm, D_m]( 1) for m > 0 simplifies to Dffl£>_,„( 1), since D_m Dm(\) equals zero. Now D
»^=
Z
rezi = \
1
(a)
«i(-rK(r + ») - { i t
- T I I V
'
sszj
^ ( - ^ ( J + B.)
=\
)
'
= 7 X S Z a /(- r ) X ai(r r e z ,s€Z i = l
Z
A
+
m)aj(-s)aj(s + m).
7=1
Using (5.5.16) we can write: (b)
at(r + m)aj(~s) = (r + m)8r+ms (a,-, a ; ) = (r + m)8r+ms 8j.
Thus (b) is non-zero only when j = i and r + m = s. Accordingly (a) simplifies to:
(c)
"7 X X X «z,.(-r)a;(.? - m). rez jez i'=l
In the above expression r and s are not linearly independent, and this operator acting on (1) E V will give a non-zero result only for s = 1, ..., m - 1, hence we can write DmD_m (1) as: fm-l /
"N
. m-1 5
,
X X a,-(-5)fli(-m + 5) (1) = ~r X ( ' » ~ "> =1 2l V ( m 3 ~ m ) ' 2
V . s = i 1=1
7
i-=i
s
Infinite-Dimensional Algebras
233
References 1. E. Abe, HopfAlgebras, Cambridge Tracts in Math. N74 (Cambridge University Press, 1980). 2. S. L. Adler and R.F. Dashen, Current Algebras and Applications to Particle Physics (W. A. Benjamin, Inc, 1968). 3. A. J. Feingold, Spinor Construction of Vertex Operator Algebras, Triality and E., Contemp. Math. No. 121 (Am. Math. Soc), 1991. 4. I. B. Frankel, Two Constructions of Affine Lie Algebras Representations and Boson-Fermion Correspondance in Quantum Field Theory, J. Funct. Anal. 44 (1981), 259-317. See Chap. 3 of [6]. 5. I. B. Frenkel and V. G. Kac, Basic Representations of Affine Lie Algebras and Dual Resonance Models, Invent. Math. 62 (1980), 23-66. See Chapter 2 of [6]. 6. P. Goddard and D. Olive (eds.), Kac-Moody and Virasoro Algebras (World Scientific, 1988). 7. O. M. Jezabek and M. Praszatowicz (ed.), Skyrmions and Anomalies (World Scientific, 1987). 8. (a) V. G. Kac, Infinite Dimensional Lie Algebras, 4.[6(a)]. (b) V. G. Kac. (ed.), Infinite Dimensional Lie Algebras and Groups, 4.[6(b)]. (c) V. G. Kac, D. A. Kazhdan, J. Lepowsky, R. L. Wilson, Realization of the Basic Representations of the Euclidean Lie Algebras, Advances in Math. 42 (1981), 83-112. 9. J. Lepowsky, S. Mandelstam and I. M. Singer, Vertex Operator in Mathematics and Physics (MSRI No. 3, Springer-Verlag, 1985). 10. J. Lepowsky and R. L. Wilson, A Lie Theoretic Interpretation and Proof of the Rogers-Ramanujan Identities, Advances in Math. 45 (1982), 21-72. 11. S. A. Prevost, Vertex Algebras and Integral Bases for the Enveloping Algebras of Affine Lie Algebras, Am. Math. Soc, Vol. 96, No. 466, (1992). 12. R. Seeley, Complex Powers of an Elliptic Operator, Proc. Symp. Pure Math. 10 (1973), 515-527. 13. T. H. R. Skyrme, Selected Papers, with Commentary ((Ed.) G. E. Brown, (World Scientific, 1994). See in particular Skyrme's papers in Proc. Roy. Soc. Lond. (1958, 1959, 1961, 1962); his joint work with J. K. Perring, A model unified field equation, Nucl. Phys 31 (1962), 550-555; and articles by V. I. Sanyuk and E. Witten. 14. H. Tsukada, String Path Integral Realization of Vertex Operator Algebras, Mem. Am. Math. Soc. 444 (1991). 15. V. G. Kac, Superconformal algebras and transistive group actions on quadrics, Comm. Math. Phys. (1997). 16. V. G. Kac, Vertex algebras for beginners, University lecture series, Vol. 10, (AMS, 1996) 17. S. J. Cheng and V.G. Kac, Conformal modules, Asian J. of Math (1997) 18. R. Borcherds, "Vertex algebras, Kac-Moody algebras and the Monster, "Proc. Natl. Acad. Sci. USA 83, 3068-3071 (1986). 19. N. Kamran and P. J. Olver (ed.) Lie Algebras, Cohomology, and New Applications to Quantum Mechanics, Contemp. Math 160 (AMS, 1994).
CHAPTER
THE ROLE OF SYMMETRY IN PHYSICS AND MATHEMATICS
1
f* O
WHAT IS SYMMETRY?1
The word symmetry according to the Merriam Webster Dictionary stands for (i) the correspondence in size, form and arrangement of parts that are on opposite sides of a plane, line or point; (ii) the excellence of proportion or the beauty based on proportion; (iii) the notions that leave a figure unchanged. Although our main concern is with number (iii), we emphasize that (i) and (ii) are also, there in one or the other form. From early on, the symmetry in nature (i.e., in laws of nature) was a recognized phenomenon in terms of invariances, e.g., translational and rotational. The science of crystallography (based on symmetry principles) is one such example. However, the use of symmetry as powerful tool for development of physics became more popular with the advent of Einstein's theory of relativity. For after that, one not only studied the invariance properties of physical systems but also used these properties to make predictions about theories. An example in this direction is the eightfold way of Gell-Mann and Ne'eman that led to the discovery of quarks [13]. Symmetry's important role as a yardstick in physics, has reached a point where no new concept is taken seriously unless the underlying symmetry of the concept is theoretically specified and the legitimacy of this symmetry is confirmed by experiments. One therefore deals these days with a whole arsenal of vocabulary pertaining to symmetry, for instance (i) the exact and approximate symmetries, (ii) the global and local symmetries, (iii) the gauge symmetries, (iv) the continuous and discrete symmetries, (v) the Lorentz and Poincare symmetries, (vi) the broken symmetries, (vii) the geometrical and non-geometrical symmetries, (viii) the hidden symmetries, (ix) the dynamical symmetries, (x) the relation between symmetries and conservation laws and currents. We shall explain these terms in brief in the next section. But before this, we would like to observe that present day physics not only uses the existence of these symmetries to unravel the mysteries of the universe, but also engages itself in tackling more subtle questions, namely: what are the origins of different symmetries? Are they 'fundamental' or are they 'dervied'? (see Def. (vii). In fact it is this quest that has led to the so-called supersymmetry-a new frontier in physics. '• In this chapter, while learning the basics of symmetry, we emphasize mainly on gauge symmetries—the root of gauge theories and introduce the readers to notational differences among these theories in mathematics and physics.
The Role of Symmetry in Physics and Mathematics 235
Finally, we note that implicit in a symmetry principle is the assumption under which certain quantities are unobservable. This in turn implies an invariance under a related mathematical transformation leading to a conservation law or selection rule. Hence like every other object in physics, symmetry is defined and studied via mathematical objects. For instance, associated with any given symmetry there always exists a continuous or a discrete group of transformations. The generators of the group commute with the Hamiltonian of the physical system in question. Thus the study of symmetries to a large extent can be viewed as the study of groups in particular that of Lie groups and their representations. Having already studied the basics of group theory in Chapter 2 and the groups SU(2) and SU(3) in Sec. 3.7, we shall mainly concern ourselves here with the product group SU(2)L x U(l)Y, the symmetry group of the Glashow-Weinberg-Salam model, better known as the invariance group of electro-weak interactions; and the rotation group 0(4)—the symmetry group of the hydrogen atom and of any particle in Coulomb-like \lr potential2 (see Exp. (6.7.5)). We shall again return to principles of symmetries in Chapters 8 and 9. In Chapter 8 we shall deal with symmetries of spacetime, whereas in Chapter 9 we shall study the symmetries (antisymmetries) that relate to familiar objects such as linear and angular momentum, energy, wave function and so forth, (see the Table 6.1 on symmetries and conservation laws at the end of this chapter).
2
DEFINITIONS AND DESCRIPTIONS
As title suggests, in this section we give in brief the various types of symmetries that we have enumerated in previous section. These can be described as follows (see Chapter 1 of [11] and [19] for details): (i) A symmetry of a physical system is called exact if no violations are observed, otherwise it is called approximate. Simple examples of exact symmetry are rotations and translations, whereas parity 'invariance under inversion of space coordinates' is an approximate symmetry. An important class of symmetries which hold only approximately are the well known unitary symmetries* (see [15] and [19]). (ii) A symmetry which is characterised by space-time independent parameters is called a global symmetry, while the one which depends on them is called local. Thus in the former case symmetry transformations on the fields of a theory are identical at all space-time points, (see Subsec. (3.2)). (iii) A local symmetry, i.e., a symmetry whose parameters depend on the space on which it is defined is often called a gauge symmetry, and local symmetry transformations are referred to as gauge transformations. A theory that studies these symmetries is called gauge theory and the underlying symmetry group of the theory is called its gauge group. A gauge theory is called abelian or non-abelian depending on the nature of the symmetry group. For example, the electromagnetic theory with £/(l) as its symmetry group is abelian gauge theory, and the Yang-Mills' theory characterized by symmetry group SU(2) is non-abelian. Sometimes the gauge transformations that leave the field equations of a theory invariant are used to set up the 'so called' constraint equations involving objects e.g. vector and scalar potentials of the theory. These choices lead to simplify the field equations (see Exercises 2 and 3 of Sec. 5, and a foot note in Subsec. 6.4). This proceses is called gauge-fixing (see Sees. 5 and 6). We shall return to these ideas on gauge theory in detail from Sec. 4 onward. 2
r = (xx, JC2, x3) and r = -Jxf + x\ + x\ . Coulomb potential (the basic potential of atomic physics) describes the electromagnetic forces that are responsible for the binding of electrons and nuclei into the form of atoms and molecules. * An invariance defined by a unitary operator (for example in quantum mechanics) or by a unitary group (e.g. U(n) or SU(n)) is called a unitary symmetry.
236
Mathematical Perspectives on Theoretical Physics
(iv) A symmetry whose associated group is continuous (discrete) is called a continuous (discrete) symmetry. Evidently translations and rotations (with parameters x and 6 which can vary continuously) are continuous symmetries. The group of permutations of degree n acting on n objects is a discrete symmetry group and the symmetry described by it is a discrete symmetry. We note that all invariances under discrete transformations except the ones governed by a group of permutations seem to be approximate . (v) The symmetries whose associated groups are Lorentz or Poincare are referred to as Lorentz or Poincare symmetries. (vi) In his famous paper of 1894 (J. Physique 3e serie, p. 393), Pierre Curie formulated the concept of broken symmetry in following words "C'est la dissymetrie qui cree le phenomene" which when translated means: "It is the broken symmetry which creates a lot of non-trivial physics". It is an established fact that physical systems which are initially uniform and time-independent (though far from equilibrium) possess a kind of self-organizing phenomena, that leads to ordered behavior. The emergence of order (referred to as "symmetry breaking") is often accompanied by the appearance of spatially asymmetric patterns. Such (symmetry breaking) phenomena occurs in macroscopic physics as well as in microscopic physics* (see Chapter 2 in [11]), for instance in subatomic physics it happens to be the rule rather than the exception. It is also of great value in systems of biological nature, where both order and asymmetry are ubiquitous. We shall, however, restrict ourselves to a group theoretic approach and shall say that given a physical system its symmetry is said to be broken if its underlying symmetry group can be expressed as a product of two other groups. These other groups are naturally subgroups of given group and form separate symmetry groups of given system. A well known example of this phenomena is the electro-weak theory of Glashow-Weinberg-Salam, to which we shall return later (see Section 5). We would also like to note that there are three types of symmetry breaking that take place in the physical world, namely (1) explicit breaking: the "classical action" has a term which is not invariant under a given symmetry group; (2) anomalous breaking: the classical action is invariant but Hamiltonian of corresponding .quantum theory is not, and there is no conserved current; (3) spontaneous breaking: the action is invariant and the currents are conserved in quantum theory, but the vacuum ground state is not invariant under the transformations, we shall return to it briefly in Subsec. 5.2, (see Chapters 5 and 6 for (1), (2) and (3) in [4] and Chapters 4,6 and 7 in [11]; see also Chapters 4 and 5 in [5], and references [9], [6], [21]). In group theoretical terms the classical Kaluza-Klein theory that unified the gravitation and electromagnetism is the best understood theory of spontaneous symmetry breaking. Here the 5-dimensional group of general coordinate transformations is (spontaneously) broken to the 4-dimensional group of general coordinate transformations and a local U(\) gauge group (see Ed. Witten, and E. Cremmer and B. Julia in [11]). We also note that when local symmetries are spontaneously broken (Subsec 5.2) due to non-invariance of vacuum, the zero mass particles do not appear in the physical spectrum of states; instead these particles provide longitudinal modes to gauge fields, making them massive. It turns out to be a welcome phenomena, since here two kinds of massless particles that are not needed for physical purposes, combine to give a 'vector meson state' capable of mediating short-range forces such as the weak forces. As a matter of fact this (Higgs-Kibble) mechanism forms the basis of unified gauge theories (see [22] for mathematical explanations, [17], [26] for original papers, and articles by 't Hooft and Dimopoulos, et al., in [9] for some more answers). Two simple examples of macroscopic physics are hydrodynamical differential equations, and mechanical properties of an organic compound such as sugar. And that of microscopic physics is the model SU(2) x U(3)-( see Exp. 3 in Subsec. 5.2).
The Role of Symmetry in Physics and Mathematics
237
(vii) A layman's definition of a geometric symmetry would be: 'a symmetry that preserves geometric configurations is a geometric symmetry'. The translational and rotational invariances of space, in other words the symmetries of Euclidean geometry-also classified as 'fundamental symmetries' are geometric symmetries. In the context of physical theories, however, the notion of geometric symmetry is used in a much broader sense. For instance, since the publication of Einstein's theory of relativity, the Lorentz invariance and time-translational invariance have been considered 'geometrical.' Similarly, the general coordinate-invariance or diffeomorphism symmetry of Einstein's theory of gravity is called 'geometrical' (though sometimes this is referred to as 'a questionable practice'). In this scheme of ideas even a supersymmetry (if it exists) is a geometrical symmetry, since it is intertwined in a non-trivial way with the symmetries of space and time translations (see Section 7.5). Examples of symmetries that are nongeometrical are offered by gauge symmetries of Yang-Mills theories. Nonetheless, the proponents of geometric symmetries consider these also as geometric, since the theories can be given a fiber bundle formulation (see Section 5 and Section 6). In any case, we would like to note that symmetries described in (v) are geometric symmetries. (viii) Sometimes the symmetry is not manifest although its effects can be felt indirectly. Such a symmetry is called a hidden symmetry (see Chapter 2 in [11]). (ix) Dynamical symmetries are not true symmetries in the sense that they are not exhibited by the laws of nature. A simple description of dynamical symmetry will be as follows. Suppose there is a physical system described by a Hamiltonian H with underlying symmetry group G, more specifically by H- h (Ga), Ga e Q the algebra of generators of G (see Sec. 3.7). In general H (written in terms of generators of G) does not admit solutions to eigenvalue problem in closed, analytic form (in other words H does not give the spectrum of observables of the theory). If, however, H contains invariant (Casimir) operators of a complete chain of subgroups of G G D
G, 3 G 2 •••
then the eigenvalue problem can be solved in closed form in terms of quantum numbers, and these lead to energy (or mass) formulas of the theory. Thus dynamical symmetry is a means towards classifications of complex physical systems. A familiar example of dynamical symmetry is given by the Gell-MannNe'eman's 5f/(3) with the group chain [13]:
SU(3) 3 SU(2) ® U(l) 3 SO(2) ® 1/(1). Other examples of dynamical symmetries are those of molecules and nuclei-with their respective symmetry groups t/(4) and f/(6). The interacting objects in both cases are bosons (see [11], [21] and F. Ichello in [27]). (x) In both classical and quantum mechanics, the conservation of the linear momentum, the angular momentum and the energy are related to the invariance of the Hamiltonian under translations, rotations and time translations respectively. More generally, in quantum mechanics whenever a conservation law holds good for a physical system, its Hamiltonian is invariant under a corresponding group of transformations. Converse of this statement is not true, for even if the system has a Hamiltonian which is invariant under the group of transformations, the corresponding conservation law may not be there. A simple example is provided by the time reversal in a given physical system. We devote Sec. 3 to these ideas.
238 Mathematical Perspectives on Theoretical Physics
3
EXACT SYMMETRIES, CONSERVATION LAWS AND CURRENTS
Very often the symmetries are not exact. In order to make them so, certain constraints are imposed on the system which in turn lead to conservation laws and currents. We illustrate this point within the framework of Lagrangian field theory.
3.1
Euler-Lagrange Equations and Currents
We write the action S as a functional of fields 0; and their derivatives d^ >,-3: S = \dixL{$id^i)
(6.3.1)
Here L is the Lagrange density, the index i runs over the number of fields and \i takes the values 0 through 3. Under a symmetry transformation the entities 0;, (9^0, and L in the above action are transformed to: h -> ^ + 50,
d^ 0; -» „ 0,- + 5(^0,), L->L+SL
(6.3.2)
Accordingly 8L can be written: (6.3.3) To simplify, we assume that there is just one field in the action, then the action 5 will be invariant under the symmetry given in (6.3.2) only if:
SS = O= Jd4*f | ^ 0 + 1^k~S(d^
0)1
(6.3.4)
The RHS of Eq. (6.3.4) can be simplified after the second term is integrated by parts, thus we have: (6.3.5) If the second integral (which is the surface term) is ignored, apparently 5 5 is zero when the integrand in the first term vanishes4:
3'
4
The actioni when compared with first quantization [pt, xj\ = - iSy is often referred to as the second quantization in quantum theory, since the interactions amongst the fields are explicit in the action. Note that the defining equation of second quantization is : [n (x), 0 (y)] x -y = —i $3 (x,,- y^). Here i takes the value 1, 2, 3, and dx, SL Pi = ——; whereas n (x) = ———— (see Ftn. 5). dt 8(6Q 0) In 3-dimensions this is the familiar equation of motion.
The Role of Symmetry in Physics and Mathematics 239
This, however, is the Euler-Lagrange equation, which can be obtained from extremization of the action 5 that involves physical fields. Hence in order that the symmetry be exact, the integrand dJ
5
(6.3.6) where ea is an arbitrary parameter of transformation for the variation of
(6.3.7)
Thus any continuous symmetry transformation which leaves the action S invariant implies the existence of a conserved current, (see Chapter 2 in [16]). Next we shall use the current J% to obtain the conservation law of the system.
3.2 Conservation Law and the Conserved Charges as Generators of Symmetry Group The integral
Qa=jd3xJZ(x)
(6.3.8)
is called the (corresponding) charge of the system. This satisfies the equation - ^ - = 0 dt
if
8L = 0
(6.3.9)
giving the conservation law of the charge. Reverting to (6.3.2) we note that if L is invariant under a symmetry group G formed by infinitesimal transformations, then we can write the variation 50,(x) as $
(6.3.10)
where {ta} are a set of matrices that satisfy the Lie algebra [ta, tp] = ica^ t* with caPy being the structure constants of G. The variation (6.3.10) of 0, implies that the current j£ can be expressed as:
*~'I£M*>*' or equivalently as:
(6311)
240
Mathematical Perspectives on Theoretical Physics
(6.3.11)(b)
The conserved charges [Qa] are the generators of the symmetry group. The symmetries explained above are global symmetries since these are characterized by space-time independent parameters £a. Under these.symmetries all fields 0,(x) are transformed at all space-points in exactly the same way. We explain these ideas further by two simple examples based on the abelian group (7(1) and the non-abelian group 5(7(2).
3.3
Examples
Example 6.3.1: Consider the Lagrange density
L = |[(Vi)2
+
( 3 ^ 2 ) 2 ] - y^ 2 (0f +4>l)~ jHtf
+ 02>2
(6-3.12)
which is evidently invariant under the symmetry group 0(2) (formed by rotations in the (0 h 02)-plane (see Sec. (2.1) where we denoted it as R2)- The corresponding transformations are: 0! —> <j>{ = 0j cos a -
a +
02 c o s
a
(6.3.13)
For infinitesimal transformations « « 1, these become: 0i = 0i - «02 02' = «0,+ 02 thus giving the relation (6.3.10) as: 50, = /«f,70;
(6.3.14)
where matrix t = (ty) stands for:
(o n
" U ojSince — 5(6 ^i)
=
Jn = -(^(0i))02 + (^02))0i
(6.3.15)
as the conserved current of the system. If instead of real fields 0j, 02 we use complex fields: 0 = - ^ ( 0 , + /02), 0* = - ^ - ( 0 , - /02) the Lagrangian becomes: L = (d^ 0*) ("0) - /i 2 (0>) - A(0* 0)2.
(6.3.16)
The Role of Symmetry in Physics and Mathematics 241
This is invariant under the U(l) transformations <j) —> <j>' = £ia<j>, and the corresponding conserved current can easily be seen as
J^iKdrfW-tf^W'].
(6.3.17)
Example 6.3.2: The isospin symmetry SU(2) is the most simple case of non-abelian symmetry. We consider an isodoublet (a column-vector formed by differentiable real valued functions defined on R3)
o and take the Lagrange density as
L = (^0+) (d^) - /i2(0+0 - 4(0V) 2
(6-3-18)
(note that 0+ is the conjugate transpose of (j>). Recall that the infinitesimal isospin rotations of (j) are given in terms of Pauli matrices xa{a = 1, 2, 3) which are the generators of SU(2) (see Sec. 3.7). It can be easily checked that the transformations ^1.->0;=01.+ ^ T g ^ .
(6.3.19)
leave L invariant. The conserved isospin current in view of (6.3.11) is given by:
V = - \ Htf)<j
T?H0;)].
(6.3.20)
The time component of the current in this case is: J
o=~j
KJotiK % - $
TSW;)].
If we use the canonical momentum nt conjugate to
Further using the commutation relations [«, (X, 0,
(a = 1,2, 3)
satisfy the commutation relations of SU(2) symmetry:
5
71
^"
(6-3.21)
242 Mathematical Perspectives on Theoretical Physics [Qa,Qh]
= ieabc
Qc
hence {Qa} are the generators of SU(2) (see [4] and Chapter 4 of 9.[l](a)).
4
GAUGE SYMMETRIES—THEIR ORIGIN
In the next two sections we study the concepts related to gauge symmetries, which as we know are also called local symmetries, to emphasize the fact that gauge symmetries are not of global nature (see also Sec. (9.4)).
4.1
A Historical Perspective
The gauge-theoretic ideas had their origin in Maxwell's electromagnetic field equations in 19th century. However, it was only in 1929 that Weyl used the word gauge for local scale* invariance which he had designed to incorporate the electromagnetic field with space-time geometry. The significance of Weyl's approach was recognized in due course and a theory involving the socalled gauge groups and gauge fields, etc., began to unfold (E.P. Wigner in [11]). The physicists in elementary-particle physics and in theory of quantum fields found the theory as an effective and powerful tool in pursuing the theoretical as well as experimental research. It is quite accidental that around the same period of time (Ehresmann, 1950, see 2. [12]) another new theory which had to have great similarities with gauge theory (in later years) was being formalized by mathematicians. This was the theory of bundles—the theory that was meant to unify the algebraic, the topological and the geometric structures together. Of particular interest in this realm of ideas were the principal fiber bundles—an entity which later on turned out to be of immense use in description of gauge theory. For almost two decades the physicists and mathematicians were unaware that the two theories they were developing had a lot in common, that they were both using the same tools for description, namely the metric and the connection and had the same goals such as the objects invariant/covariant under suitable transformations/groups of transformations. It was only in early seventies that the subtle link between the two theories was recognized. In particular, it was shown that the gauge field of Yang-Mills' equations was indeed the curvature of a connection in a principal fiber bundle with gauge group SU(2). Although physicists (naturally) influenced by Einstein's theory of relativity describe the gauge potentials, the gauge fields and their interactions via the classical theory of curvature and connections (using tensor methods)-the use of principal bundles as a language in gauge theory is becoming more and more popular. Our aim in these sections will be to explain these ideas using both points of view. Since the physicists' approach preceded that of mathematicians, we present two well known examples in gauge theory using the Lagrangian method. For the first one we begin by writing the Lagrangian Lo of (Dirac's) free electron theory and finally obtain conditions for.£0 to be gauge invariant and renormalizable. In the process we obtain the Quantum Electro Dynamics (QED) theory. Since the gauge group involved in the theory is abelian, this is also called an abelian gauge theory. The second example deals with the
*
We note that laws of physics are not themselves invariant under a change of scale. A scaling symmetry is usually 'derived' by applying dimensional arguments to hydrodynamical differential equations. The symmetry arises when the number of fluid molecules becomes very large. Obviously it is related to macroscopic physics, (see Ftn. in Def. (vi) Sec. 2).
The Role of Symmetry in Physics and Mathematics 243
Lagrangian of Yang-Mills theory and illustrates the case of non-abelian gauge theory since the underlying group SU(2) is non-abelian.
4.2
Examples (Physicists' Point of View)
Example 6.4.1: The abelian gauge theory: Let ¥(*) stand for (Dirac's) free electron field with following transformation rule: *F(JC) -» V'(x) = e~ia Vix)
¥(x) -» ¥'(x) = eia V(x)
(6.4.1)
the exponent a being the phase belonging to the abelian group f/(l). The Lagrangian Lo for the field Y(JC) given as6: £„ = H* (X) OV ^ - m)
TO
(6-4.2)
evidently has a global (/(I) symmetry under the transformations (6.4.1). In order to turn this symmetry into a local symmetry, i.e., to gauge the symmetry, we replace the phase a by a{x) in (6.4.1) and demand that the theory be invariant (i.e., gauge invariant) under this new set of transformations, namely: ¥(*) -> ¥ ' t o = e~ia(x) T(JC) »F to -» ¥'(*) = e iBW ¥(x)
(6.4.3)
It is easy to see that the derivative term *F (x)dfl *P(x) in (6.4.2) is now replaced by: eia(x)^{x)dlle-ia{x)
Yto
= ^ (JC)5M 'Pto - i ¥ (x) ( ^ aW)»Fto (6.4.4) The presence of second term in (6.4.4) disallows the invariance. If, however, we form a gaugecovariant derivative D^ to replace d^, so that D^Wix) has the transformation property: D^ V(x) -> [D^(x)Y = e-iaix)D^(x)
(6.4.5)
then the product ¥ (x)DJ¥(x) will be invariant under the gauge transformations (6.4.3) leading to a gauge-invariant Lagrangian. To achieve this, we must define a suitable covariant derivative D^. This is done by adding a term, e.g., eAp to partial derivative d^, where A^ is a vector field called the gauge field of the theory and e is a free parameter which can be identified with the electric charge. We next obtain the transformation rule for AM, so that (6.4.5) may hold. Writing D^ = d^+ ieA^ and (D^)' = o1^ + ie A^' we have:
ZVFto = ( ^ + ieAJ ¥(*) and [ D ; «Pto]' = £ > ; r = (<9M+ ieA;)e-"a(j:)'Pto
6
7 M = Dirac matrix (/z = 0, 1, 2, 3) (see Subsec. (7.3.3)); see Sec. (9.3) for Dirac equation.
(6.4.6)(a)
244
Mathematical Perspectives on Theoretical Physics
=df,(e-ia(x) ¥ W ) + ieAj
a(x)) e-[a(x) TO + iee~iaOc)A^ *(x).
(6.4.6)(b)
In order to satisfy (6.4.5), the RHS of (6.4.6)(b) must equal e-ia(x\d^ +
ieA^{x).
This is possible only if: -id^ a(x) + ieA^ = ieA^ i.e., Ap = A^ H
(6.4.7)
9^ a(x) or that A^~+ A^+ —d^a(x). The Lagrangian (6.4.2) for the theory on the
inclusion of the gauge field A^ becomes: £ 0 ' = T z W M + ieAJ ¥ - mVV
(6.4.8)
If, further, we make the gauge field a dynamical variable, we must add a term involving its derivatives to the Lagrangian Xo'. We must, however, keep in mind that any such addition does not alter the invariance property of LQ. The term involving the derivatives of A^ is denoted LA: ±A = ~ J > The second rank tensor F^v = d^Av-
F*V-
(6-4.9)
dv A^ in LA is called the field strength tensor, and the constant
- — provides the normalization factor. It is easy to check that Fuv is gauge invariant (see Exc. 1), i.e., 4 F^=FMV.
(6.4.10)
The Lagrangian L obtained by adding £$ and LA L = ¥iY'(o> /x + ieA^yV - mVx¥ - -F^v
F"v
(6.4.11)
is the QED Lagrangian. Although our main aim (as presented above) was to give a mathematical formulation of gauge theory with (7(1) symmetry group, we give here in the following remark the main features of the (QED) theory based on the Lagrangian L. The details can be found in [4]. Remark 6.4.2: (A) The photon is massless because the term A^ A*1 is not gauge invariant. (B) The minimal coupling of the photon to the electron is contained in the covariant derivative D^V which is constructed from the transformation property of the electron field. Thus the coupling of the photon to any matter field is determined by its transformation property under the symmetry group. (C) The Lagrangian does not have a gauge-field self coupling since the photon does not carry a charge (or (7(1)quantum number). Thus if there is no matter field, the theory is a free-field theory. Example 6.4.3: Non-abelicm gauge symmetry based on Yang-Mills'' theory. We now consider the field *F as a column vector
H-
The Role of Symmetry in Physics and Mathematics 245
In physics terminology this is an isospin doublet. The transformation *F —> VP' is done with the elements of SU(2) (the isospin group of Yang-Mills' theory), thus we have:
V(x) -> ¥'(*) = exp{~' T ' e }y(x)
(6.4.12)
where t= (Tl7 T2, T3) stands for Pauli matrices and 0=(9}, 92, 03) represents the group parameters of transformation (see Section 3.7). Evidently the free Lagrangian Lo = ¥ « ( i y % - m)f(x)
(6.4.13)
is invariant under the transformations when the group parameters 6X, G2, 03 are independent of spacetime coordinates. When these become space-time dependent, i.e., 0, = 0,-Qc), the symmetry transformations become local transformations giving:
¥(x) -» ¥'(*) = exp{~n'e(x)}*¥(x)
s U(G) V(x).
(6.4.14)(a)
The derivative term in (6.4.13) now changes to*: V'OO^'F'OO = V(x)d^(x)+ ^(x)[U(9)Tl[^(U(G)m(x). (6.4.14)(b) To construct a gauge-invariant Lagrangian, we follow the steps of example (6.4.1), i.e., we first choose vector gauge fields A^ (i = 1, 2, 3) to form the gauge-covariant derivative (6.4.15)
with g as the minimal coupling constant. Next, we require that D^ ^(x) has the same transformation property as ^(x) does, i.e.,
ZyFOO -> [D^ix)]' = U(6)D^(x)
(6.4.16)
Since [DJV{X)Y = D; ¥ ' W = (dp - tg I ^ L Woo' transformation (6.4.16) will hold only if:
K - i8 —^
\(U(dm = t/(0)l 9, - ig —^-jV
(6.4.17)(a)
i.e., if
\dllU(6)-ig^-U(0)W=
* Since U(fy = exT>[ZllJt(?l-\,U(0) = [U(ff)Yl I
2
J
-igU(6)(^±)U>
(6.4.17)(b)
246
Mathematical Perspectives on Theoretical Physics
(Note that £/(0)<^*F term on the LHS cancels with the one on the RHS.) This equality implies that: I^JL = f/(0)I-if- [U(d)Vl - - [d^U{e)]U'X (6)
(6.4.18)
The transformation law for gauge fields A^ given by above relation makes the Lagrangian Lo gaugecovariant. It can be checked that if 9{x) «: 1, then (6.4.18) gives (see Exc. (6.4.2)): A;1 = Aj + eijk 9jA* --d^ &
(6.4.19)
where £'•'* are the familiar Levi-Civita (anti-symmetry) symbols. The second term in (6.4.19) is the transformation for a triplet under the adjoint representation of SU(2). This shows that A^ carry the charges (in contrast to the abelian gauge case). See Remark (6.4.2) based on Example (6.4.1). We now construct the anti-symmetric (second-rank) tensor for the gauge fields A^, by setting:
(D^DV - Dv DJV = ighLF^Jff
(6.4.20)
where F^ stands for: Fjv= du A v - dV K + 8£'jk AJAvk
(6.4.21)(a)
or equivalently for:
^-3,^-3.^--l,[^,^f-}
(6.4.2W,
Using the property (6.4.16), it can be checked that: [(/)„ Dv - Dv DJV]' = U(G)(D^ Dv - Dv DJV
(6.4.22)
which leads to the transformation property for F^v : T- F^J = U(9) (TF^v)U-l(9)
(6.4.23)
For the infinitesimal transformations 6' <. 1, this reduces to: F'ly'F^e^ffF^
(6.4.24)
This shows that unlike the abelian case, the antisymmetric tensor F'^v here transforms non-trivially, and as such is not invariant under gauge transformations, although its trace: Tr{(T-FMV) (T-F" V )} - Fj, v F*v
(6.4.25)
is gauge invariant. The complete gauge-invariant Lagrangian of the theory can now be written as: L = - — FL. F'" V + ViytiDuy¥-m*¥*¥ 4 M ** with the transformation properties given by (6.4.14), (6.4,16) and (6.4.23).
(6.4.26)
The Role of Symmetry in Physics and Mathematics 247
The first term of L is sometimes referred to as pure Yang-Mills term. It can be checked that when written out in full it contains factors that are trilinear and quadrilinear in A'^ These correspond to selfcouplings of non-abelian gauge fields. Note that this self coupling of gauge fields does not occur in abelian gauge theory. We would like to observe here that above theory can be generalized to higher dimensions. For instance we can take a simple group G, whose generators {ga} satisfy the structural relation: [g\ gb] = iCbc gc The doublet ¥ =
(6.4.27)
is replaced by an arbitrary column vector v|/ (called a multiplet). The multiplet
\\f belongs to a representation with corresponding representation matrices {Ta} that obey the Lie bracket relation (6.4.27): [Ta, Th] = iCahc Tc.
(6.4.28)
The covariant derivative in this case becomes*: D^
= ( ^ - igTaAp V
(6.4.29)
and the second-rank tensor for gauge fields (A£) is given by V
= dn Av - dv K
+
sC°bc A / Avc
(6.4.30)
or equivalently by: (T• F ) ^ = /T• Av) - djj• AJ - ig[7• Ap T • Av]
(6.4.31)
The Lagrangian L which is invariant under the transformations of the group G has the same form as in Yang-Mills case: L = - - F" Fmv + Viiy11 Du - m)T
(6.4.32)
though the entities *F, A£, D^ and F^v that are involved here, have different transformation rules. At the end of Sec. 5 we shall return to a Lagrangian of most general nature for the standard SU(2) x ( / ( l ) model.
Exercise 6.4 1. Show that the anti-symmetric tensor F^v of abelian gauge theory is gauge-invariant. 2. Obtain the transformation law for the gauge field A^ when group parameters (0'(x)) are infinitesimally small. 3. Using Definition (6.4.15) of D^, establish (6.4.21) and deduce the transformation property (6.4.24) when 0,
248 Mathematical Perspectives on Theoretical Physics
Hints to Exercise 6.4 1. The gauge invariance of F^v can be established in two ways. First we note that Fflv= d^ Av- dvA^
(0 and
(")
V = dM K ~ dv V = d»{Av +~dv «(*)) - dv (AM + -d^ a(x)\ - d^ Av - dv A^ + — („ dv - dv d^) a(x).
Since a(x) is a smooth function
= dx dx^
, hence (i) and (ii) are equal. To see it dx^dx
differently we use the equality: (ni)
(Dp Dv- Dv D^V(x) = { ( ^ + ieAJ (dv + ieAv) - (dv + ieAv)(dM + ieAJ] V(x) =
ie(dllAv-dvAlF¥(x).
The other six terms cancel in pairs. From (6.4.5) we also have
(iv)
[(D^ DV - DV D^yvx = [(D; D ; - D; Dp »p'] =
e-ia[{DtlDv-DvD^V(x)l
This gives F^ «P' = e-ia (F^) or that F^ = F^. 2. We equate the two operators on *P on either side of (6.4.17)(b) to write: (T-A' U )
(i)
dM U(6) - ig y ^ ' U(6) = - ig U ( 6 ) V
(T-AU) 2"J
on postmultiplication by (U(6))~l we then obtain: (ii)
— r ^ = U(6) 2
-J!-[U(e)Yl - - (d U(6))[U(d)Vl 2 g
When 9'(x) are infinitesimals, i.e., 6(x)
The Role of Symmetry in Physics and Mathematics 249
(iv)
(The repeated indices i, j , k, I indicate summation.) The simplification of middle term leads to: (v)
2
2
ML
2
2
2
2 J g \ 2M J M
g \2
" J
or equivalently to: (vi)
A'^Aj+e'^A^^-d^. o
3. Recall that
Consequently
(i) (D, Dv - Dv Dj , [d, - frI£L) (5V - « I ^ ) - (dv - ig^) [dv - ig^) After simplification, the RHS becomes: (ii)
In view of defining equation (6.4.20) we have: ,-s
T ' F i" v
J
T'Av
a T'A^
• fT'Av
T-Avl
(m)
~2— = d »—- d v —~ r 4^~' —\-
To establish (6.4.21)(a), we write (iii) in terms of the components of T, f^ and Av, thus we have:
(>v)
^'^K-^,K-^.^f
(repeated indices on either side stand for summation). In order to obtain the value of Fl^v, we simplify the third term on the RHS and note that T; are linearly independent, therefore the coefficients of Tt, T2, T3 on either side of (iv) can be equated. Writing
250
Mathematical Perspectives on Theoretical Physics
T
/ "-fi
*j
\
~~2~'~2~ as the difference of products: (Tj A^ + T2AJ + T 3 A^) {xxAl + T2 Av2 + T3AV3) -
(v)
(T,-,Al + T2 A 2 + T3 A v3) (T, A j + T2 A^2 + T3 A^3) we observe that six of these eighteen terms cancel in pairs, whereas the remaining twelve combine in pairs and finally reduce to six terms. For instance consider the four terms: (vi)
I 2 2
2 2J "
v
I2 2
2 2J"
In view of the relation
L 2 2J
2
these terms can be written as: (iF™lL\(Al V
A2 -A2
A1)
£ J
Collecting these facts together, we equate the coefficient of T,- on either side of (iv) and obtain the required equality given in (6.4.2 l)(a): (vii)
^v=^AJ-
To obtain (6.4.24) we substitute 1 -
(viii)
l%'
^
T -^v=[ 1 -—^-J^v^
dvA; + geiJk A^Af. for U{9) in (6.4.23) and obtain:
+ '-y^J
The last two terms written out in full are: -O72)[(T,01 + T 2 ^ + ^6?)(T, F^ v + T2FJV+ T3F3V) -(T, F^ v +
T 2 F 2 V + T3F3V)
(T, 01 + T2 ^ + T3 e 3 )].
Out of these eighteen terms six cancel in pairs, and other twelve combine as they did in (vi). After this simplification, we equate the coefficient of xi to obtain F'^v = F^v + eijk 0j F^v.
5
EXAMPLES OF THEORIES WITH GAUGE SYMMETRY
We shall see in next section how the progress in present day gauge theory has been influenced by the theory of principal bundles. As mentioned earlier the theorists in both areas were unaware of the implications of each others' findings. Only when it was realized that the two main ingredients of principal bundles, namely the connection and the curvature'', could be identified with the gauge fields (Ap and the second rank tensor (F^v), the two began to mingle. Gauge theories then received enormous research
In contrast to the terminology used in physics, in bundle theoretic gauge theory the curvature is called the gauge field and the connection the gauge potential.
The Role of Symmetry in Physics and Mathematics 251
impetus from both points of view. We devote this section to enumerate the principal gauge theories currently in use. Also to prepare ourselves for bundle theoretic approach in next section, we show how geometric methods can be used to introduce the gauge concept.
5.1 Maxwell and Yang-Mills Equations in Classical Form We begin with Maxwell's equations and apply to them the derivation rules of differential forms defined on the Minkowski space. Recall that Maxwell's equations in classical form are: divE = p, divB = 0;
CurlE = - 4p-,
CurlB = J + - ^
at
(6.5.1)
at
where E = E(f) and B = B(f) are time dependent electric and magnetic fields defined on a subset of R3 and p and J are the charge and current densities respectively (see Exc. 1). It is worth noting here that these equations unified the theories of electricity and magnetism and thus lead to important advances in physics. When techniques of differential geometry and form calculus were introduced on the Minkowskian space M 4 = (-1, 1, 1, 1), it became apparent that both electric and magnetic fields could be written as components of a skew-symmetric tensor or equivalently that of a differential 2-form F. Using the calculus of forms the equations could be succinctly written as: dF = 0,
8F= *d*F =j = (J, p)
(6.5.2)
The d and its adjoint S in the above equation are differentia] and codifferential operators, and the j denotes the current density 1-form. It is easy to check (using the standard chart on M 4 and the induced bases of tensor spaces) that the components of second order tensor Fu (i < f) on M 4 can be identified with the components of vectors £, and Bt on R as follows: F,. 4 =£,,
F^eijkBk
(6.5.3)
The indices i,j, k in (6.5.3) stand for 1, 2, 3 and e^is totally anti-symmetric symbol with e123 = 1. Before we look into the gauge aspect of Maxwell's equations, we note that they are globally invariant under the conformal group and in particular under the Lorentz group. In order to see their invariance under the gauge transformations we note that on M 4 , the vanishing of dF implies that F must be the differential of a 1-form A (classically known as 4-potential), in other words F = dA. Naturally the choice of A cannot be unique, since if A' is another 1-form, such that F = dA', we shall have:
F-F =
dA'-dA=0
and this suggests that A' and A in turn are related to each other by an equation: A' = A + d*¥ s A1*" where
¥ e J (M4)
(6.5.4)
T
i.e., A —> A = A + d*¥. Now 4* is a real smooth function, therefore the above relation implies that there is a change of scale at each point. If this local scale is replaced by a local phase taking values in the unitary group 1/(1), then there follows a gauge-theoretic formulation of these equations. The gauge (local scale) transformation function *F of equation (6.5.4) is replaced by the gauge transformation g: = e'8 that involves the phase, and as a result (6.5.4) becomes: iA H» iAg: = g'1 (iA)g + g~l dg 6
4
(6.5.5) 4
where g = e' e J (M , £/(!)), i.e., g is a smooth function defined on M and taking values in U(l).
252 Mathematical Perspectives on Theoretical Physics
(Compare Eq. (6.5.5) with Eq. (ii) in the Hint to Exc. (6.4.2)). Note that in above formulation the Minkowskian space M 4 can also be taken as a 4-dimensional Lorentz manifold usually referred to as space-time manifold in Einstein's theory of gravitation (see Chapter 8). Naturally, therefore, this allows the generalization of gauge theory to Einstein's theory of gravitation. However, we do not pursue it here. Instead we consider the Yang-Mills field equations, written for the vector potential b^ of isotopic spin in interaction with a field *F of isotopic spin 1/2. These field equations in classical form look like: -j^-
+ 2e (bv x / p + 7M = 0
(6.5.6)
where the symbol x stands for cross product between & and the SU(2) valued gauge field: db
n
dbv
^=^T^;-2£(Zvx^
(6 5 7)
--
(Note that Eq. (6.5.7) can be indentified with Eq. (6.4.21)(a)). The quantity J^ denotes the current density of the field 4* (called the source field). These equations could be thought of as a matrix-valued generalization of the equations for the classical vector potential of Maxwell's theory. For more than a decade there did not appear to be any applicability of these equations, since the massless particles predicted by the theory could not be identified with any particle (see Chap. 7 in [16]). This was in contrast to Maxwell's equations where the massless particle was identified as photon, the carrier of electromagnetic field.
5.2
Other Important Gauge Theories; Spontaneously Broken Symmetry
In the late sixties the spontaneous symmetry breaking phenomena introduced by Higgs (known as Higgs' mechanism) altered the situation completely (see the Lagrangian Lws given in (6.5.10)). The mechanism which broke the existing symmetry eventually led to massive gauge vector bosons and thus helped in circumventing the difficulty created by predictions of massless particles in the Yang-Mills' theory. Besides these two classical gauge theories, we currently know three more gauge theories with different gauge groups. These are as follows: 1. The electroweak theory9: This theory gives a unified treatment of interactions of (long range) electromagnetic forces and (short range) weak forces. The carriers of weak forces (universally) denoted as W+, W~, Z° are called weak intermediate vector bosons. Unlike the carrier particle of electromagnetic interactions-the photon, the particles W+, W~, Z° are massive. This is quite understandable from their short range behavior in view of the energy principle. The underlying gauge group of the theory is U(l) x SU(2). It is called the electroweak gauge group. The (/(I) factor is called the weak hypercharge
8
9
One of the ways to break the symmetry is to introduce a term in the Lagrangian £ which no longer allows £ to be invariant under the given symmetry group. We refer the reader to [4] and [16] for a readable description of this concept. This is also sometimes called the standard model as we shall see in the Lagrangian £ws at the end of this section (see J.C. Taylor in P. Dita, et al., [6] on Mathematical Analysis of Standard Electroweak Theory).
The Role of Symmetry in Physics and Mathematics
253
gauge group and is denoted UY (1), while the SU{2) factor is called the weak isospin gauge group and is denoted as SUL(2). The subscript L in SUL(2) denotes the action of 51/(2) on left-handed fermions. The theory is left-right asymmetric, thus if fL and fR denote the set of left-handed and right-handed fermions, and Y is the quantum number assigned to them, then the following relations hold:
X^X^X^O /L
fR
(6.5.8)
Z/.2
where fL2 denotes the set of left-handed fermion doublets (see Exc. 4). 2. The quantum chromodynamics, or QCD for short: This is the theory of strong forces. The underlying gauge group, called the color gauge group, is denoted SUC(3). The elementary particles of the theory are gluons and quarks. The gluons are spin 1-massless particles, have short range behavior, and in analogy with photons carry (color) charge and (color) magnetic moment, etc. The quarks, on the other hand, are spin 1/2 particles; they are characterized by their "flavors" (called up, down, strange, charmed, bottom and top) and the "colors" (red, green and blue) in which these flavors come.10 The main features of the theory are: it is asymptotically free (i.e., coupling strength of interacting forces decreases at short distances), and it admits renormalization (see Chapter 9 for renormalization). For an elementary description of the theories introduced in 1 and 2 the reader should refer to Chapters 11 and 16 in [4] and Chapters 9, 10 and 11 in [16]. See also Exc. 5 for a Lagrangian of the theory. 3. Standard Model (or SM for short). This model combines the Glashow-Weinberg-Salam theory of electroweak interactions and quantum chromodynamics; in other words, it is a model that unifies the strong, weak and electro-magnetic forces. It can be viewed as a Yang-Mills theory of quark and lepton interactions based on the gauge group S(U(2) x U(3)) defined as follows: U{1) 0 0 0 <{/£ 0 0 0 0
U 0
£/(3)
det£/ = l l
(6.5.9)
J
This S(U(2) x £/(3)) group formed by the set of 5 x 5 unitary matrices is a subgroup of SU(5). It has the same Lie algebra as that of f/(l) x SU(2) x SU(3), and therefore sometimes the underlying group of SM is also given as 5C/C(3) x SUL(2) x U^\). In spite of this similarity, however, one should remember that while the SM group 5(C/(2) x f/(3)) is symmetric under complex conjugation, the chiral quark and lepton representations are not (for a readable account of this model, refer to Chapter 3 in Ref. [11]). Having familiarized ourselves with present day gauge theories, we move on to their mathematical formulation in next section. But before this we give below the Lagrangian Lws that we mentioned above. In other words, we describe the Lagrangian of the standard model whose underlying group is SU(2) x [/(I)." We refrain from its mathematical analysis due to our limited scope and also because this analysis is done beautifully by Ellis in [11] we simply note that . 6 ^ is S(/(2)-invariant and is also
la
Before the discovery of quarks, "protons" and "neutrons" were the "elementary" particles of nature. A proton contains two up quark and one down quark (one of each color); and a neutron contains two down and one up quark, again one of each color, (see Table S) "• The Glashow-Weinberg-Salam model was the 'standard model' before the advent of the model that incorporated their model with QCD. (See also page 800)
254 Mathematical Perspectives on Theoretical Physics
renormalizable. For ease of reading to the motivated reader, we decided to use the notations given in the above reference. The full Lagrangian denoted Lws (Weinberg-Salam) equals12: LWS~LG+
SL + Lf+L,p + LY + Lv.
(6.5.10)
The six components of the sum are given as follows:
£G = - l ( G M V a G r ) - } / > ^ V
(a) where
G^va & ^ Wva - dv W^ + geabc W^ Wvc
and
F^ss d^Bv- dvB^
are respectively the field strength tensors of the 5/7(2) gauge fields W^ and the U{\) gauge field B^, and g is the coupling constant13. LG represents intermediary vector boson (IVB) interactions. SL=+-&TGfJvaG>}v
(b)
Gfl"v = ± 8 ^ GaJ.
where
The term 8L allows for a possible P- and CP-violating SU{2) gauge term14 (see R.D. Peccei and Helen R. Quinn in [11] or [24]).
(c)
Lf=- Z [ A y " ( ^ + ig~ Wua +ig' YL B^fL+f^O^
+ ig'YxB^f^5
where fL and/^ are left- and right-handed fermions,^, being a SC/(2)-doublet and^, a 5t/(2)-singlet (see Chapters 7 and 11 for more on fermions and also Exc. 4). The entities g' and YL R are (arbitrary) f/(l)hypercharge and fermion hypercharges respectively. The Lf is the fermionic kinetic part of L. The quantities in round parentheses are covariant derivative operators of fermion fields, and they yield the fermionfermion-boson vertices. (d)
L$=
where 0 =
K
-^d^+ig^-W^+ig'-Y^B^
is a single complex 5f/(2)-doublet of elementary Higgs' fields.
The antiparticle
/ TO \
corresponding to
' ' 14 ' i5 ' l6 ' 13
. The hypercharge Y^ is generally chosen to be 1/2. L^ is the kinetic
Readers in particle physics may refer to Sec. 12.2 in [4] for an analysis of Lws. Note that our notations for W^ and BM in Sec. 4 are A^ and A^. P = parity (space inversion), CP = space inversion P followed by charge conjugation C (see Chapter 7 for C). The absence of Wm interaction term in the second parenthesis is due to the fact that/ R is St/(2)-singlet. The superscripts +, - and 0 in 0 and (j)f indicate the positive, negative and neutral charge (see also Table S).
The Role of Symmetry in Physics and Mathematics 255
term for the Higgs' fields, when (0|
(e)
Lr=- X[w / / '(/z.^)/« + H*fr ~fki
where Hff> is a general coupling matrix in the space of different fermion species; these give rise to fermion matrix when (0|
(f)
L v = ( 8 ^ + ) ( 3 £ ) - V ( 0 with V(0) = V ( < / > V ) + | A ( 0 > ) 2
is the Lagrangian that represents the Higgs self-interactions (see Eq. (6.3.18)). Next we shall replace the SU(2) doublet (j> in (f) by a real scalar field (j> and use the resulting Lagrangian to demonstrate the spontaneous symmetry breaking (SSB) phenomena. The Lagrangian (f) now reads:
L = (<9M>) ("0) - /J?
(6.5.11)
(">) - V(®
The last two terms show that field (f> is a self interacting field, and positivity of A ensures that the energy is bounded from below? Evidently Lagrangian is invariant under the transformation
(6.5.12)
Hence from definition (vi) of Sec. 2 SSB would follow if the vacuum is not invariant. Following two figures illustrate a qualitatively different behaviour of the (physical) system underlying the above Lagrangian. •. V{4>)
(a) n2 > 0
| V(4>)
(b) n2 < 0
^ ^ Q l Symmetry breaking phenomena for the potential V(0 In case (a) the vacuum expectation value of the field is <0>o=
(6.15.13)
The vacuum |0 > is invariant, thus symmetry is not broken. The parameter fi here plays the role of a mass. This situation is called the Wigner mode.
256
Mathematical Perspectives on Theoretical Physics
In case (b) - {fj.2 < 0), the potential v(
( 0 > O = ± J ^ - =±v
(6.5.14)
V A
This gives two degenerate vacuum states. The origin (in the Fig. (b)) is no longer a stable point. In other words the vacuum (ground state) is not invariant. Thus the 'symmetry is spontaneously broken.' We note that the change from situation (a) to (b) (illustrated in two figures) is called a phase transition with /j2 playing the role of order parameter. In conclusion, we note two important aspects of SSZJ-phenomena in case the symmetry is a continuous one. (1) If the SBS is a continuous global symmetry, one massless scalar field (Goldstone boson) must appear in the theory for each group generator that has been broken. (2) If a SBS is a continuous local gauge symmetry, then no Goldstone bosons are produced, instead there are gauge bosons that may acquire a mass without spoiling gauge invariance. This is Higgs Mechanism. (See Chapter 10). As an exercise reader should check that if real scalar field in (6.5.11) is replaced by a complex scalar field, the Lagrangian is a t/(l)-invariant one (under 'global' phase transformations). The minima (for jj} < 0) of the potential, v(
\
(6.5.15)
It is a case of spontaneously broken symmetry. There is an infinity of degenerate ground states, and it is the case (1) of SBS given above (check why?)
Exercise 6.5 1. Consider the Maxwell equations in classical form: (i) V• E = p
(Gauss's law)
(ii) VxB = - 7 + - - ^ c
c at 1
(iii) V • B = 0 (No free magnetic poles)
/9B
(iv) V x E =
— c at
(Faraday's law when c is not there) where E and B are electric and magnetic fields and p = p(x, t) and J = J(x, t) are the charge and current densities. Using these equations show that the predictions of Maxwell's theory for observable quantities are gauge invariant. 2. Give an example of 'gauge fixing'. What is a Coulomb gauge! And what are its implications on Maxwell's equations? Show that the electric field E and the magnetic field B can be thought of as transverse fields. 3. What are the transverse electromagnetic waves in free space? Show how you would determine the energy of radiation fields. 4. Use the algebraic approach to electroweak theory to show why the underlying group SU(2) x 1/(1) is denoted here as SUL(2) x UY(l).
The Role of Symmetry in Physics and Mathematics 257
5. Write a Lagrangian for QCD and point out its similarities with the Lagrangian of the Yang-Mills theory.
Hints to Exercise 6.5 1. In order to prove that the predictions for observable quantities resulting from Maxwell's theory are gauge invariant, we must show that these equations can be expressed in terms of potentials. Using the fact that the operator VV x reduces any vector to zero, we note that equation (iii) implies that (a) B= V xA where A s AQc, /) is an arbitrary vector potential. Again from
dB d .- .. „ dA — = —-(VxA) = Vx —dt dt dt and from equation (iv) it follows that: E=-V0-I^ (b) c dt
where (f> =
( d )
(e)
v{-V>-!-^U-V20-!|-(V-A) = p
^ c dt J c dt We use the D'Alembert's operator (see Chap. 3, Sec. 1) c2 dt2 in the above equation to write it as: (e)
c dt {c dt
J
Similarly we substitute the value of B from (a) in (ii) and obtain after simplification and use of operator • the required expression in the potentials
(f)
nA+
v ( ! - ^ . + V-A] = l / . \c dt
J
c
As equations (e') and (f) are in terms of potentials, we have proved that Maxwell theory is gauge invariant, in other words its predictions are gauge invariant (see also Sec. 7). When p and J are zero, the fields E and B are said to be free fields and the Maxwell's equations are called source free equations.
258
Mathematical Perspectives on Theoretical Physics
2. Consider a set of equations such as Maxwell's that remain invariant under gauge transformations (for example Equations (c) and (d) of Exc. 1), and use this invariance property to choose a set of potentials (A, 0) that satisfy (i)
A • A + —— = 0 (we have taken c = 1) dt or choose a set (A, 0) so that, (ii)
V • A=0
A procedure of this type is called 'fixing the gauge' (see Definition (iii) in Sec. 2). First of these choices is called the Lorentz gauge or Lorentz condition. The second choice is the Coulomb gauge, or the radiation gauge. The solutions of V • A = 0 are called the radiation fields*. Now a vector field with vanishing divergence is called a transverse field. Thus in the Coulomb gauge A is a transverse vector field. It can be checked that when p and J are zero, in Coulomb's gauge we have: (a) V2» = 0 (b) DA = 0 (c)
E=
— c dt Evidently when p = 0, E is a transverse field and B is a transverse field by very definition (iii). 3. Write the wave function A(x, t) as Ao e'( * " m\ It can be easily checked that for a Coulomb gauge K-A = 0, i.e., A is perpendicular to the propagation K of the wave. The solutions of DA = 0 are called the transverse electromagnetic waves in free space. From the above exercise we know that these are also referred to as radiation fields. The energy is naturally given in terms of E and B, hence it is:
if(E 2 + 5V 3 *. 4. We show here how {/(I) and SU(2), the gauge groups of the theories of Maxwell and Yang-Mills (respectively), are used in formulating the electro-weak theory. This is done with the help of the IVB theory, where one writes the basic interaction (of weak forces) as: (i)
L=
g(.JflWli+h.c.)
(Jp represents the weak current, WM a massive field, g a coupling constant and h.c. = Hermitian conjugate.) When this theory consists of an electron (e) and its neutrino (v e ), the Lagrangian in (i) is denoted as: (ii) Lw=g(Jkw!i+h.c.) where (iii) Jx = ve yX (1 - y5)e is the V-A charged current (for details on the V-A formalism used in the low energy theory of weak interactions for charged currents see Chapters 5 and 11 in [4] and Chapters 6 in [16]), *
The current J in ( f) of Exc. 1 can be decomposed in longitudinal and transverse part J^ and J±. These parts satisfy V x /|| = 0, A • J± = 0. In coulomb gauge the R.H.S of (f) reduces t o ^ , hence A • A = 0 is also called the transverse gauge, (see. [28]).
The Role of Symmetry in Physics and Mathematics
259
whereas the electromagnetic interaction of these leptons (electron and electron neutrino) is given by the Lagrangian17: (iv) Lem = e/r AA where
(v)
jr=~ene.
We note that the three currents J, J+ and fm given in (iii) and (v) do not close under Lie bracket to form an algebra (which apparently says that SU(2), the simplest group with three generators, cannot be the group of electroweak theory). We define the weak and electric charges respectively as
(vi)
T+(t)= ±-$d3xJ0(x) = ±-ld3xvl(l-y5)e Z,
(vii)
ZJ
2
Q(t) = J d
xJeom
(x) =
-\d\e\
and we further note that 71(0 = T+(t). Using the canonical commutation relation for fermions: (viii) {tf (x, t), v,(x', t)} = 8y 8\x - x') we have (ix) [T+(0, T_(t)] = 2T3(0 where (x)
T3(t)=±\d'x[vl(l-y5)ve-e\l-y5)e] 4J Evidently T±, Q do not form a closed algebra18 but T±, T3 do, as shown in (ix). In spite of it, T±, T3 are not sufficient for a formulation of the theory, therefore we have to introduce another gauge boson coupled to 73. These four generators will then form the group SU(2) x {/(I) (required for the theory). This can be done in more than one way. For our case we choose to consider the fermionic sector of the standard model composed of e, ve leptons and u, d quarks only. For reasons of 'conservation of helicity' in gauge interactions, we have to have independent left-handed and right-handed fermions. Thus we have the following fifteen two-component fermions: (xi)
y/ = veL, eL, eR, uL, dL, uR, dR
where color indices a = 1, 2, 3 on the quark fields have been suppressed and eL and eR stand respectively for —(1 - J5)e, —• (1 + 7 5 )e. Using these we can write the weak charges as: (xii)
T+ = jd3x(vlLeL
+ u\ dL),
T3 = | jd'x(vlL veL -eleL+ 17
T_= (T+)f
u\ uL - d{ dL)
' See the Table S for a characterization of electrons, neutrino and leptons; V-A = vector and axial vector operators; the superscript t denotes the Hermitian conjugate. 18 ' Q cannot be a generator of SU{2), since the charges of a complete multiplet must add up to zero in order to fulfill the requirement that generators for SU(2) be traceless.
260
Mathematical Perspectives on Theoretical Physics
which form the generators of SU(2). From the expressions for the S£/(2)-generators, it also follows that
(xiii)
/tS(V^]
and ,,= Q L J
are S£/(2)-doublets and eR, uR, dR are singlets. Next the f/(l) group has to be chosen in such a manner that the electric charge
(xiv)
= \dix^-e\eL-e\eR
+ — (,u{uL + uR uR) - j (d*L dL + dR dR)^
can be a linear combination of the (/(l)-generator and the generator T3 of 5/7(2). We note that the combination
(xv)G-r3= ld3x[-±(v]LveL+eleL) +
±(uluL+dt)-eleR+^uRuR-±dUR]
has the property of giving the same quantum number to all members of an SU(2) doublet in (xiii); moreover, it commutes with all the other generators of SU(2): (xvi) [Q-Tv 7 ; ] = 0 i = l , 2 , 3 (T+ = TX,T_=T2). We write 2(Q - T3) as Y, and this is the required generator of the £/(l). The generator Kis called the weak hypercharge. The two groups SU(2) and f/(l) are thus appropriately denoted as SUL(2) and UY(\) in the electroweak theory. The four generators mentioned above are 7, (i = 1, 2, 3) and Y. 5. The underlying group of QCD is SUC (3), where C represents the three-valued quantum number called color (related to quarks). A Lagrangian for the theory is usually written as: 1
(i)
v
LQCD = - - - Tr G^ G" + X % ('/" D// " w *)9* 1
where (ii) (»0 and
"f k
GMV = dn Av ~ dv Au ~ ig[An< Av\ Dfi qk = (dn ~ igAf)Qk 8 a a=l
Z
The A"'s in (iv) are the Gell-mann matrices (see Eqs. (3.7.18)-(3.7.19)) that satisfy the SU(3) commutation relations:
(v)
[^^-] = ifabCf
and the normalization conditions (vi) Tr (ka Xh) = 28ab whereas A"^ are gluons—the strong interaction gauge fields. The qk denote the quark fields where k stands for the flavor index 1, 2, •••, nf (the number of quark flavors u, d, s, c, b and t). Similarity of (vi) with (6.4.25) and that of (iii) with (6.4.15) is obvious. The gauge-fields of the Yang-Mills
The Role of Symmetry in Physics and Mathematics
261
theory are replaced by gluons here. Thus the strong interaction theory is described by an SU(3) colour Yang-Mills theory. Note that each flavour of quarks transforms here as the fundamental triplet representation.
6
BUNDLE THEORY FORMALISM IN GAUGE THEORY
We have already seen in Sec. 4 and Sec. 5 that in physical applications, the gauge group that gives the internal or local symmetry of the field is a Lie group. In fact every gauge theory that we have dealt with in these sections had a Lie group associated to it. We also know (from Sec. 2.5) that the group involved in the definition of a principal bundle is a Lie group, hence it is not surprising that principal bundle techniques can be fruitfully applied to gauge theory.
6.1
Principal Bundles as Tools in Gauge Theory
Recall that a principal bundle P(M, G) with structure group G over M is a fiber bundle (P, M, n, G) with a free right action p of G on P such that19: (i) The orbits of p are the fibers of n: P —> M, in other words K can be identified with the canonical projection P —» PlG. (ii) every local trivialization of P: y/ -> £/ x G -> ii~\U) in relation to the action p satisfies ¥'\(uxg) = V/' (ux)g where uxg = p(ux, g) x e M, ux& Px, g e G. While using the principal bundles in applications, not only does the group G vary, but M varies as well, for instance in Maxwell's and Yang-Mills' theory M is either a space-time manifold or it is its Euclidean version (see Sec. 1.4). In theories such as Kaluza-Klein and strings it is an arbitrary manifold (pseudo-Riemannian or Riemannian). We shall therefore opt to describe the theory in more general terms, and shall deduce the results for particular cases. But first we establish some terminology that we shall need. Let (M, g) be a pseudo-Riemannian manifold with metric g and let G be the fixed Lie group which would henceforth be referred to as the gauge group of the theory. Consider now the principal bundle P(M, G). A connection in P (see Def. (2.5.11) and (2.5.12)) is called a gauge connection, and the (corresponding) connection 1-form GO (see Def. (2.5.14)) is called the gauge connection form. A section s € T(P) is called a global gauge or simply a gauge. The gauge potential A on M in gauge 5 is obtained by pull-back of the gauge connection a> on P to M by s, i.e., A = s (co). When we consider an open subset U of M, and the section s is defined on the restricted bundle P\v, then we call s a local gauge. Thus for instance if t e T(U, P) is a local gauge, then the 1-form t*(co) e Al(U, g) 20 is the local gauge potential denoted At. When there is no fear for confusion it is denoted as A. Thus: A=Ar = t\co) 19
We have used slightly different notations in Sec. (2.5).
2a
g denotes the Lie algebra of G; this was denoted Q in Sec. 2.5.
(6.6.1)
262
Mathematical Perspectives on Theoretical Physics
As an example consider the gauge group G = U{\) of electromagnetism. We already know it is the circle group and its elements e'e are determined by the phase 9. In this case a local gauge over an open set U c M can be thought of as a choice of phase in the bundle P\v = U x G at each point of U. It is for this reason that the total space of P, here is sometimes referred to as the space of phase factors in physics. The curvature 2-form Q = da(O which (as we already know from Sec. 2.5) has values in Lie algebra g is called the gauge field on P. We also know that given the form Q. on P there exists a unique 2-form on M, denoted Fjx with values in the Lie algebra bundle adP. The form Fw denoted XQ belongs to A 2 (M, adP) and is called the gauge field on M. It is interesting to note that Fa is globally defined, whereas in general there is no globally defined gauge potential on M. We thus have: Given a local gauge potential A,e /\l(U, g) the following relations are valid: t\ai) = At
Fa=dmA,
(6.6.2)
A word of caution: in the physics literature (sometimes) the mathematicians' gauge potential A is called the gauge field and the gauge field-the field strength tensor (see Sec. 4 and Sec. 2.5). To illustrate the concepts introduced above we consider the principal bundle S 3 (S2, (7(1)) over the base manifold S2. The structure group of the bundle is the circle group (7(1) and the bundle is determined by Hopf fibrations of S3 (see Exc. 11 of Sec. 2.5). Let \X denote the connection 1-form of the canonical connection on this bundle and let F^ be the corresponding gauge field on S2. Evidently F^ is globally defined but the gauge potential is not, since there are at least two charts (Ux, y/y), (U2, Wi) 'hat are needed to cover S2. According to these charts there are two locally defined gauge potentials, say Ax and A2 that give rise to globally defined gauge field F^ on the base manifold S2. Note 6.6.1: The gauge field F» is equivalent to the Dirac monopole field. The Dirac monopole quantization condition corresponds to the classification of principal C/(l)-bundles over S2. The classification in this case is given by the first fundamental group 7r,([/(l)) = Z 22 . In general the principal Gbundles over S2 are classified by 7C{(G). Thus nx{SU{2)) - Id implies that there is a unique SU(2) monopole 23 on S2 and Ki(SO(3)) - Z 2 implies that there are two inequivalent 50(3) monopoles on S2. Note 6.6.2: The gauge fields and gauge potentials have no physical significance unless they are made to satisfy the (field) equations of a theory. For instance the Riemannian curvature of a space-time manifold is the gauge field corresponding to the gauge potential given by the Levi-Civita connection on the orthonormal frame bundle O{M) of M. This, however, does not describe the gravitational field unless it satisfies Einstein's field equations. If instead it is made to satisfy the Yang-Mills equations, it describes a Yang-Mills field.
6.2
The Group Aut(P) of Generalized Gauge Transformations
We shall now see how this bundle theory can be used to study the gauge invariance of field equations, which we did in the previous two sections using the classical concepts of physics. In order to do this we consider the group of diffeomorphisms Diff(/>) of P. Since this group mixes up the fibers of P, the group as a whole is not a good candidate for the group of gauge transformations. What we need here are those 21
See Sec. 2.5b for details.
22
nr(X) denotes the /th fundamental group of topological space (see [2] and 2.[29] for details).
23
' See Chapter 10 for definition of a monopole and the references there.
The Role of Symmetry in Physics and Mathematics
263
diffeomorphisms that preserve the fiber of P. This is achieved by demanding that following diagram be commutative: i.e., the triple (n,
(6.6.3)
0
P
n
n V
^M
y
M
M
E S f l ^ 9 Projectable diffeomorphism. Definition 6.6.3:
The pair (0,
diffeomorphism or projectable transformation
of P. It can be checked that the projectable
diffeomorphisms form a group denoted DiffM(P). Definition 6.6.4: Let *¥' denote a one-parameter group of projectable transformations of P resulting from the vector field X e £(P) a n d let H^J be the corresponding one-parameter group in Diff(M) associated with the vector field XM e ,£(M)- Then X is called a projectable vector field on P, thus in this case the pair (X, XM) satisfies the relation: n'(X(u)) = XM{rt(u))
for every u in P
(6.6.4)
is a
The collection of projectable vector fields on P denoted £ M (P) Lie subalgebra of the Lie algebra X(P)- It should be noted that we have not used the principal bundle properties in Def. (6.6.3); by this we mean that the commutativity condition (6.6.3) exists for an arbitrary bundle, thus for instance if
(6.6.5)
The group Aut(F) is called the group of generalized gauge transformations. It is easy to note that this group is indeed the group of principal bundle automorphisms of P. From Sec. (2.5) we know that the fiber preserving property of the generalized gauge transformation <j> completely determines the diffeomorphism
(6.6.6)
It can be checked that Q(P) is a normal subgroup of Aut(P). More explicitly
' Projectable transformation: physics terminology.
264
Mathematical Perspectives on Theoretical Physics
(6.6.7)(b)
Construction of these subgroups leads to the following exact sequence of groups: 1 -^ g(P) —U> Aut(P) — ' - ^ Diff(M) -> 1
(6.6.8)
where i denotes the inclusion map and j is defined by: j(<j>) - 0 M
for every (j> € Aut(P).
If the bundle P is the bundle of frames L(M), then the above sequence allows additional geometric insight in this study. For example the connections on the frame bundle L{M) when Mis a 4-dimensional Lorentz manifold play the role si gravitational potentials, and action functionals involving connections and metrics on M become the focal point where gauge theory of gravitation can be started. Returning to gauge transformation 0 in 6j we note that 0 can be physically interpreted as a local (pointwise) change of gauge over M. It is for this reason that G is sometimes called the local symmetry group and Q is called the local gauge group. We shall however restrict to the usage given by defining Equations (6.6.5) and (6.6.6). Given an open set U c M the map 0 : U —> G is a local representation of 0 when it is defined in the following manner. Suppose that t is a local gauge over U; then t is a section of the bundle P\v, thus in this case for x e U, t(x) is in Px - n~l(x)—the fiber of P over x, and by definition of
for every x in U
(6.6.9)
When P is a trivial bundle, U can be taken as M and the local gauge transformation 0 can be identified with a map from M to G.
6.3
The Gauge Algebra of P(M,G) and the Space of Gauge Potentials on it
Before illustrating in this section on the mathematical basics of gauge theory, we shall introduce two more objects in brief: the gauge algebra of P and the space of gauge potentials on P. In order to do this we show that there are two more groups associated to P and G which are isomorphic to the group Q(P) of gauge transformations. One of these groups results by considering the space 7(P, G) of all smooth functions/: P -» G, the group operation here, is the pointwise multiplication of these functions. Definition 6.6.7: The subset of fiP, G) consisting of all G-equivariant functions (with respect to the adjoint action) forms a group denoted JG{P, G): ?G(P,G)={f:
P->G \f{ug) = g-lf(u)g,
for every u € P and for every g e G]
(6.6.10)
The group JG (P, G) is isomorphic to Q(P). To define the other group we note that the associated bundle Ad{P) = (P x Ad G) over M (by the adjoint action of G on itself) is a bundle of Lie groups with fibre G.
The Role of Symmetry in Physics and Mathematics 265 Definition 6.6.8: The set T(Ad(P)) of the sections of this associated bundle with pointwise multiplication is a group. This group is isomorphic to Q(P) as well as to JG (P, G). In view of this statement it follows that any one of these groups or their representations can be used when we deal with the group of gauge transformations. Consider now the associated vector bundle E(M, g, ad, P) with fiber type g and the adjoint action ad of G on g. Recall that this bundle is a bundle of Lie algebras denoted P x a d g or ad(P) (see Sec. 2.5; we denoted it there as P x adg). Definition 6.6.9: The set of sections T(adP) =LQ{P) is a Lie algebra under the pointwise bracket operation. The Lie algebra Lg(P) is called the gauge algebra of P. If instead of JG (P, G) we consider the set fG {P, g}) of all G-equivariant (with respect to the adjoint action ad of G on its Lie algebra g) functions with the pointwise bracket operation, we note that it is a Lie algebra and it is isomorphic to LQ(P) (see Chap. 6 in [20] for details). So here again we can use either of these to describe the theory. We have thus far considered principal bundles on arbitrary manifolds M. We now consider principal bundles P(M, G) where M is a compact, connected and oriented manifold and the gauge group G is compact and semisimple. The choice of compactness 25 gives a Riemannian metric on M and a fixed orientation permits the integration. The base manifolds which are typically used in physical theories of interest are Sn,T" or their product 5" x 7"1. Similarly the most common gauge groups in use are U(n), SU(n), O(ri), SO(n) or their products. With these assumptions in place, a (natural) inner product can be defined on the space of gauge potentials (connections) on P. This space is denoted as S\(P) or S\ when P is fixed and is defined by: Si{P) = (ft) e A 1 (P, g ) | w is a connection on P)
(6.6.11)
From the definition of connection we know that if cox and o^e Si, then their difference 0)^ - ft^ is horizontal and is of type (ad, g) and thus defines a unique 1-form on M with values in the associated bundle adP s P x ^ g 6 . In view of this for a fixed connection ye Si, the space Si becomes isomorphic to the set: {/+ n*A\A e A 1 (M, adP)}
(6.6.12)
This isomorphism implies thatil is an affine space with underlying vector space A (M, adP). In other words the tangent space TyS\ is isomorphic to A ' ( M adP). Hence not only a metric can be defined on Si, covariant and exterior derivation can be defined as well (Sec. 2.5b). We shall not go into these details (see {20] and [7]), instead we shall use the action of the group Q(P) on the space Si(P) to derive a gauge transformation relation similar to the one we obtained in Sec. 4 and Sec. 5. We note that essentially there are two different actions of g(P) on Si(P), but these can be identified. These are the right and left actions denoted R { and L^for/e g(P). The first of these has the effect of pulling the 1-form (0 € A(P), thus fco= Rrl(Q)) = (f-x)*co
(6.6.13)
and the second one pushes the horizontal distribution HP (defined by co) forward, thus we have:
Lf(co) = f,(Hp). 25
The choice of compactness can be relaxed to include base manifolds such as R3 or R3 x S1 with appropriate boundary conditions.
26
' See Sec. 2.5.
266
Mathematical Perspectives on Theoretical Physics
In view of our earlier discussions (see Def. (2.5.12) and (2.5.14)) on Hp and co, it is clear that R i = Lf. Hence either of these actions can be used to suit a particular problem. Next we obtain a local expression for this action choosing G to be one of the (classical) matrix groups (see Sec. 4). Corresponding to the given connection CO, let A,-, Aj be local gauge potentials respectively in local gauges (sections) r,-, t} over Ui and £/•. From our study in Sec. 2.5, the sections tt, r that satisfy: f, = xi/jj tj
where
could be viewed as the local expressions of fe
y ^ : U( n Uj -> G
(6.6.14)
Q(P). Hence Eq. (6.6.13) becomes:
AJ=Yi]-lAiyfij+Y^dyfiJ
(6-6.15)
(see 2.5.37). Further, using the terminology of the previous section, if we write A, as A and Aj = A8 and y/jj as g, we obtain the (familiar) local expression of the transformation rule: As = g-x Ag + g~x dg
(6.6.16)
(Note that we established it for G = U{\) in Eq. (6.5.5)). Very often while writing the gauge transform of a potential or a field, we use the local expression (6.6.16) in preference to (6.6.15). Thus, for instance, from Eq. (6.6.16) the gauge transform of the gauge field Fta is: F£=8-lFmg
(6-6.17)
This shows that | Fm\ is a gauge invariant function on M, i.e., \Fj\ = \Fa\eHM). Due to its gauge invariance property, this is used in defining the action functional of the theory based on the gauge group G.
6.4 The Moduli Space of Gauge Potentials on P(M, G) and Gribov-Ambiguity Definition 6.6.10: Two connections a , , ^ e A{P) are said to be gauge equivalent if there exists a gauge transformation / e Q(P) such that a 2 = / o a , . The definition of the action of Q{P) on Si(JP) implies that each equivalence class formed by equivalent connections describes an orbit of Q{P) inS\(P). The orbit space ® = %JQ represents the gauge inequivalent connections and is called the moduli space of gauge potentials on P(M, G)*. The definition of orbit space leads to the infinite dimensional principal ^-bundle p:%.-*®
= AlQ
(6.6.18)
with p as the natural projection, and apparently it also leads to the notion of sections s : 0 —> 91. The transformation properties of gauge group Q are used to write gauge invariant functionals on J3. however to avoid the infinite contributions coming from gauge equivalent fields one has to integrate them on orbit space 0 . Unfortunately the mathematical content 0 is not well known, which makes the problem unwieldy. This difficulty is eventually avoided by choosing a section s and integrating over the image 5(0) c !A with a suitable weight factor such as the Faddeev-Popov determinant. This procedure Note that points of orbit space 0 correspond to equivalent classes of connections.
The Role of Symmetry in Physics and Mathematics 267
of choosing one connection in A from each equivalence class in 0 is called the gaugefixing*.Naturally the procedure could not work if required 'sections' did not exist. For example, Gribov showed that for the trivial SU(2) bundle over R4 the Coulomb gauge fails to be a section and hence is not a true global gauge. The non-existence of a global gauge is referred to as Gribov ambiguity. It can be shown that the Gribov ambiguity is related to the topological structure of a principal bundle (see, for instance, [14], [18], [25]). Due to our limited scope we are unable to devote time to these technicalities. Finally, in pursuance to our programme of learning the physicists' and mathematicians' approach to gauge theory in a unified manner, we give an example of Coulomb gauge (in this section) and devote the next section to some more intriguing characteristics of gauge theory. Example 6.6.11 Consider the principal bundle P(S*, SU(2)) and let Sl{P) denote the space of gauge potentials. For a fixed connection am ft, we know that.# is isomorphic to the vector space of 1-forms on S4 with values in the vector bundle adP = P x adg, i.e., A= {a+ KA\A
€ A'(S 4 , adP)}
(6.6.19)
(Note that (6.6.19) is indeed (6.6.12) with S4 in place of M and with 5(7(2) as the group G.) Consider the subspace Sa c A defined as Sa= {a+ n A\&* A = 0}
(6.6.20)
a
where S denotes the coexterior derivation with respect to a (see Eq. (2.5.50)). The subspace Sa is called the generalized Coulomb gauge. If in particular a = 0, then we have So = {KA\8°A = 0}. The condition S°A = 0 can locally be written as
Now on R4 as a base manifold, one can find a connection whose time component is zero and which is gauge equivalent to the given connection (i.e., the two are related by a gauge transformation). The above gauge condition then reduces to the classical Coulomb gauge condition div A = 0 (6.6.21) a Through this simple example we have thus shown how the topological construct S A = 0 can be identified with the classical condition A- A = 0.
7
MORE ON CHARACTERISTICS OF GAUGE THEORIES, AND EXAMPLES BASED ON THEM
In spite of great progress already made in theories based on gauge symmetries (e.g., the electroweak heory), gauge theories can still shed new light on theoretical physics (see references given in Exp. 4). iVe devote this section to give a few examples on this aspect. For instance, in the first two examples We emphasize that in any (sensible) physical theory 'gauge-fixing' conditions are always used. As they help to remove redundant degrees of freedom that crop in due to 'gauge invariance' of the theory in question. (See Chapter 11).
268
Mathematical Perspectives on Theoretical Physics
(cited below as results), using the bundle-theoretic construction of gauge theory, we establish the existence of Maxwell's and Yang-Mills fields on arbitrary manifolds. In particular we prove (in brief) the following two results: Result 6.7.1: Given a principal bundle P(M, (7(1)) over a compact simply connected, oriented Riemannian manifold M, the Maxwell field is the unique harmonic 2-form representing the Euler class or the first Chern class cx(P). Result 6.7.2: Let P(M, G) be a principal bundle whose base manifold M i s Riemannian and connected and G is an arbitrary Lie group, and let co be a connection of P(M, G) which represents the finite Yang-Mills action. Then following four statements are equivalent: (i) co is a critical point of the Yang-Mills functional (ii) co satisfies the equation 8mFa = 0, (iii) co satisfies the equation da* Fm = 0, (iv) AaFm-Q where A^is the the Hodge Laplacian AC0 = d0)S0)+ 803d™. (6.7.1)
7.1 A Generalized Maxwell's Field Proof of Result 6.7.1: To show the existence of Maxwell's fields on arbitrary manifolds we begin by recalling its definition and existence on the Minkowski space M 4 . Consider the principal bundle P{MA, U(\)). Since any principal bundle on the Minkowskian space is trivializable, it can be thought of as M4 x U(l). Now the Lie algebra w(l) of [/(I) can be identified with i (imaginary unit) times the real line, a connection form CO on P (co G Al(P, i R)) can be written as ico by choosing {/} as the basis for iR. Similarly the gauge field Q. = dcoeA (P, iR) can be written as J'Q. This leads to the Bianchi identity dQ. = 0. In order to define the corresponding gauge field Fm (see Sec. 6) on M4, we note that the bundle ad(P) is also trivial and therefore can be expressed as adP = M4 x u(l). Accordingly the gauge field Fa e A 2 ( M 4 , ad(P)) can be written on the base M4 as iF where F e A 2 ( M 4 ) . Since the bundle is trivial, we have a global gauge s : MA —> P defined as s(x) = (x, 1) for every x e M, we use this to pull the connection form icoon P to M4 and obtain the gauge potential: iA = is* (co)
(6.7.2)
The gauge potential A G A ' ( M 4 ) in this case is evidently global and the corresponding gauge field which as we know from Sec. 6 is always global is: F=dA
(6.7.3)
The Bianchi identity: dF=0
(6.7.4)
follows from the fact that F is exact (i.e. it is the differential of a 1-form). Consider now the action given by the gauge field F: SA = \ \ F\2 dv
(6.7.5)
where \F\ stands for the pseudo-norm induced by the Lorentz metric on M and the trivial inner product on the Lie algebra M(1), and dv stands for the infinitesimal volume element on M 4 . Note that the action represents the total energy of the electromagnetic field. The Euler-Lagrange equation obtained by
The Role of Symmetry in Physics and Mathematics
269
minimizing the action SA gives: 8F <=> d*F = 0
(6.7.6)
The equations 5F = 0 coupled with dF = 0 give Maxwell's equations for a source-free electromagnetic field. Consider now a gauge transformation (p-a section of Ad(P) = M4 x U(l), note that this is completely determined by the smooth function *F e ^(M4): (p(x) = (x, emx))
for every x e M 4
€ Ad(P)
(6.7.7)
Thus if iB denotes the gauge potential which results from the action of gauge transformation <j> on iA, then in view of (6.6.16) it follows that: iB = e^{.iA)ef¥
+ e™ de™
(6.7.8)
(where we have written *¥(x) as *¥ for ease of notation). The above equation simplifies to the (familiar) classical formulation: B=A+cP¥
(6.7.9) 4
as given in (6.5.4). Thus, in essence, beginning with a principal bundle M x f/(l) and using a 'section' as gauge transformation, we have obtained Maxwell's equations: dF = Q
SF=0
(6.7.10)
We would also like to remark that since the group Q(P) of gauge transformations acts transitively on the solution space A(P) of gauge connections of equations (6.7.10), the moduli space AIQ consists of a single point—a fact, which is of fundamental importance in path integral approach to QED (see Chap. 9). From the above discussions it should be clear that definition of a Maxwell's field can be extended to any £/(l)-bundle on an arbitrary base manifold with pseudo-Riemannian metric. In order to finally establish Result (6.7.1), we therefore consider a compact, simply-connected, oriented Riemannian 4-manifold (M, g) with volume form vg21 and define the Maxwell field as follows: Definition 6.7.3: A connection (Don P(M, C/(l)) is called a Maxwell connection or & potential if it minimizes the Maxwell action AM(co) defined as:
AM(co)= -L.ljFj2xdvg
(xeM)
(6.7.11)
The corresponding Euler-Lagrange equations are: dFa = 0
8Fco = 0
(6.7.12)
A solution of these equations is called a Maxwell field or a source-free electromagnetic field on M. Since equations dFm = 0, 8Fa = 0 taken together imply that Fm is harmonic, we have partially proved Result 6.7.1: the curvature 2-form of a Maxwell connection defined on P(M, G) is harmonic. To show the uniqueness of this connection and to establish the fact that it represents the Euler class or the first Chern class q(P), we have to use the homotopy, homology and Hodge theory. We refer the interested reader to [2] and [20]. 27
The theory based on this choice of (M, g) is called the Euclidean version of Maxwell's theory.
270 Mathematical Perspectives on Theoretical Physics
7.2 A Generalized Yang-Mills Field Proof of Result 6.7.2: In Sec. 6 (see (6.6.1)) we have already seen that if CO is a gauge connection on P s P(M, G) (M a connected manifold), then in a local gauge r e F ((/, P) the local gauge potential is A, = t*(co) which belongs to A ' ( £ / , adP). Also the action on this potential of a gauge transformation g 6 Q{P) which locally becomes a G-valued function gt on U is given by: 8rAi = (adg,)oAt+ g*0 = gjl A,g, + g*0 where 6 is the canonical 1-form on G.28 Recall that in (6.5.5) we wrote it as: A8 = g~l Ag + g'[ dg In Sec. (2.5) we also saw that the curvature Q. of the (gauge) connection co defined the unique 2-form Fa = sQ on M with values in the bundle adP. In a local gauge t, Fm can be written as: Fm=d'°At+j[At,At]
(6.7.13)
Note that the Lie bracket in the above equation stands for the bracket of bundle-valued forms. We assume that M is a compact connected oriented Riemannian manifold and G is a compact semi-simple Lie group, then we can write the Yang-Mills action (or functional) AYM as follows: A
YM («) = —^T jM\Fa>\2x
dv
s
for w e
-^( p ) and x e M
(6.7.14)
In order to write the Euler-Lagrange equation by minimizing the above action, we note that the space !A(P) is an affine space and therefore variations can be done along the straight lines (through co) of the form: 0)t = 0)+ tA (6.7.15) where A € Al(M, adP) and t is the variational parameter. The gauge field corresponding to co, can be seen to satisfy: F
m, = F
(6.7.16)
We substitute the value of Fw in (6.7.14) and differentiate the action using (6.7.15) to write: -j-Am(fl>,) dt
r=0
=^-(jjFa+td«+A dt
+ t2(AAA)\2 dvg) x r=Q
(6.7.17)
Writing the RHS as inner product and differentiating the terms with respect to / at t = 0, we note that: jmS = =
2Ju(Fa,daA)dVl! 2lM(S*Fm.A)dvg.
The above equality in variational form can be written as (see Sec. 2.5 for {(,))): 8AYM(co)(A) 28. See Subsec.(2.5.5).
= 2 « 5 a ' F ( B , A))
(6.7.18)
The Role of Symmetry in Physics and Mathematics 271
A gauge connection ft) is called a critical point of the Yang-Mills' functional if: 8AYM (ft))(A) = 0
for every A
EA'(M,
adP)
(6.7.19)
The critical points of Yang-Mills functional are solutions of the corresponding Euler-Lagrange equations: 5°> Fa = 0
(6.7.20)
The equation (6.7.20) is called the pure (source-free) Yang-Mills equation on the manifold M. The gauge connection ft) is called the Yang-Mills connection and its gauge field Fa is called the Yang-Mills field. Since (using local coordinates [see Chapter 1]) d03 and S"" are related through the Hodge star operator as: dol=±*SO)*
(6.7.21)(a)
it follows that the Yang-Mills equation (6.7.20) is satisfied if and only if: d(O*Fco = 0
(6.7.21)(b)
Through our above discussions we have already proved the equivalence between the first three statements. To show the equivalence of (iv) with the other three, we prove that (ii) <=> (iv). Recall that in view of our earlier results (see also Exc. 2) we can write the identity in (iv) as 29 « V * Fa, Fa » = || da Fa |P + || 8° F j | 2 .
(6.7.22)(a)
When we use Bianchi's identity: daFm = 0
(6.7.22)(b)
in the above equation the equivalence (ii) <=> (iv) becomes obvious (note that Eq. (6.7.22)(b) suggests that locally Fm is always derived from a potential). The pair of equations 8® Fco = 0 and dw F^ = 0, or da*Fa) = 0 and d03 F0 = 0 are called the Yang-Mills equations. It is easy to verify that when M is the Minkowski-space and G is f/(l) they reduce to Maxwell's equations. Using a local orthonormal coordinate system, it can be shown that these equations are a system of non-linear, second order, partial differential equations for the components of the gauge potential A (see Exc. 3). Example 6.7.4:30 Here we give an example of a model which has been constructed (see S. Chadha and H.B. Nielsen in [11] or [3]) to illustrate the assertion that certain symmetries are not really fundamental (as was considered before the advent of grand unified theories) but they arise because of our low energy world. These are thus expected to be broken at high energies. The model that we are talking about is gauge invariant and renormalizable but is not initially Lorentz invariant. The couplings involved in the Lagrangian of this model are functions of the energy scale and as such any change in this scale makes an impact on the physical system that Lagrangian represents. More precisely, as the energy scale is lowered, it is shown that the model tends to acquire Lorentz invariance with perceivable accuracy which confirms the premise that 'symmetry is not fundamental.' We give below this Lagrangian using the same notations as that of the paper referred above. Again we simply explain only its main features and ask the interested reader to look for details in the original paper cited above and similar extended works of Neilsen and Ninomiya in [23] and Forster, et al., in [10]. See also Chapter 6, in 29
- See Eqs. (2.5.47)-(2.5.49). ' Reader will appreciate this Exp. better after Chapters 7 and 9.
30
272
Mathematical Perspectives on Theoretical Physics
particular Subsec. 6.2.2 in [11]. Consider the action W with the Lagrangian consisting of three terms: Wnorwnt ( * V) = J ( < W ^ + T j / V + ^non-nt G V V)]
(6-7.23)
The subscript non-int stands for non-interacting, to imply that it represents a non-interacting action between photon and electron fields. The term -£non_int written out in full stands for:
-}*7V[^ l ± i ^ + ^ i z | l 5 - | l ^
(6 . 7 . 2 4)
The various terms involved in (6.7.23) and (6.7.24) are explained as follows: The 7M and r\ are the photon and electron sources respectively with A^ and \y as their corresponding fields, and F^v = d^Av-dv A^ is the field strength tensor. The veilbeins e+% and e^a are taken differently for two different helicities of the fermion. The ^-matrices are in Majorana basis (see Chapter 7), thus they are all imaginary with y° being antisymmetric and the rest being symmetric31. The matrix y5= y° yl y2 y3 can easily seen to be real. The matrix q =
stands for the charge. The r\ and y/ are real eight component spinors that
anticommute. As mentioned in the introduction, the (final) action of the model is gauge-invariant and renormalizable. This means that the (primitive) interaction term is obtained from the free Lagrange function by replacing, every derivative by a covariant derivative D^(see Sec. 4), and due to renormalizability the Lagrange function contains only those terms whose mass dimension is < 4. Although the model (to begin with) is non-invariant under Lorentz symmetry, it takes into account the chiral invariance, and accordingly uses a massless electron. Also to preserve the usual translational invariance of the action it is further assumed that the coupling constants r]flypff and the vierbeins e+%, eta are space-time independent. They are however dependent on a variable A (called the energy scale). The model is constructed on a four dimensional space (x°, xl, x2, x3) with no a priori assumption on the metric. But the stationarity of the non-interacting action Wnon..int which gives the free field equations, ultimately leads to the choice of a Minkowskian metric. The action W with interaction between photons and charged particles together with required gauge invariance is now obtained by replacing every derivative in (6.7.23) by a covariant derivative D^ (see (6.4.6)), in other words using the correspondence: „-*„- ieqAp
(6.7.25)
This leads to: W = j(dx)[j"
n
Tt il
( 0
- The/-matrices in Majorana basts are: 7 ^ = I .
Au +77/° V + £(AM V)].
^2
-T2^) Q
,
( 0
I, 7M= I . ^
(6.7.26)
(T3^| Q
,
(i\
'%*=l0
0 \
_
a
J ' ^
,
=
. (T 1 , T 2 , T 3 are Pauli matrices). One can find more than one such representation for 7-matrices
V-'T 0 J related to each other by similarity transformations (see for instance 7.[19] and Exc. (7.3.6)).
The Role of Symmetry in Physics and Mathematics
273
It can be easily checked that: MAp 40 = £ non . int (AM, V) + / A^
(6.7.27)
with./M given by:
f(x) = j^WrV "?[<, ^f^
+ e^^^ix)
(6.7.28)
Using several mathematical arguments, it is then shown that the non-covariant model looks more covariant (implying the Lorentz invariance) once the energy scale A at which the model is viewed becomes very small. We would like to emphasize that the above example with no mathematical explanations is included in the text mainly to introduce an inquisitive mind to mysterious characteristics of symmetry, and symmetry breaking, and to motivate the reader towards further study. Example 6.7.5
The four dimensional rotation group 0(4). The 0(4) symmetry is one of those
symmetries which is important at the macroscopic as well as at the microscopic level*. At the macroscopic level it is exhibited by planetary motion via Coulomb-like — potential (see Ftn. 2), for example r the Keplerian laws become very simple in view of 0(4) symmetry. At the microscopic level one encounters it in Bohr's atomic model, more specifically when one writes the Schrodinger equation for the electron in a hydrogen atom which has the O(4)-symmetry. (On a historical note, both Kepler and Bohr were unaware of this symmetry at the time of their fundamental work.) The 0(4) symmetry of the hydrogen atom was discovered by Fock (see reprinted paper in [11]), who wrote the Schrodinger equation in momentum representation. He then identified the points of the momentum space with those of a unit sphere S3 in R4 through stereographic projection. Through this process he was able to transform the SchrSdinger equation (for the hydrogen atom) in such a manner that the Hamiltonian could be expressed as a convolution with the function: constant . 2 co
, , - 2Q\ K • •
)
sin — 2 co being the distance along a great circle on S3. The transformed Hamiltonian (for the hydrogen atom) was invariant under rotations in the 4-space, of this S3 sphere around its centre. We give below the relevant equations of Fock's paper to show this O(4)-symmetry (the notations followed are those of the above reference). The Schrodinger equation in the momentum space (see Chapter 9) is given as: (6.7.30) where (dp') - dp'x p'y dp'z is the volume element, p2 = p2x + p2 + p2v and Z represents the hydrogen atom. Writing p0 = ^-2mE , the coordinate transformation given below is used to express (6.7.30) in terms of spherical coordinates on S3 (see Exp. (0.5.1)), thus t, - 2p0px I (p20 + p2) = sin a sin 6 cos 0 Note that the words macroscopic and microscopic are used here because in one of the problems (involving O (4) symmetry) the distance is very large and in the other it is minuscule.
274 Mathematical Perspectives on Theoretical Physics
7] = 2p0py/(p20
+ p2) = sin a sin 6 sin 0
£ = 2p0pz/( p\ + p2) = sin a cos 0 * = ( / > o - V ) / ( P o + P2) = cos a.
(6.7.31)
Using d£l = sin2 a sin 6da d6 d(j>, it follows that the volume element (dp) = dpxdpydpz satisfies: (dp) = - V (p20 + p2)3 dQ.
(6.7.32)
Set A=
^ng2_=
Zme2
(6 ? 33)
and use the new wave function in terms of (a, 9, <j)) given by the relation 4,(0, 9, <j)) = -?=- pf2
((p2 + P2)W(p))
(6.7.34)
to write the Schodinger equation (6.7.30) in the form that we were looking for: (6.7.35) 4sin
Ln
— 2
The denominator 4 sin2 — , as mentioned above, represents the distance between two points on a great circle whose spherical coordinates are (a, 6,
= (£ -
(6-7-36)
The stereographic projection used for the problem is reproduced below.
/
/
^^HHj
2^~ 1
.^-::~-':~/'-r^. ,
~y
^
/
The stereographic projection for Eq (6 7.31) with one suppressed dimension.
The Role of Symmetry in Physics and Mathematics 275
So far in this chapter we have studied only those symmetries that we observe in nature, and have obtained their so-called 'symmetry' groups (very often called the gauge groups). In the following example we shall deal with a symmetry which does not originate from the laws of nature, instead it helps in studying them (see Note (6.7.7)). Due to this distinctive nature, we would call this symmetry a 'manmade symmetry.' The symmetry (i.e., the symmetry group) we have in mind pertains to the Laplace operator. Recall that in Sec. (3.5) we defined the 'full symmetry group' of an operator L as the group G that is formed by all operators that comute with L and are invertible (Def. (3.5.2)). We shall see that for the Laplace operator A (which we denote as L) whose domain is 5 3 , the group G is SO(3). " d2 Consider the Laplace operator L s V — sdS+ Sd. Then from a standard result ,=i (dx ) of Riemannian geometry, it is known that if M is a Riemannian manifold and 0is an isometry of M, then
Example 6.7.8: The Lie algebra SO (3) is indeed the Lie algebra obtained from the commutator algebra CL of the Laplace operator L (see Def. (3.5.1)). To see this we note that elements of 50(3) can be written as: HO Rx(0)
0
]
= 0
cos 6
-sin6 ,
,0
sin 6
cos 6 )
fcosd R3(0) = sin d , 0
fcos0 R2(0) =
0 l^-sin 9
0 1 0
smd\ 0
,
cos 0 ,
- s i n ^ 0N cos d 0
0
(6.7.37)
1,
where Rt{9) (i = 1, 2, 3) denotes the rotation around a coordinate axis of a cartesian reference frame. Consider an infinitesimal rotation of the form: R(ff) => / + ea Xa
(6.7.38)
where ea,s are infinitesimal parameters, andX a are the infinitesimal generators of 50(3). TheX a 's are obtained from: Xt = l i m — R^ff) 8—>0 d0
(i= 1, 2, 3)
(6.7.39)
276 Mathematical Perspectives on Theoretical Physics
i.e., from A)
0
0\
( 0 0
X{ = 0 0 - 1 ,X2=
[o i oj
\\
(0
0 0 0 ,X3 =
t - i o oj
1
-1
0'
0 0
(6.7.40)
[o o o,
These X,'s form a basis for S0(3) ([Xa, Xp\ = e a / 3 r A"r). They satisfy the requirements of CL, since XjS are the operators that evidently commute with L.
Exercise 6.7 1. Obtain the value of Fa as given in (6.7.16). 2. Establish the equality (6.7.22)(a). 3. Obtain a local expression for Yang-Mills equations in an arbitrary orthonormal coordinate system.
Hints to Exercise 6.7 1. Note that cot = CO + tA is a line in the affine space of gauge potentials. The element A here € A\M, adP), i.e., A is a 1-form which takes values in the bundle P xad g. Evidently A is variable in above equation. Now we know that the curvature form fl corresponding to CO satisfies: (i) Q.t = da>+ (0y\(Q and Fw is the 2-form that corresponds to this. The curvature 2-form Q, derived from cot (using (i)) would be: Q r = dco, + co, A a>t = d(co + tA) + (co + tA)A(co + tA) = dco + tdA + {COACO) + t(AAQ) + COAA) + f2 A A A
= (dco + COACO) + tdmA + f 2 (AM).
Hence the gauge field FW( corresponding to Qt can be written as: Fcot = Fa>t + tdmA+
t\A A A)
(see Sec. 2.5 for the correspondence between Q and FJ. 2. Recall that on a compact oriented manifold (the type we are considering here) the Hodge-de Rahm operator A = dS + Sd maps Ak(M) to Ak(M) and for P e A*+1 (M) we can write: (i) (da, p) = (a,8p) for any a e Ak(M) Now both these properties can be extended to bundle valued forms also. Thus writing A" as dw 8°+ S^d® in the LHS of equality (6.7.22)(a) we have: (ii) « A ^ F w » = « (d w 5'V ffl> FJ) + ((S^d^F^ FJ) co tu = ((^F0),S F0))+((d FOJ,d°'Fj)
= \\S"F£+\\d»F£. 3. Use the steps suggested in the Hint to Exc. 3 of Sec. 6.4.
The Role of Symmetry in Physics and Mathematics
277
Table 6.1 Some Symmetries and Conservation Laws Nonobservable Difference between Identical particles Absolute position Absolute time Absolute direction Absolute velocity Absolute right or left Absolute sign of electrical charge Relative phase between states of different charge Q Relative phase between states of different baryon number N Relative phase between states of different lepton number L Difference between coherent mixtures of p and n states
Symmetry Transformation
Conservation Law or Selection Rule
Permutation
Bose-Einstein or Fermi—Dirac Statistics Momentum Energy Angular momentum Generators of the Lorentz group Parity Charge conjugation Charge
Space translation x - > x + 5(x) Time translation t-> t + 5f Rotation 0 -> 0 ' Lorentz transformation x -> -x e -»-e y/-> e'oe y/ y/-* e'Ne y/
Baryon number
y/-^> e'Le y/
Lepton number
{%)—>U{H)
Isospin
Table adapted from Guidry (1991)
References 1. A. P. Balachandran, (a) Wess-Zumino Terms and Quantum Symmetries in 5.[7]; A. P. Balachandran, G. Marmo, B. S. Skagerstam, A. Stern, (b) Gauge Symmetries and Fiber Bundles (Springer-Verlag, 1983). 2. R. BottandL. W. Tu, l.[5]. 3. S. Chadha and H. B. Nielson, Lorentz invariance as a low energy phenomenon, Nucl. Phys. B217 (1983), 125. 4. T. P. Cheng and L. F. Li, 9.[6]. 5. S. Coleman, Aspects of Symmetry (Cambridge University Press, 1985). 6. P. Dita, V. Georgescu, R. Purice (eds.), Gauge Theories: Fundamental Interactions and Rigorous Results (Birkhauser, 1982). 7. W. Dreschsler and M. E. Mayer, Fiber Bundle Techniques in Gauge Theories (Springer-Verlag, 1977). 8. T. Euguchi, et al., 1O.[24]. 9. E. Farhi and R. Jackiw (eds.), Dynamical Gauge Symmetry Breaking (World Scientific, New Jersey, 1982). 10. D. Fbrster, H. B. Nielsen and M. Ninomiya, Dynamical stability of local gauge symmetry: creation of light from chaos, Phys. Lett. 94B (1980), 135. 11. C. D. Froggatt and H. B. Nielsen (eds.), Origin of Symmetries (World Scientific, New Jersey, 1991). 12. M. K. Gaillard and R. Stora (eds.), Gauge Theories in High Energy Physics (Parts 1 and 2; North-Holland Publishing Co., 1983).
278 Mathematical Perspectives on Theoretical Physics
13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28.
M. Gell-Mann and Y. Neeman (eds.), The Eightfold Way (W. A. Benjamin, New York, 1964). V. N. Gribov, Quantization of non-abelian gauge theories, Nucl. Phys. B139 (1978), 1. M. Gourdin, Unitary Symmetries (North Holland Publishing Co., Amsterdam, 1967). M. Guidry, Gauge Field Theories (John Wiley & Sons, Inc., New York, 1991). P. W. Higgs, Spontaneous symmetry breakdown without massless bosons, Phys. Rev. 145 (1966), 1156. R. Jackiw, R. Muzinich and C. Rebbi, Coulomb gauge description of large Yang-Mills field, Phys. Rev. D17 (1978), 1576. D. B. Lichtenberg, Unitary Symmetry and Elementary Particles (Academic Press, New York, 1978). K. B. Marathe and G. Martucci, 1O.[35]. V. A. Miransky, Dynamical Symmetry Breaking in Quantum Field Theories (World Scientific, New Jersey, 1993). R. N. Mohapatra, Unification and Supersymmetry (Springer-Verlag, 1992). H. B. Nielsen and M. Ninomiya, /J-function in a non-covariant Yang-Mills theory, Nucl. Phys. B141 (1978), 153. R. D. Pecci and H. R. Quin, Constraints imposed by CP conservation in the presence of pseudoparticles, Phys. Rev. D16 (1977), 1791. I. M. Singer, Some remarks on the Gribov ambiguity, Comm. Math. Phys. 64 (1978), 7. G. t'Hooft, The Normalization of Massless Yang-Mills fields, Nucl. Phys. 33B (1971), 173. B. Gruber and M. Ramek (eds.), Symmetries in Science X (Plenum Press, New York, 1998). J. D. Jackson, Classical Electrodynamics (Wiley, 1975).
j
CHAPTER
A l l THflT IS SuPER-flN INTRODUCTION |
/
While concluding the previous chapter on Symmetry, we made a brief reference to the concept of supersymmetry—and as to why it was considered an important object in the unification scheme of field theories. In this chapter we focus our attention on all those aspects of mathematics and physics that are relevant to supersymmetry and related ideas, namely: the superalgebras, the supergroups, the superspace, the superfields, and so forth. All these concepts are linked to one another—for instance the superPoincare group (the superextension of the Poincare group) is the group of motions of a particular supermanifold-the flat superspace, whereas the infinitesimal generators of this group define a superalgebra. Moreover, knowledge of all these topics is required for understanding the supertheories— the supergravity, the super-Yang-Mills and the superstrings. We therefore present here the basics of these topics, and refer the reader for an in-depth study to the works of experts in the field (see references [2, Vols. I, II, III], [10, Vols. I, II], [19], [20], [21]).
1
GRADED-ALGEBRAS
l.l
Superalgebras and Lie Superalgebras
The first ingredient in any supertheory is its superalgebra, more precisely the Lie superalgebra. We therefore begin by giving below the definitions of superalgebra and Lie superalgebra along with a few explanatory examples. As can be expected, these definitions stem from that of algebra and Lie algebra which we had studied in Chapter 4 (see Def. (4.1.1) and (4.1.3)). We recall that an algebra .# is a vector space over a field J of characteristic 0 (or prime p) on which a distributive binary operation that commutes with scalars can be defined. The algebra^ may or may not be associative. Similarly, a vector space Q over J equipped with a bilinear, anti-commutative binary operation [, ] is a Lie algebra which is non-associative, and obeys the Jacobi identity: [A,[B, C]] + [B,[C, A]] + [C,[A, B]] = 0
(7.1.1)
In this chapter we shall be dealing mostly with graded vector spaces and graded algebras. Definition 7.1.1: Let T denote one of the rings Z or Z 2 = Z/2Z (ring of integers modulo 2.) A vector space V over the field f is said to be F-graded if it admits a family (V r ) y e r of subspaces such that: V= © VY yer
(7.1.2)
280
Mathematical Perspectives on Theoretical Physics
An algebra A is said to be F-graded if the underlying vector space is F-graded, i.e., A = © Ay
(7.1.3)
and in addition AaAj}
c Aa+p
for all a, ft e F
(7.1.4)
When F is Z 2 we denote the subspaces in (7.1.2) (anS (7.1.3)) simply as V^ and Vy (AQ and Aj), and call their elements respectively as even and odd. Definition 7.1.2: A Z 2 -graded algebra is called a superalgebra*. We denote it as L = LQ + Ly. Let the multiplication in L be denoted by ( , ), this implies in particular that (La, Lp) c La+j3for all a, fi e Z 2
(7.1.5)
The algebra L is called a Lie superalgebra if the multiplication satisfies the following identities: (a, b) = (-I)"* (a, b) {-\Ya
(7.1.6)
(a,(b, c » + (-l) a / 3
(7.1.7)
for all a e La, b e Lp, c e L y and a, /3, / e Z 2 . (Note that (7.1.6) is the graded skew-symmetry and (7.1.7) is the graded Jacobi-identity.) In keeping with physics tradition, we shall denote the Lie superalgebra as S and call the elements with even grades and odd grades Bose and Fermi respectively. We shall often use letters B and F to denote them. Written out in full, the axioms of 5 over the field C can now be put down as follows: (i) 5 is a Z2-graded vector space over
[[E, [[C, £>]]]] = 0
(7.1.8)
Apparently this reduces to the familiar Jacobi identity, except in the case when any two of the elements C, D, E are Fermi and the third one is Bose. Remark 7.1.3: We remind the reader that Lie superalgebras are frequently called Z2-graded Lie algebras although in general they are not Lie algebras.
* 1
Note that an ordinary Lie algebra can also be graded. (See [22] for the use of graded Lie algebra in the case of boson-fermionic models). Note that the adjectives before these brackets can be confusing, since for arbitrary elements a, b we have [a, b] = -\b, a], {a, b] = {b,a},a,be L however when these are equated to zero, the terminology makes sense, since [a, b]= ab - ba and {a, b} = ab + ba.
1
We have used the double barred bracket [[ ]] to stand for [ ] as well as { }. This will reduce to one of these once the choice of pairs is made consisting of elements B and F.
All that is Super—an Introduction
281
Remark 7.1.4: Let S-SQ +SJ be a Lie superalgebra, then the subalgebra SQ is a Lie algebra. Any ideal of 5 is a Z2-ideal. Definition 7.1.5: A Lie superalgebra is simple if it has no non-trivial ideals. The simple Lie superalgebras over C in finite dimensions are fully classified (see for instance Kac (1975, 1977) [12a], [12b]; Kaplansky (1980), [13]).
1.2 Other Important Superalgebras and Bose and Fermi Sectors Example 7.1.6: The general and special linear superalgebra : Let V(l\m) denote a Z2-graded vector space with /-dimensional Bosonic subspace and m-dimensional Fermi space. An arbitrary vector in V(l\m) is a column vector with upper I entries of B's and m lower entries of F's. A vector with upper / (lower m) entries containing B's (F's) only and zero elsewhere is called a Bose (Fermi) vector. The set of (I + m)x(l + m) matrices formed by complex linear transformations of V(l\m) together with induced grading defines the general linear superalgebra gl(l\m). The bracket operation here (to be explained shortly) is commutative in some cases and anti-commutative in others. It can be easily checked that if the matrices e gl{l\m) are block diagonal of the type: m
/
m^O
177777] J
(7.1.9)
they carry a Bose vector to a Bose vector and a Fermi vector to a Fermi vector. These are called Bose linear transformation matrices. A matrix which is block off-diagonal: /
1
m
I °
m \ 7777777
) 0
(7.1.10)
/
is called a Fermi transformation matrix of V(l\m). Denote these Bose and Fermi transformation matrices by MB and MF. We note that the bracket (which defines the binary operation of Lie superalgebra) for pairs of matrices (MB, MB), (MB, MF) is the (usual) commutator, but for the pair (MF, MF) it is the anti-commutator. The Lie superalgebra gl(l\m) is not simple. Also while in the ordinary case the Lie bracket of two matrices is always traceless (in finite dimensions), in this case we have an anticommutator as well that results from MF\ and even though individual MF's are traceless, their anticommutators are not necessarily so.3 This leads to the notion of supertrace—an algebraic sum, which identically vanishes for the superalgebra bracketing of two elements. Definition 7.1.7: Given a matrix M: I m I (A B\ M= m {C DJ
3
Note that the product MB MB is an MB, MB MF is an MF and MF MF is an MB.
282
Mathematical Perspectives on Theoretical Physics
the supertrace of M is defined as: strM-trA-trD
(7.1.11)
It can be checked that for I * m the supertraceless elements of gl(l\m) form a simple ((I + m)2- 1)dimensional superalgebra denoted sl(l | m) under the bracket operation mentioned above. When / = m, the unit matrix is supertraceless, this commutes with every element, and as such defines the 1-dimensional center for sl{l \ m). We factor out this piece and denote sl(m\m) without center as psl(m\m). We note that psl{m\m) is a simple algebra and for m > 2 the dim(psl(m\m)) = 4m2- 2. For m = \ the algebra psl{\\\) is nilpotent. In view of our Remark (7.1.4) and the above discussions, it is evident that the Bose portions (sectors) of 5 behave like an ordinary Lie algebra. For obvious reasons we shall denote this as °5 (in place of 5Q)Returning to gl(l\m) we observe that elements of °gl{l\m) can be viewed as forming three different categories: (i) those that shuffle Bose to Bose; (ii) those that shuffle Fermi to Fermi; and (iii) those that shuffle Bose to Bose and Fermi to Fermi but in a correlated manner, thus the algebra °gl(l\m) is not simple, similarly the algebra °sl(l\m) is not simple. These algebras can be expressed as: °gl(l\m) = gl(l) + gl{m) °sl(l\m)fcm = sl(l) + sl(m) + 1-dimensional Abelian piece °psl(m\ m) = sl(m) + sl{m) 2
(7.1.12)
The dimensions of these Bose parts are therefore I + m , (I - 1) + (m - 1) + 1 and ( m - 1) + (m 2 - I). Accordingly the dimensions of the Fermi part (sector) of these algebras are 2/m, 2lm and 2m2. We note that just as the general and special linear algebras can be generalized to their corresponding superalgebras, the algebras with orthogonal and symplectic structures can also be made into superalgebras. In the latter case, since one of the parts (say, for instance, Fermi) has to be symplectic, we have to choose m as even. Example 7.1.8: bilinear form:
2
2
2
2
Ortho-symplectic Superalgebra: The Z,-graded vector space V(l\ni) (m even) carries a (X, Y) = XTMY
X,YeV(l\m)
(7.1.13)
where /
m
>
n 1
:" 1
/ -
M=
(7.1.14) -1
'-.
m 1
-1 , This form is symmetric (antisymmetric) on Bose (Fermi) sector of V(l\m). To define the superstructure we now consider those complex linear transformations U on V(l\m) which satisfy a superantisymmetry condition:
All that is Super—an Introduction
(X, UY) + (-1)"* (UX, Y) = Q
283
(7.1.15)
where u and x are grades of U and X. The collection of all such M's and C/'s defines the ortho-symplectic superalgebra denoted osp(J\m). Using arguments similar to the case of gl{l\m) and sl(l\m), we note that the Bose sector of osp{l\m) is o(l) + sp(m) and hence it has the dimension y /(/ - 1) + Ym(m + 1)- To determine the Fermi sector (i.e., those t/'s and M'swhich form the Fermi sector), we choose U as Fermi (i.e., off-diagonal blocks) and vectors X and Y as Bose and Ferrni respectively. In view of (7.1.13) and the superantisymmetry condition (7.1.15) we now have: BTMUF + BT UTMF = 0
(7.1.16)
This means that if Z m
m [W
0)
then V and W are related to each other by:
1
(
1
-1 T
V = -W C, where C =
1 -1
1
(7.1.17)
-1
1 -1 , Thus the lower (upper) half of U determines the upper (lower) half, and therefore there are Im fermionic generators in this case and not 2lm as they were in the case of sl(l\m). As a consequence the total dimension of osp (l\m) = \ [(I2 - I) + (m2 + m) + 2lm] = {[(/ + m)2 + m - /]. It is interesting to note that if we had chosen both (i.e., Bose as well as Fermi) parts of M as symmetric or antisymmetric, we would have found the Fermi part (sector) as empty; this would have led to the ordinary Lie algebra o(l + m) or sp (I + m). On the other hand if we had chosen / even and had imposed the antisymmetry via M over the Bose sector and symmetry over the Fermi sector*, then we would have obtained osp(m\l). Thus while it is immaterial to choose which of the sectors is assigned a symmetry or antisymmetry condition, it is essential that only one of them is assigned the symmetry or the antisymmetry condition. Example 7.1.9: By restricting to the set of matrices of the type:
(A M=[
B \ _ J
I G
sl(n+l\n+l)
where tr A = 0 and B (Q is symmetric (antisymmetric), we obtain the subsuperalgebra of sl(n + 11 n + 1) denoted P(n) with dimension 2(n + 1 )2 - 1. Likewise if we choose the matrices of the type:
M=\
(A B\ , e sl(n + l\n+ 1) \B A)
* This can be done by interchanging the diagonal blocks of matrix M shown in (7.1.14). Note that / is necessarily even now.
284
Mathematical Perspectives on Theoretical Physics
with the provision tr B - 0 and factor out the centre corresponding to the unit matrix, we obtain another subsuperalgebra of sl(n + 1|« + 1). It is [2(n + I ) 2 - 2]-dimensional and is denoted as Q(n).
Exercise 7.1 1. Establish the identity (7.1.8) in different cases. 2. In the matrix representation of Z2-graded V(l\m) the bracket for pairs of Bose and Fermi matrices: (MB, MB), (MB, MF) and (MF, MF) is the usual commutator in the first two cases but is the anti-commutator in the third case. Explain the implications of the above statement. 3. Let Q be a quadratic form on a vector space V (over the field F = IR or C): Q : x e V i-> Q(x) e F, such that Q defines a symmetric scalar product on V given by: (a)
Q(x, y) = xy + yx = Q(x + y) - Q(x) - Q(y).
Show that if V is r-dimensional and has an orthonormal basis e^ (/u = 1, ..., r) with respect to Q, then (b)
e^ev + eveH = 25 U V Q(e^) • 1.
The associative algebra with a unit element, and generated by e^'s with the defining relations (b), is the 2r-dimensional Clifford algebra Cl(V, Q) with respect to the quadratic form Q. A general element of this algebra in this basis can be written as:
(c)
a(0) + afi eM + a $ " 2 e^ e ^ + ... + aft - »' e^ ... e^
where coefficients a ^ 1 ^2 etc., are antisymmetric in \ix, fx2 and e R or C according to our choice of the field F later denoted as J.
Hints to Exercise 7.1 1. Let C, D, E be all Bose elements then their grades denoted c, d, e are all zero (mod 2), also the brackets [[ ]] in this case are the familiar Lie brackets hence it is the familiar Jacobi identity, which can be verified. When C, D, E are Fermi elements, c, d, e are all 1 (mod 2), the identity in this case becomes: (i) -[C, {D, E}] - [D, {E, C}] - [E, {C, D}] = 0 or (ii) [C, DE + ED] + [D, EC + CE] + [E, CD + DC] = 0 (iii) C(DE + ED) - (DE + ED) C + D(EC + CE) - (EC + CE) D + E (CD + DC) - (CD + DC) E = 0 or (iv) (CDE + CED + DEC + DCE + ECD + EDC) - (DEC + EDC + ECD + CED + CDE + DCE) = 0. Let C and D be Fermi and E be Bose, then we have (v) [C,[D,E]} - [D,[E,C]) + [E, {C,D}] = 0. On simplification, it can be easily checked that the terms cancel with each other.
All that is Super—an Introduction
285
2. The linear transformation matrices MB and MF, which are block-diagonal and block off-diagonal, inherit the grading structure from V(l\m). For instance MB is even-graded and MF is odd-graded. Also while MB carries a Bose (Fermi) vector to a Bose (Fermi) vector, MF carries a Bose vector to a Fermi vector and vice versa. The vector space V(l\m) can be given an algebra structure through the usual bracket operation. Obviously these bracket operations will not be the same in each case due to different characteristics of MB and MF. Using the bracket rule for even and odd graded elements, we note that elements of the first two pairs combine with each other using the commutator bracket [ ] and the elements of the third pair are combined by the anticommutator bracket { }. This bracket structure amongst the linear transformations of V(l\m) defines in turn the Lie superalgebra gl(l/m). 3. By definition a symmetric bilinear form Q : x e V —> Q(x) e F is a quadratic form. As a result we can write Q(x + y) = x2 + y2 + 2xy for x, y e V. The definition of symmetric scalar product on V therefore gives: Q(x, y) = xy + yx = (x2 + y1 + 2xy) - x2 - y2. Thus if x and y are replaced by basis elements that are mutually orthogonal we have: (i) eMev + eveM = 2d^v Q{e^) 1. The associative algebra with unit element 1 and generated by {e^} is the Clifford algebra C(Q) whose defining relation is given by (i) (see Hint to Exc. (4.1.3) for the rest).
2
THE SPINORS
In S e e l we presented a few easy examples of superalgebras. These did not include the super-Poincare algebra-which happens to be an important superalgebra in the real world. In order to introduce this we have to acquaint ourselves with the basics of spinors. It is worth mentioning that spinors were the outcome of physicist's need of an object that helped define a Lorentz invariant linear first order differential operator (see Exc. 1 and 2). However once they were conceptualized in physics, they were studied structurally from different angles by mathematicians (see [14]).4 We give below the simplest form of their definition before presenting a sophisticated version.
2.1
The Definitions and Properties of Spinors
To begin with, we define the word spinor via quantum mechanical path by using the dynamical variable spin. Accordingly we represent two special states (called up and down):
U-Q,
= Q
,7.2.!,
by column matrices u and d and the general spin state % by setting: (7.2.2)
4
In his famous equation Dirac replaced SchrSdinger's complex-valued wave-function by spinor-valued wavefunction, (see Chapter 9 for Dirac equation).
286
Mathematical Perspectives on Theoretical Physics
Obviously we have X(+l) = cl
and
* ( - l ) = c2
(7.2.3)
The elements c{ and c2 of above column matrix are complex numbers in general, and the object x representing this matrix is a spinor. From the nature of things (in view of quantum mechanics principles) it is evident that Iqp is the probability of finding the particle with spin up and |c 2 | 2 is the probability of finding it with spin down. This requires the constraint (normalization): N 2 + k 2 | 2 = ( c , \ c*2) (Ccl)
=1
which leads to the fact that associated with every spinor x—\
(7.2.4) (c \
there is an entity x = (c^ c->) such
\C2)
that x*X - 1- Given two spinors % and x' X'X'=
tneir
complex scalar product is defined as:
c\ c[ + c*2 c'2
(7.2.5)
Two spinors are said to be orthogonal if this product is zero. Evidently u and d in (7.2.1) are orthogonal spinors, they are also normalized since u\ = d*d = 1 and hence they form a basis in terms of which arbitrary spinors can be expanded. In fact, every (2 x 2) Hermitian matrix
with distinct eigenvalues A\, A'2 gives rise to eigenvectors (7.2.6) which are orthogonal. These are called eigenspinors and an arbitrary spinor can be expressed in terms of them, thus we have: X=dxEx
+ d2E2
(7.2.7)
The complex coefficients d{ and d2 determine the state (represented by the spinor x), l^iT, l^l" being the probabilities of finding the eigenvalues A[ and A2-which give the measurement of the physical quantity represented by A (see Chapter 9).
2.2 Clifford Algebras and Spinors In this subsection we formulate the group theoretic version of spinors. Since Clifford algebras play an important role in the definition of spinors, we shall describe this algebra again (see Chap. 4, in particular Exes. 4.1.3 and 4.1.5 and also Exc. 7.13), this time defining it as a quotient algebra obtained from the tensor algebra. Definition 7.2.1: Let V be a vector space over the commutative field J and suppose that Q is a quadratic form on V. The Clifford algebra Cl(V,Q) associated to Vand Q is an associative algebra with
All that Is Super—an Introduction 287
CO
unit element; it is defined as follows. Let T(V) = ] £ ® r V denote the tensor algebra of V; define the r=0
ideal IQ(V) of T(V) generated by all elements of the form v ® v + Q(v)l for v e V. Then the quotient T(V)/lQ(V) is the Clifford algebra Cl(V,Q).s There is a natural imbedding V ^ Cl(V, Q) which is the image of V s ® ' V under the canonical projection ^ : T(V) —> Cl(V, 0 . As a result of this imbedding we have: if v e V c C/(V, 2 ) is a n element such that Q(v) * 0, then Adv(V) = V. In fact, for all u e V, following relation holds good: Ad
v
(u)=u-^^.
(7.2.8)
We shall use this idea further to define the 'spin group' associated to Cl(V, Q). In order to do that, we begin with following definition. Definition 7.2.2:
Consider the subset of Cl(V,Q) given by: Cl(V, Q): 3 a'1 with ax a= aax = 1}
Cl*(V, Q)={ae
(7.2.8)(a)
X
The subset C/ (V, Q) forms a group called the multiplicative group of units of the Clifford algebra. This group contains all elements v e V with Q(v) * 0, and when dimV = n, and the field J is 1R or
(7.2.9)
where O(V, Q) = {X e Gl(V) :X*Q = Q}6 is the orthogonal group of the form Q. Definition 7.2.3: The Pin group of (V, Q) is the subgroup Pin(V, Q) of P(V, Q) generated by the elements v e V such that Q{v) = ± 1. The associated Spin group of (V, Q) is defined by 7 : Spin(V, 0 = Pin(V, Q) n C/°(V, 0 p +q
2
(7.2.10) 2
When V= R and g (JC) = -^f + ••• xj-xj+ q)\ for x ...x p + q, we denote Cl(V, Q) as ClpqorC(p, p + 9 = n the notations are Cln = Cln0 and Cl* - Cl0n. The Spin group defined in (7.2.10) is now denoted by Spin,,. With these definitions in place, we shall now define a spin structure on a vector bundle and then a real (and complex) spinor bundle (see [14] for details). 5
'
6
'
7
The notations used here are slightly different from previous ones. We shall use Cl and C interchangeably to denote the Clifford objects. The map A* which preserves Q is induced by an automorphism X of V. Consider the automorphism 0 : Cl(V, Q) -» Cl(V, Q) which extends the map <j> (v} = -v on V to Cl (V, Q), since 02 = Id, there is a decomposition Cl(V, Q) = cf (V, Q) + Cl\V, Q) showing that Cl(V, Q) is Z r graded. Apparently the even part Cl°(V, Q) is a subalgebra of Cl{V, Q).
288
Mathematical Perspectives on Theoretical Physics
Definition 7.2.4: Let E be an oriented n-dimensional Riemannian vector bundle over a manifold X and let PSo(E) be its bundle of oriented orthonormal frames. Suppose n > 3; then a spin structure on E is a principal Spin,,-bundle (denoted) F Spin (£) together with a 2-sheeted covering. %:PSpin(E)^Pso(E)
(7.2.11)
such that t;(pg) = ^(p) SO2 in this case is the connected 2-fold covering. When n = 1, PSO(E) = X and a spin structure is simply defined to be a 2-fold covering of X. Definition 7.2.5: Let E be an oriented Riemannian vector bundle with a spin structure E, : Ps in(E) -» PSO(E). A real spinor bundle is the bundle PSpin(E)x^M
= S(E)
(7.2.12)
where M is a left module for Cln = Cl (R") and where [i: Spin,, —> SO{M) is the representation given by left multiplications of elements belonging to Spin,, c C/° (R"). A complex spinor bundle can be similarly defined as: PSpin(£)xMMc
(7.2.13)
where M £ is a complex left module for C/flR") ® C. In order to return to spinors of our interest, we choose n = 4, and show what exactly a spinor is, from this bundle-theoretic point of view. Suppose that U denotes the domain of a local chart of 4-dimensional manifold X = X4, and pox denotes a fixed field of orthonormal frames on U for x e U. Further, let Aox be a fixed differentiable mapping of U into Spin4. A spinor y/on U is an equivalence class of triples: (px, Ax, y,x) ~ (pux, Aox, Wox) if
(7.2.14)
(P* K Vx) = (UK A ;!) P«x. A,' A, A"iv v^).
We would like to mention that the general construction given above has been considerably simplified, the vector bundle E is replaced by the tangent vector bundle T(X), and the Clifford algebra C/(IR4) is taken as a module over itself (replacing M) by left multiplication. Finally we note that in terms of orthonormal basis {e^} of Vthe spinors could be described as follows. From Exc. (7.1.3) we recall that an arbitrary element of a Clifford algebra C{Q) with quadratic form Q defined on r-dimensional vector space VR8 can be expressed as: *«» + < ) en + « ( i T 2 ^ ,
e
^ + • " + <[
"• " r e^
e
n2---
e
»,
( 12A5)
-
where {e^} is the orthonormal basis of Vand 1, e ^ , e f t eM2, ••• e^ e^ ••• e^ (^ < [12< ... < fxt) is the basis of C ( 0 . Note that each coefficient a^'"^'
is totally antisymmetric in fi's. If we restrict Q as
follows:
f+1 Q{eJ=\ M
8
, [-1
fl = l 2, . . . , p , [l = p + l, ...,p + q = r
^R = Vector space over the field F = R.
(7-2.16)
All that is Super—an Introduction
289
n the elements of the form ya^ v (e^ ev— ev e^) in (7.2.15) can be viewed as Lie brackets which span Lie algebra spin(p, q : R). A representation of the Clifford algebra C(p, q) thus gives a representan of spin{p, q; R). The elements of the corresponding representation space are called the spinors. Due to the importance of Clifford algebras for spinors, we list below a few facts about it. \ct 7.2.6: Every Clifford algebra admits an irreducible matrix representation unique up to equivalce and therefore all Clifford algebras can be uniquely characterized as matrix algebras [1]. tct 7.2.7: The knowledge of C(p, q\ K) for q = 0 and p = 1,2, ..., 8 leads to the determination of her C(p, q; R)'s. In particular many properties of these algebras depend only on the signature (p -q) od (8). act 7.2.8: Using the expression (7.2.15), and identifying the basis elements of C(p, q) with Pauli atrices it follows that: C(0, 0) = R C(l, 0) = R + R C(0, 1) = C C(2, 0) = R (2) C(l, 1) = R(2) C(0, 2) = H (7.2.17) here R (2) stands for the algebra of real (2 x 2) matrices, and H stands for the algebra of quaternions.
r
T
act 7.2.9: Since the algebra of 4-dimensional Dirac matrices can be represented as a direct product f two copies of Pauli algebras (formed by Pauli matrices), there follows an isomorphism: C(p, q) ® C(2, 0) ~ C(q + 2, p) C(p, q) ® C(l, 1) ~ C((p + 1), (q + 1)) C(p, q) ® C(0,2) - C(q, p + 2) (7.2.18) nnally we define the complex Clifford algebras; these are required for classification of spinors of iifferent type,
2.3 Dirac, Majorana and Weyl Spinors Definition 7.2.10: A complex Clifford algebra is the complexification of a real Clifford algebra. Thus the complex Clifford algebra obtained from real C(p, q) is C(p, q) ®
(7.2.19)
The elements of the representation space of C (r) are called (complex) Dirac spinors. These spinors exist in all dimensions, and have 2 [ T r l complex components ([y r] stands for the integral part of y ) . The well known Dirac equation (in different dimensions) is written using these spinors. The Clifford algebra C (r) is known to be isomorphic to the algebra of 2L 2 J x 2 l 2 J complex matrices (denoted C(2l 2 J)). If the complex Clifford algebra corresponds to a given space-time dimension with the metric signature satisfying: (p -q) = 0, 1, 2 (mod 8) (7.2.20) then it can be checked that the Dirac equation (see Sec. 9.3) shuffles the real (imaginary) parts of any Dirac spinor amongst the real (imaginary) parts. As a consequence a reality condition can be imposed
290
Mathematical Perspectives on Theoretical Physics
on spinors, which says: the spinors be real without creating a contradiction with the Dirac equation. In these dimensions and metric signatures, the Clifford algebra has a real representation of smaller dimension than usual, and the elements of the representation space are called Majorana-spinors. If on the other hand (p - q) s 0, 6, 7 (mod 8), the Clifford algebra has a pure imaginary representation and the elements are called pseudo-Majorana. Apart from spinors that correspond to purely real or imaginary representations, there is another class which corresponds to the case of even dimensionality r of Clifford algebra C(r). If the coefficients a^, $}Hl M3> e t c -' a r e a ^ chosen to be zero, then the subalgebra formed by other elements is 2r~1-dimen-
a
sional, we denote it by °C(r) (see Ftn. 7 for different notation). It can be checked that subalgebra °C(r) is not simple since it decomposes into two ideals: °C(r) = °C(r)P+ + °C(r)P_
(7.2.21)
where P± are the projection operators defined by generalized Dirac operators (similar to y (I ± y5)) in the following sense9: P±=-i-(l±e)=-i-(l±(e1«2...O). P± = ~(l
(e 2 = + 1 ) (e 2 = - 1 )
(7.2.22)
± ie)
From (7.2.21) when r is even, the spinors in the eigenspace of P+ or P_ are called the complex Weyl spinors. Thus for all even r the complex Dirac spinor representation gives rise to Weyl spinors. When the Clifford algebra is a real C(p, q) with p + q = r even, then (7.2.21) reduces to: °C(p, q) = °C(p, q)P+ + °C(p, q)P_
(7.2.23)
whenever p - q = 0 (mod 4). This means that Majorana spinors can be split into Majorana-Weyl spinors whenever p - q = 0, 2 (mod 8) and p - q = 0 (mod 4) simultaneously. Thus Majorana-Weyl spinors can always be defined when p-q = 0 (mod 8). The well known case of existence of MajoranaWeyl spinors can be cited in the case of 10-dimensional supersymmetric Yang-Mills theory, since in this case p = 9 and q=\. Finally we collect this classification of spinors in the form of a table: Spinor
Dimension
Dirac Weyl Majorana Majorana-Weyl
Any r r even r = 2, 3, 4 (mod 8) r = 2 (mod 8)
(7.2.24)
Exercise 7.2 1. Establish the representations of Clifford algebra for different values of p and q given in (7.2.17), and show that if the quadratic form Q = 0, then C{Q) becomes a Grassmann algebra. 9
- See (7.3.7) for y5; e2 = +1 for p - q = 0 or 1 (mod 4); e2 = -1 for p - q ? 0 or 1 (mod 4). Equation (7.2.22) is defined only when e2 = ± 1.
All that is Super—an Introduction 291
Show that similar to ys, the product e = e{ ... er satisfies: ,
J+l for p - q = 0,1 (mod 4) ~ 1-1 forp-ql 0,1 (mod 4)
Show that the conditions on p, q for the definition of Majorana spinors given in (7.2.20) and (7.2.24) are compatible.
Hints to Exercise 7.2 . To obtain the representations of C(p, q) for p, q given in (7.2.17) we use (7.2.15) in each case. When p, q = 0, there are no e,'s, thus the algebra reduces to the reals i.e., to IR. For p = 1 and q = 0, i.e., for C(l,0), the basis vector is one in number, say ex. This can be viewed as a 1-dimensional unit matrix I, 1 0 and from (7.2.15) the general element is a (0) + a ( ] ) el, where <2(0), aw 6 IR. This gives: C(l, 0) = R + R. When p = 0, q = 1, (7.2.16) gives Q(et) = - 1 accordingly e, = ilx, the general element is of the form a(0) + ia^ex and hence C(0, 1) = C. For C(2, 0) the basis vectors are the Pauli matrices ex - T{, e2- t3, which gives {e{)2 = 1 2 , (e2)2 = 12 a °d ^1 e2 = -i t2. But these are the basis vectors of the algebra K(2) of real (2 x 2) matrices, which implies C(2, 0) = IR (2). In the case of C(l, 1), since (ex)2= 1 2 and (e2)2 = - 1 2 , we have IR ex = r, and e 2 = -ir 2 and e,e 2 = - r 3 , hence the basis vector again form K(2) showing that C(l, 1) = R(2). In the case of C(0, 2), both (e,) 2 = (e 2 ) 2 equal - 1 2 , hence e, = I'T,, e2 = ir 2 , ^j^ 2 = -/T 3 , these form a basis of the quaternion algebra H, hence C(0, 2) = H. If the quadratic form Q = 0, in view of the relation (ii) in Exc. 3 of (7.1), the algebra C{Q) reduces to the Grassmann algebra, whose defining equation is: e^v+
eveM = 0.
2. In order to use p - q = 0, 1 (mod 4) we choose p-q cases, p as 5 or 6 and r = 6 or 7. Accordingly: (i)
£ = e,e 2 ... e 6 , or
(ii)
£ = e,e 2 ... e 7 .
= 4 or p-q
= 5, this gives for q = 1 in both
Thus in the first case we have: (iii) e2 = (e,2 ... e6) (e,e 2 ... e 6 ), or (iv)
e2 = (~e2) {e2f (-(e 3 ) 2 ) ( Q ) 2 (-(e 5 ) 2 (*6)2)
where we have used e,e= -e,«, to move g,'s from the RHS to the LHS. Then in view of defining relation of Q{x) given in Exc. 3 of (7.1) and (7.2.16). we note that (6)2 = -1 whereas all other squares equal 1. This gives £2= 1. The result for £ given in (ii) follows on similar lines. To prove the other part €2 = - 1 when p - q £ 0, 1 (mod 4) we choose p - q = 6, this gives p - 6 + q. The choice of q = 1 makes r = 8. Following the same steps, the result is obvious. 3. To establish the result we choose q = 1, then a Majorana spinor according to (7.2.20) is given for p = 9,p = lO,p= \1, this gives r= 10, 11 and 12. Apparently these satisfy the condition given in (7.2.24). 10 ' I,, 12 stand for 1-dimensional and 2-dimensional unit matrices.
292
3
Mathematical Perspectives on Theoretical Physics
MORE ON SPINORS
Having introduced the concept of superalgebra, the Lie superalgebra and the Clifford algebras leading to spinors in earlier sections, we shall now use the spinors to describe the Poincare superalgebra and finally the supersymmetry algebra in Sec. 4. Since Pauli matrices happen to be an important tool in this description, we shall present their properties as well. Before pursuing these however, we state the Coleman-Mandula theorem [3] which happens to be one of the most precise and powerful no-go theorems. This theorem established the impossibility of nontrivial symmetries that connect particles of different spins-namely the integer spin (bosons) and halfinteger spin (fermions) particles, and thus in a manner of speaking led to the concept of supersymmetry. The theorem begins with following assumptions: (i) the scattering matrix of interacting particles (S-matrix) is based on a local, relativistic quantum field theory in 4-dimensional spacetime; (ii) there are only a finite number of different particles associated with one-particle states of a given mass; and (iii) there is an energy gap between the vacuum and the one particle states. And it concludes by asserting that the most general Lie algebra of symmetries of S-matrix contains the energy-momentum operatior P^ and the Lorentz rotation generator M^v, as well as a finite number of Lorentz scalar operators Tl that belong to the Lie algebra of a compact Lie group.
3.1 The Poincare Superalgebra In view of above theorem we make a few observations concerning the construction of these algebras: (a) Every physically admissible supersymmetry (denoted 5) has to satisfy two conditions, namely: (i) the Bose sector °S of its corresponding superalgebra 5 must be the direct sum of a Poincare algebra T (see Exc. 4.17 for T) and an internal symmetry algebra Cj; (ii) all elements of the Fermi sector \s (of 5) must transform like (Lorentz invariant) spinors. (b) Since we know that simple/semi-simple Lie algebras are easier to handle, we shall be looking for such algebras in this case also (c) Although Poincare' algebra is neither simple nor semi-simple in any space-time dimension, it happens to be most suitable (see the conclusion of above theroem), as it can be obtained by Wigner-Inonii contraction from the simple de-Sitter algebra on one hand, and on the other it can be embedded in the (again) simple conformal algebra. (See Exc. 7.3.1 for Wigner-Inonii contraction). From (a) it follows that the construction of a superalgebra can be based on a proper choice of °5 and l S,- Also in view of the theorem such a construction involves the generators M = (M^v) and P = (P^) of a Poincare algebra T, the generators (denoted T) of the internal symmetry algebra g and the generators Q that come from the Fermi sector. Hence, in the case of the Poincare superalgebra, the generators involved are those of the Poincarealgebra and the ones that come from the Fermi sector (see Sec. 4). The Lie brackets formed by generators M, P and T satisfy: [M, M] ~ M, [P, P] = 0, [P, M] ~ P, [T, T] ~ T, [P, T] = 0 = [M, T]
(7.3.1)
When the above generators are coupled with generators Q, there are four additional bracket relations: [M, Q] ~ Q, [P, Q]~Q,[T,Q]~Q,IQ,Q}~M
+P + T
(7.3.2)
All that is Super—an Introduction
293
Note that the fourth bracket in (7.3.2) is the Poisson's bracket indicating the different characteristic of Q, and the sum on the RHS of this bracket is actually a linear combination of generators M, P and T.u Since (from the previous section) the superalgebra will have a Z2-graded structure, the generators P, M and T can be viewed as even and Q as odd, we can therefore succinctly write the bracket relations as: [even,even]= even {odd, odd} = even [even, odd] = odd.
(7.3.3)
We shall return to the description of above relations in indicial form in the next section. In order to do that we have to familiarize ourselves with the properties of Pauli matrices and their usage in representing the Dirac matrices, the Weyl basis and the Weyl, Majorana and Dirac spinors, which we have already defined in most general form in Sec. 2 (see the Appendix at the end of this chapter for properties of Pauli matrices and Exc. (7.3) for verifications). We therefore devote the rest of this section to learn about them and to fix some notations that will be required in later sections. Recall that we have already used the Pauli matrices (T m ), 12 m = 0, 1, 2, 3, to define the generators ( T " ' / 2 ) of SU(2) in Chapter 2 where we emphasized that these matrices or their scalar multiples could be used as generators of a Lie algebra. We shall now employ these matrices to show a connection between SL(2, C) and the Lorentz group 13 and for representing the spinors in a different basis. Now SL(2, (C) is the linear group of 2 x 2 complex matrices M with det M - 1. This group can be represented (element-wise) by M, or its complex-conjugate M*, or its transpose inverse (MT)'X or its Hermitian conjugate inverse (M1)"1. By this we mean that either of these four matrices can be selected to represent the action of Lorentz group on two component Weyl-spinors.
3.2
Lorentz Invariance
Let P denote any (2 x 2) complex matrix e SL(2, C), we note that P can be written using Pauli matrices14 as the basis, thus:
f-P0
pp
' ~'"'{pl+ip2
+ P3
P{-iP2}
-PO-PJ
'-•«•»•«>
<"•")
which shows that if P is Hermitian, the P m 's are real. This leads to the fact that given a Hermitian matrix P, we can always find another matrix P1: P' = MPMi such that P' o 2 - P ' 2 = Pi - P 2 , (P = (P,, P2, P 3 ))
(7.3.5)
"• The equivalance sign ~ in Eqs. (7.3.1)-(7.3.2) is used to mean that the RHS is a constant multiple/linear combination of generators M, P, T. 12 We denoted them as am in Chapter 2. 13 See the definition of Lorentz group in Chapter 2 (Exp. (2.2.11)/). 14
We have taken T° =
in this case. See also the Appendix.
294
Mathematical Perspectives on Theoretical Physics
as det M = 1. Equation (7.3.5) shows the Lorentz invariance of P. Using the above example of a matrix P, we can check that the products y/ a i/ a , WaV" an( * y/a T™Q dm y/a that involve two-component spinors and derivative of spinors are all invariant under Lorentz transformations (see Sec. 6.3), in other words they are Lorentz scalars.15
3.3
Dirac Matrices and Dirac and Majorana Spinors
Next we write (4 x 4) Dirac matrices in terms of Pauli matrices which we shall use to establish the connection between four and two component spinors, thus: (7.3.6) where ( r m ) = (1, T), ( T " ) = (-1, T). We set
Ys = i y i
y2 r 3 7° = [* _ J ]
(7-3.7)
which gives y$ = 1. We also define the charge conjugation matrix (the (4 x 4) unitary matrix) Ca^ in terms of these Gamma matrices, thus:
fO
1
}
0
C=(Ca/3)=
-1
°
0
_! = i f r
(7-3.8)
2
i ' . oj Note that while C is antisymmetric, Cy'" is symmetric. The Majorana conjugate of a four-component spinor %a is defined as: XM=XTCoT(xM)a=XpCl3a
(7.3.9)
whereas the Dirac conjugate is defined as: XD = X\i y°) or (XD)a = ( * / (i f)p
16
(7-3.10)
In fact a spinor is called a Majorana spinor if this Dirac conjugate equals the Majorana conjugate, i.e., if
xTc=i(x+y°) 15
The clotted index (which can also move up and down using Lorentz transformations) indicates the complex conjugate of a given spinor component, for instance (X^* = X^. Since Pauli matrices are used in representations of these spinors, they are also given the index structure xma^, T'"^, etc. It should be noted, however, that for each value of m they represent the same Pauli matrix regardless of the indices (a/3), (a/3), etc. When m = 0 they stand for (+ or -) times the unit matrix.
All that is Super—an Introduction 295
or equivalently
Using (7.3.7) we can now write a 4-component spinor % as:
a3 u)
*••(£)
-
where the 2-component spinors % A and £, • are related to 4-component spinors by the rule: XA = j(l
+ Ys)X and lk = ~{l-y5)X
(7.3.12)
Note that %A labels the first two components and £ • the last two components of %. The charge conjugation matrix can also be expressed in this terminology as: (EAR
C=
0 "\
[ 0 eAB)
The Majorana condition XM = XD
now
(7 3 13)
''
becomes:
( £ * ) * = £ „ , (XAf = - S A
(7-3.14)
where XA = XBeBAmd^A Hence for a Majorana spinor we have:
Xa={lA\xa
= eAUs
(7-3-15)
=(XA,-XA)
(7-3.16)
Since Cfo)* = ~xK, we have C^4)* = - ~XA. From our discussions in Sec. 2 (see (7.2.22)), we also note that a Dirac spinor contains two separate Weyl spinors, e.g.,
Finally we would like to mention that Gamma-matrices are represented in more than one way. We give below three of these; to distinguish them from the previous ones as well as from each other, we have provided the y 's with a subscript. 1
' See Def. (7.2.10) for a Dirac spinor^; the symbol t = + = hermitian conjugation, whereas * is the usual complex conjugation. The bar on any spinor % indicates that it has been obtained from % by complex conjugation. We shall use the two symbols t and + interchangeably.
296
Mathematical Perspectives on Theoretical Physics
y'w=\-m {t Yco -\ f
T
] (Weyl basis) 0 )
' 1 ° ^(Canonical
(7.3.17)
basis)
(7.3.18)
r°-( ° ""'I r'-f° ^1 /M~UT2 0 J 7 M " U 3 oj 7^=[o
-ilj ^"(-,-T1
0 J
(MajOrana basis)
(7-3-19)
While using the y-matrices, the reader must make sure of their correct description in terms of Pauli matrices and the z° matrix that is used along with it (see Chapter 9 in [16] for a different view point). Exercise 6 of this section explains this point by choosing a different set of y's. We would also like to note that the summation rules in the case of spinors differ from those of tensors, for instance we write = XB £BA and xA = e AB XB for spinors, unlike tA - gm tB, tA - gM}tB for tensors. It is because the e's are antisymmetric matrices unlike the symmetric metric tensor g^. The complex conjugation also works slightly differently here due to the involvement of anti-commuting objects. For example, in this case we require that the conjugation and Hermitian conjugation of a scalar such as: XA
S=6AeABeB
(7.3.20)
be the same (8 being an arbitrary complex spinor), we explain this as follows. Viewing 0's as matrices the Hermitian conjugacy means that: 5+ = ( 0 V
(EAB)*
(6A? = GB{eABf
6A
(7.3.21)
In order that 5+ = S*, it would mean that the order of d's be reversed. More explicitly:
(eAeB)*=dBeA (9A0B6c)* = (9c)\eB)* (6A)* = -9c6BdA
(7.3.22)
since (GL)* = -6LP As can be expected, the derivation rules are also different for 0's. Note that the 8's are different type of entities because of their anti-commutativeness. We shall learn more about it 17
Note that a dotted index results from complex conjugation. We shall therefore drop the bar on G in general.
All that is Super—an Introduction 297
when we introduce superspaces, superfields, etc., in the next section. At present we only list these derivation rules (for later use) as follows:
(a)
(b)
(o
f-^-1* - -±-
f-^-V - - i -
(d)
(7.3.23)
Note that (c) and (d) can be derived from (a) and (b) by using the complex conjugation rules for 0's. All these are left derivatives18 and are called the fermionic derivatives.
Exercise 7.3 1. Recall that the isometry groups of the de-Sitter and anti-de-Sitter space are 0(4,1) and 0(3,2). Denote the corresponding Lie algebras by o(4,l) and o(3,2). Let M^v = -MVfl (fx,v = 1, 2, •••, 5) be the generators of these algebras, which decompose into two sets: M = (M12, M13, M14, M23, M24, M34) and P = (Ml5, i = 1, •••,4). The Lie bracketing relations can now be put as: (a)
[M,M]~M, [M,P)~P, [P,P]~M.
We now choose a mapping (rescaling of generators) M—>M=M,P^>P rally implies: (b)
[ M , M] ~ M,
= XP, which natu-
[M, P] - P , [ P , P ] ~ X2 M
since M = M. Show that when A —»0 we have the ordinary Poincare-algebra. The above process of taking the limit A —> 0 is called the Indnti-Wigner construction. 2. Show that xA XA = ~ XA XA = X2' m(i %A ZA = - %A^A = ? 2 hence XAXB =
-J£ABX2•
3. Verify (a) and (c) of (A. 10) of the Appendix. 4. Verify that the scalar Pm xm of (7.3.4) is a (2 x 2)-complex matrix. 18
Left derivative off(x) s lim ^ £->0
M-f(x-E) e
me
Lorentz invariant, and
298
Mathematical Perspectives on Theoretical Physics
5. Verify the Gamma-matrices product given in (7.3.7). 6. Show that the Weyl basis, the Dirac-canonical basis and the Majorana basis are related to each other by similarity transformations. Identify these matrices. Also show that the matrices of Majorana basis satisfy the relation (YM)* - - YM f° r * = 0, 1, 2, 3. 7. Consider the Majorana representation of Clifford algebra C(3, 1) : y°= J T ' 2 ® T \ 7 1 = T ' 1 ® T ° , y2 = x'1 (g) T2, y3 = T' 3 ® T° where T' and T are two copies of Pauli matrices and T'° and T° are 2 x 2 unit matrices. Let Ma = M*abe a Majorana spinor, then show that a 2-component spinor W= —(l-iy5)M
or
W = —{l-iy5)M
can be constructed using M, where y5 = y° / y2 yi = -i x'2 ® T 3 .
HintsforExercise 7.3 1. We recall that the Poincare group is the group of isometries of the Minkowski space. When spacetime is not perfectly Minkowskian but de-Sitter or anti-de-Sitter instead, its curvature does not vanish but is constant. The isometry group, however, becomes a simple group 0(4,1) for deSitter and (9(3, 2) for anti-de-Sitter space. In the limit when the constant curvature of any of these two spaces goes to zero (i.e., the curvature radius becomes infinite), they tend to be Minkowskian and their isometry groups contract to the Poincare group. This Inonii-Wigner contraction can be fully appreciated at the level of their Lie algebras. Since from (b) it follows that as X —> 0, [P, P ] —> 0, which is precisely the requirement of a Poincare algebra—namely, the momenta (i.e P's) commute. 2. To show that %A xA = -%A XA ( n o t e that %A and %A behave like anti-commuting elements of familiar Fermi sector) we use (A.4) to write: XAXA
= XB^BA^CXC-
«
To use the summation rule given in (A.3) we change the order of indices in eAC to £CA, since it is an antisymmetric tensor we replace eAC by - eCA. This gives: XAXA
= XB(-£BA£CAXC) B
= X (-8£)XC = ~XAXA We thus have %A %A - %A XA = 2 X2Multiply both sides by em
or
=
-X XB
= X2-
ite XA = £AC Xc t o obtain:
XA£ACXc = X2(iii) and change the order of indices in eAC and e^ on LHS, this gives: XA(£CA
or
Wr
(«) B
£BA) XC = £AB X2
XA (~5B) XC
(iv)
= £AB X2
2XAXB = £ABX2=>XAXB
=
-^£ABX2-
The Lorentz invariance of the entities obtained in (ii), (iii) and (iv) is obvious. The other part of the exercise can be established either by using the relations (A.4)(b) or by observing the fact that
All that is Super—an Introduction
299
each of these can be viewed as coming from a spinor with components %A or XA through the conjugation process and using (7.3.21) together with the relations: (XA) = XA and {%) =- X 3. To verify (a) of (A. 10) we choose m = n = 0, then from (A.12)(a) we have:
(*<>)£ = y (1-1-(-1)(-1)£
(i)
=0 which shows that the second term on the RHS of (A.10)(a) makes no contribution and thus equals the LHS which is 8AC. Next we choose m = 1 and n - 3, the first term on the RHS of (A.10)(a) is zero and for the second term, (A.12)(a) gives:
< >\* irr° ~i\ (° 'iT c -'\A WC-TRI oJi-i oJL-li ojc
(11)
This is the same as the value of the LHS. Similarly choosing the other combinations of m and n, the remaining expressions can also be verified. To verify (A.10)(c) we use (A.12)(c), it can be easily checked that the first term on the RHS of (A.10)(c) makes no contribution when n * m and the second term makes no contribution when n = m. For instance choose n = m = 0, then in view of (A.5) and (A.8) we have: (T 0 0 )^ = i ( ( l ) ( - l ) - (-1)(1))S = 0
(iii)
Therefore the RHS reduces to
H 00 ^ =(-!)( J °)
(iv)
The RHS, on the other hand, is (v)
The verification for n* m is left as an exercise. 4. We have:
Pm T"! = P O T ° + PjT1 + P2x2 + P 3 T 3
= p 4o =
-i] +
(-P0+P3 [p,+iP2
P1~iP2) -Po-P3)
5. We use (7.3.6) to write:
7777
U 1 o JW o J U 3 o Jl-ii o J
300
Mathematical Perspectives on Theoretical Physics
_/T'T2
0 V-T3
"tfl
TVJI 0 0
[0
-l)
r3)
0 ^ rVVj
/-TW
\
0^
6. We denote the Gamma matrices in three bases collectively as Tw, Tc and TM. Using the elementary rules of finding the similarity transformations, we obtain: w=
where Similarly
c
A= ^
J
Tw = BTM B~l 1
ll
1 f B = —==-
where
4l U
\ -ie)
The relation between Tc and TM is immediate. The determination of £ in B is left as an exercise.19 The second part of the exercise is too obvious for any proof. 7. Use the results of Exercise (7.2.1) to verify that y5 = - i x a ® T3. Note that the components of a Majorana spinor are all real. We put
M2
[MJ premultiplying it by —(1 - i y5) we have after simplification: fWA ( W2 _ i W3
yjA) 19
~ ~2
M{ + iM3 M2-i M4
\
/(Mj+iAfj)
{-i{M2-iMA),
' See the Ftn.47 in the Appendix for a hint.
All that is Super—an Introduction
301
Evidently only two of the components of W,'s are independent—say W, and W2. Thus beginning with a 4-component spinor M, we have obtained a two component spinor UMx+iM^ 2 [M2-iMj' We denote components of Was WA where A = 1 or 2. Using the complementary projection operator \ (1 + i y5), it is easy to check that the resulting the 2-component spinor is the conjugate of WA, thus (WA)*= WA.
4
SUPERSYMMETRY ALGEBRAS AND INTRODUCTION TO SUPERSPACES
From Chapter 6 we are already familiar with the role of Lie groups in defining the symmetries of a system. In fact we saw that associated to all local symmetries that we studied, there was an underlying Lie group. At that point we also mentioned in passing that there were many symmetries that occurred in quantum field theory which could not be assigned a Lie group on those lines. Using layman's language we shall call these symmetries as supersymmetries and the groups that will be used to describe these will be called supergroups. However, when symmetry began to be replaced by supersymmetry, objects such as vector spaces, vector and tensor fields, and differential forms had to be replaced by their analogues required for the description of supersymmetry. It then became essential that a formal definition of manifolds on which these super objects live be given. It is therefore not surprising that supermanifolds followed superalgebras and supergroups, but once they were defined, the super Lie algebras and super Lie groups could be conceived from them following the familiar rules for obtaining the Lie algebras and Lie groups from ordinary differentiable manifolds (see Chapter 3 in [4]). In order to define supermanifolds, we need to introduce a few objects required for their definition; these are supernumbers and their splitting, superanalytic functions, real (complex) supernumbers, and supervector spaces.
4.1
Supernumbers
We begin with a set of generators C,a, a = 1, •••, N which form a Grassmann algebra (i.e., C,a C,b+ £,b C,a - 0, for all a, b) denoted AN. When N -> <*,, this is denoted A^. The elements 1, (a, £fl \b, ••• where a * b form a 2N (infinite)-dimensional basis of AN ( A J . Moreover AN ( A J forms a 2/v-(°°)-dimensional linear vector space under addition as well as multiplication by a complex number. We note that as algebras over complex numbers, both these are associative but are not commutative except in the trivial case when N = 0,1. The elements of Ax are called supernumbers. An arbitrary element z e A^ can be written as:
* = A+(X-U lfl2 ... fl .r
C-]
(7-4.1)
where X as well as all c's which are antisymmetric in their indices are complex numbers. The first term is called the body of the supernumber and the second the soul, we shall denote them as zB and zs, thus: z = zB+zs
(7.4.2)
302
Mathematical Perspectives on Theoretical Physics
If AM is replaced by finite N, then apparently zs is nilpotent, i.e., zsN+l
=0
(7.4.3)
A supernumber has an inverse if and only if its body is nonvanishing. The inverse, which is unique, is given by the formula oo
l
z~ = Z~B X (z"j zs)n
(7.4.4)
n=0
To extend any analytic function/on the complex numbers to a supernumber-valued function on A^, we use series such as:
/«=
(7 4 5)
I-T/'" 1 ^^ »=o
--
nl
where f-n) (zB) denotes the n—th derivative of/at the point zB in the complex plane, and this definition is valid for all zB as long as they are not singular points of /. We note that the Taylor series (7.4.5) terminates when N is finite. When N is infinite both (7.4.4) and (7.4.5) stand for formal infinite series. The coefficient of each term in these series is finite and unique. Similar to supernumbers, a matrix M whose elements are supernumbers has a body and a soul. The body is the ordinary matrix obtained by replacing each element with its body, and the soul is the remainder. A square matrix has an inverse and is said to be nonsingular if and only if its body is nonsingular. The inverse is unique. Definition 7.4.1:
Let the supernumber z defined in (7.4.1) - (7.4.2) be written as:
'-(<• \ ! , 2^™*r*
<*)*(i^v.,./1-1
- r ) . . + v (7.4.6)
Apparently u is the even part of z and v is the odd part. The supernumbers that are purely even or purely odd are referred to respectively as c-numbers and a-numbers. Remark 7.4.2: We list below some of the properties of c-and a-numbers: (i) The c-numbers commute with every supernumber whether it is a c or an a (or a sum of c and a); whereas a-numbers anticommute among themselves, (ii) The product of two c-or two a-numbers is a c-number. The product of a c-number and an anumber is an a-number. The square of every a-number is zero. (iii) Since d-numbers possess no body they are not invertible. The set of all a-numbers is denoted as C a ; this set is apparently not a subalgebra of Ax. (iv) The set of all c-numbers forms a commutative subalgebra of AM, and is denoted as C(.. (v) When ATC is replaced by AN, both C r and Ca are 2N~l- dimensional vector spaces.
4.2
Superanalytic Functions
Remark 7.4.3: Similar to ordinary analysis, an analytic theory of functions of c-numbers and a-numbers can be formulated by considering mappings from C r or Ca to AM.
All that is Super—an Introduction 303
We illustrate it for Ca. For finite N, Ca and A^ are finite-dimensional vector spaces over complex numbers, hence a differentiable mapping /from C a to A^ can be defined. Thus if v e Ca,f carries v to AN and then using the formal limit A7 -> «, the analyticity of/(now known as superanalyticity) can be defined as follows. Definition 7.4.4: The differentiable mapping/is said to be superanalytic at v e Ca if, corresponding to an arbitrary a-number displacement dv of v, the image f(v) in A^ suffers a displacement of the form:
df (v) = dv M - f(v)] = \f(v) - f - ]dv I dv ->
(7.4.7)
J L
dv J
(7.4.9)
<-
-~-f(u)=f(u)-f~
(7.4.10)
du du showing that in this case there is no need to distinguish between left and right derivatives. The class of superanalytic functions of c-number variables is much richer as compared to the class of superanalytic functions of a-number variables. We list three important properties of/defined on C c in the following remark. Remark 7.4.5: (i) Corresponding to every ordinary analytic function / o n the complex numbers, there is a superanalytic function over C c analogous to (7.4.5):
fWdlff,±f^(uB)u»
(7-4.11)
«=o n\ where uB and us are respectively the body and soul of u. (ii) If / is superanalytic on and inside a closed curve in a vector space Cc, then the curve may be continuously deformed to a point without crossing any singularity and/satisfies: ->
2a
304 Mathematical Perspectives on Theoretical Physics
j f(u) du = 0.
(7.4.12)
(iii) More generally, if/is superanalytic on the curve and superanalytic inside except at a finite number of poles, then j f(u)du = 2ni x (sum of residues at poles)
(7.4.13)
Thus if/has the general form
/(«)= i — 4 - .«>>£""••• £ai n=0
n
(7 4 u)
--
-
the residues may be arbitrary supernumbers.
4.3
Real and Imaginary Supernumbers
To reach our objective of defining a supermanifold, we have to introduce the notion of real and complex amongst the supernumbers. The laws of complex conjugation (denoted *) of sums and products of two supernumbers are (z + z V = z + z'\
(zz)* = z* z \
z, z e A . .
(7.4.15)
The complex conjugate of zB is taken to be its ordinary complex conjugate, and the generators of ATC are assumed to be real, thus £" = £° for all a. This implies:
(£«,... £«„)*= £«„ ... £«,
(7.4.16)
From this together with the anticommutation rule £° Clb--C>b £", it follows that the basis element £"' • • • £"" is real when y n(n - 1) is even and is imaginary when | / i ( n - 1) is odd. As for ordinary numbers, a supernumber z is said to be real if z* = z and is said to be complex if z* = - z- A general element of AOT is real if and only if both its body and soul are real. The subset of all real elements of C f and Ca is denoted respectively as R c and R a . The subset R c is a subalgebra of C c . The product of two real c-numbers is a real c-number and the product of a real cnumber and a real a-number is a real a-number, whereas the product of two real a-numbers is an imaginary c-number. The symbol x is generally used to denote a real variable whether it is over R a or R c . The cartesian product R c x Rc • • • Rc of m factors R c is denoted R™, similarly the cartesian product of n factors R a is denoted R£.
4.4 Supervector Spaces As can be expected, the definition of supervector space is formed using the rules similar to that of a vector space, however the difference appears once the supernumbers are introduced. Definition 7.4.6: A supervector space is a set s of elements called supervectors, which is equipped with mappings, having special properties listed below: (i) There exists a binary-operation mapping + : s x g - > s called the addition such that + (X, Y) = X + Y The mapping + is commutative and associative.
for all X, Y in s
(7.4.17)
All that is Super—an Introduction
305
(ii) There exists an element 0 e 6 such that X + 0 = X for all X in G • The element 0 is called the zero supervector. (iii) For every supervector X in G, there exists another supervector - X in s such that - X + X = 0. It is easy to verify that 0 is unique. Given a supervector X, the supervector - X known as the negative of X is unique. (iv) For every supernumber a there exist two mappings, aL : G —> G called the left multiplication and aR : G —> G the right multiplication such that: aLX = aX,
aRX = Xa for all X in 6
(7.4.18)
These mappings satisfy the linear laws: (a) (a + p) X = a X + pX;
X (a + p) = X a + Xp
(b) a (X + Y) = a X + aY;
(X + Y)a = Xa + Ya
(c) (a/3) X = a (/JX) = a/3X;
X(aj8) = (Xa)/J = XajS
(d) IX = X;
XI = X
(7.4.19)
for all a, (3 € AM and X , Y e G(v) Left and right multiplication are related by the following rule: (aX)P = a (Xp) d£f aXfi (7.4.20) for all a, P in A^ and all X in <s. If a is a c-number, then it commutes with every X in G- Thus for all a in Cc and all X in 6 we have: aX = Xa
(7.4.21)
For every X in G there exist unique supervectors U and V in G such that (a) X = U + V (b) aU = Ua,
aV = - V a
for all
a in Ca
(7.4.22)
The supervectors U and V are called the even and odd parts of X. If the odd (even) part of a supervector vanishes, the supervector is said to be of type c (type a). The zero supervector is the only supervector that is simultaneously c-type and a-type. A supervector as well as a supernumber that has a definite type is called pure. For pure supernumbers and supervectors, Eqs. (7.4.21)-(7.4.22) can be summarized in the formula: aX = (-l) a X Xa (7.4.23) where it is assumed that each symbol in the exponent of (- 1) takes the value 0 or 1 according as the corresponding quantity is c-type or a-type. (vi) There exists a mapping * : G —* G called the complex conjugation and conventionally written as: '(X) =fX* for all X in G (7.4.24) The mapping * satisfies the usual properties of conjugation such as X** = X; (X + Y)* = X* + Y* in addition to the property: (aX)* = X V ,
(Xa)*= aX*
(7.4.25)
306
Mathematical Perspectives on Theoretical Physics
for all X, Y in s and all a in ATC. A linearly independent set {te} is called a complete linearly independent set or a basis if every supervector X in s can be expressed in the form: X = X' ,-e for some
X' in A«,
(7.4.26)
If a supervector space has a basis consisting of d supervectors, it is said to have total dimension d. The total dimension which may be finite or infinite is an invariant of the supervector space. We shall generally be concerned with the supervector space R"' x R£ formed by the cartesian product of mR f 's and nR a 's. When N is finite, the total dimension d of R"' x R^! is 2N~l (m + n) while the pair (m, n) is called its dimension. A general point of this space is denoted x with coordinates xl where i ranges over the set (- n ••• - 1, 1 ••• m) or sometimes over the set ( - « • • • - 1 , 0 , 1 ••• m - 1) with the negative values distinguishing the a-number coordinates from the c-number coordinates. However when this formalism is used in physics, instead of distinguishing the c-numbers and a-numbers by positive and negative indices,we use the coordinates *?, xv, xa to denote the c-type and 8a, 8^,6y (or even 6', &, #*) to denote the a-type.
4.5 Supermanifolds; Charts and Atlases Before we formally define a supermanifold, we would like to mention that the objects called supermanifolds are related to R"1 x R^' in the same manner as ordinary manifolds are to R"1; that is, small regions of a supermanifold look like small regions of R" ! x R^. They are said to have the same local topology. The topology we shall use here would be such that it would reflect the algebraic structure of R" ! x R^' irrespective of the fact whether N is finite or infinite. This would require that the topology in question should have the property prescribed as follows. Definition 7.4.7: Let n : R"' x R^1 -> R'" 21 be the mapping that replaces each coordinate x' of the point x by its body. A subset of R"' x R^' is said to be open here if and only if it has the form 7C1{Q) where Q is some open subset of Rm. We note that Rcm x R" is not Hausdorff with this topology, however if x and x' are two points of R"' x Ra" lying in two distinct soul subspaces (i.e., if n(x) * K(X), then they may be surrounded by nonintersecting neighbourhoods. Such a space is called projectively Hausdorff, and the topology here is called a coarse topology. Definition
7.4.8:
Let 0 be a mapping from an open subset 11 (as defined above in the sense of the
coarse topology) of R™ x R" to an open subset U of R"! x Rna . The mapping is said to be differentiable if the coordinates x](j - 1 ••• m, - 1 ••_• -n) of the image point
Note that Rm here defines the set of points whose coordinates have vanishing souls, n may be called the natural projection of R™ x R" onto R™.
Alf that is Super—an Introduction 307
in R"' x R^' (in the coarse sense). The collection of ordered pairs is required to have the following two properties:
(i) (J UA = M; A
(ii) 0 A o
defined by <j>A. The pair (1lA, <j)A) is called a chart, or a /oca/ coordinate patch or a coordinate system. Property (ii) implies that every pair of overlapping coordinate systems is related by a differentiable transformation. A collection of charts (1IA, (j>A) satisfying the two properties is called an Atlas. Just as in the case of ordinary differentiable manifolds, we call the union of all atlases compatible with each other (two atlases are said to be compatible if their union is again an atlas) the complete atlas of the manifold. In fact it is the set of all possible coordinate systems on M. With the definition of supermanifold in place, objects such as supercurves, sub-supermanifolds, scalar fields, etc., can be defined. We shall defer these definitions to the sections where they will be required (see [4] for details).
4.6 Supersymmetry Generators and Construction of Superalgebras from First Principles From our studies of Lie groups and Lie algebra, we know that every Lie group defines a unique Lie algebra; conversely given a Lie algebra with its generators, there exists a Lie group whose structure can be determined by these generators. We shall use this approach to define the supergroup resulting from an underlying algebra. Thus we first select the generators, the brackets required to mix them and the rules that these mixings will satisfy for compatibility. In Sec. 3, we already shed some light on these aspects showing how these selections are limited to particular groups. For instance, since most symmetries in nature are given by the Poincare group, one set of generators (which are ten in number) consists of (P^) and (M^v) (see Exc. (4.1.7)), the other set denoted (Ta) arises from the internal symmetry group G. The generators of these sets are labeled as even elements of a Z2-graded algebra. The third set containing the odd elements Q = ((?„) is not based on any known group to begin with, but as we mentioned in Sec. 3, it comes from the Fermi sector. The index / in Q'astands for the supersymmetry number 1, 2, 3, • • •, N and a represents the 4-component spinorial index. 22 The generators Q'asatisfy the anti-commutating relation: fIU n' , Wp) nin\ -- U n' Up ni++ Up nl U n' -- linear combination of some other a a a generators that must b e even
a A nq\ Ki.^ii)
and are called the supercharges. The first two sets, on the other hand, satisfy the familiar relations:23 (a) [PM, Pv] = 0 (b) [PX,MUV]= 22
r]^Pv-nxvPn
' When it is a 2-component spinor the indices used are A and A in place of a, p, etc. r\hl stands for Minkowski space-time metric (-1, 1,1, 1).
23
308
Mathematical Perspectives on Theoretical Physics
(c) [MA/i, MvS\ = - (riXv M^ + T)^ MXv - t]^ M^v - t]^ Mxs)
(7.4.28)
[Ta, Tb] = Cabc Tc
(7.4.29)
{Cabc being the structure constants of the group) and [/»„, TJ = [M^v, Th] = 0
(7.4.30)
The last of these relations as we have seen in Sec. 3 is a consequence of the Coleman-Mandula (no-go) theorem [3] which states that a symmetry group that incorporates both these symmetries is a direct product of the corresponding groups and as such permits no non-trivial mixing amongst generators. In order to find the mixing of these three sets, we first note that like the generators of ordinary Lie algebras they satisfy the Jacobi identity (see (7.1.8)) based on the rules given in (7.3.3). (i) [[£„ E2\ E3] + [[£ 3 , £j], £ 2 ] + [[E2, E3], £,] = 0 (ii) [[£„ E2], O3] + [[O 3 , £,], E2] + [[E2, O 3 ], £,] = 0 (iii) {[£„ O2], O3} + [{O2, O 3 }, £,] + {[£„ O 3 ], O2] = 0 (iv) [{O l t O2], O3] + [{O 3 , O,}, O2] + [{O2, O 3 }, O J = 0 where £,- and Oi (i = 1, 2, 3) stand for even and odd elements of the algebra. Since [even, odd] = odd, we further note that the Lie bracket
(7.4.31)
can be expressed only in terms of Qla as these alone are the odd generators. Thus the a index of the element Qlacan be viewed as rotated by MMV, giving (7.4.32) where the entities (K^
satisfy:24
[ ^ . Kv5fa = - nXv (K^a
-
%s(KXv)£
+ i]xs {K^t
+ %v (KXS)P
(7.4.33)
Equality (7.4.33) implies that (K^)^ form a representation of the Lorentz algebra, this in turn means that <2^carry a representation of the Lorentz group. Thus if we choose Q'ato be in the (0, y ) © ( y , 0) representation of the Lorentz group we have: [QL> MHV\ = \
( V « Qp
(7-4-34)
Similarly using the (7.4.31) (ii) and noticing that [ 0 ^ , £,] = linear sum of Q«
(7.4.35)
and that 8p and (Ys)p are the only invariant tensors which are scalar and pseudo scalar, we obtain the relation: [QL Ta] = (ijj QJa + K ) j (i Ysifi QJp
(7-4-36)
where (/a)j + i y5 (ma)j represent the Lie algebra of the internal symmetry group (the indices i, j refer to 24
The equality can be checked using (7.4.27), (7.4.28) (c) and (7.4.31) (ii).
All that is Super—an Introduction 309
supersymmetry and a refers to the internal symmetry group). The mixing of Qawith P., (using the Jacobi identity and (7.4.35)) simplifies to: [Qi,J>lJ = 0 (7.4.37) J Thus we are finally left with the bracket {Qa, Qp }- In view of (7.4.27) the most general expression that we can write is: {QL Qft = K y " Qap />„ SiJ + S ( T A % MXii 8ij + Caj} Uij + (y.Q^ ViJ (7.4.38) where r and s are natural numbers which vary with the choice of the supersymmetry number N and Cap - ~ Cap i s m e charge conjugation matrix (see Sec. 3). The even generators If*, V'-*, (f/1-7 = - Wl and V1-7' = -VJl) are called the central charges, (and are appropriately denoted as Z). Obviously they satisfy the property: [Uij, any generator] = 0 = [Vij, any generator] (7.4.39) From (7.4.39) it is evident that they belong to the centre of this algebra and they are non-zero only for N > 2. Now the identity [P^, {Qa, Q^}] + ••• = 0 (see (7.4.31) (hi)) implies that s = 0, also the generator P can be rescaled to set r = 2, accordingly (7.4.38) becomes: {Qa, G/}= 2 % Qap 5ij P» + Cap Uij+ (y5 Qaf} ViJ (7.4.40) The equations (7.4.28), (7.4.30), (7.4.34), (7.4.36), (7.4.37) and (7.4.40) represent the supersymmetry algebra25 we were looking for. We want to emphasize here that we have obtained this algebra under the assumption that Qla are spinors under the Lorentz group (more precisely they are in the spin 1/2 representation of Lorentz group). If we choose N = 1 the supercharge Qlais simply Q^ the supersymmetry algebra then consists of relations (7.4.28) which come from the generators of the Poincare group, and the ones that involve Qa. (i)
{Qw 0^}= 2(rM Q a / J P"
(ii) [Qa,Pfl] = 0
(iii) [<2 a ,*V=y (V)£2/5 (iv) [Qa.R^iiYstQp (7-4-41) The generator R in (iv) represents the internal symmetry group algebra, showing that it is just a chiral rotation. Note that the central charges Uu and V11 are zero in this case since N = 1 (for derivations of these equations see [Chap. 2 in [21]).26 Note that throughout these derivations we have regarded Qa as an arbitrary spinor. If we were to choose this as a Majorana spinor, i.e. as Q a = c ^ equations involving the supercharges Q'acould be simplified. Also observe that if we assume that the algebra under construction admits 'a complex conjugation as an involution,' then it can be verified that Qa satisfies the Majorana condition as a consequence of it (see [21]). 25
' Note that this is also referred to as Af-extended supersymmetry in the literature. - See Remark (7.5.2) on N = 1 supersymmetry in Sec. 7.5.
26
310
Mathematical Perspectives on Theoretical Physics
Having discussed the construction of the supersymmetry algebras in full generality, it is not hard to see that the super-Poincare algebra is obtained by considering only the relations (7.4.28) and the first three relations of (7.4.41). We now list a few facts about the supersymmetry algebra and the generators that we have discussed above. Fact 7.4.10: Supersymmetry is a symmetry which mixes particles of different spin (i.e., fermions with bosons). This is because of the fact that Q«is in the spin-1/2 representation of the Lorentz group, thus its action on a state of spin j results in a state of spin- (j ± -y). Fact 7.4.11:
The relation [P^, Q^] = 0 implies that
[Pi Q j = 0
(7.4.42)
showing that P^ is a Casimir operator of supersymmetry algebra (see Chapters 4, 5 and 6 for more on Casimir operators). This means that particles in any irreducible representation of supersymmetry will have the same mass. 27 Fact 7.4.12:
The energy PQ in supersymmetric theories is always positive.
Fact 7.4.13: In any representation of supersymmetry where />„ is a one-one operator, there are equal numbers of fermion and boson degrees of freedom.
4.7
Supersymmetry Transformations on a Superspace
Having understood the basics of supermanifolds and supersymmetry algebras, we shall use these concepts to describe in brief supersymmetry transformations on a superspace by choosing N = 1. This choice, though made for ease of introduction, is of great importance. Since it is known that if one is to deal with renormalizable Yang-Mills interactions with the further assumption that there are left-handed fermions in the underlying gauge group that commute with the supersymmetry generators, then one is left with the sole choice of N = 1. Moreover it is in the N = 1 supersymmetry where the linearized supergravity theory can be constructed. Example 7.4.14: We use Def. (7.4.9) of a supermanifold M whose dimension (m, n)_is (4, 4). In view of this definition, the local regions of M can be mapped to open subsets of R* x R^, whose total dimension (using the formula 2N~l (m + n) for N = 1) is 8. Thus, as explained in Subsec. (4.5), M carries the coordinate patches (xf1, 0") which satisfy*:
x>xxv-xvxfl = 0 a
x»e
-
0*^ = 0
9a 9P + 0 s 0" = 0
(7.4.43)
To describe the supersymmetry transformations, we first note that 0 " are Majorana spinors^ i.e., 6a= 6p da 27
+
This fact about mass does not always hold. (See Chapter 4 in [21] for Facts (7.4.10)-(7.4.13)). Note that equations in (7.4.43) are obvious in view of (1) in Remark (7.4.2). 9 a are Majorana, follows from Subsections (2.3), (4.4) and (4.5).
(7.4.44)
All that is Super—an Introduction 311
Remark 7.4.15: We note that with the assumption (7.4.44) these coordinates can also be viewed as coming from the spinor-bundle (see Def. (7.2.5)) formed over a 4-dimensional manifold (see Eq. (7.2.14)). Recall that we noted there that in the case of even n, the spinors in question were Majorana-Weyl spinors (see Subsec. 2.3). (Using the physicist's terminology we shall refer to this supermanifold M parametrized by coordinates (x4*, 6") that satisfy (7.4.43)-(7.4.44) as a superspace.) Basically physicists consider a superspace as an extension of ordinary space-time to a space-time with spin degrees of freedom. Here we have chosen 4 degrees of freedom as it is N = 1 supersymmetry case. In the group theoretic version (which we shall use in the next section), a superspace can also be viewed as the quotient space GIL where G is the group28 resulting from the 14-dimensional graded (super) Poincare" algebra and L is the homogeneous Lorentz group. For now we simply note that supersymmetry transformations are realized as motions in a superspace. Using a triplet of infinitesimal supertranslation parameters (a^, £a, £ a ) we thus write the translational variations for (x*1, 6") as: Sx* = - i ea-? 6 + i 8 T " e a + a " 50" = ea, SO" = eu (7.4.45) From above equalities it can be checked that the composition of two supertranslations of parameters (ccf, ef, ef) and « , e2a, ef) gives: [S2, 5x]zM = (- 2i e? T" £? + 2i e2a t ef, 0, 0)
(7.4.46)
where z*'= (xM, 8a, 9a). Likewise the Lorentz transformations and translations are given by:
d'a=&x-cotlvi(y^ep
(7.4.47)
where 0)^v stand for the generators of the Lorentz group. In the next section we shall see how the notion of superspace is used via supersymmetry tensor calculus to define component fields and superfields and interactions amongst them.
Exercise 7.4 1. Let {B, Bf) and (F, FT) denote two pairs of creation and annihilation operators (with B being bosonic and F being fermionic) that satisfy the relations: (i) [B, B+] = [F, F f } = 1. Write the Hamiltonian: (ii) and define an operator (iii) 28
H=(0bBtB
+ 0)fFfF
(cob (ofe CNO)
Q = Fi B + B T F.
Note that G is the group to which we referred in the introduction of this section.
312 Mathematical Perspectives on Theoretical Physics
Show that Q is fermionic and it takes bosons to fermions and vice-versa, and it satisfies: (iv) [Q,H] = (o)b-cof)Q. Show further that if (ob = (uy then H is supersymmetric, and that in this case Q, Qf and H form an algebra which closes under anti-commutation. 2. Verify (7.4.46).
Hints to Exercise 7.4 1. Recall that B and Bf are even whereas F and F + are odd, hence products of B and F + or B* and F are odd, which shows that Q is fermionic. To verify that it takes bosons to fermions and viceversa we have to first compute the brackets: (a)
[Q,Bf]
and
{Q, F + }.
Thus (b)
[Q, B+] = (F + B + B+ F) B+ - fi+ (F + B + B+ F).
We, note that these pairs, in addition to relation (1) also satisfy [B, F f ] = [Bf, F] = 0. Hence we have: [Q, B+] = F + (1 + B+ B) + B+ Bf F- F + B+ B- B f B+ F i.e. RHS = F + Similarly we can shown that {Q, F+} = B+. Thus if B f |0cand F + |0orepresent bosonic and fermionic states respectively then Q takes bosons to fermions and vice versa. This establishes the above statement. Now to compute [Q, H], we first note that since H is even, the bracket has to be a Lie bracket. Writing H in full and making use of the above result, we have: [Q, % B + B] + [Q, c»/F+ F} = (cob- mf)Q. Thus if cob=o)f, the relation [Q, H] = 0 shows that H is supersymmetric. Again writing (Ob = 0)f = a and computing {Q, Qf], we have: (c)
{Q, Q'} -
^H.
From the result established (c) it is clear thatg, Qf, Hform a closed algebra under anti-commutation. We further note a distinguishing feature of this symmetry the charge bracket here involves the Homiltonian H unlike the charge commutations of the bosanic case, which, confirms that Q, Q*, H are generators of a symmetry algebra. 2. Using the triplets of supertranslation we have: (i)
8bx" = - iei T" 6a + i6a %» e« + of1
(ii)
8bea=el
8be6c=ebx
(b=l,2).
Obviously [f^, <5,] 6a and [82, 5,]r9aare both zero since e", e" are constants and hence zero under 8. In the case of x1* we can write it as:
Ail that is Super—an Introduction 313
(S25, - 8i52) xu=62(-
i ef xM d" + if rMef+aM)~(l<^
2)
= (- ief T^ £? + i £2a T^ £ f + 0 ) - ( l H 2) = -2i£, a T"e« + 2 i £ 2 a ^ £ f .
5
D
THE CALCULUS O N SUPERSPACE, THE COMPONENT FIELDS AND SUPERFIELDS
To define any entities on a space, a knowledge of calculus is needed. We give below simple rules of calculus on the superspace and then follow on to the definitions of component fields and superfields.29 Due to our limited scope we shall not go into their detailed study, we would however like to mention that these fields provide the best means for the description of supersymmetry representations.
5.1
Infinitesimal Generators and Covariant Vector Fields
An element of the superspace is denoted by ^= (A^, 6a, 8"), thus M stands for fi, a and a indicating a Lorentz 4-vector in the former case and a Lorentz-spinor in the latter. A finite group element on the space can be defined as: G(x, 6, 0 ) = elzM*"
(7.5.1)(a)
where kM- (- P^, Qa, Q^) represents a triplet of group generators. Thus G(x, 6, 6) = exp i (- *" PM + 0 a Qa + 0* Q&) = exp i(- x" Pf, + 9 Q + 0 Q) In view of the Hausdorff's formula eA eB = e
2
(7.5. l)(b) the product of two elements G s G(x, 6,
6) and G' = (y, £, E,) can be written as: G{x + y - i § T 9 + i 9 T | , 0 + £, 0 + ^ )
(7.5.2)
where we have used the following supersymmetry algebra rules:
[/*, ft?] = [P", ? Q ] = 0 29
(7.5.3)
' The study in this section is described using the notations of [19], [20] since Wess and Zumine were the first to discover the supersymmetry algebra and give it a mathematical formulation. The reader mayfindthe equations derived in this section marginally different from those of Sec. 4 due to a different choice of ym, etc.
314 Mathematical Perspectives on Theoretical Physics based on
{?,?) = {?', Qp) - = {*Vr}=0 (7.5.4) The multiplication of group elements induces a motion in the parameter space, hence if we choose y = 0 and write G (0, £ ^) = g(£ £) we have;
*(£ I ) : (*", 0, 0) -> (x"+ /0T" £ - i £ T " 0, 0+ £ 0 + I )
(7.5.5)
The infinitesimal generators of this motion (also known as differential operators) are:
">•-'•£?
^•#
+ '^4
(7-5-6Ka)
which satisfy:
{QwQfi)
={Qa'Qp}=0
(7.5.6)(b)
They represent the algebra resulting from the group formed by the elements G, G', etc. These group elements (naturally) define curves in the superspace; the tangents to these curves give rise to covariant vectorfields.It can be checked that these vector fields are carried into themselves by group transformations. The product of three group elements in view of the associative law gives: {G2{x2, 62, G2) G, (*„ 0,, 0,)) G3{x3, ft,, e3) = G2(x2, e2, e2) (G, (x,, ex, e , ) G 3 (x 3 , 03, e 3 » which in turn suggests that the vector field can be obtained as the infinitesimal of the right multiplication of the group. The components of this (covariant) vector field are:
(7.5.7)
By their very definition D and D satisfy the anti-commutation relations:
All that is Super—an Introduction 315
{Da,
D^^-Ux^d,
{Da,Dp} = {Ddt,D^}=0
(7.5.8)
among themselves, and anti-commute with the operators Q and Q , thus we have: [Da, Qp} = {Da, Q^ }={D(l, Qp] = [fy, fy } = 0
(7.5.9)
The covariant vector fields given in (7.5.7) which are also known as the differential operators of the system are collectively written as: D
(7.5.10)
N=?N^W
dz The entity e^ can easily seen to be the (3 x 3)-matrix (see Exc. (7.6.5)):
s; e»=
*JO*
B
l-* *a*
o 0s S*
v
0
(7.5.11)
o 4)
The inverse of the above matrix determined by: e
NeA
=dN
'
e
A eM=°A
(7.5.12)
is given as follows: 5$ N
e
M=
0
-ir^P \
o'
8«p 0
idh I
op
0
(7.5.13)
b\
pJ
Note that the lower index N'm e $f stands for the indices of operator DN whereas the upper index M stands for that of
M , in the case of e$ they get reversed (see Exc. 2 for the verification of (7.5.12)). oZ The entities e ^ and e ^ are called the supervielbeins. We shall return to these later in this chapter. We next define in brief the component multiplets and superfields.
5.2
Component Multiplets and Superfields
Definition 7.5.1: Consider anti-commuting parameters £ a , t,^, ••• and fermionic elements Q that satisfy (7.5.4) and the supersymmetry algebra (7.5.3) where 30 :
& = ?Qa and I Q = I d Q* 30
Note that Q, Q are not necessarily differential operators here.
(7.5.14)
316
Mathematical Perspectives on Theoretical Physics
A component multiplet C = (A, y, •••)
(7.5.15)
is ajset_of fields in the context of supersymmetry which can be transformed by operators formed by £, Q, E,, Q to give rise to another multiplet. Beginning with C we can thus obtain another multiplet by defining the infinitesimal transformation formulated as:
(a) 5 § A = (£G + I (b) 8syr=(SQ
Q)xA (7 5 16)
+1 Q)xyr
- "
The first of these satisfies: [8?, S4] A = (8? 8^ - 8^ 8?)A =
2(S'T"I-£T"|')P;IA
=-21(^^1-1^1')^ A
(7-5.17)
We note here that the operator [8^ •, 8^\ acts differently on the other component y/. This is because the supersymmetry transformation maps tensor fields into spinor fields and vice versa. Also since Q has mass dimension 1/2, the fields of dimension k are transformed to fields of dimension k + 1/2 or into the derivatives of fields of lower dimension under the operation of Q. In the end, however, the closing requirements of the algebra always lead to the result which is acceptable within the framework of the theory. If we choose A as a scalar field and y/as, the spinor field into which A is transformed: 84A = J2%y/
(7.5.18)
then in view of the above explanation, the field y/ is transformed into a tensor field of higher dimension and into the derivative of A itself, accordingly:
8^y/ = J2E.F + i V2Vfd M A
(7.5.19)
Note that the coefficient of d^ A has been chosen in the above equation so that the commutation relation (7.5.17) be satisfied. Having introduced the new tensor field F in 8^ y/, our objective is to find the transformed object 8^ F. This is done by writing [8g>, &] y/ explicitly and by maintaining that the rules of the algebra given in (7.4.41) are preserved. Using (7.5.19) we thus have:
(S{, <%- 8z 8r)¥= - 2/ (
y, {% TV!,-$TV
+ -Jl (£5 { ,F-§' SfF)
I') (7.5.20)
It can be checked that the algebra closes if
5sF=iV2 f r ^ y This transformation rule for F also allows that the commutator on F closes as well. In view of the above discussions, the component multiplet C is now enlarged to
(7.5.21)
All that is Super—an Introduction 317
(A, y/, F)
(7.5.22)
The multiplet with transformation rules (7.5.18), (7.5.19) and (7.5.21) is called the scalar multiplet. The fields A, y/and F form the (simplest) linear representation of supersymmetry algebra described in the previous section. Also as we noted above, if the dimension of A equals 1, then ^has dimension 3/2 and F has dimension 2. Since our primary fields are A and y/, F is said to be an auxiliary field. Remark 7.5.2: We note that F transforms as a space-time derivative under Sg, we further emphasize that this is always the case for the component of highest dimension in any given multiplet. Apparently our motive to define these fields is to write a Lagrangian in terms of them that would lead to an invariant action. The Lagrangian with this property is as follows: L = £ 0 + m£m = {idMy/rM V+ A ' D A + F ' F)+ m (AF + A* F* - \ (y/ iff + y/y/))
(7.5.23)
where m stands for the mass of Weyl-spinor y/ and complex scalar A, and • stands for d'Alembert's operator. The field equations resulting from L are: /?'" 9^1//+ my/ = 0 F+ mA*=0 • A + mF* = 0 (7.5.24) which describe the Weyl-spinor {/and the complex scalar A both of same mass m. We shall next define superfields—these can always be constructed from component fields and conversely given superfields, the component fields can be recovered from them. Clearly superfields are functions defined on superspace, and as such they can always be expressed as power series in terms of #and 0. Denoting the superfield by F(x, 0, 0) we thus have: F(x, 8, 0) = f(x) + 0 <j){x) + 0z (x) + 80M(x) + 9 0N(x) + 0TM 0 Vu(x) + 08 0l(x)+ 000y/(x)+ 08 00 D(x) (7.5.25) Due to the anti-commuting properties of ffs, the product of more than two ffs or #'s vanishes. As a result the expansion of F (x, 0,0) contains only these many terms. The coefficients/(.r),
0S4N(x)
(x)+ 0 00 Sf yKx) + 00 0 0S^D(x)
(7.5.26)
Just as we had the transformation for component fields, the transformation 8^ F here stands for: ^FS(f(2 + ?fi)F
(7.5.27)
where Q and Q are differential operators defined in (7.5.6). The transformation laws for component fields can be found by substituting the values of Q and Q and matching the appropriate powers of 0. We shall illustrate this at the end of this section in Exc. (7.5.5) by using a slightly different route.
318
Mathematical Perspectives on Theoretical Physics
It can be verified that the commutator [S^, 8^] in the case of these fields as well satisfies (7.5.17) because of the anti-commutation relations (7.5.6)(b). It is easy to check that the sum and product of two or more superfields is a superfield. Moreover, the operators Q, Q involved with them are linear operators. The collection of superfields can thus be considered as forming linear representations of the supersymmetry algebra. In general these representations are highly reducible. For example, the number of extra component fields can be reduced by imposing conditions such as: DF = 0
(DF = 0)
(7.5.28)(a)
or F = F+
(the reahty condition)
(7.5.28)(b) 31
The superfield that satisfies (a) is called chiral or scalar (antichiral), whereas the one that satisfies (b) is known as a vector superfield. In conclusion, we would like to mention that all supersymmetric renormalizable Lagrangians can be constructed in terms of scalar and vector superfields, while the superfields in turn can be constructed from a component multiplet by applying the operator exp (6Q + 6 Q). Finally we illustrate, via the following examples, the theory discussed above. Example (7.5.3):
The scalar superfield: Consider the superfield <& which satisfies: 2^0
= 0
(7.5.29)
We are interested in the solution of the above equation. Since DA =
— — ida ra^ ——, it is easy
to note that the variables 9 and y^ = X? + id x^ 0are zeros of D&. More explicitly:
ff (/)
* -(-^-M"t-^)(JI'+'e'V5') *+><>'*& 4-iO°**K'O
(7.5.30)
Hence any superfield constructed with only y and 6 is scalar, and it can be written down as: 4> = A(y) + yl2 6
(7.5.31)
Note that it corresponds to the scalar multiplet introduced earlier in this section. Also, using the Taylor's expansion it can be written in terms of the variable x as: O = A(x) + id z^ 6 -^— A(x) + -666 ox^ 4 +
V200(JC)
--j=-de V2
6 D A(x)
-$—
(7.5.32)
dx^
The superfield O which, as we know is chiral, is sometimes denoted as <&+ and the one that satisfies 31
DO = 0 (7.5.33) Very often the word 'scalar' stands collectively for superfields that satisfy D F = 0 as well as DF = 0, see Hint to Exc. (7.5.9).
All that is Super—an Introduction 319
is known as antichiral and is denoted as 4>_.32 It can be easily verified that <E>_ can be expressed in terms of the variables z = z^ = x^ - id •? 8 and 8. Thus we have: <&_ = A*(z) + -J2 9 $ + 86 F* (z)
= A*(X) - ie x^e -^— A\X) + — dee en ox*1
A\X)
4
+ V2 d$(x) + - L . 0 0 - ^ - 0 (*) r " 0 + 68F\x) (7-5.34) V2 ox*1 Apparently (O+)+ = _. It is easy to check that the product of two chiral or antichiral superfields are again chiral and antichiral. Under the supersymmetry transformation (0 -> 6 + e, 8 -* 8 + e) the components of the chiral field + transform as: 8 A = -J2e<j) 5<j>= -JleF + i 7 2 V edpA SF= i -JllT* 9^ y (7.5.35) (see (7.5.18), (7.5.19) and (7.5.21)). The complex conjugation of these gives the supersymmetry transformations of the components of the antichiral field O_. It can also be verified that for any superfield F = F(x, 8, 8) given in Eq. (7.5.25), DDF = Ois chiral. Example 7.5.4:
The vector superfield and the gauge transformation: Consider the superfield V that
satisfies the condition V = V +; from (7.5.28)(b) we know that this is a vector superfield. Evidently the following choice of components of V makes this possible: V(x,8, 8) = C(x) + iQx(x) - id x(x) + j
08 [M(x) + iN(x)]
- ~8 8[M(x) - iN(x)] - 8x^8 V^ (x)
+ i868 [ i w + -L f^drfw]
-tooe [A (jc) + -i
Mnxw]
+ — 68 8 8 \D{X) + - D C(*)l
(7.5.36)
Next we consider the sum O + + ®_ given by: * + (x,8, 8) + ^>_ (JC, 8, 8) = (A + A*) + >/2~ (6
+ -j=r (66 8~T» dfl
— 6e88U(A+A*) 4
(7.5.37)
Sometimes we shall denote <&_ as
320
Mathematical Perspectives on Theoretical Physics
This helps us to define the supersymmetric gauge transformation: V -» V + O + +O_ Under this gauge transformation the component fields transform as follows: C-> C + A+A*
(7.5.38)
X -> X-i Jl
(7.5.40) (see Exc. 7.5.7). From (7.5.39) it is evident that WZ gauge is based on the following relations amongst fields: Re A = -—C 2 V
-J2*
F= — (M+iN)
(7.5.41)
2i
thus after the gauge transformation V becomes: V(0, 0, 0, 0, VM, X,D) In next section we shall briefly return to these ideas.
(7.5.42)
Exercise 7.5 1. Show that {DA, DA}=- 2 / T J
All that is Super—an Introduction 321
5. Use the Hausdorff formula eA • eB = exp (A + B + \ [A, B] • • ] to show that: G(0, £, e) G(x", 6, 6) = G(x" + j0r M £ - i ex** d, 6 + e, 6 + e). 6. Use the supersymmetry transfonnation F(x, 6,9)^>
F(x^ + i(0T" e - ez^9),
6+ e,0 + e)
to find the supersymmetry transformations of component fields. 7. Show that in WZ gauge the vector superfield V satisfies:
v2 = - - eeeev^v^
and
v3 = o.
8. Show that the operators D and D satisfy the following properties: (a)
(£>) 3 =0,
(D)3=0
and (b)
DaD2
Da=DpD2DK
9. Show that the superfields:
Wa =
-±(D)2DaV
and
are chiral and antichiral respectively and are also gauge invariant.
Hints to Exercise 7.5 1. [DA, DA} = DA DA + DA DA. In view of (7.5.7)33, the RHS can be written as:
33
' Note that Greek letters there are replaced by Latin capitals here, and also the repeated indices in T, 6 and 6, are used with different convention. Reader should check these equations using the notations of (7.5.7) as well.
322 Mathematical Perspectives on Theoretical Physics
=
_'{_d
d_ _d
\deA
deA
d\
deA deA)
M
+ TA e
y * ° ^[w)+id*
VJ_
dxv
Tj v
* i^vde*))
In the above sum, the first and fourth terms in parentheses are zero due to the anti-commuting properties:
B
B
P TA - . —A1 = 0, l\d ,e \=o \de de f J and the third vanishes since
d f d )
.
d ( d )
are zero. Similarly writing
IO* D.) - [[£
,rj 5 ^ ) ( ^ • *J V £-) + M M .enns]
+
and simplifying on the above lines we note that all four terms (in pairs) are zero, since in this case the term
2. Written out in full the equality D
N=
e
N
T
M
becomes: d
(DA
(
V
0
Da = .-c r sj
0^
dxv
o J^ .
The inverse e^ will be given by the relation:
All that is Super—an Introduction 323
Using (7.5.13) we have:
'
V irj8*
0 5J
0 1 ( 8/ 0 -ix^P
-iPraSSS + iSS&fixtf
0 6f
0
0 " 0
<5/<5^
3. In order to show that the commutator closes on A, we use the definitions of 8z A and 8^ y/ as given in (7.5.16). Since 8%A = v T ^ y/, we have:
Sr 8^A = {!;'Q + I'Q) x [(£G + ? 2 ) x A] = (I'G + ?'G) x (V2 ^ ) = 5 f (V2 ^ ) . Similarly
^«5 r A=(^ + i e ) x ( V 2 ^ » . Therefore using (7.5.19) we have: [S?, 5$ A = J2 4 (i V 2 " T " | ' <9/ + V2"^'F) - ^ " f (1 -Jlx"^
A + -Jl^F)
= 2i(^ ?dM A - ? T » 1 d^A) + 2 ( ^ ' - SET)F. The second term is zero in view of (7.5.14), this gives us (7.5.17). The other relations can be verified in a similar way. 4. Note that 69 can be written in more than one way, e.g., ov
==
£fig o
U
=
we choose to write it as eCD 0 6
£AB
-w -w
UQ v
—E
Ug u^ r
to obtain:
{£CD QC eD) = £AB £CD
[
= £
\-CcD (SB
S
-w(5/ 6° -eC 5»D)}
A ~ dA °B )]
324
Mathematical Perspectives on Theoretical Physics
= eAB(eBA
- eAB) = 2t*B eBA = 4.
Since A, B take the values 1 and 2 and 8^ = 0 for L * M we have the result. To verify the second part of the problem we write it as 34 :
and use the equality 9 9 = ECB 9^9^.
.V
A
We thus have:
A
/J
a
= H/](VV+VIAgain since A, B take values 1 and 2 and <5^ = 0 for L ± M we have the result.35 5. In fact this exercise is a verification of our statement made in (7.5.2). Writing G(0, e, e) G{xti, 9, 0) explicitly in exponential form and using the Hausdorff's formula we have: e
i(ee+ie>V(-^
+ee +
^)
= exp
U
x l t
P^+£Q
+
eQ+dQ
+ 9Q)
+ | [i(eQ + eQ), »(-*" ^ + BQ + 6 Q]\ = exp|i(-x" p^ + ee + e e + ee + ee) - i-f(ee + eQ) (-*" P^+9Q + 9Q) - {-x» Pll + 9Q + 9Q)(eQ + eQ)Jj. (Note that we have just one Lie bracket term on RHS, as all higher commutators vanish.) There are 12 terms in the second term of the above exponential, of these 8 cancel in pairs, the remaining 4 are: 34
' We have written it this way to preserve the order in the summation rule, which prescribes: e^^eCB = 8C,.
35
Note that these results could also have been established by writing e 4 8 o9 =—
7- and by using 99 = 9C 0cfor the first one, and e • 6 —
ddA deA
c
AB
j - = -eAB igB -\QA
j o9
— = -^-.
— etc. for the
deA deB deB deh
second. Yet another way of solving the problem would be to write F(x, 6, 9) = 9dM(x) = 991 or = 9 9 1 and operate on it with appropriate operator.
All that is Super—an Introduction 325
~
{(eQ9Q-9QeQ) + {eQ9Q-9QeQ)}.
In view of (7.5.3) they can be written as: - j
{2ET^ 9P^ - 29xti £ PJ = (/)2 er^9P^+ i (- i)9^£PM.
Thus replacing the i back into the term and factoring out (- P^) we have the required result. 6. We expand the transformed superfield Fix? + t?, 6+ e, 6 + £) in terms of Taylor's series. Note that we have written ^ for i(6x^e - er^d). The expansion of altered superfield gives: (i)
Fix" +?,6+e,e
+ e) = F(x, 6, 9) + §" dM F + e ~ + e ~ . do
ou
Now F(x,9,
9)= f(x) + 9®(x) + 9%{x) + 99 Mix) +99 Nix)
+ 6>TV 9 Vv ix) + 999 A (x)+ 9 9 6yKx) + 99 9 Hence (i) can be written as:
9Dix).
Fix» + ?, 9 + e, 9 + e) - Fix, 9, 9) s8F s 8f(x) + 95®ix) + 9 8~xix) + ••• + 996 9 5 Dix) = I " (dMf+ 9dM® + ••• + 99998M D) + e— (0O(x) + 60M(x) + 9TV 9VV ix) + 999~Iix) + 999y/ix) + 999 9Dix)) o9 + e-~ (9%ix) + 9 6 Nix) + 9xv 9Vvix) + 999 X(x) + 9 9 6yKx) + 999 9 Dix)). o9 We now compare the coefficients of 9, 9,99, 99, 999, 9 96 etc., beginning with the constant term on both sides of (ii) to obtain (note that we have used the differentiation rules for powers of 9, 6 as illustrated in Exc. 4): (a) Sf = eO + e x (b)<5O= ie T ^ / + eM (c)Sx
=-ieT"dllf+eN
(d) SM = ii/2)e x^dft + el it) 8 N = i-ill) ET?du x
+ e
¥
if)SVM= iedfl + iedM j + c ^ A - ex^f (g) Si = ID + ( ^ Vv- dv vp e T / i t v V
ier^M
(h) 8y/= eD + ( ^ Vv - dv VM) / T £ + iT* eduN
326
Mathematical Perspectives on Theoretical Physics
(i) 5D = (t/4)£T% I - ( / / 4 ) e T ^ \\i. Note that the second term in (g) and (h) results from simplification of Q1 d^ (6rv6 Vv). 7. In WZ gauge the defining equation (7.5.36) for V simplifies to: V = -0TM0 V^x) + i 666 A (x) - i 6 6 6Mx) + — 6d6 8D{x).
(i)
While writing V2 we note that since no_ terms containing higher powers of 0or 6 beyond 688 8 are non-zero, the terms containing A, A and D contribute nothing to the product, we thus have: (ii) V2 = (-6x^6 VM (x))(~6rv6 Vv). In view of the equality resulting from spinor algebra: (6rtld)(dTvd) = - —666 6 ^
(iii) we see that: (iv)
V2 = -— 666 6 Vp VM.
From (iv) V3 = 0 easily follows. 8. By the very definition of covariant derivative operators D and D as given in (7.5.7), the vanishing of cubic powers of D and D is a straightforward fact. To show the validity of (b) we write:
DaD2Da^DaDeDpDa In view of commutation relation given in (7.5.8), RHS equals = Da Dp (2i(rdfa
-DaDp)-
Similarly, replacing Da D we have: DaD2Da = (2i(Td)«) - Dp Da) (2i(T«?)J - Da Dp) = - 8D + DpD2 Dp - 2i(Td)P {Da, Dp }. The third term in the above equality is 8 D, hence we have the required result (note that we have not used the suffixes on Pauli matrices and the space-time derivative d here, for obvious reasons and have replaced (T)£ (td)p by 2 • , where • is d'Alembert's operator given in Chapter 3). The identity proved above leads to other identities such as: (i) (ii)
D2D2+D2D2-2DaD2Da=l6n D2 D2D2 =16 • D2
All that is Super—an Introduction 327
(iii)
D2D2 D2 = 16 DD2
and to corresponding projection operators: (iv)
(v)
(vi)
which satisfy: n O f + n 0 _ + n 1 / 2 = i.
(vii)
The operators n 0 + and n ^ acting on a scalar superfield project out the chiral and antichiral parts of the field whereas YlU2 projects out a piece called the linear multiplet. (See Chapter 10 in [16] and Chapter 9 in [19] for more details.) 9. Since (i)
DpWa
=
-^DpDDDaV
the RHS involves a cubic power of D and is therefore zero in view of the result proved in Exc. 8. Similarly Dp W^is zero. Hence Wa and W^ are chiral and antichiral fields respectively. To prove their gauge invariance we have to show that (ii)
W -> (W + O + + 3>_) = W.
In other words we have to show that (iii)
Wa -> - 1 D DDa (V +
and similarly for W^. Note that in order to prove the above expression we have to use the facts D + = D
D DDa 0>+ = Dp [i(zdi (iv)
= I ( T 8 ) J ^ * + =0.
Alternatively we can also write; (v)
DDDa®++
and obtain the required result.
DD Z>aO_= D{D, Da} O + = 0.
328
Mathematical Perspectives on Theoretical Physics
6
DIFFERENTIAL FORMS AND GAUGE TRANSFORMATIONS ON SUPERSPACES
It is known that no super theory can be formulated without the use of differential forms and their derivation on superspaces. Similarly the gauge theories such as Yang-Mills cannot be extended into super-Yang-Mills without the help of the (so-called) super gauge transformations. Our task in this section is to introduce both these topics.
6.1
Differential Forms
Let zM = (x*1, 8a, OQ) be an element of the superspace where x? represents the four-vector, and 6", 0^ represent the spinors in the superspace.36 The multiplication rule among these elements is given as: zM zN =(-)'""
zN zM
(7.6.1)
The letters n and m are functions of N and M respectively, and take the value 0 or 1 according as N and M stand for a vector or a spinor index, for example / / = (-)OxO xv A 9a 9P = ( - ) l x l 9P 9a and x " 0 « = ( - ) O x l ^ " ( s e e (7.4.17)). The exterior product and differential forms in superspace are defined along the lines of ordinary space, thus for instance the exterior product obeys the rule: dzMA dzN = - (-)nm dzN A dzM dzM zN = ( - ) " " ' zN dzM and a g-form
7
(7.6.2)
is defined as:
(7.6.3)
(The indices n and m in (7.6.2) are the same as in (7.6.1).) The 0-forms in this case are functions F(z) of superspace variable z, and a typical 1-form A = dzMWM(z) can be written as: A = dx? W^z) + def* Wa (z) + de& W" (z)
(7.6.4)
It should be noted that in view of (7.6.2) the coefficient functions are of mixed symmetry and therefore unlike the forms in the ordinary finite dimensional space, there is no value of q, above which all forms vanish. It is also assumed that all coefficient functions with even (odd) number of spinorial indices are bosonic (fermionic) in character. As a result, the usual product rules amongst differential forms are valid: A ¥ =(-f
¥A
(ciA, + c2 A2) «F = qAjY + c 2 A2 Y A (*FT) = (AV)S where A is a p-form and *F is a g-form in the first equality. 36
In Sec. 5 w e used z M ' - {x^, 0", 6a), the present choice facilitates the summation.
37
In general the exterior product sign will be dropped in the future.
(7.6.5)
All that is Super—an Introduction
329
The exterior derivative denoted d maps a 0-form to a l-form and a g-form to a (q + l)-form. It is defined as follows: d :F -^dF = dzM -p^F
= dz™ dMF
= dzM> -,dzM'>dzN-~-
d:V^dV
WMg...M{ (z)
(7.6.6)
The derivative d in superspace satisfies the properties similar to those in ordinary space, for instance: (a)
d OF + A) = d*¥ + dh
(b)
d OFA) = ¥ dk +
(-fdVA
(c) dd = 0 (7.6.7) where A is a p-form. Since equations written in terms of differential forms and their exterior derivatives are covariant under coordinate changes, they are found very useful in the formulation of almost all physical theories in particular the gauge theories (see Exc. 7.6.1). From our discussions on gauge theories in Chapter 6, we know that gauge theories are covariant under general coordinate transformations as well as under a local structure group (that pertains to the theory in question). For instance this is a compact Lie group for Yang-Mills theories and is the Lorentz group for gravitational theories. Differential forms are often used to span a representation (say X) of this group, thus we have: y>a = yb ^
(z)
(7.6.8)(a)
or ¥'=¥#. (7.6.8)(b) where a and b takes the values 1 •••/,/ being the dimension of the representation X. Now while exterior derivatives map differential forms to differential forms, they do not map tensors to tensors,38 for instance from the derivative of (7.6.8)(b) we have dV^V dx+eWx (7-6-9) which contains the inhomogeneous term *P d%- To circumvent this problem we introduce the (familiar) Lie-algebra valued connection l-form defined as: w=dzM coMr(z)ir (7.6.10) where matrices (V) are the Hermitian generators of the group and the index r runs over the dimension of the Lie algebra. The transformation law for the connection co is as follows: (O'= X'{
Recall that tensors are objects that transform linearly under a representation of the structure group.
(7.6.12)
330
Mathematical Perspectives on Theoretical Physics
Written out in full this is:
=
+ dzM^ ••• dzM" dzN (0/ WMq ... Mi (z)iTr
(7.6.13)
The derivative T) maps q-forms to (q + l)-forms and tensors to tensors. The connection form 0) together with its derivative gives rise to an important tensor, the curvature, which we denote as % %_- dm + 0)0)
(7.6.14)(a)
As can be expected it is a Lie-algebra valued two-form39. £ = j dzMdzNRNM(z) where
(7.6.14)(b)
RNM ( z ) = RNMr (z) iTr
(7.6.14)(c)
In analogy with ordinary space, the curvature and the covariant derivative of a tensor are the only tensorial quantities which can be constructed by taking derivatives. Because of dd - 0, higher derivatives do not lead to new tensors, they only lead to identities—the Bianchi identities (see Exc. 4).
6.2 The Gauge Invariant Lagrangian in Superspace Let f/(l) be the gauge group and let a scalar superfield O r undergo global phase transformations under £/(l) rotations as: O r -» e'^x O r = <&/
(7.6.15)
where tr denotes the charge corresponding to
£, = *;*,
+ [ [ 1 m.. O,. Oy. + 1 ^ 0 , ^ . 0 J
+h.c.l
(7-6-16)
is gauge invariant under the transformation (7.6.15). The first term in L (the 666 6 component of O r + O r ) is LKE and the second is LPE, which is also called the superpotential. It is important to note here that to maintain [/(l)-invariance, the coefficient m^ or gijk = 0 whenever t{ + tj or tl + tj+ tk is nonzero. If, on the other hand, the rotation angle X depends on the variable x, we denote it by A and note that following transformations: O' r = eAt>A
AiA = 0
' For this reason ^is referred to as a curvature two-form.
All that is Super—an Introduction
Da A+ = 0
331
(7.6.17)(a)
leave the LPE portion of the above Lagrangian invariant*. To make the other portion invariant, we introduce the vector superfield V along with its gauge transformation law (see (7.5.38))40: V->V+*(A-A+)
(7.6.17)(b)
in the Lagrangian and express Vin terms of W (see Exc. 7.5.9). Accordingly the LKE is written as:
LK.E. = } (W Wa\ee + % W\)
+ *r+ e''v Or l^g
(7.6.18)
(see Exc. 9 of Sec. 5 (in particular (iii)) for Waand W6). Although the Lagrangian L = X KE + £ P E with this change seems non-renormalizable, it is actually normalizable in WZ gauge where, as we know, V3 = 0. To write down the supersymmetric extension of LQED (Lagrangian for electrodynamics), we replace the charge r, with the electric charge e and use the scalar superfields:
o + -> e~ieK o + = o ; O_ -* eieK O_ = O_'
(7.6.19)
to obtain:
+ *-+ ^ ' O_ | ^ g + m (O + *_| e e + <&; O_+|eg)
(7.6.20)
It is worth noting here that the gauge transformation law (7.6.15) can be generalized to a non-abelian group (necessarily compact) by letting A in the following: 0)' = e~ik O, O' + =
(7.6.21)
in terms of the Hermitian generators Ta of the gauge group in the representation given by the scalar field O. The vector superfields V, V also become matrices and the transformation law for them is: ev'= e'iA+ev eiA The scalar field Wa is now defined as: Wa=-
— DDe'vDaev 4
(7.6.22)
(7.6.23)
with transformation law: Wa' = eiA Wa eiA (7.6.24) Finally the most general supersymmetric Lagrangian for renormalizable interacting fields can be written as: 4a
A as well as A were assumed to be real, hence to bring it in line with (7.5.38) we used the factor ('. * See Ftn. 32 for the notations used in (7.6.16) etc. In this section + stands for Hermitian conjugate.
332
Mathematical Perspectives on Theoretical Physics
L =
±-?r(vr
Wa\ee + Wa W%)
+
®+ ev<S>\eere
+ [ ( | myQPj + I gijk O,.O;.O, j |^ + h.c]
(7.6.25)
where k in the denominator is the normalizing factor resulting from: Tr T Tb = k8ah
6.3
k>0.
Supergauge Transformations
Recall that in Sec. 6.5 (see Exc. 6.5.2) as well as in an earlier section we saw how a particular choice of gauge (e.g., Coulomb's or Wess-Zumino's) could lead to easier computations. In Sec. 6.6 we also established the gauge transformation rules, in this subsection we study supergauge transformations (SGT) so that we may find convenient reparametrizations of a super theory without disturbing much of its invariance properties. We shall see that SGT's are constructed from the general coordinate and structure group transformations of the superspace. They map Lorentz tensors into Lorentz tensors and reduce to supersymmetry transformations in the limit of flat space. To obtain these transformations, we begin with supersymmetry transformations (see (7.5.17)) where the parameters t, and f are independent of x. We shall now treat them as functions of x and use £ = £00 and £ = £ (x) to denote them. The motions induced by these transformations:
x" -> x" - i(0T" £ (x) - £(*) T " 0 )
8
a
^8a-
1«W
(7-6.26)
generate coordinate transformations such as: zM ^ y M = zM - ZM(z)
(7.6.27)
This helps one to express a given theory in the language of differential forms; the theory, as we already know, is covariant under coordinate transformations. Again our basic dynamic variables (superfields) for describing the theory here are the vielbeins and the connection form. These contain a large number of component fields, which eventually get reduced through covariant constraints and by choosing a proper coordinate system. We now proceed to substantiate this statement. Note that the parameter £ can be expressed in Einstein indices, e.g., £ M as well as in Lorentz indices A % through the vielbien relation41:
41
t.A _ p.M
T?
q - q
tM
A
m £.
TON
(7.6.28)
The indices depending on the structure group (which is Lorentz here) are denoted with the beginning of the alphabets, e.g., A, B, C and are called Lorentz indices, whereas the indices that come from the middle of the alphabet, e.g., M, Nrepresent coordinates and are known as Einstein indices. We have used the upper case letter '£" in place of the lower case letter V used in Sec. 5 to distinguish the coordinate dependence of transformations, due to our choice of the Lorentz group though, they represent the same object.
All that is Super—an Introduction 333
Evidently only one of these, i.e., either %A or E,M, can be chosen as field independent transformation parameter. Since we want Lorentz tensors to be transformed into Lorentz tensors, we let E,A to be field independent. Consider now an arbitrary tensor superfield VA, its transformation can be written as: SVA = - £u dM VA + VB LBA
(7.6.29)
The entities LB are Lorentz transformations that correspond to the tensor structure of V. In analogy to ordinary space, the transformation for scalar fields (again denoted V) are: SV=- ZM duV = - SA E^ dMV = - £A
(7.6.30)
E
A EM = °M > EA EM = °A (7.6.31) A Note that while 8V is covariant under Lorentz transformations, 8V is not so-unless the derivative in (7.6.29) is replaced by the covariant derivative: *>uVA =
(a)
dMVA+(-)MBVBa>UBA
(b) VBVA=EBMVMVA Now (7.6.29) can also be written- as:
(7.6.32)
8VA = -SBEBiidMVA+ VBLBA Substituting the expression for E^ dM VA from (7.6.32) in the above equation we have:
(7.6.33)
8VA = -
(7.6.34) c
A
We know that the connection coCB is Lie algebra valued, therefore £ coCB acts like a field dependent Lorentz transformation on VB. And hence if we set LBA = - ? (OC£
(7.6.35)
8^VA = -f'DcVA
(7.6.36)
we obtain: which is obviously covariant under Lorentz transformations. The condition (7.6.35) (which is also called special Lorentz transformation) helps in defining supergauge transformations. These transformations consist of a general coordinate transformation with field independent parameter %A followed by a (structure group) Lorentz transformation with fielddependent parameter given by Lg - -^C(OCB. As mentioned in the introduction, our ultimate objective is to find the gauged supersymmetry transformations; we note that this has been achieved by establishing the transformation rules (7.6.30) and (7.6.36) respectively for scalar V and tensor VA. It is easy to check that the commutator of two supergauge transformations based onfield-independentparameters E,A and r\A can be written as: (5^
- 845n) VA = ZCT1B CDB
(7.6.37)
or equivalently (8n 6,: - S&) VA = VD^B
RBCA - f
ifTBCDVDVA
(7.6.38)
334
Mathematical Perspectives on Theoretical Physics
where RBCp and TB® are components of curvature and torsion tensor respectively. The above equality shows that the commutator [8n, <5je] VA closes into a field-dependent Lorentz transformation (implicit in RBCQ) and a field dependent transformation given by (7.6.36). If the superspace is flat, RBC£ = 0 and the torsion is proportional to T-matrices, hence (7.6.38) reduces to the familiar form: [<% 8n] VA = -2i(7jT*| - ^ 7 7 ) 0 ^ VA.
(7.6.39)
Also if we choose the 9=6=0 component of
8^EMA = -
(7.6.40)
h<°MAB = -?Rcm
(7-6.41)
We devote the final section of this chapter to the concept of integration and conformality on superspaces.
Exercise 7.6 1. Let y and z be two sets of superspace coordinates (i.e., they label the same point in superspace) related by a mapping (j>: z -> y and written as yM = yM(z) = $(z), then show that there is an induced mapping >* which relates g-forms in two coordinate systems, and has the properties:
(i) (ii)
(j)*C¥ + A) = fV + 0*A f(0A) = (fW) (0*A)
(iii) rf(0*OF)) = W ) . 2. Use the infinitesimal coordinate transformation yM = zM- %M and the induced mapping
E'MA{z)-EMA{z)
= - ? dLEMA -
{dM^)EA
for infinitesimal coordinate changes. Show further that their supergauge transformation is given by (7.6.40).
All that is Super—an Introduction 335
rghfjnts tor Exerciser 7.6 1. Note that a function of y (which in form-terminology is a 0-form) corresponds to a function of z in the following sense: (a)
F(y) = F(y(z)) = F(<j,(z)) = f(F(z)).
This definition of <j)*F ensures that an entity in two systems takes the same value at the same point independent of the labeling scheme. As a result we have: (b)
V(y)=dyMi--dyMiWMrMi(y)
= [dz-f^...^f^w^M] <*») = dzNK..dzN« =
(5NMll...d^)WMq...Mi(y(z))
dzN\..dzN«fWNq...Nx{z)
= fC¥(z)). The first property of
0*OF(z)AU))=
dzN2W'N2Ni(z))
= ndzNdzN'dz^XN2N,Niz)) where we have put the product of two functions WN(z) and W'N Ni (z) as XN2N1N^)- Then using (b), the RHS becomes (d)
dyNdyNi dyN*
XN^NW-
Rewriting XN^N 60 as WN (y)W'Njfii (^)and noting that dyNWN(y) = W ) and dy^ dyNi =
d(pV) = d{dyMi WMi(y))
W'^(y)
336
Mathematical Perspectives on Theoretical Physics
= 0VP(z)). 2. Note that in the defining equation of coordinate transformation y = >(z) = z - £, the parameter E, is not a constant, it is an infinitesimal variation in z. In the case of 0-form we have: F(y) = f(F(z)) F(zM-$M)
(a)
F(zM)-^NdNF(zM)
=fF(zM) =fF(zM).
Hence f F ( z M ) - F{zM) = 8 F = - $ N dNF{zM). Let the 1-form *P be dyMWM(y) and dzNWN(z) in >> and z coordinates, then we have: (b) dyMWM(y) = f(dzNWN(z)). The LHS of (b) in terms of infinitesimal transformations gives: (c)
dzN^-¥r
WM(y(z)) = dzN ( ^ - - ^ - J (WM(z) - $LdLWM)
= On the RHS the term containing
f(dzNWN(z)).
dz is zero due to its product in the infinitesimals
(d)
N
and ^L, hence on simplification we have:
f(dzNWN(z)) - dzNWN(z) = -dzN(^dLWM)
- dzNf|^
Wu\
3. By definition of covariant derivative we have: (a)
© 4 " = ^ ' + xV'co'
= d(Vx) + Ci'x)(X-lo>X-X-ldx) where we have written *¥' = *¥x ( s e e (7.6.8)) and have used the transformation law (7.6.11) for the connection co. The RHS simplifies to:
d(W) = d(d¥ + Vco) = 0 + d^co)
All that is Super—an Introduction 337
= Vdca - dV
dzMdzNVNVMV = | W T C ^
in view of (7.6.13) and (7.6.14)(c). Bianchi identities of the second type are obtained by taking the exterior derivative of the curvature form, thus from (7.6.14)(a) we have: (d)
d% = wdw - dww = w(f£- ww) - ( X - ww)w = wH^-Xw.
5. We use the results established in the Hint to Exc. 1, more precisely beginning with the defining equation
E\z)=dzMEMA(z) we denote the vielbein forms and the vielbein fields in the transformed coordinates z' =
E'(z') = dz'NE'NA(z') M
r(dz»EM*iz» =
^r\E'NA{z')
dzM[^EN\z')^.
This leads to the relation: (b)
r(EMA(z))=~^EN\z').
Using the infinitesimal change z'M' = z*1' - %M(z) in (b) and noting that E'MA when expressed in terms of (z) is actually without primes we obtain:
338 Mathematical Perspectives on Theoretical Physics
(c)
f(EA(z)) = [dNM -J^-J(E N A (z) - $LdLEA).
This gives:
fEMA(z) - EMA(z) = SEM\z) = -
(d)
We note that in the transformation (d), the term — ^ -
(e)
5EMA =
EMBLBA{z)
where LBA are Lorentz generators. Taking the transformations of (d) and (e) together, we obtain the transformation of the vielbein field as:
(f)
6EMA = -? dLEMA - dM Z,LEA + EMB LBA = -Z\dLEuA
- {-fMdMEA)
- dMt;A +
EUBLA
= -dMtA - ^L{TLMA - comA + ( - r coMA) + EMBLBA
(g)
where T^ denotes the torsion. We note that the torsion is the covariant derivative of the vielbein EA and can be explicitly written as: f
A _ •}
F
A
( \NM l
p A
,
NW(8
+ M)F
B
A
, -JAB F B
A
The term with connection (i)M^ in (g) combines with dM E,A to give the covariant derivative. Finally we substitute the (special Lorentz) transformation (7.6.35) in the last term of (g) to obtain the supergauge transformation law (7.6.40).(see Chapter 14 in [21].
7
THE BASICS OF INTEGRATION AND CONFORMALITY IN SUPERSPACES
In Sec. 6 we introduced the concept of Lagrangians on the superspace and discussed their invariance under supersymmetry transformations. In this section we continue this discussion for action integral S = J X, since it is the (invariant) action integral which we always use to determine the key features of a theory. For instance we shall use SS to write the Euler-Lagrange equations of a theory defined by superfields.
All that is Super—an Introduction 339
In order to achieve this, we equip ourselves with the computational skills required to define these integrals and their variations on superspace. These rules of integration are not the same as in the case of ordinary space although they are formulated using the same principles: e.g., the invariance of infinitesimal volume integral under admissible coordinate transformations, the existence of a ^-function similar to Dirac-<5 function, a relation between the integration and differentiation and so forth. Since in this case one set of variables anti-commutes, the rules of integration are different for this, we begin by setting the rules for this variable.
7.1 Integration on Superspace Definition 7.7.1:
Given just one anti-commuting variable 0, the integrals f d6 and \ Odd satisfy:
(a) J dd = 0 and J Odd = 1 (b) and are invariant under translation 6 -> 6 + e, e being a constant. (7.7.1) (It is assumed here that the boundary makes no contribution.) In view of the above definition, it is easy to note that integration and differentiation are identical. For example if f(6) is a polynomial42 a + 6b, then the following equation is self-evident:
jf(6)de=b=-^f(d)
(7.7.2)
The above definition can be used for more than one variable, thus for two variables 6{, 62 we have: \ ddxd&Qx = J ddid62e2=
0
J eJdBl = 0 (i * j) J d&d&tfd1 = 1
(7.7.3)
Next to make the definition of integration on the superspace meaningful, we have to ensure that the volume element d*z = d4x d26 d26 = dAx d*0
(1.7A)
is invariant under translations. Now this is so because the superdeterminant of vielbein EM• (obtained by coordinate transformations (zM -> z** ) is 1 (see Sec. 6). We note that volume elements d2d and d2d in (7.7.4) satisfy:
(a) (b)
j(d29)d2= — J d0l dd1eABeAeB=
— J dQx dO2 (-26l02) = 1
j(d29)d2 = - - J de[de2eki6ddA = J dWeWW = 1
In view of (7.7.2) their integrals can also be expressed as
(a) 42
| d2d =~\d2 = - 1 ^ = -JD 2 = - j ^ Z ) ,
' Functions of 0can only be polynomials due to their nilpotency character.
(7.7.5)
340 Mathematical Perspectives on Theoretical Physics
\d2Q =-ld2^--dAdi=--D2
(b)
= --DAD:
(7.7.6)
The latter two equalities in (7.7.6)(a) and (b) result from the fact that when the integral is space-time, the derivatives dA and dA can be replaced by DA and DA as the second term in DA and DA is total spacedivergence, and as such the volume integral is zero by Gauss theorem, We also observe that since for any arbitrary superfield O, the derivative superfields O4O or O^<E> contain terms in 6 and 8 of order not higher than three, the integrals involving them over the superspace are always zero, thus we have: (a)
f d4xd26d2d DA® = j d4xd20d26 dA® = 0
(b)
jd4xd29 d29DA
(7.7.7)
Using these relations we can write the formula for 'integration by parts' for the product of superfields D^O, and <E>2 as: \d4xd49
(DA
jd4xd49®iDA®2
= -jd4xd49^lDA<^2
7.2
(7.7.8)
Variation of a Superfield
A Grassman variant of the Dirac-5 function is defined as:
jf(6)8(e)dd = f(0)
(7.7.9)
which gives:
8(9) = 9,
J8(9)d9=l
and j 98(9)d9=0
(7.7.10)
For the general superspace the 5-function has the property: \d\'f(z')8\z - z') = /(*)
(7.7.11)
where d%z' = d4x'd29'd2 9' and 8\z -z') = 84(x- x')82(9- 9')82 (9 - 9'). Due to the anticommuting nature of 9 and 9, it is evident that 82(9)82(9) equals 92 92. Finally to obtain the variation 55 we have to define the 5-variation of a superfield. In the case of a simple (scalar) superfield <> / (<j>(z) = <jf(z')) we have by definition:
**y,'l:eL
= #(* - *')<52(0 - 9')8\9 - 9')
(7.7.12)
o
S o
^ j d\M) = | f (0(*. 9, 9))
8(j>(x, 9, 9) J
d(j>
(7.7.13)
All that is Super—an Introduction 341
In order to define the functional derivative with respect to a chiral superfield O, we have to ensure that the 5-derivative mentioned above maintains the defining condition D^O = 0, i.e., it satisfies: 5*(JC, e, e)
D
(7714)
*8*(x',e:-e')=°
hence we define the variation of O as: 7)2
§<j)(x 8 6)
WPTWW)
m
_
X )SH
"T^" '
_
W)
" ~ *#<*"
( 15)
"
From earlier Sections (5, 6) we know that basically all Lagrangians of interest (e.g., those of YangMills, gravity and strings) on the superspace can be formulated in terms of superfields of the type (j> and O, hence the variation rules given above can be applied to study the actions formed with these Lagrangians. An example of such an action is given below. 5 = jd4x d2G dz9 L(V,
(7.7.16)
where Vis a general superfield, O is a chiral multiplet and W(&) is the superpotential defined in Sec. 6.2. (See Exp. (7.7.3) given at the end of this section for illustrations of these discussions.) In conclusion we sum up the distinction between the ordinary integration and the one we are dealing with here in the following remark. Remark 7.7.1:
(1) While the ordinary indefinite integration is the inverse of derivation, this integra-
tion over 6 is the same thing as derivation. (2) The normalization JQd0 = 1 implies that 6 and d6 have opposite dimensionalities. (3) The supersymmetry transformation law* 6 -» 6 + e and x^ -> x*1 + id x^e suggests that 9 and e have same dimensionality whereas xu has the square of that dimensionality, thus if L (the dimension of length) denotes the .^^-dimensionality, the dimensionality of 6 and dd is LA and L " respectively. (4) From our discussion in (3), it follows that in the case of integration on the superspace, each (Bose) coordinate x increases the dimension of the superspace by one but each 6 coordinate decreases it by one half. The above remark explains why in supersymmetric quantum field theories where integrals sum over both Bose and Fermi dimensions, the convergence improves-leading very often to finite results. We explain some of these ideas in the Hint to Exc. 2. Next we devote our attention (rather briefly) to the concept of superconformality.
7.3
Superconformal Transformations
In this subsection we describe in brief the concept of conformality on superspace. In Sec. 1.4 we already defined conformal transformation on the Minkowskian/Euclidean space (see (1.4.1)) and then went on to derive conformal algebra in two-dimensions by using the light-cone coordinates (see (1.4.21) and *
Note that in supersymmetry transformation of x11 we are taking only one term id t^e = - ier^ Q in place of usual two terms i (8 T^e- er''~Q), since we are considering only 8—> 0+ e. (See the Hint to Exc. 7.5.6).
342
Mathematical Perspectives on Theoretical Physics
(1.4.25)-(1.4.26)). We would like to generalize the equality (1.4.21) to the present case. For this we note that in the case of superspace the supervielbeins play an important role, it is therefore natural to define a superconformal transformation as follows: Definition 7.7.2: Consider a coordinate reparametrization on a general (8-dimensional) supermanifold zK —> z'n = zK + £*(z), and let EnN43 be the corresponding supervielbeins that transform under the supergeneral coordinate transformation as (see Exc. 6.5): 8EnN=^dAEnN+d^AEKN
(7.7.17)
Then in some restricted sense the transformation44: EAK - > eLEA\
E*l -> eL* E%
(7.7.18)
where L (in the exponential eL) is a general superfield, is called a superconformal transformation. It can be checked that the variation (7.7.17) is invariant under this transformation, and that it has the property of closure. We note that the above transformation (7.7.18) needs a simplification in order to write a superconformally invariant field theory. To this end, we let Z = Z (z; 6) and write down the most arbitrary transformation of this pair as: Z(Z) = (?(z, 6), 6(z, 0))
(7.7.19)
and then impose the constraints on the system to finally arrive at a superconformal transformation. Since the variables are just two in number we can write the supersymmetry derivative as: D
=4z
+e
4~
(7.7.20)
dd az It can be checked that D2 satisfies:
D2=4~
(7.7.21)
az The reparametrization (7.7.19) then leads to the transformed supersymmetric derivative D given by: D = (Dd)D + [Dz - 6Dd]D2
(7.7.22)
We shall achieve our objective of defining a conformal transformation if the above relation yields a viable composition law. This, however, is not possible since (7.7.22) is a highly non-linear equation. To remove this non-linearity we make the following assumptions, in other words we constrain our transformation by choosing: (a)
D = (D6)D
(b)
Dz - 9D6 = 0
43
44
(7.7.23)
We used the index letter M in place of n and A in place of N in Sec. 6 (see Eq. (7.6.28)) and a minus sign in the reparametrization, hence the index N here transforms under the tangent space group i.e. it is Lorentzian. The index A is a superspace world index. We have written this transformation for the N = 1 supergravity case.
All that is Super—an Introduction
343
Using these assumptions it can be shown that the composition law for two superconformal transformations Z - > Z -» Z closes properly. In view of the above analysis it follows from (7.7.23) that a field transformation is superconformal if it satisfies:
• dZ
(7.7.24)
(b)
9y(y+)+
90F*(y+)45
(7.7.25) +
We note that unlike the product of two chiral superfields, the product O O is not a chiral superfield. We write this product explicitly in terms of A, y and F for two different antichiral and chiral superfields Ojand Oy: <&t<&, =A*(x)Aj(x)+ J2
6\I/J(X)A*(X)
+ ddA*(x)Fj(x) +
+
-J2dy/i(x)Aj(x)
edF)(x)Aj(x)
+ erea [ - ^ xa&m {Ajd^j - d^Aj) - 2Wi&yja] + eeea[-^
xj"
(A^,,^
+ eeea[--j=zat"i{w-?dmAj 45
-
C?,,,AV7
-
^W^]
- , ? m ^ . ) + V2/?>,.«]
(7.7.26)
Where y+ is the Hermitian conjugate of y, for instance (ym)+ = xm- id'fd. Note that the expression in (7.7.26) is written after using the Taylor's expansion (7.5.32) for <J>.
344
Mathematical Perspectives on Theoretical Physics
eede^F;Fj+^(A;DAj+DA*A.)-^(dmA;dmAj-idmWirm¥j+iYiridm¥j)^
+
We remark that the 668 6 component is a spacetime derivative, a fact which has important implications in writing a supersymmetric renormalizable Lagrangian. RecalHhat in (7.6.16), we wrote the Lagrangian in terms of chiral superfields 0 , 0 ^ by using only the 666 6 component of the product <X>* O, as: ^ = * > < I«»S component +[(\mU®i®j
+
\giik*i*j*k
^ijUcomponem+h-C.]
(7.7.27)
Also, using the transformations (7.6.17) we showed there that this was the most general supersymmetric renormalizable Lagrangian. From our discussions in Sec. 6 and Sec. 7 and in the light of our above remark, this renormalizability character of L is quite apparent. This is more so when we express it as an integral. Using the Berezin integral and in particular (7.7.10) we can write L as an integral over the superspace, thus:
+ ^(ApVpfrSie)
+ A]jk^^jOk8(d))}d26d26
(7.7.28)
If instead of considering a set of interacting superfields to write the Lagrangian, we wanted to write just for one superfield, then L denoted as Lo would be: L o = J J
d26
(7.7.29)(a)
We call it the free-field part of the Lagrangian (7.7.27). In terms of component fields it becomes: L0=A*UA
+ idmxj/rm W + F*F + m(AF + A*F* - — (y/y/+ y/yr))
(7.7.29)(b)
Also, in view of identities established in Exc. (7.5.8), we can write (7.7.29)(a) as: Lo=
j|o+d>-|mfo^-O + O+^-O+lJjV40
(7.7.30)
The Euler-Lagrange equations are now obtained by varying Xo according to the rules given in (7.7.12)(7.7.15). Written in a matrix form they are:
_lp2
oV-i^DD
i yon
(7.7.31)
Since
±D^L^ 16
=
^
if
DO = 0
LJ
(i.e. D2D2/16D is a projection operator on chiral field <&)
(7.7.32)
All that is Super—an Introduction 345
— 1 it follows that D 16
D2D2 —— =0 , hence we can simplify the matrix form to obtain: LI /M<J>- ±-DDQ>+= o
4 m$>+ - — DD<E> = 0
(7.7.33)
These are the field equations for a massive scalar multiplet. Returning to (7.7.29)(a) we emphasize that due to integration rules all components of O+<J» except the 890 6 component vanish, and thus what we obtain here is the familiar kinetic energy term, along with the second term which corresponds to the potential. This term (as we already mentioned) is called the superpotential of the theory.
Exercise 7.7 1. Show that for chiral superfields <J>, and <E>2 m e superspace integral satisfies:
j d2e d\ $ r $ 2 = -j d4ed\ fci-^2-. 2. Show that for a non relativistic superpoint particle described by m scalar superfields %l(t, t), ..., %'"(t, t), the supersymmetric action integral can be expressed as:
S= — j dt(xaxa+
i^O")
where
xa(t, T) = xa(t) + e\t)x. Also obtain the equations of motion in this case. 3. Verify the equality
where
do
dz
4. Establish (7.7.22).
Hints to Exercise 7.7 1. In an earlier section we have already seen that covariant derivatives satisfy (see Hint to Exc. 7.5.8)): (i)
D2D2D2=
1692D2,
D2D2D2
(we have written here d2 in place of • ) .
=
!6d2D2,
346
Mathematical Perspectives on Theoretical Physics
In particular for a chiral superfield O (£^
(in)
J d2e d4x *,- * 2 = J d2e d\ o, ( D ^ 2 ° 2 )
1
9
In view of (7.7.6)(b) we now replace - — D 4
f
by
9—
d 6 to obtain: J
J d2e dAx$>x •
2. The mathematical content of this exercise can be viewed as a generalization of point particle field theory of classical mechanics where the space dimension is zero and time-dimension is 1, and the coordinates of a particle are scalar fields xl(i), ..., x"\t) of time variable t. In the case of a superpoint particle field theory, one considers scalar superfields X\t,
T), ...,X"\t,
T)
in the (1,1) superspace with coordinates (t, T) and of course uses all the set up of supersymmetry theory (e.g., supersymmetry generators, resulting differential operators and supersymmetry algebra) to write the action integral. This approach is due to Friedan and Widney [8] and has been utilized since then with great advantage in string theory [2], [10]. In view of the discussions in Sec. 5, each scalar superfield x"(t, T) can be expressed as: (i)
Xa(t,
T) - xa{t) + ea(t)r.
Note that xa and t are commuting elements whereas 9" and rare anti-commuting. From (7.5.6)(a) we know that there are just two supersymmetry generators, we denote them as Q and //, thus 46 : (ii)
Q = l-T|-
dt
* dx
H =
id
dt
The corresponding superalgebra of left-supertranslations and time-translations can easily be checked to satisfy: (iii)
46.
{Q, Q) = 2<22 - -2H,
[Q, H] = [H, H] = 0.
The sign of the term containing — differs from what we have in (7.5.6)(a). dt
All that is Super—an Introduction 347
Evidently the right-supertranslation operator is: (iv)
D=ix~-+-^-. dt dx
This leads to: (v) and
{D, D) = ID1 = 2H, [D,H] = 0
(vi) {Q, D} = 0 (see Eq. (7.5.9)). The last of these relations helps us to construct the action:
(vii)
S = _['* dt J dxL
where (viii)
L=
±DxaD(DXa).
We show next that the above action is supersymmetric. For this we establish that: (ix)
8S = j dtdxL = 'surface' term.
Under the supersymmetry transformation of the parameter e (see (7.5.16)), the infinitesimal variation of superfields %a in this simple case is given as:
(x)
Sx" = eQXa = e[ix-jt ~ jA Ua(0 + 0a«T).
Since e is Fermi like Q, from (vi) we have: {[eQ, D] = 0}. Using this we obtain: 8DXa = eQDx" = DeQXa = D8x" which shows that D commutes with S, and therefore (quite appropriately) D is a covariant derivation. This implies that the variation inX is due to the variation in x"\ accordingly, in view of (x), we have: (xi)
8L = eQL = e(ir~~)L. V dt dx) The first term on the RHS, being the exact time derivative, does not contribute to variations, while the second term dr is T independent (as L is at most linear in x) and therefore from our rule of integration on Fermi coordinates, its T-integral is zero. Hence the variation of S is as given in equality (ix) which confirms that the action principle is supersymmetric. Using (i) to write 8x° in terms of components, as well as the LHS of (x) we have upon simplification:
348
Mathematical Perspectives on Theoretical Physics
Sxa + SeaT=ieTxa+
e9a
(note that iex9a T on RHS of Eq. (x) is the product of four Fermi objects and therefore it is zero) which gives the variation in component form Sxa=e6a
and
S9a=iexa
and as such defines the supersymmetry transformation laws of the system in terms of components. We now use (iv) to write
Dxa = -6a+ ixax and DDxa=
i(xa+
9ax)
to express the action S in component form as: (xiii)
S= —ij dtj dt(-9a + ixaf) (xa + 9ax) = -—\dt\
dx[x(xaxa + iOa6a)]
= -j dtix"** + idaea) where we have used (7.7.1)(a) to arrive at the final form of 5 in (xiii). We note that the first term on the RHS is the kinetic energy term of an ordinary point particle with mass m = 1; the second term — 0a6a is a kinetic piece due to superpoint particle's Grassman degrees of freedom. The equations of motion are obtained by varying the action S in (xiii), these are easily seen to be: xa = 0,
9a=0.
3. We simplify the product D of the operator taking into consideration that 9 and z are independent variables, and that 9 is an anticommuting variable. The first one gives the relation between their derivational operators as:
d9{dz)
dZ{d9J
and the second implies
d9\d9) Hence writing the product term by term we have: (i)
All that is Super—an Introduction 349 In view of the above, only the second term on the RHS is non-zero. Hence we have: (ii)
D2=-^-. dz
4. To verify the equality
D= (D6)D + [Dz -
9D6]D2
we note that
DS(-L
[dO
+
e4-)
dz)
and
D2=^-. dz
We now express the operator D on the LHS in terms of ——, —— and solve the RHS as it is.
36
dz
This gives
[d6
dz)
[do dz
(i)
de de)
{dz dz
RHS m(§ + ef}[*+9*) {dO
dz ){d9
+\§ dz)
[90
dz dej
+ e?L-J§ dz
[d8
+
ef)]±
dz ) \ dz
dd d .de 2 d +ftde d ,ftdd 2 d d9 d6 de dz dz de dz dz
[d6 dz
dz dz
d6 dz
dz dz j
The second and fourth terms on the RHS cancel with the seventh and eighth terms on the RHS and the remaining four are easily seen to be the same as on the LHS, hence the equality holds.
350
Mathematical Perspectives on Theoretical Physics
APPENDIX 7 A A.0
Notations and Pauli Matrices
We list below the properties of Pauli matrices that are extensively used in transforming the 4-component spinors into 2-component ones and conversely. Our notations are mostly based on West's book [21] with a few changes to suit our thinking. Usually the Latin capitals A, A are used to denote the 2-component spinors # A , ^ A , etc., belonging to (4-, 0), (0, y ) representations of the Lorentz group, while the Greek indices a, /3, a, j3 are used for 4-component spinors. The lower case Latin indices, or Greek indices jxv denote the space-time, thus, for instance, Minkowskian metric (-1, 1, 1, 1) is r\mn or T]^v where m, n or IJ.v take the values 0, 1, 2, 3. The £-symbols used in Lorentz transformations are the invariant tensors: e° 123 = - % 2 3 = 1
(A.I)
and eAB = ^B = -eAB=-eAB,
el2 = +1
(A.2)
The summation amongst these tensors is governed by the rule: eCBeBA
= - £ A B e C B = -dAc
(A.3)
which emphasizes their universality of usage.47 Using these we can raise and lower the indices of spinors as follows: XA = XBeBA,
(a)
XA=eABXB
(A.4) Consider the triplet: T = (T 1 , T 2 , T 3 ) formed by Pauli matrices. It can be checked that the matrices defined by (T'")AB
= (I,
r)AB
(A.5)
are invariant tensors under the Lorentz group. Note that using the tensors eAC, e • • we can lower the indices of (x'")AB. On the other hand, taking the complex conjugate of these matrices and using the defining relation of dotted indices (see Ftn. 15) we have:
(A.6) But Pauli matrices are self-conjugate, hence we have: (O, 47
B
= (T m ) B A
(A.7) f 0
In view of the fact that for A, B = 1, 2, eAB = -eBA and en = 1 we also use the matrix: ^ = this tensor. ^
A to denote '
All that is Super—an Introduction 351 where
( f ' « ) ^ = (-1, i)hA
(A.8)
m
Using the metric rf it can also be checked that (rm) = (-l, T) and (f J = (1, T)
(A.9)
These matrices with tensorial notations satisfy the following relations and identities:
(a)
( O ^ ( T , , ) ^ = 5 n m ^ + (T n "y c
(b)
( T " ) * (*«/* = «;«£ + (?;**
(C)
(T"')^(T")^ C = T]mn8A + ( T m Y c
(d)
\emnpg(Tpi)AB=i(Tmn)Ah
(a)
(r>")A"(Tm)dD
(b)
(Tm)AB(r»>)c* = +2eACe™
(A.10)
= 28AD8Bc (AM)
where the doubly indexed matrices are: (a)
iK)Ac=\i^n-^m)Ac
(b)
( T ? ) / = }(T m T n -T n T m )/
(c)
(T m Y c = | (T'^(f m )^ - Xmch (T"fA)
(A.12)
A.1 Standard Bases and Components of a Supervector Every supervector space having finite total dimension has a basis that is both pure and real (see Sec. 4). But very often it is convenient to work with pure bases for which the c-type basis supervectors are real and the a-type supervectors are pure imaginary. Such bases, called the standard bases, can be constructed from pure real bases by multiplying all the a-type basis supervectors by i. A standard basis is characterized by: f* = (-1)' ,« Thus if X is a real supervector with components X' with regard to a standard basis, then A"> = X = X* = f* X '• = (-1)' ,eX'*
(A. 13) (A. 14)
352 Mathematical Perspectives on Theoretical Physics
If X is c-type, then X' is c-type or a-type according as the index i is c-type or a-type. When X is a-type, the type association is reversed and we therefore have (see Eq. (7.4.23) for exponents of (-1)): X'>=
( - l ) i x Xu ,e
(A. 15)
which implies Xi* = (-lfXi
(X real)
(A.16)
If X is a real c-type supervector, then all its components, with respect to a standard basis, are seen to be real. Conventionally, if X is a c-type supervector, then 'X def X1
(A.17)
A.2 Contravariant Vector-fields on Supermanifold M Let !F(M) denote the set of all scalar fields over a supermanifold M, i.e., the set of all differential mappings / : M —> AM. We note that for f,ge ?{M), as A M and p & M, (f+ g) (p) def f(p) + g(p), (af)(p) def a[f(p)], (fa)(p) def [f(p)]a and f*(p) def [/(/?)]*, hence J(M) is a sup~ervector space. Clearly if m * 0 in (m, n)-the dimensionality of M, the set!F(M) is infinite-dimensional and thus has no basis of finite total dimension. However, one can always construct a subsupervector space of T(M) by choosing a finite set {eA} of linearly independent scalar fields and use it as a basis of this space. We note that J(M), in addition to having the supervector space structure, also has the property: (fg)(p) def [f(p)][g(p)]
(A.18)
This property can be used to construct more complicated structures with the help of J(M). Let {eA} be a (p, ^-dimensional subbasis of 7{M), and let F be a differentiate mapping from the subspace x AeA{M) of C^ x Cqa to A M . Every such mapping defines a scalar field F given by F(p) def F(e(p)),
for all p in M
(A.19)
A mapping X from J(M) to itself written as: X(f) def X /,
for all / i n ^(Af)
(A.20)
is called a contravariant vector field over M if it satisfies the chain rule (see Subsec. 4.2): (XF)(p)=[(XeA)(p)]\^TF(y)) idy \y = e(P) for all differentiable mappings F: xAeA(M) T(M), and for all p in M.
(A.21)
—> A M , for all pure finite-dimensional subbases [eA] of
A.3 Super Lie Groups In Subsec. 5 of (7.4) we mentioned that a supergroup can be constructed once its Lie algebra is known. We pursue this approach in the case of a super Lie group. To begin with, we remind the reader that if the postulates of Def. (2.1.1) of a group G are supplemented by the following two postulates:
All that is Super—an Introduction
353
(i) G is a supermanifold whose points are the group elements, and (ii) the binary operation (multiplication mapping) denoted F is differentiable, then G is a super Lie group. Just as in the case of ordinary Lie groups, the super Lie group G possesses the following features: (a) The left and right translations xL and xR for all points x in G; their derivative mappings denoted x'L and x'R (known as left and right draggings); (b) the left- and right-invariant vector fields (i.e., vector fields which are invariant under left and right draggings respectively) denoted XL and XR which satisfy XLe = XRe;* (c) two distinguished super Lie algebras formed by the setsXL(G) andXR(G) (isomorphic to Te{G), the tangent space at the identity e of G) of all left and right invariant vector fields; the sets have a supervector space structure given by a bracket operation that satisfies the super Jacobi identity, and the Lie brackets formed by XL and XR are related as [XL, YJ e = -[XR, YR]e; (d) the left- andright-invariantlocal frame fields; (e) the left and right auxiliary functions (see Hints to Exc. 4 of (2.5) for these functions on a Lie group) that follow from left- and right-invariant vector fields and their derivatives. It can be shown that a knowledge of either of the auxiliary functions is sufficient to determine the multiplication mapping and thus the group G itself. Now in a canonical coordinate system, the auxiliary functions are completely determined by the structure constants, while the structure constants are defined by the super Lie algebra,48 hence the group G is completely determined in a canonical chart by its own super Lie algebra. This establishes the claim made in (7.4).
A.4
Conventional Super Lie Groups
A super Lie group G is called conventional if a standard basis [ea] can be introduced in Te(G) with respect to which the souls of all the structure constants vanish. In such a basis, which is also called conventional, the only nonvanishing structure constants are of the type c^a, c^, c^ or c ^ , the first three here represent ordinary real numbers and the fourth ordinary imaginary numbers. A super Lie algebra is called conventional if it is the super Lie algebra of some conventional super Lie group. The structure of a conventional super Lie algebra can be fully determined by using only real vectors in Te(G). The components of these in a conventional basis have vanishing souls. One of the simplest example of super Lie group is the group formed by all nonsingular (m, n) x (m, n) matrices M with elements xab having the reality and type properties of the components of a real c-type rank (1,1) tensor. The group is evidently of dimension (m2 + n2, 2mn) (see Sec. 1).
A.5 Exponential Mapping Let Ee(G) denote the subspace of Te(G) consisting of all the real c-type contravariant vectors at the identity element e of G. For all s in R c and all X in Ee(G), the mapping defined as 48
' Let {„} be a basis for Te(G), then the sets {eaL} and {eaR} are the basis for super Lie algebrasXL(G) and XR(G), hence there exists a set of constant supernumbers cCab such that [eaR, ebR] = ecR cCab or equivalently [eaL, ebL] = - ecL cCab or [ea, eb] = ec cCab. It is important to note here that a super Lie algebra is not the same as a Lie superalgebra (see Sec. 4, and [12]). * Suffix e to the vector field XL etc. indicates that this vector is at e e G.
354 Mathematical Perspectives on Theoretical Physics
expOX) def xx(s)
(A.22)
is called the exponential mapping from Ee(G) to G and is denoted 'exp.' The mapping helps to define cannonical corrdinates ea exp-1(jc) in G, where {ea} is dual to the basis {ea} of Te (G).
A.6
Conventions on Structure Constants
In the case of a conventional super Lie algebra, these can be determined by considering only real vectors e Te(G) with vanishing souls. Thus if X is such a vector and if it is otype, then its nonvanishing components in the conventional basis are ordinary real numbers belonging to the set {X11}. If X is a-type, then its nonvanishing components are ordinary imaginary numbers that belong to the set {Xa}. Hence such vectors belong to the subspace R m © (z"Rn) of Te(G). Mathematicians (quite often) replace this subspace R'" © I'R" by R'" © R" and apparently modify the bracket operation by multiplying the structure constants C ^ b y i so that they become real. (See [4] for details on Subsec. A.I-A.6.)
References 1. M. Atiyah, R. Bott and A. Shapiro, Clifford Modules, Topology 3 (Supp. 2) (1964), 3-38. 2. L. Castellani, et al., Supergravity and Superstrings, A Geometric Perspective (New Jersey: World Scientific, 1991). 3. S. Coleman and J. Mandula, All possible symmetries of the 5 matrix, Phys. Rev. 159 (1967), 1251. 4. B. Dewitt, Supermanifolds (2nd ed., Cambridge University Press, 1992). 5. S. Ferrara, P. van Nieuwenchuizen and B. Dewitt (ed.), Supergravity '81 (Trieste) (Cambridge: Cambridge University Press (1982). 6. S. Ferrara, P. van Nieuwenchuizen and B. Dewitt (ed.), Supersymmetry and Supergravity, '82 (Trieste) (Singapore: World Scientific, 1983). 7. P. G. O. Freund, Introduction to Supersymmetry (Cambridge: Cambridge University Press, 1986). 8. D. Friedan and P. Windey, Supersymmetric derivation of the Atiyah-Singer index and the chiral anomaly, Nucl. Phys. B235 (1984), 395-416. 9. C. Fronsdal (ed.), Essays on Supersymmetry (Boston: Kluwer Academic Publishers, 1986). 10. M. B. Green, J. H. Schwarz and E. Witten, Superstring Theory (Vol. I, II, Cambridge: Cambridge University Press, 1987). 11. M. T. Grisaru, W. Siegel and M. Mocek, Improved Methods for supergraphs, Nuc. Phys. B159 (1979), 4 2 9 ^ 5 0 . 12. (a) V. G. Kac, Classification of simple Lie superalgebras, Fund. Anal. 9 (1975), 263-265; (b) V. G. Kac, A sketch of Lie superalgebra theory, Commn. Math. Phys. 53 (1977), 31-64. 13. I. Kaplansky, Superalgebras, Pacific J. Math. 86 (1980), 93-98. 14. H. B. Lawson, Jr. and M.-L. Michelsohn, Spin Geometry (New Jersey: Princeton University Press, 1989). 15. P. K. Mohapatra, R. N. Mohapatra and P. Pal, Z 4 symmetry and force generation, Vol. 34, Phys. Rev. D (1986), 231-234. 16. R. N. Mohapatra, Unification and Supersymmetry (2nd. ed., New York: Springer-Verlag, 1992). 17. W. Nahm, Supersymmetries and their representations, Nuc. Phys. B135 (1978), 149-166.
All that is Super—an Introduction
355
18. A. Salam and J. Strathdee, Superfields and Fermi-Bose symmetry, Phys. Rev., Vol. 11, No. 6 (1975). 19. J. Wess and J. Bagger, Supersymmetry and Supergravity (2nd ed., Princeton: Princeton University Press, 1983). 20. J. Wess and B. Zumino, Supergauge invariant extension of quantum electrodynamics, Nucl. Phys. B78 (1974), 1. 21. P. C. West, Introduction to Supersymmetry and Supergravity (New Jersey: World Scientific, 1990). 22. N. Kamaran and P. J. Olver, 5.[19].
CHAPTER
GRAVITATION, RELATIVITY AND BLACK HOLES
1
Q O
GRAVITATION (FROM NEWTON TO EINSTEIN) AND AN OVERVIEW OF SPECIAL RELATIVITY
In this chapter we describe in brief the two theories—the Gravitation and the Relativity (the Special and the General)-which have led to spectacular findings of this century such as the black holes, and devote this section in particular to the gravitational principles formulated by Newton and Einstein with geometry as a prime tool. These theories as we see them today are quite different from the perception of the world that our ancestors had. They thought that the Earth was stationary and was at the centre of the universe and that the sun, the moon, the planets and the stars were moving in circular orbits around the Earth (Aristotle, 340 BC). It is worth mentioning here that these ancient philosophers (the seekers of truth) were more interested in solving the puzzles related to distant objects, e.g., the stars and the planets, than to the nearby objects around them. As a result they conceptualized the physics of celestial bodies even before writing the rules of geometry. However, when Newton wrote his 'Philosophiae Naturalis Principia Mathematica' (1687), substantial changes had already occurred, for example: (i) (the Euclidean) geometry was already a well established discipline when, Newton used it to describe the physical laws, (ii) the Aristotelian model of the universe formed solely on the basis of it "absoluteness" had been replaced by the Copernican model, where the Earth was no longer stationary but was moving around the sun. The Copernican model was confirmed by Kepler and Galilei through observations, though they were unable to provide the reasoning as to why the orbits of the Earth and the planets were elliptic and not circular. It was Newton who (for the first time) put forward a theory for describing the motion of bodies in space and thus showed why the orbits of the Earth and the planets were elliptic. To formulate this theory he also developed the complicated mathematics which was required to analyze these motions; this is our familar calculus. Newton's work, 'Principia Mathematica' [29] in which he laid the foundations of "gravitation theory," is considered to date, the single most important publication in physical sciences. Another gigantic contribution to the theory was made by Einstein some 225 years later in the form of General Relativity. In fact the theory of relativity, with Newton's work on gravitation as the base-line, forms the cornerstone of our present knowledge on 'gravity in the universe'—the theoretical as well as the experimental. We have devoted this section to examine the differences and similarities between the theories proposed by Newton and Einstein, namely Newton's theory of gravitation and Einstein's theories of special and general relativity leading to gravitational theory.1 '
In order to do this, we shall be using the two concepts interchangably.
Gravitation, Relativity and Black Holes 357
l.l
Newton's Theory of Gravitation and his Famous Laws
We recall that Newton formulated his theory on three simple premises: (i) Newton's first law of motion: A body remains at rest or, if in motion it remains in uniform motion with constant speed in a straight line, unless it is acted on by an unbalanced external force. (ii) Newton's second law of motion: The acceleration produced by an unbalanced force acting on a body is proportional to the magnitude of the net force (resultant force) in the same direction as the force, and inversely proportional to the mass of the body; thus if a denotes the acceleration, m the mass and F the net force (whose direction is the same as that of a), then a <*: net — . m (iii) Newton's third law-action and reaction: Whenever one body exerts a force upon a second body, the second body exerts a force upon the first body; these forces are equal in magnitude and are oppositely directed. Although in none of these laws Newton explicitly uses the word inertia, in essence this underlies his first law. For him 'inertia' was a property of objects that described their tendency to maintain their state of motion, whether of rest or of constant velocity; in other words, according to Newton objects obeyed the 'Law of Inertia.' We shall soon see, while using the derived word 'inertial' as an adjective in Einstein's theory, that this 'Law of Inertia' has actually been turned around for the purposes of theory there. Using the above three laws, Newton postulated his 'law of universal gravitation" that reads as: Every particle in the universe attracts every other particle with a force that is directly proportional to the product of the masses of the two particles and inversely proportional to the square of the distance between them. Expressed as an equation this becomes: F=Ganf3
where G is a constant of proportionality, and m and m are the masses of two bodies separated by a distance r. Newton used the space R 3 x R to describe his theory. Thus an event (according to Newton) could be specified position-wise by a point belonging to 3-dimensional Euclidean space R 3 and time-wise by a point on the line R. The space and time are disjointed entities here, and the symmetry group of the theory is the Galilean group. 3A
2
It is worth noting here that Newton used Kepler's laws of (planetary) motion as well as his own astronomical observations to establish his law of gravitation. 3 It is assumed that m, m are very small as compared to their separation distance r. 3A ' It is worth noting here that it was Galileo Galilei who formulated the first known Principle of Relativity (see Chapter 3 in [39]). Using uncanny test objects such as fish in large bowls of water, flies, and bottles dripping drops of water, he showed, that physics looks the same in a ship moving uniformly as in a ship which is at rest. Einstein replaced these sea-going ships by spaceships for his special Relativity theory. Thus it is fair to say, that Galileo had already conceptualized the notion of Inertial Frame.
358 Mathematical Perspectives on Theoretical Physics
The time ordering is a basic part of this theory. Thus for any two events A and B, it is possible to say that either A precedes B or B precedes A or they are simultaneous. The consistency in this order demands that 'simultaneity' be an 'equivalence relation,' which in turn requires that spacetime be divided into equivalence classes of mutually simultaneous events with each class representing the universe at a given time. The following figure describes this idea.
/ Planes of absolute
simultaneity
{
pi
^_
I .
Si /
Geodesic
/ s
I Termpral interval J between p and q
I
~~?r
\
/
Non-geodesic
| ^ £ Q Stratlflcaflon of Newtonian spacetime. In short, Newton like his predecessors thought of 'gravity' as a force acting through space—that was (quite) mysterious and had to be explained. The concept of gravitational force in this form continued until Einstein proposed the revolutionary idea of 'eliminating gravity' in order to explain the 'force of gravity.4'
1.2
Einstein's Proposal—the Free-float Frame and the Observer
According to Einstein 'gravitation' was not a foreign force transmitted through space and time, instead it was a force that manifested itself in the curvature of spacetime. He considered space and time on equal footing and used Minkowskian space, and later on Riemannian space to describe his theory. We shall describe in brief the underlying ideas of this theory by actually defining and explaining the 'terms' that are essential ingredients of its scientific interpretation, namely: the free-float (inertial) frame, the test particle, an event, readings on synchronized clocks, and observer's relative accelerations. Definition 8.1.1: A reference frame is said to be an inertial or free-float or a Lorentz reference frame in a certain region of space and time when, throughout that region of spacetime—and within some specified accuracy—every free test particle (see Def. (8.1.3)) initially at rest (in motion) with respect to that frame remains at rest (continues its motion with no change in speed or direction). We note here how the "Law of Inertia" has been turned around; for a reference frame to be inertial, it is required that observers in that frame demonstrate that every free particle in that frame maintains its initial state of motion or that of rest. One can thus say that a free-float frame is defined by (can be identified with) Newton's first law of motion. We further remark: Remark 8.1.2: A free-float frame is "local" in the sense that it is limited in space and time—(and also "local" in the sense that its free-float character can be determined locally from within. We note that a free-float frame has been obtained by assuming the existence of a room so small that no effect of gravitation is felt there, but as soon as this condition of smallness is relaxed, the relative accelerations produced by different external factors come into play, and the 'state' of motion of a free particle guaranteed by an inertial frame remains no longer valid. The tidal waves observed in the ocean are easy 4l
To disregard gravity he used the notion of an unpowered space ship or a freely-falling room.
Gravitation, Relativity and Black Holes 359
examples of this phenomena. These waves are the result of sun and moon's gravitational pull on water particles. It is not possible to find a frame large enough that would include all these particles and be a free-float frame. Evidently, there would be many free-float frames required for it. It is only general relativity—the theory of gravitation (propounded by Einstein)—which tells us how to describe and predict orbits that traverse a string of adjacent free-float frames. Thus General Relativity is the only theory that provides the means to describe motion in unlimited regions of spacetime. Definition 8.1.3: A small particle is called test particle if its mass is so little that within some specified accuracy, its presence does not affect the motion of other nearby particles. Remark 8.1.4: A particle made of any material can be used as a test particle to determine whether a given reference frame is free-float. A frame that is free-float for a test particle is free-float for test particles of all kinds. In Einstein's theory an event is specified by a place as well as a time. The place and time of its occurrence in a given free-float frame is determined with the help of 'synchronized clocks' 5 on a lattice constructed in this frame. One of these clocks is taken as the reference clocks. It is set at time zero. A flash of light sent from here which spreads out as a spherical wave in all directions is supposed to reach a clock, say, ten meters away in ten meters of light-travel time. In other words, a clock at a distance of ten meters records ten meters of light-travel time. The space position of the event is taken to be the location of the clock nearest to the event and the time of the event is taken as the time recorded on this clock. In fact these 'recording' clocks read into their memory the nature of the event as well (e.g., collision, passage of light-flash or particle), besides giving the time and the location. A natural question that follows is: how is this information collected and who collects it? We answer this question next. The information from recording clocks is collected by the so-called 'observer' who may necessarily not be a human being. In the following definition we show precisely what it stands for, and then explain how the information is collected. Definition 8.1.5: In the theory of relativity, the word observer in a manner of speaking is shorthand for the whole collection of recording clocks associated with one free-float frame. An 'observer' can be viewed as a person who goes around reading out the memories of all recording clocks under his control. The location and the time of each event is recorded by the clock nearest that event. Owing to the importance of the free-float (inertial) frame, in the above definition the word observer is often preceded by the word "inertial." We note that the observer does not report on widely separated events that he (she) views by his (her) own eye. For such a report can cause a wrong order amongst the events that are involved. A mistake of this nature can happen due to the travel time of light-for instance the light from an 'event' that occurred a million years ago at a distance of a million light-years in our frame, may just be entering in the observer's eye-domain after the entrance of light from an 'event' that occurred on the moon a few seconds ago.
1.3 Acceleration and Spacetime Curvature Having defined Newton's laws and a few useful terms used in Einstein's theory, we give in the following remark the difference between the concept of acceleration and gravitation in the two theories. 5
'
Synchronized clocks: In a given inertial frame a lattice is constructed. At every intersecting point of this lattice identical clocks, whose readings are in meters of light-travel time, are fixed. All these clocks read the "same time" as one another for observers in this frame.
360
Mathematical Perspectives on Theoretical Physics
Remark 8.1.6: In Newtonian mechanics different particles going at different speeds are all deflected away with equal acceleration from the ideal straight line. According to Newton there is no difference in principle between the fall of a projectile and the motion of a satellite. In brief, in Newton's theory there is one global reference frame and within this frame no satellite is ever gravity free, and no particle ever moves in a straight line at constant speed. In Einstein's theory, on the other hand, there are many local regions equipped with Lorentzian geometry (as in special relativity). The 'laws of gravitation' here arise from the lack of ideality in the relation between one local region and the next. One has to observe the 'relative acceleration' of two particles slightly separated from each other to have any proper measure of a 'gravitational' effect. These 'relative accelerations' double when the 'separations' are doubled. According to Einstein, tidal acceleration displays gravity as a local phenomenon. He further emphasizes that 'tide-producing' effect does not require for its explanation some (mysterious) force of gravitation propagated through spacetime which is in addition to the structure of spacetime. This (tide-producing effect) should be described in terms of the geometry of spacetime itself as the curvature of spacetime. With these philosophical differences between the two theories in place, we now turn our attention to their mathematical descriptions.*
1.4 The Coordinate Transformations: Distinction Between the Galilean and Special Relativity Theory Let / and / ' be two inertial frames covered by coordinates (x, y, z, i) and (xr, /, z\ t'), where (JC, y, z) = r and {x, y', z') = r' are the space coordinates and t and t' are the time coordinates. These two frames are related to each other in the following manner: (i) a relative rotation of the space coordinates
f' - Ar
(ii) a displacement of the space coordinates
r' = f + a
(iii) a displacement of the time coordinate (iv) a type of boost.
t' = t + b (8.1.1)
(The vector a - {ax, a2, a3) and b in the above equations are constants.) In view of Ftn. 3A and (8.1.1) both theories are represented by / and / ' and hence by the same set of coordinates. The distinction between the Galilean and special relativistic theories comes from the boost which is: t' = t
x' = x + vt y' = y zf = z
(8.1.2)
for the first and ct + (v/c)x (l-v2/c2)m *
We emphasize that in spite of distinction between the two theories, their predictions (numerical results) are essentially the same on surface of the Earth (whether it is a projectile path or it is an ocean flow). However when gravitational effects are large (near white dwarfs or neutron stars (see Subsec. (5.2)) it is Einstein's theory that makes the right predictions.
Gravitation, Relativity and Black Holes
X
*'=
V\
2
1/2
361
(8-1-3)
2 112
(l-v /c ) y' = y z' = z
for the second. The number i> in the above equation depends on the observer's (constant) velocity and c is an absolute constant of nature with the dimensions of velocity, which has the role of conversion constant between variables t and x of different physical dimensions. We note that no change in time coordinate in the Galilean case implies that the observer's agree on the definition of simultaneity. (For details on the concept of "absoluteness" in these theories see Friedman [11]. Here he develops the kinematics of both these theories by writing the field equations in curvilinear coordinates (of General Relativity).) We describe next in brief the mathematical content of Newton's and Einstein's theories in the form of equations for gravitational fields.
1.5
Equations of Motion in Newtonian Mechanics
The Newtonian time axis is T=R. The pair (r, m;) represents a particle of inertial6 mass m, e (0, <») moving on a curve r: £ —> R3. For t e e c T, r (?) e R3 is the position vector of the particle at time t. With r as the path7 of the particle, we have r - v as its velocity, \v\ as its speed, m(v its momentum, —m\v\ 2 its kinetic energy and f = v as its acceleration. In this theory the concept of relative velocity, etc., gets introduced only when one considers another curve F : T-> R3, the difference r - F gives the path and the velocity relative to F (this however is irrelevant in the scheme of ideas pursued by Newton). Newton's equations of motion for inertial and gravitational mass respectively are: F = mfib (F = total force acting on the body)
(8.1.4)
(CO = acceleration) F - meg
(F. = gravitational force)
(8.1.5)
(g = gravitational field intensity) (m = gravitational mass 8 ) We note that the gravitational field intensity g is generated according to the inverse square law by a gravitating body. Further, if r and R denote the position vectors of a particle in an inertial frame and in a uniformly accelerating frame, say /", then the two frames are related as: 6
The property of a body whereby it resists any attempt to change its state of motion is its 'inertial mass m,.' The mass m, is measured by collision experiments that do not involve gravity.
7
The word path stands for 'trajectory or orbit' in the Euclidean space and 'world line' in the Lorentz spacetime.
8
The property which determines a body's response to a gravitational field is its gravitational mass mg.
'
362
Mathematical Perspectives on Theoretical Physics
R = r - —COt2 2
(8.1.6)
and therefore while the law of motion in inertial frame from (8.1.4) reads as: m? = F
(8.1.7)
in the accelerating frame it becomes: mR = F -ma)
(8.1.8)
where we have differentiated (8.1.6) with regard to t and have substituted the value of r from (8.1.7). We shall illustrate the applicability of these equations to a system of N-bodies in Exc. 3. As mentioned above in the definitions and the remarks, there is no counterpart of these equations in Einstein's theory of gravitation, in the sense that there is no linear equation similar to (8.1.5) which describes the gravitation force. On the other hand, it is the 'curvature' of spacetime—a highly nonlinear term—that represents the gravity. This equation, whose study will be the subject matter for a major part of this chapter reads as:
The RHS (the energy-momentum tensor) of this equation represents the influence of surrounding matter and the LHS, which depends on the spacetime geometry, stands for the gravitation. The metric of spacetime which goes into the computations of RMV in its most general form (curvilinear coordinates) is written as: ds2 = - gm dx\ +
gij
dxt dxj
(ji,v
= 0,l,2,3;i,j
= l, 2 , 3 ) .
It is known that Einstein arrived at the above equation not by shear coincidence but by years of hard labour in order to resolve the puzzles of nature. The theory of 'Special Relativity' which we describe in brief below is often viewed as a means toward Einstein's final goal: an understanding of 'gravity in the universe.'
1.6
Special Relativity
To begin with, we wish to remind the reader that though Einstein is considered the architect of this theory, there are three others, Lorentz, Poincare, and Minkowski, whose work led to this theory. Before the discovery of 'special relativity,' the space and time were measured in different units. These units of measurement were miles/meters for space, and seconds for time. No one thought of the advantages that would emerge from using the same unit for measuring. It was perhaps because the role of 'light' was not fully recognized in physics at that point.9 In any case the first step toward the theory was to treat space and time on the same footing by using the same unit for measurement. For instance, time in meters is just the time it takes a light flash to go that number of meters. The conversion factor between seconds and meters is the speed of light, c = 299, 792, 458 meters/second. The speed of light is the only natural It was the Mitchelson-Morley experiment which showed that the speed of light was the same in all directions. In fact this led to Einstein's fundamental postulate—the Principle of Relativity: Laws of science should be the same for all freely moving observers regardless of their speed.
Gravitation, Relativity and Black Holes 363
constant that has the necessary units to convert a time to a length. The velocity of light c (meters/ second) multiplied by time t (in seconds) gives ct (in meters).10 By using the same unit, the space and time became one entity; one could not be separated from the other. This space-time unification followed from the concept of invariance of the spacetime interval11 (between any two events)— and thus this invariance showed that the time and space are inseparable parts of a larger unity, though qualitatively they are different. The 'spacetime interval' is the simplest form of measure between two events, and it is 'natural' since it is invariant. The space is different for different observers just as time is, but spacetime is the same for everyone. Minkowski observed that electrical charge and particle mass that are the same for all observers in relative motion are similar to the spacetime interval, whereas quantities such as velocity, momentum, energy, separation in time and separation in space are relative in character, in the sense that they depend on the relative motion of observers. Having seen the role of 'light' in unifying the space and time, we shall see next its importance in analyzing the travel-time of a particle in spacetime. In order to do this, we note that Einstein's postulate (given in Ftn. 9) can alternatively be expressed to say: "all observers should measure the same speed of light, no matter how fast they are moving." The postulate also implies the {aw that "nothing can travel faster than the speed of light." Remark 8.1.7: We shall use these basic facts to show that: (i) a curved path (trajectory/projectile) traced by a particle in 'space' is larger in distance as compared to a straight line path, whereas in spacetime a worldline is shorter when it is kinked (curved); (ii) the time lapsed along a kinked worldline is shorter than along a straight worldline (see Exc. 4). We recall that the path of a particle in Newton's theory is traced in the 3-dimensional Euclidean space, and as such the path is shortest when it is a straight line. In the case of spacetime, the particle travels along a time-like geodesic say CT(T). In Fig. (8.3) we see how this path is longer when it is a straight line. Given below is the equation of an arbitrary geodesic in curvilinear coordinates with the metric ds2 = ^ gijt dxt dxf. ii
D r 7\ , = - ^ _ + V - ^ - ^ - = 0 T
°
aW
du2
Jk
du
(8.1.9)
du
which represents the equation of motion of a free particle. We note that in an inertial coordinate system (coordinates in a free-float frame), the above equation simplifies to: ^4j-
=0
(8.1.10)
2
du
But since every free-float frame can be given a Minkowskian space structure with line element: ds2 = -dxl + dx\ + dx\ + dx\ 10
(8.1.11)
' In 1983 the General Conference on Weights and Measures officially redefined the meter in terms of the speed of light. By this definition the meter equals the distance the light travels in a vacuum in the fraction of a second that equals 1/299,792,458. 11 This invariance of the spacetime interval was discovered by Einstein-Poincare in 1905 and is formally called the Lorentz interval. This invariance demonstrates the unity of space and time while preserving—in the formula's minus sign—the distinction between the two.
364 Mathematical Perspectives on Theoretical Physics it follows that time-like curves CT(T) in Minkowski space are given by the curves xt = constant (i = 1, 2, 3), and each of these curves has the same tangent vector:
feo.o.ol \ dx
)
On the other hand, the curves x0 = constant have the tangent vectors
0, — - , ——, —— . These V du du du )
curves are known as space-like. The curves whose tangent vectors satisfy: (8.1.12)
are the null curves. Also the timelike geodesic CT(T) here satisfies: d2x^-4-=0 dx~
(8.1.13)
in an inertial system. The solutions xl = at x + bt of the above equation suggest that t = x0 = ao T+ b0 is an affine parameter on timelike geodesies. Moreover since Tj'k = 0 in the inertial coordinate systems, the equation for timelike geodesic can be written as:
$L=°
(8'L14)
This equation not only gives the law of motion for free particles in specetime, it also represents Newton's Law of Inertia (see Subsec. 1.1). From (8.1.13) and (8.1.14) it follows that there are two types of time associated with the trajectory of a particle: the coordinate-independent proper time x and the coordinate-dependent coordinate time t. Likewise we have two types of velocity four-vectors: the proper velocity u with components u' = — - , dx and the coordinate velocity v, with components v' = —'-. Obviously the coordinate velocity takes the dt form (1,5) where v is the ordinary three-velocity. To find the relation between x and t we note that f
I
dx: dx-
™mH-*-£-£d'
n
thus 12
We have assumed here that the expression for the curve is in curvilinear coordinates.
Gravitation, Relativity and Black Holes 365
dt
V ' dt
dt
]
{{ dt )
{ dt )
{ dt ) )
= Jl-v2
(8.1.15)
The relationship between the proper velocity « and the coordinate velocity v can now be obtained by using (8.1.15) and the fact that ul = — = ( — 1 ( — |. This gives: dr { dt J{dr ) uo
_ d*o_ _ 1
u<=-^L= = -T?L= VI -v2 VI - v2
(8.1.16)
where vt are the components of the three-velocity v. From equations (8.1.10) and (8.1.14) it follows that the free particles in both theories (Newtonian and Special Relativity) move (locally) along straight lines. We give below the diagrams of null cones in these theories to illustrate the differences between them. I
i
/ L
/y
Li9htcone
\ ' ^--:7K
/ Aj P
c
7^^
C^L_ -J^? Light cone \
li >
ii
/
$LJL d,
/ /
/ / //
s C|| = ^ - f f , v\\ = v\-v
^
J
\ " Z ^
/ /
7
^i
d\\*d\, t\\*t\,v\\ = v\ (ii)
1(11) = Reference Frame 1 (2) adapted to the geodesic l(ll). p s The point of emission of light ray. q = The point to which it travels
Q ^ Q
d,(d,() = Distance of travel in l(ll) f, (fH) = Time of travel in l(ll) vt (vn) = velocity of light ray in l(ll) S = plane of simultaneity.
(I) Light cone in Newtonian spacetlme and (ii) light cone in Minkowskian spacetlme.
In this brief summary (including the exercises), we have only given the reader a glimpse of the 'Principle of Relativity' which encompassed the space and time as two pieces of one pie and attributed the reason for time differences to the measurements taken in different frameworks called the Laboratory
366
Mathematical Perspectives on Theoretical Physics
and Rocket frames (in Relativity Theory). Essentially the time difference until then was attributed to surveying discrepancies (see Chapter 2 of [39] for details). Finally, in spite of its successes the Theory of Special Relativity was not the complete answers to questions of the physical world, for one thing it provided no framework for gravitational interactions (see Schild in [7] and [8b]). In his (1916) paper [8c] Einstein begins by focussing on the shortcomings of the theory and thus justifying the need for a theory that incorporated the theory of special relativity and the gravitation. In short, he favoured a theory where all forces of nature could be collectively expressed with further provision that the equations involving them remained invariant under coordinate transformations. Einstein thus used the principles of covariance of tensor calculus to write his equation:
(8.1.17)(a)
V-J^M^V
where the LHS represented the gravitational fields via the curvature of massive bodies and the RHS stood for matter fields (including the electromagnetic fields). Since the observed phenomena did not fully agree with the predictions of the theory, he added a term A g^v and called A the cosmological constant. It is this full form of Einstein equation: R
nV-\Rgnv+
A
8nv=T^
(8.1.17)(b)
that we shall study in the following sections, and shall find out for ourselves that this represents the best theory for gravitational interactions of extended body-systems and answers the intriguing questions concerning the gravitational waves13 (see Grishchuk and Polnarev in [7], K. S. Thorne in [16d] and L.M. Sokolowski in [32]), singularities, and black holes (see Miller and Sciama in [7]) of our universe.14 In conclusion perhaps one may argue that since geometry plays a dominant role in all three theories, their distinct characteristics can be attributed to the manifolds that are used there: (i) IR 3 x IR : (direct product of Euclidean spaces) (ii) IVD'3: (gu - 1, i = 1, 2, 3, g00 = -1) (Minkowskian) (iii) 4-dimensional curvilinear space which is locally Minkowskian ( g^v = nonconstant in general)
Newton; Special Relativity;
General Relativity.
(See Exes. 1 and 2 and also Yang in [27] for geometry and physics and also Chapter 6 of [9] on ideas pertaining to geometry in Newtonian and Einsteinian physics.) 13
Gravitational waves are an unavoidable consequence of the relati vistic theory of gravitation and they occur in all physical processes where gravitational radiation participates. The two fundamental predictions of the General Relativity theory—the gravitational waves and black holes— differ because black holes often need very strong gravitational fields (gravitation potential —> c2) for their formation, while gravitational waves exist even in the weak-field approximation. Exceptions to the requirement of a strong gravitational field (for formation of a black hole) are seen in the following example: near a black hole in M 1 near r = particular a large black hole the gravitational field represented by Raprsx ~3~ ~ f 2Af> is small if M is large (see Sec. 5).
Gravitation, Relativity and Black Holes 367
Exercise 8.1 1. State Kepler's laws. 2. Show the similarity between the laws of motion along geodesies in a normal frame of general relativity theory and a semi-Euclidean frame of special relativity (see Chapter 5 in [11]). 3. Use Newton's laws of gravitation to obtain dynamical equations for a finite system of N bodies that is isolated (see T. Damour in [16d]). 4. Establish the statement given in Remark (8.1.7). regarding the distance and the time with the help of figures and examples.
Hints to Exercise 8.1 1. Kepler's laws for orbital motion are: (a) The orbit of any planet around the sun is an ellipse, with the sun at one focus of the ellipse, (b) The line joining any planet to the sun sweeps out equal areas in equal times, (c) For any two planets in the solar system, the squares of the periods of revolution are in the same proportion as the cubes of their average distances from the sun. 2. Let CT(T) be a geodesic of the general relativity theory (i.e., we have curvilinear coordinates, a metric g(j and a connection Tjk). A normal coordinate system can be constructed by choosing a quadruple of "orthonormal" vectors {X,} in the tangent space Ta(i) for each value of T, so that the metric tensor satisfies: i*j (i) g(Xi,XJ) = 0 = 1 i = j= 1,2, 3 = -1 i = j = 0. From each point ofCT(T)there comes out a family of spatial geodesies orthogonal to Xo whose parameters are defined by their proper distance s from C(T). TO each point p of this geodesic we assign coordinates: (ii)
yo=r,
yi
= g(n,X}s
(i=l,2,3)
where n is the unit tangent vector toCT(T)and s is the proper distance along the geodesic from O(f) to p. We note that this quadruple must be smooth and that such coordinates can only be defined on a small neighbourhood of cr. The connection coefficients in these coordinates can be obtained by writing the differential in general coordinates e.g. (iii)
and then making use of (i) and (ii). They are thus:
(iv)(a) and (iv)(b)
r°Oi = l V r c i o = a ' (' = 1.2,3) rj^rj^QJ
(i,7 = l,2,3)
the symbol Qj is the usual antisymmetric rotation matrix in R3. All other Fj^'s are zero. The equation of motion along the geodesic crthen becomes (see (8.1.9)):
(v)
^ + dyS
«'
+ 2 Q'^-2«'.^-4Uo
initial JjllL force
Coriolis force
-
dy°Jl relativistic correction
.
368
Mathematical Perspectives on Theoretical Physics
We recall that for special relativity the law of motion (a particle moving on a timelike geodesic) is simply (8.1.14)):
| ^ = 0 .
Thus if the inertial force (acceleration) a' and the rotation both vanish, our normal frame15 becomes a local inertial frame and (v) and (vi) become the same. This shows that the theory of General Relativity can locally be viewed as the theory of Special Relativity. 3. Since the system is an isolated one, no force other than the mutual gravitational force affects the system. We assume (for simplicity in computations) that these bodies are made of some perfect fluid with a given isentropic equation of state that links the pressure p to the mass density p, in other words (see Ref. 1, M. Mikkelson): (i) P = Pip). The equations that describe the Newtonian dynamics of the system are: (ii) .....
—— H dt (dv>
jdv1)
J
—.— = 0 dx' dp
continuity equation dU
(in) p — - + v —-J- = - —^- + p—— ^ dt dx1 ) dx' dx1
Euler equation
(iv) AC = - An Gp Poisson equation where v' = v' (x, t) is the velocity field in Cartesian coordinates i,j = 1, 2, 3, U= U(x', t) is the positive gravitational potential, and G denotes Newton's gravitational constant. We note that the assumption on isolation of the system implies that the gravitational potential U falls off outside the system:
lim U{x, t) = 0 t = const.
where | x | is the Euclidean norm of x . Poisson's equation can therefore be solved to give:
(v)
U(x,t) = GJ-^d\.
Here \x - y \ gives the Euclidean distance between the field point x and the source point y , and d y denotes the Euclidean volume element in Cartesian coordinates. The dynamical equations of this N-body system are obtained by considering two separate problems: the external and the internal, the first relates to the determination of motion of the centres of mass of N bodies and the second to the motion of each body around its centre of mass. Let ma denote the total mass of a-th body which occupies the volume Va, (a, b, c ••• = 1, 2 ••• N), then
(vi)
ma - jy p(x, t)d\
(ma can be viewed as a constant due to the continuity equation). The position of its centre of mass is given by: (vii) 15
zia= —
\xip{x,t)d\.
The frame refers to a tangent space with basis at a point of spacetime region (with curvilinear coordinates).
Gravitation, Relativity and Black Holes 369 Now for any smooth function F(x, i) we have:
(viii)
- f j F(x, t) p(x, t) d\ = f at
"a
JVa
where
dF(x'
at
dF(x,t) _ dF(x, t) dt ' dt
° p(x, t) d\ dF(x, t) dx1 dx' dt
_ dF(x, t) j dFjx, t) dt dt Differentiating (vii) twice with respect to t and using (viii) to write the RHS we have: .. ,
d2z'a
r
dvl ,3
dv' From (8.1.4) we know that p
= J1 gives the local equation of motion, J1 being the local dt force density, and therefore (ix) can be written as: ,2 i
(x)
ma-—t = \vrd\.
In a perfect fluid model J ' =
~ + p—-. Also for every body of the system and in particular dx' dx' for the a-th, it can be decomposed in terms of an internal force (known as the self-force), (XD
r
(X1)
Jt
•*-"
dp
i
dx1
P
oduU)a dx'
and an external force: (XU)
?(e)a- ~ P
dU{e)a -, i
where the self-part f/(s)fl of the gravitational potential is: (xiii)
U(s)a(x, t): = G [ P ( j U ) d3y. Jv\x-y\ The external part f/e)a = U - l/^a that results from integration over the other bodies is: (xiv)
If we choose an accelerated "centre of mass frame of reference," i.e., regard the centre of mass of the a-th body as the origin of the frame which is formed by axes parallel to the global cartesian axis, then the position of a point p with respect to this frame x'a is linked to its position in the global (cartesian) frame x by:
(xv) 16'
4 = V-4 1 6 .
z'a = coordinates of centre of mass with respect to the global frame.
370 Mathematical Perspectives on Theoretical Physics
Since time is absolute in Newtonian Mechanics, the relative velocity and the relative acceleration are given respectively as: dx'J _
(xvi)
i _ dz^ _ ~, dv^ _ dtf_ _
d2J
dt dt a' dt dt dt2 Using this notation the basic equation of the 'external problem' in view of Eq. (x) becomes: (xvii)
^
.
whereas the basic equation of the a-th 'internal problem' (motion in the centre-of-mass frame of the a-th body) can be written as (see equations (iii), (xv) and (xvi)).
(xviii) mn The above equations show that in spite of the decomposition of force-density into internal and external components, the internal and external problems are a priori coupled to each other. For instance in equation (xvii), which represents the external problem, the second term on the RHS is in fact of internal origin, as it is the total self-force: (xixi)
(The problem we have discussed above is called the N-extended body problem in Newtonian gravity.) 4. (a) Let O be the reference point and P be another point which is reached by a curved path in space and spacetime (Fig. (8.3) (i) and (ii)). It is evident that the total length along the winding path from point O to point P (in the first case) is greater than the length along the straight
Curved path— greater length
Curved worldline— shorter proper time
t Time
North
Straight worldline
Direct path
Increase in space
Increase in east Increase in north
Increase in time
East" Path in space (0
O
- Space Worldline in spacetime (ii)
Length along a path ((I) In space, (1!) In spacetime.
Gravitation, Relativity and Black Holes 371
northward axis from O to P. We know that the particle in spacetime travels along a timelike curve, hence here we have to measure the total proper time from event O to event P. Since the total proper time is shorter along the curved line we have established the statement made in the Remark (8.1.7). (b) The spacetime map given below shows the time and space measured in years. The locations of four events are given in the following table: 8
1
I
I
I
I
I T4|
1
7
SPACE AND TIME LOCATION OF EVENTS
. i
Event 1 Event 2 Event 3 Event 4
Space (years) 1 -1 .5 3
Time (years) 0 1 2.5 8
b
T I Time 4 (meters) 3 I 2 * #
Space (meters)
_ * £
| *~ _ 3
I
|
_ 2 _1
I 0
U1 | 1 2
I I 3
4
5
Using this information we compute the proper time taken by a traveller (recorded on his/her watch) who begins from Event 1, passes through Events 2 and 3 and reaches Event 4, and by another traveller who goes directly from Event 1 to Event 4. The proper time in the first case is the total sum of proper times of three segments, which is obtained by using the formula: (interval)2 = ± (space separation)2 + (time separation)2 = + (difference in space coordinates) + (difference in time coordinates)2.17 This gives: V[(2)2 - (-1) 2 ] + V[(-1.5) 2 - (-1.5) 2 ] + Vt(-2.5) 2 - (-5.5) 2 ] = V3~ + >/24~ = 1.73 + 4.90 = 6.63. In the second case it is: V ( - 8 ) 2 - ( - 2 ) 2 = V60~ = 7.75. The above example shows that the time lapsed in the first case is less than in the second case.
2
THE EINSTEIN UNIVERSE
When Einstein propounded his theory of general relativity and wrote his famous equation G^v = 8nT^
(8.2.1)
relating the 'gravity' on the LHS with the 'matter' on the RHS, one of his goals was to unify the gravitation and the electromagnetic force since he thought that these were the only forces in nature. However, from his joint work with Grossman in 1913 (see Sec. 17.7 in [26]), it is well known that his aim initially was to put accelerated frames on the same footing as the inertial frames of reference. 17
' The signs on the RHS indicate that the interval is a positive quantity.
372 Mathematical Perspectives on Theoretical Physics
In order to write the above equation, he used the rules of differential geometry, bearing in mind that the equation, being a tensor one, was invariant under coordinate transforms and as such it incorporated within itself the principle of equivalence [24]. At present, of course, not only the number of forces in nature has been changed from two to four but the method of description has been refined as well. For example, one no longer talks of simultaneity of events (as postulated by Einstein in his theory), instead one talks of spacelike hypersurfaces. Fortunately for Einstein and for posterity, the tensor equation (8.2.1) was observed to hold ground when experiments were made involving known forms of the matter. The obvious questions that followed were: (i) given a matter field, how to construct T^ and (ii) what were the suitable metrics that could be used to formulate the gravitational part G^v which as we know from Eq. (8.1.17) (b) stands for: K/iv - y
R
8nv +
A
8Mv
In other words, a search for solutions of the above equation became a priority for physicists. Most of the solutions of this equation were obtained locally in the early fifties, their global properties, however, were investigated in the sixties after the pioneering work of Penrose, Chandrashekhar and Hawking. We shall study both of these ((i) and (ii)) in Sec. (8.3) and Sec. (8.4) respectively. Our attempt here (in this section) is to give the main ingredients of the theory that are required for its description, namely the mathematical model of space-time (the collection of all events), the matter fields and the postulates of local causality and local conservation of energy and momentum. With these in place, we shall be able to write the equations and study their structural properties.
2.1 The Mathematical Model Consider an equivalence class of pairs (M, g) where M is a 4-dimensional, connected, Hausdorff C°°- manifold and g is a Lorentz metric (i.e., a metric of signature +2) on 'M. Any two pairs (!M, g), (iW', g') of this class are isometric, meaning thereby that there exists a diffeomorphism 0: fW —> M' such that 9*g = g', i.e., 9 carries the metric g into the metric g'. This equivalence class (represented by one of the pairs (3tf, g)) is the mathematical model for spacetime. In the terminology of Sec. 1, this is the collection of all events. We shall refer to it, or rather to the pair (iW, g), as a spacetime manifold. Since the word manifold intuitively implies continuity, we note here that the continuity in this case has been established for distances down to approximately 10"15 cm by experiments, therefore for distances smaller than this, the manifold model of spacetime defined above may not be appropriate (see [16c]). The assumption of connectedness on !M suggests that we have the knowledge of all events, since there are no disconnected components. Finally, the Hausdorff condition together with the existence of a Lorentz metric implies that M is paracompact. It is this M which we shall coordinatize and write the field equations upon. Now the metric g allows the classification of non-zero vectors at a point p e 9vt as timelike, spacelike or null, according as a non-zero vector X e Tp (the tangent space at p) satisfies: g(X, X)<0, g(X, X) > 0 or g(X, X) - 0. The differentiability of metric plays an important role in the writing down of the field equations (as we shall soon see). If, however, the metric coordinate components gah and gah are just continuous and have locally square integrable generalized first derivatives with respect to the local coordinates, then the field equation can be set up only in a distributional sense. To avoid complications we shall take that metric to be Ck in general. There is still one more condition that we have to impose on the model (5W, g) to ensure that all the nonsingular points of space-time are included.
Gravitation, Relativity and Black Holes 373
A Cr pair (fW', g') is called a C-extension of (M, g) if there is an isometric C'-imbedding li : fM -> 94.'. Evidently in this case the points of 5Vf will also have to be viewed as the points of spacetime. We shall therefore assume that the model (M, g) is C-inextendible. Although it may seem so, not all models are inextendible. A simple example of a non-inextendible model can be given by a pair (Mx, gx) where lMx is a two-dimensional Euclidean space with the ;c-axis removed between points xx = - \ and xx = + y . Obviously (ftfj, gj can be extended by replacing the unit interval by an arbitrary interval. There are, of course, other ways in which it can be extended. This leads us to a still stronger condition of inextendibility as defined below. Definition 8.2.1: A pair (2W, g) is said to be Cr-locally inextendible if there is no open set 11 c 9A with non-compact closure in M such that the pair (11, g\v) has an extension (11', g') in which the closure of the image of 11 is compact.
2.2 The Matter Fields The fields that describe the matter content of spacetime are called matter fields. A classic example of such fields is the familiar electromagnetic field. Since these fields are defined on a differentiable manifold 9A. with metric g, the equations involving them are expressed via tensors and their derivatives are covariant derivatives with respect to the symmetric connection defined by metric g. If there is another connection on M, from the rules of differential geometry we know that the difference between two connections is a tensor; this tensor here is regarded as a physical field. If M carries another metric, that is also viewed as another physical field. Finally, the theory one obtains depends on the matter fields that one incorporates into the theory. The rule of thumb here is to include all fields that have been experimentally observed and postulate further the existence of those which are still undetected (experimentally). We will use the notation *F(°;t:*.d to denote these matter fields. The subscript (i) will denote the j-th field of the theory, and as usual the superscripts (subscripts) will stand for the contravariant (covariant) indices, indicating the tensor character of *¥. We now describe the two postulates concerning these matter fields. Both of them are common to the two theories of relativity, the special and the general.
2.3
Postulate (a): Local Causality
The equations governing the matter fields must be such that given a convex normal neighbourhood 11 and a pair of points p and q in it, a signal can be sent from p to q if and only if they can be joined by a C'-curve lying entirely in 11, the tangent vector of this curve is everywhere non-zero and non-spacelike.18 The above postulate can alternatively be given in terms of the Cauchy problem of the matter fields in the following manner (see also Sec. 3). Let p e 11 be such that every non-spacelike curve through p intersects the spacelike surface xA = 0, within il denote this set of points of xA = 0 by 7. Note that J consists of points that can be reached from p by non-spacelike curves lying entirely in 11. It is required that the values of the matter fields at p are uniquely determined by the values of the fields and their derivatives of finite order, say k on 7 and not by values of the fields on a proper subset J' of f, to which J could be continuously retracted. A few important consequences that follow from the adherence to this postulate are: 18
A tangent vector X, which is either timelike (g(X, X) < 0) or null (g(X, X) = 0), is called non-spacelike. We shall use the coordinate x4 in place of x° from now on to distinguish the general theory.
374 Mathematical Perspectives on Theoretical Physics
(i) The metric g is a distinctively different field on 9A, which is geometric in nature, (ii) Using {xa\ as normal coordinates in 11 around p, the coordinates of the points which can be reached from p by non-spacelike curves in 11, are seen to satisfy: (x1)2 + (x2)2 + (x3)2 ~ (x4)2 < 0. The boundary of these points is formed by the image of the null cone N of p under the exponential map, evidently it is the set of all null geodesies through p. The null cone separates the timelike vectors and spacelike vectors at p. (iii) One can determine the metric at p up to a conformal factor once Np is known (see Exc. 1 for (ii) and (iii) and can determine).
2.4
Postulate (b): Local Conservation of Energy and Momentum
The equations governing the matter fields imply the existence of a symmetric tensor Tab known as the energy-momentum tensor. The tensor Ta' depends on the fields, their co variant derivatives and the metric, and satisfies the following properties: (i) It vanishes on 11 if and only if all the matter fields vanish on 11. (ii) It obeys the equation: Tab;b=0.
(8.2.2)
where ; denotes the covariant derivation with respect to a given metric. The first of these conditions establishes the principle that all fields have energy, and the second gives the 'conservation law' provided metric g admits Killing vector fields (see Exc. 2). Now the symmetric tensor Tah given in Postulate (b) is as yet not defined, we see next how it can be uniquely determined when equations of the field are derived from a Lagrangian.
2.5
Construction of the Energy-momentum Tensor Tab
Let L be the Lagrangian (a scalar) formed by fields metric and let S be the action: 5=J
V
F"^ 6 d, their first covariant derivatives and the
Ldv
(8.2.3)
where T> is a 4-dimensional compact region of a spacetime manifold M and dv is the volume element. From our earlier study on the action S (in Chapters 6 and 7), we know that the equations for a physical system of the fields are obtained by requiring that the action S be stationary for all variations of the r) S
fields in the interior of ©. The action S is said to be stationary if —
= 0 for all variations of the
fields in T>, u being a parameter used in the following definition. Definition 8.2.2: A one-parameter family of fields W^ (w, r) where u e (£, - e) and r e 51/is called a variation of the fields ^F^ if (i) 19
¥ ( 0 (0, r) = «F(0 (r) Indices a, b, ... are used when M is an arbitrary spacetime manifold.
Gravitation, Relativity and Black Holes 375
(ii)
¥«•) («, r) = 4 ^ (r)
reJf-J)
(8.2.4)
by A«F(0
(8.2.5)
Denoting —^
i dM
«=o
we have:
f
= S I [~kT-^ncb.,
+ ^ x - A ^ ^ . ^ J A,
(8.2.6)
Recalling that the symbol ';' denotes the covariant derivative, we note that A^F^ satisfies
and hence the second term in (8.2.6) can be written as:
S i f^O—A^.J -fa=^—] ^ . , dv
(8.2.7)
The expression within the parenthesis in the first term can be regarded as the component Qe of a vector Q, this allows it to be written as:
L<&do=LQ'd<''
(8 2 8)
--
Now condition (ii) of (8.2.4) implies that A*F(I-) vanishes at the boundary dT>, which means that the first term is zero for every field. Equation (8.2.6) can now be written using the second term of (8.2.7). Putting these together, we have that —
vanishes for all variations on all compact regions such as
T>, if and only if the Euler-Lagrange equations: d
\
-[
d
\
) =0
(8.2.9)
hold for all (i). These are the required equations of the fields. To obtain the energy-momentum tensor from the Lagrangian, we consider the change in the action that is induced by a variation in the metric. We assume that a variation gab{u, r) leaves the fields d ^u')
376 Mathematical Perspectives on Theoretical Physics
^
=
dgab
V
*
(8-2-10)
2
(see Exc. 3 for the derivation). Piecing these facts together we now have:
•i
^
(8.2.11)
with the provision that the second integral in view of (8.2.10) be written as ^ ( L
gab Agab)dv
(8.2.12)
Now the variation in metric induces a variation in the connection as given below (in terms of components): A ^ = \fd
{(*8db);c + (*Sdc);b ~ ^ gbc);d}
(8.2.13)
Using (8.2.13), A{^¥"i)ch d.e) can be expressed in terms of (Ag/m).n (I, m, n stand for suitable combinations of a ... b, c ... d and e), and then applying the usual integration by parts technique, the integrand can be shown to involve Agab only. Finally, collecting the coefficients of Agab from the simplified dS version of (8.2.11), we can write as: du jv(TahAgab)dv
(8.2.14)
The symmetric tensor Tab is the required energy momentum tensor of the given fields (see Exc. 4 and 5 on construction of Tab, and Exc. 6 for dependence of the conservation equations on fields). We shall now use the above two postulates along with the information gathered in various exercises to write the field equations of Einstein.
2.6
The Field Equations
To write these equations we shall have to choose the metric g in (!M, g) which we have so far not chosen, except mentioning it briefly, that it is locally a Lorentzian metric of signature 2. The easiest course would be to choose a flat metric as in special relativity, and since the theory of special relativity does not include gravitational effects, introduce an additional field for bringing in the gravitation. But the choice of a flat metric does not work since experiments have shown that light rays travelling near the sun are deflected, which means that spacetime can neither be flat nor can it be conformally flat. As for the introduction of a gravitational field, from Postulate (b) it follows that this field would be incorporated in the energy-momentum tensor, thus leaving the theory with no gravitation field.
Gravitation, Relativity and Black Holes 377
From these observations it follows that the gravitational field should be the result of the curvedness (curvature) of spacetime. Also the equations written with the field should be such that their predictions do not contradict the Newtonian principles of gravitational theory.20 According to Newtonian principles, the active gravitational mass of a body (the mass producing a gravitational field) equals the passive gravitational mass (the mass that is acted upon by the gravitational field),21 and the field equations do not involve time. Einstein's field equations incorporate these principles in the following manner: (i) They include a constant G (G^v =Sn GT^V) known as the Newtonian gravitational constant; (ii) They are formulated in terms of a static metric. A metric is called static if it admits a timelike Killing vector field K which is orthogonal to a family of spacelike surfaces. These surfaces are regarded as surfaces of constant time and are labelled as t = c. The vector field K defines a unit vector field V = / " ' K, where/ 2 = -K°Ka*. The integral curves of V (which are also integral curves of K) define the static frame of reference (see Exc. 7), thus a particle whose history is one of these integral curves experiences no change of time in spacetime. In other words, a particle released from rest and following a geodesic would appear to have an initial acceleration of (-V) (defined below) with respect to the static frame. Hence if/ — unity, then initial acceleration of the particle is —(- V / ) . This analysis suggests that the quantity - 1) must be treated analogous to the Newtonian gravitational potential. We therefore derive an equation that involves/by considering the divergence of V". Since V is a timelike unit vector we can write Vaib=-V"Vb
(8.2.15)(a)
where Va= Va.b Vb = rlf,bgah
(8.2.15)(b)
Hence the divergence of V" can be written as: Ka = (Va.b Vb).a = V.b.aVb + Va.b Vb.a = Rah VVb + {Va.a).ibVb + {Vb Vbf = Rab Va Vb
(8.2.16)
(where we have used (0.5) and the result of Exc. 7). Also, since Va can be expressed in terms of/and the metric tensor, we have: y% = (f~%8ab); a = ~r2f;af;b S°* + / " ' Aba g"' Again,
a
b
fl
b
l
(8-2-17)
ab
(8.2.18)
VaVh
(8.2.19)
f,ab V V = -f.a V. V = -f~ f.a f.b g
Using these equations together we obtain f-ab (ga" + VaVh)=fRab 20
' See [19] for a beautiful account of the manner in which the metric arises as the carrier of the imprints of gravitation. 21 This principle has been verified experimentally in 1968 (see Sec. 40.8 in [26], and the Appendix 8A). * Evidently a, b in this section stand for 1, 2, 3.
378
Mathematical Perspectives on Theoretical Physics
But the term on the left is the Laplacian of/with respect to the induced metric in the 3-surface (t = c). If the metric is almost flat, it corresponds to the Newtonian-Laplacian of the potential. Now 'almost flatness' implies a weak (gravitational) field, hence it follows that in the limit of a weak field, if the RHS term in (8.2.19) is equal to \n G times the matter density plus any other term which is small in the weak field limit, the theory obtained with the above prescription would agree with the Newtonian theory. To achieve this we set the relation: Rat = Kab
(8-2.20)
where (in view of the above discussions), Kab must be a tensorial function of the energy-momentum tensor and the metric, and must be such that {AKG)~X Kab VaVb equals the sum of matter density and the terms which are small in the Newtonian {i.e. weak field) limit. Suppose we take Kab simply as An GTab, then since Rab satisfies the" contracted Bianchi identities Rha.h - -k-R.a, the equality (8.2.20) leading to (8.2.21) would eventually imply T.a = 0 as Tb.b = 0 (in view of conservation equations). But this contradicts the actual phenomena of nature.22 Hence the expression for Kab which satisfies (8.2.21) must be: Kab = l'(Tab-±Tga^+Agab
(8.2.22)
where k and A are constants whose values can be determined from the Newtonian limits. Using the Newtonian limit we can assign the value 8TT G to the constant k. Furthermore if we use units of mass in which G = 1, the equations we are looking for are:
Kb = 8*(rfl6 - i-7gafcj + Agab
(8.2.23)
These are the well known Einstein equations, which can be written equivalently as: (*«* - \R8ab]
+ *8ab = 8"Tab
(8.2.24)
Since both sides are symmetric, these form a set of ten coupled nonlinear partial differential equations in the metric and its first and second derivatives. Although these ten equations are not linearly independent to begin with,we do obtain a set of independent equations in view of the fact that, the covariant divergence of each side vanishes identically: (Rab - ^Rgab Tab.b = 0.
+ Agab) .b = 0
(8.2.25)(a) (8.2.25)(b)
The vanishing of T.a implies that the difference jX-^p = (energy density -3 x pressure) of a perfect fluid is constant throughout the space, which is not true (see Exc. 7 and [16c], p. 73).
Gravitation, Relativity and Black Holes 379
Hence taking into account their symmetries as well as (8.2.25), we note that the number of independent differential equations for the metric-that result from Eq. (8.2.23) is only six. We show next that Einstein equations can be derived by applying variational methods (established above) on an appropriate action. The action under consideration is: S= j (a(R-2A) + L)dv
(8.2.26)
where a is a constant and L is the matter Lagrangian. We require that S be stationary under variations of gab. The first and second term of the integrand under this variation are: A(a(R - 2A)dv) = a((R - 2A) ±gab
Agab + Rab Agab + gab ARab)dv
A(Ldv) = (Tab Agab)dv
(8.2.27) (8.2.28)
The last term of (8.2.27) can be further written as: gab ARab dv = gab {{ATcab).c - (ATcac).b) dv = (ATcab §<* ~ Ardadga%
dv
(8.2.29) ab
(where we have adjusted the dummy indices and have used the fact that g c = 0). From the RHS of (8.2.29) it is evident that the integrand is a divergence therefore J gab ARab dv can be transformed into an integral over the boundary d"D. But AF£C vanishes on the boundary, hence we have:
J£|
= l{a[i^R-hyb-RabyTab}Agabdv
(8.2.30)
In
Thus if
j
vanishes for all variations Agab, we obtain the Einstein equations after putting (X =
du u=0 8n In conclusion, we note that the choice of a scalar R in the above action S is the best choice, for if this scalar was replaced by Rab Rab or Rabcd R"bcd, one would obtain equations involving fourth order derivatives of metric tensor, which would mean that one would have to specify the initial values not only of the metric and its first derivative but also of second and third derivatives as well. Moreover, this would be contrary to other accepted rules of physics where equations are of first or second order. In view of this, we shall assume that the field equations do not involve derivatives of the metric higher than the second, and if they are derived from a Lagrangian, then the action must be of the form (8.2.26). Having decided upon the form of the action, one may still ask if it would be possible to change the metric and demand that action be stationary under the variation of that metric. For example, would it be possible to consider a conformally flat metric: gab=n2riab
(8-2-31)
and then seek the equations based on variations of this metric. Naturally the equations would now be obtained after replacing Agab with 2QT1 {AQ)gab. This theory based on the conformally flat metric is called the Nordstrom theory. The theory is in agreement with the Newtonian theory if A is small or
.
380 Mathematical Perspectives on Theoretical Physics
zero and a = -1/24 ;r, however it is inconsistent with the observed deflection of light by massive objects and fails to account for the measured advance of the perihelion of Mercury. We may mention here that the two drawbacks of the above theory can be removed by choosing the metric as: gab=n\rlab+
XaXb)
(8-2-32)
where Xa is an arbitrary one-form field. The theory gives the Newtonian limit in a static metric in which Xa is parallel to the timelike Killing vector. However, there could be other static metrics where Xa was not parallel to the Killing vectors and thus would fail to give the Newtonian limit. From these discussions it is evident that the metric g can not be further restricted (apart from requiring that it be Lorentzian). This brings us to the third postulate:
2.7
Postulate (c): Field Equations
The Einstein's field equations (8.2.24) hold on iW. The predictions of these field equations (within marginal experimental errors) agree with the observations made on the deflection of light and the advance of the perihelion of Mercury (see C. M. Will in [16d]). In the next two sections we shall study the Cauchy problem related to these gravitational field equations and give a brief description of spacetime singularities.
Exercise 8.2 1. Show that the postulate (a) of local causality helps to determine the ratio of the magnitudes of a timelike vector and a spacelike vector at every point p e M, and also enables one to measure the metric up to a conformal factor. 2. Show that if the metric g admits a Killing vector field K, then the conservation equation (8.2.2) gives the conservation law, i.e., Ka.h + Kb.a = 0 and 7"$ = 0 => (KaTah JJ).b = 0. 3. Show that i ^ - = ! * < * do. 2 dgab 4. Obtain the Euler-Lagrange equations and the energy-momentum tensor for a scalar field *F. 5. Show that for the electromagnetic field A, the energy-momentum tensor given in terms of the electromagnetic tensor field F = 2dk whose components are Fab = 2A[fc.aj, is:
(a)
Tab = -±- \FM Fbd gc" - jgab Fy Fklgik g^
when the Lagrangian considered is: (b)
L=
~^FabFcdg^gM.
Gravitation, Relativity and Black Holes 381
6. Show that the tensor T"* satisfies the conservation equations (8.2.2) as a consequence of the field equations (8.2.9) satisfied by the fields *F("jc* ^- In other words, it is the field equations that lead to the conservation equation. 7. Obtain the energy momentum tensor 7"1* for a perfect fluid.
Hints to Exercise 8.2 1. 11 is a convex normal neighbourhood, hence the coordinates of the points that can be reached from p by nonspacelike curves satisfy: (x1)2 + (x2)2 + (xY-
(i)
(*V ^ 0.
Thus by observing which points can communicate with p, one can determine the null cone N. (the boundary set of points) in Tp. The knowledge of the null cone gives the ratio mentioned in the Exc. and determines the metric up to a conformal factor as we shall see below. Let X, Y e Tp be respectively timelike and spacelike vectors and let X be the variable such that X + XY is a null vector, then the quadratic equation in X (ii)
g{X + XY, X + XY) = g(X, X) + 2Xg(X, Y) + X2g(Y, Y) = 0
has two real roots since g(X, X) < 0 and g(Y, Y) > 0. If Np is known, Xx and Xj can be determined giving the required ratio between the magnitudes of a timelike vector and a spacelike vector: (iii)
XlX2 = g{X,X)lg{Y, Y).
Suppose now X' and Y' are any two non-null vectors at p, then
(iv) g(X',Y') = y W , X') + g(Y\ Y') - g(X' + Y\ X' + Y)). Each of the magnitudes on the RHS can now be compared with the magnitudes of either X or Keventually leading to the determination of g(X', Y')/g(X>X)oig(X', Y')/g(Y, Y). Thismeans that the metric can be determined up to a conformal factor. (If (X' +Y') turns out to be a null vector, the expression (iv) may be suitably altered, replacing X' + Y' by X' + 2Y'). 2. If K is a Killing vector field, then locally it satisfies: (i) Ka.b + Kb.a = 0. Define (ii) Pa =TahKb then we have: (iii) P^^T^K.+ T^K^. The first term in (iii) is zero because of the conservation equation (8.2.2) and the second is zero, as 7"* is symmetric and K satisfies (i). Hence if © is a compact orientable region with boundary dD, we have (Gauss' theorem)
L ™*>> = I ** dv = 0
382
Mathematical Perspectives on Theoretical Physics
which gives the conservation law in the following sense: vanishing of Pb.J) in T> implies that the total flux of the K-component of energy-momentum over a closed surface (dT>) is zero. In the case of a flat metric (just as in the special relativity), the Killing vectors are simply (v)
( / / = 1 , 2, 3 , 4 )
KM=-TT
and the conservation law is therefore obvious. When the metric is not flat, in general there will be no Killing vectors and so the conservation law will not hold good. However one may introduce normal coordinates in a suitable neighbourhood of a point p and then define the Killing vectors as in (v) to obtain the conservation law in that neighbourhood of p. 3. Recall that volume element dv depends on the metric, and therefore it varies with the metric. Now dv is the four-form ft; whose components are: (0
°>abcd = ( - * ) * ( 4 0 < V 2 f t % *d] - H f ) T A 2 3
where g = det(gab). This gives: d<°abcd
_
dgef
1/ _
r
Ogef
2
00
| dg
=-j(-8)~^gefgA
= \gCf ®abcd-
„ Hence
d(dv) 1 ab — — = — g dv.
dgcb 2 4. Using a scalar field *P the Lagrangian can be written as: i\\
j _
1 u/
ai
oab
1
m
>T/2
where m and ft are constants (see also Chapters 3 and 9); in fact if *F represents a particle, then m is the mass of that particle and h is the Planck's constant. The Euler-Lagrange equations (8.2.9) in this case are: (ii)
2X
We have used the notation A for 4! $[aS\ S3C <5^j here; this should not be confused with the cosmological constant introduced in Eq. (8.1.17)(b) and then used in other subsequent equations.
Gravitation, Relativity and Black Holes 383
Now
(iii)
ij~\
= -\{8caV.,bgab + V.JWX
=
-±(V.Jcgck+V.tkgk%
(k is used to replace dummy indices b and a in the line above). Using the symmetry of gck we have: (iv)
Substituting it in (ii) and using a, b for k and c, we find that the Euler-Lagrange equations are:
n To obtain the energy momentum tensor we use the equality (8.2.11) to write the integrand for this case: (v)
Integrand = - ^ - A ( ¥ ; c ) + -p^^ab
+
W^^ab-
The first term on the RHS is zero since the variation A(*F.C) = 0.xV being a scalar f.^. does not contain a connection term. Simplifying the other two terms and adjusting the indices we have:
(vi)
T* = *;a ¥ ; 6 - ± g a b ^ ; c V.,dgcd + ^
2
j.
5. When we use the Lagrangian given in (b), i.e., (0
L=-±-FabFcdgacgbd
we note that although the initial field is A, the Lagrangian L is in terms of the tensor field F; accordingly in writing the integrand (given in (v) of Exc. 4), the tensor Fab, etc., has to be used. Thus we have: (ii)
Integrand H -^-A(Fij;k)
+ ^ - A ^ + ±Lgah Agab.
384 Mathematical Perspectives on Theoretical Physics
Evidently the first term is zero and the other two terms when simplified give the required expression (a). We note that the relation Fab = 2A^h.a^ is required for showing Tahj, = 0. 6. To establish the result given in this Exc, we recall two properties of diffeomorphisms on 5W. Suppose <j): M —> iM is a diffeomorphism which is the identity everywhere except in the interior of a compact region T> of fW, then an integral
/=[ F J©
is invariant under the differential map 0 induced by 0, i.e.,
(i) v
'
f F= f J©
F= f <j)\F).
J
J© Y
'
In case
LxT\p=]imQ^(Tp-^(Tp))
where p is a point in whose neighbourhood (j)t is defined. In this particular case we again assume that diffeomorphism
(iii)
5 = f Ldv = — f Leo = — f
L« = — f 0*(Lfi>)
implies that — f (Lfi) - >*(Lco)) = 0. 4! J If the diffeomorphism 0 is generated by a vector field X which vanishes everywhere except in the interior of 2), then (iv) amounts to: (iv)
(v)
— f Lx(La>) = 0. 41 J©
A
Writing this Lie-derivative in full we have: (vi)
(where we have used LX(L(O) = (LxL)(o + L(Lxa>), have replaced — 6 ) b y
I (TahLxgab)dv
= 21 {{TubXa\b -
Ta\bXa)dv.
Gravitation, Relativity and Black Holes 385
Since the vector field X is zero on the boundary d"D, writing the first term as:
2 f Yhbdv= \ Ybdoh I'D
'"
id
*
we note that it is zero. Thus (v) will hold only if the second term in (vii) is zero for arbitrary X, which is true only if Tab.j, = 0. We have thus shown that the conservation equation is a consequence of the field equations. 7. We have to obtain here the energy-momentum tensor of the matter which is a fluid and as such it is described by a function p-the particle number density, and by a congruence24 of timelike curves-the world flow lines25 of the fluid elements. The congruence can be represented by a diffeomorphism: (i)
y : [ a , b] x 9\£—»
where [a, b] c R , 9fis a 3-dimensional manifold with boundary and T> is a sufficiently small compact region of (M, g). Since the congruence is timelike, the tangent vector W = (d/dt)Y t e [a, b] L is timelike, i.e., the corresponding unit tangent vector V = (-g(W, W)) 2 W satisfies g(\, V) = - 1 . The fluid particle current vector is defined by J = pV, and it is required that this particle current vector be conserved, i.e., j " a = 0. The behaviour of the fluid is determined by prescribing the elastic potential (or the internal energy) £ as a function of p. We take L = -p(l + e(p)) as the Lagrangian, and to obtain Tab, require that the action 5 be stationary when the flowlines are varied subject to the constraint,/",, = 0, where p is the proper particle density of the fluid. We note that while varying the flowlines, p is always adjusted so thatj" remains conserved. Let a differentiable map y: (-S, 8) x [a, b] x 5\£—»(D define a variation of the flowlines such that: y(0,[a, b], 5\O = y([a, b], 5\0 and for u e (-5, 8), y(u, [a, b), 9Q = y([a, b], 9{) on M- T>. Then if K: = (d/du) y, AW = LKW (see Hint to Exc. 6). The vector AW represents the displacement of a point of the flowline under the above variation. In terms of its corresponding unit vector V, the displacement can be written as: (ii)
AV = V.b Kb - Ka.b Vb - VVbKh.c
V.
Since A(ja.a) = 0 = (A/). fl from J = pV, it follows that (iii)
(Ap).a V + ApV.a + p.aAV + p(AVa).fl = 0.
Substituting AVa from (ii) and integrating along the flowlines, we have: (iv)
24
Ap=(pKb\b
b
+
PKb;cV
Vc.
' A congruence on a manifold M is the datum of a family of curves, with one curve through each point of M. The usage of the term 'flowlines,' which is often used in 3-dimensional Euclidean space, in this case is in 4-dimensional spacetime.
251
386 Mathematical Perspectives on Theoretical Physics The variation of the action integral S = J Ldv can therefore be expressed as 26 :
(v) (where we have integrated by parts and have used V" for V".h Vb). The action S is stationary if the RHS is zero for all K, this means that: (ji + P)va = -p.
vhva)
where we have written p ( l + e) = fi and p 2
= p to denote the 'energy density' and the \dp ) 'pressure' of the fluid respectively. The above equation shows that V^-the acceleration of the flowlines is given by the pressure gradient orthogonal to the flowlines. In Order to obtain Tab, we have to vary the metric as well (see Exc. 3 above). Now the conservation of the current can be expressed as: (vii) and since this conservation equation determines j " uniquely at every point of a flowline in terms of its initial value at some given point of that flowline, it follows that V - i " j " remains the same when the metric is varied. Using the variation of the expression:
p^g-'i^Tg
DiJ^g
jb)gah
we thus have: (viii)
2pAp=(jajh-jlJcgab)Agah
and in view of the above discussions Tab can be written as:
(ix)
T"" = |p(l + e) + p2 — 1 VVb + p2—fb [
dp)
dp
= (H + P)VaVb + pgab. Any "matter" whose energy-momentum tensor is given by (ix) is called a 'perfect fluid' regardless of the fact whether or not it is derived from a Lagrangian. Using the energy-momentum conservation Equation (8.2.2), we obtain from (ix):
26'
(x)(a)
n.aVa+(n
(x)(b)
Qi + p)Va + (gab + VaVb)p.b = 0.
£(p) is written as e.
+ p)Va.a = 0
Gravitation, Relativity and Black Holes
387
We call a perfect fluid 'isentropic' if the pressure p is a function of the energy density fi only. In this case p can be treated as a conserved quantity and the above equations as well as Tab can be derived from the Lagrangian in terms of p and the inertial energy e. (Note that pressure and energy have been denoted differently in Exc. (8.1.3).)
3
CURVATURE AND ENERGY CONDITIONS
Nowadays it is well known that the minimal condition for a spacetime to be 'singularity free' is that it be 'geodesically complete with respect to timelike and null geodesies.' However before the appearance of Penrose's work, the word singularity was used very differently. It referred to those solutions of (8.2.24) which were ill-behaved in some sense, e.g., they were infinite or were discontinuous (in some neighbourhood). Naturally the curvature and the energy-momentum tensor played a key role here. We shall describe their roles briefly and state a few results (without proof) concerning them. We shall then give 'exact solutions' of Eqs. (8.2.24) in some important cases in the next section. As the manifold (9A, g) carries a Lorentz metric, the concept of motion of a particle and its acceleration, etc., differ from the case in which g is Riemannian. We shall therefore consider the effect of spacetime curvature on families of timelike and null curves (these curves could be the world lines of fluids or the histories of photons), and in the process shall acquaint ourselves with terms such as 'rate of change of vorticity, shear and expansion' of such families of curves. The formula relating to the expansion (known as Raychaudhuri's equation) plays an important role in the proofs of singularity theorems (see [16c]). We shall also discuss the inequalities satisfied by the energy-momentum tensor (known as energy conditions) to show that the gravitational effects of matter always tend to cause convergence of timelike and null curves. We shall also use these 'energy conditions' to establish that conjugate or focal points occur in families of non-rotating timelike or null geodesies in general space-times. Finally, we shall see that the existence of conjugate points implies the existence of variations of curves (between two points). These variations take a null geodesic into a timelike curve and a timelike geodesic into a longer timelike curve We shall also use these 'energy conditons' to sho\v. In order to introduce the terms mentioned above, we consider a congruence of timelike curves with (timelike) unit tangent vector V (g(V, V) = -1). These curves could represent the histories of small (test) particles and thus would be geodesies, or could be the flow lines of a fluid. If this were a perfect fluid, then one would have (see Exc. 8.2.7): (V + p)Va = -p.bhab
(8.3.1)
where fi is the energy density and p is the pressure of the fluid, and hab is the spacelike metric coming from the projection tensor hab = Sab+ V"Vh that projects every vector X e Tq (the tangent space at q e 5W) into its component in the subspace Hq of T (which is) orthogonal to V. The vector V" = Va;b Vh (as we know from Sec. 2) is the acceleration vector of flow lines.
3.1
The Separation Vector, Vorticity, Shear and Expansion
Given a curve T(t) with tangent vector Z = (d/dt)r^ we construct a family of curves P(r, s) by moving each point of the curve T(t) a distance s along the integral curves of V. We now define Z as a tangent vector:
388
Mathematical Perspectives on Theoretical Physics
and note that V and Z are related to each other by the Lie differential equation L V Z = 0 in other words their covariant derivatives satisfy:
—Z a = Va.hZh ds
(8.3.2)
(see Chapter 0, Sec. 5). The vector Z represents a separation of points that are equidistant from some arbitrary initial points along two neighbouring curves. If one adds a multiple of V to Z, then this vector (Z + a\) will represent the separation of points on those two curves but at different distances along the curves. Since one is interested in the separation of neighbouring curves and not the points, one needs to consider Z modulo a component parallel to V. In other words, one needs to consider the projection of Z at each point q into the space Q formed by equivalence classes of vectors such as (Z + aX). This space can be represented by Hq which, as we know, is formed by vectors orthogonal to V. The projection of Z into Hq is denoted as ± Z" = hahZh. From (8.3.2) it follows that27
JjA(1Za)
= Va.h±Zh
(8.3.3)
This gives the rate of change of the separation of two infinitesimally neighbouring curves as measured in H . Operating again with — and taking the projection we obtain after some adjustments: as
KTs
ihb'Ts
±r) =
~R"hcd ± z ' v V + ^"^ + *"*" xZ&
(8-3'4
The above equation, known as the deviation or Jacobi equation, gives the relative acceleration (second order time derivative of the separation) of two infinitesimally neighbouring curves as measured in H . If these curves are geodesies, the deviation depends only on the Riemann tensor. Also in analogy with Newtonian theory, where acceleration of a particle is the gradient of the potential 0, and the relative acceleration of two particles with separation Z" is fyab Zh, the Riemann tensor term Rabcd VhVd here represents the tidal force. In order to study it further, we introduce a dual orthonormal basis (E,) and (E') (i = 1 ... 4) of Tq and r* at some point q on an integral curve X(s) of V with E 4 = V. If X(s) is a geodesic, this dual basis can be parallelly propagated to any other point q' maintaining the same relationship as at q; for a general curve, this is not possible. To overcome this we define a generalized form of (covariant) derivative along \(s) known as the Fermi derivative DFldy For a vector field X along X{s), this is defined as: (8.3.5)
27
Note that J —
\ds J
is the projection of the operator — , and can be written as h"b — , etc.
ds
ds
Gravitation, Relativity and Black Holes 389
and has the properties: (i)
—— = —— if X(s) is a geodesic;
as (il) (iii)
as
^ = 0 ; ds if X and Y are vector fields along X(s) such that as
ds
(8.3.6)
then g(X, Y) is constant along A(s); (iv)
if X is a vector field along A(s) orthogonal to V, then DFX
=
(DX)
ds ~ \ ds J In view of the above properties, an orthonormal basis, whose Fermi-derivative is zero at all points of A(s), retains its orthonormality as well as the identification E4 = V. The vectors E,, E2, E 3 can be considered as giving a non-rotating set of axes along X(s). Since —— is a generalization of — , we ds ds can extend it to tensors along with the rules that are obeyed by — (see 0.5). As a result, we note that ds DF propagates the dual basis (E') of Tq along X(s). Using the Fermi derivatives we can write (8.3.3) and (8.3.4) respectively as:
-^£. ± z a = V.b LZb as D2 - 4 - LZa = -R\cd ±ZcVbVd + h% V".clZc + Va Vb LZb as
(8.3.7)
(8.3.8)
But ±Z is orthogonal to V and thus has components only with respect to E1? E2, E3, i.e., it can be expressed as Za Ea (a = 1, 2, 3). As a result (8.3.7) and (8.3.8) can be written as ordinary derivatives: —Za= VaBZp ds -^Za (X S
= (-R*fH + Vap + VaVp) Zp
(8.3.9)
(8.3.10)
390 Mathematical Perspectives on Theoretical Physics
(Note that the V"p are those components of V"b for which a= a and b = /?.) As the components Za (a = 1, 2, 3) obey the first order linear differential equation, they can be written in terms of a (3 x 3) matrix Aap(s) all along X(s) once their values at a point q are given: Za(s) = Aap(s)Zp\q
(8.3.11)
where it is assumed that the matrix Aap(s) satisfies the properties: Aap(s)\q = h -^-Aap(s)=
(8.3.12)(a)
Va.rArP(s)
(8.3.12)(b)
Using the usual properties of matrices we can write: Aap=OaySyP
(8.3.13)
where Oayi& an orthogonal matrix with positive determinant and Syn is a symmetric matrix; we assume that they are both unit matrices at the point q. The matrix O^ represents the rotation that neighbouring curves undergo with respect to the Fermi-propagated basis, whereas Sag represents the separation of these curves from k{s). The determinant of'Sap (which equals the determinant of Aag) can be thought of as representing the three-volume of the surface orthogonal to X(s) swept out by the neighbouring curves. Now at q, where Aap is the unit matrix, dOap/ds is antisymmetric and dSap/ds is symmetric, hence it follows that the rate of rotation of neighbouring curves at q is given by the anti-symmetric part of VK p and the rate of change of separation by the symmetric part of Va. p, while the rate of change of volume is given by the trace of Va. p. The vorticity and the expansion tensors are therefore defined as:
(8.3.14)
8ab = KtiV(cilQ whereas the volume expansion is given as: e=eabhab=
va,hhab=va.a
(8-3-15) (8.3.16)
In the Fermi-propagated orthonormal basis, all these can be expressed in terms of the matrix Aap and its inverse, thus: G>ap=-A-1
rla-^Ap]r
9aP = A~i naj-Ap)y 6 = (det A)' 1 —(det A) ds Equation (8.3.10) in terms of Aap can be written as d2 . . . ^ j A ^ =(-Ra4p4 + Va.r + VaVr)Ayp
(8.3.17)
(8.3.18) (8.3.19)
(8.3.20)
Gravitation, Relativity and Black Holes 391
and can be used in calculating the propagation of the vorticity, shear and expansion along the integral curves of V once one knows the Riemann tensor. The first of these two are: — (Oap=2o)r[aep]r+ ~eap
V[ct;/3]
= -K«4/34 -
(8.3.21)
(8.3.22)
They are obtained by multiplying (8.3.20) by A ^ a n d then taking the antisymmetric part in the case of (8.3.21) and the symmetric part in the case of (8.3.22). In order to write the third one, we use the trace-free part of 6ab, which is called the shear tensor: oab=9ab-\Kb0
(8.3.23)
We thus have: — 6 = - Rab VaVb + 2m2 -2a2 -— 62 + Vaa ds 3
(8.3.24)
where 2ft)2 = 6)abafb > 0, 2<72 = aabaab
> 0.
Note that (8.3.24) is the trace of (8.3.22). This equation, which is of great importance in relativity theory, was discovered by Landau and independently by Raychaudhuri (see [16c]). From (8.3.22) and (8.3.24) the role played by the Riemann tensor in the rate of change of separation of timelike curves is obvious. We now discuss this role in the case of null curves. For simplicity we consider the case of congruence of null geodesies (which could represent the histories of photons) with the tangent vector K(g(K, K) = 0). In the absence of g(K, K) * 0 here, we do not have an arc-length to parametrize these curves. We can only choose an affine parameter v, and tangent vector K then obeys (see Def. (0.3.13)):
—Ka dv
= Ka.bKb = 0.
We should, however, bear in mind that this choice is not unique, since if we replace v byfv, the tangent vector becomes / - 1 K . We also note that the subspace Qq, the quotient of Tq by K, is no longer isomorphic to the subspace Hq of T formed by vectors orthogonal to K since it includes K (g(K, K) = 0). We are therefore interested here in the subspace Sq consisting of equivalence classes of vectors in Hq that differ from each other by a multiple of K. We shall soon see that these spaces can be spanned in terms of dual bases E b E 2 , E 3 , E 4 and E 1 , E 2 , E 3 , E 4 of Tq and T*q respectively at a point q on a curve T(v). These dual bases (quite naturally) are not taken to be orthonormal. We take E 4 = K and take E 3 as another null vector denoted K', and assume that g(E 3 , E 4 ) = - 1 . The remaining vectors Ej and E 2 are taken as unit spacelike vectors and are orthogonal to E 3 , E 4 . It can be checked that the space Hq is spanned by Ej, E 2 and E 4 , whereas the projections of Ej, E 2 and E 3 into Q form a basis of Qq, and similarly the projections of El and E 2 into Sq form a basis of Sq. A basis with the properties similar to that of E t , E 2 , E 3 , E 4 is called a pseudo-orthonormal basis. A
392
Mathematical Perspectives on Theoretical Physics
parallel transport of this basis along the geodesic T(v) assigns a pseudo-orthonormal basis to each and every point of F(v). We also note that due to the non-orthonormal character of the basis, the forms E 3 and E 4 are respectively -Ka gah and
-K'"gah.
To write the deviation equation for null geodesies, we follow the procedure of timelike curves; thus denoting the separation vector between points on the neighbouring curves by Z and noting that LKZ = 0 (as was in the case of timelike curves) we have (see Def. (0.3.13)) — Z a = K".b Zh dv
(8.3.25)
n2 -±~Za = -RabcdZcKbKd (8.3.26) dv Since Ka.4 - 0 (K being tangent to the geodesic), Eq. (8.3.25) reduces to a system of ordinary differential equations for Z1, Z2, Z3, hence the 3 components that pertain to Ej, E 2 , E 3 are: —Za=Ka.BZp dv
(a, p = 1, 2, 3)
(8.3.27)
As Qq is spanned by the projections of Ea(a = 1, 2, 3) into Qr the above equation can be interpreted to say that the propagation equation of projection of Z into Q involves only this projection, and not the component of Z parallel to K. From (KaKhgah).c = 0 it follows that K3.c - 0, this implies that Z 3 = -ZaKa is constant along the geodesic F (v), hence it can be physically interpreted to say that light rays emitted from the same source at different times maintain a constant separation (in time). In view of this, one needs to consider only those neighbouring null geodesies which have purely spatial separations, i.e., those vectors Z for which Z3 = 0. The projections of these vectors lie in the subspace 5 and obey the equation28: -4-Zi = Ki.jZj dv ''
(i,j = 1 , 2 )
(8.3.28)
But this equation is similar to (8.3.9), hence using similar arguments we can express Z' in terms of their values at some point q of F (v): Z\v) = Ajj(v)ZJ\q
(8.3.29)
where Ay is a 2 x 2 matrix that satisfies the following two equations:
28
- ^ Ayiv) = Kt.k Akj(v)
(8.3.30)
d2 -jj Ay(v) = -Ri4k4
(8.3.31)
Akj(v)
The projection symbol 1 in (8.3.27) and (8.3.28) is not used, although Za and Z' stand for projections into Q and Sq respectively.
Gravitation, Relativity and Black Holes 393
Using the same notations as those of the timelike case but with a h a t ' " ' we have equations for rates of change of vorticity 6)^, the expansion 0-the trace of separation tensor 9^, and the shear tensor a ^ -j^&ij
= - e&u + 2&k[i aj]k
— d = -RahKaKb+2co2-2G2-—e2 dv 2 — Oij = -Cm - ddij-
aik okj - a>ik 6)kj + Sy (a2 - a2)
(8.3.32) (8.3.33) (8.3.34)
Note that the first two equations are similar to (8.3.21) and (8.3.24), Eq. (8.3.33) being the analogue of the 'Raychaudhuri equation for timelike geodesies.' From (8.3.32) and (8.3.34) we note that just as we had in the timelike case, here also vorticity causes expansion while shear causes contraction.
3.2
Energy Conditions
Having acquainted ourselves with the gravitational part (which depends solely on geometry) of Einstein equations, we next consider the matter part represented by the energy-momentum tensor. In an actual universe, however, this tensor is made up of contributions coming from a large number of matter fields and as such it is almost impossible to describe it exactly even if one knew the precise form of the field and the equations of motion governing it. For example, one knows very little about the behaviour of matter under extreme situations of density-and pressure. In the absence of a reliable energy-momentum tensor, one is led to conclude that Einstein's equation cannot be used to predict the occurrence of singularities in the universe. Fortunately, this impasse is handled by using the inequalities which this tensor obeys, and which seem perfectly reasonable as assumptions. These inequalities, known as 'energy conditions,' are often sufficient to prove the occurrence of singularities, even though the exact form of the tensor may not be known. These energy conditions are referred to as: (a) the weak energy condition, (b) the dominant energy condition, and (c) the strong energy condition, depending on their role in shaping the spacetime entities. We shall list here the consequences of (results based on) these energy conditions (skipping the mathematical details, see [16c]) with a view to study the singularities.
3.2.1 The Weak Energy Condition The basic inequality that is assumed here is that at each p e 9W and for any timelike vector W e Tp, the energy momentum tensor obeys: Tab WaWb > 0
(8.3.35)(a)
By continuity this holds good even when W is a null vector. The inequality (8.3.35)(a) is called the weak energy condition. (Note that this assumption implies that the energy density measured by any observer is non negative.) In order to understand the meaning of the nomenclature used for energy conditions, we express the components of Tab (at a point p) with respect to an orthonormal basis E,, E 2 , E 3 , E 4 , (E 4 timelike) in one of the four canonical forms:
394 Mathematical Perspectives on Theoretical Physics
Tah =
Pl
Type (i) Pi
Here the energy momentum (EM) tensor has a timelike eigenvector E 4 which is unique unless \i - -pa (a = 1, 2, 3). The eigenvalue fi stands for the energy-density and the eigenvalues pa (a - 1, 2, 3) represent the principal pressures in the three spacelike directions E ^ The EM tensor has this form for all observed fields with non-zero rest mass and also for zero rest mass fields except when it is of Type (ii): [ ft , T
-
0
0
0 \
0
»,
0
0
0
0
v-k
v
,0
0
v
'
"
= ± 1
T
^
( l l )
v+k,
The EM tensor has a double null eigenvector (E 3 + E 4 ) here. This form has been observed to occur when the fields are of zero rest mass and they represent the entire radiation travelling in (E 3 + E 4 )direction. In this case p{= p2 = k = 0. For the Type (iii) and Type (iv) forms of the EM tensor given below, there are no observed fields which have EM tensors of this form. (p h T
0
0
0]
0 = 0
-v 1
1 -v
1 0-
,0
1
0
v,
Type(lll)
The EM tensor has a triple null eigenvector (E 3 + E4) here. (Pi Tab=
0
0A
P2
,0
, V
k2 < 4v2
Type (iv)
0,
The EM tensor has no timelike or null eigenvector in this case. The weak energy condition holds for Type (i) if \i > 0 and (/I + p a ) > 0 ( a = 1, 2, 3). It holds for Type (ii) if each py, p2 and k are > 0, and v = +1. The condition does not hold for Type (iii) and Type (iv). Two important cases regarding the weak energy condition are worth noting, in one of these it holds and in the other it does not. When the theory involves the scalar field > postulated by Brans and Dicke and by Dicke (see Sec. 28.4 in [26]), it holds. The field is required to be positive everywhere. The EM tensor is of the form as given in (vi) of the Exc. (2.4) with mass m = 0. The condition does not hold for the scalar field proposed by Hoyle and Narlikar (see [19b]). The field here is known as the C-field and it has zero mass.
Gravitation, Relativity and Black Holes
3.2.2
395
The Dominant Energy Condition
The energy condition is said to be dominant if for every timelike vector Wa TabWaWh>0
(8.3.35)(b)
ah
and the vector T Wa is non-spacelike. The above condition implies that to any observer, the local energy density appears non-negative and the local energy flow vector is non-spacelike. Thus in any orthonormal basis, the energy dominates the other components of Tah, i.e.: r 0 0 > \rb\
for each a, b
(8.3.35)(c)
This holds for Type (i) if p. > 0, - fi < pa < fi ( a = 1, 2, 3) and for Type (ii) if k > 0, 0 < Pi < k (i= 1, 2) and v = ±l. Evidently the dominant energy condition is the weak energy condition with the additional requirement that the pressure should not exceed the energy density. We would further add that this has been observed to be true (i.e., pa < jl, p, < k) for all known forms of the matter. We now give a few results without proof based on energy conditions (see [16c] for proofs).
3.3
Results Based on Energy Conditions
Conservation Theorem 8.3.1: Consider a compact region 11 of spacetime with past and future non-timelike boundaries (dU)x, (dZl)2, and timelike boundary (dti)3 characterized by a function t = t(xh x2, x3, x4) whose gradient is everywhere timelike (see Fig. (8.4)). Surfaces { f = constant} n
77
11 increases
WH
I
" '
E ^ E Q ii ( 0 is the part of u that lies to the past of the surface H(r) defined by t= r. («), s ^m^^m boundary for which normal form n is non-spacelike and njb ff* > 0: {dv\ = boundary for which n is non-spacelike; na f0 g°c < 0; (
Note that sign of the normal form n is chosen so that < n, X > is positive for all vectors X which point out of 11.
396
Mathematical Perspectives on Theoretical Physics
the proof of this theorem is based on differential geometry techniques, see Yano and Bochner [43] for this result in particular). Corollary 8.3.2: If the EM tensor vanishes on a space-like set S, then it also vanishes on the future Cauchy development D+(S), where D+(S) is defined as the set of all points q e 9A., such that every pastdirected inextendible non-spacelike curve passing through q intersects 5 (see Fig. (8.5)). Note that if q is any point of D+(S), the region of D+(S) to the past of q is compact and therefore can be regarded as U of the above theorem. The result can thus be interpreted to mean that the dominant energy condition implies that light always travels faster than matter.
s ^ J ^ Q j The future Cauchy developmenr />(.?) of a spacelikr- set >• Obviously similar to future Cauchy development D+(S) we have past Cauchy development D~(5) of spacelike set S- This consists of all points in M through which every future directed inextendible nonspacelike curve intersects 5Result 8.3.3:
Consider the variation equation (8.3.33) of expansion 6 which reduces to:
—e = -RahKaKh-2G2-
—e2
when the vorticity a> is zero. If Rah WuWb > 0 for any null vector W, then evidently 6 monotonically decreases along the null geodesic. This phenomenon is called the null convergence condition. From the Einstein equations Kb - jRgab
+ Ag o6 = SnTah
it is evident that if the EM tensor obeys the weak energy condition, then the above condition always holds good (i.e., Rah WaWb > 0 for any null vector W) irrespective of the value of A. Result 8.3.4: Let W be a timelike vector, then from Equation (8.3.23) it follows that the expansion 6 of a timelike geodesic with vorticity zero decreases monotonically if Rab W"Wb > 0. This is called the timelike convergence condition. In view of the Einstein equations, this condition is satisfied if the EM tensor obeys the inequality:
Tah WaWh> WaW f i r - — A] V2
8n
(8.3.36)
J
This inequality holds for Type (i) if
/d + Pa>0,
ii + ZPa - - J - A > 0 47T
(8.3.37)
Gravitation, Relativity and Black Holes 397 and holds for Type (ii) if
v=+l,
k>0,
p{>0,
p2>0
and pl + p2-—A>0 An
(8.3.38)
In case the inequality is satisfied for A = 0, the EM tensor is said to obey the strong energy condition. The 'strong energy condition' is obviously a stricter requirement than the 'weak energy condition.' It is known to hold good for the electromagnetic field and for the scalar field with zero mass. We note that for the general case, Type (i) it is violated if the energy density is negative or the pressure is large and negative; for instance for a perfect fluid with density 1 gm cm"3, it can be violated if p < -10 1 5 atmospheres. We further note that a breakdown of the 'strong energy condition' sometimes leads to a breakdown of the 'singularity theorems' causing a singularity eventually. For example, the EM tensor of n mesons represents a breakdown of this nature.
3.4 Conjugate Points In the previous subsection, we have seen that the energy conditions implied the inequality Rab KaKh > 0 for the curvature tensor. We shall now see its role in determining the conjugate points on non-spacelike geodesies. These in turn will lead to the knowledge of singularities in spacetime. We recall here the definition of conjugate points for the spacetime (see Chapter 0, Sec. 5). Definition 8.3.5: Given a timelike C2-geodesic y(s) and a Covariation of y(s) (a congruence of neighbouring timelike geodesies) represented by Ft(s) = F(s,t) (t e (-£, £)), there always exists the
\ dF field of vectors
1 (s,t)
L dt
; these vectors are called the Jacobi fields along Y(s). A point p on y(s) is
Jr=0
conjugate to q along y(s) if there is a Jacobi field along y{s), which is not identically zero but vanishes at both p and q. From Sec. (0.5) we know that Jacobi fields satisfy a second order differential equation (known as the Jacobi equation) that involves the (Riemannian) curvature tensor (see [22]). Here this equation in view of (8.3.10) is: -^-TZa'•= -Ram7? ds
( a = 1, 2, 3)
(8.3.39)
a
dZa
A solution of this equation, i.e., a Jacobi field, is specified by the values of Z and
at some point ds of y(s). Evidently there are six independent Jacobi fields along y(s), three of which vanish at some point q of y{s). These Jacobi fields can be expressed as: Za(s) = Aa/}(s)^-Z^\q as
(8.3.40)(-a)
where -yrA^is) ds 29
= -Ra,y4 Ayp(s)
See Table S for n mesons.
(8.3.40)(b)
398 Mathematical Perspectives on Theoretical Physics
and Aaa(s) is a 3 x 3 matrix that vanishes at q. From an earlier subsection, we conclude that these Jacobi fields represent the separation of neighbouring geodesies through q, and hence the vorticity, shear and expansion of these fields, which vanish at q, is given by (8.3.17), (8.3.23) and (8.3.19). In view of (8.3.40)(a) and the vanishing of Aap(s) at q, we have: Result 8.3.6:
A point p is conjugate to q along y(s) if and only if Aap is singular at p.
Again using (8.3.40)(b) we note that since /? a 4 r 4 is finite, —(det A) must also be finite. From the ds expression 9 = (det A)'1 —(det A) = — (log(det A)) ds ds for the expansion 9, and in view of the above result we obtain: Result 8.3.7: A point p is conjugate to q along y (s) if and only if 9 becomes infinite at q. We now list a few results that involve conjugate points, the curvature tensor, and the expansion 9.
3.5
Results Based on Curvature, Conjugate Points and the Expansions 6, 0
Proposition 8.3.8(a): If the expansion 6 has a negative value 9X < 0 at some point y {sx) {sx > 0) of a timelike geodesic, and if Ruh V"Vh > 0 everywhere, then there is a point conjugate to q along y(s) between y(sx) and y(s{ + 3I-9X), provided that y{s) can be extended to this parameter value. We note that this may not be possible if spacetime is geodesically incomplete. A geodesic incompleteness of this type can be interpreted to mean that there exists a singularity in spacetime. The proof of the proposition follows from the Raychaudhuri equation: -^-9 = -Rah V"Vh- 2a2 - - 92. ds 3 Since all the terms on RHS are negative for s > sx we have: 9<
(8.3.41)
•s-fo+3/-0,) Thus 9 will become infinite and there will be a point conjugate to q for some value of s e \( sx, S\
3 ".I
A slight variation of the above proposition which we shall use in Sec. 5 is as follows. Proposition 8.3.8(b): Let Rah VaVb > 0. Suppose that at some point p = y(sx) the tidal force h a Rabat y y is nonzero, then parameter s of y(s) has values s0 and s2, such that q = y(s0) and r - y(s2) are conjugate along y(s), provided that y(s) can be extended to these values. So far we have considered arbitrary timelike geodesic congruence, we shall now state a result for the congruence of timelike geodesies normal to a spacelike three-surface !H. A spacelike three-surface # i s an imbedded 3-dimensional submanifold defined locally by a C^-function y/= 0 such that g"b y/.a y/.h < 0. The unit normal vector N to the surface #"has components:
Gravitation, Relativity and Black Holes 399
Na=(-gbcV..bV.c)-Tgad\if.d
(8.3.42)
and the components of the second fundamental tensor % are given as: Xab=KhdhNc.d where
(8.3.43)
hab = gab + NaNb
is the first fundamental tensor (or the induced metric tensor) of H. The congruence of timelike geodesies orthogonal to !H consists of those timelike geodesies whose unit tangent vector V equals the unit normal N at H, thus Va.b = Xab
(8-3.44)
at H. The vector Z representing the separation of a neighbouring geodesic normal to !Hfrom a geodesic y(s) normal to # o b e y s the Jacobi equation (8.3.39) and at a point q on y{s) at !tf, it satisfies the initial condition:
-fza=XapZP
(8.3.45)
ds Since these Jacobi fields satisfy (8.3.40) at q where A^ is the unit matrix, we have: -^Kp=XaYAYp
(8.3.46)
Using (8.3.17) it can be checked that the vorticity tensor ooap can be expressed as Aya(0YSAsp= |
^
^
- A^—A^
(8.3.47)
hence it is zero on y(s) since Aya(Q sASp is zero at q - as Aap is the unit matrix there, that satisfies (8.3.46). We further note that the initial value of 6 at q is: XabSab
(8-3-48)
Definition 8.3.9: A point p on y(s) is conjugate to H along y(s) if there is a Jacobi field along y(s) not identically zero, that satisfies the initial condition (8.3.45) at q and vanishes at p. Thus p is conjugate to .Walong y(s) if and only 'f Aap is singular at p. And as seen earlier, Aap becomes singular where and only where the expansion ©becomes infinite (see (8.3.19)). Similar to Prop. (8.3.8) (a) and (b), we now have the result that pertains to the timelike geodesic congruence normal to a spacelike 3-surface. Proposition
8.3.10:
If Rab V"Vh > 0 and xabgab < 0, there will be a point conjugate to # a l o n g y(s)
within a distance 3 / ( - %ab gab) fr°m % provided that y(s) can be extended that far. The proof follows easily from the Raychaudhuri equation (8.3 24) using the steps mentioned in the earlier proposition. We now consider a congruence of null geodesies. A Jacobi field along a null geodesic y(v) is a solution of the equation:
400
Mathematical Perspectives on Theoretical Physics
-^rZ1
= -Riij4ZJ
(i, ; = 1, 2)
(8.3.49)
The Equation (8.3.40)(a) now becomes: Z\v) = A, — Z> dv
(8.3.50) q
where Ay is the (2 x 2) matrix which vanishes at q, and as before Aud)lkAkj = 0, showing that the vorticity of the Jacobi fields which are zero at q vanishes. Here again we have that p is conjugate to q along y{v) if and only if 6 = (det A)"1 — ( d e t A) dv becomes infinite at p. Finally, the propositions analogous to propositions (8.3.8)(a) and (8.3.8)(b) read as: Proposition 8.3.11(a): Let K denote the geodesic tangent vector, and let Rab KaKh > 0 everywhere and if at some point y {v{), the expansion 6 has the negative value 6X < 0, then there exists a point conjugate to q along y(v) between y(vx) and y(vl + 2/-9A,
given that y(v) can be extended that far.
Proposition 8.3.11(b): Let Rab K"l& > 0 everywhere. Suppose that atp = y(v}), K'KdK[aRb]cd[e Kq] is non-zero, then parameter v has values v0 and v2 such that q - y(v0) and r = y(v2) are conjugate along y(v), provided y{v) can be extended to these values (see Sec. 4.4 in [16c] for the proofs).
3.6
Variational Techniques
The propositions given above regarding the existence of conjugate points were based on the curvature tensor and the expansion 0 or 8. Since the existence of conjugate points is a means to establishing that spacetime be singularity free, we shall state below a few more results where their existence is ensured by using the arc-length variational techniques. In essence these results are based on the Lorentz metric g and the curves that cover M. Thus a breakdown (discontinuity) which signals singularity is studied here by considering non-spacelike geodesies and the conjugate points on them. Putting it still more simply, one ensures (here) via these results, that in general, any two points can be joined by unique nonspacelike geodesic curves in a singularity-free spacetime. Due to our limited scope, we content ourselves with only the statements of a few results and ask the interested reader to consult one of these references [16c], [26], [35]. To begin with, we define a few key ingredients that are required in these results. Definition 8.3.12 Convex Normal Neighbourhood: Consider a C r -map exp: T —> M such that the rank of exp = dim(5W) (see also the Appendix for the definition), then (even if 'M. is not complete) in view of the implicit function theorem, there exists an open neighbourhood 9\[Q of the origin in Tp and an open neighbourhood $Cp of p in 9A such that exp is a Cr-diffeomorphism of ^ onto 9{p. Such a neighbourhood is called a normal neighbourhood of p. Further, if any point q of 9L can be joined to any other
Gravitation, Relativity and Black Holes
401
point r in H^p by a unique geodesic starting at q and totally contained in 9tp, then 9{p is called a convex normal neighbourhood. Definition 8.3.13: Let II be a convex normal neighbourhood of a point q, then the length of a nonspacelike curve y(t) from q to p, for p e U, is:
L(y, , p) = JJ [-g(d/dt,d/dt)]2dt
(8.3.51)
where the integral is taken over the differentiable portions of the curve. We shall next consider the case where q and p may not be contained in a convex normal neighbourhood 11. For this we introduce the concept of 'variation' of non-spacelike curves on 9A. Definition 8.3.14:
A variation a of a timelike curve y(t) from q to p is a C'-map a: (-£, £) x [0,
tp] -> fW such that (i) a ( 0 , r) = y(r); (ii) a is C 3 on each (-£, e) x [r;, tM ] for some subdivision 0 = tx < t2 ••• < tn = tp of [0, r p ]; (iii) a ( « , 0) = q, a(u, tp) = p ; (iv) for each constant M, a ( « , f) is a timelike curve. The vector {did M) a | u = o is called the variation vector and is denoted as Z.
Definition 8.3.15:
A two-parameter variation a of a. geodesic curve y (i) from q to p is a C'-map: a : (-e, e) x (-£', fi7) x [0, g -> 5W
such that (i) a(0, 0, 0 = y(r) (ii) a is C 3 on each (-£, £) x (-£', e') x [r;, /,.+1] for some subdivision 0 = r 1 < r 2 < - < r B = r p of[0, tp\, (iii) a(u, u, 0) = q, a{u, u, tp) = p ; (iv) for all parametric pairs (u, u) consisting of constants u and u', a(u, u, t) is a timelike curve. The variation vectors Z and Z ' in this case are defined as: (a)
Z = {dldu)a
u=0
u'=0
(b)
Z ' = {dldu')a
u=0
u'=0
Resulting from these variations are the length variations of geodesic curves. To write the formula for the length variation in the one-parameter case, we assume that the parameter t in y(t) is the arc-length parameter s, then denoting by V the unit tangent vector ——, we have from (8.3.51): ds n-\
^ du
n-\ +1
= X P *(Z, \)ds + X g(Z, [V]) u=Q
i=i
it,
,=2
(8.3.52)
402 Mathematical Perspectives on Theoretical Physics
D\ where V =
is the acceleration vector (see also Exc. 2.7 for V), and [V] represents the discontids nuity at singular points of y(s) (see p. 107 of [16c]). The length variation under the two-parameter variation of geodesic y{t) involves two variational vectors Z and Z', and, as can be expected the second order derivative of length with respect to two parameters is symmetric in Z, Z', we write it as:
L(Z, Z') = -f±-
(8.3.53)
u=0
du du u'=o
and explain next the reason for this choice. d2L Using steps similar to one-parameter case the derivative ——— u=0 can be written as: au au «'=o «-i r'+i /
-p.r
T7T-
du du
-° = £
«-=o
~]
J\i {
Z
\
r n2
IN
> f V ( Z ' + g (V, Z ' ) V) - / ? (V, Z') V \\ds [ as
+ £ * (z> [•£• ( z ' + «< v ' z'> v )]l
J)
(8-3-54>
It can be further checked that this length variation depends only on the projections of Z, Z' into the space orthogonal to V. Thus if we denote by Ty the infinite-dimensional vector space consisting of all continuous, piecewise C2 vector fields along y(t) orthogonal to V and vanishing at q and p, then the second derivative of the length is a symmetric map of Tyx Tyio R1. This may be viewed as a symmetric tensor on Ty, and therefore, can be written as in (8.3.53): L(Z, Z') = - ^ - u=0 du du u'=o where Z, Z' now belongs to Ty. Recall that while in a positive definite metric one seeks the shortest curve between two points (which as we know is a geodesic), in the case of a Lorentz metric one looks for a longest non-spacelike curve. We call a timelike geodesic curve y(t) from q to p maximal if L(Z, Z') is negative semi-definite. Using the above definitions we now state the following results: Proposition 8.3.16: Let U b e a convex normal coordinate neighbourhood of a point q in M. Then the points that can be reached from q by timelike (respectively non-spacelike curves) in 11 are only those of the form expg(X), where Xe Tq satisfies g(X, X) < 0 (respectively < 0). The above proposition says that the boundary of the region in U which can be reached from q by timelike or non-spacelike curves in 11 is formed by the null geodesies emanating from q. Proposition 8.3.17: Let q and p be two points of a convex normal neighbourhood 11. Then if q and p can be joined by a non-spacelike curve in 11, the longest such curve is the unique non-spacelike geodesic curve in 11 from q to p.
Gravitation, Relativity and Black Holes 403
Thus if A(0 is a timelike curve in 11 from q to p, with a represention A(f) = a(f(t),t)30 and if p(q, p) defines the length of this curve provided it (the length) exists and is zero otherwise, then p (q, p) is a continuous function on 11 x 11, and from (8.3.51) we have: L{X, q, p) < \P f\i)dt
= p(q, p).
(8.3.55)
The equality holds if and only if A is the unique geodesic curve in 11 from q to p. Finally using the maximality definition we have: Proposition 8.3.18: A timelike geodesic curve y(t) from q to p is maximal if and only if there is no point conjugate to q along y{t) in (q, p). Proposition 8.3.19: A timelike geodesic curve y(f) from a 3-spacelike surface H\o p is maximal if and only if there is no point in {H, q) conjugate to H along y. We note that the maximality condition involving L(Z, Z') in this case reads as: a timelike geodesic curve from a surface # t o p and normal to itfis maximal if L(Z, Z') given in (8.3.53) is negative semidefinite, where it is assumed that the point q defined by, the two-parameter variation a, a(u, u', 0) = q, instead of being fixed varies over H. We now move on to Sec. 4 where we apply the ideas gathered in this section to study the exact solutions of spacetime.
4
EXACT SOLUTIONS, AND THE CAUSAL STRUCTURE
In this section we discuss the above two important topics pertaining to spacetime. We shall see that they derive their unique role for different reasons. For instance, the former helps us understand the structural properties of metrics that are solutions of Einstein equations, while the latter leads us to the study of singularity theorems and black holes.
4.1
An Exact Solution
To define an 'exact solution' we note, to begin with, that the Einstein equation: Kb ~ jRgab + A8ab = *"Tab can be theoretically satisfied by any spacetime metric, and the energy-momentum (EM) tensor-the RHS of this equation can then be considered as a known entity. In reality, however, the matter tensor does not 30
The representation X{i) = a(f(t), t), and (8.3.55) is based on the following: Let a(s, t) - expq(sX(/)) where g(X(r), X(f)) = - 1 , then writing X(t) = a(f(t), t) one has — \dt)X — \dt)a
are mutually orthogonal and g
=/'(*)— \ds)a
— , — = -1, this gives \\ds)a \ds)a)
The equality holds if and only if X(i) is a geodesic curve.
+ \dt)a
.Since — , Kds)a
404
Mathematical Perspectives on Theoretical Physics
have physically reasonable properties in all cases (see Subsec. 4.1.2 here). In the following definition of an 'exact solution' we lay down the conditions that need to be satisfied by EM so that it may be physically compatible with a solution. Definition 8.4.1: A spacetime (M, g) is called an exact solution of Einstein's equations, where the field equations are satisfied with Tab (the EM tensor of some specified form of matter) that obeys the 'local causality' postulate (a) (see Sec. 8.2) as well as one of the energy conditions (see Sec. 8.3). For example, one may look for exact solutions-for the empty space (Tab=0), for an electromagnetic field where Tab has the form (a) of Exc. (8.2.5), or for a perfect fluid where it is given by (vi) of Exc. (8.2.7). We note that due to the non-linearity of the field equations, exact solutions can be found only in spaces with high symmetry. Besides, these are often idealized—in the sense, that only simple matter contents may be considered, contarary to the fact that a region of spacetime may contain many forms of matter in general. As already mentioned in Sec. 8.2, most of these solutions (models) along with their local properties were discovered earlier. Their global properties, however, were examined only after the theory was developed with the help of other mathematical disciplines, e.g., topology, group theory, and algebraic geometry. Due to our limited scope, we shall list these exact solutions along with their important local and global properties. To study their derivations, one is advised to consult the texts [16c] and [26] and the references that are cited there. Usually these solutions are named after their discoverers and are often linked to each other through coordinate transformations. We give below some of these solutions beginning with the simplest such model. (All the notations and the diagrams to depict these models are based on the Hawking and Ellis [16c].)
4.1.1 Minkowski Spacetime This is the simplest empty spacetime of General Relativity which, as we know, is also the spacetime of Special Relativity. The pair (!M, g) here is (IR4,77) where rj is the flat Lorentz metric (1,1,1,-1) expressed as: ds2 = -(dx4)2 + (dx1)2 + (dx2)2 + (dx3)2 (8.4.1) in terms of the natural coordinates (xl, x2, x3, xA) on IR4.31 The geodesies in this case are given by: xa{v) = bav + ca
(8.4.2)
where b" and c" are constants. As a result the exponential map exp:Tp —> 3tfcan be written as: xa(expp X) = Xa + xa(p)
(8.4.3)
with X" being the components of vector X with respect to the coordinate basis \
a
[ of T . The map
I dx J exp is a diffeomorphism between Tp and*Msince it is one-one and onto. Thus any two points of SW can be joined by a unique geodesic curve. Also exp is defined everywhere on Tp for all p, hence (M, r}) is geodesically complete (see App. A. 2 for the definition). Associated with this pair (IR , 77) are the surfaces x4 = constant, which represent a family of Cauchy surfaces*. These surfaces foliate the whole of 'M. 31
The manifoldStf with this choice of metric is sometimes denoted as R3'1. In Sec. 1 we have used*0 in place of x4 and have denoted the Minkowskian space as M\. * For a spacelike 3-surface5 if D+{S) u D~(S) =!Wi.e. if every inextendible non-spacelike curve in 94. intersectsS, thenS is called a Cauchy surface (see Subsec. (3. 3) for D+(S) and D"(5).
Gravitation, Relativity and Black Holes 405 Null /+—geodesic
Future null cone of O \"
/ /^
\KS~*~^
^/~~N
\^_^
/
/_—-7
\—7// V""~7^—y/
/
/
, Uniformly
accelerating
/^_^/timelike curve
/
\ ^ - — ^ /
./^-Surface
/•*^"^ {x4 = constant}
Past null ^ ( / C C I " " : ^ ^ cone of O ^ f c t ^ ^ S S a ^ ^ ^ gff
V ^ R E Q A Cauchy surface QC = const.) in Minkowski spacetime, and spacelike surfaces Sa, Sg, which are not Cauchy surfaces. All the normal geodesies to Sn. S,. intersect at O. It should be noted here that there are inextendible spacelike surfaces which are not Cauchy surfaces, for example the surfaces: Sa :
{-(JC 4 ) 2
+ (x1)2 + (x2)2 + (x 3 ) 2 = a = constant}
(8.4.4)
where <J < 0, x < 0, are spacelike surfaces that lie totally inside the past null cone of the origin O and thus are not Cauchy surfaces (see Fig. (8.6)). We shall now give a few other coordinate systems that are used on the Minkowski spacetime in order to show how a particular coordinate system introduces or removes a singularity. Consider the spherical polar coordinates (t, r, 9, (j>) with x4 = t, x 3 = r cos#, x2 = r sin# cos (j>, xl = r sin 9 sin 0. The metric: ds2 = -dt2 + dr2 + r2(d92 + sin 2 9d
(8.4.5)
is apparently singular for r = 0 and sin 9 = 0, since (t, r, 9,
(8.4.6)
The null coordinate v(w) represents the advanced (retarded) time coordinate and can be thought of as the incoming (outgoing) spherical wave travelling at the speed of light. The surfaces w •= constant and v = constant are null surfaces (i.e., w.a w.b gab = 0 = v.a v.b gab); see Fig. (8.7)). The intersection of a surface v = const, with a surface w = const, is a two-sphere.
406 Mathematical Perspectives on Theoretical Physics | r .~ " w = costant ^C_ ^ ^ ^ /1
w
f/
"f (^2^\$^
= constant
A j A u > ?= constant
/ZX^7 sj^ZLx ^ ^ U = c o n s t a n t ^-^-~ ===== -~~\^
/^^XU^X^"^ ^xxj^X^j
c_3 6
— v = constant
^~- r = constant
(0
(ii)
j ^ ^ Q Q I (I) The ;• iv coordinate surfaces with one coord'nate suppressed. 00 The (t. rVpiane: each point represents a two-sphere of radius r Still another set of null coordinates, denoted p, q, can be defined with the help of v, w. These coordinates have finite range and are given by the relation: tan p = v,
tan q = w
(p, q e (- -|-, ^ - j ; p > q)
(8.4.7)
The metric 7] now becomes: ds2 = sec2p sec2 q i-dpdq + —sin2 (p - q) (dO2 + sin2 9 d$) ]
(8.4.8)
which is evidently conformal to another metric g given by: ds2 = -Adpdq + sin2 {p - q) (dd2 + sin2 Odcj)2).
(8.4.9)
If we further define t' = p + q, r' = p - q, where -It<
t' + r'< it,
-jt
n,
(/ > 0)
(8.4.10)
then (8.4.9) can be written as: ds2 = -(dt'f
+ {dr'f + sin2 r'{dB2 + sin2 6d
(8.4.11)
This means that the whole of Minkowski spacetime is given by the region (8.4.10) of the metric:
ds2= | sec2 {±{S + r')l sec2 U(t' - r'))dl2 with (ds)2 being determined by (8.4.11). The coordinates (t, r) of (8.4.5) are related to (t\ r') by:
2f = tan (l-(t' + r')) + tan f l ( ? ' _ r ' ) l
Gravitation, Relativity and Black Holes
2r = .tanf—(f' + r ' ) l - t a n f - ( f ' - r ' ) !
407
(8.4.12)
Remark 8.4.2: The various coordinates and the subsequent metrics introduced here are significant in one way or another. For instance, the metric (8.4.11) is locally identical to the metric of the Einstein static universe which is a completely homogeneous spacetime (see the Appendix). The metric (8.4.11) can be analytically extended to the whole of the Einstein static universe: IR1 x S3, where now t' e (- °°, °°) and (r\ 6, (j)) denote the coordinates on S3. The coordinate singularities of r' and flat 0 and ;rcan be removed by using suitable local coordinates in a neighbourhood of points where (8.4.11) is singular.. The coordinate system given by (p, q, 6,
fl =
_2L] denoted *'+and i~ (in literature) represent respec2 )
tively the future and the past timelike infinity, whereas the point ( p = —, q =
denoted /° repre-
sents the spacelike infinity. We also have the null surfaces p = + — and q = which represent the future and past null infinity.
denoted J7+ and J-
Remark 8.4.3: In view of the above remark, it follows that the whole of Minkowski spacetime is conformal to the region (8.4.10) of the Einstein static universe (shown in Fig. (8.8) by the shaded area). The boundary of this region given byJ+,J~, i+, i~ and i° can be thought of as representing the conformal structure of infinity of Minkowski spacetime. Remark 8.4.4: The coordinates (?', r') were introduced by Penrose. They can be used (rather effectively) to represent the conformal structure of infinity shown in Fig. (8.8) (i) simply by means of the (?', r') plane given by the diagram in Fig. (8.8) (ii). Each point of this diagram represents a sphere S2, arid radial null geodesies are represented by straight lines at ± 45°. This representation, known as the Penrose diagram, can always be used for the structure of infinity in any spherically symmetric spacetime. The conformal structure described above can be viewed as the 'normal' behaviour of spacetime at infinity. Evidently this behaviour may not be found suitable in all the cases that one encounters in reality. Remark 8.4.5: From the above discussions it follows, that whatever can be seen from infinity is determined by the light cone structure of space time. This is unchanged by a conformal transformation of the metric, e.g., gab -> Q.2 gab, Q being a smooth positive function of position. Hence it is useful to apply a suitable conformal transformation which squashes everything up near infinity and brings infinity up to a finite distance. This is exactly what has been done by introducing the coordinates (u, w) —> (p, q) - » (?', #•')•
4.1.2
de Sitter and Anti-de Sitter Spacetimes
Consider the spacetime metrics with constant curvature. The Riemann tensor here is given (locally) as: Kbcd = ^
R
(Sac Sbd ~ Sad 8bc)
Moreover, since in terms of the conformal tensor above relation is equivalent to: C
abcd = 0 = Rab~
~7R Sab
(8-4-13)
408 Mathematical Perspectives on Theoretical Physics
f^~ s r
/+ r =
^TN ~~~~*\+-i'= n
"vT
\ \ \
0
i
—~~~
''!\i7//
^*==fc
—~~~\
''' lly^^"
IJ = K
(i)
y^K/
V\jK\
-~
r'=0
KQ= constant}
'/\\\^f
if
iTTrnr ^ ^
°
v (ii)
^=
Surfaces
{p~
constant
cons ant
'
}
)
Two-spheres
{r= constant}
r
^ ^ J j I 3 (i) The Einstein static universe represented by an embedded cylinder (with coordinates ^ * * " " 6 and 0 it is the 'de Sitter spacetime.' The de Sitter space has the topology of K1 x S3, which can be considered as the hyperboloid: -v2 + w2 + x2 + y2 + z2 = a2
(8.4.14)(a)
-dv2 + dw2 + dx2 + dy2 + dz2 = ds2
(8.4.14)(b)
5
in R with metric: The use of coordinates (t, %, 6,
a sinh (a~lt) = v a cosh (a~lt) cos % = w a cosh (a'11) sin % cos 6 = x a cosh (a" 1 1) sin % sin 6 cos <j)= y
Gravitation, Relativity and Black Holes 409
a cosh (a" 1 0 sin j sin 9 sin (p = z
(8.4.15)(a)
leads to the metric: ds2=-dt2
+ a2 cosh2 (a' 1 1) {d%2 + sin2 x(dQ2 + sin2 6d(j)2)} (8.4.15)(b)
The metric singularities are simply those that occur for polar coordinates, namely the ones given by % 0, X = ^ a n d 6=0, 6= n. Apart from these singularities, the coordinates with the ranges: - °° < t < °°, 0 < j < TT, 0 < d< 7t, G < 0 < 2TT cover the whole space. The spatial sections given by t = constant are the spheres S 3 of constant positive curvature which are Cauchy surfaces. Their geodesic normals are lines which contract monotonically to a minimum spatial separation and then re-expand to infinity (see Fig. (8.9)(i)). If one uses the coordinates:
w+v t = cdog
» ,
ax
x =
a
* ,
w+ v
ay
y=
„
—,
az
z -
w+ v
w+ v
on the hyperboloid, the metric takes a simpler form:
ds2= -di2+ exp(2a"'f) (dx2 + dy2 + dz2) But these coordinates cover only half of the hyperboloid, as i is not defined for w + v < 0 (see Fig. (8.9)(ii)). The region v + w > 0 forms the 'steady state' model of the universe proposed by Bondi and Gold [2] and Hoyle [19(a)]. Null surfaces [t = - °°} are boundaries of coordinate patch £ increases x = 0
x = n
f
tincreases
t= 0; minimum distance
between geodesic normals
X increases
J ^jncreases y- o (i)
^y^\T\KjjK' Geodesic normals Surfaces of
i increases
Vvv\V\ z C ^ Geodesic \'i \N\^\f J? normals i \\\{fk Pllrfar..nf
constant timer
//
^Mr~™J^ff\L*i
! //
/
\o^\ ^ ^
—Jj /^—"
constant time '
Nxj\ >«\
Timelike geodesic which does not cross surfaces {t= constant} (ii)
^ B ^ H De-Sitter spacetime represented by a hyperboloid imbedded in a 5-dimensional flat space ^ ^ ^ ^ (with two dimensions suppressed), (i) The coordinates (i. x- & 0) cover the whole hyperboloid; the sections t = const, are surfaces of curvature k = +1. (ii) The coordinates (t.x.y.z) cover half of the hyperboloid; the surfaces i = const, are flat 3-spaces, their geodesic normals diverging from a point in the infinite past.
410
Mathematical Perspectives on Theoretical Physics
To study the infinity in this space, a time coordinate t' given as:
t' = 2 arctan(exp a~l t) - (nil) (- — < t' < — 1
(8.4.16)
can be defined. This gives: ds2 = a2 c o s h 2 (a11')
• ds2
(8.4.17)
2
where ds is the same as in (8.4.11) after r is identified with %. Thus the de Sitter space is conformal to the Einstein static universe defined by the range in (8.4.16) (see Fig. (8.10)(i)). The Penrose diagram, a plane in t and % coordinates, depicts the singularities of this space in a simple manner (see Fig. (8.10) (ii) and(iii)). r' = o
r' = n
P
||11P ""'"'J
HT lTTl"l ITT r '• l r ^ ^ ' ( f ' = o^)' a spheres3 1
.'T ~~jj>
•^'~~--j-(f'=- 1-^), a spheres3
(i) j ; + ( f = oo) j / + ( f = oo)
j I
Surfaces
i — t -j_-Time lines
fi ^
===-! j
{f=constant}~|13?~ 1
—'—t—'—'—'—! T j-(t = -°o)
,^L
i ^\--\-~Tj?i&~^-~V^MilM 1--7 Mfy j \\UW
Surfaces {f= constant}
}-T/2^T
IH"N
(ii)
^ ^ ^ Q
(y = constant)
!
jrj (coordinate singularity)
y
Tjine lines (x = constant)
y=w (coordinate singularity)
/^^^V(' A =-°°) ~*~^| V /" \X-n) ^--^.Coordinate singularity (# = 0) (iii)
(i) de-Sitter spacetime shown as being conformal to the region -K- < f <
n
of the
Einstein static universe. The shaded region depicts the portion that is conformal to the steady state universe, (ii) The Penrose diagram of de Sitter spacetime. (iii) The Penrose diagram of the steady state universe. In (ii) and (iii) each point represents a 2-sphere of area 2Ksin2 x'• n u | 1 | i n e s ar © ai 45°, and % = 0 and x = x are identified.
Gravitation, Relativity and Black Holes 411
Remark 8.4.6: We note that in contrast to Minkowski space, the de Sitter space has a spacelike infinity for timelike and null lines both in the future as well as in the past. This is because in this space there exist both particle and event horizons for geodesic families of observers (see A.7 and A.8 in the appendix). Definition 8.4.7: The spacetime of constant curvature with R < 0 is called the anti-de Sitter space. The topology of this specetime is that of S1 x R3, it is represented by the hyperboloid: -u2-v2
+ x2 + y2+z2=
1,
embedded in flat five dimensional space K5 with metric: ds2 = -{duf - (dv)2 + (dx)2 + (dy)2 + (dz?
(8.4.18)
Remark 8.4.7: The anti-de Sitter space is not simply connected, also it has closed timelike lines. The universal covering space of this space, obtained by unwrapping the circle S1 to (its covering space) R1, has the topology of IR4. This (universal covering) space has no closed timelike lines, and it carries the following metric: ds2 = -dt2 + cos 2 / {dx2 + sinh2 x(dd2 + sin2 Od
H I D t=
2
n
V
n
^__
/
I
— - Lines •*-*—"" {/•= constant}
— " ^ ^ r~*Surfaces Cjr ^ 5 = = {f'= constant) ^j\ZZZ—^{t= + oo) A - n ^ ^ ^ ^ ^ H — ^Surfaces ~>j/~i_A_LS^^ = "" {t= constant} ¥ n -~-~~llt"~l-T-l~~^% — — {f = -<«}
I '
/ •>
^iW -r- ,-, '\%s Timelike I V K / / geodesies I \ \r\%^ \ Y)-^ Coordinate , / / J/Zc Null
Lines W & & ^= singu,anty^J / / ^ X _ geUod esc i v | j~pP7_~7 —{X= constant) r ~° \\//Y/ °mp
f =
' ~"2 7 r ^~~*-l~*\$}Z*—~~"7 ~" ~p Null geodesies ";^Kfc^ ^^'zK from infinity to r (i) •
W/s^ oir (ii)
^ 9 ^ f f l (i) The universal anti-de Sitter space conformal to one half or the Emstem universe: the m m " — coordinates (f. i. « o) cover 1he whole space, whereas (f. / . o @) cover only one diamond shaped region; all the geodesies orthogonal to the surfaces [t = constant} can be seen converging to p and q and then diverging out into similar diamond shaped regions. (M) The Penrose diagram o* universal anti-de Sitter space. Infinity consists of tho fmohke surface / and the disjoint points h.i.
412
Mathematical Perspectives on Theoretical Physics
(see Exc. 4 for metrics (8.4.19) and (8.4.20)). The surfaces t' = constant cover the space completely and have non-geodesic normals. Remark 8.4.9:
The structure at infinity can be studied by using the coordinate r'\
r' = 2 arctan(exp r) - — j 0 < r' < — J
(8.4.21)
The metrics ds2 and ds2 of (8.4.11) are now related as: ds2 = cosh 2 rds2.
(8.4.22)
Thus the whole of anti-de Sitter space is conformal to the region 0 < r' < — of the Einstein static cylinder. The null and spacelike infinity can be thought of as a timelike surface here, which has the topology R1 x S2. Remark 8.4.10: No conformal transformation can be found which makes the timelike infinity finite without reducing the Einstein static universe to a point. Therefore timelike infinity is represented by the disjoint points z+, i~, and, as a consequence, there exists no Cauchy surface in the space (see Fig. (8.11)(ii)).
4.1.3
Robertson-Walker Space
If the universe is spatially homogeneous and admits a six-parameter group of isometries whose surfaces of transitivity are spacelike 3-surfaces of constant curvature, it is called a Robertson-Walker (or Friedmann) space. With suitable coordinates the metric here can be written as: ds2 = -dt2 + S2{t)da2
(8.4.23)
where da2, which is independent of t, is the metric of a 3-space of constant curvature K. Evidently the geometry of these 3-surfaces is qualitatively different for the cases K > 0, K < 0 and K = 0. In order to study this, it is usual to rescale the function S(t) so that K can be taken as 1 and - 1 in the first two cases. The metric da2 can now be written as:
da2 = d%2 + f\X)
(dd2 + sin2 6 d
where
pin* (0<x<2n) K = l f(X)=\x (0<^<-) K = 0 [sinh* (0 <£<<*>) K = -\ In the last two cases the spaces are diffeomorphic to R 3 and so are 'infinite,' in the first case they are diffeomorphic to a 3-sphere S and are therefore compact ('closed' or 'finite'). Remark 8.4.11: The symmetry of the Robertson-Walker solutions requires that the energy-momentum tensor has the form of a homogeneous perfect fluid. The density ji and the pressure p of this fluid are functions of time t and the flow lines are the curves %, G, <j> = const. The function S(t) represents the separation of neighbouring flow lines (i.e., of nearby galaxies). In view of the above remark, the equation of conservation of energy in these spaces is (see Exc. (2.7) and Exc. (4.6)):
Gravitation, Relativity and Black Holes
II = -3(jU + p)SIS
413
(8.4.24)
and the Raychaudhuri equation becomes: \n(\i +3p) - A = -3S/S
(8.4.25)
Remark 8.4.12: From (8.4.24) it follows that the density \i decreases as the universe expands. This can also be interpreted to mean that the density was higher in the past, increasing without bound as S —> 0. An infinite density implies an infinite curvature. Hence Robertson-Walker spacetime has a singularity at S = 0 which, unlike the coordinate singularities, cannot be removed by coordinate transformations (see [16c] and [26] for more details on these spaces).
4.1.4 The Schwarzschild and the Reissner-Nordstrom Solution The spatially homogeneous solutions, e.g., Robertson-Walker, which are good models for the large scale distribution of matter in the universe, are not suitable for description of local geometry of spacetime in the solar system. This geometry can be described to a good approximation by the Schwarzschild solution. This solution represents the spherically symmetric empty spacetime outside a spherically symmetric massive body, with the metric given by: d5*
= _ (i _ i£LJ A 2 + ^ _ ^
X
d?
+ r\d& + sin2 Q dtf)
(8.4.26)
where r > 2m.* This spacetime is 'static' meaning thereby that — is a timelike Killing vector which is at
a gradient, and the space is spherically symmetric, i.e., it is invariant under the group of isometries SO(3) operating on the spacelike 2-spheres {t, r - constant}. One should note that coordinate r here, is defined to meet the requirement that the area of these surfaces of transitivity is Anr2. The solution is unique t and is asymptotically flat, as the metric has the form:
Sab = nab + o(-J-) for large r (see Exc. 8.4.1 and 8.4.5). When the metric is an empty space solution for all values of r, it is obvious that r = 0 and r = 2m make the metric singular. To remove these, the given manifold is cut into two disconnected components defined by 0 < r < 2m and 2m < r < °° and since !M of spacetime has to be a connected piece, the component 2m < r < <*> js taken as the required iW of (iW, g). We note that although the metric is singular at r - 2m in the coordinates (f, r, 0, <j>), there are no scalar polynomials formed by the curvature tensor and the metric, that diverge as r —> 2m, hence this singularity is not a real physical singularity, but is the result of a bad choice of coordinates. To avoid this, different coordinate transformations have been used, two of these are: v = t + r*, * t
w = t - r*
(advanced, retarded null coordinates)
A comparison with Newtonian theory suggests thatm here (as measured from infinity) is the gravitational mass of the body producing the field. By uniqueness we mean that if there is any solution of the vacuum field equations, that is spherically symmetric then it must be locally isometric to Schwarzschild solution.
414 Mathematical Perspectives on Theoretical Physics
r* = f J
— = r + 2m log(r - 2m) \-2mlr
(8.4.27)
The metric in terms of (v, r, 9,
—]dv2
+ 2dvdr + r\d92
+ sin2 9 dtp2)
(8.4.28)
and in terms of (w, r, 9, 0) (denoted g") it becomes: ds2 = - (l - ^L)dw2
- 2dwdr + r\d82
+ sin2 9 d(j>2)
(8.4.29)
Both these metrics are non-singular and are analytic on the larger manifolds fW' and !M" defined by the region 0 < r < <». In fact it is through these manifolds (M\ g') and (M", g") that manifold (M, g) whose region is 2m < r < °° can be extended. For instance the region of M' for which 0 < r < 2m is isometric to that region of the Schwarzschild metric for which 0 < r < 2m. Thus by taking a different manifold Schwarzschild metric is extended in such a manner that it is no longer singular at r = 2m. In the manifold M', r = 2m is a null surface, it is that section of spacetime which is given by 6, <j> constant; each point here represents a 2-sphere of area Anr . The surface r = 2m acts as a one-way membrane; it allows future-directed timelike and null curves cross only from the outside (r>2m) to the inside (r < 2m), whereas it does not let past-directed timelike and null curves cross from the region (r > 2m) to the region (r < 2m). Thus representation (M', g') is not time symmetric, as is also obvious from the presence of the term dv dr in g'. As r —> 0, the scalar Rabcd R ahcd diverges to w2/r6, hence r = 0 is a real singularity here. We call (M', g') the advanced Finkelstein extension of (M, g), or alternatively (M, g) is said to be imbedded in (M', g'). The manifold (M", g") known as the retarded Finkelstein extension of (M, g) has much of the same properties as we saw in the case of (M\ g) . For instance the region 0 < r < 2m of M' is isometric to the region 0 < r < 2m of Schwarzschild-metric, although the isometry here reverses the direction of time. Similar to M' the manifold M" has r = 2m as a null surface which acts as a one-way membrane, the only difference is that now the past-directed timelike or null curves cross from the outside (r > 2m) to the inside (r < 2m). Using these two manifolds we shall next see that we can further define a still larger manifold !M* with a metric g* into which both (fW, g') and (5W", g") can be isometrically imbedded so that they coincide on the region r > 2m which is isometric to (M, g). Construction of the pair (3tf *, g*) is due to Kruskal (see Sees 31.1 and 31.4 in [26]), here g* is given by: ds2 = F\t', x) {-dt'2 + dx'2) + r\t\
x) (d62 + sin2 9 d(j>2) 2
(8.4.30)
2
and the manifold 9A.* is defined by the coordinates (/', x, 8,
(8.4.31)(a) (8.4.31)(b)
Gravitation, Relativity and Black Holes 415
Both these functions are positive and analytic. The coordinates (?', x) are arrived at from the pair (v, w) via the following relations:
v' = exp(v/4m),
w' = -exp(-w/4m)
(8.4.32)(a)
x' = ^-(v'-w1),
t'=^~{v' + w')
(8.4.32)(b)
We note that, the regions of (M*, g*) gievn by x' > \t'\ and x' < - \t'\ are both isometric to the region r > 2m of Schwarzschild solution (5W, g). Also the region defined by x' > -/' is isometric to the advanced Finkelstein extension {9nC, g) and the region defined by x' > f is isometric to the retarded Finkelstein extension {M", g"). The Kruskal extension {M* g*) is the unique analytic and locally inextendible extension of the Schwarzschild solution. The Reisnner-Nordstrom solution, which is locally similar to the Schwarzschild solution, represents the spacetime outside a spherically symmetric charged body (configuration), carrying an electric charge but with no spin or magnetic dipole. The energy-momentum tensor in this case is that of the electromagnetic field in the spacetime which results from the charge on the body.32 For a suitable choice of coordinates the metric can be written as: ds2=_fi
_^
+
iiw
+
f i _ izi+iiyxdri
+ r2{dei +. sin2 e # 2 )
( 8 4 33)
where m represents the gravitational mass and e the electric charge of the body. The solution is asymptotically flat and if e2 > m2, the metric is non-singular everywhere except for the irremovable singularity at r = 0. This may, however, be thought of as the point charge which produces the field. If e 2 < m2, the metric has singularities at r + and r_ where rx = m ± (m2 -
e2)^.
We have so far considered spherically symmetric solutions, this however does not always hold good in reality, since in general astronomical bodies are rotating and so the solutions outside them are not exactly spherically symmetric. We now give the exact solution that takes care of this phenomena.
4.1.5 The Kerr Solution The Kerr solutions are the only known family of solutions that can represent the stationary axisymmetric asymptotically flat field outside a rotating massive object. In fact Kerr solutions seem to be the only possible exterior solutions for black holes (see [16e]). The metric in Boyer and Lindquist coordinates (r, 9, 0, t) can be expressed as: ds2
= p2 ( —
U where
+ dd2 1 + (r 2 + a2) sin 2 6 d
pl
j
p2 = p2(r, 6) = r 2 + a2 cos 2 6,
(a sin 2 6 d
and
A = A{r) = r2 - 2mr + a2.
Entities m and a in (8.4.34) are constants, m represents the mass and ma the angular momentum as measured from infinity. When a = 0, the solution reduces to the Schwarzschild solution. The metric is invariant under simultaneous inversions of t and 0, i.e., t —> -t and 0 —> -<j) but is not invariant under 32
' Actually there is no "body" (per se ) in spacetime. The 'charge' expresses the total electric flux trapped in the topology of the "throat" connecting the two asymptotically flat 3-dimensional spaces.
416 Mathematical Perspectives on Theoretical Physics
their separate inversions. When a2 > m2 obviously, A > 0, and the metric is singular only when r = 0. The singularity here is not a point but it is a ring (see Exc. 7). The Kerr solutions, being stationary and axisymmetric, have a 2-parameter group of isometries, which is abelian. Hence there are two independent Killing vector fields that commute. There is a unique linear combination K" of these vector fields which is timelike for large positive and negative values of r. The orbits of AT" define the stationary frame, thus an object moving along these orbits appears to be stationary with respect to the infinity. Besides K" there is another unique linear combination Ka of the Killing vector fields which is zero on the axis of symmetry and whose orbits are closed curves that correspond to the rotational symmetry of the solution.
4.1.6
Godel's Universe
These solutions known as Godel's universe represent an example of exact solution where the matter is a pressure-free perfect fluid, i.e., Tab - puaiih, p being the matter density and uu the normalized 4-velocity vector. The manifold here is R4, and the metric can be expressed as: ds2 = -dt2
+ dx2 -— exp(2>/2~ CQx)dy2 + dz2 - 2exp( VT 0)x)dt dy
(8.4.35)
where a> > 0 is a constant which stands for the magnitude of the vorticity of the flow vector ua. The field equations are satisfied if u = —^-, 3 3 i.e., u" = 5Q, and An p = CO2 - -A. dx Remark 8.4.13: The Godel spacetime (JW, g) has a 5-dimensional group of isometries which is transitive, hence it is a completely homogeneous spacetime. It is easy to note that the metric is a direct sum of two metrics g{ with coordinates (/, x, y) defined on fWj = R 3 and g2 with coordinate z defined on M2 = R1- There are closed timelike curves through every point of (Mx, g{) and hence through every point of (fM, g). The Godel solution is geodesically complete.
4.1.7
Taub-NUT and Misner Spaces
The Taub-space is an empty space solution of Einstein's equations, we denote it as (M, g). 3 4 It is spatially homogeneous and has the topology IR x S3. The metric here is given by: ds2 = -U~] dt2 + (2l)2 U(d\jf + cos 0 dip)2 + {t2 + I2) (d82 + sin 2 9 dty2)
where
(8.4.36)
2(mt + l2) U(t) = - 1 + - ^ ^~
r +1
33
' t = x°. 34 The tangent space Tp forp e M is the direct sum of the vertical subspace Vp and the horizontal space Hp. For X,Y e Tp the metric g of (M, g ) can be locally expressed as: g(X,Y)=g
y{Xy, Yy) + (^ + l^g
H{jl,
Xfj, 7t*Y„)
where (Xv, Yv) and (XH, YH) are in Vp and Hp. The space Vp which is tangent to the fiber is spanned by — and dt
, and the space H is spanned by
and
cos 6
. (See Sec. 2.5 for details.)
Gravitation, Relativity and Black Holes 417
with m and / being positive constants. S 3 is covered by Euler coordinates (9, 0, y/) that vary as: 0 < 9 < n, 0 < 0 < In, 0 < y/ < \n. The metric is singular at t = t± = m ± (m2 + I2)112 where U = 0. The Misner space is a 2-dimensional space with topology Sl x (R! and with the metric g given by: ds2 = - r 1 dt2 +
fa?y/-2
(8.4.37)
where 0 < y/ < 2n. There is an obvious similarity between the two spaces in the sense that the singularity of the metric is given by the coefficient of dr. Apart from this similarity, there are other common features between them. They can both be extended to two inequivalent locally inextendible analytical extensions. In both cases these extensions are geodesically incomplete. In the case of Misner space, the metrics of these extensions (denoted g' and g") are given by: ds2-2dyr'dt
+ t{dy/')2
y/' = y/ - log t
(8.4.38)
and ds2 = -2dyi"
dt + t{dy/")2
y" = yr + log t
(8.4.39)
The manifolds M' and M" are defined respectively by y/' and - <» < t < °°, and y/" and - °° < t < °°. When t > 0 both (iW', g') and (fW", g") are isometric to original Misner space (M, g). Inextendible analytic extensions of the Taub-space are obtained by considering M as a fiber bundle over S2 with fiber R1 x S1, and the bundle projection n: !M -> S2 given by (t, y/, 9, (p) -> (9, <j>). By dropping the 9, (f> terms of (8.4.36), the metric can now be written in the form: ds2 = -U~l dt2 + 4l2U(dy/)2
(8.4.40)
The similarity between the metrics given in (8.4.37) and (8.4.40) suggests that the method used in the case of Misner space can be applied here for obtaining the extensions. It should be noted that the above metric (denoted gv) is on the fiber J = R1 x S1 which is regarded as the (t, y/) plane. Thus in effect the extensions are those of (JF, g v), when these are combined with the metric gH of the 2-sphere as given in the Ftn (34), the analytic extension of (!M g) is obtained. The metric g vhas singularities at t = t± where U = 0. To avoid these singularities, we take the manifold given by -t_ < t < t+ and y/, denote it by f0 and then extend (Jo, g v) by defining: y/'=y/+ — \ ^ V Y 21J U(t)
(8.4.41)
The metric g v now becomes g 'v : ds2 = 4/ dy/' (lU(t)dy/'
- dt)
(8.4.42) l
This metric is analytic on the manifold7' with topology S x R defined by y/' and by - « = < ; < °<=. The region t_ < t < f+ of CF', g'v) is isometric with CF0, g v). There are no closed timelike curves in the region t_ < t < t+, though there are for t < t_ and t > t+. Another extension is made by defining: y/"=y/—L\-^V W 21 J U(t) The metric g'y here is:
(8.4.43)
418 Mathematical Perspectives on Theoretical Physics
ds2 = Aldy"
(W(t)d\j/"
+ dt)
(8.4.44)
which is analytic on the manifold J" given by y" and - °° < t < °°. As in the previous case, (?"', gy) is isometric to CF0, g v) on the region t_< t < t+. These inextendible extensions of Taub's space were obtained by Newman, Unti and Tamburino [28] and therefore the Taub's space along with its extensions is called the Taub-NUT space.
4.2 Causal Structure Having described the exact solutions (to some extent), we now define some of the 'words' that are used in 'singularity' and 'black hole' theorems. We shall see that these 'words' describe the causal behaviour of spacetime and help in making succinct statements of results on causal structure. Due to our limited scope, we shall only give their definitions (skipping the underlying philosophy of their introduction into the literature) and state the results based on them, asking the reader to see [ 16c] for details.
4.2.1 Orientability Definition 8.4.14: A spacetime is said to be time-orientable if it is possible to attach an arrow in one and the same direction at every point, in other words if it is possible to define continuously a division of non-spacelike vectors into two classes which can be labelled as future and past-directed. All results in this as well as in the following section are based on the assumption that spacetime is time-orientable. If the spacetime (M, g) is not time-orientable, e.g., in the case of de Sitter space, it is customary to consider its double covering space (M, g), n: 9A e 5W which is time-orientable. The projection n carries (p, a) e M to p e M, where a denotes one of the two orientations of time at p. Definition 8.4.15: A spacetime (M, g) is called space-orientable if it is possible to assign the term 'right-handed' and 'left-handed' in a continuous manner to the bases of three spacelike axes at every point p of M. According to Geroch, if it is possible to define two-component spinor fields at every point of a spacetime, then the spacetime must be parallelizable, i.e., it must be possible to introduce a continuous system of bases of the tangent space at every point (see Geroch in [13](b)). It is important to note here that if one assumes that spacetime is orientable, then in view of experimental evidence, and of the CPT theorem, it follows that spacetime be space-orientable as well (see Chapters 7 and 9 and 7.[21] for the CPT concepts).
4.2.2
Chronological and Causal Future
Definition 8.4.16: Let 5 and 11 be two arbitrary sets of a time-orientable spacetime (M, g). The set of all points in 11 which can be reached from 5 by a future-directed timelike curve in 11 is called the chronological future I+(S, 11) of S relative to 11. 35 When Zl = !M,we denote it simply as / + (5), and we note that it is an open set since if p e M can be reached by a future-directed timelike curve from 5, then there is a small neighbourhood of p which can also be reached from 5 by timelike curves. Definition 8.4.17: The union of (5 n 11) with the set of all those points in 11, which can be reached from 5 by a future-directed non-spacelike curve in 11, is called the causal future of S relative to 11 and is denoted / + (5, 11). When 11 - M, it is denoted J+(S). This definition (like many others that will follow) has a dual in which 'future' is replaced by 'past', the notations with +ve sign are then replaced by those with -ve sign.
Gravitation, Relativity and Black Holes 419
From Sec. 3 we know that if a non-spacelike curve between two points is not a null geodesic, then it can be deformed into a timelike curve. In view of this we have that if 11 is an open set and p, q, r e 11, then either of the following two statements imply the same result: (i) qe J+{p,U),
re / + (,«) 1
(ii) 9 e /•(,,«), rer{q,11)
r
r e I
iP Uy
'
From this it follows that the closures and boundaries of the sets / + (p, 11) and J+(p, 11) are equal; i.e.: / + (p,11) = J+ (p,U)
(8.4.45)(a)
I+(p,
(8.4.45)(b)
and 11) = j+(p,
W)36 +
Remark 8.4.18: The region J (S) of spacetime is causally affected by events in 5. This is not necessarily a closed set even when it is a single point.
4.2.3
Horismos and Achronal Sets
Definition 8.4.19: The difference set J+ (5, U) -1+ (5,11) is called the future horismos of 5 relative to 11 and is denoted E + (5, 11). As usual E + (5, Wt) = E+ (5). Sometimes the relations p e / +(qr), /? e J +(q) and p e E +{q) are denoted as q « p, q < p and q -> p respectively. Remark 8.4.20: If the set £/ in the above definitions is a convex normal neighbourhood about a point f>, then E + (p, 11) consists of the future-directed null geodesies in 11 from p and thus forms the boundary in 11 of both / + (p, IS) and / + (p, 11). In Minkowski-space the null cone of p forms the boundary of the chronological and causal futures of p. The above notion of the boundary defined with the help of a convex normal neighbourhood 11 can be generalized as follows. Definition 8.4.21: A set S which satisfies / + (5) r\S = 0 (i.e., there are no two points of 5 with timelike separation) is called achronal. If 5 3 / + (5), then 5 is said to be a future set. Note that if 5 is a future set, then 9A.-S is a past set. Also for any set 5 , the set / + ( 5 ) and J+ (5) are examples of future sets. Examples of achronal sets are given by the fundamental result: Proposition 8.4.22: If 5 is a future set, then 5 , the boundary of 5, is a closed, imbedded achronal three-dimensional C'-submanifold. A set 5 with the properties of S listed in the above proposition is called an achronal boundary. A spacetime is said to satisfy the chronology condition if there are no closed timelike curves in it. The set of points where this condition does not hold good is called the chronology violating set of M. It can be shown that the chronology violating set of M is the disjoint union of the sets of the form / + (p) n / ~ (p), p e fM; and when !M is compact, the chronology violating set of !M is non-empty. Similarly if there are no closed non-spacelike curves in SW, we say that the causality condition holds in 9A. The set of points where this condition is violated is the disjoint union of sets of the form J + (p) r\J~ (p), p e tM. Further, we say that the strong causality condition holds atp if every neighbourhood 9\[ofp contains a neighbourhood $£' (of p) which no non-spacelike curve intersects more than once (fA(can be 9*£).\t can be verified that this condition holds in 5W if the following four conditions are satisfied there: (a) for every null vector K, Rab KaKb > 0; (b) every null geodesic contains a point at which K^a Rb\cd[e^-f] KcKd •*• 0, where K is the tangent vector of the geodesic; 36
For any set 5, the boundary S -S C\(WC — S). (Recall that the boundary of a set5 is usually denoted as dS-)
420
Mathematical Perspectives on Theoretical Physics
(c) the chronology condition holds on M\ (d) iWis null geodesically complete.
4.2.4 The Concept of Imprisonment Definition 8.4.23: M is supposed to be equipped with a. future {past) distinguishing condition at a point p e 9A if every neighbourhood of p contains a neighbourhood (of p) which no future (past) directed non-spacelike curve from p intersects more than once. An equivalent statement of the above condition is that / + (q) = I+ (p) (/" (q) = /" (p)) implies q = p. Evidently if the strong energy conditions hold on M, then past and future distinguishing conditions would also be there. The causality conditions described above lead to the phenomena of imprisonment (a concept required in black holes). We note that a non-spacelike curve y that is future-inextendible behaves in one of three ways as one follows it to the future: it can (i) enter a compact set 5 and remain there; (8.4.46) (ii) not remain within any compact set but continually re-enter a compact set 5; (iii) not remain within any compact set and not re-enter any of these sets more than a finite number of times. In the first and second cases, yis said to be 'totally' and 'partially future imprisoned' in 5, respectively. In the third case, /can be viewed as going off to the edge of spacetime, that is, either to infinity or a singularity. It is worth noting here that imprisonment does not necessarily occur when the causality condition is violated. An example to that effect is given by Carter (see [16c] and [26]) via Fig. (8.12) below, where imprisonment is shown without causality violation. The manifold here is R1 x S1 x S1 and the (Lorentz) metric is given by: ds2 = (cosh t - I) 2 (dt2 - dy2) + dtdy - dz2. As is evident from the Fig. (8.12), it is a space with imprisoned non-spacelike lines but no closed nonspacelike curves. Given below is one of the results that negates imprisonment: Identify
^> '"
' i i ii ii Mi II M i l ^-
& V-7
Identify after shifting an irrational amount
0)
^ ^ ^ 0
/
(H) 1
The manifold R'xS'xS covered by coordinates (t y. 2), where it y. 2) is identified with (t, y, z+ 1) as well as with ( / . / + ! . » < $ - or being an irrational number, (i) A {2= const.} section showing the orientation of the null cones. (ii)The {t= 0} section showing part of a null geodesic
Gravitation, Relativity and Black Holes
421
Proposition 8.4.24: If the strong causality condition holds on a compact set 5, then there can be no future-inextendible non-spacelike curve which is totally or partially future imprisoned in SThe proof of the above proposition is immediate since S can be covered by a finite number of convex normal neighbourhoods %li (with compact closure) such that no non-spacelike curve intersects any Ui more than once. This implies that any future-inextendible non-spacelike curve which intersects one of these neighbourhoods must leave it not to re-enter. Recall that we introduced the Cauchy development and Cauchy surfaces earlier while discussing the conservation theorem and the exact solutions, we use them here to generalize the concept of Cauchy surfaces to global hyperbolicity.
4.2.5
Cauchy Developments
Definition 8.4.25: Given a closed set5 a region D+{S) (D~(S)) to the future (past) of5 is called the 'future (past) Cauchy development or domain of dependence on 5, if the events in the region D+ (S) (D~ (5)) can be determined from the knowledge of data on SAccording to the above definition, D+{5) is the set of all points p e SWsuch that every past-inextendible non-spacelike curve through p intersects S- We shall, however, often use the Penrose ([30b], [30c]) definition of Cauchy development where "non-spacelike curve" is replaced by "timelike curve." This set is denoted (for distinction) as
D+(S).
Definition 8.4.26: The closed achronal set given by ^ (•$) - I (D (5)) i s known as the future Cauchy horizon of 5 and is denoted as H+ (5). This future boundary of 5 limits the region that can be predicted from the knowledge of data on S. Finally, as a culmination to our discussions on causal structure, we have: Definition 8.4.27: A set 2\£is said to be globally hyperbolic if the strong causality condition holds on it, and, in addition, for any two points p, q e 9{, J+ (p) r\ J~ (q) is compact and is contained in 9(. There are many results that can be established involving global hyperbolicity. We state just four of these due to their simplicity and usefulness. Proposition 8.4.28: An open globally hyperbolic set is always causally simple. A set 5\£is said to be causally simple if for every compact set %_ contained in 9*1, J+ (X) ^ 3\£and J~ (X) n 9\£are closed in 9(. In other words, if for every compact set 3Cin (open set) 5\£the sets J+ (X) n f^and J ~ (X) n 5V^ are closed in 5\£ then 5\£ is causally simple. Proposition 8.4.29: Let 5 be a closed achronal set such that J + (5) n /~(5) is strongly causal, and in addition, is either (i) acausal* or (ii) compact, then D(S) = D+ (5) u D~ (S) is globally hyperbolic. An easy corollary to the above proposition would be: Corollary: If 5 is a Cauchy surface in the above proposition, then M must be globally hyperbolic. Proposition 8.4.30: Let p and q lie in a globally hyperbolic set J^with q e J+ (p). Then there always exists a non-spacelike geodesic from p to q whose length is greater than or equal to any other non-spacelike curve from p to q. Our last result deals with asymptotically flat spaces, i.e., spaces whose metrics approach the Minkowski-space metric at large distances from the system (see Chapter 34 in [26]). The Schwarzschild, Reissner-Nordstrom and Kerr solutions are examples of spaces that have asymptotically flat regions. The spaces of this kind are needed in investigating bounded physical systems such as stars. *
J* (S) o 7 ~ (5) would be acausal if and only if S is acausal.
422 Mathematical Perspectives on Theoretical Physics
Definition 8.4.31:
A time and space-orientable space (M, g) is said to be asymptotically simple if
there exists a strongly causal space (!M, g) and an imbedding 9: M —> M which imbeds M as a manifold with smooth boundary d!M in !M, such that: (i) there is a smooth function Q on !M which is positive on 0(3lf) and satisfies £l2g = 0*(g) (i.e., g is conformal to g on 0(fM)); (ii) on (9 (M), Q = 0, and dft * 0; (iii) every null geodesic in 9A has two endpoints on d!M. A space (ftf, g) is asymptotically empty and simple if it satisfies the above three conditions and also the condition: (iv) Rah = 0 on an open neighbourhood of dM in M = M u 5W. Proposition 8.4.32: An asymptotically simple and empty space (M, g) is globally hyperbolic. For proofs of these results, consult [16c] (see also [33] for projective structures and global hyperbolicity). We now move on to the next section, where we shall use these results while studying the phenomena of singularities and black holes.
Exercise 8.4 1. Show that when spacetime is spherically symmetric, the line element can be written as: (a)
ds2 = goo dt2 + gOr drdt + grr dr2 + gee (d92 + sin 2 9 dy/2)
or alternatively in the canonical form: (b)
2.
3. 4.
5. 6. 7.
ds2 = -ev(r> l)dt2 + eUr- l)dr2 + r
2
^ + sin 2 9
dy/2).
Show further that when t and r are constants, the above line element gives the familiar Gaussian curvature of a two-sphere of radius gee in R3. Obtain Einstein's equations for the spherically symmetric spacetime using the metric in canonical form (b) of Exc. 1. Show further that "if this spacetime has a region which is asymptotically flat and empty, then the metric in this region is time independent and hence independent of the dynamical properties of its source" (Birkhoff's Theorem). Use the coordinate transformations of (8.4.15)(a) to write the metric (8.4.15)(b). Write the coordinate transformations that are required, between the coordinates (u, v, x, y, z), (t, X, 0, 0) and (t', r, 9, 0) for writing the metrics of the anti-de Sitter space as a hyperboloid in the forms (8.4.19) and (8.4.20). Using the above two exercises, establish the Schwarzschild solution (8.4.26). Establish the equations given in (8.4.24) and (8.4.25). Use the Kerr-Schild co-ordinates (x, y, z, i) to show that the singularity r — 0 in the case of Kerr solutions is a ring.
Hints to Exercise 8.4 1. Spherical symmetry in spacetime implies that for every rotation R e 50(3) (see Chapters 2 and 3), there is an isometry ¥(/?) of spacetime. Recall that in 3-dimensional space R3 with coordinates (JC, y, z), R maps (x, y, z) \-> (Rx, Ry, Rz) and this gives three distinct one-parameter families of rotations whose Killing vectors are:
Gravitation, Relativity and Black Holes 423
(i)
Z = zdy-
ydz,
1
%=xdz-
zdx,
2
§ = ydx-
xd
3
In terms of polar coordinates (r sin 9 cos Xjf, r sin 9 sin iff, r cos 9) they become: (ii)
£ = sin y/de + cos 9 cos y / ^ l
3
In the case of spacetime when the action of the rotation group results into spherical orbits, we use coordinates (t, r, 9, iff) regarding t as constant. Now due to isometry, the Lie derivative of the metric tensor (gy) (ij = 0, 1, 2, 3) with respect to Killing vector £, (a= 1, 2, 3) is zero, hence we have: a
U^g)ij = ¥ dkglj + gik d£k + gkjd£k = 0.
(iii)
a
a
a
a
Simplification of the above equation after using the values of t, given in (ii) shows that: a
(iv)
dygjj = 0; gr9 = grv = gd¥ = goe - gOv = 0; deSrr = deSee = de8or =
d
egoo
= 0;
gV¥ = gee sin 2 9.
This means that the only non-zero components of g?1 are g^, grr,gdg, g , gOr, and that they are all functions of r and t only, the line element can thus be written as given in (a). We further
choose another set of coordinates (t', / , 9', y/') such that t' - t'(r, t), r' - r\r, t), 0' = 9, y' Yand the term containing dr' dt' is zero; with this choice we have: ds2 = g'oo dt'2 + g'r,r, dr'2 + g'ffff (d9'2 + sin 2 9'
(v)
dy'2)
where each g' is a function of (r', t'). From the last equalities in (iv) and (ii) and from the fact that £, are spacelike, it follows that g'ee (r', t')= ge6 (r, t) > 0, hence there is no loss of generality a
if we write:
(vi)
r'=
We now write g'oo = - e
ylgee(r,t). v(r
• ' } and g' rV = e^r'''
and drop the prime, then (v) becomes:
(vii) ds2 = -ev(r' ° dt2 + eMr'!) dr2 + r\d& + sin2 9 dy2). The last statement of the exercise is evident from (vi) and (vii). 2. From the above Exc. we know that there are only two metric coefficients, namely e v(r ' () and e (r' '\ that need to be determined from Einstein's Equations (8.1.17) to obtain the solution. (The solution is called the 'external Schwarzschild solution.') However, our objective here is simply to write the Einstein equations. We assume that the source of the spacetime is a bounded matter energy distribution with energy momentum tensor 7^. This is spherically symmetric, hence: 37
The correspondence between subscripts 0, 1, 2, 3 and t, r, Band \ffi&: 0 —> t, 1 —> r, 2 —> 0, 3 —> iff.
424 Mathematical Perspectives on Theoretical Physics
[^r] =o.
(0
V a
Jij
Using the arguments of Exc. 1, we obtain that the only non-vanishing components of Ty are Tm, Trr, Tee, TV¥ and TOr, and they are all functions of (r, t). We also note that the non-zero connection coefficients for the metric given in (vii) of Exc. 1 are:
(ii)
Kr= | ,
I~r6
=
r°,= -|, r°Or= ^ ,
=
^ry
>
^6y/
=
c o s
"'
r ^ - s i n ©cos e
I~0r =
—
T ^ = -r sin 2 e e~x where the prime and the dot stand for dr and dt respectively. The Einstein equations can now be written as:
(iii)
(a) e'x (— + \ ) - \ = kTrr (r, t) \ r r J r
(b) e~X [-L - i l ) - -L
= kT\(r,
t)
(c) e~x— = kTr0(r, t) r
2 ^
2
r
2 J 2
{
2
2J
6
¥
where k = 8TTG/C 4 is Einstein's coupling constant, with G being the Newtonian gravitational constant and c the velocity of light. To establish the second part of the exercise, which in fact is the well-known Birkhoff's theorem for spherically symmetric, asymptotically flat empty spacetime (see [9] and Appendix B in [16c]), we require that the metric of (vii) in Exc. 1 becomes the Minkowskian metric (in polar coordinates) in some region of spacetime. For this we impose the condition: (iv)
lim v(r, t) = lim A(r, t) = 0. r—*<*»
r—»°°
Suppose now that r 0 is the smallest value of the radial coordinate beyond which Ty vanishes, i.e., (v)
Ty = 0
for
r > r0.
(Note that such an r 0 exists since we have assumed that the matter is bounded.) From (c) of (iii) above, it follows that when r> r0
Gravitation, Relativity and Black Holes 425
(vi)
A = o.
This means that outside the matter source X{r, t)\r> (b) from (a) we have v' + X' = kreX(rr-
(vii)
= X(r) is independent of time. Subtracting
T°o).
Integrating the above equality over r from r > r0 to °° and using (v) and (vi) we obtain: (viii)
v(r, t) = ~X{r)
r > r0.
This shows that outside the matter distribution (Ty = 0), v and X are time independent and thus the whole metric in this region is time-independent, hence it is independent of the dynamical properties of the source. 3. From (8.4.15)(a) we have Eq. (i) below: dv = cosh (a" 1 f)dt; A 1
dw = sinh(a~ f) cos %dt - a cosh (a"' /) sin %d% B 1
dx = sinh(a~ i) sin #cos 6dt + a cosh (a"1 t) cos % cos 9 d% - a cosh (a"1 t) sin % s m ^ d9 c dy = sinh(a~ 1 1) sin % sin0 cos <^dt + a cosh(«~' t) cos % $m & c o s <MX + a cosh (a~l t) sin % c o s ^ c o s
dz = sinh^a" 1) sin % s m ^
sm
fa*
+ a
cosh (a" 1 t) cos % sin 8 sin (f>dx
+ a cosh (a" 1 t) sin % cos Q sin (/> dd + a cosh (a" 1 t) sin % sin 6 cos
-dv2 + dw2 + dx2 + dy2 + dz2 = {- cosh 2 (a" 1 t)d? + A2 + B2 + C2 + D2} + terms without dt2.
We note that the terms with mixed differentials contained in the second term on the RHS of (ii) cancel out, thus we have to collect only the coefficients of dt2, d^, dO2, d$. This gives. (iii)
1st RHS Term = ^ [ c o s h 2 (a" 1 t) + sinh 2 (a"' t) {cos2 % + sin 2 % x (cos 2 9 + sin 2 9 cos 2
(iv)
2nd RHS Term = d)?[o? cosh 2 (cf' 0 {sin2 % + c o s 2 X x (cos 2 9 + sin 2 9 cos 2 0 + sin 2 9 sin 2 0)}] = a 2 cosh 2 (a"' t)dx2 + two terms containing d92 and dip2 .
4. Write: u = sin t, v - cos t cosh £, x = cos t sinh % c o s 9, y = cos f sinh # sin 0 cos >, 2 = cos ? sinh % sin 0 sin 0. It can be easily checked that these coordinate transformations satisfy:
426 Mathematical Perspectives on Theoretical Physics
-u2 - v2 + x2 + y2 + z2 = - sin2 t - cos2t cosh 2 % + cos2t sinh2 x cos 2 9 + cos 2 / sinh 2 x sin2 9 cos 2 (j) + cos2t sinh 2 x sin2 6 sin2 (f> = - sin2/ - cos 2 /[cosh 2 % ~ sinh2 X x {cos2 0 + sin2 0(cos2 0 + sin2 ))}) = -(sin 2 / + cos2/) = - 1 . The differentials of these coordinates are: (i)
du = cos tdt dv = - sin / cosh # dt + cos / sinh % <^X dx = - sin / sinh # cos 9 dt + cos / cosh # cos 9 dx - cos / sinh £ sin 0 d9 dy = - sin t sinh £ sin 0 cos > dt + cos r cosh # sin 0 cos <j> dx + cos t sinh ^ cos 0 cos <j> d6 - cos t sinh ^ sin 0 sin <j> dtp dz = - sin / sinh # sin 9 sin 0 A + cos / cosh % s^n 9 s^n 0 ^X + cos t sinh # cos 9 sin
We collect the coefficients of (dt)2, etc., and note that the coefficients of product differentials cancel out, this gives: - du2 - dv2 + dx2 + dy2 + dz2 =
(ii)
(dt)2[- cos2? - sin2/ cosh 2 x + sin2/ sinh2 x cos 2 9 + sin2/ sinh2 x
sin2
0(cos 2 <j> + sin2 0)] + dx1 [- cos 2 / sinh2 #
+ cos 2 / cosh2 £ cos 2 9 + cos 2 / cosh 2 £ sin2 9 cos 2 0 + cos 2 / cosh 2 £ sin2 0 sin2 <j>] + d92[cos2t sinh2 % sin2 0 + cos 2 / sinh2 % c o s 2 #(cos 2 0 + sin2 0)] +d0 2 [cos 2 / sinh 2 x
sin2
#(cos 2 0 + sin2 <j>)]
= d?{- cos 2 / - sin2/ (cosh2 £ - sinh2 %)\ + c o s 2 / ^ [ - sinh2 £ + cosh 2 £{cos 2 0 + sin2 0(cos 2 0 + sin2 0)}] + cos2/af02[sinh2 ^(sin 2 9 + cos 2 0)] + cos 2 / sinh 2 £ sin 2 0d0 2 = - dt2+ cos2t[dx2+ sinh2 x(d92+ sin29d
u = cosh r cos /', v = cosh r sin /', x = sinh r cos 9 v = sinh r sin 9 cos 0, z = sinh r sin 0 sin <j>.
This gives: (iv)
- M2 - u 2 + jt2 + v2 + z 2 = - cosh 2 r cos 2 /' - cosh 2 r sin2 /' + sinh2 r{cos 2 0 + sin2 0(cos 2 <j) + sin2
Gravitation, Relativity and Black Holes 427
The differentials in this case are: (v)
du = - sin t' cosh rdt' + sinh r cos t'dr dv = cos t' cosh rdt' + sinh r sin t'dr dx = cosh r cos 8 dr - sinh r sin 0
Hence we have: (vi)
ds2 = -du2- dv2 + dx2 + dy2 + dz2 = - cosh2 r{dt')2 + dr2 [- sinh 2 r + cosh 2 r {cos2 9 + sin2 0(cos 2 <j> + sin2 (j))}] + sinh2 r^0 2 [sin 2 9 + cos 2 0(cos 2 <)> + sin2 0)] + sinh2 rd<(>2 [sin2 0(sin 2 0 + cos 2 <j>)] = - cosh 2 r{dt'f + dr 2 + sinh2 r{d& + sin2 &ty2).
(Evidently the coefficients of product terms cancel out.) While the first coordinate system (t, x, 9, 0) covers only part of the space, and has singularities at t = ± —, the coordinates (r', r, 9, <j>) cover the whole space. 5. We begin with the differential equation established in (vii) of Exc. 2: (i)
v' = -K +
kre\Trr-T°0)
and note that if the matter is distributed regularly down to r = 0, the center of symmetry, then we can integrate this equation from 0 to r and obtain: (ii)
v(r, 0 + A(r, t) = v(0, t) + A(0, t) + k £ yeX(y-r) (Trr - T°0)dy.
In view of the fact that v(r, t) = - A(r) for r > r0, we note that the LHS of (ii) is zero at r = r0, hence we have: (iii)
v(0, 0 + A(0, 0 = -kg yeX(y-r) (Trr - T0Q)dy.
(We assume here that boundary conditions at r = 0 are derived using the 'local regularity requirement' . This requirement is as follows: the loop (enclosing r = 0) defined by r = £ (£ being small), t = const., 9 = — and parametrized by y/, having the metric length lite, that tends to zero as £ —> 0, is in a regular region of spacetime, and as such the, Lorentz transformation given by parallel propagated basis vectors around the loop approaches the identity.) We further need that: (iv) A(r) —» 0 as r —> «>. (This follows from Eq. (iv) of Exc. 2.) Then from (ii) together with the regularity of T, we have: (v) v(r) —» 0 as r —¥ «>.
428 Mathematical Perspectives on Theoretical Physics
We can now solve Eq. (iii) (b) of Exc. 2 in terms of A; since this equation can also be written as: (e-X)' + 1 (e-X) - I - krT°0 = 0 r r using (iv) and (v), we have a general solution for X (r) at r > r0: (vi)
(vii)
+
AjoV7Vy
r>r0.
We put /
---x
(vm)
4?T ft)
T.,1
M' =
r
|
J
2T-0
y rody
c and note that M' is a constant since it does not depend on r or on t. Writing the value of k = —j—, we have: c (ix)
e -AW =1 _2ATG_
C r
When we substitute this value of e ^ ( r ) in the line element (vii) of Exc. 1 (after taking into account that v(r, t) - -A(r) for r > r0), we obtain:
(x)
ds2 = -(1 - ^ 1 V
2
+ f 1" ^ T ' ^^2 + ^ 2 (^e 2 + sin2 6d y,2).
V c r ) V c r / This is the Schwarzschild solution (8.4.26) in the region r > r0, where —T-M' = M. 6. Use Exc. (2.7) to establish these equations. 7. The Kerr-Schild coordinates (x, y, z, t) are given by following relations: (x + iy) = (r + id) sin 0 exp (i j (d(j> + a A"1 J r ) ) z = r cos 6,
(i)
i = J (Jr + ( r 2 + a 2 ) A"1 dr) - r
(ii)
(see Subsec. 4.1.5 for A and a.) The Kerr-metric in these coordinates takes the form: ds2 =dx2 + dy2 + dz2 - dt2 2m r3 (r(xdx + — 5~3"
+ ydy)-a(xdy-ydx)zdzi,-\ 2 2^
+
+ dt
..... (ln)
The r in (iii) is determined (up to a sign) in terms of x, y, z by the equation: rA-{x2
+ y2+ z2-a2)
r2 - a2 z2 = 0
(iv)
The surfaces r - const. ^ 0 are confocal ellipsoids in coordinates (x,y, z), for r = 0 these degenerate to the disc z 2 + y 2 < a 2 , x = 0, and the ring x2 + y2 = a2, z = 0 which is the boundary of this disc-is the 'ring of singularity' we were looking for. We note that this is a real curvature singularity as the scalar polynomial RabcdRabcd diverges here, we further note that no scalar polynomial diverges on the disc except at the boundary.
Gravitation, Relativity and Black Holes
5
429
THE BASICS OF SPACETIME SINGULARITIES AND BLACK HOLES
In this section we introduce the reader to two great triumphs of Einstein's theory of general relativity— singularities and black holes. Our treatment here is sketchy as it is meant more for a beginner in the area. Some of the excellent sources for detailed study and research work on the subject are: [4c], [16c,d,e], [26], [30a], [40]. We begin with the layman's version of these two terms and then go on to the scientific meaning and the role they acquired in physics. The words "singular" and "hole" in common vocabulary stand respectively for "different from the ordinary" and "the part of an object it can be seen through" (a hole in the roof or in a table-top). A mathematician, on the other hand, would define the first word as: that which is "not non-singular" is "singular," e.g., a singular matrix or a singular solution, and the second as: a "geometric discontinuity." Obviously it is very often difficult to ensure the extraordinariness (singular character) of an object, and is sometimes impossible to peep through a hole (for lack of light). When these problems are dealt within the realm of macrophysics, they are referred to as "singularities" and "black holes."
5.1 Singularities and Completeness in Spacetime We recall that a singularity is a point in spacetime at which the spacetime curvature becomes infinite, we also know that there are models of spacetime e.g., Robertson-Walker, Schwarzschild, Reissner-Nordstrom and Kerr where singularity occurs. Since these models of spacetime are based on group-theoretic assumptions and choice of coordinates, a natural question that one asks is: does there exist a singularity in the universe which is not a peculiarity of a model but is the result of the breakdown of physical laws? A question of this nature is closely related to the theory of "big bang" and "expanding universe" [19a,b]. Based on the fact that galaxies are moving away from each other as well as from our solar system, and thus the universe is expanding, the big bang theory predicts that "the universe had a beginning" and if the history of time was to be written, that could be the point of origin. Models such as that of Friedmann (see Subsec. 4.1.4) supported this idea, although in the absence of a rigorous proof, the idea was not fully accepted. It was Penrose who used the theory of general relativity to give a mathematical proof of the existence of singularity related to the big bang theory. We describe in brief the concept of singularity introduced by Penrose and later on expanded by Hawking (see [30a] and [16b] for original papers). To begin with, we define the word '^-completeness' (short for bundle-completeness) which is required in formulating the existence of singularity in Einstein's universe. The ^-completeness is a generalization of our familiar geodesic completeness (of Riemannian manifolds), as we shall see below. Consider a manifold M with positive definite metric g and let p(x, y) be the distance function between two points x, y € M. Recall that p(x, y) is the greatest lower bound of the length of curves from x to y and works as a metric in the topological sense, for it provides a basis {15 (x, r)} for open sets of 3tf. The basis {'B (x, r)} is formed by using all points y e iWfor which p(x, y) < r (see Chap. 0 for the definition of the metric). The pair (fW, g) is said to be metrically complete (m-complete) if every Cauchy sequence with respect to p converges to a point in M. Alternatively we say that (M, g) is m-complete if every C1 -curve of finite length has an end point (as given in Sec. 4) and hence m-completeness implies geodesic completeness (g-completeness) (see [22] for proof), this means that every geodesic can be extended to arbitrary values of its affine parameter. When g stands for the Lorentz metric, there is no m-completeness, and g-completeness also has to be subdivided into three distinct categories depending on the nature of the curve (timelike, spacelike or null), and thus we cannot say here that (M, g) is ^-complete on the basis of geodesic-completeness of one of these.
430
Mathematical Perspectives on Theoretical Physics
If !Mis timelike geodesically incomplete, there could be freely moving observers or particles with no histories after (or before) a finite interval of proper time, a feature such as this is more objectionable than the infinite curvature in fW. A similar argument applies to null geodesically incomplete spacetimes, since null geodesies are the histories of zero rest-mass particles (see the Appendix). On the other hand, spacelike curves are not the carriers of any particles or observers, hence spacelike geodesic incompleteness is not of much significance. With these facts in view, Penrose adopted the following definition for singularity: Definition 8.5.1 The minimum conditions for a spacetime to be singularity-free are that it be both timelike and null geodesically complete. According to this definition, timelike/null incompleteness implied the occurrence of singularity in spacetime; some of Penrose's and Penrose-Hawking's earlier results were based on this premise. The main disadvantage of this premise was that it gave information of the occurrence of a singularity, but provided no clue about the shape, size or the location of a singularity, it was here that ^-completeness was found useful. Consider a C'-curve A(f) through p e M with tangent vector V= (dldi)X{() which is expressible as V = V'(t) Ej in terms of a parallelly propagated basis {Et} (along X{t)). The parameter u defined as: \_ (8.5.1) is called the generalized affine parameter on A. The length of a curve A is finite in the parameter u if and only if it is finite in any other parameter u that results from another choice of basis E]. If A is a geodesic curve, then u is an affine parameter on A. Definition 8.5.2 The pair {M, g) is called b-complete if there is an endpoint for every C'-curve of finite length as measured by a generalized affine parameter. Note that if the length of A is finite in terms of one parameter u, it is so in terms of all other parameters u (assuming that bases Et and £", leading to M, u are related via a non-singular matrix). This observation suggests that the bases {£,}, {£",} can be taken as orthonormal bases without any loss of generality, and it further suggests that if g is positive definite, the generalized affine parameter defined via orthonormal basis is arc-length, hence fe-completeness in this particular case coincides with m-completeness. We remark: Remark 8.5.3 On a pair (!M, g) ^-completeness can be defined even if the metric is not positive definite. As long as there is a connection on SW, the ^-completeness can always be defined. The bcompleteness implies ^-completeness but not the other way around. In view of the above remark, we define a spacetime to be singularity-free if it is ^-complete. We say that a ^-incomplete curve corresponds to a scalar polynomial curvature singularity ('s.p. curvature singularity') if any of the scalar polynomials in gab, t]abcd, Rabcli is unbounded on the incomplete curve. Similarly it corresponds to a curvature singularity with respect to a parallelly propagated basis ('p.p. curvature singularity') if any of the components of the curvature tensor is unbounded on the incomplete curve. Evidently s.p. curvature singularity implies p.p. curvature singularity. We next show the bundletheoretic origin of ^-completeness. We said earlier that the notion of ^-completeness was introduced to study the structure of singularities (shape, size and location), but the main difficulty was that the manifold iW of a spacetime (!M, g) was assumed to be without any singular point. In order to make room for singular points, a sort of boundary d to fWhad to be attached, giving rise to another manifold 5tf + = M u d. The boundary d had to be uniquely determined by measurements at non-singular points of (M, g). Hawking and Geroch (see [16a]
Gravitation, Relativity and Black Holes 431
and [13a]) in separate papers suggested a construction method for d by defining singular points as equivalence classes of incomplete geodesies. Their method was improved upon by Schmidt's construction given below (see [16c], [36]). Schmidt used the theory of bundles to obtain 5W+. Let 0(5W) be the set of all orthonormal frames {Ea} where Ea e Tp for each p in M and a = 1, 2, 3, 4. Consider a positive definite metric e defined on the bundles: O(M) —>M(n maps a basis {Ea} atapoint p to the point/?). It can be shown that O(!M) viewed as a manifold is m-incomplete in the metric e if and only if M is ^-incomplete.38 When O (ftf) is m-incomplete, one can obtain the metric space completion O(M) of O (iW) by Cauchy sequences. The projection ;rcan be extended to O(M) and the quotient of O(M) by ;ris defined to beM + which is the union of Mand some additional points denoted by d. The set d consists of singular points of'M.since it is the set of endpoints for every fc-incomplete curve in 2W. We now state some of the results that deal with a singularity in spacetime and are based on the concepts of completeness/incompleteness introduced above. These results are referred to as "singularity theorems" in the literature.
Result 8.5.4-Theorem
1, Penrose [30a]
A spacetime (M, g) can not be null geodesically
complete if: (i) Rah FfK* > 0 for all null vectors Ka (see Sec. 3); (ii) there is a non-compact Cauchy-surface ^ in 5W; (iii) there is a closed trapped surface ST in 94..
Result 8.5.5
(Theorem 2, Hawking and Penrose [16b])
A spacetime (M, g) is not
timelike and null geodesically complete if: (i) Rab K"Kb > 0 for every non-spacelike vector K (see Sec. 3); (ii) the generic condition is satisfied, i.e., every non-spacelike geodesic contains a point at which K[a Rb] c
Result 8.5.7
(Theorem 3, Hawking [16c]): If
(i) Rah K"Kb > 0 for every non-spacelike vector K; (ii) the strong causality condition holds on (fW, g); 38
This is obviously equivalent to the result: (O(iW), e) is w-complete if and only if (5W, g) is ^-complete.
432 Mathematical Perspectives on Theoretical Physics
(iii) there is some past-directed unit timelike vector R at a point p and a +ve constant b such that if IR is the unit tangent vector to the past-directed timelike geodesies through p, then on each such geodesic the expansion 0 = Va.a of these geodesies becomes less than -3c/b within a distance blc from p, where c = -WaVa, in such a case there is a past incomplete non-spacelike geodesic through p. Result 8.5.8—(Theorem 4, Hawking [16c]): A spacetime is not timelike geodesically complete if: (i) Rab KaKb > 0 for every non-spacelike vector K; (ii) there exists a compact spacelike 3-surface S (without edge); (iii) the unit normals to 5 are everywhere converging (or everywhere diverging) on SAs mentioned earlier, we focus our attention only on the significance of these theorems and omit their proofs. The interested reader is advised to consult [16c], [16e] and the original papers cited there. Remark 8.5.9 The first of these theorems was formulated (by Penrose) to prove the occurrence of singularities in a star which collapsed inside its Schwarzschild radius. The theorem, as is evident from the hypotheses, was not based on the assumptions of symmetry (spherical or axial); instead it used a more general criterion such as the existence of a closed trapped surface. The conclusions of the theorem were, that in a collapsing star, one of the two things occurs: a singularity, or a Cauchy horizon. It is worth noting here that much before Penrose's work on collapsing stars, an equally fundamental result on the contraction/expansion of a star was established by 5. Chandrashekhar (see [4a], [4c] and Box 24.1 in [26]). He showed that when a star contracted,39 the matter particles came very close to each other and they had different velocities. According to Pauli's exclusion principle, particles moved away from (repelled) each other and thus made the star re-expand. This implied that there was a point in a star's (life) history where a star maintained itself at a constant radius by a balance between the gravitational attraction and the repulsion caused by the exclusion principle. Chandrashekhar calculated that this balancing feat was possible, as long as the star's mass was (approximately) 3/2 times the mass of our sun = 3/2M@ (this is now called the Chandrashekhar limit). If the star's mass was less than that of our sun, the star stopped contracting, to become a "white dwarf" with a (small) radius of a few thousand miles and a density of hundreds of tons per cubic inch. On the other hand, if the mass exceeded the limit, in view of Pauli's exclusion principle, the collapse could not be halted and what happened to such a star was not known. In other words, the theory of general relativity provided no answer for it. It was J. R. Oppenheimer (the atom bomb physicist) who showed that as the star contracts, the gravitational field at the surface gets stronger and stronger, and the light cones get bent inward more and more (see Box 24.1 in [26]). As a result it becomes difficult for the light to escape from the star. The light appears dimmer and redder to an observer at a distance. Eventually, when the star has shrunk to a certain critical radius, the gravitational forces at the surface become so strong that light can no longer escape. It is this phase of a collapsing star which yields what is called a 'black hole.' Remark 8.5.10 In view of Definition (8.5.1), the failure of the non-spacelike geodesic completeness condition in Thm. 2 can be interpreted to mean that any spacetime which satisfies (i)-(iv) possesses a singularity. Whether the singularity is indeed the "infinite curvature" type cannot be inferred from it. More precisely, the theorem implies that 'some' causal (non-spacelike) geodesic "enters a singularity" (i.e., is compelled to be geodesically incomplete) before any "repeated focusing" has time to take place. 39
When a star has lost a sufficient amount of its nuclear fuel, it begins to cool off and thus begins to contract.
Gravitation, Relativity and Black Holes
433
Remark 8.5.11 Theorems 2 and 3 are considered as most useful theorems on singularities, since the conditions laid down in these theorems are satisfied in a number of physical cases. It must be noted, however, that the occurrence (predicted by their hypotheses) may not be a singularity, it may just be a closed timelike curve violating the causality condition. An outcome of this nature is physically more objectionable than the occurence of a singularity. A valid question that follows from this is: would causality violations prevent the occurrence of singularities. Theorem 4 shows that causality violations in general cannot prevent the occurrence of singularities and as such they have to be taken seriously. Our next remark deals with the role played by the metric and/or curvature in the occurence of singularities. The remark is based on the character of singularities predicted by Theorem 4. According to this theorem geodesic imcompleteness in spacetime is the consequence of unbounded curvature/irregularities in the metric. Remark 8.5.12 If one extends spacetime so as to try to continue the incomplete geodesies, the metric fails to be Lorentzian or the curvature becomes locally unbounded giving rise to a curvature singularity. We note however, that in the latter case even though curvature may be locally unbounded, the metric could still be interpreted as a distributional solution of the Einstein equations, if the volume integrals of the curvature components over any compact region are finite. Remark 8.5.13 It is impossible to determine the manifold structure at points of singularity by physical measurements. In fact there are manifold structures which agree for non-singular regions but differ for the singular points. A case in point is the manifold at the t = 0 singularity in Robertson-Walker solutions (see Subsec. (4.1.2)). This could be described by the coordinates: [t, r cos 6, r sin 0cos <j), r sin 6 sin 0} or by: {t, Sr cos 0, Sr sin 6 cos <j), Sr sin 6 sin >}. In the first case, the singularity is a 3-surface, in the second case, it is a single point.
5.2
Black Holes
We describe here in brief the formation of black holes in the universe that obeys the principles of Einstein's theory of general relativity. As mentioned above, black holes result from a collapsing star, and when such stars have a static and spherically symmetric body, the solutions to Einstein's equations in the regions outside of them are those that follow from the Schwarzschild model. For it is this model which represents the spherically symmetric empty spacetime outside a massive spherically symmetric body (see Exes. 4.1, 4.2 and 4.5). We shall see below that the size (radius) of the collapsed star and hence the ensuing singularity and the black hole depends (among other things) on the 'Schwarzschild radius 40 ' 2m, given in the metric: d* =
_f i _ 2« W
+
(l _ 2ZLJ1 dr2 + r\dtf
+ sin2 Q # 2 )
Suppose that r 0 is a fixed radial distance that corresponds to the surface of the star, then for r > r0, the solution of Einstein's equation is indeed the Schwarzschild solution (S.S.) for asymptotically flat regions. We note that when the star is static, the radius r 0 must be greater than 2m, since the surface of the star corresponds to the orbit of a timelike Killing vector which exists (in the S.S.) only where the radial distance r > 2m. 40
. Evidently the Schwarzschild radius (S.R.) differs from body to body, for instance it is 1.0 cm for the earth and 3.0 km for the sun. The ratios of the S.R. to the radius of the earth and the sun are 7 x 10~10 and 2 x 10"6 respectively.
434 Mathematical Perspectives on Theoretical Physics
If r 0 were less than 2m, the surface of the star would be expanding or contracting, the latter would come into play only after the nuclear fuel of the star is exhausted, the star then begins to cool and the pressure is reduced. Again, since the solution outside the star is a S.S., there will be a closed trapped surface ST around the star. Hence in view of Theorem. 2, a singularity will occur provided there is no causality violation and the appropriate energy conditions continue to hold. We further note that even when the star is not exactly spherically symmetric41, the above phenomena of a closed surface around the star would still occur provided the departure from spherical symmetry is not too great. This would not be due to S.S. now, but would follow from the development of a (partial) Cauchy surface. One of the key questions here is: how large can the rotation be without preventing the occurrence of a trapped surface? This question is answered by the Kerr solution, which can be thought of as representing the exterior solution for a body with mass m and angular momentum L = am. If a < m, there are closed trapped surfaces, but if a > m, there are no closed trapped surfaces. In other words, if the angular momentum of the star is greater than its squared mass (L > m2), then the contraction of the star would halt before a closed trapped surface developed. While reaching this conclusion of no closed trapped surfaces due to large angular momentum, one must remember, however, that during the period of collapse the star will lose angular momentum, and hence the notion that angular momentum could prevent the closed trapped surfaces and thus the occurrence of singularity has its drawbacks. In the following remarks, we shall analyze the mass and the density of a star qualitatively to reach conclusive answers on their collapse. In the process we shall define the stars that are called "white dwarfs" and "neutron stars." Remark
8.5.14
A star is equipped with a (frozen) magnetic field R , which increases as p~3~
(p s matter density of star) during a star's collapse, which is assumed to be nearly spherical. Thus the 4
magnetic pressure at any time is proportional to p 3 . This rate of increase is so slow that unless the magnetic pressure was of relevance initially, it would have no significant influence on the collapse of a star. Remark 8.5.15 A star that is (completely) burnt out cannot support itself against gravity if its mass exceeds the limit of 1.5 M s . In order to establish this limit, we note that in hot matter there is pressure produced by the thermal motions of the atoms and by the radiation (that results from hot matter); in cold matter, however, at densities lower than that of nuclear matter (= 1014 gm cm"3), the only significant pressure arises from the quantum mechanical exclusion principle, as explained below. Remark 8.5.16 Consider fermions of (total) mass m with number density n. By the exclusion principle, each of these fermions will occupy a volume of n~l. In view of the uncertainty principle, it will have a spatial component of momentum of order h n T where h denotes the Planck's constant. The velocity, and consequently the pressure due to these fermions, will be determined by the rule (8.5.2) given below:
Non-relativistic fermions => h n T < m
velocity will be of the order ~h ny/m" 1
pressure (s (momentum) x (vel.) x (number density)) will be of the order ~ h2 n y m"1
Relativistic
~ 1 = (the speed of light)
~-» n T
fermions => h n T > m 41
. A star departs from spherical symmetry if it is rotating or it has a magnetic field.
Gravitation, Relativity and Black Holes 435
Remark 8.5.17 From the above rule it is also evident that as long as the matter is non-relativistic, the major contribution to the degeneracy pressure comes from the electrons since m~l for them is bigger than it is for the baryons (see Table S). At high densities, however, when the particles become relativistic, the pressure is independent of their mass, it only depends on their number density. Remark 8.5.18 A cold body (burnt-out star) may be so small that self-gravity can be neglected, in this case the degeneracy pressure is balanced by attractive electrostatic forces between nearest neighbouring particles arranged in a lattice. Assuming that there are an equal number of positive and negative charges and (approximately) an equal number of electrons and baryons, these forces will produce a negative pressure of order e rO . Thus the mass density of a small cold body will be of the order e6miemn tT6
(= 1 gm cnT3)
(8.5.2)
Here me(mn) is the electron (nucleon) rest-mass. Remark 8.5.19 When the cold body is large enough for the self-gravity to be important, it works very differently. The gravity compresses the matter against the degeneracy pressure. Using the Newtonian order of magnitude argument, we note that for a star of mass M and radius r0, the gravitational force per unit volume is of the order Mlr^ n mn where n mn — Mlr$ is the mass density. The gravitational force is balanced by a pressure gradient of the order P/r0, P being the average pressure in the star. Thus the pressure P can be expressed as:
P=£L« MH^mj
42
(8-5-3)
When the density is sufficiently low, in view of Remark (8.5.17), the main contribution to the pressure is from the degeneracy of non-relativistic electrons, hence using the rule (8.5.2) we have: P = h2nT
1
m-
(8.5.4)
Equating this value of P with that of (8.5.4) we obtain: M 3 n 3 m 3n = h2n 3 m~x which gives the value of the number density n as: n= M2minmlh~6
(8.5.5)
The above value of n is based on the assumption that the self-gravity of the star is coming into play, this will be valid as long as this n is greater than the value of n given by (8.5.2) where self-gravity has no influence on the star as it is too small; and also this n must be < me3ft~3 for the correctness of Eq. (8.5.5). In terms of pressures the relationship between small and large stars can be stated as: 1.1 2. ± ± e n 3 < M 3 n 3 mn3 42
. The approximate value of P follows from the fact: u
4
3
4
3
(3M
1 )
m
I
*4V4/3
436
Mathematical Perspectives on Theoretical Physics
or equivalently as:
e'm'l < M
(8.5.6)
On the other hand since
ft"3
n > ml
(8.5.7)(a)
j_
rests on n3 h > me, from (8.5.2) it follows that Fermions, not being relativistic, their pressure does not £ vary as tin 3 , and this prevents the continuing gravitational collapse. In view of (8.5.5) the inequality (8.5.7)(a) implies: A ^ m ^ J T 6 < m3eft""3 or
M < h^m'l
(8.5.7)(b)
Putting both these inequalities together we have: e3m~2 < M < fiT
m
~2
(8.5.8)
A cold star whose mass M lies between these limits is called a white dwarf. A natural question that one asks is: what would happen if the pressure ~ /m~3"? The answer to this question is: the star would still have an equilibrium but it would be an unstable equilibrium. It is this instability which is responsible for collapse. This might be for a white dwarf collapsing towards a 'neutron star,' or a neutron star collapsing towards a 'black hole' (if the pressure is due to the neutrons) (see Remark (8.5.21)). Remark 8.5.20 If the density is so high that the electrons are relativistic, i.e., n > m\ 7T3, then the pressure P from (8.5.2) for the relativistic formula is:
This, when equated with P in (8.5.3), gives: ± i i ^ An 3 = M 3 n 3 m,,3 showing that a star of this nature has the mass: M=ML^h~
mf
=< 1.5M0
(8.5.9)
This star can have any density greater than m3smnft"3, i.e., any radius less than h~2m~^m~x. We note that stars of mass greater than ML cannot be supported by the degeneracy pressure of electrons alone, as will be evident from our next remark. Remark 8.5.21 When the electrons become relativistic, they tend to induce inverse beta decay with the protons and thus produce neutrons (see Table S for the notations used here): e~ + p -» ve + n. This lessens their number and thus reduces the degeneracy pressure due to them, causing the star to contract and make the electrons still more relativistic. The star continues to remain in an unstable situation until nearly all the electrons and protons have been converted into neutrons. When this stage is
Gravitation, Relativity and Black Holes 437
reached, the star can again be in stable equilibrium with the support of degeneracy pressure caused by neutrons. The star in this case is called a neutron star. If the neutrons are non-relativistic, from (8.5.5) it follows that the number density n is now:
If, on the other hand, neutrons are relativistic, the star must again have a mass ML and a radius < K^m~2. But M^/ft3/2m~2 = 1 and so such a star is near the General Relativity limit MLIR ~ 2. In conclusion, we note (see Remark (8.5.15)) that a cold star of mass greater than ML cannot be supported by degeneracy pressure whether it comes from electrons or from neutrons. The above limits on mass can be shown computationally by using the Newtonian equation43.
4^- = -pM(r)r2
(8.5.10)
dr where
M(r) = An\ pr2dr
Multiply (8.5.10) by r 4 and integrate the LHS by parts from 0 to r0, and since p = 0 at r = r0 we obtain: (8.5.11)
On the other hand, since —— is never positive, dr (8.5.12) Also p is never greater than tin3 (see the rule (8.5.2) on p. 435), hence,
Jor° p?dr < h(j* nr2dr)3
= ft(A/(ro))T (4wmJI)"T
(8.5.13)
From (8.5.11) we thus have, after simplification, M(r 0 ) < ( 8 f i ) ^ ( 4 ^ ) " T m ; 2 < 8 f i T m - 2
(8.5.14)
To obtain the limits on the mass of a cold star, we have this far used the Newtonian theory, we shall now see the effects of the theory of General Relativity on such masses. When the body is static, spherically symmetric and is composed of a perfect fluid, the Einstein field equations can be reduced to (see Exc. (2.7)): 43'
This equation between —— and the varying M(r), is called the support equation, note that pressure varies dr with respect to the radius r. Following the usual practice in literature, we have used p instead of P here to denote the pressure in subsequent equations.
438
Mathematical Perspectives on Theoretical Physics
d
P
dr
(V + P){M(r) + 4Kr*p) =
;
7.
:
r(r-2M(r))
(O.J.ljJ
where the radial coordinate is such that the area of the 2-surface (r = const., t = const.) is \nr2. Similar to Newtonian case the function M{r), represents the mass defined by the integral: M(r)= j r Anr2lldr
(8.5.16)
where \l = p(l + e) is the total energy density, p = nmn (n times the mass of a nucleon) and e is the relativistic increase of mass associated with the momentum of the fermions (see Remark (8.5.16)). The function M (r0) equals the Schwarzschild mass of the exterior Schwarzschild solution for r > r0. For a bounded star M (r0) will be less than the conserved mass:
M = f;
4npr2dr {
=Nmn
(8.5.17)
(l-2M/r)7 where N is the total number of nucleons in the star. The difference (M - M) represents the amount of energy (binding energy) radiated to infinity since the formation of the star from dispersed matter initially at rest. Remark 8.5.22
It has been shown (Bondi [3]) that:
(l-^]Nj
(8.5.18)(a)
provided /J. and p are positive, and that jl decreases outward; similarly,
(l-f)N{
(8.5.18)(b)
provided p < /J.. Therefore M < M < 3 M, in other words, the difference M - M can never exceed 2M; in reality the difference is never more than a few per cent. Remark 8.5.23 When we compare (8.5.15) with (8.5.10) with pi and M in place of p and M, we note that all the extra terms on the RHS of (8.5.15) are negative as long as both e and p are > 0. Thus, just as in the Newtonian theory a cold star of mass M > ML cannot support itself, a cold star of Schwarzschild mass M > ML cannot support itself in the General Relativity theory. This means that a cold star which contains more than 3 ML/mn nucleons cannot support itself (see Remark (8.5.17)). In conclusion, it is fair to say that some of the bodies of mass > ML will eventually collapse within their Schwarzschild radius and will thus give rise to a closed trapped surface. Since there are at least 109 stars with masses greater than ML in our galaxy, there would be a large number of situations where the predictions of Thm. 2 on the existence of singularities will hold good.
Gravitation, Relativity and Black Holes 439
Next we see how and when a collapsing star can be said to turn into a black hole. If the collapse is exactly spherical, the solution outside the body is S.S.; an observer O at a large distance from the star is able to see an observer O' on the surface of the star when it passes within the distance r - 2m, but is not able to see 0' once it passes r = 2m. With the passage of time, the light he receives from (/ will have a greater and greater shift of frequency to the red and also a greater and greater decrease of intensity. The surface of the star never actually disappears from O's sight, but it becomes extremely faint and so is practically invisible. The time scale for this to happen is of the order for light to travel a distance 2m. One is now left with an invisible object, but this object has the same Schwarzschild mass and it still produces the same gravitational field as it did before it collapsed. The only way one can detect its presence is by its gravitational effects on nearby objects, or by the deflection of light passing near it. Since in a spherically symmetric collapse the singularity occurs within the region r < 2m, from which no light can escape to infinity, this singularity (predicted by Theorem 2) cannot be seen by an observer who is outside r = 2m. The surface r = 2m is called the event horizon of the collapsing star (see Fig. (8.13)), and the matter and energy which crosses the event horizon is lost forever making the star into a black hole. When the collapsing star is not exactly spherically symmetric, the theory of Cauchy surfaces is used to obtain solutions outside the collapsing star and as mentioned earlier, the trapped surfaces and the event horizon pertaining to this star are seen to follow using arguments similar to the spherically symmetric case. Two key questions that one asks in both these cases are: (1) Can the future be predicted far away from a collapsing star? (2) Once the energy (of the collapsed star) has been radiated to infinity in the form of gravitational waves, does the solution outside the horizon approach a stationary state? Singularity
11
Origin of
Schwarzschild vacuum solution
Observer
/
J! & L^v^Shtcone \ K^ ^ r~~\?^ i\\\V d/ \ sJZl\\\\\YJSU------V~^ ''AVTCYVW/ \ | W\\\\\\\ ^ \
coordinates v i s 1 o'clock \ \
V \
• \ \ \ \ \ \ \ \ r \ \ Observer i \\\\\\\\\\VO' \
ffl|Kro
Event horizon
Singularity ^ ^ ^ '< V///yc^T^r^- Observer o \ y///>/JC\ . O ' s past ! V / / / A A4\^^ light cone i'1 o'clock / \ \ 2 3+ \ '/////koL^x oH
r=0 | 7////$$.*?.. ^fr U7
Origin of •//">#/^T^C. c coordinates i / ^ £ Z / ^ ^ /
(i) Finkeistein diagram and (ii) Penrose diagram. An observer O who never falls inside the collapsing fluid sphere and never sees beyond a certain time (say, 1 o'clock) in the history of another observer O'on the surface of the collapsing fluid sphere.
440 Mathematical Perspectives on Theoretical Physics
To answer the first question, Hawking gave a mathematical meaning to future predictability by defining it as follows: Definition 8.5.24
Let (fM, g) have a region which is asymptotically flat (see Sec. 4), then there is a
space (M, g) into which (fW, g) can be conformally imbedded as a manifold M = fWu dM, where dM the boundary of M in M consists of two null surfaces J+ and J~ that represent future and past null infinity. Suppose that 5 is a partial Cauchy surface in M. The space (iW, g) is said to be (future) asymptotically predictable from S if j7+ is contained in D + (5) in the conformal manifold M. Some of the spaces which are future asymptotically predictable from some surface 5 are: the Minkowski space, the Schwarzschild solution for m > 0, the Kerr solution for m > 0 with \a\ <m, and the Reissner Nordstrom solution for m > 0 with \e\ < m. On the other hand, the Kerr solution with \a\ > m and the Reissner-Nordstrom solution with \e\ > m are not future asymptotically predictable, since for any partial Cauchy surface 5, there exist past-inextendible non-spacelike curves from_7+ that do not intersect 5 but approach a singularity. Remark 8.5.25 The future asymptotic predictability is regarded as a condition that there would be no singularities to the future of 5 which are "naked," i.e., visible fromj7 +. In a spherical collapse, one obtains a space which is future asymptotically predictable. When the collapse is not exactly spherical (but departure from spherical symmetry is sufficiently small), one has the following result: Proposition 8.5.26 5, and (ii) Rah KaK
If (i) {M, g) is future asymptotically predictable from a partial Cauchy surface > 0 for all null vectors K", then a closed trapped surface 5^ in D+ (S) cannot
intersect J~ (7 +, !M), i.e., cannot be seen from J +. The concept of predictability can be extended by making another definition. We shall see that this concept provides a prescription for two or more black holes to unite and form another black hole. Definition 8.5.27
A spacetime is strongly future asymptotically predictable from a partial Cauchy
1
surface S iff " is contained in the closure D+ (5) of(£>+(5))in fW,and7 + (5)n 7" (J+, M) is contained in D+ (5). The definition can be interpreted to mean that a neighbourhood of event horizon can also be predicted from J>. If (M, g) is strongly future asymptotically predictable from a partial Cauchy surface 5, then a homeomorphism 0 : (0, oo) x 5 -> D+ (5) - 5 can be defined, such that for each r e (0, °°), S (T) = ({T} X 5) is a partial Cauchy surface which intersects J+. In fact 5(T) represents a family of spacelike surfaces homeomorphic to S which cover D+(S) - 5 and intersect J* . On the surface 5(T), a black hole is defined as a connected component of the set 2?(T) =S(f) - J~(J +, 9vt). Thus, it is a region of S(t) from which particles or photons cannot escape to J + . The above definition and the construction of a family of partial Cauchy surfaces {5 (r)} with properties given in the footnote, suggest that as T increases, black holes can merge together forming new black holes as a result of further collapsing bodies. We note that the reverse process does not follow, i.e., black holes can merge together but never bifurcate. *
The surfaces (5(T) in the
{5(T)}
have the following properties: (i) for T2 > Tv 5(T 2 ) +
C T
(SitJ): (ii) for each x, the edge of
conformal manifold M is a spacelike 2-sphere52(f) inj7 such that for T2 >
future of 52(T,); (iii) for each T, 5(T)
U
+
TX,S2(T2)
is strictly to the
{ J n J~ (52(T), M)} is a Cauchy surface in M for D (S).
Gravitation, Relativity and Black Holes 441
To answer the second question posed above, we set the conditions for a 'stationary regular predictable space.' Definition 8.5.28 ing properties:
A spacetime (M, g) is a stationary regular predictable space if it has the follow-
(i) It is a regular predictable space developing from a partial Cauchy surface. (ii) There exists an isometry group 9,: M —> 9d whose Killing vector K is timelike near f and jT. (iii) It is either empty or contains fields like electromagnetic field or scalar fields that obey (wellbehaved) hyperbolic equations, and satisfy the dominant energy condition: Tah NaLb > 0 for future-directed timelike vectors DM, L In view of the above definition, Question (2) is answered with 'yes' as it can be expected that for large values of T, the region J~ (/*", M) n J+ (S (T)) of a regular predictable space containing collapsing stars would be almost isometric to a similar region of a stationary regular predictable space. One is also interested in knowing if these regular predictable spacetimes are static. This is the case if, for example, the final state of the solution outside the event horizon is static, for then the metric in the exterior region will be that of a Schwarzschild solution. On the other hand, in an empty stationary regular predictable space which is not static, the Killing vector Ka is spacelike in part of the exterior region / + (f, M) n J~ (J +, M). The region of J+ (J" M) n J~ (/*", 9vi) on which K" is spacelike is called the ergosphere. Naturally if the solution is static, there is no ergosphere. An example of a stationary non-static regular predictable space with an ergosphere is the Kerr solution for a2 < m2. Many rigorous results on the formation and structure of black holes have been proved by physicists on either side of the Atlantic using mostly the theory of General Relativity. To reach the present state of the art, various simplifications of line elements were proposed (see, for instance, [41] and a survey article by Miller and Sciama in [7]), and many diverse approaches were suggested. Amongst these approaches, according to Bekenstein (in [27]) Gerlach's work [12], deserves a special mention—since this was the sort of analysis that would have been more convincing to Einstein due to his reservations about quantum theory. A continuous experimental activity with improved technology has led to confirmation of most of the predictions of this black hole theory. For a historical account of black holes, the reader is referred to articles by Israel and Blandford in [16d] and for theorems on black holes to Hawking's selectively collected papers [16e]. Lately there has been immense research activity that established a link between black holes and string theories. Two of these papers are listed here. (See also subsections (B.9) and (B.14) of Chapter 11 along with additional references there, on these ideas). In conclusion, we reiterate that we have only introduced the reader briefly to this vast realm of knowledge via references in literature which we feel should be easily accessible. Due to our limited scope, we have not been able to cover many important aspects of the theory, e.g., quantum gravity (see Hawking and W. Israel in [16e]), gravitational waves (see ii in [24]), general relativity via complex structures [10] and twistors [30].
Exercise 8.5 1. Establish the equality (8.5.11) and the inequality (8.5.12).
442 Mathematical Perspectives on Theoretical Physics
Hints to Exercise 8.5 1. Consider the Newtonian equation of support: (i)
&•
= -pM(r)r*
dr where M{r) = 4n\ pr^dr represents the mass of the body up to the radius r. Multiply (i) by r4 and integrate by parts between the limits 0 and r0. The LHS equals: /••x
fr0 I d p 4 ] ,
(n)
f i t d p y
\—!-r*\dr=\r\—i-\
rr0
- \
3
4r-pdr.
The first term vanishes at both limits, since p = 0 at r0. Similarly the RHS of (i) equals:
(iii) 1
uTS
'2ndfn-
Integration by parts gives:
(iv)
- (1st fn.) (j 2nd fn. j 1° + J* ( diff. (1st fn.)J 2nd fn.Jdr.
Since by definition
(v) after equating (ii) and (iv) and simplifying, we have the equality (8.5.11). To establish the inequality (8.5.12), we consider the derivative:
<-•>
iav^i'={(j o v ! *i~v 3 ( 1
4U
since
is never positive. Thus dr
4
l f ' i p
, 4 , ,V"4 3
4 J o rfr' J
v
Gravitation, Relativity and Black Holes 443
JUJV 3 *')«<-^A 2 . Uo
dr
y
)
4
F
APPENDIX 8A A.i
Spatially Homogeneous
The spacetime is called spatially homogeneous (SH) if there is a group of isometries which acts freely on 9A, and whose surfaces of transitivity are spacelike three surfaces. On these SH-surfaces, any one point on a surface is equivalent to any other point on it.
A.2 Geodesically Complete Consider the differential equation:
dv
dv
dv
(8A.1)
with a Cr-connection (r > 0) V^ on M. It is known that for any point p e M and any vector \p at p, there exists a maximal geodesic A^u) such that Ax(0) = p and (d/dv)x\v_0 = X . For r > 1, this geodesic is unique and it depends differentiably on p and Xp; and thus a C r -map exp: Tp —» !Mcan be defined where for each X e Tp, exp(X) is the point in iM a unit parameter distance along the geodesic Ax from p. This map is sometimes not defined for all X e Tp as the geodesic Xx(v) may not be defined for all v. If it is defined for all v, the geodesic X(v) is said to be a complete geodesic. The manifold fWis said to be geodesically complete if all geodesies on 5^are complete, i.e., if exp: T' —* 5tfis defined for every p in Wt.
A.3
Normal Coordinates
In a convex normal neighbourhood, 9\£one can choose any point q and a basis [Ea] of 7^ and assign coordinates (x1, x 2 , ..., xn) to any point r by the relation r = exp(x?Ea) (i.e., using the coordinates of the point exp"1 (r) in Tq with respect to the basis {Ea}). These are called normal coordinates based on q. In these coordinates:
(d/dxa)\q=Ea;
r{abc)\q=0
(8A.2)
A.4 Open or Closed Universe Consider the spatially homogeneous, isotropic Robertson-Walker spacetime model with line element (see (8.4.23)): 44
Solutions of this equation are geodesies with affine parameter v. If gab parameter.
= 1, then v is an arc length
444
Mathematical Perspectives on Theoretical Physics
ds2 = S\t)do2 - dt2 where da is the metric of the "standard" (complete, simply connected) 3-space of constant curvature k. The models are called spatially open, flat or closed according as k < 0, k = 0 or k > 0. In the Newtonian models one can always obtain these values by choosing Ro = R(0) suitably, however this is not so in the relativistic models.
A.5
Cavendish Constant Gc
This constant was obtained by H. Cavendish in (1798) (see Sec. 40.8 in [26], and the original paper Phil. Trans. R. Soc. London Part II (1798)) by carrying out experiments to determine the density of the Earth. The apparatus for experimentation was made of two separated spheres suspended by fine wires. Newton's gravitational law using this constant reads as: m]m2 Force = - Gc—L2-AThe constant is equal to one in general relativity, but in other metric theories it varies from event to event in spacetime. For instance, in the Dicke-Brans-Jordan theory it is determined by the distribution of matter in the universe. As a result the expansion of the universe changes its value, thus: 1 dGc Gc
dt
f
.ltol
\
v age of universe y
-1 10
10
or 1 0 n years
In some theories, the result of a Cavendish experiment depends on the chemical composition and internal structure of the test bodies. Most accurate tests of this type were done by Kruezer in (1968) (see Sec. 40.8 in [26] and the original paper in Phys. Rev. 169 (1968)).
A.6 Closed Trapped Surface Consider a sphere 5 that surrounds a massive body of high density. At some initial point, let S emit a flash of light, then at a later time, /, the ingoing and outgoing wave fronts from 5 will form spheres Sl and 5 2 respectively. Normally the area of Sl (S2) is smaller (greater) than that of 5 since it represents ingoing (outgoing) light. However, if a large amount of matter is enclosed within 5, then the areas of both Sx and S2 will be smaller. The surface S is then said to be a closed trapped surface. We shall denote it as 5 r i n the text . See Fig. (8.14) below. A particular example of such a surface is as follows. Consider an orientable compact spacelike two-surface S in D+(S) such that the expansion 6 (see Subsec. 8.3.3) of the outgoing null geodesies orthogonal to it is non-positive, then S is an outer trapped surface. We encounter these trapped surfaces when a star collapses. Thus if 5 ( T ) is a Cauchy surface at time T during this collapse, then a region T ( T ) is a trapped region in the surface S(f) which is the set of all points q e S(z) such that there is an outer trapped surface, say T lying in S(t), through q. The existence of the trapped region T(z) implies the existence of a black hole.
Gravitation, Relativity and Black Holes 445 S
I I \'S^Sj
( i f
^ Q Q
The envelopes (spheres) S, and S2 formed by ingoing and outgoing wavefronts due to the emission of flashes of light from S. (The light from a point p forms a sphere S around p and these small S spheres form the two envelopes)
A.7
Particle Horizon
Consider a family of particles in the de-Sitter space whose histories are timelike geodesies (these geodesies originate at the past spacelike infinity^- and end at the spacelike infinity/ 1 '). Suppose p is an event on the world-line of a particle O (observer O) in this family, the past null cone ofp is the set of events in spacetime which can be observed by O at that time. All those particles whose world-lines intersect this
':%%Co$6&
Cs world-line j Particle has been * observed by O at p / /^^/ Particle horizon \ / J^fcZP Particles not yet
^Illfl3iil^
\
8-
/
/ ^ H i l ^ l r ^ observed by O at p
Past null cone of O at p
K S Q R 3 The particle horizon defined by a congruence of geodesic curves originating from past spacelike infinity^-. null cone are visible to O, whereas those particles whose world-lines do not intersect it are not visible to O. The division of particles seen and not seen by O 3Xp is called the particle horizon for the observer O atp. Thus particle horizon represents the history of those particles lying at the limits of O's vision. (See Fig. (8.15.)
446 Mathematical Perspectives on Theoretical Physics
A.8 Event Horizon All events outside the past null cone of p are events which are not and have never been observable by O up to the time represented by the event p (in de Sitter spacetime). Thus there is a limit to O's worldline on/1". The surface which is a boundary between events which will at some time be observable by O and those that will never be observable is called the future event horizon of O's world-line. (The past event horizon can be similarly defined.)
References 1. A. Ashtekar and R. O. Hansen, A unified treatment of null and spatial infinity in general relativity I. Universal structure, asymptotic symmetries, and conserved quantities at spacial infinity, J. Math. Phys. 19 (1978), 1542. 2. J. K. Beem and P. E. Ehrlich, Global Lorentzian Geometry (New York: Marcel Dekker, 1981 Second Edition 1996). 3. (a) H. Bondi and T. Gold, The steady state theory of the expanding universe, Mon. Not. Roy. Ast. Soc. 108 (1948), 252-270. (b) H. Bondi, Massive spheres in general relativity, Proc. Roy. Soc. London A282, (1964). 4. (a) S. Chandrashekhar, The maximum mass of ideal white dwarfs, Astrophys. J. 74, (1931), 81-82. (b) S. Chandrashekhar and J. L. Friedman, On the stability of axisymmetric systems to axisymmetric perturbations in general relativity I. The equations governing Nonstationary, stationary, and perturbed systems, Astrophys. J. 175, (1972), 379. (c) S. Chandrashekhar: The Mathematical Theory of Black Holes (Oxford: Clarendon Press, 1983). (d) S. Chandrashekhar in Vol. 3 of [7]. 5. P. C. W. Davies: Space and Time in the Modem Universe (New York: Cambridge University Press, 1977). 6. B. DeWitt and C. M. DeWitt (ed): Black Holes (New York: Gordon and Breach, 1973). 7. J. Ehlers (ed), Relativity theory and astrophysics (3 volumes), American Math. Soc, (1967). (i) A. Schild (Lectures on general relativity theory). Vol. 1. 8. (a) A. Einstein, On the electrodynamics of moving bodies, Annalen der Physik 17 (1905). (b) A. Einstein, On the influence of gravitation on the propagation of light, Annalen der Physik 35 (1911). (c) A. Einstein, The foundation of the general theory of relativity, Annalen der Physik 49 (1916). 9. F. de Felice and C. J. S. Clarke, Relativity on Curved Manifolds (New York: Cambridge University Press, 1990). 10. E. J. Flaherty, Hermitian and Kdhlerian geometry in Relativity (New York: Springer-Verlag, 1976). 11. M. Friedman, Foundations of Space-Time Theories (New York: Princeton University Press, 1983). 12. U. Gerlach, The mechanism of blackbody radiation from an incipient black hole, Phys. Rev. D14 (1976), 1479. 13. (a) R. P. Geroch, Local characterization of singularities in General Relativity, J. Math. Phys. 9 (1968).
Gravitation, Relativity and Black Holes
14. 15. 16.
17.
18. 19.
20. 21.
22. 23. 24.
25. 26.
447
(b) Spinor structures of spacetime in general relativity I and II, Journ. Math. Phys. 9 (1968); Journ. Math. Phys. 11 (1970). G. W. Gibbons, The time symmetric initial value problem for black holes, Commun. Math. Phys. 27 (1972), 87. V. Guillemin and S. Sternberg, Variations on a theme by Kepler, Am. Math. Soc. (1990). (a) S. W. Hawking, Singularities and the geometry of spacetime, Adams prize essay (1966). (b) S. W. Hawking and R. Penrose, The singularities of gravitational collapse and cosmology, Proc. Roy. Soc. London A314 (1970). (c) S. W. Hawking and G. F. R. Ellis, The Large Scale Structure of Spacetime (New York: Cambridge University Press, 1973). (d) S. W. Hawking and W. Israel, Three Hundred Years of Gravitation (New York: Cambridge University Press, 1987). (i) C. M. Will (Experimental gravitation from Newton's Principia to Einstein's general relativity); (ii) T. Damour (The problem of motion in Newtonian and Einsteinian gravity); (iii) W. Israel (Dark stars: the evolution of an idea); (iv) K. S. Thorne (Gravitational radiation); (v) A. Vilenkin (Gravitational interaction of cosmic strings); (vi) S. W. Hawking (Quantum cosmology); (vii) J. H. Schwarz (Superstring unification); (viii) R. Penrose (Newton, quantum theory and reality); (ix) R.D. Blandford (Astrophysical black holes). (e) S. W. Hawking, Hawking on the Big Bang and Black Holes (New Jersey: World Scientific, 1993). A. Held: General Relativity and Gravitation (Vols. 1 and 2, Plenum Press, 1980). (i) J. C. Miller and D. W. Sciama (Gravitational Collapse to the black hole state), (ii) F. J. Tipler, C. J. S. Clarke and G. F. R. Ellis (Singularities and horizons-Review Article), (iii) L. P. Grischuk and A.G. Polnarev (Gravitational waves and their interaction with matter and fields). G. 't' Hooft, Black hole quantization and a connection to string theory, in [23]. (a) F. Hoyle, A new model for the expanding universe, Mon. Not. Roy. Ast. Soc. 108 (1948), 372-382. (b) F. Hoyle and J. V. Narlikar, Time symmetric electrodynamics and the arrow of time in cosmology; A new theory of gravitation, Proc. Roy. Soc. London A277 (1963); A282 (1964). J. A. Isenberg (ed), Mathematics and general relativity, American Math. Soc. (1988). (a) C. J. Isham, Modern Differential Geometry for Physicists (New Jersey: World Scientific, 1989). (b) C.J. Isham, An introduction to general topology and quantum topology, in [7]. S. Kobayashi and K. Nomizu, l.[10]. H. C. Lee (ed.), Physics, geometry and topology, NATO ASI Series B (Phys. Vol. 238, Plenum Press, 1990). M. A. H. MacCallum (ed), General Relativity and Gravitation (New York: Cambridge University Press, 1987). (i) M. A. Abramowicz (Accretion disks around black holes); (ii) L. P. Grischuk (Gravity-wave astronomy); (iii) C. J. Isham (Quantum Gravity); (iv) P. Mazur (Black hole uniqueness theorems); (v) R. Penrose (Twistors in general relativity). A. R. Marlow (ed.), Quantum Theory and Gravitation (New York: Academic Press, 1980). C. W. Misner, K. S. Thorne and J. A. Wheeler, Gravitation (San Francisco: W. H. Freeman & Co., 1973).
448 Mathematical Perspectives on Theoretical Physics
27. Y. Ne'eman (ed.), To Fulfill A Vision (Addison-Wesley Publishing Company, 1981). (i) C. N. Yang (Geometry of Physics); (ii) F. Giirsey (Geometrization of unified fields); (iii) J. D. Bekenstein (Gravitation, the quantum, and statistical physics); (iv) Y. Ne'eman (Gauged and affine quantum gravity). 28. E. T. Newman, L. Tamburino and J. J. Unti, Empty space generalization of Schwarzschild metric, Journ. Math. Phys. 4 (1963). 29. I. Newton, Mathematical Principles of Natural Philosophy and His System of the World (Philosophiae Naturalis Principia Mathematica), Joseph Streater (ed.), London, July 5, 1686 and Florian Cajori (ed.) (Berkeley: University of Cal. Press, 1962). 30. (a) R. Penrose, Gravitational collapse and spacetime singularities, Phys. Rev. Lett. 14 (1965); (b) General relativity, energy flux and elementary optics in Perspectives in geometry and relativity (Hlavaty Festschrift, 1966). (c) C. M. DeWitt and J. A. Wheeler (ed.), Structure of spacetime in Battelle Rencontres (New York: Benjamin, 1968). 31. W. Perett and G. B. Jeffrey, The Principles of Relativity, A Collection of Original Memoirs on the Special and General Theory of Relativity, (New York: Dover-New York, 1923). 32. Z. Perjes (ed.), Relativity Today (Nova Science Publishing Company, 1992). (i) L. M. Skolowski (Gravitational waves in multi-dimensional spacetimes); (ii) R. Bartnik (The spherically symmetric Einstein Yang-Mills equations). 33. N. Prakash: Projective structures in spacetime, Indian J. Pure Appl. Math 17 (5) (1986). 34. T. Regge and J. A. Wheeler, Stability of a Schwarzschild singularity, Phys. Rev. 108 (1975), 1063. 35. R. K. Sachs and H. Wu, General Relativity for Mathematicians (New York: Springer-Verlag, 1977). 36. B. G. Schmidt, A new definition of singular points in General Relativity, J. Gen. Re. and Gravitation 1 (1971). 37. B. F. Schutz, A First Course in General Relativity (New York: Cambridge University Press, 1985). 38. J. M. Stewart, Advanced General Relativity (New York: Cambridge University Press, 1990). 39. E. F. Taylor and J. A. Wheeler, Spacetime Physics Introduction to Special Relativity (2nd ed., New York: W. H. Freeman and Company, 1990). 40. K. S. Thorne, Black Holes and Time Warps (New Jersey: Princeton University Press, 1993). 41. P. C. Vaidya, 'Newtonian' time in general relativity, Nature 171 (1953), 260. 42. C. V. Vishveshwara, Stability of the Schwarzschild metric, Phys. Rev. Dl (1970), 3870. 43. K. Yano and S. Bochner, Curvature and Betti Numbers (Annals of Maths. Studies No. 32, New Jersey: Princeton University Press, 1953). 44. O. Kowalski and D. Krupka (ed.) Differential. Geometry and its Applications Proc. (1993). (i) M. Mikkelsen (Standard Static Space-times with perfect fluid.) 45. A. Strominger and C. Vafa, Microscopic origin of the Bekenstein-Hawking Entropy, hep-th/ 9601029 V2 14 Feb 96. 46. K. Skenderis, Black holes and branes in string theory, hep-th/9901050 V2 19 Jan. 1999.
CHAPTER BASICS OF QUANTUM THEORY
1
V /
INTRODUCTION
We devote this chapter to the basics of quantum theories. Classicaly this theory was developed using two important principles namely, the energy absorbed or emitted by a body is in multiples of a constant h * 0, called the Planck's constant and the Heisenberg's uncertainty principle expressed as (Mp xMvxm4fi)1. Very often an object or a property that carries h in its definition is called a quantum object or a quantum property. For example, a photon which carries the energy E - h a>(co= wave frequency) is a quantum object,2 and a particle carrying an angular momentum which is a multiple of (1/2) ti possesses a quantum property. The theoretical and experimental techniques that help determine the quantum nature of an object or a physical system are often referred to as components of quantum theories. The procedure used in moving from classical physics to the physics that uses the two principles cited above is called the quantization. The quantization of any given system is achieved via one of the two equivalent approaches—namely the canonical formalism (Sec. 2 and Sec. 3) or the path-integral formalism (Sec. 4, Sec. 5 and Sec. 6). In the former case the dynamical variables of the system are treated as operators and these operators are postulated to satisfy the canonical commutation relations (9.2.16). The Hamiltonian of the system is constructed, which is then used to find the time evolution operator (9B.21). This eventually leads to the computations of transition amplitude from the state at an initial time to the state at final time. The path-integral formalism, on the other hand, allows the transition amplitude to be expressed directly as the sum over all paths between the initial and the final state. The summands of the functional integral here are weighted by e'a (denoted also as e's or eM) whereas a (S or A) denotes the action (in the units of Planck's constant K) for the particular path. In both these approaches, one of the key ingredients of the theory is the principle of superposition which asserts that in a given region, every wave function y/ without singularities can be expressed as a linear combination of eigenfunctions coming from Schrodinger's wave equation, provided the boundary conditions satisfied by y/ are the same as those of eigenfunctions. The principle also extends to wave Mp = uncertainty in the correct measurement of the "position" of a given particle P. Mv = uncertainty in the correct measurement of the "velocity" of P. in = mass of P. "' h is actually Planck's constant/2^; the equation E = ttco, where a>denotes the wave frequency, is also interpreted to mean that Planck's constant h connects the wave and particle aspects of 'light' in photon via a> and E.
450
Mathematical Perspectives on Theoretical Physics
functions with multiple components (such as Pauli's or Dirac's) in a natural manner via matrix methods. In the case of path-integrals, the Feynman sum over history is based mainly on the (linear) superposition principle. In fact this principle is such a basic component of the theory that one just uses it without mentioning it explicitly. On a historical note regarding the quantum theory, it is worth mentioning that it took more than five decades to reach the stage at which we are today. Those who contributed the most toward establishing the theory on firm ground were notably Schrodinger, Heisenberg, Pauli, Dirac and Feynman. The mathematical disciplines that were used the most in formulating the theory were: analysis, algebra and geometry. As we are well aware now, Schrodinger's approach was based on differential equations and as such used analysis; Pauli and Dirac who used operators and matrices to describe the theory made algebra their main tool, whereas Feynman and Heisenberg, in addition to these disciplines used geometry (graphs) as well, as a means of identifying the amplitudes of a quantum mechanical process. Since much of the theory now uses these disciplines interchangeably, we have attempted here to present the material in an integrated manner. More specifically, we have devoted Sec. 2 to Schrodinger and Heisenberg equations and Sec. 3 to Dirac's equations along with the Klein-Gordon equations. These equations are studied for free particles as well as for particles moving in different fields. The topic of quantization of fields is studied in Sec. 4 and Sec. 5 using the diagram technique and Feynman's path-integral formalism. In Sec. 6 we use the knowledge of the previous two sections to introduce the Feynman graphs—a tool which is used with great success in string theories. Some of the terms and results that are required for an understandable account of this (giant) theory are described in brief in Appendices A, B and C. The theory of the text is illustrated by examples and exercises. The chapter also contains in Appendix D, a brief account of Hopf algebras-known as quantum groups. Instead of making these Hopf algebras as part of our chapters on algebra, we preferred to include them here because of the word 'quantum'.
2
PASSAGE FROM CLASSICAL TO QUANTUM
We review here in brief the similarities and differences between the tools used in the two theories (see Table 1 at the end of this section). In classical mechanics (CM) the basic entities are the topological spaces known as 'phase spaces,' the 'observables' which are real-valued continuous functions on phase spaces, and symmetry groups-the groups of self-homeomorphisms of phase spaces. In the case of quantum mechanics (QM) the phase space is replaced by a separable infinite-dimensional Hilbert space H, and the observables are self-adjoint operators which may or may not be bounded, and symmetries are given by automorphisms of the ^-algebraic structure3 of L(#)—the space of linear operators on H. The analogues of one-forms CO and vectors v in CM (the dual objects, as each one of them is a linear real-valued function on the space defined by the other, assuming that space is finite dimensional), are the Dirac kets \\ff) and bras (
(
(9.2.1)
(See Appendix 9A for definitions, and (9.2.41)-(9.2.43) for the relation between differential forms and operators.) 3
- (See Sec. (4.1) for definitions).
Basics of Quantum Theory 451
2.1
The Concept of Amplitude, Observable, and Hamiltonian
Every quantum mechanical process is associated with a complex number called the quantum amplitude. The square of the amplitude equals the probability of occurrence of the process: |Amplitude|2 = Prob.
(9.2.2)
For example, consider a particle at a point x0 at time r0 and at x' at time t'. The travelling of this particle from x0 to x during the time (f - t0) is a process, to which we associate the amplitude A, and the probability of finding it at x' is \A |2.
—I
1
x0
x'
^ f f l ^ n j Particle travelling from Ktox' during ttie interval (f - t). The amplitude A for this reason is called the probability amplitude. Quantum mechanics postulates that there is a set of state vectors symbolized as | ) that describe all configurations of the system. In the case of the above example, the state of the system at (x0, t0) is |JC0, t0), and the probability amplitude A is the overlap between the initial and the final states given by a scalar product ( | ) (on Hilbert space to which \x0, t0), etc., belong) thus: A = (x',t'\xo,to)
(9.2.3)
The states are normalized, hence (9.2.2) written out as: \{x', t'\x0, f 0 ) | 2 = P r o b .
(9.2.4)
makes sense (the probability p of occurrence of any event satisfies 0 < p < 1). Another equally important postulate of QM asserts that all physically observable quantities be represented by operators on a Hilbert space, and the result of a measurement of an observable must be an eigenvalue of the operator representing it. For instance, the 'position' of the particle is an observable quantity, thus if X denotes the corresponding operator acting on a state \x) of the system, thenX|x) gives the eigenvalue of the operator. Naturally |JC) is an eigen-vector (eigen-state) of X. It is customary to use the same symbol x for the eigenvalue as well as for the eigenstate \x), thus we have: X \x) = x\x)
(9.2.5a)
Then there is the concept of time evolution in QM, which is provided through a Hermitian operator H known as the Hamiltonian which time translates a (time-dependent) state | y/ (?)) from t to t + e, e being small. Using this operator we have the Schrodinger equation: i4-W{t)) = H\\if{t)) at (See Appendix 9B)
(9.2.5b)
452
Mathematical Perspectives on Theoretical Physics
In short, if the physical state of a particle at time t is described by the normalized wave function *?(?, t) with |*P(r, t)\2 the probability density for finding the particle at position r (see Appendix 9B), then the expectation values4 of the position and momentum (of the particle) are given by using the wave function as follows:
(F> = j^ir,
*)?¥(?, t)d3r
(9.2.6)
(p)= \*¥*(?,t)—VV(7,t)d3r (9.2.7) J i While calculating (p) it is assumed that *P(r, f) has a continuous derivative everywhere (the expectation value of the momentum operator of a non-continuous wave function can also be defined—though it is more complicated). The time-development of the wave function is determined by the wave equation:
ih — = I - — V2 + V(r)\ *F dt I 2/i J
(9.2.8)
The expression within parentheses represents the energy operator ——I- V for a particle of mass fi V 2 -" J in a conservative field, this is denoted by H showing that the Hamiltonian operator here is simply the energy operator. The above equation, known as the equation of motion for the state vector, can thus be written as (see (9B.19)): itt — dt
= H¥
(9.2.9)
As indicated earlier the operator H is always a Hermitian operator. We give below two examples that, illustrate the above introduction. The first of these gives the translation from classical to quantum using of the same differential equation.
2.2
Symmetry Group of the Motion of a Particle in 1 -Dimension
Example 9.2.1
Consider the differential equation: x + F(x) = 0
(9.2.10)
where F: IR —> IR is a given smooth function; in CM it represents a particle moving in a one-dimensional space whose position is given b y x s x(t) at time t. The motion of the system is analyzed by taking the phase space R 2 5 on which canonical coordinate functions q, p: R2 —> IR are introduced as: q(x,y) = x, p(x,y) = y 4
5
'
(9.2.11)
The expectation value of a random variable X for the given probability distribution (Xk, p^) is the weighted sum: ~LXkpk. It is denoted as E(X) or {x). pk denotes the probability that X may take the value Xk. The cotangent bundle T*(M) of n-dimensional C°°-manifold M with coordinates (xv •••,xn) = (xx (t), •••, xn{t)) defines the 2«-dimensional phase space in CM with coordinates (x{, • • •, xn, v,, • • •, yn); here M is IR, hence the phase space is R 2 : (x, y) (see also 9B).
Basics of Quantum Theory 453
The commutation relations satisfied by p and q are: {p,p}={q,q}=0,
{p,q} = l
(9.2.12)
which can be easily seen to follow from the Poisson bracket {/, g] = —-—— ——, for smooth dy dx dx dy real-valued functions/, g : K2 —>IR. The second order differential equation (9.2.10) reduces to a system of first order differential equations on the phase space, thus: x=y,
y=~F(x)
(9.2.13)
Since we are interested in the 'systems' without singularities, we require that through every point (x0, y0) of IR2 there be a unique smooth curve: t h-» (x(t), y(t)) that satisfies (9.2.13) with initial conditions x(0) = x0, y(0) = y0. This requirement translates into the existence of a smooth one-parameter family {0,: t € IR} of homeomorphisms of IR2 such that the functions qt = qo^)t, p,- po
L e t / : IR -> K be a smooth function such t h a t / '(x) = F(x) (i.e., f(x) = j * F(t)dt), then
i f / i s bounded below, there exists a flow (&,: t e IR) that satisfies (9.2.14). See Chap. 8 in [18] for the proof. In the case of QM, we have to quantize the one-dimensional system given by (9.2.10); therefore we begin with the Hilbert space y{= L2(R) of real valued square integrable functions and choose (in place of the canonical coordinate functions (9.2.11) of CM) the canonical operators: Q = multiplication by x
(9.2.15)
i dx ' The operators Q and P satisfy the commutation rule: [Q, Q] = [P, P] = 0,
[P, Q] = -il
(9.2.16)
Here 1 stands for the identity operator on L2(IR). In view of the above discussions, the dynamics of the system is represented by a one-parameter group (ar: t s IR) of * automorphisms of L{!tt). Thus the equations (analogous to those of a classical system (9.2.14)) satisfied by the dynamical group (a,) are: ~at{Q) = at(P) -j-a,(P) = -F(at(Q)) = -a,(F{Q)) at
(9.2.17)
454
Mathematical Perspectives on Theoretical Physics
2.3
Two-body Problem with Spherically Symmetric Potential
In the next example we consider two non-relativistic spinless particles interacting via a spherically symmetric potential, to obtain the associated Hamiltonian along with their eigenvalues and eigenkets. Example 9.2.3
Consider two particles of masses mx, m2 with position operators rx and r 2 , and mo-
mentum operators px and/>2- Let V(r) denote a spherically symmetric potential where r = (r • r)~2~and r = rx-r2
(9.2.18)
then the Hamiltonian for the system can be written as: HT = - £ * - + - ^ - + V(r) 2m.\ 2m2
(9.2.19)
(see (9.2.8). In order to obtain the eigenkets of HT, we reduce it to a sum: HT = Hcm + H
(9.2.20)
The components Hcm and H are: Hcm=—-A rP 2 2(mj + m2)
(9.2.21)
H = -^— P 2 + V(r) 2M
(9.2.22)
P = Pi+p2
(9.2.23)
where
/7 = (m2pl - mlp2)/(ml M = (mlm2)/(ml
+ m2)
+ m2) and
(9.2.24) (reduced mass)
(9.2.25)
The operators Hcm and H are the Hamiltonians that are respectively associated with the (translational) motion of the centre of mass of the two particles, and their relative motion (rotational and vibrational). Note that HT could be written as a sum since ri andp, (i = 1, 2) are conjugate operators and [r,-, Pj] = 0 ( * * y ) , (i,j = 1,2) (see9A). Also since r and p are linear combinations of ri and pt, they are Hermitian and their corresponding triplets (rk) and (pk) (k, 1=1,2. 3) satisfy: \ph rj\ = -ihdkl
(9.2.26)
(see 9A.22). This shows that r and p are canonically conjugate. It can be checked that P of (9.2.23) commutes with both of them. We further define the orbital angular momentum operator (see Exc. (9.2.1)) : L = rxp 6
'
(9.2.27)
See also Sec. (3.7), in particular the hint to Exc. 2 of that section. There we have used the angular-momentum operator to obtain a 5t/(2)-representation.
Basics of Quantum Theory 455
associated with the motion about the centre of mass (internal angular momentum). Since P commutes with r and p, the operators Hcm, H, L2 and Lz can be seen to form a set of commuting operators; we assume that the set is a complete set, hence the set being a c.s.c.o. it can be used to write the basis kets of HT (see 9A.3 for the definition of c.s.c.o.). Let \E), |/), \m) and \Ecm) be eigenkets of//, L2, Lz and Hcm respectively. We use \Eltn) to denote an eigenket of the first three operators (collectively)-, i.e.: H\Elm) = E\Elm)
(9.2.28)
L 2 | Elm) = fi2l{l + \)\Elm)
(9.2.29)
Lz\Elm) = hm\Elm)
(9.2.30)
The eigenkets \Elm) of (9.2.28)-(9.2.30) constitute an angular momentum basis. The basis kets of HT, on the other hand, can be written as the direct-product basis \Ecm) ® \Elm), where HT(\Ecm) ® \Elm)) = (Ecm + E)(\Ecm) ® \Elm))
(9.2.31)
2
Since H, L and Lz are Hermitian, the kets \Elm) are orthogonal, and if we further assume that they are normalized, then: {E'l'm'\Elm) = 8E,ESn8m.m for discrete energy eigenvalues, and
(9.2.32)
(E'l'm'\Elm) = 8(E' - E) Sn 8m.m
(9.2.33)
for continuous energy eigenvalues. The eigenvalues Ecm have a continuous spectrum and therefore: (E'cm\Ecm) = 8(E'cm - Ecm)
(9.2.34)
Since H commutes with L, we also have (see Exc. (9.2.1) Hint for L±): H(L ± \Elm)) = E(L ± \Elm)). Finally we note that if one of the masses (m, or m2) is infinite, then Hcm = 0 and two-body problem reduces to the problem of a single particle in a spherically symmetric potential.
2.4 The Radial Hamiltonian of the Two-body Problem We next consider the radial momentum operator: Pr=j(r
P+P
r) = hr-p-ih)
(9.2.35)
where f denotes the unit position operator. This operator is related to the linear and the orbital angular momentum operators by the identity: L 2 = r V - p])
(9.2.36)
1
We use (9.2.36) to eliminate p from (9.2.22) to obtain: H = —p\ 2M
r
+ —~L2 2Mr2
+ V(r)
(9.2.37)
456
Mathematical Perspectives on Theoretical Physics
If further we replace L2 by its eigenvalues h2l(l + 1) (see (iv) of Exc. 9.2.1), then the RHS of (9.2.37) becomes:
Ht = ^-P2T + *%££-
+
V(r)
(9.2.38)
2M 2Mr The operator obtained in (9.2.38) is called the radial •Hamiltonian. We denote an eigenket of Ht as \E I) with eigenvalue E. If Hl is Hermitian with respect to \El) and \E'l), then these eigenkets are orthogonal. Thus upon normalization one has: (E'l\El) = SE.E
(9.2.39)
for discrete eigenvalues and (E'l\El)= S(E'-E)
(9.2.40)
for continuous eigenvalues. From the above discussions it follows that operators H, L" and Lz can be expressed in terms of any representation (e.g., coordinate, momentum). For instance, in coordinate representation, (9.2.28)-(9.2.30) become: HyElm{r') = E¥Elm{r') 2
2
L v W > = * Kl + 1) W ) ^VW')
= *«VW)
(9.2.41)
(9-2-42) ( 9 - 2 - 43 >
where YElm(r') = (r'\Elm) (9.2.44) is the coordinate-space wave function defined by the eigenket \r') and eigenvalue r' of the operator r. The operators H, L2, Lz are the differential forms obtained after substituting: r^>r'
(9.2.45)
p -> -fftVr-
(9.2.46)
in the corresponding abstract operators. We conclude this example with the following remarks. -L 1 1 Remark 9.2.3 Similar to the radial operators r = (r • r) 2 and pr= — (r p +p r) = — {r • p - ih), we can define two other radial operators: P2 = ( p p )
(9.2.47)
rp = l ( p • r + r • p) = ±{p • r + ih)
(9.2.48)
and
the Hamiltonian H can then be expressed in terms of p and rp. Remark 9.2.4 If pr is Hermitian, which is always the case when the potential V(r) is a Coulomb potential, then r and pr are conjugate operators. Similarly when rp is Hermitian, p and rp are conjugate operators. The radial Hamiltonian expressed in terms of Hermitian pr and rp can be shown to be Hermitian. (See Chapter 7 of [21].)
Basics of Quantum Theory 457 We shall be pursuing the study of Hamiltonians in the next section by considering relativistic equations. We end this section with a comment on the so-called Schrodinger and Heisenberg picture, which we have used alternatively in the appendices as well as here, without naming them. We clarify this point in the following and show that they can be identified via simple equations.
2.5
The Relation between Schrodinger and Heisenberg Equations
Comment 9.2.5 The conventional QM begins with the Hamiltonian formulation of classical mechanics and uses observables as non-commuting operators. The dynamical law is given by the timedependent Schrodinger equation (9B.19a): ih—\if(t) = H(t)y/(t) dt
(9.2.49a)
or equivalently by Eq. (9.2.5b): i»-j-|y(0>=//|V(0> at in view of the fact that
(9.2.49b)
(x\W(t))=y/(x,t).
(9.2.49c)
Thus when the Schrodinger equation represents the wave function of a particle in one dimension, we have:
m^H
= HMw(X,t)
at
= (-^-fT + V(x))y(x,t)
(9.2.50)
V 2m dx ) From (9B.21)) we note that for the time independent Hamiltonian H, the time evolution operator U equals: U(t{, t2) = exp(-iV/i) (?, - t2)H)We use this to link the Schrddinger's time-dependent states and time-independent operators with Heisenberg's time-independent states and time-dependent operators.7 For instance, in the case of states we have:
IV>»=IV(f = 0)>s=|y(r = 0)> exp((-i/h)tH)\y/(t))H=
exp((-i/h)tH)\v(t
= 0)>5= \y/(t))s
(9.2.51)
where to write the second line we have used (9B.11) and (9B.21) after writing t0 = 0. Similarly the coordinate operator in the Heisenberg picture is related to the one in the Schrodinger picture as: XH(t) = exp((i/h)tH)Xs 7
'
e\p((-i/h)tH)
(9.2.52)
The operators in two systems are distinguished by the suffix S or H, similarly the states are designated as | )s or | )H. The letters 5 and H are dropped when there is no fear for confusion or when a particular equation/ statement is valid in both cases.
458 Mathematical Perspectives on Theoretical Physics
The eigenstates of the operator XH{t) satisfying: XH(t)\x, t)H = x\x, t)H are easily seen to be in accordance with the coordinate basis of the Schrodinger picture: I*, Oi/ = exp((i/»)tf/)|*>
(9.2.53)
Using (9.2.3) we now have: H(xu
f^,
t2)H= (Xl\ exp((-i/»)r 1 //) exp((i/K)t2H)\x2) = (Xl\ exp((-i/»)(r, - t2)H)\x2)
= (Xl\U(h, ' 2 )M = U(tu *,; t2, x2)
(9.2.54)
Since H{xi, tx\x2, t2)H are the time ordered transition amplitudes between the coordinate basis states in Heisenberg picture, it follows that the matrix elements of the time evolution operator U(tv t2) are nothing else but these transition amplitudes. We shall use this relation increasingly in Sec. 5. In the attached Table (9.2.1) we give the dynamical laws for the two approaches along with the ingredients that are used there. Table 9.2.1 Classical and Quantum Mechanics Classical Mechanics 1. Finite-dimensional phase-space 2. Real valued functions: / (one-component) 3. Variables x, p
({x,p} = l)
Quantum Mechanics Infinite-dimensional Hilbert space Complex-valued functions: y/ (with more than one component) Operators x, p = x , p
([x,/»]=l)
4. Hamiltonian H(x, p) or H(x, p, f) 5. Dynamical law
Hamiltonian H (x, p) Dynamical law
d
df
—f(x, p) = {f, H) dt
- ~
(a) Heisenberg Eq: ih — = [/, H ] dt dw (b) Schrodinger Eq: ih—1- = H y/ dt
Hamilton-Jacobi Eq.
H+H(Xtil) dt { 6. Lagrangian L(x, x)
=
0
H=H(x,-ih-^-)
dx)
dx Action S= \L
Exercise 9.2 (1) What are the most commonly used angular momentum operators in particle theory? Obtain their eigenvalues and eigenkets.
Basics of Quantum Theory 459
Hints to Exercise (9.2) 1. From Section (3.7) we are already familiar with the angular momentum operator J whose Cartesian components ./, (i = 1, 2, 3 or x, y, z) are linear Hermitian operators that satisfy: (1)
[/,, Jj\ =
itieijkJk-
In particle theory we come across two types of angular momentum operators, namely the orbital angular momentum operator denoted L, and the spin angular momentum operator S, L is obtained from the definition of classical angular momentum after replacing the classical position and momentum vectors by linear Hermitan operators that satisfy the canonical commutation relations (9A.22). Thus for a particle (ii)
L = r x p , i.e. L
i = Cijk rj Pk
which shows that L is linear and satisfies: (iii) Also
[Lj, Lj] = ift eijk Lk L*t = eijkptr]=
eijk rj pk = L,
Hence in view of (i) it is an angular momentum operator. Using the discussions made for J 2 and J in Sec. (3.7) we note that [L2, L,] = 0, and that L2 and one of L^ (say Lz) can have simultaneous eigenkets, these lead to the relations: (iv)
Lz\lm) = hm\lm) L2|/m) = h2l(l + \)\lm)
The spin angular momentum operator S is postulated, so that 5, satisfy the defining commutation relations given in (i). Similar to J±, we have here for operator L: (v)
L± = Lx±iLy
with corresponding relations (vi)
[Lz, LJ = ±h L±, [L2, LJ = 0
Using these one can obtain their eigenvalues and quantum numbers. We use these operators in Sees 2 and 3 while studying the two-body problem and the Dirac's equation.
3
QUANTUM MECHANICAL EQUATIONS AND RELATED CONCEPTS
n this section a relativistic Hamiltonian with reference to the Klein-Gordon and Dirac equations are itudied. The energy eigenvalues of free as well as charged particles are obtained in both cases. Using he Hamiltonian of a free Dirac particle, the spin and angular momentum operators (denoted S and J) tre defined. The relation between the solutions of the Klein-Gordon and Dirac equations is shown, and he Feynman-Gell-Mann reduction is applied to the Dirac equation of a charged particle in an electromagletic field.
460
Mathematical Perspectives on Theoretical Physics
3.1 Hamiltonian in a Relativistic Field, and Klein-Gordon Equation Consider the classical Hamiltonian H of a free particle: H= (c2p2+ m2oc4)T where m0 is the rest mass and p is the relativistic momentum:
P = [l-^A
2
m0v
(9.3.1)
(9.3.2)
Now the transition to a quantum mechanical system of a given classical system can be effected in more than one way, for instance using the RHS of (9.3.1) in (9.2.10) we have:
(c2p2 + m 0 2 c 4 )T¥ (r , t) = ih^-V(r, at
t)
(9.3.3)
This (quantum mechanical) equation, however, is not of much use due to the absence of symmetry in space and time coordinates. The computations based on this equation are unwieldly. Dirac circumvented this situation (absence of symmetry) by suggesting an alternative method, which we shall explain below. But before that we give an outline of another useful procedure which leads to the well known equation (9.3.4). We square both sides of (9.3.1) before operating on *F(r, t). We replace H with ih— and writep2 as dt (-H2c2V2) to obtain the quantum mechanical equation:
f-fi2c2V2 + h2-^
+ m%A ¥(i\ f) = 0
(9.3.4)
\ dt / When 4* is a scalar (invariant under change of inertial frames), the above equation is the Klein-Gordon equation for a free particle. The familiar form of this equation which one encounters in literature is: (p2 + mlc2mxv)
=0
(/i, v = 1, 2, 3, 4)
(9.3.5)
This follows by using: ct - -ix4
(9.3.6)
and then writing
-ih-4— =Pu (A* = 1, 2, 3, 4) dxM
(9.3.7)
in (9.3.4). For a particle of charge q in an electromagnetic field with vector potential A(r, t) and scalar potential O(r, t), the derivative p^ is replaced by the gauge covariant derivative: Dtl = pfl-qAll
(9.3.8)
where A^ is the four-vector potential:
A^ = ^A,j^j
(9.3.9)
Basics of Quantum Theory
461
(See Sec. 3 and Sec. 4 in Chapter 6.) Hence the Klein-Gordon equation for a charged particle in an electromagnetic field is: {D\ + mlc2)V(xv) = 0
3.2
(9.3.10)
The Dirac Equation
We now return to Dirac's equation. Dirac expressed the sum of four squares in c2p2 + mj c4 as a perfect square by introducing other operators a and p independent of p and m0, thus: c2p2 + m\c4 = (ca • p + Pmoc2)2
(9.3.11)
He assumed that a and p commute with operators p and r, so that the RHS of (9.3.11) could be written as: • j c\ajak + akap
PjPk
+ mQc\afi + paj) Pj + m20 c4/?2-
(9.3.12)
Comparison of (9.3.12) with the LHS of (9.3.11) implied the following identities for a and /5: ajCck + ak(Xj = 25jk
(9.3.13a)
ajp+p(Xj = 0
(9.3.13b)
P2=l
(9.3.13c)
This showed that «, and /3 anticommute, and have unit squares. Using the operators a and /J, the quantum-mechanical Hamiltonian is: H=cap
+ pmoc2
(9.3.14)
Writing p =^-ihV in the above Hamiltonian and substituting this H in (9.2.9), we have the Dirac equation for a free particle: {-iti ca V + Pmoc2) *F(r, t) = ih-^-xV(r, t)
(9.3.15)
at
The usual form of this equation in the literature is: (i7A+mocmxv) = 0
(9.3.16)
which is obtained by multiplying (9.3.15) on both sides by /?, and then writing y=-ipa
and y4= p
(9.3.17)
The operator components (y^) satisfy: 7 ^ + ^ = 25^ (9.3.18) 2 Evidently for each jJL, (y^ is a unit matrix. In view of (9.3.8) and (9.3.9), the Dirac-Hamiltonian for a charged particle in an electromagnetic field is: H = ca • D + pmQc2.
(9.3.19)
The Dirac equation now becomes: (iypDp + mocy¥{xv) = 0
(9.3.20)
462
Mathematical Perspectives on Theoretical Physics
The operators at, j8 and y^ can all be represented by matrices that have real or complex entries. Moreover, since H is Hermitian (see Appendix 9A), these matrices are Hermitian and are therefore square of order N say, we shall see in Exc. 3.1, that N is 4. Using the matrix representation of Exc. 9.3.1, the Hamiltonian (9.3.14) can be written as: H=c\ \CS p
\ -mocj
(9.3.21)
where
¥= J
(9.3.22)
and with appropriate choices (see Exc. 9.3.2), it can be shown that the Dirac equation (9.3.20) consists of four coupled partial differential equations: (a)
mQc2\vV + cD3\\y£ + c(Dl - iD2)\y/4)
= ih-^- - q<S> |y/x)
(b)
m0c2\\ff2)
= f j » - | - - q<5> j |y/ 2 )
(c)
-moc2\wj + cD^x) + c(Di - iD2)\W2) = [ihj^ - 9<M|y3>
(d)
-mQc2\xifA) + c{Dx+iD^)
+ c(D{ + ID2)\y/3> - cD3\y/4)
- cD^2)
= (ih^
- q®)\y/4)
(9.3.23)
We note that if the inertial frame in (9.3.15) is changed the Dirac equation becomes (-ihca • V + j3moc2)^'(r', t') = ih—^Xr',
t')
(9.3.24)
at
where ^ ' ( r ' , t') = expf- —a • V tanh"1 —W(r, t) V 2 cJ
(9.3.25)
The unit velocity vector V here represents the velocity of the second frame relative to the first (v = |V|2), and r\ t' are related to r, t by a Lorentz transformation. From our discussions in Exc. (9.3.2), it follows that the Klein-Gordon and Dirac equations (9.3.10) and (9.3.16) can be written respectively as: (Dl+m2c2)\y/)
=0
(9.3.26)
Basics of Quantum Theory 463
( / 7 ^ + m 0 c ) | y / > = 08
(9.3.27)
Recall that the Hamiltonian is the observable associated with energy. Therefore if the kets are taken as energy eigenkets and A^ is regarded as time independent, then the above equations can be written as: c2(D2+m20c2)\y/)
= (E - q®)2\v)
(9.3.28)
c(a • D + /3moc)|v> = (E- q®)\y)
(9.3.29)
where E is the energy eigenvalue (see Sec. 9.2). In view of above discussions, it is evident that beginning with (9.3.26) and (9.3.27), we can revert to Equations (9.3.10) and (9.3.20) in co-ordinate representation-by using the defining equation: ¥(/•, t) =
(9.3.30)
(see 9A and Sec. 2). We note that the elements of T are *P^(r, t) - (r\y/x) and the normalization condition 4
X
expressed for bras and kets, in the case of {4^} is given by:
Xf^/^1
(9-3-31)
A= l
Also, as (9.3.27) represents four coupled equations, these four equations can be expressed in coordinate representation using ^ ( r , t) = (f\Wxf- F ° r instance, the first equation of (9.3.23), in coordinate representation becomes:
(a)
m^Vx + cD3¥3 + c{Dx - iD2)«P4 = iihj- - ?* W
(9.3.23)'
In the following we shall discuss an example of momentum representation. From Appendix 9A we know the relation that exists between position and momentum operators. Furthermore, we also know that the. coordinate as well as momentum representations can always be obtained for any physical system using the appropriate relations from the set (A.23)-(A.33). To illustrate this, consider a Dirac particle in a spherically symmetric electrostatic field. The Dirac equation [9.3.29) can now be written as: (cap
+ Pm0c2)\ys) = [E- q
(9.3.32)
Multiplication on the left by (p | and the use of (A.25), (A.42) and (A.44) gives us: (cap
+ pmoc2-E)y/(p)
= -q{2nh)~^j
d3p'F(p' -p)yr(p')
(9.3.33)
where \jr(p') = (p'\Y) is the Fourier transform of y/(r ') (see Exc. 9C.7), and F(p' -p) is defined as: F(p' -p)= (2nh)~TJ d\ exp[i(p' -p)r/h]®(r)
(9.3.34)
^ote that in contrast to other equations, e.g., (9.3.19) or (9.3.32) which are differential equations, 9.3.33) is an integral equation for the momentum-space wavefunction y(p). 1
It can be easily recognized that (9.3.27) is the consolidated form of (9.3.23).
464 Mathematical Perspectives on Theoretical Physics
3.3
Commuting Observables for a Free Relativistic Dirac Particle
Our objective here is to obtain a c.s.c.o. (complete set of commuting observables) for the physical system that describes a free relativistic Dirac particle. Beginning with the Hamiltonian and the orbital angular momentum (which we already know in this case), we write their commutator as: [H, Lj\ = [cakpk, Lj] = cak[pk, Lj\ = -ihcakeiklpt
(9.3.35)
(see (9.3.14) for the expression on the RHS). This gives: [H,L] = -ihcaxp
(9.3.36)
showing that H does not commute with L. We are looking for a commuting operator, hence we introduce the matrix operator:
0\
fa
(9.3.37)
and calculate the bracket [H, Z]. In view of (9.3.21) it gives:
0
(
[H,T.] = c\ \[o p,o]
[a p,a\\ * 0
(9.3.38)
)
Now using GjOj- Oj(?i= 1i£ijkok, (see Eq. (7.3.2)) we can write [a • p , ojl = [
(9.3.39)
which gives: [H, Z] = lie axp
(9.3.40)
In order to find an operator that commutes with H, we define two new operators: S = ^-ftZ
(9.3.41a)
J = L + S.
(9.3.41b)
and
The operator J commutes with H: [H, J ] = 0 The operators S and J are referred to as the spin and total angular momentum operators (recall that we had a brief exposure to angular momentum operator J in Sec. 3.7.1 in Chapter 3). These operators satisfy the commutation relations: [S,, Sj] = iheijk Sk,
[/,, Jj] = iheijkJk
(9.3.42)
From (9.3.41)(a) we have (see hints to exercises 3.7.2 and 3.7.3 for explanations):
S'=i-»
[ o o 2)
+
l)nf
°)
2 V 2 J [O I)
(9.3.43)
Basics of Quantum Theory 465
Consequently, the quantum number s in S2\y) =
fi2s(s+l)\yr)
(9.3.44)
is equal to —. Moreover, if Sz denotes the spin with regard to z-axis, then S2=~h2l (9.3.45) 4 where 1 is the unit 4 x 4 matrix. Thus possible eigenvalues of Sz are ±—h. As a consequence, the Dirac equation describes particles whose spin angular momentum is —Ti. It should be noted that in general an energy eigenket is not an eigenket of Lz or Sz since unlike JZ=LZ+Sz these operators (separately) do not commute with H (see Chapter 3, Subsec. 2.4).
3.4 The Relationship Between Free Klein-Gordon and Dirac Particles We conclude this section by showing the relationship that exists between a free Klein-Gordon particle and a free Dirac particle. For this purpose we consider the Equations (9.3.4) and (9.3.15) and note that energy and momentum eigenstates of these particles are given in coordinate representation by corresponding plane wave solutions. For the Klein-Gordon particle they are: V(r, t) = n(P) exp[j(P • r - Et)h]
(9.3.46) 9
where P and E are the momentum and energy eigenvalues that satisfy the relation: E=± (P 2 c 2 + m%c4)T
(9.3.47)
and n is a normalization constant. When E is -ve, the solution *F(r, t) represents an anti-particle. In the case of Dirac equation (9.3.15), the eigenstates of energy and momentum (eigenvalues) are: V(r, 0 = «(P) exp[i(P • r - Ef)K\
(9.3.48)
The u here is a column vector:
V I I = "2
(9.3.49)
."4.
that satisfies: c(cc • P + pmoc2 -E)u = 0
(9.3.50)
Written out in full, the above equation represents four linear equations: (moc2 - £)«! + 0H 2 + cPzu3 + cP_u4 = 0 0«i + (rn0c2 - E)u2 + cP+ u 3 - cPz M4 = 0 '• Some of the texts define E only with a plus sign (see, for instance, [33]), we prefer E to stand for a + ve as well as a -ve eigenvalue of energy.
466
Mathematical Perspectives on Theoretical Physics
cP.w, + cP_u2 - (m0c2 + £ > 3 + 0w4 = 0 cP + «, - cPzu2 + 0« 3 - (moc2 + E)u4 = 0
(9.3.51)
where P±- Px± iPy. We know that for the existence of non-zero solutions for w;, the determinant formed by the coefficients must vanish. As can be easily checked, this determinant equals (E2 - c2P2 -m2 c4)2, which shows that the energy and momentum of a free Dirac particle also satisfy (9.3.47). From (9.3.51) it follows that any two of the M;'S (say u3, w4) can be expressed in terms of the other two (say ux, u2). We give below solutions corresponding to u2 = 0 and ux = 0, denoting them as w(1) and M(2). They are easily seen to be: E + moc ,,, um (P) = n(P)
0
0
,
cPz
m E + mnc u(2) (P) = n(P) cP_
. cP+ \
(9.3.52)
[ ~cPz _
Here n(P) is the normalization constant (the role of P as an argument in n(P) will soon be clear when we consider the Dirac equation in the rest frame). The solutions to Eq. (9.3.50) given above are the ones for E > 0; for E < 0, they vanish in the rest frame (E = n^c2) of the particle as P± and Pz are zero. For E < 0, the non-zero solutions correspond to u4 = 0 and M3 = 0. These are respectively: " -CPZ (3) M
(P) = n(P)
i
~f + moc
r
, «(4)(P) = «(P)
—E 0
-CP_
_
'
~!fZ
•
(9.3.53)
0 _moco ~ E
The states corresponding to solution (9.3.53) (since E< 0) are associated with anti-particles, (see Chapter 5 in Bjorken and Drell).
3.5
The Dirac Equation in Rest Frame
We see next that solutions (9.3.52) and (9.3.53) can be obtained by considering the Dirac equation: /3moc2x¥ = E*¥
(9.3.54)
in the rest frame. Evidently, as P is zero, from (9.3.48) we have ¥ = u exp(-iEt/h)
(9.3.55)
with E - ± moc2.10 For E = moc2, the analogues to (9.3.52) of the column vector u are:
"ii (1) M
(0)= °Q , « 2 (0)=
.oj 10
ro~ l Q
(9.3.56)
|A
' The assumption of rest frame implies that P is zero, this justifies the use of um (0) in place of w(1> (P), and shows the dependence of the normalization constant n(P) on P.
s
s
i l l s
s . i §
S"
^
g 3
«*!
w
^
H
«
•SOF
i
?j ill ;,se > s
, r
§ s S
V "~
SO.
Hi
m
!|s -T« i l l 1 1"S| x l a ss
s-
1
15-
1 § lf
w
j> ^
>
ill
n
gus
-5 .2 *o
, us
i
HI <*!
^ S,
" ^ 3
E
"2
g
§ » *^ usi
•*«•
-g .^H .^
^
^^
I | l|i
^
lg^ -?«1 1 i l l
*.
* |
Is! i if ^ ill i i k HI i
f
isiSf S*i4#
"§
I Si §E
SI |l
S3
ill |lt
> £ JJJ | - -S-ig
ag
I
®
a
§i
468
Mathematical Perspectives on Theoretical Physics
where E and B are the electric and magnetic fields (see Sec. 6.4 and also Exc. 3). Accordingly we obtain: c-2[i»-|--?a»j
- (•p-qA)2-m£c2+
qh(B + HT'E) • <7L) = 0
(9.3.64)
Using the above two component equations, the Dirac equation (9.3.27) can be solved more easily. From (9.3.61) and (9.3.64) we thus have: (\
r) 1 ^ co • {p - <jA) + ih— - q<& + m0c2 \<j>) W)= r \ (9.3.65) d co • (p - qk) + ih— q<S? + m0c2 \
Exercise 9.3 1. Obtain a matrix representation of the operators (aj), j3 and y that are used in the Dirac equations (9.3.15) and (9.3.16), and show that using these, the Hamiltonian can be expressed as in (9.3.21). 2. Show that using the Dirac ket and bra, Dirac equations can be written as four different coupled partial differential equations given in (9.3.23). 3. Show that for a charged particle in an external electromagnetic field, the Gamma matrices (/„) and covariant derivatives (DJ satisfy (9.3.63).
Hints to Exercise 9.3 1. Each of these operators are linear, hence they can be represented by matrices with real or complex entries. However since H is Hermitian, these matrices have to be Hermitian. In view of the Hermitian property (see [34]), they are square matrices of order N (say). Therefore from (9.3.13)(a) and (b) it follows that their determinants satisfy: (i)
\aj\\ak\ = (-lf\ak\\a^
(ii)
|oj.||j3| = ( - 1 ^ 1 1 4
j * k
And since neither \(Xj\ nor |/j| is zero, the order N has to be even. Moreover, from the property of unitary matrices, we have that if a,- and /3 satisfy (9.3.13), then so do U+CCjU and U+J3 U for any unitary matrix U. This means that one of these matrices can be diagonalized. We take that to be /?, from (9.3.13)(c) the eigenvalues of [5 are ± 1, hence:
Basics of Quantum Theory
469
(iii) We write a, as:
<"»
'
• » • ( " « )
and since /3 and ot- are of order N, the matrix elements in them, i.e., in (iii) and (iv) are of order —N. Substituting these in (9.3.13)(a) and (b), we find that aj=dj=0, and (v)
bj ck + bkcj =
Cjbk
+ ckbj = 2Sjk.
For j = k it follows from (v) that c-} - b~l. For j * k, the equality bj ck + bkcj = 0 = Cjbk+ ckbj cannot be satisfied for real or complex numbers, i.e., it cannot be satisfied for N - 2. For N = 4, the bp etc., are (2 x 2) matrices and they satisfy: (vi)
bj = lOj,
Cj = mdj
where Oj are Pauli matrices and the constants /, m obey: (vii)
Im = 1.
If we choose 1=1, then we have:
(viii)
a=^
J
and in view of (9.3.17) (ix)
( 0
-io\
We substitute the value of a and /? in H = ca • p + finite2 and obtain: (x)
H=c\
-p+
moc2 = c
2. Recall that the Dirac equation (9.3.16) and (9.3.20) for a free particle and for a particle with charge are written by treating all variables x^(fi = 1, 2, 3, 4) in the same manner. To establish the results of this exercise, we treat time as a parameter, and the space and momentum coordinates as operators along with the operators a and /3. These operators are then the fundamental dynamical variables of a given system. We note that the use of these operators allows us to write the equation explicitly in terms of Pauli matrices and the Dirac kets:
"IV,}"
lv3) JV4>_
470
Mathematical Perspectives on Theoretical Physics
We use (iii), (viii) and (ix) of Exc. (3.1) and (9.3.6)-(9.3.9) to make the required substitutions in (9.3.20) and obtain the matrix form (i) below:
17 0
cA
( 0
rj2A
( 0
rj3A
fl
0^
1
[ U oJ D ' + U ojM-o, oJ^Ho -JD'+/"»T> =0Introducing the standard Pauli matrices:
(o n
(ii)
(o
fi en
-A
we simplify the above equation into its explicit four-component form: 7
D4
0
0 -Dj
(iii)
D4 -Dx+iD2
.t-A-tDj Note that each D^= p^-
£»3
D3
0,-zD^
D,+zD2 -D4
-Dj 0
0
qA^, in particular D 4 = -z'fc— dx^
-D 4 J
ITki)" m
°C
|V^Z> = |^3)
O
JLl^4>.
q—<&. We replace D 4 by this c
expression and change —— to —. Multiplication of matrix with column vector then leads to ic at ax4 equations (9.3.23). Evidently they are coupled since we see here partial derivatives with regard to all space variables operating on all different Dirac kets. 3. We use the fact that any product ab can be put as (i)
ab = j({a,
b) + [a, b])
to write:
(")
Y^YVD^DV= y ^ y v ( { ^ , Dv) + [£>„, DJ).
Since y^, yv satisfies (9.3.18), (ii) becomes: (iii)
YnYvD^DV=^{y^yv}D^Dv
+ | y ^ yv[D^ Dv]
In view of (9.3.6) and (9.3.8), [D^, Dv] can be written as:
(iv)
dA dA.. [Dll,Dv] = iqh ^ ^ • [ dXp dxv
As a result:
Basics of Quantum Theory 471
(v) But
and
(vii)
[Yi, jjl = -Paficcj + ^afiai = a,a- - a,a, J o Pi-<* Pi
0
^
(We have used here the equalities (9.3.13), (9.3.17) and (9.3.37) and equality (viii) of Exc. 1.) Now the magnetic field for a particle of charge q is given as: B=VxA or componentwise as: (viii)
Hence substituting these expressions given in (vi), (vii) and (viii) into (v) and using the product of antisymmetric tensors appropriately, we have the required equality (9.3.63) in the composite form:
yMyv D^ Dv=Dl-qKLK 4
+ ic^qha • E
GAUGE FIELD QUANTIZATIONS
We had a brief exposure to quantum theory techniques in the previous sections and appendices and are now in a position to follow it up with the learning of gauge field quantizations. Since gauge fields are geometric in nature (connections on a fiber bundle), they cannot be quantized by standard methods as those methods lead to difficulties and contradictions. The methods that overcame this problem were suggested for the first time in 1963 by Feynman [13b] for Yang-Mills theory and were improved upon in (1967) by DeWitt [36], and in a separate paper by Faddeev and Popov [10]. In our study here (to a large extent), we follow the approach of Faddeev and Popov (FP) which uses the functional integral formalism. Our main emphasis, however, is on learning the Feynman graph techniques, since it is these techniques that are used (in removing the unwanted divergences) in string theory and in super theories in general. The integrals in the FP approach are calculated over the surfaces of the manifolds of all gauge fields. These gauge fields are represented as points on the surface by their respective classes. Recall that two gauge fields belong to one and the same class if one is a gaugetransform of another (e.g., A^ and A^ + d^A in electrodynamics (see Sec. 6.4)). Hence all gaugeequivalent fields mean one single point on the surface.
472
Mathematical Perspectives on Theoretical Physics
To understand the quantization of gauge fields, we shall begin with the formulation of rules of quantization for fields of general nature, e.g., scalar fields or Bose and Fermi fields. In the process we shall introduce the notion of functional integral for fields. The Green's function that plays an important role in quantum theory (see Appen. 9C) will now be seen as a functional integral. The generating function of a field, and the propagator or the Feynman's Green function will be defined, and the functional integral for Bose and Fermi fields will be obtained. We would like to mention here that an important tool of quantum mechanics—the path integral formalism which should chronologically precede the discussions here is relegated to the next section. Due to its applicability in areas of physics other than quantum theories, e.g., string theories, we feel the topic deserves a separate section. We introduce it there from first principles and show (in Sec. 6) how it leads to Feynman graphs.
4.1 Feynman's Functional Integral Consider a classical action: S(t0, t) = \' (p(z)q(r) - H(.q(r),p(r)))dT= f / dx
(9.4.1)
that corresponds to the trajectory (g(T), p(t)) (t0 < T < t) defined in the phase space, where p is (as usual) the momentum canonically conjugate to q. To determine S(t0, t) we take the mean value over the intermediary trajectories (to be defined shortly). This mean value—known as the Feynman's functional integral-is defined as a limit of the finite-dimensional integrals obtained from the given trajectory in the following manner (see also Sec. 5). The interval [t0, t] is divided into N equal parts by the points T = TX, T2 ... TN_V The momentum function p{t) assumed to be constant in each of these intervals (i;, T)+1) (i = 0, 1, ..., N - 1) is denoted as pi+l. The coordinate function q{T), on the other hand, is viewed as a distinct continuous linear function qi+l in each of these different intervals. At the end points t0 and t, q(f) is assumed to be the fixed number qQ and q respectively. The trajectory (q(t), p{t)) is thus replaced by N distinct trajectories, in other words it is defined by the parameters qx ... qN,px ... pN, and since qN is the constant q by our assumption, the integral (9.4.1) is replaced by a {IN - 1) finite dimensional integral: J dpxdqx
... dpN_x dqN_x dpN I
(9.4.2)
where the integrand / is suitably altered in terms of these parameters. The limit of this integral as N —> °° is the required mean value (Feynman's functional integral). We are particularly interested in the mean value of the finite dimensional integral obtained by using the exponent of S{t0, t): {2nYN J dp{dqx ... dqN_x dpN &xp{iS{t0, t)) = JN{q0, q; t0, t)
(9.4.3)
The limit of this integral as N —> °° is equal to the matrix element11 of the evolution operator U{t, t0) =
e\p{-i{t-t0)
H):
lim JN{q0, q; t0, t) = (q\ exp{-i{t - t0) H \qo) «—»•*>
11
See the definition of matrix element in the Hint to Exc. (9.4.1).
(9.4.4)
Basics of Quantum Theory 473
see Eq. (9B.19) Exc. 9B.1 and Sec. 5; we have taken % = 1 in (9.4.4)). In Exc. (4.1) we shall prove for i simple case the validity of the assertion made in (9.4.4). The functional integral obtained above is lenoted symbolically as:
r
exp(^0, 0) U&&2™
•wo)
(9.4.5)
2n
T
By definition this is the functional integral expression of the evolution operator matrix element. We shall derive it in Sec. 5 using the operator formalism.
4.2
Functional Integral of a Scalar Field
In order to study the quantization of fields, we shall begin by obtaining the functional integral for a scalar field 0 with self-interaction. The action integral here is: (9.4.6) The field functions
g
integral of I - — 0
,\
that describes the self-interaction with coupling constant g. To define the func-
tional integral over all fields, we use the finite-dimensional approximation as described below. We take a large cubic volume V embedded in the space V4 and divide it into N4 equal small cubes v-t (r = 1,2 ... N4). We then approximate the function 0(;c) in the volume Vby treating it as a constant function in the vt's. We assume that the first derivative —z— is the finite difference: -^jWxM+ S^M)-
0(x")]
(9.4.7)
where AI is the length of the edge of the cube v{. The function 0(JC) is approximated by values of piecewise constant functions in the volumes v,'s. Using this approximation rule, we consider the finite-dimensional integral r "4 | exp(iS) IInOc)
(9.4.8)
i=i X 6 Vi
over the values of the function 0(JC) in the volumes v-v The action S involves these approximated values of
Note that we have used elsewhere rj^v- (-1, +1, +1, +1) for Minkowskian metric.
474 Mathematical Perspectives on Theoretical Physics
4.3 Green's Function and Generating Functional The Green's function is the expectation value of the product of two or more field functions weighted by exp(/S). In the case of two fields, it is the two point function defined by the formula (9.4.10) below: G(x,y)s-i(
r ^
V->~
r
o,-+o
•
(9.4.10)
N*
exp(j'S) IT n(x)d<j)(x) i=i
•*
xevt
The limit of the expression on the RHS is usually denoted as: J exp(iS)(pU)
(9.4.11)
X
*
Associated with the fields of a physical system are the generating functionate which are used to determine the Green's function. The generating functional13 in this case in terms of an arbitrary function J{x)H is: fexp/(S+ \ J(x)
(9.4.12)
X
The two point Green's function is now given by the formula: G(
*' y)
= /
^T^-Tz[/]^ =°
(9 4 13)
--
5J(x) 5J(y) When the component Sl (the interaction part of S) is ignored, the calculation of G(x, y) using (9.4.13) reduces to G0(x, y) = D(x, y) (9.4.14) where D(x, y) is the solution of the operator equation: - ( • + m2)D(x, y) = 5(x, y) (9.4.15) which we studied in Appendix 9C(see 9C.10 and 9C.25). (See Exc. 2 for evaluation of Z[J] and Green's function in the case of a free field theory.) In Exc. 2 we mention the non-uniqueness character of the above solution; to circumvent this problem of non-uniqueness, we replace exp(z'S) by exp(iS£) where Se is a complex action dependent on a nonnegative parameter: Se = — J 0 ( - D - m2 + i^Qd^x 13 1
(9.4.16)
See Sec. (9.5) for derivation of generating functional. ' In Sec. 5 we shall see that / has a physical meaning. J is denoted as X] in some texts (see, for instance, [26]).
Basics of Quantum Theory 475
The action SE is chosen in such a manner that the absolute value of exp(iS£) is less than one and it vanishes when f (j)2d4x —» °°. The Green's function obtained from-S£ when e —> +0 is unambiguous as D(x, y) becomes a limit of the Green's function of the operator ( - • - m2 + is). The function D(x, y) depends on the difference (x - y) and is given by: 1
D{x
r
dAkpik{x~y)
~y) = T ^ r J .I
<9-4-17)
2 .
(2n) J k -m + te where k is the momentum four-vector*. The limit of this function for £ -» + 0, is denoted as DF(x - y) and it is called the propagator or Feynman's Green function. Essentially we have thus shown that in the theory of free fields, the expectation value of the product of fields
=
SJ{xn)
J=o
\Qxp{iSQ)(xx)--
(9.4.19)
\exp(iS0)nn(x)d
It is worth noting here that the expectation value in (9.4.19) is zero when n is odd 15 and for even n it can be expressed as the sum of products of expectation values of pairs taken over all possible combinations. This result is called Wick's theorem. For n = 4 it reads: (<j>(xl)(t>(x2)
+
{
(9.4.20)
Next we return to the full action S = So + S,16 in order to compute the contribution of exp(/S7) in exp(j'S) where: exp IS = exp(i5 0 ) exp(j5 / ) To follow the usual practice in literature the 4-vectors k, x, y etc. are not denoted by bold letters here. ' See Sec. 5 as to why the expectation value for odd n is zero.
15
16
S,=-j;lfa)dAX
(9.4.21)
476
Mathematical Perspectives on Theoretical Physics
This is done by using the perturbation theory technique17 which is based on the expansion of exp(/S7) as a series in g: expO'S,) = JT - ^ ^ - J 0 3 (*i) •-
••• d \
(9.4.22)
This series can be integrated term by term, hence after integration when we substitute the result in the 2-point Green's function (9.4.10), we obtain after using (9.4.11) the following expression (Eq. (9.4.23)): £ ^ & l ™P(iSo)
I ^ f j ^ r j eMiS0)
(9.4.24)
then the denominator stands for the expectation value: lexP(iS0)
(9.4.25)
and the numerator for the expectation value: J exp(zSo)0U)
17
Given an operator Ao with eigenvalue eQ another operator Ao + aB, where | a \ is small is called a perturbation of Ao. The eigenvalues of this new operator that lie near e0 are of great interest and so are their relations to B and their properties as functions of a. In quantum mechanics there are 'formal' series for the perturbed eigenvalues. These series are known as perturbation series and are often given in terms of oc. (See Chapter 17 in [24])
Basics of Quantum Theory 477
4.4
Diagram Technique for Scalar Field Theory
The FP ( Fadeev and Popov) procedure is based on constructing the prediagrams associated to a given expectation value. For instance, to every n-th order expectation value of the type (9.4.25), there is assigned a diagram made of n pseudo-euclidean points (with three lines jutting out from each of them). These n points here are called the vertices of the prediagram. For n = 4 this prediagram (showing the vertices also) is:
y- y*- y*- )*-
^^
Similarly (for n = 4) the prediagram assigned to the expectation value (9.4.26) is of the form: x
\x,
-— f—
\x2
/r
L
\x3
-
/
\*4
p - /r—
y
—•
(9.4.28)
Note that we have added here two points (each having one leg) that connect points x and y in V4. These diagrams are symmetric with regard to the permutation of n points xx,..., xn and also with regard to the permutation of three lines in each point. Hence there is a symmetry group Gn that leaves the prediagram invariant; the order of this group can be easily seen to be Rn = n!(3!)". The symmetry of the prediagram is exhibited in the symmetry of expectation values (9.4.25) and (9.4.26), as these expectation values remain unaltered under any permutation of their arguments, and under permutation in any of the triplets of the field functions >(*,) 0C*;)0(*;) = 03(*,)In view of Wick's theorem, we know that these expectation values (9.4.25) and (9.4.26) can be expressed as the sums of products of all possible expectation values of field function pairs. If among them the expectation value (0(JC,-)0(JC,-))O is present we connect each pair of points x{, X: with a line and thus assign a diagram to every formation of the pair expectation value. The number of lines is equal to the number of pairs and thus equals half the number of field functions. The diagrams that result from prediagrams (9.4.27) and (9.4.28) after using these connecting lines are respectively: a
GO
b
c
Q
a
(9A29)
^
b
c
ee Q @ and
^-^ e
CD
f
'"vly
(9.4.30)
In order to obtain the expression corresponding to a diagram one has to integrate the product of the pair expectation values over x{, x2, ..., xn and multiply it by the factors (-ig)"IT^and Rnlrn d. Here rnd is the order of the symmetry group of the diagram constructed from the prediagram by joining its vertices with lines. Since Rn is the order of the symmetry group of the prediagram, the ratio R^rn d gives the number of ways through which a given diagram can be obtained from a prediagram.
478
Mathematical Perspectives on Theoretical Physics
Finally to express a given Green's function/expectation value, in terms of these diagrams and viceversa, we need to set up the rules of correspondence between the basic elements of a diagram—the vertices and the lines and the 'elementary Green's functions' for a pair of points. To achieve this, we assign the Green's function DF (*,• - x) to the line joining the points xt, Xj, and a factor of coupling constant g to every vertex-point. Since DF (x-y) in view of (9.4.18) differs from the expectation value by the factor i, in effect we have set up the correspondence rule (9.4.31) given below for the expectation value as well. X
-
Xj
~—
( 9 A 3 1)
DF(,,-Xj) X—8
The expression for a diagram is actually obtained only when the product formed by contributions that correspond to elements (of diagrams) is integrated over the coordinates of vertices and multiplied by the factor (i)!~"~l (rnd)~l, where / is the number of lines, n is the number of vertices, and rn d is the order of the symmetry group of the diagram. Before closing our study on diagram techniques in this section, we make three important comments regarding it. We shall return to these discussions again in Sec. 6. Comment 9.4.1 In the computations of Green's function only those diagrams enter that are 'connected.' A diagram is said to be connected if it is possible to go from any given vertex of the diagram to any other vertex by moving along the diagram lines. Comment 9.4.2 The diagram technique in the momentum space is defined by using the Fourier transforms (j>(k) of the field functions
exp(ikx)$(k)d4k
(9.4.32)
The expectation values of the type
<0(* t ) ••• £(*„)>
(9.4.33)
play the role of Green's functions in the momentum space. The correspondence rule (9.4.31) takes the following form in this case (see (9.4.17)): k
J
k
2
-•—*- 8 (*! + k2)
{kx2 - mx2 + ze)-1 *i
I /*\-« k2 k3
^gd(k{ +k2 + k3)
(9.4.34)
The contribution of a particular diagram is now obtained by considering the product of expressions for all its elements using the rule (9.4.34) and then integrating it over all internal momenta. The multiplica( i V""" 1 tion factor here is • (rn d ) ~ l . Comment 9.4.3 The diagram technique introduced here is based on the functional integral approachthe approach used by Feynman and later by Faddeev and Popov. In most books however the operator method is used (see for instance [7]). In the next example we show how a functional integral—which we have so far seen as an abstract entity-can be transformed into integrals of a (familiar) Hamiltonian form:
Basics of Quantum Theory 479
Example 9.4.4
Consider the functional integral f exp(rS) Un(x)d^{x)
(9.4.35)
given in the denominator of (9.4.11). To write it down in Hamiltonian form we consider the integral whose action functional involves , n]) Yln(x)d
(9.4.36)
X
The action S[(j>, n] stands for:
5[0, n]=\[ndo
(9-4.37)
It is easy to note that if n is replaced by do
H=
\{IK2+ i
(V0)2+
T-<* 2+ v.^)dh
(9A38)
where
(9.4.39)
into (9.4.36) (using the expression (9.4.37) for S[
} expf -— J K2(x)d4x ) II dn(x)
(9.4.40)
over n, which leads to the product of normalization factors. Also when a Green's function (expectation values of a product of several fields) is calculated, integrals of the type (9.4.40) appear both in the numerator as well as in the denominator, hence the integral over n is cancelled out and one has simply to compute the integral over >. The above process of expressing the functional integral of a scalar field theory in Hamiltonian form by artificially introducing an integral over the canonical momentum n is found very useful in proving the Hamiltonian character of given systems of quantum field theory and statistical physics [29].
4.5
Functional Integral Approach to Bose and Fermi Fields
As expected the functional integral for these two important fields (the Bose and Fermi) is realized by making necessary changes in the theory developed above. We describe these changes in brief. In the case of Bose fields, we consider the system where the large cubic volume V=I? (which we mentioned earlier) is filled with Bose particles and is subjected to some periodic boundary conditions. The functional integral in this case is an integral over the space of complex functions (fields) yix, T), y?(jc, T) where x e V. These periodic functions are in the time parameter T with period /3.
480 Mathematical Perspectives on Theoretical Physics
The Green's function is defined as expectation value (in V) of the product of several functions y/, \p with different arguments weighted with exp S,18 where S itself is the functional of yf and y;
S = Jo" dx J d\\jr{x, T) dx y(x, T) - J* H\x)dx
(9.4.41)
The functional S represents the action here, and H' (x) is the Hamiltonian:
H\x) = j d\\i-^
V \jf{x,x)S/{x, T)1 - Xy/(x, T)y{x, T)1 +
•i-J d3xd3yv(x - y) y(x, x) y?(.y, x) y(y, t) y(*, t)
(9.4.42)
The constant X in (9.4.42) is the chemical potential of the system and v(x - y) is the pair interaction potential of two Bose particles with coordinate vectors x andy. If the system under study has thermodynamical characteristic, i.e., temperature is involved, then the periodicity constant /?equals (kT)~\ k being the Boltzmann constant and T the absolute temperature.19 The one-particle Green's function is thus: f es\if(x, T)W(X, , T, )dwdw
G(x, T; x l f T,) = (y/(x, T) y/(Xl, T,) = - J—Z
_
(9.4.43)
J e dxffdxjf
which as we can see is the ratio of two functional integrals. In the case of Fermi fields, functional integrals are defined by using the integrals over anticommuting variables x, x* (i = 1 ... n) that obey the rules of a Grassman algebra20 (with a unit element and involution) (see Chapter 7, in particular Appen. 7A): Xi Xj + Xj;xi = 0, x*x*j + x*jx)=0,
xtx*j+x*jXt = 0
(9.4.44)
An element of this algebra is given as a polynomial:
p(x, x*)=
V-<WA, *?' x"2 -
£
(9-4.45)
«,-,*, =o,i
The coefficients c 0] ^ ... ^n are complex numbers. The commutation relations in (9.4.44) imply that x\= (xp = 0, and therefore as we shall see the values of at and bt equal 1 (i = 1 ... n) in (9.4.45). The operation of involution on the polynomial p is defined as:
P -*P* = £
cav,,anbv,,K ( * / » ... (*,)*!(*;)"" ... (*l) ai
(9.4.46)
We recall that the integral of an element of this algebra is defined as: J p(x, x*) dxdx* 18
19 20
S
J p{Xl, ..., *„, x\, ..., x*Jdx\dxx ... dx*ndxn
(9.4.47)
Note that for the Green's function in the case of a scalar field theory, the weight is exp(j'S) as opposed to exp 5 here. The absolute temperature^ Celsius (centigrade) temperature +273°C. (See Waldram [39] for k and T). The Grassman algebra here is an infinite dimensional algebra, although we are taking only a finite number of n generators. Moreover, this number has to be even due to the nature of the description here.
Basics of Quantum Theory 481 subject to the following integration rules:
J dxl = 0, J dx* =0,
j xtdxt = 1, J x)dx* = 1
(9.4.48)
We note that the functional integral for Fermi fields is eventually the limit of the integral on an algebra satisfying the rules (9.4.44)-(9.4.48). Thus for instance if x* Ax = ^ aik x] xk is a quadratic form of
u the generators xt, x , corresponding to matrix A, then J exp (-x* Ax) dx* dx = det A
(9.4.49)
f exp (-x* Ax + n* x + x*ri)dx* dx and
(9.4.50)
The expressions r\* x= ^ i
77* xh x*r\ = ^ JC* ?], in (9.4.50) are linear forms of the generators xt, x] i
whose coefficients r\i and 77 * anticommute with each other and with generators (we shall use these ideas in Chapter 11). The reader is advised to see [2] for details.
Exercise 9.4 1. Show that if the Hamiltonian H in the action (9.4.1) is independent of p, then the limit of the integral (9.4.3): lim
(2K)~N
[ dpxdqx ... dpN_x dqN_x dpN exp(/5(r0, f)) = (q\ exp(-i(t -
to)H\qo)
equals the matrix element of the evolution operator (the Feynman functional integral). A similar result holds when H is independent of q. 2. Evaluate the function Z[rj] for a free field theory and show further that formula (9.4.13) holds in that case.
Hints to Exercise 9.4 1. From (9.4.1) we note that the action S(t0, t) when H is a function of q only simplifies as: (i)
S%, t) = J ' p(T) ^ - • dX- j ' Jlo dx J'o
H(q)dT.
To evaluate the first term on the RHS, we use the fact that pit) takes the constant value pt in (T,, T ;+1 ) while qix) is piecewise linear, accordingly we have: (ii)
5 ( r 0 , t) = p l { q l - q0) + p 2 ( q 2 - q{) + ••• + pN{q - qN_x) - f
H(q)dx.
482 Mathematical Perspectives on Theoretical Physics
The LHS of (9.4.3) can now be written as Eq. (iii) below: (2nyN
j dpxdqx
... dpN exp i\px(qx
-qo)+
••• + pN(q - qN-X) - (,'
H(q)dr).
Integration with respect to pi{i=\ ... N) gives the product of N 5-functions: (iv) 8{qx - qQ) 8{q2- qx) ... S(q - qN_x). rt
This allows the expression of exp (-/J H{q)dx) to be equal to exp(-/(f - to)H(qo)) which can now be put outside the integral sign. The integration with respect to qx ... qN_x eliminates all 8functions (see (0.4.10) except S(q0- q) and hence it leads to the result (v)
S(q0 - q) exp - (/(* - t0) H{q0))
which is identical to the matrix element of the evolution operator. We recall here that for any linear operator A the scalar product (0|A|i^) is the matrix element of A between \y/) and \<j>). When H depends on the momentum (p), the second term of (ii) becomes: (vi)
- J ' H(p(t))dr.
We now integrate (iii) with respect to qx ... qN_x first and then with respect to px ... pN and obtain the expression:
(vii)
— [ dp e\p{ip(q - q0) - i{t - to)H(p)} 271
J
equal to the matrix element of the evolution operator for the Hamiltonian H(p). 2. The action 5 given in (9.4.6) no longer contains the term - — 03 now. For ease of calculation we denote the other two terms of the integrand as
[ • + m2] where D is d' Alembert's operator.
Next we apply the shift: (i)
into (9.4.12), where we choose (^(x) in such a manner that while integrating over <j> in the numerator of (9.4.12), the terms linear in
(ii)
We now use our definition of Green's function (9.4.15) which gives:
(- • , - m2)D(x, y) = 5(x - y) = 5(x, y)
(iii)
to write down the solution of (ii) (in terms of this function) as: <j>0(x) = -j D(x, y) T](y)dAy.
(iv) 21
The numerator written out in full is: J exp i (} -— (D + m 2 )0 2 + i){x)
x
Basics of Quantum Theory 483
This computation finally leads to: exp!-—J T)ix)D(x, y)n(y)dAxdAy\ X J esxp(iS)TLn(x)d^x) z[rj] = L_2 J £ Jexp(»S)nn(jc)^U)
(v) Since
(vi)
z\ri\
= e x p j - -i-J 77(x)D(x, y) TjGOd4**4)/}
the application of (9.4.13) leads to the two point Green's function: (vii)
G0(x, y) = D(x, y).
Remark An important fact that we would like to emphasize here is that the definition (iii) of D(x, y) is not unique, since it is defined only up to an additive part given by the solution of the homogeneous equation: (- D - m 2 ) / = 0 (see Eq. (9.4.17)).
5
PATH INTEGRALS
5.1
Path Integral via Operator Formalism
In the previous section we studied the functional integral approach to quantum theory by considering a classical action defined in phase space. The foundations to this approach were laid by Feynman in his famous paper of 1948 [lla]. 22 Here he expressed the transition matrix element for a one-dimensional quantum-mechanical system (transition/probability amplitude (9.2.3)):
< ? ', f Ur> =
(9-5.1) f" t,
tMt
>
22
23
^
Some of the ideas here were already discussed earlier although with a different twist, see for instance Sec (9.4). We shall refer to them as we go along. Note that \q) on the R H S is an eigenstate of the position operator Q in the Schrodinger picture, and \q, t) on the L H S , which equals e'H'\q), denotes the state in the Heisenberg picture (see (9.2.51)-(9.2.54)).
484 Mathematical Perspectives on Theoretical Physics as a functional integral:
NJ [dq] cxp{ij't L(q, q)dx]
(9.5.2)
The integral was taken over the function space q{t) and it represented the sum of contributions over all paths that connect (q, t) and (qr, t') weighted by the exponential of i times the action. The constant N was used as a normalization factor and L(q, q) stood for the Lagrangian. In this section we shall derive (9.5.2)-the Path Integral (Pl)-by first principles using the operator formalism. For this we divide the interval (t\ t) into ri equal parts 8t = (t' — t)ln and write:
W\e-iHU''%) = J dq, ... dqn_, (q'\e-m%n-x) ( ^ - i l ^ ' k ^ ) - (i^^q)
(9.5.3)
by using complete sets of eigenstates of the position operator Q (in the Schrodinger picture). From our discussions in Appendix 9B, we know that for very small 8t we can write24: W\e-iH5'\q) = (q'\e-iH{P'Q)%)
= (q'\[l - iH(P, Q)8t]\q) + O(8t)2
(9.5.4)
P2 When H(P, Q) = — + V(Q), we have: 1m
(q'\H(P, Q)\q) = W\^-\q) + v f - 2 ± ^ W - q)
(9.5.5)
2w v 2 J In writing the second term on the RHS of (9.5.5) we have used the symmetric ordering (Weyl ordering25) of operators, and the fact that (q'\q) = 8(q' - q) in view of Eq. (9A.24); we also use the fact that 8(q' -q) = \—^-eip(q ~ q) to simplify it further. Using (9A.19) and subsequently (9A.42), we can write J In P2 the first term (q'\ \q) as: 1m
\%- W\P)
3
In
2m
12m
J
In
V 2 JJ
(9.5.6)
1m
(9.5.7)
and (9.5.4) becomes:
- j|^ 24.
25
eXp{ip(q'-q)-iStH^p^Jj
(9.5.8)
The variables Q and P in the Hamiltonian H(P, Q) are position and momentum operators with eigenstates |g) and \p) respectively. Weyl ordering of (products of) operators: XP = PX->Jr(XP+ PX), X2P s XPX -> \ (X2P + XPX + PX1).
Basics of Quantum Theory 485
(Note that H(p, q) is the classical Hamiltonian.) Substituting it in (9.5.3), we obtain after simplification:
< * • « - * > - J ( £ H £ ) J *••••«-•
X "Pjd^-*-,)^'^. 2 ^)]}
(9.5.9)
In view of this, the transition amplitude can be symbolically written as:
<,"k-»«' - •>!„> = J [ ^ » ] expfij; A(W - Hip, ,))}
X ^\its<[Pl ^^
- H[Pl, ^ ^ i ) ] J
(9.5.,0,
The extreme RHS of (9.5.10) defines the path integral. . _ -, In order to bring it in line with (9.5.2), we would now calculate the momentum-space i.e. —*— = n dPi\ V L27rJ II —— part of the path integral. The integrand here is oscillatory, therefore we analytically continue j=i 2n )
it to the Euclidean space by (formally) treating (idt) as real. Then using the Gaussian formula:
*i f —e-^^^-fk^e**
(9.5.11)
we simplify only that part of (9.5.10) which involves the variable p , this gives26:
^expi~^Pj+tp^-g^r^t
exp[
7s/ \
(9-5-12)
Substituting this in (9.5.10) we now have:
(9.5.13)
~ 26
(
qt+qi-A
N o t e t h a t from H\ p . , —
\ J
—-
2
P)
(qi+qi-l)
P]
= -^- + V\ —
J 2m
y
in (9.5.10) we have included
2
J
-^-.
2m
486 Mathematical Perspectives on Theoretical Physics
As n —> °o, 8t —> 0 and —
— becomes <j •, accordingly we have shown:
(q,t\q',n=(q'\e-iH«'-%)
= N j[dq] exp {/('dT^-q 2 - V(9)]}
(9.5.14)
and have thus established the equality of Feynman's path integral to the transition amplitudes of the Heisenberg and Schrodinger picture. Note that L(q, q) in (9.5.1) is taken to be — q 2 - V(q) in (9.5.14).
5.2 Time Ordered Product of Operators Having derived the Path Integral (9.5.2), we shall next see the advantages of using the PI formalism in quantum field theories. Consider, for instance, a product of two Heisenberg operators: QH(tx)QH(t2)
(9.5.15)
and evaluate the matrix element: H
( q , t ' \ Q H ( t x ) Q H { t 2 ) \ q , t)H t ' > t x > t 2 > t
(9.5.16)
Since tx > t2 we can insert complete sets of coordinate basis states and write: *'. t'\QH{tx)QH{t2)\q, X
t)H= J dqxdq2
H{q',
t'\QH(tx)\qx, tx)H
H < 9 I . h\QH(h)\
= j dqxdq2qxq2
H(q',
t'\qx, tx)H H(q2, t2\q, t)H
(9.5.17)
where we have used the eigenvalue relation: QH(t)\q, r)=q\q, T )
(9.5.18)
in the RHS for the operator QH(f). We further note that each inner product in the integrand represents a transition amplitude and therefore can be written as a path integral. Combining the products we can write (9.5.16) as: <', t'\(f(tx)QH(t2)\q,
t) = Nj [dq\q(tx)q(t2)eiS^
(tx > t2)
(9.5.19)
(Note that we have used the identification qx = q(tx), q2 = q(t2) in the above computations and have supressed H in H(, and )H.) For t2'> tx we can write the matrix element for the operator product QH(t2)^1(tx) and simplify using the above argument to obtain: (q\ t'\QH(t2)QH(tx)\q,
t) = J dqxq2{q',
t'\QH{t2)\q2, t2)
x fe h\QH(h)\(lv h) <0i. hh t) = N\ [dq]q(t2)q(tx)eiS[i]
(t2 > tx)
(9.5.20)
Basics of Quantum Theory
487
But q(t{) and q(t2) are classical quantities, therefore q{t2)q(tx) in the integrand can be written as q(t{)q(t2), this shows that the RHS on (9.5.19) and (9.5.20) are the same and this means that the path integral gives (as a natural phenomenon) the time ordered correlation functions as the moments:* H(q\
t'\T(QH(tx)QH(t2))\q, t)H = Nj d[q}q{tx)q{t2)eiSW
(9.5.21)
where the time ordering can be explicitly represented as: T(QH(t])QH(t2)) = 0 (*, - h)QH(tx)QH(t2) + G(t2 - f,)Q"('2)e"('i) An important consequence of the above fact can be recapitulated in the following remark:
(9.5.22)
Remark 9.5.1 The time ordered product of any set of operators leads to correlation functions in the PI formalism as: H(q\
t'\T{Oy{QH(tx))O2{QH)(t2))
= NJ [dqWMit,))
•••
On{QH{tn))\q,t)H
... On(q(tn))eiSl"]
(9.5.23)
so that all the factors in the path-integral are c-numbers (classical quantities), i.e., there are no operators any more.27 It should be noted that the transition amplitudes in PI formalism obtained so far have been between coordinate states, in physical applications these have to be computed for physical states. For instance, we would like to find out the probability amplitude for a system which is (initially) in a state |y/,)w at time f, and makes the transition to a state \y/f)H at time tj. By definition the wave function at time t is: H<<M|V>,/=
V(q,0
(9.5.24)
accordingly we can write: «
(9.5.25)
Hence the time ordered correlation functions between such physical states can simply be written as: tf
••• OH(Q"(tn))\\lfi)H=N\ dqfdqw}{qf,
x vtffc, ?,) J [dq]OM{tx)) .- OMQ)
exp((//»)5[ 9 ]).
tf) (9.5.26)
In particular the expectation value can be computed by using: // <^|r(O 1 (Q
//
)(?,)) ... On(QH(tn))\¥)H=
X J [dq]Ox(q(tx)) ... 0M*n» *
27
N j dqfdqi ¥*(qf, f/)V/(?,. 0
exp((i/ft) S[q])
(9.5.27)
The matrix element of a time-ordered product between ground states is: G (r,, t2) = (O\T QH(tx)QH(t2)\O); E ! wheras the matrix element: (O | q, t) = 0O (q) e "' " = fy0 (q, t) is the wavefunction for the ground state. Note that letter Tin an equation indicates that the equation is time ordered, i.e. tx > t2. Note that O ^ ^ f , ) ) is a function of the operator g"(f,) and O,(^(f,)) is a similar function of g(r,) and hence the latter is a onumber.
488 Mathematical Perspectives on Theoretical Physics
Since the states may not necessarily be normalized (i.e., #(Vilv,)// may not be 1), the expectation value is: O W G " ( ' i ) ) ... On(QH(tn)))) H(Vi\T{Oi{QH{h))-On{QH{tn)))Wi)H
_
H(ViWi)H
= J dqfdqtW*(gf, tf y ,• (g,-, t,)) j [dq]Ol(g(ft))•••(?„(q{tn)) exp(Q/ft) S[q]) J dqjdqtfi
5.3
{qf,tf
) y , (q, ,t,)j
^
[dq] exp ((i/ft) 5[g])
Correlation Functions Using an External Source J
We next see how the PI formalism can be used to generate various correlation functions for a physical system simply by adding terms due to external sources to the original action. The altered action incorporating the source term J is: S[q, J] = S[q] + j'tf dt q(t) J(t)
(9.5.29)
which gives back the original one as: S[q, J]J=0=S[q]
(9.5.30) 28
Using the action S[q, J] we have : (ViWih = N\ dqfdqw\{qf,
tf)yf/iqt, tt) J [dq] exp((i/») S[q, J])
(9.5.31)
In view of (9.5.30) this gives:
(9-5.32)
Hence for a t1 (tf >/, > /,.) we obtain the functional derivative:
^ 0J/ ^ \h)
=Nj dqfdqi¥*(qP tfyfa, 0 n OJ{tx) = NJ dqjdqw'iiqp t^fa,
tt)
X J d[q]jq(t{) exp((i/») S[q, J])
(9.5.33)
(We have used (9C.77) and (9.5.29) to write the last line.) In view of (9.5.27) this implies: S i l/ )j
^ 0J }! \
\h)
j=o
x J [dq]^q(h) 28
= Nj dqfdq^ (qf, tf)yrfa> 0 exp((i/») S[q]) = ^{xif\Q(h)\^
The Heisenberg symbol H on the LHS has been suppressed from now on.
(9.5.34)
Basics of Quantum Theory 489
The above result can be generalized to any finite number of tk's satisfying tf > tx ... tn> tt; for instance, for tf>t{,t2> tv we have:
j^MAJM'-^dq'd'i-r-
J [dq{^\
q(h) 9('2) exp((i/») S[q])
= (j^j (vAnQWQimWi)
(9-5.35)
and in general (9.5.36)
Hence the expectation value (9.5.28) can be expressed as: {Tmo
... Q{tn))) = (^nQ(h)Q(tn))\¥i)
__ (-;*)"
*• <*\*h
(V^ik,)y S(J{tl))...S(J(tn))
(9 . 5 . 37) J=o
From Sec. (9.4) we are already familiar with formulae of this type, where we obtained Green's function using the generating functional. The inner product (V'ilVi)/ is thus called the generating functional for the time ordered correlation functions.
5.4
Vacuum Functional Z{J) and Green's Functions in the Vacuum
We shall now use the above discussions to write down the Green's functions using Pi-formalism and will finally show that end results are the same regardless of the approach (e.g., the Faddeev-Popov formulation or the Pi-formalism). For this purpose we consider the vacuum to vacuum transition amplitude for 7 ^ 0 , beginning with the transition amplitude in coordinate space. Using (9.5.14) we can write it as: (qf, tf\q> tt)j = yVJ [dq] exp((i/») S[q, J]) = N\ [dq] exp(-S[q]
+ -\'f
dtq(t)J(t))
(9.5.38)
In view of (9.5.23), the RHS can also be written as a matrix element of the operator, thus:
(qf, tf\qt, ti)j = (qf, tf\T e x p ^ J ^ dt Q(t) /(?)] \qh tt)
(9.5.39)
490 Mathematical Perspectives on Theoretical Physics
Now the ground state from these coordinate states can be reached through a few complicated steps. To this end we let the initial and final times tend to infinity, so we allow: tt->-<*>, iy-»+°°
(9.5.40)
As for J we assume that it is non-zero in a large but finite interval, i.e., J(t) = 0 |f|> T
(9.5.41)
and since computations are done with J(t) -> 0, we take the limit T -» °° at the end. Accordingly equalities (9.5.38) and (9.5.39) can respectively be written as: lim (gf, tf\qt, f,)y = N\ [dq] expf±- f
dt(L(q, q)) + Jq]
(9.5.42)
it —»°°
and lim (qf, tf\qt, *,->/ = lim (^ —»~
lim {qf, tf\T expf^ f
dtJQ\\qi,
?,)
(9.5.43)
r^ —>«
We further assume that the ground state energy of our Hamiltonian is normalized to zero, i.e.: #|0> = 0 H\n) = En\n)
(9.5.44)
(here we have assumed that energy eigenstates be discrete for simplicity of calculations). Inserting complete sets of energy eigenstates in (9.5.43) we have (see (9.5.3)): lim (qf, tf\qt, t) tf —><*>
= lim lim Y,{qf,tf\n)(n\TexJ^-\T
dtJQ\\m) (m\qi, r,->
(9.5.45)
Now note that the first and last transition amplitudes are the matrix elements of the operator exp((-i/h)Ht) and its inverse, with appropriate value of t, therefore the RHS of (9.5.45) is: lim lim V
(
'/-•-
x (n\Texp^f_rdtJQym)
(m\ exp((i/ft) //r,)|g,)
= lim lim £ exp(-(i/ft) £Br,+ (i/») Emt) (qf\n)
x
(9.5.46)
(where we have used (9.5.44)). When we take the limits for r, and tp exponentials oscillate out to zero everywhere except for the ground state, hence we have:
Basics of Quantum Theory 491
lim (qj,tf\qi,t>ij=
lim foylOXOirexpf-j-f dtJQ]\0) (0\qt)
/ ^ —>»o
= </|0> <(%,•> (Olrexp^J^Jr/ejlO)
(9.5.47)
This leads to:
<0|7exp(if dtJQ)\0) = Hm
,
ffi',
/ '
(9.5.48)
Now the LHS of the above equation is independent of end points and so the RHS must also be independent of end points, moreover from (9.5.3) we know that the RHS has the structure of a path integral (see (9.5.14)), hence it follows that: <0|rexp(-i-J_~_//e)|0> = <0|0>,= NJ [dq] cxp((i/h)S[q, J])
(9.5.49)
where S[q,J]= _Q dt(L(q, q) + Jq)
(9.5.50)
Note that the RHS in (9.5.49) has no end point constraints. If we denote now (see also (9.4.12)) <0|0>, = N\ [dq] exp((i/ft) S[q, J]) = Z[J]
(9.5.51)
then from (9.5.37) we have: (
; f ? " , * l Z i i \ ( t v =
(9-5.52)
Z[J] 5J(tl)...5J{tn) which shows that Z[J] generates the time ordered correlation functions or the Green's functions in the vacuum (see also (9.4.19)). In quantum field theory Z[J] known as the vacuum functional or the generating function for vacuum Green's function plays a central role, because the knowledge of the vacuum Green's functional leads to the construction of the 5-matrix of the theory, which in turn leads to the solution of the theory.
5.5
Effective Action W[J)
While dealing with statistical deviations from the mean values in QM, it is customary to write: Z[7] = exp((i/») W[J])
(9.5.53a)
W[J] = -(ih) In Z[J]
(9.5.53b)
or then the functional derivative:
Mil
= ( ^)_^i^l
(9 . 5 .54)
492 Mathematical Perspectives on Theoretical Physics
is said to define the vacuum expectation value (Q(t{)) (of the operator Q). Accordingly, the second order functional derivative: (_ih)
5 2 ^[-/3 8J{tx)8J{t2) ,__0
= ( ;-ft)2f
l
1 5Z[J] 8Z[J] ^ Z2[J] 8J(tx) 8J{t2))
52Z[J]
[Z[J] 8j{tx)8J{t2)
J=o
= (T(Q(tx) Qih))) - (Q(tO) (Qih)) = (TdQih) - (Q(tx))) (Q(t2) - <<2(f2)»>
(9-5.55)
gives the second order deviation from the mean. It can obviously be generalized to any finite order m, although it becomes rather complicated beyond the fourth order. In the beginning of this section we saw that the path integral for the transition amplitude is proportional to the exponential of the classical action, it is for this reason that W[f] is often referred to as an effective action. In quantum field theory W [J] is known as the generating functional for the connected vacuum Green's functions (see Comment (9.4.1) for the definition of connectedness). We illustrate this by using the example of the harmonic oscillator. Example 9.5.2 Consider the action: S[q, J] = j ^ dt(-mq2
- —m(02q2 + Jq\
(9.5.56)
then Z[J] = NJ
[dq]exp((i/h)S[q,J])
gives: Z[7] d / ^ ) =
(/=0)
NJldqMt^exvai/VSlq]) NJ[dq]exp((i/h)S[q])
=Q
(9.5.57)
This is zero, since the integrand in the numerator is odd. Hence from (9.5.55) we have:
(T(Q(tl)Q(h))) = (rih)2 ]
f ^
Z(J) 8J{ty)8J{t2) =
H»)
«'"M SJ(t,)5J(t2)
y=0
(9.5.58) ,„„
Basics of Quantum Theory 493
showing that W[J] is the generating functional of (T(Q(tx)Q(t^))), i.e., of the two point connected vacuum Green's function. Further, in view of our discussions in Sec. 4 (see (9.4.13)-(9.4.18)) we know that the RHS stands for — DF(tl - t2), which means that m (T(Q{t{)Q(t2)))
= —DF(tx ~ *,) (9.5.59) m Hence the two-point time ordered vacuum correlation function, i.e., the two-point 'connected' vacuum Green's function, is indeed the Feynman propagator of the theory (see (9.4.17)). It is also worth mentioning here that path integrals cannot be exactly evaluated for all types of Lagrangians. When Lagrangians are Gaussians (or can be reduced to a Gaussian), they can be evaluated without difficulty — a simple deviation from that, such as in the following Lagrangian: L = —mq2 - — m(O2q2 - —q*
(9.5.60)
makes the evaluation unwieldly impossible.29 In this case an external source J is introduced to write the vacuum functional: Z[J] = NJ [dq] exp(OYft) S[q, J])
= N\ [dq] exv(±£_dt(±mq2
- ±mCQ2q2 - j-qA + Jqjj
(9.5.61)
Then, since S[q,J] = SQ[q, J] - £
dt-jq*
(9.5.62)
where S0[g, J] is the action of the harmonic oscillator in the presence of a source: SQ[q, J] = f^ dt (~mq2 - mo)q2 + Jq)
(9.5.63)
We can write Z[J] as: Z[J] = N\ [dq] e x p ( - - ^ - £
< V ) • exp((//*) S0[q, J]).
(9.5.64)
Now
^
4
= *(/)
(9-5.65)
oJ(t) hence the operator
while acting on S0[q, J] can be identified with qit) and as a result, the RHS 6J(t) of (9.5.64) can be written as: 29
' The Lagrangian here is usually referred to as an 'anharmonic oscillator'.
494
Mathematical Perspectives on Theoretical Physics
Nj [dq] exp - A £ (_ft_JL_j
exp((i/») S0[q, /])
= exp - A f J" j / - ^ _ | _ | TV J(f [dq] exp((i/») 5 0 [ 9 ) 7])) ^ 4/i - ~ V 8J(t)) J
(9.5.66)
The second term in the above expression is the vacuum functional for the harmonic oscillator interacting with the external source J, we denote it as Z0[J] and note that: Z0[J] = Z0[0] e x p f - ^ - m f
f
dt^J^)
DF(tl - t2) /(?,))
(9.5.67)
(see Subsec. 4.3) In view of this, (9.5.64) can now be written as:
expf —^— £ £ dt^hjit,) Dpit, - t2) J(t2)\
(9.5.68)
For small A (i.e., for weak coupling), the first exponent can be Taylor expanded and the vacuum functional can be obtained as a power series in A. The above discussions show that all the vacuum Green's functions can be calculated perturbatively using the PI formalism as well. (Note that in Sec. 4 (Ftn. 17) we arrived at this result using the perturbative series—both approaches are the same in essence.)
5.6
Path Integral Approach to Field Theory
In Sec. 4 we have discussed the scalar field theory in detail without making much distinction between non-relativistic and relativistic theory; in Sec. 3, however, we studied the relativistic aspects of quantum theory. In this subsection our aim is to point out the features of field theory when it is accessed through PI formalism. In this connection the following comments are in order. Comment 9.5.3 The method of path integrals that has so far been discussed for one particle systems can be generalized to systems with many particles as well as to systems with many degrees of freedom. Consider a system characterized by the coordinates xa(t),30 (a = 1 ... n). These coordinates may denote the coordinates of n particles in one dimension or represent a single particle in n-dimensions. Thus if S[x] is the action of the system, the transition amplitude (9.5.1) generalizes to (xf, tf\xr tj) = NJ [dxa] exp(i/h S[xa])
(9.5.69)
with S[xa] being the generic action: S[xa] = j ' f dt L(xa, xa) 30
' We are using the letter x in place of q to distinguish multi-dimensionality here.
(9.5.70)
Basics of Quantum Theory 495
Note that the integration is done over all paths originating fromxf at t = t{ and ending at x" when t = tf. The transition amplitudes in the presence of sources introduced through appropriate couplings can now be written as: (xf, tf\Xi, t{)j = N\ [dxa] exp((i/h)S[xa,
Ja])
(9.5.71)
where S[xa, Ja] = S[xa] + j ' f dt Ja(t)xa(t)
(9.5.72)
These basic transition amplitudes allow us to derive other transition amplitudes or matrix elements similar to those in (9.5.37) and (9.5.52). The latter of these corresponds to the vacuum to vacuum transition amplitude, which was obtained by letting the time interval approach infinity in the limit and having no end point restrictions while integrating over paths (i.e., initial and final coordinates of the paths could be chosen arbitrarily). The generating functional and the actions in this case are: Z[J] = (0\0)j = NJ [dx] exp((i/») S[xa, Ja])
(9.5.73)
and S[xa, Ja] = f" dt(Uxa,
xa) + Ja(t)xa(t))
(9.5.74)
J — oa
5.7
PI-formalism and Field Theories (with Infinite Degrees of Freedom)
Comment 9.5.4 The path integral method can obviously be extended to continuum field theories after suitable changes, in this case physical systems involve infinitely many degrees of freedom. Thus for a 1 +1 dimensional theory whose basic variable is the field <j) (x, i), the vacuum to vacuum transition function in the presence of a source is given as: Z[J] = <0|0>y = N j [dx] exp((i/»)5[0, /))
(9.5.75)
where S[
(9.5.76)
The form of the action S[<j>] is chosen depending on whether the field theory is relativistic or non-relativistic. It is to be .noted, however, that the relation between the Lagrangian and the Hamiltonian must be canonical (see Examples (6.3.2) and (9.4.4)), since then alone the PI method that leads to (9.5.73) or (9.5.75) can be applied. If the relation is not canonical, the transition amplitudes are computed using the methods discussed in Sec. 4 (see, for instance, (9.4.3) and Exc. (9.4.1)), and the formalism is referred to as the Feynman's path integral in phase space. Consider a relativistic field theory in spacetime (3 + 1 dimensions) with Lagrangian density:
£(0,^) = y ^0"0-^ 2 --^ 4
(9.5.77)
496
Mathematical Perspectives on Theoretical Physics
with A > 0, then
S[<j)] = J dAxL{
(9.5.78)
and S[(p, J] = S[
(9.5.79)
The Euler-Lagrange equation coming from (9.5.79) takes the form:
^ g ^ l =d ^
+ m^ + i - ^ - 70c)
o0(;c)
(9.5.80)
3!
(here x refers to (jc, t) collectively). The theory with the dynamical equation (9.5.80) or the action (9.5.79) is known as the (f-theory in the literature. When A = 0 the Lagrangian density (9.5.77) is quadratic in the field variables and hence the generating functional denoted Z0[J] in this case can be evaluated following the methods of quantum mechanical systems (see (9.5.61) and (9.5.67)), thus Z0[J] = Nj [ # ] exp((i/») 5O[0, J])
= N J [rffl exp^ j d\ ( - i - ^ 0
- ^
2
+
/^|
= yvj [ # ] expf-i-J dAx( j
(9.5.81)
(We have used here the arguments given in the Hint of Exc. (5.1), in particular we have used the Eq. (v) there, see also (9.5.68).) As in the case of the harmonic oscillator when A = 0, we have here:
=0
(9.5.82)
J=Q
and
<0|WxWy))|0)=(-^
f^
]
Z0[J] 5J(x)8J(y) zo[j] \
n
= iftDF(x-y) (see (9.5.57) and (9.5.59)).
y=0
)
y=0
(9.5.83)
Basics of Quantum Theory
497
Similarly, identifying <j)(x) with -ih , we can use the arguments of anharmonic oscillator to 5J(x) write the generating functional for the case X * 0 as:
Z[J] = Nj [d>] exp(^ J d'J^d^p = ivj [
- ^-02
- A-04 + J(pjj
exp((i/») So[0, /])
= A/J [
exp((i/») 5O[0, 7])
= expf--^-| d 4 ^ - i f t - ^ - j j (A?J [J0] exP((i7») 5O[0, 7]))
f a fJ d,4 f1 .* 5 V^ z [7] = «p[~4i» T * 5 7 o o J J «
(9 5 84)
--
Note that (9.5.84) is quite similar to (9.5.66) except for the fact that the basic variable here is the field
I
4!
J
SJA(x)
2l{
x jA
4! J []
+
Sj\x)J
z [y]
( ^y -] °
(9585)
--
where Z0[J] is given by (9.5.81). We use the above expansion to derive Green's functions by further assuming that X is so small that X2 can be neglected. Due to the J
(9.5.86)
498
Mathematical Perspectives on Theoretical Physics
<0|7W(*iW* 2 W*3W*4))|()> = -ti2{DF(xx - x2)DF(x3 - x4) + DF(xx - x3)DF(x2 - ~ -
- x4) + DF{xl - x4)DF(x2
DF(0) j dAx {DF{xx - x2)DF(x
- x3))
- x3)DF(x
- x4)
+ DF(xl - x3) DF(x - x2)DF(x - x4) + ••• 4 similar terms with appropriate combinations of x, xx, x2, x3, x4} -iXtf]
dAxDF(x - xx)DF(x - x2)DF(x - x3)DF(x - x4)
(9.5.87)
(Note the absence of terms which involve odd number of DF(x - x,)'s (/ = 1, 2, 3, 4); see Exc. (9.5.3) for derivations.) The expressions (9.5.86) and (9.5.87) represent the two-point and four-point Green's function of the 04-theory. Comment 9.5.5 When instead of relativistic fields (discussed above) we consider gauge fields, we have to deal with an extra symmetry that creeps in due to gauge transformation rules, e.g: A = A + V<j>
(9.5.88)
which results into overcounting when integration is done. Thus while the PI formulation allows an easy demonstration of the gauge invariance of the theory, it introduces spurious degrees of freedom. Therefore, while using the PI formalism it is important that these redundant degrees of freedom (due to gaugeinvariance) are weeded out by restricting the theory with gauge-fixing conditions and by factoring out the (infinite) volume (of the orbit) due to overcounting (see Sec. 6.4-5). The computations based on these constraints for general non-abelian gauge theories are quite complicated (see Sec. 6.4-5), and are beyond our limited scope; we illustrate the theory by choosing a Yang-Mills' non-abelian gauge. Example 9.5.6
Consider the Lagrangian density: L = -—FauvFatlv,
a = 1,2,3
(9.5.89)
4 for SU(2)-Yang-Mills field, where: F;v = 9^ Aav -dvAl+
geabc A\ A%
(9.5.90)
g being a coupling constant. The generating functional with external source J can be written here as: Z[J] = J [dkj exp{*j d\[L{x) + JM(x) • A"(x)]} (9.5.91) with field-free part as:
Z0[J] = J [dAJ exp{zj d\[£0(x)
+ J^x) • A"(*)]}
where
j d\£0(x)
= - i - J J 4 x ( ^ Aav - dv Ap (" Aav - dv Aa»)
(9.5.92)
Basics of Quantum Theory 499
= i - J dAx A; (x) 0>^ d2 - dW)A"v(x)
(9.5.93)
Obviously the generating functional would be obtained if the integral on the RHS of (9.5.91) or (9.5.92) can be calculated. This, however, is not possible, although the expression in the second line of (9.5.93) is similar to the scalar field theory (see (9.5.81)) because the operator: Bllv=(gllvd2-dlldv)
(9.5.94)
has no inverse. In other words, the det B = det || B^v\\ is zero, and since this will appear in the denominator if (9.5.93) is calculated, it follows that the integral is divergent. We shall see next how this situation can be avoided. The procedure discussed below is often referred to as PI volume factoring in gauge theories.
5.8 The Faddeev-Popov Ansatz Let 9 be the spacetime-dependent parameters of the group SU(2), and crbe the Pauli-matrices, then the gauge transformation A^ —> A ^ defined by \eu
• — = u(6) A H • — + —tr 1 (d)d u me) t r 1 id) 2 I 2 ig j
(9.5.95)
with 1/(0) = exp(-i0(jc) • 0/2)
(9.5.96)
leaves the action S-invariant. This means that the action is constant on the orbit of the gauge group formed with all A^'s for some fixed A p as U(&) ranges over all elements of SU{2). A proper quantization will thus be obtained by restricting the path integration to a 'hypersurface' which intersects each orbit just once. We write the equation of this hypersurface as: fa(AJ
= O a = 1,2,3
(9.5.97)
and note that for a given A^, the equation:
/fl(AJD = 0
(9.5.98)
has a unique solution 9= 9(x). Equation (9.5.97) is evidently a gauge fixing condition as shown in Fig- (9.3).
i
I
;' ;' ; Orbits in gauge T A I group manifold
Gauge fixing
; y^''': '•:/: •:W^^TTp?
constraint
' / p . •./ ; . •', \ '• '•[• '• ''\'y^
—-
Fig. 9.3
500
Mathematical Perspectives on Theoretical Physics
In order to obtain the volume of the orbit, we have to define the integration over the group space. Let 9 and 9' be elements of SU(2), then in terms of the representation matrices U(9), the multiplication of group elements takes the form: U(9)U(9')= U(99')= U(6")
(9.5.99)
In a neighbourhood of the identity U(9) can be written as: U(6) = 1 - i0
— + O(92)
(9.5.100)
and the integration measure over the group space can be taken as: 3
[d9]= Y\d9a
(9.5.101)
a=\
which is invariant in the sense that d(99') = d9' Next we define a function DF[AM] by integrating over the group space: Dp' [A^]= J [d9(x)]S[fa(AeM)]
(9.5.102)
Thus DF[Afl] = detRf
(9.5.103)
(*/U=|f-
(9.5.104)
where
The above equation implies that R^is just the response offa[A^\ to the infinitesimal gauge transformation. Using (9.5.95) and simplifying it in view of (9.5.100), the infinitesimal gauge transformation is of the form: K"
= A
°n+ eabc9bA^ - -d^9a
(9.5.105)
o
and therefore it follows that the response of fa[A,J is: fa[A%(x)]=fa[A^x)\
+ j d4y[Rf(x, y)]ab 0b(y) + O(92)
(9.5.106)
We note here that due to the uniqueness of solution of (9.5.98), the det Rf is not zero, and also that function DF [A^] is gauge invariant. The gauge-invariance of DF [A^] can be further examined by writing (9.5.102) as: DF1 [AM] = J [d9'(x)]5\fa(AeJ] which implies that we can write: Dpl[^=
i
[d9'(x)]8\fa(A°e'(x))]
= J [d(9(x)9'(x))]8\fa(A°e'(x))]
(9.5.107)
Basics of Quantum Theory 501
= j [d6"(x)]8[fa(A°"(x))]
(9.5.108)
1
But the RHS of the above equality is simply Dp [A^J from (9.5.107). Using (9.5.102) we can now write the path-integral representation of vacuum-to-vacuum amplitude as:
J [dAJ exp{ij d*xL(x)} = J [dOfrMdA^xXDplAJ x 8\fa(AeJ] exp{ij d4xL(x)} = j [J0W][dA/i(x)]DF[AM] 8\fa(AJ] exp{rj 4 JC£U)}
(9.5.109)
In writing the last line, we have used the fact that both DF[A^\ and exp{ i{d4x£ (x)} are invariant under the gauge transformation A^ -» A^. This shows that the integrand is independent of 9(x) and the integration over n dd (x) is the infinite orbit volume that we have been looking for to identify. X
Hence using (9.5.103) and (9.5.109), we can write the generating functional of gauge field A» (which is free from redundancy) as: Z / [J] = j [JAM](det Rf)8\fa(AJ]
exp{ij d4x[L(x) + J^- A"]}
(9.5.110)
This is called the Faddeev-Popov (FP) Ansatz- Note that essentially the quantization here has been done by restricting the functional measure with deX\8fl86\8\f{A^)]. (See Exc. (9.5.4) for an example of gauge-fixing.) We further note that the above discussions can be applied to Abelian gauge theory. Under a £/(l) gauge transformation equation (9.5.105) becomes: A%{x) = AM(JC) - -d,fi{x)
(9.5.111)
Thus for any choice of linear gauge-fixing condition of (9.5.97) the response matrix RAn (9.5.104) or (9.5.106) is independent of G. Hence the FP factor (det Rj) plays no physical role and can be dropped from the generating functional. The generating functional in this case is:
Zf[J] = J [dAjSWiAJ] exp{ij d4x[L(x) + J^x)A»(x)]}
(9.5.112)
(For further studies on the subject of gauge fields, the reader is referred to Chapter 9 in [6], and [11].) Two recent texts on path integration and their applications which make excellent reading are [7] and [19]. Having given some idea of path-integrals to the reader, we move on to Feynman graphs in the next section. But before this, we illustrate some of the theory of this section in the Hints to the Exercises.
Exercise 9.5 1. Show that the generating functional Z[J] for the harmonic oscillator with an external source J given by
(a)
Z[J) = NJ [dq] exp(—f^dt(—mq2
- —ma>2g2 + JqX\
502 Mathematical Perspectives on Theoretical Physics can also be written as
(b)
Z[J] = Z[O] expf- —l— f
f
dt
dt'J(t)DF{t - t') /(/')] •
2. Show that the generating functional Z0[J] for the action S0[x, J] in a relativistic scalar field theory can be expressed as: (a)
Z0[J] = Z0[0] e x p f - - L - f f d4xd4x'J(x)DF(x
\
2h
- x') J(x')) .
JJ
J
3. Consider the Taylor expansion up to first order in X for the generating functional of the 04-theory given in (9.5.84) and show that the two-point and four-point Green's function are given by (9.5.86) and (9.5.87). 4. Show that when the gauge-fixing condition fa(A^) = 0 (a = 1, 2, 3) given in (9.5.97) is simply (a)
fa = Aa3 = 0
then the response matrix R^is independent of the gauge-field and thus while writing the generating functional, (det Rf) can be ignored, hence the Zf[J ] for Yang-Mills theory in this case becomes: (b)
Zf[J] = | [dAjS(A3) exp{iS[J]}
where
(c)
S[J]= j d4x(-±(F^vf+
J\ AaA.
Hints to Exercise 9.5 1. Consider the harmonic oscillator \mq2 - \ma?q2
= ^-mq(t)
—— + CO2 \q (t), in view of
(9.5.51) the generating functional with external source J can be written as:
Z[J] = NJ [dq] expjOV/OJ"^ dt(-mq2
- mco2q2 + Jq j}
To use the theory developed in Subsec (4.3) we further write it as: (i)
Z(J) = lim N\[dq]exp[-(~)r dt!q(t{^-+ e->o+ J [ V2rty J -~ \ \dt
CO2 - ie)q(t) - —J(t)q(t))\. J m J)
(Note that in writing the Harmonic oscillator terms on RHS of (i) we have used the fact: q (t) q(t) = Const)
Basics of Quantum Theory 503
From (9.4.15) and (9.4.16) we know that (ii)
lim [ -^5- +ft)2- ie\DF (t - t') = -S(t - t'). e->0+\dt )
We use this to define a new variable q(t): (iii)
q(t) = q(t) + — f" dt'DF{t - t') J(t') m J -°°
whose substitution in (i) reduces it to the form: (iv)
Z[J] = lim N([dq]exv{-(im/2h)r dt\{q(t) - — T dt'DF(t - t')J(t')) L J £->o + J -~ Lv m J-a° ) X [^1- + CO2 - is] (q(t) - — [° dt'DF{t-t') ~ — J(t)(q(t)-
—r
J(t'))
dt'DF(t-t')J(t'))]}.
In view of (ii) this simplifies to: (v)
Z[J] = lim+ N\ j [dq] expj -(im/2h)j^dtq(t)l-^-
+ a)2 - ie (^(r) 1
x exp{-(i72»m)J"oo J ^ J ? * ' J(t)DF(t - t') J(/')} The first factor in the above term is the generating functional of the harmonic oscillator without source and can be written as Z[0]. This establishes the required result in (b). We would like to note here that Z[0] is also written in the literature as: (vi)
lim N d e t f - ^ r + co2 -ie\ \ +
e^o
I
2
{dt
2
J]
where det stands for the determinant of the operator. This follows from the generalization of Gaussian integral:
(vii)
f
dxe^Ji^f
to the functional integral involving a Hermitian operator O: (viii)
J[^]exp(zj ; V dtq(t)O{t)q{t)) = Mdet O(t)]~T.
504
Mathematical Perspectives on Theoretical Physics
2. The action S0[x, J] does not contain terms higher than two in >, hence it is defined as
(i)
S0[x, J] = j d'x^d^f
- ^f-
+ J^j.
The field
lim (p(x, i) —> 0. 1*1-+~ The arguments of Exc. (9.5.1) can be used here by noting that the point q(t) of the trajectory in the above exercise is now replaced by the field >(x) satisfying (ii). The new variable of integration (see (iii) in Hint 1) is: (iii)
0 (x) = 0 (x) + j d4x'DF{x - x') J{x').
We substitute it in Z0[J], given below: (iv)
Z0[J] = lim N\[d>] exp((i/ft)[dAx(—<j)(x)(dud^ t J V2 e^0 + J
+ m2 - ie)<j)(x) - J
After repeating steps given in Exc. 1, followed by simplifications we obtain the required result. 3. From (9.5.81) and (9.5.85), retaining the terms with X only we have: (i)
x
exp(-^"JJ
d4xld4x2J(xl)DF(xl - x2)J(x2)).
To compute this we have to evaluate the functional derivative with respect of J(x) to the 4th order. We do this in two steps, thus:
(ii)
-r4 o J (x)
expf- -J- J J d\d%J(xl)DF(xl \
- x2) J(x2))
2hJJ
J
= [-jDF(0) - ±-(j dAx,DF(x - x3)J(x3)) x(J
d4x4DF(x-x4)J(x4))]
x exp(^- -L }J d'x^JixODpix,
- x2) J(x2)\
(In taking the functional derivative we have used the rules laid down in Appendix 9C, e.g., (9C.84) to (9C.88).) This then leads to (iii) below: (iii)
f dAx ——A exp - - 5 - J J d4x1d%J(xl)DF(xl oJ (x) V in J J
J
- x2) J(x2) J
Basics of Quantum Theory 505
=
I d4xJ7^i
7FU* e x p (" ^
d\d%j(Xl)DF(Xi - X2)j(X2)) ]
= 1 dA* [~ ^DF(0)DF{0) + -^DF(0)(jdix3DF(x
- x,)J(x3))
x (J d\ADF(x - x4)J(xA)) + ^nJd4XiDF(x x
ex
- x,.)y(x;)j]
p ( - ^ J j d4x1d4x2J(x^)DF{x1 - x2)J(x2)^
Substituting this into (i) we have:
(iv)
Z[J] =Z0[0] \l + -~DF(0)DF(0)jd4x
x jd4xu(
+ jDF(0)
J d4xpF(x - xt) /(x;)j
-^•J
d'xU^d'x^ix-x^Jix^
x e x p ( - ^ - j j d4xxd4x2J(xx)DF{xx - x2)J(x2)\. From (iv) it follows that: (v)
Z[0] = Zo[O]f 1 + - ^ - £> F (0)O F (0)J^ 4 ^:j
showing the divergent nature of Z[0]. This divergence, however, is absorbed in the normalization constant. Taking the functional derivative of Z[J] we now have: (vi)
S2Z[J]
SJ(xl)SJ(x2)
J=Q
= Z 0 [0]^l + -Y-D f (0)D F (0)Jd 4 x^--^D ir (ac 1 - * 2 ) ) + — DF(0)jd4xDF(x
- Xl)DF(x - x2)].
Using the above computations we obtain from (9.5.83)-(9.5.85):
(vii)
f™
506 Mathematical Perspectives on Theoretical Physics
=
tf Z0l0](l + ij^DF(0)DF(0)jd4x) x Z0[0][l + (l + " ^ DF(0)DF(0)jd4x^ [~jDF(Xl - x2)j + j-DF(O)j d4xDF(x - x{)DF(x - JC2)1
- -h2[-LDF(xl-X2)
+
j-DF(O)jd4xDF(x-xl)DF(x-x2)^
= ih DF(xx - x2) - — D F ( 0 ) J d*xDF(x - xx)DF(x - x2). (The sign — is used to indicate that we have considered here only the leading order term coming DF(0)DF(0)\d4x
from the expansion of \ In
(.
8
t
•'
> . This is valid since we want here terms up
J
to first order in A. The RHS of (viii) consists of two terms, the first of these is the familiar Feynman propagator for the free theory, and the second one is a divergent term due to the presence of DF (0) (the Feynman propagator with 0 argument). The second term is referred to as afirst order quantum correction to the propagator in the theory. The quantum corrections in field theories lead to divergences, is a well recognized feature of quantum field theories. These divergences are, however, taken care of through the process of renormalization (see in particular Chapter 9 and 10 in [6]). We leave the computation of the 4-point Green's function as an exercise to the reader. In view of the above discussions, we note that while the second term in (9.5.87) represents a divergent term, the third one does not. This confirms the above statement that while the first order quantum correction is divergent, the second order is not so that we can regard the second and third term as quantum corrections to the first term which represents the 4-point Green's function in the freefield theory (see (9.4.20)). 4. Let/a(A^) = 0 be given as: (i) fa = A% = 0. This gauge-fixing condition is called the Axial gauge. The gauge transformation (9.5.105) in this case gives: (ii)
/ a (A« ) = A«3 + eatK 6bA\ - ±d36a = - - ^ 3 0 " . 8 g
Thus in view of (9.5.104) the response matrix Rj=
^ 3 ^ ' showing that it is independent of
the gauge field. The factor containing (det Rj) can be dropped from (9.5.110), reducing it to: (iii)
Zf[J] = j [dAJ 5(A3) exp{iS[J]}
with S[J] as given in (c).
Basics of Quantum Theory 507
6
FEYNMAN GRAPHS
The final section of this chapter is devoted to one of the most interesting contributions of Feynman to quantum theories. These are known as Feynman Graphs and have been found quite useful in superfield theories and string theories. We present here the basics that underlay the construction of these graphs, and show how perturbative expansions that are quite cumbersome even at low orders (see Exc. (5.3) for instance) can be diagrammatically represented using the Feynman rules. The key elements for a graphic representation of the 04-theory are the Feynman propagator of the free theory, and the interaction. These are related to a line and to an intersection point called the vertex. Given below is the line and the point along with the mathematical expressions they stand for: (a)
xt
x2 =
ihDF(xl-x2)
X
(b)
l ^ \ / >- ^ < *4
x2
/--
\
x3
= V(*!, x2, JC3, x4) =
1
5*S[
~ftn?=1 6«Xl)
#e0
= - — \d4xUS{x-Xi)
(9.6.1)
(see (9.5.78) for £[$)). With the help of these two elements, other non-trivial graphs can be constructed by joining the vertex to the propagators. Added to these elements is the rule of evaluating the graphs— which says that for any evaluation of graphs the integration is to be performed over the intermediate points where a vertex connects with the propagator. We illustrate it by using the diagram: y
*i
jP^
*2
(9.6.2)
We note that similar to the intersection shown in (9.6.1)(b) we have here the point characterized by merging of yx, y2, y3, y4, and like in (9.6.1)(a) we have the lines (xlt y{), (y2, x2) and the curved line (y3, y^) (or Cy4- y$))- Hence using the mathematical expressions given in (9.6.1) and the integration rule suggested above, we have the required value of the graph (9.6.2) (subject to adjustments for symmetry) as: J dAyld4y2d4y3d/iy4ihDF(xl
- yx)ihDp(y2
- x2)ihDF{y3
- y4)
x ^ O i , y 2 , y 3 , y4) = (ih)3j dAyld4y7d*y344y^DF{xl ( x
v
"X ~$(y
n
- yx)DF(y2
- x2)DF(y3 - yA)
\ - yi)S(y - y2)S(y - y3)S(y - y4)\
J
508
Mathematical Perspectives on Theoretical Physics
= -Ah2DF(0)j d*yDF{xx - y)DF(y - *,)
(9.6.3)
We note that there is a symmetry in the diagram (9.6.2) as the bubble can be rotated through 180° leaving the diagram unaltered; there is one such bubble, hence the symmetry factor is 21 = 2. Dividing by 2 we obtain the actual value of (9.6.2) in terms of Feynman propagators:
lAJy* x\
y\
y>2
= _ 2^-DF(0)\ X2
dAyDF{xx - y)DF(y - x2)
2
(9.6.4)
J
From Exc. 3 of Sec. (9.5) we know that this is the first order (linear in X) correction to the propagator, this confirms (in a small way) that perturbative expansions can be represented by a graph. Using the above graphic representations we can write (9.5.86) and (9.5.87) as: (0\T(<jKx1)
*2 +
(0\T(^xl)(t>(x2)
^3 Q X^
X4
X2
X2
Q
X]
X4
X2
Xj
Q
X3
X\
x\
X4
^
Xi
+l -*4 **2
+ ' 4
Q *4 , x\ Q ^3 ^x\ X2
Xy
X2
(9.6.5)
X2
X^
+
2
(9.6.6)
3
Q
X2
X^
, x\ \ ^ ^ ^ X 4 X^
X2
Xj
Note that in (9.6.6) the three sets of parallel lines correspond to first three terms of (9.5.86). Also in view of (9.6.4) the next six diagrams correspond to next six terms of (9.5.86), while the last term there corresponds to the diagram of (9.6.1)(b).
6.1 Connected Diagrams In Sec. 4 and 5 we have referred to connected diagrams and connected Green's functions (see Comment (9.4.1) and Subsec. 5.5); these, as we shall see, are obtained by considering the logarithmic generating functional W[J] that we had introduced in (9.5.53): W[J] = -ih In Z[J]. In view of (9.5.54) and (9.5.55) for a ^-theory, this gives:
SJ(xi)
j=Q
S[J] SJ(xi)
(9.6.7) J=o
and
_ in
s2wW
SJ(Xl)SJ(x2)
= (_tf)2 J=o
[ i s2z[j] [Z[J] 5J(Xl)8J(x2)
i sz[j] sz[j]l Z2[J] 8J(Xl) SJ(x2)\
=
J=o
Basics of Quantum Theory 509
=
(9.6.8)
4
Since (O|0(x)|O) = 0 in a 0 -theory, we note that the second term in (9.6.7) is zero, hence,
(9.6.9)
4
This shows that the 2-point Green's function in 0 -theory is connected and its graphic representation is the same as the one given in (9.6.5). (The subscript c in < )c indicates that the Green's function is evaluated using a connected diagram.) We can similarly write the 3- and 4-point connected Green's function by taking the third and fourth order functional derivatives of W[J]. In the case of 04-theory however, the odd order Green's functions are zero, hence we consider, the 4-point connected Green's function which is given by: {-in?
^M4 5J(Xl)8J(x2)SJ(x3)5J(Xi)
J=o
=
-
<0|7W(* 3 W(*4))|0>
-
(0\T((j)(x2)(l)(x2))\0)
(9.6.10)
Using the graphic representations given in (9.6.5) and (9.6.6), it is evident that we are retaining just one piece of the graph in (9.6.6), hence we have:
(0\T(
X
x2
4
(9.6.11)
x3
Using the fact that W[J] generates connected Green's functions, we can write down a diagrammatic expansion of the 2-point Green's function up to the order A2 as:
o x
]
®
X
2 = x\
X
2 + x\
on X
2 + -^1
Q X[
K^J X2 -\- Xy
X2
(9.6.12) X2
From the above discussions it is evident that perturbation series can be more easily expanded via these graph techniques. To study these further we would now like to introduce the notion of irreducibility among these diagrams. In this connection we note that W[J], which generates connected diagrams, sometimes contains diagrams that are reducible to two connected diagrams when an internal line is cut. The third graph in the RHS of (9.6.12) is an example of such reducibility. These diagrams are called IP (one particle) reducible. We shall see that IP irreducible (one particle irreducible) diagrams are of more fundamental nature since we can construct all the connected diagrams with their help. To achieve this goal, we extend our considerations of field theory to a broader spectrum. For instance, we know that a one point function in the presence of an external source is given by:
mji,^^ji-_{ommj oJ(x)
Z[J] 5J(x)
(9 . 6 , 3 ,
510 Mathematical Perspectives on Theoretical Physics
and that this is zero for a 04-theory in the absence of J. We now consider the one-point function (vacuum expectation value) in the case of an arbitrary field and write it as:
(9.6.14)
where exp(-(i/h) P • x) is the generator of space-time translations. If we assume that the vacuum state in the Hilbert space under consideration is unique and is Poincare invariant: exp(-(i/ft) P • x)\0) = |0)
or
P/JO> = |O>
(9.6.15)
then, by symmetry arguments, we conclude that
(9.6.16)
(For a 04-theory, this constant is zero.) It is also worth pointing out that the value of the one-point function plays an important role in the study of symmetries, as a non-vanishing value of it implies spontaneous breakdown of some symmetry in the theory (see Sec. 6.5). Returning to the one-point function in the presence of a source, we note that it is a functional of the source and in general it is not zero, we denote it as 0r(x) and refer to it as the classical field of the theory (though it is only a classical variable), thus from (9.6.13) we have:
(9-6.17)
It is natural to ask here as to why this vacuum expectation value is called a classical field. The simple answer is: because it behaves like a classical field. This can be seen from the following observations: Observation
9.6.1
The generating functional Z[J] = Nj
[d>] exp((i/») S[>, J])
is independent of
= - ^ - f [d(j)] \ \ d 4 x 8 ( l > ( x ) - 6 f f ' J ] )
exp((t/ft) S[(j), J])=0
(9.6.18)
(We have assumed here that [d(j>] does not change under this redefinition of the field variable
Observation 9.6.2 ence of source J:
5 [
t ^ J] exp(OVft) S[(j>, J]) = 0 S<j)(x)
(9.6.19)
By definition Eq. (9.6.19) is the vacuum expectation value equation in the pres-
( 0 | m Z l | o ) =0
(9.6.20)
Basics of Quantum Theory 511
where
-^—- = 0 is the familiar Euler-Lagrange equation. From Sec. 5 (see (9.5.80)) we know that 5<j)(x) for the 04-theory this equals: d ^
+ m2<j) + —
(9.6.21)
where we have written the first three terms as a functional F(
S
- NJ [
(9.6.22)
In view of the equality: = NJ [dMW
exp((i/») 5[0(x), /I)
(9.6.23)
we can identify:
and thus write (9.6.22) as:
z[ /1=0
(9A24)
K"*^ir)"H which can also be written as
I F[m~8Mx~)~J{X)\
expi(i/h)
W[J]) =
°
or as exp(-(i/h) W[J])\ Fl-ih
5
)-J(x)
exp((i/h) W{J}) = 0.
This gives:
f».,n.;M=0
(9.6.25)
and after using (9.6.17) this yields:
F (t>c(x) ih
i ~ ~8JU)]
' J(X) = °
(9 6 26)
''
Equation (9.6.26) is the (full) dynamical equation of the theory at the quantum level. Although it is not the same as the classical Euler-Lagrange equation, we note that when h —> 0 it appears similar to the
512 Mathematical Perspectives on Theoretical Physics
classical form (9.6.21). This explains the nomenclature 'classical field' for 0c(jc)-the vacuum expectation value of arbitrary field (f> in the presence of a source (see (9.6.17)). Now the quantum equation (9.6.26) written out in full (using (9.6.21)) becomes:
((V
+ «>)
(•,U) - »-;%-;)) * A (,, .(x)- « ^ y 3 - * ) = 0
(9.6.27)
or
( V + n?)m + A^W - i» * « %&- ^-£*$- /W = 0 3! 2! e OJ(x) 2! Sj(x)oJ(x) M
Reintroducing W[J] this can be written as: (d 3* + m2) SW[J]
"
5/U)
+ A ( 8WVV\
3! t 5/(x) J
_ I'Aft 8W[J] 82W[J]
2! 8J(x) 5J2{x)
- ML?JW1 _ J{x) = o 2!
(9.6.28)
<5J 3 U)
The above equation governs the full dynamics of the quantum theory. Since Green's functions are given by functional derivatives of W[J], using the above equation the dynamical equations satisfied by various Green's functions (of the theory) can be obtained. These equations are known as the SchwingerDyson equations. We would like to remark here that some of the terms in (9.6.28) are not well defined since they are products of operators evaluated at the same spacetime points (it is known that in quantum theory products of field operators at the same spacetime point are always ill defined, and they can be used only after regularization (See Appendix 9C for the definition of regularization)). It is interesting to note, however, that when h -> 0, all those ill defined terms disappear leaving the familiar Euler-Lagrange equation: fyd" + m2)$c(x) + A 4>l(x) - J{x) = 0
(9.6.29)
The above equation can be solved iteratively by writing it as an integral equation in terms of a propagator, thus
(j)c(x) = -j d*yDF(x - y) (j(y) -
jfitiiy))
= - J d4yDF (x - y) J(y) + A J d4yDF (x - y)03c (y) After simplification the iterative solution has the form:
(9.6.30)
Basics of Quantum Theory 513
= - J dAy DF(x - y) J(y) - A J d\d4y2dAy,d4y4(DF(x
- Vl)
X D F ( y y - y 2 ) D F { y x - y 3 ) D F ( y { - yA) J(y2) J(y3) J{yA) + •••
(9.6.31)
The iterative solution can also be diagrammatically represented as:
*+ ^
<* + A_ A- / >
x
<*
(9.6.32)
where we use a vertex * to describe the interaction of the field with the external source as:
x^i^TTn
6.2
8(x)
=¥yw
(9 6J3)
-
,=0 »
Effective Functional and Feynman Graphs with Vertex-functions
SW[J] We now describe an important feature of
(9.6.34)
whose functional derivative is easily seen to be (see Exc. 9.6.1): r
L ^ J = -j(x) 0
(9.6.35)
We have the following remark concerning the new functional r [ 0 J : Remark 9.6.4 Equation (9.6.35) that defines the source J(x) as a functional of the classical field
WZo u
rcW
=°
(9 636)
-
0e(jc) = constant
The above (extremum) equation is often used to help determine whether a symmetry is spontaneously broken.
514
Mathematical Perspectives on Theoretical Physics
We show next how the effective action W and the effective action functional F can be utilized in expressing the theory in a compact form. For this we write: W
<»> =
^MH
(9.6.37a)
Sj(Xl)...dJ(xn)
r<"> =
^
J
(9.6.37b)
84>c{Xl)...&l>c{xn) and note that using these the equality: f d\ J
8
2
^ SJ{x2)5J(x)
S
2
r
^ S0c(x)5
= -8\Xl - x2) ' 2
(9.6.38)
can be written as an 'operational equation': wm
r (2)
_ _j
(9.6.39)
(See (iii) in the Hint to Exc. (6.2).) This coordinate-free equation, when applied to matrix elements in the coordinate basis, will revert to its originating form (9.6.38). From our study in Sec. 4 and Sec. 5 (in particular Eq. (9.5.59)) and Eq. (9.6.8), we notice that there exists a relation between the Feynman propagator and the effective functional W; to see it further we write: Wm\J=0 = - D
(9.6.40)
Moreover, when 7 = 0,
(9.6.41)(a)
or equivalently as: £>r ( 2 ) |^ = 1
(9.6.41)(b)
This shows that F ( 2 ) | ^ is the inverse of the propagator at every order of the perturbation theory. We use this fact to write: r ( 2 ) | ^ r = r f > + I = Dp1 + E
(9.6.42)
where Z denotes the quantum corrections in T 2 ^ . 31 From (9.6.41)(b) it follows that:
31
' In the case of ground state |0), the logarithmic functional W[J] becomes W0[J] and F(0t.) is denoted FO(0C); therefore D in this case is DF (see (9.5.82-83)).
Basics of Quantum Theory
_^
L v J _ + W_Ly
D}1
D}1
D}1
DFl
= DF-DF1DF+DF1DF'LDF
D}1
1
D}]
515
+
+ •••.
(9.6.43)
A diagrammatic representation of the above equation is obtained by choosing the following graph for I : —Z = O n As a consequence the relation between the propagator and I ' i s given by:
(9.6.44)
=
vZ?
+ O + O O + ••• (Q {. A<;\ x y x y x y ^y.o.43; The above diagram is known as the proper self energy diagram, and it shows that Z is indeed the IP irreducible (1PI) 2-point vertex function. As a final piece to this (fascinating and vast) theory of Feynman graphs, we choose the following diagrams which represent W(2), W^ and F (3) respectively: —
^
= -ih W(2) (x, y) \J=Q
(9.6.46a)
= i-ih)2 Wm (x, y, z) \J=0
(9.6.46b)
x ^ ^ — y X
^
Z =^-r(3)(jc,y,z)|^
^
(9.6.46c)
y We then use the above diagrams to obtain the diagrammatic relations for the operational equation: W(3) = W(2) W(2) ^
F3
(9.6.47)
and for the connected Green's function given in (9.6.10). We thus have:
/^^
l^^k y
(9.6.48)
516 Mathematical Perspectives on Theoretical Physics
X
W
y
X
W
z y + permutation terms
z (9.6.49)
Having obtained these representations we are now ready to make a few remarks concerning these graphs and Ff^J. Remark 9.6.5 Every connected diagram can be reduced to (1PI) diagrams—a point that we already made earlier. We note that I ^ 3 ^ in (9.6.48) is one of the simplest examples of 1 PI, in fact it gives the proper 3-point vertex function (see Remark 9.6.7). Remark 9.6.6 ( )
The effective action functional (of the theory) T[
r « (x1...x,,4as: HW = £ J d\ ... d\ -L r^Ot, ... xn\ kOt,) ... $c(xn)
(9.6.50)
where it is assumed that 0C = 0 (in the expansion). (See the explanation for
r [ f . ] = J d\ | (-Veff«j)c(x)))
+ | A (&(*)) d ^ c { x ) d ^ c ( x ) + ••• I
where Veff is chosen to represent the potential
- -^— 0 2 - — 0 4
V
2
4!
(9.6.51)
in terms of (j)c. Note that higher
J
order derivative terms have been neglected here. Since (pc(x) - <j>c= constant in the absence of sources, all the derivative terms in (9.6.51) vanish and r[0 c ] equals:
r[<MUo = - J
d4xV
*f (&) = -yeff Wr)J d*x = <2K)Ad\0)Vm{<j,c)*
The space-time volume \dAx is usually denoted by (2nf <54(0).
(9.6.52)
Basics of Quantum Theory 517
This shows that in this limit effective action simply picks out the effective potential including quantum corrections of all order. Using (9.6.52) it can be easily established that the renormalized values of the masses and the coupling constants (including all quantum corrections) are:
4 ^
= m\
(9.6.53a)
d4V —f-
= XR
(9.6.53b)
(see Exc. (6.4)). So far in this section we have dealt with the description of Feynman graphs in the coordinate-space, however, from our discussions in Sec. 4, we know that these rules can be generalized to the momentumspace by taking the Fourier transforms of the functions involved and then defining the basic graphs in terms of these Fourier transforms. We have briefly illustrated this in Exc. 5 of this section by using the rule: P — > = ih DF(p) = l i m - = — % £->o p -m + ie
J ^ X C * Pi
Pi
=-^-$4(Pi + P2 + Pi + Pd
(9.6.54a)
(9.6.54b)
ft
Note that (b) above refers to the 04-theory. In the following remark we list important points concerning the evaluations in the momentum space. Remark 9.6.9 (i) In a proper vertex diagram, there are no external propagators or legs, (ii) All the momenta associated with the internal lines (propogators) must be integrated, thus if the number of internal lines is /, there are / momentum integrations. (iii) All momenta (in integration) are not independent since at each vertex there are momentum conserving <5-functions, thus if the number of vertices is V, the number of 5-functions is also V. The number of momentum integrations thus reduces to / - V. Nevertheless, there is an overall momentum conserving ^function for the amplitude, this means that the final number of independent internal momentum integration is:L = / - V + l s P + l . (iv) Every proper vertex diagram, i.e., (1PI) diagram with V vertices and / internal lines, has total number of h factors associated to it which equals the number P. Thus the diagram behaves like ~ hp. (v) The number L of (iii) equals the number of loops in a diagram and since L - P + 1, it follows from (iv) above that expanding an amplitude in powers of h is in fact an expansion in the number of loops. Since ft is a small quantity, loop expansion provides an efficient perturbative expansion. Finally, we note that while in general it is simpler to evaluate a diagram (i.e. calculate the Green's functions) in the momentum space, very often the process introduces infinities. For instance, in relativistic field theory the momentum variable in the loop integration ranges all the way from zero to infinity and thus permits no intrinsic cut-off in the momenta. The divergences due to this naturally make the calculations meaningless. However, it is quite relieving to note that this 'unwanted situation' is remedied via the prescription provided by the theory of normalization. The theory of renormalization suggests ways and means to isolate and remove all these infinities from physically measurable quantities. One of the methods used is the introduction of a regularization scheme under which all divergent integrals are made finite, the quantities involved are then freely
518 Mathematical Perspectives on Theoretical Physics
manipulated to obtain the calculation. In the case of a divergent diagram, one first separates the divergent part from the finite part and then allows the divergences to be absorbed in some appropriate redefinitions of mass, coupling, and field operators. We wish to point out here that our introductory remarks on 'renormalization' should not be interpreted to imply that the 'renormalization of a theory' is sought only to expurgate the infinities. Indeed the process of renormalization is used even in finite theories; in fact the term 'renormalized' is used when a given theory is altered by removal or introduction of a source (see for example Exc. (6.4)). Due to our limited scope we are unable to cover this topic here, we refer the reader to [6] for a succinct account of the theory along with useful references. We shall, however, return to it briefly when we discuss anomalies in Sec. 10.6. In conclusion we note that the theory of Feynman Graphs has wide applications in supersymmetry— where these graphs are called supergraphs. See for instance 7.[19] and 7.[16]. We shall return to these ideas briefly in Chapter 11.
Exercise 9.6 1. Verify (9.6.35).
2. show that -A-(^hl)
=
_s^ _ y) = j ^
5J(y){S
/;
J
s*nn
rtfo]
.
8J(y)8J(z) 50(:(z)50fU)
3. Use the compact forms W("\ r ( '° given in (9.6.37) and show that (a) WO)=W(2)W(2)W(2)TO) 4. Obtain the renormalized values of the masses and coupling constants in a 04-theory (in effect verify (9.6.53)). 5. Show how you would use the Feynman rules in momentum space to evaluate the one particle irreducible (1PI) 2-point vertex function of a 04-theory (in the momentum space).
Hints to Exercise 9.6 1. The functional derivative of F[0J in (9.6.34) with respect to <j)c(x) gives (i) 0<j)c(x)
0
J
O(f>c(x)
Since W [J] is a functional of J (JC), the first term in (i) equals:
(ii)
SJ(x')
J
d
8W[J] But ——=-=- =
«
frH w 5
Basics of Quantum Theory 519 2. Note that <j>r(x) is an independent variable in T[
_ J _ = f d*ys*c(y)
s
- f A , {SJ(y))__S_ = rj 4 82W[J] S_ J ' SJ(x) S
an
f6r[^]l-fi4,
s
8J(y){d
§2wlJ]
8J(y)8J(z)
}
(5r^)-
6
-S-UM)
50 c (z) [8
8J(y)
Thus we have: (iii)
j j 4 , SMJI 8J(y)8J(z)
J
.
g2n»c]
=
_ 54(x _
8<j)c(z)8(l)c(x)
3. From (iii) above taking the functional derivative with respect to J(w), we obtain:
a)
\d \
52mi
s2n*j
8J(w)8J(y)8J(z)
J
_ !dA
=Oa
8<j)c(z)8$c(x)
rfVJ^W ^3r[^c] 82W[J] 57(y)57(z) 8<j)c(z)5(!>c(x)8
J
(We have used (ii) of the above hint in writing the extreme RHS of (i).) Also since W(2)T(2)=_l
we can also rewrite the above equation after some adjustment of variables as:
(ii)
^ 0
= J dWy'dV
8J(x)8J(y)8J(z) J w2[j] w2[j] 8J(y)8J(y') 8J(z)SJ(z')
y
W2W
In compact notation this becomes: (iii) w ( 3 ) = w(2)wi2) wi2)r0). 4. Now from (9.6.52) we have:
(i)
r[0c]J=o= -{2nf8\Weff=
-^Y^C ~~
We note that the constant value of 0C = ((j)) is obtained by solving: (ii)
8
J=o
x
8J(x)8J(x') 83r[$c] 8
520
Mathematical Perspectives on Theoretical Physics
Having once obtained it, we use this fact to take the derivative of Veff, and since (j)c no longer depends on spacetime coordinates, the derivatives of Veff are ordinary derivatives, thus
(iii)
?£S-
=-kn?-k±-yc
i^L
= -U
and
(iv)
But <j>t. = (0) = 0 in a 04-theory, hence (iii) and (iv) give respectively the normalized mass and the normalized coupling constant:
(v)
^f-
= ml
(V1)
^ F "
=
**
5. We recall from Sec. (9.4) that the basic graphs in the momentum space for a field theory could be expressed as: (i)
(»)
p
„ ^ Vi
^
^
> = ih DF{p) - lim —2—^—— 2 2 £->o p - m + ie
P3
=--rS4(Pl n
+ p2 + p3 + p4)
(see (9.4.37), the change in (ii) is due to the fact that we are dealing with a 04-theory here.) While evaluating a Feynman diagram in the momentum-space, we shall naturally integrate over the intermediate momenta, i.e., the momenta of the internal propagators. Thus to evaluate the (1PI) 2-point vertex function up to the first order A given by: (iii)
/=-*—^—— Px Pi we shall calculate the following integral, which is written after using the appropriate signs for plt p2 and k indicated by their arrows: ...
(1V)
,
1(
iX \ t d4k
ih
' " I - T J W F ^
r
5
4,
,
,,
cp.-ft+*-*)-
The factor — in (iv) is due to the symmetry of the diagram, and though ie is not explicitly given here, it should be taken into account while evaluating the integral and then sending it to zero in the limit. Compare (iii) of this excercise with (9.6.44) where (1 PI) 2-point vertex function is given in coordinate space, (see Chapter 2 in [6] for further details).
Basics of Quantum Theory
521
APPENDIX 9A: LANGUAGE OF QUANTUM MECHANICS A.1
State Space, Kets and Bras, Hermitian Operators and Observables
In quantum theory the states of a physical system correspond to vectors in a Hilbert space over the complex numbers. A state vector is denoted by the ket \ yr). The kets form a space known as state space, the properties satisfied by state space are similar to those of a vector space defined earlier. We wish to recall here just two of them: Property 9A.1 For every pair of kets \yf) and \y/') there exists a unique complex number that results from the scalar (or inner) product of \y/) and |y'X a n d it is denoted by the bracket: (|V>, |VO) =
(9A.1)
The obvious properties of this scalar product, which is also called the Hermitian (complex) scalar product, are: W\w)
- (v|vO*
(* = complex conjugation)
(y'\cyr) = c(w'\w) (
(9A.2)
and
(9A.3)
The equality holds if and only if | y/) is the null vector. We say in this case that the ket represents the ground state of the system, we denote it as 10). The real number (y/\ y/) is called the norm of | y/)\ when it is 1, it is called a normalized ket. If for non-zero | yr) and \y/'),(y/\y/') = O, then the kets are orthogonal. Property 9A.2 There exist complete orthonormal sets of bases for a state space. Each such set consists of the kets \k) (k = 1, 2, ..., ri) which are orthonormal: (k\l) = 8kl
(9A.4)
and has the property that any ket (state vector) \y/) of the space can be expanded as:
|V>= El*X*IV>
(9A.5)
k= \
(Note that in mathematics expansions of this nature are written with the eigenvectors as post factors.) Every operator A on the state space assigns to each ket \y/) another ket \y/') of that space: A\y/) = \yr')
(9A.6)
(See Chapter 3 for properties of Hermitian and unitary operators, and Exc. 2 and 3 of Sec. 3.1.) Given a space of kets, a space dual to it can be constructed whose elements denoted as {
(\V)(
(9A.7)
522 Mathematical Perspectives on Theoretical Physics
where \x)l5 a n arbitrary ket. The operator defined above is postulated to act to the left of a bra (x\, therefore
(9A.8)
It is a linear operator and from the relation:
(|v><0|)+=k>
(9A.9)
where + = h.c, it follows that \<j))
£|*X*|=1
(9A.10)
k=\
every operator A can be expressed as: A=Y^\k')(k'\A\k){k\
(9A.11)
k' k
The complex numbers {k'\ A\k) are a matrix representation of A (see (3.1.9)). In the equality A\yr) = a\yr)
(9A.12)
the complex number a is called an eigenvalue of A and \y/) is called an eigenket. The set of eigenvalues of A is called the spectrum of A. If there are g linearly independent eigenkets with the same eigenvalue, then this eigenvalue is said to be g-fold degenerate. If A is Hermitian, then from (9 A. 12) and from: (
(9A.13)
it follows that the eigenvalues of A are real and the eigenkets corresponding to different values are orthogonal, and as such they can be normalized. A Hermitian operator A is an observable if its eigenkets \y/k) form a basis in a state space, i.e., if an arbitrary ket \y/) can be written as:
lv>= Z k*Xv*|v>
(9A.14)
k=\
The corresponding expansion for a bra is
(9A.15)
4=1
For an operator A which is an observable, (9A.11) reduces to:
4 = 5>*|V*XV*|
(9A.16)
k
Note that this is the spectral decomposition of A, when the spectrum is discrete; in the case of continuous spectrum, which happens when the space is infinite dimensional, the eigenvalue equation is:
Basics of Quantum Theory
MYa) = a\Wa)
523
(9 AM)
The parameter a representing the eigenvalue here is continuous. The orthonormality condition (9A.4) and the expansion (9 A. 14) in this case are respectively:
(9A.18)
|V) = j da\Wa) (y/a\¥)
(9A.19)
where the integration is taken over all values of a. n
A polynomial function PN(A) = ^ ckAk of an observable function A is an observable, in particular k=\
A2 is an observable. Hence the expectation value of A with respect to \\j/) defined as:
(9A.20a)
leads to the computation of the expectation value of any polynomial function of A. Based on this, we write below the important inequality in quantum mechanics giving the Heisenberg uncertainty principle. Uncertainty Principle Let A and B be two non-commuting observables then the physical quantites represented by A and B cannot be measured simultaneously with precision. In order to find this lack of precision, we write their commutator as [A, B] = iC. Evidently C is Hermitian. Suppose that (A A)2 denotes the variance of A (for a fixed state y/), i.e., AA = [(y/\(A-
(A})2\¥)]-7
(9A.20b)
then AA A £ > ~ | < y | c | y / > |
(9A.21)
gives the measure of deviation from precision, this is known as the measure of uncertainty or the Heisenberg uncertainty relation.
A.2
Position and Momentum Operators of a Particle
The observables associated with the position and momentum operators of a particle denoted r and p have cartesian components: r = Oj, r2, r3) = (x, y, z) P = (Pi. P 2 . Pi) = (Px> Pr Pz>According to quantum mechanics postulates, they satisfy the canonical (fundamental) commutation relations: (a)
[r,, rj\ = 0
(b)
\p-v Pj] = 0
(c)
[r,, Pj] = ihStj
(9A.22)
524 Mathematical Perspectives on Theoretical Physics
In non-relativistic quantum mechanics it is assumed that rx, r2, r3 form a complete set of commuting operators for a spinless free particle (see Def. (9A.3)), hence there is only one linearly independent eigenket for the operator r that corresponds to the eigenvalue /•', we denote it as |r'), 3 2 thus the eigenvalue equation for the position operator r is: r\r') = r'\r')
(9A.23)
where \r') = \r\ r'2 r'^). These components vary continuously from - ° ° to » . The orthonormality condition (9A.18) gives here: (r"\r') = 5\r" -r)= 8{r'[- r\) S(r"2 - r'2)8{r'i - r'3)
(9A.24)
The closure relation (9A. 10) which in this case becomes: J d3r'\r') {r'\ = 1
(9A.25)
gives the expansion (9A.14) of an arbitrary ket \\j/) in terms of the basis \r') as:
|V> = J d\'\r') (r'\¥)
(9A.26)
Here d3r' stands for dx dy dz and the integrations are over the whole of coordinate space. The expansion coefficient (r'\\j/) in (9A.26) is the coordinate-space wave function \j/(r'), which is a complex function of the continuous real variable r'. Hence we can write:
|
(9A.27)
where {r'^y/) = cv The normalization {^f\\ff) = 1, the closure relation (9A.25) and the equality (9A.27) then lead to:
J |V/(r')| 2 A' = l
(9A.28)
This shows that |y/(r')| 2 is zero at infinity. In other words, it means that the probability of r' attaining an infinite value in the case of a moving particle is zero; thus the motion is finite. The particle in this case is said to be in a bound state. If we pre-multiply (9A.26) by (0|F(r), where F(f) is a continuous function of r, and write y/(r') for (r'\y/), we obtain: <0|F(r)|v> = J d\'f{r')F{r')W)
(9A.29)
a relation that we use all the time in text. (Note that writing F{r') on the RHS of (9A.29) amounts to a change of basis from r to r'.) We can likewise obtain matrix elements for the operator/; by using (9A.26), thus: p\y/) = | d\'\r') (-ihVr,)y/(r') 32
(9A.30)
' In physics literature the eigenvalue of an operator A is usually denoted A', by using r' as the eigenvalue and \r') as the eigenket we are essentially following the physics trend.
Basics of Quantum Theory 525
where (9A.31)
This leads to <0|j>|y> = J d3r'P(r') HfiV r 0iKr')
(9A.32)
and, if {<j>\ is an eigenbra of r, then integration can be performed on the RHS and it yields: (r'\p\y,) = -ihVr,(r'\y)
(9A.33)
For momentum operator p, equations similar to (9A.23)-(9A.26) are: P\P') = P'\P') (p"\p') = S*(p" -p')
jd3p'\p')(p'\ = l \Y)=jd3p'\p')(p'\¥)
(9A.34) (9A.35)
(9A.36) (9A.37)
where integrations are over all of the momentum space. The choice of eigenkets | p') as a basis gives the momentum representation and the expansion coefficient {p'\\jf) = i/7(/>')-the momentum space wave function.
A.3
Coordinate and Momentum Space Representations
The eigenkets \r') and \p') as the basis are said to give respectively the coordinate representation and the momentum representation in quantum mechanics. Obviously the two representations are related to each other as we shall soon see. Using the left multiplication on (9A.26) and (9A.37) by (p'\ and (r'\ respectively, we get: (p'\yr) = j d\'{p'\r'){r'\y)
(9A.38)
( # » = ld3p'{r'\p')(p'\¥)
(9A.39)
which shows that there are integral relations between the co-ordinate and momentum-space wavefunctions. Now {r'\p') can be evaluated as an explicit function of r' and/>'. From (9A.33) and (9A.34) it follows that this function satisfies the three first order partial differential equations:
-ihVr,(r'\p')=p'(r'\p')
(9A.40)
and replacing r' by r" and y^by r' in (9A.39) we note that it satisfies the orthonormality condition: J d3p' {r"\p') (PY) = (r'Y) = 8{r" - r')
(9A.41)
526
Mathematical Perspectives on Theoretical Physics
It can be checked that the solution to (9A.40) and (9A.41) is (see Chapter 1 in [21]):
(9A.42)
Using (9A.38), (9A.39), (9A.42) and (9A.2) we obtain: \j/(p') = (2nh)~TJ d3r' exp(-ip' • r'/hy/(r')
(9A.43)
yr(rt) = (2nh)~^ j d3p' exp(ip' • r'lh) yr(p')
(9A.44)
In Appendix 9C (in particular in (9C.7)) we shall see that y(p') and \j/(r') are Fourier transforms of each other.
A.4 The Complete Set of Commuting Operators Finally we introduce the important concept of the complete set of commuting operators (c.s.c.o.) in quantum mechanics. We already know that the eigenvectors of a Hermitian operator A form a complete orthonormal set, and as such they can be used as basis vectors. These basis vectors in turn are characterized by eigenvalues to which they belong. Thus if all eigenvalues of A are distinct, they can be used to label the basis vectors as |A'), etc. If, however, there are two or more linearly independent eigenvectors that correspond to one eigenvalue, the labeling has to be done \A[), \A'2) .... In this case we look for another Hermitian operator B which commutes with A and has the same set of eigenvectors but gives us distinct eigenvalues say B' and B": B\A\) = B'\A\),
B\A'2) = B"\A'2)
(9A.45)
(See Ftn 32 for notations in above equations.) With the help of these eigenvalues, we now write the eigenvectors as: \A\) = \A'B'),
\A'2) = \A'B")
(9A.46)
Definition 9A.3 A set of commuting Hermitian operators A, B, C, ... whose n common eigenvectors can be given in terms of n distinct eigenvalues so that no two eigenvectors have an identical set of eigenvalues is said to be a complete set of commuting operators. It is denoted as c.s.c.o. The orthonormality condition satisfied by eigenvectors can be written as: (A'B'
... \A"B" . . . ) = 8A,A,, « W <
(9A.47)
or simply as: {k'\k") = 8k,k.,
(9A.48)
where k stands for the complete set A, B, C, ... and k', k" stand for the set of eigenvalues A'B' ..., A"B".... The completeness of the set of eigenvectors implies that an arbitrary ket \a) e state-space can be written as:
|a> = Z I*') <*» k'
(see Sec. 2 and Sec. 3 where c.s.c.o. is used)
( 9A - 49 )
Basics of Quantum Theory 527
APPENDIX 9B: A FEW DEFINITIONS AND DERIVATIONS We give below the definitions of a few mathematical objects which bear a different name when used in classical mechanics. An n-dimensional manifold Xn is the configuration space of a system with n-degrees of freedom. The coordinates used are (ql). The manifold X" x IR with coordinates (ql, t) is called the configuration spacetime of the system. The natural coordinates on the tangent bundle T(Xn) and the cotangent bundle T*(Xn) are respectively (ql, q') and (ql, pt). The latter is called the phase space of the system. The manifold T*(Xn) x IR is called the state space of the system. A function L : T(X") x IR —> R given locally by L(q', q\ t) is called a Lagrangian. The vector with components Pi = — T is a covariant vector on X". A function H: T*{X") x R - > R given by H(q', ph t) e (R is dq' called the Hamiltonian function*. In the following, we shall use the concept of a Hamiltonian function to define a Hamiltonian operator in quantum mechanics.
B.I
The Wave Function y/ in Quantum Mechanics
Consider a one-dimensional wave equation:
^ _ 4 . | £ . ( f + ± f ) r f - i - f •),.<, dx
c dt
\dx
c dt)\dx
(9B,,
c dt)
D'Alembert's solution to this equation is: T](x, t) = fax - ct) + f2(x + ct) (9B.2) We are, however, interested in the form (solution) which is used in quantum mechanics. This is: n{x, t)sri = A cos ^-(x-ct) (9B.3) X t] here is known as a sinusoidal wave travelling (propagating) in the positive ^-direction, the constants A, X and c are described as follows with the help of the diagram given below. A is the amplitude of the wave. X is the wavelength (x2- x{) for constant time f0.33 Similarly, for fixed x (i.e., x — x0) T = (t2-t{) defines the period of the wave, and the constant c = —. The number v = — is called the frequency of T T the wave and 0) = 2nv = is called the angular frequency or the frequency of oscillation-which is an observable phenomenon. The wave TJ is an infinite harmonic plane wave which is associated with the In the Table 9.2.1 we have denoted the Hamiltonian as H(x, p), since we have used there x in place of q, also we did not show there its dependence on 't' explicitly. The concept of wavelength in physics is very important since it characterises a particular wave, and also since it predicts the behaviour of a particle/string as described above. To see the former we note that if the wavelength is a meter or more, the waves are the radio waves; whereas the waves with shorter wavelength (a few centimeters) are known as 'microwaves'. They are called 'infrared' when the length is closer to —^- of a centimeter. The light visible to our eye has a wavelength 11 —— < I < —j- 1 meters, and waves with still shorter wavelengths are known as 'ultraviolet' X-rays and gamma rays.
528
Mathematical Perspectives on Theoretical Physics
motion of a free particle moving in the jc-direction with the momentum p = hk, where h is Planck's constant34 and the magnitude of k is given by the wavelength A =
. We note that above relation k
between p and k is the fundamental de-Broglie equation.
0
~H
_y
_A
T
QH^Q
W- k
A
x
Wave propagating In the x-dlrection.
Using k andft)and by incorporating the amplitude A with the cosine argument, we can write (9B.3) as: ri(x, t) = cos(foc - ox + a)
(9B.4)
Likewise, a plane wave propagating in the -ve (negative) ^-direction is: r}(x, t) = cos(kx + ox + p)
(9B.5)
In order that (9B.4) and (9B.5) may suitably describe plane waves in quantum mechanics, we have to add additional terms to both these coming from — - — —
L
dt
Accordingly we have:
Jr=0
r\x(x, t) o= co&(kx - cot) + ax sin(fcc - cot)
(9B.6)
r]2(x, t) « cos(kx + cot) + a 2 sin (kx + cot)
(9B.7)
The constants ax and a^ can be taken as +i and -i; this follows from the fact that these waves moving in the opposite direction are linearly independent at all times and that an arbitrary replacement of x or t does not alter the physical character of the wave. We note however that waves (9B.4) and (9B.5) violate the above condition of linear independence for t =
—.
Using the values of ax and o^, we can write the expressions for waves propagating in the +ve and - ve direction of x as: 77,(;c, f) = AeKkx-a*> 7]2{x, t) = Be~i(kx
+
(9B.8) "*>
(9B.9) lkx
lkx
The initial value of these waves are t]x{x, 0) = Ae , and r\2(x, t) - Be~ . This wave concept (discussed here in 1-dimension) leads to the definition of our familiar complex wave function which, in conformity with the literature, is denoted as y/(x, y, z, t): 34
Sometimes in literature the symbol h is used to denote
times the Planck's constant. In
Basics of Quantum Theory
y/(x, y, z, t) = Aei(k ' * " m) = Ae*k*x + V
+
*** ~ »«
529
(9B. 10)
Equations (9B.8) and (9B.9) follow from (9B.10) when ky = kz = O and kx= ± k. The wave function y{x, y, z, t) represents the motion of a particle moving in 3-dimensional space with the momentum (px, py, pz) = (fi kx, ft ky, h k.). (See S. L. Sobolev, Chapter 3, Ref.[Ad] for the wave operator.)
B.2 The Hamiltonian Operator H(t), and the Time Evolution Operator U(t) According to (non-relativistic) quantum mechanics postulates, an initial state y/{f0) = \a, t0) of a given system determines the subsequent states y(/i) = \o, t\) and y/(t2) = \a, t2), etc., and if two initial states \a, t0) and \b, tQ) separately evolve into \a, t) and \b, t), then their linear sum cx\a, t0) + c2\b, t0) develops to cL|a, t) + c2\b, t). These two postulates taken together imply that a state \a, t) can be obtained from an arbitrary state \a, t0) with the help of a linear operator U 35: y/(t) = \a, t) = U(t, to)\a, t0) = U(t, to)yf(to)
(9B.11)
(The reason for denoting the linear operator by U will soon become evident.) The operator U does not depend on y/(t0), and as a result one has: y/{t2) = U(t2, tOHh)
= U(t2, f,) t/(r lt t0)y/(to) = U(t2, to)\i/(to)
(9B.12)
which leads to the (group) property: U(t2, *„) = U(t2, tx)U(tx, tQ)
(9B.13)
From (9B.11) it is also evident that: U(t,f) = I
(9B.14)
Hence, U(t, to)U(to, t) = U(t0, t)U(t, t0) = I or W(t, to)]~l = U(t0, t)
(9B.15)
Using a small increment e in t, we now define an operator H{t) as: U(t + e, t) = I -—eH(t) h
(9B.16)
The presence of the factor — will be justified while writing the derivatives of state vectors and operah tors (see also (9B.22)). Presently we use the above definition along with the group property: U(t + e, r0) = U(t + £, t)U{t, t0) to obtain the differential equation for U, thus: 5
U is known as the evolution operator of the system.
530 Mathematical Perspectives on Theoretical Physics
JLUitt ,o) = mnU^ + e^-U^K.±mt)U(t, dt
e->0
£
t0)
(9B.17a)
H
or
ih-4-U(t, t0) = H(t)U(t, t0) (9B.17b) at with initial condition U(t0, tQ) = I. The operator H{t), known as the Hamiltonian operator, is the analogue of the Hamiltonian function in classical mechanics, as already defined in the beginning of this appendix. Writing / + £ in place of t and t for t0 in (9B.11) we have: y{t + £) = U(t + e, t)y/(t)
(9B.18a)
To the first order in e this gives: y/(t) + e ^ - = \l-±£
H(t)] y{i)
dt
J
L
h
(9B.18b)
Hence: ihdytt)_
= H{t)y/(t)
(9B.19a)
dt In Dirac notation this can be written as: ift—\a, t) = H(t)\a, t) dt where by definition:
(9B.19b)
d | > ,. \a,t + e)-\a,t) — \a, t) = lim-1 ———. dt e->o £ Remark 9B.1 Equations (9B.19), known as the Schrodinger equation for the time evolution of y/(t), give the general law of motion for any system; naturally for specific systems the operator H{i) has to be selected in accordance with the requirement of the system. For instance, it is the energy operator in the case of a wave function (See (9.2.9)) and a (2 x 2) matrix operator for spinorial-systems of 2-component spinors. Remark 9B.2 In both these examples we noticed that the Hamiltonian was Hermitian, hence it is worthwhile to assume that any operator selected to be the Hamiltonian of the system would be Hermitian. By (9B.16), this assumption implies that the operator U is unitary and, as a consequence, the length of any state vector remains unaltered during the motion. Thus if (a, to\a, t0) is normalized to 1 initially, it is (a, t\a, t) = 1 throughout the motion. The main advantage that follows from above is the fact that the expansion coefficients in the expansion of the state vector: \a,t) = T.\k')(k'\a,t)
(9B.20) 2
can easily be computed (See also (9A.5)); for \{k'\a, t)\ is the probability of finding the system at time / with the value k' for the observable k.
Basics of Quantum Theory 531
When H does not depend on time, U can still be obtained for finite time intervals by applying the group property (9B.13) repeatedly to n equal intervals of length £=
—. Thus using (9B.16) with n
U{t$, to) = 1, we have: m.
f
,
limf l-±£H\
=exp
= Urn [ / - I f 1^2. VI"
["i ( ? ~ f ° ) H ]
(9B-21)
(by the definition of exponential function). This expresses the evolution operator U as the exponential of the Hamiltonian H (see Sees. 9.2 and 9.4).
B.3
Dynamical Laws
Next we consider the dynamical law for an operator A. This can be obtained as follows. Let A be independent of t, then the time-variation of the expectation value (A) defined in (9A.20a) is: ih-^(ay t\A\a, t) = ih -L{y(f)\A\v{t)) at at
ih ~(^(t), at
= (vr(t), AHy/(t)) - (y/(t), HA\j/{t))
Ay(t))
(9B.22)
(To write the last step, we have used (9B.19a) and the Hermitian property of H.) The use of the partial derivative of y/(?) in (9B.22) should come as no surprise, since \f/= yr(x, y, z, t). Thus the dynamical law of an operator independent of t is: ih—(A) = (AH - HA) dt
(9B.23)
where brackets indicate expectation values of the operators enclosed. The above expression shows that commutators of H with the observables play an important role in the theory. If A commutes with H, then in view of (9B.23) its expectation value is constant and it is said to be a constant of the motion. If A depends on time, instead of (9B.23) we have: ih—(A) = (AH - HA) + ih(—\ dt \ dt /
(9B.24)
dA We now define an operator
such that for every state y of the system the equality of the expectation dt
values:
(¥) \ dt I
= -T W dt
(9B.25)
532
Mathematical Perspectives on Theoretical Physics
holds. The dynamical law in this case is given by:
+
S* dt
(9B.26)
dA The operator
is called the total time derivative. dt In Sec. 2 we obtained Hamiltonian operators for a few simple physical systems (See Exp. 9.2.2).
Exercise 9B 1 Show that the formal solution of the Schrodinger equation (9B.19a) can be written as: y/(t) = U(t, to)y/(to) = expf—l-{t - t0)HJ\i/(t0) where the Hamiltonian H is the energy operator denoted H .
Hints to Exercise 9B 1. In view of Remark (9B.1), the Hamiltonian in the Schrodinger equation is the energy operator H. Hence we have to solve dt This can be done by using the integrating factor method. The integrating factor here is:
ej'io-±Hdt=e,P[-±(t-to)H].
Basics of Quantum Theory
533
APPENDIX 9C: TOOLS OF PHYSICAL THEORIES In this appendix we define a few mathematical objects (along with their related examples) that are constantly used in theoretical physics, particularly in quantum theories. They are: (i) test functions, (ii) distributions, (iii) Fourier-transforms, and (iv) Green-functions.
C.l
Test Functions and Distributions
Definition 9C.1 A differentiate function T: IR —» R with compact support is called a test function. For example, given a > 0:
r W: =
«
{;
(9C1)
M*«
is a test function. Its support has length 2a, and the function is differentiable. For instance, the first derivative T'a (x) given by: 2a2X 2
("
-a2 /(a2-,2) 2 e
ra(x) = \ (a -x f
\*l°
(9C2)
[0 is well defined everywhere, as we see: lim T'a(x) = 0= x-*a+
lim T'a (x)
(9C.3)
X-HX
The derivative of a test function is again a test function. In order to define a distribution, we need the concept of a 'weakly convergent (w.c.) sequence of functions' which is as follows: Definition 9C.2 A sequence of differentiable functions /„ : R —» R (n = 1, 2, 3, ...) is said to be weakly convergent if for any test function T(x) the limit of: lim f
fn(x)T(x)dx
exists. The sequence of functions fix) =
(9C.4)
?~^~> known as Breit-Wigner functions, is weakly n l+n x
convergent. In fact it can be shown that: lim f
fn{x)T{x)dx = 7X0)
(9C.5)
A weakly convergent sequence which satisfies (9C.5) is said to possess the sifting property. Other examples of weakly convergent sequence with sifting property are: z
\
(a)
r /
\
ft
fn(x) = -i=re V7T (see Exc. 9C.1).
—ti*"X
/L\
/• /
x
(b)
fn{x)
,
\
I
Sill
TIX
5—
nn
x
ff\/~* z:\
(9C.6)
534 Mathematical Perspectives on Theoretical Physics
Definition 9C.3 A distribution D (also known as a generalized function) is an equivalence class of weakly convergent sequences of functions [/j,].36 For any representative sequence^, we write it as: [ D(x)T(x)dx:=
lim P fn(x)T(x)dx € R
(9C.7)
The LHS is also denoted as J DT or in the functional setting as D(T) and reads "D evaluated on 77' A distribution is in fact a linear functional over the space of test functions; thus for two test functions T and S we have: D(T + S) = D(T) + D(S)
(9C.8)
and for any real number a: D(aT)=aD(T)
(9C.9)
We note that a distribution can only be "localized" in a finite interval by evaluating it on a test function with support in that interval; we say that "the distribution is smeared out by the test function." In view of this distributions are better suited than functions in formulating the uncertainty principle in quantum field theory. Definition 9C.4
The distribution defined by the sequence [/„(*)] where, (9C.10)
is called the Dirac's delta function at t, e R. It is denoted by 8% or Sc{x) or 5{x - £).37
C.2
Properties of Distributions with Respect to Operations on Them
(i) Two or more distributions can be added to give another distribution. The sum is naturally represented by the equivalence class of the sum of weakly convergent sequences, (ii) For every a e R, the distribution aD is a scalar multiple of D. (iii) The derivative of D represented by a sequence [£,M] is the distribution D' defined as:
JD'T=-JDT'
(9C.11)
and is represented by the sequence \fn(x)]. For example, the derivative of the (^-distribution given in (9C.10) is represented by:
• Two weakly convergent sequences of functions/n(;t), gn(x) are equivalent if their difference converges weakly to zero. 37 Sometimes it is also denoted as 8{x, <£). It is a generalization of the Kronecker delta Sg to the case of a poo
continuous variable, and as such it is defined by the equation:/(
S(x, E,)f(x)dx, where/(x) is an
arbitrary well behaved function. In the above equation S(x, £) = 0 everywhere except when t, is very close to x (See pp. 82-84 of [24].)
Basics of Quantum Theory 535
{ X
/:W = - —
~ ®2 ,
(9C.12)
The above example shows that in many ways distributions behave operationally like functions.38 However, this is not always true, for instance the product of two distributions may or may not be a distribution. An easy example to illustrate this is the square of ^-distribution where the sequence:
[n (l + nV) does not converge weakly, and hence the square of an arbitrary (^-distribution is not a distribution. (iv) The product of a differentiable function g(x) and a distribution D represented by the sequence [fn(x)] is the distribution: (gD)(T) := lim J gfj
= D(gT)
(9C.13)
(v) Let h(x) be a diffeomorphism39 and let \fn(x)] be a weakly connected sequence representing a distribution D(x), then D(h(x)) is a distribution defined as:
f D(h(x))T(x)dx: = lim J fn(h(x))T(x)dx T(h-\y)) t = hmJ fn(y) } l >dy «^~
\h (h (y))
(9C.14)
This means that substitution in a distribution is possible only if h(x) is a diffeomorphism. (vi) In the case of distributions, integration and limit as well as differentiation and limit can be interchanged, more specifically: lim j Dn(x)T(x)dx = f lim Dn(x)T(x)dx
(9C.15)
lim -4-Dn(x) = -~- lim Dn(x)
(9C.16)
and
n->°° aX
dX n->~
(Equality (9C.15) follows from Def. (9C.5).) It should be noted that such an interchange is not possible for pointwise convergence of functions, for example lim i J ° i ^ = 0 n->~
(9C.17)
n
38
Even the support of a distribution is defined in the same way as that of a function. It is the complement of the largest open subset on which the distribution vanishes.
39
By diffeomorphism h(x) w e mean here that h(x) is a differentiable function whose inverse is defined.
536 Mathematical Perspectives on Theoretical Physics
whereas lim -4-(222L)
=
lim cos nx
(9C.18)
fails to exist. Finally, one also has the concept of convergence amongst distributions as shown below. Definition 9C.5 A sequence of distributions Dn, n = 1, 2, 3, ... is said to be convergent if there is a distribution D such that for any test function T
lim f
DJ= f" DT
(9C.19)
We write it as lim Dn = D
(9C.20)
Dirac's delta-distribution Sn(x) for n = 1, 2, ... represents a convergent sequence of distributions: lim 8n(x) = 0
C.3
(9C.21)
Green's Functions
In the following we shall define a Green's function and study some of its properties. We shall see that in general they are not functions, although they carry this name. Definition 9C.6
Consider an W-th degree homogeneous linear differential equation:
ao(x)f(x)
+ a x ( x ) f ' ( x ) + ••• + a£x)f(i>(x)
+ ••• + a N ( x ) f m ( x )
=0
(9C.22)
in the operator form A/= 0
(9C.23)
where A is the linear differential operator:
Am
^{-h)k
<9C24)
with differentiable coefficient functions ak(x). A Green's function for the linear differential operator A given in (9C.24) is a distribution G^(x) such that AG4(x) = 54(x)
40
| € R
(9C.25)
A Green's function is also called a propagator or an elementary solution of A. An easy example of Green's function is the Heaviside's step function41 40
Sometimes in the literature the RHS carries a -ve sign or an i.
4L
Any differentiable function/^) can be considered as a distribution in terms of the weakly convergent sequence/ (x),f(x),f(x)... formed by it. The function H(x) in this sense is a distribution, more specifically it is Green's function G0(x).
Basics of Quantum Theory 537
fO
x<0
"«={, ,>„
(9C26)
given by the weakly convergent sequence: \fn(x)} = f— + — arctan nx] VI it J Since for A =
(9C.27)
we have: dx
H(x) = dx
f —i arctan nx dx 12 n )
= — - — W = *<*)
(9C.28)
z
n l+ nx Green's functions defined in (9C.25) play an important role in solving inhomogeneous equations:
Af(x) = s(x)
(9C.29)
The above equation involves two functions f(x) and s(x)—the latter, called the source function, is supposed to be known. In the case of many source functions, using G^(x), a solution in the form of a distribution can be defined, thus: D{x) = £ , G^x) s(& dS
(9C.30)
Due to its usefulness, the relation in (9C.30) is sometimes referred to as a magic formula. Note that when A =
and GAx) = H(x - £), this formula reduces to the relation between the function/(jc) and dx the source function s(x) (see Exc. 9C.2):
f(x) = £ . H(x - <^) s(Q d^ = J ^ H(x - | ) s(0 dt, + j~H(x - 0 s{& d£
(9C.31)
The second integral is zero since H(x - £) < 0 in the interval [x, °°], the first integral exists if s(£) is integrable in the interval [- °°, x], accordingly we have: f(x) = f
s{$ dt;
(9C.32)
(In the process we have obtained the 'fundamental theorem of differentiation and integration' in (9C.32).) If the coefficients in A are all constant, the Green's function G4 (x) = G(x - I)
(9C.33)
538
Mathematical Perspectives on Theoretical Physics
due to translation invariance can be written as G(x). The function H{x -
(9C.34)
is solved. All these concepts introduced above can be generalized to functions of several variables with complex values after replacing the ordinary Riemann integral by multiple integrals and ordinary derivatives by partial derivatives. For instance, a weakly convergent sequence can be defined as: Definition 9C. 7 A sequence of differentiable functions fn : RN —> C is said to be weakly convergent if for any differentiable function T: R" —>
••• T fn(xl
= : lim f
... xN)T(xl ... xN)dxl ... dxN
fn(x)T(x)dx =: lim L fj
x: = (xl ... xN)
(9C.35)
exists. A useful test function is:
L-«>2-2)
Ta(x): = \e
lo,
where
r: = \x\: = Ij^xf
'
r
*
(9C.36)
and a > 0.
Similarly a Green's function can be defined for the "divergence"—the linear partial differential operator acting on differentiable vector-valued functions E: K3 —» K3: 3
2
divE(x) := X ~E\x) , = i dx
(9C.37)
In view of Eq. (9C.33) this is: G4(x) = G(x -$) = {G\x - £, G\x - |), G\x - 0)
(9C.38)
which is a solution of the inhomogeneous equation: divG(x)= S(x). The solution: GOO = —^T— (9C.39) Anr r is unique, if it is assumed that far away from the origin G(x) as a distribution approaches zero. 12
The convolution product of two functions/(JC) and g{x) is defined as:
(f* g)(x)= f /(x - £)$(£)<*£
Basics of Quantum Theory 539
Using polar coordinates (r cos <j) sin 9, r sin <j> sin 6, r cos 6) (0 < r < °°, 0 < 0 < 2;r, 0 ^ 0 < n) it can be shown that G(x) in (9C.39), although not a piecewise differentiable function, is a distribution given by the limit of the sequence:
Gn(x):=-^T//r--T)
(9C.40)
It has the property G(-x) = - G{x) (i.e. it is odd), and can be used in solving inhomogeneous equations of the type: div£(x) = h(x) (9C.41) Thus, with the help of the magic formula (9C.30) and (9C.39), the solution to (9C.41) can be written as: (9C.42) The above solution is the limit of superpositions of electric fields generated by point-like charges. (See Exp. (9C.15) on Green's function.)
C.4
Fourier Transforms and Related Objects
Fourier transforms play an important role in solving differential as well as integral equations. To begin with, we define here the objects that lead to these transforms. Definition
9C.8 A real variable function /(/) is said to be periodic of period T, if for all t e R
f(t+T)=f(t)
(9C.43)
The constant functions and the so-called harmonics sin-^r, T
cos^f, T
sin^-f, T
cos ^-t ... T
are easy examples of periodic functions.43 Periodic functions are often treated as defined on a circle of circumference T rather than on the whole of real axis R. Periodicity is then reproduced by wrapping R on the circle. One of the central questions that is asked for periodic functions is: given the period T, when can it be expressed as a superposition of harmonics oo
oo
fit) = I aj c o s ( ^ r ) + I
bj sin
[j?ft)
(9C.44)
that is, as a Fourier series with Fourier coefficients aj, bj? To answer this, one has to address the question of convergence. The convergence may be uniform, pointwise, weak or in the mean depending on the properties of periodic function.44 43'
The harmonics should not be confused with harmonic functions—the solutions of the Laplace equation. The terminology used here is based on the fact that they satisfy the equation of the harmonic oscillator / + o?f= 0 with (0=
44
, , etc. T T For various types of convergences mentioned above, see an analysis book, e.g., 3.[12] or Ref. [31]; see also Exc. 1.
540
Mathematical Perspectives on Theoretical Physics
We note that a Fourier series helps reduce a given differential equation for a periodic function into an algebraic equation for the Fourier coefficients a;, bj. This happens because the harmonics form a basis in the space of periodic functions, such that each basis vector is an eigenvector of the differential operator. Very often the Fourier series of a periodic function f{t) with period T can be expressed in complex form: f(t) = lim X Cje-i(2*'/T)J
(9C.45)
n —> °° . j = -n
where the complex Fourier coefficients c- are: \{aj+ibj)
j>0
cj =-aQ
j =0
j{«-j-ib-j)
(9C.46)
7<0
If/(0 is a real function, then the Fourier coefficients aj and bj are real and c_j - Cj. We define an orthonormal set of functions*: eft)
:= ^
e
"
W
jeZ
for the complex harmonics, and note that the Fourier coefficients c immediately follow from the scalar product: cj - -jf(ejtf) = ^\lf{t)e^"^dt
(9C.47)
Thus obtaining the Fourier series of a periodic function amounts to using (9C.45) where c are calculated with the help of (9C.47). (See Exc. 9C.3.) Likewise, the Fourier series of a distribution D(t) which is periodic can also be obtained by considering the Fourier coefficients C" of periodic functions fn (?) that represent D (r), where c
" = jjTofn(t)e*2!"/T)idt
(9C.48a)
For instance, consider the periodic ^-distribution with period 1, its Fourier coefficients are: Cj= j1oS(t)ei(2nl)jdt
= 1
(9C.48b)
consequently the Fourier series for S(t) is: +«
S(t) = lim = X e~K2m)J "^°° ; = - „ The factor
.— = the normalization constant is chosen so that (ey> ek) — 8jk forj, k e Z.
(9C.49)
Basics of Quantum Theory 541 Its convergence (in the sense of distributions) can be obtained by using a test function T(t) on the circle: Y e~2MiJI T(t)dt = T(0)
lim f J
(9C.50)
n-»=o 0 . j = -n
We next define the Fourier Transform (F.T.) of a function fix) along with the properties that fix) should possess for existence of it's F.T. Definition 9C.9 Let P be any polynomial, a function/: K —> C is said to be of fast decrease if it is differentiable, and if the product of/or any of its derivatives with polynomial P is bounded. Any test function as well as the Gaussian: f i x ) = e"x2/a2
a > 0
(9C.51)
is of fast decrease. Moreover, if/is of fast decrease, then so are its derivatives and the function resulting from its product with any polynomial. Definition 9C.10
If/is a function of fast decrease, then its Fourier transform is defined as:45
/(*) := -J=^J1 /(*)*'"** dx
(9C.52)
(see Exc. 9C.4). Fourier's Inversion Theorem The Fourier transform f(k) of f(x) is of fast decrease and satisfies the Fourier's inversion formula:
fix) = - 7 = J 1 f(k)e~ikx dk
(9C.53)
We list below in the form of a theorem some of the properties concerning the Fourier transform of a function of fast decrease. Theorem 9C.11
(a) if / i s of fast decrease, then . 2 . (JC) = _ # / ( * ) dx
(9C.54)
(«?)(*)=4f(*)
(9c-55)
and
dk (b) If f(x) = g(x - a), then fik)=eika
gik)
(c) If/ is even or odd, then so is its F.T. 45
- Sometimes e~lkx is used in place of e'kx=e'kx, and the factor 4in is given in the numerator.
(9C.56)
542 Mathematical Perspectives on Theoretical Physics (d) If/and g are of fast decrease, then their convolution product
if * g) 00: = J_~M f{x - y)g(y)dy
(9C.57)
is well defined and f * g = fg Parseval's Theorem 9C.12 g, then
(9C.58) If/and g are functions of fast decrease with Fourier transforms / and
1 1 f(x)g(x)dx = £ /(*) £<*)dfc
(9C.59)
where f(x) denotes the complex conjugate of f(x). (See Exc. (9C.5) and (9C.6) which deal with functions that are not of fast decreases.) Naturally one can also define Fourier transforms of distributions as well, though for a restricted class—the class of 'tempered distributions.' Definition 9C.13 A tempered distribution (T.D.) is an equivalence class of sequences of functions /„(*) of fast decrease such that for any function F{x) (also) of fast decrease, the sequence of numbers I
fn(x)F(x)dx converges. We denote its limit as: lim f fn(x)F(x)dx =: f" D(x)F(x)dx = : D(F)
(9C.60)
For example, the ^-distribution and the constant function 1 is a T.D. In general every distribution of compact support is a T.D; any linear combination of tempered distributions is a T.D. and the derivative of a T.D. is a T.D. A differentiable function defined on the whole of the x-axis is said to be a function of slow increase if its increase at + °° or - °° is polynomial or slower. For instance, polynomials or e~Vx In x2 are functions of slow increase. Any T.D. multiplied by a function of slow increase is again a T.D. In particular a function of slow increase is tempered. We next define the Fourier transform of tempered distributions by Fourier transforming their representative functions. Definition 9C.14 Let D(x) be a T.D. represented by the functions fn{x) of fast decrease. By Parseval's Theorem (9C.12) their Fourier transforms fn(k) define a tempered distribution known as the F.T. of D(x) and denoted as D(k). Thus, D(k)= lim f
fn(x)F(k)dk=
lim f /„(*) F(x)dx
(9C.61)
Note that the F.T. of the ^-distribution concentrated at the origin is the constant function the sequence representing ^-distribution can be represented \>y fn{x) = -^=-e'n V7T
x
.
. Since
•J27Z = —%^ e~x /(1/ " \ w e -Jn
Basics of Quantum Theory 543 can use the result established in Eq. (ii) of Exc. (9C.4) to write the Fourier transform of/,, (x) as:
-*2"«2
/„(*) = - f , _*
(9C.62)
The ET. /„ (k) represents the F.T. of the 5-distribution, evidently it approaches
,
as n —> °°.
The Fourier transforms of functions and distributions defined over \RN are obtained by replacing the L -JL single integral by multiple integral and the factor (2TC) 2 by the factor (2n) 2 . For instance if / : RN -> C is a function fix) of fast decrease, then its F.T. is:
/(*'... * " )
= _ l (V2TF)
f
. . . f /OcV.^x
= , rLN[f{x)eik-xdx
(9C.63)
where k • x = ]j£ * J JC 7 . The inverse transform therefore becomes: 7=1
fix) =
J_
f f(k)e*
x
dk
(9C.64)
The derivatives in Eqs. (9C.54) and (9C.55) are now replaced by partial derivatives, thus the F.T. of partial differentiation in the x-space becomes multiplication of F.T. by the corresponding component in the fc-space: ? £ $ - =-ikjf{k) dx]
j=l,2
...N
(9C.65)
conversely, —jfik) = fxjfjx) j = 1, 2 ... N (9C.66) dkJ Finally we give below an example of computation of a Green's function for damped harmonic oscillator. Example 9C.15
Recall that a linear differential operator A = —
+ 2X
1- co2Q where co0 > X > 0
is called a damped harmonic oscillator.,46 The Green's function G(t) with respect to A satisfies:
[— I +2A— + col G(t)=5(t) \dt) 46
dt
It is called o verdamped if A > at 0 > 0 and undamped if A = 0.
co0>X>0
(9C.67)
544
Mathematical Perspectives on Theoretical Physics
Our objective is to compute G{t). We assume that G(t) is tempered and so the differential equation (9C.67) can be Fourier transformed to give an algebraic equation in the frequency space (See Appendix 9B). As mentioned earlier, such an algebraic equation is easier to solve. The solution can then be transformed back using the residue theorem to give finally the Green's function. The F.T. of (9C.67) leads to47: [(jft))2 - 2iXco + co20] 6 (
(9C.68)
•J2n This gives
6 (to) = -1=——r;
< 9C69 )
r
y/2n -ft) -2rAa) + a>o which can be checked to be square integrable, thus G(t) is indeed tempered. Also G((O) is integrable, therefore in view of inversion formula (9C.53) one has: 1 re~i0" 1 r» e~im'dto G(t)=—\ =—^ — Td(O = —— f In J -~ - o 2 -2iXco + col 2nJ~°° (OJ -
(9C.71)
ft)2 = -iX + 4
(9C.72)
ft>! = -iXand
are isolated singularities of the integrand. For t < 0 we consider the closed curve Fig. (9.5) and note that isolated singularities are not inside this closed curve C, hence applying Cauchy's theorem as R —> °o, we have G(t) = 0 as the integral vanishes over the half circle C. Imd)
R
:
:
47- J" f f—1 +2l— + col\ eia" G(t) dt = J"_ 8(t)eim dt
R
Ret0
Basics of Quantum Theory 545 Imm
C -R 1
\
;
:
W
2
R 1
Rea
°h
For t > 0 we use the clockwise curve C in Fig. (9.6). The singularities are within this closed curve, thus we have, using the Residue theorem (See Exc. (9C.6)):
«0 - " f^[(«-U • **k) { - ^ , ' L . ^ } ]
<*™>
Next we introduce another entity—the 'functional' which is always used in quantum mechanics in particular in path integral formalism.
C.5
Functional and their Calculus
In layman's language a functional is a function of functions, or a function of infinitely many variables. Calculations involving a functional are carried out by considering it as a function of a finite number of variables (x1 ... xN) and then letting N -> <*>. The simplest example of a functional is our familiar action functional of a particle moving in one direction in a potential (with Lagrangian L(x, x)): S[x] = f" dtL(x, x)= Jt o
\h dt\ ^m[—) 'o {2 \dt)
J
- VU) I J
(9C.74)
Note that unlike other functions whose value would depend on a particular point JC, the value of S[x] depends on the entire trajectory (curve) along which the integration is taken from the initial point x(t0) given by tQ to final point x(t{) given by tv Thus generically a functional is: F[f]= $ dxF(f(x))
(9C.75)
where for example F(f(x)) may simply be (f(x))k. The concept of derivation of a functional can be viewed as an extension of derivation of a generalized function. Thus functional derivative (Gateaux derivative) is defined from the linear functional as:
546 Mathematical Perspectives on Theoretical Physics
F'[g] = •£•*•!/+ eg]\ex0 = j dxj^£g(x)
(9C.76)
The above definition corresponds to an equivalent expression which is more useful from a computational standpoint:
SF(f(x))
= Hm
5f(y)
Fjfjx) + eS(x - y)) - Fjfjx))
£->o
c
£
Equation (9C.77) implies that:
(9C-78)
T7TT = «(*->) The following properties (similar to those of ordinary derivative) can be easily verified:
-~—(F[f] 8 fix)
+ G[f]) = 4 T 7 T + 4 T 7 T 8 fix) 8 fix)
-zj—(F[f\G[f]) = 4 T 7 T G [ / ] 8f(x) 8 fix) F[G[ 1}ss
T7T
*
+
(linearity)
^[/]4T^
(9C.79) (product rule)
(9C.80)
6 fix) (9C81)
7£-ir
c>g(A:) d o Og Any given functional F [ / ] can be expanded as a Taylor series in the following form: F
if]
= j dxgoix) + j dx1dx2glixl,
x2)fix2)
+ ••• + j dxxdx2 ... dxkgk_x (*„ x2 ... xk)fix2)
... fixk)
...
(9C.82)
where
goix) = F(fix))\fM
= 0>
8l(xlt
x2) =
5F X
g X X
* » * "3) = M^ffl )
^ ^
(9C 83)
"
Of{x2)5f{x3) / U ) = 0 We illustrate the functional derivative for two easy functionals (a) F [ / ] = (f)3 and (b) S[x] = f1 dt'Lixit'), x it')). Writing F[f ] = f dyF( /(y)) = f dyifiy)? in the case of (a) and using (9C.77) JtQ
J
J
we have: SFjfjy))
= lim
^(/(y) + £^(y - *)) - ^(/(y»
= lim(/(y)
+ £l5(3;
~x))3~(-/'W)3
Basics of Quantum Theory 547
(f(y)f + ie(f(y))2$(y - *) + o(e2) - (f(y))3
.. - hm
1—
= 3(/(y)) 2 5(y-x)
(9C.84)
In view of (9C.75) and the fact that derivative of a functional is a functional, we obtain:
Jw§)= I dy^nxf=
i d y 3 (f{y))2Siy ~x) = 3(fix))2
(9C 85)
-
For (b) we write L as a sum of two separate functionals: L(x(t), x(t)) = —mx2-
V(x) = T(x(t)) - V(x)
(9C.86)
and applying the differentiation rule (9C.77) to V(x) and T(x(t)), we have respectively: SVjxjt')) _ 1 . m V(x(t') + eS(t' -1)) ~ V{x(t')) 8x(t) e->o e gV(jc(r/)) =
8{f -t)= V'W))8(f'
- t)
(9C.87)
Sx(t') and
5T(X(t')) 5x(t)
=
snap) saw)) Sx(t)
Sx(t)
= mxUf)—S(t'-t)
(9C.88)
where we have regarded x{t') as a function o.f x(t) and therefore have applied the chain rule (9C.81) 5T(x(t')) i and have then used the result of (a) to write the value of ^. = —m(2x (?'))• Thus piecing (9C.87) and (9C.88) together we have:
4 ^ - = Jf dt'l miU')~8Uf - /) - V'(xU'))8W - t) ) 8x(t)
'o
[
dt
J
= - mx (t) - V'x(t) d dL(x(t),x(t)) dt dx(t)
|
dL(x(t\x(t)) dx{t)
(9C89)
We note that the RHS is the familiar Euler-Lagrange expression, this shows that the functional extremum of the action: - ^ i =0 5x(t) is indeed the classical Euler-Lagrange equation.
(9C.90)
548 Mathematical Perspectives on Theoretical Physics
Exercise 9C 1 Establish (9C.5) for /„(*) = — i=-=-, and show that fJx) = - ^ < r n 2 j r 2 also satisfies the n 1+ n x -4n 'sifting property' (9C.5). 2 Use Eq. (9C.30) to show the justification of writing f(x) on the LHS of (9C.32). 3. Show that the periodic function f(t) -t
of period 1 is square integrable. Obtain its Fourier
series and show that it converges in the mean. 4. Compute the Fourier transform of the Gaussian function e~x la (a > 0). 5. Show that the Fourier transform of the 'transmission function' of an ideal slit of width 2a: (0
\x\>a
can be defined although f(x) is not a function of fast decrease. 6. Obtain the Fourier transform of the function: /(*) =
2
*
a
2
>
0
x +a
by using complex methods. 7. Show that ¥ ( / ) = (2nK)~T j d3r' exp(-ip' • r'IKf9(r') and »F(rO = (2nh)~^\ d3p' expO// • r'/K) ¥ ( / / ) are Fourier transforms of each other.
Hints to Exercise 9C 1. Write the integral of (9C.5) as a sum of three integrals,
(i)
lim f
+f
+f
p-
\
2T(x)dx\.
Now every test function T(x) is bounded, hence we note that the first and third integral —» 0 as n becomes large. This is because /••\
(n)
t°°
\
Ji/vVT
1
n
— n
= A\ V2
_,. , ,
T{x)dx
2 i + n2xTT
n
~ . e°°
hiJn
1
n
i+
,
dx Y~T 2 2 n x
arctan -Jn 1 J
(where A is a constant) and since lim arctan 4n —> —, it follows that the third term is zero. «-»» 2 Repeating similar steps, we can show the vanishing of the first term. The middle term equals:
Basics of Quantum Theory
(iii)
limf"^L-5-
\
549
T(x)dx.
We apply the mean value theorem to compute it. Suppose there is a number y e the domain of integration — = - , —=- , then (iii) equals: L vn Vn J (iv)
lim T(y) f" "_ — /,->»
ijil-in
n
hr-j-dx = lim T(y)— arctan Vn~ l +
rfx
«->-
It
when n becomes large, the interval becomes very, very small, hence the choice of y as 0 becomes quite legitimate, which gives the required result: l lim f S-J-T T(x) dx -> 7X0). n -»«J-~ K l + n^x*
Forfn(x) = —2=-e~" x , the first and third integral reduce to zero as n becomes large, the second V7T
one gives:
(v)
f'^
ne.n2x2T(x)dx
Using the mean value theorem in —j=^, —j=- we have: L vn vn J 1 (vi) TOO J '^- -jL ,-« v Jx = r(v) J £
4^,~' 2 ^
where we have changed the variable x to —. Thus when n —» °° the integrand evaluated between n the interval (-«>, <*>) approaches 1, however y lying in I —j=-, —==• tends to zero, hence V vn vn / lim f" -2=-e-"2x2
T(x)dx = 7X0)
which shows that it satisfies the 'sifting property.' 2. Note that the integral on the RHS of (9C.30) should be viewed as a limit (that exists) of distributions of Riemannian sum, and as a result for an operator A, it gives: (i)
AD(x) = A\~
G*(x)s(£,)dZ = A lim £ GAx) *(&)A&
= lim £ AG{.(*) s(^,)A4i = lim £ 5?.(*) ^(^,)A^
550 Mathematical Perspectives on Theoretical Physics
Since D(x) =f(x) and A = — - , the LHS of (i) equals f(x), hence we have: dx dx -~f(x) dx
= s(x)
which apparently leads to (9C.32). 3. A function/(0 defined in an interval (a, b) is said to be square integrable if
(i)
fa\f(t)\2dt
Also if it is a periodic function of period T, then J |/U)| 2 dt <°°. In this case the function is a periodic function of period 1 in the interval t e [0, 1), thus since, J°
2
K
A)
13
2
4 Jo 12
it is square integrable. To obtain the Fourier series, we calculate the Fourier coefficients (see (9C.47)) by using integration by parts for j * 0:
(ii)
cj = f f t -1W H
J =
dt = f r - ±.)-±-e*«" '
2J
\
2) 2nij
l_L_ e 2«.>- _ 1 . _ 1 2 2^7 2 2^/7
^ ( g (2nij)
0 2
Jo
-?i^-dt
Inij
^ - i)
For y = 0, we have:
Hence, the Fourier series after simplification of coefficients is:
fit) = - - X ~ s i n 2«7V. To show the second part of the exercise, we consider the scalar product of two periodic functions /and g of period T:
(iv)
(/,*)= Jj f(t)g(t)dt
under the assumption that the integral exists. This product is sesqui-linear, i.e., antilinear in the first component and linear in the second. The norm of/:
(v)
V(/,/) = | / |
measures the convergence in the mean. In this case it is: (vi)
which shows that the series converges in the mean.
Basics of Quantum Theory
4. In view of (9C.52), the F.T. of e^2'"2 (i)
551
(a > 0) would be:
f{k) = - ^ = - [" e'*2'"2 + ikx dx
L lv a
42^ J-~
2 ;
v2
) jJ
,2 2
= —4=-e ~ P exp| -f— -—Ao") Id*Similar to Exp. (9C.15) we apply Cauchy's theorem to the closed curve C in Fig. (9.7) for evaluating the integral in (i). Imx/a
'
>
-R
'—*- Re xla R
Fig. 9.7
A change of variable to t =
ha gives dx - adt, evidently the limits of integration are still a
2
the same so that:
(ii)
/(Jt)=l_L-r e-'2adt=-^=-e *
This shows that F.T. of a Gaussian is again a Gaussian as also illustrated in Fig. 9.8. t W
\f(x)
J \ -/
1 -^
Small a
^,^-_^
^
1
^
x
k i .
" f(x)
1
Large a
»x
Fig. 9.8
A
AV^'
£-
1
-^
>• k
552
Mathematical Perspectives on Theoretical Physics
We note however that the width of the first Gaussian is:
\j^x2\f(x)\2
dx
I fjf{xfdx
a
~2
whereas that of the second is I/a. Thus more /(x) is peaked, the flatter is its FT., as is evident from above illustrations in Fig. (9.8). 5. In view of (9C.52), the FT. of trammission function f{x) is:
+«
= ZL
yf2n k
keikx
i—
J— sin ka
_a V n
The graphs of/(x) and f(k) given below illustrate how the uncertainty principle (see (9A.21) for definition) is verified experimentally by using the diffraction by a slit. Although f(k) has been computed, the integral in the Fourier inversion formula (9C.53) is not absolutely convergent— (which would always be the case for f(x) if it was of fast decrease), and also Eq. (9C.54) does not hold in the sense of functions. f(x)
f{k)
Small a
1
x
I
\
_^Z—I—X_—*. k
;^«
jf(k) Large a
/
\
R B C Q Uncertainty principle verified via diffraction by a slit 6. We write the FT. f(k) as: ' (i)
/(*) = - £ = • [ " dx •J2n J -~ (x + id)(x - id)
and for k > 0 we use the Residue Theorem (See l.[l], l.[7] and Subsec. 1.6 of Chapter 1) with the closed curve C (consisting of the interval [~R, R] and half circle) to evaluate the integral. We thus have r-, (n)
ivn a r eikz , a \ R eikx . ^r eikz ,1 f (k) = —F=^ —* ^-dz = ~7=^ { —* T-dx +\ —= ^-dz\ J 3c 2 2 LR 2 2 j2^ Z +a j2^l X +a Jhalfciicle Z 2 + fl2 J
Basics of Quantum Theory
=
553
« (2ni Res {1^1 1 = « f 2ni^^-] = , / ^ e ^ . V2wl U(z)J J 42n K h'(ia)J \ 2
Here we have used g(z) and h(z) to denote the holomorphic functions elkz and (z2 + a2) respectively, hence h'(z) \ia = 2ia. Similarly for k < 0 we apply the Residue Theorem to the lower half of the circle that encloses the singularity -ia, and obtain (iii)
f{k)=^eka.
When k = 0, g(z) becomes a constant function whose value at every point is 1, the result of Residue Theorem still holds good, accordingly, (iv) J
f(k)\ J
\k=o
= !$- . v 2
Collecting the results of (ii), (iii), and (iv) we have:
}m=lLe-\K\«. Evidently f(k) is not differentiable as a function, hence property (9C.55) of the F.T. does not hold good. The above two exercises show that if a function is not of fast decrease, its F.T. can exist, but it may not satisfy all the properties. lmx
/a
/ 1 -R
>-
1
\ *
1 R
Rex
E ^ ^ ^ Q The curve C in complex plane enclosing the isolated singularity ia. 7. Use (9C.63) and (9C.64) to solve this exercise. (Note that r' and p' respectively belong to coordinate and momentum space of a particle moving in 3-space.)
554
Mathematical Perspectives on Theoretical Physics
APPENDIX 9D: QUANTUM GROUPS We describe in brief the special type of Hopf algebras which have come to be known as 'quantum groups,' since they were discovered, to begin with, through quantum mechanical models, more precisely using quantum enveloping algebra (of a semisimple Lie algebra). Although origins of these groups can be traced to the early eighties, their popularity in mathematics and physics is of much recent origin. It is well known now that quantum groups are related to theories of low-dimensional topology on one hand, and statistical physics on the other. We recommend the reader to the article by Kirillov and Reshetikhin in 4.[6b] for historical survey and C. Kassel in Ref. [Ad] for the theory and more recent references. We have divided the Appendix in two subsections. The first deals with the definitions that lead to the definition of Hopf algebra, and the second dwells on objects (which we would like to name as ^-objects) that are required to define Hopf algebras SLq{2) and Uq(sl{2)). These are indeed the simplest examples of quantum groups.
D.I Algebra, Coalgebra, Bialgebra and Hopf Algebra Before defining these (Hopf) algebras, we wish to note that like other types of algebras, the definition here is not based directly on the axioms of algebra, instead it stems from the so-called coalgebra. In order to define a 'coalgebra,' we make a pictorial presentation of the definition of 'algebra.' Definition 9D.1 An algebra over a field K is a triple (A, X, fi) where A is a vector space and X: A ® A -* A and fJ.: K —> A are linear maps which satisfy the axioms (Ass) and (Un): (Ass): The square X
®id > A® A
A® A® A
J id ® X A® A
JA —^->
A
(9D.1)
commutes. (Un): The diagram K®
A
"**
) A®
A < "*"
A
A ®K
(9D.2)
commutes. The first axiom (Ass) expresses the requirement that the multiplication map X is associative, whereas the second axiom (Un) implies that the element ^i(l) of A is a left as well as a right unit for X. The algebra A is commutative if, in addition, it satisfies the axiom: (Comm): The triangle A® A
TA A
'
A
) A® A
/*.
x
A
(9D.3)
Basics of Quantum Theory
555
commutes. Note that TAA denotes the mapping which flip switches the elements of A, thus xA A (a ® a) = a ® a. Using the above definition, the morphism between two algebras (A, X, fi) and (A', X', [/) is defined as follows: Definition 9D.2 such that:
A morphism of algebras / : (A, X, /£> -> (A', X', fi') is a linear map /from A to A'
X' o(f®f)=fo
X
and
f 0/1 = 1/
(9D.4)
By definition a coalgebra is a triple which is obtained by reversing the arrows in the diagrams (9D.1-2). More precisely, we have: Definition 9D.3 A coalgebra is a triple (C, v, 8) where C is a vector space and v: C -* C ® C and 8 : C —» K are linear maps that satisfy the axioms (Coass) and (Coun) given below. (Coass): The square C
—^->
C® C
|v C®C
[id ® v v(8a
>
C®C®C
(9D.5)
commutes. (Coun): The diagram K
<
S9id
C®C
V
idm
fv
)
C®K
/S
C
(9D.6)
commutes. The map v is called the coproduct or the comultiplication while Sis called the counit of the coalgebra. The diagrams (9D.5-6) imply that the coproduct vis associative and counital. Furthermore, if the triangle (Cocomm) C
C® C
TQC
)
C® C
(9D.7)
commutes where Tc c denotes the flip, we call the coalgebra cocommutative. A morphism between two coalgebras (C, v, 8) and (C", v', 5 ' ) is defined as follows: Definition 9D.4
A linear map /from C to C" is a morphism of coalgebras if:
(f®f)ov=v'of
and
8= 8'of
(9D.8)
Associated with any algebra A there is the opposite algebra denoted A same as that of A but multiplication is defined as:
op
whose vector space is the
556
Mathematical Perspectives on Theoretical Physics
AAoP = AA o TA_ A
(AAop (a, a') = a a)
(9D.9)
From (9D.3) it is evident that A is commutative if and only if AAoP = AA
(9D.10)
Apparently the 'opposite coalgebra' of a given coalgebra (C, v, 5) can be defined by setting the mapping
v o p = rc co v
(9D.11)
thus (C, v op , 8) is a coalgebra known as the opposite coalgebra of (C, v, 5). We note that the field K has a natural coalgebra structure with v(l) = I <S> 1 and 5(1) = 1. Also for any coalgebra (C, v, S) the map 5 : C —> K is a morphism of coalgebras. Furthermore, we note that the dual vector space of a coalgebra is an algebra (See Exc. (9D.1)). It is called the dual algebra of C. We list below a few examples of coalgebras. Example 9D.4
Let X be a set and C = K[X] = ®xeX ^x v(x) = x®x
and
De
the vector space with basis X. Define
8(x) = 1
(9D.12)
for x e X. Then (C, V, 8) is a coalgebra (of a set). The dual algebra C* is the algebra of functions on X with values in K. Thus in this case a linear form/on C is determined by its values on the basis X, and if/' is another linear form, then48:
(ff')(x) = X(f®f')(x) = W®/')(v(*)) =/(*)/'(*)
(9D.13)
The unit of the algebra C* is given by the constant function 8. While the dual vector space of a coalgebra is an algebra, in general, the dual vector space of an algebra does not carry a natural coalgebra structure. If, however, the vector space of A is finite dimensional, the dual vector space has a coalgebra structure. Example 9D.5 Consider the set Mr(K) of (r x r) matrices with entries in K. Let Etj denote the matrices with all entries equal to zero except for the (/, j) entry which equals 1 (see Chapter 4 for an Exp.). The set {£,-,-} (1 < i, j < r) is a basis of Mr(K). Let {xtj} denote the dual basis. If A denotes the algebra formed by the set Mr(K), then A* is the coalgebra defined by: r
v(Xij) = X *ik ® xkj
and
8(Xij) = Stj
(9D.14)
k=\
In fact we have: 8(Xij) = ^.(/x(l)) = XijCZEuJ = X Slk 8kj = 8tj k
and A* (Xij) (Ekl ® Emn) = Xij U(Ekl ® Emn) = t>lmxij(Ekn) 48
=
' See Hint to Exc. 1 for mappings 77 and rj.
S
lmSikSjn
(9D.15)
Basics of Quantum Theory
557
= X 5ik Slp 5pm Sjn = X Xip(-Ekl) xpj (Emn) P P
= n\yjxip®
xpj }(Ekl ® £„„,)
(9D.16)
Example 9D.6 The tensor product C ® C of two coalgebras (C, v, 5) and (C", v', 5') is a coalgebra with comultiplication (id ® Tc c - ® id) o(v ® v') and counit S ® 5'. Similar to the concepts of an ideal and a quotient algebra in the case of an algebra, we have here a coideal and a quotient coalgebra, these are defined as follows: Definition 9D.7 Let (C, v, S) be a coalgebra, A subspace / of C is a coideal if: v(/) c / ® C + C ® / and 5(7) = 0. In this case v factors through a map v from Cll to C ® Cl(l ® C + C ® I) = Cll ® C//. Similarly the counit factors through a map 8: C/7 —> K. The triple (C/7, v, 5) is a coalgebra called the quotientcoalgebra. Sweedler's Sigma Notation 9D.8 It is usual to write the tensor product C ® C or C ® C ® C for coalgebra C in this notation. According to this, if x is an element of (C, v, 5), then the element v(x) of C ® C is of the form: v(x) = £ x'; ® *','
(9D.17a)
i
This is alternatively written as:
v(x) = X *' ® •*•." s X *(1) ® x(2) X
(9D.17b)
A'
where numerical in parenthesis stands for the number of primes used. The coassociativity of v(i.e., the commutativity of square (9D.5)) is expressed by
X X (*')' ® (*')" 1 ® *" = X *' ® I X u")' ® (^")") u ) V(-t')
y
(*)
v(^")
/
= X •*' ® *" ® x'" s X x0) ® ^^ ® x°] (X)
(9D.18)
(X)
Moreover if we apply the comultiplication to (9D.18), we obtain the following equal expressions
X v(x') ® x" ® x'",- X x ® v(x") ® x'", X * ® ^c" ® v(x'"). (x)
(x)
U)
Note that a coalgebra morphism (Def. (9D.4)) in Sweedler's notation can be written as:
X fix') ® f(x") = X (/W)' ®
(9D.19)
558
Mathematical Perspectives on Theoretical Physics
Definition 9D.9 Consider a vector space H equipped with an algebra structure (//, A, /u), and also with a coalgebra structure (//, v, 5) such that the two structures satisfy the following compatibility conditions: (i) The maps A, fJ. are morphisms of coalgebras. (ii) The maps v, 8 are morphisms of algebras. A quintuple (H, A, /J, v, 8) for which the above two equivalent statements are satisfied is called a 'bialgebra.' (The equivalence between (i) and (ii) is established in the Hint to Exc. 2.) We note that a morphism of bialgebras is a morphism for the underlying algebra and coalgebra structures. Definition 9D.10 Let (A, A, \l) and (C, V, <5) be an algebra and a coalgebra, consider the vector space Hom(C, A) of linear maps from C to A, then for/and g e Hom(C, A), the composition of maps: C —v-^ C ® C
f
®g ) A ® A ——> A
(9D.20)
is the convolution map/* g. In Sweedler's sigma notation, we express it as: (f* 8)(y) = £ / ( / ) * ( / ' ) ,
y e C
(9D.21)
(y)
Evidently the convolution is bilinear. When (H, A, fi, v, 8) is a bialgebra, we may consider the particular case C = A = H, and thus define the convolution on the vector space End(H) of endomorphisms of H. This leads us to the definition: Definition 9D.11 Let (//, A, /J,, v, 8) be a bialgebra. An endomorphism S e End(//) is called an antipode for the bialgebra / / if S * idH = idH* S = ix o 8
(9D.22)
Definition 9D.12 A Hopf algebra is a bialgebra with an antipode. A morphism of Hopf algebras is a morphism between the underlying bialgebras commuting with the antipodes. In relation to antipodes and Hopf algebras, the following remark and result are noteworthy. Remark 9D.13 A bialgebra may or may not have an antipode; if it does, it is unique. For if 5 and S' are two antipodes of a bialgebra, then S = S*(/J.o8)
= S* (idH * S') = (S * idH) * S' = (p o 8) * S' - S"
(9D.23)
A Hopf algebra with an antipode S is denoted (H, A, fi, V, 8, S). Using Sweedler's convention, we note that an antipode S satisfies the relations: ^
x'S{x") = S(x)l = X S(x')x",
U)
(x e H)
(9D.24)
(.v)
X *(I) ® x(2) ® S(x0)) ® x(4) ® x(5) = X * ( 0 ® S(xi2y) ® *(3) ® JC(4) (.v)
= X *
U)
(U
(2)
®*
(3)
®*
(9D.25)
(The first equality in (9D.25) is obvious from (9D.24) and (9D.17b), whereas the second follows from the axiom (Coun) diagram (9D.6).) Result 9D.14 Let H be a finite-dimensional Hopf algebra with antipode 5, then the bialgebra H* is a Hopf algebra with antipode S*.
Basics of Quantum Theory
559
D.2 The Quantum Plane, the Algebra Mq(2), and Hopf Algebras GLq(2), SLq(2) and Uq(sl(2» Definition 9D. 15 Let q be an invertible element of the ground field K, and let / be the two-sided ideal of the free algebra 49 K{x, y) generated by the element yx - qxy. The quotient algebra Kq[x, y] = K{x, y)llq
(9D.26)
is called the 'quantum plane.' When q * 1, the algebra Kq[x, y] is non-commutative. For any algebra A, the algebra of 2 x 2 matrices with entries in A is denoted by M2{A). As a set M2(A) is in bijection with the set A4 of 4-tuples. If further, M(2) denotes the polynomial algebra K [a, b, c, d] = K {a, b, c, d}l{ad - be) and A is a commutative algebra, then Hom A , s (M(2), A) = M2(A)
(9D.27)
This bijection maps an algebra morphism/: M(2) —> A to the matrix
ff(a) [f(O
fib'A fid))
(9D.28)
Furthermore, the multiplication of matrices M2(A) x M2(A) —> M2(A) leads to the bijection of M2(A) x M2{A) with A8, which eventually gives the polynomial algebra M(2)®2 = K[a', a", b", b", c', c", d', d"\
(9D.29)
The above discussions lead to an important result: Result 9D.16
Let A : M(2) -> M(2)®2 be the algebra morphism defined by: A(o) = da" + b'c"
Mb) = a'b" + b'd"
X(c) = c'a" + d'e" X(d) = c'b" + d'd" (9D.30) then for any commutative algebra A the morphism A corresponds to the matrix multiplication in M2(A) under the identifications (9D.27) and (9D.29). The proof is obvious from the relation:
fa
b\
(Ma)
Mb)}
(a
b'\(a"
b"\
(9D.31)
In order to define a ^-analogue of the algebra Af(2), we consider the variable x, y subject to the quantum plane relation yx - qxy and in addition consider four variables a, b, c, d that commute with x andy. Next we define x', y', x" and y" using the matrix relations: 49
Consider the vector space K {X} whose basis is the set of all elements x^ ... x, including (p in the set X. An element xt ... xt is called a monomial and its length p is called the degree of the monomial. The vector space K {X} equipped with multiplication (X,
...
X . ) (X:
. . . X: ) = X:
...
X(
X,
...
X;
becomes an algebra known as the 'free algebra.' WhenX= {x, ... xn},K{A'} is also denoted as &{xx ... xn), like we have in the above definition. Note that the two-sided ideal I of K{x( ... xn} is generated by all elements of the form xiXj-xjxl^^ where i, j belong to the set (1, 2, 3, ..., n). The quotient-algebra K{x, ... x n } / I i s isomorphic to the polynomial algebra K[x, ... xn] in n variables with coefficients in the ground field K.
560
Mathematical Perspectives on Theoretical Physics
(9D.32) Result 9D.17 The two sets of variables / , y', x", y"; a, b, c, d that are related by matrix equations given in (9D.32) satisfy the following quantum plane relations: /- \
/
(1)
/
/
/
//
ft
y x = qx y , y x
(ii)
Ff
= qx y
ba = qab,
db - qbd
ca = qac,
dc = qcd
be = cb,
ft
ad - da = (q~x - q)bc
(See Hint to Exc. 5 for the proof on the equivalence of (i) and (ii)). Definition 9D.18 The quotient of the free algebra IK {a, b, c, d) by the two-sided ideal J generated by the six relations given in (ii) of the above result is the algebra Mq(2). When q = 1, the algebra Mq(2) is obviously isomorphic to M(2). Result 9D.19 The element ad - q~l be = da - qbc of Mq{2) is central. It is called the quantum determinant of Mq(2) and is denoted det?. To prove the result we have to show that det commutes with all the generators a, b, c, d. Using the relations in (ii) of Result (9D.17), we have {ad - q~lbc)a = a(da - qbc),
(ad - q~xbc)b = b(ad - q~Xbc)
(ad - q'lbc)c = c(ad - q~Xbc), (da - qbc)d = d(ad - q~lbc). We further note that similar to Mq(2) we can also define the algebra Mq_ ,(2) replacing q by q~x in the quantum relation yx = qxy. To proceed further toward our goal of Hopf algebras GLq(2), etc., we recall definitions of the groups GL2(A) and SL2(A) that result from M2(A). For instance50: GL2(A) =\\
[\Y
A e M2(A)
such that
ad-
Pye
and SL2(A) is the subgroup of GL2(A) of matrices with determinant aSfollowing result: Result 9D.20
Ax\
J
o)
f5y- 1. This leads to the
Define the commutative algebras GL(2) = M(2)[t]/((ad - bc)t - 1)
(9D.33)
and SL(2) = GL(2)/(t - 1) = M(2)/(ad - be - 1)
51 ;
(9D.34)
then for any commutative algebra A, there are bijections:
50 5L
(a)
HomAlg(GL(2),
(b)
Hom A ^(5L(2), A) = SL2(A)
A) = GL2(A)
and
The algebra A is commutative and A x is the group formed by all elements of A that are invertible. Recall that M(2) = K {a, b, c, d}/(ad - be) = K[a, b, c, d].
(9D.35)
Basics of Quantum'Theory 561
that send an algebra morphism/to the matrix
(f{a)
f(b)\
yf{c)
fid))
(See Exc. 6 for the proof and (9D.28) for
the above expression). We use (9D.29) to write the following commutative algebras: GL(2)®2= M(2f2[t',t"]/{{a'd'-b'cy-l{a"d"
-b" c")t" -1)
SL(2)®2 = GL(2)®2/(t' - 1, t" - 1) = M(2)®2l{{a'd' -b'c' -\),{a"d"
- b" c" - 1))
(9D.36)
(9D.37)
The morphism given in Result (9D.16) then leads to the algebra morphisms: (a)
X: GL(2) -> GL(2)m,
(b) X: SL(2) -> SL(2)m
(9D.38)
In order to obtain the Hopf algebras from Mq(2), we need to endow it with a bialgebra structure; we state it in the following result. Result 9D.21 There exist unique morphisms of algebras: V : Mq(2) -» Mq{2) ® Mq{2)
and 8: Mq{2) -> K
(9D.39)
given by: v ( a ) = a®a + b®c,
v(b) = a® b + b ® d
(9D.40)
v(c) =c ® a
v{d) = c ® b
(9D.41)
+d ® c ,
+d ® d
S(a) = 5(d) = 1, 8{b) = S{c) = 0
(9D.42)
The algebra Mq{2) equipped with these morphisms becomes a bialgebra which is neither commutative nor cocommutative. Under these morphisms, we also have: v(det,) = det9 ® det,,
5(det,) = 1
(9D.43)
The relations (9D.40-42) can be written in the matrix form as: v
(a
b\
=
(a
M
U d) [c d)
(a <8>
b\
[c d)
Ja and 5
b\
[c dj
=
(I
0\
1,0 lj
(9D.44)
Proof In order to prove the result we have to check the coassociativity and counit axioms. For this we write: (V ® id) v = (id ® V) V (9D.45) and since both sides are morphisms of algebras, we verify it on the generators a, b, c, d. Using the matrix form (9D.44) we have:
fa b\ (fa b\ fa b\\ fa b\ (a b\ (fa b\ (a
b\\
562
Mathematical Perspectives on Theoretical Physics
= ((id ® v)v)
fa
b\
(9D.46)
U d)
Similarly, the conunit axiom follows from the matrix identity: fa
b\ (\
0\
fa
b\
(I
Q\ fa
b\
u JU J-U Jio iJU J
mA7>
The computation of v(det^) in (9D.43) follows from the result established in Exc. 7. Definition 9D.22 Consider the algebras GLq{2) = Mq(2)[t]/(t d e t , - 1) = Gg and 5L,(2) = G(2)/(t - 1) = M,(2)/(det,- 1) E= Sq then given an algebra /?, an fl-point of GLq(2) (respectively of SLq(2)) is defined as an tf-point m = (A, 5, C, £>) of M (2) whose quantum-determinant Det^(m) = AD -q~xBC (9D.48) is invertible in 7? (respectively is equal to 1). It can be shown that the set of /?-points of Gq (respectively of Sq) is in bijection with the set Horn^, (Gq, R) (respectively Hom^ (Sq, /?)) of algebra morphisms from Gq to R (respectively from Sq to /?). We thus have the following important result: Result 9D.23 The comultiplication v and the counit 5of M (2) given by (9D.40-42) equip the algebras GLq(2) and SLq(2) with Hopf algebra structures such that the antipode S is given in the matrix form by fS(a)
S(b)} .f d -qb\ , =det1 i (9D.49) q yS(c) S(d)J \-q~xc a ) The proof of the above result entails showing that v and 8 are well-defined maps on GL (2) and SLq(2), and that both these algebras have antipodes. We refer the reader to Thm. IV.6.1 in C. Kassel, Ref. [Ad] for the proof. In the following remark we summarize the important aspect of these algebras with which we began the introduction. Remark 9D.24 The bialgebra Mq(2) and Hopf algebras GLq(2) and SLq(2), which are obtained using the self-transformations of the quantum plane, are indeed one-parameter deformations of the bialgebras Af(2), GL(2) and SL(2), the parameter here is the (quantum number) q. These are the simplest examples of quantum groups. Finally, we give below two more examples of Hopf algebras. Example 9D.25 Let L denote a Lie algebra, and let U(L) = T(L)II(L) be its enveloping algebra (See Chapter 4). Define a comultiplication von U(L) by v= yo U(d) where d is the diagonal map* \->(x,x) fromX intoX ® L and i//is the isomorphism U(L ®L) —> U(L) ® U(L), and also define a co-unit given by 5= U(0) where 0 is the zero morphism from£ into the zero Lie algebra {0}. The antipode for this is defined by S = U(op) where 'op' is the isomorphism from L onto £op such that op(x) = -x for x e L.52 52. ^-op = opposite Lie algebra of X. It is the vector space £ with Lie bracket [, ]op given by [x, y]°p =[y, x] = - [x, y]. Also U(L°P) = U(L)°V
Basics of Quantum Theory
563
The enveloping algebra U(L) is a cocommutative Hopf algebra for the maps v, 8 and the antipode S defined above. More explicitly, for xx, ..., xn e L, we have: V(JC, ... Xn) = 1 ® *! ... * „ + £ X
X
c r ( l ) ••• x o ( p ) ®
JCtr (p+1) ••• *
/> = 1 (T
+ JC, ... xn ® 1
(9D.50)
where cr runs over all (p, q) shuffles of the symmetric group Sn, and the antipode S satisfies S(xx x2 ... xn) = (-l)"x,, ... x2xx
(9D.51)
The enveloping algebra U(L) is a Hopf algebra since the coassociativity axiom (9D.5) required for the definition is satisfied as a consequence of the commutativity of the square: L
—2-^>
L ®L
it]
i id © 77 nmd
L® L
)
L ©L ©L
(9D.52)
and the counit axiom (9D.6) is satisfied in view of the commutativity of the diagram 0©X
<
mid
idm
L®L
\=
>
t rj
L®0 /> =
L
(9D.53)
and the cocommutativity follows from L V / L ©L
\ —^—>
f] £ ©L
(9D.54)
The morphism 77 should not be confused with v (See hint to Exc. 9 for more explanations). We have taken this map to illustrate (9D.5), (9D.6) and (9D.7). Note also that the tensor product sign ® in (9D.5)-(9D.7) is changed to direct sum sign ©, this is obvious from the isomorphism i// mentioned in Exp. (9D.25). Example 9D.26 Next we consider the enveloping algebra of the familiar Lie algebra sl(2). We recall that 5/(2) is formed by (2 x 2) traceless matrices. For if the ground field DC is
,
Y=\
\,
H=\
,
/=
(9D.55)
are its basis elements. The subspace spanned by the basis {X, Y, H) is an ideal of gl(2) and is the Lie algebra 5/(2). Since gl(2) = 5/(2) © a
(9D.56)
564
Mathematical Perspectives on Theoretical Physics
it follows that the results for gl(2) can be deduced from those of sl(2). The enveloping algebra U s (7(5/(2)) is isomorphic to the algebra generated by the three elements X, Y, H with the three relations: [X,Y] = H,
[H,X] = 2X,
[H,Y]=-2Y
(9D.57)
In view of the above Exp. the algebra U(sl(2)) is a Hopf algebra. Although this algebra has many important properties that are used to study the structural theory of Hopf algebras, we limit our attention to the interesting feature of its duality with the algebra SL{2) (see (9D.34)). The notion of duality amongst bialgebras is given by the following definition. Definition 9D.27 Let (U, X, /u, v, <5) and (H, X, fi, v, S) be two bialgebras and let < , > denote a bilinear form on U x H. We say that the bilinear form realizes a duality between U and H (or that they are in duality) if the following relations hold good for all u, v e U and x, y e H: {uv, ;c> = £ (u, x) (v, x")
(9D.58)
U)
(u, xy) = X <«'. x) <«"> y) («) (l,x)=
(9D.59)
S(x)
(9D.60)
<«, 1) = S(u) (9D.61) If in addition U and H are Hopf algebras with antipodes S, then they are said to be in duality if the underlying bialgebras are in duality and if we have: (S(u), x) =
(9D.62)
for ail u e U and x e H. (We leave it for the reader to examine that U(sl(2)) and SL(2) satisfy the conditions of duality given in the above definition (see Chapter 5 in C. Kassel, Ref. [Ad]). As a final piece toward our description of quantum groups, we define the Hopf algebra U = UJsl(2)). Definition 9D.28
Let q e C be an element which is different from 1 and - 1 , then the fraction
- j - is well defined. q-q~ The algebra generated by four variables E, F, K, K ' with the relations: KK~[ = K~XK= I X
KEK~ ~ q'E,
(9D.63)
KFK'
{
2
= q~ F
(9D.64)
and [E, F] =
K K
~
(9D.65)
q-q is the algebra Uq = Uq(sl(2)). The algebra U admits a unique algebra automorphism co such that
Basics of Quantum Theory 565
Exercise 9D 1 Show that the dual vector space of a coalgebra is an algebra. 2 Establish the equivalence between (i) and (ii) of Def. (9D.9) using the pictorial representations of maps A, fi, v, 8. 3. Show that the triple (Hom(C, A), * , ji o 8) is an algebra, and the map r\c A: A <E> C* —> Horn (C, A) is a morphism of algebras where A ® C* is the tensor product algebra of A and of the algebra C* (dual to the coalgebra of Q . Show further that when A = K the algebra structure (Hom(C, K), * , \x o 8) on the dual space C* is the same as the one defined in Exc. 1. 4. The quotient algebra K[x{, ..., xn}/I is isomorphic to the polynomial algebra K[x:, ..,, xn] in n variables with coefficients in the ground field DC When n = 1 for any commutative algebra A, the underlying set A is in bijection with the set 53 (a)
HomA,g(IKM, A) = A.
The algebra K[x] is called the ajfine line and the set Horn^OKIx], A) is called the set of A-points of the affine line. The algebra of polynomials K[JCJ, x2] with the bijection (b)
Hourly (K^c,, x2), A) = A2
is called the affine plane, an element in HomA/^ (K[x{, x2], A) is called an A-point of the affine plane. Suppose that A : K[x] —> K[xt, x2], Ii: KM —> K, v : K[x] —> K[x] are the algebra morphisms defined by: (c)
X(x) = xx + x2,
/i(x) = 0,
v(x) = -x.
Then show that under the identifications (a) and (b), the morphisms A, ji, v correspond to the maps +, 0, - given by: (d)
+ : A2 ^> A,
the unit 0 : {0} —> A,
and the inverse -: A ^> A.
5. Establish the equivalence relation between (i) and (ii) of Result (9D.17). 6. Prove Result (9D.20). 7. Given an algebra R, an /?-point of Mq{2) is a quadruple (A, B, C, D) e RA which satisfies the relations (ii) of Result (9D.17) with {a, b, c, d) replaced by (A, B, C, D). Writing (A, B, C, D) in matrix form
, show that if m = (A, B, C, D) and m' = (A', B', C, D') are two 7?-points of
Mq(2) that commute, then the element mm =
(A" „
B" \
M (2); show further that the quadruple (
{-q-lC
=
(A'
,
B'\ (A ,
B\
is an /?-point of
| is an /?-point of M..-\ (2) and an
A )
# op -point of Mq{2), and that Detq(m'm) = D e t ^ m ' ) Det (? (m) in R where Det ? (m) = AD - q~l BC = DA - q BC. The element Det ? (m) of algebra K is called the quantum determinant of m. 8. Show that v and 8 given in (9D.39) are algebra morphisms. 53
Horn^ (DC [x], A) = the set of algebra morphisms from IK [x] to A.
566
Mathematical Perspectives on Theoretical Physics
9. Given a vector space V, show that there exists a unique bialgebra structure on the tensor algebra T(V) such that 77 (y) = 1 ® v + v ® 1 and 8(v) = 0 for any element v in V. This bialgebra is cocommutative and for all vit ..., vne V (a)
S(vi ... vn) = 0
whereas «-i
(b)
77(17, ... vn) = 1 ® o, ... vn+ £ £
y a ( 1 ) ... yCT(p) ® va(j)+l)
... i>ff(B) + u, ...
vn®\
p=l a
where cr runs over all permutations of the symmetric group Sn such that <7(1) < <J(2) < ... < a(p) and a{p +1) ... < <J(n) (this permutation is sometimes called a (p, n - p) shuffle).
Hints to Exercise 9D 1. Let (C, v, 8) be a coalgebra. It is known that given a finite dimensional vector space C there exists an isomorphism 77: C* ® C* -> (C ® C)*, where C* is the dual space of C. Let 77 = 77 o TC* c; define /I = C*, X = v* o rj and ji = S* where the superscript * on a linear map indicates its transpose. Then (A, A, jx) is an algebra, proving the result that the dual vector space of (C, v, 8) is an algebra. 2. In order to establish the compatibility between the two structures, we consider the tensor product H ® H of vector space H and the two induced structures of algebra and coalgebra on it. We then use the commutative diagrams (9D.1)-(9D.2) and (9D.5)-(9D.6) to express both statements (i) and (ii) of Def. (9D.9): (a)
H ® H
—^>
i (id ®
H
® id) (v ® v)
T
i v
X
®X )
(H ® H) ®(H ® H)
H
5
®5
H ®H
) K®K
I X
i id
H
— ^
K
The fact that \l is a morphism of coalgebras/is expressed by the commutativity of the following diagrams: IK
(0)
—>
lid
tl
^ v
K®K
^W
H ®H
ti
—>
VSL
^8 K
It is easy to note that these four commutative diagrams are exactly the same as the four diagrams given below whose commutativity expresses the fact that v and 8 are morphisms of algebras. (c)
H ®H i A H
v
®v
—v-^>
)
(H ® H) ® (H ® H)
K
—^—>
I (X ® X) (id ® z® id)
I id
'"^
H ®H
K®K
// )
I v H®//
Basics of Quantum Theory
567
and (d)
H® H
5
®6 )K ® K
i X H
K
I id —±->
—^-»
^ id
K
H ^ S
K
(The reader will recall that the mapping Tin these diagrams indicates the flipping of two elements.) (Hints to 3 and 4 can be found in Chapters 3 and 1 of C. Kassel (See Ref. [Ad].) 5. We have to show that (i) <=> (ii). We use the first matrix relation (9D.32) to write y'x = qx' y, we thus have: (a)
(ex + dy)(ax + by) = q(ax + by) (ex + dy).
Equating the coefficients of x2, y2 and xy on both sides we obtain: (p)
ca = qac, db = qbd,
cb + qda = qad + q be.
The third equality in (j3) when divided by q yields: (/)
ad - da = q~xcb - qbc.
Similarly, using x" and y" we have: ba = qab, dc - qcd, ad - da = q~xbc - qcb.
(8)
(Note that (8) is actually (a) with b and c interchanged as should be expected from the expres(x'\ fx" \ sions for and of (9D.32).) From (7) and (5) we obtain
\y J (r)
\y J (q~x + q) (be - cb) = 0.
Since q2 * -1 we have the final equality be = cb of (ii), showing that (i) implies (ii). The converse can be proved using similar arguments. (Note that we shall begin with the relations given in (ii) of Result (9D.17) and then obtain the two relations y'x' = qx' y and y"x" = qx" y" using the matrix relations of (9D.32).)
(a p\ 6. We prove the result for GL(2). Let be a matrix in GL2(A). Since A is commutative, there is a unique morphism/: M(2)[t] —>A such that:
(i)
f(a) = a, f(b) = P, f(c) = y, f(d) = 8, and f(t) = (a8- Pf)-1.
Also (ii)
Mad -bc)t-\)= (f(a)f(d) - f(b)f(c))f(t) - / ( I ) = (aS- pf) (a8- py)'1 - 1 = 0 .
This implies that morphism/factors through the quotient algebra GL(2) which eventually leads to the required equality (9D.35a) of Result (9D.20). Note that in the case of SL(2), the fifth equality in (i) is simply 1, hence (ii) is trivially satisfied. 7. To establish this excercise we consider the tensor product algebra: R' = R® Kq[X, Y] = R{X, Y}I(YX - qXY)
S68 Mathematical Perspectives on Theoretical Physics
and note that the Result (9D.17) can be rewritten in the language of/?-point of Mq (2). According to this a quadruple
(A B\
of elements of an algebra R is an /?-point of Mq (2) if and only if the
pairs X', Y, X", Y", satisfying (9D.32) (with x, y, x, y, x", y", a, b, c, d replaced by corresponding capitals) are /?'-points of the quantum plane. Since A, B, C, D and A', B', C, D' commute, the second equality in (9D.32) becomes:
\Y" )
U' D')\Y)
Now a second application of Result (9D.32) gives that
(A'
B'\(X'\
(A" B"\(X\
[c D'){rj = [c"
D"){Y} and
(A C\(X"\_(A" C"\(X\ U D)\Y")~\B" D")[Y) are R '-points of the quantum plane, hence mm is an /?-point of Mq (2). To prove the second part, we set A' = D, B' - - qB, C = - q~lC and D' = A and note that the Result (9D.17) with (a, b, c, d) replaced by (A, B, C, D) gives the relations A'B' = q B'A' etc. in terms of (A', B',C',D') which means that (A',B\ C',D') is an #-point of M^ (2) or an Rop-point of Mq (2). The third part which is computational is left for the reader. 8. Use the above exercise to show that (v(a), v(b), v(c), v(d)) is an M (2) ® M (2)-point of Mq(2), similarly show that (8(a), S(b), 8(c), 8{d)) is a DC-point of Mq{2). 9. Note that by universality of the tensor algebra, there exist unique algebra morphisms r\: T(V) —> T(V) ® T(V) and 8: T(V) —> K such that their restrictions to Vare given by the formulas given in this Exc. Now consider n elements v{, ..., vn in V, and note that formula (a) is a trivial consequence of the multiplicativity of morphism 8. Next, to compute rj(v}, ..., vn), use induction on«. Since formula (b) holds for n - 1 by definition, assume that it holds up to n - 1 > 1, then writing the equality (i)
77^! ... V,)= 77(U1 ... U,,_,) J](Vn) = 77(U, ... IV4)(1 ® Vn+ Vn ® 1)
and substituting the value of T](vl ... vn_x) from formula (b), one has the equality after using the arguments on (p, n - 1 - p) shuffles of Sn_v (We leave the rest of the proof for the reader.) We note, however, that cocommutativity is a consequence of the fact that the permutation: (ii)
[1 [p+l
2 ... p p + l p + 2 ... n\ p+2 ... n 1 2 ... p)
switches the two shuffles (p, n -p) and (n - p, p). Also, the coassociativity of r] results from the relation (iii)
(d ® id) o d = {id ® d) o d
where d is the diagonal map d(v) - (v, v) from Vinto V® V.
Basics of Quantum Theory
569
References 1. A. O. Barut, New Frontiers in Quantum Electrodynamics and Quantum Optics (New York: Plenum Press, 1990). 2. F. A. Berezin, Method of Second Quantization (New York: Academic Press, 1966). 3. S. N. Bose, Plancks Gesetz und Lichtquantenhypothese, Zeitschriftfur Physic 26 (1924), 178-181. 4. A. Boutet de Monvel, et al. (ed.), Recent Developments in Quantum Mechanics (Proc. of Brasov Conf., Boston: Kluwer Academic Publishers, 1991). 5. L. de Broglie, Heisenberg's Uncertainties and the Probabilistic Interpretation of Wave Mechanics (Boston: Kluwer Academic Publishers, 1990). 6. T. P. Cheng and L. F. Li, Gauge Theory of Elementary Particle Physics (New York: Clarendon Press, 1984). 7. A. Das, Field Theory: A Path Integral Approach (New York: World Scientific, 1993). 8. H. D. Doebner, J. D. Henning and T.-D. Palev (ed.), Infinite-dimensional Lie Algebras and Quantum Field Theory (Proc. of Varna Summer School, New York: World Scientific, 1988). 9. E. N. Economou, Green's Functions in Quantum Physics (New York: Springer-Verlag, 1979). 10. L. D.Faddeev and V. N. Popov, Feynman Diagrams for Yang-Mills Field, Phys. Lett. B25 (1967). 11. L. D. Faddeev and A. A. Slavnov, Gauge Fields: An Introduction to Quantum Theory (New York: Benjamin-Cummings Publishing Co., 1980). 12. J. S. Feldman and L. M. Rosen (ed.), Mathematical Quantum Field Theory and Related Topics, Proc. of the 1987 Montreal Conf., Ann. Math. Soc. (1988). 13. R. P. Feynman, (a) Space-time Approach to Nonrelativistic Quantum Mechanics, Rev. Mod. Phys. 20 (1948), 367-387; (b) Quantum Theory of Gravitation, Acta. Phys. Polon. 24 (1963). 14. A. Galindo and P. Pascual, Quantum Mechanics /(New York: Springer-Verlag, 1990). 15. J. Glimm and A. Jaffe, Quantum Physics (2nd ed., New York: Springer-Verlag, 1986). 16. W. T. Grandy, Relativistic Quantum Mechanics of Leptons and Fields (Boston: Kluwer Academic Publishers, 1991). 17. H. F. Hameka, Quantum Mechanics (New York: John Wiley, 1981). 18. M. W. Hirsch and S. Smale, Differential Equations, Dynamical Systems and Linear Algebra (New York: Academic Press, 1974). 19. D. C. Khandekar, S. V. Lawande, K.V. Bhagwat, Path Integral Methods and their Applications (New York: World Scientific, 1993). 20. L. D. Landau, Quantum Mechanics (Non-relativistic Theory) (New York: Pergammon Press, 1977). 21. O. L. De Lange and R. E. Raab, Operator Methods in Quantum Mechanics (Oxford: Clarendon Press, 1991). 22. F. Mandl, Introduction to Quantum Field Theory (New York: Interscience Publishers Inc., 1959). 23. F. Mandl and G. Shaw, Quantum Field Theory (New York: John Wiley, 1984). 24. E. Merzbacher, Quantum Mechanics (2nd ed., New York: John Wiley, 1970). 25. V. A. Miransky, Dynamical Symmetry Breaking in Quantum Field Theories (New York: World Scientific, 1993). 26. K. Moriyasu, An Elementary Primer for Gauge Theory (New York: World Scientific, 1983). 27. O. Piguet and K. Sibold, Renormalized Supersymmetry, the Perturbation Theory of Renormalized Supersymmetric Theories in Flat Space-Time (Boston: Birkhauser, 1968).
570
Mathematical Perspectives on Theoretical Physics
28. M. Planck, Ueber das Gesetz der Energieverteilung im Normalspectrum, Annalen d. Physic 4 (1901), 553-563. 29. V. N. Popov, Functional Integrals in Quantum Field Theory and Statistical Physics (New York: D. Reidel Publishing Co., 1983). 30. R. M. Santilli, Foundations of Theoretical Mechanics (Vol. I and II, New York: Springer-Verlag, 1978). 31. T. Schiicker, Distributions, Fourier Transforms and Some of their Applications to Physics (New York: World Scientific, 1991). 32. B. Simon, Functional Integration and Quantum Physics (New York: Academic Press, 1970). 33. H. Spohn, Large Scale Dynamics of Interacting Particles (New York: Springer-Verlag, 1991). 34. G. W. Strang, Linear Algebra and its Applications (3rd ed., Saunders, 1988). 35. V. S. Varadarajan, Geometry of Quantum Theory (2nd ed., New York: Springer-Verlag, 1985). 36. B. S. De Witt, Quantum Theory of Gravity, Phys. Rev. 160 (1967); 162 (1967). 37. T. Y. Wu, Quantum Mechanics (New York: World Scientific, 1986). 38. J. D. Bjorken and S. D Drell, Relativistic Quantum Mechanics (New York: McGraw-Hill, 1964). 39. J. R. Waldram, The Theory of Thermodynamics (Cambridge: Cambridge University Press, 1985).
THEORY OF YANG-MILLS AND
THE YANG-MILLS-HIGGS MECHANISM
1
CHAPTER
-4 I
f\ U
INTRODUCTION
This chapter is devoted to one of the most outstanding theories of our times-the theory of Yang-Mills. It is this theory which on the physical side paved the way for another important gauge theory-the electroweak theory of Glashow-Salam-Weinberg, and gave an insight in symmetry-breaking phenomena through Higgs' mechanism. On the mathematical side it led to prolific research in analysis, geometry and topologyThe theory was basically proposed by C. N. Yang and R. P. Mills in 1954 [55] to replace the abelian gauge group f/(l) of Maxwell's theory by the isospin gauge group SU(2). Unfortunately the massless particles predicted by the theory could not be identified with anything similar to photons-the massless carriers of electromagnetic field. Hence for almost two decades the theory remained dormant. In 1966 Higgs [28] circumvented this difficulty by introducing a scalar field is called the Higgs' field and the ensuing process—the Higgs mechanism. Soon after in mathematical world the non-linear differential equations resulting from Yang-Mills' theory received a new status and the study of their solutions along with their properties became a hot field of research. In the realm of physics the theory began to be viewed with interest and with trust in the sense that experiments were modeled using the principles of the theory (See Salam [43]). Some of the key players (mathematicians and physicists) that participated in the explosive scheme of ideas leading to a structurally sound theory were: Higgs [28], Belavin and Polyakov [15], Bogomolny [16], Prasad [40], Jackiw [30], Atiyah [2], Ward [52], Manton [34], Hitchin [10, 29], Taubes [46], t'Hooft [47], Ulhenbeck [48], Goddard [13], Bott [9], Witten [11, 54], Singer [4, 5, 12,44] and Donaldson [22]. In the context of solutions of these non-linear differential equations (resulting from Yang-Mills functional) new words such as "instantons" were coined, and words and phrases such as anomaly, vortices and monopoles, self-duality of curvature, exotic structures, deRham complex, index theorems, Sobolev spaces and moduli spaces received a new meaning. Deep regularity theorems linking one discipline with another were proved (See, for instance, the work by Ulhenbeck [48], Taubes [46] and (most importantly) Donaldson [22]). It was recognized early on that differential geometry (in particular the fiber bundles) provided a suitable language for the description of the theory (See Sec. 6.5 and Sec. 6.6) and therefore, there appeared quite a few survey articles of the theory; some of these are [5], [9a, b], [13] [20] and [24].
572
Mathematical Perspectives on Theoretical Physics
Since there already exists a vast amount of literature on the subject, some of which is easily comprehensible [2a], [23] we mainly restrict ourselves to providing examples and explanations of the words mentioned above along with a brief background material. For detailed studies, the reader is advised to refer to the above articles and to the books [1], [4], [10], [23], [32], [34].
2
YANG-MILLS AND YANG-MILLS-HIGGS FUNCTIONAL
For reasons mentioned in the introduction, we begin with Yang-Mills-Higgs theory and define the terms required for understanding it.
2.1
Yang-Mills-Hlggs Action in Rn and Rn>'
The ingredients of the theory—called the dynamical variables by physicists—are a gauge potential (connection) A = A; (x)dx' and a scalar field
A
dxj
= y (di Aj (x) - dj A, (*) + [A,- (JC), Aj (x)]) dxl A dx\
(10.2.1)
and gives the covariant derivative of <j) on the other, DA 0 = (VA)/ (0) dxl = (V,-0 + p{A^)dxl.
(10.2.2a)
The p in above equation stands for the linear representation of the Lie algebra g on L , evidently this representation is induced by that of G (on L). Obviously the covariant derivation (10.2.2a) couples <j> to the connection A. We note that the gauge potential A also defines the exterior covariant derivative of arbitrary p-forms (O. Thus, if 0) is an I valued p-form: DA(O: = do) + p(A) A (O
(10.2.2b)
and when it is a g-valued p-form it is: DAco: = dco+ A A co- ( - If
CO A A
(10.2.2c)
It is easy to note that when p is the adjoint representation acting on L = g, equations (10.2.2b) and (10.2.2c) agree with each other. (In Exc. 5 we shall see that all these equations (10.2.2) are covariant under gauge transformation.) 1
Lie group G being a transformation group on vector space L is a matrix Lie group here, which can always be replaced by a general (compact) Lie group G, to write the YMH functional. In this case the definition of FA, where A stands for matrix multiplication has to be changed as the use of A in (A A A) in (10.2.1) is valid only in the case of matrix Lie groups. We note that the definition of FA in indicial notation is valid in all cases.
Theory of Yang-Mills and The Yang-Mills-Hlggs Mechanism 573
The Euclidean Yang-Mills-Higgs action can now be written as:
AYMH(A,
0)= | { R , , {(F A ,F A ) + (DA«/>,DAtfO + i- ( | 0 |2 - l ) 2 } S | R n * R »
(10.2.3)
the notations used here are mostly that of [28]. The third term in (10.2.3) represents the Higgs' self-interaction were X> 0 is a constant. In place of R", if we use the Minkowskian space—of dimension (n + 1) with x° denoting the time coordinate, the action density becomes:
AU= \ {y (fy Fv) -OW- ^o«) + ((VA),-^ (V A ),» - ((VA)O0, (VA)O0) + A ( |0|2 _1}2 J
(10 .2.4)
and accordingly the action is: ^H=|
R n
,i^
(10.2.5a)
The field configuration (A, <j>) is called 'static' if A and
(10.2.5b)
exists and equals (10.2.3). The A^MH is called the energy of the static configuration.
2.2
The Vdriational Equations and Solutions
Just as we had variational equations for pure Yang-Mills action (See Eqs. 6.7.18-21), the variational equations for Yang-Mills-Higgs action (10.2.3) (denoted YMH) are DA*F=*J, and
(10.2.6a)
V A 2 0=— 0 ( | 0 | 2 - 1),
(10.2.6b)
When G is arbitrary, L = g and p is the adjoint representation, the current J is: ./= - [ f DA >] (10.2.7) We note that one is usually interested in 'finite action' solutions of above variational equations. These are indeed the time-independent finite energy solutions to the variational equations coming from the action density (10.2.4) on Minkowskian space K"'1. We call these solutions-solitons.
2.3
Instantons, Vortices and Monopoles
In the case of Euclidean pure Yang-Mills equations (i.e., A = 0 and <j> = 0) when n = 4, these (solitons) are called instantons. Now instantons have the property that their curvatures are self-dual; i.e. FA satisfies: *FA = FA
or
*FA = -FA.
(10.2.8)
574 Mathematical Perspectives on Theoretical Physics
In view of the definitions (10.2.2a) of DA and (10.2.1) of FA, as well as the fact that d2 = 0, it follows that FA satisfies the Bianchi identity DAFA=0, (10.2.9) therefore from (10.2.8) we have the Yang-Mills equation: DA*FA=0 (10.2.10) Hence every curvature that satisfies the self-duality or anti-self-duality condition (10.2.8) is a critical point (see Definition (0.5.3)) of the n = 4 Yang-Mills action (functional):
* ™ = y J R 4 (FA> FA)Recall that we obtained the YM equations in Sec. (6.7)-(Eq. (6.7.20)) while studying the gauge theories from bundle-theoretic point of view. We now list a few facts about the solutions of pure Yang-Mills and Yang-Mill-Higgs equations. Fact 10.2.1 When field A is considered as a connection on a principal bundle and it is assumed that it approaches the flat connection sufficiently rapidly as |JC| —> <», then the integral
V Tr f F A F = N $7t
(10.2.11)
J
is an integer. A sufficient condition that A may satisfy the above asymptoticity is, that the connection A be the pullback of a connection on 5 via stereographic projections; for in this case, the above integral is the second Chern number (see 10A.20). The Chern numbers are known to be invariant under pullbacks. (See. Chapter 5 in [35]). Fact 10.2.2 All instanton solutions have an associated integer N. For a fixed N, the solution manifold has a well defined dimensionality which is determined by the group G and the integer N. For instance, when G = 5(7(2), there is an 8|iV| - 3 parameter family of solutions. Fact 10.2.3 The instanton solutions minimize the functional AYM as long as it is restricted to connections for which N is a fixed integer. Also every local minimum of AYM is an instanton. Fact 10.2.4 For dimensions < 4, there are no finite action solutions to the pure Yang-Mills equations (10.2.10). However, finite action solutions exist for YMH equations (10.2.6). In the case of n = 2 these are called vortices, the term vortex coming from superconductivity as we shall see in Sec. 5, and when n = 3 they are known as monopoles. The features of these Higgs models are qualitatively similar to the Yang-Mills theory in 4 dimension. Fact 10.2.5
For n > 4, there are no finite action solutions to (10.2.6).
Fact 10.2.6 For n = 4, the only solutions to (10.2.6) are finite action solutions, which are naturally gauge equivalent to pure YM solutions. We shall prove a few of these facts later. We now illustrate some of the above theory by means of examples and exercises that would deal with instantons, vortices and monopoles.
2.4
An Example on Instantons
Example 10.2.7 The instanton solution (finite action solution) of Euclidean Yang-Mills theory is a connection on a principal bundle with M = S4 as the base space and G = SU{2) = S3 as the fiber. We
Theory of Yang-Mills and The Yang-Mills-Higgs Mechanism
575
obtain below an explicit expression for it (see also Exc. 2 of this section). For this we take the metric on S 4 with radius— as: 2 d2
•>__
dr2 + r2 (o\x + oly + O2LL * Z) =J\(ePf 2 2 2 (l + r la ) p%
dx^dx^ ax ax (l + r2/a2)2
(10.2.12)
which is achieved by considering the projection from the north or south pole onto R4. (See Exp. (0.5.1) and (0.5.2) and Exc. (3) for the notations used.) As in the Hint to Exc. 2, we split S 4 into hemispheres H+ and H_, and in the overlap region H+ n H_— 53, define the transition functions h(x?) that relate the fibers g+, g_ coming from these hemispheres thus2: g-=Mx)]kg+
(10.2.13)
In the hint to Exc. 3 we have seen that h(x^) =
satisfies: r
h'x dh = itk ak
(dh) h'1 = -itk ck
and
(10.2.14)
(We used the summation index Z there). Also, in view of Exp. (0.5.2) the connection 1-forms in the two neighborhoods H+, H_ can be written as:
on
co = gl1 A'g_ + gZl dg_
on
H+
(10.2.15)
H_
where A'(x) = (h(x))k A(x) (h(x)Tk + (h(x))k d(h(x)Tk.
(10.2.16)
When k = 1 we have the single instanton solution: H+: A = (r2/(r2 + a2)) • h~ldh = (r2/(r2 + a2)) ixkak Hj. A' = h[(r2/(r2 + a2)) h-ldh]h~l + hdhT1
(10.2.17)
= {r2l{r2 + a2)) (dh)hTl + (- (dh)h~l), where we have used d(h o h~l) = (dh) o h~x + hdh~x. This gives: A, =
_mfL=_iij^ 2
l+r /a
2
(10.2.18) 2
2
l + r /a
Note that while A is well defined throughout H+, it is singular at the south pole at r = <». Similarly A' that is well defined in H_ is singular at the north pole at r = 0. 2
' k in (10.2.13), (10.2.16) is an integer which corresponds to second chern class and represents the equivalence classes of instanton bundles; see Fact (10.2.2).
576
Mathematical Perspectives on Theoretical Physics
The solutions A and A' are the (Yang-Mills) analogues of the two gauge-equivalent Dirac monopole solutions with Dirac strings in the upper and lower hemispheres of S2 as shown in the Hint to Exc. 1. Corresponding field strengths in H+ and H_ are given as. H+: F+ = dA + A A A
(10.2.19)
H_: F_ = dA' + A' A A'. Using the computations of Exp. (0.5.2) and the equality (10.2.16), these are seen to be: F+ = iT;(2/V) (e° A el + \ e^e1 A ej)
(10.2.20)
F_ = hF+hTl. Now F is self-dual, i.e., *F= F, therefore in view of the Bianchi identity: DAF=0 we have the Yang-Mills equations (10.2.10): DA* F = d* F + AA *F-* F A A = 0.
(10.2.21)
Hence we have established that, A is the single-instanton solution of the equation. We emphasize that, while looking for a solution to the Euclidean Yang-Mills equation, we are indeed in search of a gauge potential which is regular almost everywhere, the potential A described above fulfills that condition. Having obtained the solutions A and A' in index free notations, we write it down locally to explain things further. Recall that A = (A^), and since A is ,sw(2)-valued we can write Atl = AflaTa/2
(a = 1, 2, 3) and
F
nv = Fnva Tfl/2 = [(dv A^ -
(10.2.22)
(See Sec. 6.6.) In view of (10.2.17), when r —> °o the components A^—> h~x d^ h, which shows that locally it is a pure gauge (i.e. it no longer depends on the compactified manifold). From (10.2.22) the components F^v are seen to vanish. Thus the singular point of A is characterized by the zero field strength*. For r = 0, A^ = 0 and F^ = 0. Hence in both cases AM, whether it is trivially zero or it is asymptotic, gives rise to a zero field strength and thus, to a vacuum state.
2.5
An Example on Vortices
In the following example we again use a coordinate set up to obtain the vortex solutions of a complex scalar field (j) interacting with an electromagnetic field. The example illustrates the Yang-Mills-Higgs system in the case n - 2. Example 10.2.8 The field (p(x) (with complex conjugate 0 ) can be viewed as though it was a Higgs particle with mass m. The Yang-Mills field A = (A,) is replaced by the electromagnetic field, and the gauge group G in place of SU(2) is now U(l). As a result, the connection and curvature are respectively: -iA, and -iFA = -idA, with A real (see also Sees (6.6) and (6.7)). F is the field strength tensor.
Theory of Yang-Mills and The Yang-Mills-Higgs Mechanism 577
The covariant derivative DA0 given in (10.2.2a) and the curvature FA in (10.2.1) can now be written as3: D
A
= C?n
FA = d v A ^ - d ^ A
v
dx^^D^ip
dx*
(ji, v = l , 2 ) ,
(10.2.23a) (10.2.23b)
whereas the Lagrangian for the interaction can be expressed as:
L = 1 FMV F^ + (D^)* (D"0 - j - (V0 - - j - ) ,
(10.2.24)
(Note the similarity between this and (10.2.3)). The variational equations for the action \ 2 L are: dvF»v = ef
DtlD*<j>= - A ( V > - — J 0
(10.2.25a)
(10.2.25b)
where the current / = (y'M) stands for: /
= i (<j)* D " <j) - 0 (DM0)*) = i(
(10.2.26)
The similarity between (10.2.25) and (10.2.6), and between (10.2.26) and (10.2.7) is obvious. We also note that the equations (10.2.25) are gauge invariant as can be seen by using the phase transformation: (j> -» e'n <j>, >* -» e~'v >* given by the group U(l). As mentioned earlier the solutions (A, tj>), known as vortex solutions of (10.2.25), result from superconductivity phenomena. We elaborate this point below for this example begining not with gauge potential A but with an arbitrary field B. Thus let B denote an external magnetic field. If the strength of B is less than a (fixed) critical value Ho, then B is not able to penetrate inside the superconductor.4 If, however, this strength > Ho, the field can go through a kind of hole in a superconductor of type II, and there is a magnetic flux across the superconductor. We shall see that this magnetic flux can be quantized, and the pair (A, >) can be determined by making suitable choices. For this purpose we assume that the magnetic field B is in the z-direction. We denote by C a circle in the ry-plane with origin as its centre, and assume that over the circumference dC, the current J is zero. From (10.2.26) (using the vector notation), we thus have: A = —l—U* V0-0V0*) 2e<j) 0 v '
(10.2.27)
In order to write the magnetic flux across the circle C, we use polar coordinates to express the complex scalar field