Mathematical Perspectives on Theoretical Physics: A Journey from Black Holes to Superstrings

M A T H E M A T I C A L P E R S P E C T I V E S THEORETICAL ON PHYSICS A Journey from Black Holes to Superstrings ...

Author: Nirmala Prakash

54 downloads 722 Views 43MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

M A T H E M A T I C A L P E R S P E C T I V E S THEORETICAL

ON

PHYSICS

A Journey from Black Holes to Superstrings

M A T H E M A T I C A L P E R S P E C T I V E S THEORETICAL

ON

PHYSICS

A Journey from Black Holes to Superstrings

NIRMALA PRAKASH Visting Scientist Massachusetts Institute of Technology, USA

/flh

Imperial College Press

Published by Imperial College Press 57 Shelton Street Covent Garden London WC2H 9HE Distributed by World Scientific Publishing Co. Pte. Ltd. 5 Toh Tuck Link, Singapore 596224 USA office: Suite 202,1060 Main Street, River Edge, NJ 07661 UK office: 57 Shelton Street, Covent Garden, London WC2H 9HE

British Library Cataloguing-in-Publication Data A catalogue record for this book is available from the British Library.

First published in 2000 by Tata McGraw-Hill Publishing Company Limited Copyright © 2000 by Tata McGraw-Hill Publishing Company Limited

MATHEMATICAL PERSPECTIVES ON THEORETICAL PHYSICS A Journey from Black Hole to Superstrings Copyright © 2003 by Imperial College Press All rights reserved. This book, or parts thereof, may not be reproduced in any form or by any means, electronic or mechanical, including photocopying, recording or any information storage and retrieval system now known or to be invented, without written permission from the Publisher.

For photocopying of material in this volume, please pay a copying fee through the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, USA. In this case permission to photocopy is not required from the publisher.

ISBN 1-86094-364-0 ISBN 1-86094-365-0 (pbk)

Printed by Fulsland Offset Printing (S) Pte Ltd, Singapore

"To the readers of the new millennium

PREFACE

This text, unlike others, did not grow out of seminar or classroom lectures; instead it grew out of the author's conviction that present day physicists and mathematicians should know the basics of string and superstring theories, just as they know calculus, linear algebra, geometry and analysis. To reach this goal, however, the theory that is pursued at research level in selected schools has to be made available to a wider audience. This is possible only if the teaching (and a text) focuses not only on the string and superstring theories but also provides: (i) the elements of all the prerequisites; (ii) an overview of other great theories that have preceded it (since it uses their phenomenology); and (iii) a motivational thread to reach the end goal. The present text is organized to fulfill these objectives. Since the target here is a much larger population of physicists and mathematicians, we avoid the mathematical rigor. No theorems (with few exceptions) are proved in the main text, in fact they are stated as 'Results' and 'Facts,' and their proofs (in important cases) are given as solutions to the exercises at the end of a section. This offers the reader (the teacher) the option of choosing the preferred level of in-depth/non-depth coverage (in a class). Besides this, the material is often presented using the two points of view (mathematics and physics) which makes the subject easily comprehensible. The first six chapters provide the mathematical background needed for the theory and thus fulfill the criterion (i). Chapter 0 gives the definitions in topology, differentiable manifolds, analysis and algebraic topology, and Chapter 1 explains the basics of the theory of complex functions, Riemann surfaces, and twodimensional conformal field theory. In Chapter 2 a quick review of group theory which includes algebraic, topological and Lie groups is given. A brief description of bundle theory from two points of view (mathematics and physics) is also part of this chapter. Chapter 3 is devoted to elementary operator theory with emphasis on spectral decomposition of Hermitian and unitary operators, on generalized Schrodinger and Dirac operators, and on the operators formed by the generators of the groups SU(2) and SU(3). Chapter 4 deals with the basics offinite-dimensionalalgebras. Solvable, semi-simple and simple Lie algebras along with their representations are studied, objects such as weights and root systems, Weyl groups, Cartan matrices and Dynkin diagrams are defined. Chapter 5 explains the intricacies of infinitedimensional algebras, in particular those of Kac-Moody algebras, and Heisenberg algebras. Using the Dynkin indices the generalized Casimir operator is defined, and vertex operators (needed especially in string theory) are introduced.

viii

Preface

Chapter 6, devoted to several aspects of symmetry (e.g., global and gauge) in nature and to symmetry breaking phenomena, is the first chapter that exposes the reader to particle physics. Examples based on different types of Lagrangians are used to explain various gauge theories, namely Maxwell's, YangMills' and GSW's. Chapter 7 is a brief review of all those objects that have emerged, ever since the notion of supersymmetry gained credence amongst physicists and mathematicians. The chapter begins with the definitions of Z2-graded algebras (superalgebras), Lie superalgebras, Clifford algebras and spinors (Dirac, Majorana and Weyl). The concepts of supersymmetry transformations, superspace, supermanifold, superscalar field and supervector field, etc., are introduced in order to write a Lorentzinvariant super Lagrangian. The question of renormalizability is addressed, though rather briefly (by using the Wess-Zumino gauge). The form calculus on supermanifolds and Berezin integrals are also included in this chapter. Chapter 8 gives an overview of the theories of gravitation, relativity and black holes. Beginning with Newton's laws and Einstein's free float frame, the principles of general relativity are explained. Wellknown exact solutions of Einstein's equation (e.g., Schwarzschild), the singularity theorems of Penrose and Hawking's black holes are then studied. Chapter 9 is devoted to quantum theories. Due to the vastness of the subject, it includes four appendices. It introduces the reader to principles of quantum mechanics, the so-called Schrodinger and Heisenberg pictures, the Dirac equation in a non-relativistic as well as relativistic field. The chapter then develops into Feynman's path integral formalism. The Feynman propagator, Green's function, the action principles in quantum mechanics and some examples based on path integrals are given. In Appendix D a brief review of quantum groups is also given. Chapter 10 is an introduction to Yang-Mills (YM) and Yang-Mills-Higgs (YMH) theories, in particular to those aspects that have resulted from applications of index theory, algebraic geometry, and algebraic topology to these two theories. A qualitative study of solutions of YM and YMH equations (known as instantons vortices and monopoles) is done in this chapter. A section on anomalies is also included. Chapter 11, devoted to strings and superstrings, introduces the reader to Regge trajectories, the NambuGoto action of a string, bosonic strings and their quantization, DDF operators, the No-ghost theorem, the Fadeev-Popov ghosts and ghosts in bosonic theory. Some global aspects of string world sheet, the world sheet supersymmetry and super Virasoro operators in string theory are described. The superVirasoro algebra, the anomaly, and superstrings as a theory of unification are briefly overviewed. From the above descriptions it should be apparent that Chapters 6, 8, 9 and 10 are in accordance with (ii). Chapter 7, on the other hand, provides the tools for describing superstring theory. Finally, the solved exercises and examples (approximately 250 in number), more than 60 illustrations, explanatory footnotes and appendices, and a large number of references provide the motivational thread towards our main goal: learning a synthesized theory of 'Mathematical Physics of the 21st Century.' The book fails to be 'consistent' with 'symbols.' The diversity of covered subjects made the 'consistency in symbols' rather impractical. Sometimes alphabets and notations have been used to represent different objects, whereas at other times the very same object (e.g., hermitian conjugate s h.c.) is denoted differently. A list of notations chapterwise and adequate footnotes for describing the symbols (wherever required) should help to alleviate this problem. Since the book comprises of chapters devoted to different subjects, one may have the impression that the chapters are separate entities. This, however, is not the case as one would find a large number of cross-references spread through these chapters. The repeated references which are indicated by decimal

Preface

ix

notation (for instance, Ref. 10.[5] in Chapter 3 stands for Ref [5] of Chapter 10) are another proof of a thread that binds the chapters. Every attempt has been made to make the book self-contained. A few concepts that remained undefined in former chapters as well as concepts e.g. p-branes, D-branes, and dualities that led to 'Second Superstring Revolution' and black holes in string theory, are explained at the end of the book in Appendix ll.B. Similarly some recent titles of interest and the references not covered earlier are added in the form of Reference Addendum at the end of 11 .B. The original inspiration for writing the book came from Raoul Bott's remark to the author. He thought the relation that existed between string theory and Schwartzian derivative used by the author in her work on projective structures was worth examining. A greater motivational inspiration from Isadore Singer that led to the planning of the book followed soon after; the author is indebted to him for sponsoring a visiting position at the MIT mathematics department. The author acknowledges the encouragement received from Sigurdur Helgason, Victor G. Kac, R. A. Gangolli, J. N. Kapur, M. Sharma, Jim Eggleston, and David Ferriero, and the help from Dennis Porche in administrative matters, from Sylvie Besett, Suli Rocha, and Nini Wang in partial typing, from Hong Shu in proof-reading and preparing the references, and from Hayden library staff in searching the material during the course of this long project. Heartfelt thanks are due to Ulrich Gerlach and Scott Axelrod for their precious time to give comments on portions of the book, as well as to those whose excellent texts and original works helped in writing the book. A conversation with Gerlach, Witten and Canizares, that led to the section on black holes is thankfully acknowledged. The author also wishes to acknowledge Giuseppe Castellacci's help in editing, and Jean Morris' outstanding typing of the final version of the book, and Angela Chang's help in typing the subject index. In spite of their good work, and author's commitment to accuracy of information there will still be some errors, for which the latter is fully responsible; she would appreciate, if they are brought to her attention. Finally, a word of gratitude to the staff of Tata McGraw-Hill, New Delhi for their patience and continued support during the writing of the book. It is hoped that the book in spite of its shortcomings will be a useful addition to the scientific literature.

NlRMALA PRAKASH

FOREWORD

I would like to record a few thoughts for the ICP print. I never anticipated that the path of this book would ultimately end where it began. Geetha Nair, Scientific Editor ICP asked for a copy of the book when it came out in 2000. After browsing through — she remarked that the book fulfilled the goals that it set for itself in the preface. Commissioning Editor Anthony Doyle was equally encouraging — and more. It was his patience and superb coordination with TMH and WS that brought about this unusual printing event. The past two years have seen new discoveries in Cosmology, Quantum theory and Superstrings. I was inclined to integrate these new developments, instead I chose to add just a few references (96105) on p. 798 as an aid to the reader, and postpone the integration for a revised edition. The book in its present form remains more than adequate precursor to everchanging landscape of scientific ideas. I hope that readers from all walks of life will find this simple-minded version of mathematical and physical theories along with their historical notes rather appealing. Finally, I wish to thank TMH management for graciously reliquishing their publication rights and for sharing the book with the global community, and to World Scientific for bringing it under their banner. I also take this opportunity to acknowledge James H. Wiborg's generosity for a grant — so that I continue my work in these areas.

Nirmala Prakash 11.19.2002

ACKNOWLEDGEMENTS

The author gratefully acknowledges the grant of permission (with no charges) for the use of material in this book from the following publishers: Cambridge University Press, New York, Princeton University Press, Princeton; W.W. Norton, New York; John-Wiley and Sons, New York; Springer-Verlag GmbH & Co., Heidelberg; Birkhaiiser-Verlag, Basel; Gordon and Breach Publishers, Switzerland; Elsevier Science, Oxford; Kluwer Academic Press, Netherland; Imperial College Press, London; D. Riedel Publishing Co., Holland; W.H. Freeman & Co., New York and World Scientific Publishing Co., Singapore. Ted Gerney of CUP, Loan Osborne of PUP and Sarah Feider of WWN, much against the conventional practice of their respective companies, waived the copyright fee upon the author's request. The author owes her sincere thanks to professors Raoul Bott, Loring Tu, Arthur Jaffe, Cifford Taubes, Christian Kassel, Brian Greene, M.B. Green, John H. Schwarz, Julius Wess, Ashok Das, Thomas Schiiker, Frank Miller Jr., Jan Louis, and Dr Stefan Forste, who graciously permitted her to use their intellectual work, and wished this book every success. NlRMALA PRAKASH

CONTENTS Preface Foreword Acknowledgements Chapter 0. Preliminaries

vii x xi 1

1. Basic Definitions 1 2. Topology 1 Exercise (0.2) 7 Hints to Exercise (0.2) 8 3. Differentiable Manifolds 8 3.1 Differentiable Manifolds 8 3.2 Tangent Space 10 3.3 Vector Fields, Tensors and Tensor Fields 10 3.4 Riemannian Metric and Covariant Derivation 11 3.5 Geodesies, Jacobi Fields, Curvature and Torsion 13 4. Measure, Exp H, Dirac 5-function 14 4.1 Measurable Spaces and Measurable Functions 14 4.2 Haar Measure 15 4.3 The Space Exp H 15 4.4 Dirac 5-function 16 5. Examples Based on Differential Geometry 17 5.1 Critical Points 19 6. Basic Definitions in Algebraic Topology 21 6.1 de Rham Complex and de Rham Cohomology 21 6.2 Category and Functors 23 6.3 Mayer- Vietoris Sequence 24 6.4 Homotopy 24 References 26 Chapter 1. Complex Functions, Riemann Surfaces and Two-Dimensional Conformal Field Theory (an Introduction) 1. Complex Functions 27 1.1 Complex Plane 27 1.2 Analytic Function 27 1.3 Harmonic Functions 28 1.4 Laurent Series 29

27

xiv

Contents

1.5 Simply Connected and Multiply Connected Domain 30 1.6 Residues and Poles JO 1.7 Elliptic Curves 32 1. Complex Structure on a Manifold, Kahler Metric 32 2.1 Complex Manifold M 32 2.2 Complex Structure on M 32 2.3 The Tangent and Cotangent Spaces to M 33 2.4 Holomorphic Vector Fields and Holomorphic Forms on M 33 2.5 Some Calculus on M 34 2.6 Kahler Manifold 36 2.7 Harmonic Forms on a Kahler Manifold 36 Exercise (1.2) 37 Hints to Exercise (1.2) 37 3. Riemann Surfaces 37 3.1 Riemann Surface M 37 3.2 Holomorphic Mappings on M 38 3.3 Differential Forms on M, their Algebra and Calculus 39 3.4 The Star (*) Operator on M 42 3.5 Harmonic and Holomorphic Forms on M 42 3.6 Square-integrable 1-forms on M 43 3.7 Abelian Differentials on M 45 3.8 A Few Results Based on Transformation Groups of M 46 Exercise (1.3) 49 Hints to Exercise (1.3) 50 4. The Two-Dimensional Conformal Field Theory 60 4.1 Conformal Group 60 4.2 Light-cone Formalism and the Lorentz Group 61 4.3 Euclidean Space Formalism 62 AA Two-dimensional Conformal Group 63 4.5 Mobius Transformation 64 4.6 Conformal Tensor Calculus 65 4.7 Conserved Currents 66 Exercise (1.4) 67 Hints to Exercise (1.4) 68 References 69 Chapter 2. Elements of Group Theory and Group Representations 1. Introduction 70 1.1 Definition of a Group, Examples, and Conjugate Classes 1.2 Invariant Subgroups, Factor Groups, Simple and Semi-simple Groups 72 1.3 Products of Groups and Homomorphism 72 2. Lie Groups and Topological Groups 73 2.1 Topological Groups 73 2.2 Algebraic Groups 77

70 70

Contents xv

Exercise (2.2) 77 Hints to Exercise (2.2) 79 3. Basics of Group Representation 80 3.1 Relation Between Two Representations 80 3.2 Tensor Product of Representations 81 Exercise (2.3) 82 4. Specific Examples of Group Representation 82 Exercise (2.4) 85 Hints to Exercise (2.4) 86 5. The Theory of Bundles and Related Objects 86 Part A 86 5.1 Fiber Bundle, Bundle Morphism 86 5.2 Tangent Bundle 89 5.3 Lie Group of Transformations, One-parameter Subgroups of a Lie Group, Killing Vector Fields 90 5.4 Parallel Transport and Connection 92 5.5 The Linear and the Metric Connection, and the Torsion Form 94 PartB 95 5.6 Connection and Curvature on a Bundle from a Different Point of View 95 5.7 Associated Bundles 97 5.8 Affine Bundle and Affine Connection 97 5.9 Tensorial and Bundle-valued Forms 98 Exercise (2.5) 100 Hints to Exercise (2.5) 101 References 106 Chapter 3. A Primer on Operators 1. Definitions and Examples 109 1.1 Properties of a Linear Operator 110 1.2 Matrix Representation of a Linear Operator 111 1.4 List of Operators (Commonly in Use) 114 Exercise (3.1) 116 Hints to Exercise (3.1) 117 2. Eigenvalues and Eigenfunctions 119 2.1 The Resolvent and the Spectrum of an Operator 119 2.2 Examples and Results on Eigenvalues and Eigenfunctions of an Operator 119 2.3 Hermitian Operators 720 2.4 Properties of Commuting Operators 121 Exercise (3.2) 122 Hints to Exercise (3.2) 123 3. Some Properties of Operators 124 3.1 Projection Operators and their Properties 124 3.2 More on Hermitian and Unitary Operators 125

109

xvi

Contents

Exercise (3.3) 127 Hints to Exercise (3.3) 128 4. The Spectral Decomposition 129 4.1 Results Based on Spectral Families of Operators 130 Exercise (3.4) 132 Hints to Exercise (3.4) 133 5. Group Theoretic Aspects of Operators 134 6. A Few Important Operators 135 6.1 Laplace Operator 135 6.2 The Riemarinian Measure 137 6.3 Operators other than A 139 6.4 Dirac Operator 139 Exercise (3.6) 140 Hints to Exercise (3.6) 141 7. Representations of SI/(2) and SU(3) Using the Theory of Operators 143 7.1 The Group SI/(2) 143 7.2 The Group SU (3) and its Irreducible Representations 145 Exercise (3.7) 148 Hints to Exercise (3.7) 148 References 152 Chapter 4. Basics of Algebras and Related Concepts 1. Some Definitions and Examples 153 1.1 Associative, Jordan and Lie Algebras 153 1.2 Lattices 154 1.3 Examples of Algebras, and the *- and C*-Algebra 155 1.4 Examples on Lie Algebra 156 Exercise (4.1) 157 Hints to Exercise (4.1) 158 2. Solvable and Semi-simple Lie Algebras 161 2.1 Lie Subalgebras, Ideals and Lattices 161 2.2 Semi-simple and Simple Lie Algebras and their Levi Decomposition 162 2.3 Lie Algebra of Derivations, Adjoint Mapping and Centralizer 163 2.4 Modules, Lower and Upper Central Series 164 Exercise (4.2) 165 Hints to Exercise (4.2) 165 3. Representations of Lie Algebras, Modules over Lie Algebras 767 3.1 Representations of a Lie Algebra 167 3.2 Representations via Modules over a Lie Algebra 168 3.3 Nilpotent Lie Algebras 769 3.4 Weight System and Roots of a Lie Algebra 770 3.5 Lexicographic Ordering, Simple and Highest Root, and Highest Weight 7 72

153

Contents

Exercise (4.3) 174 Hints to Exercise (4.3) 174 4. Universal Enveloping Algebra, Weyl Group and Cartan Matrix 176 4.1 Universal Enveloping Algebra, Representations on Modules 4.2 Root Systems and the Weyl Group 178 4.3 Cartan Matrices 779 4.4 Dynkin Diagram 180 4.5 Casimir Element and Casimir Operator of L 182 Exercise (4.4) 183 Hints to Exercise (4.4) 184 References 759

xvii

176

Chapter 5. Infinite-Dimensional Algebras 1. Lie Algebras Associated to Cartan Matrices 790 1.1 Generalized Cartan Matrix and its Realization 797 1.2 Construction of Kac-Moody Algebra g (A) and its Universal Enveloping Algebra 792 Exercise (5.1) 194 Hints to Exercise (5.1) 795 2. Affine Algebras: An Introduction 198 2.1 Construction of Affine Algebra 198 2.2 Derivations and the Affine Algebra 799 2.3 The Root Decomposition of g 200 2.4 Formulation of the Virasoro Algebra 207 2.5 The Chevalley Basis and the Casimir Element in Terms of the Chevalley Basis 203 2.6 Casimir Element of QJ 204 2.7 Canonical Generators of the Affine Algebra g 205 2.8 The Weyl Group of g 206 Exercise (5.2) 207 Hints to Exercise (5.2) 207 3. Modules and Representations 270 3.1 Highest Weight Modules 270 3.2 The Basic Representation of the Affine Algebra 277 4. Heisenberg Systems and Differential Operators 213 4.1 Heisenberg Systems 275 4.2 Fock Spaces Constructed from a Heisenberg System 214 4.3 The Canonical Representation 275 Exercise (5.4) 275 Hints to Exercise (5.4) 218 5. Creation and Annihilation Operators 279 5.1 Creation and Annihilation Operators on Fock Spaces 279 5.2 Hamiltonian in Terms of Creation and Annihilation Operators 220 5.3 Operators on the Fock Space Associated to Heisenberg System 227

190

xviii Contents

Exercise (5.5) 224 Hints to Exercise (5.5) 225 6. The Vertex and Virasoro Operators 226 6.1 The Vertex Operators 227 6.2 The Virasoro Operators 229 Exercise (5.6) 231 Hints to Exercise (5.6) 231 References 233 Chapter 6. The Role of Symmetry in Physics and Mathematics

234

1. What is Symmetry? 234 2. Definitions and Descriptions 235 3. Exact Symmetries, Conservation Laws and Currents 238 3.1 Euler-Lagrange Equations and Currents 238 3.2 Conservation Law and the Conserved Charges as Generators of Symmetry Group 239 3.3 Examples 240 4. Gauge Symmetries—Their Origin 242 4.1 A Historical Perspective 242 4.2 Examples (Physicists' Point of View) 243 Exercise (6.4) 247 Hints to Exercise (6.4) 248 5. Examples of Theories with Gauge Symmetry 250 5.1 Maxwell and Yang-Mills Equations in Classical Form 251 5.2 Other Important Gauge Theories; Spontaneously Broken Symmetry 252 Exercise (6.5) 256 Hints to Exercise (6.5) 257 6. Bundle Theory Formalism in Gauge Theory 261 6.1 Principal Bundles as Tools in Gauge Theory 261 6.2 The Group Aut(P) of Generalized Gauge Transformations 262 6.3 The Gauge Algebra of P(M,G) and the Space of Gauge Potentials on it 264 6.4 The Moduli Space of Gauge Potentials on P(M, G) and Gribov-Ambiguity 266 7. More on Characteristics of Gauge Theories, and Examples Based on Them 267 7.1 A Generalized Maxwell's Field 268 7.2 A Generalized Yang-Mills Field 270 Exercise (6.7) 276 Hints to Exercise (6.7) 276 Table 6.1 Some Symmetries and Conservation Laws 277 References 277 Chapter 7. All That's Super-An Introduction 1. Graded-Algebras 279

279

Contents

2.

3.

4.

5.

6.

1.

1.1 Superalgebras and Lie Superalgebras 279 1.2 Other Important Superalgebras and Bose and Fermi Sectors 281 Exercise (7.1) 284 Hints to Exercise (7.1) 284 TheSpinors 285 2.1 The Definitions and Properties of Spinors 285 2.2 Clifford Algebras and Spinors 286 2.3 Dirac, Majorana and Weyl Spinors 289 Exercise (7.2) 290 Hints to Exercise (7.2) 297 More on Spinors 292 3.1 The Poincare Superalgebra 292 3.2 Lorentz Invariance 293 3.3 Dirac Matrices and Dirac and Majorana Spinors 294 Exercise (7.3) 297 Hints to Exercise (7.3) 298 Supersymmetry Algebras and Introduction to Superspaces 301 4.1 Supernumbers 301 4.2 Superanalytic Functions 302 4.3 Real and Imaginary Supernumbers 304 4.4 Supervector Spaces 304 4.5 Supermanifolds; Charts and Atlases 306 4.6 Supersymmetry Generators and Construction of Superalgebras from First Principles 307 4.7 Supersymmetry Transformations on a Superspace 310 Exercise (7.4) 311 Hints to Exercise (7.4) 312 The Calculus on Superspace, the Component Fields and Superfields 313 5.1 Infinitesimal Generators and Covariant Vector Fields 313 5.2 Component Multiplets and Superfields 315 Exercise (7.5) 320 Hints to Exercise (7.5) 321 Differential Forms and Gauge Transformations on Superspaces 328 6.1 Differential Forms 328 6.2 The Gauge Invariant Lagrangian in Superspace 330 6.3 Supergauge Transformations 332 Exercise (7.6) 334 Hints to Exercise (7.6) 335 The Basics of Integration and Conformality in Superspaces 338 1.1 Integration on Superspace 339 7.2 Variation of a Superfield 340 13 Superconformal Transformations 341 Exercise (7.7) 345 Hints to Exercise (7.7) 345

xix

xx

Contents

Appendix 7A 350 A.O Notations and Pauli Matrices 350 A. 1 Standard Bases and Components of a Supervector 351 A.2 Contravariant Vector-fields on Supermanifold M 352 A.3 Super Lie Groups 352 A.4 Conventional Super Lie Groups 353 A.5 Exponential Mapping 353 A.6 Conventions on Structure Constants 354 References 354 Chapter 8. Gravitation, Relativity and Black Holes 1. Gravitation (from Newton to Einstein) and an Overview of Special Relativity 356 1.1 Newton's Theory of Gravitation and his Famous Laws 357 1.2 Einstein's Proposal—the Free-float Frame and the Observer 358 1.3 Acceleration and Spacetime Curvature 359 1.4 The Coordinate Transformations: Distinction Between the Galilean and Special Relativity Theory 360 1.5 Equations of Motion in Newtonian Mechanics 361 1.6 Special Relativity 362 Exercise (8.1) 367 Hints to Exercise (8.1) 367 2. The Einstein Universe 371 2.1 The Mathematical Model 372 2.2 The Matter Fields 373 2.3 Postulate (a): Local Causality 373 2.4 Postulate (b): Local Conservation of Energy and Momentum 374 2.5 Construction of the Energy-momentum Tensor Tab 374 2.6 The Field Equations 376 2.7 Postulate (c): Field Equations 380 Exercise (8.2) 380 Hints to Exercise (8.2) 381 3. Curvature and Energy Conditions 387 3.1 The Separation Vector, Vorticity, Shear and Expansion 387 3.2 Energy Conditions 393 3.2.1 The Weak Energy Condition 393 3.2.2 The Dominant Energy Condition 395 3.3 Results Based on Energy Conditions 395 3.4 Conjugate Points 397 3.5 Results Based on Curvature, Conjugate Points and the Expansions 9,9 398 3.6 Variational Techniques 400 4 Exact Solutions, and the Causal Structure 403 4.1 An Exact Solution 403 4.1.1 Minkowski Spacetime 404

356

Contents

xxi

4.1.2 de-Sitter and Anti-de Sitter Spacetimes 407 4.1.3 Robertson-Walker Space 412 4.1.4 The Schwarzschild and the Reissner-Nordstrom Solution 413 4.1.5 The Kerr Solution 415 4.1.6 Gbdel's Universe 416 AA.l Taub-NUT and Misner Spaces 416 4.2 Causal Structure 418 4.2.1 Orientability 418 4.2.2 Chronological and Causal Future 418 4.2.3 Horismos and Achronal Sets 419 4.2.4 The Concept of Imprisonment 420 4.2.5 Cauchy Developments 421 Exercise (8.4) 422 Hints to Exercise (8.4) 422 5. The Basics of Spacetime Singularities and Black Holes 429 5.1 Singularities and Completeness in Spacetime 429 5.2 Black Holes 433 Exercise (8.5) 441 Hints to Exercise (8.5) 442 Appendix 8A 443 A. 1 Spatially Homogeneous 443 A.2 Geodesically Complete 443 A.3 Normal Coordinates 443 A.4 Open or Closed Universe 443 A.5 Cavendish Constant Gc 444 A.6 Closed Trapped Surface 444 A.7 Particle Horizon 445 A.8 Event Horizon 446 References 446 Chapter 9. Basics of Quantum Theory 1. Introduction 449 2. Passage from Classical to Quantum 450 2.1 The Concept of Amplitude, Observable, and Hamiltonian 451 2.2 Symmetry Group of the Motion of a Particle in 1-Dimension 452 2.3 Two-body Problem with Spherically Symmetric Potential 454 2.4 The Radial Hamiltonian of the Two-body Problem 455 2.5 The Relation between Schrddinger and Heisenberg Equations 457 Table 9.2.1 Classical and Quantum Mechanics 458 Exercise (9.2) 458 Hints to Exercise (9.2) 459 3. Quantum Mechanical Equations and Related Concepts 459 3.1 Hamiltonian in a Relativistic Field, and Klein-Gordon Equation 460 3.2 The Dirac Equation 461

449

xxii

Contents

3.3 Commuting Observables for a Free Relativistic Dirac Particle 464 3.4 The Relationship Between Free Klein-Gordon and Dirac Particles 465 3.5 The Dirac Equation in Rest Frame 466 3.6 The Feynman-Gell-Mann Reduction 467 Exercise (9.3) 468 Hints to Exercise (9.3) 468 4. Gauge Field Quantizations 471 4.1 Feynman's Functional Integral 472 4.2 Functional Integral of a Scalar Field 473 4.3 Green's Function and Generating Functional 474 A A Diagram Technique for Scalar Field Theory 477 4.5 Functional Integral Approach to Bose and Fermi Fields 479 Exercise (9.4) 481 Hints to Exercise (9.4) 481 5. Path Integrals 483 5.1 Path Integral via Operator Formalism 483 5.2 Time Ordered Product of Operators 486 5.3 Correlation Functions Using an External Source J 488 5.4 Vacuum Functional Z[J] and Green's Functions in the Vacuum 489 5.5 Effective Action W[J] 491 5.6 Path Integral Approach to Field Theory 494 5.7 Pi-formalism and Field Theories (with Infinite Degrees of Freedom) 495 5.8 The Faddeev-Popov Ansatz 499 Exercise (9.5) 501 Hints to Exercise (9.5) 502 6. Feynman Graphs 507 6.1 Connected Diagrams 508 6.2 Effective Functional and Feynman Graphs with Vertex-functions 513 Exercise (9.6) 518 Hints to Exercise (9.6) 518 Appendix 9A: Language of Quantum Mechanics 521 A.I State Space, Kets and Bras, Hermitian Operators and Observables 521 A.2 Position and Momentum Operators of a Particle 523 A. 3 Coordinate and Momentum Space Representations 525 A.4 The Complete Set of Commuting Operators 526 Appendix 9B: A Few Definitions and Derivations 527 B.I The Wave Function ynn Quantum Mechanics 527 B.2 The Hamiltonian Operator H{i), and the Time Evolution Operator. U(t) 529 B.3 Dynamical Laws 531 Exercise (9B) 532 Hints to Exercise (9B) 532 Appendix 9C: Tools of Physical Theories 533 C. 1 Test Functions and Distributions 533

Contents

xxiii

C.2 Properties of Distributions with Respect to Operations on Them 534 C.3 Green's Functions 536 C.4 Fourier Transforms and Related Objects 539 C.5 Functionals and their Calculus 545 Exercise (9C) 548 Hints to Exercise (9C) 548 Appendix 9D: Quantum Groups 554 D.I Algebra, Coalgebra, Bialgebra and Hopf Algebra 554 D.2 The Quantum Plane, the Algebra Mq (2), and Hopf Algebras GLq{2), SLq(2\ Uq{sl{2)) 559 Exercise (9D) 565 Hints to Exercise (9D) 566 References 569 Chapter 10. Theory of Yang-Mills and The Yang-Mills-Higgs Mechanism 1. Introduction 571 2. Yang-Mills and Yang-Mills-Higgs Functional 572 2.1 Yang-Mills-Higgs Action in R" and R"'1 572 2.2 The Variational Equations and Solutions 573 2.3 Instantons, Vortices and Monopoles 573 2.4 An Example on Instantons 574 2.5 An Example on Vortices 576 Exercise (10.2) 579 Hints to Exercise (10.2) 579 3. Self-duality in Yang-Mills Theory and Instantons 584 3.1 Self-duality in 4-dimensions 584 3.2 Examples of Self-(Anti-Self) Dual Manifolds 585 3.3 Self-dual Connection on a 4-manifold 585 3.4 Self-duality in Spinor-bundles 586 3.5 Quaternions and Yang-Mills' Instanton 586 3.6 The BasiG Instanton and its Asymptotic Form 588 3.7 Anti-instanton in Asymptotic Gauge 589 3.8 Application of Conformal Transformations to Basic Anti-instanton 3.9 Construction of Multi-instantons 590 3.10 Projective Spaces and Instantons 592 Exercise (10.3) 594 Hints to Exercise (10.3) 595 4. More on Monopoles 597 4.1 Yang-Mills-Higgs'Configuration Space 597 4.2 Bogomolny Equations 598 4.3 The Solution Space of Monopoles 601 4.4 The Scattering and Spectral Curve 603 4.5 The Metric on Mk 605 4.6 Monopoles in Coordinate Form 607

571

590

xxiv

Contents

Exercise (10.4) 613 Hints to Exercise (10.4) 674 5. More on Vortices 618 5.1 Characterization of Superconductivity 678 5.2 Superconductivity and Multivortex Configuration 678 5.3 Vortices when A= 1 627 5.4 Some Existence Theorems in the Complex Framework 623 Exercise (10.5) 628 Hint to Exercise (10.5) 628 6. Anomalies 628 6.1 Renormalization (The Physicist's Approach) 628 6.2 Anomaly as an "Obstruction" (A Mathematician's Point of View) 634 6.3 Anomaly as a Welcome Phenomenon 636 6.4 Chiral Gauge Theories 637 6.5 Construction and Computation of an Anomaly 637 Exercise (10.6) 645 Hints for Exercise (10.6) 645 Appendix 10A: Glossary 648 A.I The Projective Space Pn (C) 648 A.2 Vector Bundles over Pn (
661

678

Contents

3.2 Solution of the Wave-equation in Reference to Strings 681 3.3 Conserved Currents, Linear and Angular Momentum 685 3.4 The Hamiltonian for Strings; Mode Expansions of Tap = 0 686 3.5 The Quantization in String Theory 688 3.6 The Fock Space and Virasoro Operators 689 3.7 Quantum Anomaly and Physical States 691 3.8 The Conformal Dimension of an Operator; Vertex Operator 695 3.9 Gauge Quantization Using the Light-Cone Formalism 700 3.10 DDF Operators and the No-ghost Theorem 704 3.11 The Spectrum of Physical States and its Analysis 710 Exercise (11.3) 712 Hints to Exercise (11.3) 713 4. Co variant Quantization from a Modern Point of View 715 4.1 Faddeev-Popov Ghosts and Virasoro Generators 715 42 BRST Quantization 721 4.3 Anomalies in Reference to String Theory 726 4.4 Calculation of the Virasoro Anomaly via World-sheet Methods 729 4.5 Ghosts in Bosonic Theory 732 Exercise (11.4) 736 Hints to Exercise (11.4) 736 5. A Few Important Topics in String Theory 737 5.1 Global Aspects of String World Sheet 738 5.2 Effect of Non-flat Metric on the Propagation of a String 743 5.3 Breakdown of Weyl In variance and Beta Functions 745 5.4 Weyl Invariance and Vertex Operators 747 Exercise (11.5) 749 Hints to Exercise (11.5) 749 6 The Concept of Supersymmetry in String Theory 751 6.1 Bosonic Theory with Majorana Fermions 757 6.2 World-sheet Supersymmetry and Two-dimensional Superspace 753 6.3 The Action and the Constraint Equations on £ = £((7°, QA) 755 6.4 Boundary Conditions and Super-Virasoro Operators 759 6.5 Super-Virasoro Algebra and the Anomaly 762 6.6 Superstrings—A Theory of Unification 762 Exercise (11.6) 763 Appendix 11 A: Glossary 765 A.1 Ghost 765 A.2 Hadrons 765 A.3 s and ^-channel Scattering 765 A.4 Mandelstam Variables 766 A.5 Tachyons 766 References 767 Appendix 11B: Some Recent Developments in Superstrings' Theory (A Few Definitions) 769 B.I Divergences 769

xxv

xxvi Contents

Table 1. Electromagnetic Radiations 770 B.2 Sobolev Spaces 770 B.3 K3 Surfaces, Orbifolds, and Narain Spaces 771 B.3.1 K3 Surfaces 777 B.3.2 Orbifolds 771 B.3.3 Narain Spaces 772 B.4 The Concept of Holonomy, and Calabi-Yau Manifolds 772 B.5 Dolbeault Cohomology on Manifolds of SU(N) Holonomy 773 B.6 Groups that Matter in String/Superstring Theory 774 B.6.1 The Groups SO(8), 50(16) and 50(32). 774 B.6.2 Exceptional Groups 776 B.7 Five Superstring Theories 777 Table 2. Superstring Theories in Ten Dimensions 778 B.8 Duality Symmetries of Superstring Theories 779 B.9 The BPS States and Blackholes 779 B.10 Cosmic Strings and Superstrings 780 B. 11 D-branes and p-branes 781 B . l l . l Boundary-value Problems of Dirichlet (D) and Neumann (N) 781 B. 11.2 Bosonic p-brane Action, and Supermembrane action 783 B.12 M-Theory and its Relation to Other Theories 784 B.I2.1 Here We Describe in Brief A/-Theory and Kaluza-Klein theory 784 B.12.2 Kaluza-Klein Theories—A means to Duality 785 B.I3 Second Quantization and String Theories 785 B.I4 Black Holes and Elementary Particles 786 Appendix to (11B.14) Definitions/Explainations of Words Used in (11B. 14) 790 Additional References 794 Table S (Subatomic Particles) 799 Concluding Note to the Reader 801

Symbols Index

803 813

CHAPTER

PRELIMINARIES

1

U

BASIC DEFINITIONS

Definition 0.1.1: Let/be a mapping from a set E to a set F , for x e E, the image f{x) = y e F. Sometimes we denote/(;c) as£, this is called the indicia! notation of the mapping/and E is called the index set. Definition 0.1.2: If A is any subset of E, the mapping of A into E which associates with each x e A the very same element x in E is called the canonical mapping of A into E. (We note that the terms variable, arbitrary element and generic element are used synonymously for the element x.) Definition 0.1.3: If/is a mapping of E into itself and X is a subset of E such that/(Z) c X, then X is said to be stable under f. Definition 0.1.4: An ordered set E in which every finite non-empty subset of E has a least upper bound and a greatest lower bound is called a lattice. The set of subsets of any set, ordered by inclusion, is a lattice. Every totally ordered set is a lattice. Definition 0.1.5: A non-void set E equipped with (+) and (•) operations is called a linear space. The first operation is commutative and associative and the second is the multiplication by scalars (real/ complex). If the set of scalars is real (complex) we call it a real (complex) linear vector space. Its elements are called vectors. The set of scalars is called a field. Definition 0.1.6: Let X and Y be linear space over a field %. The mapping / : X —> Y is a linear mapping iff (ax + fix') = a fix) + /?/(*') for all x, x e X and a, fi e % The set {x e XJix) = 0} is called the kernel of/and is denoted ker/.

2

TOPOLOGY

We list here some definitions together with the notations that are universally used in classical as well as quantum theories (see [2], [3], and [9]). Definition 0.2.1: Let X be a set and 11 a set of subsets of X that is invariant under finite intersections and arbitrary unions such that X e 11 and 0 e U, then 11 defines a topology Ton X and X is called a topological space and is denoted (X, T) (see [3] and [11]).

2

Mathematical Perspectives on Theoretical Physics

Although we have included in the definition the fact that X and 0 are members of 11, it is apparent from the following reasoning. Since X is the intersection over the empty subset of 11 (i.e., X = p]#), Be0 hence it belongs to 11, similarly 0 is the union of empty subset of 11, hence 0 e 11. The sets G belonging to 11 are called open sets while their complements C s I \ G a i e called closed with respect to T. Given an arbitrary set A c X, we define the corresponding open set Int A = A° (interior of A) as the set formed by the union of all open subsets of A. The closed set A called the closure of A is formed by the intersection of all closed sets containing A. It is easy to note that X can be thought of as an open as well as a closed set of the topology T. This observation will have a bearing on our next definition. But before that we would like to introduce another concept. Note that every space X (without any reference to topology) can have a family of subsets [Sa}ae

such that [jSa covers X, we call {Sa} to be the covering of X. If X is a topological a space, we may think each subset to be open, and call this an open covering of X. If {Sp}peB is another covering of X and B c A, we call {Sp} a subcovering. We call a topological space compact if any open covering of X has a finite subcovering. A topological space that does not satisfy this property is noncompact (see Exp 2.1c). A topological space which is not the union of two non-empty, open, disjoint, subspaces is called connected (see Exp. 2.Id). A

Definition 0.2.2: If A and B are two subsets of X such that A a B, then B is said to be dense relative to A; if in addition B c A, it is said to be dense in A. If X contains a countable dense subset then X is called a separable topological space. A subset 5 is dense in X if 5 = X. Definition 0.2.3: A subset V e X is called a neighbourhood of a point x e X if x e Int V, and V is called a neighbourhood of another subset A say, if every x e A implies that x e Int V. Definition 0.2.4: A mapping <j> from a topological space (X, 11) (11 being an open covering) to another topological space (Y, 11') is called continuous if for any open set A' € 11', (jT1 (A') E 11. Furthermore, if <j) is bijective, it is called a homeomorphic mapping if

x + y

is continuous on X x X into X

'• Note that we are using the product topology here, i.e., the minimal topology that makes the projections Pj, P2 continuous (see Chapter 1 in [3]).

Preliminaries 3

P2: (a, x) -» ax

is continuous on K x X into X

Since the continuity given by Px implies (x, y) -*(x- y), it follows that every t.v.s. is a commutative topological group. Naturally topological spaces in general and t.v.s. in particular can be provided with some geometric and analytical structure; the following definitions pertain to that. Definition 0.2.8: The pair (X, p) denotes a metric space tf p: X x X -» R is a mapping that satisfies: (i) p(x, y) > 0 if x * y, p(x, y) = 0 if x = y; (ii) p(x, y) = p(y, x): and (iii) p(x, y) < p(x, z) + p(z, y) for any z G X. If A is a subset of metric space (X, p), we call and denote the diameter of A and the distance from x to A respectively by:

diam A = sup [p(x, y): x, ye A] dist (JC, A) = inf {p(x, y) : y e A}. The notation B(x; r) is used to denote a closed ball centred at x with radius r > 0, thus, B{x;r)=[ye

X: p(x, y) < r ) .

The subset of this ball given by {y e X: p{x, y) < r) is called an open ball and is denoted Bo (JC; r). A set 11 in space X is said to be open if for each point x e 11 there exists an open ball centred at x and contained in 11. It is easy to verify that this defines a topology on X. Given a topological space X, we call it metrizable if there exists a metric p such that the open balls form a basis for the topology; such a metric is said to be compatible with the topology on X. It can be checked that the metric p is translation invariant in this case, i.e., p(x, y) = p(x + z,y + z) for x, y, z e X. Recall that the symbol A denotes the closure of A in X. If X has a topology T other than the one induced by the metric, we use A T to denote the closure of A in {X, T). Definition 0.2.9: A subset C c X of a t.v.s. is convex if for any JC, y e C, A x + (1 - A) y e C, where A e [0,1]. Given A c X, the convex: /iw// of A denoted conv A is the smallest convex subset of X that contains A, thus: conv A = n {£ c X : K 3 A, K is convex}. n

n

Thus x e conv A if and only if x = £ A; *,- where JC; e A and X ^, = 1- The closure of conv A is 1=1

i=i

conv A. A well known theorem states: If A is compact so is conv A. Definition 0.2.10: A real valued mapping on a vector space X defined over IR or C (K stands for real or complex number field) is called a norm (denoted || ||) if it satisfies: (a) || Ax|| = |A| ||JC|| for all A e K, x e X; (b) jjjc + v|| < ||x|| + ||y||; and (c) ||x|| = 0 implies x = 0. A normed space is thus a pair (X, || ||). It is easy to check that a distance function between a pair of elements can be defined on this space by the rule: {x, y) —> [|JC - y||. Consequently the vector space X can be given a topology. Hence a normed vector space is a topological vector space. Definition 0.2.11: In a normed linear space X, a sequence {xn} is said to be convergent if there exists an element x in X such that | | x n - x\\ —> 0. The sequence {xn] is said to converge to the element x.

4 Mathematical Perspectives on Theoretical Physics

Definition 0.2.12: A sequence {xn} in X is said to be a Cauchy sequence if given £ > 0 we can find an integer N(e) such that \\xn — xm\\ < e for all n, m > N(e). Evidently every convergent sequence is a Cauchy sequence, but the converse is not always true. Definition 0.2.13: A normed linear space in which every Cauchy sequence is a convergent sequence is said to be complete, and it is called a Banach space after the Polish mathematician Stefan Banach. One of the important spaces that is used in quantum theories is the Hilbert space. In order to define it next we describe the space from which it follows. Pre-Hilbert space: If H is a (complex or real) vector space and < .,. > is a non-degenerate scalar product on H, then we call the pair (H, < . , . > ) a vector space with scalar product or a pre-Hilbert space. Every pre-Hilbert space carries a norm in a natural way, the norm being | | / j | = ( / , / ) l y ' 2 where / € H. If in addition every Cauchy sequence is convergent, then this space in view of Def. (0.2.13) is complete. A complete pre-Hilbert space is called a Hilbert space (see Def. (0.2.14)). Examples of Norm: In Cm (or Rm) define the norms as m

(0 ll/lli = XlJ3 (ii)

| | / | | . . = max { | / - | : i = l ,

-,m}

(iiD 11/11= jlU-lj 2 The last of these gives the Euclidean length1 of the vector/= (fx,---,fm) e C m (Rm), and | | / - g\\ here is the Euclidean distance of points/and g. Let (H,( ., .)) be a pre-Hilbert space. A family M - {ea: a e an index set A} of elements from H is called an orthonormal system (ONS) if ( ea,-ep)

>= 8afifor

a, j8 e A.

An orthonormal system M is called an orthonormal basis (ONB) of a subspace T of H, if M is total in T (i.e., M c Tand L(M)

:D T). If M is an ONB of H, then M is called an orthonormal basis of H.

Definition 0.2.14: A complete normed space whose norm || || is given by a scalar product ( ) is called a Hilbert space. More explicitly a space X is Hilbert if to each-pair of elements x, y in X there is associated a scalar {x, y ) that satisfies: (i) ( ax xx + «2 x2, y ) = ax ( xx, y) + a2 ( x 2 , y )

(ii) ( x , y ) = (y, x) (the complex conjugate) (iii) (x,x) 2

= (|| JC||) is positive definite when x ^ 0.

L(M) = closure of the linear hull L (M) (the set of finite linear combinations of elements of M, or in other words, the smallest subspace of H that contains M).

Preliminaries 5

The scalar product defined above is called a positive definite Hermitian form. It is usual to denote a Hilbert space by 9i. We shall use this notation throughout the text. It is important to note here that elements of Hare arbitrary (e.g., real or complex numbers or real valued or complex valued functions) and scalar product (as obvious from (ii)) is not necessarily real. It can also be checked that the norm and the scalar product on J/'are related by the Schwartz inequality: (iv) | U y ) | £ | H | | | y | | and by the polarization identity: (v) 4( x, y) = \\x+y\\2 - \\x- y\\2 + i\\x + iy\\2 - i\\x -

iyf.

The notion of completeness in 9i means that if a sequence of elements {(j)n} in !H satisfies the condition 110/1 ~ 0mll —> 0 for m, « —> °°, then there exists an element

n - 0|| —> 0 for n —> °°. this context we would like to mention that there are two types of convergence in ?/(in fact in a normed linear space), the strong and the weak. The sequence of elements {0,,} converges to the element (f> strongly if as n —> °°

ik-0ii->o and weakly to an element <j)' if for each element y/ e 9{

The "strong convergence" is also called "convergence in the mean" and it implies:

IKII-HMIEvidently strong convergence implies weak convergence. Let u,veJ{, we say that these elements are orthogonal if ( u, v ) = 0. Suppose that H is a subset of 9(, the set of all elements we 9{fox which (u,v) = 0, v e H forms a subspace of"K,we denote this by HL. If the subset H is the whole space 9{, then the space H' consisting of all elements u in #"such that u e !HL is called the null space of the Hermitian product. We shall in general deal with a separable Hilbert space. Its dimension is either finite or is denumerably infinite. In the former case, we shall sometimes use a familiar nomenclature a unitary space of dimension n. Even when the space is infinite dimensional, an orthonormal basis can be obtained. Since it can be shown that there always exists an infinite complete orthonormal sequence, that can be obtained by applying Gram-Schmidt orthogonalization process to a complete denumerable subset of 9{. We give below a few examples to illustrate the objects that have been defined above. The numbering of the example has an additional letter a, b, etc., added to the definition of the object it represents. Example 0.2.1a: Let X = R be the set of real numbers. Let a subset U of R be called open if for each point x in U there exists an open interval / containing x and contained in U . Obviously R with the set 11 formed by open sets U becomes a topological space. Example 0.2.1b: Let X = {a, b, c, d, e,f) and let 11 be the set formed by subsets (assumed as open) [{a, b, c, d}, {a, b, e,f), {c, d, e,f), 0 , X]. It can be easily checked that X together with 11 forms a topological space. Example 0.2.1c: Among the simplest examples of a compact and non-compact set are respectively an open or a closed disc and an extended plane as shown in Fig. (0.1). Note that the topological space of (0.2.1a) is non-compact and that of (0.2.1b) is compact.

6

Mathematical Perspectives on Theoretical Physics

• Q mg

|x] < r, open disc v-tv

|x| < r, closed »ICD2

Plane extended to infinity on either side

x = ^X-|, X2) fc n

l ^ ^ ^ f l Open and closed discs, and extended plane Example 0.2.1d: Connected and disconnected topological spaces are visually represented as follows.

I^SCTj Connected and disconnected topological spaces Also note that an open, half open, or closed interval / cz R is always connected. Example 0.2.2a: The set Q of rationals is dense in the set R of real numbers. Example 0.2.3a: An open interval (a, b) is a neighbourhood of each of its points. A closed interval [a, b] is a neighbourhood of each point of (a, b). As can be easily seen, [a, b] is not the neighbourhood of a or b. Example 0.2.3b: The set R of real numbers is the neighbourhood of each of its points. The set Q of rational numbers is not the neighbourhood of any of its points. Example 0.2.5a: If the set U of subsets in Exp. (0.2.1b) is replaced by [{a} {b} ••• { / } ; {a, b] •••; {a, b, c} •••; {a, b, c, d] •••; {a, b, c, d, e} • • ; 0 , X], the topology defined will be the discrete topology. Example 0.2.6a: Every metric space is Hausdorff, for if d(x, y) = S> 0 defines a metric on X, then the sets defined as: Va : = {x\d(a, x) < 5/2}, Vh : = {x\d(b, x) < 6/2} a * b are disjoint neighbourhoods of points a, b in X. Example 0.2.6b: A topological space endowed with the discrete topology is a Hausdorff space. Example 0.2.7a: The cartesian product Rn = R x • • • x R over the field of reals is the simplest example of a topological vector space; the vector in this case is x = (xlt ••• , xn), the mappings Pl and P2 are pointwise addition and scalar multiplication of vectors e R". Example 0.2.7b: A similar example is offered by the collection of nxn of real or complex numbers.

matrices defined over the field

Example 0.2.8a: Let X = R2 be a topological space whose metric is the Euclidean distance: p(x, y) : = •/(*! - y\ ) 2 + (x2 ~ y2 ) 2 •

Preliminaries 7

Denote it by (X, T). Define another metric p' on X = R2 given by: p'{x,

y): = max {|*, - y,|, \x2-

y2\}.

It can be easily verified that both p and p ' given above satisfy the postulates of a metric on R2 and the two topologies defined by them are the same, i.e., 1(p) = T(p'), in the sense that every 5-ball constructed for T(p) can be shown to be contained in a ball in T(p') and vice-versa. Example 0.2.10a: Let X be the space of continuous functions that are defined on the interval (a, b) and let m denote a general measure on X. The mapping / s u c h that

f^{jjf{x)\» dXy {l\f(x)\"dm) defines a norm for 1 < p < °°. The space X is called a Lp-space. Example 0.2.11a: Let X be the real line R, the sequence defined by fn (x) = —j=— (x e R, n = 1, 2, 3, • • •) is a convergent sequence, since f(x) = lim fn (x) = 0. But the sequence given by f'n (x) = -Jri H-»°o

cos rix is not convergent to any / ' , since f'n (0) = 4n

—> °° whereas / ' (0) = 0.

1

Example 0.2.13a: The vector space R and the space C(K) of continuous functions defined on a compact set K of R1 are complete spaces (obviously it is true when 1 is replaced by n). Example 0.2.13b: The space of square (summable) functions defined on a measurable space (Exp. 2.10a for p = 2) is a complete space. Example 0.2.14a: The space formed by arbitrary sequences (
Exercise 0.2 1. Show that the scalar product defined by (i) and (ii) in Def. (0.2.14) is a sesquilinear form, i.e., it is a map:

XxX^C (x, y) H» <x, y>

which is linear in the first variable and is semilinear in the second variable ({ x, ay) a (x,y)). 2. Let the function u i-> ||M|| be a seminorm on X i.e., \\u\\ > 0 for any u € !tf. Show that \\u\\ = 0 if and only if u e 9{Q the null space corresponding to the Hermitian product. 3. Prove property (iv) of Def. (0.2.14) by choosing R as the Hilbert space. 4. Let (X, {Xa}, T) be a topological space and let Y be a subset of X. A topology formed by open sets Oa = Y n X a of Y is called a subspace (or relative) topology on Y and is denoted as TY.

8 Mathematical Perspectives on Theoretical Physics

Show that the real line R c R2 has the relative topology which is induced by the usual topology on R2 given by the Euclidean metric. 5. Show that the sequence {sin nx}\ 0 < x < it converges weakly to 0 but does not converge in the mean. 6. Given an orthonormal sequence {
DO

such that the series X |tfn|2 converges, show that the series £ an^>n will converge in the mean l

l

to an element g of the Z2-space where g satisfies: (g,

<j)n) = an{n = 1, 2 , 3 , •••)

(Riesz-Fischer theorem).

Hints to Exercise 0.2 1. Since {x, y) = (y, x), we have ( x, a1yl + a2y2 ) = {^I^I entry implies ax{yx,

x) + a2{y2,x)

= a , {yl,x)

+ a2

+ a

2y2^x)

(y2,x)

= al

anc

* linearity in the first

{ x, yx ) + a j (x,

y2).

2

3. Consider the function <j): R - » R defined by <j>(t) = ||JC + fy|| = (x + ty, x + ty) = (x, x) + 2t( x, y ) + t2( y, y ) . Since
|U?)| 2 <||*|| 2 IMI 2 . 4. Note that R c R2 is obtained by identifying a point x e R with the point (x, 0) e R2. The topology Ton R2 is given by open balls of R2. A basis for R can be taken by considering bounded open intervals; it is easy to note that every element of this basis, i.e., a bounded open interval (a, h), will correspond to the set 5 of points on the x-axis in R2 between {a, 0) and (b, 0). Let D be the open disc with centre at

, 0 and radius

, clearly D is open in % and S = D n R,

and so 5 is open in the relative topology TR on R. (See Chapters. 3 and 4 of 3.[17] for the Hints of Exercises 5 and 6.)

3 3.1

DIFFERENTIABLE MANIFOLDS Differentiable Manifolds

To define a differentiable manifold we need to define the ingredients that go into its definition. Definition 0.3.1; An n-dimensional topological manifold M is a Hausdorff topological space such that every point in M has a neighbourhood homeomorphic to R". Definition 0.3.2: Each pair (U, <j>) where U is an open set of M and 0 is a homeomorphism of U to an open subset U' of R" is called a coordinate neighbourhood or a chart. Using the homeomorphism 0 one can assign to each q e U, n coordinates xl (q), •••,xll(q) of its image
Preliminaries 9

as the z'-th coordinate function) is a real-valued function on U. If q also e another coordinate neighbourhood (V, y/) then it has coordinates /(
y/o fl: (U nV)-* y/(U r> V).

(0.3.1)

l

The domain and range of the mapping yro <jT are the two open subsets of R" that correspond to the points of U n V by the coordinate maps and yi respectively. The homeomorphisms y/ o tjr1 and
i

(0.3.2)(a)

1

x =g (y ,-,y^. (0.3.2)(b) To define a differentiable manifold, we basically need to select a family or subcollection of neighbourhoods so that the change of coordinates is always given by differentiable functions (i.e.,/', g' for each i are differentiable). Definition 0.3.3: Two charts (U, ) and (V, y/) are said to be Ck-compatible if U nV*0 implies 3 that the functions /'(x) and g'(y) giving the change of coordinates are C*. Definition 0.3.4: A Ck-structure on a topological manifold M is a family 11 = {1Ia ,0a} of charts4 such that: (i) the sets Ua cover M; (ii) for any a, ft the pairs (Ua, tyj and (Up, <j>p) are C*-compatible; (iii) any chart (V, yr) compatible with every (Ua, (j>a) e U is itself in 11.

( 7^—ri / p(UanUf,)

QfflBVI Coordinate neighbourhoods and maps. A (differentiable) C*-manifold is a topological manifold which carries a C*-structure. When k is r, °° orfi),we call it Cr, C°° or C10 differentiable manifold. A C°°-manifold is generally referred to as a smooth manifold and a C^-manifold is known as a real-analytic manifold. When Rn is replaced by C" and the mappings 0« o
k stands for differentiability of order r, °° and a>, and the collections of neighbourhoods (also denoted Ck) resulting from these choices satisfy Cr~DC°°z3 Cm. 4 ' Chart = local coordinate system = coordinate neighbourhoods and maps. '

10 Mathematical Perspectives on Theoretical Physics

3.2 Tangent Space Let Mbe a smooth (C°°-) manifold and let C°°(p) denote the set of "germs" of smooth functions5 at a point p in M. Definition 0.3.5: The tangent space Tp{M) toM&tp is the set of all mappings Xp: C°°(p) -> R that satisfy for all a, J3 e R and/, g & C°°(p) the following two conditions: (i)

Xp(af +pg) = a (Xpf) + P (Xp g)

(ii)

Xp(fg) = (Xpf)g(p)+f(p)(Xpg)

(linearity) (Leibnitz'rule)

(0.3.3)

along with the vector space structure defined as: (Xp+Yp)f=Xpf+Ypf (aXp) / = a (Xpf). (0.3.4) A tangent vector to M at p is any Xp e Tp (A/). A few facts resulting from the above definitions are noteworthy. Fact 0.3.6: Given a C°°-map F: M —> N (from C°°- M to C°°- AT), there exists a map F*: T (M) —» TF(p) (N), defined by F* (Xp) f= Xp (F*f), where the map F*: C (F(p)) -» C°°(pf is defined by F*(f) = / o F. The map F is a homomorphism of algebras (of smooth functions) responsible for inducing the vector space homomorphism F*, which gives F* (Xp) as a map of C°° (F(p)) to R. When F : Af —> M is the identity map, both F* and F* are identity isomorphisms. If H = G o Fis a composition of C°°-maps, then H* = F* o G* and //, = G» o F». The homomorphism F*: Tp(M) -> TF(p) (Af) is called the differential of F and it is sometimes denoted as dF, DF or F'. Fact 0.3.7: To each coordinate neighbourhood U on n-dimensional manifold M there corresponds a natural basis ex , ••• en of Tp(M) for every p € (/, in particular dim rp(Af) = n. Fac* 0.3.8: Given F, M and N as in Fact (0.3.6), the rank of F at p is the dimension of the image of F*(Tp(M)). F* is an isomorphism "into" if and only if this rank is the dimension of M; it is "onto" if and only if the rank equals dim N.

3.3

Vector Fields, Tensors and Tensor Fields

Definition 0.3.9: A tangent vector field X on M is a function that assigns to each point p of M an element Xp of Tp(M). The domain of this function is the whole of M and the range is the set T{M) consisting of all tangent vectors at all points of M, evidently T{M) = |J Tp (M) peM

Definition 0.3.10:

A multilinear map O defined on a vector space V and its dual V as: d > : V x - - - x V x V* x - x V* - 4 R r

5

6

(0.3.5)

s

Smooth functions at p or on M are usually denoted as T(p), J(M). Note that C°° (p) is the algebra of C°°functions whose domain of definition includes some open neighbourhoods of p with functions identified if they agree on any of those neighbourhoods; these functions are called "germs" of C°°-functions. C°°(p) (C°(F(p))) = smooth functions atp e M (F(p) e N).

Preliminaries 11

is called a tensor ofcovariant order r and contravariant order s. For a fixed (r, s), the collection of all such tensors is denoted Trs(V), and since as real-valued functions, the elements of Trs{V) can be added and multiplied by scalars (elements of R), it follows that T^(V) is a vector space. In particular, when 5 = 0, it is denoted by Tr(V), and when r = 0 it is Ts (V). If V is n-dimensional, the dimension of T , (V) (as can be easily checked) is nr+s. If {ex, • •• , en) is a basis of V, we note that <E> e (T\V) is completely determined by its nr values on the basis vectors. These nr numbers {3> (e^,---, e^} are called the components of O in the basis {e,, •••, en) and are denoted (in indicial notation) as <J>, ,. Similarly if {ft)1,--, of} is the basis of V*, we can obtain the components o f O e T j ( V ) a s :

Clearly if V is replaced by Tp(M), we have tensors on the tangent space T (M) of the manifold M at the point p. This leads to our next definition. Definition 0.3.11: A C°°- covariant tensor field of order r on a C°°-manifold M is a function O which assigns to each p e M a n element O p of Tr(T iM)), with the additional property that given any C°°-vector fields Xl,--, Xron an open set Uof M, Q> (X^---, Xr) is a C°°-function on £/. The set of all C°-covariant tensor fields of order r on M is denoted by Tr(M). The set Tr (M) is a vector space over R (in fact it is a C° (M)-module), since linear combinations of covariant tensors of order r with C°°-functions on M as coefficients, are again covariant tensors. A. field 4> of C-bilinearforms, r > 0, on a manifold M is a function which assigns to each point p of M a bilinear form <S>p on 7^, (Af) (i.e., a bilinear mapping), ^ : 7 p (Af) x Tp (M) —> y (A/) such that for any coordinate neighbourhood (U, <j>) the functions oty = O (e;, e ) defined by <E> and the coordinate frame {e,} are of class Cr. Usually these forms are taken to be C°° and are written as ®(Xp, Y ) instead

of%(Xp,Yp). A covariant tensor Tr (V) is symmetric if:

<S>(vi,-,vr) = ®(vCJ(]y..,va(r)) for every permutation < r e S r a n d vx,---,vre

(0.3.6)

V. It is alternating (anti-symmetric) if:

<E> (»!,•••, o r ) s sgn a O ( P o d ) , - - , f ^ ) for every permutation a& Sr (See Ftn. 11) and vx,---,vre

3.4

(0.3.7)

V.

Riemannian Metric and Covariant Derivation

Definition 0.3.12: A manifold M c R", on which a field of symmetric, positive definite bilinear forms O can be defined, is called a Riemannian manifold, and <& is called the Riemannian metric. A Riemannian metric <£> makes the tangent space at each point into a Euclidean space, with inner product defined by <£> (Xp, Yp) = O p (Xp, Yp). This allows us to define the length of curves, as well as the angle between the curves at their point of intersection via their tangent vectors Xp and Yp (say). In local coordinates the Riemannian metric is written as: dsl

= E Sadtodx<X dx^ (m < «)

(°-3-8)

12 Mathematical Perspectives on Theoretical Physics

The existence of a vector field Z on (Riemannian manifold) M implies that there is a vector Z at every point p of M which can be decomposed as: Zp = Z'p + Z"p where Z'p e Tp (M) and Z"p e Tpx (M), the latter is called the normal space to M at p. Note that Tp (R") = Tp (M) © TJ(M). Let ;r' and TT" denote the projections: %' (Zp) = Z" and n" (Zp) = Z"p\ they are linear mappings of Tp (R") onto the subspaces tangent and normal to M. If Y is a tangent vector field as defined in Def. (0.3.9), then n' (Y) s Y, and Yp e 7 p (M). Suppose that p(r) is a C'-curve on M (r > 1), then K(0 = y p(0 is a vector field along the curve, and its differential

is another vector field along the curve, in dt f /JV \

general this is not tangent to M, however at each point p(t) we have the projection %'

which is a v dt )

tangent vectof to M. This gives: Definition 0.3.13:

The projection %'

is denoted and is called the covariant derivative \dt) dt of the tangent vector field Yon M along the curve p(t). If (£/, 0) is a local coordinate system on M, such that <j>(U) = W is an open subset of Rm, then using the local coordinates ( M 1 , - - , um) on M and noticing that (jTl : W -» R" is an imbedding of W, we have V

J

(0.3.9)

This gives (jT1 in terms of the coordinate mappings g'(u). We further note that (0.3.9) is indeed a parametric representation of the m-dimensional manifold M embedded in R".7 Next we define another important object on M in order to write the covariant derivative on an n-dimensional manifold M. Definition 0.3.14: Let %(M) denote the set of tangent vector fields on an arbitrary manifold M, a rule V which assigns to each tangent vector field X € %(M) a linear mapping Vx of %(M) into itself satisfying the following conditions8: (i) (ii) (iii)

V , / = Xf Vfic+gr=fVx+gVr Vx(fY)=fVxY

(0-3.10)

+ (Xf)Y

for f,ge C°°(M) and X, Y e #(M), is called an affine connection on M, and Vx is known as the covariant differential operator with respect to X. Writing gl(u) = x\ we have natural basis - — - , • • • , — — for TAM). Thus for each — ^ - 6 %(M), \dx dx ) dx' we have: (0.3.11)

1' 8

When M is not embedded, the mapping
Preliminaries 13

The functions Tjj are called the connection coefficients which represent the operator V in a neighbourhood U. When M is a Riemannian manifold, they are called Christojfel symbols. , d .d Given X = X' ^xi and Y = K7' , the local expression of VXY on U is:

(a3 12)

Vxi^lfEx'^ + SrJx'y'] I T k \i The local expression for

ox

using the coordinate frame dt

dt

3.5

, ••-, y dx

— =lf— y{ dt

+

can be written as: dx )

r;,W^l^. ]k

-

J dx

tJ

(0.3.13)

dt ) dx'

Geodesies, Jacobi Fields, Curvature and Torsion

Definition

0.3.15:

Let t -> a(t) (t e I) be a curve in M and let X, Y € j ( M ) be such that

*«w = ^(0 = a (')

fe

'

K a(0 = 7(0 (a < t < b) cz I then Y is said to be parallel along a ya h] if ( V x O 0 ) « w = 0.

(0.3.14)

A local expression for parallelism of Y along finite arc a [ aft]or the whole of a can be written respectively as:

(Vx(r))«(;)= X | S x'(o-^-P- + £ x ' w W w r j U - ^ ^ • +Sr{^=o

(0.3.i5)(a) (0.3.15Kb)

where we have written X = A" - ^ s X\i) - ^ - = i ' (0 = —^- and F = Yj - ^ . If the vector field
4+Vr; t ^^,fl J; 2 f[ Jk dt dt

(0.3.16)

14 Mathematical Perspectives on Theoretical Physics

Definition 0.3.16: Equation (0.3.16).

A curve a in M is called a geodesic of M if its tangent vector satisfies the

Definition 0.3.17: Let y: [0, Z] —> M be a natural C°°-geodesic (i.e., a geodesic parametrized by the arc-length parameter 5), and let F : [0, Z] x {-t, t) —» M be a variation of y such that for every f e (- e , e ) the map F,: [0, Z] -> A/ given by F,(s) = F(s, f) is again a C°°-geodesic (not necessarily a natural one); then as t varies over (- e , e ) , the field of vectors

I~(9F

1

(5, f)

L dt

is known as the variational field with

J,=o

reference to F and 7, or the Jacobi field along y. We denote it as J(s). Definition 0.3.18: Let p e M and let y: [0, I] —» M be the natural geodesic with y(0) = p. A point <7 = y(s0) is said to be conjugate to p relative to yif there exists a Jacobi field J(s) along y which is not identically zero but which satisfies 7(0) = 0 = J(s0). Definition 0.3.19: Given Cr+l vector fields X, Y, Z on Riemannian manifold M, a Cr~x vector field R(X, Y)Z is defined by a C connection V (see De/ (0.3.14)) as R(X, Y)Z = Vx (V y Z) - V y (V x Z) - V [ X „ Z.

(0.3.17)

This is called the Riemann {curvature) tensor. A local expression of (0.3.17) is: Ri

m = -~ir--^T + nm i? - C r^.

(0.3.18)

Definition 0.3.20: Given a C connection V on a C°° manifold M (which is not Riemannian), a r x C ~ tensor field Tdefined by T(X, Y) = VXY-

VYX-

[X, Y]

(0.3.19)

r

where X, Fare arbitrary C vector fields, is called a torsion tensor. Using a coordinate basis its components are:

ri = ivri;..

(0.3.20)

We shall require these concepts mainly in Chapter 8 while studying the subject of relativity and gravitation. (See [1], [6], [8] for more details on differentiable manifolds and related concepts.)

4 4.1

MEASURE, EXP tf, DIRAC 5-FUNCTION Measurable Spaces and Measurable Functions

Let X be an arbitrary space. A ring of subsets of X is a nonempty class i^.of subsets of X such that A, Be %.=$ A-.B $ 91 and y4 u fi € ^ . . Evidently 0 e H{_ and 1{_ is closed under finite unions and finite intersections. If X also belongs to % it is called a field (or boolean algebra) and is denoted as !A. It is called a cr-ring (a cr-field) if %(%) is closed under countable unions. A positive set Junction on a space X is a mapping m from a family !A. of subsets of X containing the empty set into the extended positive real numbers: m-.SZ —> R+ u {<*>}.

Preliminaries 15

Definition 0.4.1: A positive measure m on a space X is a countably additive9 positive set function from a cr-field A of X into R+ u {+ °°}. The space X equipped with the measure m is called a measure space and is denoted (X, A, m). Definition 0.4.2:

A real function / on the measure space (X, SI, m) is said to be measurable if {x : a < f(x) < b] e SlV a,b e R .

4.2

Haar Measure

Associated with a compact group G is the so called Haar measure which can be described as follows. Definition 0.4.3: Let CiG) denote the set of complex valued continuous functions on G, Cf the convex hull (see Def. (0.2.9)) of all left translates of / e CiG) [i.e., finite sums of the form £ %/Xg, i

x) with gj, x e G, ai > 0 and 5/*, = 1] and A^ the closure of Cy in CiG). A unique linear form m : CiG) —> C is Haar measure™ if it satisfies the following properties: (1) m ( / ) > 0 if / > 0 (2) mil) = 1 (3) miaf) = amif) (0.4.1) (4) mi/) = m(fg)^m(f) (where/(/p are left (right) translates off:gfix) def fig'1 x)\fgix) def fag), g, x e G). Note that (2) implies that when/= 1, closure (Kj) = {1}, whereas (3) implies that closure iKaJ) = aKf. Property (4) shows the left and right invariance of m, it determines the uniqueness of m. The Haar measure is a normalized measure (in virtue of (2)). It is usually written as: JG fix)dmix) or simply as j G fix) dx.

(0.4.2)

Example 0.4.4: The Haar measure of the group 50(2, R) is given by the absolute value of the form da

dB (a —- where a and p are the parameters in 2 x 2 matrix

=

2np 2na 2 or+ p = l.

H

v

f5\

{-pa)

with the restriction

4.3 The Space Exp oi Definition 0.4.5: Given a complex Hilbert space H, another Hilbert space formed by the direct sum of symmetric powers of 9{,n namely 9

'

m i s c o u n t a b l y a d d i t i v e if for e v e r y disjoint c o u n t a b l e f a m i l y o f s u b s e t s ( A j , •••, An, •••)'mSi,

with union in

A, ml \jAi = I > (A^, and m(0) = 0. \i=i 10

J i=i

' A sufficient condition for the existence of such a measure is that the group be locally compact (see A. Weil, [11]).

"• S y m m e t r i c p o w e r S ' H : ( f t , , h 2 , •••, h r ) h-> —

=£

Vr! ae Sr- the group of permutations of degree r.

a

( f t ^ , , , ka(2),

•••, h a , r ) ) w h e r e ( f t , , h 2 , •••, h r ) e

W a n d

16

Mathematical Perspectives on Theoretical Physics

S ° # © S[

ti®S2tt®

••• © S " # © •••

(0.4.3)

is called the exponential o/Wand is denoted Exp 9{= 5?. An arbitrary element of Exp # i s : Exp v = 1 © v © - 7 L - z; ® a © - ^ - z; ® Z7 ® Z7 © •••

neJ/

(0.4.4)

The first element in (0.4.4), i.e., 1, is defined as 1 © 0 © ••• © 0 © •••

(0.4.5)

It is called the vacuum vector of Exp H and is evidently recovered from (0.4.4) by putting v = 0. The scalar products of H and H are related through the following equality: (Expy,, Expo^ = exp (v{, v2)

(0.4.6)

The collection {Exp Z7} constitutes a total set in 9{, i.e., the set of finite linear combination of elements e {Exp v] is dense in i#(see also Ftn. 1). The functor Exp transforms the direct sum of two Hilbert spaces in the tensor product of their images, and thus defines the canonical isomorphism: Exp # , ® Exp tt2 -> Exp ( # ! © # 2 )

(0.4.7)

by assigning to each vector Exp Z7, ® Exp v2 the vector Exp (vl © v2). Since {Exp v{], {Exp Z72}, and {Exp (Z7, © i72)} are total sets in their respective spaces, the isomorphism (0.4.6) implies that {Exp Z7, ® Exp v2] forms a total set in the space Exp Hx ® Exp !H2. Given a real Hilbert space H one can repeat the above procedure for J{c the complexification of 0i and obtain Exp"H^.Furthermore, by considering the nuclear extension of y{, and the standard Gaussian measure /I in it given by the Fourier transform12: Q ( F ) = | e^F' ^ dfi(F) = g-TlWI2 F e O, h e H

(0.4.8)

we can introduce the Hilbert space L2^ (O) of all complex square-integrable functions on 4>. If now one sets the functional Q s l a s vacuum vector in L2^ (<J>) and defines the functionals as: pv{F) = AF'v)

(0.4.9)

where ZJ in H and X is an arbitrary fixed complex number, then it can be shown that these functionals constitute a total set in L2 (O) and that L2 (<E>) is isomorphic to Exp 9(c via their scalar products provided we choose |A| = 1. In view of (0.4.9) it should be noted that for each different complex number X with |A| = 1 there will be a different isomorphism Ix : Exp 9{c —> ZA (<E>) (see [4] and Exp. [2.4.1]).

4.4

Dirac 5-function

By definition, the Dirac 5-function is the function that satisfies 12

The notion of nuclear spaces is due to A. Grothendieck [Ref. Ad]. See App. to Chapter 10 in K. Yosida, and A. i Pietsch for details in [Ref. Ad]. Standard Gaussian Measure = . 41n

( x exp \

2

^ \dx, x e R 2 )

Preliminaries

\ Jxi

dx8 0 - a) = \ ' 2 [0 otherwise.

(JC, xt, a e R )

17

(0.4.10)

It can also be conveniently expressed as (see Chap. 9): - ^ /_"_ dx exp [i( Px - p'x) x] = 5(px - p'x).

(0.4.11)

In 3-dimensions it would be: —i-3- £

A exp [/ (p - p') • x] = <53(p - p').

(0.4.12)

(27T)

The 5-function satisfies the following properties (a & 0): 5(ax)= — 5(x) |a|

*5(JC) = 0,

(0.4.13)

5(x2 - a2) = -J— [5(x - a) + 5(x + a)] 2\a\

(0.4.14)

—1

JJx 5[gW] = ^ - ^

, where ^(JC0) = 0.

(0.4.15)

(It is assumed that g(x) is analytic near its zeros.)

5

EXAMPLES BASED ON DIFFERENTIAL GEOMETRY

Example 0.5.1: We obtain structure equations in 2 and 4 dimensions using polar coordinates. Case 1: R2: Cartesian coordinates are (x, y); ds2 = dx2 + dy2. In polar coordinates (r cos 8, r sin 6) we have (i) ds2 = dr2 + r2d&. The vielbeins in polar coordinates given by the relation e" = e^ dx? can therefore be written as: (er=dr)

(ii)

l(x

y$dx\

The orthonormal frame (e1, e2) = (er, ee) gives the following structure equations, the connection and the curvature: der - CO

A

ee = 0 - CO

A

rdO = 0

(iii) de6 + 0) A er = dr

A

dd +tt>A dr = 0

> > structure equations J

(where we have written G)\ = - a>2= co, and have used the fact that torsion (Def. (0.3.20)) is zero).

18 Mathematical Perspectives on Theoretical Physics

(iv)

CO = d6

connection

(v)

R = dco - 0

curvature

Hodge star operator gives (see sec. (1.3)): (vi)

* (dx, dy) = (dy, - dx) * (dr, rd0) = (rdO, - dr) 4

2

Case 2: R , ds = dx2 + dy2 + dz2 + dt2. Polar coordinates •( roidrcooruindies.

& r CQS

_

¥

CQS

V 2 where 0< 6< z,0< (j)<2

+

_r

r_

6 • W + f CQS

_

s m

_r_

r_

6

r $m

_

Y -
CQS

r_

r_

• d • V -
r s m

_

&m

jr

r_

2

22 22 2 2 J TT,O< $/<4-n. The transformation between Cartesian coordinates (x, y, z, t) d i 0 i and polar coordinates is also written as x + iy - r cos — exp — (yf+ (j>), z + it = r sin — exp — (y/- (fi). The coordinate frames in two systems are related as: ' e° =dr \

f x

y

z

t\

fdx\

e =rox

1 -t

-z

y

x

dy

e = roy

r

z

-t

-x

y

dz

\-y

x

-t

l

.

^^ro,)

z){dt;

Here ax, <Jy, az are 1-forms that satisfy the cyclic relation: (viii)

dax = 2ay A az.

The torsion is zero, hence the first set of structure equations written out in full is: de° + <w° A e2 + a)® l

de +

(OQ A

A

e2 + 6)® 2

A

e 3 = 0 = 0 + ft)° A rax + (o\ A roy + 0i\ 3

e° + co\ A e + (o\ A e = 0 = dr

(ix)

A 2

2

2

A GX

+ r (2cy

dr + (o\

A

A

A

rcz

az) + G)o'

rOy + (o\

A

roz

3

de + COQ A e° + CO A e + CO \ A e = 0 = dr A <Jy + r(2(7z A ax) +0)1 A

dr + CO2 A rax + col

A

raz

de3 + G)Q A e° + a)3 A el + O>1 A e2 = 0 = dr A <JZ + r(2ax A ay) + (UQ dr +ft)3A rox + O>1 A ray Using the anti-symmetry co£ = - co a and the anti-symmetry oi dx A dy, etc., it can be seen that A

b

(x)

O)Q = ft)| = ox, a>l= a>]= ar andft)3,= co\= az.

The curvature, defined as: Rab = dcoab+coacACo'b can also be seen to vanish, for all combinations of a and b. For instance choosing a = 0,b = l and using (viii) and (x), we have: (xi)

Ri = dco° + co°2 A co\ + o>3 A co\

Preliminaries 19

= d{-ax) + {-oy A - CTZ) + (-
ds2 = ?d& + r 2 sin 2 0 d(j)2 = (e1)2 + (e2)2.

Structure equations for the above metric (with no torsion) are: 0 = del + (o\ A e2 = d(rd9) + (o\ A e2 = 0 + ft)2l A e2

(ii)

0 = de2 + Q)2 A e1 = d{r sin0d0) + ft)2 A el. The first of these shows that (o\ is a constant multiple of e". From the second we have: (iii)

r cos 6 dB A d(j>= -co2 A el = (o\ A e1

(.-. (o\ = -(O2).

This gives (o\ = - cos 0 d$. The curvature which is apparently non-zero is thus given by: (iv)

R\ = dco\ = sin 6d0 A d<j> = -\-

(rd6 A r sin&ty) = ~

But R\ = /?2i2 e1 A e2. This implies that the Gaussian curvature K = Rahab = —

ex A e2.

is constant.

Case 2: The 4-sphere S4 can be given a metric: (v)

ds2 = (dr2 + r\a2x + c£ + O 2z ))/(l+(r/a) 2 ) 2

where crx, ay, az are defined in (vii) of Exp. (0.5.1) and a is the diameter of S 4 . The metric in (v) is called the de Sitter metric. The vielbeins ea (a = 0, 1, 2, 3) are evidently — , - ^ J L e ^ k k

5.1

where

k s (1 + (r/a)2).

Critical Points

Definition 0.5.3: L e t / : M —» R be a smooth function on an n-dimensional smooth manifold M. A point p € Mis called a critical point o f / i f the induced m a p / , : Tp M —> 7 ^ R defined on the tangent space at p e M is zero. Thus if (xl, x2,---,x") are coordinates of p in the neighbourhood U which contains p, then the above definition implies that at the critical point p, These are called geodesic coordinates (see 8.[26])( and in terms of these, curvature is zero in a small neighbourhood, we therefore say that metric is locally flat (see [8], [10]).

20 Mathematical Perspectives on Theoretical Physics

iZ- = iL,... = i L 1

2

dx

dx

= 0.

(0.5.1)

dx"

The real number/(p) is called a critical value off. A critical point is called non-degenerate if and only if the matrix:

(w (p) ]

(a5 2)

-

is non-singular. A critical point p of/defines a symmetric bilinear functional (denoted/**) on T M, which is called the Hessian off at p. The matrix given in (0.5.2) represents this Hessian with regard to • the basis

1

, i = I,---, n.

{dx J Definition 0.5.4: The index of a bilinear functional (say) H on a vector space V is the maximal dimension of a subspace of V on which // is negative definite. The dimension of the subspace V c V for which H (v, w) = 0, where v e V and w e V, is called the nullity of H. The space V is called the null space pertaining to H. Obviously the point p e M i s a non-degenerate critical point of/if and only if/** has nullity equal to 0 on TpM. The index of/** on TpM is simply referred to as the index of/at p. Given below are a few examples of critical points of/defined on R ((ii), (iii)), and on R2 ((iii), (iv), (v)). (i) (ii)

f(x) = x2

The origin is a non-degenerate critical point.

3

f(x) = x

3

The origin is a degenerate critical point, 2

(iii)

f(x, y) = x - 3xy

(0, 0) is a degenerate critical point.

(iv)

f(x, y) = x2

The jc-axis is the set of all critical points of/and they are all degenerate. Clearly this is a submanifold of the given manifold R2.

(v)

f(x, y) = x2 y2

The set of critical points consists of the union of x and y axis. They are all degenerate, but this set is not a submanifold of R2.

Definition 0.5.5: A function, all of whose critical points are non-degenerate, is called a Morse function . Note that in the above examples only the first one gives a Morse function. Based on that, it is easy to construct Morse functions, for instance f(x, y) = x2 + ky2 for any non-zero real number k is a Morse function, since its only critical point-the origin-is non-generate. The function f(x, y) = x2 + ky3 on the other hand is not Morse, since the critical point (0, 0) is degenerate. Finally, we observe that every function near a non-degenerate critical point p is locally equivalent to a quadratic polynomial, whose coefficients form the Hessian of/at p. This observation is due to an important result called the Morse Lemma which we state without proof (see [5]).

Preliminaries 21 Morse Lemma: Let p e R" be a non-degenerate critical point of the function / , and

{h

$ = Wd7{p))

be the Hessian of/at p. Then there exists a local coordinate system (JC1, • • • , x") around p (with x'ip) = 0 for all i) such that f=f(p)+

I/il7(x!,--,y')^V

near p. Another result which is worth noting is the following: Theorem 0.5.6: If/is a differentiate function on a compact manifold M and/has only two critical points (which may or may not be non-degenerate), then M is homomorphic to a sphere. (See [7] for the proof.) We shall be using these ideas in Chapter 10 while studying the Yang-Mills theory.

6

BASIC DEFINITIONS IN ALGEBRAIC TOPOLOGY

In this section we describe in brief a few important concepts of algebraic topology that will be required in our study of Yang-Mills and superstrings theory. More precisely we define the de Rham complex, the de Rham cohomology, homotopy and homology.

6.1 de Rham Complex and de Rham Cohomology Let xx,---,xn 14 be the coordinates on R" and let Q* denote the algebra over R generated by the differentials (dxj) that satisfy the relation:

{

—dx:dx: _ '

i^ j . .

0

i= j

(0-6.1)

The algebra Q* is a vector space over R with the basis: 1, dxj, dxjdxj, dxjdxj dxk,---, dx\ dx2--- dxn i <j

i<

(0.6.2)

j
The tensor product C°° (R") <8> Q* denoted Q*(R") is the collection of C°° differential forms on R". For instance, 0)e Q*(R") can be uniquely written as £ ft

... ,- dxi , •••, dxt wherefi

...

;

is a C°°-function

in C°° (R n ). The form CO is called a p-form and is also written as X / / ^ i where / = (i{, •••, ip). The algebra Q* (R n ) = @np _ 0 Q^7 (R n ) is obviously graded, and any two consecutive components of the sum are related via a differential operator: d : OF (R") - » ^ defined as: 14

+ 1

(R n )

(0.6.3)

In conformity with the usage in Algebraic Topology, we have used lower subscripts to denote the coordinates inR".

22 Mathematical Perspectives on Theoretical Physics

(i)

df= X - | ^

dx for

r

/

e fi

° (R">

(°- 6 - 4 )

dm = X <*/,<&„ for to € Q p (R").

(ii)

The elements of GP (R n ) and Q 9 (R") are combined to give an element of QP + '1 (R") through wedge product A that obeys: O A 1 = ( £ //«/*/) A, (gy <&,) = JLf/gj dx, dXj = {-\fq

n ACO.

(0.6.5)

The operator d is an antiderivation, i.e., d(o) A t]) = dco A rj + (-1)" co AT].

(0.6.6)

2

It can be easily checked that d = 0 on Q* (R"). The complex Q (R") together with the differential operator d is called the de Rham complex on R". For an arbitrary p the elements of kernel of d are the closed p-forms and the elements of image of d are the exact p-forms. Their sets are respectively denoted as Q£(R") and Q£(R"). 15 Note that the de Rham complex is indeed a set of differential equations whose solutions are closed forms. Since exact forms are necessarily closed, solutions coming from them are not as interesting. The following definition gives the space on which interesting solutions will live. Definition 0.6.1:

The p-th de Rham cohomology of R" is the vector space: P

HD

R

(R») = Q£ (R")/Q| (R")

(0.6.7)(a)

and the de Rham cohomology is the direct sum: H*DR (Rfl) = e ^

=0

HgR (R»).

(0.6.7)(b)

If the C°°-functions are defined on an open set U c R", the corresponding de Rham complex and de Rham cohomology are denoted: Q * ( [ / ) a n d H*{U).

(0.6.8)(a)

(Note that Q* ([/) = {C°°-functions on U] ®R Q*.) If on the other hand we use C°°-functions with compact support, the corresponding de Rham complex and de Rham cohomology are denoted: Q*c (R") and H*c (R").

(0.6.8)(b)

We list below a few important facts about the de Rham cohomology. Fact 0.6.2: Consider the trivial bundle {R" x R1, R", n), and let it: R" x R1 - * R" be the projection map on the first factor (usually denoted as p^), and let s : Rn —> R" x R1 be the zero section: R" x R1

ni

Xs U'l

.

rt

n{X t) = X

'

s(x) = (x,0)

These maps define the maps n* and s* amongst Q* (R" x R1) and Q* (Rn): 15

See Chapter 1.

Preliminaries 23

Q* (R" x R1) n*i Q*(R") V

5*0** = 1

The mappings n* and s* thus give rise to inverse isomorphisms in cohomology as shown below (henceforth we will suppress the subscript "DR."): H* (R" x R1)

(

*

, H* (R'!)

(0.6.9)

s

and therefore lead to: H*(Rn

+1

) » H*(Rn).

(0.6.10)

Fact 0.6.3: (Poincare Lemma): „ fR when n = 0 H*(R") = \ . [0 otherwise For arbitrary differentiable manifolds, the above results can be generalized as follows: Fact 0.6.4: If M is a differentiable manifold, then (0.6.10) gives 16 : If (Mx Fact 0.6.5:

R1) » H* (M).

(0.6.11)

(0.6.12)

If A/ is «-sphere 5 " we have: *

n

[R in dimensions 0 and n [0 otherwise.

6.2

Category and Functors

In order to write the next facts on these ideas, we describe the objects required there. A category !^is a class of members (A, B, C, •••) called objects together with sets Horn (A, B) of morphisms from A to B such that if / i s a morphism from A to B and g from B to C, then the composite morphism g o/from A to C is defined. The operation of composition is associative, and for every object A there is an identity element lA in Horn (A, A). If %_x and %i2 are two categories and F : ^Cj -> ^C2 is a rule that assigns to A e 3 d , the object F(A) e %_2 and to every morphism / : A -> B in ^ d a morphism F ( / ) : F(A) —> F{B) in 3C2, and in addition preserves the operation of composition and the identity morphism, then F is called the covariant functor. If F reverses the arrow, i.e., F(f) : F(B) —> F(A), it is called a contravariant functor. Fact 0.6.6: The algebra Q is a contravariant functor from the category of Euclidean spaces {R " }n e z and smooth maps: R'" -» R" to the category of commutative differential graded algebras 17 and their homomorphisms. £2* is unique in the sense that it is the only functor that is the pullback of functions on Q° (R"). 16 11

' See 1.[5] for H*DR(M). • Commutative here means y (O = (-l) d e g r d e g i a <0)'.

24 Mathematical Perspectives on Theoretical Physics

6.3

Mayer-Vietoris Sequence

Fact (0.6.6) leads to the important Mayer-Vietoris sequence, which is a tool in computing the cohomology of the union of two open sets. Let U and V be two open sets, denote U KJ Vby M, then we have the inclusion map: M
°

Un V

(0.6.13)

where U uV denotes the disjoint union of U and V and <90, dx denote the inclusion maps of U n V in V and U respectively. When we apply the contravariant functor Q* to (0.6.13) we note that the mappings d0 and dl induce pullback mappings d0 and dl on Q (U u V) which carry the forms e Q*(U u V) to forms on submanifold Cl (U n V). We thus have a sequence of restrictions of forms18: Q*(M) -> Q*(U) © n*(V)

° ' Q* (U n V).

(0.6.14)

The sequence obtained by taking the difference of the last two maps is called the Mayers-Vietoris sequence (MV sequence): 0 -> Q* (M) -> Q* (60 © Q* (V) -> Q* ({/ n V) ~> 0 (ftju, «„) i-> (o> y - «„)

(0.6.15)

Result 0.6.7: The MV sequence is exac/, and induces a (long) exact sequence in cohomology which is also called the MV sequence (see(0.6.7)): fHq+ \M) —+ Hq+ \U) © Hq+ \V)—+Hq+ \Un

)

d

^H"(M)-^

Hq{U) ®Hq(V) _

V)_3 (0.6.16)

//^(i/nV)

Here d* is the coboundary operator. See l.[5] for details. We next define another object which plays an important role in the theory of Yang-Mills and strings.

6.4

Homotopy

Definition 0.6.8: A homotopy between two maps/and g from manifold M to another manifold N is a map F : M x\Rl —> N such that: rF(,,0=/« \ F{x, t) = g(x)

for?>l for t < 0

(0.6.17)

Thus if s0 and sx are the 0-section and l-section o f M - j M x K 1 , i.e., io(x) = (x, 0) and .^Oc) = (x, 1) (see Fact (0.6.2)), then /=FOJ,

and

« = FOJ0

This observation leads to the fact: Fact 0.6.9: l8 '

Homotopic maps induce the same maps in cohomology.

The forms here are called restricted to a submanifold since they are the images under
(0.6.18)

Preliminaries

25

From (0.6.18) we have: f =(F os.) = s , o F • ' ''. * ^

(0.6.19)

g - (F o s0) = s 0 o F . In view of the mapping (0.6.9) it implies that both s0 and s l invert n and hence they are equal and as a consequence / and g are equal. This establishes the above fact. Definition 0.6.10: Two manifolds M and N are said to have the same homotopy type in the C°°-sense if there are C°° maps/: M -> N and g: N —> M such that/og and go/are homotopic to the identity map on N and M respectively. The relation of homotopy type denoted notationally as ~ is an equivalence relation, i.e., if/~ g and g ~ h, then/- h. The collection of all homotopic mappings belonging to the equivalence class is denoted We use this to define one of the most fundamental concepts of topology—the Homotopy groups. Definition 0.6.11: Consider a topological space X with base point19 denoted by *, and let f denote the ^-dimensional unit cube with faces Iq. For q > 1 the q-th homotopy group nq{X) of X is the collection of homotopy classes of maps from Iq to X which send lq to the base point *. The group operation in nq{X) is defined by obtaining a new class [y] from the two given classes [a] and [p] via the product given below: Y(t,

[ u ••• t)= {

ri'i.'2.

, y

a(2t{,

t2, - , tq)

forO
| p{2ti_ht2t...ftq)

for

(0.6.20)

± <,,
Equivalently nq(X) can be thought of as the homotopy classes of base-point preserving maps from the ^-sphere S9 to X. See Fig. (0.4) for the pictorial representation of group operation of mappings a, j3 e [a],\J5] from P and S* to X.

~jq

j^^^Q 19

T

~1 P

Homotopy classes of maps from the q-cube IQ and from S*.

See 1 .[5] for more technical details.

26 Mathematical Perspectives on Theoretical Physics

We note that when X is replaced by a Lie group G, the homotopy groups 7tn(G) (constructed by using homotopically equivalent maps from Sn to G) art used in topological classification of gauge fields. More precisely every gauge field on 5" is assigned an element of ;rn_](G). For example, when n = 2 and G = C/(l), the homotopy group n{ (f/(l)) - Z, and the gauge fields on S2 are magnetic monopoles. We shall use the homotopy classes and homotopy groups while studying the solutions of Yang-Mills equations in Chapter 10.

REFERENCE 1. W. Boothby, An Introduction to Differentiate Manifolds and Riemannian Geometry (New York: Academic Press, 1975). 2. N. Bourbaki, Topological Vector Spaces (Chapters 1-5, Springer-Verlag, 1987). 3. Y. Choquet-Bruhat, C. DeWitt-Morette, M. Dillard-Bleick, Analysis, Manifolds and Physics (North Holland Publishing Company, 1977). 4. I. M. Gelfand, et al, in 2.[19]. 5. V. Guillemin and A. Pollack, Differential Topology (New Jersey: Prentice-Hall, 1974). 6. S. Kobayashi and K. Momizu, Foundations of Differential Geometry I, l.[10]. 7. J. Milnor, (a) Morse Theory (Annals of Math. Studies 51, Princeton University Press, 1963); (b) Topology from a Differential Viewpoint (University of Virginia Press, 1965). 8. N. Prakash, Differential Geometry, An Integrated Approach (New Delhi: Tata-McGraw Hill, 1981). 9. F. Riesz and B. Sz Nagy, Functional Analysis (New York: F. Ungar Publishing Company, 1955). 10. M. Spivak, A Comprehensive Introduction to Differential Geometry (Vol. 1, Publish or Perish, 1979). 11. A. Weil, L'Integration dans le Groupes Topologiques (Paris: Hermann, 1940). 12. K. Yosida, Functional Analysis (2nd ed., Springer-Verlag, 1965).

COMPIEK FUNCTIONS, RIEHUNN SURFACES AND TWO-DIMENSIONAL CONFORMAL FIELD THEORY (AN INTRODUCTION)

CHAPTER A |

The theory of complex variables/complex functions has always been utilized in a better understanding of physics. More recently it has been shown that the complex formulations of physical theories (e.g., string theories, 2-dimensional conformal field theories) lead not only to a better understanding of these theories but are often instrumental in bringing out the elegance of the theory. We shall see in later chapters (4, 5 and 11) how some of the classical concepts-infinite series, transforms of functions, Mobius transformations, Riemann surfaces, etc.,-are used in affine and current algebras, and string and superstring theories. In this chapter we give a brief account of these basic concepts. In Sec. 1 we give a few definitions, and in Sec. 2 we describe the notion of complex structure on a manifold. In Sec. 3 and Sec. 4 we describe Riemann surfaces and the 2-dimensional conformal field theory, respectively.

1

COMPLEX FUNCTIONS

1.1 Complex Plane Geometrically, the AY-plane is the complex plane or the Z-plane, when we choose to treat the Y axis as the imaginary axis by replacing the y-coordinate of a point by iy. The coordinates (x, y) of a point in the plane are replaced by a single coordinate z = x + iy in the Z-plane. Some of the geometric configurations are represented more simplistically using the complex coordinates. For instance, the equation of a circle with center z0 and radius r is \z- ZQ\ = r where \z\ is the absolute value: -Jzz = -j(x + iy) (x - iy) .

1.2

Analytic Function

Definition 1.1.1: A complex-valued function/of z is called (complex)-analytic or holomorphic at a point z0 if its derivative exists at z0 as well as at each point in some neighbourhood of ZQ. It is analytic in a region R if it is analytic at every point of R. Note that the function f(z) = z" is analytic for all positive integral values of n, but f(z) = \z\n is not. An analytic function is called an entire function if it is analytic in the whole of complex plane. f(z) = z" given above is an entire function. If a function/(z) fails to be analytic at a point ZQ but is analytic at some point in the neighbourhood of ZQ, then ZQ is called a point of singularity. For instance, for/(z) = —, zero is a point of singularity. z

28 Mathematical Perspectives on Theoretical Physics

1.3

Harmonic Functions

A function g : R2 —> R is said to be harmonic in a given domain D of R2 if it has continuous first and second partial derivatives there, and satisfies: ga (*. y) + gyy (x, y) = 0.

(1.1.1)

Treating D as a domain in the Z-plane, we note that if an analytic function f(z) is written as u{x, y) + iv(x, y), then its real and imaginary components u and v satisfy the Cauchy-Riemann equations: (a) ux=vy (b) uy=-vx (1.1.2) and are harmonic functions, i.e., (a) Uxx + uyy = 0 (b) va + vyy = 0. (1.1.3) Note that (1.1.3) (a) and (b) are Laplace equations for u and v. If two given functions u(x, y) and v(x, y) in D are harmonic and in addition satisfy the Cauchy-Riemann equations (1.1.2) (a) and (b) throughout D, then v is said to be harmonic conjugate of u. Hence a necessary and sufficient condition for a function f(z) = u + iv to be analytic in D is that v be harmonic conjugate of u in D. If one relates u and v with fluid flow, the curves v(x, y) = c are the stream lines of the flow with v(x, y) a stream function, and u as the velocity potential (velocity being ux+ iuy), then the analytic function f(z) = u(x, y) + iv (x, y) is called the complex potential of the flow. If a function/is analytic at a given point of a domain, then all its derivatives are also analytic at that point. Two important formulae used for analytic functions are: f^)

= ^\cJ^Ldz 2m JC Z-ZQ

/( )(Zo) =

"

(1.1.4a)

^ J c ( / _ ( ^ t . " = 0,1,2,-

(1.1.4b)

where C denotes a contour taken in a counterclockwise way that encloses the region R in which / is analytic and z0 is an interior point of #. Evidently n = 0 gives (1.1.4a). If C is replaced by a circle Co : \z - ZQI = Jo enclosing an open disc \z - zo\ < r, and function/(z), which is analytic within and on the circle, assumes the maximum value M s \f(z)\ on the circle, then using(1.1.4) we have the Cauchy inequality: lf

( n )

Mi

"

^ (n = l , 2 , •••)

/'""ft

"\

I I ^ Q Q Domain of an analytic function

(1.1.5)

Complex Functions, Riemann Surfaces and Two-Dimensional Conformal Field Theory (an Introduction) 29

If/is analytic at z0 in \z - zo|, there is a circle \z - zo| = r0 around ZQ, such that/is represented as: /(z) = % + I an (z - z0)" (Taylor series)

(1.1.6)

n=l

where a 0 = / ( z o )

a n ^ an

= / ( ' ! ) (zoV"! ( n = 1. 2, •••)• If ^o ls

f'(zo)=f"

°f/(z)> then a 0 = 0; if, in addition

a zero

(ZQ)= ••• = / ( * " 1 ) ( z 0 ) = 0

b u t / w (ZQ) ;* 0, then ZQ is called a zero of order fc and (1.1.6) can be written as /(z) = (z - Zo)* S aB + t (z - Zo)" (a t ^ 0,|z - z o | < r 0 ).

(1.1.7)

n=0

The above discussion implies that if ZQ is a zero of/, then there is a neighbourhood of z0 in which /has no other zero unless/is identically zero; thus zeros of an analytic function are isolated points of a given region. We now state another series expansion for an analytic function. This series, known as Laurent series, is used in string theories—in particular in the definition of affine Lie algebras.

1.4

Laurent Series

Let C[ and C2 be two positively oriented circles \z - zo| = r{, \z - zol = r2 (r2 < r i) which enclose an annular domain D, and let/be an analytic function in D and on Cx, C2, then at each point z in D,f(z) can be expanded as:

/(z)= £a n (z-zo)"+ £ n =0

n =\

(1.1.8)

K (Z~^o)

where

(1.1.9)

(1.1.10)

The series given by (1.1.8) is called the Laurent series of/(z).

Y

[ Z°' ) DJ \l

I

^

^S

y/

x

E ^ ^ Q Domain of expansion of a Laurent series

30 Mathematical Perspectives on Theoretical Physics

1.5 Simply Connected and Multiply Connected Domain A domain D in which every simple closed contour encloses only points of D is called a simply connected domain. The set of points interior to a simple closed contour is an easy example of such a domain. (Naturally) a domain that is not simply connected is a multiply connected domain. The annular domain in Fig. 1.2 is a multiply connected domain, since the region enclosed by the circle C2 (a simple closed contour) does not contain the points of domain.

1.6

Residues and Poles

From the definition of an isolated singularity z0 of an analytic function, we already know that there is a neighbourhood of ZQ, where/is analytic at all points except at z0, thus there is a positive number r such that/is analytic at each point for which 0 < \z - zo| < r. The function in this domain can be expressed in terms of Laurent series:

f(z)= Za,,(z-Zo)"+7—VT + 7

L

TT+ •••

dill)

where b{, b2, b3, ••• are given by the integral (1.1.10). The complex number

»,-^Jc/«* where C is a positively oriented simple contour such that/is analytic at C and at all points interior to C except at ZQ, is called the residue of / at z0. Suppose that / is analytic on a positively oriented simple closed contour as well as on points interior to it except at a finite number of singular points z1? •••, zk, then jcf(z)dz

= 2ni(Bi

+

B2+ ••• + Bk)

(1.1.12)

where B, denotes the residue o f / a t z,-, (i = 1, 2, •••, k). If the analytic function/at an isolated singular point Zg has Laurent expansion (1.1.11), then the sum with negative powers of (z - ZQ) is called the principal part o f / a t z0- Moreover, if this principal part contains a finite number of non-zero terms the last being bk * 0 and bk + l,bk + 2, • • • are all zero, then the isolated singular point z0 is called a pole of order k of the function /. When k = 1, it is called a simple pole. For instance, the function: fU) =

Z

~3\+5 z-3

= 3 + (z - 3) + -2— (jz - 3| > 0) z- 3

(1.1.13)

has an isolated singularity at z = 3, which is a simple pole. The residue there is 5, whereas the function

il^i

1

z6

z5

+

1 3!z3

+_L +

5!z

^. + Z 7!

+

9!

...

(N>0) M

'

(1.1.14)

Complex Functions, Riemann Surfaces and Two-Dimensional Conformal Field Theory (an Introduction) 31

has an isolated singularity at z = 0 which is a pole of order 5, and the residue there is — . When the principal part of / a t z® has an infinite number of terms, z0 is called an essential singularity. An example of essential singularity is given by the function:

U-1-15)

exp {-) = 1 + X -i-4" fcl) > ° KzJ

n=l

n\ z

where z = 0 is the singular point. Details on results listed above can be found in any elementary text on complex variables (see for instance [1] and [7]). Finally we give two examples to illustrate some of the ideas introduced in this section. Example 1.1.2: Let A be a real constant, the transformation z = ^ Ai + w

(1.1.16)

between the W-plane and the Z-plane satisfies: (a) I*) < 1 iff im w > 0, ^^___

V___// z

(b) |z| = 1 iff im w = 0

(1.1.17)

- ^ /A

w

ERJffij Cayley transform of the circle — = -2)l . dw. iz K +w

and

(1.1.18)

The above transformation is called the Cayley transform of the circle into the upper half plane. From (1.1.16) we have: 2A« z = -1 + — Al + W i

r

therefore

r

dz =

,

2A/

r-aw

(Ai + w) 2

dz 2 Ai (Ai + w) , 2 Ai , —- = =-•-dw = + ^j ~-dw. z (Ai + vw)2 (Ai -w) A2 + w2 Statements (1.1.17) (a) and (b) can be established by writing z- x + iy and w = u + iv and simplifying thereafter. Note that w as a function of z can be written as or

32

Mathematical Perspectives on Theoretical Physics

,.1-Z

W - Al

1+Z and that when A = — in (1.1.16), it becomes the Mobius transformation (see Sec. 3). Example 1.1.3:

The line integral {dx2 + dy2) in complex coordinates can be expressed as a product

dzdz- This is immediate from the definition z-x

1.7

+ iy which implies dz = dx + idy and dz = dx - idy.

Elliptic Curves

Let F = F, + zT2 be an element of C with F 2 > 0. The set of points defined by: + 2nTZ) (1.1.19) is called an elliptic curve (denoted ET) associated to the complex number F. Given below is a simple diagram pertaining to the area enclosed by £ r which can easily be seen to be 4ft2 F 2 (see p. 15 in 5.[14] for its use in string path integrals). The area of the parallelogram OACB is evidently the same as that of B'CCB with arm lengths 2n and 27tT2, respectively.
2nr

/

\B'

2n (r + 1)

A/2K

C'\

o

2 2.1

COMPLEX STRUCTURES ON A MANIFOLD, KAHLER METRIC Complex Manifold M

Definition 1.2.1: In layman's language, an ^-dimensional complex manifold is a real manifold of dimension 2n if there can (always) be found complex coordinates on it with holomorphic (i.e., analytic on real manifold) transition functions. More precisely, let M be a complex manifold of (complex) dimension n and z , (A = 1, 2, • • •, n) a system of complex local coordinates on an open subset U of M. Let z* = xx + iy\ then (JC1, y\ •••, x", y") is a system of (real) local coordinates of the differentiable manifold M on U.

2.2

Complex Structure on M

For each x e U we define a linear transformation Jx of the tangent space Tx (M) that transforms the pair

Complex Functions, Riemann Surfaces and Two-Dimensional Conformal Field Theory (an Introduction) 33 in following manner:

\dxl

)

dyK

\dy

)

dxX

(1.2.1)

The linear transformation / x satisfies the condition: •/x2 = - l

(1.2.2)

and the assignment 7 : x —> Jx defines a tensor field of type (1,1) on the differentiable manifold A/.1 The tensor J is called a complex structure of M.

2.3

The Tangent and Cotangent Spaces to M

Let Tx (A/)' and T*(M)C be the complexifications of the tangent and the cotangent vector spaces respectively; then the elements

belonging to TX(M)C and T*(!(f)c (respectively) are given by:

W

dzX

l{dxx

dyky

dt

2{dxX

dy'j

(b) dzX =dxx+ idyx, d~zX = dxx - idyx. (1.2.3) Thus the endomorphism Jx of the vector space Tx (A/) defines the endomorphism of Tx {M)c and we have: TX(M)C = TX+(M) ® TX(M),

7+ = Tx.

(1.2.4)

where Tx (respectively T~) consists of all v e Tx{Mf such that Jxv = iv (respectively Jx v = -iv) and bar on Txdenotes the conjugation of Tx [Mf. The elements

WW-k)*-1-2--* form the bases of Tx and Tx respectively at each point x of the coordinate neighbourhood U.

2.4

Holomorphic Vector Fields and Holomorphic Forms on M

Definition 1.2.2: A complex vector field X on M is a map on M such that for every x € M there is an element Xx e Tx(Mf. On a coordinate neighbourhood [/, X can be uniquely expressed as: A - g

, + g az

''

See the definition in Chapter 0.

. dz

(i.z.3)

34

Mathematical Perspectives on Theoretical Physics

where £A and t,x are complex valued smooth functions on U. X is a real vector field (i.e., Xx e Tx (M)) if and only if £ A = £* for A= 1,2, ••-,«. For any complex vector field X, let X be the vector field such that (X) x = X x at each x in M, then X is real if and only if X = X. X is said to be of type (1,0) (respectively of type (0, 1)) if Xx e Tx (respectively T~) at each point x. When X is of type (1, 0) we can write it locally as:

If the components £A are holomorphic functions of the local coordinates (zX), then X is called a holomorphic vector field on M. Definition 1.2.3: A differential r-form a is a map defined on M such that for every x e M, a (x) is an alternating r-linear function on TX(M)C. We denote it as ax. In view of decomposition (1.2.4) we can think of it as an element of:

XA f (r x + )'®A'(r-)'

d.2.6)

p+q=r

where Ap, (respectively Aq) are p-th power (9-th power) of exterior product on cotangent space (T*)* (respectively (7^)*). Since the bases for these spaces are {dz*} and {dzX}, we can write:

a= X ap,q

O-2-7)

p+q=r

with ap a

q

expressed as:

P,q =

X A1
The differential form a

«A, - v , • • • « , - '

d z h

A

"•AdzXp

Ad

*"1

A

•"

A d z H

(L2

-8)

is said to be of type (p, q). It can be easily checked that if a is of type (p, q),

then the conjugate a is of type (q, p). A form a is holomorphic if its components a p functions of zX-

2.5

-

are holomorphic

Some Calculus on M

Definition 1.2.4: A (complex) tensor field Tof order (r, 5) is a map on M that assigns to each x in M an element of the tensor product of <8>r TX{M)C and ®, 7"*(M)C. Evidently r is contravariant of order r, and covariant of order 5. We now give a few elementary facts on the calculus of differential forms. But before this, we would like to note that unlike tensor fields, the order of differential forms is restricted by 2n—the real dimensionality of M—and this natural phenomenon provides a rich mathematical structure to differential forms in the form of Hodge theory and Chern classes, etc., which in turn play an important role in modern physical theories, e.g., Yang Mills and string theory.

Complex Functions, Riemann Surfaces and Two-Dimensional Conformal Field Theory (an Introduction) 35

Similar to real differential forms, we define two differential operators d' and d" on complex differential forms in the following manner: d'a= £ < f ap
(a)

d'ap q = X ~^-

(1.2.9)

P'i

dzx Adz* A d7

Aji OZ

(b)

d"apt,=

^_-^-d~zv dz

AdzAAd~z*.

(12.10)

A/J

where we have used A and fj, to denote the sets (Al5 •••, Ap) and ( / i 1 ( •••, / i ? ) . Evidently d'a d"a are forms of type (p + 1, q) and (p, q + 1), respectively. It can b e easily checked that: (a)

(d'f

Farf 1.2.5:

=(d")2=0,

(b) d = d' + d" and

(c) cTa = d"a

and

(1.2.11)

A p-form ot of type (p, 0) is holomorphic if and only if d" a = 0.

Definition 1.2.6:

A form a on M is called c/osed if its differential da vanishes on M.

Definition 1.2.7: A symmetric covariant 2-tensor field T is said to be Hermitian if for any real vector fields Vlt V2 on M, T satisfies: T(JV{, JV2) = nv{,

V2).

(1.2.12)

Given a Hermitian tensor field T, define wT(V, V) = T(V, JV)

(1.2.13)

then wT is a (real) differential form of type (1,1) and wT can be written locally as: wT=-iTXJidzX

Adz*

where

(1.2.14) (1.2.15)

Conversely, if there is a real differential 2-form 9 of type (1,1) such that for any vector fields Vx, V2 on M T(V{, V2)= 9(JV{, V2)

(1.2.16)

then T is an Hermitian symmetric covariant 2-tensor field such that 6 = wT2 and locally

where

2-

Sxji = -iTXji

(1.2.17)

0^ = 0 ^ , - ^ .

(1.2.18)

Note the palcement of J in (1.2.13) and (1.2.16). This explains the factor (-i) in (1.2.17).

36 Mathematical Perspectives on Theoretical Physics

2.6

Kahler Manifold

Definition 1.2.8: A Riemannian metric g on a complex manifold is said to be Hermitian, if it is Hermitian as a symmetric covariant 2-tensor field. A Hermitian metric g is called Kdhlerian if the real differential 2-form w of type (1,1) associated to g is a closed 2-form. The form w is called the fundamental form of the Kahler metric g. Thus a Kahler manifold is a complex manifold coupled with a Kahler metric. We now state few results without proof on Kahler metric and Kahler manifolds (see ref. [10] for proofs). Result 1.2.9: A real closed 2-form 9 of type (1,1) is the fundamental form of a Kahler metric if and only if 9 > 0, i.e., the Hermitian matrix (i9x^) given by relation (1.2.18) has a positive determinant. Result 1.2.10: Let V denote the operator of covariant differentiation with respect to a Hermitian metric g on complex manifold A/; then M is Kahlerian if and only if any of the following statements holds good: (i) V w = 0, (ii) V J = 0, (iii) VA Z- = V x Z^ = 0 (1.2.19) where

Zx = - A - .

Z- = - ^ - ,

V, = VZ],

VT = V z

a n d A , jU t a k e a l l v a l u e s 1 , 2 , •••, n .

Result 1.2.11: Let M be a Kahler manifold and S the Ricci tensor of M (see Chapter 8), then 5 is a Hermitian symmetric covariant 2-tensor field. The real 2-form s of type (1,1) that results from equality (1.2.13) is called the Ricci form of M. Locally s can be written as: s = -iSx- dzx A d7,

2.7

Sx- = S(ZX, Z-).

(1.2.20)

Harmonic Forms on a Kahler Manifold

We see next that when M is compact, we can define another operator on the collection of forms on M. To this end we assume that 9 and TJ are any two forms of type (p, q), then the inner product ( ,) between 9 and r\ is given by: {9, TJ)= f (6, 77) * 1

(1.2.21)

where * 1 denotes the volume element with respect to metric g on M and (9, 77) (x) = { 9{x), r$x)) is the scalar product on M (i.e., a smooth function V x e M). Let Ef- q denote the space of differential forms of type (p, q); then there exists an operator d": Tf'q -> Lf-9"1 (q >l\d"Dp0 = Q such that (d"9,r]) = (9,d"r$ q x

(1.2.22)

q

for any 9 e Ef' " and 77 e Lf' . Define another operator: U" = d" d" + d"d".

(1.2.23)

Complex Functions, Riemann Surfaces and Two-Dimensional Conformal Field Theory (an Introduction) 37

The zeros of D" are called harmonic forms of A/.3 We now state a few results that involve these operators. Recall that on a smooth compact Riemannian manifold M we have the so-called Laplace operator4: A = d8+ 5d

(1.2.24)

This operator can be viewed as operating on complex differential forms, when we assume that M is Kahler. The operators A and D" (denoted A") obey the fundamental relation: 2A" = A The following results are easy to prove.

(1.2.25)

Result 1.2.12: If M is a compact Kahler manifold and 6 s Lf'°, then the three statements given below are equivalent: (i) d0 = 0, (ii) tfisholomorphic, (iii) 0 is harmonic (1.2.26) Result 1.2.13: If M is the same as above, and 6 e D0' q, then the statements given below are equivalent: (i) d0 = 0, (ii) 9 is antiholomorphic, (iii) 6 is harmonic (1.2.27)

Exercise 1.2 1. Prove Result (1.2.12). 2. Prove Result (1.2.13).

Hints to Exercise 1.2 Both of these exercises are easy to establish. One can use the procedures suggested in Exercises 6 and 7 of the next section or by first principles, using the definitions of these objects.

3

RIEMANN SURFACES

In the theory of strings, Riemann surfaces are important ingredients. In this section we define them and state some of the well known results related to their topology and geometry (differential as well as algebraic).

3.1

Riemann Surface M

Definition 1.3.1: A Riemann surface is a connected complex analytical manifold of complex dimension one, i.e., it is a two-real-dimensional manifold, with a maximal set of charts {Uw Za}that satisfies:

4

We have introduced D" and
38

Mathematical Perspectives on Theoretical Physics

(i)

Za : Ua -> <E and

(ii) whenever Uar\ Up* <j), the transition functions fan = Zao Zj : Zp (Ua n Up) -> Za (Ua n f/^) 5

are holomorphic. (Subscripts a and /? in (i) and (ii) e an index set A.) A compact Riemann surface is sometimes referred to as closed, and a non-compact one as open. The simplest example of Riemann surface is C with single coordinate chart (C, id). Of course this is open. The one point compactification C u {«=} is the familiar closed Riemann surface known as extended complex plane or Riemann sphere. The charts used in this case are {I/,, Z,} ( i = l , 2) Ux= U, U2={U ^ {0}) u {°°} and

Zx{z) = z z e Uv

Z2(z) = z

z e U2

with the usual convention that — = 0. Since [/, n {/, always equals {/• n £/,-, the only (non-trivial) transition function is: / 1 2 : C \ { 0 } 4 t x {0} giving

3.2

/ p (z) = —. z

Holomorphic Mappings on M

Definition 1.3.2:

Let M and A' be any arbitrary Riemann surfaces. A mapping (j): M -> N

is called holomorphic or analytic if for all coordinate pairs (U, Z) and ([/', Z') on M and Af respectively, whenever U n (jf] ([/') ^ 0 the mapping Z' O(j)O r l : Z(U n 0"1 (£/')) -> Z'(f/') is holomorphic (note that this is indeed a mapping from C to C). Definition 1.3.3: The mapping 0 defined above is called conformal if it is one-one and onto. It is obvious that in this case 0~' : N -» M is also conformal. Definition 1.3.4: When N is C, the holomorphic mapping 0 is called a holomorphic function. When N isa Riemann sphere C u {<»}, every mapping 0 into Af other than the constant one which sends M to °° is called a meromorphic function. The mapping (j) is called constant if <j>(M) is a point. 5

See [2] for the definition in terms of conformal structure defined by a family of local homeomorphisms on a connected Hausdorff space.

Complex Functions, Riemann Surfaces and Two-Dimensional Conformal Field Theory (an Introduction) 39

Definition 1.3.5: Let 0 be a non-constant holomorphic mapping between M and N. For p E M let take the value 0(p) n times atp (n> I) 6 ; the integer n (the multiplicity of 0at p) is called the ramification number of (j> at p and (n - 1) denoted b^{p) is called the branch number at p . We list the following well known results (useful in string theory) without proof. The proofs can be found in any standard book on the subject (see [3], [8]). Result 1.3.6: If M and N are Riemann surfaces with M compact and <j): M —> N is a holomorphic mapping, then

N be a non-constant holomorphic mapping between them. Then there exists a positive integer m, such that every q e. N happens to be the image under <j> (counting multiplicities) exactly m times, i.e., for all q e N: I

(b^ip) + 1) = m.

(1.3.1)

Definition 1.3.8: The integer m is called the degree of <j), and 0 is called an m-sheeted (ramified) cover of JV by M. Equivalently 0 is said to have m sheets.

3.3

Differential Forms on M, their Algebra and Calculus

Definition 1.3.9: Let M be a Riemann surface. A 0-form on M is a continuous function on M. A 1-form t] is an (ordered) assignment of two continuous functions / t and f2 to each local coordinate z (= x + iy) on M such that r]=fldx+f2dy

(1.3.2)

is invariant under coordinate transformations. A 2-form Q on M is an assignment of a continuous function g to each local coordinate z (= x + iy) such that: Q = gdxAdy

(1.3.3)

is invariant under coordinate transformations.8 The functions in (1.3.2) and (1.3.3), after change of coordinate transformations, are related to their counterparts by:

r/>)] fu Ju r/:U(w))l U(w)J i £ iz L/2(^w))J \_dv 6'

(1.3.4)

dv _

This is possible in the following manner. Choose local coordinates z on M and w on W that vanish at p and

(z) = £ a^z n>0,an^0. Use another holomorphic mapping h, say, to k>n write w = zlh(z)n = (zft(z))" = z ". Note that z —» zft(z) = z is another local coordinate vanishing at /?, and in terms of this new coordinate, the mapping 0is given by w = zn, showing that <j> takes the value (j>(p)n times.

7

The ring of all holomorphic mappings on M.

8

Note that the forms have been defined in (1.2)—we are repeating them here in local coordinates to simplify concepts on a Riemann surface.

40

Mathematical Perspectives on Theoretical Physics

8 (w)-g(z(w))

dx

dy

dx

dy

_dv

(1.3.5)

dv^

where we have used the transformed coordinate w = u + iv. If we use complex (analytic) coordinates and note that coordinate changes are holomorphic, the 1-form and 2-form given by (1.3.2) and (1.3.3)9 can be written as: Fl(z)dz + F2(z)dz

(1.3.2) (a)

GdzAdz.

(1.3.3) (a)

Since dz = dx + idy and dz = dx - idy, these functions in (1.3.2) and (1.3.3) are respectively related as f{ = Fl + F2, f2 = i{Fl - F2) and

(1.3.4) (a)

g = -2iG

(1.3.5) (a)

Remark 1.3.10: In view of the rules of exterior multiplication (dx A dy = -dy A dx), it is obvious that on a Riemann surface all forms of degree > 2 are zero, thus if AP denotes the vector space of pforms, then AP is a module 10 over A 0 and AP = {0} for p > 3. One can also talk about the integration and differentiation of these forms. Definition 1.3.11: An r-form (r = 0, 1,2) can be integrated over ^-chains (r = 0, 1, 2). When r - 0, the chain is a finite set of points. When r = 1, it is a finite union of paths (u,C ( ) and for r - 2 it is a finite union of discs (u,D,). Thus integrations of 0, 1, 2 forms are given by: X naf(Po)

over tne

0-cnain napa

(1.3.6)

with na e Z, pa e M, and Uci

W=

£ \fx^t\y(t))~

+ f2{x(t\y(t))^-\

dt

(1.3.7)

where we have simplified the u C;- by treating this as a piecewise differentiable curve C : / = [0,1] —> M and have used the local expression (1.3.2) to write the right hand side of (1.3.7). Similarly lLlDi n=\jDg(x,y)dxAdy

(1.3.8)

Here again we have simplified uZ), to a disc D with a single coordinate chart and have used (1.3.3) to write the above equality. To define the differential operator on the forms, we must assume that they are at least C1, i.e., the functions defining them are continuously differentiable at least up to the first order. Definition 1.3.12: The differential operator d assigns a (r + l)-form to a /--form on a Riemann surface. The operator d satisfies d s 0 (whenever d is defined). Accordingly d operating on a 0-form (function/) gives: 9

We have used capital letters to distinguish them from the previous ones.

10

See Chapter 4 for modules.

Complex Functions, Riemann Surfaces and Two-Dimensional Conformal Field Theory (an Introduction) 41

df=^-dx dx

+ ^dy=fxdx+fydy

(1.3.9)

dy

whereas for a 1-form (1.3.2) and a 2-form (1.3.3) it gives: dr\ = dif^dx + f2dy) =rf/iA dx + df2

A

dy

= ((fOx dx + (fOy dy)Adx

+ ((f2)xdx + (f2)y dy) A dy

= (fi)ydy Adx + (f2)x dx A dy = {(f2)x-(fi)y)dxAdy

(1.3.10)

(1.3.11) dQ. = 0 When we use complex analytic coordinates, we obtain two other differential operators defined by11:

(1.3.12)

for c'-functions, and d(O= d(fx dz +f2 dz) = df{ A dz + df2 A dz dd)= d(fxdz +f2dz) = ~dfx Adz + df2 A dz

(1.3.13)

for a C1 1-form co. It is easy to check that the operators d and d satisfy: (i)

d=d+d

(ii) d'^dd +dd=d2 =0 J (L3-14) The following result on C'-forms links the operations of integration and differentiation on Riemann surfaces. Result 1.3.13: Let 77 be a C1 r-form (r = 0, 1, 2), and let D be a (r + l)-chain, then12 (1.3.15)

where dD is the r-chain obtained by applying the boundary operator d. The above result—known as Stokes' theorem—is non-trivial only when r = 1.

11 12

We have used <9and d here in place of d' and d" used in an earlier section. For an (r + l)-chain D, dD is an r-chain. Note that the d used here is not the same as that in (1.3.12).

42

Mathematical Perspectives on Theoretical Physics

3.4

Star (•) Operator on M

Besides the operators introduced above, another operator, denoted * and known as the conjugacy (Hodge star) operator, can be defined on the vector space A of forms by using an inner product. The operator * maps Ar to A2~r and satisfies the rule: ** = (-l) r (1.3.16) It can be verified that using the operator *, a function, a 1-form defined in (1.3.2) and a 2-form defined in (1.3.3) are mapped as follows: *f(z)=f(z)(a(z)dx *dy) *7] = -f2dx + fxdy

(1.3.17) (1.3.18)

*Q = p(z) (1.3.19) where a(z) and /3(z) are functions defined on same local regions as/and Q are. Definition 1.3.14: A form r\ is called closed provided it is C1 and dt] = 0; it is co-closed \id{*vj) = 0. Definition 1.3.15: A 1-form r\ is called exact if 7] = df for some C2-function on M, and is called co-exact if *7] is exact. The latter holds if and only if 7] = *df for some C2-function /. Remark 1.3.16: On a simply connected domain (see Sec. 1) closed (co-closed) differentials are exact (co-exact), whereas every exact (co-exact) differential is closed (co-closed). Hence locally closed <=> exact (co-closed <=> co-exact).

3.5

Harmonic and Holomorphic Forms on M

Definition 1.3.17:

Let/be a C2-function on M then/is called harmonic if the 2-form

(fxx + fyy)dx A dy called the Laplacian of/and denoted by A / is zero (see Sec. 1). A 1-form TJ is called harmonic if locally it can be given as df with/a harmonic function. Definition 1.3.18: function/

A 1-form r\ is called holomorphic provided locally r/ equals df for a holomorphic

Remark 1.3.19: The concept of a harmonic function is a local one, thus locally every real-valued harmonic function is the real part of a holomorphic function. The following results which are basic for integration on Riemann surfaces are easy to check. Result 1.3.20: A differential13 co is harmonic if and only if it is closed and co-closed. Result 1.3.21: If w is a harmonic function on M, then du is a holomorphic differential. Result 1.3.22: A differential r\ = 0, dz + <j>2 dz is holomorphic if and only if <j)2 = 0 and , is a holomorphic function locally. Result 1.3.23: A differential form r\ is holomorphic if and only if r\ = a + i*a where a is a harmonic differential. 13

' A differential 1-form on a Riemann surface is simply called a differential sometimes (see [8]).

Complex Functions, Riemann Surfaces and Two-Dimensional Conformal Field Theory (an Introduction) 43

Result 1.3.24: Suppose D is a relatively compact region on a Riemann surface M with piecewise differentiable boundary, and/and co are a C^-function and a differential 1-form in a neighbourhood of the closure of D. Then

iLf^=jjDfdco-jjDCOAdf

(1.3.20)

(The above result uses Stokes' theorem, and as we can see, it is integration by parts.)

3.6 Square-integrable 1 -forms on M Definition 1.3.25: pressed locally as:

A 1-formc r\ on a fixed region D of M is a measurable form if it can be exr]=fdz + gdl

(1.3.21)

in terms of measurable functions/and g on D (see Chap. 0 for the definition). The complex Hilbert space of 1-forms defined above with norm

IN£=JJ>A*ij<~

(1.3.22)

is denoted L2(D), and is called the space of square-integrable 1-forms on D. In local coordinates the RHS of (1.3.22) becomes jjD

i(ff + g~8)dz Adz=

jjD 2 ( | / | 2 + \g\2)dx A dy

(1.3.23)

The inner product of two forms f] {, r\2e L2(D) is given by: 07i> *72)/> = JJ o i7iA*iJ 2

d-3-24)

Using the local expression (1.3.21) it can be easily checked that (*7i» * 7 2 ) D = ( , M I ) O C1-3-25) Obviously L2(A/) denotes the Hilbert space of square-integrable 1-forms on the Riemann surface M. We state the following important result known as Weyl's lemma (for a proof see [8]). Result 1.3.26: A measurable square-integrable function on the unit disc D is harmonic if and only if for every C°°-function g on D with compact support, the following holds: jjDfAg

=O

(1.3.26)

where Ag is the Laplacian of g. An alternative version of above result using local coordinates can be put as: Result 1.3.27: Let/be a measurable square-integrable function on the unit disc D. The function/is holomorphic if and only if for every C°°-function g on D with compact support

JJD/(z)J=-«feArfi = O

(1.3.27)

44 Mathematical Perspectives on Theoretical Physics

We next see that L2(M) can be decomposed in more than one way as a direct sum of orthogonal spaces. Let E denote the L2(M)-closure of all 1-forms defined as: E = {df:fis a smooth function on M with compact support}. Let E* be the collection of all forms 77 e L2(M) such that * rj e E. Then for every t] e E (£*) there exists a sequence {/„} of smooth functions with compact support on M such that: 77 = Hmdfn (= lim * dfn). n

n

2

The set L (M) has the following decompositions: L2(M)

= E ® E1, L2(M) = E* ® (E*)1

L2(M) = E® E* ® H where H = E n (£*) x . The following result is a consequence of above decomposition. (See Hint to Exc. 13 for definitions of E1 and (E*)1).

(1.3.28)

x

Result 1.3.28;

The Hilbert space H consists of harmonic differentials in L2{M).

Remark 1.3.29: On a compact Riemann surface there do not exist exact non-zero harmonic differentials. In order to have such differentials one has to allow singularities on them (see [8]). Result 1.3.30(a): Given a point Po on M and a local coordinate z that vanishes at Po, one can always find a function 0 with the following properties: (i)

—

is harmonic on every sufficiently small neighbourhood 5\£of Po, and (ii)

ff

dtp A * d$ < °c

Also, (iii)

(dip, df) = (d(j), *df) = 0

for all smooth functions / o n M that have compact support and vanish on a neighbourhood of P o (see Hint to Exc. 13 for the proof of (iii). Result 1.3.30 (b): Given two points P{ and P2 on M and local coordinates z{ and z2 which vanish respectively at P{ and P2, then one can find a function (j> such that (j) is harmonic on M N {P{, P2},

+ |log z2| is harmonic in a neighbourhood A^ of P~>. In addition f [

d<j> A *J0 < °= for every open set N containing Pl and P 2 , and (d(j), df) = 0

= (d0, *J/") for all smooth functions/on M-that have compact support and vanish on a neighbourhood of Px and P2. Another concept which is of great use in Riemann surfaces is the classification of differentials on them via the theory of abelian differentials. We shall list a few of these below.

Complex Functions, Riemann Surfaces and Two-Dimensional Conformal Field Theory (an Introduction) 45

3.7

Abelian Differentials on M

Definition 1.3.31: Let r be an integer. A meromorphic r-differential r\ on M is an assignment of a meromorphic function/to each local coordinate z on M such that f(z)dzr

(1.3.29)

is invariantly defined (i.e., it is independent of coordinates). When r = 1, this is called an abelian differential. Result 1.3.32: Let P e M and let z be a local coordinate on M that vanishes at P. Then for every integer n > 1, there exists an abelian (meromorphic) differential f{z) d(z) on M which is holomorphic on M \ j P ) and has singularity ——f

at

P 0-e-> / h a s a pole of order « + 1 at P).

Result 1.3.33: Let P 1 and P 2 C^i * ^2) be two points on M with local coordinates zx and z2 that vanish atPj andP 2 respectively. Then there exists an abelian differential 77, holomorphic atM \ {Px, P2] and with singularities — at P,

and —— at P2.

Zi

z2

To prove the results (1.3.32) and (1.3.33) we set respectively differentials: (i) 77 = - (1/2) ( a + i * a) and (ii) TJ = a + i * a where a = d. Existence of such aQonM for (i) follows from the Result (1.3.30) (a) and for (ii) from the Result (1.3.30) b. Definition 1.3.34: Let 77 be a r-differential (1.3.29) on M and z a local coordinate which vanishes at P e M, then the order of 77 at P is: ordPr] = ord0f

(0 = origin).

If/(z) = z"g(z), where g(z) is holomorphic and non-vanishing at P, then oid0f=

n.

Remark 1.3.35: Note that local parameters are homeomorphic, hence the order of 77 (defined in terms of the order of / ) is well defined, i.e., invariant under coordinate transformations. Moreover {P € M : ovdpT] * 0} is a discrete set on M, hence it is a finite set if M is compact. Definition 1.3.36: If 77 is an abelian differential given by (1.3.29) for r=\, P is defined as the coefficient a_, in the Laurent series expansion of/(z):

then the residue of 77 at

f(z) = X ^ . Thus

ResP77 = a_x.

On the other hand, it is known that (see [8])

where C is a simple closed curve in M that bounds a disc D containing P, and has a winding number 1 around P, and 77 is holomorphic in clD x {P}. Hence the residue of meromorphic 1-form at a point P is a well defined concept. Result 1.3.37: Suppose P{, P 2 , •••, P ; (/ > 1) are distinct points on a Riemann surface M, and a1? • • •, at are complex numbers with

46

Mathematical Perspectives on Theoretical Physics

X«t=0

(1.3.30)

*=i

Then there exists a meromorphic (abelian) differential rj on M, which is holomorphic o n M \ {P,, •••,Pi) and satisfies: (a) ord^T] = - 1 (b) Res^rj = ak for every k = 1, •••, / (1.3.31) Result 1.3.38:

Every abelian differential 77 on a compact Riemann surface has the following property: £

3.8

Re S / ,r/ = 0.

(1.3.32)

A Few Results Based on Transformation Groups of M

Next we return to some group theoretic aspects that are needed in the theory of Riemann surfaces. Recall that the group PSL(2, C),14 the projective linear group formed by 2 x 2 complex matrices of determinant one, is a group of automorphisms of (C which is better known as the group of Mbbius transformations: az + b z —> a, b, c, a e C and ad - be = 1. cz + d Definition 1.3.39: Let G be a subgroup of PSL(2, C) which acts as a group of biholomorphic automorphisms of the extended plane C u {00}. Let Q(G) denote the set of all those points ZQ e C u {00} at which G acts (properly) discontinuously.15 The set Q(G) = Q is an open G-invariant subset of C u {°°}. If Q * (j>, we call G a Kleinian group. The set A = A(G) = C u {<*>} \ Q(G) is called the ZwraV .sefo/G. Definition 1.3.40: A Kleinian group G is called Fuchsian if there is a disc A (A = \z\ < cc; a e IR) that is invariant under G. It is called an elementary group if A (G) consists of two or less points. The following facts about the Kleinian groups are well known. Fact 1.3.41:

(a) Every Kleinian group G is finite or countable.

(b) A group G which is Kleinian must be discrete (note that converse is not true). Fact 1.3.42:

If G is Fuchsian with invariant disc D, then A(G) <= dD.

Definition 1.3.43:

Let M be an arbitrary Riemann surface, and 16

M its universal covering space with canonical projection: IT : M —> M and G as the covering group (group of topological automorphisms), so that the following diagram commutes:

n

\ n\^

/ /n M

Q3^^3 14 15

16

See Chap. 2 for group theory. • • • • • G acts discontinuously at ZQ e C u {°°} if: (i) the isotrophy subgroup G^ of G at z0 [g e G: g(z0) - ZQ] is finite; (ii) there exists a neighbourhood V of ZQ such that g(VO = V for g e GZ(); (iii) g(V) n V * <j> for g e G \ GZ(>. Given a connected topological manifold X, a new manifold A!" is called the universal covering manifold of X if X has the following properties: (i) there is a surjective local homeomorphism n : X —> X; (ii) the manifold X is simply connected, i.e., the fundamental group of X is trivial (i.e., nl (X) - {1}) (see Chapter 0); (iii) every homotopically non-trivial closed curve on X lifts to an open curve on X and the curve on X is uniquely determined by the curve on X and the point lying over its initial point.

Complex Functions, Riemann Surfaces and Two-Dimensional Conformal Field Theory (an Introduction) 47

Then, since FI is a normal covering, the group G = Tlx (A/)-the fundamental group of M. It is worth noting that an M can be one of the following: (i) I u ( » } = DX; (ii) € = D2; (iii) A= U= ( z s 0} = Dv Each of these has the following property: The group of conformal automorphisms of M (i.e., the group of topological automorphisms G of M) is a group of Mdbius transformations:

az + b cz+d ' Thus: (a) (b) (c) In

Aut (C u {oo}) = PSL(2,
Result 1.3.44: spaces:

Every Riemann surface is conformally equivalent17 to one of the homogeneous D/Gt

(i = l, 2, 3)

where D ; and G, are given in (i), (ii) and (iii), and (a), (b), and (c). Furthermore, these groups are fundamental groups of the corresponding Riemann surfaces. If the Riemann surface is simply connected, then (and then alone) it is conformally equivalent to one of the D ; 's listed above. In addition, if FIj (A/) = Z (the ring of integers), then M is conformally equivalent to one of the following: (i) C ' = C N { 0 } ; (ii) A* = A N { 0 } ; (iii) A r = { z e C : r<\z\<\) = (0 < r< 1). If IIj (A/) = Z®Z, then M is a torus z + F with Im F > 0 (see Exp. 3 in Sec. 1). Having analyzed in brief some of the topological and group theoretic aspects of M, we have: Result 1.3.45: sphere.

If the universal covering of a Riemann surface M is a sphere, then M must be a

Result 1.3.46: If the (holomorphic) universal covering space of M is
Definition 1.3.47:

Let az + b Z K>

7

cz + d be written as z t-> Az where

fa

17

b\

H J-° M and N are conformally equivalent if and only if there exists an analytic bijection of M onto N. A surface is simply connected, if it can be continuously contracted to a point.

48 Mathematical Perspectives on Theoretical Physics

with ad - be = 1. The transformation A is called parabolic if it has one fixed point. Note that the trace equation: Trace2 A = (a + df

(1.3.33)

implies that A is parabolic if and only if Trace2A = 4 Using equations (1.3.33) and (1.3.34) we can further define:

(1.3.34)

Definition 1.3.48: A is called elliptic if Trace2 A = T e R and 0 < T < 4. It is called loxodromic if T i. [0,4] but 6 IR. A loxodromic transformation A is called hyperbolic if T > 4. Using the above characteristic properties of Mobius transformations, one can prove: Result 1.3.49: If M is a Riemann surface with n,(M) = Z © Z, then the holomorphic universal covering space of M is C. Result 1.3.50: Suppose M is a Riemann surface with holomorphic universal covering space U = {z s C : I m z > 0 ) and n,(M) is commutative. Then M must be A, or A* = A N {0} or Ar = {z e C : |z| < r, 0 < r < 1} (compare the above two results with (1.3.44)). We further note that this classification of fixed points of a Mobius transformation also helps in determining whether or not two topologically equivalent Riemann surfaces are conformally equivalenta concept that is required in String theory. We now move on to the metric properties of a Riemann surface. The Riemannian metric on our familiar (simply-connected) Riemann surfaces M, = C, M2= A = {z e C : |z| < 1} and M3 =
\dz\

zeMi;

(ii)

[T^-r)

oii)

(rHy)1*'

|*| z « M2; z € Mj

'

It should be noted that (iii) holds only for z * °°. We have seen earlier that M3 is a compact surface of genus 0; in fact this is a sphere, as can be illustrated by using the stereographic projection and the metric on it given in (iii) (see Exc. 19). An important result pertaining to Aut (M3) is given as follows:

Result 1.3.51:

Let (a

b\

belong to Aut (M3) (the group of Mobius transformations). Then A defines an isometry of the metric given in (iii), if and only if elements of A satisfy. d=a,

b = -c,

and

|a|2+|c|2=l

In this section we have only presented the material for a beginner in the theory illustrating it with some 20 exercises. For detailed literature and deep results, see References [2], [3], [6], [8], [9], [11]. We devote the final section of this chapter to conformal field theory in 2-dimensions.

Complex Functions, Riemann Surfaces and Two-Dimensional Conformal Field Theory (an Introduction) 49

Exercise 1.3 1. Using dz = dx + idy and dz = dx-idy, show that the functions/, and/ 2 given in (1.3.2) are related to those given in (1.3.2a) as: (i)

fi = Fl+ F2, 1 f2 = i{Fx - F2)

and (ii) dz A dz = -2i dx A dy. 2. Show that the partial derivatives fz and/j in (1.3.12) satisfy: fz=\(fx-ify)

3. Establish Result (1.3.6). 4. Verify the equalities (1.3.17), (1.3.18) and (1.3.19) after mentioning explicitly the requirements for the * operator. 5. Let A* denote the vector space of it-forms. Show that the vector space A = A0 © A1 © A2 is a graded anti-commutative algebra under the form multiplication. 6. Show that every C2-function/satisfies:

7. 8. 9. 10. 11.

-2iddf=Af=d*df. Prove Result (1.3.20). Prove Result (1.3.21). Prove Result (1.3.22). Prove Result (1.3.23) and deduce from here that a differential ft) is holomorphic if and only if it is closed and *co = - i(O. Use the defining equation (1.3.21) for T) and show that \\D rj A * n = \\D i(ff

+ gg)dz Adz = jjD 2 ( | / | 2 + \g\2)dx A dy.

12. Prove Result (1.3.27), i.e., Weyl's Lemma in its alternative form. 13. Show that the Hilbert space H defined in (1.3.28) consists of harmonic differentials in L2(M). Deduce condition (iii) of Result (1.3.30)(a) from here. 14. Prove the results (a) (1.3.37) and (b) (1.3.38). 15. By choosing the image of a fixed point as °° e (Cu {°°}, show that a Mobius transformation A is parabolic if and only if A is conjugate to a translation z i-> z + b (or equivalently to z H-> z + 1). 16. Show that the Mobius transformation A with two fixed points is conjugate to z w a z where a * 0, 1 and that: A is elliptic

<=> |of| = 1

a*\;

A is loxodormic <=> \a\ * 1 a * 0; A is hyperbolic » « E R

a > 0, a * 1.

50

Mathematical Perspectives on Theoretical Physics

17. Define the complex projective space P and show that it is a Riemann surface. 18. Prove Result (1.3.51), show further that A here is elliptic. 19. Use the metric —'—hr (z & °°) and the stereographic projection to show that the compactified

1+M complex plane C u {<*>} is a sphere. 20. If <j)(z) denotes the coefficient of \dz\ in the above exercise, assuming that the curvature of this . sphere is

Laplacian (log 0(z)) ^ show that it equals 1.

(4>(z))2

Hints to Exercise 1.3 1. Use (1.2.3) and the procedure described in the Hint to Exercise. 8. 2. The verification is an immediate consequence of (1.2.3). 3. Because of the importance of this result, we begin by explaining a few facts that were not explicitly mentioned in the text. We note that we have not drawn any distinction between an analytic mapping and a holomorphic mapping defined on a Riemann surface, although a more precise way would be: <> / is holomorphic on M if

C is analytic. Thus, in short,

(a)

D ; C D2 C [Ua nf-1 (Ufi n g'1 (Up)). Since D, and D2 are both contained in the open set within parenthesis, it follows that both/^ a = ZpofoZ^ and gpa = Z'p o g o Z^1 are defined and holomorphic on Za(D2), so the set of points in the compact set Za(D{) w h e r e / ^ = gpa is either finite or is all of Za ( D , ) . Hence/and g are same either throughout Dx or at only a finite subset of Dv We use both these options as follows. Let A be the set of points in Mx which have a neighbourhood on which/= g at only finitely many

18

For/? e M wherep is in (Zw £/„), let N= {q, da (q, p)<£}= Z~x [z : \z - Z a (p)\ < e}, then N, which can be mapped by Za onto an open disc in C, is called an open parametric disc at p with radius e. Note that we are using z as the coordinate of the point.

Complex Functions, Riemann Surfaces and Two-Dimensional Conformal Field Theory (an Introduction) 51

points and let B be the set of points which have a neighbourhood 9{_ such that/= g throughout 9{. Then A and B are disjoint open subsets of M s , and the above discussions imply that A u B = M{. But since Mx is connected, we have either Mx = A or Mx = B. (ii) If Mj and M2 are Riemann surfaces and 0 : Mx —> M2 is holomorphic but not constant, then if A is an open subset of M,,
Da

Uan
Now the mapping./^ s Z'pO
(D)a^(A) is an open subset of M2. To prove the given result (1.3.6), we assume that <j> is not constant; then, from above, is surjective. 4. Let Tx (M) denote the cotangent space to an n-dimensional Riemannian manifold M and let A denote the space of differential forms, then the inner product on fx (M) can be used to define an inner product on A in the following manner: (a)

(dx'i A dx?z A ••• A dx{p\dx^

A d x ^ A ••• A d x h ) = Elky''pk

g

j

^ ••• g h k p

where g^kr denote the components of Riemann metric and e'\ ...'£ is the Kronecker tensor that equals +1 or - 1 according as k{, •••, kp is an even or odd permutation of ilt •••, ip. Using (a), the inner product between two arbitrary p-forms

a=

— a-h ••• ivp dx'i ••• dx'p a n d P= — fih ...j dxh ... dxh

can be written as:

(b)

ah ... ip pi - \

(a\p) = ±

When M is a Riemann surface the maximum value which p can take is 2, hence (a) simplifies to: (C)

(dx'l A dx'2\dxji A dxj2) = £ ^

gh k\ g>2k2

= 8Ug22-gi2g2i. If in addition the manifold is oriented, we can define an n-form T called the volume form or the volume element on M. In terms of an orthonormal basis {6'}, this is given by (d)

r = 01

A

&

A

•••

A

6"

or equivalently as F = F, ... ; dx 'i A ••• A dxl". On such an oriented manifold, an isomorphism AP(M) —> A"~P(M) can be defined by an operator *, which is related to F in the following sense:

52 Mathematical Perspectives on Theoretical Physics

(e) T(a\p) = a A */? for every p-form a e AP(M). (Note that */} is an (n - p) - form). For example, by choosing a = dx1 A ••• A <& P , (e) gives: (f) which implies

r c ^ 1 A ••• A j^iyS) = rpu •'"

<«>

< * % • i. - . ' . = 7 7 r M - v v

+l

- ^ " ' ''"

For a Riemann surface (n = 2), we set the form F as: (h) a (z) dx A dy. The choice of p is limited to 0, 1 and 2. When a is a 0-form, jS is also a 0-form, hence Eqs. (g) and (h) give: (i) *f(z)=f(z)(a(z)dxAdy). When P is a 1-form 77 =/i(z) etc + / 2 (z) dy, using Eqs. (g) and (d) and noticing that 9'idx1) = <^' (dx1 = dx, dx2 = dy), we obtain: (j) *7] = -/2(z)
d (df) = d(fz dz) = fz-z dz A dz.

From (1.2.3) we know that/ z = —(fx- ify) and/=r = — (fx+ ify), whereas dz = dx + idy and dz = dx - idy. We substitute these values in (a), and since:

(b)

JZ [\{f, - ify)} = \ dXX + I/* - Hyx + fyy)

Complex Functions, Riemann Surfaces and Two-Dimensional Conformal Field Theory (an Introduction) 53

and (c)

dz A dz = (dx - idy) A (dx + idy) = 2i(dx A dy)

we obtain

-2iddf=Af. I f / i s harmonic, A/is zero, i.e., ddf= 0 i.e., 3/is holomorphic. 9. The differential 77 = x dz + fa. dz is holomorphic; this implies that there is a holomorphic function/such that 77 = df=fzdz- Comparison with the right-hand-side leads to
mat

V = *7i + V2 • We write 77 = udz + vdz, then using d = d + d, we

dT] = (u~z - vz) dz

A

dz

and (b)

d*ri = -i(u£ + vz) dz A dz.

In view of result (1.3.20), r\ is harmonic if and only if the right hand side of Eqs. (a) and (b) is zero; this will be so only when u and v are holomorphic. Hence v can be written in terms of holomorphic differentials as: r\ = j]i + r\i. Next we begin by assuming that a is harmonic, hence we can write it in terms of holomorphic differentials ax and a2 :

a = ax + a2 *a = -/ai + ia2 . This gives a + i*a = 2aj. Obviously 2ax is our 77. Conversely, if 77 is holomorphic, then r\ and 77 are harmonic (in view of result (1.3.21)) and so 77-77 ——- = a 2 as well as - it] - it]

*a= 2 is harmonic. This gives: 77 = a + i*a. The last part is left to be completed by the reader. 11.

Now r]=fdz + gdz = f(dx + idy) + g(dx - idy) = (f+g)dx + i(f- g)dy. Treat / + g as / , and i(f- g) as f2 and use (1.3.18) to write *r] with the assumption that * is defined in this region. This gives

*T] = -(i(f- g))dx + (f+ g)dy

54 Mathematical Perspectives on Theoretical Physics

= -if(dx + idy) + ig (dx - idy) = -ifdz + igd z and *T] = if dz - ig dzTherefore \\D n A *rj = j \D i iff + gg) dz A d~z =

jjD2(\f\2+\g\2)dxAdy.

12. The necessity part is easy to show using Stokes' theorem (although/is not defined on dD). Let / b e holomorphic and g be smooth with compact support. Then

To prove the sufficiency, we suppose that/is C1. Then, since for every g with compact support

We have -UD

jrg(z)dzAdl = \lDf^dzAdl-

=O

which leads to Cauchy-Riemann equations. Moreover if / and g are arbitrary, and g is with compact support, we have:

In view of the form of Weyl's lemma, this shows that/is C° and thus it is holomorphic. 13. By definition, E1 = {77 e L2(M); (77, df) = 0 for all smooth functions f on M with compact support}. Similarly, £*x = {77 e L2(M); (j],*df) = 0 for all/described above}. Using these we can show that if 77 6 L2(M) is C1, it e E*1 CE"1) if and only if it is closed (co-closed). For instance, consider that 77 is co-closed, and / is a smooth function with support inside a disc D (closure of D compact); then writing the inner product:

(df,rD = jjDdfA*v =

jjDd(f*n)-jjDfAd(*v)

we note that the second integral is zero since t] co-closed means that *rj is closed, and the first is zero as / is a smooth function and is integrated on a closed boundary. Thus 77 e £°~. Conversely, given that 77 G E1, we have (df, 77) = 0, which leads to

Complex Functions, Riemann Surfaces and Two-Dimensional Conformal Field Theory (an Introduction) 55

JJ M /Arf(*^) = O for all smooth functions on M with compact support. This implies that d{*r\) = 0, i.e., 77 is coclosed. Finally, to show that H = £ x n E*L consists of harmonic forms, we observe that if 77 is harmonic, it means it is closed and co-closed and hence from above it e H. For the converse, let 77 e H and let D be a coordinate disc on M with local coordinate z = x + iy. Choose a real-valued function $ which is smooth and is supported in D, denote the partial -*- by y/} and —r- by y/2 ax ay and note that dy

dx

Write 77 =fdx + gdy where/and g are measurable. Then since 77 e H and it is real-valued, we have:

0 = (77,
•IL<*+

>*>*{•&;"+&>)

Similarly

0= (77,* ^ 2 ) = Jj D r7A*(*^7)

Subtraction of the above equalities gives:

0=(r1,dyl-*dy/2)= JJD/A0.

56 Mathematical Perspectives on Theoretical Physics

Therefore in view of Weyl's Lemma, (1.3.26),/is harmonic and hence Cl. Again, writing these steps for *T) which also e H, we shall see that g is harmonic and hence C1. Accordingly, r\ =fdx + gdy is of class C1 and since H = E± n (E*)±, 77 e (E*)L and also to E"1, hence it is closed and co-closed. To prove (iii) of Result (1.3.30)(a) we simply note that d is in H with respect to the surface M N ON. 14. (a) To prove the result (1.3.37) we note that in view of result (1.3.33) we can always have an Abelian differential r]k which is holomorphic on M x {Po, Pk] and has only singularities — Zk

at Pk and

Z

at P o (P o ^ Pk). This result can obviously be generalized to an arbitrary number

° of finite points on M giving us a required meromorphic Abelian differential. Thus we set /

1

k=l

k=\

V= I (*krlk= I

akfk(z)dz.

We assume that singularity of an fk(z) is — whereas fj(z) (j ^ k) is holomorphic at zk. We Zk

now use definition (1.3.36) to obtain the relations (a) and (b) of (1.3.31). From this definition and our result (1.3.30)(b) it also follows that the form r\ is well-defined. (b) Since M is compact it can be covered by a finite number of sets. Accordingly we can triangulate A/by / number of triangles A], •••, A; (2-simplices) assuming that each singularity of r] is in the interior of just one triangle. Then using the fact that: ; 1

£ ResP77 = — X L r\ (where d Ay is the (positively oriented) boundary of Afi, we obtain the result since each 1-simplex d Aj appears twice with opposite sign in the above sum. 15. Recall that an element b € G (a group) is conjugate to an element a of G if b = c o a 0 c"1 for every c in G. We are dealing here with G = PSL (2, C). Let A e G be parabolic, and let ZQ be the corresponding fixed point, let C be a Mobius transformation such that C(z0) = °°, then evidently we have a Mobius transformation D which is conjugate to A such that: (i)

D(oo)

= C o A o C~l

(00) = 00

showing that D is parabolic, equating (i) to

(and changing the coefficients) it follows that cz + d a parabolic transformation can be expressed as z i-> az + b with a * 0 . The trace condition (1.3.32) restricts that a be equal to 1, and b can be chosen as 1, hence we have proved the result that if A is parabolic, it is conjugate to a Mobius transformation z i-> z + b (and thus also t o z H z + 1). Conversely given the translation z i-» z + 1 represented by the matrix:

{0 1) We have to show that for every

fa b\

Complex Functions, Riemann Surfaces and Two-Dimensional Conformal Field Theory (an Introduction) 57

the transformation: ,

(a

b\ (I

Y\ ( d

-b\

is parabolic. Now a2

(a(d-c)-bc ~{

2

-c

\

c(a-b) + adj

It can be easily checked that A is Mobius and that Trace 2 A = 4, hence A is parabolic. 16. We use almost the same arguments as we did in Exercise. 15. Given A e G = PSL (2, C) with two fixed points z, and z2, we assume C e G, such that C(zx) = 0 and C(z2) = °°, then construct the mapping D = Co Ao C~l which gives, using z, and z2, that D(z) = cz + d and D(z) = az + b. Normalization gives: D{z) = c'z + 1, D(z) = a z + 1. Since they must both be equal, we have D(z) = (a - c) z = Az. Obviously, A = 0 will mean that the mapping given by D is degenerate, and A, = 1 will mean that it is the identity. Hence a mapping D conjugate to A with two fixed points must be such that z i-> Az where A * 0, 1. We leave the converse and the rest of the exercise as a simple problem to be solved using the definition (1.3.48). 17. Consider the space: V = C x C - {(0, 0)} = {(z, w) : z, w € C, \zf + |w| 2 * 0}

(i)

with the subspace topology derived from the product topology on C x C. We say that two pairs (z, w), (z\ w') are equivalent if there is a non-zero complex number / such that (z, w) = (tz\ tw'). The equivalence class containing (z, w) is denoted as: (ii)

[z, w] = {{tz, tw) : t e C , t ± 0 ) .

The quotient map q : (z, w) —> [z, w] maps V onto the space P of equivalence classes which inherits the quotient topology induced by q : V -> P. The space P is called the complex projective space (see also 10A.1). To show that it is a Riemann surface, we consider the maps <j): V —> C j 9 and 0' : P —> C w defined by: (iii)

0 (z, w) = (j)'([z, w]) = z/w = oo

if w * 0 if w = 0.

Evidently = z/w on U{ = {[z, w] : w * 0} [z, w] i-> w/z on f/ 2 = {[z, w] : z * 0}. 18. We write the metric on the Riemann surface M 3 = CM as <j>(z) \dz\ and note that A e Aut ( C J is an isometry if and only if (i) I9

U) for z 6 C u {->}.

- C = Cu{o-}.

58 Mathematical Perspectives on Theoretical Physics

Since A(z) =

and

az + b . , d (az + b\ implies A (z) = — r = cz+d dz\cz + dj(cz

0(A(z)) =

1 T + df

2

\cz + d) \cz + dj Simplification after substitution in (i) yields: (ii)

1 _ 1 \cz + d\2 +\az + b\2 l + \z\2 '

Thus A is an isometry if and only if (ii) holds. lid = a and b =- c with \a2\ + \c\2 = 1, then evidently the left hand side of (ii) equals the right hand side, for \cz + d\2 + \az + b\2 = \cz + a\2 + \az -c\2 = {cz + a) ( c z + a) + (az - c) (a z - c) = \c\2\z\2 + \a\2 + acz + acz +\a\2\z\2 + \c\2 - acz = (\a\2+\c\2)(\z\2+D

-acz =

\z\2+l.

To show the converse, we expand \cz + d\2 + \az + b\2 and equate it to 1 + \z\2; this gives: (\a\2 + \c\2) \z\2 + (cd + ab)z

+ (dc + ba)~z + (\b\2 + \d\2) = 1 + \z\2.

This can hold only if (iii)

|a|2 + \c\2 = 1 = \b\2 + \d\2

and (iv)

cd + ab = 0 as well as dc + ba = 0.

Note, however, that the vanishing of one of these in (iv) implies the vanishing of the other. Since ad - be = 1, we have: add — bed = d or using (iv) a\d\2+a\b\2= d. Again, using (iii), we have a = d or a = d. Similarly, multiplying ad- be = I by - c and using Eqs. (iv) and (iii) subsequently we get b = - c. To show that A is elliptic, we note that (v)

Trace A = a + f l = 2 R e a .

Complex Functions, Riemann Surfaces and Two-Dimensional Conformal Field Theory (an Introduction) 59

If Re a were to be ± 1, A will naturally be the identity (thus A will give a trivial isometry) and this is not what we want; hence we have -2 < Trace A < 2. Since Trace2 A < 4, A is elliptic. 19. Let S2 be the unit sphere in R3 given locally as: Consider the stereographic projection through the point (0, 0, 1) of this sphere onto C, thus „

„

£ + if]

(6 v, 0 •-> -2—^ = z. Put

z = x + iy and note that 2x = _ ^ L , 2v = y ^ r

and

(x2 + y2) + 1 = - ^ .

This helps us to write the inverse of above map as: ^

(i)

(2Rez

2Imz

|z|2-O

This in turn defines a diffeomorphism between S2 \ (0, 0, 1) and C which can be extended to a diffeomorphism between S and C u ( » j . The Euclidean metric in R induces a metric on S . We show next that this metric is the same as the one given in the exercise. From (i) we have:

(11) (
f 2 ( l - * 2 + y 2 ) d * - 4 x y d y 2(l + x 2 - y 2 ) r f x - 4 x ^ y 4(xrfx + ydy)l 7 j ^2 ' A 2 ^2 '7 2 ?^" ^ (l + x2+y2) (l + x2+y2) (l + x2+y2) )

d? + dr)2 + dt? =

x {[(1 - x2 + y2)2 + Ax2 y2 + Ax2] dx2 (l + x2

+y2)

+ [4x2 y2 + (1 + x2 - y2)2 + Ay2] dy2 + [Sxy - Axy (1 - x2 + y2 + 1 + x2 - y2)] dxdy) =

r

I.

(l +

2

{(1 + x2+

2 \4

x2+y2)

+ (1 + x2+y2)2dy2+

20. Beginning with Laplacian (log <j) (z))

0}

y2fdx2

60 Mathematical Perspectives on Theoretical Physics

we compute the Laplacian (log (f)(z)), i.e., ( d2

d2 \

- f r + f^

nog2-log(l+x

2,

i

+

l " 2 ^ - * 2 +y2)

/ ) ] = - -^—

+

2(l + ; t 2 - y 2 )

iTF 7 ^

J-

_4 =

(l + * 2 + > 2 ) 2

=-(
4

THE TWO-DIMENSIONAL CONFORMAL FIELD THEORY

As the title suggests we shall give in this section a simplified version of conformal field theory—in the sense that we shall limit our discussions of the theory on the Minkowskian/Euclidean space of two dimensions. In later chapters (Chapters. 5 and 11), we shall see that conformal groups play a dominant role in string theory via the Virasoro operators (see Sec. 5.2) L,,-which happen to be the generators of this group.

4.1

Conformal Group

Given an n-dimensional Minkowski/Euclidean space M with coordinates x? (/l = 1, • • •, n), a conformal transformation is a diffeomorphism x*1 —> x ^ such that the line (metric) element is preserved up to a scale factor, i.e.: d~s2 = dZ^ dxv rjMV = n(x)dxfl

dxv T)^v

(1.4.1)(a)

In the case of an infinitesimal transformation x^1 —> xM + e M, the infinitesimal distance ds2 transforms as: dsi->dsz + (dllev+dvell)dx^dxv.

&^ = evr]tlv

(1.4.1)(b)

These two taken together lead to: ^ e v + dv&ll = illn) Ti^v dpsp,

(1.4.1)(c)

The collection of all such transformations given in (1.4.1) is the conformal group of M. It is known that for all dimensions n > 2, it is finite-dimensional, whereas for n — 2 it is infinite-dimensional. However, it is in this case that models in statistical mechanics can be related to string theory since whatever be the dimension of the embedding space, the theory in effect described by the world-sheet swept by string is 2-dimensional. And this brings the two great theories—the quantum and the string— much closer and renders them more comprehensible (see the original work of Belavin Polyakov and Zamolodchikov [4] and the references there, and 7.[21]). Since this group is infinite-dimensional, it has an infinite number of generators as we shall see in Subsec 4. To begin with, we consider a particular case of this group—the Lorentz group—the group of those transformations where the scale factor is ± 1. This group is abelian and thus has one generator and its irreducible representations are one-dimensional (see Chapter 2). Thus all representations of this group,

Complex Functions, Riemann Surfaces and Two-Dimensional Conformal Field Theory (an Introduction) 61

such as tensors, can be decomposed into one-dimensional representations. To achieve this end, we shall use the light-cone formalism 20:

4.2

Light-cone Formalism and the Lorentz Group

Define z = x° + x\ z s J C 0 - * 1

(1.4.2)

where z and z are independent coordinates. The coordinate transformation inverse to (1.4.2) gives back the Minkowski coordinates (x0, x^:

x° = ± (z + i), xx = ±-{z-z)

(1.4.3)

Under the Lorentz group with (single element) a = a>01 (see Remark (2.2.12)), the transformations for light-cone coordinates (resulting from 8x° = cox\ 8xx = 0)x°) are 6z = coz, Sz = - oilThe Minkowski line element in these coordinates becomes:

(1-4.4)

ds2 = - (dx°)2 + (dx1)2 = -dz d'z

(1.4.5)

and written out with metric tensor this becomes: ds2 = gzztdzdz+ gzl dzdz + g-_zdzdz+

g-- dz dz

2i

(1.4.6)

This eventually leads to the values of these metric tensor components:

s«=s;;=°'S« = « « = - j

(1A7)

in light-cone coordinates. Inverse metric can easily be verified as: gzz = g" = 0, gzl = gIz = - 2.

_ (1.4.8)

An arbitrary contravariant tensor with components f in (xf1) system has components f and tz given by the rule: tz=t°+t\

tz = t°-t[

(1.4.9)(a)

and a covariant tensor T^ (in view of (1.4.3)) is given as Tz=j(T0+Ti),

20.

21 •

Tz = ^(T0-Tx)

(1.4.9)(b)

Light-cone coordinates (an accepted usage in literature) must not be confused with complex coordinates of previous sections. zz m all expressions stands for z z •

62 Mathematical Perspectives on Theoretical Physics

The easiest examples of these tensors given by Eqs. (1.4.9)(a) and (1.4.9)(b) are dz, dz and — , -—

dz

dz

when written out in full they stand for: dz = dx° + dx\ dz = dx° - dxl

Tz=l[-d? + J?\

(1.4.10)(a)

Tz^W-Jj)

(1A10)(b)

From (1.4.4), the following fact regarding the action of the Lorentz group on these tensors is immediate. Fact 1.4.1: Each component of a tensor in the light-cone coordinates forms an irreducible tensor of the Lorentz group: (a)

Sf = cof,

8tz = -tot1,

(b)

STz = -coTz,

8Tz = coT-z

(1.4.11)

It is interesting to note that the scalar product in light-cone formalism is one of following (a)

t • T=t^TM=

tz Tz + tz Tz

(b)

=-Ht-zTz+tzTz)

(c)

= - — ( r f + tl

(1.4.12) Tz).

Equality (c) implies that sum of any tensor with two indices zz, such as Tzz + Tzz, can be expressed as a divergence, i.e., Tzz + Tzz =-2 T% (1.4.13) Also, using the metric, one can express any tensor in terms of only upper and lower z (z) indices. In conclusion we note that in light-cone formalism (1.4.1)(c) becomes: d0 € ! = - d l s 0

4.3

a n d d0 e 0 + d l e l = 0

Euclidean Space Formalism

We next show an analogue of the light-cone coordinates in Euclidean space. This is desirable for two reasons, namely (1) working in a space with positive definite metric allows one to have an access to the mathematical theory of Riemann surfaces; (2) conformal field theories associated with statistical models are in Euclidean space (in string theory they are in Minkowski space). The change to Euclidean space from Minkowskian space is achieved via the Wick's rotation principle: * ° - * - w ° , x1 -4 x1 (1.4.14) 1 2 ] Denoting the Euclidean coordinates as x , x in place of Minkowskian x°, x , the line element (1.4.5) becomes ds2 = (dx1)2 + (dx2)2

(1.4.15)

Complex Functions, Riemann Surfaces and Two-Dimensional Conformal Field Theory (an Introduction) 63 The coordinates z and z in this format are chosen as i

Z=

. 7

~~

X + IX ,

1

. 2

Z - X - IX .

The inverse coordinate transformation is evidently

xl=j

(z+z), x2= - j ( z - i ) -

The coordinates (z, z) are our familiar complex coordinates. The line element (1.4.15) now becomes: ds2=dzdz

(1.4.16)

and the metric tensor is given by: (a)

8zz=S-zi=0,

gzl =

(b)

ga=g"=0,

gzz = g~zz = 2

gz-z=-, (1.4.17)

The group of transformations in the case of the Euclidean space is 50(2), with the following rule of Euclidean rotation: x1 —> xl cos co + x2 sin co x2 -> x2 cos co - x1 sin co

(1.4.18)

These transformations lead to: z - > e-iu> z, z - » eiw 1

(1.4.19)

Given a tensor with components f and 7^, their counterpart in z, z coordinates is given by: (a)

tz=tl

+ it2,

tz = t l - it2,

(b)

Tz = — (7, - iT 2 ),

7j = — (7, + iT2)

(1.4.20)

From these equations it is evident that all the conditions that are satisfied in light-cone formalism can be shown to hold good in this complex coordinate formalism by appropriate introduction of ± i. We shall need this fact while studying Majorana and Weyl spinors (Chapter 7). We also observe that Eq. (1.4.1)(c) in the Euclidean case gives: dx e 2 +
and

(1.4.20)(a)

4.4 Two-dimensional Conformal Group We now return (although briefly) to the conformal transformation group in two dimensions. We first note that since the line element ds2 = -dzdz is preserved up to a scale factor, this implies that we can find a smooth function ev(z'z)

such that

ds2 = -ev(-z'l)

dzdz

(1.4.21)

64 Mathematical Perspectives on Theoretical Physics

Accordingly the metric in (z, z) coordinates is given by:

(a)

§zz=g-l=0,8z.

=

g-z=-Le^n

(b)

gzz = g~zz = 0, g" = gzl = -2e~^z-l)

(1.4.22)

We shall soon use these to discuss the conformal tensor calculus, but first we establish the claim (made in the introduction of this section) that the conformal group for 2-dimensional spaces has infinitely many generators. However, the good part is that the Lie algebra formed by them is of great value in physical theories. Note that the transformations of the type:

(a)

z^f(z)

z^gCz)

(b) z - > A ( i ) z-»/fc(z) (1.4.23) with/, g, h, k as smooth functions will preserve (1.4.21), in the case of (a) where for instance e¥<-z> z) will be f'(z)g' (z) ev<-z' z\ The second transformation, though, will change the orientation in view of (1.4.10)(b). To avoid complications of orientation change, we stick to the transformations Eq. (1.4.23)(a), and consider infinitesimal transformations22: z->z+ JJanz

n +

\~z-*z

+I

n

anz

n +l

n e Z

(1.4.24)

n

In view of (1.4.13)(a) and (1.4.20)(a) it follows, that these transformations are generated by: zn +

l J_sL dz

-B + l d dz

s I

z

(1425)

Obviously the Lie algebra formed by them satisfies: (a)

[Ln, LJ = -(n-m)

Ln + m,

(b)

[Ln, Z j = 0,

(c)

[Ln, Z j = - ( « - m ) Ln + m

(1.4.26)

We shall return to these generators (operators) in Chapter 5 and of course shall use them in Chapter 11 (see in particular Sec. 11.6).

4.5

Mobius Transformation

In the previous section we have already come across Mobius transformations. We give below an important result on these transformations—which will also serve as an example to the theory discussed above. • In writing an in the second correspondence, we have followed the usual practice in literature. This does not mean that a „ is the complex conjugate of an. The common feature they share is that they are both infinitesimals independent of z and z respectively (an of z and an of z).

Complex Functions, Riemann Surfaces and Two-Dimensional Conformal Field Theory (an Introduction) 65

Result 1.4.2: The most general transformation that maps the Riemann sphere onto itself is the Mobius transformation: . z-*z

=

az + b -,

_ _„ az+b z->z = __ -

cz + d

(1.4.27)

cz + a

where a, b, c, d are complex parameters satisfying the equality: ad -bc= 1. The mapping is one-one. The conformal group also known as Mobius group is a six real-parameter group (see Remark (2.2.12)). Infinitesimal Mobius transformations can be written as23: z' = z + a_x + a0 z + ax z2 ~z = z + a_i + a0 z + ai z2

(1.4.28)

and in view of (1.4.25), they are generated by Lg, L±1, and L o , L±l.

4.6

Conformal Tensor Calculus

Let the coordinate changes be written as: z —> w(z),

z —> w (z)

then a general tensor T^'j s T(z, z) will correspond to a tensor T'(w, w) in (w, w) coordinates, with transformation rule:

T(z, z) -> Hw, w) = (*»-Yk teY"' T(z, I) \dz

)

\dz

(1.4.29)

)

The numerical quantities (i - k) = h, (j - /) = h are called the conformal weights or dimensions of tensor T. Similar to other metric theories, we can raise and lower the covariant and contravariant indices by using the appropriate metric tensor, thus for instance: Tz = gzl Tl = - i - ev{z-'z) Tl and T- = - 1 ev{z'l)

Tz.

(1.4.30)

We now state some important results that relate our discussions to familiar pictures in physical theories. Result 1.4.3:

A time translation x° —> x° + c, where c is real, is induced by z —> z + c, z —> z + c

and so is generated by the sum of generators (L_t + L_x) in view of (1.4.25). The generator (L_x + L_{) is the Hamiltonian. Result 1.4.4:

A space shift xl -> x[ + X is induced byz—>z + A, z - ^ z - A and is generated by

L_x - L_j, which is the total momentum. 23

Note that while coefficients an, an in (1.4.24) can stand for light-cone coordinates or for their analogue in Euclidean space, in this case they are in second formalism (see the paragraph below (1.4.13)).

66 Mathematical Perspectives on Theoretical Physics

Result 1.4.5:

The rotations 5x° = -Qx1, Sx1 = + tj)x° are the consequence of the transformations

z -> e'^z, z —> e~"^ z- The resulting generator L o - L o is the angular momentum generator. This is nothing but the generator of the 2-dimensional Poincare group (see Exercise 4.1.7). Finally, the dilation z -» Az, z —> Xz for X real is generated by Lo + Lo. It can be checked that under a dilation a tensor T~*X(h + 'i)T, while under a rotation T -> e i(A - * > *r. The sum (A + A) and the difference (h - h) is respectively called the dialation weight of T and the spin of T. Finally, we list a few results dealing with conformally invariant two-dimensional theories. These will be pursued in detail in later chapters (Chapters 6 and 11).

4.7

Conserved Currents

Result 1.4.6: If the theory is Poincare-invariant, then there exists an energy momentum tensor Tan (which can always be chosen to be symmetric)—which is a conserved current, i.e.,24 «9"r^ = 0

(1.4.31)

and whose charge generates translations (see Sec. 6.3). The current corresponding to Lorentz rotations is a moment of energy-momentum tensor: xaTp8-xpTaS

(1.4.32)

which is conserved on its Sindex due to the symmetry of Taa and due to its conservation. Result 1.4.7: If the theory is dilation-invariant, i.e., it is invariant under xa —> Xxa, and the associate current ./^ is given by a moment of the energy momentum tensor: jp = xaTap

(1.4.33)

then jp is conserved provided 7"^ = 0 The above relation leads to constructions of further conserved currents, for instance, define an arbitrary current fa(x)Tafi (1.4.34) and demand that d? fa + d" fP - ft]"1* = 0

(1.4.35)

where (f> is an arbitrary function of x and rf^ is the metric tensor, then it can be checked that/ a (x) Tap is conserved. A simple example of (1.4.34) is given by xaxSTsp-x2Tali

(1.4.36)

This generates the special translations of the conformal group. Moreover, these additional conserved currents define corresponding generators and together with Poincare and dilation generators, they have the conformal group as their algebra. Returning to (z, z)-coordinates in a two-dimensional Euclidean invariant theory, the above statements lead to the following realizations: If the energy momentum tensor is symmetric and traceless, it has only two components, say Too and T01. The traceless condition in coordinates (z, z) implies: Tz- = 0 24

Equations (1.4.31), etc., are in arbitrary dimension.

(1.4.37)

Complex Functions, Riemann Surfaces and Two-Dimensional Conformal Field Theory (an Introduction) 67 thus there are only two components, T^ and T--. The conservation condition leads t o

d-Jzz = 0,

dzT-zl=0

(1.4.38)

showing that T^ and T— are functions of only z and z, respectively. With the help of the above discussions, we can now define an infinite set of conserved currents, for instance, consider any two functions /(z) and g(z) and form a new pair: f(z)Ta,

g(~z) T-

(1.4.39)

which evidently satisfies d- (/"(z)T,,) = 0 and dz(g(z) T- = 0. The corresponding generators are:

_

j

1

1

L dz

n=J-$Yln

n+2

T

+ 2T

(1A40)

«

This shows that a theory for which Tzz = 0, carries an infinite-dimensional conformal group. As mentioned above, we shall return to these generators in later chapters.

Exercise 1.4 1. Find the light-cone components of tensors of type (2, 0) and (0, 2), i.e., components of contravariant and covariant tensors of degree 2, and then write its generalized version for a mixed tensor of type (r, s). 2. Obtain the Lorentz group transformation for tensors of type (2, 0) and (0, 2) in light-cone formalism and then write its generalized version. 3. Show that for the line element (1.4.21) the Christoffel symbols F ^ i n z, z coordinates satisfy:

and the remaining ones are zero. 4. Establish the results (1.4.6) and (1.4.7). 5. Show that for the line element (1.4.21), the current defined by (1.4.36) is a conserved current that generates special translations of the conformal group. 6. Establish the generators of 2-dimensional conformal field theory as given in (1.4.40). 7. Show that for a free spin-zero 2-dimensional field theory, the energy momentum tensor components are: Tzl=0,

Tzz=dz¥dzy,

T- = d-zyd-z¥.

68

Mathematical Perspectives on Theoretical Physics

Hints to Exercise 1.4 1. The components of a contravariant tensor of degree 2 are obtained by writing the elements of tensor product of contravariant vector spaces. Thus denoting the tensor by s, we have (a)

szz=

f®tz

= (t°+ tl) ® (t°- tl) = t° t°- t° tl + t[ ® t° - tl ® tl -

* 01 J . / 1 0

fi°

f11

Similarly szz =

?00 +

f0! +

r l 0 + rU

and szz and szz are respectively: ,oo_fio+,o1_?nandfoo_roi_fio

+ r ii

In the case of covariant tensor, we have: (b)

Szl = ±(JQ+Tx)®^

(TQ-TJ

= ~7 (^oo - Tox + TlQ - Tu). And the other three are: (c)

Szz=^(TO0+T0l zi

=

+ Tl0+Tu)

~7 (^00 ~ ^10 + ^01 " ^ll)

S& - ~~T (^oo - TQl - Tl0 + Tn). We denote the general tensor of type (r, s) by T and note that its components could look like:

(d)

=

r

rr r ] n n ^ • M k

I

where (i, j) and (k, I) are some partitions of r and s respectively, and z, z have been suppressed in T'j. (This is also denoted T™ or T(z, I) in literature). 2. Note that 5(?J ® rz) = & z ® r z + rz ® 5fz. Similarly, S(tl ® r*) = 5 ^ ® ^ + f' ® 8tz. Writing rz ® r ' = Tzz and rz ® fz = Tzz and using (1.4.1 l)(a), we obtain: (a)

STZZ = 2coTzz and 8 TTz =-2coTzz

Complex Functions, Riemann Surfaces and Two-Dimensional Conformal Field Theory (an Introduction) 69

In the case of the combinations f ® tz and tz <8> tz, it can be easily checked that 8(tz <8> tz) and 8(tz ® tz) are zero. Using the same procedure to write 8(TZ ® Tz) and 5{T- <S> T-) with the help of (1.4.11 )(b), we have: (b)

8SZZ = -2(0 Szz and S S- = 2a S-

and 8 S,- = 8S-7z = 0. In view of these computations it follows that for a general tensor, the Lorentz transformation rule would be:

(c)

STJj = co [(i - k) - (] - I)] T^.

3. Recall that ^Pr = — 8a$ (dpgys + dYgps-

dggfr).

Write a, p, / a s z and 8 as z, as well as z for summation, then since gzz = 0 = ga, it becomes r z , = | « « (dz §z-z + dz gz-z) = e-" ( ^

e

^ = dzxir.

Similarly for F | ? , we have (9- y/. The mixed component YK

= \gZZ

^zgzz+dzgzz-dzgzz)

is evidently zero. (A good source for the remaining four exercises is Ref. 7.[21].)

References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11.

L. V. Ahlfors, Complex Analysis (New York: McGraw-Hill, 1979). L. V. Ahlfors and L. Sario, Riemann Surfaces (Princeton University Press, 1960). A. F. Beardon, A Primer on Riemann Surfaces (New York: Cambridge University Press, 1984). A. Belavin, A. M. Polyakov and A. B. Zamolodchikov, Infinite Conformal Symmetry in Twodimensional Quantum Field Theory, Nucl. Phys. 241, 333-380 (1984). R. Bott and L. W. Tu, Differential Forms in Algebraic Topology (Springer-Verlag, 1982). P. Buser, Geometry and Spectra of Compact Riemann Surfaces (Birkhauser, 1992). R. V. Churchill and J. Brown, Complex Variables and Applications (McGraw-Hill, 1990). H. M. Farkas and I. Kra, Riemann Surfaces (New York: Springer-Verlag, 1991). O. Forster, Lectures on Riemann Surfaces (Springer-Verlag, 1981). S. Kabayashi and K. Nomizu, Foundations of Differential Geometry (Interscience Publishers, 1969). H. Weyl, The Concept of a Riemann Surface (3rd ed., Reading, MA: Addison-Wesley, 1964).

CHAPTER

EUMENTS OF GROUP THEORY AND GROUP REPRESENTATIONS

r\ dL

The theory of groups as a structural theory has been developed and studied in great detail by mathematicians (e.g., the theory of topological groups and Lie groups). From the point of view of physicists, however, the abstract formulations are not of interest nor worth the effort, unless they relate to some physical system or to some observed phenomena. Thus, for instance, the groups whose actions leave some postulated laws of physics or observed phenomena unaltered are of importance to physicists. These groups are known as symmetry groups of given physical systems. We shall return to these in later chapters. We give below the basics of group theory compromising the mathematical rigor, with a view to emphasize the applicability.

1 l.l

INTRODUCTION Definition of a Group, Examples, and Conjugate Classes

Definition 2.1.1: A group 'G' is a set of elements (a, b, c, ...) together with a binary operation G x G -» G (i.e., (a, b) -» a • b = ab (closure)) that satisfies: (i) a • (b • c) = (a • b) • c = abc (associativity); (ii) there exists a unique element V in G such that a • e = e • a - a; (iii) corresponding to every element a in G, there is an element b such that a • b = b • a = e, V is called the inverse of a and is written (a" 1 ). The group 'G' is finite if the number of elements is finite, otherwise it is infinite. If the elements of G are functions a(x) of a continuous parameter x, then G obviously has an infinite number of elements and as such is an infinite group. If for every pair (a, b), ab = ba, the group G is called an abelian group. The total number of elements in a finite group G is called its order and is denoted \G\ or g. In this case every element 'a' has a smallest integer, say n, associated with it such that a" = e, n is called its order and is denoted \a\. Any group H c G is called a subgroup of G. Apparently the set {a, a2, ..., a"} = An c G is a subgroup of G. The group An is called a cyclic group. To define a cyclic group we need just one element and a binary operation. This one element is called the generator of the group. Clearly every finite group G has a finite number of generators: a, b,c, ..., meaning thereby that every element of G is expressible as a finite product of powers (including the negative powers) of a, b, c, .... The order \G\ = g is the sum of the orders of its generators (i.e., \G\ = |a| + \b\ + \c\ + •••).

Elements of Group Theory and Group Representations 71

Example 2.1.2: (a) All integers a, b, ... that satisfy a - b = mn for an arbitrary integer m and a given integer n form a group under addition modulo n. This is denoted Zn. (b) The set of real numbers (integers) denoted R (Z) forms a group with addition as the group operation, whereas the set of rationals Q (excluding zero) forms a group with multiplication as the group operation.1 (c) The T complex numbers exp (2 mm/k) (m = 0, 1 ..., k- 1) form a cyclic group under multiplication. (d) All 2 x 2 matrices

( cos 6

s'm 6\

\- sin 8

cos 6)

0<

6<2n

form a group with respect to matrix multiplication. This is called the rotation group R2 in twodimensions. Similarly, the set of 3 x 3 matrices cos (j)

1 + sin
- sin
cos 8 cos + sin Q

\ sin <j) sin 0

- sin 0 cos 0 + cos 6

1

N

sin 6

0<6<2K

cos 0y

with matrix multiplication as the group operation is the rotation group /? 3 . (e) The set of 3! permutations of {1, 2, 3} is the group 5 3 where any two permutations Pt /*• are combined into one permutation Pk by carrying out successive permutations of Pj followed by that of Pi.

All of the above groups except the last two are abelian. While using groups as modes of application we treat the elements a, b, ... as operators A, B, ... and call groups (in classical terminology) as groups of transformation. Of particular interest are the groups which leave the geometry of an object or the equations defining a physical system invariant. These are called symmetry groups of the system. For example the groups in (d) and (e) above are symmetry groups. Definition 2.1.3: An element 'fo' in G is said to be conjugate to a given element 'a' in G, if for some _1

C

C

element V of G, we have b = vav~ . We denote this relation as b = a. It is evident that a - a, and b = a implies a = b and also a = b, b = c leads to a = c. The set of all elements conjugate to each other forms the so-called conjugacy class or simply the class. In view of the above definition, G can be decomposed into a disjoint union of these classes. If \C,\ denotes the number of elements in the i-th class and N the total number of these classes, then \G\ = X,-Ii iQl- T n e integer N is an important characteristic of the group. It may be noted that V in any G forms a class by itself, and if G is an abelian group each element forms a class by itself.

1

The groups formed by R, Z and Q are denoted as: (R, +); (Z, +); and (Q, •).

72 Mathematical Perspectives on Theoretical Physics

1.2 Invariant Subgroups, Factor Groups, Simple and Semi-simple Groups Definition 2.1.4: Every subgroup Gs , ...), if aGs = Gs.a for every a in G and not in Gs, the subgroup is said to be an invariant or normal subgroup of G. The group G is called a simple group if it J> an invariant Gv, and is called semi-simple if it 3> an invariant abelian subgroup Gs. For a finite group G, the total number m of cosets for any G,. satisfies: |G| = |GS| x m. Definition 2.1.5: Every invariant subgroup G^ defines another group G/Gs= {eGs, a2Gs, ..., amGs} with respect to the coset multiplication: at Gs x a-} Gs = a, a;- G r It is called the factor group or the quotient group of G relative to Gs. Apparently the order of GIGS is m, which is also called the index of Gs. Example 2.1.6: (a) The group # 3 does not contain an invariant subgroup and is therefore simple. (b) The group 5 3 {e = (123), (132), (213), (231), (312), (321)} is not semi-simple since it contains the alternating group A = (e, (132), (213)) which is an abelian invariant subgroup of S 3 (obviously 5 3 is not simple).

1.3

Products of Groups and Homomorphism

Definition 2.1.7: Let G and G' be two arbitrary groups with independent binary operations. The group G x G' formed by using the combination rule: (a, a') (b, b') = (ab, a'b') is called their direct product. The order of the group is \G\\G'\ = gg. Evidently \G x G'\ = \G' x G\. In particular if G{ and G 2 are two commuting subgroups of a larger group G, then the direct product group 2 Gx x G 2 c G, and the groups Gj, G2 are invariant subgroups of Gj x G2- If G{ x G 2 = G, then evidently Gx and G2 are invariant subgroups of G. If, however, Gx is invariant and G 2 n G{ = (e), and also every member a e G can be expressed as a{aJ2 (product of elements e G1? G2), then the product of G{ and G2 equals G; the group G is called the semi-direct product of Gy, G2 and is written Gx tx 5 G2 or simply as Gx K G2. Definition 2.1.8: A mapping (f> : G ^> G' which preserves the group structure (i.e., (f>(a • b) = / and is written ker <j>. If every element of G' is preimage of some element in G, the mapping is a homomorphism onto, otherwise it is a homomorphism into. For instance, if H is an invariant subgroup of G, the mapping G —> G/H is a 'homomorphism onto.' Definition 2.1.9: A 1-1 correspondence between two groups G and G' is called an isomorphism if it preserves the group structure. Thus from an abstract point of view two isomorphic groups are one and the same. Definition 2.1.10: If a 1-1 correspondence of a group G onto itself associates a unique element 0 (a) to every a in G, it is called an automorphism. In particular, if mapping 0 results from choosing a fixed element * in G and by defining: a —> JC~ ax = 0(a) = a x 2

G{xG2c:G

in view of the binary operation Gx G -> G.

Elements of Group Theory and Group Representations 73

it is called an inner automorphism of G. An automorphism which is not equivalent to the transformation given by a single element is called an outer automorphism. For instance, in a cyclic group of order m, the correspondence a —» ak if k is prime to m defines an outer automorphism.

2

LIE GROUPS AND TOPOLOGICAL GROUPS

The importance of Lie groups in mathematics and physics can hardly be overemphasized. For instance they play an important role via diffeomorphism groups in the study of various types of partial differential equations. The subject of Lie groups since the first contributions of Sophus Lie [22] has developed tremendously. One can now talk of infinite-dimensional Lie groups modelled on Banach manifolds or Frechet manifolds. The work in this direction is due to Bott (1956) [4], Eells (1958) [11], Abraham (1961) [1], Smale and Palais (1968) [28], Omori (1970) [27], and Ebin and Marsden (1970) [10]. In the late sixties through the pioneering work of Arnold (1966) [2] (followed by that of Marsden, Ebin and Fischer (1972) [25]), the applications of Lie theory to mechanics, in particular to hydrodynamics and plasma physics, shifted the emphasis from mathematics to physics. During the past decade and a half, the theory of Yang-Mills [Moncrief (1977), (1980), Singer (1977), (1980), Atiyah, Hitchin and Singer (1978),] (see Chapter 10 for these references), Kac-Moody algebras [Chap. 5] and the string and superstring theories [Chap. 11] have given another boost to interacting research areas of Lie groups. It is therefore just in order that we acquaint our readers with the primary tools of the topic, leaving the task of sophisticated technical details to specialized texts on the subject (see [16] and also Chapter 8 of 0.[8] for an introductory account). To begin with, we give the definition of topological group. These groups are of a more general nature.

2.1 Topological Groups Definition 2.2.1: A set G is a topological group if the set is (i) a group, (ii) a Hausdorff topological space, and (iii) mappings 0 : G x G -> G :

x • y, \ff:G-*G: y/(x) -> x~l are continuous, where G x G has the product topology. The requirement (iii) is the compatibility condition between the two structures (algebraic and topological). Obviously the two mappings of (iii) can be reduced to a single continuous mapping: (x, y) —> xy'1. Example 2.2.2: (a) The additive group of real numbers R is a topological group (T.G.) with metric topology. (b) The group of (n X n) nonsingular real matrices denoted GL (n, R) is a T.G. with matrix multiplication as the group structure and with the usual topology of R" as the topological structure. (c) The rotation groups in Exp. (1.1.2, d) are also T.G.s. (d) The unit circle 5' : {exp {2nix) \xe R} is a T.G. with multiplication of complex numbers as the group operation. (In each of these examples it is an easy exercise to check that the mappings (f> and y/ of Def. (2.2.1) are continuous.) Given two topological groups G\ and G2, one can define another T.G. called the product group G{ x G2 by choosing the product topology on the set G{ x G2 as the topology and by letting the pointwise operations as (xx, x2) (y^, y2)~l = (x{ y[l, x2 y21)-

74

Mathematical Perspectives on Theoretical Physics

For example, the n-time product of Sl with itself gives the familiar torus T" {{e2 *Ui, ..., e2nvc") \ (xu ...,*„ e R"). Definition 2.2.3: Given a topological group G and an element a E G, the mappings La and Ra defined as (i) La : G —> G : x —> ax, and (ii) Ra : G —> G : x —> xa are called left and right translations of G with respect to a. These mappings are evidently homeomorphisms of G; they help us determine the local properties of any element a with the help of those of e, since neighbourhood U of e defines a unique neighbourhood La U = V of a (see subsection (5.3)). Definition 2.2.4: A subset H of a topological group G is a topological subgroup if: (i) HH~X hk~l is continuous for h, k in H. A topological subgroup H of G is called normal if affa"1 c / / for every a in G. Evidently then, there follows the quotient GIH which is the factor group of G with respect to the operation aH • bH = abH. If in addition H is closed (i.e., a closed subset of G), then GIH can be given a quotient topology n: G —> G/// which is Hausodorff (;r defines an open continuous homomorphism, (See Chap. 0)), and in that case GIH becomes a topological group. Definition 2.2.5: To every point p of a topological space X there is associated a unique maximal connected subset C(p) of X that contains p. In the case of a topological group G when p is the identity 'e', the subset is denoted 'G o ' and is called the identity component of G. The set G o (and for that matter C(p)) is a closed set. It can be easily checked that Go is a closed normal topological subgroup of G; and the connected component of any element a, i.e., C(a) equals the coset aG0. Moreover if G is locally connected (i.e., every point a e G has a connected neighbourhood), then G/Go is discrete.3 Remark 2.2.6: In general, if H is a connected topological subgroup of G and H is closed, then if the corresponding quotient group GIH is connected, the group G is connected as well. It should be noted, however, that the connectedness of G and GIH does not imply the connectedness of H. For example, the torus Tl = (R/Z, +) is a connected T.G., where R is connected, but the closed subgroup Z is not connected. Definition 2.2.7: A connected group is said to be a simply connected group, if each closed continuous curve g(t) (0 < t < 1, g(0) = g(l)) in it can be continuously shrunk to zero (i.e., reduced to a single point) using for instance a family of curves g(t, s) 0 < s < 1, g(t, 0) = g(t), g(t, 1) = e. Definition 2.2.8: A group whose elements are defined in terms of a continuous variable is called a continuous group. The group R3 of rotations given by continuous parameters 6,

be an associative binary operation that can be defined only on certain pairs (x, y) e N x N and let yr be another operation \\f: x —> x~l defined for some elements of N, then iV is a local topological group provided <j) and y/ are continuous and wherever they are defined xx~l = AT1 X = e holds good. 3

'

Note that Definitions (2.2.4) and (2.2.5) are the same for Lie groups, once the topological space and continuous mappings here are replaced by analytic (C°°) manifolds and analytic (C°°) mappings.

Elements of Group Theory and Group Representations 75

Definition 2.2.10: A Lie group G is a set which is (a) an analytic (or a C°) manifold, (b) a group, and (c) where the two group laws: multiplication m : G x G —> G : (x, y) —> xy and the inversion i : G -> G : x -» x~l are analytic (or C°°) maps with regard to the manifold structure in (a). 4 Example 2.2.11: (a) All topological groups given in Exp. (2.2.2) are also Lie groups. (b) Let R (C) denote the real (complex) numbers and H the quaternions, then the groups GL (n, R) (GL (n, Q ) and GL (n, H) of n x n invertible matrices with real (complex) and quaternions are Lie groups. (c) The quotient group R/Z (Remark 2.2.6), which is isomorphic to the circle group Sl(= Tl), is a Lie group. (d) The cartesian product of Lie groups is a Lie group. Accordingly, the n-dimensional torus [Tn = S} x S1 x, ..., S1] formed by n copies of 5 1 is a Lie group. (e) The group of (n + 1) x (n + 1) matrices A with complex coefficients satisfying A • A+ = 1 and det A = 1 is a Lie group (A + = Hermitian conjugate/conjugate transpose of A). It is denoted SUin + 1) and is called the special unitary group (See also (g)). In particular when n = 1, the collection of matrices

(a

P) (2.2.1)

with the restriction | a | 2 + |/?f = 1 gives SU(2). The group SU(2) is homomorphic to the 3-sphere S3 of unit vectors in C 2 . (f) Let gr be the canonical pseudo-metric of signature (n - r) on R" whose infinitesimal length is d s 2 = d x \ + d x 2 + ••• + d x \ - d x 2 r + x ••• - d x 2 n

(2.2.2)

and whose matrix representation, also denoted gr, is: (lr

O

s'={°

\

-l.-.)

(2'23)

Let O{r,n - r) be the group of linear transformations A of R" such that gr(Ax, Ay) = gr (x, y), V x, y G R". This group can be identified with the group of n x n matrices A such that A'grA = gr (2.2.4) From the above equation it follows that det A = ± 1, V A e O (r, n - r). The subgroup of O(r, n - r) of all those A's for which det A = 1 is denoted SO(r, n - r). The group O(r, n- r), as well as SO(r, n - r), are Lie groups. In particular the groups O(n, 0) s O(«), S0(ra, 0) s SO(n) are Lie groups, these are respectively called the orthogonal group and the special orthogonal group*. Furthermore, when n = 4 and r = 1, the group (9(1, 3) is called the Lorentz group and 5 0 ( 1 , 3) is known as the proper Lorentz group. (g) Let gr denote the canonical non-degenerate hermitian sesquilinear form on C" of signature (n-r): gr(x, 4'

y ) =xly

l

+ ••• + x r y r - x

r + l

y r + l - ••• - x n y

n

(2.2.5)

The underlying manifold of a Lie group can be C°° or analytic. See Sec. 0.3 for definition and distinction between the two structures. The group which leaves the quadratic (x2 + y2 + z2 - c2t2) invariant is called the (homogenous) Lorentz group. (Notice the sign convention here). If we add to it the group of transformations, we obtain the inhomogeneous Lorentz group, commonly known as the Poincare' group.

76

Mathematical Perspectives on Theoretical Physics

V x, y G C". The matrix representation of gr is the same as in Exp. (f) above. In this case, the group of linear transformations A of C" satisfying gr(Ax, Ay) = gr(x, y), Vx, y e C", denoted U(r, n - r), can be identified with the group of {n x n) complex matrices A such that A+grA

= gr

(2.2.6)

+

where A is the conjugate transpose of A. The above equation implies that det A = ± 1. The group U(r, n - r) and its subgroup SU(r, n- r) formed by matrices A with det A = 1 are Lie groups. Evidently U(r, 0) s U{ri) and SU(n, 0) s SU(n), known as the unitary group and special unitary group of dimension n, are Lie groups.

( °

(h) The matrix co=\

X>

l , where 1 denotes the n-rowed unit matrix, defines the canonical symplectic

structure on R2". The group of linear transformations A of R2n satisfying co(Ax, Ay) = (o(x, y) V x, y 6 R2" denoted Sp (n, R) can be identified with the group of In x In matrices A such that A'coA = a.

(2.2.7)

The above equation implies that det A = ± 1 V A e Sp(n, R). This group, known as the real symplectic group in 2n dimensions, is a Lie group. (See Exercise 12 for (f), (g) and (h).) Remark 2.2.12: Since the ^-dimensional spheres for n = 1 and n = 3 turned out to be Lie groups, a natural question would be, "Do all n-dimensional spheres define Lie groups?" Surely ... 'no.' All even dimensional spheres S2n fail to be Lie groups, since the multiplication map 'm' fails to be analytic. (See Bott [4].) If the underlying manifold of Lie group G is compact, G is called a compact Lie group. The (R/Z, +) = Sl is thus a compact Lie group and so is T ". Compact Lie groups are important for more than one reason. For instance, they can be given a bi-invariant Riemannian metric (a metric, i.e., left as well as right invariant). Classically,5 a Lie group can be thought of as a special type of continuous group (see Def. 2.2.8) whose elements a, b, c, etc., are labelled by r real parameters (a{, a2, -.., ar) which can vary over a finite or infinite range; their domain space is called the group-parameter space. The elements c = a • b and d = (a)"1 obtained from multiplication and inverse operations are analytic functions over group parameter space. A Lie group is said to be compact if the domain of variation of its parameters is closed and bounded*. If the number r is the smallest number that characterises the group-the parameters are called essential and the group is known as r-parameter group.

al2\

fan

As an example, the real transformation group GL (2, R) formed by 2 x 2 matrices a = \

V«21

a

has

22)

an arbitrary element a = ( a n , a ]2 , a2i, ci^i)- The parameters here are four in number, and since the group parameter space is R, the group is not compact. The group SO (2, R) formed by orthogonal matrices of determinant 1 is a compact group since parameters are bounded, and it is a 1-parameter group. The Lorentz group given in (e) is a six-parameter group in four dimensions; whereas Poincare' group is a 10-parameter group. Clearly for an arbitrary n, Lorentz group has n(n - l)/2 essential parameters. 5

'

Since Lie groups arose via continuous transformation groups, this definition is closer to a physicist's way of looking at them. The domain (set) is said to be closed, if the limit of every convergent sequence of points in the set is also in the set. A set of numbers (a) is said to be bounded if every a satisfies: \a\ < M where M is a given positive number.

Elements of Group Theory and Group Representations

77

2.2 Algebraic Groups Our list of definitions will remain incomplete without a word about algebraic groups. These groups are often thought of as mere applications of algebraic geometry, although there is evidence to the contrary. In fact they have their origin as early as 1883, when E. Picard used them in the Galois theory of linear differential equations (see [7]). The proponents of Lie group theory would like to think of them as algebraization of Lie groups—if that was the case one would expect to find them in Lie's work. But they are simply not there, though Lie's work contains detailed studies of some matrix groups over C (Lie [22]). The definition given below should be enough to convince the reader about our viewpoint that algebraic groups are entities independent of the two topics mentioned above, i.e., algebraic geometry and the Lie groups ([3], [5], [9]). Definition 2.2.13: Let K denote an algebraically closed field and SLn(K) the group of n x n matrices x = (xtj) with entries in K and with det x = 1. A subgroup G c SLn{K) is called a linear algebraic group over K if there exists a set S of polynomials P in K[xtj] 1 < i,j < n for which P(xt) = 0 <=> that x = (Xy) is in G for every P in 5. These are generally referred to as /^-groups and their algebra is denoted as K[G]. Definition 2.2.14: The group GLn(Kf of all non-singular n x n matrices defined over K can be viewed as a A"-group contained in SLn + 1 (K) via the embedding:

'•*{l

(de°r')*eGL"m

<2 2 8)

--

Note that SLn, On, SOn, where the last two denote the group of (« x ri) orthogonal matrices and orthogonal matrices with determinant 1, are contained in GLn(K) and hence are K-groups. Definition 2.2.15: An algebraic group G over an algebraically closed field K is an algebraic variety over closed field K if G has a group structure such that the product map G x G —> G and the inversion map G —> G are morphisms of algebraic varieties. The above definition shows the relation of algebraic groups with algebraic varieties. It should be noted that being an algebraic variety in the case of linear algebraic groups implies that 'G be an affine variety.' For more literature on the topic, see [21], [26], [31]. Definition 2.2.16: A group G is called unipotent if all its elements are unipotent. An element g e G is unipotent if all its eigenvalues are 1. These unipotent groups play an important role in the Jordan decomposition of algebraic groups (see for instance [3] and Chapter 0 of [9]).

Exercise 2.2 1. Show that if H is an open subgroup of topological group G, then H is closed in G. 2. Show that if H is an open normal subgroup of the topological group G which is locally connected, then GIH is discrete. 3. Fill in the lines to prove that T1 is topologically isomorphic to R/Z. 6

'

We are using the integer V as a subscript in GLn (£)-which differs from previous notation GL (n, R). This is done to emphasize the algebraically closed field K.

78

Mathematical Perspectives on Theoretical Physics

4. Show that the centre C of G{x e G \ xa = ax for all a e G] is a normal subgroup of G. It is denoted Z(G). 5. Show that the complex linear transformation group GL(2, C) is an eight-parameter group. Show further that the group SU(2) is compact, and it is a 3-parameter group. 6. Show that the elements of a group G of one-dimensional coordinate transformation (x —> x + e) that leave the wave function invariant can be expressed as U = exp(-iepx)

where p x = . V idx)

7. Show that the group SU(2) can be viewed as an isospin invariant group for the doublet q consisting of quarks u (1^ = — ] and d \l^ = — , expression in brackets stand for their isospin; (see Subsec 6.5.2 and Table-5). 8. Let M be four-dimensional real vector space equipped with space-time bilinear form of signature (+, -, -, - ) . The linear transformations which preserve the bilinear form define the group O (M) which is isomorphic to 0 ( 1 , 3). Show that SO0(M ) (the elements of 0(M) with determinant 1) is a connected component of O(M). Show further that the group operation in the semi-direct product (Af) K SO0(M) is defined as (x, A) • (y, B) = (x + A • y, A • B). (i) The above group is known as the Poincare group or inhomogeneous Lorentz group (of the Special Theory of Relativity). 9. The Galilean group (the group of transformations in non-relativistic mechanics) is isomorphic to R4 K (R3 K (50(3))). Define the group action of R3 K 50(3) on R4. (0 -A fO n (\ 0\ 10. Let e = ia2 = iI . define the Levi-Civita symbol, and let a{ = , cr3 = be the other two Pauli matrices and cr0 be

. Then show that the set 5W" = {±£, ±cr0, ±cru ±a2,

±
where e is ia2 (the Levi-civita symbol). Show further that generators Ay and A2

pertaining to infinitesimal rotations about the x and y axis are

e

^ ^

and 0 0

0 , and that

to o o ,

the commutation relation satisfied by them is (b) [A, Aj] =-eiJk A k . 12. Establish that O(r, n - r), SO(r, n - r), U(r, n - r), SU(r, n-r) by showing that they are Lie subgroups of GL(n, R), etc.

and S(n, R) are all Lie groups

Elements of Group Theory and Group Representations 79

Hints to Exercise 2.2 1. Since H is open, aH{a e G but not in H) is open in G, and thus K = u (aH) is open in G, hence the complement of K, which is H, is closed. 2. Use Exercise. 1 and Definitions (2.2.4) and (2.2.5). 3. R is an additive group with metric topology, and Z the additive subgroup of integers is closed in R. Thus quotient group R/Z is a topological group. The isomorphism between R/Z and Tl is given by the map (i) 0 : R -> T1 = S1 : x -» e2ld*. The map 0 is a continuous epimorphism with kernel Z. 6. Consider G as a group that leaves the wave function y/(x) invariant, i.e., \j/'{x') = y ix(x)) a Y (x - e); let £/ e G, then v'OO = (/ V (*) = V (* - € ) • I f e i s small, (i)

yr(x-e)=

yr(x) - e-f- +O(ez) = y/(x) - ie Pxy/. ax

Hence U = exp(-ie px). 7. LetaeS(/(2)bewrittenas +

fau

an\

U21

a

22J

(seeEqn(2.2.1)),andlet^'=

(u'\

(an \=aq = \ \d ) \.a2l

an\(u\ a22){dj

+

Compute (q'\q'}. Since aa = a a = I, (q'\q') ={q\q) (seeApp.9A). 8. Find the identity element of the group (which is (0, e)) and the inverse of a given element (e.g., (x, A)'1 = (-A"1*, A'1)), and establish that (i) satisfies the associativity; these three taken together would show that the given operation is a group operation. 9. Let (a, A) be an arbitrary member of R3 x SO(3). Thus (a, A) maps (x, i) e R3 x R = R4 to the element (i) (A • x + ta, t) e R3 x R = R4. The action (i) is well defined. For instance, another element (/}, B) e R3 « SO (3) maps (A • x + ta, i) to (BAx + t(Ba + (J), f) e R3 x R = R4. (See Sec. 33 in 3.[11] on Galilei transformations.) 12. Let S(n, R) denote the group of (n X n) symmetric matrices. Consider the mapping/: GL(n, R) -> S(n, R) defined as (see Eq. (2.2.4)): f{A)=A'grA.

(i) 1

x

The map / i s continuous and therefore/" ({gr}) is closed. Thus O(r, n - r) = f~ {{gr}) is a closed subgroup of GL(n, R). From (iii) of Def. (2.2.4) it also follows that in view of Ftn. 3, it is a Lie subgroup of GL(n, R). Let S(n, C) denote the group of n x n Hermitian matrices and let / : GL(n, C) -> S(n, C) be the map defined as (see Eq. (2.2.6)): f(A) = A+grA.

(ii) l

The map/is continuous and therefore/ ~ ({£r}) is closed. Accordingly U(r, n - r) is a closed subgroup of GL(n, C) and hence is a Lie subgroup. Similarly, let A(2n, R) denote the group of {In x 2n) skew symmetric matrices and let/: GL(2n, R) —> A {In, R) be the map defined as (see Eq. (2.2.7)):

80 Mathematical Perspectives on Theoretical Physics

/ ( A ) = A' O)A.

(iii) l

The map / i s continuous and hence / ~ ({co}) is closed. As a result Sp(n, R) = / "' ({co}) is a closed subgroup of GL{2n, R)-and thus it is a Lie subgroup of GL(2n, R). (See Chap. 0 for continuous mapping and closed set.)

3

BASICS OF GROUP REPRESENTATION

In layman's language a representation of a group G is its realization as a group of transformations of some set X with structure which is required to be preserved under these transformations. For instance: (i) if X is a linear space over a field K and GL(X) denotes the group of invertible transformations, then a homomorphism of G into the group GL(X) is called the linear representation of G with respect to X over the field K (if K = R and X has a basis eu e2, ..., en, GL(X) = GL(n, R)). (ii) If X is a Hilbert space (usually denoted as # ) and U(X) is the group of unitary operators on X, the homomorphism from G to t/(A") is a unitary representation of G over X (see Chapter 3 for operators). More formally we have: Definition 2.3.1: A representation of a given group G is a homomorphic mapping G —> GL(X) where elements of GL (X) are linear operators or matrices on a space X. X is called the representation space and its dimension is known as the dimension or degree of the representation. For topological groups, a representation can be defined as follows: Definition 2.3.2: Let £ be a complex Banach space and GL(E) (defined above) be the group of continuous transformations. A representation n of a compact group G in £ is a homomorphism n: G -> GL(E) such that all maps G -> E, g h-> n(g) x (x e E) are continuous. The space E is called the representation space of G with reference to n and is therefore denoted E^ When £ = tt, each 7r(g) is a unitary operator in view of (ii) as given above (i.e., (n(g))+= (Tr(g)r' = n(g~l) V g e G) and the representation of G is unitary. When G is a Lie group, the mappings defined from G into the space of representations are required to be C °° (see Kirillov in [19] and Ref. [13] for an elementary description). It is easy to note that a group can have more than one representation. It is therefore natural to ask the relationship, if any, among these representations.

3.1

Relation Between Two Representations

Definition 2.3.3: Let Rx and R2 be two representations of G in the spaces X{ and X2. The operators A : Xx —> X2 that commute with R] and R2:

ARl(g) = R2(g)A

(2.3.1)

are called intertwining operators. The space of these operators is denoted HomG (/?[, R2) = C(^?l5 R2) and its dimension, denoted c(Rh R2), is called the intertwining number of Rx and R2. If the operator A is invertible, Rl and R2 are called equivalent. If Rx = R2 = R and Zj = X2 = X, the notations given above are simply EndG{X) = C(R) and c(R) respectively. 7

'

If G is a topological group, A will be a bounded linear operator.

Elements of Group Theory and Group Representations 81

Given a representation R of group G with respect to a space X, suppose that X admits a subspace Xx which is /?(G)-invariant, then R restricted to Xx defines the so called subrepresentation and the natural representation that follows on the quotient space XIXX is called the quotient representation. In case the subspace X2, the complement of Xx in X, is also R (G) invariant, another subrepresentation R2 is defined and R = Rx + R2 is said to be completely reducible. A representation that does not allow a "subrepresentation" is called irreducible. It is called algebraically irreducible when G is an algebraic group and topologically irreducible when G is a topological group.8 For a reducible representation R of degree n, it follows that every matrix D(g) that corresponds to g e G can always be reduced to a block diagonal form using an (n x n) nonsingular matrix M: 'Dx(g)

MD(g)M-l=

0 ^

°2ig) . , 0

Dk(gh

k

The representation R = ^ /?,, where diagonal matrix D((g) corresponds to /?,. Furthermore if Dt(g) is i=i k

ni x «( matrix, then n = ^T «,-. ;=i

Definition 2.3.4: Every compact, connected Lie group G has a "God given" representation defined by the so-called Adjoint action on the tangent space Te(G) at e.9 This action is given by X —» g Xg~x and is written as Ad(g)X where X e Te (G) and g e G. This is the canonical finite dimensional representation of Lie group G on Te(G). It can be verified that the derivative of Ad, denoted ad : Te(G) —> End(TeG) and written for X, Y € Te(G) as ad(X) (Y), equals the Lie bracket [X, K] (see Exercise 4).

3.2

Tensor Product of Representations

Definition 2.3.5: Let ttx <8> # 2 denote the algebraic tensor product of complex Hilbert spaces Hx and H2-W Let (TTj, ^ ) and ( ^ , tt2) be two representations of G. Denote by Hx ® tt2 the space of all Hilbert-Schmidt operators A : !H'2-> !HX,H'2 being the C-linear dual of H2. By definition A is HilbertSchmidt if11

IW| 2 2=I>£I| 2 <~

(2-3.2)

;=i 8 9

' '

11

We shall simply use the word irreducible, since the adjective will be clear from the context. Note that for any g e the Lie group G conjugation by g (Def. (2.1.3)) defines an automorphism g(X). In the real world we come across many examples of such tensor products. For example, two particles in R are associated with the tensor product L2(R3) ® L2 (R3) = L2 (R6) for their coordinate labelling. See Sec. (0.2) and Chapter 11 of Ref. 3.[3].

82 Mathematical Perspectives on Theoretical Physics

for an arbitrary orthonormal basis {£,} of H'2. This in turn defines an inner product oo

(A,B) = tr{B*A) = X

(A&Bx}.

/=i

for A, B € Hx <§> tt2. The space 9ix ® tt2 equipped with this inner product is Hilbert space (see Sec. (0.2)), and is denoted as Hom2(9{'2, Hx). The tensor product representation nx ® fy can be defined as an algebraic action on Mx <8> tt2 by setting: (nx K2)g ($ 77) = nx (g) $®K2

(g) r]

(2.3.3)

where ge G,£,eHx, and J] e H2. This extends to a unitary representation on ttx ® ?(2 = Hom2 (9{'2, Hx) through the equality: {nx ® n2)gA = nx (g) o A o n2 (g~l). g 6 G and A e !tt2. Here ^ 2 is the representation of G with respect to 9{'2.

Exercise 2.3 1. Let (n, H) be a unitary representation of a topological group G and let Hx c ^ b e a closed K(G)invariant subspace that defines the pair (nx, !HX). Show that the orthogonal complement (^j) 1 = 9i2 is also a n(G) invariant subspace of H and hence n= nx + fy. 2. Show that every unitary representation {n, tt^ of a group G uniquely defines another representation (ft, ?{'„) where ^"'^is a C-linear dual of Hx 3. Let /?j and # 2 be two irreducible representations of G, then show that any intertwining operator A e HomG (Rx, R2) is either zero or is invertible (Schur's Lemma). 4. By using G to be a matrix group over the reals, show that the derivative ad = d(Ad) satisfies ad(X) (Y) = [X, Y] for X, Y e Te (G).

4

SPECIFIC EXAMPLES OF GROUP REPRESENTATION

In this section we give just three examples on group representations which relate in general respectively to quantum (in particular to current algebras), Yang-Mills and string theories. Following example illustrates that begining with a group of unitary operators and its representation space 9i, another group can be defined whose representation space is H. (This example may be viewed as a follow up of the theory given in Subsec. 4.3 of Chap 0, and also as an example for constructing unitary operators). Example 2.4.1: Let G0(!H) denote the group of motions of the Hilbert space H, i.e., the group of transformations of the type x (-» Ax + b where b e # a n d A is a unitary operator (See Eq. (3.1.4)). The composition rule in Go (#) is defined as: (A,, bx) • (A2, b2) = (A, A 2 , bx + A! b2)

(2.4.1)

Elements of Group Theory and Group Representations 83

since the pair (A2, b2) carries x to A-gc + b2 and the pair (Ax, bx) carries A2x + b2 to Ax{A2x + b2) + bx. The central extension of G0(M) is another group G(#) formed by triplets (A, b, c) where c € the unit circle Sl. The composition rule in G(#) is given by: (A,, bi, c{) • (A2, b2, c2) = (A, A2, by + Ax b2, q c2 exp (ilm (&,, A, b2))).

(2.4.2)

The group G(#) has the irreducible unitary representation in the space Exp!K which is given by: (A, b, c)^UAtb,c where

UA

b

(2.4.3)

,. (Expy) = c exp (- - ^

(Aw, &)) Exp (Ay + b)

(2.4.4)

If 9(is the complexification of a real Hilbert space 7ir and A is a unitary real operator (i.e., the operator that leaves ttr invariant) with b e tir, then the operators {UA b r } can be realized in L^(4>), the Gaussian model of Expj^(see Chapter 0 and Gelfand, et ah, in [19]). To see this, consider the isomorphism px: E x p # r -> L^(<&) defined as Exp v -» e-^HI^s.^A^,!;) becomes: ^.fc. c ^>(F) = cei(F'b)

for

A

_ ^

then in v k w

O(A>)

e'{-v)

= ce^'Av+h)

( 0 .4.9), f/A

fe t

(2.4.5)

where A* is the conjugate (seeDef. (3.1.7)) of the operator Ar = A| Hr, F,be product in H. The equation (2.4.4) thus simplifies to: UA.b,c

of

«'•<••*> • e(A*--v)

Hand <) denotes an inner

(2.4.6)

Example 2.4.2: We have seen that special unitary group SU(ri) for even n is a Lie group. It is formed by unitary matrices U : U + £/ = {/£/ = /, det £/ = 1. Every unitary matrix U can be written as e'H for some Hermitian matrix H. Using the property det U = 1, it can be easily checked that 7r// = 0*. In this set of (n x n) Hermitian matrices, there are (n2 - 1) traceless Hermitian matrices. Accordingly, an element U of SU(n) can be expressed as:

r «2-i f/ = exp

* X G r
i (2- 4 - 7 )

where € r are (real) group parameters and gr are the group generators12 represented by traceless Hermitian matrices. Of these ( n 2 - 1) traceless matrices, (n - 1) can be diagonalised simultaneously. The number (n - 1) is called the rank of SU(n). When n = 2, we have the group SU{2) of isospin invariance of Yang-Mill's theory. Obviously the rank of this group is 1 and the parameters are 3 in number, hence (2.4.7) gives £/(£„ 6 2 , e 3 ) = exp

12

i X zrgr\

(2.4.8)

Express e' as a series, which would terminate after a finite no.of steps, evaluate its determinate and you will note that the term involving i is trace of matrix H. A set S c G (any arbitrary group) is said to be a system of generators of G if the powers S" (n = 0, 1, 2,..., 5° = e) cover the entire group G (see also Def. (2.1.1)).

84 Mathematical Perspectives on Theoretical Physics

The generators gy, g2, g3 are the Pauli matrices

Note that if we were to use the generators as Jr= —— to represent SU(2), we would have the commutation rule for these generators as: [Ja,Jb]=i€abcJc

(2.4.9)

where eafe(. is the anti-symmetric (Levi-Civita) tensor (see [21](b) for in-depth study). From the previous section it is clear that an arbitrary group can have more than one inequivalent linear representation. Depending on the vector space one chooses, some of these representations are more useful than others. In the next example we illustrate the method of finding a representation of degree (n + 1) for the special linear group SL(2, C) of complex 2 x 2 matrices of determinant 1. Example 2.4.3: Let X denote the (n + l)-dimensional complex vector space whose elements are homogenous polynomials P of degree n in two variables (zl5 z2) with complex coefficients: (2.4.10) 1=0

In order to show that X is a representation space of SL (2, C), we should establish a homomorphism: A i-> R(A)- P

(2.4.11)

where A € SL(2, C) and R(A) • P is a transformed homogeneous polynomial in X. We write the pair (z,, z2) as a column vector

:

s z and let A =

, and then define R(A) • P(z) as a new

homogenous polynomial given by: P(A->Z) = P((*

~P)(Zz[)]=i,xl(Szi-Pz2)l(-Yzl+az2)n-1

(2.4.12)

Notationally, P(A~lz) = PA(z) = P((5zl - /3z2), (-y*i + az2)). Since R(AB) = R(A) R(B), (2.4.11) and (2.4.12) lead to: AB i-> R(A) R(B) P = PAB (z). In order to bring this in line with physical theories that use this kind of group representation, we write n = 2m. Thus m can be an integer or a half-integer, and we rewrite (2.4.10) as m

(zi> z2) = Z, xk *i

Zz

(2.4.13)

k=—m

These homogeneous polynomials can be reduced to non-homogeneous ones by writing (z,, z2) as (z, 1), where z = Z\/z2 with z2 * 0. Thus we have,

Elements of Group Theory and Group Representations

P(z,l)=

X xkzr* = Q{z)

85

(2.4.14)

k ——m

The above process is evidently reversible since P(zv z2) and Q(z) are related by the formula:

P( Z l , z2) = z22m 2 ( - ^ - 1 (z2 * 0)

(2.4.15)

Let X ' m denote the space of all polynomials Q of degree 2m, and let fl'm be the corresponding representation of SL(2, C) in X'm then A f-> /?^,(A) • <2 = QA is given by:

QA(z) = (-7Z + «)2'" Q f * Z ~ ^ ]

y-yz

(2.4.16)

+a)

From (2.4.14) it follows that the monomials {z"'~k} for -m
P(z,, .... z 2 r ) =

**,- k2rz\l

X h

••• zfy

*2r=0

where the indices kt must satisfy ^

k{ = n. We can then write the column vector

: Z

\ 2r;

treat P{z) as a function in C2r, and for A e SL(2r, C) use the transformation rule: fir

Ir

\

Rn(A) • P = PA = PA(z) = P(A"' •z) = P\ X a . - z , - , - , £ v , - Z i

where the matrix

'«!

...

a2r^

,Vj

...

V2rJ

A"1 =

Exercise 2.4 1. Verify the Lie bracket relation (2.4.9) for Jr= —— where
as z,

86 Mathematical Perspectives on Theoretical Physics

(0

l\

(0

-A

(1

0\

Hints to Exercise 2.4 1. To prove [Ja, Jb] = ieahc Jc, we write the LHS for a = 1 and b = 2. Thus we have:

OTi)roKXi)H[(i-K;13] = - — [2i <73] = «= 123 J 3 . Note that to obtain a non-zero result on the RHS, the subscript c in &abc must be 3-otherwise e abc will be zero.

5

THE THEORY OF BUNDLES AND RELATED OBJECTS

In this section we use our knowledge of groups to learn about another important topic-Ztand/es-which originated in mathematics around 1950 as generalization of topological products. But very soon groups, in particular Lie groups, were introduced in this generalization scheme which led to an important mathematical structure, namely principal fiber bundles. These, as we shall soon see, turned out to be an effective tool via their offshoot-the theory of connections-in the description of other mathematical and physical theories. The notion of connection has been described in the literature in more than one way, for instance (i) by using the classical concept of parallel translation; (ii) by following the EhresmannKoszul construction of principal fiber bundles, more specifically by considering the smooth splitting of tangent spaces into horizontal and vertical subspaces; and (iii) by treating the cotangent bundles as the main ingredients in the description. To draw a distinction among these approaches, we have divided this section into two parts: A and B. Part A deals with introductory definitions and with the theory of connections based on (ii). The description of connection based on (iii) together with their reference in physics is the subject matter of Part B. Other important objects, e.g., associated bundles (of principal fiber bundles), the universal connection and the metric on a bundle, are also given in this part. Due to our limited scope, we describe the theory in brief and refer the reader to other advanced and specialized texts on this subject l.[10], [30], [18], and [14].

Part A 5.1

Fiber Bundles, Bundle Morphisms

Definition 2.5.1: Given two topological spaces E and X and a continuous surjective mapping n: E —» X, the triple (E, X, n) is called a bundle. The space X is called the base. Example 2.5.2: The triple {Xx x X2, Xu nx) where E is the Cartesian product Xx x X2 and n is the projection nx on the first factor, is the Cartesian bundle. In particular,

Elements of Group Theory and Group Representations 87

7

(s, a)

(Xi,X2) 1 f

(X,)

X,

S

^ ^ ^ Q Cartesian product bundle of a circle and a line. the triple (S1 x /, s\ nx) which represents a cylinder formed (as though) by gluing the two ends of a piece of paper, is a Cartesian product bundle of circle S1 with line segment /. The inverse image 7fl (x) for every x e X is assumed to be homeomorphic to a given space F\ it is called the fiber at x and is denoted as Fx or sometimes as Ex. The space F is called the typical fiber. An additional structure in the bundle can be provided by incorporating a group of homeomorphisms of F and a covering of X by open sets; we then have the following definition: Definition 2.5.3: A bundle (E, X, n) is a fiber bundle (E, X, n, G) if it is equipped with a typical fiber F, a topological group G of homeomorphisms of F (onto itself) called the structural group, and a covering of X by a family of open sets {Vj, j e J c N} such that: (a) Locally it is a trivial bundle, i.e. for every / e /, the topological space n~~] (V) is homeomorphic to the topological product Vj x F. This homeomorphism denoted ty from 7T (V}) to V;- x F can be written as:
(2.5.1)

and it satisfies the following diagram: £

/ V,

\ I,

/,. o <j>j = n VjxF

^ ^ ^ Q Homeomorphic map
88

Mathematical Perspectives on Theoretical Physics

4>k,x °

4>~j'x--

F

F

^

(2-5.2)

F

Fx A

P C ^

^ —-*•

•—~~

A

A 1

~

01, x ° 02, x

A

02, x

Y

• •

A

A

,

UESaJ Composite mapping #A x o 0^x \ F-* F. Evidently if G has only one element, the bundle is trivial. A

(c) The composite mapping 0 t

4 x

o

§•

induces a continuous mapping g:k : V- n Vj. —> G as

follows: A

* ^ A

^

(*) = ^ , x o

i

^;,^

(2.5.3)(a)

A

The elements ^ (x) = (f>k

x

o ^ • ^ which belong to the group of homeomorphisms G are also called

the transition functions of local representations. If x e Vt n V r> Vt, then it can be checked that the composite map satisfies the following: giJ

(x) o gjk (x) o gki (x) = Id F

(2.5.3)(b)

The identity map \dF on F is in fact the identity element of G. The condition (2.5.3) is referred to as cyclic condition of transition functions. An important concept related to fiber bundles is that of sections, which is described as follows. Given a fiber bundle £ = (£, X, n, F), a C-(smooth) map x : X —> E such that n o s = ldx is called a Cr-(smooth) section13 of the fiber bundle E over X. The space of these sections is denoted as F (AT, E) or simply as F (£). For an open set U c X, the set of sections of the bundle E restricted to U is denoted as F (E ly). In particular if x e U and s e T (E | y ), then 5 is called a local section of £ at x. Note that by choosing local coordinates {xl, 1 < i < m} in a neighbourhood of JC € X and {(.*', y-'); 1 < i < m, 1 <j < n) in a neighbourhood of j(x) € £, we may think of s as a function from an open subset of Rm to R" + m such that14

OO -> (y, y V -

*m))

(2.5.4) r

The section is also referred to as cross-section. See Exc. 1. The section is C (C°° = smooth) onlyin the case of C (smooth) fiber bundles. See Def. (2.5.4). Note that (2.5.4) can be written only if the spaces X and E carry coordinate charts and F is homeomorphic to R". This is what we have assumed here, though we have not mentioned it explicitly.

Elements of Group Theory and Group Representations

89

From this point of view one can use the calculus to write the Taylor expansion of 5 at x for a given set of local coordinates and eventually define an equivalence class of expansions of same order, say k. This is called the k-jet of s at x (see [20] for details). Definition 2.5.4: A bundle (E, X, n, G) is said to be a differentiable (CO fiber bundle if E, X and the typical fiber are differentiable (CO manifolds, n is a differentiable (CO mapping, and G is a Lie group: further more the covering of X given by the domains of an admissible atlas implies that the mappings gjk are differentiable (CODefinition 2.5.5: Consider two bundles (Ex, X,, Kx) and (E2, X2, 7%), the mappings which map fibers into fibers (i.e., preserve the local product structure) are called bundle morphisms. Thus a bundle morphism is a pair of maps ( / , / ) : / : £, -> E2 and / : X, -> X2

(2.5.5)

such that following diagram commutes: £1

T

X,

f

E | m

>

5>

X2

Bundle morphism

Given the map / , the map/^-if it exists-is unique. Definition 2.5.6: Consider a differentiable fiber bundle (E, X, /r) where £ and X are manifolds of dimension n + m and n. A chart (U, y/) on E defines fiber coordinates on E, if the mapping y/: £/ -> R "+m is a bundle morphism with Rn+m viewed as natural (Cartesian product) bundle Rn+m = R" x Rm.

5.2

Tangent Bundle

It can be checked that given a differentiable manifold X" the triple (T(Xn), X", K) is a fiber bundle, where T(Xn) stands for the space of pairs (x, vx), x e X", and vx e Tx(X")-the tangent space to X" at x. The mapping ;ris the projection n: {x, vx) h-> x. The fiber at x is TX(X"), while the typical fiber is R". The covering of X " is given by {V,-} where {V,-, y/j] is an atlas of X". The homeomorphism ty (see (2.5.1)) can be written as the pair (n, y/f o n') such that (7T, y/f o if) : nx (Vj) -» V;- x R"

(2.5.6)

is given by (x, vx) i-> (x, I//,-' ( f j ) . Here n:' is the mapping that assigns (x, vx) to vx, and y/}'(vx) is the representative of vx in the chart (VJ, y/}). The fiber coordinates on T(X' ! ) are given by the mappings (y/j, Id) o (n, y/j o n') : 1C1 (Vj) -» R" x R" l

The coordinates of a point p = (x, z^) € n~ (Vj) a T(X") are thus:

(2.5.7)

90 Mathematical Perspectives on Theoretical Physics

(x1 ... xn, v\ ... vnx)

(2.5.8)

The structural group G of this bundle is the Lie group GL(n, R)-the group of linear automorphisms of R". Recall that the group consists of non-singular n x n matrices. The fiber bundle described above is the Tangent Bundle. If the manifold X" is C-differentiable, then the tangent bundle (T(Xn), X", n) is C^'-differentiable. Definition 2.5.7: Let C, = (E, X, n, F) denote a fiber bundle with structure group G, then if F is a Banach space and G is the Lie group of automorphisms 15 (linear continuous bijections) of F, then £is called a vector bundle with fiber type F. When F is R" (or C ) and G is GL{n, R) (or GL(n, Q ) , then we call £ a real (or complex) vector bundle of rank n. Apparently, the tangent bundle given above is a real vector bundle of rank n. Definition 2.5.8: A fiber bundle £ = (P, M, K, F) with structure group G is called a principal fiber bundle over M with structure group G, if F is a Lie group, and G is the Lie group of diffeomorphisms h of F (see Def. (2.5.3)) such that: h(flf2) = h(f1)f2, / , , / 2 e F

(2.5.9)

It should be noted that the mapping that carries h to h(e) where e is the identity element of F establishes group isomorphism between G and F. Thus it is customary to write the bundle C, in short form as P{M, G),16 and use the right action p of the group G in setting the conditions of definition. We shall return to this in Chapter 6 while dealing with gauge theory. We now move on to some algebraic and geometric concepts of bundle theory. As can be expected, the literature dealing with these concepts is quite vast. With our limited scope we only list the definitions of objects that are important from the applications' point of view and give the references for further reading. In order to define the notion of a parallel translate of a point of the fiber or a vector in relation to a principal fiber bundle, we return to Lie groups, in particular to Lie groups of transformations.

5.3

Lie Groups of Transformations, One-parameter Subgroups of a Lie Group, Killing Vector Fields

Given a group G and a manifold X, the set { X) is a Lie group of transformations if the mapping O:GxX->X

(2.5.10)

defined as (g, x) i-> a(g, x) is differentiable, and if the set of transformations {ag) (<Jg(x) = o(g, x)) in relation to the composite mapping follow the group property:
ogoah

<Je = identity transformation 1

(2.5.11)

From above it is evident that cr^i = cr" . The group G is said to operate effectively on X if crg(x) = x for every x e X implies g = e. It is said to operate transitively on X if for every x 6 X and y e l , there exists g E G such that Gg(x) = y. If the group of transformations {o,} is one-dimensional, most of its 15

' It is assumed that such a Lie group can be defined, see [21(a)]. ' In spite of this shorthand notation, we often use the general form £= (E, X, K, G).

16

Elements of Group Theory and Group Representations 91

properties can be stated in terms of the vector field which generates it. Naturally if the group of transformations {Og(t)}; t G R} results from a one-parameter subgroup [g (t); ( E R} of G, it has similar properties, as we shall soon see. By definition, a one-parameter subgroup of a Lie group G is a differentiable curve: R -> G; m

g{i) such that

(2.5.12)

gO)g(s) = g(t + s) g(0) = e

(2.5.13)

Using the ideas developed for the group {<Jt} and for the group Gg(,y it can be checked that the image of the one-parameter subgroup {g(t); teR} under the mapping ax is indeed the curve generated by the transformations <7gW; t e R} that operate on x G X. The mapping <7X mentioned above is thus defined as follows: am (x) = a{g (/), x) = ax{g (/)) We give below a pictorial representation of the mappings in (2.5.14).

(2.5.14)

X

G

(mj^ x

|dbl^

£__v-Y V

X^Js^—^^^ a \

y

9w

}x= a(e, *)/ i

^

_

\

^

An illustration of the relation between a ^ and a.

The vector field which generates the group of transformations {Gg{f)\ t e R} is called a Killing vector field on X relative to the group G. We denote it as v. The integral curve of v, which passes through x satisfies the differential equation:

* £ « U , („,(,«) with initial condition ax (e) = x. It is well known that if G acts effectively on X, the space of Killing vector fields on X is isomorphic to the Lie algebra Q of G. The two groups of transformations that we consider next are of great importance in the study of mathematics as well as physics. They are defined only with respect to Lie groups; specifically they are the groups of transformations formed by left and right translations of G, and are defined as: Left translation : Lg: G -» G; Lg (h) = gh Right translation : Rg: G -» G; Rg(h) = hg. (2.5.15) where h is an arbitrary element of G and g is a fixed element. We now state two important results based on above concepts (see Hints to Exercises 4 and 5 for the proof). Result 2.5.9: There is a one-one correspondence between the set of left (right) invariant vector fields on G and the set of vectors tangent to G at e, i.e., to the tangent vector space Te(G).

92 Mathematical Perspectives on Theoretical Physics

A vector field v on G is left (right) invariant if L's v(h) = v (Lgh) = v (gh) [R'g v(h) = v (Rgh) = v(hg)], V g,he

G

Result 2.5.10; The set of left (right) invariant vector fields is closed under the Lie bracket relation. Having studied the properties of the transformation groups associated with Lie groups, we are now in a position to define the notion of parallel transport (translate) and connection on a principal bundle.

5.4 Parallel Transport and Connection We note that in a principal fiber bundle (E, X, n, G) while each fiber is diffeomorphic to G, this diffeomorphisms is not canonical since it depends on the atlas {Uh y/^} of X. We shall see that a connection is a means to set up a canonical correspondence between any two fibers along a curve C in X. Using this correspondence one could say that a point of the fiber over a given point of the curve is parallel translated along the curve. In view of this it is natural that the definition of connection involves mappings between tangent spaces of the base manifold X and those of the bundle manifold E. Definition 2.5.11:

A connection on the principalfiberbundle C, = (E, X, n, G) is a linear mapping Op: Tx{X)^Tp{E),x

= K(p)

that satisfies: (i) K' <Jp is the identity mapping on TX(X),

(iii) ap depends differentiably on p

(2.5.16)

The symbol Rg stands for the right action of G on E with respect to g e G, and R'g is its differential. From the above definition it follows that i f C : / c R ^ I ( / i - > C(t)) is a curve in X that passes through the point x0 = C(0) and p0 is a point on the fiber such that 7i(p0) - x0, then the parallel translation of PQ along C is given by the C : I c R —> E (t h-> C(t)) which is defined as: - j - C(t) = cp -j-C(t), C(0) = Po dt dt

(2.5.17)

Since op is linear, the space Hp= op(Tx{X)), x=n(p)

(2.5.18)

is a vector subspace of Tp (£). Using the vector space Hp, the definition of connection can also be formulated in the following manner. Definition 2.5.12: A connection on £= (£, X, n, G) is a field (an assignment) of vector spaces Hp, Hp c Tp (£) such that (i) n': Hp-* TX(X), (x - n{pj) is an isomorphism

(iii) Hp depends differentiably on p.

(2.5.19)

Elements of Group Theory and Group Representations 93

The complement of the subspace Hp with respect to the tangent space Tp (£) is denoted V evidently Vp = Tp (Ex) and K'(Vp) = 0; thus Tp (E) = Hp® Vp; v = vh + vv where vh s Hp and vv e. Vp, v e Tp(E). The elements of H and Vp are called horizontal and vertical vectors and the spaces are referred to as horizontal and vertical spaces. The dimension of H is evidently the same as that of Tx (X) or that of X, whereas the dimension of Vp equals that of Lie algebra Q of G. We clarify this in the following remark: Remark 2.5.13:

The group G acts effectively on E, hence there is a natural isomorphism between Q

and the space of Killing vector fields on E relative to G. Moreover, since p and R p lie in the same fiber, any Killing vector (denoted v (p)) satisfying the relation:

**(P) = -^(iW)|,=o is tangent at p to the fiber Ex defined at x = rc(p), that is vk(p) e V . Also a Killing vector field does not vanish at any point, unless the point is associated with the zero element of Q. This implies that the dimension of the space formed by Killing vectors {vk(p)} atp, equals that of Q and therefore that of V . This leads to a canonical isomorphism Vp —> Q given by M H u, and as a consequence there follows another equivalent definition of connection. Definition 2.5.14: A connection on a principal fiber bundle £ = (E, X, n, G) is a linear mapping (Op : Tp (E) —> Q (i.e., a 1-form on E with values in the vector space Q) which satisfies: (i) (Op (u) - u for every u e Vp (ii) (O^p (R'gv) = Ad(g~l) (Op{v) (iii) (0p depends differentiably on p. The equivalence of last two definitions can be checked by using the fact: v e Hp<^ (0{v) = 0

(2.5.20)

We next see that a connection on a principal fiber bundle f = (E, X, K, G) can always be defined by means of a 1-form on X that has values in Q. In a manner of speaking, this is an existence theorem for a connection on £. Result 2.5.15: On every principal fiber bundle £= (E, X, n, G) where X is a paracompact manifold, a connection (O with values in Q can be constructed. (See Hint to Exercise 6 for the proof.) Another important geometric object that can be defined on a principal fiber bundle is the curvature. Definition 2.5.16: Let £= (E, X, K, G) be the principal fiber bundle with connection H defined by the 1-form a>; and let h:Tp{E)-*Hp be the mapping such that v t-> vh. Then the 2-form DsDffl

(2.5.21)(a)

is called the curvature form of the connection (0 (or that of H ) where by definition: Doo(vv, V2) - d(o{hvx, hv2) (2.5.21)(b) Thus symbol D stands for exterior covariant derivative of forms defined on E, just as d stands for exterior covariant derivative on X. In general the exterior covariant derivative of an r-form \j/= \[fu ® ea (with values in a vector space with basis ea) could be written as:

94 Mathematical Perspectives on Theoretical Physics

Dy{vx, v2,...,vr+l) = dyihv^

hvx ..., hvr+l)

(2.5.22)

The curvature form Q defined in (2.5.21) satisfies the relation: Q. (u, v) = d(O(u, v) + [O)(u\ co(v)]

(2.5.23)

for any vector fields u, v on E. The above equation, known as the Carton's structural equation,11 is quite fundamental in the study of geometric structure of a principal fiber bundle. We shall return to it in later chapters. (See Exercise 7 for the proof of this equation.) If {ea} denotes a basis in Q, then above equation becomes: Cla= dcoa+ -

C^mp A a)?

(2.5.24)

The differentiation of (2.5.24) gives:

dna= - CfadaP A cor- - C%da^

A dcor

(2.5.25)

Evaluating dQ.a on (hu, hv) and making use of (2.5.20) and (2.5.22) implies that DQ.a{u, v) = d£la(hu, hv) = 0 for every u, v

(2.5.26)

i.e., DQ = 0. The above equations are called the Bianchi identities on E. In conclusion to our study of connections on principal fiber bundles, we shall define two more objects. One of these is the linear connection (metric connection) on X, when X is considered as the base manifold of £. Recall that we are already familiar with the notion of metric connection on a differentiable manifold from Sec. (0.3). The other is the bundle associated to P(X, G) (see Part B).

5.5 The Linear and the Metric Connection, and the Torsion Form Definition 2.5.17: A connection on the principal fiber bundle L(X) of frames on a differentiable manifold X is called a linear connection on X. Now L(X) = uxe

x

LX(X) =
Let <j)p: Tx (X ") —» R", p = (x, «,) b e the mapping that carries a vector v e Tx (X ") to its corresponding component with respect to the basis {M,}, thus if {6'} is the basis dual to {ut}, then
dp{u) =
for ueTp(P)

(2.5.27)

is called the canonical form of X. Some authors call it the canonical form of L(X) (see l.[10]). Definition 2.5.18: The 2-form 0 = DO is called the torsion form of the linear connection on X (see Exercise 10). If instead of considering the set of arbitrary frames, we consider orthonormal frames at every point x of X, the principal fiber bundle whose structural group is now O(n, R) is denoted O(X)}& The bundle O(X) is called a reduction bundle of L(X). 17

- There are two Cartan's structural equations and this is one of them. The other is Eq. (a) in Exc. (2.5.10).

1

Evidently the presence of a Riemannian structure has been assumed here (see Subsec. (0.3.4)).

Elements of Group Theory and Group Representations 95

In view of the result: "if (E, X, n, G) is a reduction of L(X) then a connection on E determines a linear connection on L(X)," 19 it follows that a connection on O(X) determines a linear connection on L(X). This linear connection on L(X) is called a metric connection. If X is a Riemannian manifold, the Riemannian connection (also called the Levi-Civita connection) is the unique metric connection such that its torsion form: 0 =0

(2.5.28)

Having described the connection (developed by Koszul and Ehresmann based on the ideas of E. Cartan) in the fullest generality, we proceed now in the next subsection to give the definition of connection and covariant derivation from another perspective.

PartB As already mentioned in the introduction, we shall describe here the meaning of connection on a vector bundle say E(E, B, n, F).20 We make the following definition for the purpose.

5.6 Connection and Curvature on a Bundle from a Different Point of View Definition 2.5.19: Let T* (B) denote the cotangent bundle of B, i.e. the bundle formed by the spaces of 1-forms (covariant vectors) at all points of B. A connection or a covariant differential is a linear differential operator D from the space of smooth sections on E to the space of smooth sections on T*(B) ® E, i.e., D : r (B, E) -> T (B, f(B)

E)

such that D satisfies the Leibniz rule (see (iii) in Eq. (0.3.10)), thus for s e Y (B, E) a n d / e C°° (£) it gives: D(fs) = df® s+fDs l

(2.5.29) 21

Locally (i.e., in n~ (U) s U x F, U an open set of B) the above equation can be written as : V,. • (fs)dx' = (V,./ • s + f V,-i) dxl (i = 1,..., n.)

(2.5.30)

If {i,} i = 1, 2,..., n denotes a set of n linearly independent sections in a neighbourhood U of B, then the action of D on the sections st can be expanded in terms of same local frames with coefficients in T (B), i.e., Zfc; = £ 0/ ® S;

(2.5.31)

The matrix 9 of one-forms d{ on U is called the connection one-form. Using the matrix notation for {si) we can write (2.5.31) as: Ds=8®s 19

21

(2.5.32)

See reference [30], p. 302 for proof. Note that we use B in place of X to denote the base space in order to distinguish the study in this subsection from the previous subsection. See Sec. (0.3) for V,,

96 Mathematical Perspectives on Theoretical Physics

The covariant differentiation of the above equation gives the curvature two-form matrix: D{Ds) = D(G ® s) = d9® s + 9® Ds = d9® s + 9® (9® s) = {d8-e*9}®s

(2.5.33)

Note that we have replaced the tensor product between 0's by a wedge product and have also changed the sign of the term. The first of these steps is taken because 0 is a one-form, and the second results from the fact that 9 is a matrix, and to obey the matrix multiplication rule the two connection matrices have to be interchanged. Note further that equation (2.5.33) defines the covariant exterior differentiation of the one-form matrix 9, and this eventually gives the curvature two-form Q22: Cl = D0 = d9-eA9

(2.5.34)

Let / = {sf} be another frame which is related to s = {.?;} through local linear transformations as: s' = g(x)s

(2.5.35)

where g(x) e GL(m, R) is a matrix formed by C°°-functions of x e B. Let 9' be the connection matrix with respect to s', then we have: Ds' = 9' ® s' or equivalently using (2.5.35) and (2.5.29) we obtain: D(g(x)s) = dg(x) ® s + g(x) Ds = dg(x) ® s + g(x) (9 ® s) = (dg(x) + g(x)9)®s

= 9'g(x)®s

(2.5.36)

This implies 9'g(x) = dg(x) + g(x) 9 and since g (x) is invertible, we have: 9'= dg{x) • {g(x)Tx + g(x) 9(g(x)yl

(2.5.37)

In Chapter 6 (Eq. (6.6.11)), we shall see that this is the transformation law of a Yang-Mills potential (with g~l in place of g). Differentiate (2.5.37) to obtain23: D9' = Q' = D(dg • g~l + g9 g~l)

(2.5.38)

Simplification of the right hand side yields: n'=gQg-1

(2.5.39)

where Q.' is given by (2.5.34) with 9' in place of 9. The transformation property (2.5.39) is indeed the transformation rule for Yang-Mills field strength (tensor) under a gauge transformation (see Sec. 6.5). Finally the covariant differential of a two-form matrix Q gives the Bianchi identity: DQ = dQ.-9AQ = 0

(2.5.40)

This turns out to be the Yang-Mills field equation. We shall return to these ideas in Chapter 6 and later on, in Chapter 10. 23

The reader should return to (2.5.21) to appreciate the difference between the two approaches. g and g~x are in fact g(x) and (g(x))~l.

Elements of Group Theory and Group Representations 97 Our next definition describes an important class of bundles known as Associated Bundles.

5.7

Associated Bundles

Definition 2.5.20: Let P(M, G) be a principal fiber bundle and F be a manifold on which G acts to the left. A fiber bundle E s E{M, F, G, P) is called the associated bundle to the principal bundle P(M, G) if its differential structure is constructed in following manner: (i) The group G acts to the right on the product manifold Px F, with the rule: a : P x F -> P x F,

(u, Q a = (ua, cTl £)

where u e P, f e F and a e G. Denote the quotient space of P x F by this group action as E = P xG F. (ii) Let <j): P x F —> M be the mapping such that 0(M, £) = 7T(M), the mapping <j) induces a mapping nE called the projection of E onto A/. For each x e M the set % ! (x) is called the fiber of E over x. (iii) The isomorphism n~x (£/) = U x G induces an isomorphism 7%' ([/) = U x F for every neighbourhood U of M (see Exercise 12). (iv) The differential structure in £ is assigned by ensuring that n^ {IT) be an open submanifold of E which is diffeomorphic with Ux Funder the isomorphism % : (£/) = UxF. The projection % is then a differentiable mapping of E onto M. Sometimes the group action is selected to suit the manifold F. For example, F can be G itself and the group action is then the Ad action of G on G, or F can be the Lie algebra § of G and the action of G on g can naturally be the action ad (see l.[10]). The associated bundles in this case are denoted (P xAd G) = Ad(P) and (P xad G) = ad{P) respectively.24 We use these bundles in (6.6) while studying the gauge theory.

5.8 Affine Bundle and Affine Connection Another bundle that is sometimes used is the so-called affine bundle A (A/), formed by affine frames of M. The structure group G in this case is A in, R)-the group of affine transformation of A" (when R" is regarded as affine space25 it is denoted as An). An arbitrary element a of A(n, R) is a =

fa C\

\ae

GL(n, R), £ e R" is a column vector

(2.5.41)

The element a maps a point r\ s An into ai] + £. An affine frame of a manifold M at x consists of a point p e A^CAO and a linear frame (MX, M2, ..., un), and it is denoted (p; ux, u2, ..., «„). Let the origin of R" and its natural basis be denoted by 0 and (e,, ..., en), then (0; ex, ..., en) gives the canonical frame a. Every affine frame (p; ux, ..., un) can be identified with A" via an affine transformation U: A" —> AX{M). This 1-1 correspondence between the set of affine frames at x and the set of affine transformations of A" onto AX(M) eventually gives the projection mapping n: A(M) -» M such that n{U) = x. The connection resulting from this bundle is called the affine connection on M. Let y: L(M) —> A(M) 24 25

Written out in full these associated bundles are E(M, G, Ad, P) and E(M, Q, ad, P). An affine subspace (affine hyperplane) of a linear space A' is a set of elements of X which can be written as: x = y + x0 where x0 is a given point of X and y e Y-a linear subspace of X.

98

Mathematical Perspectives on Theoretical Physics

be the map that carries a frame (M,, ..., «h) to (0^, ux, ..., un) where 0x e AX(M) is the point corresponding to the origin of TX(M), and let 5) and 0)be the connections corresponding to A (M) and L{M), then using the pullback map y* we have: Y* a>= (0+ a

(2.5.42)

where CO is a gl(n, R)-valued and a is a Revalued 1-form (see Kobayashi l.[10] for details).

5.9 Tensorial and Bundle-valued Forms In our earlier discussions we have established correspondence between the connection form on P and the 1-form on M. In the following paragraphs we now show that there exists a one-one correspondence between tensorial forms defined on principal bundles and the bundle-valued forms defined on a paracompact manifold M. Let us first explain the terms such as tensorial forms mentioned above. Recall that the group action of the structure group G on the principal bundle P(M, G) is a (free) right action denoted by p (see Def. 2.5.8), whereas when we consider the associated bundles with manifold F (or vector space V), the group action is a left action on F (or V). In defining the tensorial forms and the bundle valued forms, both these actions come into play. Definition 2.5.21: Consider a principal bundle P{M, G) together with a finite dimensional vector space V. Let r : G —> GL{V) be a representation of G on V, and let <j> e A!(P, V) be an /-form on P with values in V. The form <j> is called pseudo-tensorial of type (r, V) if: p*
G

(2.5.43)

where pa denotes the right action on P corresponding to the element a. It is easy to note that connection 1-form CO on P is pseudo-tensorial of type (ad,Q). The form

(2.5.44)

whenever some Xt at p e P, 1 < i < I is a vertical vector. The form 0 e A' (P, V) is called tensorial of type (r, V) if it is horizontal and pseudo-tensorial of type (r, V). Corresponding to every tensorial form there exists a unique /-form denoted ^ on M with the values in the associated vector bundle E = P\V. The form s^ is defined as follows:. s+ix) (*„ X2, ..., X{) = u(
for every x 6 M

x

(2.5.45)

where u e Tf (X), K; € TU(P), if (K,) = Xt 1 < i < I and u is a linear mapping from V onto % ' (x). This ensures that both sides of (2.5.45) are elements of % ' (x). We note that since (j> is tensorial, the definition of SQ is independent of the choice of the point u of the fiber if1 (x), and of the vectors K, € TUP. The form S+ defined above belonging to /\{M, E) is called the l-form associated to (j). In particular when Q e A (P, Q) is the curvature 2-form daa) given by the £-valued connection 1-form CO on P, it can be checked that Q is a tensorial 2-form of type {ad, Cj).26 We have already seen earlier in (2.5.23) that it satisfies the equality: dco(X, Y) = Q(X, Y) - [co(X), co{Y)] We are using d® CO in place of dco to emphasize the role of connection 1-form CO.

Elements of Group Theory and Group Representations 99

which can also be written as: da = £2 - co A (o.

The corresponding form sa e A 2 (M, adP) is denoted Fw and is known as the curvature 2-form of M. We shall return to this in Chapter 6 while dealing with gauge theory. Next we devote our attention to the notion of metric on fiber bundles. Without going into details we want to note that this is possible only in the case of principal bundles and that too only on a particular class of these bundles, as would soon be evident from the following definition: Definition 2.5.22: Let (M, g) be a compact, connected, ^-dimensional Riemannian manifold and let G be a compact, semi-simple, w-dimensional Lie group of the principal bundle P(M, G). Let ( L denote a G-invariant inner product on the Lie algebra Q of G, and suppose that {e'(x)}l <,<„, is a basis for the fiber (adP)x of the Lie algebra bundle adP. Then using the fact that for every x e M, the metric gx induces inner products on the tensor spaces and the spaces of differential forms, we can write the inner products for bundle valued differential forms on M. More precisely, let a, p e AP{M, adP) and y e Aq (M, adP), with local expressions27 a(x) = «,(*) ® «''(*) j8(jc) = Pj(x) ® ej(x)

(2.5.46)

y{x) = yk(x) ® ek{x) where a,-(x), Pj(x) e /^{M)^ yk(x) e Aq(M)x and e'(x), eJ(x), e\x) G (adP)x for every i,j and k, then we can use the above local expressions for the following definitions: (1) The product (a,/3) belonging to jF (M) the set of smooth functions on M is given by the mapping x h-» {a,P)x where (a,p)x = gx(ai(x), pj(x)) ( e\x), ej(x) )g

(2.5.47)

(2) The inner product ((a,P)) belonging to R is:

((a,P)) = jM(a.P)dvg

(2.5.48)

where dvg is the volume element on M with respect to metric g. The corresponding norms (the first of these being local) resulting from these products are: x H» \a\x = TJ(OC, a)x for every x e M, ||a|| = | V « a , a » | a

m

P

(2.5.49)(a) (2.5.49)(b)

1

(3) The formal adjoint S of d : A (M, adP) -> A ^ (M, adP) follows from the definition of adjoint operators, thus28: {{daa,o)) = ((a,Saa)) 27

' Repeated indices imply summation. See Chapter 3, Def. (3.1.7).

28

where ere AP+1 (M, adP)

(2.5.50)

100

Mathematical Perspectives on Theoretical Physics

(4) The product a A ye Ap+q (M) is given by: m ( « A ^ = (a, (x) A yk(x)) (e\x), e\x) ) g e AP+C> (M)

(2.5.51)

(5) Finally the bracket [a, y] A of bundle valued forms is defined as: x -* [a, y]x = («,(*) A yk(x)) [el(x\ ek{x)\ e APX+" (M, adP)

(2.5.52)

Note that the wedge product A in (4) and the Lie bracket [ ]A in (5) is simply written as A and [ ] when there is no cause for confusion. All these five definitions are used in principal bundle formalisms of gauge theory in Chapter 6 (see, for instance, Sec. 6.6 and Sec. 6.7). We also remark that the above definitions could be extended to the case of principal bundles where the compactness of M is replaced by paracompactness. Finally we note that for every compact Lie group G a special type of connection, known as the Universal connection, can be defined in the following manner. Remark 2.5.23: Let G be a compact Lie group and n a fixed positive integer. Then there exists a principal bundle P (N, G) (dim N = n) together with a connection F such that any connection F on any other principal bundle P(M, G) {dimM < n) can be obtained as the inverse image of F by a suitable bundle homomorphism/: P into P. In other words, if co and 5) are connection forms corresponding to F and F , then co =f* 5). The connection F is called the universal connection of "G and the integer «" (see 1.[ 10] for examples of these connections).

Exercise 2.5 1. Show that a vector field v on X" is a cross-section of the tangent bundle T(X"). 2. Show that given the real vector bundle £ on a manifold M, a Riemannian metric on £ is a smooth section s of the vector bundle S2(E) on M such that s (x) is a bilinear symmetric, positive definite map of Ex x Ex into R for every x e M. 3. Show that L(M) = u x e M LX(M), formed by the union of frames defined on an n-dimensional differentiable manifold M, is a principal fiber bundle: L(M) (M, GL{n, R)). 4. Establish Result (2.5.9) using the concept of auxiliary functions for the mappings involved. 5. Prove Result (2.5.10). 6. Prove Result (2.5.15). 7. Establish the Carton's structural equations on the principal fiber bundle E. 8. Let P = M x G be a trivial principal fiber bundle, the canonical flat connection in P is defined by taking the tangent space to M x {a} at p = (x, a) (p € P, x e M, a e G) as the horizontal subspace at p. Show that a connection in P is the canonical flat connection if and only if it is reducible to a unique connection in M x {e}. 9. A connection in P(M, G) is flat if and only if the curvature form vanishes identically. 10. Show that the torsion form 0 satisfies the relation: (a)

0 (M, v) = d6 (u, v) + — [(0(v) 0(u) - co(u) 9(v)].

11. Let S3 be the unit sphere in C2 and S2 the unit sphere in R3 and l e t / : S3 -> CP1 be the natural map/(z0, Z[) = [z0, z,] where [z0, z j stands for the homogeneous coordinates on CPl, then show

Elements of Group Theory and Group Representations

101

that using the identification of C/3' with S2 via stereographic projection, a bundle structure giving the principal fiber bundle S3(S2, f/(l)) can be defined with the structure group U(l). The mapping / : S3 -> S2 is called the Hopf fibration. 12. Show that the isomorphism itx (£/) = U x G induces the isomorphism n^x (IS) = U x F where K: P —> M and KE : E —> M are the projection maps of the principal bundle P and its associated bundle E.

Hints to Exercise 2.5 1. By definition, section (cross-section) of the bundle (T(Xn), X", it) is a mapping 5 : X" —> T(Xn) such that n o s = Idxn. Denoting the vector field on X" by v, we wish to examine that v fulfills this requirement. We know that T(X ") is formed by the pairs (x, vx) where vx is a tangent vector e Tx(Xn)—the tangent space at x e X". The mapping n carries the pair (x, vx) to x. Hence if v: x \-* (x, vx), then evidently n o v is the identity mapping on X". 2. To begin with, we recall that a Riemannian metric on a differentiable manifold M is a smooth section of the vector bundle S2{T(Mj) on M, where S2{T{M)) denotes the bundle whose fiber on x e M is the vector space S2(TXM) of symmetric bilinear maps of TXM x Tx M into R. In fact this is also referred to as the Riemannian metric on the tangent bundle T(M). Having said this, it is clear that when T(M) is replaced by a general real vector bundle E on M, then a Riemannian metric on £ is a smooth section s of the vector bundle S2(E) on M such that s(x) is a bilinear, symmetric and positive definite map of Ex x Ex into R for every x e M. A Riemannian vector bundle is a pair (E, s) where £ is a real vector bundle and s is a Riemannian metric on E. 3. By definition, a triple (P, M, G) is a principal fiber bundle P(M, G) over a differentiable manifold M if the Lie group G can be identified with the typical fiber F of manifold P, and G acts freely on P on the right. A frame U = (ux, ..., un) at a point x e M is an ordered basis of the tangent space TXM. Let LX(M) denote the set: (i)

LX(M) = {U\U is a frame at x e M]

and let (li)

L(M) =

ux£MLx(M).

Define the projection n: L(M) —> M such that U h-> x. The group G = GL(n, R) acts freely on L(M) to the right: (iii)

(U,g)\-*

Ug = ( u l g { ... u x g l n , ...,uig[...

u ; g ' n , . . . , u n g ' [ . . . u n gnn)

where g = (gj) e GL(n, R). Furthermore L(M) can be given a manifold structure such that this GL(n, Reaction is smooth and GL(n, R) can be identified with a typical fiber LX(M) of L(M). Hence L(M) is a principal fiber bundle denoted L{M) (M, GL(n, R)). 4. To establish the required result we first define the auxiliary functions for the mappings defined on X and G, and then go on to define auxiliary functions for the mappings Lg and Rg. Let the local coordinate expressions for x e X and g e G be (V) and (ga) which € R" and Rp respectively. The mappings a : G x X -> X

with (g, x) )-> cr(g, *),

102

Mathematical Perspectives on Theoretical Physics

ae : X -» X ax:G-*X

with aAx) = a(g, x),

with ox(g) = a(g, x) a

(i) J

have the same local coordinate representation {& (g , x )) that belongs to R". The derivatives ag :TX(X)~> Ta (X)(X) and ax : Tg(G) -» TaAg)(X) are respectively represented as da'lg'.x')

.

doHg",^)

«(*))>—f-j- 1 -

(a;w)/=—f—-*-•.

<»>

J

dx dg The representative of <JX ate e G (i.e., when g = e) is the set of auxiliary functions ala (xj). We now replace X by G in (i) to write the local expressions for left (right) translation Lgh = gh (Rgh = hg) as (ghf = If (gb, hc) {Qigf = R" {gh, h')). Similarly replacing X by G in (ii), we have the left auxiliary (right auxiliary) function L^{gb) (R%(gh)); which is given by: (dRa{gb,hr)

dL«{g»,h<) d

dh

' h=e

)

d

dh V

I"

(m;

h=eJ

We emphasize that although the result (2.5.9) is being proved here for left invariant vector fields, it also holds good for right invariant vector fields. Recall that a vector field v is called left invariant if L'g v(h) = v(Lgh) = v(gh) for every ~g, h e G.

(iv)

Denote the value of a vector field v at e by y, i.e., 7= v(e), then at h = e (iv) becomes: L'g 7= v(g). This can be locally written in terms of left auxiliary functions as va(JS)^ir dg

= L'k{g)rk~r.

(v)

(vi)

dga

Conversely, L'g v{e) = v{g) implies that v is left invariant in view of the following equalities: v(Lgh) = v(gh) = L'gh v(e) = (Lg o Lh)' v(e) = L'g o L\ v(e) = Lg v(h). (vii) The above proof does not explicitly mention the correspondence between the left invariant vector field and the tangent vector. We emphasize this point by noting that 7= v(e) is indeed the tangent vector at e e G, i.e., / e Te(G). 5. We have to show that if v and w are left invariant vector fields on G, then their Lie bracket is also left invariant which means that they are closed under Lie bracket relation. In order to prove it, we have to establish: L; [v, w] = [L'g v, L'g w] = [v, w]. (i) Prior to showing this, however, we make two important remarks in this connection, namely (a) given a smooth function fe C°° (G) the vector fields v, w determine smooth functions vif) and w(/) that belong to C"(G); (b) the composition vw considered as an operator on C°°(G) does not in general determine a vector field, whereas the difference (vw - wv) does, which means that

Elements of Group Theory and Group Representations

103

the collection of vector fields on G is closed under the Lie bracket operation. Thus in order to prove (i), a smooth function which is not explicitly there has to be brought into picture. Note that in this context (v) and (vi) of the above Hint to Exercise 4 look like:

(L; i)f= vif(g))

(n)

where f(g) is the value of / a t the point g e G, and ^)^7

= V(«)/|((iii) og dg In view of remark (b), L' (vw) has no meaning in our context, whereas L' v(w(f)) = (L' v) w(f) = v(w(f)) does. Similarly writing La w(v(f)) = (L' w) v(f) = w(v(f)), we have: o

L'g [v, w]f=

a

L'g {(vw - wv)f) = (L'gv) w if) - (L'gw)v(f) - v(w(f)) - w(v(f)) = (vw - wv)f.

(iv)

To show that L'g [v, w] = [L'gv, L'g w], we only observe that retracing our steps in (iv) and using (ii), (vw - wv)f in fact implies (L'vL'.,w - L'w L' y)/which leads to [L'v, L' w], 6. Since X is paracompact, a result proved in a neighbourhood can be extended to the whole manifold through a partition of unity (see Boothby O.[l]). Let {[/,-} be the covering of X associated to the fiber bundle structure of E, with {y/,} the corresponding mappings on E, and let a> be a differential 1-form on X with values in £-the vector space of the Lie algebra of G. Suppose that p0 is a point in E such that V,(Po) = (xo> £o) where x0 e £/, and g0 e G, then any vector v e T (E) can be expressed as: v = ul + u2

(i)

where y/[ul e T{XQgQ) ([/,- x {g0}) and ^ u2 e T^ g^ (x0 x G). This means that w2is a vertical vector. Define 0)p as the connection form at p0 by the equality: copo(v) = a) (nf u{) + oi).

(ii)

The transitive action of G on each fiber implies that the variant of cop can be defined at any other point p = Rgp by the relation: 29 cop(R'sv) = Ad(g-1) copQ(v).

(iii) l

In view of this relation, we conclude that there is a form ft), defined on n~ (£/,-) with values in Q. Next suppose that {0;} is a partition of unity on X subordinate to the covering {{/,}, then the form (O = Z (6i o 7T)a)j is the required connection form on E. 7. To establish the equality (2.5.23) we shall have to consider three different combinations of vector fields u and v or that of vectors up and vp, namely both up and vp are horizontal, one horizontal and another vertical, and both vertical. Now Q(«, v) = Dco(u, v) - dco(hu, hv). This means that we have to show that: dco(hu, hv) = dco(u, v) + [O)(u), a>(v)] in all three cases. 29

- See (ii) of Def. (2.5.14).

(i)

104

Mathematical Perspectives on Theoretical Physics

Case 1: In view of (2.5.20), which says co(vp) = 0 vp e Hp, it follows that (o{up) = co(vp) = 0, hence [(o(up), O)(up)] = 0, and since hup = up, we have: Dco(up, vp) = dO)(up, vp). Case 2: Extend both up and vp in a neighbourhood of p to vector fields u and v which are respectively horizontal and vertical, then co(u) - 0 and co (v) being a fixed element of Cj, uco (v) = ua ——(const.) = 0 where {ea} stands for local coordinates in H . This means that not only a de [co(u), co(v)] = [0, co(v)] = 0, but also dco(u, v) = ucoiv) - va>(u) - co[u, v] = 0 as the Lie bracket [M, V] in third term is horizontal. Since Dco(u, v) = dco(hu, hv) = dco(hu, 0) = 0, we note that both sides are identically zero, and hence equality holds good. Case 3: Extend u and v to Killing vector fields in the neighbourhood of p, and note that the algebra of Killing vector fields is isomorphic to Q. As a consequence it follows that: dco(u, v) = u(o{v) - vco(u) - o)[u, v] = 0 - 0 - co[u, v] = -[co(u), co(v)] = 0. (Note that we have established an equality similar to co[u, v] = [co(u), (O(v)] in Exercise 5, while dealing with left invariant vector fields.) On the other hand, Dco(u, v) = dco(hu, hv) = d(O(0, 0) = 0. Hence the equality is identically zero on both sides. 8. The result to be proved here is, in a way, a restatement of the definition of canonical flat connection on a principal fiber bundle. This bundle has to be a trivial fiber bundle for the definition to be meaningful. In order to establish the result, we use the principal bundle homeomorphism between the two bundles: P = M x G and P' = M x { G be the projection mapping, then Q) = f*9

(i)

defines the connection form of the canonical flat connection. The curvature form corresponding to co is zero, since

dco=d(f*9) = f\d9)=f*(-±-

[9, d]) = -±[f*e,f*9]

= -± [co, co] (i

(see Exercise 7). The Cartan structural equation when written for 1-form 9 on G is called the Maurer-Cartan equation: d9=-~[9,

9}.

9. A connection in P(M, G) is called flat if every point x e M has a neighbourhood such that the induced connection in the restricted bundle P\u= nx (U) is isomorphic with the canonical flat connection in U x G. In other words, P admits a flat connection if there exists an isomorphism

Elements of Group Theory and Group Representations

105

(j): ifx (U) —> U x G which maps the horizontal subspace at u e n~l(U) upon the horizontal subspace at #(P) of the Lie algebra g into the Lie algebra %{P) of vector fields on P. For every A e Q, this homomorphism defines the vector field A = (j)(A) known as the fundamental vector field (on P). To every p e P, A* assigns the vertical vector A p, which as we know is tangent to the fiber at p e P. Given a linear connection on M (see Def. 2.5.17), for each a e R", we associate a vector field B(a) on P, such that for each ps P, (B(a))p is the unique horizontal vector at/? which satisfies it' ((B{a))p) =p(cc). The vector field B(a) is called the standard horizontal vector field on P. The following properties of these vector fields are well known (see (l.[10j): (i) A ^ 0 and a* 0 imply respectively that A*, B(a) never vanish on P. (ii) Given A* corresponding to A, for every g e G, (Rg)* A* is a fundamental vector field corresponding to {ad{g~x)) A e Q, and co(A*) = A.30 (iii) If 6 is the canonical form and ft) the connection form on P, then 0(B(a)) = a and a>(B(a)) = 0.31 (iv) Rg{B(a)) = B(g~l a) forge G and a e R". (v) [A*, B(a)] - B{A a) where A a denotes the image of a by A-an n x /i-matrix e gl(n, R)-the Lie algebra of GL(n, R). We emphasize the fact that a fundamental vector field on P can be defined without assigning a connection on P, whereas a standard horizontal vector field can be defined only after a linear connection has been defined on M. We now use these concepts in establishing the result. Note that in view of the Def. (2.5.16) for exterior covariant derivative of an arbitrary form on P, we can write 0 ( M , V) as: (a)

0 ( M , V) = D9(U,V)

= d6(hu,hv)

(a)

where h : Tp(P) —> H is the mapping that carries v —> vh—the horizontal component of v. We further note that since the connection-form « i s a ^-valued 1-form and f9 is a Revalued 1-form they satisfy: (b)

u vertical

: co(u) * 0, 9(u) = 0

(b)

(c)

u horizontal : co(u) = 0, G(u) * 0.

(c)

Now the equality (a) in the Exercise has to be verified (just as we did in Exercise 7) for different combinations of u and v, e.g., both horizontal or both vertical, or u vertical and v horizontal. The first two cases turn out to be trivial (see the steps taken in Exercise 7), hence we establish the equality for the third case. Suppose that a vertical vector u corresponds to the fundamental vector field A*, i.e. u = A*, and v corresponds to the standard horizontal vector (B(a)) . We shall write u and v as A* and (B(a))p when required. Evidently 0 ( M , V) = 0, since dd(0, hv) = 0. Also 3a

31

The vector field A* is induced by the 1-parameter group of transformation Ra where a, = exp tA, whereas the vector field (/?„)* A* is induced by the 1-parameter group of transformations RaRn Ra-\ = R^-ia a = C^a)*- The latter group is generated by (ad(a~1)) A e Q . Note that these two conditions completely determine B(Q for each f e R".

106

Mathematical Perspectives on Theoretical Physics

co(v) • 6(u) = 0 in view of (b) or (c). Whereas co («) 9(v) - co(A*) 6 ((Ba)p) = A a in view of properties (ii) and (iii). On the other hand writing 2 d6(u,v) = u9(v) - vd(u) - 9 [u,v] we note that the first term vanishes since it is the derivative of a constant «-tuple, the second term vanishes as 9 (u) = 0, and the third term can be written as: 9[A*p, (B(a))pl

(d)

Using the property (v) followed by (iii), we see that (d) equals Aoc; and this establishes our result. 11. To define a principal bundle P(M, G) we are supposed to have a Lie group G whose (free) right action p on P is such that the orbits of p are the fibers of n: P —> M, i.e., n can be identified with the canonical projection P —> PIG. Also for every yv: U x G —> n~x (U) (considered as a local trivialization) one has: y/;1 (uxg) = y ; 1 (ux)g

V uxe

Px, g e G

(i)

where uxg = p(ux, g) and x is a point of U e M (see Def. (2.5.3) and (2.5.8) to appreciate the use of mapping \j/~l here). Now S3 is a unit sphere in C 2 and there exists a natural m a p / : 5 3 —> C Pl which sends the pair (z0, zx) e S3 to the equivalence class [z0, z,] e C P 1 . In Exercise (1.3.8) we have already seen that S2 can be stereographically identified with C P1. Hence using the right action of the Lie group U(l) on S3, the requirements of principal fiber bundle (given above) can be easily verified (see Hint to Exercise 3). 12. By definition of principal bundle, every point x e M has a neighbourhood U such that K~x (U) = ( / x G . W e identify n"x(U) with Ux G and note that the action of G on the right on n'1 (If) xF is indeed the action on U x G x F, which is given as: (x, g, a) -> (.x, g, a) o h = (x, gh, /T1 a)

(i)

where x e U, g,he G and oce F. This means that the isomorphism TT"1 ([/) = UxG induces an isomorphism %"' (U) = U x F. A Note on Hints: Exercise (2.2): Exercises 4, 5, 10 and 11 can be'solved with the help of Refs. [13], 4.[4], 4.[8] and 4.[9]. Exercises 6 and 7 will be appreciated better after Chapters. 6 and 9. Exercise (2.3): a good source for hints to these exercises is the article by A. A. Kirillov in [19] and the text 3.[11],

References 1. R. Abraham, Piecewise differentiable manifolds and the space-time general relativity, J. Math. Mech. 11 (1962), 553-592. 2. V. I. Arnold, (a) On the topology of three-dimensional steady flows of an ideal fluid, Prikl. Mat. Meh. 30 (1966), 183-185 (Russian), translated as J. Appl. Math. Mech. 30 (1966), 223-226; (b) Singularities of smooth mappings (Russian) Uspehi Mat. Nauk 23 (1968), no. 1 (139), translated as Russian Math. Survey. 3. A. Borel, Linear Algebraic Groups (Benjamin, 1969). 4. R. Bott, An application of the Morse Theory to the Topology of Lie groups, Bull. Soc. Math. France 84(1956), 251-282. 5. R. W. Carter, Finite Groups of Lie Type (Wiley-Interscience, 1985).

Elements of Group Theory and Group Representations

107

6. H. Chandra, Automorphic Forms on Semisimple Lie Groups (Springer-Verlag Lecture Notes 62, 1968). 7. C. Chevalley, Theory of Lie Groups /(Princeton University Press, 1946). 8. C. M. DeWitt and J. A. Wheeler (ed.), Lectures in Mathematics and Physics, Batelle Rencontres (1967) (New York: W. A. Benjamin, 1968); R. Bott and J. Mather, Topics in Topology and Differential Geometry (New York: W. A. Benjamin, 1968). 9. F. Digne and J. Michel, Representations of Finite Groups of Lie Type (Cambridge University Press, 1991). 10. D. G. Ebin and J. Marsden, Groups of diffeomorphism and the motion of an incompressible fluid, Ann. of Math (2) 92 (1970), 102-163. 11. J. Eells, Jr., On the Geometry of Function Spaces. Symposium internacional de topologia algebraica [International symposium on algebraic topology], 303-308. Universidad Nacional Autonoma de Mexico and Unesco, Mexico City, 1958. 12. C. Ehresmann, Les Connections Infinitesimales Dans un espace Fibre Differentiate (Colloque de Topologie, Bruxelles, 1950). 13. M. Gourdin, Basics of Lie Groups (Editions Frontieres, 1982). 14. W. Greub, S. Halperin and R. Vanstone, Connections, Curvature, and Cohomology, Vol. 1: De Rham Cohomology of Manifolds and Vector Bundles (New York: Academic Press, 1972). 15. V. Guillemin and A. Pollack, Differential Topology (New Jersey: Prentice-Hall, Englewood Cliffs, 1974). 16. S. Helgason, Differential Geometry, Lie Groups and Symmetric Spaces (New York: Academic Press, Inc., 1978). 17. F. Hirzebruch, Topological Methods in Algebraic Geometry (3rd enlarged ed., New York: Springer-Verlag, 1966). 18. D. Husemoller, Fiber Bundles (2nd ed., Berlin: Springer-Verlag, 1975). 19. A. A. Kirillov (ed.), Representations of Lie Groups and Lie Algebras (Akademiai kiado, Budapest, 1985). 20. B. A. Kupers'chmidt, in G. Kaiser and J. E. Marsden, Geometric Methods in Mathematical Physics (Berlin: Springer-Verlag, 1980). 21. S. Lang, (a) Differential Manifolds (Addison-Wesley, Reading, MA, 1972); (b) SL2 (R) (New York: Springer-Verlag, 1985). 22. S. Lie, Theorie der Transformations Gruppen (Leipzig: B. G. Teubner, 1888). 23. G. W. Mackey, Unitary Group Representations in Physics, Probability and Number Theory (The Benjamin/Cummings Publishing Company, 1978). 24. K. B. Marathe and G. Martucci, 1O.[35]. 25. J. Marsden, D. G. Ebin and A. Fischer, Diffeomorphism Groups, Hydrodynamics and Relativity (Toronto? Pub: s.n.)? 26. D. Mumford, Abelian Varieties, Bombay Lectures (2nd ed., Tata Inst. of Fund. Re., Bombay, India, 1974). 27. H. Omori, Infinite Dimensional Lie Transformation Groups (Springer-Verlag, 1974). 28. S. Smale and R.S. Palais, What is global analysis, Amer. Math. Monthly 76 (1969), 4-9. 29. N. E. Steenrod, The Topology of Fiber Bundles (Princeton University Press, 1951). 30. S. Sternberg, Lectures on Differential Geometry (Prentice-Hall, 1965). 31. D. Sundararaman, (a) Topics in Several Complex Variables (Boston: Pitman, 1985); (b) Moduli Deformation and Classification of Compact Complex Manifolds (Marshfield, MA: Pitman, 1980).

108

Mathematical Perspectives on Theoretical Physics

32. G. Warner, Harmonic Analysis on Semisimple Lie Groups (Springer-Verlag, 1972). 33. F. W. Warner, Foundations of Differentiable Manifolds and Lie Groups (Glenview, IL: Scott, Foresman and Company, 1971). 34. S.-T. Yau (ed.), Geometry, Topology, and Physics for Raoul Bott (International Press, 1995). 35. H. Zassenhaus, Lie Groups, Lie Algebras and Representation Theory (Les Presses De l'Universite de Montreal, 1981).

CHAPTER

A PRIMER ON OPERATORS

1

O

DEFINITIONS AND EXAMPLES

In physics as well as in mathematics operators play an important role in physical and algebraic structures. We therefore give some elementary details of operator theory before using them as tools in other theories. It is well known that very often the search for solutions (ansatz) of a given system becomes easier when cast in operator formalism. Symbolically given an equation L(x) = a, where a is known and L stands for a well defined operation, we solve for unknown x. Thus in the set of equations: ln xx + lux2 = a{ l

2\

x

\ +

Z

22 X2 =

a

2

the pair (a{, a2), (*i> x2) and (2 x 2) matrix

(hi h2) Ui hi) represent respectively a, x and the operator L. The operator may be defined only for a certain class of elements (i.e., those amongst which the unknown is supposed to lie); this set is called the domain of definition of L, denoted D(L). As the elements of D(L) vary, the results of the operation define another set called the range of L, denoted R(L). If the solution to the equation L (x) = a is unique, we say that the operator L possesses an inverse denoted LT1. Thus if the matrix in the above example is non-singular, L has an inverse L~l. The example we have chosen here illustrates an important class of operators known as linear operators. It is this class of operator with which we shall deal in general. In order to study different aspects of operators, we begin by considering the cartesian product 3 x X of two sets. A mapping from AxX

into X : (a, x) H-» ax = y, called an external operation on X, assigns

the role of operators to elements of A. Thus the operator here is a procedure that transforms a given element of X into another element of X (the range set of the operator in this case is included in the domain set X). To explain the algebraic structure of the set 2 , we assume that X is a set of functions (continuous or differentiable as required) defined on an arbitrary bounded domain (Euclidean, Hilbertian1 '

The Hilbert spaces considered throughout this chapter are complete and separable. (See 0.2.14 for the definition.)

110

Mathematical Perspectives on Theoretical Physics

or a compact smooth manifold). In this setting generic elements of 3 and X are respectively Q, and/ and g; thus we have Q f = g. Simple examples of Q are the multiplication by a constant and the derivation, i.e.,

fl/=A/=g

(Q = A.) and Qf=^L=g dx

la = ~). \ dxJ

An operator is called //««zr if it satisfies: a (a/, + bf2) = aO/, + bQf2. (3.1.1) Evidently the 'domain' of a linear operator is a vector linear space and the 'range' is also a linear space. Note that the operator which squares or more generally exponentiates a given element/is not a linear operator. Operators of this type are non-linear operators.

l.l

Properties of a Linear Operator

Definition 3.1.1: Consider a linear operator Q defined on a topological vector space (see Def. (0.2.7)) X, then linear operator Q, is continuous if Q.y/n -> fiy/for any sequence of vectors {y/n] in X that converges to a limit vector i/fin X. It is bounded if there exists a positive number b such that || Q\ff\\ < b || v/|| for every vector y/. The smallest number b with this property is called the norm of Q. and is denoted |Q|. It can be shown that a linear operator is continuous if and only if it is bounded. A set of bounded operators on X is denoted 'B (X). The sum and product of two operators are defined as: nf=nj+n2f, =

+

While Q[ + Q2 ^2 ^i> ^1 ^2 ' of / by x and Q2=T-.

and « / = ( Q , Q 2 ) / = Q , ( Q 2 / ) .

s not

(3.1.2)

always equal to Q 2 t^. For example when Q, is multiplication

(£W/*" 2 "i/

dx If however f=f(x,

y) and

«i = ^ dx

and

Q2 = A

ay

0^/=

£W=

-LL. axdy

Sums, scalar multiples and products of bounded operators are bounded. In view of Def. (4.1.1) "S (X) is an algebra. Definition 3.1.2: A vector subspace X' c X is said to be invariant under the action of an operator Q if any vector \\i in X' is transformed to another vector I/A' in X'. Also if {e,} denotes an orthonormal basis in X, operator Q has a matrix representation given by: O ( e , - ) = I «,,- er i / is assumed to be smooth throughout.

(3.1.3)

A Primer on Operators 111 Definition by Or1.

3.1.3:

We say that a linear operator Q has an inverse £2if£2Q = Q Q = l , w e denote it

Result 3.1.4: If Q is a linear operator on an n-dimensional vector space with basis {e,}, it can be shown that the following four conditions are necessary and sufficient for Q to possess an inverse. (1) There is no non-zero vector (//that satisfies Qy/= 0. (2) The set of n vectors Qe, (i = 1, •••, n) is linearly independent. (3) There is a linear operator Q such that Q Q = ilQ. = 1. (4) The matrix [Qy] corresponding to Q has non-zero determinant. Among linear operators with inverses, there are those which preserve the scalar product of a vector space (assuming that a scalar product is defined on it). These are called unitary operators. We denote them as U and we note that for any vector y/ in a given vector space X

\\Uy\\=\M

(3-1.4)

holds. We would like to mention here that (3.1.4) does not imply that an operator satisfying it possesses an inverse. In the case of finite-dimensional vector spaces it has an inverse, but for infinite-dimensional vector spaces this is not always true. For instance consider the space I2 formed by infinite sequences3 y/= (x{, x2, •••, xn, •••) and let A be an operator such that Ayr=(0,xl,x2,-,xn,-).

(3.1.5)

r

Evidently \\Ayf\\ = || Vll f ° every vector y/, but A has no inverse. The following results about unitary operators can be easily checked. Result 3.1.5: If U is a unitary operator and set of vectors (2, • • •,fa,• • •) is an orthonormal basis of X, then (Ufa, Ufa, ••-, Uk, •••) is an orthonormal basis. Result 3.1.6: Let (0j, •••, •"> ^ 0 * ' " 0 i s a n orthonormal basis, then Q. must be unitary. Definition 3.1.7: Let ( , ) denote the scalar product on X. The adjoint of a bounded linear operator is an operator denoted Q + , that satisfies: (<j), Q+ y/) = (Q, yr)

(3.1.6) +

for any vectors (j), \|/ in X. The bounded operator Q is called self-adjoint (Hermitian) if Q = Q (see also Eq. (3.1.13)).

1.2

Matrix Representation of a Linear Operator

When we talk of operators in quantum mechanics (see 9A.14-18), we denote the orthonormal basis as |y/,) and rewrite (3.1.3) as

"lv;>=XM Q )lvo->

(3-1-7)

j

The basis vectors | y/;) are called the basis state vectors in X, and Ay,- (Q) is the matrix representation of Q with respect to this basis. It is easy to check that (3.1.3) and (3.1.7) lead respectively to 3

/2-space = the space formed by (x,, • • •, xn, •••) such that X I*J2 < °°> w^ addition and scalar multiplication n= l

defined componentwise. It is also denoted as l2 - space (See Excercise 6 in 0.2)

112

Mathematical Perspectives on Theoretical Physics

(e,, Q( ep) = Q/j,

(3.1.8)

and (yfi\a\yfP = Aij(Q)4

(3.1.9)

Also as the scalar product in quantum mechanics is given by an integral, equality (3.1.9) can be written as: Afj (ft) = \ \j/i to &¥} to <**-5

(3.1.10)

Using (3.1.10) the unitarity condition (3.1.4) can now be expressed as:

\\Uy/\\ = |
(3.1.11)

Returning to the use of functions for further study, we note that given an operator Q, its complex conjugate operator is defined by the relation: ( « / ) ' = a"f*

(3.1.12)

Pursuing this scheme of ideas we say that Q. is Hermitian if for any pair of functions /, g it satisfies:

jfngdx=

$ gtffdx

(3.1.13)

In view of (3.1.7) and (3.1.8) it can be easily verified that Q is Hermitian if

(f\n\g) = (8\n\f)*

(3.1.14)

Evidently a Hermitian operator is represented by a Hermitian matrix (see Sec. 2). Another important notion about operators is their completeness (not to be confused with the completeness of Hilbert spaces). We call an operator complete if it involves all variables of the domain of its definition. If that is not the case, we call it incomplete. Thus for a particle which is freely moving in space, we note that the operator (

^ =

{Px) P

° ~

»

d

2Ki dx

is incomplete, but the Hamiltonian Hm s H \ x, v, z, op 1, •

, 2ni dx

, 2ni dy

.t 2ni dx

)

is complete. We close this section with three examples of operators which are used in both mathematics and physics all the time and give at the end a list of operators that we shall use as we go along.

Note that on the IMS we have used the quantum mechanics (Dirac) notation ( , ) in place of ( ,) to denote the scalar product, we use the two notations interchangeably. Also, when there is no fear for confusion, we shall 5

use Qij and not Ay (Q.). y/* denotes the complex conjugate of \j/(a wave function in quantum-mechanics terminology).

A Primer on Operators

113

Example (3.1.8): Recall that the Fourier transform of a function/!*) if: R —> (T) is another function F(u) (see Sec 4 in App. 9c): F(u) = {2K)~2

£°

e~iuxf(x)dx.

Hence Fourier transform Z7 can be viewed as £2/, where F = Q/= H ft (x; x')f{x')dx J — DO

= f (2ff)-J e4**'^)

dx

(3.1.15)

J

Note that this is an integral representation of the operator Q, the quantity (2TT)~7 e~'w = Q. (x, x') is called the kernel of this operator. In the above example we chose the Fourier transform of a function to obtain an integral representation of an operator; since in the Fourier transform there is a built-in integration, this may wrongly imply that only those transformation procedures that involve integration could be given an integral operator formalism. To illustrate that this is not the case, we show how a differential operator can be given an integral representation. ox Example 3.1.9:

Let D =

denote the differential operator. In order to find its integral representadx tion we have to determine its kernel D (x, x) in the following equality (for differentiable functions with compact support defined on the interval (- °°, °°)) f^L dx

=

f~ D(x,x')f{x')dx'

(3.1.16)

J-°°

Using the result of Excercise 4 for the integral representation of the multiplicative unit operator, we can

•

df(x)

write - 4 dx

as

J-~

dx

dx

(3.1.17)

Integrating the RHS by parts we have:

^ ^ dx

= 5(x - x') f{x')

=f J-°°

_„

- J" \ ~ 8{x - JC')1 /(*') dx' °° L dx

~8{x-x')f{x')dx dx

= J ^ 8'(x - x') f{x') dx'. This shows that the kernel Z)(x, x') = 5'(x -x)—the derivative of Dirac-delta function.

(3.1.18)

114 Mathematical Perspectives on Theoretical Physics

Example 3.1.10: Shift operator: As the name suggests, operators that shift a parameter to right or left in a continuous system, or the entries of a sequence in a discrete system, are called right or left shift operators. For example, if {U,}(t e R) denotes a family of unitary operators and S£ U, = Ut+E (e > 0), then SE is a right shift operator of length e. S_£ - (5e)~' is a left shift operator: S_e Ut - Uc_£, e > 0. If B = (bu • • •, bn) is a sequence of real numbers of length n, the operator that shifts the 1 st, 2nd, • • • entries to the (k + l)th, (k + 2)th, ••• places and fills the first k entries with zeros: Rk(B)=(0,0,-,0,bi,

•••,&„_*)

is the right shift operator of order k. The left shift operator is naturally defined by shifting the entries to the left, thus Lk(B)=(bk+l,bk+2, k

•••,£,„ 0 , 0 , •.., 0).

k

Note that L (B) = FT (B). We give below two operators to which we shall return later in Chapters 5 and 11. (i) Virasoro operators, (ii) Del Giudice, Di Vecchia and Fubini (DDF) operators: these operators commute with Virasoro operators, when these are applied successively to ground states they give all possible physical states. Moreover they form a closed algebra called the spectrum generating algebra. The elements of this algebra (i.e., operators) are in one-one correspondence with Fourier coefficients afx of string theory.6 Finally we list some of the classical operators that we use all the time.

1.4

List of Operators (Commonly in Use)

Name of the Operator

The Notation

The Spaces and the Set of Functions on which it Commonly Acts

Position

Q (or X)

R", C".

Grad (Gradient)

V= — ,—,— 1 ^dx

Div (Divergence)

Curl

^

onlR 3 :/(x,j, 2 ).

->

V»

Set of functions/defined

Vx

on\R3 :f(x,y,z). Set of functions /defined 2

6

Set of functions/defined

dz

2

2

.

onR3

:f(x,y,z).

Laplace

V = A = — T , —r-, — T \dx2 dy2 dz2 )

Set of functions/defined 3 on\R :f(x,y,z).

Momentum

-iVsP

Set of functions y defined on Minkowskian space: y/(t, x, y, z).

See Chapters 11 and 11. [13] for (i) and (ii).

A Primer on Operators 115

Name of the Operator

The Notation

The Spaces and the Set of Functions on Which it Commonly Acts

Kinetic Energy

- V2/2m = f

Functions on R3.

Energy

d _ -

Complex-valued

-i — = E

_»

"t

functions \{/ (t, X) known as the wave functions. Set of particles moving in R 3 (V(r, t) isthepotential energy. Set of functions defined o n Minkowskian space. Set of functions on R3. Set of functions on R3.

- V2/2m + V (r, t) = f + V

Total Energy

/ 2 N - -~- - V2 \= - U2 ^ "t ) - if x V s 7 x p = J = L J2 = j 2 + j 2 + j 2

Wave or D'Alembert's Angular Momentum Total Angular Momentum

S

J2 + Jl + J2

Angular Momentum Opera-

M =

tors in Wave Mechanics

=—

y

2ni \ dz 2ni

Set of functions on R3

z -zr—

ay) (y/x denotes the azimuths about the axis Ox). Similarly for My and Mz.

d

Vx

Operators defined for Specific Spaces The Hamiltonian Operator

( H x,y, z, V

h

h

/9

,

,

2ni dx

2ni dz)

{dx2

?)

For Minkowskian space.

2ni dy

2m \ 2ni J

dy2

dz2)

+ V(x,y,z,t) d2 y + V(x) 2 dx 1

One-Dimensional

H=

Schrodinger Operator (also known as Harmonic Oscillator) Schrodinger Operator (Harmonic Oscillator in a General Form)

H = — (-A + q(x))

For a particle moving in one-dimensional medium where V(x) = j * F{i) dt is the potential. For a particle moving in R" (here q{x) is a positive quadratic form on R").

116

Mathematical Perspectives on Theoretical Physics

The Spaces and the Set of Functions on which it Commonly Acts

Name of the Operator

The Notation

Schrodinger Operator (in General)

H (a, V) = — (-z V - a)2 + V

Dirac Operator

ca p + (3moc2

7

For Minkowskian space (here a is a vector potential and V is a scalar potential): Space of spinors.

Exercise 3.1 1. Establish the equality (3.1.11). 2. If Q is an operator on Hilbert space ^ a n d {„} is a complete set of orthonormal functions ((0«> Qn) = 8,1,n)< show that the matrix representation for Q using the set {
The above representation of Q is in terms of a matrix of infinite order. 3. Show that two operators commute if and only if their Lie bracket is zero. 4. Show that the multiplicative unit operator Q.u (£luf=f) defined on a set of functions {f(x) : x € (_ oo, co)} can be written in terms of integral operator as

f(x)=[oaQu(x,x')f(x')dx' where the kernel Qu (x, x') is the Dirac delta function 8(x - x') that satisfies

J dxS(x - x') = 1 (see subsection 3 of (0.4) for Dirac 5-function). 5. Writing the operators p and J in terms of their components: (px, pv, pz), ( jx,jv,jj),

show that

and that the commutation relations satisfied by j's are: Ux, jy] = iJv [/r JJ = 'A' Uv Jxl = Vy which can succinctly be written using the indices 1, 2, 3 and the anti-symmetric tensor e i2j 7. p = hp.

as:

A Primer on Operators

[//' Jni\ =

e

117

imnJn-

6. Show that the Laplace operator

in cylindrical coordinates (p, (j>, z) and spherical polar coordinates (r, 0, 0) is respectively:

dp2

A

d2 (9r2

^

p2 d(j)2

p dp

|

2 d r <9r

|

1 d2 r 2 <902

|

dz2 cos0 ^ | 1 ^2 r 2 s i n 0 50 r 2 sin 2 0 5^>2 '

7. Consider the complex space L (0, 1) of complex square integrable functions y/(x) on the interval 0 < x < 1. Let w be a real number and U a linear operator defined as Uy/(x) = e'wx y/(x). Show that U is a unitary operator. 8. Consider the set of operators ei*o

J - =d

for

n £ Z.

we Show that their Lie bracket satisfies: K> dm\ = (n-

m) dn+m.

The set {dn} is a basis for the Lie algebra of complex vector fields on the unit circle S 1 (see Eq. (5.2.9)).

Hints to Exercise 3.1 2. Let gn be the image of
S« = X a«, * ^t

(0

hence gn = Q.(pn becomes

(ii)

X a«, * ^ = Q^«it

Premultiplication by
as

\fma<$>ndx={
118 Mathematical Perspectives on Theoretical Physics and

/ 0m hdx

= <0m' 0t> = Sm, t

Using Ftn. 4 we have (iv> ",„„ = an, mHence (ii) yields the required matrix representation: k

It should be noted that the coefficients of the expansion of gn—the image of
+

JyJ' + JzK=

J

X Px

K

y Z Py Pz

and express jx, jy, jz, with the understanding that Px =

. d ~ 1 ^

etc. The commutation relations can easily be computed by writing j x , jy, y, in terms of px, py, pz. 6. Coordinate relations x = p cos , y = p sin (j), z = z give: ( d d d \ ( , d 3 - . - 3 - . 3 - = cos^^J: ^y dzj v dp

sin0 d r. p dip

•

A

sin0

d

, cos d d\ + 1 dp p d(f> dz J

Accordingly d2

( z- =

dx2

, 5 COS0

sin0
^ COS0

sin0 ^ "\ Z

{

dp p d(j)){ dp p d
A Primer on Operators 119

2

EIGENVALUES AND EIGENFUNCTIONS

2.1 The Resolvent and the Spectrum of an Operator In general a function/e X = D (Q) is transformed to another function g under the action of an operator Q. In X however there is a set {fn} of non-zero functions that satisfy the equation Q / n = Xnfn for some constant Xn ( e R or C). These functions are called the eigenfunctions* of the operator and Xn are called the eigenvalues. A sophisticated way to introduce the concept is as follows: If for any A e K(IR or C) the operator XI - £2 is not injective (i.e., N(Xl - Q) = kernel (XI - Q) * 0), and there exists an / e D(Q), / ^ 0, such that Q / = Xf, then X is called the eigenvalue a n d / i s called the eigenfunction. The subspace N (XI - Q) is the eigenspace of X, and dim (TV (XI - Q)) is its multiplicity. In case A is not an eigenvalue (i.e., XI - Q is injective), then there is a (well-defined) operator R(X, Q) = (XI - Q)" 1 which can be viewed as a function on the set defined below. The set p(Q) = {A € K : XI - Q. is injective, and 7? (A, Q) e $ (X)} is called the resolvent set of Q, and the function R (•, D): p(fi) - » S (X), A -» /?(A, Q) is called the resolvent of Q. The set CT(£2) = K \ p(Q) is called the spectrum of Q. The set of eigenvalues is obviously contained in cr(Q), it is called the point-spectrum and is therefore denoted as a (Q). (For details see [17], O.[3].)

2.2 Examples and Results on Eigenvalues and Eigenfunctions of an Operator Example 3.2.1: Let X be the space of C°°-functions defined on R 3 and Q be the Laplace operator A defined in Subsec. (1.4). The functions fn= cos nx + cos ny + cos nz are eigenfunctions of the equation A/ n = Xnfn where Xn= -n2 is non-degenerate. Example 3.2.2: The operators Mk (k stands for x, y, z) given in Subsec. (1.4) possess the eigenvalues m (HI2n) with corresponding eigenfunctions (2TC)~V2 exp (- im£ where tyk is the azimuths about the axes Ok, and m equals 0, ±1, ±2, ±3, •••. (See Sections 3 and 4 for discussions on spectrum.) The spectrum in both examples given above is Z u {0}. It should be noted that eigenvalues in both cases are isolated, thus the spectrum is discrete. If the eigenvalues form a continuous sequence, the spectrum is continuous and is usually referred to as band spectrum. The following results for linear operators acting on finite-dimensional vector spaces are easy to check. Result 3.2.3: If a is an eigenvalue of a linear operator A, then operator (A-a) has no inverse; conversely if for a scalar a the operator (A-a) has no inverse then a must be an eigenvalue. Result 3.2.4: The necessary and sufficient condition that a scalar a be an eigenvalue of linear operator A is that det (A - a) = 0. Result 3.2.5: Let L be a linear operator acting on a vector space X (which may even be infinitedimensional) and let L"1 be its inverse, then the operators A and LALT1 have the same eigenvalues.

8

'

When Q. is a linear operator acting on vector space X (Euclidean or Hilbertian), we call them eigenvectors and generally denote the set as {\jfn}.

120

2.3

Mathematical Perspectives on Theoretical Physics

Hermitian Operators

We next see that Hermitian operators, denoted as H, have the following properties: (1) The eigenvalues are real. (2) The eigenfunctions corresponding to distinct eigenvalues are orthogonal. (3) The operator H can always be represented by a Hermitian matrix. To prove (1) we pre-multiply the eigenvalue equation Hfn= Xnfn by f*', integrate and use the defining relation (3.1.9)—(3.1.10) to obtain:

(3-2.1)

In view of the Hermitian property (3.1.14) of H we further have

(fnMfn) = = K ft,/„> which gives Xn - Xn showing that Xn is real. To prove (2) we pre-multiply Hfn = XJn and (Hfm)* = Xm /„* by/m* and/ n respectively, and integrate. This gives:

(fm\H\fn) = \(fm,fn)

(3.2.2)

(fn\H\fm)* = Xm(fn,fS

(3.2.3)

(fm\H\fn) = K{fn,fn)

(3.2.4)

which equals since H is Hermitian. Equating (3.2.2) and (3.2.4) we have the orthogonality of /„ and/ m as Xm ^ Xn (see Exercise 1 for equal eigenvalues). To prove (3) we begin by choosing Q as an arbitrary operator. Suppose that / i s an eigenfunction of Q, i.e., Q / = Xf. Let {/„} be a complete set of orthonormal functions, then / = ^ ck fk implies that the above relation can be put as k

"5>*/t=*I cjk. k

(3.2.5)

k

We now pre-multiply by/ H and integrate to obtain k

k

I^=ica.

(3.2.6)

k

If we choose now Q as a Hermitian operator, then

ft,MA> = ft|n|/m>* implies that Qmk= il*^ Therefore, in view of Exercise 2 of Sec. 1, we conclude that the matrix is Hermitian. From this property it follows that the eigenvalue problem of a Hermitian operator is equivalent to the eigenvalue problem of a Hermitian matrix (which may be of infinite order sometimes, see Exercise 4). For instance, since (Qmyt) is a Hermitian matrix, there exists a unitary matrix U = (Uk[) that diagonalizes the matrix (O.ml): m

k

and thus gives the eigenvalue of the operator Q.

A Primer on Operators 121

Another property which makes Hermitian operators rather important is the following: Proposition 3.2.6:

Two Hermitian operators commute iff they have a basis of common eigenfunctions.

Proof: (Nee.) Let/[, •••,fi be the common set of eigenfunctions for operators H and H . Then eigenequations ///•= A,/-, H= Xifi taken together imply

(a)

HHfi=H atfi) = WJi and

(b)

HHfi = H (Xj) = X^.f,

Since the R H S of (a) and (b) are equal, it shows that (c) H H = H H. (Suff.) Starting with the commutativity (c) we now show that they h a v e the s a m e set of eigenfunctions. We assume that they have different set of eigenfunctions fx, •••,fi, •••, f {, •••, / , , •••• Their eigenequations accordingly are: (d)

Hf^Xifi,

and

(e)

Hf{ = A,/,-.

Pre-multiplication of (d) by H and using (c) we have

(f)

HHfi=H

A,-fi= A,- (Hfi = H(H ft).

The extreme RHS of equation (f) shows that if A; is treated as a non-degenerate eigenvalue, then apparently (Hfy is an eigenfunction of H — which must necessarily be a constant multiple offi.In other words H ft = kft, i.e., ft is also an eigenfunction of H. Since choice of f{ in (d) is arbitrary, it follows that the set of eigenfunctions of H is also the set of eigenfunctions of H . Similarly, using the equation (e) it can be shown that eigenfunctions of H are also the eigenfunctions of H. We wish to mention here that conclusions will be the same, even when we delete the requirement of non-degeneracy of eigenvalue in above arguments and use instead an /n-fold degenerate eigenvalue (for proofs see [5], [11] and Hint to Exercise 1).

2.4

Properties of Commuting Operators

Since in quantum mechanics measurable quantities (observables) always correspond to (bounded) operators, the result proved above leads to an important consequence. Remark 3.2.7: In a given physical system two measurable quantities A and B cannot be measured simultaneously unless the operators representing them commute. For example, the coordinate q and its conjugate p — the linear momentum — cannot be measured simultaneously. To see this consider the position operator Q which assigns coordinate q to a measurable quantity and the momentum operator P=

— 9 and note that 2ni dq

- (PQ - QP) f{q) = 9[~?\^-[-^-)4-

(

<^» = -hf^

V 2m) dq V 2m J dq 2m where/is a differentiable function of q. This implies that QP - PQ - h/2m showing the non-commutativity of Q and P. If, however, there are several Q's and P's with the equalities Qt Qj = Q} Qr Pi Pj = Pj P ; leading to P, Qj = Pj Qt, we note that for i ^j the measurable quantities can be measured simultaneously. 9

P and Q of previous section are same except for the constant and the coordinate notation.

<

122 Mathematical Perspectives on Theoretical Physics

Remark 3.2.8: If two arbitrary operators A and B have an eigenfunction <j> in common then either (i) they commute, or (ii) their commutator [A, B] has a zero eigenvalue. Since A and B share the same eigenfunction <j>, for some constants a and j3(a* p), we have A= a(j> and B = /J0. This implies: (AB - BA)(j> =[A, B] - aB(j> = Pa<j> - aflip = 0

(3.2.8)

hence either [A, B] = 0, i.e., A and B commute, in which case the operators A and B have the same set of eigenfunctions, or [A, B] * 0 has a zero eigenvalue as is obvious from the extreme RHS equality of (3.2.8). To illustrate (ii) further, we consider the operators Mx, My, Mz which satisfy the relation: [Mx, My]

=^-Mz.

The set of eigenvalues of Mz given by m(h/2n) where m = 0, 1,2, • • • includes the zero eigenvalue for the commutator [Mx, My], and in this case operators Mx, My share the common eigenfunction 0 = cos t. Note that eigenfunctions of Mx and My are respectively (2n)~m exp(-im\ffx) and (2TT)~1/2 exp (-imyry) where \j/x and \fry are azimuths with respect to Ox and Oy. For m = 0 these eigenfunctions coincide with

(2nTm.

Sometimes it is useful to consider not eigenvectors but generalized eigenvectors of a linear operator which are defined as follows: Definition 3.2.9: Associated with an eigenvalue A of an operator L a vector £ is called a generalized eigenvector of order v if (L - A)v^ = 0, and for every v ' < V (L - A)v £ * 0. Naturally given an eigenvalue A of L, the generalized eigenvector for v = 1 is an eigenvector. (See [14] for details.)

Exercise 3.2 1. Prove the results (3.2.3) and (3.2.4). 2. Prove the result (3.2.5) and show that the result holds good for infinite-dimensional spaces as well. 3. Show that for a Hermitian operator it is always possible to select an orthogonal (orthonormal) set of eigenfunctions even in the case of degenerate eigenvalues. 4. Given a Hermitian operator Q and its matrix representation (£2^), show that the eigenvector pertaining to an eigenvalue A^ of matrix (£1;) uniquely determines the corresponding eigenfunction forQ. 5. Show that the measurable quantities represented by operators Mx and My cannot be simultaneously measured, whereas the quantities represented by pairs (A/2, Mx), etc., can be simultaneously measured. 6. Show that the Casimir operator*

r=l

v

z

J

r=\

commutes with each Jr. Also the raising and lowering operators J±=Ji± U2 satisfy the commutation rule [J+, J_] = 273, [/±, 73] = + J± (see Sec. 7 and the hints to the exercise there). * Given a non-abelian group G, a nonlinear function of generators of G that commutes with all of its generators is called a Casimir operator. It is usually denoted by C.

A Primer on Operators 123

7. Show that every eigenvalue of a unitary operator is a complex number of absolute value 1. 8. Consider 2

3

e x Fp ( f A ) = / + tA + — A2 + — A 3 +•••

2!

3!

as a curve of linear operators and - j - (exP

M

>

as its tangent vector at time t, then show that the tangent vector to the curve L(exp tA) LT1 at t = 0 is LALT1, where L is any invertible operator. 9. Show that if there exists a generalized eigenvector x of order v for a given eigenvalue X, then the elements (L - X)x, (L - X) x, ••-, (L - X)v~l x are respectively generalized eigenvectors of order O - 1), ( v - 2 ) 1 associated with X.

Hints to Exercise 3.2 1. Since a is an eigenvalue, there exists an eigenvector y/ such that Ay/=ay/, this implies (A-a) y/=0. In view of Result (3.1.4) (1) it follows that (A - a) has no inverse and conversely if (A - a) y/= 0 holds for the non-zero vector \ff, then by definition a must be an eigenvalue. Result (3.2.4) is obvious from condition (4) of Result (3.1.4). 2. Suppose that a is an eigenvalue of A, then for a non-zero ^ w e have A\f/= ay, also since L has an inverse L~\ from (1) of Result (3.1.4), Lift* 0. Accordingly LAW1 (L\f/) = LA (LTlL) y/ = Lay/= aLxf/ As Liffis a non-zero vector, it follows that a is an eigenvalue of LALT1. If we start by choosing a to be the eigenvalue of LALT1, we can show that this is an eigenvalue of A simply by writing: A =L~l LAL~l L together with the fact that L y i s non-zero if y is an eigenvector of LAL~X. 3. When A, * Xj eigenequations Hft = A,/-, Hfj = Xjfj with Hermition operator H imply10

\x UJ Wfi -f, (H /;)*) dx = o = (Xi- xp jx f* ft dx which shows that eigenfunctions ffs form an orthogonal set. If Xt is s-fold, there are s linearly independent eigenfunctions fn, •••,fis that may not be orthogonal. They can be replaced by an orthogonal set in following manner. Choose a set of s functions as:

fn=fn fi2r =

fis' =
' It is assumed that eigenfunctions (/•) defined on X are integrable.

124

Mathematical Perspectives on Theoretical Physics

The new set consists of (s - 1) eigenfunctions f£, • ••,fi's, each of which together with their linear combinations is orthogonal to ft{. Either these (s - 1) functions are mutually orthogonal, or, if not, the above procedure can be repeated by setting: f " -

f'

Ji2 - Til

fa'=
(y/, yf) = (Uyr, Uy/) = (uy/, uy/) = uu{y, yr) which shows that uu must be 1. 8. Since (LALT1)" = LA'1 L~x for all n, it follows that exp t(LAL~x) = L (exp tA) L~\ and the derivative at t = 0 is the coefficient of t in the exponential series.

3

SOME PROPERTIES OF OPERATORS

We devote this section to few more definitions of operators and their properties and also to questions related to spectral decompositions of operators defined in previous sections. Most of these results are given without proofs but along with the references where these can be found.

3.1

Projection Operators and their Properties

Definition 3.3.1: Let J^be a separable Hilbert space and M one of its subspaces. Every vector yr& ^¥can be uniquely decomposed into a sum of vectors y/M and y/M± belonging to M and its orthogonal complement M1 respectively. An operator which maps every y/to \jfM is called a. projection operator on M, and is denoted as EM. Evidently M is an invariant subspace under the action of EM. The projection operator on the whole space -^Tis the identity operator /, and on M1 there is the projection operator £" which is related to EM as: E'=1-EM

(3.3.1)

Thus every vector y/e 9{\n relation to these two projection operators satisfies: (EM +E')y/=

EMyt + E'yi = y/M+ y/Mx

(3.3.2)

It can be easily verified that every projection operator is self-adjoint. In fact we have the following result that characterizes a bounded linear operator as a projection operator. Result 3.3.2: (i) A bounded linear operator £ is a projection operator if and only if E2 - E - E+. (ii) The subspace onto which E projects consists of vectors Eyrfox all vectors i^in 3f([6], [7]. See Exercise 1).

A Primer on Operators

125

Definition 3.3.3: Two arbitrary projection operators El and E2 are said to be orthogonal if the subspaces M{, M2 on which they act are orthogonal* It can be easily checked that in this case: ElE2=E2E1

(3.3.3)

The operators EM and E' are obviously orthogonal. In case subspaces Mx and M2 are such that Mx c M2, then we say that Ex < E2 or equivalently E2> Ev The following result uses these definitions. Result 3.3.4: Let Ex and E2 be projection operators onto subspaces Mx and M2. We have: (i) If Ex E2 = E2El, then Ex E2 is a projection operator that projects on the subspace Mx n M2. (ii) If Ex and E2 are orthogonal, then El + E2 is the projection operator onto Mx © M2. (iii) If Et < E2, then E2 - Ex is a projection operator, and the corresponding subspace on which it projects is the orthogonal complement of My in M2. Note that (iii) is obvious in view of Def. (3.3.1). Definition 3.3.5: Let A" be a subspace of another space Y and let J3 be a linear operator such that for every vector y/in X, Sly/is in X, whereas for every vector 0 in XL, A<j> is in X1; then we say that X reduces S\. If a is an eigenvalue of operator «#on X and Xa denotes the collection of all vectors i/^that satisfy Siy/= ay/, then it can be easily seen that Xa is a subspace that reduces A. Moreover if S\ is a Hermitian (or unitary) operator on X, these subspaces corresponding to different eigenvalues at are mutually orthogonal, and are naturally the spaces spanned by corresponding eigenvectors. If X - ® Xa., then X reduces A, as a matter of fact !A. acts on X as a diagonalizing matrix. Thus every operator A which is Hermitian or unitary splits into two separate parts Ax and Jl2, the part Pix operates on the space spanned by its eigenvectors and Jl2 acts on its orthogonal complement. The operator Jlx can be represented by a diagonal matrix consisting of its eigenvalues, once an orthonormal basis of eigenvectors is chosen for the space X = ® Xa.. Following results further characterize the unique features of Hermitian and unitary operators.

3.2

More on Hermitian and Unitary Operators

Proposition 3.3.6: Every finite-dimensional complex vector space11 acted upon by Hermitian or unitary operators can be spanned by their eigenvectors. Proof: Let A be a Hermitian/unitary operator that acts on a given space X. Let XA denote the vector space spanned by its eigenvectors and £ be a projection operator on Xx Assume that XA ^ X. Now we know that XA reduces J3, and since by our assumption X^ ^ X, we shall have its orthogonal complement I ^ c l a l o n g with the operator Sl{\ - E ) that acts on it. Since X% is finite-dimensional, Sl{\ -E) must have a non-zero eigenvector along with a scalar b such that .2(1 - E) (j)= b(j) Writing JZ= JZE + JZ(1 - E) and noting that E(j) = 0, we have: X(j)= %E§ + A.(l-E)

<j)= b^>

which means that 0 e X% is indeed an eigenvector of S%, thus our assumption that X% * X% •£ X is incorrect. • * 11

Two vector spaces are orthogonal iff the sets of their basis vectors are orthogonal. If X is a real vector space, the result holds good only for a Hermitian operator.

126

Mathematical Perspectives on Theoretical Physics

In view of the above result, it follows that a Hermitian/unitary operator acting on a finite-dimensional complex vector space can always be given a diagonal matrix representation. Our next proposition shows how an arbitrary linear operator on a Hilbert space can be thought of as a collection of projection operators. This naturally implies that the simple properties of projection operators can be utilized to learn about operators of complex nature. Proposition 3.3.7: Let "Hbe. an n-dimensional Hilbert space and let .# be a linear operator on it. Then for a suitable choice of orthonormal basis (a) the matrix representation of %. is upper triangular; (b) equivalently there exist projection operators: 0 = Eo< E1 < E2< ••• < £„ = 1 that satisfy: (i)

(1 - £,.) J4Et = 0

0 < i
(ii) The algebra generated by {£,} is maximal abelian.12 Proof: To prove (b), as a first step we must show that given a basis {e,} of orthonormal vectors in H, projection operators satisfying the inclusion relation can be defined. The trivial projection operator, acting on the span of {e,} is En = 1. Since 1 • #"= En 9i= # = > El 9i= En9{ 2

we have £, = En. Let ^ _ j denote the space spanned by all basis vectors ey, e2, •••, en_x (excluding en), using the above argument we can define a projection operator En_l on tfi-\- Evidently En_{ < En since ^4_! c 9i The spaces ^,_ 2 , •••, #j can be likewise constructed and projection operators satisfying the inclusion relation can be formed. The operator Eo in this sequence represents the zero operator, that carries every element to the null element of 9i The operators Ei defined on 9^ as projection operators are in fact projection operators on 9i Since 9(= 9^ + 9^, by the very construction Et satisfies Et 9^= 9^ and Ej 9^ = 0, therefore we have from these two equalities: Ei O^ + El, ^

= Et9{=

9^ + 0 = 9
and E{ (£, 0i) = Ei9{i = Et 9l thus Ef = E{ on 9i More explicitly, i f / e ^/has components h and g e 9^ and 9 £,2 = £,- (g = £,• g = 0). Since ^"reduces the given operator J% trivially, (i) is immediate in view of the results: EtJl=Jl £,•, (1 - Ej) R= %.(\ - Ej) given in Exercise 3 (see hint for the proof). We post-multiply the second equality by Ej and obtain Eq. (i) of (b): (1 - E^ A Ei = A (1 - £,-) Ei =AEi-SL

E? = 0.

To prove (ii) of (b), we note that in view of Def. (3.1.1) the algebra "B {9C) is generated by {£,}. And from Remark (3.2.8) it is abelian. We have so far dealt with operators: (i) that are bounded; (ii) that operate on finite-dimensional spaces. From now on we shall relax these two conditions, for as we shall see in the foregoing example, the unbounded operators are not hard to find. Example 3.3.8: Let P be a linear operator on the space L2 (- «>, °°) of square integrable functions/(x) defined by the rule:

(Pf) x = xf (x). The operator has the Hermitian property:

(8,Pf) = I" Six)* xf (JC) dx = P (x g(x))* f(x) dx = (Pg,f) 12

That is, if there is any other abelian algebra of operators on"Hthatcontains it, then that algebra must coincide with it. See Lecture 1 in [1] for details.

A Primer on Operators 127

only if the integrals converge. This is evidently not bounded since

\\Pff=\~Jxf{x)\2dx will often be many times larger than

n/ii2=j;j/(*)i2
n r*/toi2(.#)} is a closed subspace of #"© tf. More explicitly if {y/n} is a convergent sequence in D{fy with the limit vector y, and the sequence {A.%} has the limit vector 0, then y must be in D(JZ) and .fly must equal 0 Definition 3.3.11: An operator # : # -> 9l'vs> said to be symmetric if (.#y/, 0) = (y/, Jfy), for all vectors 0, y e D{$) (i.e., it is self-adjoint) and D(A) is dense. 13 The corresponding adjoint operator Sl+ is now given by: (0, PC y> = <*0, y>

(3.3.4) +

for every vector 0 in D(Jt). Moreover, D(Jt) is dense, therefore A y/= SI y/for every y in D(^l). Thus ^L+ can be viewed as an extension of A, in other words, the domain D (Jl+) can be larger than D(A); in case they are same, A = S>C. From the above it is clear that if an operator A is not symmetric, the next best thing we can do is to ask that it be closed. If the operator is not closed but is defined on a dense domain, then it can be shown that its adjoint R+ is closed. Remark: We emphasize that for operators belonging to #(.#) the notions of Hermitian, symmetric, and self-adjoint are equivalent.

Exercise 3.3 1. Prove Result (3.3.2). 2. Show that a projection operator E has only two eigenvalues, which are 1 and 0. 3. Let A be a linear operator and X a subspace that reduces A If E is a projection operator onto X then show that (i) AE = EJZand (ii) (I - E) Jl= Sl(l - E). Conversely an arbitrary operator A that satisfies (i) and (ii) must be such that X reduces it. 4. Show that an operator JZ defined on a dense domain has an adjoint which is closed. 13

Since in the physics literature Hermitian means self-adjoint, using physicists' terminology, A is symmetric if it is Hermitian and is densely defined.

128

Mathematical Perspectives on Theoretical Physics

Hints to Exercise 3.3 1. If E is a projection operator on a subspace #"of 9i', then for every vector ye E2y = E(Ey) = £ w = 2

H':

yH = Ey

thus E = E. Also, since (j>= fH+ ip^ and y = Y^+ y^,

we

nave

(<j>, Ey) = (ji, Yx) = (4V, Yx) = (, say, but by definition of E the vector <j) = yH • . To show the converse we assume that E is a bounded operator that satisfies E2 = E = E+, and denote by #"the set of vectors Ey for all y. We shall show that E projects on M. Linearity of E implies that if E<j> and Ey are in ^ a n d a is a scalar, then E(<j> + y) and E (ay) are also in iWand consequently # l s a linear space. Suppose that there is a sequence of vectors (Eyn) in "Hthat converges to 4*, then, since E is bounded linear, it is continuous, hence: Eyn =

EEyn-^EV.

But the limit vector for the sequence has to be unique, thus E*¥ must equal *F, this implies that 4* is in Pleading to the conclusion that !His a subspace. Again using the self-adjointness of E we can show that if £ y is in X then ( 1 - E) y is in 5/ 1 . To do this we take an arbitrary vector Ety e i^and form the inner product: (£0, (1 - E)y) =((p, E(\ - E)y) = (, (E - E2) y) = 0. Thus (1 - E)ybelongs to the orthogonal complement of 9i Denote Ey= y^and (1 - E)y= y/^; every vector yis uniquely decomposed as a sum of yHand y^ through the action of E, hence E is a projection operator on H— the space spanned by the vectors Ey. 2. Suppose that (//is an eigenvector with eigenvalue X, then Ey= Xy leads to: X2y = X(Ey) = E[Ey) = E2y = Ey=

Xy.

This shows that X (X - 1) y= 0, i.e., X = 0 or 1. 3. Since X reduces ft, for every vector yeX, Ay= 0 e X, also since E projects on X, EyE

l/f and

EAy = E(JZy) = E<(> = $ = Ay = A Ey i.e., (i) holds and (ii) follows trivially. To prove the converse, i.e., given (i) and (ii) X reduces A, we have to show that if y e X and 0 € X x , then Sly must be in X and Slip in X 1 . By definition if E is a projection operator on X, then (1 - E) is a projection operator on X1, consequently Eyy and (1 - E)

] = [1 - E] (A(j>) which shows that Atp e X1. 4. D (A) is dense, means that V vector ysD (A) there is a sequence yne D (A) such that yis the limit vector of sequence yn whereas D (A+) is the set of all vectors I// that satisfy (3.3.4). To show that adjoint operator A+ is closed we must establish that the limit vector)//' of every convergent sequence y'n that € D(A') also e tD(A+). From (3.3.4) for

= < Acp, y'n > -» < j^0, y1 > = < ). This shows that l//' is in CD (A+). In other words A+ y'n converges to a vector A+ y' = £.

A Primer on Operators 129

4

THE SPECTRAL DECOMPOSITION

We have seen that a Hermitian or unitary operator Q acting on a finite-dimensional space when expressed in terms of its eigenvalues and eigenvectors can be represented by a diagonal matrix with eigenvalues along the diagonal. We shall next see that using these eigenvalues it can be expressed as a sum of projection operators, and that this notion can be generalized to define other projection operators leading to spectral decompositions of operators on infinite-dimensional spaces, although some of these operators may not possess eigenvalues nor eigenvectors. The technique developed below is quite useful to the study of these operators of general nature. Consider a Hermitian operator H with eigenvalues X^4 (k = 1, •••, n) and corresponding eigenspaces Mk spanned by eigenvectors. We know that they are mutually orthogonal, accordingly if we denote by Ik the projection operator onto Mk, the operator H can be written as:

H=thh

(3-4.1)

k=\

Evidently these projection operators are mutually orthogonal, i.e., / ; Ik = 8jklk and have the completeness property: n *=i

We now define another set of projection operators:

Ex = £ Ik

(3.4.2)

Xk<x

where x is a real number and where we have assumed that Xx < X^ < • • • < Xn (recall that A/s are real). Ex is the projection operator on the subspace formed by eigenvectors corresponding to all eigenvalues Xk< x; it is zero for x < X{ and equals 1 when x > Xn, due to completeness property of {Ik}. Also, if x < v, then ExEy=Ex=EyEx

(3.4.3)

from Def. (3.3.3) and Result (3.3.4). The operator Ex defined in this manner, increases from zero to one as x increases through the set Xx to Xn; more precisely it increases by Ik when x reaches the value Xk. For a given x, by choosing a positive number e small enough so that there is no eigenvalue between x - e and x, we set the difference between Ex and Ex_ E as: dEx=Ex-Ex_e

(3.4.4)

We note that if x equals an eigenvalue Xb then by very definitions of Ex and Ex_e, (3.4.4) gives: dEh =(/, + I2+ ... + /,) - (/,+ ... + / , _ , ) = /,

(3.4.5)

accordingly, it follows that:

J"_ dEx = 1 and (replacing Xk by x in (3.4.1)): 14

' To keep the discussions simple, we are taking the eigenvalues to be distinct.

(3.4.6)

130 Mathematical Perspectives on Theoretical Physics

J" xdEx=H

(3.4.7)

For any vectors y'and <j) that belong to D(H), the above equations imply:

<0, y/> = J_"_ rf<0, £ ^ > = /_"„. **(<$>, Exy?)

(3.4.6)(a) (3.4.7)(a)

It should be noted that by the very nature of (3.4.4) (0, Exy/) (which is a complex function of x) jumps in value by {0, Iky/) at x = Xk. In the case of the unitary operators U, we write the eigenvalues as e'9k where 0 < 9X < • • • < 6n < 2n gives the order relation, hence for each real number x we can write

Ex= I h Bk^x

We note that in this case Ex defines the projection operator on the space formed by all eigenvectors that correspond to eigenvalues e'6k with 8k < x. If x < 0, Ex = 0 and if x > 2K, EX = 1. Using the same arguments as in the case of Hermitian operators, we can write

U=±e^Ik k=\

as

U= \2* eix dEx

(3.4.8)

For any two vectors i/Aand
(, Utf = \2* eixd (, Exy)

(3.4.8)(a)

It is easy to note that the operators Ex defined above are continuous from the right as a function of x, since for e > 0 Ex + e y/ —> Ex y/ as e —» 0 for any i// and JC, whereas they are discontinuous from the left, since dEx = /, at x = Xt or 8t. Having defined these projection operators, we are now in a position to state two fundamental results on Hermitian and unitary operators defined on infinite-dimensional spaces.

4.1

Results Based on Spectral Families of Operators

Definition 3.4.1: A family of projection operators Ex depending on a real parameter x is called a spectral family if it has the following properties: (i) If x < y, then Ex < Ey or Ex Ey = Ex= Ey Ex. (ii) For any vector ^ a n d any x, and a small s > 0, Ex + e y/ —> Ex y/as e —> 0. (iii) For any vector y/, Ex \\i —> 0 as x —> - oo, and Exy/ —» y^asx—» + °°. We use this definition to state a few results without proof (see Sec. 107, Sec. 109 and Sec. 120 in [12], and Theorems 5.9 and 8.4 in [15] for proofs). Result 3.4.2: For each self-adjoint operator .3, there exists a unique spectral family of projection operators Ex such that for all vectors y/ and 0 (if A is unbounded, y/e D (A)) the following holds good: <0, .PLi/') = J"_ xd (<(>, Exy/)

(3.4.9)

A Primer on Operators 131

And the operator .3. can be written as:

#= J~ xdEx

(3.4.10)

Result 3.4.3: For each unitary operator U there exists a unique spectral family of projection operators Ex such that Ex = 0 for x < 0 and Ex = 1 for x > 27T and

{<)), Uiff) = fK eixd ((j), Ex\i/)

(3.4.11)

for all vectors 0 and y/. The operator (/ can be written as: U=j**eudEx

(3.4.12)

The Equations (3.4.10) and (3.4.12) define the spectral decomposition (resolution) of the self-adjoint operator JZ and the unitary operator U respectively. More generally (Stone's theorem) , if there is a one-parameter unitary group [Ut: t e R} having positive spectrum,15 then there is a spectral measure E, i.e., a measure on the real line K such that U,= J~_ eitx dE{x)

(3.4.13)

Since the positive spectrum condition implies that E(x) is concentrated on the positive half-line 0 < x < <», a semigroup (with operators) Pp t > 0 can be defined where: Pt= j ~ e-'xdE(x)

(3.4.14)

It can be checked that Pt is a strongly continuous contraction operator satisfying Po = 1 and P, = Pt. Thus, while (3.4.13) gives unitary operators, (3.4.14) gives self-adjoint operators. This construction is important in the path integral approach to quantum theory (see [1] Lecture 3 and Sec. 7.6 in [17]). To illustrate the above two results, we give below three examples. Example 3.4.4: Consider the Hilbert space L2(0, 1) of square summable functions f(x) defined on the interval [0, 1]. Let Abe the self-adjoint operator defined as Af(x) = xf(x) for every/, and let Ex be the projection operators that are given by the rule:

and

(Exf)y=f(y)

if

y < x

(3.4.15)

(Exf)y = 0

if

y>x

(3.4.16)

Then the family {Ex} satisfies (3.4.9) and (3.4.10), and thus gives a spectral decomposition of A Example 3.4.5: Let Q be the self-adjoint operator defined by the action (Qf) (x) = xf(x) where /(JC) now belongs to L2(- <», »), and let Ex be defined as in Example (3.4.4), then the family {Ex} satisfies (3.4.9) and gives the spectral decomposition of Q. Example 3.4.6: Let U be the unitary operator defined by (Uf) (x) = e'27af(x), where f(x) belongs to L2(0, 1), and let (Exf) (y) =f(y) if y < x/2nand (Exf) (y) = 0ify> xlln, then the family {Ex} satisfies (3.4.11) and gives the spectral decomposition (3.4.12). In each of these examples that we have cited, Ex is a continuous function of x rather different from what we introduced in the beginning, this fact brings us to our next result that distinguishes self-adjoint I5

' An operator.!? is said to have positive spectrum if all its eigenvalues are non-negative.

132

Mathematical Perspectives on Theoretical Physics

operators that possess no eigenvalues (or eigenvectors) from those that do possess eigenvalues and hence eigenvectors in domains of their definition. Proposition 3.4.7: Let JZbe a self-adjoint operator with the spectral decomposition: A= \°° xExdx

(3.4.17)

then Ex jumps in value at x = X if and only if X is an eigenvalue of ^L If Ix denotes the projection operator onto the subspace spanned by the eigenvectors corresponding to X, then EJX = 0 for x < X and EJX = Ix for x > A, and for £ > 0, (Ex\ff - Ex _ £ iff) -» Ix y/ for every vector y/, as £ —> 0. We give in Exercise 3 an outline of the proof, for more details reader may refer to [11] and [12]. From the above result it is evident that Ex for an operator Q sometimes increases by jumps (this is so when Q. has an orthonormal basis of eigenvectors) and sometimes continuously as well as by jumps. In the latter case the eigenspace spanned by eigenvectors is smaller than the whole space. Our next remark and definition clarify this point. Remark 3.4.8: The set of points x at which Ex, the projection operator for a self-adjoint operator Q, jumps is the point spectrum of Q.. The point spectrum, as we already know from Subsec. (2.1), is the set of all eigenvalues of Q. Definition 3.4.9: The set of points x, such that Ex increases continuously in the neighbourhood of x is called the continuous spectrum of Q. The point spectrum and continuous spectrum comprise the spectrum of Q, denoted <7(Q) (see Subsec. (2.1)). Note that this is the totality of points at which Ex increases. If Q is a unitary operator U, 'the set of points x is replaced by 'the set of e" at points x' in the definitions, and thus the spectrum of U is the totality of points e'x at points x for which Ex increases either by jumps or continuously. A far reaching16 relation between the operator and its spectrum is given by our next three results (see [12] Sees. 126-132, and [11] for the proofs). Result 3.4.10:

A self-adjoint operator is bounded if and only if its spectrum is bounded.

Result 3.4.11: A self-adjoint operator PL is positive ((0, J&j>) > 0 for every ) if and only if its spectrum is non-negative. (See Ftn. 15). Result 3.4.12: Let/Qc) be a complex function of real variable x and let/(^) be the same function of self-adjoint operator ^ whose spectral decomposition is given in (3.4.17). Then operator/(.#) is selfadjoint i f / i s real, and is unitary i f / * / = 1; also it is bounded if |/(;c)| is bounded over the spectrum of J3, and is positive if f(x) > 0 over the spectrum of A (see Exercise 3).

Exercise 3.4 1. Fill in the lines of proof required for (a) Example (3.4.4) and (b) Example (3.4.5). 2. Obtain the spectral decomposition of unitary operator defined as (Uf) (x) = e2ttixf(x)

where f{x) e L2 (0,1).

3. Give lines of proof for Proposition (3.4.7) and Result (3.4.12). 16

These results are of great importance in quantum mechanics since a real physical quantity is represented by a self-adjoint operator.

A Primer on Operators

133

Hints to Exercise 3.4 1. (a) Given (3.4.15) and (3.4.16) we have to establish (3.4.9) for every /and g e L2(0, 1) and show that Ex satisfies (i), (ii) and (iii) of definition (3.4.1). Thus for any /and g, we have:

£

xd(g, Exf) = J_~_ xd jjj g(y)' (Exf)(y)dy

(a)

where we have used the definition of scalar product on L2(0, 1). We now write it as jl0 xd \* g(y? f(y)dy. Note that the limits of the outer integral in (a) are changed since x e (0, 1), and that of the inner integral change due to (3.4.15) and (3.4.16). Obviously now,

RHS=

fQg(x)*xf0c)dx

= <*.*/>• This establishes (3.4.9). By the very definition of a projection operator (see Def. (3.3.1) and Def. (3.4.1)), if x < y, Ex< E , then Ex Ey= Ex= EyEx, which shows that (i) is valid. Again using the definition of the norm of an operator (which in this case is Ex+£- Ex, £ > 0) we have:

\\(Ex+£-Ex)ff=\Xx+E\\fiy)tdy.

(fi)

Note that the integral on the RHS in (/?) tends to zero as £ —> 0 for a n y / e L 2 (0,l) and x in [0,1), and this gives (ii). Obviously Ex = 0 if x < 0 and Ex=\ if x > 1, therefore the family of projection operators [Ex] defines a spectral family, and gives the spectral decomposition of A (b) The lines of proof for the operator Q defined on L2 (-«>, °°) are identical, except that in this case Ex = 0 as x —* - oo and Ex = 1 as x —> + oo and Ex continues to increase in the interval (— °°, °°). 2. We set a projection operator Ex on the interval (0,1) by defining

and

(Exf)y

= f(y)

when y < xlln

(Exf)y

=0

when y > xlln.

Using the same lines of argument as given in l(a), we note that {EJ defines a spectral family for which Ex= 0 if x < 0 and Ex= 1 if x > 21 n. 3. To prove Prop. (3.4.7), we note that for e > <5> 0 and any vector y/, Ex - E^_e and Ex- E^_s are projection operators. Moreover Ex - Ex_^ > Ex - E,s, thus \\{EX - Ex_ G ) !//•- (Ex - E^_s) i/^||2* can be expressed to show (i) that \\(EX - E^) v | | converges to a limit as e —> 0. and (ii) that U(£A - Ex_^) y/1|2- \\(EX- Ex_$) i/^l2 -> 0 as e , 8-> 0, (i) implies that the vectors (Ex - Ex_^) y/ have the Cauchy property as e —> 0. But since the space is complete, (E^ - Ex_^) y must converge to a limit vector, say y/x as e —» 0. *

||(^-^- 6 )V^-(£A-£A-5)V'l| 2 = l|(£A-^-e)V'l| 2 +ll(£A-^-8)^| 2 -<^(£A-^- e )(£A-'EA-5)V'> -< iff, (Ex - Ex_5) (Ex - Ex _ 6 ) y/>. In view of (3.4.3) and £j^ = Ex, each of the last two terms equals \\(EX-EX_5)¥\\^

134

Mathematical Perspectives on Theoretical Physics

If y/x * 0, and x > A, then we have Ex (Ex - EX_E) = Ex - Ex_e\ whereas for x < X and thus for small e, x < X - e, Ex (Ex - EX_E) = 0. This shows that Ex (Ex - EXJ) yr = Ex y/x = 0 for x < X and Ex (Ex - Ex_e) y/ = {Ex yrx - yrx) = 0 for x > X . So for any vector 0 we have (0, %.yr{) = J

xd (
X is an eigenvectors and y/x is an eigenvector. To prove the second part of the proposition we let Ix be the projection operator onto the subspace spanned by the eigenvectors, that correspond to eigenvalue X of A Now

\\EX Ix y/||2 - > 0 as x -* - ~ and \\EX Ix y/\\2 -> \\IX y/\\2 as x -> oo. Thus for any vector y/ \\EX Ixy\\2->0ifx<X

and \\EX lx yf -> \\lx yf

for x > X. This means that-E^ Ix = 0 for x < X and Ex lx - Ix for x > X. Hence for any vector y/, (Ex - Ex_e) Ixy/= Ixy/ showing that Ex jumps in value at x = X. Again for any vector y/, writing II(Ex - Ex_s) (1 - lx) y/ I = <(1-/ A ) yrAEx-Ex^)2{\-lx)

yr>

= <(l-Ix) ¥,(EX-Ex^)(l-Ix) W> We note that as e —> 0, it —> 0 for one of the two reasons, namely either {Ex - EX_G) (1 - lx) y/ converges to a zero limit, or it is a non-zero eigenvector which must be orthogonal to (1 - lx) y/. Hence for any vector y/ we have (Ex - Ex _ e ) y/ —> Ix y/ as G —> 0. D In the case of Result (3.4.12), we note that if, for instance,/(x) = c0 + q x + • • • + cnxn (c(- 6 <E), t h e n / ( ^ ) = co + cj A+ ••• + cnf('. Let if) (x) =f(x)*; for any vectors y a n d 0 we have {<j>, \j{JZ)]+

y$ = (y,fW4>f = J_^ f(x)* d(y/, Ex0)* = J"^ ( / ) (x)d{J'{X) y», so \f(x)] =f (Si). Thus if/is a real function, then/(.#) is a self-adjoint operator, and if/*/= 1, then since \f(A)]+f(A) = 1 =/(jQ [f(A)]+JW

is a unitary operator. Writing <0,/U) 0> = | " /(x)J || ^ 0 | | 2 ,

we have that/(j?) is positive if f(x) > 0 over the spectrum of SI

5

GROUP THEORETIC ASPECTS OF OPERATORS

In the previous section we have already studied (though partially) the equation: Ly/=Xy/ (3.5.1) for a linear operator in order to learn about its eigenvalues {A} and its eigenfunctions {y/}. Since it happens to be an important equation in mathematical physics, we shall devote the next Section 6 to the study of (3.5.1) where L stands for a few specific operators of interest to both mathematicians and physicists. In this section we define a few objects related to the algebraic structure of operators. Definition 3.5.1: Let fi 1 and Q 2 be any two operators that commute with L, then their product as well as their linear combination also commutes with L; the collection of all such operators forms an algebra called the commutator algebra of L. We shall denote it by CL.

A Primer on Operators 135

From equation (3.5.1) it is evident that Q e CL implies that Q yr is an eigenvector of L with the same eigenvalue A that corresponds to eigenvector y/. If V^ denotes the vector space spanned by eigenvectors that correspond to same eigenvalue X, then Vx is invariant under Q, it follows that it is invariant under entire CL. We further observe that if V^ is finite-dimensional and the operators € CL generate the full algebra of matrices in the vector space V^, then any solution y/ e Vx can be obtained from a single solution y/0 e V% ( % * 0) by elements of the commutator algebra. We shall return to this idea in Result (3.6.1). Definition 3.5.2: The set of all invertible operators in CL forms a group G called the full symmetry group of L. If G is a Lie group, it is of particular interest. We shall see in an example in Chapter 6 that this group for operator A acting on S3 is 5O(3) (Exp. (6.7.6)). It should be noted that the set CL can always be made into a Lie algebra since it contains the elements QjQ2 - Q2 ^ i together with Ql7 Q2. We shall show later in an example (Chapter 6., Exp. (6.7.8)) that this Lie algebra derived via the symmetry group of A is indeed the Lie algebra of SO(3). We have mostly talked about operators on finite-dimensional spaces, in reality though we have to deal with operators on infinite-dimensional spaces. The next two definitions involve such spaces. Definition 3.5.3: An operator Q. is called transitive if the only closed subspace Hoi the ambient Hilbert space H' satisfying is H= {0} or 9{= H'. The term intransitive simply stands for non-transitive. Definition 3.5.4: A linear bounded mapping (operator) Q. from a Hilbert space Hx into a Hilbert space 9{2 is called compact if it carries bounded sets in Hx into subsets of bounded sets in H2. An important property of such an operator is that the image sequence {iix,,} of every weakly convergent sequence {xn} in #j is strongly convergent. The following results concerning such operators listed here without proof are of great interest in operator algebras (see 10.[3] for proofs of Results (3.5.5) and (3.5.6) and also [17]). Result 3.5.5:

Every compact operator on an infinite-dimensional Hilbert space is intransitive.

Result 3.5.6:

If an operator^ commutes with a non-zero compact operator theafl must be intransitive.

6

A FEW IMPORTANT OPERATORS

We devote this section to those operators which are of great interest from the applications 'point of view' namely, the Laplace operator, the Schrodinger operator and the Dirac operator. In earlier sections we have studied only (self-adjoint) Hermitian operators; in general, however, the operators are not self-adjoint. In sub-section 3 we introduce these general type of operators through examples of two well known operators, mentioned above. We begin here with the Laplace operator A which can easily be checked to be self-adjoint and hence real-valued on the domains we are considering here (see Subsecs. 3 and 4 for cases where - A is not necessarily self-adjoint).

6.1 Laplace Operator For the sake of simplicity we denote the Laplace operator A in the next two subsections as L and note that on compact manifolds without boundary, writing it as L = d8+ Sd we can use it to determine the

136 Mathematical Perspectives on Theoretical Physics

harmonic forms. Due to its importance, we study it on four different domains. The first of these is the Euclidean plane R2: {(xj, x2)}, where L, often denoted as LR2, is in its simplest form: dxf

+

dx\ •

Its eigenfunction and eigenvalue in this case are respectively e'X(-Xl<Xl +Xl "^ and - A2, a = (a,, c^) being a unit vector in K2 and X a complex number. When IR is considered as the homogeneous space M(2)/O(2) formed by M(2)—the group of isometries, modulo the orthogonal group 0(2) that leaves (0, 0) fixed, we note that every differential operator on K2 which is M(2)-invariant turns out to be a polynomial in L. If we think of IR2 as a complex plane z = x + iy, then L becomes: d2 L = 4—-.

(3.6.1)

dzdz

Here, we obtain its transform under conformal maps. For this we consider the group SL(2, C) formed by complex matrices

fa P}\= g of determinant vr o)

1, the group acts transitively on the one-point compacti-

fication of the plane via the conformal maps:

s-^f-

(3-6-2)

yz + 5 We define the transform Dg of a differential operator D on R2 under the mapping (3.6.2) as 17 : Dg:f-*(D(fog))og-1 If the differentia] operator D is —— we have: dz

feC~

(IR2)

(3.6.3(a))

(3.6.3(b))

The simplification of (3.6.3)(b) gives:

f | - 1 / = (.-Yz + a)2 ^f

(3.6.4(a))

V dzJg dz Complex conjugation of the above equality further gives:

f 4:1 / = (-7 z + a)2 4^ V dz Jg 17

(3.6.4(b))

dz

Note that (3.6.2) implies that g( z ) = a z

+

^ , and hence g~l(z)=

Tz + S

5l~^

-yz + cc'

A Primer on Operators 137

Thus the transformation L of L under the mapping (3.6.2) satisfies: Lg=\yz-a\4L.

(3.6.5)

Finally we consider the Poincar© model of the non-Euclidean plane and obtain the corresponding L in (3.6.11 (b)). Now the Poincare'model is given by the open disc T> = |z| < 1 in R2 equipped with the Riemannian structure: (3.6.6)

where | , t] are any tangent vectors at z e T> and ( , ) is the inner product in R2. Using (3.6.6), the infinitesimal arc length of a segment in £> can be written as: ds2 = (dx2 + dxl )/(l - x2 - x\f

(3.6.7)

or equivalently as: ds2 = gy dxl dx> where gi} = (1 - | Z | 2 )" 2 Sy The disc ©can be identified with SC/(1,1)/S0(2) through the mapping:

z^f±l

N2-|^|2 = l

(3.6.8)

(3-6.9)

Pz + a The above mapping is the onto conformal mapping of the unit disc.18 It can be checked that the action of 5(7(1,1) on T> is transitive and the subgroup fixing the origin is 50(2).

6.2 The Riemannian Measure Consider the Riemannian measure 0 - > J (j> 4 G d x x • • • d x n

(3.6.10)(a)

and the Laplace-Beltrami operator L-t>^^ld'n[l

rJGdA

(3.6.10)(b)

in an n-dimensional space, where G = |det ^|. In view of (3.6.8) we note that for ©they are respectively: dz=

Tdx1dx2

(3.6.1 l)(a)

[1- x{ - x2 J ( d2 (92 ^ L = ( l - ^ 2 - x 2 2 ) 2 — - + —T\

and

V dx-

(3.6.1 l)(b)

dy" J

Since the Riemannian measure and the Laplace-Beltrami operators are invariants of isometries, and all isometries of T> are given by the mapping (3.6.9) and the conjugation z —> z, it follows that dz and L 18

See Chapter 1, Sec. 4.

138 Mathematical Perspectives on Theoretical Physics

given respectively by (3.6.11)(a) and (3.6.11)(b) are invariants of these mappings (see Hints toExcercises 1 and 2). Listed below are few more results on the Laplacian without proof, first two of these relate the Laplace operator on K" to that of Sn~l. The proofs can be found in any standard text on complex or harmonic analysis (the texts to our taste are references l.[l] and [9]). Result 3.6.1: equation:

An eigenfunction O of L^2 with eigenvalue (- A2) always satisfies the functional

where z, w e C and yAA 0

- ^ - C®(z + eiew)d9 = &(z) Wx o (w) 2K U is given by: ^»=^rj

s

i

eiKx'eW) eimd d<0.

(3.6.12)

(3.6.13)

Here do) is the circular measure on S1, x = (x{, x2) e fR2 and m e Z. Conversely a continuous function <J> satisfying (3.6.12) is a solution to the equation L%i 3> = - A2 3>. It can be verified that if the solutions <£> of the above equation satisfy: <£ (eim6 z) = eim6'<X> {z)

(3.6.14)

then they must be constant multiples of the function \ff^m defined in (3.6.13) (see [9], pp. 14-16). Result 3.6.2:

The Laplacian L^n {n > 1) has the form

^ ^ T

+

—

dr

ir r

+J L

< 3 - 6 - 15 >

T r

or

where L is the Laplacian on S"~} - the (n - l)-dimensional unit sphere (see Exercise 6 of Sec. 1 for n = 3). Result 3.6.3:

The eigenspaces corresponding to L on Sn~l are of the form:

Em = Span of | X

a s

'• a

i i]

e c

" isotropic19 I

(3.6.16)

where {sv ••• , sn) are cartesian coordinates on S"~l and m e Z + . The eigenvalue is - m (m + n - 2). Each eigenspace representation is irreducible. Since they are mutually orthogonal, they give the orthogonal decomposition

£ Em

(3.6.17)

m=0

of the Hilbert space L2 (5"" 1 ) formed by square integrable functions on Sn~l. n

1

A vector a = (ax, • • •, a n ) e C" is called isotropic if ^

a

f = 0-

A Primer on Operators

6.3

139

Operators other than A

Continuing in this sequence of ideas we consider a few more important operators that relate to known physical theories. We again use the letter L to denote a formal differential operator on the Hilbert space 9{= L (E) (E being 3-dimensional Euclidean space) and emphasize that L is not necessarily selfadjoint. However, given such an operator, a densely defined linear operator can be constructed by restricting the domain L2(E) to CJ°(£), i.e., to the set of infinitely differentiable functions with compact support. This new operator turns out to be self-adjoint. We illustrate this point by choosing L as Schrodinger operator (note that in Sec. 1 we have used a factor of — on R.H.S): L=- A+ q{x) (3.6.18) Obviously L may not even be an operator on 5/"= L2(E), if q(x) is not satisfactorily designed; for instance if q(x) is singular, then for u(x) e L2{E) it is not necessary that q(x) u(x) e L2(E). To begin with, therefore, we ensure that q(x) is locally square integrable (i.e., q eL2(K), K a compact set of E) and, to make L into an operator for which the notion of self-adjointness is relevant, we assume that its domain is restricted to CJ° (£) or at least to CQ (£). The operator obtained by restricting the domain of L to CQ (denoted S) is called the minimal operator pertaining to L. Evidently the operator 5 can be extended to another operator S by enlarging its domain with the inclusion of all functions u e L2{E). The following definition for self-adjointness of a differential operator L would make it clear as to why a densely defined operator needs to be constructed from a given L. A differential operator L is said to be formally self-adjoint if Green's identity holds, i.e., if (3.6.19)

where Eo is any sub-domain of E with compact closure and with smooth boundary dE0; ds on the RHS stands for surface element and

for the inward normal derivative on dE0. When q(x) is zero L an becomes the familiar Laplace operator - A, whose minimal operator denoted usually as f (in literature) is self-adjoint. The domain of t in view of the above discussions is CQ (£) (See Chapter 6 in [17]).

6.4

Dirac Operator

Finally we write the Dirac operator20 for a particle moving freely in space. Using the same notation L we have: ' Although we shall return to this operator in other Chapters (e.g., 9 and 10), as a historical note we would like to mention here that this operator, introduced by the physicist P. A. M. Dirac in search of Lorentz invariant first order differential operator in the late 1920s, has led to important discoveries in mathematics through Atiyah, Bott and Singer's work. M. Atiyah and I. Singer-through their celebrated work on index theory broadened the concept of this operator by giving it a bundle-theoretic formulation (we use it in the Appendix to Chapter 10 to define elliptic operators and the heat operator). As this bundle-theoretic definition of the Dirac-operator is due to Atiyah and Singer, the Dirac-operator in this format is referred to asAtiyah-Singer operator (See. 7.[14]).

140

Mathematical Perspectives on Theoretical Physics

L=-

ia-grad + /3.

(3.6.20)

This operator L differs from the Schrodinger operator or other differential operators in the sense that its domain consists of vector valued (or spinor valued) functions u(x) = (wj(.r), u2{x), u3(x), u4(x)), the components Uj(x) (i,j = 1, 2, 3, 4) are functions of space variables (xlt x2, JC3). More specifically, if u(x) = (U[(x)) 6 C 4 , then components ak(k= 1,2, 3) of the vector a can be viewed as operators and can be identified with their (operator) representations given by (4x4) matrices with complex entries. Apparently P is also a similar (4 x 4) matrix in this case. Thus Lu = v = (vl (x), • • •, V4(x)) in terms of the components would be:

»*(*) = t (-' S («/)** T ^ + P*h «* h=\ V

dx

i=\

i

to)

(3-6.21)

)

The matrices ak and P are Hermitian symmetric matrices and satisfy the commutation relation: ak ah + ahak= 2Skhl for k, h = 1, 2, 3, 4 where /3 has been set up as aA. Using this operator we can construct other operators on Hilbert space H = (£2(IR3))4. Elements of .Tifare ([^-valued square integrable functions that satisfy:

IMI2=J|M(*)|2
|MW|2= ^

KWI 2

and whose associated inner product is («, v)= j

u(x) •

The corresponding minimal operator f (tu 3

v(x)dx. = Lu) is obtained by restricting the domain D(T) to

4

(Co°° (K )) . Thus it consists of all u(x) with the components uk {x) lying in Co~ (R 3 ). r is self-adjoint. The Dirac operator for a particle in static field with potential q(x) can be written as: L = -ia-grad+/J + Q

(3.6.22)

where (2 is the multiplication operator q(x)I. In order to define the minimal operator S (of L in (3.6.22)) with domain (CQ (R 3 )) 4 we must ask that q(x) be locally square integrable. Again 5 is densely defined but not necessarily self-adjoint.

Exercise 3.6 1. Show that the Riemannian respectively (1 — |z| 2 )~ 2 e and Laplace operator in R 2. Show that the Riemannian isometries of t).

measure and Laplace-Beltrami operator for the Poincare model T> are and (1 - \z\2)2L^, where e m and Lgy stand for the Euclidean measure . measure and the Laplace-Beltrami operators on Dare invariant under

A Primer on Operators 141

Hints to Exercise 3.6 1. By definition gij=

d-\z\2)25ii

=

(l-x?-xl)2SiJ

accordingly

(i-N 2 ) 2 This leads to

VG=

X

-T-T. d-W 2 ) 2

Also Euclidean measure for R 2 is dxxdx2, hence result is obvious for part 1 of the exercise. For the second part, we note that g'i in this case is simply Ofy)"1 therefore g11 — (1 - |z|2)2 8'J and gV ~/G= S'i. This substitution along with the value of in (3.6.10) (b) gives the required vG result, since

('-WVi-.d-*',-^ ( £ + £ } 2. We show first that the mappings21

and z —> z are isometries for ©, i.e. (3.6.7) or (3.6.8) are unaltered by replacing z by z or by

az + p Note that (3.6.7) can be written as: (I)

^-r + \dzdz 7-r^r ( 1 - U I 2 ) 2 [V 2 ) \ 2i ) J (l-|z|2)2 which immediately shows that z —» z is an isometry. Now i3z + a implies that 21

See also Sec. 3, Chapter 1.

142 Mathematical Perspectives on Theoretical Physics

(pz + a)2 )

{pz + a (pz + a)2 Similarly

Pz + a implies that dz -> ——

ipz + a)2

^-dz.

We substitute the values of z, z, dz and dz in RHS of (i) and simplify:

1

f 2

(l (az + p)(az + p)) )(Pz ( {pz + a){pz+a)) (ii)

=

dz

dz 1 2

+ a)

(fiz+af\

( (Pz + afiPz+a)2 1x 2 2 2 2 2 2 [ {|^| |z| + |a| + a~Pz + api- (M \z\ + |j8| + apz + fia z)f J X dz dz

_

dz dz

2

{(Pz + a)(Pz + a)) " ( 1 - U | 2 ) 2 ' Here we have used the basic computational method to show that

^ a z ^ Pz + a is an isometry. A more fundamental way of arriving at it would be by showing first that absolute values: zi - z 2 l~ZlZ2

are conformal invariants for any pair of points z,, z2 e ©, and then by taking the limit z2 —» z P This would lead to the fact

H i-N2 , where

m

=

i-l^l

a z + j3 ce = — —.

Pz + a

2

=ds

2

A Primer on Operators

143

To show the invariance of the Laplace operator under this mapping, we simply note that —— and dz

-r— respectively correspond to dz

(fc + afj; while

and

e-Mv-v

(1/)2

(/K + <*>2^-

,

{pz + a) (j3z+a)2 accordingly the result is obvious. For establishing the invariance of the Riemannian measure Adx dy

(i-w 2 ) 2 we use the conformal invariance of 2\dz\

i-Ui2 by choosing zlt z^ with equal y-coordinates to obtain dx and with equal ^-coordinates to obtain dy. [Naturally this subtle approach avoids computational complications. We are of the opinion, though, that they (computational methods) provide some insight for a layman.]

7

REPRESENTATIONS OF SU(2) AND S(/(3) USING THE THEORY OF OPERATORS

We devote this section to two well known examples where group representations and operator theory are used to explain ideas (e.g. weak, and strong interactions) in particle physics.22 The groups we have in mind are the special unitary groups SU(2) and SU(3), the former being the group of 'isospin invariance' of Yang-Mills theory (also known as the group of internal symmetries of weak interactions in electro-weak theory) and the latter (the 'eight fold way') that of quark-theory and of QCD (the group that explained the strong interactions). (See Subsec. (6.5.2).) We have already touched upon both these groups in Chapter 2 (see Exercise 2.2.11 and Exercise 2.4.2) in connection with the Lie algebra satisfied by their generators. Our aim here is to obtain the irreducible representations of these groups using the terminology of operators studied in this chapter. The operators in this case are generators of these groups, and the space on which they act is the Hilbert space of state vectors. Thus eigenfunctions are now replaced by eigenstates (see Sees 9.2 and 9A).

7.1

The Group SU(2)

In the case of this group we achieve our objective by using the fact that SU(2) is isomorphic to 50(3)— the group of rotations—and as such we have the angular momentum operators Ja{a = 1, 2, 3) (the generators of the group) and their squared sum: 22i

The reader may like to return to this section after Chapter 9 (in particular 9A). We discuss these ideas further in Chapter 6.

144

Mathematical Perspectives on Theoretical Physics 3

J2=Y,Ja

(3-7.1)

a=\

at our disposal. These can be identified with the generators {aj (a = 1, 2, 3) of 5(7(2), where aa are Pauli matrices.23 Recall that Ja = —— satisfied the relation [Ja, Jh]= ieah(.Jc

(3.7.2)

Since the structure constants in both cases (see 2.4.9) are the same up to scalars, the equations set up in the J's can be interpreted to give results for SU(2), more precisely an irreducible representation of SU(2) can be obtained in terms of eigenstates of these operators. To this end, we first note that J2 defined by (3.7.1) is an invariant operator (in fact it is a Casimir operator; see Exercise 6 of Sec. 2 and the footnote there for definition) which commutes with all the other operators Ja: [J2, Ja] = 0

(3.7.3)

Next, defining the raising and lowering operators J+= 7, + U2,

J_ = 7, - U2

(3.7.4)

J2 = ! (y+ J_+J_

J+) + }\

(3.7.5)

we can write J as

and using (3.7.2) we have the bracket relations: [j + , y j = 2y3

(3.7.6)

and

(3.7.7)

[y±) y 3 ] = + y ±

We now consider an eigenstate \a, jx) of J" and y3 with eigenvalues a and fl respectively: y2 \a, n) = a \a, //)

(3.7.8)

y 3 |a,/i) = n |a, ft)

(3.7.9)

In view of (3.7.7) it follows that y3 also has J± \a, ji) as eigenstates with eigenvalues Qi ± 1). It is also natural to ask the transformed state of \cc, fx) under the operators J+ and J_. Using the commuting relation (3.7.3) and the same eigenvalue a, we have (see Hint to (Exercise 2)): y ± \a, V) = K±(a,

/i) |a, n±\)

(3.7.10)

where K± (a, jJ.) are constants dependent on a and \i. Now for a fixed a the values of fx are bounded, more precisely because of the equality J2- j \ = j \ + J2, > 0, from (3.7.8) and (3.7.9) we have: a-/x2>0

(3.7.11)

Lety' denote the largest value that an eigenvalue fx can have, then, since J+ ja, fx) = const, ja, fx + 1} for every fx, it follows that J+\a,j) = 0 This in turn implies: ll

- See Equality (2.4.9) and Exercise (2.4.1) and also Example (6.4.3).

(3.7.12)

A Primer on Operators

145

0 = J_J+\a,j) = (J2 - J\ - / 3 ) | a, j) = (a-f-j)\a,j) (where we have used (3.7.5) and (3.7.6)), leading to the solution:

(3.7.13)

a = j(j+l) Similarly (since [i is bounded), setting / as the smallest value that fi can take, we have

(3.7.14)

J_|a,/> = 0

(3.7.15)

which gives a = f(j'-l) (3.7.16) Equating the two values of a, we get two solutions fory", they are either/ = -j orj' =j + 1. The latter being inadmissible, we finally conclude that if j is the largest value of \i then -j is its smallest value. Also since J_ lowers the value of \i by one unit (see 3.7.10), j -f = 2/ which means that j is either an integer or a half integer. The states \a, /x) with pi =j,j - 1, •••, - 1 , ••• , -j form the basis of an irreducible representation of SU(2). The dimension of the representation is (j + 1). The representation matrices can be obtained by using (3.7.9) and (3.7.10) (see for example the Hint to Exercise 3).

7.2

The Group 51/(3) and its Irreducible Representations

In Sec. (2.4) we have already seen that elements U 6 SU(3) which are unitary matrices U+- U+ U = I with det U = 1 can be expressed as (see Eq. (2.4.7)): [/(£,, •••, £8) = exp (iergr) r = 1, 2, ••-, 8 (3.7.17) where sr are the group parameters and gr are the group generators represented by (3 x 3) traceless Hermitian matrices; for our present purpose we choose these as the A-matrices of Gell-Mann, which are given as follows: '010"| (0 -i 0\ M 0 0A A, = 1 0 0 , X2 = i 0 0 , A3 = 0 - 1 0 ,

A

4

,0 0 oj

[o 0 OJ

'o o n

[o o -i]

= 0 0 0 , ; i

,1 0 oj

5

= 0 0

0 , A

[O 0 0j

fooo^ 6

= 0 0

[i 0 0J

1 ,

(3.7.18)

(oiOy

(0 0 0 \ (\ 0 0 s X = 0 0 - i , A. = -4=- 0 10. yo i o ) 1,0 o - 2 ) Note that the A-matrices have the commutation property

[lT'T-] = / / ^T-

a7 19)

-

146

Mathematical Perspectives on Theoretical Physics

where fabc is totally antisymmetric and has the following non-zero values: /l23 = 1' /l47 = ~Z>fl56 ~ ~ ~Z>f246 ~ ~~Z> fl51 ~ ~Z

f -l f -

l

/345 - ~Z' /367

(3.7.20)

f ~ f^ f ~ l^

y > /458 ~ -J ~Z ' /678 ~ A/ ~Z •

It can also be checked that they satisfy the normalization property: Tr (Afl, Xb) = 2 5 a ,

(3.7.21)

From (3.7.19) it follows that the group generators gr satisfy the Lie algebra relation ^ 8b\ = if** 8c (3-7.22) and also, since 5(7(3) is a rank 2 group (Sec. 2.4), there are just two Hermitian matrices which are diagonal. Apparently these are A3 and Ag; this leads to the fact that the corresponding generators g 3 and g g satisfy: [ft. 8%] = 0 The raising and lowering operators in terms of the remaining # r 's are 24 : T±=8i±

»ft.

U±=g6±

igl,

V± = g4 ± ig5

(3.7.23) (3.7.24)

The generators g3 and g% in this set up are denoted as: T3=g3, Y=^j8s

(3-7.25)

One of the Casimir operators* for SU(3) is g2 = ^

gt gt

i

The corresponding commutation relations using the operators of (3.7.24) and (3.7.25) are given by: [T3, T±] = ±T±, [T3, U±} = + ~ U±, [T3, V±] = ± y V± [Y, T±] = 0,

[Y,U±] = ±U±,

[T+, T_] = 2T3,

[Y, V±] = ±V±

(3.7.26)

[U+, U_]= | - y - T3 s 2U3

[V+,V_] = ^Y + T3=2V3

(3.7.27)

[T+,V+] = [T+,UJ\ = [U+,V+] = 0 [T+,V_] = -U_,

[T+,U+\ = V+

(3.7.28)

[U+,VJ = T_, [T3,Y] = 0. Note that these commutation relations can be easily verified using the values offahc given in (3.7.20). 2A

*

' The use of letters T, U, V, Y is an accepted usage in physics literature. Since SU(3) is a rank-2 group, there are two independent Casimir operators. The other one is cubic in the g/s.

A Primer on Operators

147

Now, since SU(3) is of rank 2 (equivalently T3 and Fcan be diagonalized simultaneously), the states in its irreducible representation must be labeled by two eigenvalues: t3 and y. A representation can then be considered as a two-dimensional figure in the t3 j-plane. The effect of the raising and lowering operators on the states formed by ?3 and y can be enumerated from (3.7.26)-(3.7.28) to obtain the graphic representation of SU(3). For instance (using the first bracket in (3.7.26)) T+ raises t3 by 1 unit, T_ lowers h by 1 unit, whereas both T+ and T_ leave y unchanged (bracket no. iv). Similarly operator U+ lowers t3 by 1/2 unit and raises y by 1 unit, and operator V+ raises r3 by 1/2 unit and y by 1 unit, etc. From the above it is evident that once an appropriate scale is selected for t3 and y, the raising and lowering operators connect points along the lines (in the t3 y-plane) whose inclinations are multiples of 60° with each other (see Fig. (3.1)).

u.

T_ - *

-1

'

\

\y

v_

".

\ A 6 0 ° T+ A

'

/\

1

*

u_

*^

f3

^ 7 Q 1 Graphic representation of raising and lowering operators of SUC3) on the t y-piane

Each irreducible representation of SU(3) is characterized by a set of two integers (n, m), and graphically it forms a hexagonal boundary such that three sides of the hexagon are equal in length to n units and the other three to m units. The hexagon collapses to an equilateral triangle if either n or m is zero. n

\m

m/

n

\

r

\

n

\r

m

7

Z_A m

m

(i)

^ ^ 3

/\m

(ii)

Boundaries of the SU[3) reoresentation (n. m). (a 0) and (0. m)

Finally, while an irreducible representation of SU(2) is characterized by one integer or a halt integer j , whose graphic representation is a straight line of length 2/ with 2/ + 1 sites each of them being occupied singly by one state, an irreducible representation of SU(3) has a more complex graphic representation. As can be expected, the variations in both arguments of the pair (n, m) lead to the multiplicity of states on each site in the f3 y-plane. In simple words they form the following pattern: the sites in the boundary are singly occupied, on the second layer they are doubly occupied and on the third they are triply occupied-the process continues until a triangle layer is reached and as a result multiplicity ceases to increase. This stage is reached when multiplicity becomes m + 1 for n > m (or n + 1 for m > n). 25

' Note that U+, which is called the raising operator, in this case is a misnomer.

148 Mathematical Perspectives on Theoretical Physics

The sum of the multiplicity of states at each site is the dimension of the representation. The formula for this dimension is: d(n, m) =

=a

(3.7.29)

Sometimes an irreducible representation is labeled not by the pair (n, m) but by its dimension. For instance, a J-dimensional irreducible representation is labeled by d and its complex conjugate by d*. In conclusion to this section, we give below four simple cases of graphic representations of SU(3) which are explained in Exercise 4. In all these examples all sites (shown by x) are singly occupied except the centre of 8 (see Section (4.2) of 9.[6] or Section (5.3) of 6.[16] for more details). y

t \7

f f

3

/ \ x

(n,m) = (1,0),3 (triplet)

•- h

- \

x

{n,m) = (0,1), 3* (triplet)

\

f

x^n

y. /-»• '3 /

\

(n, m) = (1, 1), 8 (octet)

/ \

/

(n, m) = (3, 0),10 (decuplet)

R S ^ j Examples of SLK3) representation with states labeled by (^3, y)

Exercise 3.7 1. Verify the Lie bracket relations (3.7.6) and (3.7.7). 2. Establish (3.7.10) and determine the constants A"± {a, [i) involved there. 3. Find the representation matrices of the irreducible representation of SLUT) forj = — and j = 1. 4. Give mathematical explainations for the diagrams of Fig. 3.3, that represent the sites of states for different choices of (n, m).

Hints to Exercise 3.7 1. Written out in full (3.7.6) becomes: [(7! + U2), (7, - U2)] = 2Jy Using the additive property of Lie bracket and the fact that [Ja, Ja] = 0 for a = 1, 2 we have on the LHS: [7,, - U2] + [U2, 7,] = 2i[J2, 7,] = -2i[J,, 72] = -2i(i£ 1 2 3 73) = 27 3 . Equality (3.7.7) can be verified in a similar manner. 2. We use (3.7.3) to write the equality:

A Primer on Operators 149

0=(J2J3-^J2)(J±\a,

fi))

= J2J3(J±\cc, fi)) - J3 (J± J2\a, fi)) = J2(fx ± 1) J±\a, fi) - aJ3 J+\a, fi) = {fi± \)J2{J±\a, fi)) - a(fi±

l)j±\a, ft)

(i)

To determine the constants K±(a, fi) in (3.7.10) we use the relation: (a, fi\J_ J+\a, fi) = \K+(a, fif.

(ii)

Since J_ is the Hermitian conjugate of J+, it follows that (a,fi\J_=Kt(a,fi)(a,(fi+l)\.

(iii)

Also from (3.7.13) the above equality becomes: (a, fi\J_ J+\a, fi) = (a, fi\J2- j \ - J3\a, fi) = j(j + 1) - fi2 - fi.

(iv)

Hence we obtain

KJia, fi) = (j(j + D-fl2-

H)T = [(j-fi)(j

+ fi+ 1)]T

(v)

where fi varies and j stands for (maxA Using similar steps, we have KJLa, fi) = [(/' + fi) (j- fi + 1 ) ] ^

(vi)

Note that sincey in the above values of K±(a, fi) stands for the largest value of fi which is integer or a half integer, fi takes only those values which are admissible with this value of j . 3. When j = 1/2 the only admissible values for fi are 1/2 and 1/2 - 1 = -1/2, hence the eigenstates in question are 11/2, 1/2) and 11/2, - 1/2). If these eigenstates are denoted in matrix form by column matrices, i.e:

i.l\ = H and k-±U°) 2 2/

[0J

2

(0

2/ [l)

then using , 1 1\ 1 1 1\ , . 1 1\ 11 J-,—,—) = , —) and 7-1—, ) = , 3 2 2/ 2 2 2/ 2 2/ 2 2 we have the matrix for J3 given by:

1\ ) 2/

(ii) From

y++2l,21\ = 0, and+y2+ l--!-\ = rri + l V i - i / 2 / LV 2 2 A 2 2

+1^1,1)

JJ 2 2 /

we have

(0 n

•/+=[o o H + '72

(ill)

150 Mathematical Perspectives on Theoretical Physics

(Note that we have used Eq. (v) of Exercise 2 in writing / + —,

+ f° •/- = • / • %

)). Similarly

0>

l

0J

=W

Ov)

2

These give the matrices for J{ and J2 as:

wo

n

i (0

-A

(v)

•'••Til o J - W l i o) The matrix representation for the irreducible representation of SU{2) is therefore given by:

if 1 °) ±f° l) ±f° "''I 2 [o - 1 / 2 U 0 / l[i

0/

Evidently it involves the Pauli matrices and thus confirms the identification Ja = —— mentioned in Section 7. For the choice j = 1, we have /i = 1, 0, - 1 ; the eigenstates in this case (with their matrix representation) are:

n\

fo\

|1, 1>= 0 , | l , 0 > = vOj

1 , 11,-1)= [0J

(o\ 0

(vi)

Uy

The eigenvalue of 73 on these eigenstates is respectively given by: / 3 |i, i) = i|i, i), y 3 |i, o) = o|i,o>, / 3 | i , -1} = - i | i , - l ) hence the matrix representation of J3 is:

no 73=

oN

0 0 0 ,0 0 - 1 ,

(vii)

Using the formula K±(a, n) = [(/• + fi) (j ± li + 1)]T for pairs (1, 1), (1, 0), (1, -1) we have the values of the constants needed for J+ and Jj, they are respectively (0, -Jl, 4l) and (V2~, 42, 0), accordingly, from 7+|l, 1) = 0, 7 + |l, 0) = -Jl\\, 1) and /+|1,-1) = -Jl\\, 0). We have: '0 42 J+ = 0 0

0N 42 = Jx + f72

(viii)

A Primer on Operators

151

and from 7_|1, 1) = V2~|l, 0), 7_|1, 0) = V2~|l, - I ) and J_|l, -1) = 0 we obtain: '0 J_ =

42 ,0

0

0^|

0

0

J2

0y

= 7, - U2

(ix)

The solutions of (viii) and (ix) yield:

ro -i oN

ro i o\ Jl

=

-j=r

1 0 1 , J2=-±=r

«

0

-i

The triplet (/,, J2, 73) gives another irreducible representation of SU(2), and it is easy to check that 7 a 's satisfy the Lie algebra relations (3.7.2). 4. In order to solve this exercise we list a few more facts about the states and the sites involved in SU(3) representations. We recall that: T+ raises t3 by 1 unit and leaves y unchanged. U+ lowers t3 by 1/2 unit and raises y by 1 unit, V+ raises / 3 by 1/2 unit and raises y by 1 unit. On the other hand, since T_ = (T+)+ (adjoint of T+) etc., operators T_, V_ become the lowering operator and f/_becomes the raising operator on r3. T_ leaves y unchanged and U_, V_ lower y by 1 unit. This also implies that sites are symmetrical with respect to y-axis. We further note that since T+, U_ and V+ all increase the value of T3, for SU(3) representations there must exist a maximally stretched state ^ r a that satisfies: (i) T+ max = U_ 0 m a x = 0

For a given pair (n, m) the width of widest portion of hexagon is n + m, thus 0 max state gives: r-\

T / A

\

n +m

(n) T3 ((pmj = —j-,

n-m

Y (mJ =

-^~-

We now consider the first diagram for which n = 1, m = 0. From (ii) 7 3 (0 max ) = f3 = — ^(0max) = y = 1/3 thus coordinate representation of one of the sites with (one state) is ( —, — . Due to symmetry the other site is

, — given by T —, —) = ,—). i 2 3/ 2 3/ 2 3/ Now none of the raising operators on T3 can be used in view of (i). Hence the only operator that gives the admissible result is:

(iii) V 1 1 \ = 0, ^ . \ "2

3/

3 / ( This gives the third site 0,

-2\ .

152 Mathematical Perspectives on Theoretical Physics

(Note that since m = 0, 3 sides of the hexagon have reduced to zero). Using the formula (3.7.29) we have d = 3. Diagram (2) is conjugate of (1). r 3 (0max) = —, and Y ($IO!a) =

. This explains the inverted triangle. The dimension d* = 3, which is denoted

as 3* or 3 . In the case of diagram (3), («, m) = (1, 1) gives:

(iv) r3(tfUx)Sr3 = i, y ( ^ ) S y = o The site of state |1, 0) is given by the point (1, 0) in the t3 - y plane. Using same arguments as above we use the operators T_, V_ and U+ to obtain: (v) 71|l,0> = |0, 0), V_|l, 0>= — , - l \ , t / + | l , 0 > = — , l \ T_ V _ | l , 0 > \ =

-—,-l\,T_T_\l,0)

= | - 1 , 0>, T_ U+ |1, 0> = - — , l V V_ U+ |1, 0) = |0, 0>. Which shows that all sites except the centre of the hexagon is occupied by single states, the centre is doubly occupied. Obviously dimension equals 8. Diagram (4) is left as an exercise for the reader.

References 1. W. Arveson, Ten Lectures on Operators Algebras, A.M.S. No. 55 (1984). 2. H. Bercovici, C. Foias and C. Pearcy, Dual Algebras with Applications to Invariant Subspaces and Dilation Theory, A.M.S. No. 56 (1985). 3. M. S. Birman and M. Z. Solomjak, Spectral Theory of Self-Adjoint Operators in Hilbert Space (D. Reidel Publishing Company, 1987). 4. L. de Broglie, Heisenberg's Uncertainties and the Probabilistic Interpretation of Wave Mechanics, with Critical Notes of the Author (Boston: Kluwer Academic Publishers, 1990). 5. H. R. Dowson, Spectral Theory of Linear Operators (New York: Academic Press, 1978). 6. N. Dunford and J. Schwartz, Linear Operators (Vols. 1, 2 and 3, New York: Interscience Publishers, 1958). 7. C. Foias, C. Pearcy and B. Sz.-Nagy, (a) The Functional Model of a Contraction and the Space Ll, Acta Sci. Math. (Szeged) 42 (1980) 201-204; (b) Contractions with Spectral Radius One and Invariant Subspace, Acta. Sci. Math. (Szeged), 43 (1981) 273-280. 8. H. F. Hameka, Quantum Mechanics (New York: Wiley, 1981). 9. S. Helgason, Topics in Harmonic Analysis on Homogeneous Spaces (Boston: Birkhauser, 1981). 10. M. W. Hirsch and S. Smale, Differential Equations, Dynamical Systems, and Linear Algebra (New York: Academic Press, 1974). 11. T. F. Jordan, Linear Operators for Quantum Mechanics (New York: Wiley, 1969). 12. F. Riesz and B. Sz.-Nagy, Functional Analysis (New York: F. Ungar Publishing Company, 1955). 13. W. Rudin, Real and Complex Analysis (New York: McGraw-Hill, 1966). 14. J. L. Soule, Linear Operators in Hilbert Space (Gordon and Breach Science Publishers, 1968). 15. M. H. Stone, Linear Transformations in Hilbert Space (Am. Math. Soc. New York, 1932). 16. B. Sz.-Nagy and C. Foias, Harmonic Analysis of Operators on Hilbert Space (Amsterdam: North-Holland, 1970). 17. J. Weidmann, Linear Operators in Hilbert Spaces (New York: Springer-Verlag, 1980).

CHAPTER

BASICS OF ALGEBRAS AND RELATED CONCEPTS

1

A H1

SOME DEFINITIONS AND EXAMPLES

The topic of algebra has always been important to both physicists and mathematicians. The diversity of algebras in recent years has increased manifold, and their links to applications in physics are becoming more and more apparent. Some of these algebras (known quite well in literature) are the following: associative algebras, Lie algebras, Jordan algebras, Cartan algebras, Heisenberg algebras, Chevalley algebras, Von-Neumann algebras, and of more recent origin, current algebras, Hopf algebras, affine Kac-Moody algebras, Virasoro algebras and vertex operator algebras.1 The latter three algebras are all infinite dimensional (see Chapter 5) and have been responsible for better understanding of string/ superstring theory, apart from the fact that they opened new channels of interpretation for some of the classical concepts, e.g., the Roger-Ramanujan identities, the Dirac's magnetic monopole, soliton solutions of the KdV equation (see Preface and Chapter 7 in [7]). One could very well say that no study in mathematical physics could be complete without the knowledge of these algebras and Lie algebras. Due to our limited scope, we give only the definitions and examples (together with some exercises and their hints) of only those objects which are relevant to our main theme: the applications in quantum, YangMills and string theory. To begin with we give a few definitions.

l.l

Associative, Jordan and Lie Algebras

Definition 4.1.1 An algebra A is a linear vector space over a field J of characteristic 0 or a prime number p on which a distributive binary operation 'o' that commutes with scalars can be defined as: a o ( X b ) = X(a o b) = ( k a ) o b

a, b e A ,

X

&

J

(4.1.1)

It is called an associative algebra if for all triples a, b, c e !A the associativity rule: (a o b) o c = a o (b o c)

(4.1.2)

is valid. A subset 5 of A is called a subalgebra of A if it is closed under the operation 'o'.

1

We study here mainly Lie algebras due to their importance in applications. The study of Hopf algebras is postponed to Chapter 9, since in current terminology they are also called quantum groups (see Appendix D in Chapter 9).

154 Mathematical Perspectives on Theoretical Physics

Definition 4.1.2: algebra.

An algebra J over J characterized by the following properties is called a Jordan a ob=boa

2

2

a o(()oa)-((i ob)oa = 0

(4.1.3) (4.1.4)

The above properties are also written as: [a, b] = 0 and [a2, b,a] = 0 (See [5] for details.) Definition 4.1.3: An algebra L is called a Lie-algebra if the commutator formed by all pairs of a given algebra is bilinear and anti-symmetric, thus denoting the commutator (called the Lie bracket) as [a, b] we have: [a,b] = -[b,a]

(4.1.5(a))

and [a + A . a ' , b + /j. b'] = [a, b] + X [a, b] + \i [a, b']+ X\i [a', b']

(4.1.5(b))

It can be easily verified that (4.1.5)(a) leads to the Jacobi identity: [a, [b, c]] + [b, [c, a]] + [c, [a, b]\ = 0

(4.1.6)

It is to be noted that while a Jordan algebra J is an associative algebra, a Lie algebra is not so. The associativity is replaced by the Jacobi identity in this case. Before we give examples to illustrate the above definitions, we shall define one more object: the lattice, since, as we shall soon see that the Lie algebras (with which we shall be most concerned) could very often be constructed via Lie groups and lattices. See for instance Example (4.1.7) and Hints to Exercise 8 of this section and Exercise 3 of Sec. (5.1).

1.2

Lattices

Definition 4.1.4: Let V be a real Af-dimensional vector space with an inner product (not necessarily non-singular) denoted x o y for x, y e V, and let [e;}i = 1, ...,N denote the basis vectors of V, then the set of points of V of the form: In,*,-

n,-e Z

(4.1.7)

1=1

is called a lattice (denoted A) in V. If V is a Euclidean space R or a Minkowskian space RN~1'', the lattice A is called a Euclidean or Lorentzian lattice respectively. If | det(e( o efi | = 1 the lattice is called unimodular. A unimodular lattice implies that A contains just one point of V per unit volume. In other words, if a lattice A is unimodular, each unit volume of V can contain only one point of A. Definition 4.1.5: The dual A* (not always a lattice) of any lattice A is the set of points y e V for which x o y is an integer for all x e A. Only if the inner product is non-singular and A spans V, the set A is a lattice called the dual lattice of A. The lattice A is called an integral lattice if x o y is an integer for every x, y e A. It can be verified that in this case A c A . If A is both unimodular and integral then it is self-dual, i.e., A = A* (see Goddard and Olive in 5.[6]).

Basics of Algebras and Related Concepts 155

1.3

Examples of Algebras, and the *- and C*-Algebra

Returning to algebra and examples of algebra, we first note that in physics the abstract elements a, b, c belonging to an algebra are realized by specific quantities such as functions <j)(a) on T*(M) (the cotangent bundle of a manifold M) say, or the operators^ on Hilbert space J{. The product a o b in the case of functions defined on T*(M) can then be expressed as:

Ha)o'(a)

= 4~Cuv4^r

(4-1-8)

dav

da" uv

where a e f*{M) and C is a tensor with u, v ranging over 1, 2, ..., dim M. The distributivity and commutativity with respect to scalars can easily be checked for the above product. Another simple example of algebra is the so-called polynomial algebra. The set of all polynomials (in x): n

f= a o + I X * ' (an* 0) 1=1

defined on a field J forms an infinite dimensional commutative and associative algebra over J. Consider now an algebra S\ of observables 'a' in quantum mechanics. Let w denote a state in S\ which defines a linear positive and normalized functional on S\, i.e., 0)(Xa + X' a') = X (O(a)+ X' (O{a') co(a*a) > 0

(4.1.9(a)) (4.1.9(b))

(0(1) = 1

(4.1.9(c))

then S\ equipped with a state co is called a *-algebra. Note that a is conjugate of observable a, I is the identity and X , A' are real numbers. With every *-algebra A one can associate another algebra known as C -algebra^L in the following manner: for each element a eJZ and a state co and/e A, define a probability measure fxm a on the real line such that (O(f(a)) = AV a if) = [fettle

a

(4.1.10)

The elements/and g e A satisfy the properties

||/+S||~=||/|U+IUL = |A|||/|U

(4.1.11(a)) (4.i.ii
| | / | | - = 0=>/=0

(4.1.11(c))

\\fg\\£

(4.1.1 Kd))

IU/|U

11/11 \\g II

H//Il = ll//Il = II/1H 2

'

(4.1-11(0)

Note that the definition of this product implies that the product is real valued or complex valued depending on the manifold M.

156

Mathematical Perspectives on Theoretical Physics

where

||/||-= sup|/(A )|.

It can be checked that distributivity and multiplication with scalars for (4.1.10) follows using the properties (4.1.11). These are incidentally the properties of absolute values of complex-valued functions defined on the real line R. The property (4.1. ll)(d) further ensures that multiplication is a continuous operation, i.e., fn^>f> 8n~> 8=> fn8n ~» f8 The above algebra is also referred to as Banach algebra (see [10]). Another example of an algebra is given by the collection of continuous functions with compact support on a locally compact group G (denoted CC(G)). The binary operation here is the convolution defined as: p * yr{x) = jG
dy

(4.1.12)

where dy is a Haar measure on G (see Exercise 1-5 for other examples of algebras and Def. (0.4.3) for Haar measure). We now give a few simple examples of Lie algebras.

1.4

Examples on Lie Algebra

Example 4.1.6: Every vector subspace L of an associative algebra closed under the operation [x, y] = xy — yx is a Lie algebra. Example 4.1.7: The general linear algebra gl(V) formed by endomorphisms of a vector space V (denoted EndV) is a Lie algebra. If V is n-dimensional R'1 we write it as gln or gl(n, R). The standard basis in this case is {e^Kl^ i <j < n) where ey is the matrix with (i,/)-th entry as 1 and zero elsewhere. Since e,-,- ekl = Sjk en it can be verified that the Lie bracket satisfies: ieip ekl\=

S

jkeil-

5

Uekj

which in turn leads to Jacobi identity. Example 4.1.8: The tangent space Te{G) at the identity element of every Lie group G is a vector space which forms a Lie algebra. More specifically if G is a r-parameter group with generators [XM] (jj. = 1, ..., r) then the commutators formed by generators satisfy (i.e., they are linear combinations of generators): [Xfl,Xv\ = Cx^vXx

(4.1.13)

The algebra of generators is a Lie algebra as is evident from (4.1.13). The constants C^v, known as structure constants, take values on the representation space of G. Since [X^, Xv ] = - [Xv , X^ ] these constants C ^ v are anti-symmetric in [iv . It can be verified that in consequence of Jacobi identity satisfied by the commutators they also satisfy an identity:

I

(Cm Cm + C^ Cdvfs + CVflS C ^ ) = 0

(4.1.14)

3=1

(Note that this Lie algebra is finite dimensional since the basis vectors in this case are {XJ \l - 1, 2, ..., r.) We devote next section to two special classes of algebra, the solvable Lie algebras and the semisimple Lie algebras, which will eventually help us in defining the infinite dimensional algebras.

Basics of Algebras and Related Concepts

157

Exercise 4.1 1. Show that the direct sum T(V) = f © V © (V ® V) © ... of all the tensor powers of a vector space V over a field J is an infinite-dimensional algebra called the tensor algebra. Show also that it is an associative algebra. 2. Show that the quotient vector space E(V) = T(V)/A, where A is a two sided ideal 3 generated by all elements of the form x ® x (x e V), is an associative algebra obtained by antisymmetrizing the tensors. This is known as the exterior algebra, and unlike T(V) it is finite-dimensional. 3. Let Vw", r e Z+(r_-,> = 0 i*j (ii) (eit e)=l i = j f = l , 2 , .... r (iii) (et, ep = -l i = j = r + 1, ..., n. Introduce a product uv of vectors in V" which is associative and distributive with respect to addition and satisfies the condition: (iv) uv + vu = 2<M, v>. Verify that the set formed by all possible sums and products of V(")is an algebra. This algebra denoted C(V"r)) is the well known Clifford algebra. Show that the algebra C(V"r)) is a linear space of dimension 2" with basis (1, e,, elxe,Y ..., exe2 ... en), and that C(V{0)) is C. 4 4. Show that algebra C(V(o)) is the algebra of quaternions. 5. Let Cr denote the linear subspace of Clifford algebra C spanned by the (") products (e, e- ... eir). The direct sums:

c+=

© r=even integer

cr,

c_=

©

cr

r = odd integer

are linear subspaces of C, and C + is also a subalgebra of C. The Clifford algebras C(V(1)) and C( V(0)) are respectively called the Dirac algebra and the Pauli algebra. Show that even subalgebra of Dirac algebra is Pauli algebra. 6. Show that every real Lie algebra £ can be extended to a complex Lie algebra with same structure constants. 7. Let the space-time representation of angular momentum and momentum operators M^v and P^ be given as: (a) Muv = xu — M h dxv (b) />,=

xv — , dxn

/ -

then show that the Lie brackets for them satisfy: (c) [Mmn, Mpq] = gnp Mm - gmp Mm - g,lqMmp + gniqMnp (d) [Mmn,Pq] = 3

'

4

glulPm-gniqPn

A left ideal (right ideal) / of an algebra A is a subalgebra of A such that is I => ai e I (iae / ) V a e A (see Section 2).} Note that e{ stands collectively for one element set ex, e2, e3, ... ,en while et et stands for two element sets ele2,ele3,...,elen,e2e3,...,e2en,...etc.

158

Mathematical Perspectives on Theoretical Physics

[Pm, Pa] = 0

(e)

where g^v stands for Minkowskian metric (- + + + ). The algebra obtained above is called the Poincare algebra. Show that if, on the other hand, g^v is replaced by the 5-dimensional metric gAB = (- + + + + ), one obtains the de Sitter algebra resulting from the group 0(1, 4) (see Sec. 7.3). The generalized angular momentum operators MAB in this case satisfy: (f) [MAB, MCD\ = gBC MAD - gAC MBD - gBD MAC + gAD MBC whereas identifying M^5 = f ^ , it can be checked that (g) [Ffl,rv}=-Mllv=Mvll. 8. Show that the vector space spanned by the generators of an /--parameter group forms a Lie algebra by choosing r = 2 and 3. 9. Show that the Lie algebra of the group of volume preserving diffeomorphisms is the algebra of divergence-free vector fields.

Hints to Exercise 4.1 1. Let ®rV and ®SV denote the vector spaces of tensors of rank r and s respectively. The binary operation on T( V) which identifies the tensor product between elements of ®rV and &V with that of®' + V: (x, ® x2 ® ... ® xr) ® 0>, ® y2 ® ... ® ys) - (x{ ® x2 ® ... ® xr ® y{ ® y2 ® ... ® ys) makes T(V) into an algebra, as Property (4.1.1) of Def. (4.1.1) is obvious with respect to scalar multiplication. The associativity is easy to check. Since the tensor product of any number of elements can be taken as many times as one likes, the algebra is infinite-dimensional. 2. The vector space A = {x ® x) generated by x e V is a two-sided ideal of T(V), i.e., AT(V) c A, as well as T(V)A c A (compare this property of two-sided ideal with that of a normal subgroup). The multiplication operation induced by ® in E(V) is denoted A and two cosets tx + A and t2 + A for f,, t2 € T(V) satisfy: (i) (r, + A) A (t2 + A) = [(/,® t2) - (r2 ® *,)] + A. The vector space V is c E(V), since x e V can be identified with the coset x + A e E(V). V can then be used to give exterior powers of V, e.g., (ii) Ar(V) = V A V A ... A V. (r copies). Each of these is a subspace of E(V), and if Vis n dimensional with basis [el } (i = 1, •••, n), fn} nl then A (V) is = '• \r) r\{n-r)\ e

dimensional (0 < r < ri) with basis vectors e., A e,- A ... A l

2

i 0'i < «2< ••• < ir)- T h u s

(iii) E(V)=

© Ar(V) r= 0

has finite dimension 2". 3. The set V{/." (r = 0, 1, 2, •••, n) whose vectors satisfy (iv) can easily be seen to satisfy the property (4.1.1) required for an algebra. It is associative and distributive with respect to additions, since (denoting the operation by ' o') we have for distributivity: (i) 2(u + a u, v + X v') = (u + a u) o (v + X v') = {u+ a u')(v + X v') + (v + X v')(u + a u')

Basics of Algebras and Related Concepts

159

= (uv + vu) + a (u'v + vu') + X (uv' + v'u) + <xX (u'v' + v'u') = 2{u, v) + 2a <«', V) + 2X (u, v') + 2aX <«', v'). The associativity for addition can also be checked in a similar manner. The properties (i), (ii) and (iii) in view of (iv) lead to: (ii) efij + efr = ±2 5tj

(iii) <«,, e,.)=(e,.)2 = ± l

(iv) eft = efi i * j . Since all combinations of products are to be considered we have:

The basis vectors for these products are naturally formed by 1, e7, elxel2 for V(2" and 1, e7, e, e, , e, ej e, for V^" (see Ftn. 4). When n = 1, the basis vector of V(0) is only one in number, say, for instance, it is e, then in view of (iii), e2 = - 1 . Hence an arbitrary element of C(V(0)) is of the form cc+ )3e = a±iP where a, p e R. Thus C(V(o}) is C. 4. A basis for V^ is (eb e2) with (e;, e>) = - dy, and elements of C{V^) axe of the form a + bex + ce2 + dexe2 where a, b, c, d e R. Write i = ex, j = e2, k = e-^e-^ then using the fact that r = 0 implies e\-e\ = (e^e^) - - 1 . We can write: y = -;' = * (i) jk = -kj = i 11 = -ik = j and 12 = / = * 2 = - l But these equalities set out in (i) are the defining conditions of the algebra of the quaternions with the basis (i, j , k), hence the result. 5. A basis for the even subalgebra of C(V(1)) is given by: (1, exe2, exez, e,e 4 , e2e3, e2eA, e3eA,

exe2e3e^.

Let a basis for V(0) be (gj, e2, ?3) which satisfies (i), (ii), (iii) of Exercise 3. The elements of C(V{0)) will then be (1, ex, e2, e3, exe2,?{?3,

e2e3,exe2e3).

Use the identification mapping:

and note that: exe2 = e{e2 exe3 = el(-ele2)e3 = -e 2 e 3 = (-l)e 2 e 3 exe3 = exe2exei = -e 2 e 4 = (-l)e2e4 e2e3 = e1«3e1«4 = -e 3 e 4 - = (~1)«3«4 which shows that the elements of C(V^) (the Pauli algebra) are in one-one correspondence with the elements of C+ (the even subalgebra) c C(V(1j) (the Dirac algebra). In other words, there is an isomorphism between the Pauli algebra and the even part of the Dirac algebra.

160

Mathematical Perspectives on Theoretical Physics

6. The complex extension of real Lie algebra is denoted C ® L and is called the complexification of L. To show that it has same structure constants, we assume that£ is finite-dimensional and thus has basis {e; } (i = 1, 2, ..., n). The Lie product: gives the structure constants {C|}. The elements of C ® L, when it is regarded as a complex vector space, are defined by: (i) A (jU ® x) = {kpi ) x where l , / / e C and .* e X. The Lie bracket is likewise defined as: (ii) [A ® x, jU ® y] = (A/i) ® [x, y]. In particular writing the basis vectors et, ef in (ii) we have: (iii) [A ® g,, |/ ® g,.] = (XfJ.) ® C,J e t which shows that structure constants for C ® L are the same as that of L. 7. Note that (e) is trivial and that (d) follows once (c) is established. To establish (c) we use (a) to write:

LA^MTO]=^—

-,„_ j|,,_ -,,_j

_f _i__ _A_V _^_-

91

-('"ax, " ' ' a x j l ^ a x ^ ' - a x j ,.

_{

a

a

a

a1

I

dxn

3x^

dxp

dxm )

( _ a _ _ a _ _ ^_ _a_"l I

OJCBI

9x£/

3xp

3x,,)

_{ _a_ _ a _ _ _a_ _d_) -{XmdXnx«dXp-Xpdxclx"axJ +

^i

[.

i

oxm

X

•"'(7 -,

p

dxp

-\

dx9

X

m

->

dxn )

Since the metric is Minkowskian we have:

_a_ *•"•'

-i

dxn

a x

p

~ 8np ~ Spn ~ -\

x

n-

dxp

Hence the expression in the first parenthesis of (i) simplifies to

Basics of Algebras and Related Concepts

t (HO

d

161

*]

Snp\ Xm -foT ~ Xq ~^~

= SnpMmq-

Using similar simplifications as in (ii), we obtain (c). To establish (f) we consider /• \

h*

d

M

(IV)

AB =

X

A T

d X

B

3—

dxB dxA where xA, xB are coordinates in a 5-dimensional space, and repeat the steps taken in (c). Finally, to show (g), we note that

(v)

p^^^-^-L.

accordingly

-

X

v "3 V. dx5

X

X 53 [X 3 dxv ) y dx5

d (

3 V

f a - <. x5 { dx^ +

a xv dx5

xu\ xv V {dx5 )dx5

*5 "5 dxM J

d (

a x5 top

d )

a 1 xv — > dx5 J

± 3 similar terms . J

The last four terms in the above expression given in ( ) are zero as xv and x5 are independent coordinates, while the two terms in { } cancel each other, hence we are left with the first two terms which equal Mv^.

2

SOLVABLE AND SEMI-SIMPLE LIE ALGEBRAS

In order to study the above classes of algebras, we have to define two basic objects that are required there—the Lie subalgebra and the ideal.

2.1

Lie Subalgebras, Ideals and Lattices

Definition 4.2.1: Let A and B be two subspaces of a given Lie algebra X, and let [A, B] denote the subspace spanned by all vectors of the type [a, b] (formed by the Lie product) for a e A and b 6 B. A

162

Mathematical Perspectives on Theoretical Physics

subspace S which is closed under Lie multiplication, i.e. [S, S] c S is called a subalgebra of L and is denoted SDefinition 4.2.2: Given a subalgebra 5 of L, if the Lie product [/, s] for I e L and 5 e 5 is a member of 5, then S is called an idea/ of L and is denoted /. Thus an ideal of L is characterized by the relation [£,F\
(4.2.1)

It is easy to note that concepts of subalgebra and ideal play the same role in Lie algebra as the subgroups and normal subgroups play in Lie group theory. Accordingly if one were to talk about a homomorphism h between two Lie algebras Ll and L2, then it can be checked that the kernel of h is an ideal of L x and the image under h is a subalgebra of L2The sum and intersection of ideals are again ideals, and the ideals of a Lie algebra form a lattice under these two operations. To avoid any confusion with the word lattice introduced in the previous section, we wish to note here that this usage refers to following definition: Definition 4.2.3: Given a vector space V, the set 5 of all subspaces of V equipped with the operations of intersection and union (of subspaces) is called a lattice of V.

2.2 Semi-simple and Simple Lie Algebras and their Levi Decomposition Given a Lie algebra L, consider the Lie product [£, L] = L'.5 Evidently £ z> L' and £' is an ideal of L. Thus with the Lie algebra L a series of ideals can be associated by repeating the process of taking the Lie product, thus: [£', £'] = £"

[L(r\ L^'-] =

aL'a£

L{r+x)

where r is a positive integer representing the Lie product number. The ideals obtained in this manner fromX are said to form a derived series of ideals of L. More specifically these ideal are said to form an increasing derived sequence. Definition 4.2.4:

A Lie algebra L is called a solvable Lie algebra if the series L D i ' D l " ID . . . D i W D . . .

is eventually zero. Every abelian Lie algebra (i.e., an algebra L for which [£, L\ - 0) is trivially solvable, since [L, L] =£' is zero. The simplest example of a solvable Lie algebra which is non-abelian is offered by the two-dimensional affine Lie algebra. In this case £ = {ex, e2 } and [L, L] = L' is given by the Lie product [ev e2] = ex and hence £" is zero. An ideal of a given Lie algebra can either be solvable or be not solvable. All ideals that are solvable form a sublattice (under the operation of intersection and union) of the lattice associated toX. Evidently the intersection as well as the sum of two solvable ideals is a solvable ideal.

5

In Def. (4.2.18) we denote it as T>L and thus indicate the difference between the two definitions.

Basics of Algebras and Related Concepts

163

Definition 4.2.5: The sum of all solvable ideals of a Lie algebra L is its unique maximal solvable ideal. This is called the radical of Lie algebra L and is denoted %. The supplement of the radical in L is called the Levi subalgebra of L. Apparently the radical of a Lie algebra is 0 if it does not have any solvable ideals other than zero. Definition 4.2.6: A Lie algebra is called a semi-simple Lie algebra if it has no non-trivial abelian ideals (note that every Lie algebra contains a trivial abelian ideal 0). In view of the definitions of solvable ideals and the radical, one concludes that a Lie algebra is semisimple if and only if its radical is zero. A Lie algebra L that is not semi-simple can however be expressed as a direct sum of two subalgebras one of which is its radical % and the other is the semi-simple subalgebra S: £ = %.®S

(4.2.2)

This is called the Levi decomposition of L. According to this decomposition every element of L can be written uniquely as a sum of an element in ^ and an element in 5Definition 4.2.7: A Lie algebra is called a simple Lie algebra if it is non-abelian and has no proper ideals (note that the whole algebra L can be viewed as an ideal—but this is not a proper ideal). Every simple Lie algebra can be thought of as a semi-simple Lie algebra. Example 4.2.8: The real Lie algebra so(3, R ) s X formed by a 3-dimensional real vector space V3 equipped with familiar cross product is a simple Lie algebra. To see this consider a vector subspace 5 of V3 which can be a line or a plane (passing through the origin), let x be a vector in S and y a vector in L. If S has to be an ideal then x x y must be in 5, but the cross product of two vectors is always perpendicular to the plane formed by them, hence it cannot be in 5- Thus 5 is not an ideal of so (3, R). Also for x,ye so (3, R), x x y = - y x x which shows; that it is non-abelian and hence it is simple. Remark 4.2.9: simple ideals:

Furthermore, every semi-simple Lie algebra 5 can be written as a direct sum of S = I:® I2@ ... ® Ik

(4.2.3)

Thus a Lie algebra L can be expressed as a direct sum of its radical and simple ideals:

L = 3(.®h ® h ® ••• ® h

(4-2.4)

Before we close this section, we give a few more elementary definitions and the results that we shall use.

2.3

Lie Algebra of Derivations, Adjoint Mapping and Centralizer

Definition 4.2.10: \JS\.!A be an algebra over a field J. A mapping D :S\ —> S\ is called a derivation of S\ if it is jF-linear and satisfies the Leibnitz rule: D(ab) = {Da)b + a(Db) for a, b e A

(4.2.6)

The kernel of a derivation D is a subalgebra of S\. If Dl and D2 are two derivations of S\, then it can be easily checked that DXD2 - D2Dl is a derivation. Obviously the above definition holds (in particular) for a Lie algebraX, and implies that the set of all derivations of L is a Lie algebra denoted (DL.

164

Mathematical Perspectives on Theoretical Physics

Definition 4.2.11: Let x be a fixed element of X. Then the linear mapping y —> [x, y] of £ into L is called the adjoint linear mapping on L, and is denoted adLx or simply adx. Based on this we have: Result 4.2.12: Let £ denote a Lie algebra and tDL the Lie algebra formed by derivations of £. For every x e L, adx is a derivation. The mapping x —> adx is a homomorphism of the Lie algebra £ into 'DL. Moreover if D e tDL and x e £, then [D, adx\ = ad(Dx)

(4.2.7)

A mapping adx is also called an inner derivation of JC (see Sec. 3 and Hint to Exercise 1 for the proof of the above result). Definition 4.2.13: LetX be a Lie algebra and X be a subset of £. The centralizer of X in £ is the set of all those elements ofX which permute with elements of X. The centralizer of X is the intersection of all kernels of adx as x runs through X. We denote it Xc. It can be checked that Xc is a subalgebra of £. Result 4.2.14: Let £ be a Lie algebra and / an ideal of L. The centralizer Ic of / in X is an ideal (see Exercise 2). The centralizer of L in L is called the cerafre of L. The centre of £ is the kernel of the homomorphism x —> adx. Definition 4.2.15:

LetX, and£ 2 be two Lie algebras overiF. An extension of X2 by i ^ is a sequence: Ll

fl

) X

^

) X2

(4.2.8)

where £ is a Lie algebra over ?"; / 2 is a surjective homomorphism of X onto £ 2 and fx is an injective homomorphism of X, onto the kernel of/2. The kernel "Koff2 is called the kernel of the extension. Evidently the homomorphism/! is an isomorphism of X, onto 3Cand/ 2 can be viewed as an isomorphism of the quotient £1%onto X2- By an abuse of language £ is called an extension of L7 by £v

2.4

Modules, Lower and Upper Central Series

According to Bourbaki a unitary module M over a field f is a set equipped with binary operation (x, y) —> xy of M x M into A/, that satisfies all axioms of algebra except the associativity (see [2]). A subset of M is a submodule if it is invariant under the binary operation. Using M, a Lie algebra L can be formed, conversely L (or a subset of L) with respect to binary operation on L can be viewed as a module (or a submodule provided the subset is closed with respect to binary operation). In Sec. 3 we shall return to modules while studying the representation theory of Lie algebras. Definition 4.2.16:

An ideal of L is a submodule of L which is stable under inner derivations of L.

Definition 4.2.17: A submodule of £ which is stable under every derivation of L is called a characteristic ideal of L. Definition 4.2.18: The characteristic ideal [L, L] is called the derived ideal of a Lie algebra L and is denoted 'DL (note that we have denoted this as L' also). Every submodule of L containing ©X is an ideal of L. The derived series of L is the decreasing sequence XL of characteristic ideals of L, which is defined inductively as: £>°X = L;'Dr+lL = [VfL, DrL] (denoted L(r+ l) earlier). Definition 4.2.19: The decreasing sequence C[£, C2£, ... of characteristic ideals of L defined [ inductively as: C £ = £, Cr+ l£ = [£, Cr£] is called the lower central series of £. From above it follows that T>£ = C2£ and in general Cr+ l£ 3 T>r£.

Basics of Algebras and Related Concepts

165

Finally we have the upper central series of Lie algebra X defined as follows: Definition 4.2.20: The increasing sequence C0X, CXL, C2L ... of characteristic ideals defined as C0L = [0},Cp+lL = inverse image of the centre of L/C' L under the canonical mapping ofX onto L/C L, is called the upper central series of L. Note that the ideal C,X is the centre of X.6 We shall revert to these series in the next section where we use them to classify Lie algebras.

Exercise 4.2 1. Show that for any x e L, adx is a derivation, and that x —> adx is a homomorphism of L in ©X. Show further that for x e L and D e T>L the Lie bracket [D, adx] = ad(Dx). 2. Show that the centralizer Ic of an ideal / in L is an ideal of L. 3. Let L be a Lie algebra, / an ideal (respectively a characteristic ideal) and J a characteristic ideal of /. Then show that J is an ideal (respectively a characteristic ideal) of L. 4. Show that if L is an n-dimensional Lie algebra over a field J and the centre of L is of dimension > n - 1, then L is commutative. 5. Let X] and X2 be two Lie algebras over the same field J and let 0 be a homomorphism from Lx onto X2. Then show that (£>%) = 'DL. 7. If Lie algebraX has centre z * {0} and X is 3-dimensional and non-commutative, then prove that dim z = 1 and dim T>L - 1. Show further that if z ^ T)L, then X is the product of z and a 2-dimensional non-commutative algebra. 8. Let / be an ideal of Lie algebra X such that
Hints to Exercise 4.2 1. To show that for any x e X, adx is a derivation, we must show that for y, z e X, adx (yz) satisfies (4.2.6). Now the multiplication rule in X is (yz) —> [y, z], accordingly we should show that (i)

(adx)(\y, z]) = [(adx)y, z] + \y, (adx)z].

When written out in full, this is the Jacobi identity: (ii) [x, [y, z]] = [[*, y], z] + [y, [x, z]]. Linearity of adx also follows, since the Lie bracket is bilinear in both arguments, hence x —» adx is derivation. Let denote the mapping x —> ad!* from X into (DL, i.e., ^» (x) = adx. To show that it is a homomorphism, we must establish that for x, y e X, (iii)

0 [x, y] = tj> (JC)0 (y) - 0 (y)0 M

Note that <j) [x, y] = ad [x, y], hence writing (iv) 6

'

z) - ady • ((adx)•

z)

Let 0 : L SS^S L\CpL, and let zp denote the centre, then Cp+]L = (j>~\zp)- Since z0 is the centre of L/C^L we have C,£ = centre ofX. (See also App. 9 of Vol. 2 of l.[10].)}

=L,

166

Mathematical Perspectives on Theoretical Physics

we obtain: (v) [[x, y], z] = [x, [y, z)] - b>, U, z]]

which is again the Jacobi identity given in (ii). Replacing adx and ady as 0 (x) and 0 (y) in (iv), we have (iii). Finally, to show that [D, adx] = ad(Dx), we note that [D, adx] is a derivation on X, therefore for y e X we have: (vi) [D, adx] • y = (Dadx - adxD) • y = D((adx) • y) - adx • (Dy) = D([x, y]) - [x, Dy]. Since D is a derivation on X, the first term equals [Dx, y] + [x, Dy], hence (vi) is simply [Dx, y] = ad(Dx) - y. Thus the bracket [D, adx] = ad(Dx). Note that if we were to write ady in place of D, we would simply re-assert the homomorphism shown in (iii). 2. By definition an inner derivation of £ is also an inner derivation of the ideal / (/ is stable under inner derivations). We prove this result using this fact on stability. Let D be an inner derivation of X and let x e Ic and y e /, then we have (Leibnitz rule): D([x, y]) = [Dx, y] + [x, Dy]. From the above observation on stability Dy e I, hence the term on the LHS and the second term in the RHS are zero since x e Ic. This leads to: [Dx, 3.

5.

6.

7.

8.

y]=0

which means that Dx s 7C, or that Ic is stable under D, thus it is an ideal of X. In view of Def. (4.2.16) and (4.2.17) we have that: every inner derivation (respectively a derivation) of X leaves the ideal / stable and induces on / an inner derivation (respectively a derivation). This induced inner derivation (respectively any derivation) leaves the characteristic ideal / stable, meaning thereby that / is an ideal (characteristic ideal) of X. Let M and M' be any two submodules of Lx, then 0 {[M, M']) = [(j) (M), 0 (MO]. In particular 0 ([£!, X,]) = [0 (£,), (£,)], i.e., 0 (©£,) = ©(0 (X,)). Since (X,) = L2, we have 0 (©X,) = T>L2- By induction on p we have the required result: f (OpXj) = T>PL2- In the case of derived sequence CPL, we note that C2£l = T>LX and CPLX = [X1? C~ lLx], hence using the above result 0 (C2L0 = <> / CDLO = VL2 = C2£2 and 0 (CX,) = [0 (£,), 0 ( C ^ X , ) ] = [X2, C ^ X 2 ] = C'X 2 . In view of Exercise 5 we have a (canonical) homomorphism from X onto X//. To say that X/Z is commutative amounts to saying that [L/I, LIT] =T>(LII) = {0}. But TKLlt) is the canonical image of T>L in L/I, hence the result. The Lie algebra X is 3-dimensional and non-commutative. In view of Exercise 4, dim z can neither be 3 nor 2, since that would imply that X is commutative. Hence dimz must be 1. We denote the elements of X by (e{, e2, e3). Note that elements of [X, X] are then ([elt e2], [ex, e3], [e2, e 3 ]), since one of them e z, say e3, we have ©X = [X, X] = ([eu e2], 0, 0). This shows that 2?X is one-dimensional. Now z may or may not be equal to T>L. When z & T)L, we have to show that X is the product of z and a 2-dimensional non-commutative algebra. This is immediate from Exercise 5, since Liz can be commutative if and only if z z> T>L, i.e., only if dim z > dim
Basics of Algebras and Related Concepts

167

D([x, y]) e [I, I] =

3

REPRESENTATIONS OF LIE ALGEBRAS, MODULES OVER LIE ALGEBRAS

We are already familiar with the concept of representation from Chapter 2 where we studied it for groups. We also know that very often information on group representations (in particular on Lie groups) is acquired through representations of their Lie algebras. Now to define a representation of a Lie algebra L over a field, we need a vector space V over which we form a Lie algebra L (V) of linear operators, and a homomorphism 0 fromX to L (V) that satisfies the usual properties of homomorphism between two compatible algebraic objects. Using these ingredients we have the definition as follows:

3.1

Representations of a Lie Algebra

Definition 4.3.1: A representation of a Lie algebra L is the pair (V, 0) where 0 is a homomorphism on L such that for x e L, (j) (x) is a linear operator on V which satisfies: <j) (c{xx + c2x2) - c^^xO + c2
(4.3.1 (a))

for xy, x2 in £ and for q, c2 e J, and
{xx)
(4.3.l(b))

An element v e V under the linear operator (x) becomes: ([*i. x2]) • v = (j) (jfj) (x2)

(4.3.3)

We also know that the operator adx is similar to a first order differential operator since it satisfies the Leibnitz rule (see (4.2.6)): (adx)\y, z] = [{adx)y, z] + [y, (adx)z] (4.3.4) From the previous section it is easy to note that (4.3.4) can be viewed as another way of writing the Jacobi identity. The mapping ad which takes x to the operator adx is a linear mapping from L into the space of linear operators on L. To see that it is a representation we have to show that ad preserves the Lie products, i.e., ad [x, y] = [adx, ady] Now [adx, ady]z = (adx)(ady)z - (ady)(adx)z

(4.3.5)

168 Mathematical Perspectives on Theoretical Physics

= [x, [y, z]] - [y, [x, z\] = [x, [y, z]] + [y, [z, x]] = (ad [x, y])z

(4.3.6)

Thus ad is indeed a homomorphism and (L, ad) is a representation of Lie algebra L known as the adjoint representation of X. The mapping ad helps us to define a symmetric bilinear form on L called the Killing form by the relation: {x, y) = TrL(adx)(ady)

(4.3.7)

Remark 4.3.2:

A Lie algebra L is semi-simple if and only if its Killing form is non-singular.

Remark 4.3.3:

A Lie algebra L is solvable if and only if its Killing form is zero.

3.2 Representations via Modules over a Lie Algebra We next define another algebraic object—a module over a Lie algebra in order to give another way of representation without introducing a homomorphism explicitly. We would like to note that although the following definition is that of a module over a Lie algebra, it can likewise be defined over a group or a ring. Definition 4.3.4: A module over a Lie algebraL is a vector space M along with a bilinear mapping: L x M —> M which carries (x, m) e L x M to an element (vector) x into M and satisfies: [xt, x2]m = xl{x2m) - x2{xlm)

(4.3.8)

The product defined by the mapping is called the module product rule. In order to see how modules are used as representation spaces, we associate to every element x of L a linear operator f{x) on M which assigns to m e M the vector xm in M. The equality (4.3.8) therefore translates into: / l*i, x2] -> [xx, x2]m - xx(x2ni) - x2{xxm)

(4.3.9)

Comparing this with the definition of a representation, it follows that (M,f) is a representation ofL. More succinctly, (using 4.3.2) we can say that a representation of L on module M is a linear mapping p of L into the endomorphism module of M such that P (l*i> x2]) m = p (x})p (x2) m - p (x2) p (*,) • m

(4.3.10)

We often use this simpler definition of representation in preference to Def. (4.3.1). Remark 4.3.5: If one were to define a module over an associative algebra^, all that is needed is the replacement of (4.3.8) by the equality: (xxx2)m = xl(x2m)

(4.3.11)

It is quite usual to obtain the representation (M, f) by using matrices to define the linear operator f(x) for J t e X . We choose ex ... en as the basis vectors in M and write:

i=i

Evidently \f-{x)] is the matrix (linear operator) with respect to x. This representation is called a matrix representation.

Basics of Algebras and Related Concepts 169

3.3

Nilpotent Lie Algebras

We next recall the definition of another type of derived series of ideals than the one used for solvable algebras in the previous section. This other derived series will lead us to nilpotent and Cartan subalgebras needed for further structural theory. In the previous section we already defined this derived series of ideals (using different notation, see Def. (4.2.19)): L =>£2= [L, L]-=>L3 = [L2,L]z> ... D £ I + 1 = [L\£\

(4.3.12)

and called it the lower central series of ideals. We now use it to define a nilpotent algebra. Definition 4.3.6: A Lie algebra L is called nilpotent if in the above series Lk + ' is zero for some integer k. The integer k is called the order of nilpotency of L. We list below a few facts about nilpotent Lie algebras. Fact 4.3.7: We note that Lr c £ 2r ,hence every nilpotent Lie algebra is solvable, but a solvable algebra is not necessarily nilpotent. For example the two-dimensional non-abelian Lie algebra defined by [e,, e2] = ex is solvable (as we have already seen in the previous section) but not nilpotent, since Lk is spanned by ex for all k > 1. Fact 4.3.8: nilpotent.

A finite-dimensional Lie algebra L is solvable if and only if its derived algebra [£, L] is

Fact 4.3.9: A Lie algebra L is nilpotent if and only if the linear operator adx for every x in L is a nilpotent operator, i.e., some power of adx is zero for every x e L. Fact 4.3.10: are ideals.

A finite-dimensional Lie algebra L is nilpotent if and only if all its maximal subalgebras

Fact 4.3.11:

The centre of a non-zero nilpotent Lie algebra is non-zero.

Fact 4.3.12:

The Killing form of a nilpotent Lie algebra is zero.

Fact 4.3.13: The subalgebras, the quotient algebras and the central extensions of nilpotent Lie algebras are nilpotent. A finite product of nilpotent Lie algebras is a nilpotent Lie algebra. Since every nilpotent Lie algebra is solvable, the result given in above fact holds good for solvable Lie algebras as well. In addition to above facts, we wish to note that the following statements are true for a Lie algebra to be a nilpotent algebra L: (a) CrL = {0} for sufficiently large r; (b) CrL = L for sufficiently large r; (c) There exists an integer r such that for all elements x x , x 2 , •••, x r in L , adxl

o adx2 o ••• o adxr

=0

(4.3.13)

(d) There exists a decreasing sequence of ideals <£;)oLn= (0}such that [L, £j] c £ i + 1 and dim L/Lt + { = 1 for 0 < i < n. Given below is an important result known as Engel's theorem.7 Result 4.3.14: Let V be a vector space over a field J and let L be a finite-dimensional subalgebra of gl(V) whose elements are nilpotent endomorphisms of V. If V •$• {0}, there exists v ^ 0 in V such that x-v = 0 for all x e L. (See Sec. 3 of Chapter 1 in [4].) 7

'

The result given in Fact (4.3.9) is also cited as Engel's theorem in literature (see [5]).

170 Mathematical Perspectives on Theoretical Physics

Definition 4.3.15: y satisfies

Let H be a nilpotent subalgebra of L, the subalgebra of L each of whose element (adh)ny = 0

(4.3.14)

for every element h of H for some integer n, is called the fitting null component ofL with respect to H. This subalgebra is denoted L^. From (4.3.14) it is evident that H c L^. Definition 4.3.16: A nilpotent subalgebra H is called Cartan subalgebra of L if H = L^. Since for h e H and x e L a Cartan subalgebra satisfies: [h, [h, ...[h, *]...]] = 0

C4.3.15)

it follows that x also lies in //. Every finite-dimensional Lie algebra L contains at least one Cartan subalgebra. Even if L contains more than one Cartan subalgebra, they all have one thing in common—their equal dimensionality. This dimension, denoted /, is called the rank of Lie algebra L and L is therefore offen denoted as .#,. Remark 4.3.17: The Cartan subalgebras of a semi-simple Lie algebra are maximal abelian subalgebras; the converse, however, does not always hold, i.e., a maximal abelian subalgebra of a semisimple Lie algebra is not necessarily a Cartan subalgebra. We next define two more important objects connected with modules and Cartan subalgebras, which help in the classification of Lie algebras (see Ref. [1], [4] and [9] for details).

3.4

Weight System and Roots of a Lie Algebra

Definition 4.3.18: Let L denote the dual vector space of L (when L is considered as a vector space), and let M be a module over L. A linear form /i e X is called a weight of M if there exists a non-zero vector m in M such that xm=fl(x)m

(4.3.16)

for x e L. Note that (in view of the previous discussions) x here acts as an operator on M, and as such fx (x) can be treated as an eigenvalue corresponding to the vector m of M. Thus the weight \i can be thought of as the collection of eigenvalues. The set of all weights of a module is called its weight system. Remark 4.3.19: Any non-zero module over a solvable Lie algebra defined on a field of complex numbers always admits at least one weight (see Ref. [1], [5]). In order to consider direct sum decompositions of modules over nilpotent Lie algebras, we further define the so-called simultaneous eigenvector. Definition 4.3.20: A non-zero vector m in M is called a weight vector or a generalized simultaneous eigenvector if there exists an integer r such that (x - ft (x)\)rm = 0

(4.3.17)

for all xinL. The set of all weight vectors corresponding to a weight fj., together with the zero vector, forms a module, which is a submodule of M. It is denoted M^L and is called the weight submodule for the weight ^u. Now in a weight submodule each element x e L is represented by an operator which is the difference of a nilpotent operator and a multiple of the unit operator. Hence it follows that for a module over a

Basics of Algebras and Related Concepts

171

nilpotent Lie algebra there always exists a basis consisting entirely of weight vectors that expresses M as a direct sum: M=@M£

(4.3.18(a))

In particular if £ is a semi-simple complex Lie algebra and H is one of its Cartan subalgebras, then any module over L can be written as the direct sum of its weight submodules with respect to H, thus:

M=®M>f n

(4.3.18(b))

Consider now a Lie algebra L over complex numbers and let H be a Cartan subalgebra. The dual space H* of H is the set of all complex valued linear forms on H. Definition such that

4.3.21:

A linear form a eH* is called a root if there exists a non-zero element x e L [h,x] = a(h)x

(4.3.19)

for every h in H. If we compare (4.3.19) with (4.3.16) it becomes apparent that roots can be thought of as special cases of weights in the sense that L can be regarded as a module over H via the adjoint representation, and therefore a direct sum decomposition of L (similar to (4.3.18)) can be obtained in terms of its root spaces L$, defined below. In order to obtain this decomposition, we assume thatX is a semi-simple Lie algebra over the complex numbers. The bilinear (Killing) form restricted to H is therefore non-singular (see Remark (4.3.2)), and this means that for every as H* there is a unique vector hae Hsuch that8: (ha, h) = a(h)

(4.3.20(a))

for all h e H, where ( , ) denotes the bilinear form. Thus H* can be identified with H. Further, for any non-zero root a, there exists a unique vector (denoted ea) e L such that for every h e H [h, ea] = a(h)ea

(4.3.20(b))

The vector e a is called a root vector, and since [h, e j = adh(ea) it is a simultaneous eigenvector of the linear operator adh acting on L. The 1-dimensional vector space spanned by e a is the root space denoted LaH. In particular the Cartan subalgebra H is the root space for a = 0. Since every non-zero root P & a determines a distinct root space, we obtain the required decomposition: L=H®(®LaH\

(4.3.21)

From (4.3.21) it follows that the total number of non-zero roots of a semi-simple Lie algebra equals the difference: dimension of L minus the rank of L. We now list a few facts about the properties of roots and weights of semi-simple Lie algebras (see [1], [3]). Fact 4.3.22: If a is a root then - a is also a root. But there are no other non-zero roots which are multiples of a. 8

ha in H can be viewed as the image of a under the mapping that identifies H* with H. In Chapter. 5 we shall use the terminology co-root for ha

172

Mathematical Perspectives on Theoretical Physics

Fact 4.3.23: If a and /3 are roots with corresponding root vectors ea and e^ and a + (5 * 0, then using the orthogonality given by the Killing form it can be seen that ea and e^ are orthogonal. Fact 4.3.24: Weights and roots have the additive property with respect to taking tensor products. Thus for instance if M^ and N^ are two weight submodules of the modules M and N over a rrilpotent Lie algebra H, then the tensor product M^ ® iV^is contained in the weight submodule (M ® N)J} + v of the module M ® N. We now show the close relationship between weights and roots which does not seem to be so apparent in spite of the fact that Cartan subalgebra H plays an important role (see Eqs. (4.3.18)(b) and (4.3.19)) in both cases. This relationship is best seen through the so-called weight ladder module obtained in the following manner. Given a weight module M^, repeated actions of elements ea and e_a of L generate the whole ladder of weight submodules that correspond to the entire weight ladder ji + zee for z = 0, ±1, ±2, . . . . However, since L is finite-dimensional, there are only finite number of elements in the ladder, giving rise to only finitely many submodules in the weight submodule ladder. The direct sum of these weight submodules can be regarded as a module over the (ladder generating) Lie subalgebra H + LaH + LTaH. Due to our limited scope we shall not go into the details of this topic (we shall however use these concepts in later sections, see Hint to Exercise. 6 of the next section). Interested readers can find them in other texts, e.g., (see Ref. [1], [9]). Finally before we close this section, we introduce the notion of ordering amongst the roots and weights of a Lie algebra and define a simple root, a highest root and a highest weight which will eventually lead to the definitions of Cartan matrix and Weyl group.

3.5

Lexicographic Ordering, Simple and Highest Root, and Highest Weight

Let H* denote the dual of H—the nilpotent Cartan subalgebra of a given complex semi-simple Lie algebra L. Further let H*R denote an /-dimensional real subspace of H* whose elements are real linear combinations of roots. We have seen that H* can be identified with H and therefore a metric such as given below can be defined: (a, p) = (ha, hp)

(4.3.22)

for a, P in H* and h^ hp in H. Restricting the above metric to H*R implies that (a, ft) is real, moreover if a, P are non-zero (a, a) and (P, P) are positive non-zero numbers, the Killing form on H*R is positive definite and it makes HR into a Euclidean space with Killing form as the inner product. We choose an ordered (but arbitrary) basis / l 5 / 2 , ...,/; in H*R and note that any X e H*R can be written as: X=rlfl+

... + r,f,

where rt (i = 1, . . . , / ) are real. We now define the lexicographic ordering in H*R by saying that a vector X is higher than another vector Y in H*R if the first non-zero component of their difference X - Y is greater than zero. This is notationally written as X > Y. Having selected a lexicographic ordering in H*R, we can now define a positive root and a simple root. Definition 4.3.25: A root a is positive if it is > 0 and is called a simple root if it is a positive root and is not a sum of two positive roots.* *

In view of (4.3.20)(b) corresponding to / simple roots a,- (i = 1, ... 0 there are I root vectors eaj = e,. One can also define/other vectors as/; = 2e_ a /. Elements e, and/j-,/= 1,2... / are called the simple raising and simple lowering elements of A,. These elements together with all ht e H generate A,.

Basics of Algebras and Related Concepts 173

A semi-simple Lie algebra of rank I carries a system of I simple roots (see Section 4) with respect to a lexicographic ordering. If a and fl are roots, then the sequence of linear forms P~pa, ..., /5- a, (5, [I + a, ...,/? + qa is called an a-ladder through /J if every member of the sequence is a root and if y3 (p + \)a and j5 + (q + \)a are not roots. The numbers p and q being 0, ± 1, ± 2, ± 3, ... are related as: p - q = 2(a, P)l(a, a)

(4.3.23)

If a and (5 are simple roots, by Def. (4.3.25) their difference cannot be a root, hence in this case p = 0. We shall later use the RHS of Eq. (4.3.23) to determine the a-ladder through /J. In the next section we shall also see that, this RHS for the set of simple roots defines the Cartan matrix. Using the induction method for constructing the ladders for simple roots, the highest root of the system can be determined (see Exercise (4.5)). As for the determination of the highest weight in a weight system of any module which is an irreducible representation module of X, we would like to note that in a semi-simple Lie algebra of rank /just like roots, weights are also linear combinations of simple roots. They can be considered as a set of points lying in the Euclidean space H*K. And since H*R is lexicographically ordered with respect to some basis, the weights also form a totally ordered set and hence the concept of highest weight is well defined. Moreover since L is finite-dimensional, there are only a finite number of different weights and one of these is highest, i.e., higher than others. In the previous subsection we have already seen the ladder of weights constructed from a given weight /n . Written out in full it is: fi- pa ... [i + qa. All these weights belong to the weight system. It is interesting to note that the real number r = 2{fi, a)/(a, a) which is an integer defines an element \x-ra in the ladder. We shall see that the weight system of an irreducible module over£ can be obtained using the Dynkin diagram (Def. (4.4.14)). We now state an important result (see Lemma 4.6.5 in [ll(a)] concerning the highest weight of a given representation p:L —> V and a consequential definition. Result 4.3.26: Let p be a representation of L in a vector space V. Suppose that v e Vis a non-zero vector such that9 (i) v € Vx for some he H* (V= © Vx ) (ii) piXt)v = 0 1 < i < I, Xt e L (4.3.24) (iii) v is cyclic for p (i.e., V= p (£)v). Then p is a representation with weights, and X is the highest weight of p. Definition 4.3.27: A highest weight is called a basic weight if it is not the sum of two non-zero highest weights. The number of basic weights of a semi-simple Lie algebra equals its rank /. An irreducible module is said to be a basic module if its highest weight is one of the basic weights. Since the number of basic weights is the same as that of simple roots, one can set up a correspondence amongst them as:

2
(4.3.25)

where i and j take the values 1 to /. Given the basic weights A ,, A 2 , ..., Xt any other weight /j. can be expressed as 9. f is also referred to as extreme vector. We recall that a vector x e an £-module M is an extreme vector if e, x = 0 for i = 1 ... /, where e,'s are simple raising elements ofL =%i (see [1]).

174

Mathematical Perspectives on Theoretical Physics / i=i

where each mi is an integer coefficient given by the relation: m, = 2(A ,, «,)/<«,, a,)

(4.3.26)

The numbers m{ which can be 0, positive or negative, are called the Dynkin indices of weight /x. The Dynkin indices of the highest weight are always non-negative and they uniquely determine (up to isomorphisms) an irreducible module over a semi-simple Lie algebra. We shall return to Dynkin indices and Dynkin diagram in the next section (see Exercise (4.4.6)).

Exercise 4.3 1. LetX be a 3-dimensional non-commutative Lie algebra with centres. Show that if z - T>L, then L is nilpotent. 2. Prove Fact (4.3.7). 3. Prove Fact (4.3.8). 4. Prove Fact (4,3.9). 5. Prove Fact (4.3.12). 6. Show that the statements given in (4.3.13) for a Lie algebra L to be nilpotent are equivalent. 7. Prove Fact (4.3.13). 8. Let £ be a nilpotent Lie algebra and p (respectively q) be the smallest integer such that CPL = {0} (respectively CqL = £). Show that p = q + 1 and that C{L 3 CP~'L. 9. Let L be the 6-dimensional Lie algebra over a field J with basis elements (ev e2, •--, e6) and multiplication table [e,, e2] = - [e2, ex] - e4, [e{, e 3 ] = - [e3, ex] = e5, [e2, e 3 ] = - [e 3 , e2] = e6,

with other brackets zero. Let /3 be the bilinear form on L such that j3(e3, e4) = j3(e4, e3) = 1, P(eh e6) = f5(e6, ej) = 1, /5(e2, e5) - j3(e5, e2) = -1 and it is zero for other ordered pairs. Then show that (5 is invariant, L is nilpotent, z = T>L ^ {0} and /3 is non-degenerate. 10. Show that the following multiplication tables define two nilpotent Lie algebras L3 and LA of dimensions 3 and 4: £3:

[eu e2] = e3 [e}, e3] = [e2, e3] = 0

L4:

[ex, e2] = e3 [ex, e3] = e4 [ev e4] = [e2, e3] = [e2, e4] = [e3, e4] = 0.

11. Show that a Lie algebra L is solvable if and only if its Killing form is zero. 12. Prove Fact (4.3.23).

Hints to Exercise 4.3 1. From the definition T>L = [L, L], we note that in this case the centre z = [L, L] = L2. The algebra L is 3-dimensional and non-commutative. Since z =X 2 , it follows thatX 3 the next derived ideal: [L2, L] = [z, £] is zero, showing that £ is nilpotent.

Basics of Algebras and Related Concepts

175

2. By induction Lr+X ZD L(r + ' \ thus if the LHS is zero, the RHS is also zero, hence from Def. (4.2.4) and (4.3.6), the result is obvious. (Note that in Def. (4.2.18) and (4.2.19) this is characterized as Cr+lL ZD 1fL, the vanishing of Cr+ XL implies the vanishing of T>rL.) 3. Denote [L, L\ = (DL. The condition is necessary since when L is a solvable algebra, the nilpotent radical ofX is T>L. To see the sufficiency we note tha.x.Ll'DL is commutative when
Ci + lL and Lr_i
*,, ...,*,-e L.

But C'L is the set of linear combinations of elements of the form given above, the vanishing of one implies the vanishing of the other, accordingly (a) and (c) are equivalent, leading to equivalence of (a), (b) and (c). Finally, if there exists a sequence (£,) 0 < (< r of ideals with the defining properties for £ , to be nilpotent, then it is easy to note that there exists a decreasing sequence (V;)o <,- < „ of vector subspaces of L of dimensions n, n- I, n-2, ..., 0 and a sequence of indices i o < 2 , < ...

with

Lo=Vio,Ll

=

< i

r

Vii,...Lr=Vir

then as [L, Vt ] c V{ , the V,- are ideals and [L, V,] aVi + l for all i. Hence the nilpotency of L implies (d) and thus all (a), (b), (c) and (d) are equivalent. 7. Let 5 and / be a subalgebra and an ideal of nilpotent Lie algebra L. Then Lll = Q, is the quotient algebra, and
(CrL) = {0}, hence the subalgebra S as well as the quotient algebra Qis nilpotent. If, on the other hand, Q,is nilpotent and / is contained in the centre of L, then, using CfCi= {0} for some r, we have Crg c / for gsL and therefore Cr+lL c [/, L] = {0} so thatX is nilpotent. The nilpotency of 5 and / can be established trivially

176 Mathematical Perspectives on Theoretical Physics

by treating them as subsets of L, thus, for instance, CrS c CL = {0}. Finally, the assertion on products follows from (c) of equivalence relations (4.3.13). 8. Using the equivalence of (a) and (b) in statement (4.3.13) it is obvious that if p and q are the smallest integers that satisfy CPL = {0} and CqL = L for Lie algebra L, then p must equal q + 1. Also note that in the Hint to Exercises 6 we have shown that in this case £,- => C(- + jX and (replacing r b y q) Lq_i c CtL. This leads to CiLz>Lq_i =3 C'i+X L, i.e., C,X => C'lL. 9. Using the multiplication table we obtain that DL = [£, L] = (e4, e5, e6). It is easy to see that is also the centre of X, since every element of L permutes with those of CDL. Hence z = T>£ * {0}. This leads to: I^L = [DL, T>L] = 0 showing thatX is nilpotent (we shall also prove it, by computing the brackets for {e,}). A form (5 on L is invariant if /3([x, y], z) = P(x, [y, z]) for all x, y, z e L (see Def. (4.4.7)). In this case we have to check for \et }i = 1, ..., 6. Now Pde^ e2], e3) must be equal to P(el, [e2, e3]). Using the table for /?(e(-, e) and the Lie brackets, we note that the former is f3(e4, e3) = 1 and the latter is P(ex, e6) = 1. Similarly /3([e1? e 3 ], eA) must be equal to Piey, [e3, e4]) or that P(e5, e4) must equal /3(el5 0), but these are both zero. Likewise filling in all other values of i, j , k in P(et, [ej, ek]) for i*j± k, we have in all 20 pairs and they can all be seen to be equal. This shows that ft is invariant, also non-degeneracy of ft is obvious. To show that L is nilpotent we have to compute [L, L], etc., and show that dfL = [$/' XL, 2 / ~ iL] is zero for some r. Now [L, L] = (e4, e5, e6, 0, 0, ...) = T>L. T>2L = [L] = {[eA, e5], [e4, e6], [e5, e6]) = (0, 0, 0).

Thus L is nilpotent. 11: We use Fact (4.3.7) and (4.3.12) (established in Exercises 2 and 5) to obtain the result. 12. In effect we have to show that if a + P ^ 0 then the one-dimensional spaces LaH and L®H spanned by ea and ea are orthogonal. We use the defining equation (4.3.22) which gives [h, e j = aihje^ and the decomposition equality (4.3.23), and since adh is a derivation on whole of L and in particular on LaH and L% we note that for ea e LaH and e^ e L^H we can write: adh[ea, ep] = [(adh)ea, ep] + [ea, (adh)ep] = a(h)[ea,

ep] + p(h)[ea,

= (a(h) + p(h))[ea,

eft

ep].

This shows that [ea, ep] is an eigenvector belonging to Lfi + ^. Thus [LaH, £%,] c Lfi + & holds good for any roots a and p.lfa+/5^0 and yis any other element of H*, adea o ade^ will map LyH into L^Jr^Jry which is different from LyH, Hence (ew eg) = Tr(adea o adep) = 0 showing that ea 1 ep.

4 4.1

UNIVERSAL ENVELOPING ALGEBRA, WEYL GROUP AND CARTAN MATRIX Universal Enveloping Algebra, Representations on Modules

In Sec. 1 we have seen how an associative algebra can be made into a Lie algebra. We shall now see the reverse process, namely how a given Lie algebra can be associated with an associative algebra. For this purpose we treat L as a vector space and form its (contravariant) algebra: T{L)=J®L 0 (£®L)

...

Basics of Algebras and Related Concepts 177

We know T{L) is an associative algebra (see Hint to Exercise 1 in Sec. (4.1)). We now collect the set of all elements of the form (differences between Lie algebra products and the corresponding commutators in T(L)): [x,y]-(x®y-y®x)

(4.4.1)

for x, y in X, and we note that this collection forms a two-sided ideal denoted /. The associative quotient algebra: T(L)/I=U(L)

(4.4.2)

is called the universal enveloping algebra of X. Since associative algebras are easier to handle (structurally), the enveloping algebra U(L) is found very useful in the study of Lie groups (via their Lie algebras). Some of the properties of U(£) are listed below: Properties 4.4.1: Every representation of a Lie algebra X can be extended to a representation of U(L). Every module over X can be thought of as a module over U(L). Conversely a unitary left module over U(L) is called a left X-module—in short a X-module. The action of U(L) over a module M is defined as: (JC,, x2, ..., xn)m = xx(x2(... xn(m)...))

(4.4.3)

where jclt ..., xne X and m e M. Also if {xt} is a basis of X then monomials of the form: xi{ ® xi2 ® ... ® xin

n = 0, 1, 2, ...

(where n = 0 gives the trivial monomial 1) span the tensor algebra T(L) and hence the cosets of these monomials span [/(£). Before closing this subsection we give a few definitions concerning representations that involve modules associated to the enveloping algebra of a Lie algebra and the unitary modules over a field 7 (see [1] and [8] for details). Definition 4.4.2: Two representations p and p' of X on ^F-modules M and M' (over the same fieldT) are called similar or isomorphic if the X-modules M and M' are isomorphic. It should be noted that for this it is necessary and sufficient that there exists an isomorphism y/ of the jF-module M onto the jF-module M', such that p'(x) = Xjfo p(x) oyf1

(4.4.4)

for x e L. Definition 4.4.3: Let / be an index set and for i e I, let p, be a representation of L on the module A/;, and let M be the X-module which is the direct sum of the modules A/;. Then there is a corresponding representation

P= 5>/ iel

called the direct sum of representations such that: p(x) • m = ®(Pi(x)mi)JeI where x e L and m = ^ w , 6 M. iel

(4.4.5)

178

Mathematical Perspectives on Theoretical Physics

Definition 4.4.4: A representation p of L on M is called simple or irreducible if the associated Lmodule is simple. Note that this amounts to saying that there exists no submodule of M (over the field f) other than {0} and M, which is stable under all the p(x)'s for x e L. Thus a class of simple modules defines a class of simple representations. Definition 4.4.5: A representation p of L on M is called semi-simple or completely reducible if the associated X-module is semi-simple. Thus p in this case is a direct sum of simple representations on submodules of M such that none of these submodules is stable under p (x) for every x in L. Definition 4.4.6: Let L be a Lie algebra over a field ^F, and M a X-module. Then the X-module structure on M and the trivial X-module structure on f define a £-module structure on the ^F-module N formed by bilinear forms on M. Explicitly N stands for ®(Af, M; f) and its structure is given as: (xN • p)(m, m') = - p(xM • m, m') + p(m, xM • m')

(4.4.6)

where xe L,m, m'e M, ps N and xN, xM stand for the linear operators on modules N and M that result from x in L. The set of all elements x e L which satisfy xN • /3 = 0 for a fixed element ft of N forms a subalgebra of L. In view of the above definition, we note that if M is a ^F-module and gl{M) denotes the Lie algebra of endomorphisms of M, then for a given bilinear form fl on M, the set of x e gl(M) that satisfies: -P(xm, m ) + P(m, x m )=0

(4.4.7)

forms a Lie subalgebra of gl(M). Definition 4.4.7: Let L be a Lie algebra over J. The adjoint representation of L on L and the zero representation of L on jF define a X-module structure on the jF-module AT= 25(£, X; jF) of bilinear forms on L. A bilinear form /3 on L is said to be invariant if it is invariant under the representation x —> xN. From equality (4.4.6), the necessary and sufficient condition that p be invariant is: P([x,ylz) = P(y,[x,z])

4.2

(4.4.8)

Root Systems and the Weyl Group

In order to define a Weyl group in all generality, we recall the definition of a root system in an arbitrary finite-dimensional vector space V over the field of rational numbers Q. Definition 4.4.8: Let the vector space V carry a positive definite symmetric bilinear form. A finite subset X of non-zero vectors of V is called a root system in V if the following four properties are satisfied by its elements (called roots), a, p\ etc.: (i) X spans V (ii) if a X and tecs X with t e Q, then / = ± 1 (iii) if a, P e X, then 2 {a, P)/(a, a) is an integer (4.4.9) (iv) if a, P e X, then /3 - 2 [(a, p")/ V as Sa(v) = v-2 [(a, v)l(a, a)]a.

(4.4.10)

The set of all symmetries defined by (4.4.10) forms a group called the Weyl group W of the root system X. Since X is finite, the group W is finite. Moreover W can be viewed as the permutation group of X.

Basics of Algebras and Related Concepts 179

For all split10 Lie algebras L of finite-dimension over a field of characteristic 0, a root system can be defined (as shown in the previous section), by taking the vector space V over Q as the one spanned by the basis vectors ax... an e H* (the dual of the Cartan subalgebra H). Denote the set of basis vectors by X. Then clearly X czV and elements of X satisfy the remaining three properties (ii), (iii) and (iv). The Weyl group in this case is given by the elements {Sa.}, at e H* (i = 1, ..., n). The following two properties relating to above discussions can be easily checked. Property 4.4.9:

For any root system X a V and a e X

(Sa(u), Sa(v)) = (u, v) for all u, v e V

(4.4.11)

Property 4.4.10: Let n(a, P) = 2(a, /3)/(A P) define a function on X x X —> Q , the only values that n takes in Q are 0, ± 1, ± 2, ± 3. Definition 4.4.11: A subset K c X i s called a root system basis for X in V if Y is a vector space basis for V and for any [3 e X we have n

P=Y4mi a,

m,. e Q

i=i

where F = {otj, ..., an}, and either all the m/s are non-negative or they are all non-positive. A root system basis is irreducible if there is no non-trivial disjoint union Y = Yl u Y2 with (a, P) = 0 for all a € Fj, and /? e F2- ^ ' s known that every root system X in V contains a root system basis (see [8]). Before closing this subsection we list the important properties of a root system basis Y of the root system XinV: (i) Y spans V (ii) if a, (i e Y and a * P then < a, p > < 0 (iii) Y is a vector space basis of V. (iv) if Y = {cfj, «2, ... an} and P e X+ (the subset of all positive roots in X), then either j8 e Y or there is an at 6 F with /? - a, € X + . « (v) if /? e X+. then there exist positive integers «,• for / = 1, 2, ... n with P= ^ mt at. i=i

We leave the verification of these properties as an exercise for the reader, and ofcourse we shall use them in the remaining part of this chapter.

4.3

Cartan Matrices

Definition 4.4.12: Let Y = ( a t , ..., an) denote a basis for root system in V and let n : Y x Y —> {0, ±1, ±2, ±3} be the mapping defined in Property (4.4.10). Then the matrix:

||n(o,, apH = £

} a

'

(4.4.12)

is called the Cartan matrix of root system X*. 10. ALiealgebra£ over a field J of characteristic 0 is said to be split algebra if for each x e L all the characteristic values of adx are in f. All algebras over an algebraically closed field, in particular over the field of complex numbers, are split algebras (see [5] for details). * From property (ii) of root system basis it is evident that all diagonal elements of a Cartan matrix are 2, and off diagonal elements are from the set {0, - 1 , -2, -3}.

180 Mathematical Perspectives on Theoretical Physics

The elements n(ah aj) = Ay of Cartan matrix evidently determine the root-ladder of simple roots (see Eq. (4.3.23)), and hence also the highest root of the system. n

If a = £ ml a, e X+, then ^ mt is called the height of a with respect to Y. Two root systems A", and X2 1=1

in Vx and V2 respectively are said to be isomorphic if there exists an onto non-singular linear transformation T: Vt —» V2 such that T{X{) = X2, and for u, v e Vu{T(u), T{v))2 = c{ u, v){ where c is a positive rational number and (,) ( is a symmetric bilinear form on V) (/ = 1, 2). Property 4.4.13: All the roots of a root system X and its Cartan matrix can be determined once a basis Ffor the root system is given. Also two root systems with bases having identical Cartan matrices are isomorphic. It is important to note here that the Cartan matrix for a given root system is unique up to permutations of its rows and columns irrespective of the choice of a root basis. This observation leads to the fact that Cartan matrices for Lie algebras are different only when these algebras are non-isomorphic. We next define another important object of this chapter—the Dynkin diagram which turns out to be a useful tool in the determination of Cartan matrices of root systems and in the classification of Lie algebras.

4.4

Dynkin Diagram

Definition 4.4.14: The Dynkin diagram A of a root system X in V with basis Y = {ax, 0^, ..., an} consists of a graph in R2 that has n vertices labelled au ..., an and has Ny = n(ah af)n (<Xp a,) line segments joining the a r th vertex to the 0,-th vertex. Moreover, for any two arbitrary roots a and /3 (which include a/s), if n(a, p)*0 and (ft, ft) > (a, a), then the Dynkin diagram carries an arrow from the /J-vertex to cu-vertex. From the above definition it is clear that the number of lines which join the vertices a ; and GCj can be obtained using the expression: -—U

'

ll!

—r-

= 4 cos- Z(ah aj)

(4.4.13)

Property 4.4.15: Given a basis Y of a root system X in V, the Dynkin diagram of this root system X determines the Cartan matrix of X. To examine this property we need the two facts (already listed above), namely (1) n(a,, af) = {0, ±1, ±2, ±3} for (Xj, GCj € Y; (2) the number of line segments joining the vertices a;, a,- in the Dynkin diagram is n{at, ap n (a,, a,). To begin with, we note that if there is no line between a- and 0Cj then nity, ap = 0, thus i, j'-th as well asj, i-th element of Cartan matrix is zero. If n(at, ot;) n (ct;, a,-) = 1 (i.e., a, and Oj are joined by one line) then n(a;, ap = n (a;, at) = -1 and (a;, at) = (ap a,) giving roots of equal length. In view of Property (4.4.10) (see Hint to Exercise 2), the only other values that n{ab apn(Oj, a,) can take are 2 and 3. Suppose now that n(a{, ap n (a ; , a;) = 2, the roots in this case will not be of equal length; if there is an arrow from a7 to a,, it would mean that (a,-, Oj) is greater than (at, at), this would therefore imply: 2(ai,aj)

(«;,«,)

c 2(« y .,q,.)

(a,-, a,)

Basics of Algebras and Related Concepts 181

since numerators are equal. But the RHS and the LHS of this inequality are n(a,, a,) and n(at, a}) respectively. Hence we have n(at, a,) < n((Xj, a,) which means that n(at, a.p = -2 and n((Xj, a,) = - 1 . The remaining case n(aj, a}) n (ay, a,) = 3 can be argued in a similar manner. For instance the roots will be of unequal lengths. An arrow from a, to at would imply that (a,, a-) is greater than (a,, a,) leading eventually to n(a,, aj) = -3 and n(ap a,) - - 1 . From our discussion on Cartan matrix in previous subsection, we already know that off diagonal elements are < 0, and Dynkin diagram helps determine these elements, in all cases. We now devote the rest of this section to illustrate the concepts introduced above by a few examples based on variations of root systems of a two-dimensional vector space. Example 4.4.16: Let V stand for a 2-dimensional vector space and X stand for the root systems for the four different variations given below. (a) V = Q 2 = Q x Q , a = (1,0), X = {±a, ±)3} P = (0, 1); (b) V={(a, J3b):a,be Q}, a =(1,0), X = {±<x, ±p, ±(a + p)) p= a= P= a= P=

(c) V=Q2,

(-1/2, V3/2}); (1,0), (-1, 1); (1, 0), (-3/2, V3/2);

X= {±a, ±p, ± (a + p), ±(2a+p)} (d) V = {(a, J3b) :a,beQ], X = {±a, ±p, ±(a + p), ±(2a+p), ±Qa+ p), ±(3a+2p)} We shall now verify the compatibility of different root systems with the corresponding vector spaces using Def. (4.4.8). It is easy to note that (i) is trivially satisfied in each of these four examples, i.e., X spans V, in fact there is a root basis Y formed by {a, P) in all of them which spans V. The condition (ii) is also quite evident in all cases. To examine (iii) and (iv) we choose the roots (2a + p) and (3a + p) of (d) and write down the required inner products. In the case of (iii) we have to show that: 2( la + p,3a+ P)l(2a + P,2a+ P)

(4.4.14)(a)

is an integer, i.e., 2((l/2, V3/2), (3/2, V3~/2)>/<(l/2, V3 12), (1/2, V3 12)) is an integer. Simplification yields: 2(3/4 + 3/4)/(l/4 + 3/4) = 3

(4.4.14)(b)

To establish (iv) we use the calculations of (4.4.14)(b) and write: (3a + p) - 3(2a + p) = - 3 a -2p = - ( 3 a + 2)3)

(4.4.15)

Evidently -(3 a + 2)3) e X. It should be noted that the symmetry transformations Sa can be interpreted geometrically as reflections in the plane perpendicular to the root a e X, these are called Weyl reflections. In this context property (iv) of Def. (4.4.8) merely asserts that the Weyl reflection of a root P in the hyperplane (through the origin) perpendicular to any non-zero root a yields another root

(a, a)

182

Mathematical Perspectives on Theoretical Physics

This fact is also obvious for all four examples given above. We now use equality (4.4.12) of Def. (4.4.12) to write the Cartan matrix for example (d). For this we denote the root basis {a, /3} as {av c^ }. The Cartan matrix is: '2(«i.«i)

<«!,«,} 2(«2»"l)

2{(Xi,«2)}

(a2,a2) 2 <«2»"2)

i {ax,ax)

(2 1-3

-n 2j

(a2,a2)j

The Dynkin diagram A of root system of example (d) is computed as follows: n(av a2)n(a2,

ax) - (-l)(-3) = 3
Thus there are three line segments joining a, and a2 and there is an arrow pointing from c^ to a,. «i

a2

Property (4.4.13), which says that Dynkin diagram for a given root basis determines the Cartan matrix, can be easily verified.

4.5

Casimir Element and Casimir Operator of L

Let C{L) stand for the centre of universal algebra U(L). It is known that C(£) is an abelian subalgebra of U(L) and it consists of those elements of U(L) which commute with every element of L. These elements of C(L) are called Casimir elements. More formally we define it as follows: Definition 4.4.17: Let£ be a Lie algebra over a field J, and U(L) be its enveloping algebra. Let J be an ^-dimensional ideal of L such that an invariant bilinear form /5 on L when restricted to J is nondegenerate. Suppose that J admits two bases (e,-) and (e't) i - 1, ..., n for which j9(e,, e'p = 8{j.n The element

C=Yeie'i i

of U (L) belongs to the centre C(L) and is called the Casimir element. It is evidently independent of the choice of the basis. In particular when ji is the bilinear form associated with aX-module M, the element C is called the Casimir element associated with M (or with the corresponding representation). Remark 4.4.18: WhenX is a semi-simple Lie algebra, there always exists a Casimir element. Since in this case there is a non-singular Killing form (which we again denote as f5), with respect to which two bases (e,) and (e',-) can be defined satisfying the relation: p(et, e'j) = dij.

The element

1=1

1

' The bases (e,) and (e') are dual of each other with respect to /}.

Basics of Algebras and Related Concepts

183

is the required Casimir element, that commutes with all elements of L. The element C is sometimes referred to as second order Casimir element. In the case of a semi-simple Lie algebra (that we are talking about), it is analogous to the Laplace operator in the theory of special functions, and as such it is sometimes expressed by using a basis {hv ..., /z,) of Cartan subalgebra H of L:

C = X M ' ' + X , eae~a . i= l

(4.4.16)

\ea'e-a)

a*0

where {/?'} denotes the dual basis with respect to the Killing form ( , ) on H, and summation in the second term stands for all non-zero roots a e L which in turn define non-zero root vectors eae L^. Definition 4.4.19: Let L be a Lie algebra over the field J, let J be an ideal of L and let p denote a representation of L in a finite dimensional vector space V. Define a bilinear form F on L by setting: T(X,

Y) = Trp(X)p(Y)

X , Y z L

and assume that it is non-degenerate when restricted to J. Let {X,, ...,Xn} X'n} be its dual (i.e., T(Xh X'p = 8y), then the element 12

(4.4.17) be a basis of J and {X\, ...,

C=£p(X,-)o piX',)

(4.4.18)

is an endomorphism of V which commutes with every endomorphism p{A) fox A 6 L. The element C of (4.4.18) is called the Casimir operator of J corresponding to the representation p. If V is an irreducible X-module M, then the Casimir operator of L is an automorphism of M. Hence C is a multiple of unit operator 1, thus, denoting the eigenvalue by C(M), we have in this case:

C= C(M) 1. If M has highest weight X , then using Weyl's formula13 we have C(M)= pYA,A+ Ya)

(4.4.19)

We shall return to these ideas in the next chapter, where we shall see how these concepts are used in representation theory of infinite-dimensional algebras.

Exercise 4.4 1. Prove Property (4.4.9). 2. Prove Property (4.4.10). 3. Show that the symmetry transformations {Sa} of a vector space V satisfy (0 •Sa = S_a and (ii) si = 1 for every root as X. 12

13

We have used the same symbol C to denote the Casimir element as well as the Casimir operator, since it has the same properties as that of Casimir operator defined earlier for Lie groups (See Exercise 3.2 and Sec. 3.7). See Belinfante, Jacobson, Vardarajan for details on Weyl's formulae, e.g., character formula and dimension formula ([1], [5], [11]).

184

Mathematical Perspectives on Theoretical Physics

4. Show that the Cartan matrices for Exp. (4.4.16) (a), (b) and (d) are respectively: T2 O i p

-11 I" 2

-11

|_0 2J [-1

2] L-3

2]

5. Establish the following for a simple Lie algebra # 2 , when the Dynkin diagram in terms of simple roots ax, OC2 is given as:

o

o

(i) its Cartan matrix; (ii) the a r ladder through c^; (iii) its root system, the highest root and the root diagram; (iv) a canonical basis for A2. 6. Find the basic weights A ]5 X 2 for the Lie algebra A2 of Exercise 5, and basic representation modules corresponding to them, along with their weight diagrams.

Hints to Exercise 4.4 1. By definition 5^(M) = u - 2 [(a, u)/(a, a )]a therefore < Sa(u), sa(v)) = (u, v) - 2(u, a)(a, v)/(a, a) - 2(v, a)(a, u)/(a, a) + 4 (<*> u) (a> v) ,/ fl; a) {a, a)1 Since the bilinear form ( , ) is symmetric, we have the required result. 2. To prove this property, we think of V as having the base field R (which ZD Q)-the field of real numbers; the inner product on V and therefore on X is the restriction of the usual inner product on R", n being the dimension of V. If 6 denotes the angle between roots a and /?, thought of as vectors in R", then cos26>=
(i)

This implies that

2 «(a^)n ( /3,a)(p, = 4^|^4 = 4cos e 15) (a, a) Since cos#< 1 n(a, p)n(P, a ) < 4 . But n(a, /3) and n(fi, a) are both integers (Def.(4.4.8)(iii)), therefore we have

(n)

- 4 < n(a, p) < 4. (iii) In order to show that n(a, p) takes only the values 0, ±1, ±2, ±3, we have to show that it does not take the value ± 4 as implied by (iii). If possible let n(a, /3) = 4, then from (ii)rc(/3,a) = 1, but

Basics of Algebras and Related Concepts

185

n(fi, a) = ^ ' (a, a) therefore (a, a) = 2<j8, a) = 2 «(a, 0) = 4
Sa(x) = x- 2((a, x)/(a, a))a, and S_a(x) = x- 2«- a, x)/(- a, - «))(- a). Apparently SJ.x) = 5_a (x) for all x in V and all a in X. To compute S^ix) we write Sa(x) = y, thus ^«(y) = y-

2

« « ' y>/<«. « » « •

On substitution this becomes:

Sa(SaM) = *(a,- ^a) « l\- Ma, (a, x -a)^4a)L a)]a // J _

2{a,x)

— X

_ 2(«, x) U

(a, a)

—

4(«, x)

— U -I

(a, a)

-

—U

(a, a)

— X

thus5« = 1. 2(a,,a) 4. We use Def. (4.4.12) to form the matrix -p -±- . Since in each of these examples the root basis consists of two elements a, P the matrix is: 2

(a, a) (a, a)

2(P,a)

(a, a)

2&JQ(p, p) 2{ML

(P, P)

Substitution of different values (such as a = (1, 0) and P = (0, 1) in the case of (a)) gives the required result. 5. (i) From the definition of Dynkin diagram and Eq. (4.4.13) we have (a)

«(«,-, a)n(a,

a,) =

4 (a-, a - ) \ , —r

186 Mathematical Perspectives on Theoretical Physics

= 4 cos2 Z(a;, aft. The LHS of the above equality represents the number of lines between the vertices a{ and a-}, which can be 0, 1, 2, or 3. Also (a,, a,) < 0 for simple roots, hence using different choices of r(r = 0, 1, 2, 3) in rIA - cos 2 Z(a{, a,) it follows that angles between a, and of can be one of these: 90°, 120°, 135°, 150°. In our case the value of r = 1, hence the angle is 120°. The roots a, and (Xj are equal, since in the Dynkin diagram there is no arrow from one root to the other. To compute the Cartan matrix: "2(«i.«i)

2

(ax,ax)

<"i.«2)l

(a2,a2)

2 («2.«l)

JAn

2 («2'«2>

. (a,,a,)

AX21

U2I

^22 J

(a 2 ,a 2 ).

we observe that since roots are equal the off-diagonal elements are not only equal but simplify to 2 cos Z (ax, %) = 2 cos 120° = - 1 . Hence the matrix is

(ii) Using the expression for a root ladder and the fact that difference of simple roots is zero, we have p = 0, thus the root ladder ax through 0^ is simply (b) ctj, CC2+ ax, ..., ccj + qax where q satisfies (4.3.26), i.e.,

(a2,a2) This gives q = 1, accordingly the root ladder (b) has just two roots (Xj, (X2+ ocl. (iii) Since every simple root a as well as any other positive root has its negative counterpart that belongs to the root system, we have the root system consisting of six roots ±ctx, ±0^, ±(0!! + o^). The highest root is obviously ax + Oj- The root diagram is as given below: /a1

-«2\

-(a, + oy

<

\L

(a-i + a2)

Q f f Q Root dicagram of A2 Obviously the Euclidean space //R* formed by simple roots is a plane spanned by ax and o^. From the above diagram it is also clear that root system of Sl2 ' s symmetric under the Weyl group W, and has additional symmetry under inversion in the origin (e.g., at —> -a,-, etc.). (iv) Let hx, h2 be the elements of Cartan subalgebra given by:

Basics of Algebras and Related Concepts 187

*,•=

,

' .

(I'=1,2)

and let us assume that the root vector ea. corresponds to et e %.v define

2e_a. The vectors ex, e2,fx,f2u together with hx, h2 form the canonical basis and generate the Lie algebra !A2. The following Lie bracket relations can be easily verified: [hit hj] = 0

[«i./il = *i [e 2 ,/i] = 0 = [«i./ 2 ]. Writing similar relations forft2we get

[hx, ex] = an ex = 2ex,

[hx, e2] = ax2 e2 = -e2

[A,,/,] = -anfi [^/ 2 ] = ^2-

[Ai,/2] = ^12/2 = / 2

= -2/i,

[A2, e i ] = -€,, [^2,/,]=/,, [h2,e2] = 2e2, [h2, f2] =-2f2. Besides these six vectors there are the commutators [e1; e2] = ei2 and \fx,f2] =f\2- Thus in all we have eight vectors which span the vector space underlying !A2. 6. From the defining Eq. (4.3.28) for basic weights, we know that when i*j, the basic weight A, is perpendicular to (Xf, in the case i =j the formula implies that projection of A, on the root a, is half the length of av Since the rank of A2 is two, there are only two simple roots and two basic weights. To obtain the latter we must find the lengths of ax and 0^, for this we use the fact that the sum of the squares of lengths of all the roots is equal to the rank. From the root diagram of Exercise. 5, there are six roots, and the end points of root vectors form the vertices of a regular hexagon, thus they are all of equal length, say I. From 6l2 - 2 we have / = —^=, i.e., {ax, ctx) = v3 (a2, oc2) = — and since Z(ax, oc^ = 120°, {ax, a 2 ) = 3 the basic weights will be as follows:

/a1

~CC2 \

\

- (a, + ag) ^

/

Y<)

-a, /

. Thus superimposed on a root diagram 6

1 JU

>- (ai + a2)

\az

^CTHCT Weight diagram of .q2 1

' The vectors {e,} and {/j} in a Lie algebra J^; of rank / are called the simple raising elements and simple lowering elements of ^t; (see Ftn. on p. 172).

188 Mathematical Perspectives on Theoretical Physics

Corresponding to these weights X x, X 2 there are two basic modules for A2. The representation which corresponds to one is just the dual of the other. The weight diagram (with highest weight Xx) using the Weyl group can be seen to be an equilateral triangle with weights Xx, X2- Xx, -Xj as its vertices. Likewise the weight diagram with highest weight j ^ will be the equilateral triangle with weights X2, Xx - X2, - Xx as its vertices. Thus if fa, fi2, fa, is the weight system of basic module Mx with highest weight Xx = fa, then -fa, - fa, -fa is the weight system for basic module M2 with highest weight X2 - -fa. The weight diagrams in two cases would be:

^ 3 ^ 3 Weight diagrams of Mi and M7 Each of these modules will be three-dimensional, since corresponding to weights fa, fa, fa there will be three distinct weight vectors xx, x2, xy We use xx x2 x3 as basis vectors and, choosing xx as the extreme vector (i.e., epcx = 0 15 for every 0, we can set the other two as x2 =fxxx and x3 = f2x2. We use the relations given in (4.3.18) to write htXj = ^(/z,)^ 1 6 and Xfhp = Stj and then use these to note further that e

i*2 = exfxxx = [ex,fx]xx +fxexxx = hxxx = xx.

The rule Lff M^H a Mff + a (see Sec. 3) and an inspection of the weight system further gives exxx = 0 = exxy We now use

n\

ro\

xx = 0 , x2 =

,0)

ro\

1 , x3 =

UJ

0

u,

to write the matrix representation of !A2 corresponding to the irreducible (basic) module Mx. Thus we have:

"0 1 01 ex h-> 0

0

0

H->

0

0

1

eX2

H>

0

0

r 0

[ 0 0 oJ

[0 0 o_

0

0

01

ro

0

01

ro

/ ! H> 1

0

0

f2 \-> 0

0

0

hx i-> 0

.0 15

e2

ro o

.0 0 oj

.0 0 oj "1 0 01

16

ro 0 01

-1

0

0 oj

[01 oj ro 0 0* h2 h^ 0

1

/ 1 2 H> 0

o o" 0

0

[1 0 0.

0

[0 0 -i_

We are using (4.3.24) (i) and (ii) but with different notation. Comparing it with (4.3.16) it is obvius that while ht replaces x (an element of the Lie algebra L), xt replaces m (a vector in the module).

Basics of Algebras and Related Concepts

189

since the weight system of M2 can be obtained from that of M{ by inversion in the origin. The representation matrices can be written by using the dual basis x*, x*, x% (in place of xu x2, x3). It can be checked that they are negative transposes of the matrices given above.

References 1. J. G.F. Belinfante and B. Kolman, A Survey of Lie Groups and Lie Algebras with Applications and Computational Methods (SIAM, Philadelphia, 1972). 2. N. Bourbaki, Elements de Mathematique, Groupes et Algebres de Lie, Chaptre I (Paris: Hermann, 1960). 3. E. B. Dynkin, The Structure of Semisimple Algebras, Uspekhi Mat. Nauk 2 (1947), 59-127; English transl., Amer. Math. Soc. Transl. No. 17, 1950, reprinted in Amer. Math. Soc. Translations Series I, Vol. 9 (1962) 328-469. 4. J. E. Humphreys, Introduction to Lie Algebras and Representation Theory (New York: SpringerVerlag, 1972). 5. N. Jacobson, Lie Algebras (Interscience Publishers, 1962). 6. V. G. Kac, (a) Infinite Dimensional Lie Algebras (Birkhauser Boston Inc., 1983); (b) (ed.) Infinite Dimensional Lie Algebras and Groups (World Scientific, 1989). 7. V. G. Kac and A. K. Raina, Highest Weight Representations of Infinite Dimensional Lie Algebras (World Scientific, 1987). 8. A. A. Sagle and R. E. Walde, Introduction to Lie Groups and Lie Algebras (Academic Press, Inc., 1973). 9. D. H. Sattinger and O. L. Weaver, Lie Groups and Algebras with Applications to Physics, Geometry, and Mechanics (Springer-Verlag, 1986). 10. A. E. Taylor and D. C. Lay, Introduction to Functional Analysis (John Wiley and Sons, 1980). 11. V. S. Varadarajan, Lie Groups, Lie Algebras, and Their Representations (a) (New Jersey: Prentice Hall, 1974); (b) (New York: Springer-Verlag, 1984). 12. H. Zassenhaus, 2.[35].

CHAPTER

INFINITE-DIMENSIONAL ALGEBRAS

3

Infinite-dimensional algebras occur in many areas of mathematics and physics, for instance one finds them in 2-dimensional conformal field theories (Virasoro algebra), in gauge and quantum field theories as well as in string theories. In this chapter, we focus our attention mainly on affine algebras i.e, kacMoody algebras with affine matrix. Because it is these algebras which have deep links with combanitorics and the theory of modular forms and theta functions on mathematical side, and have an important role (via vertex operators) in dual reasonance models/quantum string theories on physics side. Besides this, it is possible to construct root-systems, and obtain representations (e.g. highest weight representation) for these algebras in much the same way as one does for finite-dimensional Lie algebras (see Dolan in Ref. [Ad], [15] and [17], as well as Chari and Pressley in [8b]). It is worth mentioning here that KacMoody algebras were initiated independently by Victor Kac and R.V. Moody as the study of a class of infinite-dimensional Lie algebras in mid-sixties, and have assumed a phenomenal role ever since. We devote first 3 sections to these algebras, and devote the latter 3 sections to Heisenberg systems, and the Vertex and Virasoro operators. We shall see in those sections the effectiveness of Kac-Moody algebras as a tool in other theories.

1

LIE ALGEBRAS ASSOCIATED TO CARTAN MATRICES

In the previous chapter while studying the weight and root systems of finite dimensional Lie algebras, we have seen that every (semi-simple)1 Lie algebra carries at least one Cartan subalgebra, with respect to which we have a root space decomposition:

g = X®Z ga aeA

We also learnt there that using a symmetric bilinear form on g, a matrix called the Cartan matrix can be defined associa-ted with the root space A, which is unique up to isomorphisms. In this section we reverse the order, we consider the so-called generalized Cartan matrix and construct a Lie algebra from it. A familiarity with this approach is required for learning the basics of infinite-dimensional algebras, particularly that of Kac-Moody algebras. The material in this section is based on the (standard) text [8(a)]. For more details and enlightening exercises, the reader is referred to this text. The notations used here differ from previous chapter in some cases, for instance the Lie algebra there has been denoted as L and the Cartan subalgebra as H, here we shall denote the Cartan subalgebra as H.

Infinite-Dimensional Algebras 191

l.l

Generalized Cartan Matrix and its Realization

Definition 5.1.1: A complex n x n matrix A = (a,-,-)" ,-_i of rank / is called a generalized Cartan matrix if its entries satisfy: (i) au = 2 (i = l, - , n ) (ii) atj are non-positive integers for i * j (iii) ay = 0 implies a,,- = 0 (5.1.1) We now assume that A is real and indecomposable,2 then it can be verified, that just one of the three following statements (listed in Fact (5.1.2)) holds good for both A and 'A Fact 5.1.2: (a) If det A # 0 then there exists a column vector u > 0 (i.e., all «,'s are > 0) such that Au > 0; moreover Av > 0 implies either v > 0 or v = 0. Such an A is called a finite type matrix, (b) If det A = 0, and rank A is (n - 1) , then there exists M > 0 such that Au = 0, and Av > 0 implies Az; = 0. The matrix A is now called aj^zne type, (c) There exists K > 0 such that Au < 0, and Av>0,v>0 imply w = 0. The matrix in this case is known as indefinite type. The above facts can be translated in terms of principal minors (determinants of principal submatrices) of A. For instance: (a') A is of finite type if and only if all its principal minors are positive, (b') A is of affme type if and only if all its proper principal minors are positive and det A = 0 (see Chapter 4 of [8(a)]). Using these facts on indecomposable generalized Cartan matrices a complete classification of Dynkin diagrams of these matrices can be obtained, and associated Lie algebras can be analyzed (see Thm. (4.8) of [8(a)]). Returning to our goal of defining a Lie algebra for a given generalized Cartan matrix, we first define the term realization of a matrix A. Definition 5.1.3:

Let A be an n X n matrix of rank / with complex entries (to be more general), H

a complex vector space and 3i its dual. Let n = {ax, ..., ccn] a 9i and II = {«j, ..., a,,} c !Hbe indexed subsets of 9{* and 9{ which satisfy the following properties3: (i) The sets IT and n are linearly independent (ii) (a^ocj)

=ay

(ij=

1, 2, ...,«)

(iii) d i m ^ = 2 n - / .

(5.1.2)

Then the triple {W, II, fl} is called a realization of A. Every matrix A possesses a realization, which is unique up to isomorphisms4 only if det A * 0. It can be checked that if {X II, n } is a realization of A, then [H , n , II} is a realization of 'A = the transpose of A.

2

3

4

A matrix A is said to be decomposable if after reordering its rows/columns it takes a diagonal block form

In other words it can be expressed as a direct sum of two or more matrices. If this is not the case, it is called indecomposable. Note that the symbol (tilde) - o n f l and ax, ..., an does not stand for complex conjugation. See Sec. (3.1) to appreciate (ii) of Eq. (5.1.2). The word 'isomorphism' relates to vector spaces.

192

Mathematical Perspectives on Theoretical Physics

In analogy with the terminology used in the previous chapter the elements of II and n are respectively simple roots and simple co-roots (see (4.3.20)(a) and Ftn. 8 there), while II and LI are root and co-root basis. The root lattices accordingly are:

Q = £ Zo,, Q + = i z + a , i=i

(5.1.3)

1=1

A partial ordering > on 9i* can be defined by setting a > / 3 i f c c - / 3 e Q + . Having set up this machinery, we now obtain an (auxiliary) Lie algebra g(A) associated with the (n x n) complex matrix A whose realization is {lH, II, n } . Remark 5.1.4: Consider two sets of generators {e,}, \f{\ along with the vector space H. Assume that the elements of those sets satisfy* (see Hint to Exercise 5 in Sec. 4.4). (i) [ei,fj]= SijUi (i,j = 1,2, .... n) (ii) [h, K] = 0 (h,h' e H) (iii) [h, et] = (a,-, h)et

(i = 1, 2,

...,n,h(=fy

(iv) [M] =-(«.-• h)fi (i = U . . . , H , f t e ^ (5-1.4) Then the vector space generated by the union of {e{}, (/)} and #" equipped with the Lie brackets (5.1.4) forms the Lie algebra denoted g(A) which is associatedtothe matrix A and the realization {M, II, fl}. The Lie algebra g (A) is characterized by a fundamental result, which we state without proof in next subsection (see Hint to Exercise 2 of this section and Thm. 1.2 in [8(a)]).

1.2 Construction of Kac-Moody Algebra g(A) and its Universal Enveloping Algebra Result 5.1.5:

(a) The Lie algebra g(A) depends on A and equals g(A) = fi+®H® fL

(5.1.5)

where n + ( n _ ) is the subalgebra freely generated by the set {e,, ..., en) ({/,, ...,/„}). (b) In view of the relations (5.1.4), there exists a mapping e, —> -ft,fi —> -e{, (i = 1, ..., «), and h —> -h, (h e H) which can be uniquely extended to an involution o of the Lie algebra g(A). (c) g(A) admits the root space decomposition with respect to Ji: ( g(A)=

\ ® g_a ®X® aeQ+

V 9*0

where

J

(

\

© ga aeQ+ \ a*0

j

g a= {x e g(A)\[h, x] - a(h) x], dim g a< °o f

* Repeated indices do not imply summation here. f

dim ga < n!hla\ this estimate links the dimension of ga to the height of root a. (See Def. 4.4.12).

(5.1.6)

Infinite-Dimensional Algebras 193

and

ga c n

for ± a e Q + .

±

(d) There exists a unique maximal ideal / i n g(A) which satisfies: / = (In fi_) 0 ( / n fl + )

(5.1.7)

The ideal 7 intersects ^/trivially. With g (A) and / defined above, one can obtain the quotient set: g(A)/I=g(A)

(5.1.8)

The set g{A) is the Lie algebra associated to a complex matrix A. When A is a generalized Cartan matrix (as defined in (5.1.1)), the algebra constructed in the above manner is called a Kac-Moody algebra. The quadruple (g(A), M, IT, ft) is called the (g, #)-pair associated to the matrix A. Note that #"is the Cartan subalgebra of g(A). The images of the generators {e,}, {/;•} under the canonical mapping g(A) —> g(A)II, which are again denoted as {e ( }, [£•}, are the Chevalley generators of g(A).5 Also the involution cf on g (A) leaves the ideal / invariant and thus induces an involution a on g(A) which satisfies: (T(«,.) = -/•, cr(/i) = - e , ,
(5.1.9)

The involution O" is called the Cartan involution of g(A). n

Furthermore if we set H' = ^ C a , and consider the derived subalgebra g'(A), we have i=i

g'(A) n H= H'\ g'(A) n ga= ga if a * 0. In view of (5.1.6) we therefore have: 8(A)= 0

ga

(5.1.10)

aeQ

where ga - {x € ^(A)|[/i, *] = a(/z) x for all h € ^/} is the root space corresponding to the root a. Evidently g o = ^/"(see subsec. 4 of (4.3)). Equations (5.1.9) and (5.1.10) taken together lead to the fact that if a is a simple root, then ga is 1-dimensional and for every simple root a of ga, there is a root -a, and the root space g_ a is 1-dimensional. If a is not a simple root, then multiplicity of a = multiplicity of (-a). where In particular

multiplicity of a = dim ga. A_ = -A + .

The Lie algebra g(A) is finite-dimensional if and only if all principal minors of matrix A are positive. In fact these Lie algebras are semi-simple, and as such, the classical theory of Killing-Cartan on these algebras becomes the theory of Kac-Moody algebras associated to matrices of finite type.

5

These generators generate the (derived) subalgebra g'(A) = [g(A), g(A)], and g(A) = g'(A) + Ti, (g(A) = g'(A) if and only if det A * 0).

194 Mathematical Perspectives on Theoretical Physics

Definition 5.1.6: A generalized Cartan matrix A is Euclidean if it is indecomposable, symmetrizable, singular and every principal submatrix is of finite type. The infinite-dimensional Kac-Moody Lie algebras associated with the Euclidean generalized Cartan matrix are called affine Lie algebras. These algebras are characterized by their different type of realizations (non-isomorphic realizations), and accordingly their Cartan matrices, although of the same order, differ in row and column contents. Evidently Dynkin diagrams play an important role in their characterization (see Exercise 1). In the next section we shall give an explicit construction of affine algebras from a different perspective. But before that we have to define another important object which will be used later. Definition

5.1.7:

The canonical mapping g(A) —» g(A) preserves the direct sum decomposition

n + © M. © fi_ of g(A) in the sense that the image of n + ( f i j denoted n+ (n_) gives the (so-called) triangular decomposition of g(A): n+®H®n_

(5.1.11)

where n + ( n j is the subalgebra of g(A) generated by ei (/j). It should be noted that if a is a positive root, then ga a n+ and if it is a negative root, ga c «_. In the former case it is a linear span of elements of the form [.. .[[e, , e, ], el ] ... e, ] and in the latter it is a linear span of the elements of the form [... [(/), / ; ], f^] ... /j ] such that *

fa,

each a ,

is formed by et 's

CC- — \

/

~x

'*

[- a , each at

is formed by ft 's.

From here it follows that: ««,. = C « i ,

«_«,.= C/,;

g, a i = 0 i f | 5 | > l .

(5.1.12)

This further implies the important fact: Fact 5.1.8: If /? € A + \{a,}, then {& + Z«.) n A c A+. Finally, the universal enveloping algebra of g(A) is denoted U(g(A)), and the corresponding triangular decomposition of U{g{A)) is: U(n+) ® U(X) © U(nJ

(5.1.13)

Exercise 5.1 1. Show how you would obtain the realization {!H, II, fl} of a matrix by choosing a generalized Cartan matrix (

2

{-b

- ^

2)

be a (2 x 2) Generalized Cartan Matrix (GCM). Show that the Lie algebra defined with the Cartan subalgebra based on this matrix is generated by six elements e 0 , e1( / 0 , / t , h0, hx that satisfy:

Infinite-Dimensional Algebras

(a) [«,,#= Sghi, (c) ft, *_,] = a&. Cj,

195

(b) [ho,h{] = O (d) [h^f^-ayfj

[«,, [«,, [«,, «,]]] = [/;, [ / , [ /,£]]]

= 0 if i * y.

This GCM Lie algebra is the simplest version of Kac-Moody algebra; it is denoted as Aj(1) by Kac. The subscript 1 in AJ1' indicates the rank of Cartan matrix in this case. In general, the Lie algebra obtained through a GCM of rank I and order (/ + 1) and having similarly defined Lie brackets with respect to 3/ + 3 generators (e,,/, h{) (i = 0, 1, ..., I), is denoted A^. 3. Show that every finite-dimensional compact connected Lie group G can be associated to an infinite-dimensional Lie algebra via a set of smooth mappings from the circle S 1 to G. The algebra obtained in this manner is called the untwisted affine Kac-Moody algebra. 4. Show that the centre of the affine Lie algebra g(A) associated to an affine type matrix A is 1-dimensional and is defined by the element /

c = X^«i • i=0

where 5 , e n c #"(the system of simple coroots), and at corresponds to a, e S = (a 0 , a,, ..., a,) —the unique vector coming from the Dynkin diagram S(A) of affine type GCM. The a,'s are all positive and relatively prime (the at can be viewed as the dual of a{ via a non-degenerate symmetric bilinear C-valued form ( | ) on !H). More simply, at (i = 0, ...,/) refers to the Dynkin diagram S('A) of the dual algebra. S('A) is obtained from S(A) by reversing the directions of all arrows but keeping the same 'number' (label) for vertices. The element c is called the canonical central element of g(A).

Hints to Exercise 5.1

F2 ~ a l 1. Note that the matrix

is a generalized Cartan matrix as (i) and (ii) of Def. (5.1.1) are \_-b

2J

satisfied. To obtain the realization, we shall have to determine the vector space #"and the indexed subsets n = [ax, c^} citf* and U = {a{, a2} c ^ . By property (iii) of Def. (5.1.3) the dimension of 51 will be 2 or 3 depending on a * b or a = b. In the latter case the rank of matrix is 1, and accordingly the dim tt= 2 x 2 - 1 = 3. Assume that a = b is a complex number, then we choose #"to be C3, ccx, o^ the first 2 (linearly independent) coordinate functions and a {, a2 the first two rows of the augmented matrix: ' 2 -a 0" -a 2 1 0 1 0_ Next we have to check that (i) and (ii) of Def. (5.1.3) are satisfied. The property (i) is obvious since the rows a x = (2, - a, 0) and a2 = (- a, 2, 1) are linearly independent, just as well the coordinate functions ax, 0^ are, by definition.

196

Mathematical Perspectives on Theoretical Physics

Property (ii) which states (dct,a -^ = atj is also immediate as a;- (the jth coordinate function) assigns to a , the ith entry of that row. Thus ( f i j . a , ) = (a, : ( 2 , - a, 0)) = 2 = a n (fii, a 2 ) = («2 : (2. - a, 0)) = - a = an (a2, a t ) = (a, : (-a, 2, 1)) = - a = a 2 , ( a 2 , a 2 ) = ( « 2 : (-*• 2> D) = 2 = a 22 . When a ^ Z>, the vector space M is 2-dimensional and a t and 5 2 are respectively (2, - a ) , (- &, 2). The above equalities can easily be verified for pairs (ai, a j ) . 2. To begin with we note that matrix A is of rank 1 and hence its Cartan subalgebra will differ from a 2-rowed matrix whose determinant is non-zero. We construct these generators using the process that holds good in a general case also (see, for instance, Sec. 2). Let £, • (i,j = 1, 2, ..., n) stand for an (n x n) matrix which has i, j entry as 1 and 0 elsewhere. Write (i) E0=Enl, Et =EiM, i =1,2, . . . , n - l (ii) F 0 =E 1 > B , F,. = £ / + 1 > , i =1,2, . . . , « - 1 (iii) // 0 = £„, „ - £ u , //,- = £,. j - £ i+1 , i+1 , i = 1, 2, ..., n - 1 then since n = 2, we have:

(i') £ 0 =

mF

ro oi [ l Oj ro i i

«-[o

(iii')

,

t[

r-i oi Ha0= L o IJ

ro

n

£, = ' L0 0. ro o"

F

''[i o.

H, =

ri

on

Lo - u

Define (iv) eo=Eo®

t,

fo=Fo® K

ex = El ® 1

/, = F , ( 8 » 1

ho=Ho® 1 ® c, fcj = « ! ® 1 and let the Lie bracket for the elements e^fi (i = 0, 1) be given by the rule: (v) \e-v fj] = [£, Fj] ® t+l + kSk+U0 Tr(E,.F;)c, (*, / e Z)*. (The element 'c' in (v) and in h0 is the central element of the Lie algebra formed by E-t, Ft, //,-.) In order to show that the generators defined in (iv) satisfy the equalities given in the exercise, we first show that [et, fj] = 8^ hj. Now [e0, / 0 ] = [Eo, F o ] ® tl

+(

-'> + 1 • 5, + (_1)f 0 Tr(£ 0 F 0 )c

= Ho ® 1 + c = h0. Similarly [ehfj = [Ey, F J ® f0+0 = ff, ® 1 = /ij. (The second term does not appear since k = 0.) Two more equalities similar to (v) will be formed by pairs (ft,, e-) and (ht, f-), for instance we shall have: From (iv) it is evident that k, I take only the values -1,0, 1.

Infinite-Dimensional Algebras

(vi)

[ft., ej\ = [//,, Ejl ® tk +' + k8k + ,, 0 Tr(HiEj) c

(vii)

[ht, fj ] = [Hit Fjl ® f*+/ + *<5, +; , o Tr (fl ; Fp c.

197

Consider the case i = 0 and j = 1 in (vi) to write: (viii)

[h0, el]=[HQ,El]®t0

+0

= -2(£, ® f°) = - 2 e i . Other equalities arising from (vi), (vii) for the remaining combinations can be easily verified. Using the same Lie bracket relation for (h0, h{) as given in (v), etc., we note that [h0, / i j = 0. Finally, the fact that the brackets [e,, [et, [et, ey]]] equal zero can be established by suitable substitutions. Note that the Cartan subalgebra is generated by (h0, hx). 3. Let denote the set of smooth mappings (i)

0 : Sx ={zs

C| |z| = l} - > G

such that Z H (j)(z)e G-the compact connected Lie group. For 0 l5 (j)2 e 3> the group operation in G gives 0[ • 02(z) = 0!(z) • 0 2 ( z )' an( ^ m u s defines a group structure on <&. Obviously $ is infinite-dimensional, since there are infinitely many smooth maps from Sl to G. The group O is called the loop group of G. Now from Exp. (4.1.7) we know that every finite-dimensional Lie group G admits a Lie algebra g, whose basis vectors {7**} 1 < a < dim g satisfy6: (ii)

[7*. 1*] = iff

f

where ff are the structure constants of g. Moreover, the T^'s are generators of G (which as we know, is connected), hence an element of G can be written as (iii)

exp (-iT01 0a)

where 0a (1 < a < dim g) are the group parameters. From (i), the parameters 0a's can be viewed as functions on the unit circle |z| = 1. Using these functions as well as (iii), a typical element of O can be expressed as: (iv)

0(z) = e x p H T " Ga(z)).

Also using the Laurent expansion

(v)

^

8~" (z)" for 9a(z), 0(z) can be re-written:

0( z )=ex P [-i £ ( r V > - " l

V n=-~ y Thus if we write 7'"= 7^ z", we have a set of infinite number of generators {7""} and an infinite set of parameters 6^ for . In terms of these, (v) becomes:

(vi)

Hz) » 1 - i I r^| 0a"

near the identity. Since 0 e O, we have in the process obtained the Lie algebra g for <E>, whose elements satisfy the following Lie bracket relation: Note that our notation for structure constants and basis vectors here is different from Chapter 4. The inclusion of / in the equality is to facilitate the description of mappings. See also super Lie groups and super Lie algebras for the convention of / in Chapter 7 App.

198 Mathematical Perspectives on Theoretical Physics

(vii)

[TZ,Tft = if?Trm

+ n.

Note that by choosing n = 0 in the set {7^} we retrieve the Lie algebra g, in the sense that the subalgebra formed by {T"} is isomorphic to g. The Lie algebra g obtained by considering the group <E> of maps S1 —> G is the untwisted affine Kac-Moody algebra.7 4. See Prop. (1.6), Theorem (4.8) and Sec. 6.2 in [8(a)] for the proof.

2

AFFINE ALGEBRAS: AN INTRODUCTION

We devote this section to affine algebras and the next section to their representations. These algebras in many respects can be considered as analogues of simple finite-dimensional Lie algebras. As already mentioned in the introduction the present version of the subject originated from Kac-Moody algebras Ad. [18]. The subject has grown tremendously during past two decades due to the applicability of these algebras. They provide a powerful and natural framework in the study of unified field theories on the physics side, and make an effective tool for learning the theory of infinite dimensional Lie groups (via their Lie algebras) on the mathematics side. While Kac and Moody defined these algebras as realizations of representation of generalized Cartan matrices (as we have already seen in Sec. 1), others like J. Lepowsky and I. Frenkel developed these as generalizations of simple Lie algebras using the so-called vertex representation. Many of the original research papers and survey articles that show the progress of the subject from a mathematical point of view, as well as its relation to physical systems, e.g., dual resonance models, 2-dimensional conformally invariant structures, Boson-Fermion correspondence in quantum field theory, can be found in two readable texts (V. Kac [8(a)], and P. Goddard and D. Olive [6]). Our presentation in this section is based on the formulation of the theory given by Frenkel and Kac, Goddard and Olive, and Lepowsky and Wilson. We define an affine algebra in a general form and deduce from it other affine algebras, e.g., the Heisenberg algebra, the Cartan algebra, the loop algebra and the Virasoro algebra. Finally we show that every affine algebra has a connection with a Kac-Moody algebra.

2.1

Construction of Affine Algebra

Definition 5.2.1: Let C[t, f[] be the algebra of Laurent polynomials in the indeterminates t and fl over C, and let g be a complex simple finite-dimensional Lie algebra, which carries a non-zero invariant bilinear form (,). The infinite-dimensional algebra g formed by vector space C[t, r 1 ] ®cg®Cc

(5.2.1)

with the Lie bracket* [x ® t\ y ® tm] = [x, y] ® tn+m + n 5,,,_m (x,y)c 7

'

(5.2.2)

See Sec. (3.5) of Goddard and Olive in [6] for construction of the twisted affine Kac-Moody algebra. A twisted affine Kac-Moody algebra can be defined with the help of a compact finite-dimensional Lie algebra g and an automorphism a of g which is of finite order. We denote it as g a, and emphasize that this 'twisting' is related to 'symmetry breaking.' * Very often x ® tk = tk ® x is denoted as x(k). (See for instance Eq.(5.2.6)).

Infinite-Dimensional Algebras

199

where x,y e g, n, m e Z, and c e the centre of g, is called the affine (Euclidean) algebra associated with g (see Frenkel in [6]). If g is replaced in (5.2.1) by its Cartan subalgebra M, the resulting subalgebra C[t, r 1 ] <8>c H 0 Qc = gH

(5.2.3)

is called the Heisenberg subalgebra8 of g. The subalgebras: 9{®Qc^3i and

(5.2.4)(a)

1 g

(5.2.4)(b)

are called the Cartan and scalar subalgebras of g. Note that the latter subalgebra can be identified with g, though g is not a subalgebra of g in formal sense. The quotient algebra gl Qc = g © C [t, r 1 ] = g0

(5-2.5)

is called the loop algebra associated with g. This nomenclature is self-evident since the loop algebra can be realized as an algebra of g-valued functions on the unit circle (\z\ = 1) with finite Fourier series, and with Lie bracket acting pointwise [8(b)] (see also Hint to Exercise (1.3)).

2.2

Derivations and the Affine Algebra

Let x(n) stand for x ® t", and let d(n) = f+{ — be the derivation of the algebra of Laurent polynomial als C [t, f ' ] . Then assuming that d(n) acts trivially on Cc, the derivation can be extended to the whole of g by setting [d(n\ x(m)] = mx(n + m) n, m e Z, x € g

(5.2.6)

Also using d{ri) (for every n), a semi-direct product vector space | = g 0 Cd(n)

(5.2.7)

can b e defined, a n d o n g t h e bilinear form ( , ) of g c a n b e extended to ( , ) „ a s follows: (x(m) + aGc+ ax d(n), y(m) + p0 c + 0, d(n) )„ = Sm+m, n (x,y) + aQ ft + a, )30 where

(5.2.8)

x,y € g, m, m, n e Z, a0, a ] t j30, ft e C.

We note that the set of derivations {d(n)} defines a Lie algebra with Lie bracket: [d{n\ d{m)} = {m-n)

d{n + m).

(5.2.9)

The equality (5.2.9) is self-evident in view of the commutation relation between f+l—and f+x—. dt dt It is easy to note that the requirements on the Lie bracket (as given in Def. (4.1.3)) for the definition of a Lie algebra are met in this particular case. We shall denote this Lie algebra as 2). For every arbitrary element: 8

'

See Sec. 4 for explicit construction.

200

Mathematical Perspectives on Theoretical Physics

d

= X akd(k) (ak e C & e Z) k

in
a

k ( >)k

k

can be defined. In particular when d = d(0), we denote g + Cfi?(0)= g + G / as £ and note that the above construction helps in finding the root system of affine algebras, in other words it enables us to write down a root decomposition of g and g . The root decompositions of g and g can formally be written as:

(a) g = 5? 0 £ £ 2 ;

(b) g = X 8 X «s

(5-2-10)

We obtain these decompositions in (5.2.13) and (5.2.15). In Subsection (2.4) we shall also see that g and (, ) n can be used to define the Casimir element of g.

2.3

The Root Decomposition of g

We have already seen (Sec. 4.3) that every finite-dimensional complex simple Lie algebra has the root decomposition9: g = X®

X Sa

(5-2.11)

aeA

with respect to Cartan subalgebra #", where A c W (the dual of # ) is the root system, and for every a € A, ga is the 1-dimensional space spanned by a and defined as: ga = {x € g: [h, x] = (a,/z)x for every ft e # }

(5.2.12)

We also know that a subsystem of roots consisting of simple roots {a,, ..., a j where / is the rank of g can be fixed and by using an order in the root system, the highest root of the system can be selected. Obviously a similar procedure has to be adopted for obtaining the root decompositions of g . We show it explicitly as follows. We note that in this case the Cartan subalgebra ^/of g is replaced by !H+ Cc + C d(0) = 9{ . The algebra H (as required by any subalgebra to be a Cartan subalgebra) is a maximal commutative diagonalizable subalgebra in g. Returning to the root decomposition of g, we denote the dual space of H (with respect to the invariant C-valued non-degenerate bilinear form on g) by ik * and the corresponding root system
H}

The root decomposition of g now reads as:

g = h® X 8s

( 5 - 2 - 13 )

aEA

Next we extend any linear function/e V to ik* (denoted as/only) by setting/(c) =f(d) = 0, and we choose a linear function /3 on !H such that j3|^

+ Cf

= 0, j8(d)=l.

(5.2.14)

9. Note that Cartan subalgebra has been denoted there as H, smdga\sL/^ (SeeEq. (4.3.21)). We also note that the root system in Sec. 4.4 is X; the symbol A there, is used for Dynkin diagram.

Infinite-Dimensional Algebras

201

In view of the above choice with respect to central element C and d(0) s d, we can write the decomposition of g with respect to i/" as a sum of the tensor product of monomials t" and the root spaces that pertain to 9{\ g = X © X (f ® c g&)

<5-2-15)

The pair (n, a) ranges over Z x (A u 0)\(0, 0), and the root system of g with respect to H is given as: A = {nP + a,

neZ,

a e A u 0} s {0}

(5.2.16)

The root space gs spanned by the root a = n/3 + a, in view of (5.2.15), can be written as t" ® c ga. Hence the multiplicity (dim gs) of a root a =nfi+ oce A is 1 if a * 0 and is n otherwise. A root 5 = nji+ a with a e A is called real, and a root a - nfi, n e Z N 0 is called imaginary. For every positive root a in A there is also a corresponding root - a in A. The root a is said to be positive if n > 0 or n = 0, for a e A+. This allows A to be expressed as a union: A = A+ u (-A + ). The bilinear form defined on 8 when restricted to H and the root system A satisfies10: (a) (, )|^ is non-degenerate (b) (, )\g_3 © gs is non-degenerate (c) (gs,gg) = 0 if 5 + b * 0.

(5.2.17)

Just as we have a root basis in the case of g which is formed by simple roots, we have in this case the set of simple roots { a o = / J - a , 5 j , ..., 5,,} (where a denotes the highest root,11) which forms a Z+-basis of A +. We denote this set by S .

2.4

Formulation of the Virasoro Algebra

Having discussed an affine algebra along with its root system, we now show how some of these affine algebras can be constructed from first principles. We illustrate this for three algebras namely the KacMoody algebra, the Virasoro algebra and the Heisenberg algebra (see [6], [11]). The construction of Kac-Moody algebra (denoted g) has already been considered in the Hint to Exercise 3 of Sec. 1. We have shown there that the algebra can be obtained by using an infinite-dimensional group of smooth mappings from the unit circle S1 to a compact connected Lie group G. In a similar manner the Virasoro algebra can be formed by considering the infinite-dimensional group of one-to-one mappings from the circle S1 to S1 (see Goddard and Olive in [6]). The group composition in this case is: y/i o y/2(z) = Vi (V2(*)) 10 11

Note that the suffix d of ( )d has been suppressed here. See Sec. 4.4 for definition.

202

Mathematical Perspectives on Theoretical Physics

where y/ b y/2 are smooth maps defined on the first circle Sl= {ze C : |z| = 1}. We denote this infinitedimensional group by v and its Lie algebra by v . It can be shown that the generators of Lie algebra in this case are 12 : L,,= - z " + 1 ^ - , dz which evidently satisfy:

ne

Z

(5.2.18)

[Lm, L J = (m - n) Lm+n

(5.2.19)

Now if we were to consider the semi-direct product (5.2.7):

f s g ® X Cd(n) nez

we note that

]£ Cd(n)

(5.2.20)

nez

is a Z-graded subalgebra of the affine algebra g , and by the very definition of d(n) the algebra v can be identified with the Lie algebra formed by {d(n)}, hence the Virasoro algebra can be thought of as an algebra that can be deduced from the general affine algebra defined above. We further note that the direct sum of g (see Excercise 1.3) and v is again a Lie algebra denoted g © v . The generators of this algebra satisfy:

(a) (b)

[7,« rf] = Iff Tjn+n [Lm, m = -nTam+n

(c)

[Lm, LJ = (m-n)

Lm+n

(5.2.21)

Due to the importance of the Virasoro algebra in string theory, we shall return to it in Exercise 5 of this section and to Virasoro operators in Section 6. The third infinite-dimensional algebra that we wish to construct is the Heisenberg algebra. We postpone this to a later section, where we formulate this algebra from first principles and show how one can define Fock spaces and other operators using the socalled Heisenberg systems. 12

' To obtain the Lie algebra for v consider its faithful (i.e. injective) representation defined by its action (i) on functions/: Sl —> a vector space V:

O 4 /(z)=/(r 1 (z)), £e V-

(i)

For an element £ close to the identity, the equality |(z) = ze~ie(z) gives iff1 (z)~ z + ize(z). Hence Taylor's series for/in Dcf(z) =f(^ ~l (z)) near the identity yields D^f(z) ~f(z) + ie{z) z—f(z). Using the Laurent expansion dz OO

I

£ £ ^ / , we can introduce Ln = -z"+l — , n e Z. See also Sec. (1.4). Note that we can also use n = - oo dz z = exp (id) here, 6 being the parameter on Sl. £(z) =

Infinite-Dimensional Algebras

2.5

203

The Chevalley Basis and the Casimir Element in Terms of the Chevalley Basis

In the previous section we have already been introduced to Chevalley generators and the Chevalley basis in connection with the Lie algebra g (A) of a generalized Cartan matrix A. We show here that such a basis can be obtained for any arbitrary simple Lie algebra g by using a 2-cocycle on the root lattice Q of g. Let g be a complex finite-dimensional simple Lie algebra, #"its Cartan subalgebra, and A the set of roots with respect to M. Let Ft = {a{, ..., a,} be the set of simple roots. Then in view of (4.3.20)(a) we know that for every a, € n there exists a unique element ha. e !H such that: (ha., h) = Oiih),

h € M

The elements {ha.} provide an orthonormal basis for Cartan subalgebra 9i. For notational convenience we shall sometimes denote ha, as ht, when there is no fear for confusion. Remark 5.2.1:

Let Q be the root lattice and let ( , ) be an invariant bilinear form on g normalized

as {a, a) = 2 for a e A. Define a bilinear function13 e: Q x Q —> {± 1} with following properties: e(a, j8) £ (a + p, y) = e(/3, y) e(a, p + f) e(a, ft £ (A a)

= e"i{a'P)

= (- l) /3>

e(a, 0)

=1

£(a, - a)

= 1 (normalizing condition)

(5.2.22)

Kac shows that using this bilinear function e, the set of roots A, and the algebra !tf, a linearly independent set of basis vectors can be obtained which spans a vector space, and this vector space can be made into a Lie algebra g' by assigning Lie bracket relations to these basis vectors. The Lie algebra g' is isomorphic to g and the basis obtained in this way is called the Chevalley basis of g (see [5]). We give below this basis, along with the conditions that these basis vectors satisfy. For details see the above reference. For a, P, y... e. A, let Ew Ep,. Ey ... denote the vectors that belong to the root spaces ga, gp, gy respectively and let ha_ = hi denote the vector for a, e FI. Then the union {Ea} u {&,} of these sets generates a Lie algebra if the following Lie bracket relations for arbitrary pairs belonging to the sets and their union are satisfied: (a)

[Ep,Ey] = 0

(b)

[Ep Er] = £(r, p) Ep+y

13

if/?+y«Au{0} if p + y € A

' e is called the 2-cocycle of Q which defines the central extension of Q in the following manner. The root lattice Q of a simple finite dimensional Lie algebra g of type An, Dn or En has a central extension T by the group Z/2 Z = {± 1}, given as: 1 -> {+ 1} ->• T

^ > Q -> 0.

This is uniquely defined by the relation aba'1 b~x = eKl^a^\

= P-

where a, b e T, a, f)e Q, and (b)

204

Mathematical Perspectives on Theoretical Physics

(c)

[Hp, Ey] = (P, y) Ey

(d)

[Hp, Hy] = 0

(e)

[Ey, E_Y] = Hr

(5.2.23)

(where we have used the notation Hp, Hyin place of ha.). Remark 5.2.2: We would like to note here that Kac's description of Casimir operator which appears to be different is essentially the same as given below, except for the fact that we have mentioned the Cartan subalgebra #"more explicitly (see Sec. 2.5 in [8(a)]). In fact to write the Casimir element of an arbitrary affine Lie algebra (Subsec. 2.6), we use slightly different notations (see Frenkel in [6]) to show dependence on the cocycle e. We use x^ for Ea and identify ha with h e ^where required. Accordingly (5.2.23)(a-e) become: (a')

[xp, xer] = 0

(b')

[*/, xfi = e(y,p) 4 + y

(c')

if 0 + y £ A u {0}

if p + r e A

[h, xEa] = (h, a)

(d')

[h, h'] = 0

(5.2.24)

(e')

[xey, xE_r] = h y .

From Sec. 4 of the previous chapter we know that the Casimir element C is an element of the center of the enveloping algebra of g (see Eq. (4.4.16)). Using the same arguments and Eq. 5.2.24, in this case we can write it as: i

C=^h]-2 i=l

where

J^xeaxia

+ 2p

(5.2.25)(a)

aeA

p =— £ a

(5.2.25)(b)

When the algebra in question is a simple Lie algebra over C of the type A,(1) (see Exercise 1.2 for A,(1)), then p is zero (see Sec. 2.8 in [8(a)] and page 279 in [6] for different (e')).

2.6

Casimir Element of g

Just as we defined modules related to other Lie algebras (see Sec. 4.3 and Sec. 4.4), we can define a g = g ® Crf(fc)-moduleV with the property that v & V can be annihilated by all x(ri) (for x e g) when n (e Z) is sufficiently large. Using the bilinear form( , )k (as given in (5.2.8)), we choose dual bases in gs and gkc _~ (where c is a central element of g and k e Z is fixed). These are respectively { x s, i} and {xkc_~ ,} for 1 < / < dim g~. The basis elements x~ , and xkc_Si,

are now used to define the

endomorphism C{k) of the module V which is called the Casimir element14 (of the affine algebra): u

' See (4.4) for the definition in the case of a simple algebra.^.

Infinite-Dimensional Algebras 205 /

C(k) = 2cd(k)+ X /z;(0)/i,.(*) + 1=1

2

X I

X

kc-a, i *a. i + 2& • M) + <* ® P)

( 5 - 2 - 26 )

aeA+ i'=l

Here 7/ and p (in the fourth term) stand respectively for the Coxeter number15 and the sum (5.2.25)(b), and h-t{k) (i = 1, ..., / and k e Z) stands for an element of the Cartan subalgebra ji*. From our discussions on d(k), h^k), xa , and the roots, it is apparent that all elements on the RHS of (5.2.26) can be viewed as operators, accordingly C(k) is an operator. This can be used to define another operator: l(k) = C(k) - d(k) (2c + 2H)

(5.2.27)

where c e centre of g and H is the Coxeter number. We shall see that the above operator can be expressed in terms of Chevalley operators by using the following ordering method between two elements of g (the integers k and k' are arbitrary here): : x(k)y(k'): = x{k)y(k')

k' > k

: x(k)y(k'): = y (x(k)y(kf) + y(k')x(k)) k = k' : x(k)y{k')\ = y(k')x(k) k' < k. (5.2.28) In order to express l(k) in terms of the Chevalley basis, we use the above ordering method amongst these basis vectors. In view of the earlier subsection, all that we need is a set of independent vectors which can be obtained with the help of the bilinear function e given in (5.2.22). A subset of these vectors forms an orthonormal set for the Cartan subalgebra J{ of g , and the remaining ones span the root spaces g~ of g . The set in question is {Aj(«)} u {xea(m)} where, for i = 1 ... I, me Z, /z,(m) denotes an element of !k, and for a e A, xea(m) is an element of g ~. The operator l(k) can now be written as:

W) = X X 0 W » i (A7 - * ) : ) - X 0 4 » 4 (*' - *):)) itez V; = l

2.7

a e A

(5.2.29)

'

Canonical Generators of the Affine Algebra g

We use the set of simple roots {a 0, a it ..., at} of g to define the elements 15

The Coxeter number H of an affine Lie algebra is the sum of numerical labels attached to the vertices of the Dynkin diagram that is obtained from the generalized Cartan matrix pertaining to that algebra (see Exercise 1.4). Note that Cartan subalgebra Oi is different from jj - the subalgebra of g , there k was chosen as zero.

206 Mathematical Perspectives on Theoretical Physics

a,; = 2 ( S i , S j ) / ( S j , S j ) i , j = 0 , 1 , .... I.

(5.2.30)

The matrix

is called the Cartan matrix of the Lie algebra g . Note that this is also the extended Cartan matrix of g, and as such the canonical generators £,, F,-, Hi (z = 1,2,..., /) with their usual meaning, e.g., Et € ga., F ; e g_a,, Ht = [Et, F,-] and a, (//,) = <5L 2 can be used to define similar relations on g. Thus if we choose Eo e #_~ and F o e g 5 such that a(H0) = - 2 where // 0 = [Eo, F o ] and set e 0 = t ® Eo l

fo=f ®Fo Ao = 1 ® Wo + c

et = t° ® £,fi=tG®Fi

i = l , 2 , ...,/

A,. = 1 ® //,-,

(5.2.31)

then it can be verified that they satisfy (see Exercise 1.2) [e, jj] = 3 7 h^ [hh hj] = 0, Ik, ej] = Oij ejt [kh fj[ = -aiifi l a

(adei) - U

ej

= 0, (fldf^vf.

= 0 i *j.

(5.2.32)

The elements e ; ,/-, /i; (i = 0, 1,2, ...,/) are called the canonical generators; they generate the subalgebra g of g . Evidently g as a subalgebra is of codimension 1 in g . This is the (so-called) Kac-Moody Lie algebra associated with the matrix A. This shows that every affine algebra can be related to a KacMoody algebra.

2.8 The Weyl Group of g For a real root a (i.e., (5, a) ^ 0), let a = 2 5 / ( 5 , a) denote the dual root, and let r- be the reflection in the space H

with respect to a given by: r - ( 5 ) = 2 - 5

u € i?*

(5.2.33)

The collection {/- } for a 6 A ((5, a) •*• 0) generates a group called the Weyl group of g . We denote it as W. We note that unlike the Weyl group of g which is finite-dimensional, the Weyl group W is infinite-dimensional. The following properties of W can be easily checked. Property

5.2.3:

(a) The bilinear form ( , ) restricted to H* is W-invariant.

(b) Any real root is a W-conjugate of a simple root. (c) The line C/3 (with fi defined above in (5.2.14)) is the fixed point set of W. Note that the group W is generated by reflections r~ (i = 0, 1, 2, ..., /) that correspond to simple roots a,. Denoting r 5 by r;, it can be shown that the Weyl group W of the Lie algebra g can be identified with the subgroup of W generated by ri (i = 1,2, ..., Z) (see Sec. (3.7) in [8a] and Sec. (1.6) of [5].

Infinite-Dimensional Algebras 207

Exercise 5.2 We note that some of the symbols used here differ from those of the text. 1. Let si (2, C) denote the set of traceless (2 x 2) matrices over C. Show that it is a Lie algebra g with the basis:

ri

01

ro n

ro 01

* - [ o - i | ' - o o J - H oj and the bracket relations: (a) [h,e] = 2e (b) [h,f] = -2f (c) [e,f}= h. Show further that using g, an infinite-dimensional Lie algebra g = g ® C [t,t~l] can be defined where C [t, t~l] is the algebra of polynomials in the indeterminate / and its inverse t~l. Specify its basis and the bracket relations. 2. Show that the generalized Cartan matrix Lie algebra g defined in Exercise. 5.1.2 (denoted there as A'/can be identified with the direct sum of vector spaces resulting from g given in Exercise 1 and the centre of g, thus:

8 = I ©C z where z = h0 + hx is in the center, i.e., [z, g] = 0. 3. Let A be a generalized Cartan matrix of finite type and C [t, f ' ] denote the algebra of Laurent polynomials in t. Then show that the Lie algebra

with the Lie bracket: [P ® x, P' ® x'] = PP' ® [x, x]

(P, P' e C [t, T 1 ], x, x e g(A))

can be identified with the Lie algebra of regular rational maps from the set of non-zero complex numbers C* —> g(A) such that the element

X tl ® *,. corresponds to the mapping z H> LZ'JC,-, where z e C * . 4. Show that the following subalgebras of g "-=

X«-2.n += 5eA+

X 8a

and

b =h ®n+

5eA t

are the generalizations of maximal nilpotent and the Borel subalgebras of g.

Hints to Exercise 5.2 1. The bracket relations (a), (b), (c) can be easily verified, however we examine it for (c), thus we write (c) in terms of matrices e and/, and obtain:

208

Mathematical Perspectives on Theoretical Physics

(\ o\ (o o\ a o\

(i)

ioo)-(o i H i - i ) - *

Any elements x, y e g will be of the form axe + a 2 / + °h, h, Pie + fiif+ &h where a, and /3, are complex numbers. To check that [x, y] = -\y, x], we have to check that [a, et, fa e,] = -[/?,- et, a,- e;] where we have written ex = e, e2 =/, e3 = h and have used double indices to imply summation. In view of (a), (b), (c) and linearity, the anti-commutativity is obvious. To check the Jacobi identity we write: \f, [h, e)] + [h, [e, f]] + [e, If, h]] = [f, 2e] + [h, h] + [e, 2 / ] The RHS is zero and hence the identity holds good. Consider now the tensor product g®C[t,fl]

(ii) = g.

The basis elements of g will be (h ® t'", e ® t", f®f\m,n,pe

Z)

(iii)

and the bracket relations using (a), (b), (c) can therefore be written as: (a')

[h ® t"\ e ® tn] = 2e ® tm+n

(b')

[h ® t"\ f®tp]

(c')

[e ® t", f®tp]

= -2f® tm+p = h® t"+p

(iv)

From (iv) the arbitrary elements of the Lie algebra g are (2 x 2) matrices, tensor multiplied with terms of infinite Laurent series. The anti-commutativity and the Jacobi identity for (a'), (b'), (c') are easy to check. Also for any two elements x and y in g, the bracket relation: [x ® t"\ y®tn]

= [x, y] ® tm+n

(v)

can be easily verified using (a'), (b') and (cO2. To establish the identification between the two descriptions of g, we must show that there exists a correspondence between the generators of g and that of g ®Cz- This can be done by writing the Lie brackets in both cases and comparing them using the Hints to Exercises (5.1.2) and (5.2.1). Note that the Lie bracket for g in this general situation would be: [x ® t'", y®t"]=

[x, y] ® tm+n + mSm_„ Tr (xy)z

(i)

where x, y e g, ze centre of g and Tr (xy) = trace of the product of matrices x, y. Using this for likely combinations of generators in g as given in the above exercise, we shall have, for example: [e ® t"\ f ® tn] = [e, f] ® f'+" + m8m_n

lr{ef)z

(ii)

Now the second term on the RHS is non-zero only when m = -n. We choose m = 1 = -n; there is no loss of generality in this choice since (t, tl) can generate all powers in C [t, r1]. As a consequence (ii) becomes:

Infinite-Dimensional Algebras

209

[e ® t, f® r ' ] = [
(iii)

(Note that from (i) of the Hint to Exercise 1, Tr(e/) = 1.) In view of the earlier description of g (given in Exercise 5.1.2) it is appropriate to write e0 for e ® t and/ 0 f o r / ® t~l. This gives: [e0, fo] = h® 1 + z

(iv)

But [e0, f0] = h0, hence it follows that hQ —> h ® 1 + zHaving obtained the correspondences: e ® t, f0 - > / <8> t~\ h0 —» h ® 1 + z

(v)

we have to find for the remaining three e{,f{, h] and examine that these are compatible with the set (v). One more choice for m = -n can be that m - -n be equal to 0 in (i). In this case we further choose that x is / a n d y is e, then (i) becomes:

[f®t°,e®

t°] =\f,e]®t°+0

(vi)

Write ex for/® t° and/, for e ® t°. This gives: [eu /i] = [/, e] ® r°

(vii)

hx-^> -h® 1. Note that if z is replaced by /i0 + /i, in (v), the above correspondence in (vii) becomes apparent. This shows the appropriateness of our choice. To examine the compatibility we substitute these values of ebf{ and hi in (iv) of Exercise (5.1.2) to verify that they are satisfied. Write ho = h® 1 + z and hx = - h ® 1 in [h0, /i,] to obtain: [h ® 1 + z, - h ® 1] = [h ® 1, - h ® 1] + [z, - A ® 1] = 0

(viii)

This shows that h®l + z,-h®l belong to the Cartan subalgebra of g ® C £ Now the LHS of (vi) in Exercise (5.1.2) can be written as either [h0, e,]

or [hu
or [h{, e0]

(ix)

We prove just one (the first one, for instance). [h ® 1 + z,f®

t°] = [h ® 1, g ® 1] + [z,f®

1]

= - 2/ ® 1 + 0

(x)

While [h0, e j = A01 e, = - 2ej = - 2/® 1. Similarly (vii) of Exercise (5.1.2) can also be verified. Note that we have thus established a two-way (bijective) correspondence: e0 <-> e ® t, f0 <-> / ® r 1 , h0 = h ® 1 + z e, < ^ / ® l , / , - e c ® U | = - / i ® l . This proves the identification between the two descriptions of g.

210

Mathematical Perspectives on Theoretical Physics

The exercises 3 and 4 are left for the reader. The hints for solution can be found in Chapter 7 of [8(a)] for 3, and in Sec. (1.3) of [5] for 4.

3

MODULES AND REPRESENTATIONS

Having seen the connection between affine algebras and Kac-Moody algebras, it will be worth while to study the modules and the representations for Kac-Moody algebras (to begin with). It should be apparent that the concept of weight space decompositions, the Weyl group and the representations (via modules) established in the case of finite-dimensional simple Lie algebras can be generalized to affine Lie algebras. Let A be a I x I generalized Cartan matrix and g(A) be the Lie algebra associated to a realization {M, n , II}, then a g(A)-module Vis called H-diagonalizable if

where

Vx= [v & V\h(v) = (X, h)v for h e ti\ * 0

(5.3.1)

is a weight space and A e $t is a weight. The dimension of Vx is called the multiplicity of the weight X. If in addition the generators {e,}, {/•} (i = 1, ..., Z) are locally nilpotent on V (i.e., there exists a positive integer n such that (e,)"w = 0 for every v in V, similarly for fy, then the module V is called integrable (see Sec. 3.2—3.5 in [8(a)]). Let the triangular decompositions for g(A) and its enveloping algebra U(g(A)) (see (5.1.11) and (5.1.13)) be given by (a) (b)

g(A) = n_ © H © n+ U(g(A)) = U(n_) um

® U(n+)

(5.3.2)

then using the above decomposition, an important type of module can be defined as follows.

3.1

Highest Weight Modules

Definition 5.3.1: A g(A)-module V is called a highest weight module with highest weight A e rf if there exists a non-zero vector v e V such that (a) (b)

n+ (v) = 0; h(v) = (A, h)v

for h e X; and

U(g(A)) (v) = V.

(5.3.3)

The vector v is called a "highest weight vector." It can be easily seen that in view of (5.3.2)b and (5.3.3)(a), the Eq. (5.3.3)b becomes U(n_)(v) = V. Hence we have: V= 0

Vx;Vx=dV

dimV A
(5.3.3)(c)

Every two highest weight vectors are proportional. We now generalize these ideas to an affine Lie algebra g. Definition 5.3.2: Let A € H ; the highest weight module V(A) is an irreducible g -module for which there exists a non-zero vector v 0 € V(A) such that

Infinite-Dimensional Algebras 211

n + (v0) = 0 h(v0) = (A,h) vo= A(h) v0 for he X

(5.3.4)

For any A e !H *, there is a unique module (up to isomorphisms) of this kind, the element A is called the highest weight of the module V(A). As usual we have a weight decomposition of V(A) with respect to H which is given as:

V(A) =

£

V(A)M

(5.3.5)

litLH''
The subscript JX in the above decomposition ranges over the set < A - ^ r{ai \ where ri is an element

I

i=0

J

of the Weyl group defined in Subsection (2.7) and ai e A + . The dimension of ViA)^, denoted mJA), is called the multiplicity of /j. and ^ is called a weight of the module V(A) if m^ (A) * 0. Furthermore, if A(fy) (f = 0, 1, 2, ...,/) is a non-negative integer for every i, then A is referred to as dominant. For a module V(A) with dominant highest weight A, it is known that {e,-} and {/j} are locally nilpotent.

3.2

The Basic Representation of the Affine Algebra

In order to define the so-called basic representation of g and the corresponding weight system denoted P, we define a subalgebra in g by setting: M=Cd + (C(t)®cg)

(5.3.6)

By definition the basic representation of g is an irreducible g -module V for which there exists a nonzero vector voe V such that16 3W> 0 ) = 0 a n d c(vo) = v0

c = h0 + £ bkhk, bk e C

(5.3.7)

It can be verified that such a representation always exists when the module in question is a highest weight module V(A0), with the highest weight AQ e H * defined by the following relation: Ao(fto) = 1, A 0 (ft ; ) = 0 (i = 1, .... /), Ao(d) = 0.

(5.3.8)

The first two equalities of (5.3.8) show that AQ is a dominant weight. The weight system P of the basic representation consists of the elements of the form [5]: A o + y - ( j < 7 . y } + *)fl7e Q , * e Z,

(5.3.9)

The multiplicity of the weight given in Eq. (5.3.9) is p ( n ) (k), the number of partitions of k into parts of n different objects. The partition function p ( n ) (k) satisfies the identity 16. See Sees. (4.3) and (4.4).

212

Mathematical Perspectives on Theoretical Physics

Ip ( f l ) (*)9*=fn(l-9*)]

(5-3.10)

The collection of g -modules V forms a category %_ if the following two conditions are satisfied (see also Sec. (0.6)): V

= Z

v

n

where each

v

ti = ( y

e

W ^

= he

fi}

(5.3.11)(a)

is finite dimensional. The elements e- and/j (i = 0, 1, 2, ..., /) are nilpotent operators on V. (5.3.1 l)(b) It can be verified that if Vl is a submodule of a g -module V, then Vx e 3Cif and only if Vj and V/Vj both belong to %. The module of adjoint representations also belongs to %. Let ndenote a representation of g in V e !?C For each y e A (the set of roots in g) fix an element £ y in the root space g y c g such that y([Er, E_y]) = 2 (see Remark (5.2.1) and Eq. (5.2.23) (e) and denote Ea_ by Ei (i = 1, 2, ..., /). For a real root 5 ( = k^+ y) e A (the set of roots in £) we set Es = £ ® Eye gs, and define an element of representation ;ras follows: r? = exp (- n(Es))

exp (7t(E_s))

exp (-7r(£ a ))

(5.3.12)

For 5 , 6 II c A, we denote the above element rf. It can be checked that the collection {rf} (i = 0, 1, 2, ...,/) forms a group denoted W ^ We state below three important results (see [5] for proof) based on the representation n. Result 5.3.3: The operator r~ for every a is a well defined automorphism of the space V with the following properties: (a)

r5 (VM) = Vr_w

(5.3.13)

where r~ is the reflection in the space H* with respect to 5 (see (5.2.33)) (b)

(r?)2\Vii = ±id.

Result 5.3.4: Let A71 denote the subgroup of W" formed by the elements (r*)2 (i = 0, 1, ..., I). Then if ker n c Cc, the following is true: (a) AK is a normal abelian subgroup of period 2. (b) The quotient group W^/A* is isomorphic to W the Weyl group of g (see Subsec. (5.2.8) for the Weyl group W). Result 5.3.5: For the adjoint representation every element r~ is an automorphism of the Lie algebra g, which preserves the (standard) bilinear form on g, and has in addition the following properties: H is r~d- invariant, and r~d ^.

(a) (b)

r?d(gs)

=

= r-

grs(b)beA.

We shall briefly return to representations in the next section after we have introduced the Heisenberg systems.

Infinite-Dimensional Algebras

213

Since our final aim in this chapter is to define the vertex operators, we devote the next section to Heisenberg systems and differential operators—the main ingredients of these operators. Although we have already introduced the Heisenberg algebra in a previous section, we establish it from first principles in the next section in order to show its action on Fock spaces. In the process we shall see the Fock space from the mathematician's point of view (as used in string theory). Recall that (in physics) the state vector space of a many body system (formed by identical particles) is called a Fock space. The fundamental postulate of the space is, that the basis vectors (given in terms of Dirac's ket notation): K , n2, n3, . . . )

(5.3.14)

which allot the eigenvalues ki (say) to nl particles and k2 to n2 particles, etc., constitute a complete set of orthonormal basis vectors for the system of identical particles. Note that ni represents identical particles that differs from rij (i ^j) and that n,-, n} characterized as occupation numbers are in fact the eigenvalues of Hermitian operators TV,-, Nj defined in the system. The operators Nh Nj are known as occupation number operators. In particular the no-particle (vacuum) state and one-particle state (in this system) are respectively: y/ 0) = |0, 0, 0, . . . ) , y\l)=

|0, 0, ..., nt= 1 , 0 , 0 , ...,) = *,. >

(5.3.15)

whereas the most general state of the system is a linear combination of the kets (5.3.14). All these are elements of the Fock space.

4

HEISENBERG SYSTEMS AND DIFFERENTIAL OPERATORS

As already mentioned we devote this section to Heisenberg systems, to the action of the Heisenberg algebra 1 7 on Fock spaces, and to the differential operators which follow in a natural manner while obtaining representation of this algebra on the space of polynomials C [t{, t2, ••-, tn, ...] in infinitely many variables.

4.1

Heisenberg Systems

It may be recalled that an infinite-dimensional Lie algebra with basis vectors (p,-, qt) and an element c is a Heisenberg algebra with commutation relations18: [Pi, qi] = c ( i = 1 , 2 , . . . )

(5.4.1)

This is a nilpotent algebra with centre Cc. In view of the above, a Heisenberg-Lie algebra can be expressed as a direct sum S = S ® Cc (5.4.2)(a) where the subspace S (in general, infinite dimensional) is assumed to be equipped with a non-degenerate alternate bilinear form y/ which satisfies: [ a , p \ = y/(oc p ) c

a , P s S

(5.4.2)(b)

Let !Hb& the complex vector space of dimension n with the non-degenerate bilinear form, and let T c H * be an ^-dimensional lattice. Construct the direct sum • The notations to write the Heisenberg algebra differ from previous sections for obvious reasons. qi denotes the position operator and pt the momentum operator.

214 Mathematical Perspectives on Theoretical Physics

5 = S @M (5.4.3) and treat #"as a commutative Lie algebra*, then S becomes a Lie algebra. Further, viewing F as a group, we define its action as automorphism of the Lie algebra S by setting: Ty(s © h) = (s - y(h)c) ®h,s<ES,he?{,YeT

(5.4.4)

where evidently Ty is an operator on 5 . The pair (S , F) is called a "Heisenberg system." On the other hand, beginning with an n-dimensional lattice F and a real non-degenerate bilinear form ( , ) on it, one can construct the space # = (F ® z C)*

(5.4.5)

and extend the form ( , ) given on F to it, which in turn gives !H* 3 F. Setting 5 = (C [t, r l ] <8>c 90 © Cc = Si © Cc and defining a bilinear alternate form i/f on !H as: Wl

®h,tm®

K) = n Sn+m,o (h, h') h, h' e H

(5.4.6) (5.4.7)

we note that S is a Lie algebra with the bracket [a, p] = yr(a, p)c

(5.4.8)

where a, j3 e H and [c, 5 ] = 0 (i.e., c is a central element of S ). This construction gives an example of Heisenberg systems (5, F) (defined above) with S being the Heisenberg algebra pertaining to H (see Eq. (5.2.3)). We shall denote it as (5 , F) and call it as the Heisenberg system associated with the lattice F. Note that S in this case is isomorphic to a direct sum of commutative Lie algebra ^defined in (5.4.5) and a Heisenberg algebra 3 = 5 © Cc where S stands for: S = X (*" ® c #)

(5-4-9)

The space 5 admits a (canonical) polarization with respect to the bilinear form y/ 9 : S = S_® S+=

X '" ®c #

® X ? " ®c ^

(5.4.10)

We shall use this polarization to define Fock spaces related to Heisenberg systems.

4.2

Fock Spaces Constructed from a Heisenberg System

We now suppose that F is an even lattice of rank /, i.e., (7, 7) is always even for ye F and as such it can be thought of as being isomorphic to Zl as a group, then we have: *

The itfwhich stands for arbitrary commutative Lie algebra here, should not be confused with Cartan subalgebra of earlier sections. 19 ' h<S>t" e S_(he #and n < 0) is usually denoted h(- n).

Infinite-Dimensional Algebras 215

Definition 5.4.1: Let 5 (5_) denote the symmetric algebra of S_ (the negative part of 5) and let C [F] denote the group algebra of the lattice F (i.e., when it is viewed as an abelian group). The C-vector space given below Vr = S (S_) ®c C [F]

(5.4.11)

is called the Fock space of the lattice F. Sometimes the name Fock space is used for the space 5 (SJ also, with the understanding that unity 1 be treated as-the vacuum vector. Remark 5.4.2: We would like to note that the group algebra C [F] has a basis {ey} with y e P, and the multiplication rule in it is ey • ey = ey+ y'. We shall use this fact .to define the representation of Heisenberg system (5 , F) in the space Vr. Let F' be the lattice dual to F, i.e.,20

Definition 5.4.3:

r = {P e # R |(a, J8> e Z for all a e F} set

Vr = S (SJ ® c C [F']

which carries a symmetric non-degenerate bilinear form ( | ) satisfying: (a)

(ey\ey')

(b)

(h(- n) | h(- m)> = n (h, h') 5n> m on S_ and

(C)

(xl,...,XN\x'l,...,X'M)=

=

Sy:ronC[T'],

X ^ l l ^ ' f f d ) ) " ' {XNWa{N))^N, M

(5-4.12)

aeSN

on 5 (S_). Here SN denotes the symmetric group of order N. The space VT' is a Fock space. Remark 5.4.4:

For any element A e F' we can define the vector space (denoted V(X)): V(X) = S (S_) ® C [F + A]

(5.4.13)

The space V(X) is a subspace of V r , in particular V(0) = Vr is a subspace of VF'. In this manner we have constructed a family of Fock spaces starting from a Heisenberg algebra. We shall use them later to obtain representations.

4.3 The Canonical Representation In Sec. 2 we defined a 2-cocycle e : Q —> {± 1} which satisfied properties laid down in (5.2.22) and we used e to obtain the (Chevalley) basis vectors and eventually a Lie algebra. We use this now to define a representation of a Heisenberg system in a vector space V as follows. Definition 5.4.5: Given a Heisenberg system (S , F) (F not necessarily an even integral lattice) and a 2-cocycle e: F —> {+ 1} with the properties given in (5.2.22). A pair n- {nx, TJ^} defines a representation (associated with e) of (5 , F) in a vector space V if nx is the representation of the Lie algebra 5 in V, and TT, is a protective representation of the group F in V such that the following two conditions are satisfied: 2O

^ R = F®ZR.

216 Mathematical Perspectives on Theoretical Physics

(a)

«2(y. r') s ^ ( / + r') - e(y. rO ^(tf ^(rO

(b)

7zr2(y) nx{a) n2(y)~l = nx{Ty(a))

(the consistency condition)

(5.4.14)

where y, y' e F, a e S and Tyis the automorphism given in (5.4.4). Further let 5 be the subspace of S = S @ Cc with the decomposition (see 5.4.9-10): S = S_® S+. The two components of 5 are isotropic with respect to the bilinear form y/ introduced above (See Subsec 4.1). Using this polarization of S another representation of (5 , F) can be established in the following manner. Consider the symmetric algebra S(SJ and define the derivations on S(S_) and C(F) with respect to S+, J#and F, and use the multiplication on S(S_). For an element u e S+ a derivation du of S(S_) is: du(v) = y/(u, v)

where v € S_

(5.4.15)

whereas for an element h e 9{, a derivation dh of the group algebra C [F] is: dh(ey) = x W e r

( y e F and ey e C [F])

(5.4.16)

and for an element y' e F, it is: dr(er) = (y',y)el

(5.4.17)

The multiplication of 5(S_) by an element x e \s_is denoted Lx. We also define a family of operators {Ty} on C [F] for every / i n F by the relation: fY(er)

= e(y, y')ey+r.

(5.4.18)

We now use the vector space V r = S(S_) ® c C [F] obtained above to establish the required representation n- {nu H2) of the Heisenberg system (S ,T). The following definitions are made for this purpose: nx{u) = du ® 1, u e S+;

nx(v) = Lv ® 1, v e 5_

nx(c) = lv; nx{h) = I ® dh, h e !H; K2(y) = f,ye

T

(5.4.19)

The consistency relation (5.4.14)b for the above definition of nx and n2 can be easily verified (see Hint to Exercise 5.4.1). The representation defined above is called the canonical representation of (SY) associated with cocycle e. The following remarks and a result (Stone-von Neuman theorem for Heisenberg systems) concerning the canonical representations are noteworthy. Remark 5.4.6: The representation n- [7tl, 7^} defined above is irreducible, i.e., there are no nontrivial subspaces of V which are invariant with respect to both nx and ^ . Remark 5.4.7: lent. 21

Any two canonical representations associated with two equivalent cocycles are equiva-

See Ref. [5] for the proofs of results given as Remarks (5.4.6)-(5.4.7)) and Results (5.4.8)-(5.4.9), and G. Segal in [6] for projective transformations of a group.

Infinite-Dimensional Algebras

217

Result 5.4.8: Let K' be an irreducible representation of a Heisenberg system (S , F) (with a fixed polarization S = S_@ S+) associated with a cocycle e in a vector space V. Suppose that there exists a vacuum vector vo€ V, i.e., v0 is a non-zero vector satisfying: n'{u) (V0) = 0 and

u e S + ; 7r'(/t) (o 0 ) = 0 /t e # ;

7r'(c)(i;0) = z>0

then the representation TT' is equivalent to ?r. Note that to obtain the representation n, we defined the Heisenberg algebra from first principles. We now revert to the defining equation (5.2.3) of the Heisenberg algebra which resulted from a simple Lie algebra g and use the root lattice Q in place of F; the lattice Q equals its dual Q (see Sec. 4.1). Accordingly the Heisenberg system is (5 , Q) 2 2 where the polarization is the same as described above, and the representation that follows is given below. Recall that we defined the so-called basic representation for the affine group g , we use this to write the following: Result 5.4.9: Let nx be the restriction of the basic representation to the subalgebra S and let n^ denote the representation of Q defined by J^ia) = T*, a e Q. Then the representation (nt, Tt^> of the Heisenberg system ( 5 , Q) in the space V(AQ) is equivalent to the canonical representation of this system associated to the cocycle satisfying (5.2.22). We next use the Heisenberg system ( 5 , F) and the canonical polarization to define differential operator on the vector space V = S(S_) ®c C[F] of this representation. We assume that the lattice F is integral even. This assumption allows that the spaces 5 _ and 5 + which are completions of S_ and S+ are respectively equal to 5* and 5* (the duals of S+ and 5_). The bilinear form y/on S_® S+ can be extended to S _ @ S+ as well as to 5_ © S + . Further using J ( 5_), which is the completion of the symmetric algebra S (5_), we set the vector space V = S (SJ

® c C (F)

which gives the canonical imbedding V c

(5.4.20)

V.

Definition 5.4.10: A differential operator on V is a linear map from V to V. Here are a few examples of differential operators. For any xe Vthe left multiplication denoted Lx is a differential operator. The tensor products:

dp®l,peS+;

\®dh,h&tt\

1 ® fy, 1 ® dy, y e F

(5.4.21)

are also examples of differential operators. These operators are generally denoted (in literature) dp, dh, Tr, and ^respectively (when there is no cause for confusion). Likewise the elements q ® 1 and 1 ® er are written simply as q and ey. "With these basic differential operators one can construct other differential operators. We shall return to these in the section on vertex operators, but first we have to familiarize ourselves with two more operators, the creation and annihilation. We study them in brief in the next section.

21

Note that S is g^in Sec. (5.2) (see Eq. (5.2.3)).

218 Mathematical Perspectives on Theoretical Physics

Exercise 5.4 1. Let P = C[xx, x2, ...] be the algebra of polynomials in infinitely many variables xt and let P denote the formal completion of P (i.e., the algebra of all linear combinations of (finite) monomials in the xt). Suppose that D : P -» P is a linear map. Then show that the following three statements are valid: (a) If [xh D] = at D (a, e C i = 1, 2, ...), then D = D(l) exp - Y a,. —

I (b) If -j-,

.

dx

i )

D = fl-D (i = 1, 2, ...), then D(l) = c exp ( £ /3,-x, I where c e C.

(c) If [*;, D] = OiD and - ^ - , D = ^D, then

D = c exp Y fijXj

exp - Y a;.

for some c e C.

The operator D given in (c) is called a vertex operator.

Hints to Exercise 5.4 1. Recall that a linear map D can be a differential operator, e.g.,

I

X

p.,... 'V - ^ r - 3 7 -

where

Pi, •••.-,6 ^

it can be a multiplication by a polynomial p e i°, or can be an operator T a defined as: Ta(f(xu x2, ...)) =f{xx + a,, x 2 + a2, ...) fe

P

(i)

where a stands for (a1? cUj, ...) for a, € C. Using Taylor's formula for function / o n RHS we note that ^a= e*P X a , - / -

and70=l.

(ii)

We shall use this operator to prove (a). Replace D by DTa and observe that the 'if part' becomes [xt, DTa] = a{DTa, whereas the 'then part" becomes DTa= D(l). This is true for any operator, in particular for DT0; in this case, however, the RHS of the if part is zero, hence the proposition becomes if [*,-, D] = 0 for i = 1, 2, ..., then D = D(l). But D(f) = D(l)/for every polynomial / i n F (by induction on the degree of / ) , hence the result. To prove (b) we replace D by exp - Y^ piyi

v

/

D and obtain if:

J

Infinite-Dimensional Algebras 219

I "£"'exp r ? piyt n = then

exp

- ^

/3,-y,

Pi exp

(~ ? P i y i J A

(iii)

D(l) = const.

Using the same argument as in (a), when we choose j3 = (fiv /32, ...) to be zero, the statement becomes: If -^—,D\ dx

L i

= 0 for i = 1, 2, ... then D(l) = const.

J

This, however, also follows from the fact that - ^ - , D = 0 implies - ^ - (£> (1) = 0 (/ = 1, 2, ...),

Ldxi

J

i.e., D(l) = const, hence the statement in (b) holds good. Part (c) follows by taking together (a) and (b) and substituting the value of D(l) in (a) from that of (b).

5

CREATION AND ANNIHILATION OPERATORS

We have been familiar with the processes of creation and annihilation in mathematics for almost three centuries that could be compared with the present day concept in physics. For example the integration and differentiation on a space of polynomials have the effect of creating and destroying by adding to, and by subtracting from, the powers of monomials. In physics they are usually associated with pair productions and removals (of a particle). We describe these operators here as they are of great use in particle as well as in string theory. Our introduction in Subsec. 2 is based on the linear harmonic oscillator of a particle dependent on one coordinate (see Chapter 5 of 9.[24]).

5.1 Creation and Annihilation Operators on Fock Spaces Classically, the spaces on which these operators act are Fock spaces. For instance using the basis vector (5.3.14), we can define a creation operator: fl*|filt n2, ..., n M , n,-, nM,

...) ~ |n 1; n2, ..., n M , nt + 1, n i+1 , ... >

(5.5.1)

which adds to the basis state with quantum number kt one more particle. Similarly from the definition of a Hermitian adjoint operator, it follows that an annihilation operator ai can be defined such that a.-lw,, n2, ..., n M , n,, ni+l, ...) ~ | W]) n2, ..., n,._,, nt- 1, nM

...)

(5.5.2)

Thus operator at removes from the basis state with quantum number kt one particle. Evidently the effect of these operators on vacuum state i//0) and one-particle state y^!) described in (5.3.15) would be: a] v/ 0) = i/P, at »//'/= «/ 0)

(5-5.3)

Since the vacuum state contains no particle to be destroyed, we postulate that: a, y/ 0) = 0,

aj yty= 0 (j * i) V i, ; e Z N ( 0 ) .

(5.5.4)

220 Mathematical Perspectives on Theoretical Physics

5.2

Hamiltonian in Terms of Creation and Annihilation Operators

With the introductory background of creation and annihilation operators given above, we define them now in simplest form. For this purpose we use the quantized version of the Hamiltonian for the harmonic oscillator, with potential, energy V(q) = -j w2q2; w here is a positive constant and q is the position vector of the particle. The Hamiltonian H = ^p2+w2q2 in quantized form becomes

(with unit mass andp as the momentum)

H= — P2 + —w2Q2 2 2 where P and Q are operators and satisfy by assumption23:

(5.5.5)

[P, Q] = -i We now define the creation and annihilation operators in this case as:

(5.5.6)

a = - L (WQ _ ;/>), a = -£=• (wQ + iP) V2 V2

(5.5.7)

The operators a* and a satisfy the following relations: (i) (ii)

[a, a*] = w H= — (aa* + a a) = a a + —w

(5.5.8)

They act on the Fock space of polynomials V = C [a ] by multiplication and differentiation; evidently the operator a can be viewed as w—j-. da By taking a simple example from string theory, we shall see how these operators can be constructed with ease. Let x(t, 8) denote the position of (closed) string in the compactifield space S' = R/2TTZ, where 9 e [0, 2 n] and t e R are the parameters it is assumed that x(t, 0) = x (t, 2n). Consider now the Hamiltonian: 2

H=j_f«\]_(dx) J

In o [ 2 (, dt )

+

adx\2}de

i559)

2 {dd J j

and the following Fourier expansion for x(t, 6) in order to simplify it: x(t, 6) = qo(t) + X 4n(t) ^2cos nd + £ qjj) 71 = 1

VIsin n6 + k9

(5.5.10)

n= l

It can be checked (after some calculations, see Hint to Exercise 2) that H can be written as: H= I 23

7 ( 9 ' + n2q2) + ±q2 + ±k2.

Planck's constant has been taken as unity.

(5.5.11)

Infinite-Dimensional Algebras 221

The/: in (5.5.10) and (5.5.11) belongs to Z. Using the creation and annihilation operators a*n, an (ne Z), the Hamiltonian can be formally expressed as: H

= \ X (a*, an + an an) + i - P02 + \ 1

n*0

l

K2

(5.5.12)

l

on L 2 (S')—the set of square iptegrable

where Po is the momentum operator defined as Po - -i

dd functions on Sl; and K is a position operator which acts as K • e" = n en on C [Z]—the group algebra of Z. The operators a* and an act as multiplication and differentiation operators respectively. In fact an = \n\ —— and their Lie bracket [an, an] equals \n\. This implies that H of (5.5.12) can be written as:

H= X (anan+\\n\) + \{Pl+K2)

(5.5.13)(a)

F r o m the a b o v e d i s c u s s i o n s it f o l l o w s that t h e s p a c e o f a c t i o n o f H s h o u l d b e t h e F o c k s p a c e : V= C[a*, a*2, 03, ..., a * , , a * 2 , a!. 3> . . . ] ® L2{SX)

® C [Z]

b u t w e n o t e t h a t H d e f i n e d i n ( 5 . 5 . 1 3 ) ( a ) c a n n o t b e a n o p e r a t o r o n V, s i n c e HI-

—^ | n | - l = °o-i ^ n*0

does not converge. (Note that 1 is the vacuum vector in V (see Result (5.4.8)). This problem of nonconvergence, however, can be circumvented by replacing 4"M by —[V • Here —^ is the value of the Riemann zeta function, the analytic continuation of

n=\

n

at s = - 1 (see [12]). Thus it is the renormalized form of H: H™= I

anan+ i-P02+ \K2-

-L

(5.5.13)(b)

that one uses as an operator on V. For more details see [14]. We shall use these ideas in what follows next.

5.3

Operators on the Fock Space Associated to Heisenberg System

Returning to Heisenberg system (S , Q) associated with an integral even lattice Q, we now define the operators on (Fock space)

VQ = S(S_) ® C C[Q] the vector space which gives the canonical representation of the system associated with a cocycle e. The creation operator for h e #"and n > 0 is denoted h{ - n) and equals 24 : 24

See footnote 19.

222 Mathematical Perspectives on Theoretical Physics

t-"®h

(5.5.14)

The annihilation operator denoted h(n) equals: d

<.r»®h)

h e M,n>0

(5.5.15)

and operates as follows: d(rn ® h) ('"'" ®h') = n 5,u m {h, h')

(5.5.16)

Sometimes when the element h of # i s /iyfor ye Q and (h, hy \ - y(h), the annihilation operator hy(n) is simply denoted as y(n). For h e # w e also define the operator: *(0) = 1 <E> dh where dh{ey) = y{h)eyye

Q and eys C [Q]

(5.5.17)

For ye Q, two more operators can be defined: dys 1 ^y where ^ ( e 7 ' ) = (y',y)er

(5.5.18)

C r s 1 C r where C 7 (e r ') = e(y, y')er

(5.5.19)

and Note that in the previous section we showed that VQ (denoted Vr there) can be considered as a subspace of a larger Fock space. In fact each of those spaces and V in particular admit a Z+-gradation, for instance: n>0

This gradation follows from: (a)

deg(r" ® h) = -n,

(b)

deg( e y ) = - - i (y,y),

n = 1, 2, ... ys Q

(5.5.20)

On each of these V_n 's we can define an operator known as the energy operator as follows: DQ(v)

= -nv

for y e V_

(5.5.21)

We shall use these operators along with those introduced in the previous section to define vertex operators in next section. We use them now, to define the canonical action of the Heisenberg algebra on the Fock space: V = S(S_) ®C [T]. We use x to denote an arbitrary element of 5 (5_) and x <E> ex for an arbitrary element of V. The action by the Heisenberg system (S, I") is indeed the action by the operators defined in (5.5.14)—(5.5.19). We thus have: h (-n) • x ® ex = (h(-n) • x) ® eX n > 0 h (n) • x ® eX = n —^—

dh{-n)

® eX n > 0

(5.5.22) (5.5.23)

Infinite-Dimensional Algebras 223

h(Q) • x ® ex = x ® (dh ex) = (h, A) x ® ex

(5.5.24)

X • x® ex= e(X', A) * ® eA + A'

(5.5.25)

c • x® ex = x ® ex

(5.5.26)

where A, A' € F and c is a central element of 5 . The above equations (considered together) imply the following: A' • (A • y) = £(A', A) (A' + A) • y X (h(0) • (A'"1 • y)) = (A' (*(0))) • y A' • (A • (A"1 • (A-1 • y))) = e(A', A) £(A, A')"1 y (5.5.27) where y e V and A, A' e T. This action of the Heisenberg algebra can also be defined on double Fock spaces by taking the product (S x S') between the algebra 5 and its copy denoted S '. We give below a brief description of this idea. Let prime ' denote the copy of an object, thus S'_ is a copy of S_, C [T]' is a copy of C [F], and C [V]' is a copy of C [F'] where F' is the lattice dual to F. The Fock spaces with primed objects are denoted with bar for distinction (see [14]). Thus (a) (b)

Vr = 5CS'J®C[lT V r = 5(5l)®C[F']'

(c)

Va) =S(S'_) ® C[F + X]\ where A e F'.

(5.5.28) 25

In fact we use only (b) to form the tensor product Vr ® V r which we identify with : Sym(S+©5J®

X

Ceart

in the following manner: A,(-n,) X^-nJ

... U~nr) eX ® nx{-m{)' /i 2 (-m 2 )' ... ^ v (-m,)'(^)'

H-* A,(-n,) AjC-ziz) ... Ar(-«r) /i,(fn,) H2(m2) ... ^ . ( m , ) ^ ' ^

(5.5.29)

The space i/r< defined below: f/r = Sym (S+ © 5_) ® C [F'] = Sym(5+ © S_) ® ZAe r C e{k A) c V r ® Vr

(5.5.30)

is a complex vector space, and is called the double Fock space of F'. The canonical action of the Heisenberg algebra S x S', in view of equations (5.5.22)—(5.5.26) on the space Ur, can be written down as follows: 25

- e<-^M) an(j e(k ^) denote t h e double exponential series. We use unprimed letters X, n etc. even though they e F'

224

Mathematical Perspectives on Theoretical Physics

(a)

h(-n) • u ® ea-A = (h(~n)u) ® e(X' M)

(b)

h(0) • u ® ea- M) = (h, X)u ® ea ">

(c)

h(n) • u ® e(X'M)

= n . du dh(-n)

<8> e(A< M)

(5.5.31)

and (a)

A(-n)' • u ® e a> M) = (h(n)u) ® e ( A ' r t

(b)

h(0)' • u ® ea' # = (h, fi) u® e a

(c)

h(n)' • u ® ea-rt

M)

= n - ^ - ® ea oh{n)

rt

(5.5.32)

where M e Sym (S+ © 5_) and n > 0. Evidently the central elements c and c' act as Id operators on the space. The difference between the action equations (5.5.31) and (5.5.32) resulting from the Heisenberg algebra S and its copy S ' is worth noting. In any case this was expected in view of the identifying relation (5.5.29). We note that the operator

is defined by dh(n)

- | ( ^ 7 = (m, n) | n \8n m an (n)

h(n) = h® t", j{m) = j® t'"; h,je

H

(5.5.33)

Also dvw dv dw -r = w +v dh(n) dh(n) dh(n)

„, (5.5.34)

In the above equations m, n stand for any non-zero positive and negative integers. Although we do not use the double Fock spaces explicitly, we would like to remark that they are used in superalgebras and superstring theory-the final objective of our text (see [11] for details). We return to vertex operators in the next section.

Exercise 5.5 1. Write down the exponential analogue of (5.5.10) and obtain —— and ——. at ad 2. Using the Fourier expansion (5.5.10) for x{t, 6) establish the Hamiltonian H as given in (5.5.11). 3. Use (^Q, ..., jxr) to represent V/T where F and F' are the root lattices given in Subsec. (5.4.2). Set /IQ = 0. Then show that the Fock space Vr decomposes as a direct sum of (r + 1) subspaces:

(a)

V(,o)®-eVW

Infinite-Dimensional Algebras 225

In the case of T being a root lattice, V becomes a weight lattice, and (a) gives the decomposition in terms of Fock spaces formed with the help of the weights of the algebra.

Hints to Exercise 5.5 2. We differentiate (5.5.10) with respect to t and 0 to obtain: -T7 = 9o + X ?« ^ C O S <"

nd

+ X /2"sin 0

n=l

/*) r

(i)

#1 = 1

°°

°°

-r— = - V2" Y^ n<7n sin 0 + -Pi Y «^_n cos n^ + £. »=i

^

(ii)

n=i

Substituting it in (5.5.9) we have after some simplification: H=

T~ V 47r

i°

+

L

2 9n22cos2 «0 + ^: n 2sin2ne + 2^0 £ V 2 ^ cos nd) «=i

v«=i

+ 2^o Z«-« ^ s i n «0 + 4 £ V#i = l

j

)

^ {(cos j0 cos kd) qjqk + (sin j0 sin kd)q_jq_k }

7=1 A>1

+ 4 £ X 9; 9_t cos ;0 sin *e ] d0 + 2 X "29« 7=1 * = i

sin2

"e

('»)

L "=1

+ 2 X " V - n cos2 n6 + k2 + 2k I Y^ V2~(- n^n sinn0 + nq_n cosre0)[ 71=1 00

-

4

U =l 00

J 00

0

X X ^ / s i n ; ) 9-* (

cos

kff) + 4 X X {;^(sin;0) ^(sin *0) 7 = 1 k>\ j*k

7 =1 *=1

+ jq_j (cos jd) kq_k (cos A:0)}

00

dd.

( dx\ The first three terms of the above integral which correspond to can be written as: V dt )

226

Mathematical Perspectives on Theoretical Physics

T - £ * 19o + S «»( cos 2 " 0 + 1) + £ 9-n (1 ~ cos 2n0) d6

1

-2

1

V

2 , V"

2

1

1

•2 . V

1 -2

1

(iv)

NT -2

as J "cos 2n0d6 = 0. The next five terms each of which involves derivatives of qn or q_n and g0 with respect to t are zero when integrated. Next we note that the first three terms in square bracket, corresponding to

are the only terms which contribute to a non-zero part:

— V L Z n2q\ (1 - cos 2nO) + Y n2qln(\ + cos 2nd) + k2 }d6 An

° U=i

»=i

J

=|i»2^2+y£»V^+^-. Z

n=l

Z

n=l

Z

(v)

The rest of the terms are zero. Adding (iv) and (v) we obtain (5.5.11). 3. Use Def. (5.4.3) and Eq. (5.4.12) to obtain the solution (see also Sec. 1-A in [14]).

6

THE VERTEX AND VIRASORO OPERATORS

We devote this section to vertex and Virasoro operators. Both these operators were discovered by physicists, for instance the vertex operator D defined in terms of exponentials in Sec. 4 (Exercise 2) was used in dual string theory and prior to that a similar operator (fermionic field operator) for the soliton in the Sine-Gordon model was used by Skryme [13]. Similarly the Virasoro operator was used in looking at the conformal invariance in string theory. The commutation relations of these operators correspond to the re-parametrization and conformal symmetry of the string (see Chapter 11). In recent years it has been found that they are extremely useful in the understanding and development of infinite-dimensional algebras as well as string theory. These operators therefore have led to a strong interplay between mathematics and physics. One now talks of different types of formulations of algebras (e.g., a Spinor construction) and of representations of algebras (e.g., untwisted and twisted) associated with these operators, and examines their relationship with other operators and algebras (see [4]). Due to our limited scope, we content ourselves with the basics of these operators. Detailed accounts on these topics can be found in references [9] and [6] where most of the research papers have been reprinted (see also [3]).

Infinite-Dimensional Algebras 227

6.1 The Vertex Operators Before we write down a vertex or Virasoro operator in a general form, we state two simple results based on differential operators on the vector space V = S(S_) ® c C[F]. Recall that, we obtained these operators in Sec. 4 and Sec 5. In Sec 4. we used the Heisenberg system ( 5 , V) with a fixed polarization and a representation n = {nu n2} associated to a cocycle e to do so, (see Result (5.4.9) and Definition (5.4.10)), and in Sec. 5, we obtained them on Fock space VQ, Q being an integral even lattice (see Subsec. 5.3). Result 5.6.1:

Let D denote the differential operator from V —> V: D s exp(
(5.6.1)

where we S + , z e C vO and y e F. 26 Then using Taylor's formula and defining equations (5.4.15), and (5.4.17) it follows that: D{v) = v + y/(u, v),

D(er) =

{r 7) r z

'' e '

(5.6.2) y

where v e S_, y' e T and y/ is the alternate bilinear form defined in Sec. 4. (Note that D(e ) = (In z) < y' y> er = In z<" r * e y s z ey , where we have retained only one term in the expansion oflnz). Result 5.6.2:

Let A : V —> V be a differential operator for which there exist u0 e S + , v0 e 5_,

7, 7 ' e T and z e C \ 0 such that: (a)

[
« e

(b)

[Lv, A] = y/ (« 0 , p)i4,

o e 5_

(c)

[dr,A]=

y" e T

(d) fr Then the operator A equals:

(y,y")A,

AT_r=

z(r'y"]A,

5+

y" e T.

afy (exp v0) exp (-((In z) 5 r ' + du))), where a e C.

(5.6.3)

(5.6.4)

We use the operator D of Eq. (5.6.1) and the annihilation operator given in (5.5.15) to define the vertex operator. Definition 5.6.3:

Let V = S (S_) ®c C [H and V = S(S _) <S>C C [T] be the vector spaces associated

to the Heisenberg system (5 , T) and let D, y(ri) (= hy{n)) be the differential operators on V. Then the operator:

X(y z) = exp X — / ( - ") exp (7+ (In z)dy) • exp f - X — 7<»)

26

- K = 5 (5 J ® C [H (see Eq. (5.4.20)).

(5-6.5)

228

Mathematical Perspectives on Theoretical Physics

for z e C \ 0 and y e T is the so-called vertex operator from V to V . The following facts about this operator are easy to check: -ir.r)er Fact 5.6.4: (a) The middle term can be written as z 2 exp(ln z)
Ar(er')=Sr^r)eT'

(5.6.6)

(b) Although X(y, z) is an operator from V to V, it can be developed as a power series in z:

X(%z)=yZxr(y)zr

(5.6.7)

rez

involving operators Xr(y) (r e Z) which map V into itself. The operator X(y, z) is an analytic operator (as a function of z) accordingly it is natural to talk about its residue. Indeed the operators Xr{y) of (5.6.7) are the operators that are obtained by considering the residue: Xr(y) • v = Res (X(y, z)v • zr'1) v € V

(5.6.8)

z=0

By the very definition of X(y, z) (product of exponentials), it is evident that the RHS has a finite pole at z = 0, and therefore Xr(y) is well defined. Again using the cocycle e, we can write the operators depending on the cocycle, thus27: Xe(y, z) = X(y, z)er , Xr£(y) = Xr(y)ey

(5.6.9)

These can be further used to define commutation and anticommutation relations between the vertex operators. For this purpose we set28 X\y, 7'; z, z0) = : X(y, z) X(y\ z 0 ): ey + y ,

(5.6.10)

and state a simple result:

Result 5.6.5: Let y, y' e T and v e V, then X\y, z)X\y\ z0) = e(y, y') (z - zo){y'r)

{zz0Y{r'r)l2

X\y, y', z, z0)

where |z| > |z o | If e(y, y') e (yr, y)~x - ± ( - \){r-r\

(i) then

[X*{% X\y\ z o )] + v = £(7, 7') Res ((z - Z o ^ ^ ^ z z o ) " ^ ^ 2 x XE(y, y';z,zo)vz = z0

27

ey{a) = e (y, a) defined in subsection (5.2.5). ' See Eq. (5.2.28) for the ordering convention.

28

(ii) zr~l)

Infinite-Dimensional Algebras 229

where [ , ] _ = [ , ] denotes the commutator and [ , ] + = { , } denotes the anticommutator (see Prop. (1.2.15) in [4] for the proof). We now show (through next result) how vertex operators can be used to give a representation of an affine algebra g associated to a complex simple finite-dimensional Lie algebra g. Recall that in this case 9l is a Cartan subalgebra of g, A is the associated root system and Q is the lattice in tf generated by A. The bilinear form ( , ) restricted to A satisfies (y, y) = 2. Moreover g has a Chevalley basis {Ey} u {ha), ye A, a, e n , and also has an associated cocycle e. Then setting •S* = X sk

where s

k=

fk

®X k ± 0 ,

kez 5

-=Z5-t'

SQ=Cc + !H and V = S(SJ ® c C [Q]

k<0

and using the operators h(r), dy,Cy, Do (defined in subsection (5.5.3)) andXr(y) as well as the Hermitian form on V given by ( | ) (see Subsec. (5.4.2)), we obtain the following (important) result (see [5] for the proof): Result 5.6.6:

The map n: g -> End V given by 7T(c) = l v , 7t(d)= Do n{tr ® h) = h(r), r e Z

(5.6.11)

^(? r ® £y) = Xr(^) CY y e A is a representation of the Lie algebra g, which is equivalent to the basic representation (see Sec. 3 for the basic representation).

6.2

The Virasoro Operators

Just as in the case of vertex operators, here also we deal with a Heisenberg system with canonical representation {nx, K2} associated to an even integral lattice Q, a cocycle, e and the space

V = S(SJ®CC[Q] on which the Virasoro operators act. These operators are defined in terms of elements of the Cartan subalgebra !tf. We choose an orthonormal basis {a,} (i = 1, ...,/) for Hand define them as follows: Definition 5.6.7:

The operators

O0 = - I I a,- (~r) a, ( r ) - | X («,- (0))2

and

Dm = - 1 £ X fl. (-r) a, (r + m) m € Z \ 0

(5.6.12)

(5.6.13)

2 rezi = l

are the well known Virasoro operators. They map V into itself. The operator Do (as should be evident from its form) is also known as the energy operator of the system. These operators satisfy the following relations with other operators, e.g., 7t{(s), r y a n d X(y, z):

230 Mathematical Perspectives on Theoretical Physics

[Dn, »,(*)] = KX f tm+x -^-(5) J s 6 5 TrDm

(5.6.14)

f~{ = Dm+ y(m) - j 5 m , 0 (7-7)

[£,„, *(* Z)l = - A y (y, y) + z-^-1 *(y, 2) [Dm, Dn] = (n-m)

Dm+n + -L

{n?

- m)Sm< _„.

(5-6.15)

(5.6.16) (5.6.17)

Equalities (5.6.14) and (5.6.15) can be proved by using the bracket identity: [0,02, O3] = O,[O2, O3] + [OL 0 3 ]O 2 (5.6.18) since every term of Dm is a product of two operators. The defining equations required for establishing these two are respectively (5.4.19) and (5.4.18). For proving (5.6.16) and the constant term of (5.6.17), we have provided a few hints in Exercises (5.6.4) and (5.6.5). Recall that on the subalgebra g of g , we are already familiar with the action of the differential operator

d(n) = tn+l ±. dt

Using the defining relations of Result (5.6.6), it can be shown that if n: g —» End V is the representation of the Lie algebra g , then: [Dm, a(x)] = n(dm(x)) for any x e g eg

(5.6.19)

where we are denoting d(m) as dm. Thus commutation with the Virasoro operator induces the operator

f'+l

—=d dt '"

Now the operators {dm} form an infinite-dimensional algebra denoted T> (see Sec. 2) with the bracket relation:

[dm,dl] = (l-m)dm

+l

(5.6.20)

The relation (5.6.17) therefore can be viewed to give a central extension T> of the algebra 2). It is interesting to note that both these algebras have a 3-dimensional subalgebra which is isomorphic to sl(2, C). In the case of i> it is formed by D_x, Do, D, and in the case of
Infinite-Dimensional Algebras

231

We feel however, that inspite of these shortcomings we have given sufficient flavour of the subject, and have provided adequate references for a further indepth study.

Exercise 5.6 1. Prove Result (5.6.1). 2. Prove Result (5.6.2). 3. The middle term of the vertex operator defined in (5.6.5) operates on C [F]. Show that its action on ey in particular gives: z

{r'.r)+-^{r.Y) 2

er

, Y+v r

4. Prove the relations (5.6.16) and (5.6.17) for the Virasoro operator Dm. 5. Find the value of [Dm, £>_J(1) where m > 0.

Hints to Exercise 5.6 1. Note that elements of C [F] can be viewed as constants when the operator du is concerned, likewise elements of S(S_) can be treated as constants when the operator dr'is concerned. Hence using (5.4.15) and (5.4.17) we have the result given in Eq. (5.6.2). 2. We replace the operator A by the operator B = f_yexp(-v0)

• A • exp((ln z) dy> + du ). Then our

statement in (5.6.3) becomes: if [du, A] = [Lv, A] = [dy», A] - [ff., A] = 0 for any u e S+, v e S_, y" e F, then A = const. The latter statement is obvious (see also Prop. (3.1) in [8(c)]). 3. Note that the middle term of a vertex operator can be put as:

\rez

where Ar satisfies Ar (ey) = 8r^y

J

y-j

ey . Accordingly we have:

i ( r ' r > r f y 7r8

er)

Z

e

e

\L

Z d

r,(y,Y')

• )

\rez

In the above sum, all other terms, except the one for which r = (7, 7') are zero, hence the result is 4 + (r.r'> 7 r +7v . 2 Z

e

.

4. To show the equality (5.6.16) we note that the terms in X(y, z) are products of three exponents, which (using (5.6.18)) can be treated individually to obtain the overall result. Thus we can write these commutation relations simply as:

232

Mathematical Perspectives on Theoretical Physics

(a)

Dm, exp — 7 ( - r) = -z y(m - r) exp — / ( - r ) L r J r

(b)

[£>„,, exp(y + (In z)d^} = -y(m) exp(y+ (In z) d).

On the other hand the differentiation of these two exponents in (a) and (b) above gives:

-z"'+1 -f- e x p ^ 7 ( - r) = -zr+m y(- r) exp ^y(- r)

(c)

dz

r

r

-zm+[ ~ e x p ( r + (In z)dy) = -zr 7(0) exp(y + (In z)dy). dz ' Comparing the RHS of (a) and (b) with (c) and (d), it follows that (d)

[D"\ X(y, z)] and -zm+i -j-X{y, z) dz differ only by m permutations of the factors. Each permutation implies the addition of a scalar factor yz'" (7, 7) and hence we have the equality:

[Dm, X(y z)] = -z'" [^ (7, 7) + zj^j X(Y, z). We leave the proof of the second part for the reader. 5. Note that [Dm, D_m]( 1) for m > 0 simplifies to Dffl£>_,„( 1), since D_m Dm(\) equals zero. Now D

»^=

Z

rezi = \

1

(a)

«i(-rK(r + ») - { i t

- T I I V

'

sszj

^ ( - ^ ( J + B.)

=\

)

'

= 7 X S Z a /(- r ) X ai(r r e z ,s€Z i = l

Z

A

+

m)aj(-s)aj(s + m).

7=1

Using (5.5.16) we can write: (b)

at(r + m)aj(~s) = (r + m)8r+ms (a,-, a ; ) = (r + m)8r+ms 8j.

Thus (b) is non-zero only when j = i and r + m = s. Accordingly (a) simplifies to:

(c)

"7 X X X «z,.(-r)a;(.? - m). rez jez i'=l

In the above expression r and s are not linearly independent, and this operator acting on (1) E V will give a non-zero result only for s = 1, ..., m - 1, hence we can write DmD_m (1) as: fm-l /

"N

. m-1 5

,

X X a,-(-5)fli(-m + 5) (1) = ~r X ( ' » ~ "> =1 2l V ( m 3 ~ m ) ' 2

V . s = i 1=1

7

i-=i

s

Infinite-Dimensional Algebras

233

References 1. E. Abe, HopfAlgebras, Cambridge Tracts in Math. N74 (Cambridge University Press, 1980). 2. S. L. Adler and R.F. Dashen, Current Algebras and Applications to Particle Physics (W. A. Benjamin, Inc, 1968). 3. A. J. Feingold, Spinor Construction of Vertex Operator Algebras, Triality and E., Contemp. Math. No. 121 (Am. Math. Soc), 1991. 4. I. B. Frankel, Two Constructions of Affine Lie Algebras Representations and Boson-Fermion Correspondance in Quantum Field Theory, J. Funct. Anal. 44 (1981), 259-317. See Chap. 3 of [6]. 5. I. B. Frenkel and V. G. Kac, Basic Representations of Affine Lie Algebras and Dual Resonance Models, Invent. Math. 62 (1980), 23-66. See Chapter 2 of [6]. 6. P. Goddard and D. Olive (eds.), Kac-Moody and Virasoro Algebras (World Scientific, 1988). 7. O. M. Jezabek and M. Praszatowicz (ed.), Skyrmions and Anomalies (World Scientific, 1987). 8. (a) V. G. Kac, Infinite Dimensional Lie Algebras, 4.[6(a)]. (b) V. G. Kac. (ed.), Infinite Dimensional Lie Algebras and Groups, 4.[6(b)]. (c) V. G. Kac, D. A. Kazhdan, J. Lepowsky, R. L. Wilson, Realization of the Basic Representations of the Euclidean Lie Algebras, Advances in Math. 42 (1981), 83-112. 9. J. Lepowsky, S. Mandelstam and I. M. Singer, Vertex Operator in Mathematics and Physics (MSRI No. 3, Springer-Verlag, 1985). 10. J. Lepowsky and R. L. Wilson, A Lie Theoretic Interpretation and Proof of the Rogers-Ramanujan Identities, Advances in Math. 45 (1982), 21-72. 11. S. A. Prevost, Vertex Algebras and Integral Bases for the Enveloping Algebras of Affine Lie Algebras, Am. Math. Soc, Vol. 96, No. 466, (1992). 12. R. Seeley, Complex Powers of an Elliptic Operator, Proc. Symp. Pure Math. 10 (1973), 515-527. 13. T. H. R. Skyrme, Selected Papers, with Commentary ((Ed.) G. E. Brown, (World Scientific, 1994). See in particular Skyrme's papers in Proc. Roy. Soc. Lond. (1958, 1959, 1961, 1962); his joint work with J. K. Perring, A model unified field equation, Nucl. Phys 31 (1962), 550-555; and articles by V. I. Sanyuk and E. Witten. 14. H. Tsukada, String Path Integral Realization of Vertex Operator Algebras, Mem. Am. Math. Soc. 444 (1991). 15. V. G. Kac, Superconformal algebras and transistive group actions on quadrics, Comm. Math. Phys. (1997). 16. V. G. Kac, Vertex algebras for beginners, University lecture series, Vol. 10, (AMS, 1996) 17. S. J. Cheng and V.G. Kac, Conformal modules, Asian J. of Math (1997) 18. R. Borcherds, "Vertex algebras, Kac-Moody algebras and the Monster, "Proc. Natl. Acad. Sci. USA 83, 3068-3071 (1986). 19. N. Kamran and P. J. Olver (ed.) Lie Algebras, Cohomology, and New Applications to Quantum Mechanics, Contemp. Math 160 (AMS, 1994).

CHAPTER

THE ROLE OF SYMMETRY IN PHYSICS AND MATHEMATICS

1

f* O

WHAT IS SYMMETRY?1

The word symmetry according to the Merriam Webster Dictionary stands for (i) the correspondence in size, form and arrangement of parts that are on opposite sides of a plane, line or point; (ii) the excellence of proportion or the beauty based on proportion; (iii) the notions that leave a figure unchanged. Although our main concern is with number (iii), we emphasize that (i) and (ii) are also, there in one or the other form. From early on, the symmetry in nature (i.e., in laws of nature) was a recognized phenomenon in terms of invariances, e.g., translational and rotational. The science of crystallography (based on symmetry principles) is one such example. However, the use of symmetry as powerful tool for development of physics became more popular with the advent of Einstein's theory of relativity. For after that, one not only studied the invariance properties of physical systems but also used these properties to make predictions about theories. An example in this direction is the eightfold way of Gell-Mann and Ne'eman that led to the discovery of quarks [13]. Symmetry's important role as a yardstick in physics, has reached a point where no new concept is taken seriously unless the underlying symmetry of the concept is theoretically specified and the legitimacy of this symmetry is confirmed by experiments. One therefore deals these days with a whole arsenal of vocabulary pertaining to symmetry, for instance (i) the exact and approximate symmetries, (ii) the global and local symmetries, (iii) the gauge symmetries, (iv) the continuous and discrete symmetries, (v) the Lorentz and Poincare symmetries, (vi) the broken symmetries, (vii) the geometrical and non-geometrical symmetries, (viii) the hidden symmetries, (ix) the dynamical symmetries, (x) the relation between symmetries and conservation laws and currents. We shall explain these terms in brief in the next section. But before this, we would like to observe that present day physics not only uses the existence of these symmetries to unravel the mysteries of the universe, but also engages itself in tackling more subtle questions, namely: what are the origins of different symmetries? Are they 'fundamental' or are they 'dervied'? (see Def. (vii). In fact it is this quest that has led to the so-called supersymmetry-a new frontier in physics. '• In this chapter, while learning the basics of symmetry, we emphasize mainly on gauge symmetries—the root of gauge theories and introduce the readers to notational differences among these theories in mathematics and physics.

The Role of Symmetry in Physics and Mathematics 235

Finally, we note that implicit in a symmetry principle is the assumption under which certain quantities are unobservable. This in turn implies an invariance under a related mathematical transformation leading to a conservation law or selection rule. Hence like every other object in physics, symmetry is defined and studied via mathematical objects. For instance, associated with any given symmetry there always exists a continuous or a discrete group of transformations. The generators of the group commute with the Hamiltonian of the physical system in question. Thus the study of symmetries to a large extent can be viewed as the study of groups in particular that of Lie groups and their representations. Having already studied the basics of group theory in Chapter 2 and the groups SU(2) and SU(3) in Sec. 3.7, we shall mainly concern ourselves here with the product group SU(2)L x U(l)Y, the symmetry group of the Glashow-Weinberg-Salam model, better known as the invariance group of electro-weak interactions; and the rotation group 0(4)—the symmetry group of the hydrogen atom and of any particle in Coulomb-like \lr potential2 (see Exp. (6.7.5)). We shall again return to principles of symmetries in Chapters 8 and 9. In Chapter 8 we shall deal with symmetries of spacetime, whereas in Chapter 9 we shall study the symmetries (antisymmetries) that relate to familiar objects such as linear and angular momentum, energy, wave function and so forth, (see the Table 6.1 on symmetries and conservation laws at the end of this chapter).

2

DEFINITIONS AND DESCRIPTIONS

As title suggests, in this section we give in brief the various types of symmetries that we have enumerated in previous section. These can be described as follows (see Chapter 1 of [11] and [19] for details): (i) A symmetry of a physical system is called exact if no violations are observed, otherwise it is called approximate. Simple examples of exact symmetry are rotations and translations, whereas parity 'invariance under inversion of space coordinates' is an approximate symmetry. An important class of symmetries which hold only approximately are the well known unitary symmetries* (see [15] and [19]). (ii) A symmetry which is characterised by space-time independent parameters is called a global symmetry, while the one which depends on them is called local. Thus in the former case symmetry transformations on the fields of a theory are identical at all space-time points, (see Subsec. (3.2)). (iii) A local symmetry, i.e., a symmetry whose parameters depend on the space on which it is defined is often called a gauge symmetry, and local symmetry transformations are referred to as gauge transformations. A theory that studies these symmetries is called gauge theory and the underlying symmetry group of the theory is called its gauge group. A gauge theory is called abelian or non-abelian depending on the nature of the symmetry group. For example, the electromagnetic theory with £/(l) as its symmetry group is abelian gauge theory, and the Yang-Mills' theory characterized by symmetry group SU(2) is non-abelian. Sometimes the gauge transformations that leave the field equations of a theory invariant are used to set up the 'so called' constraint equations involving objects e.g. vector and scalar potentials of the theory. These choices lead to simplify the field equations (see Exercises 2 and 3 of Sec. 5, and a foot note in Subsec. 6.4). This proceses is called gauge-fixing (see Sees. 5 and 6). We shall return to these ideas on gauge theory in detail from Sec. 4 onward. 2

r = (xx, JC2, x3) and r = -Jxf + x\ + x\ . Coulomb potential (the basic potential of atomic physics) describes the electromagnetic forces that are responsible for the binding of electrons and nuclei into the form of atoms and molecules. * An invariance defined by a unitary operator (for example in quantum mechanics) or by a unitary group (e.g. U(n) or SU(n)) is called a unitary symmetry.

236

Mathematical Perspectives on Theoretical Physics

(iv) A symmetry whose associated group is continuous (discrete) is called a continuous (discrete) symmetry. Evidently translations and rotations (with parameters x and 6 which can vary continuously) are continuous symmetries. The group of permutations of degree n acting on n objects is a discrete symmetry group and the symmetry described by it is a discrete symmetry. We note that all invariances under discrete transformations except the ones governed by a group of permutations seem to be approximate . (v) The symmetries whose associated groups are Lorentz or Poincare are referred to as Lorentz or Poincare symmetries. (vi) In his famous paper of 1894 (J. Physique 3e serie, p. 393), Pierre Curie formulated the concept of broken symmetry in following words "C'est la dissymetrie qui cree le phenomene" which when translated means: "It is the broken symmetry which creates a lot of non-trivial physics". It is an established fact that physical systems which are initially uniform and time-independent (though far from equilibrium) possess a kind of self-organizing phenomena, that leads to ordered behavior. The emergence of order (referred to as "symmetry breaking") is often accompanied by the appearance of spatially asymmetric patterns. Such (symmetry breaking) phenomena occurs in macroscopic physics as well as in microscopic physics* (see Chapter 2 in [11]), for instance in subatomic physics it happens to be the rule rather than the exception. It is also of great value in systems of biological nature, where both order and asymmetry are ubiquitous. We shall, however, restrict ourselves to a group theoretic approach and shall say that given a physical system its symmetry is said to be broken if its underlying symmetry group can be expressed as a product of two other groups. These other groups are naturally subgroups of given group and form separate symmetry groups of given system. A well known example of this phenomena is the electro-weak theory of Glashow-Weinberg-Salam, to which we shall return later (see Section 5). We would also like to note that there are three types of symmetry breaking that take place in the physical world, namely (1) explicit breaking: the "classical action" has a term which is not invariant under a given symmetry group; (2) anomalous breaking: the classical action is invariant but Hamiltonian of corresponding .quantum theory is not, and there is no conserved current; (3) spontaneous breaking: the action is invariant and the currents are conserved in quantum theory, but the vacuum ground state is not invariant under the transformations, we shall return to it briefly in Subsec. 5.2, (see Chapters 5 and 6 for (1), (2) and (3) in [4] and Chapters 4,6 and 7 in [11]; see also Chapters 4 and 5 in [5], and references [9], [6], [21]). In group theoretical terms the classical Kaluza-Klein theory that unified the gravitation and electromagnetism is the best understood theory of spontaneous symmetry breaking. Here the 5-dimensional group of general coordinate transformations is (spontaneously) broken to the 4-dimensional group of general coordinate transformations and a local U(\) gauge group (see Ed. Witten, and E. Cremmer and B. Julia in [11]). We also note that when local symmetries are spontaneously broken (Subsec 5.2) due to non-invariance of vacuum, the zero mass particles do not appear in the physical spectrum of states; instead these particles provide longitudinal modes to gauge fields, making them massive. It turns out to be a welcome phenomena, since here two kinds of massless particles that are not needed for physical purposes, combine to give a 'vector meson state' capable of mediating short-range forces such as the weak forces. As a matter of fact this (Higgs-Kibble) mechanism forms the basis of unified gauge theories (see [22] for mathematical explanations, [17], [26] for original papers, and articles by 't Hooft and Dimopoulos, et al., in [9] for some more answers). Two simple examples of macroscopic physics are hydrodynamical differential equations, and mechanical properties of an organic compound such as sugar. And that of microscopic physics is the model SU(2) x U(3)-( see Exp. 3 in Subsec. 5.2).

The Role of Symmetry in Physics and Mathematics

237

(vii) A layman's definition of a geometric symmetry would be: 'a symmetry that preserves geometric configurations is a geometric symmetry'. The translational and rotational invariances of space, in other words the symmetries of Euclidean geometry-also classified as 'fundamental symmetries' are geometric symmetries. In the context of physical theories, however, the notion of geometric symmetry is used in a much broader sense. For instance, since the publication of Einstein's theory of relativity, the Lorentz invariance and time-translational invariance have been considered 'geometrical.' Similarly, the general coordinate-invariance or diffeomorphism symmetry of Einstein's theory of gravity is called 'geometrical' (though sometimes this is referred to as 'a questionable practice'). In this scheme of ideas even a supersymmetry (if it exists) is a geometrical symmetry, since it is intertwined in a non-trivial way with the symmetries of space and time translations (see Section 7.5). Examples of symmetries that are nongeometrical are offered by gauge symmetries of Yang-Mills theories. Nonetheless, the proponents of geometric symmetries consider these also as geometric, since the theories can be given a fiber bundle formulation (see Section 5 and Section 6). In any case, we would like to note that symmetries described in (v) are geometric symmetries. (viii) Sometimes the symmetry is not manifest although its effects can be felt indirectly. Such a symmetry is called a hidden symmetry (see Chapter 2 in [11]). (ix) Dynamical symmetries are not true symmetries in the sense that they are not exhibited by the laws of nature. A simple description of dynamical symmetry will be as follows. Suppose there is a physical system described by a Hamiltonian H with underlying symmetry group G, more specifically by H- h (Ga), Ga e Q the algebra of generators of G (see Sec. 3.7). In general H (written in terms of generators of G) does not admit solutions to eigenvalue problem in closed, analytic form (in other words H does not give the spectrum of observables of the theory). If, however, H contains invariant (Casimir) operators of a complete chain of subgroups of G G D

G, 3 G 2 •••

then the eigenvalue problem can be solved in closed form in terms of quantum numbers, and these lead to energy (or mass) formulas of the theory. Thus dynamical symmetry is a means towards classifications of complex physical systems. A familiar example of dynamical symmetry is given by the Gell-MannNe'eman's 5f/(3) with the group chain [13]:

SU(3) 3 SU(2) ® U(l) 3 SO(2) ® 1/(1). Other examples of dynamical symmetries are those of molecules and nuclei-with their respective symmetry groups t/(4) and f/(6). The interacting objects in both cases are bosons (see [11], [21] and F. Ichello in [27]). (x) In both classical and quantum mechanics, the conservation of the linear momentum, the angular momentum and the energy are related to the invariance of the Hamiltonian under translations, rotations and time translations respectively. More generally, in quantum mechanics whenever a conservation law holds good for a physical system, its Hamiltonian is invariant under a corresponding group of transformations. Converse of this statement is not true, for even if the system has a Hamiltonian which is invariant under the group of transformations, the corresponding conservation law may not be there. A simple example is provided by the time reversal in a given physical system. We devote Sec. 3 to these ideas.

238 Mathematical Perspectives on Theoretical Physics

3

EXACT SYMMETRIES, CONSERVATION LAWS AND CURRENTS

Very often the symmetries are not exact. In order to make them so, certain constraints are imposed on the system which in turn lead to conservation laws and currents. We illustrate this point within the framework of Lagrangian field theory.

3.1

Euler-Lagrange Equations and Currents

We write the action S as a functional of fields 0; and their derivatives d^ ,-3: S = \dixL{$id^i)

(6.3.1)

Here L is the Lagrange density, the index i runs over the number of fields and \i takes the values 0 through 3. Under a symmetry transformation the entities 0;, (9^0, and L in the above action are transformed to: h -> ^ + 50,

d^ 0; -» L+SL

(6.3.2)

Accordingly 8L can be written: (6.3.3) To simplify, we assume that there is just one field in the action, then the action 5 will be invariant under the symmetry given in (6.3.2) only if:

SS = O= Jd4*f | ^ 0 + 1^k~S(d^

0)1

(6.3.4)

The RHS of Eq. (6.3.4) can be simplified after the second term is integrated by parts, thus we have: (6.3.5) If the second integral (which is the surface term) is ignored, apparently 5 5 is zero when the integrand in the first term vanishes4:

3'

4

The actioni when compared with first quantization [pt, xj\ = - iSy is often referred to as the second quantization in quantum theory, since the interactions amongst the fields are explicit in the action. Note that the defining equation of second quantization is : [n (x), 0 (y)] x -y = —i $3 (x,,- y^). Here i takes the value 1, 2, 3, and dx, SL Pi = ——; whereas n (x) = ———— (see Ftn. 5). dt 8(6Q 0) In 3-dimensions this is the familiar equation of motion.

The Role of Symmetry in Physics and Mathematics 239

This, however, is the Euler-Lagrange equation, which can be obtained from extremization of the action 5 that involves physical fields. Hence in order that the symmetry be exact, the integrand dJ

5

(6.3.6) where ea is an arbitrary parameter of transformation for the variation of

(6.3.7)

Thus any continuous symmetry transformation which leaves the action S invariant implies the existence of a conserved current, (see Chapter 2 in [16]). Next we shall use the current J% to obtain the conservation law of the system.

3.2 Conservation Law and the Conserved Charges as Generators of Symmetry Group The integral

Qa=jd3xJZ(x)

(6.3.8)

is called the (corresponding) charge of the system. This satisfies the equation - ^ - = 0 dt

if

8L = 0

(6.3.9)

giving the conservation law of the charge. Reverting to (6.3.2) we note that if L is invariant under a symmetry group G formed by infinitesimal transformations, then we can write the variation 50,(x) as $j{x)

(6.3.10)

where {ta} are a set of matrices that satisfy the Lie algebra [ta, tp] = ica^ t* with caPy being the structure constants of G. The variation (6.3.10) of 0, implies that the current j£ can be expressed as:

*~'I£M*>*' or equivalently as:

(6311)
240

Mathematical Perspectives on Theoretical Physics

(6.3.11)(b)

The conserved charges [Qa] are the generators of the symmetry group. The symmetries explained above are global symmetries since these are characterized by space-time independent parameters £a. Under these.symmetries all fields 0,(x) are transformed at all space-points in exactly the same way. We explain these ideas further by two simple examples based on the abelian group (7(1) and the non-abelian group 5(7(2).

3.3

Examples

Example 6.3.1: Consider the Lagrange density

L = |[(Vi)2

+

( 3 ^ 2 ) 2 ] - y^ 2 (0f +4>l)~ jHtf

+ 02>2

(6-3.12)

which is evidently invariant under the symmetry group 0(2) (formed by rotations in the (0 h 02)-plane (see Sec. (2.1) where we denoted it as R2)- The corresponding transformations are: 0! —> <j>{ = 0j cos a - 2 sin a 02 —> 02 = 01 s m

a +

02 c o s

a

(6.3.13)

For infinitesimal transformations « « 1, these become: 0i = 0i - «02 02' = «0,+ 02 thus giving the relation (6.3.10) as: 50, = /«f,70;

(6.3.14)

where matrix t = (ty) stands for:

(o n

" U ojSince — 5(6 ^i)

=
Jn = -(^(0i))02 + (^02))0i

(6.3.15)

as the conserved current of the system. If instead of real fields 0j, 02 we use complex fields: 0 = - ^ ( 0 , + /02), 0* = - ^ - ( 0 , - /02) the Lagrangian becomes: L = (d^ 0*) () - A(0* 0)2.

(6.3.16)

The Role of Symmetry in Physics and Mathematics 241

This is invariant under the U(l) transformations <j) —> <j>' = £ia<j>, and the corresponding conserved current can easily be seen as

J^iKdrfW-tf^W'].

(6.3.17)

Example 6.3.2: The isospin symmetry SU(2) is the most simple case of non-abelian symmetry. We consider an isodoublet (a column-vector formed by differentiable real valued functions defined on R3)

o and take the Lagrange density as

L = (^0+) (d^) - /i2(0+0 - 4(0V) 2

(6-3-18)

(note that 0+ is the conjugate transpose of (j>). Recall that the infinitesimal isospin rotations of (j) are given in terms of Pauli matrices xa{a = 1, 2, 3) which are the generators of SU(2) (see Sec. 3.7). It can be easily checked that the transformations ^1.->0;=01.+ ^ T g ^ .

(6.3.19)

leave L invariant. The conserved isospin current in view of (6.3.11) is given by:

V = - \ Htf)<j
T?H0;)].

(6.3.20)

The time component of the current in this case is: J

o=~j

KJotiK % - $

TSW;)].

If we use the canonical momentum nt conjugate to
Further using the commutation relations [«, (X, 0, j (X', 0] = -iStJ 8\X - X') for it-t and fy, it can be shown that the charges Qa= \d3xJS

(a = 1,2, 3)

satisfy the commutation relations of SU(2) symmetry:

5

71

^"

(6-3.21)

242 Mathematical Perspectives on Theoretical Physics [Qa,Qh]

= ieabc

Qc

hence {Qa} are the generators of SU(2) (see [4] and Chapter 4 of 9.[l](a)).

4

GAUGE SYMMETRIES—THEIR ORIGIN

In the next two sections we study the concepts related to gauge symmetries, which as we know are also called local symmetries, to emphasize the fact that gauge symmetries are not of global nature (see also Sec. (9.4)).

4.1

A Historical Perspective

The gauge-theoretic ideas had their origin in Maxwell's electromagnetic field equations in 19th century. However, it was only in 1929 that Weyl used the word gauge for local scale* invariance which he had designed to incorporate the electromagnetic field with space-time geometry. The significance of Weyl's approach was recognized in due course and a theory involving the socalled gauge groups and gauge fields, etc., began to unfold (E.P. Wigner in [11]). The physicists in elementary-particle physics and in theory of quantum fields found the theory as an effective and powerful tool in pursuing the theoretical as well as experimental research. It is quite accidental that around the same period of time (Ehresmann, 1950, see 2. [12]) another new theory which had to have great similarities with gauge theory (in later years) was being formalized by mathematicians. This was the theory of bundles—the theory that was meant to unify the algebraic, the topological and the geometric structures together. Of particular interest in this realm of ideas were the principal fiber bundles—an entity which later on turned out to be of immense use in description of gauge theory. For almost two decades the physicists and mathematicians were unaware that the two theories they were developing had a lot in common, that they were both using the same tools for description, namely the metric and the connection and had the same goals such as the objects invariant/covariant under suitable transformations/groups of transformations. It was only in early seventies that the subtle link between the two theories was recognized. In particular, it was shown that the gauge field of Yang-Mills' equations was indeed the curvature of a connection in a principal fiber bundle with gauge group SU(2). Although physicists (naturally) influenced by Einstein's theory of relativity describe the gauge potentials, the gauge fields and their interactions via the classical theory of curvature and connections (using tensor methods)-the use of principal bundles as a language in gauge theory is becoming more and more popular. Our aim in these sections will be to explain these ideas using both points of view. Since the physicists' approach preceded that of mathematicians, we present two well known examples in gauge theory using the Lagrangian method. For the first one we begin by writing the Lagrangian Lo of (Dirac's) free electron theory and finally obtain conditions for.£0 to be gauge invariant and renormalizable. In the process we obtain the Quantum Electro Dynamics (QED) theory. Since the gauge group involved in the theory is abelian, this is also called an abelian gauge theory. The second example deals with the

*

We note that laws of physics are not themselves invariant under a change of scale. A scaling symmetry is usually 'derived' by applying dimensional arguments to hydrodynamical differential equations. The symmetry arises when the number of fluid molecules becomes very large. Obviously it is related to macroscopic physics, (see Ftn. in Def. (vi) Sec. 2).

The Role of Symmetry in Physics and Mathematics 243

Lagrangian of Yang-Mills theory and illustrates the case of non-abelian gauge theory since the underlying group SU(2) is non-abelian.

4.2

Examples (Physicists' Point of View)

Example 6.4.1: The abelian gauge theory: Let ¥(*) stand for (Dirac's) free electron field with following transformation rule: *F(JC) -» V'(x) = e~ia Vix)

¥(x) -» ¥'(x) = eia V(x)

(6.4.1)

the exponent a being the phase belonging to the abelian group f/(l). The Lagrangian Lo for the field Y(JC) given as6: £„ = H* (X) OV ^ - m)

TO

(6-4.2)

evidently has a global (/(I) symmetry under the transformations (6.4.1). In order to turn this symmetry into a local symmetry, i.e., to gauge the symmetry, we replace the phase a by a{x) in (6.4.1) and demand that the theory be invariant (i.e., gauge invariant) under this new set of transformations, namely: ¥(*) -> ¥ ' t o = e~ia(x) T(JC) »F to -» ¥'(*) = e iBW ¥(x)

(6.4.3)

It is easy to see that the derivative term *F (x)dfl *P(x) in (6.4.2) is now replaced by: eia(x)^{x)dlle-ia{x)

Yto

= ^ (JC)5M 'Pto - i ¥ (x) ( ^ aW)»Fto (6.4.4) The presence of second term in (6.4.4) disallows the invariance. If, however, we form a gaugecovariant derivative D^ to replace d^, so that D^Wix) has the transformation property: D^ V(x) -> [D^(x)Y = e-iaix)D^(x)

(6.4.5)

then the product ¥ (x)DJ¥(x) will be invariant under the gauge transformations (6.4.3) leading to a gauge-invariant Lagrangian. To achieve this, we must define a suitable covariant derivative D^. This is done by adding a term, e.g., eAp to partial derivative d^, where A^ is a vector field called the gauge field of the theory and e is a free parameter which can be identified with the electric charge. We next obtain the transformation rule for AM, so that (6.4.5) may hold. Writing D^ = d^+ ieA^ and (D^)' = o1^ + ie A^' we have:

ZVFto = ( ^ + ieAJ ¥(*) and [ D ; «Pto]' = £ > ; r = (<9M+ ieA;)e-"a(j:)'Pto

6

7 M = Dirac matrix (/z = 0, 1, 2, 3) (see Subsec. (7.3.3)); see Sec. (9.3) for Dirac equation.

(6.4.6)(a)

244

Mathematical Perspectives on Theoretical Physics

=df,(e-ia(x) ¥ W ) + ieAj
a(x)) e-[a(x) TO + iee~iaOc)A^ *(x).

(6.4.6)(b)

In order to satisfy (6.4.5), the RHS of (6.4.6)(b) must equal e-ia(x\d^ +

ieA^{x).

This is possible only if: -id^ a(x) + ieA^ = ieA^ i.e., Ap = A^ H

(6.4.7)

9^ a(x) or that A^~+ A^+ —d^a(x). The Lagrangian (6.4.2) for the theory on the

inclusion of the gauge field A^ becomes: £ 0 ' = T z W M + ieAJ ¥ - mVV

(6.4.8)

If, further, we make the gauge field a dynamical variable, we must add a term involving its derivatives to the Lagrangian Xo'. We must, however, keep in mind that any such addition does not alter the invariance property of LQ. The term involving the derivatives of A^ is denoted LA: ±A = ~ J > The second rank tensor F^v = d^Av-

F*V-

(6-4.9)

dv A^ in LA is called the field strength tensor, and the constant

- — provides the normalization factor. It is easy to check that Fuv is gauge invariant (see Exc. 1), i.e., 4 F^=FMV.

(6.4.10)

The Lagrangian L obtained by adding £$ and LA L = ¥iY'(o> /x + ieA^yV - mVx¥ - -F^v

F"v

(6.4.11)

is the QED Lagrangian. Although our main aim (as presented above) was to give a mathematical formulation of gauge theory with (7(1) symmetry group, we give here in the following remark the main features of the (QED) theory based on the Lagrangian L. The details can be found in [4]. Remark 6.4.2: (A) The photon is massless because the term A^ A*1 is not gauge invariant. (B) The minimal coupling of the photon to the electron is contained in the covariant derivative D^V which is constructed from the transformation property of the electron field. Thus the coupling of the photon to any matter field is determined by its transformation property under the symmetry group. (C) The Lagrangian does not have a gauge-field self coupling since the photon does not carry a charge (or (7(1)quantum number). Thus if there is no matter field, the theory is a free-field theory. Example 6.4.3: Non-abelicm gauge symmetry based on Yang-Mills'' theory. We now consider the field *F as a column vector

H-

The Role of Symmetry in Physics and Mathematics 245

In physics terminology this is an isospin doublet. The transformation *F —> VP' is done with the elements of SU(2) (the isospin group of Yang-Mills' theory), thus we have:

V(x) -> ¥'(*) = exp{~' T ' e }y(x)

(6.4.12)

where t= (Tl7 T2, T3) stands for Pauli matrices and 0=(9}, 92, 03) represents the group parameters of transformation (see Section 3.7). Evidently the free Lagrangian Lo = ¥ « ( i y % - m)f(x)

(6.4.13)

is invariant under the transformations when the group parameters 6X, G2, 03 are independent of spacetime coordinates. When these become space-time dependent, i.e., 0, = 0,-Qc), the symmetry transformations become local transformations giving:

¥(x) -» ¥'(*) = exp{~n'e(x)}*¥(x)

s U(G) V(x).

(6.4.14)(a)

The derivative term in (6.4.13) now changes to*: V'OO^'F'OO = V(x)d^(x)+ ^(x)[U(9)Tl[^(U(G)m(x). (6.4.14)(b) To construct a gauge-invariant Lagrangian, we follow the steps of example (6.4.1), i.e., we first choose vector gauge fields A^ (i = 1, 2, 3) to form the gauge-covariant derivative (6.4.15)

with g as the minimal coupling constant. Next, we require that D^ ^(x) has the same transformation property as ^(x) does, i.e.,

ZyFOO -> [D^ix)]' = U(6)D^(x)

(6.4.16)

Since [DJV{X)Y = D; ¥ ' W = (dp - tg I ^ L Woo' transformation (6.4.16) will hold only if:

K - i8 —^

\(U(dm = t/(0)l 9, - ig —^-jV

(6.4.17)(a)

i.e., if

\dllU(6)-ig^-U(0)W=

* Since U(fy = exT>[ZllJt(?l-\,U(0) = [U(ff)Yl I

2

J

-igU(6)(^±)U>

(6.4.17)(b)

246

Mathematical Perspectives on Theoretical Physics

(Note that £/(0)<^*F term on the LHS cancels with the one on the RHS.) This equality implies that: I^JL = f/(0)I-if- [U(d)Vl - - [d^U{e)]U'X (6)

(6.4.18)

The transformation law for gauge fields A^ given by above relation makes the Lagrangian Lo gaugecovariant. It can be checked that if 9{x) «: 1, then (6.4.18) gives (see Exc. (6.4.2)): A;1 = Aj + eijk 9jA* --d^ &

(6.4.19)

where £'•'* are the familiar Levi-Civita (anti-symmetry) symbols. The second term in (6.4.19) is the transformation for a triplet under the adjoint representation of SU(2). This shows that A^ carry the charges (in contrast to the abelian gauge case). See Remark (6.4.2) based on Example (6.4.1). We now construct the anti-symmetric (second-rank) tensor for the gauge fields A^, by setting:

(D^DV - Dv DJV = ighLF^Jff

(6.4.20)

where F^ stands for: Fjv= du A v - dV K + 8£'jk AJAvk

(6.4.21)(a)

or equivalently for:

^-3,^-3.^--l,[^,^f-}

(6.4.2W,

Using the property (6.4.16), it can be checked that: [(/)„ Dv - Dv DJV]' = U(G)(D^ Dv - Dv DJV

(6.4.22)

which leads to the transformation property for F^v : T- F^J = U(9) (TF^v)U-l(9)

(6.4.23)

For the infinitesimal transformations 6' <. 1, this reduces to: F'ly'F^e^ffF^

(6.4.24)

This shows that unlike the abelian case, the antisymmetric tensor F'^v here transforms non-trivially, and as such is not invariant under gauge transformations, although its trace: Tr{(T-FMV) (T-F" V )} - Fj, v F*v

(6.4.25)

is gauge invariant. The complete gauge-invariant Lagrangian of the theory can now be written as: L = - — FL. F'" V + ViytiDuy¥-m*¥*¥ 4 M ** with the transformation properties given by (6.4.14), (6.4,16) and (6.4.23).

(6.4.26)

The Role of Symmetry in Physics and Mathematics 247

The first term of L is sometimes referred to as pure Yang-Mills term. It can be checked that when written out in full it contains factors that are trilinear and quadrilinear in A'^ These correspond to selfcouplings of non-abelian gauge fields. Note that this self coupling of gauge fields does not occur in abelian gauge theory. We would like to observe here that above theory can be generalized to higher dimensions. For instance we can take a simple group G, whose generators {ga} satisfy the structural relation: [g\ gb] = iCbc gc The doublet ¥ =

(6.4.27)

is replaced by an arbitrary column vector v|/ (called a multiplet). The multiplet

\\f belongs to a representation with corresponding representation matrices {Ta} that obey the Lie bracket relation (6.4.27): [Ta, Th] = iCahc Tc.

(6.4.28)

The covariant derivative in this case becomes*: D^

= ( ^ - igTaAp V

(6.4.29)

and the second-rank tensor for gauge fields (A£) is given by V

= dn Av - dv K

+

sC°bc A / Avc

(6.4.30)

or equivalently by: (T• F ) ^ =
(6.4.31)

The Lagrangian L which is invariant under the transformations of the group G has the same form as in Yang-Mills case: L = - - F" Fmv + Viiy11 Du - m)T

(6.4.32)

though the entities *F, A£, D^ and F^v that are involved here, have different transformation rules. At the end of Sec. 5 we shall return to a Lagrangian of most general nature for the standard SU(2) x ( / ( l ) model.

Exercise 6.4 1. Show that the anti-symmetric tensor F^v of abelian gauge theory is gauge-invariant. 2. Obtain the transformation law for the gauge field A^ when group parameters (0'(x)) are infinitesimally small. 3. Using Definition (6.4.15) of D^, establish (6.4.21) and deduce the transformation property (6.4.24) when 0,
248 Mathematical Perspectives on Theoretical Physics

Hints to Exercise 6.4 1. The gauge invariance of F^v can be established in two ways. First we note that Fflv= d^ Av- dvA^

(0 and

(")

V = dM K ~ dv V = d»{Av +~dv «(*)) - dv (AM + -d^ a(x)\ - d^ Av - dv A^ + — (
Since a(x) is a smooth function

= dx dx^

, hence (i) and (ii) are equal. To see it dx^dx

differently we use the equality: (ni)

(Dp Dv- Dv D^V(x) = { ( ^ + ieAJ (dv + ieAv) - (dv + ieAv)(dM + ieAJ] V(x) =

ie(dllAv-dvAlF¥(x).

The other six terms cancel in pairs. From (6.4.5) we also have

(iv)

[(D^ DV - DV D^yvx = [(D; D ; - D; Dp »p'] =

e-ia[{DtlDv-DvD^V(x)l

This gives F^ «P' = e-ia (F^) or that F^ = F^. 2. We equate the two operators on *P on either side of (6.4.17)(b) to write: (T-A' U )

(i)

dM U(6) - ig y ^ ' U(6) = - ig U ( 6 ) V

(T-AU) 2"J

on postmultiplication by (U(6))~l we then obtain: (ii)

— r ^ = U(6) 2

-J!-[U(e)Yl - - (d U(6))[U(d)Vl 2 g

When 9'(x) are infinitesimals, i.e., 6(x)
The Role of Symmetry in Physics and Mathematics 249

(iv)

(The repeated indices i, j , k, I indicate summation.) The simplification of middle term leads to: (v)

2

2

ML

2

2

2

2 J g \ 2M J M

g \2

" J

or equivalently to: (vi)

A'^Aj+e'^A^^-d^. o

3. Recall that

Consequently

(i) (D, Dv - Dv Dj , [d, - frI£L) (5V - « I ^ ) - (dv - ig^) [dv - ig^) After simplification, the RHS becomes: (ii)

In view of defining equation (6.4.20) we have: ,-s

T ' F i" v

J

T'Av

a T'A^

• fT'Av

T-Avl

(m)

~2— = d »—- d v —~ r 4^~' —\-

To establish (6.4.21)(a), we write (iii) in terms of the components of T, f^ and Av, thus we have:

(>v)

^'^K-^,K-^.^f

(repeated indices on either side stand for summation). In order to obtain the value of Fl^v, we simplify the third term on the RHS and note that T; are linearly independent, therefore the coefficients of Tt, T2, T3 on either side of (iv) can be equated. Writing

250

Mathematical Perspectives on Theoretical Physics

T

/ "-fi

*j

\

~~2~'~2~ as the difference of products: (Tj A^ + T2AJ + T 3 A^) {xxAl + T2 Av2 + T3AV3) -

(v)

(T,-,Al + T2 A 2 + T3 A v3) (T, A j + T2 A^2 + T3 A^3) we observe that six of these eighteen terms cancel in pairs, whereas the remaining twelve combine in pairs and finally reduce to six terms. For instance consider the four terms: (vi)

I 2 2

2 2J "

v

I2 2

2 2J"

In view of the relation

L 2 2J

2

these terms can be written as: (iF™lL\(Al V

A2 -A2

A1)

£ J

Collecting these facts together, we equate the coefficient of T,- on either side of (iv) and obtain the required equality given in (6.4.2 l)(a): (vii)

^v=^AJ-

To obtain (6.4.24) we substitute 1 -

(viii)

l%'

^

T -^v=[ 1 -—^-J^v^

dvA; + geiJk A^Af. for U{9) in (6.4.23) and obtain:

+ '-y^J

The last two terms written out in full are: -O72)[(T,01 + T 2 ^ + ^6?)(T, F^ v + T2FJV+ T3F3V) -(T, F^ v +

T 2 F 2 V + T3F3V)

(T, 01 + T2 ^ + T3 e 3 )].

Out of these eighteen terms six cancel in pairs, and other twelve combine as they did in (vi). After this simplification, we equate the coefficient of xi to obtain F'^v = F^v + eijk 0j F^v.

5

EXAMPLES OF THEORIES WITH GAUGE SYMMETRY

We shall see in next section how the progress in present day gauge theory has been influenced by the theory of principal bundles. As mentioned earlier the theorists in both areas were unaware of the implications of each others' findings. Only when it was realized that the two main ingredients of principal bundles, namely the connection and the curvature'', could be identified with the gauge fields (Ap and the second rank tensor (F^v), the two began to mingle. Gauge theories then received enormous research

In contrast to the terminology used in physics, in bundle theoretic gauge theory the curvature is called the gauge field and the connection the gauge potential.

The Role of Symmetry in Physics and Mathematics 251

impetus from both points of view. We devote this section to enumerate the principal gauge theories currently in use. Also to prepare ourselves for bundle theoretic approach in next section, we show how geometric methods can be used to introduce the gauge concept.

5.1 Maxwell and Yang-Mills Equations in Classical Form We begin with Maxwell's equations and apply to them the derivation rules of differential forms defined on the Minkowski space. Recall that Maxwell's equations in classical form are: divE = p, divB = 0;

CurlE = - 4p-,

CurlB = J + - ^

at

(6.5.1)

at

where E = E(f) and B = B(f) are time dependent electric and magnetic fields defined on a subset of R3 and p and J are the charge and current densities respectively (see Exc. 1). It is worth noting here that these equations unified the theories of electricity and magnetism and thus lead to important advances in physics. When techniques of differential geometry and form calculus were introduced on the Minkowskian space M 4 = (-1, 1, 1, 1), it became apparent that both electric and magnetic fields could be written as components of a skew-symmetric tensor or equivalently that of a differential 2-form F. Using the calculus of forms the equations could be succinctly written as: dF = 0,

8F= *d*F =j = (J, p)

(6.5.2)

The d and its adjoint S in the above equation are differentia] and codifferential operators, and the j denotes the current density 1-form. It is easy to check (using the standard chart on M 4 and the induced bases of tensor spaces) that the components of second order tensor Fu (i < f) on M 4 can be identified with the components of vectors £, and Bt on R as follows: F,. 4 =£,,

F^eijkBk

(6.5.3)

The indices i,j, k in (6.5.3) stand for 1, 2, 3 and e^is totally anti-symmetric symbol with e123 = 1. Before we look into the gauge aspect of Maxwell's equations, we note that they are globally invariant under the conformal group and in particular under the Lorentz group. In order to see their invariance under the gauge transformations we note that on M 4 , the vanishing of dF implies that F must be the differential of a 1-form A (classically known as 4-potential), in other words F = dA. Naturally the choice of A cannot be unique, since if A' is another 1-form, such that F = dA', we shall have:

F-F =

dA'-dA=0

and this suggests that A' and A in turn are related to each other by an equation: A' = A + d*¥ s A1*" where

¥ e J (M4)

(6.5.4)

T

i.e., A —> A = A + d*¥. Now 4* is a real smooth function, therefore the above relation implies that there is a change of scale at each point. If this local scale is replaced by a local phase taking values in the unitary group 1/(1), then there follows a gauge-theoretic formulation of these equations. The gauge (local scale) transformation function *F of equation (6.5.4) is replaced by the gauge transformation g: = e'8 that involves the phase, and as a result (6.5.4) becomes: iA H» iAg: = g'1 (iA)g + g~l dg 6

4

(6.5.5) 4

where g = e' e J (M , £/(!)), i.e., g is a smooth function defined on M and taking values in U(l).

252 Mathematical Perspectives on Theoretical Physics

(Compare Eq. (6.5.5) with Eq. (ii) in the Hint to Exc. (6.4.2)). Note that in above formulation the Minkowskian space M 4 can also be taken as a 4-dimensional Lorentz manifold usually referred to as space-time manifold in Einstein's theory of gravitation (see Chapter 8). Naturally, therefore, this allows the generalization of gauge theory to Einstein's theory of gravitation. However, we do not pursue it here. Instead we consider the Yang-Mills field equations, written for the vector potential b^ of isotopic spin in interaction with a field *F of isotopic spin 1/2. These field equations in classical form look like: -j^-

+ 2e (bv x / p + 7M = 0

(6.5.6)

where the symbol x stands for cross product between & and the SU(2) valued gauge field: db

n

dbv

^=^T^;-2£(Zvx^

(6 5 7)

--

(Note that Eq. (6.5.7) can be indentified with Eq. (6.4.21)(a)). The quantity J^ denotes the current density of the field 4* (called the source field). These equations could be thought of as a matrix-valued generalization of the equations for the classical vector potential of Maxwell's theory. For more than a decade there did not appear to be any applicability of these equations, since the massless particles predicted by the theory could not be identified with any particle (see Chap. 7 in [16]). This was in contrast to Maxwell's equations where the massless particle was identified as photon, the carrier of electromagnetic field.

5.2

Other Important Gauge Theories; Spontaneously Broken Symmetry

In the late sixties the spontaneous symmetry breaking phenomena introduced by Higgs (known as Higgs' mechanism) altered the situation completely (see the Lagrangian Lws given in (6.5.10)). The mechanism which broke the existing symmetry eventually led to massive gauge vector bosons and thus helped in circumventing the difficulty created by predictions of massless particles in the Yang-Mills' theory. Besides these two classical gauge theories, we currently know three more gauge theories with different gauge groups. These are as follows: 1. The electroweak theory9: This theory gives a unified treatment of interactions of (long range) electromagnetic forces and (short range) weak forces. The carriers of weak forces (universally) denoted as W+, W~, Z° are called weak intermediate vector bosons. Unlike the carrier particle of electromagnetic interactions-the photon, the particles W+, W~, Z° are massive. This is quite understandable from their short range behavior in view of the energy principle. The underlying gauge group of the theory is U(l) x SU(2). It is called the electroweak gauge group. The (/(I) factor is called the weak hypercharge

8

9

One of the ways to break the symmetry is to introduce a term in the Lagrangian £ which no longer allows £ to be invariant under the given symmetry group. We refer the reader to [4] and [16] for a readable description of this concept. This is also sometimes called the standard model as we shall see in the Lagrangian £ws at the end of this section (see J.C. Taylor in P. Dita, et al., [6] on Mathematical Analysis of Standard Electroweak Theory).

The Role of Symmetry in Physics and Mathematics

253

gauge group and is denoted UY (1), while the SU{2) factor is called the weak isospin gauge group and is denoted as SUL(2). The subscript L in SUL(2) denotes the action of 51/(2) on left-handed fermions. The theory is left-right asymmetric, thus if fL and fR denote the set of left-handed and right-handed fermions, and Y is the quantum number assigned to them, then the following relations hold:

X^X^X^O /L

fR

(6.5.8)

Z/.2

where fL2 denotes the set of left-handed fermion doublets (see Exc. 4). 2. The quantum chromodynamics, or QCD for short: This is the theory of strong forces. The underlying gauge group, called the color gauge group, is denoted SUC(3). The elementary particles of the theory are gluons and quarks. The gluons are spin 1-massless particles, have short range behavior, and in analogy with photons carry (color) charge and (color) magnetic moment, etc. The quarks, on the other hand, are spin 1/2 particles; they are characterized by their "flavors" (called up, down, strange, charmed, bottom and top) and the "colors" (red, green and blue) in which these flavors come.10 The main features of the theory are: it is asymptotically free (i.e., coupling strength of interacting forces decreases at short distances), and it admits renormalization (see Chapter 9 for renormalization). For an elementary description of the theories introduced in 1 and 2 the reader should refer to Chapters 11 and 16 in [4] and Chapters 9, 10 and 11 in [16]. See also Exc. 5 for a Lagrangian of the theory. 3. Standard Model (or SM for short). This model combines the Glashow-Weinberg-Salam theory of electroweak interactions and quantum chromodynamics; in other words, it is a model that unifies the strong, weak and electro-magnetic forces. It can be viewed as a Yang-Mills theory of quark and lepton interactions based on the gauge group S(U(2) x U(3)) defined as follows: U{1) 0 0 0 <{/£ 0 0 0 0

U 0

£/(3)

det£/ = l l

(6.5.9)

J

This S(U(2) x £/(3)) group formed by the set of 5 x 5 unitary matrices is a subgroup of SU(5). It has the same Lie algebra as that of f/(l) x SU(2) x SU(3), and therefore sometimes the underlying group of SM is also given as 5C/C(3) x SUL(2) x U^\). In spite of this similarity, however, one should remember that while the SM group 5(C/(2) x f/(3)) is symmetric under complex conjugation, the chiral quark and lepton representations are not (for a readable account of this model, refer to Chapter 3 in Ref. [11]). Having familiarized ourselves with present day gauge theories, we move on to their mathematical formulation in next section. But before this we give below the Lagrangian Lws that we mentioned above. In other words, we describe the Lagrangian of the standard model whose underlying group is SU(2) x [/(I)." We refrain from its mathematical analysis due to our limited scope and also because this analysis is done beautifully by Ellis in [11] we simply note that . 6 ^ is S(/(2)-invariant and is also

la

Before the discovery of quarks, "protons" and "neutrons" were the "elementary" particles of nature. A proton contains two up quark and one down quark (one of each color); and a neutron contains two down and one up quark, again one of each color, (see Table S) "• The Glashow-Weinberg-Salam model was the 'standard model' before the advent of the model that incorporated their model with QCD. (See also page 800)

254 Mathematical Perspectives on Theoretical Physics

renormalizable. For ease of reading to the motivated reader, we decided to use the notations given in the above reference. The full Lagrangian denoted Lws (Weinberg-Salam) equals12: LWS~LG+

SL + Lf+L,p + LY + Lv.

(6.5.10)

The six components of the sum are given as follows:

£G = - l ( G M V a G r ) - } / > ^ V

(a) where

G^va & ^ Wva - dv W^ + geabc W^ Wvc

and

F^ss d^Bv- dvB^

are respectively the field strength tensors of the 5/7(2) gauge fields W^ and the U{\) gauge field B^, and g is the coupling constant13. LG represents intermediary vector boson (IVB) interactions. SL=+-&TGfJvaG>}v

(b)

Gfl"v = ± 8 ^ GaJ.

where

The term 8L allows for a possible P- and CP-violating SU{2) gauge term14 (see R.D. Peccei and Helen R. Quinn in [11] or [24]).

(c)

Lf=- Z [ A y " ( ^ + ig~ Wua +ig' YL B^fL+f^O^

+ ig'YxB^f^5

where fL and/^ are left- and right-handed fermions,^, being a SC/(2)-doublet and^, a 5t/(2)-singlet (see Chapters 7 and 11 for more on fermions and also Exc. 4). The entities g' and YL R are (arbitrary) f/(l)hypercharge and fermion hypercharges respectively. The Lf is the fermionic kinetic part of L. The quantities in round parentheses are covariant derivative operators of fermion fields, and they yield the fermionfermion-boson vertices. (d)

L$=

where 0 =

K J

-^d^+ig^-W^+ig'-Y^B^

is a single complex 5f/(2)-doublet of elementary Higgs' fields.

The antiparticle

/ TO \

corresponding to

' ' 14 ' i5 ' l6 ' 13

. The hypercharge Y^ is generally chosen to be 1/2. L^ is the kinetic

Readers in particle physics may refer to Sec. 12.2 in [4] for an analysis of Lws. Note that our notations for W^ and BM in Sec. 4 are A^ and A^. P = parity (space inversion), CP = space inversion P followed by charge conjugation C (see Chapter 7 for C). The absence of Wm interaction term in the second parenthesis is due to the fact that/ R is St/(2)-singlet. The superscripts +, - and 0 in 0 and (j)f indicate the positive, negative and neutral charge (see also Table S).

The Role of Symmetry in Physics and Mathematics 255

term for the Higgs' fields, when (0| |0)^ 0 this interaction results into a direct W-<j> coupling thereby giving masses to the W bosons.

(e)

Lr=- X[w / / '(/z.^)/« + H*fr ~fki
where Hff> is a general coupling matrix in the space of different fermion species; these give rise to fermion matrix when (0| |0) * 0, and also to the Cabibbo mixing angles of the Kobayashi-Maskawa matrix (see [4], [16] for details on CKM). LY is the fermions and the Higgs fields interaction Lagrangian.

(f)

L v = ( 8 ^ + ) ( 3 £ ) - V ( 0 with V(0) = V ( < / > V ) + | A ( 0 > ) 2

is the Lagrangian that represents the Higgs self-interactions (see Eq. (6.3.18)). Next we shall replace the SU(2) doublet (j> in (f) by a real scalar field (j> and use the resulting Lagrangian to demonstrate the spontaneous symmetry breaking (SSB) phenomena. The Lagrangian (f) now reads:

L = (<9M) (2 - — X 0) = (d^)

(6.5.11)

() - V(®

The last two terms show that field (f> is a self interacting field, and positivity of A ensures that the energy is bounded from below? Evidently Lagrangian is invariant under the transformation —>-(j)

(6.5.12)

Hence from definition (vi) of Sec. 2 SSB would follow if the vacuum is not invariant. Following two figures illustrate a qualitatively different behaviour of the (physical) system underlying the above Lagrangian. •. V{4>)

(a) n2 > 0

| V(4>)

(b) n2 < 0

^ ^ Q l Symmetry breaking phenomena for the potential V(0 In case (a) the vacuum expectation value of the field is <0>o=0)

(6.15.13)

The vacuum |0 > is invariant, thus symmetry is not broken. The parameter fi here plays the role of a mass. This situation is called the Wigner mode.

256

Mathematical Perspectives on Theoretical Physics

In case (b) - {fj.2 < 0), the potential v() has minima at

( 0 > O = ± J ^ - =±v

(6.5.14)

V A

This gives two degenerate vacuum states. The origin (in the Fig. (b)) is no longer a stable point. In other words the vacuum (ground state) is not invariant. Thus the 'symmetry is spontaneously broken.' We note that the change from situation (a) to (b) (illustrated in two figures) is called a phase transition with /j2 playing the role of order parameter. In conclusion, we note two important aspects of SSZJ-phenomena in case the symmetry is a continuous one. (1) If the SBS is a continuous global symmetry, one massless scalar field (Goldstone boson) must appear in the theory for each group generator that has been broken. (2) If a SBS is a continuous local gauge symmetry, then no Goldstone bosons are produced, instead there are gauge bosons that may acquire a mass without spoiling gauge invariance. This is Higgs Mechanism. (See Chapter 10). As an exercise reader should check that if real scalar field in (6.5.11) is replaced by a complex scalar field, the Lagrangian is a t/(l)-invariant one (under 'global' phase transformations). The minima (for jj} < 0) of the potential, v( plane on a circle of radius.

\\=JZT- =v

(6.5.15)

It is a case of spontaneously broken symmetry. There is an infinity of degenerate ground states, and it is the case (1) of SBS given above (check why?)

Exercise 6.5 1. Consider the Maxwell equations in classical form: (i) V• E = p

(Gauss's law)

(ii) VxB = - 7 + - - ^ c

c at 1

(iii) V • B = 0 (No free magnetic poles)

/9B

(iv) V x E =

— c at

(Faraday's law when c is not there) where E and B are electric and magnetic fields and p = p(x, t) and J = J(x, t) are the charge and current densities. Using these equations show that the predictions of Maxwell's theory for observable quantities are gauge invariant. 2. Give an example of 'gauge fixing'. What is a Coulomb gauge! And what are its implications on Maxwell's equations? Show that the electric field E and the magnetic field B can be thought of as transverse fields. 3. What are the transverse electromagnetic waves in free space? Show how you would determine the energy of radiation fields. 4. Use the algebraic approach to electroweak theory to show why the underlying group SU(2) x 1/(1) is denoted here as SUL(2) x UY(l).

The Role of Symmetry in Physics and Mathematics 257

5. Write a Lagrangian for QCD and point out its similarities with the Lagrangian of the Yang-Mills theory.

Hints to Exercise 6.5 1. In order to prove that the predictions for observable quantities resulting from Maxwell's theory are gauge invariant, we must show that these equations can be expressed in terms of potentials. Using the fact that the operator VV x reduces any vector to zero, we note that equation (iii) implies that (a) B= V xA where A s AQc, /) is an arbitrary vector potential. Again from

dB d .- .. „ dA — = —-(VxA) = Vx —dt dt dt and from equation (iv) it follows that: E=-V0-I^ (b) c dt

where (f> = are arbitrary, we note that (a) and (b) will be true even when these are replaced by A' and
( d ) ' = <j> + 1 ^ c dt /being a smooth function on R3 (or on a subset of R3). Equations (c) and (d) are called the gauge transformations of the potentials. Substituting the value of E from (b) in equation (i) we obtain:

(e)

v{-V-!-^U-V20-!|-(V-A) = p

^ c dt J c dt We use the D'Alembert's operator (see Chap. 3, Sec. 1) c2 dt2 in the above equation to write it as: (e)

c dt {c dt

J

Similarly we substitute the value of B from (a) in (ii) and obtain after simplification and use of operator • the required expression in the potentials
(f)

nA+

v ( ! - ^ . + V-A] = l / . \c dt

J

c

As equations (e') and (f) are in terms of potentials, we have proved that Maxwell theory is gauge invariant, in other words its predictions are gauge invariant (see also Sec. 7). When p and J are zero, the fields E and B are said to be free fields and the Maxwell's equations are called source free equations.

258

Mathematical Perspectives on Theoretical Physics

2. Consider a set of equations such as Maxwell's that remain invariant under gauge transformations (for example Equations (c) and (d) of Exc. 1), and use this invariance property to choose a set of potentials (A, 0) that satisfy (i)

A • A + —— = 0 (we have taken c = 1) dt or choose a set (A, 0) so that, (ii)

V • A=0

A procedure of this type is called 'fixing the gauge' (see Definition (iii) in Sec. 2). First of these choices is called the Lorentz gauge or Lorentz condition. The second choice is the Coulomb gauge, or the radiation gauge. The solutions of V • A = 0 are called the radiation fields*. Now a vector field with vanishing divergence is called a transverse field. Thus in the Coulomb gauge A is a transverse vector field. It can be checked that when p and J are zero, in Coulomb's gauge we have: (a) V2
E=

— c dt Evidently when p = 0, E is a transverse field and B is a transverse field by very definition (iii). 3. Write the wave function A(x, t) as Ao e'( * " m\ It can be easily checked that for a Coulomb gauge K-A = 0, i.e., A is perpendicular to the propagation K of the wave. The solutions of DA = 0 are called the transverse electromagnetic waves in free space. From the above exercise we know that these are also referred to as radiation fields. The energy is naturally given in terms of E and B, hence it is:

if(E 2 + 5V 3 *. 4. We show here how {/(I) and SU(2), the gauge groups of the theories of Maxwell and Yang-Mills (respectively), are used in formulating the electro-weak theory. This is done with the help of the IVB theory, where one writes the basic interaction (of weak forces) as: (i)

L=

g(.JflWli+h.c.)

(Jp represents the weak current, WM a massive field, g a coupling constant and h.c. = Hermitian conjugate.) When this theory consists of an electron (e) and its neutrino (v e ), the Lagrangian in (i) is denoted as: (ii) Lw=g(Jkw!i+h.c.) where (iii) Jx = ve yX (1 - y5)e is the V-A charged current (for details on the V-A formalism used in the low energy theory of weak interactions for charged currents see Chapters 5 and 11 in [4] and Chapters 6 in [16]), *

The current J in ( f) of Exc. 1 can be decomposed in longitudinal and transverse part J^ and J±. These parts satisfy V x /|| = 0, A • J± = 0. In coulomb gauge the R.H.S of (f) reduces t o ^ , hence A • A = 0 is also called the transverse gauge, (see. [28]).

The Role of Symmetry in Physics and Mathematics

259

whereas the electromagnetic interaction of these leptons (electron and electron neutrino) is given by the Lagrangian17: (iv) Lem = e/r AA where

(v)

jr=~ene.

We note that the three currents J, J+ and fm given in (iii) and (v) do not close under Lie bracket to form an algebra (which apparently says that SU(2), the simplest group with three generators, cannot be the group of electroweak theory). We define the weak and electric charges respectively as

(vi)

T+(t)= ±-$d3xJ0(x) = ±-ld3xvl(l-y5)e Z,

(vii)

ZJ

2

Q(t) = J d

xJeom

(x) =

-\d\e\

and we further note that 71(0 = T+(t). Using the canonical commutation relation for fermions: (viii) {tf (x, t), v,(x', t)} = 8y 8\x - x') we have (ix) [T+(0, T_(t)] = 2T3(0 where (x)

T3(t)=±\d'x[vl(l-y5)ve-e\l-y5)e] 4J Evidently T±, Q do not form a closed algebra18 but T±, T3 do, as shown in (ix). In spite of it, T±, T3 are not sufficient for a formulation of the theory, therefore we have to introduce another gauge boson coupled to 73. These four generators will then form the group SU(2) x {/(I) (required for the theory). This can be done in more than one way. For our case we choose to consider the fermionic sector of the standard model composed of e, ve leptons and u, d quarks only. For reasons of 'conservation of helicity' in gauge interactions, we have to have independent left-handed and right-handed fermions. Thus we have the following fifteen two-component fermions: (xi)

y/ = veL, eL, eR, uL, dL, uR, dR

where color indices a = 1, 2, 3 on the quark fields have been suppressed and eL and eR stand respectively for —(1 - J5)e, —• (1 + 7 5 )e. Using these we can write the weak charges as: (xii)

T+ = jd3x(vlLeL

+ u\ dL),

T3 = | jd'x(vlL veL -eleL+ 17

T_= (T+)f

u\ uL - d{ dL)

' See the Table S for a characterization of electrons, neutrino and leptons; V-A = vector and axial vector operators; the superscript t denotes the Hermitian conjugate. 18 ' Q cannot be a generator of SU{2), since the charges of a complete multiplet must add up to zero in order to fulfill the requirement that generators for SU(2) be traceless.

260

Mathematical Perspectives on Theoretical Physics

which form the generators of SU(2). From the expressions for the S£/(2)-generators, it also follows that

(xiii)

/tS(V^]

and ,,= Q L J

are S£/(2)-doublets and eR, uR, dR are singlets. Next the f/(l) group has to be chosen in such a manner that the electric charge

(xiv)

= \dix^-e\eL-e\eR

+ — (,u{uL + uR uR) - j (d*L dL + dR dR)^

can be a linear combination of the (/(l)-generator and the generator T3 of 5/7(2). We note that the combination

(xv)G-r3= ld3x[-±(v]LveL+eleL) +

±(uluL+dt)-eleR+^uRuR-±dUR]

has the property of giving the same quantum number to all members of an SU(2) doublet in (xiii); moreover, it commutes with all the other generators of SU(2): (xvi) [Q-Tv 7 ; ] = 0 i = l , 2 , 3 (T+ = TX,T_=T2). We write 2(Q - T3) as Y, and this is the required generator of the £/(l). The generator Kis called the weak hypercharge. The two groups SU(2) and f/(l) are thus appropriately denoted as SUL(2) and UY(\) in the electroweak theory. The four generators mentioned above are 7, (i = 1, 2, 3) and Y. 5. The underlying group of QCD is SUC (3), where C represents the three-valued quantum number called color (related to quarks). A Lagrangian for the theory is usually written as: 1

(i)

v

LQCD = - - - Tr G^ G" + X % ('/" D// " w *)9* 1

where (ii) (»0 and

"f k

GMV = dn Av ~ dv Au ~ ig[An< Av\ Dfi qk = (dn ~ igAf)Qk 8 a a=l

Z

The A"'s in (iv) are the Gell-mann matrices (see Eqs. (3.7.18)-(3.7.19)) that satisfy the SU(3) commutation relations:

(v)

[^^-] = ifabCf

and the normalization conditions (vi) Tr (ka Xh) = 28ab whereas A"^ are gluons—the strong interaction gauge fields. The qk denote the quark fields where k stands for the flavor index 1, 2, •••, nf (the number of quark flavors u, d, s, c, b and t). Similarity of (vi) with (6.4.25) and that of (iii) with (6.4.15) is obvious. The gauge-fields of the Yang-Mills

The Role of Symmetry in Physics and Mathematics

261

theory are replaced by gluons here. Thus the strong interaction theory is described by an SU(3) colour Yang-Mills theory. Note that each flavour of quarks transforms here as the fundamental triplet representation.

6

BUNDLE THEORY FORMALISM IN GAUGE THEORY

We have already seen in Sec. 4 and Sec. 5 that in physical applications, the gauge group that gives the internal or local symmetry of the field is a Lie group. In fact every gauge theory that we have dealt with in these sections had a Lie group associated to it. We also know (from Sec. 2.5) that the group involved in the definition of a principal bundle is a Lie group, hence it is not surprising that principal bundle techniques can be fruitfully applied to gauge theory.

6.1

Principal Bundles as Tools in Gauge Theory

Recall that a principal bundle P(M, G) with structure group G over M is a fiber bundle (P, M, n, G) with a free right action p of G on P such that19: (i) The orbits of p are the fibers of n: P —> M, in other words K can be identified with the canonical projection P —» PlG. (ii) every local trivialization of P: y/ -> £/ x G -> ii~\U) in relation to the action p satisfies ¥'\(uxg) = V/' (ux)g where uxg = p(ux, g) x e M, ux& Px, g e G. While using the principal bundles in applications, not only does the group G vary, but M varies as well, for instance in Maxwell's and Yang-Mills' theory M is either a space-time manifold or it is its Euclidean version (see Sec. 1.4). In theories such as Kaluza-Klein and strings it is an arbitrary manifold (pseudo-Riemannian or Riemannian). We shall therefore opt to describe the theory in more general terms, and shall deduce the results for particular cases. But first we establish some terminology that we shall need. Let (M, g) be a pseudo-Riemannian manifold with metric g and let G be the fixed Lie group which would henceforth be referred to as the gauge group of the theory. Consider now the principal bundle P(M, G). A connection in P (see Def. (2.5.11) and (2.5.12)) is called a gauge connection, and the (corresponding) connection 1-form GO (see Def. (2.5.14)) is called the gauge connection form. A section s € T(P) is called a global gauge or simply a gauge. The gauge potential A on M in gauge 5 is obtained by pull-back of the gauge connection a> on P to M by s, i.e., A = s (co). When we consider an open subset U of M, and the section s is defined on the restricted bundle P\v, then we call s a local gauge. Thus for instance if t e T(U, P) is a local gauge, then the 1-form t*(co) e Al(U, g) 20 is the local gauge potential denoted At. When there is no fear for confusion it is denoted as A. Thus: A=Ar = t\co) 19

We have used slightly different notations in Sec. (2.5).

2a

g denotes the Lie algebra of G; this was denoted Q in Sec. 2.5.

(6.6.1)

262

Mathematical Perspectives on Theoretical Physics

As an example consider the gauge group G = U{\) of electromagnetism. We already know it is the circle group and its elements e'e are determined by the phase 9. In this case a local gauge over an open set U c M can be thought of as a choice of phase in the bundle P\v = U x G at each point of U. It is for this reason that the total space of P, here is sometimes referred to as the space of phase factors in physics. The curvature 2-form Q = da(O which (as we already know from Sec. 2.5) has values in Lie algebra g is called the gauge field on P. We also know that given the form Q. on P there exists a unique 2-form on M, denoted Fjx with values in the Lie algebra bundle adP. The form Fw denoted XQ belongs to A 2 (M, adP) and is called the gauge field on M. It is interesting to note that Fa is globally defined, whereas in general there is no globally defined gauge potential on M. We thus have: Given a local gauge potential A,e /\l(U, g) the following relations are valid: t\ai) = At

Fa=dmA,

(6.6.2)

A word of caution: in the physics literature (sometimes) the mathematicians' gauge potential A is called the gauge field and the gauge field-the field strength tensor (see Sec. 4 and Sec. 2.5). To illustrate the concepts introduced above we consider the principal bundle S 3 (S2, (7(1)) over the base manifold S2. The structure group of the bundle is the circle group (7(1) and the bundle is determined by Hopf fibrations of S3 (see Exc. 11 of Sec. 2.5). Let \X denote the connection 1-form of the canonical connection on this bundle and let F^ be the corresponding gauge field on S2. Evidently F^ is globally defined but the gauge potential is not, since there are at least two charts (Ux, y/y), (U2, Wi) 'hat are needed to cover S2. According to these charts there are two locally defined gauge potentials, say Ax and A2 that give rise to globally defined gauge field F^ on the base manifold S2. Note 6.6.1: The gauge field F» is equivalent to the Dirac monopole field. The Dirac monopole quantization condition corresponds to the classification of principal C/(l)-bundles over S2. The classification in this case is given by the first fundamental group 7r,([/(l)) = Z 22 . In general the principal Gbundles over S2 are classified by 7C{(G). Thus nx{SU{2)) - Id implies that there is a unique SU(2) monopole 23 on S2 and Ki(SO(3)) - Z 2 implies that there are two inequivalent 50(3) monopoles on S2. Note 6.6.2: The gauge fields and gauge potentials have no physical significance unless they are made to satisfy the (field) equations of a theory. For instance the Riemannian curvature of a space-time manifold is the gauge field corresponding to the gauge potential given by the Levi-Civita connection on the orthonormal frame bundle O{M) of M. This, however, does not describe the gravitational field unless it satisfies Einstein's field equations. If instead it is made to satisfy the Yang-Mills equations, it describes a Yang-Mills field.

6.2

The Group Aut(P) of Generalized Gauge Transformations

We shall now see how this bundle theory can be used to study the gauge invariance of field equations, which we did in the previous two sections using the classical concepts of physics. In order to do this we consider the group of diffeomorphisms Diff(/>) of P. Since this group mixes up the fibers of P, the group as a whole is not a good candidate for the group of gauge transformations. What we need here are those 21

See Sec. 2.5b for details.

22

nr(X) denotes the /th fundamental group of topological space (see [2] and 2.[29] for details).

23

' See Chapter 10 for definition of a monopole and the references there.

The Role of Symmetry in Physics and Mathematics

263

diffeomorphisms that preserve the fiber of P. This is achieved by demanding that following diagram be commutative: i.e., the triple (n, , M) satisfy the following: n o (f) = (j)M o K P

(6.6.3)

0

P

n

n V

^M

y

M

M

E S f l ^ 9 Projectable diffeomorphism. Definition 6.6.3:

The pair (0, is called a projectable

diffeomorphism or projectable transformation

of P. It can be checked that the projectable

diffeomorphisms form a group denoted DiffM(P). Definition 6.6.4: Let *¥' denote a one-parameter group of projectable transformations of P resulting from the vector field X e £(P) a n d let H^J be the corresponding one-parameter group in Diff(M) associated with the vector field XM e ,£(M)- Then X is called a projectable vector field on P, thus in this case the pair (X, XM) satisfies the relation: n'(X(u)) = XM{rt(u))

for every u in P

(6.6.4)

is a

The collection of projectable vector fields on P denoted £ M (P) Lie subalgebra of the Lie algebra X(P)- It should be noted that we have not used the principal bundle properties in Def. (6.6.3); by this we mean that the commutativity condition (6.6.3) exists for an arbitrary bundle, thus for instance if
(ug) = (j> (u)g forge G and u € P, the condition (6.6.3) is still satisfied. This enables us to define a group of automorphisms: Aut(F) = { € Diff(P) | <j) is G-equivariant}

(6.6.5)

The group Aut(F) is called the group of generalized gauge transformations. It is easy to note that this group is indeed the group of principal bundle automorphisms of P. From Sec. (2.5) we know that the fiber preserving property of the generalized gauge transformation <j> completely determines the diffeomorphism
(6.6.6)

It can be checked that Q(P) is a normal subgroup of Aut(P). More explicitly
satisfies: no $= ft (6.6.7)(a) 24

' Projectable transformation: physics terminology.

264

Mathematical Perspectives on Theoretical Physics

(u-g) = (u)- g for every g e G and U e P.

(6.6.7)(b)

Construction of these subgroups leads to the following exact sequence of groups: 1 -^ g(P) —U> Aut(P) — ' - ^ Diff(M) -> 1

(6.6.8)

where i denotes the inclusion map and j is defined by: j(<j>) - 0 M

for every (j> € Aut(P).

If the bundle P is the bundle of frames L(M), then the above sequence allows additional geometric insight in this study. For example the connections on the frame bundle L{M) when Mis a 4-dimensional Lorentz manifold play the role si gravitational potentials, and action functionals involving connections and metrics on M become the focal point where gauge theory of gravitation can be started. Returning to gauge transformation 0 in 6j we note that 0 can be physically interpreted as a local (pointwise) change of gauge over M. It is for this reason that G is sometimes called the local symmetry group and Q is called the local gauge group. We shall however restrict to the usage given by defining Equations (6.6.5) and (6.6.6). Given an open set U c M the map 0 : U —> G is a local representation of 0 when it is defined in the following manner. Suppose that t is a local gauge over U; then t is a section of the bundle P\v, thus in this case for x e U, t(x) is in Px - n~l(x)—the fiber of P over x, and by definition of , the image (x)

for every x in U

(6.6.9)

When P is a trivial bundle, U can be taken as M and the local gauge transformation 0 can be identified with a map from M to G.

6.3

The Gauge Algebra of P(M,G) and the Space of Gauge Potentials on it

Before illustrating in this section on the mathematical basics of gauge theory, we shall introduce two more objects in brief: the gauge algebra of P and the space of gauge potentials on P. In order to do this we show that there are two more groups associated to P and G which are isomorphic to the group Q(P) of gauge transformations. One of these groups results by considering the space 7(P, G) of all smooth functions/: P -» G, the group operation here, is the pointwise multiplication of these functions. Definition 6.6.7: The subset of fiP, G) consisting of all G-equivariant functions (with respect to the adjoint action) forms a group denoted JG{P, G): ?G(P,G)={f:

P->G \f{ug) = g-lf(u)g,

for every u € P and for every g e G]

(6.6.10)

The group JG (P, G) is isomorphic to Q(P). To define the other group we note that the associated bundle Ad{P) = (P x Ad G) over M (by the adjoint action of G on itself) is a bundle of Lie groups with fibre G.

The Role of Symmetry in Physics and Mathematics 265 Definition 6.6.8: The set T(Ad(P)) of the sections of this associated bundle with pointwise multiplication is a group. This group is isomorphic to Q(P) as well as to JG (P, G). In view of this statement it follows that any one of these groups or their representations can be used when we deal with the group of gauge transformations. Consider now the associated vector bundle E(M, g, ad, P) with fiber type g and the adjoint action ad of G on g. Recall that this bundle is a bundle of Lie algebras denoted P x a d g or ad(P) (see Sec. 2.5; we denoted it there as P x adg). Definition 6.6.9: The set of sections T(adP) =LQ{P) is a Lie algebra under the pointwise bracket operation. The Lie algebra Lg(P) is called the gauge algebra of P. If instead of JG (P, G) we consider the set fG {P, g}) of all G-equivariant (with respect to the adjoint action ad of G on its Lie algebra g) functions with the pointwise bracket operation, we note that it is a Lie algebra and it is isomorphic to LQ(P) (see Chap. 6 in [20] for details). So here again we can use either of these to describe the theory. We have thus far considered principal bundles on arbitrary manifolds M. We now consider principal bundles P(M, G) where M is a compact, connected and oriented manifold and the gauge group G is compact and semisimple. The choice of compactness 25 gives a Riemannian metric on M and a fixed orientation permits the integration. The base manifolds which are typically used in physical theories of interest are Sn,T" or their product 5" x 7"1. Similarly the most common gauge groups in use are U(n), SU(n), O(ri), SO(n) or their products. With these assumptions in place, a (natural) inner product can be defined on the space of gauge potentials (connections) on P. This space is denoted as S\(P) or S\ when P is fixed and is defined by: Si{P) = (ft) e A 1 (P, g ) | w is a connection on P)

(6.6.11)

From the definition of connection we know that if cox and o^e Si, then their difference 0)^ - ft^ is horizontal and is of type (ad, g) and thus defines a unique 1-form on M with values in the associated bundle adP s P x ^ g 6 . In view of this for a fixed connection ye Si, the space Si becomes isomorphic to the set: {/+ n*A\A e A 1 (M, adP)}

(6.6.12)

This isomorphism implies thatil is an affine space with underlying vector space A (M, adP). In other words the tangent space TyS\ is isomorphic to A ' ( M adP). Hence not only a metric can be defined on Si, covariant and exterior derivation can be defined as well (Sec. 2.5b). We shall not go into these details (see {20] and [7]), instead we shall use the action of the group Q(P) on the space Si(P) to derive a gauge transformation relation similar to the one we obtained in Sec. 4 and Sec. 5. We note that essentially there are two different actions of g(P) on Si(P), but these can be identified. These are the right and left actions denoted R { and L^for/e g(P). The first of these has the effect of pulling the 1-form (0 € A(P), thus fco= Rrl(Q)) = (f-x)*co

(6.6.13)

and the second one pushes the horizontal distribution HP (defined by co) forward, thus we have:

Lf(co) = f,(Hp). 25

The choice of compactness can be relaxed to include base manifolds such as R3 or R3 x S1 with appropriate boundary conditions.

26

' See Sec. 2.5.

266

Mathematical Perspectives on Theoretical Physics

In view of our earlier discussions (see Def. (2.5.12) and (2.5.14)) on Hp and co, it is clear that R i = Lf. Hence either of these actions can be used to suit a particular problem. Next we obtain a local expression for this action choosing G to be one of the (classical) matrix groups (see Sec. 4). Corresponding to the given connection CO, let A,-, Aj be local gauge potentials respectively in local gauges (sections) r,-, t} over Ui and £/•. From our study in Sec. 2.5, the sections tt, r that satisfy: f, = xi/jj tj

where

could be viewed as the local expressions of fe

y ^ : U( n Uj -> G

(6.6.14)

Q(P). Hence Eq. (6.6.13) becomes:

AJ=Yi]-lAiyfij+Y^dyfiJ

(6-6.15)

(see 2.5.37). Further, using the terminology of the previous section, if we write A, as A and Aj = A8 and y/jj as g, we obtain the (familiar) local expression of the transformation rule: As = g-x Ag + g~x dg

(6.6.16)

(Note that we established it for G = U{\) in Eq. (6.5.5)). Very often while writing the gauge transform of a potential or a field, we use the local expression (6.6.16) in preference to (6.6.15). Thus, for instance, from Eq. (6.6.16) the gauge transform of the gauge field Fta is: F£=8-lFmg

(6-6.17)

This shows that | Fm\ is a gauge invariant function on M, i.e., \Fj\ = \Fa\eHM). Due to its gauge invariance property, this is used in defining the action functional of the theory based on the gauge group G.

6.4 The Moduli Space of Gauge Potentials on P(M, G) and Gribov-Ambiguity Definition 6.6.10: Two connections a , , ^ e A{P) are said to be gauge equivalent if there exists a gauge transformation / e Q(P) such that a 2 = / o a , . The definition of the action of Q{P) on Si(JP) implies that each equivalence class formed by equivalent connections describes an orbit of Q{P) inS\(P). The orbit space ® = %JQ represents the gauge inequivalent connections and is called the moduli space of gauge potentials on P(M, G)*. The definition of orbit space leads to the infinite dimensional principal ^-bundle p:%.-*®

= AlQ

(6.6.18)

with p as the natural projection, and apparently it also leads to the notion of sections s : 0 —> 91. The transformation properties of gauge group Q are used to write gauge invariant functionals on J3. however to avoid the infinite contributions coming from gauge equivalent fields one has to integrate them on orbit space 0 . Unfortunately the mathematical content 0 is not well known, which makes the problem unwieldy. This difficulty is eventually avoided by choosing a section s and integrating over the image 5(0) c !A with a suitable weight factor such as the Faddeev-Popov determinant. This procedure Note that points of orbit space 0 correspond to equivalent classes of connections.

The Role of Symmetry in Physics and Mathematics 267

of choosing one connection in A from each equivalence class in 0 is called the gaugefixing*.Naturally the procedure could not work if required 'sections' did not exist. For example, Gribov showed that for the trivial SU(2) bundle over R4 the Coulomb gauge fails to be a section and hence is not a true global gauge. The non-existence of a global gauge is referred to as Gribov ambiguity. It can be shown that the Gribov ambiguity is related to the topological structure of a principal bundle (see, for instance, [14], [18], [25]). Due to our limited scope we are unable to devote time to these technicalities. Finally, in pursuance to our programme of learning the physicists' and mathematicians' approach to gauge theory in a unified manner, we give an example of Coulomb gauge (in this section) and devote the next section to some more intriguing characteristics of gauge theory. Example 6.6.11 Consider the principal bundle P(S*, SU(2)) and let Sl{P) denote the space of gauge potentials. For a fixed connection am ft, we know that.# is isomorphic to the vector space of 1-forms on S4 with values in the vector bundle adP = P x adg, i.e., A= {a+ KA\A

€ A'(S 4 , adP)}

(6.6.19)

(Note that (6.6.19) is indeed (6.6.12) with S4 in place of M and with 5(7(2) as the group G.) Consider the subspace Sa c A defined as Sa= {a+ n A\&* A = 0}

(6.6.20)

a

where S denotes the coexterior derivation with respect to a (see Eq. (2.5.50)). The subspace Sa is called the generalized Coulomb gauge. If in particular a = 0, then we have So = {KA\8°A = 0}. The condition S°A = 0 can locally be written as

Now on R4 as a base manifold, one can find a connection whose time component is zero and which is gauge equivalent to the given connection (i.e., the two are related by a gauge transformation). The above gauge condition then reduces to the classical Coulomb gauge condition div A = 0 (6.6.21) a Through this simple example we have thus shown how the topological construct S A = 0 can be identified with the classical condition A- A = 0.

7

MORE ON CHARACTERISTICS OF GAUGE THEORIES, AND EXAMPLES BASED ON THEM

In spite of great progress already made in theories based on gauge symmetries (e.g., the electroweak heory), gauge theories can still shed new light on theoretical physics (see references given in Exp. 4). iVe devote this section to give a few examples on this aspect. For instance, in the first two examples We emphasize that in any (sensible) physical theory 'gauge-fixing' conditions are always used. As they help to remove redundant degrees of freedom that crop in due to 'gauge invariance' of the theory in question. (See Chapter 11).

268

Mathematical Perspectives on Theoretical Physics

(cited below as results), using the bundle-theoretic construction of gauge theory, we establish the existence of Maxwell's and Yang-Mills fields on arbitrary manifolds. In particular we prove (in brief) the following two results: Result 6.7.1: Given a principal bundle P(M, (7(1)) over a compact simply connected, oriented Riemannian manifold M, the Maxwell field is the unique harmonic 2-form representing the Euler class or the first Chern class cx(P). Result 6.7.2: Let P(M, G) be a principal bundle whose base manifold M i s Riemannian and connected and G is an arbitrary Lie group, and let co be a connection of P(M, G) which represents the finite Yang-Mills action. Then following four statements are equivalent: (i) co is a critical point of the Yang-Mills functional (ii) co satisfies the equation 8mFa = 0, (iii) co satisfies the equation da* Fm = 0, (iv) AaFm-Q where A^is the the Hodge Laplacian AC0 = d0)S0)+ 803d™. (6.7.1)

7.1 A Generalized Maxwell's Field Proof of Result 6.7.1: To show the existence of Maxwell's fields on arbitrary manifolds we begin by recalling its definition and existence on the Minkowski space M 4 . Consider the principal bundle P{MA, U(\)). Since any principal bundle on the Minkowskian space is trivializable, it can be thought of as M4 x U(l). Now the Lie algebra w(l) of [/(I) can be identified with i (imaginary unit) times the real line, a connection form CO on P (co G Al(P, i R)) can be written as ico by choosing {/} as the basis for iR. Similarly the gauge field Q. = dcoeA (P, iR) can be written as J'Q. This leads to the Bianchi identity dQ. = 0. In order to define the corresponding gauge field Fm (see Sec. 6) on M4, we note that the bundle ad(P) is also trivial and therefore can be expressed as adP = M4 x u(l). Accordingly the gauge field Fa e A 2 ( M 4 , ad(P)) can be written on the base M4 as iF where F e A 2 ( M 4 ) . Since the bundle is trivial, we have a global gauge s : MA —> P defined as s(x) = (x, 1) for every x e M, we use this to pull the connection form icoon P to M4 and obtain the gauge potential: iA = is* (co)

(6.7.2)

The gauge potential A G A ' ( M 4 ) in this case is evidently global and the corresponding gauge field which as we know from Sec. 6 is always global is: F=dA

(6.7.3)

The Bianchi identity: dF=0

(6.7.4)

follows from the fact that F is exact (i.e. it is the differential of a 1-form). Consider now the action given by the gauge field F: SA = \ \ F\2 dv

(6.7.5)

where \F\ stands for the pseudo-norm induced by the Lorentz metric on M and the trivial inner product on the Lie algebra M(1), and dv stands for the infinitesimal volume element on M 4 . Note that the action represents the total energy of the electromagnetic field. The Euler-Lagrange equation obtained by

The Role of Symmetry in Physics and Mathematics

269

minimizing the action SA gives: 8F <=> d*F = 0

(6.7.6)

The equations 5F = 0 coupled with dF = 0 give Maxwell's equations for a source-free electromagnetic field. Consider now a gauge transformation (p-a section of Ad(P) = M4 x U(l), note that this is completely determined by the smooth function *F e ^(M4): (p(x) = (x, emx))

for every x e M 4

€ Ad(P)

(6.7.7)

Thus if iB denotes the gauge potential which results from the action of gauge transformation <j> on iA, then in view of (6.6.16) it follows that: iB = e^{.iA)ef¥

+ e™ de™

(6.7.8)

(where we have written *¥(x) as *¥ for ease of notation). The above equation simplifies to the (familiar) classical formulation: B=A+cP¥

(6.7.9) 4

as given in (6.5.4). Thus, in essence, beginning with a principal bundle M x f/(l) and using a 'section' as gauge transformation, we have obtained Maxwell's equations: dF = Q

SF=0

(6.7.10)

We would also like to remark that since the group Q(P) of gauge transformations acts transitively on the solution space A(P) of gauge connections of equations (6.7.10), the moduli space AIQ consists of a single point—a fact, which is of fundamental importance in path integral approach to QED (see Chap. 9). From the above discussions it should be clear that definition of a Maxwell's field can be extended to any £/(l)-bundle on an arbitrary base manifold with pseudo-Riemannian metric. In order to finally establish Result (6.7.1), we therefore consider a compact, simply-connected, oriented Riemannian 4-manifold (M, g) with volume form vg21 and define the Maxwell field as follows: Definition 6.7.3: A connection (Don P(M, C/(l)) is called a Maxwell connection or & potential if it minimizes the Maxwell action AM(co) defined as:

AM(co)= -L.ljFj2xdvg

(xeM)

(6.7.11)

The corresponding Euler-Lagrange equations are: dFa = 0

8Fco = 0

(6.7.12)

A solution of these equations is called a Maxwell field or a source-free electromagnetic field on M. Since equations dFm = 0, 8Fa = 0 taken together imply that Fm is harmonic, we have partially proved Result 6.7.1: the curvature 2-form of a Maxwell connection defined on P(M, G) is harmonic. To show the uniqueness of this connection and to establish the fact that it represents the Euler class or the first Chern class q(P), we have to use the homotopy, homology and Hodge theory. We refer the interested reader to [2] and [20]. 27

The theory based on this choice of (M, g) is called the Euclidean version of Maxwell's theory.

270 Mathematical Perspectives on Theoretical Physics

7.2 A Generalized Yang-Mills Field Proof of Result 6.7.2: In Sec. 6 (see (6.6.1)) we have already seen that if CO is a gauge connection on P s P(M, G) (M a connected manifold), then in a local gauge r e F ((/, P) the local gauge potential is A, = t*(co) which belongs to A ' ( £ / , adP). Also the action on this potential of a gauge transformation g 6 Q{P) which locally becomes a G-valued function gt on U is given by: 8rAi = (adg,)oAt+ g*0 = gjl A,g, + g*0 where 6 is the canonical 1-form on G.28 Recall that in (6.5.5) we wrote it as: A8 = g~l Ag + g'[ dg In Sec. (2.5) we also saw that the curvature Q. of the (gauge) connection co defined the unique 2-form Fa = sQ on M with values in the bundle adP. In a local gauge t, Fm can be written as: Fm=d'°At+j[At,At]

(6.7.13)

Note that the Lie bracket in the above equation stands for the bracket of bundle-valued forms. We assume that M is a compact connected oriented Riemannian manifold and G is a compact semi-simple Lie group, then we can write the Yang-Mills action (or functional) AYM as follows: A

YM («) = —^T jM\Fa>\2x

dv

s

for w e

-^( p ) and x e M

(6.7.14)

In order to write the Euler-Lagrange equation by minimizing the above action, we note that the space !A(P) is an affine space and therefore variations can be done along the straight lines (through co) of the form: 0)t = 0)+ tA (6.7.15) where A € Al(M, adP) and t is the variational parameter. The gauge field corresponding to co, can be seen to satisfy: F

m, = F
(6.7.16)

We substitute the value of Fw in (6.7.14) and differentiate the action using (6.7.15) to write: -j-Am(fl>,) dt

r=0

=^-(jjFa+td«+A dt

+ t2(AAA)\2 dvg) x r=Q

(6.7.17)

Writing the RHS as inner product and differentiating the terms with respect to / at t = 0, we note that: jmS = =

2Ju(Fa,daA)dVl! 2lM(S*Fm.A)dvg.

The above equality in variational form can be written as (see Sec. 2.5 for {(,))): 8AYM(co)(A) 28. See Subsec.(2.5.5).

= 2 « 5 a ' F ( B , A))

(6.7.18)

The Role of Symmetry in Physics and Mathematics 271

A gauge connection ft) is called a critical point of the Yang-Mills' functional if: 8AYM (ft))(A) = 0

for every A

EA'(M,

adP)

(6.7.19)

The critical points of Yang-Mills functional are solutions of the corresponding Euler-Lagrange equations: 5°> Fa = 0

(6.7.20)

The equation (6.7.20) is called the pure (source-free) Yang-Mills equation on the manifold M. The gauge connection ft) is called the Yang-Mills connection and its gauge field Fa is called the Yang-Mills field. Since (using local coordinates [see Chapter 1]) d03 and S"" are related through the Hodge star operator as: dol=±*SO)*

(6.7.21)(a)

it follows that the Yang-Mills equation (6.7.20) is satisfied if and only if: d(O*Fco = 0

(6.7.21)(b)

Through our above discussions we have already proved the equivalence between the first three statements. To show the equivalence of (iv) with the other three, we prove that (ii) <=> (iv). Recall that in view of our earlier results (see also Exc. 2) we can write the identity in (iv) as 29 « V * Fa, Fa » = || da Fa |P + || 8° F j | 2 .

(6.7.22)(a)

When we use Bianchi's identity: daFm = 0

(6.7.22)(b)

in the above equation the equivalence (ii) <=> (iv) becomes obvious (note that Eq. (6.7.22)(b) suggests that locally Fm is always derived from a potential). The pair of equations 8® Fco = 0 and dw F^ = 0, or da*Fa) = 0 and d03 F0 = 0 are called the Yang-Mills equations. It is easy to verify that when M is the Minkowski-space and G is f/(l) they reduce to Maxwell's equations. Using a local orthonormal coordinate system, it can be shown that these equations are a system of non-linear, second order, partial differential equations for the components of the gauge potential A (see Exc. 3). Example 6.7.4:30 Here we give an example of a model which has been constructed (see S. Chadha and H.B. Nielsen in [11] or [3]) to illustrate the assertion that certain symmetries are not really fundamental (as was considered before the advent of grand unified theories) but they arise because of our low energy world. These are thus expected to be broken at high energies. The model that we are talking about is gauge invariant and renormalizable but is not initially Lorentz invariant. The couplings involved in the Lagrangian of this model are functions of the energy scale and as such any change in this scale makes an impact on the physical system that Lagrangian represents. More precisely, as the energy scale is lowered, it is shown that the model tends to acquire Lorentz invariance with perceivable accuracy which confirms the premise that 'symmetry is not fundamental.' We give below this Lagrangian using the same notations as that of the paper referred above. Again we simply explain only its main features and ask the interested reader to look for details in the original paper cited above and similar extended works of Neilsen and Ninomiya in [23] and Forster, et al., in [10]. See also Chapter 6, in 29

- See Eqs. (2.5.47)-(2.5.49). ' Reader will appreciate this Exp. better after Chapters 7 and 9.

30

272

Mathematical Perspectives on Theoretical Physics

particular Subsec. 6.2.2 in [11]. Consider the action W with the Lagrangian consisting of three terms: Wnorwnt ( * V) = J ( < W ^ + T j / V + ^non-nt G V V)]

(6-7.23)

The subscript non-int stands for non-interacting, to imply that it represents a non-interacting action between photon and electron fields. The term -£non_int written out in full stands for:

-}*7V[^ l ± i ^ + ^ i z | l 5 - | l ^

(6 . 7 . 2 4)

The various terms involved in (6.7.23) and (6.7.24) are explained as follows: The 7M and r\ are the photon and electron sources respectively with A^ and \y as their corresponding fields, and F^v = d^Av-dv A^ is the field strength tensor. The veilbeins e+% and e^a are taken differently for two different helicities of the fermion. The ^-matrices are in Majorana basis (see Chapter 7), thus they are all imaginary with y° being antisymmetric and the rest being symmetric31. The matrix y5= y° yl y2 y3 can easily seen to be real. The matrix q =

stands for the charge. The r\ and y/ are real eight component spinors that

anticommute. As mentioned in the introduction, the (final) action of the model is gauge-invariant and renormalizable. This means that the (primitive) interaction term is obtained from the free Lagrange function by replacing, every derivative by a covariant derivative D^(see Sec. 4), and due to renormalizability the Lagrange function contains only those terms whose mass dimension is < 4. Although the model (to begin with) is non-invariant under Lorentz symmetry, it takes into account the chiral invariance, and accordingly uses a massless electron. Also to preserve the usual translational invariance of the action it is further assumed that the coupling constants r]flypff and the vierbeins e+%, eta are space-time independent. They are however dependent on a variable A (called the energy scale). The model is constructed on a four dimensional space (x°, xl, x2, x3) with no a priori assumption on the metric. But the stationarity of the non-interacting action Wnon..int which gives the free field equations, ultimately leads to the choice of a Minkowskian metric. The action W with interaction between photons and charged particles together with required gauge invariance is now obtained by replacing every derivative in (6.7.23) by a covariant derivative D^ (see (6.4.6)), in other words using the correspondence:
(6.7.25)

This leads to: W = j(dx)[j"

n

Tt il

( 0

- The/-matrices in Majorana basts are: 7 ^ = I .

Au +77/° V + £(AM V)].

^2

-T2^) Q

,

( 0

I, 7M= I . ^

(6.7.26)

(T3^| Q

,

(i\

'%*=l0

0 \

_

a

J ' ^

,

=

. (T 1 , T 2 , T 3 are Pauli matrices). One can find more than one such representation for 7-matrices

V-'T 0 J related to each other by similarity transformations (see for instance 7.[19] and Exc. (7.3.6)).

The Role of Symmetry in Physics and Mathematics

273

It can be easily checked that: MAp 40 = £ non . int (AM, V) + / A^

(6.7.27)

with./M given by:

f(x) = j^WrV "?[<, ^f^

+ e^^^ix)

(6.7.28)

Using several mathematical arguments, it is then shown that the non-covariant model looks more covariant (implying the Lorentz invariance) once the energy scale A at which the model is viewed becomes very small. We would like to emphasize that the above example with no mathematical explanations is included in the text mainly to introduce an inquisitive mind to mysterious characteristics of symmetry, and symmetry breaking, and to motivate the reader towards further study. Example 6.7.5

The four dimensional rotation group 0(4). The 0(4) symmetry is one of those

symmetries which is important at the macroscopic as well as at the microscopic level*. At the macroscopic level it is exhibited by planetary motion via Coulomb-like — potential (see Ftn. 2), for example r the Keplerian laws become very simple in view of 0(4) symmetry. At the microscopic level one encounters it in Bohr's atomic model, more specifically when one writes the Schrodinger equation for the electron in a hydrogen atom which has the O(4)-symmetry. (On a historical note, both Kepler and Bohr were unaware of this symmetry at the time of their fundamental work.) The 0(4) symmetry of the hydrogen atom was discovered by Fock (see reprinted paper in [11]), who wrote the Schrodinger equation in momentum representation. He then identified the points of the momentum space with those of a unit sphere S3 in R4 through stereographic projection. Through this process he was able to transform the SchrSdinger equation (for the hydrogen atom) in such a manner that the Hamiltonian could be expressed as a convolution with the function: constant . 2 co

, , - 2Q\ K • •

)

sin — 2 co being the distance along a great circle on S3. The transformed Hamiltonian (for the hydrogen atom) was invariant under rotations in the 4-space, of this S3 sphere around its centre. We give below the relevant equations of Fock's paper to show this O(4)-symmetry (the notations followed are those of the above reference). The Schrodinger equation in the momentum space (see Chapter 9) is given as: (6.7.30) where (dp') - dp'x p'y dp'z is the volume element, p2 = p2x + p2 + p2v and Z represents the hydrogen atom. Writing p0 = ^-2mE , the coordinate transformation given below is used to express (6.7.30) in terms of spherical coordinates on S3 (see Exp. (0.5.1)), thus t, - 2p0px I (p20 + p2) = sin a sin 6 cos 0 Note that the words macroscopic and microscopic are used here because in one of the problems (involving O (4) symmetry) the distance is very large and in the other it is minuscule.

274 Mathematical Perspectives on Theoretical Physics

7] = 2p0py/(p20

+ p2) = sin a sin 6 sin 0

£ = 2p0pz/( p\ + p2) = sin a cos 0 * = ( / > o - V ) / ( P o + P2) = cos a.

(6.7.31)

Using d£l = sin2 a sin 6da d6 d(j>, it follows that the volume element (dp) = dpxdpydpz satisfies: (dp) = - V (p20 + p2)3 dQ.

(6.7.32)

Set A=

^ng2_=

Zme2

(6 ? 33)

and use the new wave function in terms of (a, 9, <j)) given by the relation 4,(0, 9, <j)) = -?=- pf2

((p2 + P2)W(p))

(6.7.34)

to write the Schodinger equation (6.7.30) in the form that we were looking for: (6.7.35) 4sin

Ln

— 2

The denominator 4 sin2 — , as mentioned above, represents the distance between two points on a great circle whose spherical coordinates are (a, 6, '), thus: 4 sin 2 y

= (£ -
(6-7-36)

The stereographic projection used for the problem is reproduced below.

/

/

^^HHj

2^~ 1

.^-::~-':~/'-r^. ,

~y

^

/

The stereographic projection for Eq (6 7.31) with one suppressed dimension.

The Role of Symmetry in Physics and Mathematics 275

So far in this chapter we have studied only those symmetries that we observe in nature, and have obtained their so-called 'symmetry' groups (very often called the gauge groups). In the following example we shall deal with a symmetry which does not originate from the laws of nature, instead it helps in studying them (see Note (6.7.7)). Due to this distinctive nature, we would call this symmetry a 'manmade symmetry.' The symmetry (i.e., the symmetry group) we have in mind pertains to the Laplace operator. Recall that in Sec. (3.5) we defined the 'full symmetry group' of an operator L as the group G that is formed by all operators that comute with L and are invertible (Def. (3.5.2)). We shall see that for the Laplace operator A (which we denote as L) whose domain is 5 3 , the group G is SO(3). " d2 Consider the Laplace operator L s V — sdS+ Sd. Then from a standard result ,=i (dx ) of Riemannian geometry, it is known that if M is a Riemannian manifold and 0is an isometry of M, then
Example 6.7.8: The Lie algebra SO (3) is indeed the Lie algebra obtained from the commutator algebra CL of the Laplace operator L (see Def. (3.5.1)). To see this we note that elements of 50(3) can be written as: HO Rx(0)

0

]

= 0

cos 6

-sin6 ,

,0

sin 6

cos 6 )

fcosd R3(0) = sin d , 0

fcos0 R2(0) =

0 l^-sin 9

0 1 0

smd\ 0

,

cos 0 ,

- s i n ^ 0N cos d 0

0

(6.7.37)

1,

where Rt{9) (i = 1, 2, 3) denotes the rotation around a coordinate axis of a cartesian reference frame. Consider an infinitesimal rotation of the form: R(ff) => / + ea Xa

(6.7.38)

where ea,s are infinitesimal parameters, andX a are the infinitesimal generators of 50(3). TheX a 's are obtained from: Xt = l i m — R^ff) 8—>0 d0

(i= 1, 2, 3)

(6.7.39)

276 Mathematical Perspectives on Theoretical Physics

i.e., from A)

0

0\

( 0 0

X{ = 0 0 - 1 ,X2=

[o i oj

\\

(0

0 0 0 ,X3 =

t - i o oj

1

-1

0'

0 0

(6.7.40)

[o o o,

These X,'s form a basis for S0(3) ([Xa, Xp\ = e a / 3 r A"r). They satisfy the requirements of CL, since XjS are the operators that evidently commute with L.

Exercise 6.7 1. Obtain the value of Fa as given in (6.7.16). 2. Establish the equality (6.7.22)(a). 3. Obtain a local expression for Yang-Mills equations in an arbitrary orthonormal coordinate system.

Hints to Exercise 6.7 1. Note that cot = CO + tA is a line in the affine space of gauge potentials. The element A here € A\M, adP), i.e., A is a 1-form which takes values in the bundle P xad g. Evidently A is variable in above equation. Now we know that the curvature form fl corresponding to CO satisfies: (i) Q.t = da>+ (0y\(Q and Fw is the 2-form that corresponds to this. The curvature 2-form Q, derived from cot (using (i)) would be: Q r = dco, + co, A a>t = d(co + tA) + (co + tA)A(co + tA) = dco + tdA + {COACO) + t(AAQ) + COAA) + f2 A A A

= (dco + COACO) + tdmA + f 2 (AM).

Hence the gauge field FW( corresponding to Qt can be written as: Fcot = Fa>t + tdmA+

t\A A A)

(see Sec. 2.5 for the correspondence between Q and FJ. 2. Recall that on a compact oriented manifold (the type we are considering here) the Hodge-de Rahm operator A = dS + Sd maps Ak(M) to Ak(M) and for P e A*+1 (M) we can write: (i) (da, p) = (a,8p) for any a e Ak(M) Now both these properties can be extended to bundle valued forms also. Thus writing A" as dw 8°+ S^d® in the LHS of equality (6.7.22)(a) we have: (ii) « A ^ F w » = « (d w 5'V ffl> FJ) + ((S^d^F^ FJ) co tu = ((^F0),S F0))+((d FOJ,d°'Fj)

= \\S"F£+\\d»F£. 3. Use the steps suggested in the Hint to Exc. 3 of Sec. 6.4.

The Role of Symmetry in Physics and Mathematics

277

Table 6.1 Some Symmetries and Conservation Laws Nonobservable Difference between Identical particles Absolute position Absolute time Absolute direction Absolute velocity Absolute right or left Absolute sign of electrical charge Relative phase between states of different charge Q Relative phase between states of different baryon number N Relative phase between states of different lepton number L Difference between coherent mixtures of p and n states

Symmetry Transformation

Conservation Law or Selection Rule

Permutation

Bose-Einstein or Fermi—Dirac Statistics Momentum Energy Angular momentum Generators of the Lorentz group Parity Charge conjugation Charge

Space translation x - > x + 5(x) Time translation t-> t + 5f Rotation 0 -> 0 ' Lorentz transformation x -> -x e -»-e y/-> e'oe y/ y/-* e'Ne y/

Baryon number

y/-^> e'Le y/

Lepton number

{%)—>U{H)

Isospin

Table adapted from Guidry (1991)

References 1. A. P. Balachandran, (a) Wess-Zumino Terms and Quantum Symmetries in 5.[7]; A. P. Balachandran, G. Marmo, B. S. Skagerstam, A. Stern, (b) Gauge Symmetries and Fiber Bundles (Springer-Verlag, 1983). 2. R. BottandL. W. Tu, l.[5]. 3. S. Chadha and H. B. Nielson, Lorentz invariance as a low energy phenomenon, Nucl. Phys. B217 (1983), 125. 4. T. P. Cheng and L. F. Li, 9.[6]. 5. S. Coleman, Aspects of Symmetry (Cambridge University Press, 1985). 6. P. Dita, V. Georgescu, R. Purice (eds.), Gauge Theories: Fundamental Interactions and Rigorous Results (Birkhauser, 1982). 7. W. Dreschsler and M. E. Mayer, Fiber Bundle Techniques in Gauge Theories (Springer-Verlag, 1977). 8. T. Euguchi, et al., 1O.[24]. 9. E. Farhi and R. Jackiw (eds.), Dynamical Gauge Symmetry Breaking (World Scientific, New Jersey, 1982). 10. D. Fbrster, H. B. Nielsen and M. Ninomiya, Dynamical stability of local gauge symmetry: creation of light from chaos, Phys. Lett. 94B (1980), 135. 11. C. D. Froggatt and H. B. Nielsen (eds.), Origin of Symmetries (World Scientific, New Jersey, 1991). 12. M. K. Gaillard and R. Stora (eds.), Gauge Theories in High Energy Physics (Parts 1 and 2; North-Holland Publishing Co., 1983).

278 Mathematical Perspectives on Theoretical Physics

13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28.

M. Gell-Mann and Y. Neeman (eds.), The Eightfold Way (W. A. Benjamin, New York, 1964). V. N. Gribov, Quantization of non-abelian gauge theories, Nucl. Phys. B139 (1978), 1. M. Gourdin, Unitary Symmetries (North Holland Publishing Co., Amsterdam, 1967). M. Guidry, Gauge Field Theories (John Wiley & Sons, Inc., New York, 1991). P. W. Higgs, Spontaneous symmetry breakdown without massless bosons, Phys. Rev. 145 (1966), 1156. R. Jackiw, R. Muzinich and C. Rebbi, Coulomb gauge description of large Yang-Mills field, Phys. Rev. D17 (1978), 1576. D. B. Lichtenberg, Unitary Symmetry and Elementary Particles (Academic Press, New York, 1978). K. B. Marathe and G. Martucci, 1O.[35]. V. A. Miransky, Dynamical Symmetry Breaking in Quantum Field Theories (World Scientific, New Jersey, 1993). R. N. Mohapatra, Unification and Supersymmetry (Springer-Verlag, 1992). H. B. Nielsen and M. Ninomiya, /J-function in a non-covariant Yang-Mills theory, Nucl. Phys. B141 (1978), 153. R. D. Pecci and H. R. Quin, Constraints imposed by CP conservation in the presence of pseudoparticles, Phys. Rev. D16 (1977), 1791. I. M. Singer, Some remarks on the Gribov ambiguity, Comm. Math. Phys. 64 (1978), 7. G. t'Hooft, The Normalization of Massless Yang-Mills fields, Nucl. Phys. 33B (1971), 173. B. Gruber and M. Ramek (eds.), Symmetries in Science X (Plenum Press, New York, 1998). J. D. Jackson, Classical Electrodynamics (Wiley, 1975).

j

CHAPTER

A l l THflT IS SuPER-flN INTRODUCTION |

/

While concluding the previous chapter on Symmetry, we made a brief reference to the concept of supersymmetry—and as to why it was considered an important object in the unification scheme of field theories. In this chapter we focus our attention on all those aspects of mathematics and physics that are relevant to supersymmetry and related ideas, namely: the superalgebras, the supergroups, the superspace, the superfields, and so forth. All these concepts are linked to one another—for instance the superPoincare group (the superextension of the Poincare group) is the group of motions of a particular supermanifold-the flat superspace, whereas the infinitesimal generators of this group define a superalgebra. Moreover, knowledge of all these topics is required for understanding the supertheories— the supergravity, the super-Yang-Mills and the superstrings. We therefore present here the basics of these topics, and refer the reader for an in-depth study to the works of experts in the field (see references [2, Vols. I, II, III], [10, Vols. I, II], [19], [20], [21]).

1

GRADED-ALGEBRAS

l.l

Superalgebras and Lie Superalgebras

The first ingredient in any supertheory is its superalgebra, more precisely the Lie superalgebra. We therefore begin by giving below the definitions of superalgebra and Lie superalgebra along with a few explanatory examples. As can be expected, these definitions stem from that of algebra and Lie algebra which we had studied in Chapter 4 (see Def. (4.1.1) and (4.1.3)). We recall that an algebra .# is a vector space over a field J of characteristic 0 (or prime p) on which a distributive binary operation that commutes with scalars can be defined. The algebra^ may or may not be associative. Similarly, a vector space Q over J equipped with a bilinear, anti-commutative binary operation [, ] is a Lie algebra which is non-associative, and obeys the Jacobi identity: [A,[B, C]] + [B,[C, A]] + [C,[A, B]] = 0

(7.1.1)

In this chapter we shall be dealing mostly with graded vector spaces and graded algebras. Definition 7.1.1: Let T denote one of the rings Z or Z 2 = Z/2Z (ring of integers modulo 2.) A vector space V over the field f is said to be F-graded if it admits a family (V r ) y e r of subspaces such that: V= © VY yer

(7.1.2)

280

Mathematical Perspectives on Theoretical Physics

An algebra A is said to be F-graded if the underlying vector space is F-graded, i.e., A = © Ay

(7.1.3)

and in addition AaAj}

c Aa+p

for all a, ft e F

(7.1.4)

When F is Z 2 we denote the subspaces in (7.1.2) (anS (7.1.3)) simply as V^ and Vy (AQ and Aj), and call their elements respectively as even and odd. Definition 7.1.2: A Z 2 -graded algebra is called a superalgebra*. We denote it as L = LQ + Ly. Let the multiplication in L be denoted by ( , ), this implies in particular that (La, Lp) c La+j3for all a, fi e Z 2

(7.1.5)

The algebra L is called a Lie superalgebra if the multiplication satisfies the following identities: (a, b) = (-I)"* (a, b) {-\Ya

(7.1.6)

(a,(b, c » + (-l) a / 3 ,
(7.1.7)

for all a e La, b e Lp, c e L y and a, /3, / e Z 2 . (Note that (7.1.6) is the graded skew-symmetry and (7.1.7) is the graded Jacobi-identity.) In keeping with physics tradition, we shall denote the Lie superalgebra as S and call the elements with even grades and odd grades Bose and Fermi respectively. We shall often use letters B and F to denote them. Written out in full, the axioms of 5 over the field C can now be put down as follows: (i) 5 is a Z2-graded vector space over
[[E, [[C, £>]]]] = 0

(7.1.8)

Apparently this reduces to the familiar Jacobi identity, except in the case when any two of the elements C, D, E are Fermi and the third one is Bose. Remark 7.1.3: We remind the reader that Lie superalgebras are frequently called Z2-graded Lie algebras although in general they are not Lie algebras.

* 1

Note that an ordinary Lie algebra can also be graded. (See [22] for the use of graded Lie algebra in the case of boson-fermionic models). Note that the adjectives before these brackets can be confusing, since for arbitrary elements a, b we have [a, b] = -\b, a], {a, b] = {b,a},a,be L however when these are equated to zero, the terminology makes sense, since [a, b]= ab - ba and {a, b} = ab + ba.

1

We have used the double barred bracket [[ ]] to stand for [ ] as well as { }. This will reduce to one of these once the choice of pairs is made consisting of elements B and F.

All that is Super—an Introduction

281

Remark 7.1.4: Let S-SQ +SJ be a Lie superalgebra, then the subalgebra SQ is a Lie algebra. Any ideal of 5 is a Z2-ideal. Definition 7.1.5: A Lie superalgebra is simple if it has no non-trivial ideals. The simple Lie superalgebras over C in finite dimensions are fully classified (see for instance Kac (1975, 1977) [12a], [12b]; Kaplansky (1980), [13]).

1.2 Other Important Superalgebras and Bose and Fermi Sectors Example 7.1.6: The general and special linear superalgebra : Let V(l\m) denote a Z2-graded vector space with /-dimensional Bosonic subspace and m-dimensional Fermi space. An arbitrary vector in V(l\m) is a column vector with upper I entries of B's and m lower entries of F's. A vector with upper / (lower m) entries containing B's (F's) only and zero elsewhere is called a Bose (Fermi) vector. The set of (I + m)x(l + m) matrices formed by complex linear transformations of V(l\m) together with induced grading defines the general linear superalgebra gl(l\m). The bracket operation here (to be explained shortly) is commutative in some cases and anti-commutative in others. It can be easily checked that if the matrices e gl{l\m) are block diagonal of the type: m

/

m^O

177777] J

(7.1.9)

they carry a Bose vector to a Bose vector and a Fermi vector to a Fermi vector. These are called Bose linear transformation matrices. A matrix which is block off-diagonal: /

1

m

I °

m \ 7777777

) 0

(7.1.10)

/

is called a Fermi transformation matrix of V(l\m). Denote these Bose and Fermi transformation matrices by MB and MF. We note that the bracket (which defines the binary operation of Lie superalgebra) for pairs of matrices (MB, MB), (MB, MF) is the (usual) commutator, but for the pair (MF, MF) it is the anti-commutator. The Lie superalgebra gl(l\m) is not simple. Also while in the ordinary case the Lie bracket of two matrices is always traceless (in finite dimensions), in this case we have an anticommutator as well that results from MF\ and even though individual MF's are traceless, their anticommutators are not necessarily so.3 This leads to the notion of supertrace—an algebraic sum, which identically vanishes for the superalgebra bracketing of two elements. Definition 7.1.7: Given a matrix M: I m I (A B\ M= m {C DJ

3

Note that the product MB MB is an MB, MB MF is an MF and MF MF is an MB.

282

Mathematical Perspectives on Theoretical Physics

the supertrace of M is defined as: strM-trA-trD

(7.1.11)

It can be checked that for I * m the supertraceless elements of gl(l\m) form a simple ((I + m)2- 1)dimensional superalgebra denoted sl(l | m) under the bracket operation mentioned above. When / = m, the unit matrix is supertraceless, this commutes with every element, and as such defines the 1-dimensional center for sl{l \ m). We factor out this piece and denote sl(m\m) without center as psl(m\m). We note that psl{m\m) is a simple algebra and for m > 2 the dim(psl(m\m)) = 4m2- 2. For m = \ the algebra psl{\\\) is nilpotent. In view of our Remark (7.1.4) and the above discussions, it is evident that the Bose portions (sectors) of 5 behave like an ordinary Lie algebra. For obvious reasons we shall denote this as °5 (in place of 5Q)Returning to gl(l\m) we observe that elements of °gl{l\m) can be viewed as forming three different categories: (i) those that shuffle Bose to Bose; (ii) those that shuffle Fermi to Fermi; and (iii) those that shuffle Bose to Bose and Fermi to Fermi but in a correlated manner, thus the algebra °gl(l\m) is not simple, similarly the algebra °sl(l\m) is not simple. These algebras can be expressed as: °gl(l\m) = gl(l) + gl{m) °sl(l\m)fcm = sl(l) + sl(m) + 1-dimensional Abelian piece °psl(m\ m) = sl(m) + sl{m) 2

(7.1.12)

The dimensions of these Bose parts are therefore I + m , (I - 1) + (m - 1) + 1 and ( m - 1) + (m 2 - I). Accordingly the dimensions of the Fermi part (sector) of these algebras are 2/m, 2lm and 2m2. We note that just as the general and special linear algebras can be generalized to their corresponding superalgebras, the algebras with orthogonal and symplectic structures can also be made into superalgebras. In the latter case, since one of the parts (say, for instance, Fermi) has to be symplectic, we have to choose m as even. Example 7.1.8: bilinear form:

2

2

2

2

Ortho-symplectic Superalgebra: The Z,-graded vector space V(l\ni) (m even) carries a (X, Y) = XTMY

X,YeV(l\m)

(7.1.13)

where /

m

>

n 1

:" 1

/ -

M=

(7.1.14) -1

'-.

m 1

-1 , This form is symmetric (antisymmetric) on Bose (Fermi) sector of V(l\m). To define the superstructure we now consider those complex linear transformations U on V(l\m) which satisfy a superantisymmetry condition:

All that is Super—an Introduction

(X, UY) + (-1)"* (UX, Y) = Q

283

(7.1.15)

where u and x are grades of U and X. The collection of all such M's and C/'s defines the ortho-symplectic superalgebra denoted osp(J\m). Using arguments similar to the case of gl{l\m) and sl(l\m), we note that the Bose sector of osp{l\m) is o(l) + sp(m) and hence it has the dimension y /(/ - 1) + Ym(m + 1)- To determine the Fermi sector (i.e., those t/'s and M'swhich form the Fermi sector), we choose U as Fermi (i.e., off-diagonal blocks) and vectors X and Y as Bose and Ferrni respectively. In view of (7.1.13) and the superantisymmetry condition (7.1.15) we now have: BTMUF + BT UTMF = 0

(7.1.16)

This means that if Z m

m [W

0)

then V and W are related to each other by:

1

(

1

-1 T

V = -W C, where C =

1 -1

1

(7.1.17)

-1

1 -1 , Thus the lower (upper) half of U determines the upper (lower) half, and therefore there are Im fermionic generators in this case and not 2lm as they were in the case of sl(l\m). As a consequence the total dimension of osp (l\m) = \ [(I2 - I) + (m2 + m) + 2lm] = {[(/ + m)2 + m - /]. It is interesting to note that if we had chosen both (i.e., Bose as well as Fermi) parts of M as symmetric or antisymmetric, we would have found the Fermi part (sector) as empty; this would have led to the ordinary Lie algebra o(l + m) or sp (I + m). On the other hand if we had chosen / even and had imposed the antisymmetry via M over the Bose sector and symmetry over the Fermi sector*, then we would have obtained osp(m\l). Thus while it is immaterial to choose which of the sectors is assigned a symmetry or antisymmetry condition, it is essential that only one of them is assigned the symmetry or the antisymmetry condition. Example 7.1.9: By restricting to the set of matrices of the type:

(A M=[

B \ _ J

I G

sl(n+l\n+l)

where tr A = 0 and B (Q is symmetric (antisymmetric), we obtain the subsuperalgebra of sl(n + 11 n + 1) denoted P(n) with dimension 2(n + 1 )2 - 1. Likewise if we choose the matrices of the type:

M=\

(A B\ , e sl(n + l\n+ 1) \B A)

* This can be done by interchanging the diagonal blocks of matrix M shown in (7.1.14). Note that / is necessarily even now.

284

Mathematical Perspectives on Theoretical Physics

with the provision tr B - 0 and factor out the centre corresponding to the unit matrix, we obtain another subsuperalgebra of sl(n + 1|« + 1). It is [2(n + I ) 2 - 2]-dimensional and is denoted as Q(n).

Exercise 7.1 1. Establish the identity (7.1.8) in different cases. 2. In the matrix representation of Z2-graded V(l\m) the bracket for pairs of Bose and Fermi matrices: (MB, MB), (MB, MF) and (MF, MF) is the usual commutator in the first two cases but is the anti-commutator in the third case. Explain the implications of the above statement. 3. Let Q be a quadratic form on a vector space V (over the field F = IR or C): Q : x e V i-> Q(x) e F, such that Q defines a symmetric scalar product on V given by: (a)

Q(x, y) = xy + yx = Q(x + y) - Q(x) - Q(y).

Show that if V is r-dimensional and has an orthonormal basis e^ (/u = 1, ..., r) with respect to Q, then (b)

e^ev + eveH = 25 U V Q(e^) • 1.

The associative algebra with a unit element, and generated by e^'s with the defining relations (b), is the 2r-dimensional Clifford algebra Cl(V, Q) with respect to the quadratic form Q. A general element of this algebra in this basis can be written as:

(c)

a(0) + afi eM + a $ " 2 e^ e ^ + ... + aft - »' e^ ... e^

where coefficients a ^ 1 ^2 etc., are antisymmetric in \ix, fx2 and e R or C according to our choice of the field F later denoted as J.

Hints to Exercise 7.1 1. Let C, D, E be all Bose elements then their grades denoted c, d, e are all zero (mod 2), also the brackets [[ ]] in this case are the familiar Lie brackets hence it is the familiar Jacobi identity, which can be verified. When C, D, E are Fermi elements, c, d, e are all 1 (mod 2), the identity in this case becomes: (i) -[C, {D, E}] - [D, {E, C}] - [E, {C, D}] = 0 or (ii) [C, DE + ED] + [D, EC + CE] + [E, CD + DC] = 0 (iii) C(DE + ED) - (DE + ED) C + D(EC + CE) - (EC + CE) D + E (CD + DC) - (CD + DC) E = 0 or (iv) (CDE + CED + DEC + DCE + ECD + EDC) - (DEC + EDC + ECD + CED + CDE + DCE) = 0. Let C and D be Fermi and E be Bose, then we have (v) [C,[D,E]} - [D,[E,C]) + [E, {C,D}] = 0. On simplification, it can be easily checked that the terms cancel with each other.

All that is Super—an Introduction

285

2. The linear transformation matrices MB and MF, which are block-diagonal and block off-diagonal, inherit the grading structure from V(l\m). For instance MB is even-graded and MF is odd-graded. Also while MB carries a Bose (Fermi) vector to a Bose (Fermi) vector, MF carries a Bose vector to a Fermi vector and vice versa. The vector space V(l\m) can be given an algebra structure through the usual bracket operation. Obviously these bracket operations will not be the same in each case due to different characteristics of MB and MF. Using the bracket rule for even and odd graded elements, we note that elements of the first two pairs combine with each other using the commutator bracket [ ] and the elements of the third pair are combined by the anticommutator bracket { }. This bracket structure amongst the linear transformations of V(l\m) defines in turn the Lie superalgebra gl(l/m). 3. By definition a symmetric bilinear form Q : x e V —> Q(x) e F is a quadratic form. As a result we can write Q(x + y) = x2 + y2 + 2xy for x, y e V. The definition of symmetric scalar product on V therefore gives: Q(x, y) = xy + yx = (x2 + y1 + 2xy) - x2 - y2. Thus if x and y are replaced by basis elements that are mutually orthogonal we have: (i) eMev + eveM = 2d^v Q{e^) 1. The associative algebra with unit element 1 and generated by {e^} is the Clifford algebra C(Q) whose defining relation is given by (i) (see Hint to Exc. (4.1.3) for the rest).

2

THE SPINORS

In S e e l we presented a few easy examples of superalgebras. These did not include the super-Poincare algebra-which happens to be an important superalgebra in the real world. In order to introduce this we have to acquaint ourselves with the basics of spinors. It is worth mentioning that spinors were the outcome of physicist's need of an object that helped define a Lorentz invariant linear first order differential operator (see Exc. 1 and 2). However once they were conceptualized in physics, they were studied structurally from different angles by mathematicians (see [14]).4 We give below the simplest form of their definition before presenting a sophisticated version.

2.1

The Definitions and Properties of Spinors

To begin with, we define the word spinor via quantum mechanical path by using the dynamical variable spin. Accordingly we represent two special states (called up and down):

U-Q,

,7.2.!,

by column matrices u and d and the general spin state % by setting: (7.2.2)

4

In his famous equation Dirac replaced SchrSdinger's complex-valued wave-function by spinor-valued wavefunction, (see Chapter 9 for Dirac equation).

286

Mathematical Perspectives on Theoretical Physics

Obviously we have X(+l) = cl

and

* ( - l ) = c2

(7.2.3)

The elements c{ and c2 of above column matrix are complex numbers in general, and the object x representing this matrix is a spinor. From the nature of things (in view of quantum mechanics principles) it is evident that Iqp is the probability of finding the particle with spin up and |c 2 | 2 is the probability of finding it with spin down. This requires the constraint (normalization): N 2 + k 2 | 2 = ( c , \ c*2) (Ccl)

=1

which leads to the fact that associated with every spinor x—\

(7.2.4) (c \

there is an entity x = (c^ c->) such

\C2)

that x*X - 1- Given two spinors % and x' X'X'=

tneir

complex scalar product is defined as:

c\ c[ + c*2 c'2

(7.2.5)

Two spinors are said to be orthogonal if this product is zero. Evidently u and d in (7.2.1) are orthogonal spinors, they are also normalized since u\ = d*d = 1 and hence they form a basis in terms of which arbitrary spinors can be expanded. In fact, every (2 x 2) Hermitian matrix

with distinct eigenvalues A\, A'2 gives rise to eigenvectors (7.2.6) which are orthogonal. These are called eigenspinors and an arbitrary spinor can be expressed in terms of them, thus we have: X=dxEx

+ d2E2

(7.2.7)

The complex coefficients d{ and d2 determine the state (represented by the spinor x), l^iT, l^l" being the probabilities of finding the eigenvalues A[ and A2-which give the measurement of the physical quantity represented by A (see Chapter 9).

2.2 Clifford Algebras and Spinors In this subsection we formulate the group theoretic version of spinors. Since Clifford algebras play an important role in the definition of spinors, we shall describe this algebra again (see Chap. 4, in particular Exes. 4.1.3 and 4.1.5 and also Exc. 7.13), this time defining it as a quotient algebra obtained from the tensor algebra. Definition 7.2.1: Let V be a vector space over the commutative field J and suppose that Q is a quadratic form on V. The Clifford algebra Cl(V,Q) associated to Vand Q is an associative algebra with

All that Is Super—an Introduction 287

CO

unit element; it is defined as follows. Let T(V) = ] £ ® r V denote the tensor algebra of V; define the r=0

ideal IQ(V) of T(V) generated by all elements of the form v ® v + Q(v)l for v e V. Then the quotient T(V)/lQ(V) is the Clifford algebra Cl(V,Q).s There is a natural imbedding V ^ Cl(V, Q) which is the image of V s ® ' V under the canonical projection ^ : T(V) —> Cl(V, 0 . As a result of this imbedding we have: if v e V c C/(V, 2 ) is a n element such that Q(v) * 0, then Adv(V) = V. In fact, for all u e V, following relation holds good: Ad

v

(u)=u-^^.

(7.2.8)

We shall use this idea further to define the 'spin group' associated to Cl(V, Q). In order to do that, we begin with following definition. Definition 7.2.2:

Consider the subset of Cl(V,Q) given by: Cl(V, Q): 3 a'1 with ax a= aax = 1}

Cl*(V, Q)={ae

(7.2.8)(a)

X

The subset C/ (V, Q) forms a group called the multiplicative group of units of the Clifford algebra. This group contains all elements v e V with Q(v) * 0, and when dimV = n, and the field J is 1R or
(7.2.9)

where O(V, Q) = {X e Gl(V) :X*Q = Q}6 is the orthogonal group of the form Q. Definition 7.2.3: The Pin group of (V, Q) is the subgroup Pin(V, Q) of P(V, Q) generated by the elements v e V such that Q{v) = ± 1. The associated Spin group of (V, Q) is defined by 7 : Spin(V, 0 = Pin(V, Q) n C/°(V, 0 p +q

2

(7.2.10) 2

When V= R and g (JC) = -^f + ••• xj-xj+ q)\ for x ...x p + q, we denote Cl(V, Q) as ClpqorC(p, p + 9 = n the notations are Cln = Cln0 and Cl* - Cl0n. The Spin group defined in (7.2.10) is now denoted by Spin,,. With these definitions in place, we shall now define a spin structure on a vector bundle and then a real (and complex) spinor bundle (see [14] for details). 5

'

6

'

7

The notations used here are slightly different from previous ones. We shall use Cl and C interchangeably to denote the Clifford objects. The map A* which preserves Q is induced by an automorphism X of V. Consider the automorphism 0 : Cl(V, Q) -» Cl(V, Q) which extends the map <j> (v} = -v on V to Cl (V, Q), since 02 = Id, there is a decomposition Cl(V, Q) = cf (V, Q) + Cl\V, Q) showing that Cl(V, Q) is Z r graded. Apparently the even part Cl°(V, Q) is a subalgebra of Cl{V, Q).

288

Mathematical Perspectives on Theoretical Physics

Definition 7.2.4: Let E be an oriented n-dimensional Riemannian vector bundle over a manifold X and let PSo(E) be its bundle of oriented orthonormal frames. Suppose n > 3; then a spin structure on E is a principal Spin,,-bundle (denoted) F Spin (£) together with a 2-sheeted covering. %:PSpin(E)^Pso(E)

(7.2.11)

such that t;(pg) = ^(p) SO2 in this case is the connected 2-fold covering. When n = 1, PSO(E) = X and a spin structure is simply defined to be a 2-fold covering of X. Definition 7.2.5: Let E be an oriented Riemannian vector bundle with a spin structure E, : Ps in(E) -» PSO(E). A real spinor bundle is the bundle PSpin(E)x^M

= S(E)

(7.2.12)

where M is a left module for Cln = Cl (R") and where [i: Spin,, —> SO{M) is the representation given by left multiplications of elements belonging to Spin,, c C/° (R"). A complex spinor bundle can be similarly defined as: PSpin(£)xMMc

(7.2.13)

where M £ is a complex left module for C/flR") ® C. In order to return to spinors of our interest, we choose n = 4, and show what exactly a spinor is, from this bundle-theoretic point of view. Suppose that U denotes the domain of a local chart of 4-dimensional manifold X = X4, and pox denotes a fixed field of orthonormal frames on U for x e U. Further, let Aox be a fixed differentiable mapping of U into Spin4. A spinor y/on U is an equivalence class of triples: (px, Ax, y,x) ~ (pux, Aox, Wox) if

(7.2.14)

(P* K Vx) = (UK A ;!) P«x. A,' A, A"iv v^).

We would like to mention that the general construction given above has been considerably simplified, the vector bundle E is replaced by the tangent vector bundle T(X), and the Clifford algebra C/(IR4) is taken as a module over itself (replacing M) by left multiplication. Finally we note that in terms of orthonormal basis {e^} of Vthe spinors could be described as follows. From Exc. (7.1.3) we recall that an arbitrary element of a Clifford algebra C{Q) with quadratic form Q defined on r-dimensional vector space VR8 can be expressed as: *«» + < ) en + « ( i T 2 ^ ,

e

^ + • " + <[

"• " r e^

e

n2---

e

»,

( 12A5)

-

where {e^} is the orthonormal basis of Vand 1, e ^ , e f t eM2, ••• e^ e^ ••• e^ (^ < [12< ... < fxt) is the basis of C ( 0 . Note that each coefficient a^'"^'

is totally antisymmetric in fi's. If we restrict Q as

follows:

f+1 Q{eJ=\ M

8

, [-1

fl = l 2, . . . , p , [l = p + l, ...,p + q = r

^R = Vector space over the field F = R.

(7-2.16)

All that is Super—an Introduction

289

n the elements of the form ya^ v (e^ ev— ev e^) in (7.2.15) can be viewed as Lie brackets which span Lie algebra spin(p, q : R). A representation of the Clifford algebra C(p, q) thus gives a representan of spin{p, q; R). The elements of the corresponding representation space are called the spinors. Due to the importance of Clifford algebras for spinors, we list below a few facts about it. \ct 7.2.6: Every Clifford algebra admits an irreducible matrix representation unique up to equivalce and therefore all Clifford algebras can be uniquely characterized as matrix algebras [1]. tct 7.2.7: The knowledge of C(p, q\ K) for q = 0 and p = 1,2, ..., 8 leads to the determination of her C(p, q; R)'s. In particular many properties of these algebras depend only on the signature (p -q) od (8). act 7.2.8: Using the expression (7.2.15), and identifying the basis elements of C(p, q) with Pauli atrices it follows that: C(0, 0) = R C(l, 0) = R + R C(0, 1) = C C(2, 0) = R (2) C(l, 1) = R(2) C(0, 2) = H (7.2.17) here R (2) stands for the algebra of real (2 x 2) matrices, and H stands for the algebra of quaternions.

r

T

act 7.2.9: Since the algebra of 4-dimensional Dirac matrices can be represented as a direct product f two copies of Pauli algebras (formed by Pauli matrices), there follows an isomorphism: C(p, q) ® C(2, 0) ~ C(q + 2, p) C(p, q) ® C(l, 1) ~ C((p + 1), (q + 1)) C(p, q) ® C(0,2) - C(q, p + 2) (7.2.18) nnally we define the complex Clifford algebras; these are required for classification of spinors of iifferent type,

2.3 Dirac, Majorana and Weyl Spinors Definition 7.2.10: A complex Clifford algebra is the complexification of a real Clifford algebra. Thus the complex Clifford algebra obtained from real C(p, q) is C(p, q) ®
(7.2.19)

The elements of the representation space of C (r) are called (complex) Dirac spinors. These spinors exist in all dimensions, and have 2 [ T r l complex components ([y r] stands for the integral part of y ) . The well known Dirac equation (in different dimensions) is written using these spinors. The Clifford algebra C (r) is known to be isomorphic to the algebra of 2L 2 J x 2 l 2 J complex matrices (denoted C(2l 2 J)). If the complex Clifford algebra corresponds to a given space-time dimension with the metric signature satisfying: (p -q) = 0, 1, 2 (mod 8) (7.2.20) then it can be checked that the Dirac equation (see Sec. 9.3) shuffles the real (imaginary) parts of any Dirac spinor amongst the real (imaginary) parts. As a consequence a reality condition can be imposed

290

Mathematical Perspectives on Theoretical Physics

on spinors, which says: the spinors be real without creating a contradiction with the Dirac equation. In these dimensions and metric signatures, the Clifford algebra has a real representation of smaller dimension than usual, and the elements of the representation space are called Majorana-spinors. If on the other hand (p - q) s 0, 6, 7 (mod 8), the Clifford algebra has a pure imaginary representation and the elements are called pseudo-Majorana. Apart from spinors that correspond to purely real or imaginary representations, there is another class which corresponds to the case of even dimensionality r of Clifford algebra C(r). If the coefficients a^, $}Hl M3> e t c -' a r e a ^ chosen to be zero, then the subalgebra formed by other elements is 2r~1-dimen-

a

sional, we denote it by °C(r) (see Ftn. 7 for different notation). It can be checked that subalgebra °C(r) is not simple since it decomposes into two ideals: °C(r) = °C(r)P+ + °C(r)P_

(7.2.21)

where P± are the projection operators defined by generalized Dirac operators (similar to y (I ± y5)) in the following sense9: P±=-i-(l±e)=-i-(l±(e1«2...O). P± = ~(l

(e 2 = + 1 ) (e 2 = - 1 )

(7.2.22)

± ie)

From (7.2.21) when r is even, the spinors in the eigenspace of P+ or P_ are called the complex Weyl spinors. Thus for all even r the complex Dirac spinor representation gives rise to Weyl spinors. When the Clifford algebra is a real C(p, q) with p + q = r even, then (7.2.21) reduces to: °C(p, q) = °C(p, q)P+ + °C(p, q)P_

(7.2.23)

whenever p - q = 0 (mod 4). This means that Majorana spinors can be split into Majorana-Weyl spinors whenever p - q = 0, 2 (mod 8) and p - q = 0 (mod 4) simultaneously. Thus Majorana-Weyl spinors can always be defined when p-q = 0 (mod 8). The well known case of existence of MajoranaWeyl spinors can be cited in the case of 10-dimensional supersymmetric Yang-Mills theory, since in this case p = 9 and q=\. Finally we collect this classification of spinors in the form of a table: Spinor

Dimension

Dirac Weyl Majorana Majorana-Weyl

Any r r even r = 2, 3, 4 (mod 8) r = 2 (mod 8)

(7.2.24)

Exercise 7.2 1. Establish the representations of Clifford algebra for different values of p and q given in (7.2.17), and show that if the quadratic form Q = 0, then C{Q) becomes a Grassmann algebra. 9

- See (7.3.7) for y5; e2 = +1 for p - q = 0 or 1 (mod 4); e2 = -1 for p - q ? 0 or 1 (mod 4). Equation (7.2.22) is defined only when e2 = ± 1.

All that is Super—an Introduction 291

Show that similar to ys, the product e = e{ ... er satisfies: ,

J+l for p - q = 0,1 (mod 4) ~ 1-1 forp-ql 0,1 (mod 4)

Show that the conditions on p, q for the definition of Majorana spinors given in (7.2.20) and (7.2.24) are compatible.

Hints to Exercise 7.2 . To obtain the representations of C(p, q) for p, q given in (7.2.17) we use (7.2.15) in each case. When p, q = 0, there are no e,'s, thus the algebra reduces to the reals i.e., to IR. For p = 1 and q = 0, i.e., for C(l,0), the basis vector is one in number, say ex. This can be viewed as a 1-dimensional unit matrix I, 1 0 and from (7.2.15) the general element is a (0) + a ( ] ) el, where <2(0), aw 6 IR. This gives: C(l, 0) = R + R. When p = 0, q = 1, (7.2.16) gives Q(et) = - 1 accordingly e, = ilx, the general element is of the form a(0) + ia^ex and hence C(0, 1) = C. For C(2, 0) the basis vectors are the Pauli matrices ex - T{, e2- t3, which gives {e{)2 = 1 2 , (e2)2 = 12 a °d ^1 e2 = -i t2. But these are the basis vectors of the algebra K(2) of real (2 x 2) matrices, which implies C(2, 0) = IR (2). In the case of C(l, 1), since (ex)2= 1 2 and (e2)2 = - 1 2 , we have IR ex = r, and e 2 = -ir 2 and e,e 2 = - r 3 , hence the basis vector again form K(2) showing that C(l, 1) = R(2). In the case of C(0, 2), both (e,) 2 = (e 2 ) 2 equal - 1 2 , hence e, = I'T,, e2 = ir 2 , ^j^ 2 = -/T 3 , these form a basis of the quaternion algebra H, hence C(0, 2) = H. If the quadratic form Q = 0, in view of the relation (ii) in Exc. 3 of (7.1), the algebra C{Q) reduces to the Grassmann algebra, whose defining equation is: e^v+

eveM = 0.

2. In order to use p - q = 0, 1 (mod 4) we choose p-q cases, p as 5 or 6 and r = 6 or 7. Accordingly: (i)

£ = e,e 2 ... e 6 , or

(ii)

£ = e,e 2 ... e 7 .

= 4 or p-q

= 5, this gives for q = 1 in both

Thus in the first case we have: (iii) e2 = (e,
e2 = (~e2) {e2f (-(e 3 ) 2 ) ( Q ) 2 (-(e 5 ) 2 (*6)2)

where we have used e,e= -e,«, to move g,'s from the RHS to the LHS. Then in view of defining relation of Q{x) given in Exc. 3 of (7.1) and (7.2.16). we note that (
292

3

Mathematical Perspectives on Theoretical Physics

MORE ON SPINORS

Having introduced the concept of superalgebra, the Lie superalgebra and the Clifford algebras leading to spinors in earlier sections, we shall now use the spinors to describe the Poincare superalgebra and finally the supersymmetry algebra in Sec. 4. Since Pauli matrices happen to be an important tool in this description, we shall present their properties as well. Before pursuing these however, we state the Coleman-Mandula theorem [3] which happens to be one of the most precise and powerful no-go theorems. This theorem established the impossibility of nontrivial symmetries that connect particles of different spins-namely the integer spin (bosons) and halfinteger spin (fermions) particles, and thus in a manner of speaking led to the concept of supersymmetry. The theorem begins with following assumptions: (i) the scattering matrix of interacting particles (S-matrix) is based on a local, relativistic quantum field theory in 4-dimensional spacetime; (ii) there are only a finite number of different particles associated with one-particle states of a given mass; and (iii) there is an energy gap between the vacuum and the one particle states. And it concludes by asserting that the most general Lie algebra of symmetries of S-matrix contains the energy-momentum operatior P^ and the Lorentz rotation generator M^v, as well as a finite number of Lorentz scalar operators Tl that belong to the Lie algebra of a compact Lie group.

3.1 The Poincare Superalgebra In view of above theorem we make a few observations concerning the construction of these algebras: (a) Every physically admissible supersymmetry (denoted 5) has to satisfy two conditions, namely: (i) the Bose sector °S of its corresponding superalgebra 5 must be the direct sum of a Poincare algebra T (see Exc. 4.17 for T) and an internal symmetry algebra Cj; (ii) all elements of the Fermi sector \s (of 5) must transform like (Lorentz invariant) spinors. (b) Since we know that simple/semi-simple Lie algebras are easier to handle, we shall be looking for such algebras in this case also (c) Although Poincare' algebra is neither simple nor semi-simple in any space-time dimension, it happens to be most suitable (see the conclusion of above theroem), as it can be obtained by Wigner-Inonii contraction from the simple de-Sitter algebra on one hand, and on the other it can be embedded in the (again) simple conformal algebra. (See Exc. 7.3.1 for Wigner-Inonii contraction). From (a) it follows that the construction of a superalgebra can be based on a proper choice of °5 and l S,- Also in view of the theorem such a construction involves the generators M = (M^v) and P = (P^) of a Poincare algebra T, the generators (denoted T) of the internal symmetry algebra g and the generators Q that come from the Fermi sector. Hence, in the case of the Poincare superalgebra, the generators involved are those of the Poincarealgebra and the ones that come from the Fermi sector (see Sec. 4). The Lie brackets formed by generators M, P and T satisfy: [M, M] ~ M, [P, P] = 0, [P, M] ~ P, [T, T] ~ T, [P, T] = 0 = [M, T]

(7.3.1)

When the above generators are coupled with generators Q, there are four additional bracket relations: [M, Q] ~ Q, [P, Q]~Q,[T,Q]~Q,IQ,Q}~M

+P + T

(7.3.2)

All that is Super—an Introduction

293

Note that the fourth bracket in (7.3.2) is the Poisson's bracket indicating the different characteristic of Q, and the sum on the RHS of this bracket is actually a linear combination of generators M, P and T.u Since (from the previous section) the superalgebra will have a Z2-graded structure, the generators P, M and T can be viewed as even and Q as odd, we can therefore succinctly write the bracket relations as: [even,even]= even {odd, odd} = even [even, odd] = odd.

(7.3.3)

We shall return to the description of above relations in indicial form in the next section. In order to do that we have to familiarize ourselves with the properties of Pauli matrices and their usage in representing the Dirac matrices, the Weyl basis and the Weyl, Majorana and Dirac spinors, which we have already defined in most general form in Sec. 2 (see the Appendix at the end of this chapter for properties of Pauli matrices and Exc. (7.3) for verifications). We therefore devote the rest of this section to learn about them and to fix some notations that will be required in later sections. Recall that we have already used the Pauli matrices (T m ), 12 m = 0, 1, 2, 3, to define the generators ( T " ' / 2 ) of SU(2) in Chapter 2 where we emphasized that these matrices or their scalar multiples could be used as generators of a Lie algebra. We shall now employ these matrices to show a connection between SL(2, C) and the Lorentz group 13 and for representing the spinors in a different basis. Now SL(2, (C) is the linear group of 2 x 2 complex matrices M with det M - 1. This group can be represented (element-wise) by M, or its complex-conjugate M*, or its transpose inverse (MT)'X or its Hermitian conjugate inverse (M1)"1. By this we mean that either of these four matrices can be selected to represent the action of Lorentz group on two component Weyl-spinors.

3.2

Lorentz Invariance

Let P denote any (2 x 2) complex matrix e SL(2, C), we note that P can be written using Pauli matrices14 as the basis, thus:

f-P0

pp

' ~'"'{pl+ip2

+ P3

P{-iP2}

-PO-PJ

'-•«•»•«>

<"•")

which shows that if P is Hermitian, the P m 's are real. This leads to the fact that given a Hermitian matrix P, we can always find another matrix P1: P' = MPMi such that P' o 2 - P ' 2 = Pi - P 2 , (P = (P,, P2, P 3 ))

(7.3.5)

"• The equivalance sign ~ in Eqs. (7.3.1)-(7.3.2) is used to mean that the RHS is a constant multiple/linear combination of generators M, P, T. 12 We denoted them as am in Chapter 2. 13 See the definition of Lorentz group in Chapter 2 (Exp. (2.2.11)/). 14

We have taken T° =

in this case. See also the Appendix.

294

Mathematical Perspectives on Theoretical Physics

as det M = 1. Equation (7.3.5) shows the Lorentz invariance of P. Using the above example of a matrix P, we can check that the products y/ a i/ a , WaV" an( * y/a T™Q dm y/a that involve two-component spinors and derivative of spinors are all invariant under Lorentz transformations (see Sec. 6.3), in other words they are Lorentz scalars.15

3.3

Dirac Matrices and Dirac and Majorana Spinors

Next we write (4 x 4) Dirac matrices in terms of Pauli matrices which we shall use to establish the connection between four and two component spinors, thus: (7.3.6) where ( r m ) = (1, T), ( T " ) = (-1, T). We set

Ys = i y i

y2 r 3 7° = [* _ J ]

(7-3.7)

which gives y$ = 1. We also define the charge conjugation matrix (the (4 x 4) unitary matrix) Ca^ in terms of these Gamma matrices, thus:

fO

1

}

0

C=(Ca/3)=

-1

°

0

_! = i f r

(7-3.8)

2

i ' . oj Note that while C is antisymmetric, Cy'" is symmetric. The Majorana conjugate of a four-component spinor %a is defined as: XM=XTCoT(xM)a=XpCl3a

(7.3.9)

whereas the Dirac conjugate is defined as: XD = X\i y°) or (XD)a = ( * / (i f)p

16

(7-3.10)

In fact a spinor is called a Majorana spinor if this Dirac conjugate equals the Majorana conjugate, i.e., if

xTc=i(x+y°) 15

The clotted index (which can also move up and down using Lorentz transformations) indicates the complex conjugate of a given spinor component, for instance (X^* = X^. Since Pauli matrices are used in representations of these spinors, they are also given the index structure xma^, T'"^, etc. It should be noted, however, that for each value of m they represent the same Pauli matrix regardless of the indices (a/3), (a/3), etc. When m = 0 they stand for (+ or -) times the unit matrix.

All that is Super—an Introduction 295

or equivalently

Using (7.3.7) we can now write a 4-component spinor % as:

a3 u)

*••(£)

-

where the 2-component spinors % A and £, • are related to 4-component spinors by the rule: XA = j(l

+ Ys)X and lk = ~{l-y5)X

(7.3.12)

Note that %A labels the first two components and £ • the last two components of %. The charge conjugation matrix can also be expressed in this terminology as: (EAR

C=

0 "\

[ 0 eAB)

The Majorana condition XM = XD

now

(7 3 13)

''

becomes:

( £ * ) * = £ „ , (XAf = - S A

(7-3.14)

where XA = XBeBAmd^A Hence for a Majorana spinor we have:

Xa={lA\xa

= eAUs

(7-3-15)

=(XA,-XA)

(7-3.16)

Since Cfo)* = ~xK, we have C^4)* = - ~XA. From our discussions in Sec. 2 (see (7.2.22)), we also note that a Dirac spinor contains two separate Weyl spinors, e.g.,

Finally we would like to mention that Gamma-matrices are represented in more than one way. We give below three of these; to distinguish them from the previous ones as well as from each other, we have provided the y 's with a subscript. 1

' See Def. (7.2.10) for a Dirac spinor^; the symbol t = + = hermitian conjugation, whereas * is the usual complex conjugation. The bar on any spinor % indicates that it has been obtained from % by complex conjugation. We shall use the two symbols t and + interchangeably.

296

Mathematical Perspectives on Theoretical Physics

y'w=\-m {t Yco -\ f

T

] (Weyl basis) 0 )

' 1 ° ^(Canonical

(7.3.17)

basis)

(7.3.18)

r°-( ° ""'I r'-f° ^1 /M~UT2 0 J 7 M " U 3 oj 7^=[o

-ilj ^"(-,-T1

0 J

(MajOrana basis)

(7-3-19)

While using the y-matrices, the reader must make sure of their correct description in terms of Pauli matrices and the z° matrix that is used along with it (see Chapter 9 in [16] for a different view point). Exercise 6 of this section explains this point by choosing a different set of y's. We would also like to note that the summation rules in the case of spinors differ from those of tensors, for instance we write = XB £BA and xA = e AB XB for spinors, unlike tA - gm tB, tA - gM}tB for tensors. It is because the e's are antisymmetric matrices unlike the symmetric metric tensor g^. The complex conjugation also works slightly differently here due to the involvement of anti-commuting objects. For example, in this case we require that the conjugation and Hermitian conjugation of a scalar such as: XA

S=6AeABeB

(7.3.20)

be the same (8 being an arbitrary complex spinor), we explain this as follows. Viewing 0's as matrices the Hermitian conjugacy means that: 5+ = ( 0 V

(EAB)*

(6A? = GB{eABf

6A

(7.3.21)

In order that 5+ = S*, it would mean that the order of d's be reversed. More explicitly:

(eAeB)*=dBeA (9A0B6c)* = (9c)\eB)* (6A)* = -9c6BdA

(7.3.22)

since (GL)* = -6LP As can be expected, the derivation rules are also different for 0's. Note that the 8's are different type of entities because of their anti-commutativeness. We shall learn more about it 17

Note that a dotted index results from complex conjugation. We shall therefore drop the bar on G in general.

All that is Super—an Introduction 297

when we introduce superspaces, superfields, etc., in the next section. At present we only list these derivation rules (for later use) as follows:

(a)

(b)

(o

f-^-1* - -±-

f-^-V - - i -

(d)

(7.3.23)

Note that (c) and (d) can be derived from (a) and (b) by using the complex conjugation rules for 0's. All these are left derivatives18 and are called the fermionic derivatives.

Exercise 7.3 1. Recall that the isometry groups of the de-Sitter and anti-de-Sitter space are 0(4,1) and 0(3,2). Denote the corresponding Lie algebras by o(4,l) and o(3,2). Let M^v = -MVfl (fx,v = 1, 2, •••, 5) be the generators of these algebras, which decompose into two sets: M = (M12, M13, M14, M23, M24, M34) and P = (Ml5, i = 1, •••,4). The Lie bracketing relations can now be put as: (a)

[M,M]~M, [M,P)~P, [P,P]~M.

We now choose a mapping (rescaling of generators) M—>M=M,P^>P rally implies: (b)

[ M , M] ~ M,

= XP, which natu-

[M, P] - P , [ P , P ] ~ X2 M

since M = M. Show that when A —»0 we have the ordinary Poincare-algebra. The above process of taking the limit A —> 0 is called the Indnti-Wigner construction. 2. Show that xA XA = ~ XA XA = X2' m(i %A ZA = - %A^A = ? 2 hence XAXB =

-J£ABX2•

3. Verify (a) and (c) of (A. 10) of the Appendix. 4. Verify that the scalar Pm xm of (7.3.4) is a (2 x 2)-complex matrix. 18

Left derivative off(x) s lim ^ £->0

M-f(x-E) e

me

Lorentz invariant, and

298

Mathematical Perspectives on Theoretical Physics

5. Verify the Gamma-matrices product given in (7.3.7). 6. Show that the Weyl basis, the Dirac-canonical basis and the Majorana basis are related to each other by similarity transformations. Identify these matrices. Also show that the matrices of Majorana basis satisfy the relation (YM)* - - YM f° r * = 0, 1, 2, 3. 7. Consider the Majorana representation of Clifford algebra C(3, 1) : y°= J T ' 2 ® T \ 7 1 = T ' 1 ® T ° , y2 = x'1 (g) T2, y3 = T' 3 ® T° where T' and T are two copies of Pauli matrices and T'° and T° are 2 x 2 unit matrices. Let Ma = M*abe a Majorana spinor, then show that a 2-component spinor W= —(l-iy5)M

or

W = —{l-iy5)M

can be constructed using M, where y5 = y° / y2 yi = -i x'2 ® T 3 .

HintsforExercise 7.3 1. We recall that the Poincare group is the group of isometries of the Minkowski space. When spacetime is not perfectly Minkowskian but de-Sitter or anti-de-Sitter instead, its curvature does not vanish but is constant. The isometry group, however, becomes a simple group 0(4,1) for deSitter and (9(3, 2) for anti-de-Sitter space. In the limit when the constant curvature of any of these two spaces goes to zero (i.e., the curvature radius becomes infinite), they tend to be Minkowskian and their isometry groups contract to the Poincare group. This Inonii-Wigner contraction can be fully appreciated at the level of their Lie algebras. Since from (b) it follows that as X —> 0, [P, P ] —> 0, which is precisely the requirement of a Poincare algebra—namely, the momenta (i.e P's) commute. 2. To show that %A xA = -%A XA ( n o t e that %A and %A behave like anti-commuting elements of familiar Fermi sector) we use (A.4) to write: XAXA

= XB^BA^CXC-

«

To use the summation rule given in (A.3) we change the order of indices in eAC to £CA, since it is an antisymmetric tensor we replace eAC by - eCA. This gives: XAXA

= XB(-£BA£CAXC) B

= X (-8£)XC = ~XAXA We thus have %A %A - %A XA = 2 X2Multiply both sides by em

or

=

-X XB

= X2-

ite XA = £AC Xc t o obtain:

XA£ACXc = X2(iii) and change the order of indices in eAC and e^ on LHS, this gives: XA(£CA

or

Wr

(«) B

£BA) XC = £AB X2

XA (~5B) XC

(iv)

= £AB X2

2XAXB = £ABX2=>XAXB

=

-^£ABX2-

The Lorentz invariance of the entities obtained in (ii), (iii) and (iv) is obvious. The other part of the exercise can be established either by using the relations (A.4)(b) or by observing the fact that

All that is Super—an Introduction

299

each of these can be viewed as coming from a spinor with components %A or XA through the conjugation process and using (7.3.21) together with the relations: (XA) = XA and {%) =- X 3. To verify (a) of (A. 10) we choose m = n = 0, then from (A.12)(a) we have:

(*<>)£ = y (1-1-(-1)(-1)£

(i)

=0 which shows that the second term on the RHS of (A.10)(a) makes no contribution and thus equals the LHS which is 8AC. Next we choose m = 1 and n - 3, the first term on the RHS of (A.10)(a) is zero and for the second term, (A.12)(a) gives:

< >\* irr° ~i\ (° 'iT c -'\A WC-TRI oJi-i oJL-li ojc

(11)

This is the same as the value of the LHS. Similarly choosing the other combinations of m and n, the remaining expressions can also be verified. To verify (A.10)(c) we use (A.12)(c), it can be easily checked that the first term on the RHS of (A.10)(c) makes no contribution when n * m and the second term makes no contribution when n = m. For instance choose n = m = 0, then in view of (A.5) and (A.8) we have: (T 0 0 )^ = i ( ( l ) ( - l ) - (-1)(1))S = 0

(iii)

Therefore the RHS reduces to

H 00 ^ =(-!)( J °)

(iv)

The RHS, on the other hand, is (v)

The verification for n* m is left as an exercise. 4. We have:

Pm T"! = P O T ° + PjT1 + P2x2 + P 3 T 3

= p 4o =

-i] +
(-P0+P3 [p,+iP2

P1~iP2) -Po-P3)

5. We use (7.3.6) to write:

7777

U 1 o JW o J U 3 o Jl-ii o J

300

Mathematical Perspectives on Theoretical Physics

_/T'T2

0 V-T3

"tfl

TVJI 0 0

[0

-l)

r3)

0 ^ rVVj

/-TW

\

0^

6. We denote the Gamma matrices in three bases collectively as Tw, Tc and TM. Using the elementary rules of finding the similarity transformations, we obtain: w=

where Similarly

c

A= ^

J

Tw = BTM B~l 1

ll

1 f B = —==-

where

4l U

\ -ie)

The relation between Tc and TM is immediate. The determination of £ in B is left as an exercise.19 The second part of the exercise is too obvious for any proof. 7. Use the results of Exercise (7.2.1) to verify that y5 = - i x a ® T3. Note that the components of a Majorana spinor are all real. We put

M2

[MJ premultiplying it by —(1 - i y5) we have after simplification: fWA ( W2 _ i W3

yjA) 19

~ ~2

M{ + iM3 M2-i M4

\

/(Mj+iAfj)

{-i{M2-iMA),

' See the Ftn.47 in the Appendix for a hint.

All that is Super—an Introduction

301

Evidently only two of the components of W,'s are independent—say W, and W2. Thus beginning with a 4-component spinor M, we have obtained a two component spinor UMx+iM^ 2 [M2-iMj' We denote components of Was WA where A = 1 or 2. Using the complementary projection operator \ (1 + i y5), it is easy to check that the resulting the 2-component spinor is the conjugate of WA, thus (WA)*= WA.

4

SUPERSYMMETRY ALGEBRAS AND INTRODUCTION TO SUPERSPACES

From Chapter 6 we are already familiar with the role of Lie groups in defining the symmetries of a system. In fact we saw that associated to all local symmetries that we studied, there was an underlying Lie group. At that point we also mentioned in passing that there were many symmetries that occurred in quantum field theory which could not be assigned a Lie group on those lines. Using layman's language we shall call these symmetries as supersymmetries and the groups that will be used to describe these will be called supergroups. However, when symmetry began to be replaced by supersymmetry, objects such as vector spaces, vector and tensor fields, and differential forms had to be replaced by their analogues required for the description of supersymmetry. It then became essential that a formal definition of manifolds on which these super objects live be given. It is therefore not surprising that supermanifolds followed superalgebras and supergroups, but once they were defined, the super Lie algebras and super Lie groups could be conceived from them following the familiar rules for obtaining the Lie algebras and Lie groups from ordinary differentiable manifolds (see Chapter 3 in [4]). In order to define supermanifolds, we need to introduce a few objects required for their definition; these are supernumbers and their splitting, superanalytic functions, real (complex) supernumbers, and supervector spaces.

4.1

Supernumbers

We begin with a set of generators C,a, a = 1, •••, N which form a Grassmann algebra (i.e., C,a C,b+ £,b C,a - 0, for all a, b) denoted AN. When N -> <*,, this is denoted A^. The elements 1, (a, £fl \b, ••• where a * b form a 2N (infinite)-dimensional basis of AN ( A J . Moreover AN ( A J forms a 2/v-(°°)-dimensional linear vector space under addition as well as multiplication by a complex number. We note that as algebras over complex numbers, both these are associative but are not commutative except in the trivial case when N = 0,1. The elements of Ax are called supernumbers. An arbitrary element z e A^ can be written as:

* = A+(X-U lfl2 ... fl .r

C-]

(7-4.1)

where X as well as all c's which are antisymmetric in their indices are complex numbers. The first term is called the body of the supernumber and the second the soul, we shall denote them as zB and zs, thus: z = zB+zs

(7.4.2)

302

Mathematical Perspectives on Theoretical Physics

If AM is replaced by finite N, then apparently zs is nilpotent, i.e., zsN+l

=0

(7.4.3)

A supernumber has an inverse if and only if its body is nonvanishing. The inverse, which is unique, is given by the formula oo

l

z~ = Z~B X (z"j zs)n

(7.4.4)

n=0

To extend any analytic function/on the complex numbers to a supernumber-valued function on A^, we use series such as:

/«=

(7 4 5)

I-T/'" 1 ^^ »=o

--

nl

where f-n) (zB) denotes the n—th derivative of/at the point zB in the complex plane, and this definition is valid for all zB as long as they are not singular points of /. We note that the Taylor series (7.4.5) terminates when N is finite. When N is infinite both (7.4.4) and (7.4.5) stand for formal infinite series. The coefficient of each term in these series is finite and unique. Similar to supernumbers, a matrix M whose elements are supernumbers has a body and a soul. The body is the ordinary matrix obtained by replacing each element with its body, and the soul is the remainder. A square matrix has an inverse and is said to be nonsingular if and only if its body is nonsingular. The inverse is unique. Definition 7.4.1:

Let the supernumber z defined in (7.4.1) - (7.4.2) be written as:

'-(<• \ ! , 2^™*r*

<*)*(i^v.,./1-1

- r ) . . + v (7.4.6)

Apparently u is the even part of z and v is the odd part. The supernumbers that are purely even or purely odd are referred to respectively as c-numbers and a-numbers. Remark 7.4.2: We list below some of the properties of c-and a-numbers: (i) The c-numbers commute with every supernumber whether it is a c or an a (or a sum of c and a); whereas a-numbers anticommute among themselves, (ii) The product of two c-or two a-numbers is a c-number. The product of a c-number and an anumber is an a-number. The square of every a-number is zero. (iii) Since d-numbers possess no body they are not invertible. The set of all a-numbers is denoted as C a ; this set is apparently not a subalgebra of Ax. (iv) The set of all c-numbers forms a commutative subalgebra of AM, and is denoted as C(.. (v) When ATC is replaced by AN, both C r and Ca are 2N~l- dimensional vector spaces.

4.2

Superanalytic Functions

Remark 7.4.3: Similar to ordinary analysis, an analytic theory of functions of c-numbers and a-numbers can be formulated by considering mappings from C r or Ca to AM.

All that is Super—an Introduction 303

We illustrate it for Ca. For finite N, Ca and A^ are finite-dimensional vector spaces over complex numbers, hence a differentiable mapping /from C a to A^ can be defined. Thus if v e Ca,f carries v to AN and then using the formal limit A7 -> «, the analyticity of/(now known as superanalyticity) can be defined as follows. Definition 7.4.4: The differentiable mapping/is said to be superanalytic at v e Ca if, corresponding to an arbitrary a-number displacement dv of v, the image f(v) in A^ suffers a displacement of the form:

df (v) = dv M - f(v)] = \f(v) - f - ]dv I dv ->

(7.4.7)

J L

dv J
(7.4.9)

<-

-~-f(u)=f(u)-f~

(7.4.10)

du du showing that in this case there is no need to distinguish between left and right derivatives. The class of superanalytic functions of c-number variables is much richer as compared to the class of superanalytic functions of a-number variables. We list three important properties of/defined on C c in the following remark. Remark 7.4.5: (i) Corresponding to every ordinary analytic function / o n the complex numbers, there is a superanalytic function over C c analogous to (7.4.5):

fWdlff,±f^(uB)u»

(7-4.11)

«=o n\ where uB and us are respectively the body and soul of u. (ii) If / is superanalytic on and inside a closed curve in a vector space Cc, then the curve may be continuously deformed to a point without crossing any singularity and/satisfies: ->

2a

304 Mathematical Perspectives on Theoretical Physics

j f(u) du = 0.

(7.4.12)

(iii) More generally, if/is superanalytic on the curve and superanalytic inside except at a finite number of poles, then j f(u)du = 2ni x (sum of residues at poles)

(7.4.13)

Thus if/has the general form

/(«)= i — 4 - .«>>£""••• £ai n=0

n

(7 4 u)

--

-

the residues may be arbitrary supernumbers.

4.3

Real and Imaginary Supernumbers

To reach our objective of defining a supermanifold, we have to introduce the notion of real and complex amongst the supernumbers. The laws of complex conjugation (denoted *) of sums and products of two supernumbers are (z + z V = z + z'\

(zz)* = z* z \

z, z e A . .

(7.4.15)

The complex conjugate of zB is taken to be its ordinary complex conjugate, and the generators of ATC are assumed to be real, thus £" = £° for all a. This implies:

(£«,... £«„)*= £«„ ... £«,

(7.4.16)

From this together with the anticommutation rule £° Clb--C>b £", it follows that the basis element £"' • • • £"" is real when y n(n - 1) is even and is imaginary when | / i ( n - 1) is odd. As for ordinary numbers, a supernumber z is said to be real if z* = z and is said to be complex if z* = - z- A general element of AOT is real if and only if both its body and soul are real. The subset of all real elements of C f and Ca is denoted respectively as R c and R a . The subset R c is a subalgebra of C c . The product of two real c-numbers is a real c-number and the product of a real cnumber and a real a-number is a real a-number, whereas the product of two real a-numbers is an imaginary c-number. The symbol x is generally used to denote a real variable whether it is over R a or R c . The cartesian product R c x Rc • • • Rc of m factors R c is denoted R™, similarly the cartesian product of n factors R a is denoted R£.

4.4 Supervector Spaces As can be expected, the definition of supervector space is formed using the rules similar to that of a vector space, however the difference appears once the supernumbers are introduced. Definition 7.4.6: A supervector space is a set s of elements called supervectors, which is equipped with mappings, having special properties listed below: (i) There exists a binary-operation mapping + : s x g - > s called the addition such that + (X, Y) = X + Y The mapping + is commutative and associative.

for all X, Y in s

(7.4.17)

All that is Super—an Introduction

305

(ii) There exists an element 0 e 6 such that X + 0 = X for all X in G • The element 0 is called the zero supervector. (iii) For every supervector X in G, there exists another supervector - X in s such that - X + X = 0. It is easy to verify that 0 is unique. Given a supervector X, the supervector - X known as the negative of X is unique. (iv) For every supernumber a there exist two mappings, aL : G —> G called the left multiplication and aR : G —> G the right multiplication such that: aLX = aX,

aRX = Xa for all X in 6

(7.4.18)

These mappings satisfy the linear laws: (a) (a + p) X = a X + pX;

X (a + p) = X a + Xp

(b) a (X + Y) = a X + aY;

(X + Y)a = Xa + Ya

(c) (a/3) X = a (/JX) = a/3X;

X(aj8) = (Xa)/J = XajS

(d) IX = X;

XI = X

(7.4.19)

for all a, (3 € AM and X , Y e G(v) Left and right multiplication are related by the following rule: (aX)P = a (Xp) d£f aXfi (7.4.20) for all a, P in A^ and all X in <s. If a is a c-number, then it commutes with every X in G- Thus for all a in Cc and all X in 6 we have: aX = Xa

(7.4.21)

For every X in G there exist unique supervectors U and V in G such that (a) X = U + V (b) aU = Ua,

aV = - V a

for all

a in Ca

(7.4.22)

The supervectors U and V are called the even and odd parts of X. If the odd (even) part of a supervector vanishes, the supervector is said to be of type c (type a). The zero supervector is the only supervector that is simultaneously c-type and a-type. A supervector as well as a supernumber that has a definite type is called pure. For pure supernumbers and supervectors, Eqs. (7.4.21)-(7.4.22) can be summarized in the formula: aX = (-l) a X Xa (7.4.23) where it is assumed that each symbol in the exponent of (- 1) takes the value 0 or 1 according as the corresponding quantity is c-type or a-type. (vi) There exists a mapping * : G —* G called the complex conjugation and conventionally written as: '(X) =fX* for all X in G (7.4.24) The mapping * satisfies the usual properties of conjugation such as X** = X; (X + Y)* = X* + Y* in addition to the property: (aX)* = X V ,

(Xa)*= aX*

(7.4.25)

306

Mathematical Perspectives on Theoretical Physics

for all X, Y in s and all a in ATC. A linearly independent set {te} is called a complete linearly independent set or a basis if every supervector X in s can be expressed in the form: X = X' ,-e for some

X' in A«,

(7.4.26)

If a supervector space has a basis consisting of d supervectors, it is said to have total dimension d. The total dimension which may be finite or infinite is an invariant of the supervector space. We shall generally be concerned with the supervector space R"' x R£ formed by the cartesian product of mR f 's and nR a 's. When N is finite, the total dimension d of R"' x R^! is 2N~l (m + n) while the pair (m, n) is called its dimension. A general point of this space is denoted x with coordinates xl where i ranges over the set (- n ••• - 1, 1 ••• m) or sometimes over the set ( - « • • • - 1 , 0 , 1 ••• m - 1) with the negative values distinguishing the a-number coordinates from the c-number coordinates. However when this formalism is used in physics, instead of distinguishing the c-numbers and a-numbers by positive and negative indices,we use the coordinates *?, xv, xa to denote the c-type and 8a, 8^,6y (or even 6', &, #*) to denote the a-type.

4.5 Supermanifolds; Charts and Atlases Before we formally define a supermanifold, we would like to mention that the objects called supermanifolds are related to R"1 x R^' in the same manner as ordinary manifolds are to R"1; that is, small regions of a supermanifold look like small regions of R" ! x R^. They are said to have the same local topology. The topology we shall use here would be such that it would reflect the algebraic structure of R" ! x R^' irrespective of the fact whether N is finite or infinite. This would require that the topology in question should have the property prescribed as follows. Definition 7.4.7: Let n : R"' x R^1 -> R'" 21 be the mapping that replaces each coordinate x' of the point x by its body. A subset of R"' x R^' is said to be open here if and only if it has the form 7C1{Q) where Q is some open subset of Rm. We note that Rcm x R" is not Hausdorff with this topology, however if x and x' are two points of R"' x Ra" lying in two distinct soul subspaces (i.e., if n(x) * K(X), then they may be surrounded by nonintersecting neighbourhoods. Such a space is called projectively Hausdorff, and the topology here is called a coarse topology. Definition

7.4.8:

Let 0 be a mapping from an open subset 11 (as defined above in the sense of the

coarse topology) of R™ x R" to an open subset U of R"! x Rna . The mapping is said to be differentiable if the coordinates x](j - 1 ••• m, - 1 ••_• -n) of the image point
Note that Rm here defines the set of points whose coordinates have vanishing souls, n may be called the natural projection of R™ x R" onto R™.

Alf that is Super—an Introduction 307

in R"' x R^' (in the coarse sense). The collection of ordered pairs is required to have the following two properties:

(i) (J UA = M; A

(ii) 0 A o
defined by <j>A. The pair (1lA, <j)A) is called a chart, or a /oca/ coordinate patch or a coordinate system. Property (ii) implies that every pair of overlapping coordinate systems is related by a differentiable transformation. A collection of charts (1IA, (j>A) satisfying the two properties is called an Atlas. Just as in the case of ordinary differentiable manifolds, we call the union of all atlases compatible with each other (two atlases are said to be compatible if their union is again an atlas) the complete atlas of the manifold. In fact it is the set of all possible coordinate systems on M. With the definition of supermanifold in place, objects such as supercurves, sub-supermanifolds, scalar fields, etc., can be defined. We shall defer these definitions to the sections where they will be required (see [4] for details).

4.6 Supersymmetry Generators and Construction of Superalgebras from First Principles From our studies of Lie groups and Lie algebra, we know that every Lie group defines a unique Lie algebra; conversely given a Lie algebra with its generators, there exists a Lie group whose structure can be determined by these generators. We shall use this approach to define the supergroup resulting from an underlying algebra. Thus we first select the generators, the brackets required to mix them and the rules that these mixings will satisfy for compatibility. In Sec. 3, we already shed some light on these aspects showing how these selections are limited to particular groups. For instance, since most symmetries in nature are given by the Poincare group, one set of generators (which are ten in number) consists of (P^) and (M^v) (see Exc. (4.1.7)), the other set denoted (Ta) arises from the internal symmetry group G. The generators of these sets are labeled as even elements of a Z2-graded algebra. The third set containing the odd elements Q = ((?„) is not based on any known group to begin with, but as we mentioned in Sec. 3, it comes from the Fermi sector. The index / in Q'astands for the supersymmetry number 1, 2, 3, • • •, N and a represents the 4-component spinorial index. 22 The generators Q'asatisfy the anti-commutating relation: fIU n' , Wp) nin\ -- U n' Up ni++ Up nl U n' -- linear combination of some other a a a generators that must b e even

a A nq\ Ki.^ii)

and are called the supercharges. The first two sets, on the other hand, satisfy the familiar relations:23 (a) [PM, Pv] = 0 (b) [PX,MUV]= 22

r]^Pv-nxvPn

' When it is a 2-component spinor the indices used are A and A in place of a, p, etc. r\hl stands for Minkowski space-time metric (-1, 1,1, 1).

23

308

Mathematical Perspectives on Theoretical Physics

(c) [MA/i, MvS\ = - (riXv M^ + T)^ MXv - t]^ M^v - t]^ Mxs)

(7.4.28)

[Ta, Tb] = Cabc Tc

(7.4.29)

{Cabc being the structure constants of the group) and [/»„, TJ = [M^v, Th] = 0

(7.4.30)

The last of these relations as we have seen in Sec. 3 is a consequence of the Coleman-Mandula (no-go) theorem [3] which states that a symmetry group that incorporates both these symmetries is a direct product of the corresponding groups and as such permits no non-trivial mixing amongst generators. In order to find the mixing of these three sets, we first note that like the generators of ordinary Lie algebras they satisfy the Jacobi identity (see (7.1.8)) based on the rules given in (7.3.3). (i) [[£„ E2\ E3] + [[£ 3 , £j], £ 2 ] + [[E2, E3], £,] = 0 (ii) [[£„ E2], O3] + [[O 3 , £,], E2] + [[E2, O 3 ], £,] = 0 (iii) {[£„ O2], O3} + [{O2, O 3 }, £,] + {[£„ O 3 ], O2] = 0 (iv) [{O l t O2], O3] + [{O 3 , O,}, O2] + [{O2, O 3 }, O J = 0 where £,- and Oi (i = 1, 2, 3) stand for even and odd elements of the algebra. Since [even, odd] = odd, we further note that the Lie bracket

(7.4.31)

can be expressed only in terms of Qla as these alone are the odd generators. Thus the a index of the element Qlacan be viewed as rotated by MMV, giving (7.4.32) where the entities (K^

satisfy:24

[ ^ . Kv5fa = - nXv (K^a

-

%s(KXv)£

+ i]xs {K^t

+ %v (KXS)P

(7.4.33)

Equality (7.4.33) implies that (K^)^ form a representation of the Lorentz algebra, this in turn means that <2^carry a representation of the Lorentz group. Thus if we choose Q'ato be in the (0, y ) © ( y , 0) representation of the Lorentz group we have: [QL> MHV\ = \

( V « Qp

(7-4-34)

Similarly using the (7.4.31) (ii) and noticing that [ 0 ^ , £,] = linear sum of Q«

(7.4.35)

and that 8p and (Ys)p are the only invariant tensors which are scalar and pseudo scalar, we obtain the relation: [QL Ta] = (ijj QJa + K ) j (i Ysifi QJp

(7-4-36)

where (/a)j + i y5 (ma)j represent the Lie algebra of the internal symmetry group (the indices i, j refer to 24

The equality can be checked using (7.4.27), (7.4.28) (c) and (7.4.31) (ii).

All that is Super—an Introduction 309

supersymmetry and a refers to the internal symmetry group). The mixing of Qawith P., (using the Jacobi identity and (7.4.35)) simplifies to: [Qi,J>lJ = 0 (7.4.37) J Thus we are finally left with the bracket {Qa, Qp }- In view of (7.4.27) the most general expression that we can write is: {QL Qft = K y " Qap />„ SiJ + S ( T A % MXii 8ij + Caj} Uij + (y.Q^ ViJ (7.4.38) where r and s are natural numbers which vary with the choice of the supersymmetry number N and Cap - ~ Cap i s m e charge conjugation matrix (see Sec. 3). The even generators If*, V'-*, (f/1-7 = - Wl and V1-7' = -VJl) are called the central charges, (and are appropriately denoted as Z). Obviously they satisfy the property: [Uij, any generator] = 0 = [Vij, any generator] (7.4.39) From (7.4.39) it is evident that they belong to the centre of this algebra and they are non-zero only for N > 2. Now the identity [P^, {Qa, Q^}] + ••• = 0 (see (7.4.31) (hi)) implies that s = 0, also the generator P can be rescaled to set r = 2, accordingly (7.4.38) becomes: {Qa, G/}= 2 % Qap 5ij P» + Cap Uij+ (y5 Qaf} ViJ (7.4.40) The equations (7.4.28), (7.4.30), (7.4.34), (7.4.36), (7.4.37) and (7.4.40) represent the supersymmetry algebra25 we were looking for. We want to emphasize here that we have obtained this algebra under the assumption that Qla are spinors under the Lorentz group (more precisely they are in the spin 1/2 representation of Lorentz group). If we choose N = 1 the supercharge Qlais simply Q^ the supersymmetry algebra then consists of relations (7.4.28) which come from the generators of the Poincare group, and the ones that involve Qa. (i)

{Qw 0^}= 2(rM Q a / J P"

(ii) [Qa,Pfl] = 0

(iii) [<2 a ,*V=y (V)£2/5 (iv) [Qa.R^iiYstQp (7-4-41) The generator R in (iv) represents the internal symmetry group algebra, showing that it is just a chiral rotation. Note that the central charges Uu and V11 are zero in this case since N = 1 (for derivations of these equations see [Chap. 2 in [21]).26 Note that throughout these derivations we have regarded Qa as an arbitrary spinor. If we were to choose this as a Majorana spinor, i.e. as Q a = c ^ equations involving the supercharges Q'acould be simplified. Also observe that if we assume that the algebra under construction admits 'a complex conjugation as an involution,' then it can be verified that Qa satisfies the Majorana condition as a consequence of it (see [21]). 25

' Note that this is also referred to as Af-extended supersymmetry in the literature. - See Remark (7.5.2) on N = 1 supersymmetry in Sec. 7.5.

26

310

Mathematical Perspectives on Theoretical Physics

Having discussed the construction of the supersymmetry algebras in full generality, it is not hard to see that the super-Poincare algebra is obtained by considering only the relations (7.4.28) and the first three relations of (7.4.41). We now list a few facts about the supersymmetry algebra and the generators that we have discussed above. Fact 7.4.10: Supersymmetry is a symmetry which mixes particles of different spin (i.e., fermions with bosons). This is because of the fact that Q«is in the spin-1/2 representation of the Lorentz group, thus its action on a state of spin j results in a state of spin- (j ± -y). Fact 7.4.11:

The relation [P^, Q^] = 0 implies that

[Pi Q j = 0

(7.4.42)

showing that P^ is a Casimir operator of supersymmetry algebra (see Chapters 4, 5 and 6 for more on Casimir operators). This means that particles in any irreducible representation of supersymmetry will have the same mass. 27 Fact 7.4.12:

The energy PQ in supersymmetric theories is always positive.

Fact 7.4.13: In any representation of supersymmetry where />„ is a one-one operator, there are equal numbers of fermion and boson degrees of freedom.

4.7

Supersymmetry Transformations on a Superspace

Having understood the basics of supermanifolds and supersymmetry algebras, we shall use these concepts to describe in brief supersymmetry transformations on a superspace by choosing N = 1. This choice, though made for ease of introduction, is of great importance. Since it is known that if one is to deal with renormalizable Yang-Mills interactions with the further assumption that there are left-handed fermions in the underlying gauge group that commute with the supersymmetry generators, then one is left with the sole choice of N = 1. Moreover it is in the N = 1 supersymmetry where the linearized supergravity theory can be constructed. Example 7.4.14: We use Def. (7.4.9) of a supermanifold M whose dimension (m, n)_is (4, 4). In view of this definition, the local regions of M can be mapped to open subsets of R* x R^, whose total dimension (using the formula 2N~l (m + n) for N = 1) is 8. Thus, as explained in Subsec. (4.5), M carries the coordinate patches (xf1, 0") which satisfy*:

x>xxv-xvxfl = 0 a

x»e

-

0*^ = 0

9a 9P + 0 s 0" = 0

(7.4.43)

To describe the supersymmetry transformations, we first note that 0 " are Majorana spinors^ i.e., 6a= 6p da 27

+

This fact about mass does not always hold. (See Chapter 4 in [21] for Facts (7.4.10)-(7.4.13)). Note that equations in (7.4.43) are obvious in view of (1) in Remark (7.4.2). 9 a are Majorana, follows from Subsections (2.3), (4.4) and (4.5).

(7.4.44)

All that is Super—an Introduction 311

Remark 7.4.15: We note that with the assumption (7.4.44) these coordinates can also be viewed as coming from the spinor-bundle (see Def. (7.2.5)) formed over a 4-dimensional manifold (see Eq. (7.2.14)). Recall that we noted there that in the case of even n, the spinors in question were Majorana-Weyl spinors (see Subsec. 2.3). (Using the physicist's terminology we shall refer to this supermanifold M parametrized by coordinates (x4*, 6") that satisfy (7.4.43)-(7.4.44) as a superspace.) Basically physicists consider a superspace as an extension of ordinary space-time to a space-time with spin degrees of freedom. Here we have chosen 4 degrees of freedom as it is N = 1 supersymmetry case. In the group theoretic version (which we shall use in the next section), a superspace can also be viewed as the quotient space GIL where G is the group28 resulting from the 14-dimensional graded (super) Poincare" algebra and L is the homogeneous Lorentz group. For now we simply note that supersymmetry transformations are realized as motions in a superspace. Using a triplet of infinitesimal supertranslation parameters (a^, £a, £ a ) we thus write the translational variations for (x*1, 6") as: Sx* = - i ea-? 6 + i 8 T " e a + a " 50" = ea, SO" = eu (7.4.45) From above equalities it can be checked that the composition of two supertranslations of parameters (ccf, ef, ef) and « , e2a, ef) gives: [S2, 5x]zM = (- 2i e? T" £? + 2i e2a t ef, 0, 0)

(7.4.46)

where z*'= (xM, 8a, 9a). Likewise the Lorentz transformations and translations are given by:

d'a=&x-cotlvi(y^ep

(7.4.47)

where 0)^v stand for the generators of the Lorentz group. In the next section we shall see how the notion of superspace is used via supersymmetry tensor calculus to define component fields and superfields and interactions amongst them.

Exercise 7.4 1. Let {B, Bf) and (F, FT) denote two pairs of creation and annihilation operators (with B being bosonic and F being fermionic) that satisfy the relations: (i) [B, B+] = [F, F f } = 1. Write the Hamiltonian: (ii) and define an operator (iii) 28

H=(0bBtB

+ 0)fFfF

(cob (ofe CNO)

Q = Fi B + B T F.

Note that G is the group to which we referred in the introduction of this section.

312 Mathematical Perspectives on Theoretical Physics

Show that Q is fermionic and it takes bosons to fermions and vice-versa, and it satisfies: (iv) [Q,H] = (o)b-cof)Q. Show further that if (ob = (uy then H is supersymmetric, and that in this case Q, Qf and H form an algebra which closes under anti-commutation. 2. Verify (7.4.46).

Hints to Exercise 7.4 1. Recall that B and Bf are even whereas F and F + are odd, hence products of B and F + or B* and F are odd, which shows that Q is fermionic. To verify that it takes bosons to fermions and viceversa we have to first compute the brackets: (a)

[Q,Bf]

and

{Q, F + }.

Thus (b)

[Q, B+] = (F + B + B+ F) B+ - fi+ (F + B + B+ F).

We, note that these pairs, in addition to relation (1) also satisfy [B, F f ] = [Bf, F] = 0. Hence we have: [Q, B+] = F + (1 + B+ B) + B+ Bf F- F + B+ B- B f B+ F i.e. RHS = F + Similarly we can shown that {Q, F+} = B+. Thus if B f |0cand F + |0orepresent bosonic and fermionic states respectively then Q takes bosons to fermions and vice versa. This establishes the above statement. Now to compute [Q, H], we first note that since H is even, the bracket has to be a Lie bracket. Writing H in full and making use of the above result, we have: [Q, % B + B] + [Q, c»/F+ F} = (cob- mf)Q. Thus if cob=o)f, the relation [Q, H] = 0 shows that H is supersymmetric. Again writing (Ob = 0)f = a and computing {Q, Qf], we have: (c)

{Q, Q'} -

^H.

From the result established (c) it is clear thatg, Qf, Hform a closed algebra under anti-commutation. We further note a distinguishing feature of this symmetry the charge bracket here involves the Homiltonian H unlike the charge commutations of the bosanic case, which, confirms that Q, Q*, H are generators of a symmetry algebra. 2. Using the triplets of supertranslation we have: (i)

8bx" = - iei T" 6a + i6a %» e« + of1

(ii)

8bea=el

8be6c=ebx

(b=l,2).

Obviously [f^, <5,] 6a and [82, 5,]r9aare both zero since e", e" are constants and hence zero under 8. In the case of x1* we can write it as:

Ail that is Super—an Introduction 313

(S25, - 8i52) xu=62(-

i ef xM d" + if rMef+aM)~(l<^

2)

= (- ief T^ £? + i £2a T^ £ f + 0 ) - ( l H 2) = -2i£, a T"e« + 2 i £ 2 a ^ £ f .

5

D

THE CALCULUS O N SUPERSPACE, THE COMPONENT FIELDS AND SUPERFIELDS

To define any entities on a space, a knowledge of calculus is needed. We give below simple rules of calculus on the superspace and then follow on to the definitions of component fields and superfields.29 Due to our limited scope we shall not go into their detailed study, we would however like to mention that these fields provide the best means for the description of supersymmetry representations.

5.1

Infinitesimal Generators and Covariant Vector Fields

An element of the superspace is denoted by ^= (A^, 6a, 8"), thus M stands for fi, a and a indicating a Lorentz 4-vector in the former case and a Lorentz-spinor in the latter. A finite group element on the space can be defined as: G(x, 6, 0 ) = elzM*"

(7.5.1)(a)

where kM- (- P^, Qa, Q^) represents a triplet of group generators. Thus G(x, 6, 6) = exp i (- *" PM + 0 a Qa + 0* Q&) = exp i(- x" Pf, + 9 Q + 0 Q) In view of the Hausdorff's formula eA eB = e

2

(7.5. l)(b) the product of two elements G s G(x, 6,

6) and G' = (y, £, E,) can be written as: G{x + y - i § T 9 + i 9 T | , 0 + £, 0 + ^ )

(7.5.2)

where we have used the following supersymmetry algebra rules:

[/*, ft?] = [P", ? Q ] = 0 29

(7.5.3)

' The study in this section is described using the notations of [19], [20] since Wess and Zumine were the first to discover the supersymmetry algebra and give it a mathematical formulation. The reader mayfindthe equations derived in this section marginally different from those of Sec. 4 due to a different choice of ym, etc.

314 Mathematical Perspectives on Theoretical Physics based on

{?,?) = {?', Qp) - = {*Vr}=0 (7.5.4) The multiplication of group elements induces a motion in the parameter space, hence if we choose y = 0 and write G (0, £ ^) = g(£ £) we have;

*(£ I ) : (*", 0, 0) -> (x"+ /0T" £ - i £ T " 0, 0+ £ 0 + I )

(7.5.5)

The infinitesimal generators of this motion (also known as differential operators) are:

">•-'•£?

^•#

+ '^4

(7-5-6Ka)

which satisfy:

{QwQfi)

={Qa'Qp}=0

(7.5.6)(b)

They represent the algebra resulting from the group formed by the elements G, G', etc. These group elements (naturally) define curves in the superspace; the tangents to these curves give rise to covariant vectorfields.It can be checked that these vector fields are carried into themselves by group transformations. The product of three group elements in view of the associative law gives: {G2{x2, 62, G2) G, (*„ 0,, 0,)) G3{x3, ft,, e3) = G2(x2, e2, e2) (G, (x,, ex, e , ) G 3 (x 3 , 03, e 3 » which in turn suggests that the vector field can be obtained as the infinitesimal of the right multiplication of the group. The components of this (covariant) vector field are:

(7.5.7)

By their very definition D and D satisfy the anti-commutation relations:

All that is Super—an Introduction 315

{Da,

D^^-Ux^d,

{Da,Dp} = {Ddt,D^}=0

(7.5.8)

among themselves, and anti-commute with the operators Q and Q , thus we have: [Da, Qp} = {Da, Q^ }={D(l, Qp] = [fy, fy } = 0

(7.5.9)

The covariant vector fields given in (7.5.7) which are also known as the differential operators of the system are collectively written as: D

(7.5.10)

N=?N^W

dz The entity e^ can easily seen to be the (3 x 3)-matrix (see Exc. (7.6.5)):

s; e»=

*JO*

B

l-* *a*

o 0s S*

v

0

(7.5.11)

o 4)

The inverse of the above matrix determined by: e

NeA

=dN

'

e

A eM=°A

(7.5.12)

is given as follows: 5$ N

e

M=

0

-ir^P \

o'

8«p 0

idh I

op

0

(7.5.13)

b\

pJ

Note that the lower index N'm e $f stands for the indices of operator DN whereas the upper index M stands for that of

M , in the case of e$ they get reversed (see Exc. 2 for the verification of (7.5.12)). oZ The entities e ^ and e ^ are called the supervielbeins. We shall return to these later in this chapter. We next define in brief the component multiplets and superfields.

5.2

Component Multiplets and Superfields

Definition 7.5.1: Consider anti-commuting parameters £ a , t,^, ••• and fermionic elements Q that satisfy (7.5.4) and the supersymmetry algebra (7.5.3) where 30 :

& = ?Qa and I Q = I d Q* 30

Note that Q, Q are not necessarily differential operators here.

(7.5.14)

316

Mathematical Perspectives on Theoretical Physics

A component multiplet C = (A, y, •••)

(7.5.15)

is ajset_of fields in the context of supersymmetry which can be transformed by operators formed by £, Q, E,, Q to give rise to another multiplet. Beginning with C we can thus obtain another multiplet by defining the infinitesimal transformation formulated as:

(a) 5 § A = (£G + I (b) 8syr=(SQ

Q)xA (7 5 16)

+1 Q)xyr

- "

The first of these satisfies: [8?, S4] A = (8? 8^ - 8^ 8?)A =

2(S'T"I-£T"|')P;IA

=-21(^^1-1^1')^ A

(7-5.17)

We note here that the operator [8^ •, 8^\ acts differently on the other component y/. This is because the supersymmetry transformation maps tensor fields into spinor fields and vice versa. Also since Q has mass dimension 1/2, the fields of dimension k are transformed to fields of dimension k + 1/2 or into the derivatives of fields of lower dimension under the operation of Q. In the end, however, the closing requirements of the algebra always lead to the result which is acceptable within the framework of the theory. If we choose A as a scalar field and y/as, the spinor field into which A is transformed: 84A = J2%y/

(7.5.18)

then in view of the above explanation, the field y/ is transformed into a tensor field of higher dimension and into the derivative of A itself, accordingly:

8^y/ = J2E.F + i V2Vfd M A

(7.5.19)

Note that the coefficient of d^ A has been chosen in the above equation so that the commutation relation (7.5.17) be satisfied. Having introduced the new tensor field F in 8^ y/, our objective is to find the transformed object 8^ F. This is done by writing [8g>, &] y/ explicitly and by maintaining that the rules of the algebra given in (7.4.41) are preserved. Using (7.5.19) we thus have:

(S{, <%- 8z 8r)¥= - 2/ (
y, {% TV!,-$TV

+ -Jl (£5 { ,F-§' SfF)

I') (7.5.20)

It can be checked that the algebra closes if

5sF=iV2 f r ^ y This transformation rule for F also allows that the commutator on F closes as well. In view of the above discussions, the component multiplet C is now enlarged to

(7.5.21)

All that is Super—an Introduction 317

(A, y/, F)

(7.5.22)

The multiplet with transformation rules (7.5.18), (7.5.19) and (7.5.21) is called the scalar multiplet. The fields A, y/and F form the (simplest) linear representation of supersymmetry algebra described in the previous section. Also as we noted above, if the dimension of A equals 1, then ^has dimension 3/2 and F has dimension 2. Since our primary fields are A and y/, F is said to be an auxiliary field. Remark 7.5.2: We note that F transforms as a space-time derivative under Sg, we further emphasize that this is always the case for the component of highest dimension in any given multiplet. Apparently our motive to define these fields is to write a Lagrangian in terms of them that would lead to an invariant action. The Lagrangian with this property is as follows: L = £ 0 + m£m = {idMy/rM V+ A ' D A + F ' F)+ m (AF + A* F* - \ (y/ iff + y/y/))

(7.5.23)

where m stands for the mass of Weyl-spinor y/ and complex scalar A, and • stands for d'Alembert's operator. The field equations resulting from L are: /?'" 9^1//+ my/ = 0 F+ mA*=0 • A + mF* = 0 (7.5.24) which describe the Weyl-spinor {/and the complex scalar A both of same mass m. We shall next define superfields—these can always be constructed from component fields and conversely given superfields, the component fields can be recovered from them. Clearly superfields are functions defined on superspace, and as such they can always be expressed as power series in terms of #and 0. Denoting the superfield by F(x, 0, 0) we thus have: F(x, 8, 0) = f(x) + 0 <j){x) + 0z (x) + 80M(x) + 9 0N(x) + 0TM 0 Vu(x) + 08 0l(x)+ 000y/(x)+ 08 00 D(x) (7.5.25) Due to the anti-commuting properties of ffs, the product of more than two ffs or #'s vanishes. As a result the expansion of F (x, 0,0) contains only these many terms. The coefficients/(.r),
0S4N(x)

(x)+ 0 00 Sf yKx) + 00 0 0S^D(x)

(7.5.26)

Just as we had the transformation for component fields, the transformation 8^ F here stands for: ^FS(f(2 + ?fi)F

(7.5.27)

where Q and Q are differential operators defined in (7.5.6). The transformation laws for component fields can be found by substituting the values of Q and Q and matching the appropriate powers of 0. We shall illustrate this at the end of this section in Exc. (7.5.5) by using a slightly different route.

318

Mathematical Perspectives on Theoretical Physics

It can be verified that the commutator [S^, 8^] in the case of these fields as well satisfies (7.5.17) because of the anti-commutation relations (7.5.6)(b). It is easy to check that the sum and product of two or more superfields is a superfield. Moreover, the operators Q, Q involved with them are linear operators. The collection of superfields can thus be considered as forming linear representations of the supersymmetry algebra. In general these representations are highly reducible. For example, the number of extra component fields can be reduced by imposing conditions such as: DF = 0

(DF = 0)

(7.5.28)(a)

or F = F+

(the reahty condition)

(7.5.28)(b) 31

The superfield that satisfies (a) is called chiral or scalar (antichiral), whereas the one that satisfies (b) is known as a vector superfield. In conclusion, we would like to mention that all supersymmetric renormalizable Lagrangians can be constructed in terms of scalar and vector superfields, while the superfields in turn can be constructed from a component multiplet by applying the operator exp (6Q + 6 Q). Finally we illustrate, via the following examples, the theory discussed above. Example (7.5.3):

The scalar superfield: Consider the superfield <& which satisfies: 2^0

= 0

(7.5.29)

We are interested in the solution of the above equation. Since DA =

— — ida ra^ ——, it is easy

to note that the variables 9 and y^ = X? + id x^ 0are zeros of D&. More explicitly:

ff (/)

* -(-^-M"t-^)(JI'+'e'V5') *+><>'*& 4-iO°**K'O

(7.5.30)

Hence any superfield constructed with only y and 6 is scalar, and it can be written down as: 4> = A(y) + yl2 6{y) + 66 F(y)

(7.5.31)

Note that it corresponds to the scalar multiplet introduced earlier in this section. Also, using the Taylor's expansion it can be written in terms of the variable x as: O = A(x) + id z^ 6 -^— A(x) + -666 ox^ 4 +

V200(JC)

--j=-de V2

6 D A(x)

-$— (x) T»e + 66F(x)

(7.5.32)

dx^

The superfield O which, as we know is chiral, is sometimes denoted as <&+ and the one that satisfies 31

DO = 0 (7.5.33) Very often the word 'scalar' stands collectively for superfields that satisfy D F = 0 as well as DF = 0, see Hint to Exc. (7.5.9).

All that is Super—an Introduction 319

is known as antichiral and is denoted as 4>_.32 It can be easily verified that <E>_ can be expressed in terms of the variables z = z^ = x^ - id •? 8 and 8. Thus we have: <&_ = A*(z) + -J2 9 $ + 86 F* (z)

= A*(X) - ie x^e -^— A\X) + — dee en ox*1

A\X)

4

+ V2 d$(x) + - L . 0 0 - ^ - 0 (*) r " 0 + 68F\x) (7-5.34) V2 ox*1 Apparently (O+)+ = _. It is easy to check that the product of two chiral or antichiral superfields are again chiral and antichiral. Under the supersymmetry transformation (0 -> 6 + e, 8 -* 8 + e) the components of the chiral field + transform as: 8 A = -J2e<j) 5<j>= -JleF + i 7 2 V edpA SF= i -JllT* 9^ y (7.5.35) (see (7.5.18), (7.5.19) and (7.5.21)). The complex conjugation of these gives the supersymmetry transformations of the components of the antichiral field O_. It can also be verified that for any superfield F = F(x, 8, 8) given in Eq. (7.5.25), DDF = Ois chiral. Example 7.5.4:

The vector superfield and the gauge transformation: Consider the superfield V that

satisfies the condition V = V +; from (7.5.28)(b) we know that this is a vector superfield. Evidently the following choice of components of V makes this possible: V(x,8, 8) = C(x) + iQx(x) - id x(x) + j

08 [M(x) + iN(x)]

- ~8 8[M(x) - iN(x)] - 8x^8 V^ (x)

+ i868 [ i w + -L f^drfw]

-tooe [A (jc) + -i

Mnxw]

+ — 68 8 8 \D{X) + - D C(*)l

(7.5.36)

Next we consider the sum O + + ®_ given by: * + (x,8, 8) + ^>_ (JC, 8, 8) = (A + A*) + >/2~ (6 + 8<j>) + (66F + 6 8F*) + idx* 83^ (A-A*) + 886'fdu^)+ 32

+ -j=r (66 8~T» dfl

— 6e88U(A+A*) 4

(7.5.37)

Sometimes we shall denote <&_ as \ retaining the letter «I» for <1>+ see in particular Subsec. 6.2. The * on the side of A and F in Eq. (7.5.34) and subsequent equations stands for complex conjugation.

320

Mathematical Perspectives on Theoretical Physics

This helps us to define the supersymmetric gauge transformation: V -» V + O + +O_ Under this gauge transformation the component fields transform as follows: C-> C + A+A*

(7.5.38)

X -> X-i Jl
(M + iN) - 2iF V^^V -id (A-A') A^ X D -> D (7.5.39) Obviously A and D are gauge invariant. If one were to choose the vector superfield V with C, x, M and N as zero, the supersymmetry breaks but the familiar gauge transformation is still invariant (note that we have denoted - i(A - A*) as a). This particular choice of C, £, M, N is called the Wess-Zumino or WZ gauge. It can be easily verified that in WZ gauge the vector field V satisfies: V3 = 0

(7.5.40) (see Exc. 7.5.7). From (7.5.39) it is evident that WZ gauge is based on the following relations amongst fields: Re A = -—C 2 V

-J2*

F= — (M+iN)

(7.5.41)

2i

thus after the gauge transformation V becomes: V(0, 0, 0, 0, VM, X,D) In next section we shall briefly return to these ideas.

(7.5.42)

Exercise 7.5 1. Show that {DA, DA}=- 2 / T J
All that is Super—an Introduction 321

5. Use the Hausdorff formula eA • eB = exp (A + B + \ [A, B] • • ] to show that: G(0, £, e) G(x", 6, 6) = G(x" + j0r M £ - i ex** d, 6 + e, 6 + e). 6. Use the supersymmetry transfonnation F(x, 6,9)^>

F(x^ + i(0T" e - ez^9),

6+ e,0 + e)

to find the supersymmetry transformations of component fields. 7. Show that in WZ gauge the vector superfield V satisfies:

v2 = - - eeeev^v^

and

v3 = o.

8. Show that the operators D and D satisfy the following properties: (a)

(£>) 3 =0,

(D)3=0

and (b)

DaD2

Da=DpD2DK

9. Show that the superfields:

Wa =

-±(D)2DaV

and

are chiral and antichiral respectively and are also gauge invariant.

Hints to Exercise 7.5 1. [DA, DA} = DA DA + DA DA. In view of (7.5.7)33, the RHS can be written as:

33

' Note that Greek letters there are replaced by Latin capitals here, and also the repeated indices in T, 6 and 6, are used with different convention. Reader should check these equations using the notations of (7.5.7) as well.

322 Mathematical Perspectives on Theoretical Physics

=

_'{_d

d_ _d

\deA

deA

d\

deA deA)

M

+ TA e

y * ° ^[w)+id*

VJ_

dxv

Tj v

* i^vde*))

In the above sum, the first and fourth terms in parentheses are zero due to the anti-commuting properties:

B

B

P TA - . —A1 = 0, l\d ,e \=o \de de f J and the third vanishes since

d f d )

.

d ( d )

are zero. Similarly writing

IO* D.) - [[£

,rj 5 ^ ) ( ^ • *J V £-) + M M .enns]

+

and simplifying on the above lines we note that all four terms (in pairs) are zero, since in this case the term

2. Written out in full the equality D

N=

e

N

T

M

becomes: d

(DA

(

V

0

Da = .-c r sj

0^

dxv

o J^ .

The inverse e^ will be given by the relation:

All that is Super—an Introduction 323

Using (7.5.13) we have:

'

V irj8*

0 5J

0 1 ( 8/ 0 -ix^P

-iPraSSS + iSS&fixtf

0 6f

0

0 " 0

<5/<5^

3. In order to show that the commutator closes on A, we use the definitions of 8z A and 8^ y/ as given in (7.5.16). Since 8%A = v T ^ y/, we have:

Sr 8^A = {!;'Q + I'Q) x [(£G + ? 2 ) x A] = (I'G + ?'G) x (V2 ^ ) = 5 f (V2 ^ ) . Similarly

^«5 r A=(^ + i e ) x ( V 2 ^ » . Therefore using (7.5.19) we have: [S?, 5$ A = J2 4 (i V 2 " T " | ' <9/ + V2"^'F) - ^ " f (1 -Jlx"^

A + -Jl^F)

= 2i(^ ?dM A - ? T » 1 d^A) + 2 ( ^ ' - SET)F. The second term is zero in view of (7.5.14), this gives us (7.5.17). The other relations can be verified in a similar way. 4. Note that 69 can be written in more than one way, e.g., ov

==

£fig o

U

=

we choose to write it as eCD 0 6

£AB

-w -w

UQ v

—E

Ug u^ r

to obtain:

{£CD QC eD) = £AB £CD

[

= £

\-CcD (SB

S

-w(5/ 6° -eC 5»D)}

A ~ dA °B )]

324

Mathematical Perspectives on Theoretical Physics

= eAB(eBA

- eAB) = 2t*B eBA = 4.

Since A, B take the values 1 and 2 and 8^ = 0 for L * M we have the result. To verify the second part of the problem we write it as 34 :

and use the equality 9 9 = ECB 9^9^.

.V

A

We thus have:

A

/J

a

= H/](VV+VIAgain since A, B take values 1 and 2 and <5^ = 0 for L ± M we have the result.35 5. In fact this exercise is a verification of our statement made in (7.5.2). Writing G(0, e, e) G{xti, 9, 0) explicitly in exponential form and using the Hausdorff's formula we have: e

i(ee+ie>V(-^

+ee +

^)

= exp

U

x l t

P^+£Q

+

eQ+dQ

+ 9Q)

+ | [i(eQ + eQ), »(-*" ^ + BQ + 6 Q]\ = exp|i(-x" p^ + ee + e e + ee + ee) - i-f(ee + eQ) (-*" P^+9Q + 9Q) - {-x» Pll + 9Q + 9Q)(eQ + eQ)Jj. (Note that we have just one Lie bracket term on RHS, as all higher commutators vanish.) There are 12 terms in the second term of the above exponential, of these 8 cancel in pairs, the remaining 4 are: 34

' We have written it this way to preserve the order in the summation rule, which prescribes: e^^eCB = 8C,.

35

Note that these results could also have been established by writing e 4 8 o9 =—

7- and by using 99 = 9C 0cfor the first one, and e • 6 —

ddA deA

c

AB

j - = -eAB igB -\QA

j o9

— = -^-.

— etc. for the

deA deB deB deh

second. Yet another way of solving the problem would be to write F(x, 6, 9) = 9dM(x) = 991 or = 9 9 1 and operate on it with appropriate operator.

All that is Super—an Introduction 325

~

{(eQ9Q-9QeQ) + {eQ9Q-9QeQ)}.

In view of (7.5.3) they can be written as: - j

{2ET^ 9P^ - 29xti £ PJ = (/)2 er^9P^+ i (- i)9^£PM.

Thus replacing the i back into the term and factoring out (- P^) we have the required result. 6. We expand the transformed superfield Fix? + t?, 6+ e, 6 + £) in terms of Taylor's series. Note that we have written ^ for i(6x^e - er^d). The expansion of altered superfield gives: (i)

Fix" +?,6+e,e

+ e) = F(x, 6, 9) + §" dM F + e ~ + e ~ . do

ou

Now F(x,9,

9)= f(x) + 9®(x) + 9%{x) + 99 Mix) +99 Nix)

+ 6>TV 9 Vv ix) + 999 A (x)+ 9 9 6yKx) + 99 9 Hence (i) can be written as:

9Dix).

Fix» + ?, 9 + e, 9 + e) - Fix, 9, 9) s8F s 8f(x) + 95®ix) + 9 8~xix) + ••• + 996 9 5 Dix) = I " (dMf+ 9dM® + ••• + 99998M D) + e— (0O(x) + 60M(x) + 9TV 9VV ix) + 999~Iix) + 999y/ix) + 999 9Dix)) o9 + e-~ (9%ix) + 9 6 Nix) + 9xv 9Vvix) + 999 X(x) + 9 9 6yKx) + 999 9 Dix)). o9 We now compare the coefficients of 9, 9,99, 99, 999, 9 96 etc., beginning with the constant term on both sides of (ii) to obtain (note that we have used the differentiation rules for powers of 9, 6 as illustrated in Exc. 4): (a) Sf = eO + e x (b)<5O= ie T ^ / + eM (c)Sx

=-ieT"dllf+eN

(d) SM = ii/2)e x^dft + el it) 8 N = i-ill) ET?du x

+ e

¥

if)SVM= iedfl + iedM j + c ^ A - ex^f (g) Si = ID + ( ^ Vv- dv vp e T / i t v V

ier^M

(h) 8y/= eD + ( ^ Vv - dv VM) / T £ + iT* eduN

326

Mathematical Perspectives on Theoretical Physics

(i) 5D = (t/4)£T% I - ( / / 4 ) e T ^ \\i. Note that the second term in (g) and (h) results from simplification of Q1 d^ (6rv6 Vv). 7. In WZ gauge the defining equation (7.5.36) for V simplifies to: V = -0TM0 V^x) + i 666 A (x) - i 6 6 6Mx) + — 6d6 8D{x).

(i)

While writing V2 we note that since no_ terms containing higher powers of 0or 6 beyond 688 8 are non-zero, the terms containing A, A and D contribute nothing to the product, we thus have: (ii) V2 = (-6x^6 VM (x))(~6rv6 Vv). In view of the equality resulting from spinor algebra: (6rtld)(dTvd) = - —666 6 ^

(iii) we see that: (iv)

V2 = -— 666 6 Vp VM.

From (iv) V3 = 0 easily follows. 8. By the very definition of covariant derivative operators D and D as given in (7.5.7), the vanishing of cubic powers of D and D is a straightforward fact. To show the validity of (b) we write:

DaD2Da^DaDeDpDa In view of commutation relation given in (7.5.8), RHS equals = Da Dp (2i(rdfa

-DaDp)-

Similarly, replacing Da D we have: DaD2Da = (2i(Td)«) - Dp Da) (2i(T«?)J - Da Dp) = - 8D + DpD2 Dp - 2i(Td)P {Da, Dp }. The third term in the above equality is 8 D, hence we have the required result (note that we have not used the suffixes on Pauli matrices and the space-time derivative d here, for obvious reasons and have replaced (T
D2D2+D2D2-2DaD2Da=l6n D2 D2D2 =16 • D2

All that is Super—an Introduction 327

(iii)

D2D2 D2 = 16 DD2

and to corresponding projection operators: (iv)

(v)

(vi)

which satisfy: n O f + n 0 _ + n 1 / 2 = i.

(vii)

The operators n 0 + and n ^ acting on a scalar superfield project out the chiral and antichiral parts of the field whereas YlU2 projects out a piece called the linear multiplet. (See Chapter 10 in [16] and Chapter 9 in [19] for more details.) 9. Since (i)

DpWa

=

-^DpDDDaV

the RHS involves a cubic power of D and is therefore zero in view of the result proved in Exc. 8. Similarly Dp W^is zero. Hence Wa and W^ are chiral and antichiral fields respectively. To prove their gauge invariance we have to show that (ii)

W -> (W + O + + 3>_) = W.

In other words we have to show that (iii)

Wa -> - 1 D DDa (V +
and similarly for W^. Note that in order to prove the above expression we have to use the facts D + = D
D DDa 0>+ = Dp [i(zdi (iv)

= I ( T 8 ) J ^ * + =0.

Alternatively we can also write; (v)

DDDa®++

and obtain the required result.

DD Z>aO_= D{D, Da} O + = 0.

328

Mathematical Perspectives on Theoretical Physics

6

DIFFERENTIAL FORMS AND GAUGE TRANSFORMATIONS ON SUPERSPACES

It is known that no super theory can be formulated without the use of differential forms and their derivation on superspaces. Similarly the gauge theories such as Yang-Mills cannot be extended into super-Yang-Mills without the help of the (so-called) super gauge transformations. Our task in this section is to introduce both these topics.

6.1

Differential Forms

Let zM = (x*1, 8a, OQ) be an element of the superspace where x? represents the four-vector, and 6", 0^ represent the spinors in the superspace.36 The multiplication rule among these elements is given as: zM zN =(-)'""

zN zM

(7.6.1)

The letters n and m are functions of N and M respectively, and take the value 0 or 1 according as N and M stand for a vector or a spinor index, for example / / = (-)OxO xv A 9a 9P = ( - ) l x l 9P 9a and x " 0 « = ( - ) O x l ^ " ( s e e (7.4.17)). The exterior product and differential forms in superspace are defined along the lines of ordinary space, thus for instance the exterior product obeys the rule: dzMA dzN = - (-)nm dzN A dzM dzM zN = ( - ) " " ' zN dzM and a g-form

7

(7.6.2)

is defined as: A • • • A dzM« WUq... Mi (z)

(7.6.3)

(The indices n and m in (7.6.2) are the same as in (7.6.1).) The 0-forms in this case are functions F(z) of superspace variable z, and a typical 1-form A = dzMWM(z) can be written as: A = dx? W^z) + def* Wa (z) + de& W" (z)

(7.6.4)

It should be noted that in view of (7.6.2) the coefficient functions are of mixed symmetry and therefore unlike the forms in the ordinary finite dimensional space, there is no value of q, above which all forms vanish. It is also assumed that all coefficient functions with even (odd) number of spinorial indices are bosonic (fermionic) in character. As a result, the usual product rules amongst differential forms are valid: A ¥ =(-f

¥A

(ciA, + c2 A2) «F = qAjY + c 2 A2 Y A (*FT) = (AV)S where A is a p-form and *F is a g-form in the first equality. 36

In Sec. 5 w e used z M ' - {x^, 0", 6a), the present choice facilitates the summation.

37

In general the exterior product sign will be dropped in the future.

(7.6.5)

All that is Super—an Introduction

329

The exterior derivative denoted d maps a 0-form to a l-form and a g-form to a (q + l)-form. It is defined as follows: d :F -^dF = dzM -p^F

= dz™ dMF

= dzM> -,dzM'>dzN-~-

d:V^dV

WMg...M{ (z)

(7.6.6)

The derivative d in superspace satisfies the properties similar to those in ordinary space, for instance: (a)

d OF + A) = d*¥ + dh

(b)

d OFA) = ¥ dk +

(-fdVA

(c) dd = 0 (7.6.7) where A is a p-form. Since equations written in terms of differential forms and their exterior derivatives are covariant under coordinate changes, they are found very useful in the formulation of almost all physical theories in particular the gauge theories (see Exc. 7.6.1). From our discussions on gauge theories in Chapter 6, we know that gauge theories are covariant under general coordinate transformations as well as under a local structure group (that pertains to the theory in question). For instance this is a compact Lie group for Yang-Mills theories and is the Lorentz group for gravitational theories. Differential forms are often used to span a representation (say X) of this group, thus we have: y>a = yb ^

(z)

(7.6.8)(a)

or ¥'=¥#. (7.6.8)(b) where a and b takes the values 1 •••/,/ being the dimension of the representation X. Now while exterior derivatives map differential forms to differential forms, they do not map tensors to tensors,38 for instance from the derivative of (7.6.8)(b) we have dV^V dx+eWx (7-6-9) which contains the inhomogeneous term *P d%- To circumvent this problem we introduce the (familiar) Lie-algebra valued connection l-form defined as: w=dzM coMr(z)ir (7.6.10) where matrices (V) are the Hermitian generators of the group and the index r runs over the dimension of the Lie algebra. The transformation law for the connection co is as follows: (O'= X'{
Recall that tensors are objects that transform linearly under a representation of the structure group.

(7.6.12)

330

Mathematical Perspectives on Theoretical Physics

Written out in full this is:
=

+ dzM^ ••• dzM" dzN (0/ WMq ... Mi (z)iTr

(7.6.13)

The derivative T) maps q-forms to (q + l)-forms and tensors to tensors. The connection form 0) together with its derivative gives rise to an important tensor, the curvature, which we denote as % %_- dm + 0)0)

(7.6.14)(a)

As can be expected it is a Lie-algebra valued two-form39. £ = j dzMdzNRNM(z) where

(7.6.14)(b)

RNM ( z ) = RNMr (z) iTr

(7.6.14)(c)

In analogy with ordinary space, the curvature and the covariant derivative of a tensor are the only tensorial quantities which can be constructed by taking derivatives. Because of dd - 0, higher derivatives do not lead to new tensors, they only lead to identities—the Bianchi identities (see Exc. 4).

6.2 The Gauge Invariant Lagrangian in Superspace Let f/(l) be the gauge group and let a scalar superfield O r undergo global phase transformations under £/(l) rotations as: O r -» e'^x O r = <&/

(7.6.15)

where tr denotes the charge corresponding to

£, = *;*,

+ [ [ 1 m.. O,. Oy. + 1 ^ 0 , ^ . 0 J

+h.c.l

(7-6-16)

is gauge invariant under the transformation (7.6.15). The first term in L (the 666 6 component of O r + O r ) is LKE and the second is LPE, which is also called the superpotential. It is important to note here that to maintain [/(l)-invariance, the coefficient m^ or gijk = 0 whenever t{ + tj or tl + tj+ tk is nonzero. If, on the other hand, the rotation angle X depends on the variable x, we denote it by A and note that following transformations: O' r = eAt>A
AiA = 0

' For this reason ^is referred to as a curvature two-form.

All that is Super—an Introduction

Da A+ = 0

331

(7.6.17)(a)

leave the LPE portion of the above Lagrangian invariant*. To make the other portion invariant, we introduce the vector superfield V along with its gauge transformation law (see (7.5.38))40: V->V+*(A-A+)

(7.6.17)(b)

in the Lagrangian and express Vin terms of W (see Exc. 7.5.9). Accordingly the LKE is written as:

LK.E. = } (W Wa\ee + % W\)

+ *r+ e''v Or l^g

(7.6.18)

(see Exc. 9 of Sec. 5 (in particular (iii)) for Waand W6). Although the Lagrangian L = X KE + £ P E with this change seems non-renormalizable, it is actually normalizable in WZ gauge where, as we know, V3 = 0. To write down the supersymmetric extension of LQED (Lagrangian for electrodynamics), we replace the charge r, with the electric charge e and use the scalar superfields:

o + -> e~ieK o + = o ; O_ -* eieK O_ = O_'

(7.6.19)

to obtain:

+ *-+ ^ ' O_ | ^ g + m (O + *_| e e + <&; O_+|eg)

(7.6.20)

It is worth noting here that the gauge transformation law (7.6.15) can be generalized to a non-abelian group (necessarily compact) by letting A in the following: 0)' = e~ik O, O' + =
(7.6.21)

in terms of the Hermitian generators Ta of the gauge group in the representation given by the scalar field O. The vector superfields V, V also become matrices and the transformation law for them is: ev'= e'iA+ev eiA The scalar field Wa is now defined as: Wa=-

— DDe'vDaev 4

(7.6.22)

(7.6.23)

with transformation law: Wa' = eiA Wa eiA (7.6.24) Finally the most general supersymmetric Lagrangian for renormalizable interacting fields can be written as: 4a

A as well as A were assumed to be real, hence to bring it in line with (7.5.38) we used the factor ('. * See Ftn. 32 for the notations used in (7.6.16) etc. In this section + stands for Hermitian conjugate.

332

Mathematical Perspectives on Theoretical Physics

L =

±-?r(vr

Wa\ee + Wa W%)

+

®+ ev<S>\eere

+ [ ( | myQPj + I gijk O,.O;.O, j |^ + h.c]

(7.6.25)

where k in the denominator is the normalizing factor resulting from: Tr T Tb = k8ah

6.3

k>0.

Supergauge Transformations

Recall that in Sec. 6.5 (see Exc. 6.5.2) as well as in an earlier section we saw how a particular choice of gauge (e.g., Coulomb's or Wess-Zumino's) could lead to easier computations. In Sec. 6.6 we also established the gauge transformation rules, in this subsection we study supergauge transformations (SGT) so that we may find convenient reparametrizations of a super theory without disturbing much of its invariance properties. We shall see that SGT's are constructed from the general coordinate and structure group transformations of the superspace. They map Lorentz tensors into Lorentz tensors and reduce to supersymmetry transformations in the limit of flat space. To obtain these transformations, we begin with supersymmetry transformations (see (7.5.17)) where the parameters t, and f are independent of x. We shall now treat them as functions of x and use £ = £00 and £ = £ (x) to denote them. The motions induced by these transformations:

x" -> x" - i(0T" £ (x) - £(*) T " 0 )

8

a

^8a-

1«W

(7-6.26)

generate coordinate transformations such as: zM ^ y M = zM - ZM(z)

(7.6.27)

This helps one to express a given theory in the language of differential forms; the theory, as we already know, is covariant under coordinate transformations. Again our basic dynamic variables (superfields) for describing the theory here are the vielbeins and the connection form. These contain a large number of component fields, which eventually get reduced through covariant constraints and by choosing a proper coordinate system. We now proceed to substantiate this statement. Note that the parameter £ can be expressed in Einstein indices, e.g., £ M as well as in Lorentz indices A % through the vielbien relation41:

41

t.A _ p.M

T?

q - q

tM

A

m £.

TON

(7.6.28)

The indices depending on the structure group (which is Lorentz here) are denoted with the beginning of the alphabets, e.g., A, B, C and are called Lorentz indices, whereas the indices that come from the middle of the alphabet, e.g., M, Nrepresent coordinates and are known as Einstein indices. We have used the upper case letter '£" in place of the lower case letter V used in Sec. 5 to distinguish the coordinate dependence of transformations, due to our choice of the Lorentz group though, they represent the same object.

All that is Super—an Introduction 333

Evidently only one of these, i.e., either %A or E,M, can be chosen as field independent transformation parameter. Since we want Lorentz tensors to be transformed into Lorentz tensors, we let E,A to be field independent. Consider now an arbitrary tensor superfield VA, its transformation can be written as: SVA = - £u dM VA + VB LBA

(7.6.29)

The entities LB are Lorentz transformations that correspond to the tensor structure of V. In analogy to ordinary space, the transformation for scalar fields (again denoted V) are: SV=- ZM duV = - SA E^ dMV = - £A
(7.6.30)

E

A EM = °M > EA EM = °A (7.6.31) A Note that while 8V is covariant under Lorentz transformations, 8V is not so-unless the derivative in (7.6.29) is replaced by the covariant derivative: *>uVA =

(a)

dMVA+(-)MBVBa>UBA

(b) VBVA=EBMVMVA Now (7.6.29) can also be written- as:

(7.6.32)

8VA = -SBEBiidMVA+ VBLBA Substituting the expression for E^ dM VA from (7.6.32) in the above equation we have:

(7.6.33)

8VA = - B VA + VB $c
(7.6.34) c

A

We know that the connection coCB is Lie algebra valued, therefore £ coCB acts like a field dependent Lorentz transformation on VB. And hence if we set LBA = - ? (OC£

(7.6.35)

8^VA = -f'DcVA

(7.6.36)

we obtain: which is obviously covariant under Lorentz transformations. The condition (7.6.35) (which is also called special Lorentz transformation) helps in defining supergauge transformations. These transformations consist of a general coordinate transformation with field independent parameter %A followed by a (structure group) Lorentz transformation with fielddependent parameter given by Lg - -^C(OCB. As mentioned in the introduction, our ultimate objective is to find the gauged supersymmetry transformations; we note that this has been achieved by establishing the transformation rules (7.6.30) and (7.6.36) respectively for scalar V and tensor VA. It is easy to check that the commutator of two supergauge transformations based onfield-independentparameters E,A and r\A can be written as: (5^

- 845n) VA = ZCT1B CDB
(7.6.37)

or equivalently (8n 6,: - S&) VA = VD^B

RBCA - f

ifTBCDVDVA

(7.6.38)

334

Mathematical Perspectives on Theoretical Physics

where RBCp and TB® are components of curvature and torsion tensor respectively. The above equality shows that the commutator [8n, <5je] VA closes into a field-dependent Lorentz transformation (implicit in RBCQ) and a field dependent transformation given by (7.6.36). If the superspace is flat, RBC£ = 0 and the torsion is proportional to T-matrices, hence (7.6.38) reduces to the familiar form: [<% 8n] VA = -2i(7jT*| - ^ 7 7 ) 0 ^ VA.

(7.6.39)

Also if we choose the 9=6=0 component of
8^EMA = -
(7.6.40)

h<°MAB = -?Rcm

(7-6.41)

We devote the final section of this chapter to the concept of integration and conformality on superspaces.

Exercise 7.6 1. Let y and z be two sets of superspace coordinates (i.e., they label the same point in superspace) related by a mapping (j>: z -> y and written as yM = yM(z) = $(z), then show that there is an induced mapping * which relates g-forms in two coordinate systems, and has the properties:

(i) (ii)

(j)*C¥ + A) = fV + 0*A f(0A) = (fW) (0*A)

(iii) rf(0*OF)) = W ) . 2. Use the infinitesimal coordinate transformation yM = zM- %M and the induced mapping * to obtain an infinitesimal variation in a 0-form F(z) and a l-form *P. 3. Show that 2) maps tensors into tensors by establishing the equality QWx) = VQ¥)X4. Obtain the Bianchi identities in superspace. 5. Let EA = dzMEMA{z) be the vielbein forms that define a local reference frame for differential forms on superspace. Show that the vielbein fields E^f which are the coefficient functions of the vielbein forms satisfy 8EMA =

E'MA{z)-EMA{z)

= - ? dLEMA -

{dM^)EA

for infinitesimal coordinate changes. Show further that their supergauge transformation is given by (7.6.40).

All that is Super—an Introduction 335

rghfjnts tor Exerciser 7.6 1. Note that a function of y (which in form-terminology is a 0-form) corresponds to a function of z in the following sense: (a)

F(y) = F(y(z)) = F(<j,(z)) = f(F(z)).

This definition of <j)*F ensures that an entity in two systems takes the same value at the same point independent of the labeling scheme. As a result we have: (b)

V(y)=dyMi--dyMiWMrMi(y)

= [dz-f^...^f^w^M] <*») = dzNK..dzN« =

(5NMll...d^)WMq...Mi(y(z))

dzN\..dzN«fWNq...Nx{z)

= fC¥(z)). The first property of
* is immediate, it shows that this is a linear mapping. To check (ii) and (iii) we take *F and A as 1- and 2-forms. Accordingly (ii) can be written as: (c)

0*OF(z)AU))=
dzN2W'N2Ni(z))

= ndzNdzN'dz^XN2N,Niz)) where we have put the product of two functions WN(z) and W'N Ni (z) as XN2N1N^)- Then using (b), the RHS becomes (d)

dyNdyNi dyN*

XN^NW-

Rewriting XN^N 60 as WN (y)W'Njfii (^)and noting that dyNWN(y) = W ) and dy^ dyNi = *(A) we have the result. In the case of (iii) we have: (e)

d(pV) = d{dyMi WMi(y))

W'^(y)

336

Mathematical Perspectives on Theoretical Physics

= 0VP(z)). 2. Note that in the defining equation of coordinate transformation y = (z) = z - £, the parameter E, is not a constant, it is an infinitesimal variation in z. In the case of 0-form we have: F(y) = f(F(z)) F(zM-$M)

(a)

F(zM)-^NdNF(zM)

=fF(zM) =fF(zM).

Hence f F ( z M ) - F{zM) = 8 F = - $ N dNF{zM). Let the 1-form *P be dyMWM(y) and dzNWN(z) in >> and z coordinates, then we have: (b) dyMWM(y) = f(dzNWN(z)). The LHS of (b) in terms of infinitesimal transformations gives: (c)

dzN^-¥r

WM(y(z)) = dzN ( ^ - - ^ - J (WM(z) - $LdLWM)

= On the RHS the term containing

f(dzNWN(z)).

dz is zero due to its product in the infinitesimals

(d)

N

and ^L, hence on simplification we have:

f(dzNWN(z)) - dzNWN(z) = -dzN(^dLWM)

- dzNf|^

Wu\

3. By definition of covariant derivative we have: (a)

© 4 " = ^ ' + xV'co'

= d(Vx) + Ci'x)(X-lo>X-X-ldx) where we have written *¥' = *¥x ( s e e (7.6.8)) and have used the transformation law (7.6.11) for the connection co. The RHS simplifies to:
d(W) = d(d¥ + Vco) = 0 + d^co)

All that is Super—an Introduction 337

= Vdca - dV¥) co = *¥% i.e., (b) ©©¥ = TSt or equivalently: (c)

dzMdzNVNVMV = | W T C ^

in view of (7.6.13) and (7.6.14)(c). Bianchi identities of the second type are obtained by taking the exterior derivative of the curvature form, thus from (7.6.14)(a) we have: (d)

d% = wdw - dww = w(f£- ww) - ( X - ww)w = wH^-Xw.

5. We use the results established in the Hint to Exc. 1, more precisely beginning with the defining equation

E\z)=dzMEMA(z) we denote the vielbein forms and the vielbein fields in the transformed coordinates z' = (z) by using a prime, thus we have: (a)

E'(z') = dz'NE'NA(z') M

r(dz»EM*iz» =

^r\E'NA{z')

dzM[^EN\z')^.

This leads to the relation: (b)

r(EMA(z))=~^EN\z').

Using the infinitesimal change z'M' = z*1' - %M(z) in (b) and noting that E'MA when expressed in terms of (z) is actually without primes we obtain:

338 Mathematical Perspectives on Theoretical Physics

(c)

f(EA(z)) = [dNM -J^-J(E N A (z) - $LdLEA).

This gives:

fEMA(z) - EMA(z) = SEM\z) = -
(d)

We note that in the transformation (d), the term — ^ -
(e)

5EMA =

EMBLBA{z)

where LBA are Lorentz generators. Taking the transformations of (d) and (e) together, we obtain the transformation of the vielbein field as:

(f)

6EMA = -? dLEMA - dM Z,LEA + EMB LBA = -Z\dLEuA

- {-fMdMEA)

- dMt;A +

EUBLA

= -dMtA - ^L{TLMA - comA + ( - r coMA) + EMBLBA

(g)

where T^ denotes the torsion. We note that the torsion is the covariant derivative of the vielbein EA and can be explicitly written as: f

A _ •}

F

A

( \NM l

p A

,

NW(8

+ M)F

B

A

, -JAB F B

A

The term with connection (i)M^ in (g) combines with dM E,A to give the covariant derivative. Finally we substitute the (special Lorentz) transformation (7.6.35) in the last term of (g) to obtain the supergauge transformation law (7.6.40).(see Chapter 14 in [21].

7

THE BASICS OF INTEGRATION AND CONFORMALITY IN SUPERSPACES

In Sec. 6 we introduced the concept of Lagrangians on the superspace and discussed their invariance under supersymmetry transformations. In this section we continue this discussion for action integral S = J X, since it is the (invariant) action integral which we always use to determine the key features of a theory. For instance we shall use SS to write the Euler-Lagrange equations of a theory defined by superfields.

All that is Super—an Introduction 339

In order to achieve this, we equip ourselves with the computational skills required to define these integrals and their variations on superspace. These rules of integration are not the same as in the case of ordinary space although they are formulated using the same principles: e.g., the invariance of infinitesimal volume integral under admissible coordinate transformations, the existence of a ^-function similar to Dirac-<5 function, a relation between the integration and differentiation and so forth. Since in this case one set of variables anti-commutes, the rules of integration are different for this, we begin by setting the rules for this variable.

7.1 Integration on Superspace Definition 7.7.1:

Given just one anti-commuting variable 0, the integrals f d6 and \ Odd satisfy:

(a) J dd = 0 and J Odd = 1 (b) and are invariant under translation 6 -> 6 + e, e being a constant. (7.7.1) (It is assumed here that the boundary makes no contribution.) In view of the above definition, it is easy to note that integration and differentiation are identical. For example if f(6) is a polynomial42 a + 6b, then the following equation is self-evident:

jf(6)de=b=-^f(d)

(7.7.2)

The above definition can be used for more than one variable, thus for two variables 6{, 62 we have: \ ddxd&Qx = J ddid62e2=

0

J eJdBl = 0 (i * j) J d&d&tfd1 = 1

(7.7.3)

Next to make the definition of integration on the superspace meaningful, we have to ensure that the volume element d*z = d4x d26 d26 = dAx d*0

(1.7A)

is invariant under translations. Now this is so because the superdeterminant of vielbein EM• (obtained by coordinate transformations (zM -> z** ) is 1 (see Sec. 6). We note that volume elements d2d and d2d in (7.7.4) satisfy:

(a) (b)

j(d29)d2= — J d0l dd1eABeAeB=

— J dQx dO2 (-26l02) = 1

j(d29)d2 = - - J de[de2eki6ddA = J dWeWW = 1

In view of (7.7.2) their integrals can also be expressed as

(a) 42

| d2d =~\d2 = - 1 ^ = -JD 2 = - j ^ Z ) ,

' Functions of 0can only be polynomials due to their nilpotency character.

(7.7.5)

340 Mathematical Perspectives on Theoretical Physics

\d2Q =-ld2^--dAdi=--D2

(b)

= --DAD:

(7.7.6)

The latter two equalities in (7.7.6)(a) and (b) result from the fact that when the integral is space-time, the derivatives dA and dA can be replaced by DA and DA as the second term in DA and DA is total spacedivergence, and as such the volume integral is zero by Gauss theorem, We also observe that since for any arbitrary superfield O, the derivative superfields O4O or O^<E> contain terms in 6 and 8 of order not higher than three, the integrals involving them over the superspace are always zero, thus we have: (a)

f d4xd26d2d DA® = j d4xd20d26 dA® = 0

(b)

jd4xd29 d29DA = Jd4xd29d29 dAO = 0

(7.7.7)

Using these relations we can write the formula for 'integration by parts' for the product of superfields D^O, and <E>2 as: \d4xd49

(DA,)O2= |"jj 4 jcJ 4 0 (Z) A fc,)]*2-

jd4xd49®iDA®2

= -jd4xd49^lDA<^2

7.2

(7.7.8)

Variation of a Superfield

A Grassman variant of the Dirac-5 function is defined as:

jf(6)8(e)dd = f(0)

(7.7.9)

which gives:

8(9) = 9,

J8(9)d9=l

and j 98(9)d9=0

(7.7.10)

For the general superspace the 5-function has the property: \d\'f(z')8\z - z') = /(*)

(7.7.11)

where d%z' = d4x'd29'd2 9' and 8\z -z') = 84(x- x')82(9- 9')82 (9 - 9'). Due to the anticommuting nature of 9 and 9, it is evident that 82(9)82(9) equals 92 92. Finally to obtain the variation 55 we have to define the 5-variation of a superfield. In the case of a simple (scalar) superfield <> / (<j>(z) = <jf(z')) we have by definition:

**y,'l:eL

= #(* - *')<52(0 - 9')8\9 - 9')

(7.7.12)

o ) Thus for a function/(0) (of this superfield 0) we have the functional derivative: x.(

S o

^ j d\M) = | f (0(*. 9, 9))

8(j>(x, 9, 9) J

d(j>

(7.7.13)

All that is Super—an Introduction 341

In order to define the functional derivative with respect to a chiral superfield O, we have to ensure that the 5-derivative mentioned above maintains the defining condition D^O = 0, i.e., it satisfies: 5*(JC, e, e)

D

(7714)

*8*(x',e:-e')=°

hence we define the variation of O as: 7)2

§<j)(x 8 6)

WPTWW)

m

_

X )SH

"T^" '

_

W)

" ~ *#<*"

( 15)

"

From earlier Sections (5, 6) we know that basically all Lagrangians of interest (e.g., those of YangMills, gravity and strings) on the superspace can be formulated in terms of superfields of the type (j> and O, hence the variation rules given above can be applied to study the actions formed with these Lagrangians. An example of such an action is given below. 5 = jd4x d2G dz9 L(V, ) + h.c.j

(7.7.16)

where Vis a general superfield, O is a chiral multiplet and W(&) is the superpotential defined in Sec. 6.2. (See Exp. (7.7.3) given at the end of this section for illustrations of these discussions.) In conclusion we sum up the distinction between the ordinary integration and the one we are dealing with here in the following remark. Remark 7.7.1:

(1) While the ordinary indefinite integration is the inverse of derivation, this integra-

tion over 6 is the same thing as derivation. (2) The normalization JQd0 = 1 implies that 6 and d6 have opposite dimensionalities. (3) The supersymmetry transformation law* 6 -» 6 + e and x^ -> x*1 + id x^e suggests that 9 and e have same dimensionality whereas xu has the square of that dimensionality, thus if L (the dimension of length) denotes the .^^-dimensionality, the dimensionality of 6 and dd is LA and L " respectively. (4) From our discussion in (3), it follows that in the case of integration on the superspace, each (Bose) coordinate x increases the dimension of the superspace by one but each 6 coordinate decreases it by one half. The above remark explains why in supersymmetric quantum field theories where integrals sum over both Bose and Fermi dimensions, the convergence improves-leading very often to finite results. We explain some of these ideas in the Hint to Exc. 2. Next we devote our attention (rather briefly) to the concept of superconformality.

7.3

Superconformal Transformations

In this subsection we describe in brief the concept of conformality on superspace. In Sec. 1.4 we already defined conformal transformation on the Minkowskian/Euclidean space (see (1.4.1)) and then went on to derive conformal algebra in two-dimensions by using the light-cone coordinates (see (1.4.21) and *

Note that in supersymmetry transformation of x11 we are taking only one term id t^e = - ier^ Q in place of usual two terms i (8 T^e- er''~Q), since we are considering only 8—> 0+ e. (See the Hint to Exc. 7.5.6).

342

Mathematical Perspectives on Theoretical Physics

(1.4.25)-(1.4.26)). We would like to generalize the equality (1.4.21) to the present case. For this we note that in the case of superspace the supervielbeins play an important role, it is therefore natural to define a superconformal transformation as follows: Definition 7.7.2: Consider a coordinate reparametrization on a general (8-dimensional) supermanifold zK —> z'n = zK + £*(z), and let EnN43 be the corresponding supervielbeins that transform under the supergeneral coordinate transformation as (see Exc. 6.5): 8EnN=^dAEnN+d^AEKN

(7.7.17)

Then in some restricted sense the transformation44: EAK - > eLEA\

E*l -> eL* E%

(7.7.18)

where L (in the exponential eL) is a general superfield, is called a superconformal transformation. It can be checked that the variation (7.7.17) is invariant under this transformation, and that it has the property of closure. We note that the above transformation (7.7.18) needs a simplification in order to write a superconformally invariant field theory. To this end, we let Z = Z (z; 6) and write down the most arbitrary transformation of this pair as: Z(Z) = (?(z, 6), 6(z, 0))

(7.7.19)

and then impose the constraints on the system to finally arrive at a superconformal transformation. Since the variables are just two in number we can write the supersymmetry derivative as: D

=4z

+e

4~

(7.7.20)

dd az It can be checked that D2 satisfies:

D2=4~

(7.7.21)

az The reparametrization (7.7.19) then leads to the transformed supersymmetric derivative D given by: D = (Dd)D + [Dz - 6Dd]D2

(7.7.22)

We shall achieve our objective of defining a conformal transformation if the above relation yields a viable composition law. This, however, is not possible since (7.7.22) is a highly non-linear equation. To remove this non-linearity we make the following assumptions, in other words we constrain our transformation by choosing: (a)

D = (D6)D

(b)

Dz - 9D6 = 0

43

44

(7.7.23)

We used the index letter M in place of n and A in place of N in Sec. 6 (see Eq. (7.6.28)) and a minus sign in the reparametrization, hence the index N here transforms under the tangent space group i.e. it is Lorentzian. The index A is a superspace world index. We have written this transformation for the N = 1 supergravity case.

All that is Super—an Introduction

343

Using these assumptions it can be shown that the composition law for two superconformal transformations Z - > Z -» Z closes properly. In view of the above analysis it follows from (7.7.23) that a field transformation is superconformal if it satisfies:

• dZ

(7.7.24)

(Z) = $(z)(Ddf" where h is the conformal weight per variable. Note the similarity between above equation and the Eq. (1.4.29) except that conformal weight now is defined in terms of fermionic coordinate 6. (See [21] for details). We shall pursue these ideas in later chapters while studying the specific field theories. Next we given an example to show how rules of integration regarding superspace coordinates 8, 6 (known in short as Berezin integral rules) can be used to write a Lagrangian on superspace. We then go on to obtain Euler-Lagrange equations. Example 7.7.3: Let O and O + denote the chiral and antichiral superfields, and let ym = x"' + iOxm 6, then and O+ can be expressed in terms of component fields A, \ff and F introduced in (7.5.15)(7.5.22), thus: (a) O = A(y) + 4l ey/iy) + 96F(y) ®+ = A*(y+) + 4l

(b)

9y(y+)+

90F*(y+)45

(7.7.25) +

We note that unlike the product of two chiral superfields, the product O O is not a chiral superfield. We write this product explicitly in terms of A, y and F for two different antichiral and chiral superfields Ojand Oy: <&t<&, =A*(x)Aj(x)+ J2

6\I/J(X)A*(X)

+ ddA*(x)Fj(x) +

+

-J2dy/i(x)Aj(x)

edF)(x)Aj(x)

+ erea [ - ^ xa&m {Ajd^j - d^Aj) - 2Wi&yja] + eeea[-^

xj"

(A^,,^

+ eeea[--j=zat"i{w-?dmAj 45

-

C?,,,AV7

-

^W^]

- , ? m ^ . ) + V2/?>,.«]

(7.7.26)

Where y+ is the Hermitian conjugate of y, for instance (ym)+ = xm- id'fd. Note that the expression in (7.7.26) is written after using the Taylor's expansion (7.5.32) for <J>.

344

Mathematical Perspectives on Theoretical Physics

eede^F;Fj+^(A;DAj+DA*A.)-^(dmA;dmAj-idmWirm¥j+iYiridm¥j)^

+

We remark that the 668 6 component is a spacetime derivative, a fact which has important implications in writing a supersymmetric renormalizable Lagrangian. RecalHhat in (7.6.16), we wrote the Lagrangian in terms of chiral superfields 0 , 0 ^ by using only the 666 6 component of the product <X>* O, as: ^ = * > < I«»S component +[(\mU®i®j

+

\giik*i*j*k

^ijUcomponem+h-C.]

(7.7.27)

Also, using the transformations (7.6.17) we showed there that this was the most general supersymmetric renormalizable Lagrangian. From our discussions in Sec. 6 and Sec. 7 and in the light of our above remark, this renormalizability character of L is quite apparent. This is more so when we express it as an integral. Using the Berezin integral and in particular (7.7.10) we can write L as an integral over the superspace, thus:

+ ^(ApVpfrSie)

+ A]jk^^jOk8(d))}d26d26

(7.7.28)

If instead of considering a set of interacting superfields to write the Lagrangian, we wanted to write just for one superfield, then L denoted as Lo would be: L o = J J +8(d))\d2d

d26

(7.7.29)(a)

We call it the free-field part of the Lagrangian (7.7.27). In terms of component fields it becomes: L0=A*UA

+ idmxj/rm W + F*F + m(AF + A*F* - — (y/y/+ y/yr))

(7.7.29)(b)

Also, in view of identities established in Exc. (7.5.8), we can write (7.7.29)(a) as: Lo=

j|o+d>-|mfo^-O + O+^-O+lJjV40

(7.7.30)

The Euler-Lagrange equations are now obtained by varying Xo according to the rules given in (7.7.12)(7.7.15). Written in a matrix form they are:

_lp2

oV-i^DD

i yon

(7.7.31)

Since

±D^L^ 16

=

^

if

DO = 0

LJ

(i.e. D2D2/16D is a projection operator on chiral field <&)

(7.7.32)

All that is Super—an Introduction 345

— 1 it follows that D 16

D2D2 —— =0 , hence we can simplify the matrix form to obtain: LI /M<J>- ±-DDQ>+= o

4 m$>+ - — DD<E> = 0

(7.7.33)

These are the field equations for a massive scalar multiplet. Returning to (7.7.29)(a) we emphasize that due to integration rules all components of O+<J» except the 890 6 component vanish, and thus what we obtain here is the familiar kinetic energy term, along with the second term which corresponds to the potential. This term (as we already mentioned) is called the superpotential of the theory.

Exercise 7.7 1. Show that for chiral superfields <J>, and <E>2 m e superspace integral satisfies:

j d2e d\ $ r $ 2 = -j d4ed\ fci-^2-. 2. Show that for a non relativistic superpoint particle described by m scalar superfields %l(t, t), ..., %'"(t, t), the supersymmetric action integral can be expressed as:

S= — j dt(xaxa+

i^O")

where

xa(t, T) = xa(t) + e\t)x. Also obtain the equations of motion in this case. 3. Verify the equality

where

do

dz

4. Establish (7.7.22).

Hints to Exercise 7.7 1. In an earlier section we have already seen that covariant derivatives satisfy (see Hint to Exc. 7.5.8)): (i)

D2D2D2=

1692D2,

D2D2D2

(we have written here d2 in place of • ) .

=

!6d2D2,

346

Mathematical Perspectives on Theoretical Physics

In particular for a chiral superfield O (£^ 2 D 2 O= 16
(in)

J d2e d4x *,- * 2 = J d2e d\ o, ( D ^ 2 ° 2 )

1

9

In view of (7.7.6)(b) we now replace - — D 4

f

by

9—

d 6 to obtain: J

J d2e dAx$>x •
2. The mathematical content of this exercise can be viewed as a generalization of point particle field theory of classical mechanics where the space dimension is zero and time-dimension is 1, and the coordinates of a particle are scalar fields xl(i), ..., x"\t) of time variable t. In the case of a superpoint particle field theory, one considers scalar superfields X\t,

T), ...,X"\t,

T)

in the (1,1) superspace with coordinates (t, T) and of course uses all the set up of supersymmetry theory (e.g., supersymmetry generators, resulting differential operators and supersymmetry algebra) to write the action integral. This approach is due to Friedan and Widney [8] and has been utilized since then with great advantage in string theory [2], [10]. In view of the discussions in Sec. 5, each scalar superfield x"(t, T) can be expressed as: (i)

Xa(t,

T) - xa{t) + ea(t)r.

Note that xa and t are commuting elements whereas 9" and rare anti-commuting. From (7.5.6)(a) we know that there are just two supersymmetry generators, we denote them as Q and //, thus 46 : (ii)

Q = l-T|-

dt

* dx

H =

id

dt

The corresponding superalgebra of left-supertranslations and time-translations can easily be checked to satisfy: (iii)

46.

{Q, Q) = 2<22 - -2H,

[Q, H] = [H, H] = 0.

The sign of the term containing — differs from what we have in (7.5.6)(a). dt

All that is Super—an Introduction 347

Evidently the right-supertranslation operator is: (iv)

D=ix~-+-^-. dt dx

This leads to: (v) and

{D, D) = ID1 = 2H, [D,H] = 0

(vi) {Q, D} = 0 (see Eq. (7.5.9)). The last of these relations helps us to construct the action:

(vii)

S = _['* dt J dxL

where (viii)

L=

±DxaD(DXa).

We show next that the above action is supersymmetric. For this we establish that: (ix)

8S = j dtdxL = 'surface' term.

Under the supersymmetry transformation of the parameter e (see (7.5.16)), the infinitesimal variation of superfields %a in this simple case is given as:

(x)

Sx" = eQXa = e[ix-jt ~ jA Ua(0 + 0a«T).

Since e is Fermi like Q, from (vi) we have: {[eQ, D] = 0}. Using this we obtain: 8DXa = eQDx" = DeQXa = D8x" which shows that D commutes with S, and therefore (quite appropriately) D is a covariant derivation. This implies that the variation inX is due to the variation in x"\ accordingly, in view of (x), we have: (xi)

8L = eQL = e(ir~~)L. V dt dx) The first term on the RHS, being the exact time derivative, does not contribute to variations, while the second term dr is T independent (as L is at most linear in x) and therefore from our rule of integration on Fermi coordinates, its T-integral is zero. Hence the variation of S is as given in equality (ix) which confirms that the action principle is supersymmetric. Using (i) to write 8x° in terms of components, as well as the LHS of (x) we have upon simplification:

348

Mathematical Perspectives on Theoretical Physics

Sxa + SeaT=ieTxa+

e9a

(note that iex9a T on RHS of Eq. (x) is the product of four Fermi objects and therefore it is zero) which gives the variation in component form Sxa=e6a

and

S9a=iexa

and as such defines the supersymmetry transformation laws of the system in terms of components. We now use (iv) to write

Dxa = -6a+ ixax and DDxa=

i(xa+

9ax)

to express the action S in component form as: (xiii)

S= —ij dtj dt(-9a + ixaf) (xa + 9ax) = -—\dt\

dx[x(xaxa + iOa6a)]

= -j dtix"** + idaea) where we have used (7.7.1)(a) to arrive at the final form of 5 in (xiii). We note that the first term on the RHS is the kinetic energy term of an ordinary point particle with mass m = 1; the second term — 0a6a is a kinetic piece due to superpoint particle's Grassman degrees of freedom. The equations of motion are obtained by varying the action S in (xiii), these are easily seen to be: xa = 0,

9a=0.

3. We simplify the product D of the operator taking into consideration that 9 and z are independent variables, and that 9 is an anticommuting variable. The first one gives the relation between their derivational operators as:

d9{dz)

dZ{d9J

and the second implies

d9\d9) Hence writing the product term by term we have: (i)

All that is Super—an Introduction 349 In view of the above, only the second term on the RHS is non-zero. Hence we have: (ii)

D2=-^-. dz

4. To verify the equality

D= (D6)D + [Dz -

9D6]D2

we note that

DS(-L

[dO

+

e4-)

dz)

and

D2=^-. dz

We now express the operator D on the LHS in terms of ——, —— and solve the RHS as it is.

36

dz

This gives

[d6

dz)

[do dz

(i)

de de)

{dz dz

RHS m(§ + ef}[*+9*) {dO

dz ){d9

+\§ dz)

[90

dz dej

+ e?L-J§ dz

[d8

+

ef)]±

dz ) \ dz

dd d .de 2 d +ftde d ,ftdd 2 d d9 d6 de dz dz de dz dz

[d6 dz

dz dz

d6 dz

dz dz j

The second and fourth terms on the RHS cancel with the seventh and eighth terms on the RHS and the remaining four are easily seen to be the same as on the LHS, hence the equality holds.

350

Mathematical Perspectives on Theoretical Physics

APPENDIX 7 A A.0

Notations and Pauli Matrices

We list below the properties of Pauli matrices that are extensively used in transforming the 4-component spinors into 2-component ones and conversely. Our notations are mostly based on West's book [21] with a few changes to suit our thinking. Usually the Latin capitals A, A are used to denote the 2-component spinors # A , ^ A , etc., belonging to (4-, 0), (0, y ) representations of the Lorentz group, while the Greek indices a, /3, a, j3 are used for 4-component spinors. The lower case Latin indices, or Greek indices jxv denote the space-time, thus, for instance, Minkowskian metric (-1, 1, 1, 1) is r\mn or T]^v where m, n or IJ.v take the values 0, 1, 2, 3. The £-symbols used in Lorentz transformations are the invariant tensors: e° 123 = - % 2 3 = 1

(A.I)

and eAB = ^B = -eAB=-eAB,

el2 = +1

(A.2)

The summation amongst these tensors is governed by the rule: eCBeBA

= - £ A B e C B = -dAc

(A.3)

which emphasizes their universality of usage.47 Using these we can raise and lower the indices of spinors as follows: XA = XBeBA,

(a)

XA=eABXB

(A.4) Consider the triplet: T = (T 1 , T 2 , T 3 ) formed by Pauli matrices. It can be checked that the matrices defined by (T'")AB

= (I,

r)AB

(A.5)

are invariant tensors under the Lorentz group. Note that using the tensors eAC, e • • we can lower the indices of (x'")AB. On the other hand, taking the complex conjugate of these matrices and using the defining relation of dotted indices (see Ftn. 15) we have:

(A.6) But Pauli matrices are self-conjugate, hence we have: (O, 47

B

= (T m ) B A

(A.7) f 0

In view of the fact that for A, B = 1, 2, eAB = -eBA and en = 1 we also use the matrix: ^ = this tensor. ^

A to denote '

All that is Super—an Introduction 351 where

( f ' « ) ^ = (-1, i)hA

(A.8)

m

Using the metric rf it can also be checked that (rm) = (-l, T) and (f J = (1, T)

(A.9)

These matrices with tensorial notations satisfy the following relations and identities:

(a)

( O ^ ( T , , ) ^ = 5 n m ^ + (T n "y c

(b)

( T " ) * (*«/* = «;«£ + (?;**

(C)

(T"')^(T")^ C = T]mn8A + ( T m Y c

(d)

\emnpg(Tpi)AB=i(Tmn)Ah

(a)

(r>")A"(Tm)dD

(b)

(Tm)AB(r»>)c* = +2eACe™

(A.10)

= 28AD8Bc (AM)

where the doubly indexed matrices are: (a)

iK)Ac=\i^n-^m)Ac

(b)

( T ? ) / = }(T m T n -T n T m )/

(c)

(T m Y c = | (T'^(f m )^ - Xmch (T"fA)

(A.12)

A.1 Standard Bases and Components of a Supervector Every supervector space having finite total dimension has a basis that is both pure and real (see Sec. 4). But very often it is convenient to work with pure bases for which the c-type basis supervectors are real and the a-type supervectors are pure imaginary. Such bases, called the standard bases, can be constructed from pure real bases by multiplying all the a-type basis supervectors by i. A standard basis is characterized by: f* = (-1)' ,« Thus if X is a real supervector with components X' with regard to a standard basis, then A"> = X = X* = f* X '• = (-1)' ,eX'*

(A. 13) (A. 14)

352 Mathematical Perspectives on Theoretical Physics

If X is c-type, then X' is c-type or a-type according as the index i is c-type or a-type. When X is a-type, the type association is reversed and we therefore have (see Eq. (7.4.23) for exponents of (-1)): X'>=

( - l ) i x Xu ,e

(A. 15)

which implies Xi* = (-lfXi

(X real)

(A.16)

If X is a real c-type supervector, then all its components, with respect to a standard basis, are seen to be real. Conventionally, if X is a c-type supervector, then 'X def X1

(A.17)

A.2 Contravariant Vector-fields on Supermanifold M Let !F(M) denote the set of all scalar fields over a supermanifold M, i.e., the set of all differential mappings / : M —> AM. We note that for f,ge ?{M), as A M and p & M, (f+ g) (p) def f(p) + g(p), (af)(p) def a[f(p)], (fa)(p) def [f(p)]a and f*(p) def [/(/?)]*, hence J(M) is a sup~ervector space. Clearly if m * 0 in (m, n)-the dimensionality of M, the set!F(M) is infinite-dimensional and thus has no basis of finite total dimension. However, one can always construct a subsupervector space of T(M) by choosing a finite set {eA} of linearly independent scalar fields and use it as a basis of this space. We note that J(M), in addition to having the supervector space structure, also has the property: (fg)(p) def [f(p)][g(p)]

(A.18)

This property can be used to construct more complicated structures with the help of J(M). Let {eA} be a (p, ^-dimensional subbasis of 7{M), and let F be a differentiate mapping from the subspace x AeA{M) of C^ x Cqa to A M . Every such mapping defines a scalar field F given by F(p) def F(e(p)),

for all p in M

(A.19)

A mapping X from J(M) to itself written as: X(f) def X /,

for all / i n ^(Af)

(A.20)

is called a contravariant vector field over M if it satisfies the chain rule (see Subsec. 4.2): (XF)(p)=[(XeA)(p)]\^TF(y)) idy \y = e(P) for all differentiable mappings F: xAeA(M) T(M), and for all p in M.

(A.21)

—> A M , for all pure finite-dimensional subbases [eA] of

A.3 Super Lie Groups In Subsec. 5 of (7.4) we mentioned that a supergroup can be constructed once its Lie algebra is known. We pursue this approach in the case of a super Lie group. To begin with, we remind the reader that if the postulates of Def. (2.1.1) of a group G are supplemented by the following two postulates:

All that is Super—an Introduction

353

(i) G is a supermanifold whose points are the group elements, and (ii) the binary operation (multiplication mapping) denoted F is differentiable, then G is a super Lie group. Just as in the case of ordinary Lie groups, the super Lie group G possesses the following features: (a) The left and right translations xL and xR for all points x in G; their derivative mappings denoted x'L and x'R (known as left and right draggings); (b) the left- and right-invariant vector fields (i.e., vector fields which are invariant under left and right draggings respectively) denoted XL and XR which satisfy XLe = XRe;* (c) two distinguished super Lie algebras formed by the setsXL(G) andXR(G) (isomorphic to Te{G), the tangent space at the identity e of G) of all left and right invariant vector fields; the sets have a supervector space structure given by a bracket operation that satisfies the super Jacobi identity, and the Lie brackets formed by XL and XR are related as [XL, YJ e = -[XR, YR]e; (d) the left- andright-invariantlocal frame fields; (e) the left and right auxiliary functions (see Hints to Exc. 4 of (2.5) for these functions on a Lie group) that follow from left- and right-invariant vector fields and their derivatives. It can be shown that a knowledge of either of the auxiliary functions is sufficient to determine the multiplication mapping and thus the group G itself. Now in a canonical coordinate system, the auxiliary functions are completely determined by the structure constants, while the structure constants are defined by the super Lie algebra,48 hence the group G is completely determined in a canonical chart by its own super Lie algebra. This establishes the claim made in (7.4).

A.4

Conventional Super Lie Groups

A super Lie group G is called conventional if a standard basis [ea] can be introduced in Te(G) with respect to which the souls of all the structure constants vanish. In such a basis, which is also called conventional, the only nonvanishing structure constants are of the type c^a, c^, c^ or c ^ , the first three here represent ordinary real numbers and the fourth ordinary imaginary numbers. A super Lie algebra is called conventional if it is the super Lie algebra of some conventional super Lie group. The structure of a conventional super Lie algebra can be fully determined by using only real vectors in Te(G). The components of these in a conventional basis have vanishing souls. One of the simplest example of super Lie group is the group formed by all nonsingular (m, n) x (m, n) matrices M with elements xab having the reality and type properties of the components of a real c-type rank (1,1) tensor. The group is evidently of dimension (m2 + n2, 2mn) (see Sec. 1).

A.5 Exponential Mapping Let Ee(G) denote the subspace of Te(G) consisting of all the real c-type contravariant vectors at the identity element e of G. For all s in R c and all X in Ee(G), the mapping defined as 48

' Let {
354 Mathematical Perspectives on Theoretical Physics

expOX) def xx(s)

(A.22)

is called the exponential mapping from Ee(G) to G and is denoted 'exp.' The mapping helps to define cannonical corrdinates ea exp-1(jc) in G, where {ea} is dual to the basis {ea} of Te (G).

A.6

Conventions on Structure Constants

In the case of a conventional super Lie algebra, these can be determined by considering only real vectors e Te(G) with vanishing souls. Thus if X is such a vector and if it is otype, then its nonvanishing components in the conventional basis are ordinary real numbers belonging to the set {X11}. If X is a-type, then its nonvanishing components are ordinary imaginary numbers that belong to the set {Xa}. Hence such vectors belong to the subspace R m © (z"Rn) of Te(G). Mathematicians (quite often) replace this subspace R'" © I'R" by R'" © R" and apparently modify the bracket operation by multiplying the structure constants C ^ b y i so that they become real. (See [4] for details on Subsec. A.I-A.6.)

References 1. M. Atiyah, R. Bott and A. Shapiro, Clifford Modules, Topology 3 (Supp. 2) (1964), 3-38. 2. L. Castellani, et al., Supergravity and Superstrings, A Geometric Perspective (New Jersey: World Scientific, 1991). 3. S. Coleman and J. Mandula, All possible symmetries of the 5 matrix, Phys. Rev. 159 (1967), 1251. 4. B. Dewitt, Supermanifolds (2nd ed., Cambridge University Press, 1992). 5. S. Ferrara, P. van Nieuwenchuizen and B. Dewitt (ed.), Supergravity '81 (Trieste) (Cambridge: Cambridge University Press (1982). 6. S. Ferrara, P. van Nieuwenchuizen and B. Dewitt (ed.), Supersymmetry and Supergravity, '82 (Trieste) (Singapore: World Scientific, 1983). 7. P. G. O. Freund, Introduction to Supersymmetry (Cambridge: Cambridge University Press, 1986). 8. D. Friedan and P. Windey, Supersymmetric derivation of the Atiyah-Singer index and the chiral anomaly, Nucl. Phys. B235 (1984), 395-416. 9. C. Fronsdal (ed.), Essays on Supersymmetry (Boston: Kluwer Academic Publishers, 1986). 10. M. B. Green, J. H. Schwarz and E. Witten, Superstring Theory (Vol. I, II, Cambridge: Cambridge University Press, 1987). 11. M. T. Grisaru, W. Siegel and M. Mocek, Improved Methods for supergraphs, Nuc. Phys. B159 (1979), 4 2 9 ^ 5 0 . 12. (a) V. G. Kac, Classification of simple Lie superalgebras, Fund. Anal. 9 (1975), 263-265; (b) V. G. Kac, A sketch of Lie superalgebra theory, Commn. Math. Phys. 53 (1977), 31-64. 13. I. Kaplansky, Superalgebras, Pacific J. Math. 86 (1980), 93-98. 14. H. B. Lawson, Jr. and M.-L. Michelsohn, Spin Geometry (New Jersey: Princeton University Press, 1989). 15. P. K. Mohapatra, R. N. Mohapatra and P. Pal, Z 4 symmetry and force generation, Vol. 34, Phys. Rev. D (1986), 231-234. 16. R. N. Mohapatra, Unification and Supersymmetry (2nd. ed., New York: Springer-Verlag, 1992). 17. W. Nahm, Supersymmetries and their representations, Nuc. Phys. B135 (1978), 149-166.

All that is Super—an Introduction

355

18. A. Salam and J. Strathdee, Superfields and Fermi-Bose symmetry, Phys. Rev., Vol. 11, No. 6 (1975). 19. J. Wess and J. Bagger, Supersymmetry and Supergravity (2nd ed., Princeton: Princeton University Press, 1983). 20. J. Wess and B. Zumino, Supergauge invariant extension of quantum electrodynamics, Nucl. Phys. B78 (1974), 1. 21. P. C. West, Introduction to Supersymmetry and Supergravity (New Jersey: World Scientific, 1990). 22. N. Kamaran and P. J. Olver, 5.[19].

CHAPTER

GRAVITATION, RELATIVITY AND BLACK HOLES

1

Q O

GRAVITATION (FROM NEWTON TO EINSTEIN) AND AN OVERVIEW OF SPECIAL RELATIVITY

In this chapter we describe in brief the two theories—the Gravitation and the Relativity (the Special and the General)-which have led to spectacular findings of this century such as the black holes, and devote this section in particular to the gravitational principles formulated by Newton and Einstein with geometry as a prime tool. These theories as we see them today are quite different from the perception of the world that our ancestors had. They thought that the Earth was stationary and was at the centre of the universe and that the sun, the moon, the planets and the stars were moving in circular orbits around the Earth (Aristotle, 340 BC). It is worth mentioning here that these ancient philosophers (the seekers of truth) were more interested in solving the puzzles related to distant objects, e.g., the stars and the planets, than to the nearby objects around them. As a result they conceptualized the physics of celestial bodies even before writing the rules of geometry. However, when Newton wrote his 'Philosophiae Naturalis Principia Mathematica' (1687), substantial changes had already occurred, for example: (i) (the Euclidean) geometry was already a well established discipline when, Newton used it to describe the physical laws, (ii) the Aristotelian model of the universe formed solely on the basis of it "absoluteness" had been replaced by the Copernican model, where the Earth was no longer stationary but was moving around the sun. The Copernican model was confirmed by Kepler and Galilei through observations, though they were unable to provide the reasoning as to why the orbits of the Earth and the planets were elliptic and not circular. It was Newton who (for the first time) put forward a theory for describing the motion of bodies in space and thus showed why the orbits of the Earth and the planets were elliptic. To formulate this theory he also developed the complicated mathematics which was required to analyze these motions; this is our familar calculus. Newton's work, 'Principia Mathematica' [29] in which he laid the foundations of "gravitation theory," is considered to date, the single most important publication in physical sciences. Another gigantic contribution to the theory was made by Einstein some 225 years later in the form of General Relativity. In fact the theory of relativity, with Newton's work on gravitation as the base-line, forms the cornerstone of our present knowledge on 'gravity in the universe'—the theoretical as well as the experimental. We have devoted this section to examine the differences and similarities between the theories proposed by Newton and Einstein, namely Newton's theory of gravitation and Einstein's theories of special and general relativity leading to gravitational theory.1 '

In order to do this, we shall be using the two concepts interchangably.

Gravitation, Relativity and Black Holes 357

l.l

Newton's Theory of Gravitation and his Famous Laws

We recall that Newton formulated his theory on three simple premises: (i) Newton's first law of motion: A body remains at rest or, if in motion it remains in uniform motion with constant speed in a straight line, unless it is acted on by an unbalanced external force. (ii) Newton's second law of motion: The acceleration produced by an unbalanced force acting on a body is proportional to the magnitude of the net force (resultant force) in the same direction as the force, and inversely proportional to the mass of the body; thus if a denotes the acceleration, m the mass and F the net force (whose direction is the same as that of a), then a <*: net — . m (iii) Newton's third law-action and reaction: Whenever one body exerts a force upon a second body, the second body exerts a force upon the first body; these forces are equal in magnitude and are oppositely directed. Although in none of these laws Newton explicitly uses the word inertia, in essence this underlies his first law. For him 'inertia' was a property of objects that described their tendency to maintain their state of motion, whether of rest or of constant velocity; in other words, according to Newton objects obeyed the 'Law of Inertia.' We shall soon see, while using the derived word 'inertial' as an adjective in Einstein's theory, that this 'Law of Inertia' has actually been turned around for the purposes of theory there. Using the above three laws, Newton postulated his 'law of universal gravitation" that reads as: Every particle in the universe attracts every other particle with a force that is directly proportional to the product of the masses of the two particles and inversely proportional to the square of the distance between them. Expressed as an equation this becomes: F=Ganf3

where G is a constant of proportionality, and m and m are the masses of two bodies separated by a distance r. Newton used the space R 3 x R to describe his theory. Thus an event (according to Newton) could be specified position-wise by a point belonging to 3-dimensional Euclidean space R 3 and time-wise by a point on the line R. The space and time are disjointed entities here, and the symmetry group of the theory is the Galilean group. 3A

2

It is worth noting here that Newton used Kepler's laws of (planetary) motion as well as his own astronomical observations to establish his law of gravitation. 3 It is assumed that m, m are very small as compared to their separation distance r. 3A ' It is worth noting here that it was Galileo Galilei who formulated the first known Principle of Relativity (see Chapter 3 in [39]). Using uncanny test objects such as fish in large bowls of water, flies, and bottles dripping drops of water, he showed, that physics looks the same in a ship moving uniformly as in a ship which is at rest. Einstein replaced these sea-going ships by spaceships for his special Relativity theory. Thus it is fair to say, that Galileo had already conceptualized the notion of Inertial Frame.

358 Mathematical Perspectives on Theoretical Physics

The time ordering is a basic part of this theory. Thus for any two events A and B, it is possible to say that either A precedes B or B precedes A or they are simultaneous. The consistency in this order demands that 'simultaneity' be an 'equivalence relation,' which in turn requires that spacetime be divided into equivalence classes of mutually simultaneous events with each class representing the universe at a given time. The following figure describes this idea.

/ Planes of absolute

simultaneity

{

pi

^_

I .

Si /

Geodesic

/ s

I Termpral interval J between p and q

I

~~?r

\

/

Non-geodesic

| ^ £ Q Stratlflcaflon of Newtonian spacetime. In short, Newton like his predecessors thought of 'gravity' as a force acting through space—that was (quite) mysterious and had to be explained. The concept of gravitational force in this form continued until Einstein proposed the revolutionary idea of 'eliminating gravity' in order to explain the 'force of gravity.4'

1.2

Einstein's Proposal—the Free-float Frame and the Observer

According to Einstein 'gravitation' was not a foreign force transmitted through space and time, instead it was a force that manifested itself in the curvature of spacetime. He considered space and time on equal footing and used Minkowskian space, and later on Riemannian space to describe his theory. We shall describe in brief the underlying ideas of this theory by actually defining and explaining the 'terms' that are essential ingredients of its scientific interpretation, namely: the free-float (inertial) frame, the test particle, an event, readings on synchronized clocks, and observer's relative accelerations. Definition 8.1.1: A reference frame is said to be an inertial or free-float or a Lorentz reference frame in a certain region of space and time when, throughout that region of spacetime—and within some specified accuracy—every free test particle (see Def. (8.1.3)) initially at rest (in motion) with respect to that frame remains at rest (continues its motion with no change in speed or direction). We note here how the "Law of Inertia" has been turned around; for a reference frame to be inertial, it is required that observers in that frame demonstrate that every free particle in that frame maintains its initial state of motion or that of rest. One can thus say that a free-float frame is defined by (can be identified with) Newton's first law of motion. We further remark: Remark 8.1.2: A free-float frame is "local" in the sense that it is limited in space and time—(and also "local" in the sense that its free-float character can be determined locally from within. We note that a free-float frame has been obtained by assuming the existence of a room so small that no effect of gravitation is felt there, but as soon as this condition of smallness is relaxed, the relative accelerations produced by different external factors come into play, and the 'state' of motion of a free particle guaranteed by an inertial frame remains no longer valid. The tidal waves observed in the ocean are easy 4l

To disregard gravity he used the notion of an unpowered space ship or a freely-falling room.

Gravitation, Relativity and Black Holes 359

examples of this phenomena. These waves are the result of sun and moon's gravitational pull on water particles. It is not possible to find a frame large enough that would include all these particles and be a free-float frame. Evidently, there would be many free-float frames required for it. It is only general relativity—the theory of gravitation (propounded by Einstein)—which tells us how to describe and predict orbits that traverse a string of adjacent free-float frames. Thus General Relativity is the only theory that provides the means to describe motion in unlimited regions of spacetime. Definition 8.1.3: A small particle is called test particle if its mass is so little that within some specified accuracy, its presence does not affect the motion of other nearby particles. Remark 8.1.4: A particle made of any material can be used as a test particle to determine whether a given reference frame is free-float. A frame that is free-float for a test particle is free-float for test particles of all kinds. In Einstein's theory an event is specified by a place as well as a time. The place and time of its occurrence in a given free-float frame is determined with the help of 'synchronized clocks' 5 on a lattice constructed in this frame. One of these clocks is taken as the reference clocks. It is set at time zero. A flash of light sent from here which spreads out as a spherical wave in all directions is supposed to reach a clock, say, ten meters away in ten meters of light-travel time. In other words, a clock at a distance of ten meters records ten meters of light-travel time. The space position of the event is taken to be the location of the clock nearest to the event and the time of the event is taken as the time recorded on this clock. In fact these 'recording' clocks read into their memory the nature of the event as well (e.g., collision, passage of light-flash or particle), besides giving the time and the location. A natural question that follows is: how is this information collected and who collects it? We answer this question next. The information from recording clocks is collected by the so-called 'observer' who may necessarily not be a human being. In the following definition we show precisely what it stands for, and then explain how the information is collected. Definition 8.1.5: In the theory of relativity, the word observer in a manner of speaking is shorthand for the whole collection of recording clocks associated with one free-float frame. An 'observer' can be viewed as a person who goes around reading out the memories of all recording clocks under his control. The location and the time of each event is recorded by the clock nearest that event. Owing to the importance of the free-float (inertial) frame, in the above definition the word observer is often preceded by the word "inertial." We note that the observer does not report on widely separated events that he (she) views by his (her) own eye. For such a report can cause a wrong order amongst the events that are involved. A mistake of this nature can happen due to the travel time of light-for instance the light from an 'event' that occurred a million years ago at a distance of a million light-years in our frame, may just be entering in the observer's eye-domain after the entrance of light from an 'event' that occurred on the moon a few seconds ago.

1.3 Acceleration and Spacetime Curvature Having defined Newton's laws and a few useful terms used in Einstein's theory, we give in the following remark the difference between the concept of acceleration and gravitation in the two theories. 5

'

Synchronized clocks: In a given inertial frame a lattice is constructed. At every intersecting point of this lattice identical clocks, whose readings are in meters of light-travel time, are fixed. All these clocks read the "same time" as one another for observers in this frame.

360

Mathematical Perspectives on Theoretical Physics

Remark 8.1.6: In Newtonian mechanics different particles going at different speeds are all deflected away with equal acceleration from the ideal straight line. According to Newton there is no difference in principle between the fall of a projectile and the motion of a satellite. In brief, in Newton's theory there is one global reference frame and within this frame no satellite is ever gravity free, and no particle ever moves in a straight line at constant speed. In Einstein's theory, on the other hand, there are many local regions equipped with Lorentzian geometry (as in special relativity). The 'laws of gravitation' here arise from the lack of ideality in the relation between one local region and the next. One has to observe the 'relative acceleration' of two particles slightly separated from each other to have any proper measure of a 'gravitational' effect. These 'relative accelerations' double when the 'separations' are doubled. According to Einstein, tidal acceleration displays gravity as a local phenomenon. He further emphasizes that 'tide-producing' effect does not require for its explanation some (mysterious) force of gravitation propagated through spacetime which is in addition to the structure of spacetime. This (tide-producing effect) should be described in terms of the geometry of spacetime itself as the curvature of spacetime. With these philosophical differences between the two theories in place, we now turn our attention to their mathematical descriptions.*

1.4 The Coordinate Transformations: Distinction Between the Galilean and Special Relativity Theory Let / and / ' be two inertial frames covered by coordinates (x, y, z, i) and (xr, /, z\ t'), where (JC, y, z) = r and {x, y', z') = r' are the space coordinates and t and t' are the time coordinates. These two frames are related to each other in the following manner: (i) a relative rotation of the space coordinates

f' - Ar

(ii) a displacement of the space coordinates

r' = f + a

(iii) a displacement of the time coordinate (iv) a type of boost.

t' = t + b (8.1.1)

(The vector a - {ax, a2, a3) and b in the above equations are constants.) In view of Ftn. 3A and (8.1.1) both theories are represented by / and / ' and hence by the same set of coordinates. The distinction between the Galilean and special relativistic theories comes from the boost which is: t' = t

x' = x + vt y' = y zf = z

(8.1.2)

for the first and ct + (v/c)x (l-v2/c2)m *

We emphasize that in spite of distinction between the two theories, their predictions (numerical results) are essentially the same on surface of the Earth (whether it is a projectile path or it is an ocean flow). However when gravitational effects are large (near white dwarfs or neutron stars (see Subsec. (5.2)) it is Einstein's theory that makes the right predictions.

Gravitation, Relativity and Black Holes

X

*'=

V\

2

1/2

361

(8-1-3)

2 112

(l-v /c ) y' = y z' = z

for the second. The number i> in the above equation depends on the observer's (constant) velocity and c is an absolute constant of nature with the dimensions of velocity, which has the role of conversion constant between variables t and x of different physical dimensions. We note that no change in time coordinate in the Galilean case implies that the observer's agree on the definition of simultaneity. (For details on the concept of "absoluteness" in these theories see Friedman [11]. Here he develops the kinematics of both these theories by writing the field equations in curvilinear coordinates (of General Relativity).) We describe next in brief the mathematical content of Newton's and Einstein's theories in the form of equations for gravitational fields.

1.5

Equations of Motion in Newtonian Mechanics

The Newtonian time axis is T=R. The pair (r, m;) represents a particle of inertial6 mass m, e (0, <») moving on a curve r: £ —> R3. For t e e c T, r (?) e R3 is the position vector of the particle at time t. With r as the path7 of the particle, we have r - v as its velocity, \v\ as its speed, m(v its momentum, —m\v\ 2 its kinetic energy and f = v as its acceleration. In this theory the concept of relative velocity, etc., gets introduced only when one considers another curve F : T-> R3, the difference r - F gives the path and the velocity relative to F (this however is irrelevant in the scheme of ideas pursued by Newton). Newton's equations of motion for inertial and gravitational mass respectively are: F = mfib (F = total force acting on the body)

(8.1.4)

(CO = acceleration) F - meg

(F. = gravitational force)

(8.1.5)

(g = gravitational field intensity) (m = gravitational mass 8 ) We note that the gravitational field intensity g is generated according to the inverse square law by a gravitating body. Further, if r and R denote the position vectors of a particle in an inertial frame and in a uniformly accelerating frame, say /", then the two frames are related as: 6

The property of a body whereby it resists any attempt to change its state of motion is its 'inertial mass m,.' The mass m, is measured by collision experiments that do not involve gravity.

7

The word path stands for 'trajectory or orbit' in the Euclidean space and 'world line' in the Lorentz spacetime.

8

The property which determines a body's response to a gravitational field is its gravitational mass mg.

'

362

Mathematical Perspectives on Theoretical Physics

R = r - —COt2 2

(8.1.6)

and therefore while the law of motion in inertial frame from (8.1.4) reads as: m? = F

(8.1.7)

in the accelerating frame it becomes: mR = F -ma)

(8.1.8)

where we have differentiated (8.1.6) with regard to t and have substituted the value of r from (8.1.7). We shall illustrate the applicability of these equations to a system of N-bodies in Exc. 3. As mentioned above in the definitions and the remarks, there is no counterpart of these equations in Einstein's theory of gravitation, in the sense that there is no linear equation similar to (8.1.5) which describes the gravitation force. On the other hand, it is the 'curvature' of spacetime—a highly nonlinear term—that represents the gravity. This equation, whose study will be the subject matter for a major part of this chapter reads as:

The RHS (the energy-momentum tensor) of this equation represents the influence of surrounding matter and the LHS, which depends on the spacetime geometry, stands for the gravitation. The metric of spacetime which goes into the computations of RMV in its most general form (curvilinear coordinates) is written as: ds2 = - gm dx\ +

gij

dxt dxj

(ji,v

= 0,l,2,3;i,j

= l, 2 , 3 ) .

It is known that Einstein arrived at the above equation not by shear coincidence but by years of hard labour in order to resolve the puzzles of nature. The theory of 'Special Relativity' which we describe in brief below is often viewed as a means toward Einstein's final goal: an understanding of 'gravity in the universe.'

1.6

Special Relativity

To begin with, we wish to remind the reader that though Einstein is considered the architect of this theory, there are three others, Lorentz, Poincare, and Minkowski, whose work led to this theory. Before the discovery of 'special relativity,' the space and time were measured in different units. These units of measurement were miles/meters for space, and seconds for time. No one thought of the advantages that would emerge from using the same unit for measuring. It was perhaps because the role of 'light' was not fully recognized in physics at that point.9 In any case the first step toward the theory was to treat space and time on the same footing by using the same unit for measurement. For instance, time in meters is just the time it takes a light flash to go that number of meters. The conversion factor between seconds and meters is the speed of light, c = 299, 792, 458 meters/second. The speed of light is the only natural It was the Mitchelson-Morley experiment which showed that the speed of light was the same in all directions. In fact this led to Einstein's fundamental postulate—the Principle of Relativity: Laws of science should be the same for all freely moving observers regardless of their speed.

Gravitation, Relativity and Black Holes 363

constant that has the necessary units to convert a time to a length. The velocity of light c (meters/ second) multiplied by time t (in seconds) gives ct (in meters).10 By using the same unit, the space and time became one entity; one could not be separated from the other. This space-time unification followed from the concept of invariance of the spacetime interval11 (between any two events)— and thus this invariance showed that the time and space are inseparable parts of a larger unity, though qualitatively they are different. The 'spacetime interval' is the simplest form of measure between two events, and it is 'natural' since it is invariant. The space is different for different observers just as time is, but spacetime is the same for everyone. Minkowski observed that electrical charge and particle mass that are the same for all observers in relative motion are similar to the spacetime interval, whereas quantities such as velocity, momentum, energy, separation in time and separation in space are relative in character, in the sense that they depend on the relative motion of observers. Having seen the role of 'light' in unifying the space and time, we shall see next its importance in analyzing the travel-time of a particle in spacetime. In order to do this, we note that Einstein's postulate (given in Ftn. 9) can alternatively be expressed to say: "all observers should measure the same speed of light, no matter how fast they are moving." The postulate also implies the {aw that "nothing can travel faster than the speed of light." Remark 8.1.7: We shall use these basic facts to show that: (i) a curved path (trajectory/projectile) traced by a particle in 'space' is larger in distance as compared to a straight line path, whereas in spacetime a worldline is shorter when it is kinked (curved); (ii) the time lapsed along a kinked worldline is shorter than along a straight worldline (see Exc. 4). We recall that the path of a particle in Newton's theory is traced in the 3-dimensional Euclidean space, and as such the path is shortest when it is a straight line. In the case of spacetime, the particle travels along a time-like geodesic say CT(T). In Fig. (8.3) we see how this path is longer when it is a straight line. Given below is the equation of an arbitrary geodesic in curvilinear coordinates with the metric ds2 = ^ gijt dxt dxf. ii

D r 7\ , = - ^ _ + V - ^ - ^ - = 0 T

°

aW

du2

Jk

du

(8.1.9)

du

which represents the equation of motion of a free particle. We note that in an inertial coordinate system (coordinates in a free-float frame), the above equation simplifies to: ^4j-

=0

(8.1.10)

2

du

But since every free-float frame can be given a Minkowskian space structure with line element: ds2 = -dxl + dx\ + dx\ + dx\ 10

(8.1.11)

' In 1983 the General Conference on Weights and Measures officially redefined the meter in terms of the speed of light. By this definition the meter equals the distance the light travels in a vacuum in the fraction of a second that equals 1/299,792,458. 11 This invariance of the spacetime interval was discovered by Einstein-Poincare in 1905 and is formally called the Lorentz interval. This invariance demonstrates the unity of space and time while preserving—in the formula's minus sign—the distinction between the two.

364 Mathematical Perspectives on Theoretical Physics it follows that time-like curves CT(T) in Minkowski space are given by the curves xt = constant (i = 1, 2, 3), and each of these curves has the same tangent vector:

feo.o.ol \ dx

)

On the other hand, the curves x0 = constant have the tangent vectors

0, — - , ——, —— . These V du du du )

curves are known as space-like. The curves whose tangent vectors satisfy: (8.1.12)

are the null curves. Also the timelike geodesic CT(T) here satisfies: d2x^-4-=0 dx~

(8.1.13)

in an inertial system. The solutions xl = at x + bt of the above equation suggest that t = x0 = ao T+ b0 is an affine parameter on timelike geodesies. Moreover since Tj'k = 0 in the inertial coordinate systems, the equation for timelike geodesic can be written as:

$L=°

(8'L14)

This equation not only gives the law of motion for free particles in specetime, it also represents Newton's Law of Inertia (see Subsec. 1.1). From (8.1.13) and (8.1.14) it follows that there are two types of time associated with the trajectory of a particle: the coordinate-independent proper time x and the coordinate-dependent coordinate time t. Likewise we have two types of velocity four-vectors: the proper velocity u with components u' = — - , dx and the coordinate velocity v, with components v' = —'-. Obviously the coordinate velocity takes the dt form (1,5) where v is the ordinary three-velocity. To find the relation between x and t we note that f

I

dx: dx-

™mH-*-£-£d'

n

thus 12

We have assumed here that the expression for the curve is in curvilinear coordinates.

Gravitation, Relativity and Black Holes 365

dt

V ' dt

dt

]

{{ dt )

{ dt )

{ dt ) )

= Jl-v2

(8.1.15)

The relationship between the proper velocity « and the coordinate velocity v can now be obtained by using (8.1.15) and the fact that ul = — = ( — 1 ( — |. This gives: dr { dt J{dr ) uo

_ d*o_ _ 1
u<=-^L= = -T?L= VI -v2 VI - v2

(8.1.16)

where vt are the components of the three-velocity v. From equations (8.1.10) and (8.1.14) it follows that the free particles in both theories (Newtonian and Special Relativity) move (locally) along straight lines. We give below the diagrams of null cones in these theories to illustrate the differences between them. I

i

/ L

/y

Li9htcone

\ ' ^--:7K

/ Aj P

c

7^^

C^L_ -J^? Light cone \

li >

ii

/

$LJL d,

/ /

/ / //

s C|| = ^ - f f , v\\ = v\-v

^

J

\ " Z ^

/ /

7

^i

d\\*d\, t\\*t\,v\\ = v\ (ii)

1(11) = Reference Frame 1 (2) adapted to the geodesic l(ll). p s The point of emission of light ray. q = The point to which it travels

Q ^ Q

d,(d,() = Distance of travel in l(ll) f, (fH) = Time of travel in l(ll) vt (vn) = velocity of light ray in l(ll) S = plane of simultaneity.

(I) Light cone in Newtonian spacetlme and (ii) light cone in Minkowskian spacetlme.

In this brief summary (including the exercises), we have only given the reader a glimpse of the 'Principle of Relativity' which encompassed the space and time as two pieces of one pie and attributed the reason for time differences to the measurements taken in different frameworks called the Laboratory

366

Mathematical Perspectives on Theoretical Physics

and Rocket frames (in Relativity Theory). Essentially the time difference until then was attributed to surveying discrepancies (see Chapter 2 of [39] for details). Finally, in spite of its successes the Theory of Special Relativity was not the complete answers to questions of the physical world, for one thing it provided no framework for gravitational interactions (see Schild in [7] and [8b]). In his (1916) paper [8c] Einstein begins by focussing on the shortcomings of the theory and thus justifying the need for a theory that incorporated the theory of special relativity and the gravitation. In short, he favoured a theory where all forces of nature could be collectively expressed with further provision that the equations involving them remained invariant under coordinate transformations. Einstein thus used the principles of covariance of tensor calculus to write his equation:

(8.1.17)(a)

V-J^M^V

where the LHS represented the gravitational fields via the curvature of massive bodies and the RHS stood for matter fields (including the electromagnetic fields). Since the observed phenomena did not fully agree with the predictions of the theory, he added a term A g^v and called A the cosmological constant. It is this full form of Einstein equation: R

nV-\Rgnv+

A

8nv=T^

(8.1.17)(b)

that we shall study in the following sections, and shall find out for ourselves that this represents the best theory for gravitational interactions of extended body-systems and answers the intriguing questions concerning the gravitational waves13 (see Grishchuk and Polnarev in [7], K. S. Thorne in [16d] and L.M. Sokolowski in [32]), singularities, and black holes (see Miller and Sciama in [7]) of our universe.14 In conclusion perhaps one may argue that since geometry plays a dominant role in all three theories, their distinct characteristics can be attributed to the manifolds that are used there: (i) IR 3 x IR : (direct product of Euclidean spaces) (ii) IVD'3: (gu - 1, i = 1, 2, 3, g00 = -1) (Minkowskian) (iii) 4-dimensional curvilinear space which is locally Minkowskian ( g^v = nonconstant in general)

Newton; Special Relativity;

General Relativity.

(See Exes. 1 and 2 and also Yang in [27] for geometry and physics and also Chapter 6 of [9] on ideas pertaining to geometry in Newtonian and Einsteinian physics.) 13

Gravitational waves are an unavoidable consequence of the relati vistic theory of gravitation and they occur in all physical processes where gravitational radiation participates. The two fundamental predictions of the General Relativity theory—the gravitational waves and black holes— differ because black holes often need very strong gravitational fields (gravitation potential —> c2) for their formation, while gravitational waves exist even in the weak-field approximation. Exceptions to the requirement of a strong gravitational field (for formation of a black hole) are seen in the following example: near a black hole in M 1 near r = particular a large black hole the gravitational field represented by Raprsx ~3~ ~ f 2Af> is small if M is large (see Sec. 5).

Gravitation, Relativity and Black Holes 367

Exercise 8.1 1. State Kepler's laws. 2. Show the similarity between the laws of motion along geodesies in a normal frame of general relativity theory and a semi-Euclidean frame of special relativity (see Chapter 5 in [11]). 3. Use Newton's laws of gravitation to obtain dynamical equations for a finite system of N bodies that is isolated (see T. Damour in [16d]). 4. Establish the statement given in Remark (8.1.7). regarding the distance and the time with the help of figures and examples.

Hints to Exercise 8.1 1. Kepler's laws for orbital motion are: (a) The orbit of any planet around the sun is an ellipse, with the sun at one focus of the ellipse, (b) The line joining any planet to the sun sweeps out equal areas in equal times, (c) For any two planets in the solar system, the squares of the periods of revolution are in the same proportion as the cubes of their average distances from the sun. 2. Let CT(T) be a geodesic of the general relativity theory (i.e., we have curvilinear coordinates, a metric g(j and a connection Tjk). A normal coordinate system can be constructed by choosing a quadruple of "orthonormal" vectors {X,} in the tangent space Ta(i) for each value of T, so that the metric tensor satisfies: i*j (i) g(Xi,XJ) = 0 = 1 i = j= 1,2, 3 = -1 i = j = 0. From each point ofCT(T)there comes out a family of spatial geodesies orthogonal to Xo whose parameters are defined by their proper distance s from C(T). TO each point p of this geodesic we assign coordinates: (ii)

yo=r,

yi

= g(n,X}s

(i=l,2,3)

where n is the unit tangent vector toCT(T)and s is the proper distance along the geodesic from O(f) to p. We note that this quadruple must be smooth and that such coordinates can only be defined on a small neighbourhood of cr. The connection coefficients in these coordinates can be obtained by writing the differential in general coordinates e.g. (iii)

and then making use of (i) and (ii). They are thus:

(iv)(a) and (iv)(b)

r°Oi = l V r c i o = a ' (' = 1.2,3) rj^rj^QJ

(i,7 = l,2,3)

the symbol Qj is the usual antisymmetric rotation matrix in R3. All other Fj^'s are zero. The equation of motion along the geodesic crthen becomes (see (8.1.9)):

(v)

^ + dyS

«'

+ 2 Q'^-2«'.^-4Uo

initial JjllL force

Coriolis force

-

dy°Jl relativistic correction

.

368

Mathematical Perspectives on Theoretical Physics

We recall that for special relativity the law of motion (a particle moving on a timelike geodesic) is simply (8.1.14)):
| ^ = 0 .

Thus if the inertial force (acceleration) a' and the rotation both vanish, our normal frame15 becomes a local inertial frame and (v) and (vi) become the same. This shows that the theory of General Relativity can locally be viewed as the theory of Special Relativity. 3. Since the system is an isolated one, no force other than the mutual gravitational force affects the system. We assume (for simplicity in computations) that these bodies are made of some perfect fluid with a given isentropic equation of state that links the pressure p to the mass density p, in other words (see Ref. 1, M. Mikkelson): (i) P = Pip). The equations that describe the Newtonian dynamics of the system are: (ii) .....

—— H dt (dv>

jdv1)

J

—.— = 0 dx' dp

continuity equation dU

(in) p — - + v —-J- = - —^- + p—— ^ dt dx1 ) dx' dx1

Euler equation

(iv) AC = - An Gp Poisson equation where v' = v' (x, t) is the velocity field in Cartesian coordinates i,j = 1, 2, 3, U= U(x', t) is the positive gravitational potential, and G denotes Newton's gravitational constant. We note that the assumption on isolation of the system implies that the gravitational potential U falls off outside the system:

lim U{x, t) = 0 t = const.

where | x | is the Euclidean norm of x . Poisson's equation can therefore be solved to give:

(v)

U(x,t) = GJ-^d\.

Here \x - y \ gives the Euclidean distance between the field point x and the source point y , and d y denotes the Euclidean volume element in Cartesian coordinates. The dynamical equations of this N-body system are obtained by considering two separate problems: the external and the internal, the first relates to the determination of motion of the centres of mass of N bodies and the second to the motion of each body around its centre of mass. Let ma denote the total mass of a-th body which occupies the volume Va, (a, b, c ••• = 1, 2 ••• N), then

(vi)

ma - jy p(x, t)d\

(ma can be viewed as a constant due to the continuity equation). The position of its centre of mass is given by: (vii) 15

zia= —

\xip{x,t)d\.

The frame refers to a tangent space with basis at a point of spacetime region (with curvilinear coordinates).

Gravitation, Relativity and Black Holes 369 Now for any smooth function F(x, i) we have:

(viii)

- f j F(x, t) p(x, t) d\ = f at

"a

JVa

where

dF(x'

at

dF(x,t) _ dF(x, t) dt ' dt

° p(x, t) d\ dF(x, t) dx1 dx' dt

_ dF(x, t) j dFjx, t) dt dt Differentiating (vii) twice with respect to t and using (viii) to write the RHS we have: .. ,

d2z'a

r

dvl ,3

dv' From (8.1.4) we know that p

= J1 gives the local equation of motion, J1 being the local dt force density, and therefore (ix) can be written as: ,2 i

(x)

ma-—t = \vrd\.

In a perfect fluid model J ' =

~ + p—-. Also for every body of the system and in particular dx' dx' for the a-th, it can be decomposed in terms of an internal force (known as the self-force), (XD

r

(X1)

Jt

•*-"

dp

i

dx1

P

oduU)a dx'

and an external force: (XU)

?(e)a- ~ P

dU{e)a -, i

where the self-part f/(s)fl of the gravitational potential is: (xiii)

U(s)a(x, t): = G [ P ( j U ) d3y. Jv\x-y\ The external part f/e)a = U - l/^a that results from integration over the other bodies is: (xiv)

If we choose an accelerated "centre of mass frame of reference," i.e., regard the centre of mass of the a-th body as the origin of the frame which is formed by axes parallel to the global cartesian axis, then the position of a point p with respect to this frame x'a is linked to its position in the global (cartesian) frame x by:

(xv) 16'

4 = V-4 1 6 .

z'a = coordinates of centre of mass with respect to the global frame.

370 Mathematical Perspectives on Theoretical Physics

Since time is absolute in Newtonian Mechanics, the relative velocity and the relative acceleration are given respectively as: dx'J _

(xvi)

i _ dz^ _ ~, dv^ _ dtf_ _

d2J

dt dt a' dt dt dt2 Using this notation the basic equation of the 'external problem' in view of Eq. (x) becomes: (xvii)

^

.

whereas the basic equation of the a-th 'internal problem' (motion in the centre-of-mass frame of the a-th body) can be written as (see equations (iii), (xv) and (xvi)).

(xviii) mn The above equations show that in spite of the decomposition of force-density into internal and external components, the internal and external problems are a priori coupled to each other. For instance in equation (xvii), which represents the external problem, the second term on the RHS is in fact of internal origin, as it is the total self-force: (xixi)

(The problem we have discussed above is called the N-extended body problem in Newtonian gravity.) 4. (a) Let O be the reference point and P be another point which is reached by a curved path in space and spacetime (Fig. (8.3) (i) and (ii)). It is evident that the total length along the winding path from point O to point P (in the first case) is greater than the length along the straight

Curved path— greater length

Curved worldline— shorter proper time

t Time

North

Straight worldline

Direct path

Increase in space

Increase in east Increase in north

Increase in time

East" Path in space (0

O

- Space Worldline in spacetime (ii)

Length along a path ((I) In space, (1!) In spacetime.

Gravitation, Relativity and Black Holes 371

northward axis from O to P. We know that the particle in spacetime travels along a timelike curve, hence here we have to measure the total proper time from event O to event P. Since the total proper time is shorter along the curved line we have established the statement made in the Remark (8.1.7). (b) The spacetime map given below shows the time and space measured in years. The locations of four events are given in the following table: 8

1

I

I

I

I

I T4|

1

7

SPACE AND TIME LOCATION OF EVENTS

. i

Event 1 Event 2 Event 3 Event 4

Space (years) 1 -1 .5 3

Time (years) 0 1 2.5 8

b

T I Time 4 (meters) 3 I 2 * #

Space (meters)

_ * £

| *~ _ 3

I

|

_ 2 _1

I 0

U1 | 1 2

I I 3

4

5

Using this information we compute the proper time taken by a traveller (recorded on his/her watch) who begins from Event 1, passes through Events 2 and 3 and reaches Event 4, and by another traveller who goes directly from Event 1 to Event 4. The proper time in the first case is the total sum of proper times of three segments, which is obtained by using the formula: (interval)2 = ± (space separation)2 + (time separation)2 = + (difference in space coordinates) + (difference in time coordinates)2.17 This gives: V[(2)2 - (-1) 2 ] + V[(-1.5) 2 - (-1.5) 2 ] + Vt(-2.5) 2 - (-5.5) 2 ] = V3~ + >/24~ = 1.73 + 4.90 = 6.63. In the second case it is: V ( - 8 ) 2 - ( - 2 ) 2 = V60~ = 7.75. The above example shows that the time lapsed in the first case is less than in the second case.

2

THE EINSTEIN UNIVERSE

When Einstein propounded his theory of general relativity and wrote his famous equation G^v = 8nT^

(8.2.1)

relating the 'gravity' on the LHS with the 'matter' on the RHS, one of his goals was to unify the gravitation and the electromagnetic force since he thought that these were the only forces in nature. However, from his joint work with Grossman in 1913 (see Sec. 17.7 in [26]), it is well known that his aim initially was to put accelerated frames on the same footing as the inertial frames of reference. 17

' The signs on the RHS indicate that the interval is a positive quantity.

372 Mathematical Perspectives on Theoretical Physics

In order to write the above equation, he used the rules of differential geometry, bearing in mind that the equation, being a tensor one, was invariant under coordinate transforms and as such it incorporated within itself the principle of equivalence [24]. At present, of course, not only the number of forces in nature has been changed from two to four but the method of description has been refined as well. For example, one no longer talks of simultaneity of events (as postulated by Einstein in his theory), instead one talks of spacelike hypersurfaces. Fortunately for Einstein and for posterity, the tensor equation (8.2.1) was observed to hold ground when experiments were made involving known forms of the matter. The obvious questions that followed were: (i) given a matter field, how to construct T^ and (ii) what were the suitable metrics that could be used to formulate the gravitational part G^v which as we know from Eq. (8.1.17) (b) stands for: K/iv - y

R

8nv +

A

8Mv

In other words, a search for solutions of the above equation became a priority for physicists. Most of the solutions of this equation were obtained locally in the early fifties, their global properties, however, were investigated in the sixties after the pioneering work of Penrose, Chandrashekhar and Hawking. We shall study both of these ((i) and (ii)) in Sec. (8.3) and Sec. (8.4) respectively. Our attempt here (in this section) is to give the main ingredients of the theory that are required for its description, namely the mathematical model of space-time (the collection of all events), the matter fields and the postulates of local causality and local conservation of energy and momentum. With these in place, we shall be able to write the equations and study their structural properties.

2.1 The Mathematical Model Consider an equivalence class of pairs (M, g) where M is a 4-dimensional, connected, Hausdorff C°°- manifold and g is a Lorentz metric (i.e., a metric of signature +2) on 'M. Any two pairs (!M, g), (iW', g') of this class are isometric, meaning thereby that there exists a diffeomorphism 0: fW —> M' such that 9*g = g', i.e., 9 carries the metric g into the metric g'. This equivalence class (represented by one of the pairs (3tf, g)) is the mathematical model for spacetime. In the terminology of Sec. 1, this is the collection of all events. We shall refer to it, or rather to the pair (iW, g), as a spacetime manifold. Since the word manifold intuitively implies continuity, we note here that the continuity in this case has been established for distances down to approximately 10"15 cm by experiments, therefore for distances smaller than this, the manifold model of spacetime defined above may not be appropriate (see [16c]). The assumption of connectedness on !M suggests that we have the knowledge of all events, since there are no disconnected components. Finally, the Hausdorff condition together with the existence of a Lorentz metric implies that M is paracompact. It is this M which we shall coordinatize and write the field equations upon. Now the metric g allows the classification of non-zero vectors at a point p e 9vt as timelike, spacelike or null, according as a non-zero vector X e Tp (the tangent space at p) satisfies: g(X, X)<0, g(X, X) > 0 or g(X, X) - 0. The differentiability of metric plays an important role in the writing down of the field equations (as we shall soon see). If, however, the metric coordinate components gah and gah are just continuous and have locally square integrable generalized first derivatives with respect to the local coordinates, then the field equation can be set up only in a distributional sense. To avoid complications we shall take that metric to be Ck in general. There is still one more condition that we have to impose on the model (5W, g) to ensure that all the nonsingular points of space-time are included.

Gravitation, Relativity and Black Holes 373

A Cr pair (fW', g') is called a C-extension of (M, g) if there is an isometric C'-imbedding li : fM -> 94.'. Evidently in this case the points of 5Vf will also have to be viewed as the points of spacetime. We shall therefore assume that the model (M, g) is C-inextendible. Although it may seem so, not all models are inextendible. A simple example of a non-inextendible model can be given by a pair (Mx, gx) where lMx is a two-dimensional Euclidean space with the ;c-axis removed between points xx = - \ and xx = + y . Obviously (ftfj, gj can be extended by replacing the unit interval by an arbitrary interval. There are, of course, other ways in which it can be extended. This leads us to a still stronger condition of inextendibility as defined below. Definition 8.2.1: A pair (2W, g) is said to be Cr-locally inextendible if there is no open set 11 c 9A with non-compact closure in M such that the pair (11, g\v) has an extension (11', g') in which the closure of the image of 11 is compact.

2.2 The Matter Fields The fields that describe the matter content of spacetime are called matter fields. A classic example of such fields is the familiar electromagnetic field. Since these fields are defined on a differentiable manifold 9A. with metric g, the equations involving them are expressed via tensors and their derivatives are covariant derivatives with respect to the symmetric connection defined by metric g. If there is another connection on M, from the rules of differential geometry we know that the difference between two connections is a tensor; this tensor here is regarded as a physical field. If M carries another metric, that is also viewed as another physical field. Finally, the theory one obtains depends on the matter fields that one incorporates into the theory. The rule of thumb here is to include all fields that have been experimentally observed and postulate further the existence of those which are still undetected (experimentally). We will use the notation *F(°;t:*.d to denote these matter fields. The subscript (i) will denote the j-th field of the theory, and as usual the superscripts (subscripts) will stand for the contravariant (covariant) indices, indicating the tensor character of *¥. We now describe the two postulates concerning these matter fields. Both of them are common to the two theories of relativity, the special and the general.

2.3

Postulate (a): Local Causality

The equations governing the matter fields must be such that given a convex normal neighbourhood 11 and a pair of points p and q in it, a signal can be sent from p to q if and only if they can be joined by a C'-curve lying entirely in 11, the tangent vector of this curve is everywhere non-zero and non-spacelike.18 The above postulate can alternatively be given in terms of the Cauchy problem of the matter fields in the following manner (see also Sec. 3). Let p e 11 be such that every non-spacelike curve through p intersects the spacelike surface xA = 0, within il denote this set of points of xA = 0 by 7. Note that J consists of points that can be reached from p by non-spacelike curves lying entirely in 11. It is required that the values of the matter fields at p are uniquely determined by the values of the fields and their derivatives of finite order, say k on 7 and not by values of the fields on a proper subset J' of f, to which J could be continuously retracted. A few important consequences that follow from the adherence to this postulate are: 18

A tangent vector X, which is either timelike (g(X, X) < 0) or null (g(X, X) = 0), is called non-spacelike. We shall use the coordinate x4 in place of x° from now on to distinguish the general theory.

374 Mathematical Perspectives on Theoretical Physics

(i) The metric g is a distinctively different field on 9A, which is geometric in nature, (ii) Using {xa\ as normal coordinates in 11 around p, the coordinates of the points which can be reached from p by non-spacelike curves in 11, are seen to satisfy: (x1)2 + (x2)2 + (x3)2 ~ (x4)2 < 0. The boundary of these points is formed by the image of the null cone N of p under the exponential map, evidently it is the set of all null geodesies through p. The null cone separates the timelike vectors and spacelike vectors at p. (iii) One can determine the metric at p up to a conformal factor once Np is known (see Exc. 1 for (ii) and (iii) and can determine).

2.4

Postulate (b): Local Conservation of Energy and Momentum

The equations governing the matter fields imply the existence of a symmetric tensor Tab known as the energy-momentum tensor. The tensor Ta' depends on the fields, their co variant derivatives and the metric, and satisfies the following properties: (i) It vanishes on 11 if and only if all the matter fields vanish on 11. (ii) It obeys the equation: Tab;b=0.

(8.2.2)

where ; denotes the covariant derivation with respect to a given metric. The first of these conditions establishes the principle that all fields have energy, and the second gives the 'conservation law' provided metric g admits Killing vector fields (see Exc. 2). Now the symmetric tensor Tah given in Postulate (b) is as yet not defined, we see next how it can be uniquely determined when equations of the field are derived from a Lagrangian.

2.5

Construction of the Energy-momentum Tensor Tab

Let L be the Lagrangian (a scalar) formed by fields metric and let S be the action: 5=J

V

F"^ 6 d, their first covariant derivatives and the

Ldv

(8.2.3)

where T> is a 4-dimensional compact region of a spacetime manifold M and dv is the volume element. From our earlier study on the action S (in Chapters 6 and 7), we know that the equations for a physical system of the fields are obtained by requiring that the action S be stationary for all variations of the r) S

fields in the interior of ©. The action S is said to be stationary if —

= 0 for all variations of the

fields in T>, u being a parameter used in the following definition. Definition 8.2.2: A one-parameter family of fields W^ (w, r) where u e (£, - e) and r e 51/is called a variation of the fields ^F^ if (i) 19

¥ ( 0 (0, r) = «F(0 (r) Indices a, b, ... are used when M is an arbitrary spacetime manifold.

Gravitation, Relativity and Black Holes 375

(ii)

¥«•) («, r) = 4 ^ (r)

reJf-J)

(8.2.4)

by A«F(0

(8.2.5)

Denoting —^

i dM

«=o

we have:

f

= S I [~kT-^ncb.,

+ ^ x - A ^ ^ . ^ J A,

(8.2.6)

Recalling that the symbol ';' denotes the covariant derivative, we note that A^F^ satisfies

and hence the second term in (8.2.6) can be written as:

S i f^O—A^.J -fa=^—] ^ . , dv

(8.2.7)

The expression within the parenthesis in the first term can be regarded as the component Qe of a vector Q, this allows it to be written as:

L<&do=LQ'd<''

(8 2 8)

--

Now condition (ii) of (8.2.4) implies that A*F(I-) vanishes at the boundary dT>, which means that the first term is zero for every field. Equation (8.2.6) can now be written using the second term of (8.2.7). Putting these together, we have that —

vanishes for all variations on all compact regions such as

T>, if and only if the Euler-Lagrange equations: d

\

-[

d

\

) =0

(8.2.9)

hold for all (i). These are the required equations of the fields. To obtain the energy-momentum tensor from the Lagrangian, we consider the change in the action that is induced by a variation in the metric. We assume that a variation gab{u, r) leaves the fields d ^u')
376 Mathematical Perspectives on Theoretical Physics

^

=

dgab

V

*

(8-2-10)

2

(see Exc. 3 for the derivation). Piecing these facts together we now have:

•i

^

(8.2.11)

with the provision that the second integral in view of (8.2.10) be written as ^ ( L

gab Agab)dv

(8.2.12)

Now the variation in metric induces a variation in the connection as given below (in terms of components): A ^ = \fd

{(*8db);c + (*Sdc);b ~ ^ gbc);d}

(8.2.13)

Using (8.2.13), A{^¥"i)ch d.e) can be expressed in terms of (Ag/m).n (I, m, n stand for suitable combinations of a ... b, c ... d and e), and then applying the usual integration by parts technique, the integrand can be shown to involve Agab only. Finally, collecting the coefficients of Agab from the simplified dS version of (8.2.11), we can write as: du jv(TahAgab)dv

(8.2.14)

The symmetric tensor Tab is the required energy momentum tensor of the given fields (see Exc. 4 and 5 on construction of Tab, and Exc. 6 for dependence of the conservation equations on fields). We shall now use the above two postulates along with the information gathered in various exercises to write the field equations of Einstein.

2.6

The Field Equations

To write these equations we shall have to choose the metric g in (!M, g) which we have so far not chosen, except mentioning it briefly, that it is locally a Lorentzian metric of signature 2. The easiest course would be to choose a flat metric as in special relativity, and since the theory of special relativity does not include gravitational effects, introduce an additional field for bringing in the gravitation. But the choice of a flat metric does not work since experiments have shown that light rays travelling near the sun are deflected, which means that spacetime can neither be flat nor can it be conformally flat. As for the introduction of a gravitational field, from Postulate (b) it follows that this field would be incorporated in the energy-momentum tensor, thus leaving the theory with no gravitation field.

Gravitation, Relativity and Black Holes 377

From these observations it follows that the gravitational field should be the result of the curvedness (curvature) of spacetime. Also the equations written with the field should be such that their predictions do not contradict the Newtonian principles of gravitational theory.20 According to Newtonian principles, the active gravitational mass of a body (the mass producing a gravitational field) equals the passive gravitational mass (the mass that is acted upon by the gravitational field),21 and the field equations do not involve time. Einstein's field equations incorporate these principles in the following manner: (i) They include a constant G (G^v =Sn GT^V) known as the Newtonian gravitational constant; (ii) They are formulated in terms of a static metric. A metric is called static if it admits a timelike Killing vector field K which is orthogonal to a family of spacelike surfaces. These surfaces are regarded as surfaces of constant time and are labelled as t = c. The vector field K defines a unit vector field V = / " ' K, where/ 2 = -K°Ka*. The integral curves of V (which are also integral curves of K) define the static frame of reference (see Exc. 7), thus a particle whose history is one of these integral curves experiences no change of time in spacetime. In other words, a particle released from rest and following a geodesic would appear to have an initial acceleration of (-V) (defined below) with respect to the static frame. Hence if/ — unity, then initial acceleration of the particle is —(- V / ) . This analysis suggests that the quantity
(8.2.15)(a)

where Va= Va.b Vb = rlf,bgah

(8.2.15)(b)

Hence the divergence of V" can be written as: Ka = (Va.b Vb).a = V.b.aVb + Va.b Vb.a = Rah VVb + {Va.a).ibVb + {Vb Vbf = Rab Va Vb

(8.2.16)

(where we have used (0.5) and the result of Exc. 7). Also, since Va can be expressed in terms of/and the metric tensor, we have: y% = (f~%8ab); a = ~r2f;af;b S°* + / " ' Aba g"' Again,

a

b

fl

b

l

(8-2-17)

ab

(8.2.18)

VaVh

(8.2.19)

f,ab V V = -f.a V. V = -f~ f.a f.b g

Using these equations together we obtain f-ab (ga" + VaVh)=fRab 20

' See [19] for a beautiful account of the manner in which the metric arises as the carrier of the imprints of gravitation. 21 This principle has been verified experimentally in 1968 (see Sec. 40.8 in [26], and the Appendix 8A). * Evidently a, b in this section stand for 1, 2, 3.

378

Mathematical Perspectives on Theoretical Physics

But the term on the left is the Laplacian of/with respect to the induced metric in the 3-surface (t = c). If the metric is almost flat, it corresponds to the Newtonian-Laplacian of the potential. Now 'almost flatness' implies a weak (gravitational) field, hence it follows that in the limit of a weak field, if the RHS term in (8.2.19) is equal to \n G times the matter density plus any other term which is small in the weak field limit, the theory obtained with the above prescription would agree with the Newtonian theory. To achieve this we set the relation: Rat = Kab

(8-2.20)

where (in view of the above discussions), Kab must be a tensorial function of the energy-momentum tensor and the metric, and must be such that {AKG)~X Kab VaVb equals the sum of matter density and the terms which are small in the Newtonian {i.e. weak field) limit. Suppose we take Kab simply as An GTab, then since Rab satisfies the" contracted Bianchi identities Rha.h - -k-R.a, the equality (8.2.20) leading to (8.2.21) would eventually imply T.a = 0 as Tb.b = 0 (in view of conservation equations). But this contradicts the actual phenomena of nature.22 Hence the expression for Kab which satisfies (8.2.21) must be: Kab = l'(Tab-±Tga^+Agab

(8.2.22)

where k and A are constants whose values can be determined from the Newtonian limits. Using the Newtonian limit we can assign the value 8TT G to the constant k. Furthermore if we use units of mass in which G = 1, the equations we are looking for are:

Kb = 8*(rfl6 - i-7gafcj + Agab

(8.2.23)

These are the well known Einstein equations, which can be written equivalently as: (*«* - \R8ab]

+ *8ab = 8"Tab

(8.2.24)

Since both sides are symmetric, these form a set of ten coupled nonlinear partial differential equations in the metric and its first and second derivatives. Although these ten equations are not linearly independent to begin with,we do obtain a set of independent equations in view of the fact that, the covariant divergence of each side vanishes identically: (Rab - ^Rgab Tab.b = 0.

+ Agab) .b = 0

(8.2.25)(a) (8.2.25)(b)

The vanishing of T.a implies that the difference jX-^p = (energy density -3 x pressure) of a perfect fluid is constant throughout the space, which is not true (see Exc. 7 and [16c], p. 73).

Gravitation, Relativity and Black Holes 379

Hence taking into account their symmetries as well as (8.2.25), we note that the number of independent differential equations for the metric-that result from Eq. (8.2.23) is only six. We show next that Einstein equations can be derived by applying variational methods (established above) on an appropriate action. The action under consideration is: S= j (a(R-2A) + L)dv

(8.2.26)

where a is a constant and L is the matter Lagrangian. We require that S be stationary under variations of gab. The first and second term of the integrand under this variation are: A(a(R - 2A)dv) = a((R - 2A) ±gab

Agab + Rab Agab + gab ARab)dv

A(Ldv) = (Tab Agab)dv

(8.2.27) (8.2.28)

The last term of (8.2.27) can be further written as: gab ARab dv = gab {{ATcab).c - (ATcac).b) dv = (ATcab §<* ~ Ardadga%

dv

(8.2.29) ab

(where we have adjusted the dummy indices and have used the fact that g c = 0). From the RHS of (8.2.29) it is evident that the integrand is a divergence therefore J gab ARab dv can be transformed into an integral over the boundary d"D. But AF£C vanishes on the boundary, hence we have:

J£|

= l{a[i^R-hyb-RabyTab}Agabdv

(8.2.30)

In

Thus if

j

vanishes for all variations Agab, we obtain the Einstein equations after putting (X =

du u=0 8n In conclusion, we note that the choice of a scalar R in the above action S is the best choice, for if this scalar was replaced by Rab Rab or Rabcd R"bcd, one would obtain equations involving fourth order derivatives of metric tensor, which would mean that one would have to specify the initial values not only of the metric and its first derivative but also of second and third derivatives as well. Moreover, this would be contrary to other accepted rules of physics where equations are of first or second order. In view of this, we shall assume that the field equations do not involve derivatives of the metric higher than the second, and if they are derived from a Lagrangian, then the action must be of the form (8.2.26). Having decided upon the form of the action, one may still ask if it would be possible to change the metric and demand that action be stationary under the variation of that metric. For example, would it be possible to consider a conformally flat metric: gab=n2riab

(8-2-31)

and then seek the equations based on variations of this metric. Naturally the equations would now be obtained after replacing Agab with 2QT1 {AQ)gab. This theory based on the conformally flat metric is called the Nordstrom theory. The theory is in agreement with the Newtonian theory if A is small or

.

380 Mathematical Perspectives on Theoretical Physics

zero and a = -1/24 ;r, however it is inconsistent with the observed deflection of light by massive objects and fails to account for the measured advance of the perihelion of Mercury. We may mention here that the two drawbacks of the above theory can be removed by choosing the metric as: gab=n\rlab+

XaXb)

(8-2-32)

where Xa is an arbitrary one-form field. The theory gives the Newtonian limit in a static metric in which Xa is parallel to the timelike Killing vector. However, there could be other static metrics where Xa was not parallel to the Killing vectors and thus would fail to give the Newtonian limit. From these discussions it is evident that the metric g can not be further restricted (apart from requiring that it be Lorentzian). This brings us to the third postulate:

2.7

Postulate (c): Field Equations

The Einstein's field equations (8.2.24) hold on iW. The predictions of these field equations (within marginal experimental errors) agree with the observations made on the deflection of light and the advance of the perihelion of Mercury (see C. M. Will in [16d]). In the next two sections we shall study the Cauchy problem related to these gravitational field equations and give a brief description of spacetime singularities.

Exercise 8.2 1. Show that the postulate (a) of local causality helps to determine the ratio of the magnitudes of a timelike vector and a spacelike vector at every point p e M, and also enables one to measure the metric up to a conformal factor. 2. Show that if the metric g admits a Killing vector field K, then the conservation equation (8.2.2) gives the conservation law, i.e., Ka.h + Kb.a = 0 and 7"$ = 0 => (KaTah JJ).b = 0. 3. Show that i ^ - = ! * < * do. 2 dgab 4. Obtain the Euler-Lagrange equations and the energy-momentum tensor for a scalar field *F. 5. Show that for the electromagnetic field A, the energy-momentum tensor given in terms of the electromagnetic tensor field F = 2dk whose components are Fab = 2A[fc.aj, is:

(a)

Tab = -±- \FM Fbd gc" - jgab Fy Fklgik g^

when the Lagrangian considered is: (b)

L=

~^FabFcdg^gM.

Gravitation, Relativity and Black Holes 381

6. Show that the tensor T"* satisfies the conservation equations (8.2.2) as a consequence of the field equations (8.2.9) satisfied by the fields *F("jc* ^- In other words, it is the field equations that lead to the conservation equation. 7. Obtain the energy momentum tensor 7"1* for a perfect fluid.

Hints to Exercise 8.2 1. 11 is a convex normal neighbourhood, hence the coordinates of the points that can be reached from p by nonspacelike curves satisfy: (x1)2 + (x2)2 + (xY-

(i)

(*V ^ 0.

Thus by observing which points can communicate with p, one can determine the null cone N. (the boundary set of points) in Tp. The knowledge of the null cone gives the ratio mentioned in the Exc. and determines the metric up to a conformal factor as we shall see below. Let X, Y e Tp be respectively timelike and spacelike vectors and let X be the variable such that X + XY is a null vector, then the quadratic equation in X (ii)

g{X + XY, X + XY) = g(X, X) + 2Xg(X, Y) + X2g(Y, Y) = 0

has two real roots since g(X, X) < 0 and g(Y, Y) > 0. If Np is known, Xx and Xj can be determined giving the required ratio between the magnitudes of a timelike vector and a spacelike vector: (iii)

XlX2 = g{X,X)lg{Y, Y).

Suppose now X' and Y' are any two non-null vectors at p, then

(iv) g(X',Y') = y W , X') + g(Y\ Y') - g(X' + Y\ X' + Y)). Each of the magnitudes on the RHS can now be compared with the magnitudes of either X or Keventually leading to the determination of g(X', Y')/g(X>X)oig(X', Y')/g(Y, Y). Thismeans that the metric can be determined up to a conformal factor. (If (X' +Y') turns out to be a null vector, the expression (iv) may be suitably altered, replacing X' + Y' by X' + 2Y'). 2. If K is a Killing vector field, then locally it satisfies: (i) Ka.b + Kb.a = 0. Define (ii) Pa =TahKb then we have: (iii) P^^T^K.+ T^K^. The first term in (iii) is zero because of the conservation equation (8.2.2) and the second is zero, as 7"* is symmetric and K satisfies (i). Hence if © is a compact orientable region with boundary dD, we have (Gauss' theorem)

L ™*>> = I ** dv = 0

382

Mathematical Perspectives on Theoretical Physics

which gives the conservation law in the following sense: vanishing of Pb.J) in T> implies that the total flux of the K-component of energy-momentum over a closed surface (dT>) is zero. In the case of a flat metric (just as in the special relativity), the Killing vectors are simply (v)

( / / = 1 , 2, 3 , 4 )

KM=-TT

and the conservation law is therefore obvious. When the metric is not flat, in general there will be no Killing vectors and so the conservation law will not hold good. However one may introduce normal coordinates in a suitable neighbourhood of a point p and then define the Killing vectors as in (v) to obtain the conservation law in that neighbourhood of p. 3. Recall that volume element dv depends on the metric, and therefore it varies with the metric. Now dv is the four-form ft; whose components are: (0

°>abcd = ( - * ) * ( 4 0 < V 2 f t % *d] - H f ) T A 2 3

where g = det(gab). This gives: d<°abcd

_

dgef

1/ _

r

Ogef

2

00

| dg

=-j(-8)~^gefgA

= \gCf ®abcd-

„ Hence

d(dv) 1 ab — — = — g dv.

dgcb 2 4. Using a scalar field *P the Lagrangian can be written as: i\\

j _

1 u/

ai

oab

1

m

>T/2

where m and ft are constants (see also Chapters 3 and 9); in fact if *F represents a particle, then m is the mass of that particle and h is the Planck's constant. The Euler-Lagrange equations (8.2.9) in this case are: (ii)

2X

We have used the notation A for 4! $[aS\ S3C <5^j here; this should not be confused with the cosmological constant introduced in Eq. (8.1.17)(b) and then used in other subsequent equations.

Gravitation, Relativity and Black Holes 383

Now

(iii)

ij~\

= -\{8caV.,bgab + V.JWX

=

-±(V.Jcgck+V.tkgk%

(k is used to replace dummy indices b and a in the line above). Using the symmetry of gck we have: (iv)

Substituting it in (ii) and using a, b for k and c, we find that the Euler-Lagrange equations are:

n To obtain the energy momentum tensor we use the equality (8.2.11) to write the integrand for this case: (v)

Integrand = - ^ - A ( ¥ ; c ) + -p^^ab

+

W^^ab-

The first term on the RHS is zero since the variation A(*F.C) = 0.xV being a scalar f.^. does not contain a connection term. Simplifying the other two terms and adjusting the indices we have:

(vi)

T* = *;a ¥ ; 6 - ± g a b ^ ; c V.,dgcd + ^

2

j.

5. When we use the Lagrangian given in (b), i.e., (0

L=-±-FabFcdgacgbd

we note that although the initial field is A, the Lagrangian L is in terms of the tensor field F; accordingly in writing the integrand (given in (v) of Exc. 4), the tensor Fab, etc., has to be used. Thus we have: (ii)

Integrand H -^-A(Fij;k)

+ ^ - A ^ + ±Lgah Agab.

384 Mathematical Perspectives on Theoretical Physics

Evidently the first term is zero and the other two terms when simplified give the required expression (a). We note that the relation Fab = 2A^h.a^ is required for showing Tahj, = 0. 6. To establish the result given in this Exc, we recall two properties of diffeomorphisms on 5W. Suppose <j): M —> iM is a diffeomorphism which is the identity everywhere except in the interior of a compact region T> of fW, then an integral

/=[ F J©

is invariant under the differential map 0 induced by 0, i.e.,

(i) v

'

f F= f J©

F= f <j)\F).

J
J© Y

'

In case = {
LxT\p=]imQ^(Tp-^(Tp))

where p is a point in whose neighbourhood (j)t is defined. In this particular case we again assume that diffeomorphism 51/ is the identity everywhere except in the interior of ©. Then the action (8.2.3), being invariant under the induced map 0* (see Exc. 3):

(iii)

5 = f Ldv = — f Leo = — f

L« = — f 0*(Lfi>)

implies that — f (Lfi) - *(Lco)) = 0. 4! J If the diffeomorphism 0 is generated by a vector field X which vanishes everywhere except in the interior of 2), then (iv) amounts to: (iv)

(v)

— f Lx(La>) = 0. 41 J©

A

Writing this Lie-derivative in full we have: (vi)

(where we have used LX(L(O) = (LxL)(o + L(Lxa>), have replaced — 6 ) b y
I (TahLxgab)dv

= 21 {{TubXa\b -

Ta\bXa)dv.

Gravitation, Relativity and Black Holes 385

Since the vector field X is zero on the boundary d"D, writing the first term as:

2 f Yhbdv= \ Ybdoh I'D

'"

id
*

we note that it is zero. Thus (v) will hold only if the second term in (vii) is zero for arbitrary X, which is true only if Tab.j, = 0. We have thus shown that the conservation equation is a consequence of the field equations. 7. We have to obtain here the energy-momentum tensor of the matter which is a fluid and as such it is described by a function p-the particle number density, and by a congruence24 of timelike curves-the world flow lines25 of the fluid elements. The congruence can be represented by a diffeomorphism: (i)

y : [ a , b] x 9\£—»
where [a, b] c R , 9fis a 3-dimensional manifold with boundary and T> is a sufficiently small compact region of (M, g). Since the congruence is timelike, the tangent vector W = (d/dt)Y t e [a, b] L is timelike, i.e., the corresponding unit tangent vector V = (-g(W, W)) 2 W satisfies g(\, V) = - 1 . The fluid particle current vector is defined by J = pV, and it is required that this particle current vector be conserved, i.e., j " a = 0. The behaviour of the fluid is determined by prescribing the elastic potential (or the internal energy) £ as a function of p. We take L = -p(l + e(p)) as the Lagrangian, and to obtain Tab, require that the action 5 be stationary when the flowlines are varied subject to the constraint,/",, = 0, where p is the proper particle density of the fluid. We note that while varying the flowlines, p is always adjusted so thatj" remains conserved. Let a differentiable map y: (-S, 8) x [a, b] x 5\£—»(D define a variation of the flowlines such that: y(0,[a, b], 5\O = y([a, b], 5\0 and for u e (-5, 8), y(u, [a, b), 9Q = y([a, b], 9{) on M- T>. Then if K: = (d/du) y, AW = LKW (see Hint to Exc. 6). The vector AW represents the displacement of a point of the flowline under the above variation. In terms of its corresponding unit vector V, the displacement can be written as: (ii)

AV = V.b Kb - Ka.b Vb - VVbKh.c

V.

Since A(ja.a) = 0 = (A/). fl from J = pV, it follows that (iii)

(Ap).a V + ApV.a + p.aAV + p(AVa).fl = 0.

Substituting AVa from (ii) and integrating along the flowlines, we have: (iv)

24

Ap=(pKb\b

b

+

PKb;cV

Vc.

' A congruence on a manifold M is the datum of a family of curves, with one curve through each point of M. The usage of the term 'flowlines,' which is often used in 3-dimensional Euclidean space, in this case is in 4-dimensional spacetime.

251

386 Mathematical Perspectives on Theoretical Physics The variation of the action integral S = J Ldv can therefore be expressed as 26 :

(v) (where we have integrated by parts and have used V" for V".h Vb). The action S is stationary if the RHS is zero for all K, this means that: (ji + P)va = -p.
vhva)

where we have written p ( l + e) = fi and p 2

= p to denote the 'energy density' and the \dp ) 'pressure' of the fluid respectively. The above equation shows that V^-the acceleration of the flowlines is given by the pressure gradient orthogonal to the flowlines. In Order to obtain Tab, we have to vary the metric as well (see Exc. 3 above). Now the conservation of the current can be expressed as: (vii) and since this conservation equation determines j " uniquely at every point of a flowline in terms of its initial value at some given point of that flowline, it follows that V - i " j " remains the same when the metric is varied. Using the variation of the expression:

p^g-'i^Tg

DiJ^g

jb)gah

we thus have: (viii)

2pAp=(jajh-jlJcgab)Agah

and in view of the above discussions Tab can be written as:

(ix)

T"" = |p(l + e) + p2 — 1 VVb + p2—fb [

dp)

dp

= (H + P)VaVb + pgab. Any "matter" whose energy-momentum tensor is given by (ix) is called a 'perfect fluid' regardless of the fact whether or not it is derived from a Lagrangian. Using the energy-momentum conservation Equation (8.2.2), we obtain from (ix):

26'

(x)(a)

n.aVa+(n

(x)(b)

Qi + p)Va + (gab + VaVb)p.b = 0.

£(p) is written as e.

+ p)Va.a = 0

Gravitation, Relativity and Black Holes

387

We call a perfect fluid 'isentropic' if the pressure p is a function of the energy density fi only. In this case p can be treated as a conserved quantity and the above equations as well as Tab can be derived from the Lagrangian in terms of p and the inertial energy e. (Note that pressure and energy have been denoted differently in Exc. (8.1.3).)

3

CURVATURE AND ENERGY CONDITIONS

Nowadays it is well known that the minimal condition for a spacetime to be 'singularity free' is that it be 'geodesically complete with respect to timelike and null geodesies.' However before the appearance of Penrose's work, the word singularity was used very differently. It referred to those solutions of (8.2.24) which were ill-behaved in some sense, e.g., they were infinite or were discontinuous (in some neighbourhood). Naturally the curvature and the energy-momentum tensor played a key role here. We shall describe their roles briefly and state a few results (without proof) concerning them. We shall then give 'exact solutions' of Eqs. (8.2.24) in some important cases in the next section. As the manifold (9A, g) carries a Lorentz metric, the concept of motion of a particle and its acceleration, etc., differ from the case in which g is Riemannian. We shall therefore consider the effect of spacetime curvature on families of timelike and null curves (these curves could be the world lines of fluids or the histories of photons), and in the process shall acquaint ourselves with terms such as 'rate of change of vorticity, shear and expansion' of such families of curves. The formula relating to the expansion (known as Raychaudhuri's equation) plays an important role in the proofs of singularity theorems (see [16c]). We shall also discuss the inequalities satisfied by the energy-momentum tensor (known as energy conditions) to show that the gravitational effects of matter always tend to cause convergence of timelike and null curves. We shall also use these 'energy conditions' to establish that conjugate or focal points occur in families of non-rotating timelike or null geodesies in general space-times. Finally, we shall see that the existence of conjugate points implies the existence of variations of curves (between two points). These variations take a null geodesic into a timelike curve and a timelike geodesic into a longer timelike curve We shall also use these 'energy conditons' to sho\v. In order to introduce the terms mentioned above, we consider a congruence of timelike curves with (timelike) unit tangent vector V (g(V, V) = -1). These curves could represent the histories of small (test) particles and thus would be geodesies, or could be the flow lines of a fluid. If this were a perfect fluid, then one would have (see Exc. 8.2.7): (V + p)Va = -p.bhab

(8.3.1)

where fi is the energy density and p is the pressure of the fluid, and hab is the spacelike metric coming from the projection tensor hab = Sab+ V"Vh that projects every vector X e Tq (the tangent space at q e 5W) into its component in the subspace Hq of T (which is) orthogonal to V. The vector V" = Va;b Vh (as we know from Sec. 2) is the acceleration vector of flow lines.

3.1

The Separation Vector, Vorticity, Shear and Expansion

Given a curve T(t) with tangent vector Z = (d/dt)r^ we construct a family of curves P(r, s) by moving each point of the curve T(t) a distance s along the integral curves of V. We now define Z as a tangent vector:

388

Mathematical Perspectives on Theoretical Physics

and note that V and Z are related to each other by the Lie differential equation L V Z = 0 in other words their covariant derivatives satisfy:

—Z a = Va.hZh ds

(8.3.2)

(see Chapter 0, Sec. 5). The vector Z represents a separation of points that are equidistant from some arbitrary initial points along two neighbouring curves. If one adds a multiple of V to Z, then this vector (Z + a\) will represent the separation of points on those two curves but at different distances along the curves. Since one is interested in the separation of neighbouring curves and not the points, one needs to consider Z modulo a component parallel to V. In other words, one needs to consider the projection of Z at each point q into the space Q formed by equivalence classes of vectors such as (Z + aX). This space can be represented by Hq which, as we know, is formed by vectors orthogonal to V. The projection of Z into Hq is denoted as ± Z" = hahZh. From (8.3.2) it follows that27

JjA(1Za)

= Va.h±Zh

(8.3.3)

This gives the rate of change of the separation of two infinitesimally neighbouring curves as measured in H . Operating again with — and taking the projection we obtain after some adjustments: as

KTs

ihb'Ts

±r) =

~R"hcd ± z ' v V + ^"^ + *"*" xZ&

(8-3'4

The above equation, known as the deviation or Jacobi equation, gives the relative acceleration (second order time derivative of the separation) of two infinitesimally neighbouring curves as measured in H . If these curves are geodesies, the deviation depends only on the Riemann tensor. Also in analogy with Newtonian theory, where acceleration of a particle is the gradient of the potential 0, and the relative acceleration of two particles with separation Z" is fyab Zh, the Riemann tensor term Rabcd VhVd here represents the tidal force. In order to study it further, we introduce a dual orthonormal basis (E,) and (E') (i = 1 ... 4) of Tq and r* at some point q on an integral curve X(s) of V with E 4 = V. If X(s) is a geodesic, this dual basis can be parallelly propagated to any other point q' maintaining the same relationship as at q; for a general curve, this is not possible. To overcome this we define a generalized form of (covariant) derivative along \(s) known as the Fermi derivative DFldy For a vector field X along X{s), this is defined as: (8.3.5)

27

Note that J —

\ds J

is the projection of the operator — , and can be written as h"b — , etc.

ds

ds

Gravitation, Relativity and Black Holes 389

and has the properties: (i)

—— = —— if X(s) is a geodesic;

as (il) (iii)

as

^ = 0 ; ds if X and Y are vector fields along X(s) such that as

ds

(8.3.6)

then g(X, Y) is constant along A(s); (iv)

if X is a vector field along A(s) orthogonal to V, then DFX

=

(DX)

ds ~ \ ds J In view of the above properties, an orthonormal basis, whose Fermi-derivative is zero at all points of A(s), retains its orthonormality as well as the identification E4 = V. The vectors E,, E2, E 3 can be considered as giving a non-rotating set of axes along X(s). Since —— is a generalization of — , we ds ds can extend it to tensors along with the rules that are obeyed by — (see 0.5). As a result, we note that ds DF propagates the dual basis (E') of Tq along X(s). Using the Fermi derivatives we can write (8.3.3) and (8.3.4) respectively as:

-^£. ± z a = V.b LZb as D2 - 4 - LZa = -R\cd ±ZcVbVd + h% V".clZc + Va Vb LZb as

(8.3.7)

(8.3.8)

But ±Z is orthogonal to V and thus has components only with respect to E1? E2, E3, i.e., it can be expressed as Za Ea (a = 1, 2, 3). As a result (8.3.7) and (8.3.8) can be written as ordinary derivatives: —Za= VaBZp ds -^Za (X S

= (-R*fH + Vap + VaVp) Zp

(8.3.9)

(8.3.10)

390 Mathematical Perspectives on Theoretical Physics

(Note that the V"p are those components of V"b for which a= a and b = /?.) As the components Za (a = 1, 2, 3) obey the first order linear differential equation, they can be written in terms of a (3 x 3) matrix Aap(s) all along X(s) once their values at a point q are given: Za(s) = Aap(s)Zp\q

(8.3.11)

where it is assumed that the matrix Aap(s) satisfies the properties: Aap(s)\q = h -^-Aap(s)=

(8.3.12)(a)

Va.rArP(s)

(8.3.12)(b)

Using the usual properties of matrices we can write: Aap=OaySyP

(8.3.13)

where Oayi& an orthogonal matrix with positive determinant and Syn is a symmetric matrix; we assume that they are both unit matrices at the point q. The matrix O^ represents the rotation that neighbouring curves undergo with respect to the Fermi-propagated basis, whereas Sag represents the separation of these curves from k{s). The determinant of'Sap (which equals the determinant of Aag) can be thought of as representing the three-volume of the surface orthogonal to X(s) swept out by the neighbouring curves. Now at q, where Aap is the unit matrix, dOap/ds is antisymmetric and dSap/ds is symmetric, hence it follows that the rate of rotation of neighbouring curves at q is given by the anti-symmetric part of VK p and the rate of change of separation by the symmetric part of Va. p, while the rate of change of volume is given by the trace of Va. p. The vorticity and the expansion tensors are therefore defined as:
(8.3.14)

8ab = KtiV(cilQ whereas the volume expansion is given as: e=eabhab=

va,hhab=va.a

(8-3-15) (8.3.16)

In the Fermi-propagated orthonormal basis, all these can be expressed in terms of the matrix Aap and its inverse, thus: G>ap=-A-1

rla-^Ap]r

9aP = A~i naj-Ap)y 6 = (det A)' 1 —(det A) ds Equation (8.3.10) in terms of Aap can be written as d2 . . . ^ j A ^ =(-Ra4p4 + Va.r + VaVr)Ayp

(8.3.17)

(8.3.18) (8.3.19)

(8.3.20)

Gravitation, Relativity and Black Holes 391

and can be used in calculating the propagation of the vorticity, shear and expansion along the integral curves of V once one knows the Riemann tensor. The first of these two are: — (Oap=2o)r[aep]r+ ~eap

V[ct;/3]

= -K«4/34 -yis ~ % Oyp + Va-,p + VaVp

(8.3.21)

(8.3.22)

They are obtained by multiplying (8.3.20) by A ^ a n d then taking the antisymmetric part in the case of (8.3.21) and the symmetric part in the case of (8.3.22). In order to write the third one, we use the trace-free part of 6ab, which is called the shear tensor: oab=9ab-\Kb0

(8.3.23)

We thus have: — 6 = - Rab VaVb + 2m2 -2a2 -— 62 + Vaa ds 3

(8.3.24)

where 2ft)2 = 6)abafb > 0, 2<72 = aabaab

> 0.

Note that (8.3.24) is the trace of (8.3.22). This equation, which is of great importance in relativity theory, was discovered by Landau and independently by Raychaudhuri (see [16c]). From (8.3.22) and (8.3.24) the role played by the Riemann tensor in the rate of change of separation of timelike curves is obvious. We now discuss this role in the case of null curves. For simplicity we consider the case of congruence of null geodesies (which could represent the histories of photons) with the tangent vector K(g(K, K) = 0). In the absence of g(K, K) * 0 here, we do not have an arc-length to parametrize these curves. We can only choose an affine parameter v, and tangent vector K then obeys (see Def. (0.3.13)):

—Ka dv

= Ka.bKb = 0.

We should, however, bear in mind that this choice is not unique, since if we replace v byfv, the tangent vector becomes / - 1 K . We also note that the subspace Qq, the quotient of Tq by K, is no longer isomorphic to the subspace Hq of T formed by vectors orthogonal to K since it includes K (g(K, K) = 0). We are therefore interested here in the subspace Sq consisting of equivalence classes of vectors in Hq that differ from each other by a multiple of K. We shall soon see that these spaces can be spanned in terms of dual bases E b E 2 , E 3 , E 4 and E 1 , E 2 , E 3 , E 4 of Tq and T*q respectively at a point q on a curve T(v). These dual bases (quite naturally) are not taken to be orthonormal. We take E 4 = K and take E 3 as another null vector denoted K', and assume that g(E 3 , E 4 ) = - 1 . The remaining vectors Ej and E 2 are taken as unit spacelike vectors and are orthogonal to E 3 , E 4 . It can be checked that the space Hq is spanned by Ej, E 2 and E 4 , whereas the projections of Ej, E 2 and E 3 into Q form a basis of Qq, and similarly the projections of El and E 2 into Sq form a basis of Sq. A basis with the properties similar to that of E t , E 2 , E 3 , E 4 is called a pseudo-orthonormal basis. A

392

Mathematical Perspectives on Theoretical Physics

parallel transport of this basis along the geodesic T(v) assigns a pseudo-orthonormal basis to each and every point of F(v). We also note that due to the non-orthonormal character of the basis, the forms E 3 and E 4 are respectively -Ka gah and

-K'"gah.

To write the deviation equation for null geodesies, we follow the procedure of timelike curves; thus denoting the separation vector between points on the neighbouring curves by Z and noting that LKZ = 0 (as was in the case of timelike curves) we have (see Def. (0.3.13)) — Z a = K".b Zh dv

(8.3.25)

n2 -±~Za = -RabcdZcKbKd (8.3.26) dv Since Ka.4 - 0 (K being tangent to the geodesic), Eq. (8.3.25) reduces to a system of ordinary differential equations for Z1, Z2, Z3, hence the 3 components that pertain to Ej, E 2 , E 3 are: —Za=Ka.BZp dv

(a, p = 1, 2, 3)

(8.3.27)

As Qq is spanned by the projections of Ea(a = 1, 2, 3) into Qr the above equation can be interpreted to say that the propagation equation of projection of Z into Q involves only this projection, and not the component of Z parallel to K. From (KaKhgah).c = 0 it follows that K3.c - 0, this implies that Z 3 = -ZaKa is constant along the geodesic F (v), hence it can be physically interpreted to say that light rays emitted from the same source at different times maintain a constant separation (in time). In view of this, one needs to consider only those neighbouring null geodesies which have purely spatial separations, i.e., those vectors Z for which Z3 = 0. The projections of these vectors lie in the subspace 5 and obey the equation28: -4-Zi = Ki.jZj dv ''

(i,j = 1 , 2 )

(8.3.28)

But this equation is similar to (8.3.9), hence using similar arguments we can express Z' in terms of their values at some point q of F (v): Z\v) = Ajj(v)ZJ\q

(8.3.29)

where Ay is a 2 x 2 matrix that satisfies the following two equations:

28

- ^ Ayiv) = Kt.k Akj(v)

(8.3.30)

d2 -jj Ay(v) = -Ri4k4

(8.3.31)

Akj(v)

The projection symbol 1 in (8.3.27) and (8.3.28) is not used, although Za and Z' stand for projections into Q and Sq respectively.

Gravitation, Relativity and Black Holes 393

Using the same notations as those of the timelike case but with a h a t ' " ' we have equations for rates of change of vorticity 6)^, the expansion 0-the trace of separation tensor 9^, and the shear tensor a ^ -j^&ij

= - e&u + 2&k[i aj]k

— d = -RahKaKb+2co2-2G2-—e2 dv 2 — Oij = -Cm - ddij-

aik okj - a>ik 6)kj + Sy (a2 - a2)

(8.3.32) (8.3.33) (8.3.34)

Note that the first two equations are similar to (8.3.21) and (8.3.24), Eq. (8.3.33) being the analogue of the 'Raychaudhuri equation for timelike geodesies.' From (8.3.32) and (8.3.34) we note that just as we had in the timelike case, here also vorticity causes expansion while shear causes contraction.

3.2

Energy Conditions

Having acquainted ourselves with the gravitational part (which depends solely on geometry) of Einstein equations, we next consider the matter part represented by the energy-momentum tensor. In an actual universe, however, this tensor is made up of contributions coming from a large number of matter fields and as such it is almost impossible to describe it exactly even if one knew the precise form of the field and the equations of motion governing it. For example, one knows very little about the behaviour of matter under extreme situations of density-and pressure. In the absence of a reliable energy-momentum tensor, one is led to conclude that Einstein's equation cannot be used to predict the occurrence of singularities in the universe. Fortunately, this impasse is handled by using the inequalities which this tensor obeys, and which seem perfectly reasonable as assumptions. These inequalities, known as 'energy conditions,' are often sufficient to prove the occurrence of singularities, even though the exact form of the tensor may not be known. These energy conditions are referred to as: (a) the weak energy condition, (b) the dominant energy condition, and (c) the strong energy condition, depending on their role in shaping the spacetime entities. We shall list here the consequences of (results based on) these energy conditions (skipping the mathematical details, see [16c]) with a view to study the singularities.

3.2.1 The Weak Energy Condition The basic inequality that is assumed here is that at each p e 9W and for any timelike vector W e Tp, the energy momentum tensor obeys: Tab WaWb > 0

(8.3.35)(a)

By continuity this holds good even when W is a null vector. The inequality (8.3.35)(a) is called the weak energy condition. (Note that this assumption implies that the energy density measured by any observer is non negative.) In order to understand the meaning of the nomenclature used for energy conditions, we express the components of Tab (at a point p) with respect to an orthonormal basis E,, E 2 , E 3 , E 4 , (E 4 timelike) in one of the four canonical forms:

394 Mathematical Perspectives on Theoretical Physics

Tah =

Pl

Type (i) Pi

Here the energy momentum (EM) tensor has a timelike eigenvector E 4 which is unique unless \i - -pa (a = 1, 2, 3). The eigenvalue fi stands for the energy-density and the eigenvalues pa (a - 1, 2, 3) represent the principal pressures in the three spacelike directions E ^ The EM tensor has this form for all observed fields with non-zero rest mass and also for zero rest mass fields except when it is of Type (ii): [ ft , T

-

0

0

0 \

0

»,

0

0

0

0

v-k

v

,0

0

v

'

"

= ± 1

T

^

( l l )

v+k,

The EM tensor has a double null eigenvector (E 3 + E 4 ) here. This form has been observed to occur when the fields are of zero rest mass and they represent the entire radiation travelling in (E 3 + E 4 )direction. In this case p{= p2 = k = 0. For the Type (iii) and Type (iv) forms of the EM tensor given below, there are no observed fields which have EM tensors of this form. (p h T

0

0

0]

0 = 0

-v 1

1 -v

1 0-

,0

1

0

v,

Type(lll)

The EM tensor has a triple null eigenvector (E 3 + E4) here. (Pi Tab=

0

0A

P2

,0

, V

k2 < 4v2

Type (iv)

0,

The EM tensor has no timelike or null eigenvector in this case. The weak energy condition holds for Type (i) if \i > 0 and (/I + p a ) > 0 ( a = 1, 2, 3). It holds for Type (ii) if each py, p2 and k are > 0, and v = +1. The condition does not hold for Type (iii) and Type (iv). Two important cases regarding the weak energy condition are worth noting, in one of these it holds and in the other it does not. When the theory involves the scalar field postulated by Brans and Dicke and by Dicke (see Sec. 28.4 in [26]), it holds. The field is required to be positive everywhere. The EM tensor is of the form as given in (vi) of the Exc. (2.4) with mass m = 0. The condition does not hold for the scalar field proposed by Hoyle and Narlikar (see [19b]). The field here is known as the C-field and it has zero mass.

Gravitation, Relativity and Black Holes

3.2.2

395

The Dominant Energy Condition

The energy condition is said to be dominant if for every timelike vector Wa TabWaWh>0

(8.3.35)(b)

ah

and the vector T Wa is non-spacelike. The above condition implies that to any observer, the local energy density appears non-negative and the local energy flow vector is non-spacelike. Thus in any orthonormal basis, the energy dominates the other components of Tah, i.e.: r 0 0 > \rb\

for each a, b

(8.3.35)(c)

This holds for Type (i) if p. > 0, - fi < pa < fi ( a = 1, 2, 3) and for Type (ii) if k > 0, 0 < Pi < k (i= 1, 2) and v = ±l. Evidently the dominant energy condition is the weak energy condition with the additional requirement that the pressure should not exceed the energy density. We would further add that this has been observed to be true (i.e., pa < jl, p, < k) for all known forms of the matter. We now give a few results without proof based on energy conditions (see [16c] for proofs).

3.3

Results Based on Energy Conditions

Conservation Theorem 8.3.1: Consider a compact region 11 of spacetime with past and future non-timelike boundaries (dU)x, (dZl)2, and timelike boundary (dti)3 characterized by a function t = t(xh x2, x3, x4) whose gradient is everywhere timelike (see Fig. (8.4)). Surfaces { f = constant} n

77

11 increases

WH

I

" '

E ^ E Q ii ( 0 is the part of u that lies to the past of the surface H(r) defined by t= r. ( 0: {dv\ = boundary for which n is non-spacelike; na f0 g°c < 0; (
Note that sign of the normal form n is chosen so that < n, X > is positive for all vectors X which point out of 11.

396

Mathematical Perspectives on Theoretical Physics

the proof of this theorem is based on differential geometry techniques, see Yano and Bochner [43] for this result in particular). Corollary 8.3.2: If the EM tensor vanishes on a space-like set S, then it also vanishes on the future Cauchy development D+(S), where D+(S) is defined as the set of all points q e 9A., such that every pastdirected inextendible non-spacelike curve passing through q intersects 5 (see Fig. (8.5)). Note that if q is any point of D+(S), the region of D+(S) to the past of q is compact and therefore can be regarded as U of the above theorem. The result can thus be interpreted to mean that the dominant energy condition implies that light always travels faster than matter.

s ^ J ^ Q j The future Cauchy developmenr />(.?) of a spacelikr- set >• Obviously similar to future Cauchy development D+(S) we have past Cauchy development D~(5) of spacelike set S- This consists of all points in M through which every future directed inextendible nonspacelike curve intersects 5Result 8.3.3:

Consider the variation equation (8.3.33) of expansion 6 which reduces to:

—e = -RahKaKh-2G2-

—e2

when the vorticity a> is zero. If Rah WuWb > 0 for any null vector W, then evidently 6 monotonically decreases along the null geodesic. This phenomenon is called the null convergence condition. From the Einstein equations Kb - jRgab

+ Ag o6 = SnTah

it is evident that if the EM tensor obeys the weak energy condition, then the above condition always holds good (i.e., Rah WaWb > 0 for any null vector W) irrespective of the value of A. Result 8.3.4: Let W be a timelike vector, then from Equation (8.3.23) it follows that the expansion 6 of a timelike geodesic with vorticity zero decreases monotonically if Rab W"Wb > 0. This is called the timelike convergence condition. In view of the Einstein equations, this condition is satisfied if the EM tensor obeys the inequality:

Tah WaWh> WaW f i r - — A] V2

8n

(8.3.36)

J

This inequality holds for Type (i) if

/d + Pa>0,

ii + ZPa - - J - A > 0 47T

(8.3.37)

Gravitation, Relativity and Black Holes 397 and holds for Type (ii) if

v=+l,

k>0,

p{>0,

p2>0

and pl + p2-—A>0 An

(8.3.38)

In case the inequality is satisfied for A = 0, the EM tensor is said to obey the strong energy condition. The 'strong energy condition' is obviously a stricter requirement than the 'weak energy condition.' It is known to hold good for the electromagnetic field and for the scalar field with zero mass. We note that for the general case, Type (i) it is violated if the energy density is negative or the pressure is large and negative; for instance for a perfect fluid with density 1 gm cm"3, it can be violated if p < -10 1 5 atmospheres. We further note that a breakdown of the 'strong energy condition' sometimes leads to a breakdown of the 'singularity theorems' causing a singularity eventually. For example, the EM tensor of n mesons represents a breakdown of this nature.

3.4 Conjugate Points In the previous subsection, we have seen that the energy conditions implied the inequality Rab KaKh > 0 for the curvature tensor. We shall now see its role in determining the conjugate points on non-spacelike geodesies. These in turn will lead to the knowledge of singularities in spacetime. We recall here the definition of conjugate points for the spacetime (see Chapter 0, Sec. 5). Definition 8.3.5: Given a timelike C2-geodesic y(s) and a Covariation of y(s) (a congruence of neighbouring timelike geodesies) represented by Ft(s) = F(s,t) (t e (-£, £)), there always exists the

\ dF field of vectors

1 (s,t)

L dt

; these vectors are called the Jacobi fields along Y(s). A point p on y(s) is

Jr=0

conjugate to q along y(s) if there is a Jacobi field along y{s), which is not identically zero but vanishes at both p and q. From Sec. (0.5) we know that Jacobi fields satisfy a second order differential equation (known as the Jacobi equation) that involves the (Riemannian) curvature tensor (see [22]). Here this equation in view of (8.3.10) is: -^-TZa'•= -Ram7? ds

( a = 1, 2, 3)

(8.3.39)

a

dZa

A solution of this equation, i.e., a Jacobi field, is specified by the values of Z and

at some point ds of y(s). Evidently there are six independent Jacobi fields along y(s), three of which vanish at some point q of y{s). These Jacobi fields can be expressed as: Za(s) = Aa/}(s)^-Z^\q as

(8.3.40)(-a)

where -yrA^is) ds 29

= -Ra,y4 Ayp(s)

See Table S for n mesons.

(8.3.40)(b)

398 Mathematical Perspectives on Theoretical Physics

and Aaa(s) is a 3 x 3 matrix that vanishes at q. From an earlier subsection, we conclude that these Jacobi fields represent the separation of neighbouring geodesies through q, and hence the vorticity, shear and expansion of these fields, which vanish at q, is given by (8.3.17), (8.3.23) and (8.3.19). In view of (8.3.40)(a) and the vanishing of Aap(s) at q, we have: Result 8.3.6:

A point p is conjugate to q along y(s) if and only if Aap is singular at p.

Again using (8.3.40)(b) we note that since /? a 4 r 4 is finite, —(det A) must also be finite. From the ds expression 9 = (det A)'1 —(det A) = — (log(det A)) ds ds for the expansion 9, and in view of the above result we obtain: Result 8.3.7: A point p is conjugate to q along y (s) if and only if 9 becomes infinite at q. We now list a few results that involve conjugate points, the curvature tensor, and the expansion 9.

3.5

Results Based on Curvature, Conjugate Points and the Expansions 6, 0

Proposition 8.3.8(a): If the expansion 6 has a negative value 9X < 0 at some point y {sx) {sx > 0) of a timelike geodesic, and if Ruh V"Vh > 0 everywhere, then there is a point conjugate to q along y(s) between y(sx) and y(s{ + 3I-9X), provided that y{s) can be extended to this parameter value. We note that this may not be possible if spacetime is geodesically incomplete. A geodesic incompleteness of this type can be interpreted to mean that there exists a singularity in spacetime. The proof of the proposition follows from the Raychaudhuri equation: -^-9 = -Rah V"Vh- 2a2 - - 92. ds 3 Since all the terms on RHS are negative for s > sx we have: 9<

(8.3.41)

•s-fo+3/-0,) Thus 9 will become infinite and there will be a point conjugate to q for some value of s e \( sx, S\

3 ".I

A slight variation of the above proposition which we shall use in Sec. 5 is as follows. Proposition 8.3.8(b): Let Rah VaVb > 0. Suppose that at some point p = y(sx) the tidal force h a Rabat y y is nonzero, then parameter s of y(s) has values s0 and s2, such that q = y(s0) and r - y(s2) are conjugate along y(s), provided that y(s) can be extended to these values. So far we have considered arbitrary timelike geodesic congruence, we shall now state a result for the congruence of timelike geodesies normal to a spacelike three-surface !H. A spacelike three-surface # i s an imbedded 3-dimensional submanifold defined locally by a C^-function y/= 0 such that g"b y/.a y/.h < 0. The unit normal vector N to the surface #"has components:

Gravitation, Relativity and Black Holes 399

Na=(-gbcV..bV.c)-Tgad\if.d

(8.3.42)

and the components of the second fundamental tensor % are given as: Xab=KhdhNc.d where

(8.3.43)

hab = gab + NaNb

is the first fundamental tensor (or the induced metric tensor) of H. The congruence of timelike geodesies orthogonal to !H consists of those timelike geodesies whose unit tangent vector V equals the unit normal N at H, thus Va.b = Xab

(8-3.44)

at H. The vector Z representing the separation of a neighbouring geodesic normal to !Hfrom a geodesic y(s) normal to # o b e y s the Jacobi equation (8.3.39) and at a point q on y{s) at !tf, it satisfies the initial condition:

-fza=XapZP

(8.3.45)

ds Since these Jacobi fields satisfy (8.3.40) at q where A^ is the unit matrix, we have: -^Kp=XaYAYp

(8.3.46)

Using (8.3.17) it can be checked that the vorticity tensor ooap can be expressed as Aya(0YSAsp= |

^

^

- A^—A^

(8.3.47)

hence it is zero on y(s) since Aya(Q sASp is zero at q - as Aap is the unit matrix there, that satisfies (8.3.46). We further note that the initial value of 6 at q is: XabSab

(8-3-48)

Definition 8.3.9: A point p on y(s) is conjugate to H along y(s) if there is a Jacobi field along y(s) not identically zero, that satisfies the initial condition (8.3.45) at q and vanishes at p. Thus p is conjugate to .Walong y(s) if and only 'f Aap is singular at p. And as seen earlier, Aap becomes singular where and only where the expansion ©becomes infinite (see (8.3.19)). Similar to Prop. (8.3.8) (a) and (b), we now have the result that pertains to the timelike geodesic congruence normal to a spacelike 3-surface. Proposition

8.3.10:

If Rab V"Vh > 0 and xabgab < 0, there will be a point conjugate to # a l o n g y(s)

within a distance 3 / ( - %ab gab) fr°m % provided that y(s) can be extended that far. The proof follows easily from the Raychaudhuri equation (8.3 24) using the steps mentioned in the earlier proposition. We now consider a congruence of null geodesies. A Jacobi field along a null geodesic y(v) is a solution of the equation:

400

Mathematical Perspectives on Theoretical Physics

-^rZ1

= -Riij4ZJ

(i, ; = 1, 2)

(8.3.49)

The Equation (8.3.40)(a) now becomes: Z\v) = A, — Z> dv

(8.3.50) q

where Ay is the (2 x 2) matrix which vanishes at q, and as before Aud)lkAkj = 0, showing that the vorticity of the Jacobi fields which are zero at q vanishes. Here again we have that p is conjugate to q along y{v) if and only if 6 = (det A)"1 — ( d e t A) dv becomes infinite at p. Finally, the propositions analogous to propositions (8.3.8)(a) and (8.3.8)(b) read as: Proposition 8.3.11(a): Let K denote the geodesic tangent vector, and let Rab KaKh > 0 everywhere and if at some point y {v{), the expansion 6 has the negative value 6X < 0, then there exists a point conjugate to q along y(v) between y(vx) and y(vl + 2/-9A,

given that y(v) can be extended that far.

Proposition 8.3.11(b): Let Rab K"l& > 0 everywhere. Suppose that atp = y(v}), K'KdK[aRb]cd[e Kq] is non-zero, then parameter v has values v0 and v2 such that q - y(v0) and r = y(v2) are conjugate along y(v), provided y{v) can be extended to these values (see Sec. 4.4 in [16c] for the proofs).

3.6

Variational Techniques

The propositions given above regarding the existence of conjugate points were based on the curvature tensor and the expansion 0 or 8. Since the existence of conjugate points is a means to establishing that spacetime be singularity free, we shall state below a few more results where their existence is ensured by using the arc-length variational techniques. In essence these results are based on the Lorentz metric g and the curves that cover M. Thus a breakdown (discontinuity) which signals singularity is studied here by considering non-spacelike geodesies and the conjugate points on them. Putting it still more simply, one ensures (here) via these results, that in general, any two points can be joined by unique nonspacelike geodesic curves in a singularity-free spacetime. Due to our limited scope, we content ourselves with only the statements of a few results and ask the interested reader to consult one of these references [16c], [26], [35]. To begin with, we define a few key ingredients that are required in these results. Definition 8.3.12 Convex Normal Neighbourhood: Consider a C r -map exp: T —> M such that the rank of exp = dim(5W) (see also the Appendix for the definition), then (even if 'M. is not complete) in view of the implicit function theorem, there exists an open neighbourhood 9\[Q of the origin in Tp and an open neighbourhood $Cp of p in 9A such that exp is a Cr-diffeomorphism of ^ onto 9{p. Such a neighbourhood is called a normal neighbourhood of p. Further, if any point q of 9L can be joined to any other

Gravitation, Relativity and Black Holes

401

point r in H^p by a unique geodesic starting at q and totally contained in 9tp, then 9{p is called a convex normal neighbourhood. Definition 8.3.13: Let II be a convex normal neighbourhood of a point q, then the length of a nonspacelike curve y(t) from q to p, for p e U, is:

L(y,
(8.3.51)

where the integral is taken over the differentiable portions of the curve. We shall next consider the case where q and p may not be contained in a convex normal neighbourhood 11. For this we introduce the concept of 'variation' of non-spacelike curves on 9A. Definition 8.3.14:

A variation a of a timelike curve y(t) from q to p is a C'-map a: (-£, £) x [0,

tp] -> fW such that (i) a ( 0 , r) = y(r); (ii) a is C 3 on each (-£, e) x [r;, tM ] for some subdivision 0 = tx < t2 ••• < tn = tp of [0, r p ]; (iii) a ( « , 0) = q, a(u, tp) = p ; (iv) for each constant M, a ( « , f) is a timelike curve. The vector {did M) a | u = o is called the variation vector and is denoted as Z.

Definition 8.3.15:

A two-parameter variation a of a. geodesic curve y (i) from q to p is a C'-map: a : (-e, e) x (-£', fi7) x [0, g -> 5W

such that (i) a(0, 0, 0 = y(r) (ii) a is C 3 on each (-£, £) x (-£', e') x [r;, /,.+1] for some subdivision 0 = r 1 < r 2 < - < r B = r p of[0, tp\, (iii) a(u, u, 0) = q, a{u, u, tp) = p ; (iv) for all parametric pairs (u, u) consisting of constants u and u', a(u, u, t) is a timelike curve. The variation vectors Z and Z ' in this case are defined as: (a)

Z = {dldu)a

u=0

u'=0

(b)

Z ' = {dldu')a

u=0

u'=0

Resulting from these variations are the length variations of geodesic curves. To write the formula for the length variation in the one-parameter case, we assume that the parameter t in y(t) is the arc-length parameter s, then denoting by V the unit tangent vector ——, we have from (8.3.51): ds n-\

^ du

n-\ +1

= X P *(Z, \)ds + X g(Z, [V]) u=Q

i=i

it,

,=2

(8.3.52)

402 Mathematical Perspectives on Theoretical Physics

D\ where V =

is the acceleration vector (see also Exc. 2.7 for V), and [V] represents the discontids nuity at singular points of y(s) (see p. 107 of [16c]). The length variation under the two-parameter variation of geodesic y{t) involves two variational vectors Z and Z', and, as can be expected the second order derivative of length with respect to two parameters is symmetric in Z, Z', we write it as:

L(Z, Z') = -f±-

(8.3.53)

u=0

du du u'=o

and explain next the reason for this choice. d2L Using steps similar to one-parameter case the derivative ——— u=0 can be written as: au au «'=o «-i r'+i /

-p.r

T7T-

du du

-° = £

«-=o

~]

J\i {

Z

\

r n2

IN

> f V ( Z ' + g (V, Z ' ) V) - / ? (V, Z') V \\ds [ as

+ £ * (z> [•£• ( z ' + «< v ' z'> v )]l

J)

(8-3-54>

It can be further checked that this length variation depends only on the projections of Z, Z' into the space orthogonal to V. Thus if we denote by Ty the infinite-dimensional vector space consisting of all continuous, piecewise C2 vector fields along y(t) orthogonal to V and vanishing at q and p, then the second derivative of the length is a symmetric map of Tyx Tyio R1. This may be viewed as a symmetric tensor on Ty, and therefore, can be written as in (8.3.53): L(Z, Z') = - ^ - u=0 du du u'=o where Z, Z' now belongs to Ty. Recall that while in a positive definite metric one seeks the shortest curve between two points (which as we know is a geodesic), in the case of a Lorentz metric one looks for a longest non-spacelike curve. We call a timelike geodesic curve y(t) from q to p maximal if L(Z, Z') is negative semi-definite. Using the above definitions we now state the following results: Proposition 8.3.16: Let U b e a convex normal coordinate neighbourhood of a point q in M. Then the points that can be reached from q by timelike (respectively non-spacelike curves) in 11 are only those of the form expg(X), where Xe Tq satisfies g(X, X) < 0 (respectively < 0). The above proposition says that the boundary of the region in U which can be reached from q by timelike or non-spacelike curves in 11 is formed by the null geodesies emanating from q. Proposition 8.3.17: Let q and p be two points of a convex normal neighbourhood 11. Then if q and p can be joined by a non-spacelike curve in 11, the longest such curve is the unique non-spacelike geodesic curve in 11 from q to p.

Gravitation, Relativity and Black Holes 403

Thus if A(0 is a timelike curve in 11 from q to p, with a represention A(f) = a(f(t),t)30 and if p(q, p) defines the length of this curve provided it (the length) exists and is zero otherwise, then p (q, p) is a continuous function on 11 x 11, and from (8.3.51) we have: L{X, q, p) < \P f\i)dt

= p(q, p).

(8.3.55)

The equality holds if and only if A is the unique geodesic curve in 11 from q to p. Finally using the maximality definition we have: Proposition 8.3.18: A timelike geodesic curve y(t) from q to p is maximal if and only if there is no point conjugate to q along y{t) in (q, p). Proposition 8.3.19: A timelike geodesic curve y(f) from a 3-spacelike surface H\o p is maximal if and only if there is no point in {H, q) conjugate to H along y. We note that the maximality condition involving L(Z, Z') in this case reads as: a timelike geodesic curve from a surface # t o p and normal to itfis maximal if L(Z, Z') given in (8.3.53) is negative semidefinite, where it is assumed that the point q defined by, the two-parameter variation a, a(u, u', 0) = q, instead of being fixed varies over H. We now move on to Sec. 4 where we apply the ideas gathered in this section to study the exact solutions of spacetime.

4

EXACT SOLUTIONS, AND THE CAUSAL STRUCTURE

In this section we discuss the above two important topics pertaining to spacetime. We shall see that they derive their unique role for different reasons. For instance, the former helps us understand the structural properties of metrics that are solutions of Einstein equations, while the latter leads us to the study of singularity theorems and black holes.

4.1

An Exact Solution

To define an 'exact solution' we note, to begin with, that the Einstein equation: Kb ~ jRgab + A8ab = *"Tab can be theoretically satisfied by any spacetime metric, and the energy-momentum (EM) tensor-the RHS of this equation can then be considered as a known entity. In reality, however, the matter tensor does not 30

The representation X{i) = a(f(t), t), and (8.3.55) is based on the following: Let a(s, t) - expq(sX(/)) where g(X(r), X(f)) = - 1 , then writing X(t) = a(f(t), t) one has — \dt)X — \dt)a

are mutually orthogonal and g

=/'(*)— \ds)a

— , — = -1, this gives \\ds)a \ds)a)

The equality holds if and only if X(i) is a geodesic curve.

+ \dt)a

.Since — , Kds)a

404

Mathematical Perspectives on Theoretical Physics

have physically reasonable properties in all cases (see Subsec. 4.1.2 here). In the following definition of an 'exact solution' we lay down the conditions that need to be satisfied by EM so that it may be physically compatible with a solution. Definition 8.4.1: A spacetime (M, g) is called an exact solution of Einstein's equations, where the field equations are satisfied with Tab (the EM tensor of some specified form of matter) that obeys the 'local causality' postulate (a) (see Sec. 8.2) as well as one of the energy conditions (see Sec. 8.3). For example, one may look for exact solutions-for the empty space (Tab=0), for an electromagnetic field where Tab has the form (a) of Exc. (8.2.5), or for a perfect fluid where it is given by (vi) of Exc. (8.2.7). We note that due to the non-linearity of the field equations, exact solutions can be found only in spaces with high symmetry. Besides, these are often idealized—in the sense, that only simple matter contents may be considered, contarary to the fact that a region of spacetime may contain many forms of matter in general. As already mentioned in Sec. 8.2, most of these solutions (models) along with their local properties were discovered earlier. Their global properties, however, were examined only after the theory was developed with the help of other mathematical disciplines, e.g., topology, group theory, and algebraic geometry. Due to our limited scope, we shall list these exact solutions along with their important local and global properties. To study their derivations, one is advised to consult the texts [16c] and [26] and the references that are cited there. Usually these solutions are named after their discoverers and are often linked to each other through coordinate transformations. We give below some of these solutions beginning with the simplest such model. (All the notations and the diagrams to depict these models are based on the Hawking and Ellis [16c].)

4.1.1 Minkowski Spacetime This is the simplest empty spacetime of General Relativity which, as we know, is also the spacetime of Special Relativity. The pair (!M, g) here is (IR4,77) where rj is the flat Lorentz metric (1,1,1,-1) expressed as: ds2 = -(dx4)2 + (dx1)2 + (dx2)2 + (dx3)2 (8.4.1) in terms of the natural coordinates (xl, x2, x3, xA) on IR4.31 The geodesies in this case are given by: xa{v) = bav + ca

(8.4.2)

where b" and c" are constants. As a result the exponential map exp:Tp —> 3tfcan be written as: xa(expp X) = Xa + xa(p)

(8.4.3)

with X" being the components of vector X with respect to the coordinate basis \

a

[ of T . The map

I dx J exp is a diffeomorphism between Tp and*Msince it is one-one and onto. Thus any two points of SW can be joined by a unique geodesic curve. Also exp is defined everywhere on Tp for all p, hence (M, r}) is geodesically complete (see App. A. 2 for the definition). Associated with this pair (IR , 77) are the surfaces x4 = constant, which represent a family of Cauchy surfaces*. These surfaces foliate the whole of 'M. 31

The manifoldStf with this choice of metric is sometimes denoted as R3'1. In Sec. 1 we have used*0 in place of x4 and have denoted the Minkowskian space as M\. * For a spacelike 3-surface5 if D+{S) u D~(S) =!Wi.e. if every inextendible non-spacelike curve in 94. intersectsS, thenS is called a Cauchy surface (see Subsec. (3. 3) for D+(S) and D"(5).

Gravitation, Relativity and Black Holes 405 Null /+—geodesic

Future null cone of O \"

/ /^

\KS~*~^

^/~~N

\^_^

/

/_—-7

\—7// V""~7^—y/

/

/

, Uniformly

accelerating

/^_^/timelike curve

/

\ ^ - — ^ /

./^-Surface

/•*^"^ {x4 = constant}

Past null ^ ( / C C I " " : ^ ^ cone of O ^ f c t ^ ^ S S a ^ ^ ^ gff

V ^ R E Q A Cauchy surface QC = const.) in Minkowski spacetime, and spacelike surfaces Sa, Sg, which are not Cauchy surfaces. All the normal geodesies to Sn. S,. intersect at O. It should be noted here that there are inextendible spacelike surfaces which are not Cauchy surfaces, for example the surfaces: Sa :

{-(JC 4 ) 2

+ (x1)2 + (x2)2 + (x 3 ) 2 = a = constant}

(8.4.4)

where <J < 0, x < 0, are spacelike surfaces that lie totally inside the past null cone of the origin O and thus are not Cauchy surfaces (see Fig. (8.6)). We shall now give a few other coordinate systems that are used on the Minkowski spacetime in order to show how a particular coordinate system introduces or removes a singularity. Consider the spherical polar coordinates (t, r, 9, (j>) with x4 = t, x 3 = r cos#, x2 = r sin# cos (j>, xl = r sin 9 sin 0. The metric: ds2 = -dt2 + dr2 + r2(d92 + sin 2 9d
(8.4.5)

is apparently singular for r = 0 and sin 9 = 0, since (t, r, 9, ) are not admissible coordinates at these points. This singularity is removed by restricting the coordinates to the ranges 0 < r < ~ , 0 < 9< K, 0 <§<2n. Two such coordinate neighbourhoods are needed to cover the whole of Minkowski space (see Chapter 10). The coordinate system defined by v = t +r, w = t-r (=> v > w) where ~ °° < v < °°, _ o o < w < ° ° is called the null coordinate system. The metric here takes the form: ds2 = -dvdw + —(v - wfidQ2 + sin 2 9 d<j>2) 4

(8.4.6)

The null coordinate v(w) represents the advanced (retarded) time coordinate and can be thought of as the incoming (outgoing) spherical wave travelling at the speed of light. The surfaces w •= constant and v = constant are null surfaces (i.e., w.a w.b gab = 0 = v.a v.b gab); see Fig. (8.7)). The intersection of a surface v = const, with a surface w = const, is a two-sphere.

406 Mathematical Perspectives on Theoretical Physics | r .~ " w = costant ^C_ ^ ^ ^ /1

w

f/

"f (^2^\$^

= constant

A j A u > ?= constant

/ZX^7 sj^ZLx ^ ^ U = c o n s t a n t ^-^-~ ===== -~~\^

/^^XU^X^"^ ^xxj^X^j

c_3 6

— v = constant

^~- r = constant

(0

(ii)

j ^ ^ Q Q I (I) The ;• iv coordinate surfaces with one coord'nate suppressed. 00 The (t. rVpiane: each point represents a two-sphere of radius r Still another set of null coordinates, denoted p, q, can be defined with the help of v, w. These coordinates have finite range and are given by the relation: tan p = v,

tan q = w

(p, q e (- -|-, ^ - j ; p > q)

(8.4.7)

The metric 7] now becomes: ds2 = sec2p sec2 q i-dpdq + —sin2 (p - q) (dO2 + sin2 9 d$) ]

(8.4.8)

which is evidently conformal to another metric g given by: ds2 = -Adpdq + sin2 {p - q) (dd2 + sin2 Odcj)2).

(8.4.9)

If we further define t' = p + q, r' = p - q, where -It<

t' + r'< it,

-jt
n,

(/ > 0)

(8.4.10)

then (8.4.9) can be written as: ds2 = -(dt'f

+ {dr'f + sin2 r'{dB2 + sin2 6d2)

(8.4.11)

This means that the whole of Minkowski spacetime is given by the region (8.4.10) of the metric:

ds2= | sec2 {±{S + r')l sec2 U(t' - r'))dl2 with (ds)2 being determined by (8.4.11). The coordinates (t, r) of (8.4.5) are related to (t\ r') by:

2f = tan (l-(t' + r')) + tan f l ( ? ' _ r ' ) l

Gravitation, Relativity and Black Holes

2r = .tanf—(f' + r ' ) l - t a n f - ( f ' - r ' ) !

407

(8.4.12)

Remark 8.4.2: The various coordinates and the subsequent metrics introduced here are significant in one way or another. For instance, the metric (8.4.11) is locally identical to the metric of the Einstein static universe which is a completely homogeneous spacetime (see the Appendix). The metric (8.4.11) can be analytically extended to the whole of the Einstein static universe: IR1 x S3, where now t' e (- °°, °°) and (r\ 6, (j)) denote the coordinates on S3. The coordinate singularities of r' and flat 0 and ;rcan be removed by using suitable local coordinates in a neighbourhood of points where (8.4.11) is singular.. The coordinate system given by (p, q, 6, ) is used to define the infinities of Minkowski spacetime. The points \ p = —, q - — ) and [ „ = _ i L V 2 2) y 2

fl =

_2L] denoted *'+and i~ (in literature) represent respec2 )

tively the future and the past timelike infinity, whereas the point ( p = —, q =

denoted /° repre-

sents the spacelike infinity. We also have the null surfaces p = + — and q = which represent the future and past null infinity.

denoted J7+ and J-

Remark 8.4.3: In view of the above remark, it follows that the whole of Minkowski spacetime is conformal to the region (8.4.10) of the Einstein static universe (shown in Fig. (8.8) by the shaded area). The boundary of this region given byJ+,J~, i+, i~ and i° can be thought of as representing the conformal structure of infinity of Minkowski spacetime. Remark 8.4.4: The coordinates (?', r') were introduced by Penrose. They can be used (rather effectively) to represent the conformal structure of infinity shown in Fig. (8.8) (i) simply by means of the (?', r') plane given by the diagram in Fig. (8.8) (ii). Each point of this diagram represents a sphere S2, arid radial null geodesies are represented by straight lines at ± 45°. This representation, known as the Penrose diagram, can always be used for the structure of infinity in any spherically symmetric spacetime. The conformal structure described above can be viewed as the 'normal' behaviour of spacetime at infinity. Evidently this behaviour may not be found suitable in all the cases that one encounters in reality. Remark 8.4.5: From the above discussions it follows, that whatever can be seen from infinity is determined by the light cone structure of space time. This is unchanged by a conformal transformation of the metric, e.g., gab -> Q.2 gab, Q being a smooth positive function of position. Hence it is useful to apply a suitable conformal transformation which squashes everything up near infinity and brings infinity up to a finite distance. This is exactly what has been done by introducing the coordinates (u, w) —> (p, q) - » (?', #•')•

4.1.2

de Sitter and Anti-de Sitter Spacetimes

Consider the spacetime metrics with constant curvature. The Riemann tensor here is given (locally) as: Kbcd = ^

R

(Sac Sbd ~ Sad 8bc)

Moreover, since in terms of the conformal tensor above relation is equivalent to: C

abcd = 0 = Rab~

~7R Sab

(8-4-13)

408 Mathematical Perspectives on Theoretical Physics

f^~ s r

/+ r =

^TN ~~~~*\+-i'= n

"vT

\ \ \

0

i

—~~~

''!\i7//

^*==fc

—~~~\

''' lly^^"

IJ = K

(i)

y^K/

V\jK\

-~

r'=0

KQ= constant}

'/\\\^f

if

iTTrnr ^ ^

°

v (ii)

^=

Surfaces

{p~

constant

cons ant

'

}

)

Two-spheres

{r= constant}

r

^ ^ J j I 3 (i) The Einstein static universe represented by an embedded cylinder (with coordinates ^ * * " " 6 and 0 it is the 'de Sitter spacetime.' The de Sitter space has the topology of K1 x S3, which can be considered as the hyperboloid: -v2 + w2 + x2 + y2 + z2 = a2

(8.4.14)(a)

-dv2 + dw2 + dx2 + dy2 + dz2 = ds2

(8.4.14)(b)

5

in R with metric: The use of coordinates (t, %, 6,
a sinh (a~lt) = v a cosh (a~lt) cos % = w a cosh (a'11) sin % cos 6 = x a cosh (a" 1 1) sin % sin 6 cos <j)= y

Gravitation, Relativity and Black Holes 409

a cosh (a" 1 0 sin j sin 9 sin (p = z

(8.4.15)(a)

leads to the metric: ds2=-dt2

+ a2 cosh2 (a' 1 1) {d%2 + sin2 x(dQ2 + sin2 6d(j)2)} (8.4.15)(b)

The metric singularities are simply those that occur for polar coordinates, namely the ones given by % 0, X = ^ a n d 6=0, 6= n. Apart from these singularities, the coordinates with the ranges: - °° < t < °°, 0 < j < TT, 0 < d< 7t, G < 0 < 2TT cover the whole space. The spatial sections given by t = constant are the spheres S 3 of constant positive curvature which are Cauchy surfaces. Their geodesic normals are lines which contract monotonically to a minimum spatial separation and then re-expand to infinity (see Fig. (8.9)(i)). If one uses the coordinates:

w+v t = cdog

» ,

ax

x =

a

* ,

w+ v

ay

y=

„

—,

az

z -

w+ v

w+ v

on the hyperboloid, the metric takes a simpler form:

ds2= -di2+ exp(2a"'f) (dx2 + dy2 + dz2) But these coordinates cover only half of the hyperboloid, as i is not defined for w + v < 0 (see Fig. (8.9)(ii)). The region v + w > 0 forms the 'steady state' model of the universe proposed by Bondi and Gold [2] and Hoyle [19(a)]. Null surfaces [t = - °°} are boundaries of coordinate patch £ increases x = 0

x = n

f

tincreases

t= 0; minimum distance

between geodesic normals

X increases

J ^jncreases y- o (i)

^y^\T\KjjK' Geodesic normals Surfaces of

i increases

Vvv\V\ z C ^ Geodesic \'i \N\^\f J? normals i \\\{fk Pllrfar..nf

constant timer

//

^Mr~™J^ff\L*i

! //

/

\o^\ ^ ^

—Jj /^—"

constant time '

Nxj\ >«\

Timelike geodesic which does not cross surfaces {t= constant} (ii)

^ B ^ H De-Sitter spacetime represented by a hyperboloid imbedded in a 5-dimensional flat space ^ ^ ^ ^ (with two dimensions suppressed), (i) The coordinates (i. x- & 0) cover the whole hyperboloid; the sections t = const, are surfaces of curvature k = +1. (ii) The coordinates (t.x.y.z) cover half of the hyperboloid; the surfaces i = const, are flat 3-spaces, their geodesic normals diverging from a point in the infinite past.

410

Mathematical Perspectives on Theoretical Physics

To study the infinity in this space, a time coordinate t' given as:

t' = 2 arctan(exp a~l t) - (nil) (- — < t' < — 1

(8.4.16)

can be defined. This gives: ds2 = a2 c o s h 2 (a11')

• ds2

(8.4.17)

2

where ds is the same as in (8.4.11) after r is identified with %. Thus the de Sitter space is conformal to the Einstein static universe defined by the range in (8.4.16) (see Fig. (8.10)(i)). The Penrose diagram, a plane in t and % coordinates, depicts the singularities of this space in a simple manner (see Fig. (8.10) (ii) and(iii)). r' = o

r' = n

P

||11P ""'"'J

HT lTTl"l ITT r '• l r ^ ^ ' ( f ' = o^)' a spheres3 1

.'T ~~jj>

•^'~~--j-(f'=- 1-^), a spheres3

(i) j ; + ( f = oo) j / + ( f = oo)

j I

Surfaces

i — t -j_-Time lines

fi ^

===-! j

{f=constant}~|13?~ 1

—'—t—'—'—'—! T j-(t = -°o)

,^L

i ^\--\-~Tj?i&~^-~V^MilM 1--7 Mfy j \\UW

Surfaces {f= constant}

}-T/2^T

IH"N

(ii)

^ ^ ^ Q

(y = constant)

!

jrj (coordinate singularity)

y

Tjine lines (x = constant)

y=w (coordinate singularity)

/^^^V(' A =-°°) ~*~^| V /" \X-n) ^--^.Coordinate singularity (# = 0) (iii)

(i) de-Sitter spacetime shown as being conformal to the region -K- < f <

n

of the

Einstein static universe. The shaded region depicts the portion that is conformal to the steady state universe, (ii) The Penrose diagram of de Sitter spacetime. (iii) The Penrose diagram of the steady state universe. In (ii) and (iii) each point represents a 2-sphere of area 2Ksin2 x'• n u | 1 | i n e s ar © ai 45°, and % = 0 and x = x are identified.

Gravitation, Relativity and Black Holes 411

Remark 8.4.6: We note that in contrast to Minkowski space, the de Sitter space has a spacelike infinity for timelike and null lines both in the future as well as in the past. This is because in this space there exist both particle and event horizons for geodesic families of observers (see A.7 and A.8 in the appendix). Definition 8.4.7: The spacetime of constant curvature with R < 0 is called the anti-de Sitter space. The topology of this specetime is that of S1 x R3, it is represented by the hyperboloid: -u2-v2

+ x2 + y2+z2=

1,

embedded in flat five dimensional space K5 with metric: ds2 = -{duf - (dv)2 + (dx)2 + (dy)2 + (dz?

(8.4.18)

Remark 8.4.7: The anti-de Sitter space is not simply connected, also it has closed timelike lines. The universal covering space of this space, obtained by unwrapping the circle S1 to (its covering space) R1, has the topology of IR4. This (universal covering) space has no closed timelike lines, and it carries the following metric: ds2 = -dt2 + cos 2 / {dx2 + sinh2 x(dd2 + sin2 Od
H I D t=

2

n

V

n

^__

/

I

— - Lines •*-*—"" {/•= constant}

— " ^ ^ r~*Surfaces Cjr ^ 5 = = {f'= constant) ^j\ZZZ—^{t= + oo) A - n ^ ^ ^ ^ ^ H — ^Surfaces ~>j/~i_A_LS^^ = "" {t= constant} ¥ n -~-~~llt"~l-T-l~~^% — — {f = -<«}

I '

/ •>

^iW -r- ,-, '\%s Timelike I V K / / geodesies I \ \r\%^ \ Y)-^ Coordinate , / / J/Zc Null

Lines W & & ^= singu,anty^J / / ^ X _ geUod esc i v | j~pP7_~7 —{X= constant) r ~° \\//Y/ °mp

f =

' ~"2 7 r ^~~*-l~*\$}Z*—~~"7 ~" ~p Null geodesies ";^Kfc^ ^^'zK from infinity to r (i) •

W/s^ oir (ii)

^ 9 ^ f f l (i) The universal anti-de Sitter space conformal to one half or the Emstem universe: the m m " — coordinates (f. i. « o) cover 1he whole space, whereas (f. / . o @) cover only one diamond shaped region; all the geodesies orthogonal to the surfaces [t = constant} can be seen converging to p and q and then diverging out into similar diamond shaped regions. (M) The Penrose diagram o* universal anti-de Sitter space. Infinity consists of tho fmohke surface / and the disjoint points h.i.

412

Mathematical Perspectives on Theoretical Physics

(see Exc. 4 for metrics (8.4.19) and (8.4.20)). The surfaces t' = constant cover the space completely and have non-geodesic normals. Remark 8.4.9:

The structure at infinity can be studied by using the coordinate r'\

r' = 2 arctan(exp r) - — j 0 < r' < — J

(8.4.21)

The metrics ds2 and ds2 of (8.4.11) are now related as: ds2 = cosh 2 rds2.

(8.4.22)

Thus the whole of anti-de Sitter space is conformal to the region 0 < r' < — of the Einstein static cylinder. The null and spacelike infinity can be thought of as a timelike surface here, which has the topology R1 x S2. Remark 8.4.10: No conformal transformation can be found which makes the timelike infinity finite without reducing the Einstein static universe to a point. Therefore timelike infinity is represented by the disjoint points z+, i~, and, as a consequence, there exists no Cauchy surface in the space (see Fig. (8.11)(ii)).

4.1.3

Robertson-Walker Space

If the universe is spatially homogeneous and admits a six-parameter group of isometries whose surfaces of transitivity are spacelike 3-surfaces of constant curvature, it is called a Robertson-Walker (or Friedmann) space. With suitable coordinates the metric here can be written as: ds2 = -dt2 + S2{t)da2

(8.4.23)

where da2, which is independent of t, is the metric of a 3-space of constant curvature K. Evidently the geometry of these 3-surfaces is qualitatively different for the cases K > 0, K < 0 and K = 0. In order to study this, it is usual to rescale the function S(t) so that K can be taken as 1 and - 1 in the first two cases. The metric da2 can now be written as:

da2 = d%2 + f\X)

(dd2 + sin2 6 d
where

pin* (0<x<2n) K = l f(X)=\x (0<^<-) K = 0 [sinh* (0 <£<<*>) K = -\ In the last two cases the spaces are diffeomorphic to R 3 and so are 'infinite,' in the first case they are diffeomorphic to a 3-sphere S and are therefore compact ('closed' or 'finite'). Remark 8.4.11: The symmetry of the Robertson-Walker solutions requires that the energy-momentum tensor has the form of a homogeneous perfect fluid. The density ji and the pressure p of this fluid are functions of time t and the flow lines are the curves %, G, <j> = const. The function S(t) represents the separation of neighbouring flow lines (i.e., of nearby galaxies). In view of the above remark, the equation of conservation of energy in these spaces is (see Exc. (2.7) and Exc. (4.6)):

Gravitation, Relativity and Black Holes

II = -3(jU + p)SIS

413

(8.4.24)

and the Raychaudhuri equation becomes: \n(\i +3p) - A = -3S/S

(8.4.25)

Remark 8.4.12: From (8.4.24) it follows that the density \i decreases as the universe expands. This can also be interpreted to mean that the density was higher in the past, increasing without bound as S —> 0. An infinite density implies an infinite curvature. Hence Robertson-Walker spacetime has a singularity at S = 0 which, unlike the coordinate singularities, cannot be removed by coordinate transformations (see [16c] and [26] for more details on these spaces).

4.1.4 The Schwarzschild and the Reissner-Nordstrom Solution The spatially homogeneous solutions, e.g., Robertson-Walker, which are good models for the large scale distribution of matter in the universe, are not suitable for description of local geometry of spacetime in the solar system. This geometry can be described to a good approximation by the Schwarzschild solution. This solution represents the spherically symmetric empty spacetime outside a spherically symmetric massive body, with the metric given by: d5*

= _ (i _ i£LJ A 2 + ^ _ ^

X

d?

+ r\d& + sin2 Q dtf)

(8.4.26)

where r > 2m.* This spacetime is 'static' meaning thereby that — is a timelike Killing vector which is at

a gradient, and the space is spherically symmetric, i.e., it is invariant under the group of isometries SO(3) operating on the spacelike 2-spheres {t, r - constant}. One should note that coordinate r here, is defined to meet the requirement that the area of these surfaces of transitivity is Anr2. The solution is unique t and is asymptotically flat, as the metric has the form:

Sab = nab + o(-J-) for large r (see Exc. 8.4.1 and 8.4.5). When the metric is an empty space solution for all values of r, it is obvious that r = 0 and r = 2m make the metric singular. To remove these, the given manifold is cut into two disconnected components defined by 0 < r < 2m and 2m < r < °° and since !M of spacetime has to be a connected piece, the component 2m < r < <*> js taken as the required iW of (iW, g). We note that although the metric is singular at r - 2m in the coordinates (f, r, 0, <j>), there are no scalar polynomials formed by the curvature tensor and the metric, that diverge as r —> 2m, hence this singularity is not a real physical singularity, but is the result of a bad choice of coordinates. To avoid this, different coordinate transformations have been used, two of these are: v = t + r*, * t

w = t - r*

(advanced, retarded null coordinates)

A comparison with Newtonian theory suggests thatm here (as measured from infinity) is the gravitational mass of the body producing the field. By uniqueness we mean that if there is any solution of the vacuum field equations, that is spherically symmetric then it must be locally isometric to Schwarzschild solution.

414 Mathematical Perspectives on Theoretical Physics

r* = f J

— = r + 2m log(r - 2m) \-2mlr

(8.4.27)

The metric in terms of (v, r, 9, ) (known as the Eddington-Finkelstein metric g') takes the form: ds2 =-(l-

—]dv2

+ 2dvdr + r\d92

+ sin2 9 dtp2)

(8.4.28)

and in terms of (w, r, 9, 0) (denoted g") it becomes: ds2 = - (l - ^L)dw2

- 2dwdr + r\d82

+ sin2 9 d(j>2)

(8.4.29)

Both these metrics are non-singular and are analytic on the larger manifolds fW' and !M" defined by the region 0 < r < <». In fact it is through these manifolds (M\ g') and (M", g") that manifold (M, g) whose region is 2m < r < °° can be extended. For instance the region of M' for which 0 < r < 2m is isometric to that region of the Schwarzschild metric for which 0 < r < 2m. Thus by taking a different manifold Schwarzschild metric is extended in such a manner that it is no longer singular at r = 2m. In the manifold M', r = 2m is a null surface, it is that section of spacetime which is given by 6, <j> constant; each point here represents a 2-sphere of area Anr . The surface r = 2m acts as a one-way membrane; it allows future-directed timelike and null curves cross only from the outside (r>2m) to the inside (r < 2m), whereas it does not let past-directed timelike and null curves cross from the region (r > 2m) to the region (r < 2m). Thus representation (M', g') is not time symmetric, as is also obvious from the presence of the term dv dr in g'. As r —> 0, the scalar Rabcd R ahcd diverges to w2/r6, hence r = 0 is a real singularity here. We call (M', g') the advanced Finkelstein extension of (M, g), or alternatively (M, g) is said to be imbedded in (M', g'). The manifold (M", g") known as the retarded Finkelstein extension of (M, g) has much of the same properties as we saw in the case of (M\ g) . For instance the region 0 < r < 2m of M' is isometric to the region 0 < r < 2m of Schwarzschild-metric, although the isometry here reverses the direction of time. Similar to M' the manifold M" has r = 2m as a null surface which acts as a one-way membrane, the only difference is that now the past-directed timelike or null curves cross from the outside (r > 2m) to the inside (r < 2m). Using these two manifolds we shall next see that we can further define a still larger manifold !M* with a metric g* into which both (fW, g') and (5W", g") can be isometrically imbedded so that they coincide on the region r > 2m which is isometric to (M, g). Construction of the pair (3tf *, g*) is due to Kruskal (see Sees 31.1 and 31.4 in [26]), here g* is given by: ds2 = F\t', x) {-dt'2 + dx'2) + r\t\

x) (d62 + sin2 9 d(j>2) 2

(8.4.30)

2

and the manifold 9A.* is defined by the coordinates (/', x, 8,
(8.4.31)(a) (8.4.31)(b)

Gravitation, Relativity and Black Holes 415

Both these functions are positive and analytic. The coordinates (?', x) are arrived at from the pair (v, w) via the following relations:

v' = exp(v/4m),

w' = -exp(-w/4m)

(8.4.32)(a)

x' = ^-(v'-w1),

t'=^~{v' + w')

(8.4.32)(b)

We note that, the regions of (M*, g*) gievn by x' > \t'\ and x' < - \t'\ are both isometric to the region r > 2m of Schwarzschild solution (5W, g). Also the region defined by x' > -/' is isometric to the advanced Finkelstein extension {9nC, g) and the region defined by x' > f is isometric to the retarded Finkelstein extension {M", g"). The Kruskal extension {M* g*) is the unique analytic and locally inextendible extension of the Schwarzschild solution. The Reisnner-Nordstrom solution, which is locally similar to the Schwarzschild solution, represents the spacetime outside a spherically symmetric charged body (configuration), carrying an electric charge but with no spin or magnetic dipole. The energy-momentum tensor in this case is that of the electromagnetic field in the spacetime which results from the charge on the body.32 For a suitable choice of coordinates the metric can be written as: ds2=_fi

_^

+

iiw

+

f i _ izi+iiyxdri

+ r2{dei +. sin2 e # 2 )

( 8 4 33)

where m represents the gravitational mass and e the electric charge of the body. The solution is asymptotically flat and if e2 > m2, the metric is non-singular everywhere except for the irremovable singularity at r = 0. This may, however, be thought of as the point charge which produces the field. If e 2 < m2, the metric has singularities at r + and r_ where rx = m ± (m2 -

e2)^.

We have so far considered spherically symmetric solutions, this however does not always hold good in reality, since in general astronomical bodies are rotating and so the solutions outside them are not exactly spherically symmetric. We now give the exact solution that takes care of this phenomena.

4.1.5 The Kerr Solution The Kerr solutions are the only known family of solutions that can represent the stationary axisymmetric asymptotically flat field outside a rotating massive object. In fact Kerr solutions seem to be the only possible exterior solutions for black holes (see [16e]). The metric in Boyer and Lindquist coordinates (r, 9, 0, t) can be expressed as: ds2

= p2 ( —

U where

+ dd2 1 + (r 2 + a2) sin 2 6 d
pl

j

p2 = p2(r, 6) = r 2 + a2 cos 2 6,

(a sin 2 6 d

and

A = A{r) = r2 - 2mr + a2.

Entities m and a in (8.4.34) are constants, m represents the mass and ma the angular momentum as measured from infinity. When a = 0, the solution reduces to the Schwarzschild solution. The metric is invariant under simultaneous inversions of t and 0, i.e., t —> -t and 0 —> -<j) but is not invariant under 32

' Actually there is no "body" (per se ) in spacetime. The 'charge' expresses the total electric flux trapped in the topology of the "throat" connecting the two asymptotically flat 3-dimensional spaces.

416 Mathematical Perspectives on Theoretical Physics

their separate inversions. When a2 > m2 obviously, A > 0, and the metric is singular only when r = 0. The singularity here is not a point but it is a ring (see Exc. 7). The Kerr solutions, being stationary and axisymmetric, have a 2-parameter group of isometries, which is abelian. Hence there are two independent Killing vector fields that commute. There is a unique linear combination K" of these vector fields which is timelike for large positive and negative values of r. The orbits of AT" define the stationary frame, thus an object moving along these orbits appears to be stationary with respect to the infinity. Besides K" there is another unique linear combination Ka of the Killing vector fields which is zero on the axis of symmetry and whose orbits are closed curves that correspond to the rotational symmetry of the solution.

4.1.6

Godel's Universe

These solutions known as Godel's universe represent an example of exact solution where the matter is a pressure-free perfect fluid, i.e., Tab - puaiih, p being the matter density and uu the normalized 4-velocity vector. The manifold here is R4, and the metric can be expressed as: ds2 = -dt2

+ dx2 -— exp(2>/2~ CQx)dy2 + dz2 - 2exp( VT 0)x)dt dy

(8.4.35)

where a> > 0 is a constant which stands for the magnitude of the vorticity of the flow vector ua. The field equations are satisfied if u = —^-, 3 3 i.e., u" = 5Q, and An p = CO2 - -A. dx Remark 8.4.13: The Godel spacetime (JW, g) has a 5-dimensional group of isometries which is transitive, hence it is a completely homogeneous spacetime. It is easy to note that the metric is a direct sum of two metrics g{ with coordinates (/, x, y) defined on fWj = R 3 and g2 with coordinate z defined on M2 = R1- There are closed timelike curves through every point of (Mx, g{) and hence through every point of (fM, g). The Godel solution is geodesically complete.

4.1.7

Taub-NUT and Misner Spaces

The Taub-space is an empty space solution of Einstein's equations, we denote it as (M, g). 3 4 It is spatially homogeneous and has the topology IR x S3. The metric here is given by: ds2 = -U~] dt2 + (2l)2 U(d\jf + cos 0 dip)2 + {t2 + I2) (d82 + sin 2 9 dty2)

where

(8.4.36)

2(mt + l2) U(t) = - 1 + - ^ ^~

r +1

33

' t = x°. 34 The tangent space Tp forp e M is the direct sum of the vertical subspace Vp and the horizontal space Hp. For X,Y e Tp the metric g of (M, g ) can be locally expressed as: g(X,Y)=g

y{Xy, Yy) + (^ + l^g

H{jl,

Xfj, 7t*Y„)

where (Xv, Yv) and (XH, YH) are in Vp and Hp. The space Vp which is tangent to the fiber is spanned by — and dt

, and the space H is spanned by

and

cos 6

. (See Sec. 2.5 for details.)

Gravitation, Relativity and Black Holes 417

with m and / being positive constants. S 3 is covered by Euler coordinates (9, 0, y/) that vary as: 0 < 9 < n, 0 < 0 < In, 0 < y/ < \n. The metric is singular at t = t± = m ± (m2 + I2)112 where U = 0. The Misner space is a 2-dimensional space with topology Sl x (R! and with the metric g given by: ds2 = - r 1 dt2 +

fa?y/-2

(8.4.37)

where 0 < y/ < 2n. There is an obvious similarity between the two spaces in the sense that the singularity of the metric is given by the coefficient of dr. Apart from this similarity, there are other common features between them. They can both be extended to two inequivalent locally inextendible analytical extensions. In both cases these extensions are geodesically incomplete. In the case of Misner space, the metrics of these extensions (denoted g' and g") are given by: ds2-2dyr'dt

+ t{dy/')2

y/' = y/ - log t

(8.4.38)

and ds2 = -2dyi"

dt + t{dy/")2

y" = yr + log t

(8.4.39)

The manifolds M' and M" are defined respectively by y/' and - <» < t < °°, and y/" and - °° < t < °°. When t > 0 both (iW', g') and (fW", g") are isometric to original Misner space (M, g). Inextendible analytic extensions of the Taub-space are obtained by considering M as a fiber bundle over S2 with fiber R1 x S1, and the bundle projection n: !M -> S2 given by (t, y/, 9, (p) -> (9, <j>). By dropping the 9, (f> terms of (8.4.36), the metric can now be written in the form: ds2 = -U~l dt2 + 4l2U(dy/)2

(8.4.40)

The similarity between the metrics given in (8.4.37) and (8.4.40) suggests that the method used in the case of Misner space can be applied here for obtaining the extensions. It should be noted that the above metric (denoted gv) is on the fiber J = R1 x S1 which is regarded as the (t, y/) plane. Thus in effect the extensions are those of (JF, g v), when these are combined with the metric gH of the 2-sphere as given in the Ftn (34), the analytic extension of (!M g) is obtained. The metric g vhas singularities at t = t± where U = 0. To avoid these singularities, we take the manifold given by -t_ < t < t+ and y/, denote it by f0 and then extend (Jo, g v) by defining: y/'=y/+ — \ ^ V Y 21J U(t)

(8.4.41)

The metric g v now becomes g 'v : ds2 = 4/ dy/' (lU(t)dy/'

- dt)

(8.4.42) l

This metric is analytic on the manifold7' with topology S x R defined by y/' and by - « = < ; < °<=. The region t_ < t < f+ of CF', g'v) is isometric with CF0, g v). There are no closed timelike curves in the region t_ < t < t+, though there are for t < t_ and t > t+. Another extension is made by defining: y/"=y/—L\-^V W 21 J U(t) The metric g'y here is:

(8.4.43)

418 Mathematical Perspectives on Theoretical Physics

ds2 = Aldy"

(W(t)d\j/"

+ dt)

(8.4.44)

which is analytic on the manifold J" given by y" and - °° < t < °°. As in the previous case, (?"', gy) is isometric to CF0, g v) on the region t_< t < t+. These inextendible extensions of Taub's space were obtained by Newman, Unti and Tamburino [28] and therefore the Taub's space along with its extensions is called the Taub-NUT space.

4.2 Causal Structure Having described the exact solutions (to some extent), we now define some of the 'words' that are used in 'singularity' and 'black hole' theorems. We shall see that these 'words' describe the causal behaviour of spacetime and help in making succinct statements of results on causal structure. Due to our limited scope, we shall only give their definitions (skipping the underlying philosophy of their introduction into the literature) and state the results based on them, asking the reader to see [ 16c] for details.

4.2.1 Orientability Definition 8.4.14: A spacetime is said to be time-orientable if it is possible to attach an arrow in one and the same direction at every point, in other words if it is possible to define continuously a division of non-spacelike vectors into two classes which can be labelled as future and past-directed. All results in this as well as in the following section are based on the assumption that spacetime is time-orientable. If the spacetime (M, g) is not time-orientable, e.g., in the case of de Sitter space, it is customary to consider its double covering space (M, g), n: 9A e 5W which is time-orientable. The projection n carries (p, a) e M to p e M, where a denotes one of the two orientations of time at p. Definition 8.4.15: A spacetime (M, g) is called space-orientable if it is possible to assign the term 'right-handed' and 'left-handed' in a continuous manner to the bases of three spacelike axes at every point p of M. According to Geroch, if it is possible to define two-component spinor fields at every point of a spacetime, then the spacetime must be parallelizable, i.e., it must be possible to introduce a continuous system of bases of the tangent space at every point (see Geroch in [13](b)). It is important to note here that if one assumes that spacetime is orientable, then in view of experimental evidence, and of the CPT theorem, it follows that spacetime be space-orientable as well (see Chapters 7 and 9 and 7.[21] for the CPT concepts).

4.2.2

Chronological and Causal Future

Definition 8.4.16: Let 5 and 11 be two arbitrary sets of a time-orientable spacetime (M, g). The set of all points in 11 which can be reached from 5 by a future-directed timelike curve in 11 is called the chronological future I+(S, 11) of S relative to 11. 35 When Zl = !M,we denote it simply as / + (5), and we note that it is an open set since if p e M can be reached by a future-directed timelike curve from 5, then there is a small neighbourhood of p which can also be reached from 5 by timelike curves. Definition 8.4.17: The union of (5 n 11) with the set of all those points in 11, which can be reached from 5 by a future-directed non-spacelike curve in 11, is called the causal future of S relative to 11 and is denoted / + (5, 11). When 11 - M, it is denoted J+(S). This definition (like many others that will follow) has a dual in which 'future' is replaced by 'past', the notations with +ve sign are then replaced by those with -ve sign.

Gravitation, Relativity and Black Holes 419

From Sec. 3 we know that if a non-spacelike curve between two points is not a null geodesic, then it can be deformed into a timelike curve. In view of this we have that if 11 is an open set and p, q, r e 11, then either of the following two statements imply the same result: (i) qe J+{p,U),

re / + (
(ii) 9 e /•(,,«), rer{q,11)

r

r e I

iP Uy

'

From this it follows that the closures and boundaries of the sets / + (p, 11) and J+(p, 11) are equal; i.e.: / + (p,11) = J+ (p,U)

(8.4.45)(a)

I+(p,

(8.4.45)(b)

and 11) = j+(p,

W)36 +

Remark 8.4.18: The region J (S) of spacetime is causally affected by events in 5. This is not necessarily a closed set even when it is a single point.

4.2.3

Horismos and Achronal Sets

Definition 8.4.19: The difference set J+ (5, U) -1+ (5,11) is called the future horismos of 5 relative to 11 and is denoted E + (5, 11). As usual E + (5, Wt) = E+ (5). Sometimes the relations p e / +(qr), /? e J +(q) and p e E +{q) are denoted as q « p, q < p and q -> p respectively. Remark 8.4.20: If the set £/ in the above definitions is a convex normal neighbourhood about a point f>, then E + (p, 11) consists of the future-directed null geodesies in 11 from p and thus forms the boundary in 11 of both / + (p, IS) and / + (p, 11). In Minkowski-space the null cone of p forms the boundary of the chronological and causal futures of p. The above notion of the boundary defined with the help of a convex normal neighbourhood 11 can be generalized as follows. Definition 8.4.21: A set S which satisfies / + (5) r\S = 0 (i.e., there are no two points of 5 with timelike separation) is called achronal. If 5 3 / + (5), then 5 is said to be a future set. Note that if 5 is a future set, then 9A.-S is a past set. Also for any set 5 , the set / + ( 5 ) and J+ (5) are examples of future sets. Examples of achronal sets are given by the fundamental result: Proposition 8.4.22: If 5 is a future set, then 5 , the boundary of 5, is a closed, imbedded achronal three-dimensional C'-submanifold. A set 5 with the properties of S listed in the above proposition is called an achronal boundary. A spacetime is said to satisfy the chronology condition if there are no closed timelike curves in it. The set of points where this condition does not hold good is called the chronology violating set of M. It can be shown that the chronology violating set of M is the disjoint union of the sets of the form / + (p) n / ~ (p), p e fM; and when !M is compact, the chronology violating set of !M is non-empty. Similarly if there are no closed non-spacelike curves in SW, we say that the causality condition holds in 9A. The set of points where this condition is violated is the disjoint union of sets of the form J + (p) r\J~ (p), p e tM. Further, we say that the strong causality condition holds atp if every neighbourhood 9\[ofp contains a neighbourhood $£' (of p) which no non-spacelike curve intersects more than once (fA(can be 9*£).\t can be verified that this condition holds in 5W if the following four conditions are satisfied there: (a) for every null vector K, Rab KaKb > 0; (b) every null geodesic contains a point at which K^a Rb\cd[e^-f] KcKd •*• 0, where K is the tangent vector of the geodesic; 36

For any set 5, the boundary S -S C\(WC — S). (Recall that the boundary of a set5 is usually denoted as dS-)

420

Mathematical Perspectives on Theoretical Physics

(c) the chronology condition holds on M\ (d) iWis null geodesically complete.

4.2.4 The Concept of Imprisonment Definition 8.4.23: M is supposed to be equipped with a. future {past) distinguishing condition at a point p e 9A if every neighbourhood of p contains a neighbourhood (of p) which no future (past) directed non-spacelike curve from p intersects more than once. An equivalent statement of the above condition is that / + (q) = I+ (p) (/" (q) = /" (p)) implies q = p. Evidently if the strong energy conditions hold on M, then past and future distinguishing conditions would also be there. The causality conditions described above lead to the phenomena of imprisonment (a concept required in black holes). We note that a non-spacelike curve y that is future-inextendible behaves in one of three ways as one follows it to the future: it can (i) enter a compact set 5 and remain there; (8.4.46) (ii) not remain within any compact set but continually re-enter a compact set 5; (iii) not remain within any compact set and not re-enter any of these sets more than a finite number of times. In the first and second cases, yis said to be 'totally' and 'partially future imprisoned' in 5, respectively. In the third case, /can be viewed as going off to the edge of spacetime, that is, either to infinity or a singularity. It is worth noting here that imprisonment does not necessarily occur when the causality condition is violated. An example to that effect is given by Carter (see [16c] and [26]) via Fig. (8.12) below, where imprisonment is shown without causality violation. The manifold here is R1 x S1 x S1 and the (Lorentz) metric is given by: ds2 = (cosh t - I) 2 (dt2 - dy2) + dtdy - dz2. As is evident from the Fig. (8.12), it is a space with imprisoned non-spacelike lines but no closed nonspacelike curves. Given below is one of the results that negates imprisonment: Identify

^> '"

' i i ii ii Mi II M i l ^-

& V-7

Identify after shifting an irrational amount

0)

^ ^ ^ 0

/

(H) 1

The manifold R'xS'xS covered by coordinates (t y. 2), where it y. 2) is identified with (t, y, z+ 1) as well as with ( / . / + ! . » < $ - or being an irrational number, (i) A {2= const.} section showing the orientation of the null cones. (ii)The {t= 0} section showing part of a null geodesic

Gravitation, Relativity and Black Holes

421

Proposition 8.4.24: If the strong causality condition holds on a compact set 5, then there can be no future-inextendible non-spacelike curve which is totally or partially future imprisoned in SThe proof of the above proposition is immediate since S can be covered by a finite number of convex normal neighbourhoods %li (with compact closure) such that no non-spacelike curve intersects any Ui more than once. This implies that any future-inextendible non-spacelike curve which intersects one of these neighbourhoods must leave it not to re-enter. Recall that we introduced the Cauchy development and Cauchy surfaces earlier while discussing the conservation theorem and the exact solutions, we use them here to generalize the concept of Cauchy surfaces to global hyperbolicity.

4.2.5

Cauchy Developments

Definition 8.4.25: Given a closed set5 a region D+{S) (D~(S)) to the future (past) of5 is called the 'future (past) Cauchy development or domain of dependence on 5, if the events in the region D+ (S) (D~ (5)) can be determined from the knowledge of data on SAccording to the above definition, D+{5) is the set of all points p e SWsuch that every past-inextendible non-spacelike curve through p intersects S- We shall, however, often use the Penrose ([30b], [30c]) definition of Cauchy development where "non-spacelike curve" is replaced by "timelike curve." This set is denoted (for distinction) as

D+(S).

Definition 8.4.26: The closed achronal set given by ^ (•$) - I (D (5)) i s known as the future Cauchy horizon of 5 and is denoted as H+ (5). This future boundary of 5 limits the region that can be predicted from the knowledge of data on S. Finally, as a culmination to our discussions on causal structure, we have: Definition 8.4.27: A set 2\£is said to be globally hyperbolic if the strong causality condition holds on it, and, in addition, for any two points p, q e 9{, J+ (p) r\ J~ (q) is compact and is contained in 9(. There are many results that can be established involving global hyperbolicity. We state just four of these due to their simplicity and usefulness. Proposition 8.4.28: An open globally hyperbolic set is always causally simple. A set 5\£is said to be causally simple if for every compact set %_ contained in 9*1, J+ (X) ^ 3\£and J~ (X) n 9\£are closed in 9(. In other words, if for every compact set 3Cin (open set) 5\£the sets J+ (X) n f^and J ~ (X) n 5V^ are closed in 5\£ then 5\£ is causally simple. Proposition 8.4.29: Let 5 be a closed achronal set such that J + (5) n /~(5) is strongly causal, and in addition, is either (i) acausal* or (ii) compact, then D(S) = D+ (5) u D~ (S) is globally hyperbolic. An easy corollary to the above proposition would be: Corollary: If 5 is a Cauchy surface in the above proposition, then M must be globally hyperbolic. Proposition 8.4.30: Let p and q lie in a globally hyperbolic set J^with q e J+ (p). Then there always exists a non-spacelike geodesic from p to q whose length is greater than or equal to any other non-spacelike curve from p to q. Our last result deals with asymptotically flat spaces, i.e., spaces whose metrics approach the Minkowski-space metric at large distances from the system (see Chapter 34 in [26]). The Schwarzschild, Reissner-Nordstrom and Kerr solutions are examples of spaces that have asymptotically flat regions. The spaces of this kind are needed in investigating bounded physical systems such as stars. *

J* (S) o 7 ~ (5) would be acausal if and only if S is acausal.

422 Mathematical Perspectives on Theoretical Physics

Definition 8.4.31:

A time and space-orientable space (M, g) is said to be asymptotically simple if

there exists a strongly causal space (!M, g) and an imbedding 9: M —> M which imbeds M as a manifold with smooth boundary d!M in !M, such that: (i) there is a smooth function Q on !M which is positive on 0(3lf) and satisfies £l2g = 0*(g) (i.e., g is conformal to g on 0(fM)); (ii) on (9 (M), Q = 0, and dft * 0; (iii) every null geodesic in 9A has two endpoints on d!M. A space (ftf, g) is asymptotically empty and simple if it satisfies the above three conditions and also the condition: (iv) Rah = 0 on an open neighbourhood of dM in M = M u
Exercise 8.4 1. Show that when spacetime is spherically symmetric, the line element can be written as: (a)

ds2 = goo dt2 + gOr drdt + grr dr2 + gee (d92 + sin 2 9 dy/2)

or alternatively in the canonical form: (b)

2.

3. 4.

5. 6. 7.

ds2 = -ev(r> l)dt2 + eUr- l)dr2 + r

2

^ + sin 2 9

dy/2).

Show further that when t and r are constants, the above line element gives the familiar Gaussian curvature of a two-sphere of radius gee in R3. Obtain Einstein's equations for the spherically symmetric spacetime using the metric in canonical form (b) of Exc. 1. Show further that "if this spacetime has a region which is asymptotically flat and empty, then the metric in this region is time independent and hence independent of the dynamical properties of its source" (Birkhoff's Theorem). Use the coordinate transformations of (8.4.15)(a) to write the metric (8.4.15)(b). Write the coordinate transformations that are required, between the coordinates (u, v, x, y, z), (t, X, 0, 0) and (t', r, 9, 0) for writing the metrics of the anti-de Sitter space as a hyperboloid in the forms (8.4.19) and (8.4.20). Using the above two exercises, establish the Schwarzschild solution (8.4.26). Establish the equations given in (8.4.24) and (8.4.25). Use the Kerr-Schild co-ordinates (x, y, z, i) to show that the singularity r — 0 in the case of Kerr solutions is a ring.

Hints to Exercise 8.4 1. Spherical symmetry in spacetime implies that for every rotation R e 50(3) (see Chapters 2 and 3), there is an isometry ¥(/?) of spacetime. Recall that in 3-dimensional space R3 with coordinates (JC, y, z), R maps (x, y, z) \-> (Rx, Ry, Rz) and this gives three distinct one-parameter families of rotations whose Killing vectors are:

Gravitation, Relativity and Black Holes 423

(i)

Z = zdy-

ydz,

1

%=xdz-

zdx,

2

§ = ydx-

xd

3

In terms of polar coordinates (r sin 9 cos Xjf, r sin 9 sin iff, r cos 9) they become: (ii)

£ = sin y/de + cos 9 cos y / ^ l

3

In the case of spacetime when the action of the rotation group results into spherical orbits, we use coordinates (t, r, 9, iff) regarding t as constant. Now due to isometry, the Lie derivative of the metric tensor (gy) (ij = 0, 1, 2, 3) with respect to Killing vector £, (a= 1, 2, 3) is zero, hence we have: a

U^g)ij = ¥ dkglj + gik d£k + gkjd£k = 0.

(iii)

a

a

a

a

Simplification of the above equation after using the values of t, given in (ii) shows that: a

(iv)

dygjj = 0; gr9 = grv = gd¥ = goe - gOv = 0; deSrr = deSee = de8or =

d

egoo

= 0;

gV¥ = gee sin 2 9.

This means that the only non-zero components of g?1 are g^, grr,gdg, g , gOr, and that they are all functions of r and t only, the line element can thus be written as given in (a). We further

choose another set of coordinates (t', / , 9', y/') such that t' - t'(r, t), r' - r\r, t), 0' = 9, y' Yand the term containing dr' dt' is zero; with this choice we have: ds2 = g'oo dt'2 + g'r,r, dr'2 + g'ffff (d9'2 + sin 2 9'

(v)

dy'2)

where each g' is a function of (r', t'). From the last equalities in (iv) and (ii) and from the fact that £, are spacelike, it follows that g'ee (r', t')= ge6 (r, t) > 0, hence there is no loss of generality a

if we write:

(vi)

r'=

We now write g'oo = - e

ylgee(r,t). v(r

• ' } and g' rV = e^r'''

and drop the prime, then (v) becomes:

(vii) ds2 = -ev(r' ° dt2 + eMr'!) dr2 + r\d& + sin2 9 dy2). The last statement of the exercise is evident from (vi) and (vii). 2. From the above Exc. we know that there are only two metric coefficients, namely e v(r ' () and e (r' '\ that need to be determined from Einstein's Equations (8.1.17) to obtain the solution. (The solution is called the 'external Schwarzschild solution.') However, our objective here is simply to write the Einstein equations. We assume that the source of the spacetime is a bounded matter energy distribution with energy momentum tensor 7^. This is spherically symmetric, hence: 37

The correspondence between subscripts 0, 1, 2, 3 and t, r, Band \ffi&: 0 —> t, 1 —> r, 2 —> 0, 3 —> iff.

424 Mathematical Perspectives on Theoretical Physics

[^r] =o.

(0

V a

Jij

Using the arguments of Exc. 1, we obtain that the only non-vanishing components of Ty are Tm, Trr, Tee, TV¥ and TOr, and they are all functions of (r, t). We also note that the non-zero connection coefficients for the metric given in (vii) of Exc. 1 are:

(ii)

Kr= | ,

I~r6

=

r°,= -|, r°Or= ^ ,

=

^ry

>

^6y/

=

c o s

"'

r ^ - s i n ©cos e

I~0r =

—

T ^ = -r sin 2 e e~x where the prime and the dot stand for dr and dt respectively. The Einstein equations can now be written as:

(iii)

(a) e'x (— + \ ) - \ = kTrr (r, t) \ r r J r

(b) e~X [-L - i l ) - -L

= kT\(r,

t)

(c) e~x— = kTr0(r, t) r

2 ^

2

r

2 J 2

{

2

2J

6

¥

where k = 8TTG/C 4 is Einstein's coupling constant, with G being the Newtonian gravitational constant and c the velocity of light. To establish the second part of the exercise, which in fact is the well-known Birkhoff's theorem for spherically symmetric, asymptotically flat empty spacetime (see [9] and Appendix B in [16c]), we require that the metric of (vii) in Exc. 1 becomes the Minkowskian metric (in polar coordinates) in some region of spacetime. For this we impose the condition: (iv)

lim v(r, t) = lim A(r, t) = 0. r—*<*»

r—»°°

Suppose now that r 0 is the smallest value of the radial coordinate beyond which Ty vanishes, i.e., (v)

Ty = 0

for

r > r0.

(Note that such an r 0 exists since we have assumed that the matter is bounded.) From (c) of (iii) above, it follows that when r> r0

Gravitation, Relativity and Black Holes 425

(vi)

A = o.

This means that outside the matter source X{r, t)\r> (b) from (a) we have v' + X' = kreX(rr-

(vii)

= X(r) is independent of time. Subtracting

T°o).

Integrating the above equality over r from r > r0 to °° and using (v) and (vi) we obtain: (viii)

v(r, t) = ~X{r)

r > r0.

This shows that outside the matter distribution (Ty = 0), v and X are time independent and thus the whole metric in this region is time-independent, hence it is independent of the dynamical properties of the source. 3. From (8.4.15)(a) we have Eq. (i) below: dv = cosh (a" 1 f)dt; A 1

dw = sinh(a~ f) cos %dt - a cosh (a"' /) sin %d% B 1

dx = sinh(a~ i) sin #cos 6dt + a cosh (a"1 t) cos % cos 9 d% - a cosh (a"1 t) sin % s m ^ d9 c dy = sinh(a~ 1 1) sin % sin0 cos <^dt + a cosh(«~' t) cos % $m & c o s <MX + a cosh (a~l t) sin % c o s ^ c o s d9 - a cosh (a" 1 /) sin % sin 6 sin

dz = sinh^a" 1) sin % s m ^

sm

fa*

+ a

cosh (a" 1 t) cos % sin 8 sin (f>dx

+ a cosh (a" 1 t) sin % cos Q sin (/> dd + a cosh (a" 1 t) sin % sin 6 cos

-dv2 + dw2 + dx2 + dy2 + dz2 = {- cosh 2 (a" 1 t)d? + A2 + B2 + C2 + D2} + terms without dt2.

We note that the terms with mixed differentials contained in the second term on the RHS of (ii) cancel out, thus we have to collect only the coefficients of dt2, d^, dO2, d$. This gives. (iii)

1st RHS Term = ^ [ c o s h 2 (a" 1 t) + sinh 2 (a"' t) {cos2 % + sin 2 % x (cos 2 9 + sin 2 9 cos 2 + sin 2 9 sin 2 <j))}] = -dt2

(iv)

2nd RHS Term = d)?[o? cosh 2 (cf' 0 {sin2 % + c o s 2 X x (cos 2 9 + sin 2 9 cos 2 0 + sin 2 9 sin 2 0)}] = a 2 cosh 2 (a"' t)dx2 + two terms containing d92 and dip2 .

4. Write: u = sin t, v - cos t cosh £, x = cos t sinh % c o s 9, y = cos f sinh # sin 0 cos , 2 = cos ? sinh % sin 0 sin 0. It can be easily checked that these coordinate transformations satisfy:

426 Mathematical Perspectives on Theoretical Physics

-u2 - v2 + x2 + y2 + z2 = - sin2 t - cos2t cosh 2 % + cos2t sinh2 x cos 2 9 + cos 2 / sinh 2 x sin2 9 cos 2 (j) + cos2t sinh 2 x sin2 6 sin2 (f> = - sin2/ - cos 2 /[cosh 2 % ~ sinh2 X x {cos2 0 + sin2 0(cos2 0 + sin2
du = cos tdt dv = - sin / cosh # dt + cos / sinh % <^X dx = - sin / sinh # cos 9 dt + cos / cosh # cos 9 dx - cos / sinh £ sin 0 d9 dy = - sin t sinh £ sin 0 cos dt + cos r cosh # sin 0 cos <j> dx + cos t sinh ^ cos 0 cos <j> d6 - cos t sinh ^ sin 0 sin <j> dtp dz = - sin / sinh # sin 9 sin 0 A + cos / cosh % s^n 9 s^n 0 ^X + cos t sinh # cos 9 sin d9 + cos r sinh £ sin 0 cos $ d<j).

We collect the coefficients of (dt)2, etc., and note that the coefficients of product differentials cancel out, this gives: - du2 - dv2 + dx2 + dy2 + dz2 =

(ii)

(dt)2[- cos2? - sin2/ cosh 2 x + sin2/ sinh2 x cos 2 9 + sin2/ sinh2 x

sin2

0(cos 2 <j> + sin2 0)] + dx1 [- cos 2 / sinh2 #

+ cos 2 / cosh2 £ cos 2 9 + cos 2 / cosh 2 £ sin2 9 cos 2 0 + cos 2 / cosh 2 £ sin2 0 sin2 <j>] + d92[cos2t sinh2 % sin2 0 + cos 2 / sinh2 % c o s 2 #(cos 2 0 + sin2 0)] +d0 2 [cos 2 / sinh 2 x

sin2

#(cos 2 0 + sin2 <j>)]

= d?{- cos 2 / - sin2/ (cosh2 £ - sinh2 %)\ + c o s 2 / ^ [ - sinh2 £ + cosh 2 £{cos 2 0 + sin2 0(cos 2 0 + sin2 0)}] + cos2/af02[sinh2 ^(sin 2 9 + cos 2 0)] + cos 2 / sinh 2 £ sin 2 0d0 2 = - dt2+ cos2t[dx2+ sinh2 x(d92+ sin29d
u = cosh r cos /', v = cosh r sin /', x = sinh r cos 9 v = sinh r sin 9 cos 0, z = sinh r sin 0 sin <j>.

This gives: (iv)

- M2 - u 2 + jt2 + v2 + z 2 = - cosh 2 r cos 2 /' - cosh 2 r sin2 /' + sinh2 r{cos 2 0 + sin2 0(cos 2 <j) + sin2
Gravitation, Relativity and Black Holes 427

The differentials in this case are: (v)

du = - sin t' cosh rdt' + sinh r cos t'dr dv = cos t' cosh rdt' + sinh r sin t'dr dx = cosh r cos 8 dr - sinh r sin 0 dr + sinh r cos 0 cos 0d0 - sinh r sin 0 sin dz = cosh r sin 0 sin 0e?r + sinh r cos 0 sin d9 + sinh r sin 9 cos 0 d0.

Hence we have: (vi)

ds2 = -du2- dv2 + dx2 + dy2 + dz2 = - cosh2 r{dt')2 + dr2 [- sinh 2 r + cosh 2 r {cos2 9 + sin2 0(cos 2 <j> + sin2 (j))}] + sinh2 r^0 2 [sin 2 9 + cos 2 0(cos 2 <)> + sin2 0)] + sinh2 rd<(>2 [sin2 0(sin 2 0 + cos 2 <j>)] = - cosh 2 r{dt'f + dr 2 + sinh2 r{d& + sin2 &ty2).

(Evidently the coefficients of product terms cancel out.) While the first coordinate system (t, x, 9, 0) covers only part of the space, and has singularities at t = ± —, the coordinates (r', r, 9, <j>) cover the whole space. 5. We begin with the differential equation established in (vii) of Exc. 2: (i)

v' = -K +

kre\Trr-T°0)

and note that if the matter is distributed regularly down to r = 0, the center of symmetry, then we can integrate this equation from 0 to r and obtain: (ii)

v(r, 0 + A(r, t) = v(0, t) + A(0, t) + k £ yeX(y-r) (Trr - T°0)dy.

In view of the fact that v(r, t) = - A(r) for r > r0, we note that the LHS of (ii) is zero at r = r0, hence we have: (iii)

v(0, 0 + A(0, 0 = -kg yeX(y-r) (Trr - T0Q)dy.

(We assume here that boundary conditions at r = 0 are derived using the 'local regularity requirement' . This requirement is as follows: the loop (enclosing r = 0) defined by r = £ (£ being small), t = const., 9 = — and parametrized by y/, having the metric length lite, that tends to zero as £ —> 0, is in a regular region of spacetime, and as such the, Lorentz transformation given by parallel propagated basis vectors around the loop approaches the identity.) We further need that: (iv) A(r) —» 0 as r —> «>. (This follows from Eq. (iv) of Exc. 2.) Then from (ii) together with the regularity of T, we have: (v) v(r) —» 0 as r —¥ «>.

428 Mathematical Perspectives on Theoretical Physics

We can now solve Eq. (iii) (b) of Exc. 2 in terms of A; since this equation can also be written as: (e-X)' + 1 (e-X) - I - krT°0 = 0 r r using (iv) and (v), we have a general solution for X (r) at r > r0: (vi)

(vii)

=l

+

AjoV7Vy

r>r0.

We put /

---x

(vm)

4?T ft)

T.,1

M' =

r

|

J

2T-0

y rody

c and note that M' is a constant since it does not depend on r or on t. Writing the value of k = —j—, we have: c (ix)

e -AW =1 _2ATG_

C r

When we substitute this value of e ^ ( r ) in the line element (vii) of Exc. 1 (after taking into account that v(r, t) - -A(r) for r > r0), we obtain:

(x)

ds2 = -(1 - ^ 1 V

2

+ f 1" ^ T ' ^^2 + ^ 2 (^e 2 + sin2 6d y,2).

V c r ) V c r / This is the Schwarzschild solution (8.4.26) in the region r > r0, where —T-M' = M. 6. Use Exc. (2.7) to establish these equations. 7. The Kerr-Schild coordinates (x, y, z, t) are given by following relations: (x + iy) = (r + id) sin 0 exp (i j (d(j> + a A"1 J r ) ) z = r cos 6,

(i)

i = J (Jr + ( r 2 + a 2 ) A"1 dr) - r

(ii)

(see Subsec. 4.1.5 for A and a.) The Kerr-metric in these coordinates takes the form: ds2 =dx2 + dy2 + dz2 - dt2 2m r3 (r(xdx + — 5~3"

+ ydy)-a(xdy-ydx)zdzi,-\ 2 2^

+

+ dt

..... (ln)

The r in (iii) is determined (up to a sign) in terms of x, y, z by the equation: rA-{x2

+ y2+ z2-a2)

r2 - a2 z2 = 0

(iv)

The surfaces r - const. ^ 0 are confocal ellipsoids in coordinates (x,y, z), for r = 0 these degenerate to the disc z 2 + y 2 < a 2 , x = 0, and the ring x2 + y2 = a2, z = 0 which is the boundary of this disc-is the 'ring of singularity' we were looking for. We note that this is a real curvature singularity as the scalar polynomial RabcdRabcd diverges here, we further note that no scalar polynomial diverges on the disc except at the boundary.

Gravitation, Relativity and Black Holes

5

429

THE BASICS OF SPACETIME SINGULARITIES AND BLACK HOLES

In this section we introduce the reader to two great triumphs of Einstein's theory of general relativity— singularities and black holes. Our treatment here is sketchy as it is meant more for a beginner in the area. Some of the excellent sources for detailed study and research work on the subject are: [4c], [16c,d,e], [26], [30a], [40]. We begin with the layman's version of these two terms and then go on to the scientific meaning and the role they acquired in physics. The words "singular" and "hole" in common vocabulary stand respectively for "different from the ordinary" and "the part of an object it can be seen through" (a hole in the roof or in a table-top). A mathematician, on the other hand, would define the first word as: that which is "not non-singular" is "singular," e.g., a singular matrix or a singular solution, and the second as: a "geometric discontinuity." Obviously it is very often difficult to ensure the extraordinariness (singular character) of an object, and is sometimes impossible to peep through a hole (for lack of light). When these problems are dealt within the realm of macrophysics, they are referred to as "singularities" and "black holes."

5.1 Singularities and Completeness in Spacetime We recall that a singularity is a point in spacetime at which the spacetime curvature becomes infinite, we also know that there are models of spacetime e.g., Robertson-Walker, Schwarzschild, Reissner-Nordstrom and Kerr where singularity occurs. Since these models of spacetime are based on group-theoretic assumptions and choice of coordinates, a natural question that one asks is: does there exist a singularity in the universe which is not a peculiarity of a model but is the result of the breakdown of physical laws? A question of this nature is closely related to the theory of "big bang" and "expanding universe" [19a,b]. Based on the fact that galaxies are moving away from each other as well as from our solar system, and thus the universe is expanding, the big bang theory predicts that "the universe had a beginning" and if the history of time was to be written, that could be the point of origin. Models such as that of Friedmann (see Subsec. 4.1.4) supported this idea, although in the absence of a rigorous proof, the idea was not fully accepted. It was Penrose who used the theory of general relativity to give a mathematical proof of the existence of singularity related to the big bang theory. We describe in brief the concept of singularity introduced by Penrose and later on expanded by Hawking (see [30a] and [16b] for original papers). To begin with, we define the word '^-completeness' (short for bundle-completeness) which is required in formulating the existence of singularity in Einstein's universe. The ^-completeness is a generalization of our familiar geodesic completeness (of Riemannian manifolds), as we shall see below. Consider a manifold M with positive definite metric g and let p(x, y) be the distance function between two points x, y € M. Recall that p(x, y) is the greatest lower bound of the length of curves from x to y and works as a metric in the topological sense, for it provides a basis {15 (x, r)} for open sets of 3tf. The basis {'B (x, r)} is formed by using all points y e iWfor which p(x, y) < r (see Chap. 0 for the definition of the metric). The pair (fW, g) is said to be metrically complete (m-complete) if every Cauchy sequence with respect to p converges to a point in M. Alternatively we say that (M, g) is m-complete if every C1 -curve of finite length has an end point (as given in Sec. 4) and hence m-completeness implies geodesic completeness (g-completeness) (see [22] for proof), this means that every geodesic can be extended to arbitrary values of its affine parameter. When g stands for the Lorentz metric, there is no m-completeness, and g-completeness also has to be subdivided into three distinct categories depending on the nature of the curve (timelike, spacelike or null), and thus we cannot say here that (M, g) is ^-complete on the basis of geodesic-completeness of one of these.

430

Mathematical Perspectives on Theoretical Physics

If !Mis timelike geodesically incomplete, there could be freely moving observers or particles with no histories after (or before) a finite interval of proper time, a feature such as this is more objectionable than the infinite curvature in fW. A similar argument applies to null geodesically incomplete spacetimes, since null geodesies are the histories of zero rest-mass particles (see the Appendix). On the other hand, spacelike curves are not the carriers of any particles or observers, hence spacelike geodesic incompleteness is not of much significance. With these facts in view, Penrose adopted the following definition for singularity: Definition 8.5.1 The minimum conditions for a spacetime to be singularity-free are that it be both timelike and null geodesically complete. According to this definition, timelike/null incompleteness implied the occurrence of singularity in spacetime; some of Penrose's and Penrose-Hawking's earlier results were based on this premise. The main disadvantage of this premise was that it gave information of the occurrence of a singularity, but provided no clue about the shape, size or the location of a singularity, it was here that ^-completeness was found useful. Consider a C'-curve A(f) through p e M with tangent vector V= (dldi)X{() which is expressible as V = V'(t) Ej in terms of a parallelly propagated basis {Et} (along X{t)). The parameter u defined as: \_ (8.5.1) is called the generalized affine parameter on A. The length of a curve A is finite in the parameter u if and only if it is finite in any other parameter u that results from another choice of basis E]. If A is a geodesic curve, then u is an affine parameter on A. Definition 8.5.2 The pair {M, g) is called b-complete if there is an endpoint for every C'-curve of finite length as measured by a generalized affine parameter. Note that if the length of A is finite in terms of one parameter u, it is so in terms of all other parameters u (assuming that bases Et and £", leading to M, u are related via a non-singular matrix). This observation suggests that the bases {£,}, {£",} can be taken as orthonormal bases without any loss of generality, and it further suggests that if g is positive definite, the generalized affine parameter defined via orthonormal basis is arc-length, hence fe-completeness in this particular case coincides with m-completeness. We remark: Remark 8.5.3 On a pair (!M, g) ^-completeness can be defined even if the metric is not positive definite. As long as there is a connection on SW, the ^-completeness can always be defined. The bcompleteness implies ^-completeness but not the other way around. In view of the above remark, we define a spacetime to be singularity-free if it is ^-complete. We say that a ^-incomplete curve corresponds to a scalar polynomial curvature singularity ('s.p. curvature singularity') if any of the scalar polynomials in gab, t]abcd, Rabcli is unbounded on the incomplete curve. Similarly it corresponds to a curvature singularity with respect to a parallelly propagated basis ('p.p. curvature singularity') if any of the components of the curvature tensor is unbounded on the incomplete curve. Evidently s.p. curvature singularity implies p.p. curvature singularity. We next show the bundletheoretic origin of ^-completeness. We said earlier that the notion of ^-completeness was introduced to study the structure of singularities (shape, size and location), but the main difficulty was that the manifold iW of a spacetime (!M, g) was assumed to be without any singular point. In order to make room for singular points, a sort of boundary d to fWhad to be attached, giving rise to another manifold 5tf + = M u d. The boundary d had to be uniquely determined by measurements at non-singular points of (M, g). Hawking and Geroch (see [16a]

Gravitation, Relativity and Black Holes 431

and [13a]) in separate papers suggested a construction method for d by defining singular points as equivalence classes of incomplete geodesies. Their method was improved upon by Schmidt's construction given below (see [16c], [36]). Schmidt used the theory of bundles to obtain 5W+. Let 0(5W) be the set of all orthonormal frames {Ea} where Ea e Tp for each p in M and a = 1, 2, 3, 4. Consider a positive definite metric e defined on the bundles: O(M) —>M(n maps a basis {Ea} atapoint p to the point/?). It can be shown that O(!M) viewed as a manifold is m-incomplete in the metric e if and only if M is ^-incomplete.38 When O (ftf) is m-incomplete, one can obtain the metric space completion O(M) of O (iW) by Cauchy sequences. The projection ;rcan be extended to O(M) and the quotient of O(M) by ;ris defined to beM + which is the union of Mand some additional points denoted by d. The set d consists of singular points of'M.since it is the set of endpoints for every fc-incomplete curve in 2W. We now state some of the results that deal with a singularity in spacetime and are based on the concepts of completeness/incompleteness introduced above. These results are referred to as "singularity theorems" in the literature.

Result 8.5.4-Theorem

1, Penrose [30a]

A spacetime (M, g) can not be null geodesically

complete if: (i) Rah FfK* > 0 for all null vectors Ka (see Sec. 3); (ii) there is a non-compact Cauchy-surface ^ in 5W; (iii) there is a closed trapped surface ST in 94..

Result 8.5.5

(Theorem 2, Hawking and Penrose [16b])

A spacetime (M, g) is not

timelike and null geodesically complete if: (i) Rab K"Kb > 0 for every non-spacelike vector K (see Sec. 3); (ii) the generic condition is satisfied, i.e., every non-spacelike geodesic contains a point at which K[a Rb] c
Result 8.5.7

(Theorem 3, Hawking [16c]): If

(i) Rah K"Kb > 0 for every non-spacelike vector K; (ii) the strong causality condition holds on (fW, g); 38

This is obviously equivalent to the result: (O(iW), e) is w-complete if and only if (5W, g) is ^-complete.

432 Mathematical Perspectives on Theoretical Physics

(iii) there is some past-directed unit timelike vector R at a point p and a +ve constant b such that if IR is the unit tangent vector to the past-directed timelike geodesies through p, then on each such geodesic the expansion 0 = Va.a of these geodesies becomes less than -3c/b within a distance blc from p, where c = -WaVa, in such a case there is a past incomplete non-spacelike geodesic through p. Result 8.5.8—(Theorem 4, Hawking [16c]): A spacetime is not timelike geodesically complete if: (i) Rab KaKb > 0 for every non-spacelike vector K; (ii) there exists a compact spacelike 3-surface S (without edge); (iii) the unit normals to 5 are everywhere converging (or everywhere diverging) on SAs mentioned earlier, we focus our attention only on the significance of these theorems and omit their proofs. The interested reader is advised to consult [16c], [16e] and the original papers cited there. Remark 8.5.9 The first of these theorems was formulated (by Penrose) to prove the occurrence of singularities in a star which collapsed inside its Schwarzschild radius. The theorem, as is evident from the hypotheses, was not based on the assumptions of symmetry (spherical or axial); instead it used a more general criterion such as the existence of a closed trapped surface. The conclusions of the theorem were, that in a collapsing star, one of the two things occurs: a singularity, or a Cauchy horizon. It is worth noting here that much before Penrose's work on collapsing stars, an equally fundamental result on the contraction/expansion of a star was established by 5. Chandrashekhar (see [4a], [4c] and Box 24.1 in [26]). He showed that when a star contracted,39 the matter particles came very close to each other and they had different velocities. According to Pauli's exclusion principle, particles moved away from (repelled) each other and thus made the star re-expand. This implied that there was a point in a star's (life) history where a star maintained itself at a constant radius by a balance between the gravitational attraction and the repulsion caused by the exclusion principle. Chandrashekhar calculated that this balancing feat was possible, as long as the star's mass was (approximately) 3/2 times the mass of our sun = 3/2M@ (this is now called the Chandrashekhar limit). If the star's mass was less than that of our sun, the star stopped contracting, to become a "white dwarf" with a (small) radius of a few thousand miles and a density of hundreds of tons per cubic inch. On the other hand, if the mass exceeded the limit, in view of Pauli's exclusion principle, the collapse could not be halted and what happened to such a star was not known. In other words, the theory of general relativity provided no answer for it. It was J. R. Oppenheimer (the atom bomb physicist) who showed that as the star contracts, the gravitational field at the surface gets stronger and stronger, and the light cones get bent inward more and more (see Box 24.1 in [26]). As a result it becomes difficult for the light to escape from the star. The light appears dimmer and redder to an observer at a distance. Eventually, when the star has shrunk to a certain critical radius, the gravitational forces at the surface become so strong that light can no longer escape. It is this phase of a collapsing star which yields what is called a 'black hole.' Remark 8.5.10 In view of Definition (8.5.1), the failure of the non-spacelike geodesic completeness condition in Thm. 2 can be interpreted to mean that any spacetime which satisfies (i)-(iv) possesses a singularity. Whether the singularity is indeed the "infinite curvature" type cannot be inferred from it. More precisely, the theorem implies that 'some' causal (non-spacelike) geodesic "enters a singularity" (i.e., is compelled to be geodesically incomplete) before any "repeated focusing" has time to take place. 39

When a star has lost a sufficient amount of its nuclear fuel, it begins to cool off and thus begins to contract.

Gravitation, Relativity and Black Holes

433

Remark 8.5.11 Theorems 2 and 3 are considered as most useful theorems on singularities, since the conditions laid down in these theorems are satisfied in a number of physical cases. It must be noted, however, that the occurrence (predicted by their hypotheses) may not be a singularity, it may just be a closed timelike curve violating the causality condition. An outcome of this nature is physically more objectionable than the occurence of a singularity. A valid question that follows from this is: would causality violations prevent the occurrence of singularities. Theorem 4 shows that causality violations in general cannot prevent the occurrence of singularities and as such they have to be taken seriously. Our next remark deals with the role played by the metric and/or curvature in the occurence of singularities. The remark is based on the character of singularities predicted by Theorem 4. According to this theorem geodesic imcompleteness in spacetime is the consequence of unbounded curvature/irregularities in the metric. Remark 8.5.12 If one extends spacetime so as to try to continue the incomplete geodesies, the metric fails to be Lorentzian or the curvature becomes locally unbounded giving rise to a curvature singularity. We note however, that in the latter case even though curvature may be locally unbounded, the metric could still be interpreted as a distributional solution of the Einstein equations, if the volume integrals of the curvature components over any compact region are finite. Remark 8.5.13 It is impossible to determine the manifold structure at points of singularity by physical measurements. In fact there are manifold structures which agree for non-singular regions but differ for the singular points. A case in point is the manifold at the t = 0 singularity in Robertson-Walker solutions (see Subsec. (4.1.2)). This could be described by the coordinates: [t, r cos 6, r sin 0cos <j), r sin 6 sin 0} or by: {t, Sr cos 0, Sr sin 6 cos <j), Sr sin 6 sin }. In the first case, the singularity is a 3-surface, in the second case, it is a single point.

5.2

Black Holes

We describe here in brief the formation of black holes in the universe that obeys the principles of Einstein's theory of general relativity. As mentioned above, black holes result from a collapsing star, and when such stars have a static and spherically symmetric body, the solutions to Einstein's equations in the regions outside of them are those that follow from the Schwarzschild model. For it is this model which represents the spherically symmetric empty spacetime outside a massive spherically symmetric body (see Exes. 4.1, 4.2 and 4.5). We shall see below that the size (radius) of the collapsed star and hence the ensuing singularity and the black hole depends (among other things) on the 'Schwarzschild radius 40 ' 2m, given in the metric: d* =

_f i _ 2« W

+

(l _ 2ZLJ1 dr2 + r\dtf

+ sin2 Q # 2 )

Suppose that r 0 is a fixed radial distance that corresponds to the surface of the star, then for r > r0, the solution of Einstein's equation is indeed the Schwarzschild solution (S.S.) for asymptotically flat regions. We note that when the star is static, the radius r 0 must be greater than 2m, since the surface of the star corresponds to the orbit of a timelike Killing vector which exists (in the S.S.) only where the radial distance r > 2m. 40

. Evidently the Schwarzschild radius (S.R.) differs from body to body, for instance it is 1.0 cm for the earth and 3.0 km for the sun. The ratios of the S.R. to the radius of the earth and the sun are 7 x 10~10 and 2 x 10"6 respectively.

434 Mathematical Perspectives on Theoretical Physics

If r 0 were less than 2m, the surface of the star would be expanding or contracting, the latter would come into play only after the nuclear fuel of the star is exhausted, the star then begins to cool and the pressure is reduced. Again, since the solution outside the star is a S.S., there will be a closed trapped surface ST around the star. Hence in view of Theorem. 2, a singularity will occur provided there is no causality violation and the appropriate energy conditions continue to hold. We further note that even when the star is not exactly spherically symmetric41, the above phenomena of a closed surface around the star would still occur provided the departure from spherical symmetry is not too great. This would not be due to S.S. now, but would follow from the development of a (partial) Cauchy surface. One of the key questions here is: how large can the rotation be without preventing the occurrence of a trapped surface? This question is answered by the Kerr solution, which can be thought of as representing the exterior solution for a body with mass m and angular momentum L = am. If a < m, there are closed trapped surfaces, but if a > m, there are no closed trapped surfaces. In other words, if the angular momentum of the star is greater than its squared mass (L > m2), then the contraction of the star would halt before a closed trapped surface developed. While reaching this conclusion of no closed trapped surfaces due to large angular momentum, one must remember, however, that during the period of collapse the star will lose angular momentum, and hence the notion that angular momentum could prevent the closed trapped surfaces and thus the occurrence of singularity has its drawbacks. In the following remarks, we shall analyze the mass and the density of a star qualitatively to reach conclusive answers on their collapse. In the process we shall define the stars that are called "white dwarfs" and "neutron stars." Remark

8.5.14

A star is equipped with a (frozen) magnetic field R , which increases as p~3~

(p s matter density of star) during a star's collapse, which is assumed to be nearly spherical. Thus the 4

magnetic pressure at any time is proportional to p 3 . This rate of increase is so slow that unless the magnetic pressure was of relevance initially, it would have no significant influence on the collapse of a star. Remark 8.5.15 A star that is (completely) burnt out cannot support itself against gravity if its mass exceeds the limit of 1.5 M s . In order to establish this limit, we note that in hot matter there is pressure produced by the thermal motions of the atoms and by the radiation (that results from hot matter); in cold matter, however, at densities lower than that of nuclear matter (= 1014 gm cm"3), the only significant pressure arises from the quantum mechanical exclusion principle, as explained below. Remark 8.5.16 Consider fermions of (total) mass m with number density n. By the exclusion principle, each of these fermions will occupy a volume of n~l. In view of the uncertainty principle, it will have a spatial component of momentum of order h n T where h denotes the Planck's constant. The velocity, and consequently the pressure due to these fermions, will be determined by the rule (8.5.2) given below:

Non-relativistic fermions => h n T < m

velocity will be of the order ~h ny/m" 1

pressure (s (momentum) x (vel.) x (number density)) will be of the order ~ h2 n y m"1

Relativistic

~ 1 = (the speed of light)

~-» n T

fermions => h n T > m 41

. A star departs from spherical symmetry if it is rotating or it has a magnetic field.

Gravitation, Relativity and Black Holes 435

Remark 8.5.17 From the above rule it is also evident that as long as the matter is non-relativistic, the major contribution to the degeneracy pressure comes from the electrons since m~l for them is bigger than it is for the baryons (see Table S). At high densities, however, when the particles become relativistic, the pressure is independent of their mass, it only depends on their number density. Remark 8.5.18 A cold body (burnt-out star) may be so small that self-gravity can be neglected, in this case the degeneracy pressure is balanced by attractive electrostatic forces between nearest neighbouring particles arranged in a lattice. Assuming that there are an equal number of positive and negative charges and (approximately) an equal number of electrons and baryons, these forces will produce a negative pressure of order e rO . Thus the mass density of a small cold body will be of the order e6miemn tT6

(= 1 gm cnT3)

(8.5.2)

Here me(mn) is the electron (nucleon) rest-mass. Remark 8.5.19 When the cold body is large enough for the self-gravity to be important, it works very differently. The gravity compresses the matter against the degeneracy pressure. Using the Newtonian order of magnitude argument, we note that for a star of mass M and radius r0, the gravitational force per unit volume is of the order Mlr^ n mn where n mn — Mlr$ is the mass density. The gravitational force is balanced by a pressure gradient of the order P/r0, P being the average pressure in the star. Thus the pressure P can be expressed as:

P=£L« MH^mj

42

(8-5-3)

When the density is sufficiently low, in view of Remark (8.5.17), the main contribution to the pressure is from the degeneracy of non-relativistic electrons, hence using the rule (8.5.2) we have: P = h2nT

1

m-

(8.5.4)

Equating this value of P with that of (8.5.4) we obtain: M 3 n 3 m 3n = h2n 3 m~x which gives the value of the number density n as: n= M2minmlh~6

(8.5.5)

The above value of n is based on the assumption that the self-gravity of the star is coming into play, this will be valid as long as this n is greater than the value of n given by (8.5.2) where self-gravity has no influence on the star as it is too small; and also this n must be < me3ft~3 for the correctness of Eq. (8.5.5). In terms of pressures the relationship between small and large stars can be stated as: 1.1 2. ± ± e n 3 < M 3 n 3 mn3 42

. The approximate value of P follows from the fact: u

4

3

4

3

(3M

1 )

m

I

*4V4/3

436

Mathematical Perspectives on Theoretical Physics

or equivalently as:

e'm'l < M

(8.5.6)

On the other hand since

ft"3

n > ml

(8.5.7)(a)

j_

rests on n3 h > me, from (8.5.2) it follows that Fermions, not being relativistic, their pressure does not £ vary as tin 3 , and this prevents the continuing gravitational collapse. In view of (8.5.5) the inequality (8.5.7)(a) implies: A ^ m ^ J T 6 < m3eft""3 or

M < h^m'l

(8.5.7)(b)

Putting both these inequalities together we have: e3m~2 < M < fiT

m

~2

(8.5.8)

A cold star whose mass M lies between these limits is called a white dwarf. A natural question that one asks is: what would happen if the pressure ~ /m~3"? The answer to this question is: the star would still have an equilibrium but it would be an unstable equilibrium. It is this instability which is responsible for collapse. This might be for a white dwarf collapsing towards a 'neutron star,' or a neutron star collapsing towards a 'black hole' (if the pressure is due to the neutrons) (see Remark (8.5.21)). Remark 8.5.20 If the density is so high that the electrons are relativistic, i.e., n > m\ 7T3, then the pressure P from (8.5.2) for the relativistic formula is:

This, when equated with P in (8.5.3), gives: ± i i ^ An 3 = M 3 n 3 m,,3 showing that a star of this nature has the mass: M=ML^h~

mf

=< 1.5M0

(8.5.9)

This star can have any density greater than m3smnft"3, i.e., any radius less than h~2m~^m~x. We note that stars of mass greater than ML cannot be supported by the degeneracy pressure of electrons alone, as will be evident from our next remark. Remark 8.5.21 When the electrons become relativistic, they tend to induce inverse beta decay with the protons and thus produce neutrons (see Table S for the notations used here): e~ + p -» ve + n. This lessens their number and thus reduces the degeneracy pressure due to them, causing the star to contract and make the electrons still more relativistic. The star continues to remain in an unstable situation until nearly all the electrons and protons have been converted into neutrons. When this stage is

Gravitation, Relativity and Black Holes 437

reached, the star can again be in stable equilibrium with the support of degeneracy pressure caused by neutrons. The star in this case is called a neutron star. If the neutrons are non-relativistic, from (8.5.5) it follows that the number density n is now:

If, on the other hand, neutrons are relativistic, the star must again have a mass ML and a radius < K^m~2. But M^/ft3/2m~2 = 1 and so such a star is near the General Relativity limit MLIR ~ 2. In conclusion, we note (see Remark (8.5.15)) that a cold star of mass greater than ML cannot be supported by degeneracy pressure whether it comes from electrons or from neutrons. The above limits on mass can be shown computationally by using the Newtonian equation43.

4^- = -pM(r)r2

(8.5.10)

dr where

M(r) = An\ pr2dr

Multiply (8.5.10) by r 4 and integrate the LHS by parts from 0 to r0, and since p = 0 at r = r0 we obtain: (8.5.11)

On the other hand, since —— is never positive, dr (8.5.12) Also p is never greater than tin3 (see the rule (8.5.2) on p. 435), hence,

Jor° p?dr < h(j* nr2dr)3

= ft(A/(ro))T (4wmJI)"T

(8.5.13)

From (8.5.11) we thus have, after simplification, M(r 0 ) < ( 8 f i ) ^ ( 4 ^ ) " T m ; 2 < 8 f i T m - 2

(8.5.14)

To obtain the limits on the mass of a cold star, we have this far used the Newtonian theory, we shall now see the effects of the theory of General Relativity on such masses. When the body is static, spherically symmetric and is composed of a perfect fluid, the Einstein field equations can be reduced to (see Exc. (2.7)): 43'

This equation between —— and the varying M(r), is called the support equation, note that pressure varies dr with respect to the radius r. Following the usual practice in literature, we have used p instead of P here to denote the pressure in subsequent equations.

438

Mathematical Perspectives on Theoretical Physics

d

P

dr

(V + P){M(r) + 4Kr*p) =

;

7.

:

r(r-2M(r))

(O.J.ljJ

where the radial coordinate is such that the area of the 2-surface (r = const., t = const.) is \nr2. Similar to Newtonian case the function M{r), represents the mass defined by the integral: M(r)= j r Anr2lldr

(8.5.16)

where \l = p(l + e) is the total energy density, p = nmn (n times the mass of a nucleon) and e is the relativistic increase of mass associated with the momentum of the fermions (see Remark (8.5.16)). The function M (r0) equals the Schwarzschild mass of the exterior Schwarzschild solution for r > r0. For a bounded star M (r0) will be less than the conserved mass:

M = f;

4npr2dr {

=Nmn

(8.5.17)

(l-2M/r)7 where N is the total number of nucleons in the star. The difference (M - M) represents the amount of energy (binding energy) radiated to infinity since the formation of the star from dispersed matter initially at rest. Remark 8.5.22

It has been shown (Bondi [3]) that:

(l-^]Nj

(8.5.18)(a)

provided /J. and p are positive, and that jl decreases outward; similarly,

(l-f)N{

(8.5.18)(b)

provided p < /J.. Therefore M < M < 3 M, in other words, the difference M - M can never exceed 2M; in reality the difference is never more than a few per cent. Remark 8.5.23 When we compare (8.5.15) with (8.5.10) with pi and M in place of p and M, we note that all the extra terms on the RHS of (8.5.15) are negative as long as both e and p are > 0. Thus, just as in the Newtonian theory a cold star of mass M > ML cannot support itself, a cold star of Schwarzschild mass M > ML cannot support itself in the General Relativity theory. This means that a cold star which contains more than 3 ML/mn nucleons cannot support itself (see Remark (8.5.17)). In conclusion, it is fair to say that some of the bodies of mass > ML will eventually collapse within their Schwarzschild radius and will thus give rise to a closed trapped surface. Since there are at least 109 stars with masses greater than ML in our galaxy, there would be a large number of situations where the predictions of Thm. 2 on the existence of singularities will hold good.

Gravitation, Relativity and Black Holes 439

Next we see how and when a collapsing star can be said to turn into a black hole. If the collapse is exactly spherical, the solution outside the body is S.S.; an observer O at a large distance from the star is able to see an observer O' on the surface of the star when it passes within the distance r - 2m, but is not able to see 0' once it passes r = 2m. With the passage of time, the light he receives from (/ will have a greater and greater shift of frequency to the red and also a greater and greater decrease of intensity. The surface of the star never actually disappears from O's sight, but it becomes extremely faint and so is practically invisible. The time scale for this to happen is of the order for light to travel a distance 2m. One is now left with an invisible object, but this object has the same Schwarzschild mass and it still produces the same gravitational field as it did before it collapsed. The only way one can detect its presence is by its gravitational effects on nearby objects, or by the deflection of light passing near it. Since in a spherically symmetric collapse the singularity occurs within the region r < 2m, from which no light can escape to infinity, this singularity (predicted by Theorem 2) cannot be seen by an observer who is outside r = 2m. The surface r = 2m is called the event horizon of the collapsing star (see Fig. (8.13)), and the matter and energy which crosses the event horizon is lost forever making the star into a black hole. When the collapsing star is not exactly spherically symmetric, the theory of Cauchy surfaces is used to obtain solutions outside the collapsing star and as mentioned earlier, the trapped surfaces and the event horizon pertaining to this star are seen to follow using arguments similar to the spherically symmetric case. Two key questions that one asks in both these cases are: (1) Can the future be predicted far away from a collapsing star? (2) Once the energy (of the collapsed star) has been radiated to infinity in the form of gravitational waves, does the solution outside the horizon approach a stationary state? Singularity

11

Origin of

Schwarzschild vacuum solution

Observer

/

J! & L^v^Shtcone \ K^ ^ r~~\?^ i\\\V d/ \ sJZl\\\\\YJSU------V~^ ''AVTCYVW/ \ | W\\\\\\\ ^ \

coordinates v i s 1 o'clock \ \

V \

• \ \ \ \ \ \ \ \ r \ \ Observer i \\\\\\\\\\VO' \

ffl|Kro

Event horizon

Singularity ^ ^ ^ '< V///yc^T^r^- Observer o \ y///>/JC\ . O ' s past ! V / / / A A4\^^ light cone i'1 o'clock / \ \ 2 3+ \ '/////koL^x oH

r=0 | 7////$$.*?.. ^fr U7

Origin of •//">#/^T^C. c coordinates i / ^ £ Z / ^ ^ /

(i) Finkeistein diagram and (ii) Penrose diagram. An observer O who never falls inside the collapsing fluid sphere and never sees beyond a certain time (say, 1 o'clock) in the history of another observer O'on the surface of the collapsing fluid sphere.

440 Mathematical Perspectives on Theoretical Physics

To answer the first question, Hawking gave a mathematical meaning to future predictability by defining it as follows: Definition 8.5.24

Let (fM, g) have a region which is asymptotically flat (see Sec. 4), then there is a

space (M, g) into which (fW, g) can be conformally imbedded as a manifold M = fWu dM, where dM the boundary of M in M consists of two null surfaces J+ and J~ that represent future and past null infinity. Suppose that 5 is a partial Cauchy surface in M. The space (iW, g) is said to be (future) asymptotically predictable from S if j7+ is contained in D + (5) in the conformal manifold M. Some of the spaces which are future asymptotically predictable from some surface 5 are: the Minkowski space, the Schwarzschild solution for m > 0, the Kerr solution for m > 0 with \a\ <m, and the Reissner Nordstrom solution for m > 0 with \e\ < m. On the other hand, the Kerr solution with \a\ > m and the Reissner-Nordstrom solution with \e\ > m are not future asymptotically predictable, since for any partial Cauchy surface 5, there exist past-inextendible non-spacelike curves from_7+ that do not intersect 5 but approach a singularity. Remark 8.5.25 The future asymptotic predictability is regarded as a condition that there would be no singularities to the future of 5 which are "naked," i.e., visible fromj7 +. In a spherical collapse, one obtains a space which is future asymptotically predictable. When the collapse is not exactly spherical (but departure from spherical symmetry is sufficiently small), one has the following result: Proposition 8.5.26 5, and (ii) Rah KaK

If (i) {M, g) is future asymptotically predictable from a partial Cauchy surface > 0 for all null vectors K", then a closed trapped surface 5^ in D+ (S) cannot

intersect J~ (7 +, !M), i.e., cannot be seen from J +. The concept of predictability can be extended by making another definition. We shall see that this concept provides a prescription for two or more black holes to unite and form another black hole. Definition 8.5.27

A spacetime is strongly future asymptotically predictable from a partial Cauchy

1

surface S iff " is contained in the closure D+ (5) of(£>+(5))in fW,and7 + (5)n 7" (J+, M) is contained in D+ (5). The definition can be interpreted to mean that a neighbourhood of event horizon can also be predicted from J>. If (M, g) is strongly future asymptotically predictable from a partial Cauchy surface 5, then a homeomorphism 0 : (0, oo) x 5 -> D+ (5) - 5 can be defined, such that for each r e (0, °°), S (T) = ({T} X 5) is a partial Cauchy surface which intersects J+. In fact 5(T) represents a family of spacelike surfaces homeomorphic to S which cover D+(S) - 5 and intersect J* . On the surface 5(T), a black hole is defined as a connected component of the set 2?(T) =S(f) - J~(J +, 9vt). Thus, it is a region of S(t) from which particles or photons cannot escape to J + . The above definition and the construction of a family of partial Cauchy surfaces {5 (r)} with properties given in the footnote, suggest that as T increases, black holes can merge together forming new black holes as a result of further collapsing bodies. We note that the reverse process does not follow, i.e., black holes can merge together but never bifurcate. *

The surfaces (5(T) in the

{5(T)}

have the following properties: (i) for T2 > Tv 5(T 2 ) +

C T

(SitJ): (ii) for each x, the edge of

conformal manifold M is a spacelike 2-sphere52(f) inj7 such that for T2 >

future of 52(T,); (iii) for each T, 5(T)

U

+

TX,S2(T2)

is strictly to the

{ J n J~ (52(T), M)} is a Cauchy surface in M for D (S).

Gravitation, Relativity and Black Holes 441

To answer the second question posed above, we set the conditions for a 'stationary regular predictable space.' Definition 8.5.28 ing properties:

A spacetime (M, g) is a stationary regular predictable space if it has the follow-

(i) It is a regular predictable space developing from a partial Cauchy surface. (ii) There exists an isometry group 9,: M —> 9d whose Killing vector K is timelike near f and jT. (iii) It is either empty or contains fields like electromagnetic field or scalar fields that obey (wellbehaved) hyperbolic equations, and satisfy the dominant energy condition: Tah NaLb > 0 for future-directed timelike vectors DM, L In view of the above definition, Question (2) is answered with 'yes' as it can be expected that for large values of T, the region J~ (/*", M) n J+ (S (T)) of a regular predictable space containing collapsing stars would be almost isometric to a similar region of a stationary regular predictable space. One is also interested in knowing if these regular predictable spacetimes are static. This is the case if, for example, the final state of the solution outside the event horizon is static, for then the metric in the exterior region will be that of a Schwarzschild solution. On the other hand, in an empty stationary regular predictable space which is not static, the Killing vector Ka is spacelike in part of the exterior region / + (f, M) n J~ (J +, M). The region of J+ (J" M) n J~ (/*", 9vi) on which K" is spacelike is called the ergosphere. Naturally if the solution is static, there is no ergosphere. An example of a stationary non-static regular predictable space with an ergosphere is the Kerr solution for a2 < m2. Many rigorous results on the formation and structure of black holes have been proved by physicists on either side of the Atlantic using mostly the theory of General Relativity. To reach the present state of the art, various simplifications of line elements were proposed (see, for instance, [41] and a survey article by Miller and Sciama in [7]), and many diverse approaches were suggested. Amongst these approaches, according to Bekenstein (in [27]) Gerlach's work [12], deserves a special mention—since this was the sort of analysis that would have been more convincing to Einstein due to his reservations about quantum theory. A continuous experimental activity with improved technology has led to confirmation of most of the predictions of this black hole theory. For a historical account of black holes, the reader is referred to articles by Israel and Blandford in [16d] and for theorems on black holes to Hawking's selectively collected papers [16e]. Lately there has been immense research activity that established a link between black holes and string theories. Two of these papers are listed here. (See also subsections (B.9) and (B.14) of Chapter 11 along with additional references there, on these ideas). In conclusion, we reiterate that we have only introduced the reader briefly to this vast realm of knowledge via references in literature which we feel should be easily accessible. Due to our limited scope, we have not been able to cover many important aspects of the theory, e.g., quantum gravity (see Hawking and W. Israel in [16e]), gravitational waves (see ii in [24]), general relativity via complex structures [10] and twistors [30].

Exercise 8.5 1. Establish the equality (8.5.11) and the inequality (8.5.12).

442 Mathematical Perspectives on Theoretical Physics

Hints to Exercise 8.5 1. Consider the Newtonian equation of support: (i)

&•

= -pM(r)r*

dr where M{r) = 4n\ pr^dr represents the mass of the body up to the radius r. Multiply (i) by r4 and integrate by parts between the limits 0 and r0. The LHS equals: /••x

fr0 I d p 4 ] ,

(n)

f i t d p y

\—!-r*\dr=\r\—i-\

rr0

- \

3

4r-pdr.

The first term vanishes at both limits, since p = 0 at r0. Similarly the RHS of (i) equals:

(iii) 1

uTS

'2ndfn-

Integration by parts gives:

(iv)

- (1st fn.) (j 2nd fn. j 1° + J* ( diff. (1st fn.)J 2nd fn.Jdr.

Since by definition

(v) after equating (ii) and (iv) and simplifying, we have the equality (8.5.11). To establish the inequality (8.5.12), we consider the derivative:

<-•>

iav^i'={(j o v ! *i~v 3 ( 1

4U

since

is never positive. Thus dr

4

l f ' i p

, 4 , ,V"4 3

4 J o rfr' J

v

Gravitation, Relativity and Black Holes 443

JUJV 3 *')«<-^A 2 . Uo

dr

y

)

4

F

APPENDIX 8A A.i

Spatially Homogeneous

The spacetime is called spatially homogeneous (SH) if there is a group of isometries which acts freely on 9A, and whose surfaces of transitivity are spacelike three surfaces. On these SH-surfaces, any one point on a surface is equivalent to any other point on it.

A.2 Geodesically Complete Consider the differential equation:

dv

dv

dv

(8A.1)

with a Cr-connection (r > 0) V^ on M. It is known that for any point p e M and any vector \p at p, there exists a maximal geodesic A^u) such that Ax(0) = p and (d/dv)x\v_0 = X . For r > 1, this geodesic is unique and it depends differentiably on p and Xp; and thus a C r -map exp: Tp —» !Mcan be defined where for each X e Tp, exp(X) is the point in iM a unit parameter distance along the geodesic Ax from p. This map is sometimes not defined for all X e Tp as the geodesic Xx(v) may not be defined for all v. If it is defined for all v, the geodesic X(v) is said to be a complete geodesic. The manifold fWis said to be geodesically complete if all geodesies on 5^are complete, i.e., if exp: T' —* 5tfis defined for every p in Wt.

A.3

Normal Coordinates

In a convex normal neighbourhood, 9\£one can choose any point q and a basis [Ea] of 7^ and assign coordinates (x1, x 2 , ..., xn) to any point r by the relation r = exp(x?Ea) (i.e., using the coordinates of the point exp"1 (r) in Tq with respect to the basis {Ea}). These are called normal coordinates based on q. In these coordinates:

(d/dxa)\q=Ea;

r{abc)\q=0

(8A.2)

A.4 Open or Closed Universe Consider the spatially homogeneous, isotropic Robertson-Walker spacetime model with line element (see (8.4.23)): 44

Solutions of this equation are geodesies with affine parameter v. If gab parameter.

= 1, then v is an arc length

444

Mathematical Perspectives on Theoretical Physics

ds2 = S\t)do2 - dt2 where da is the metric of the "standard" (complete, simply connected) 3-space of constant curvature k. The models are called spatially open, flat or closed according as k < 0, k = 0 or k > 0. In the Newtonian models one can always obtain these values by choosing Ro = R(0) suitably, however this is not so in the relativistic models.

A.5

Cavendish Constant Gc

This constant was obtained by H. Cavendish in (1798) (see Sec. 40.8 in [26], and the original paper Phil. Trans. R. Soc. London Part II (1798)) by carrying out experiments to determine the density of the Earth. The apparatus for experimentation was made of two separated spheres suspended by fine wires. Newton's gravitational law using this constant reads as: m]m2 Force = - Gc—L2-AThe constant is equal to one in general relativity, but in other metric theories it varies from event to event in spacetime. For instance, in the Dicke-Brans-Jordan theory it is determined by the distribution of matter in the universe. As a result the expansion of the universe changes its value, thus: 1 dGc Gc

dt

f

.ltol

\

v age of universe y

-1 10

10

or 1 0 n years

In some theories, the result of a Cavendish experiment depends on the chemical composition and internal structure of the test bodies. Most accurate tests of this type were done by Kruezer in (1968) (see Sec. 40.8 in [26] and the original paper in Phys. Rev. 169 (1968)).

A.6 Closed Trapped Surface Consider a sphere 5 that surrounds a massive body of high density. At some initial point, let S emit a flash of light, then at a later time, /, the ingoing and outgoing wave fronts from 5 will form spheres Sl and 5 2 respectively. Normally the area of Sl (S2) is smaller (greater) than that of 5 since it represents ingoing (outgoing) light. However, if a large amount of matter is enclosed within 5, then the areas of both Sx and S2 will be smaller. The surface S is then said to be a closed trapped surface. We shall denote it as 5 r i n the text . See Fig. (8.14) below. A particular example of such a surface is as follows. Consider an orientable compact spacelike two-surface S in D+(S) such that the expansion 6 (see Subsec. 8.3.3) of the outgoing null geodesies orthogonal to it is non-positive, then S is an outer trapped surface. We encounter these trapped surfaces when a star collapses. Thus if 5 ( T ) is a Cauchy surface at time T during this collapse, then a region T ( T ) is a trapped region in the surface S(f) which is the set of all points q e S(z) such that there is an outer trapped surface, say T lying in S(t), through q. The existence of the trapped region T(z) implies the existence of a black hole.

Gravitation, Relativity and Black Holes 445 S

I I \'S^Sj

( i f

^ Q Q

The envelopes (spheres) S, and S2 formed by ingoing and outgoing wavefronts due to the emission of flashes of light from S. (The light from a point p forms a sphere S around p and these small S spheres form the two envelopes)

A.7

Particle Horizon

Consider a family of particles in the de-Sitter space whose histories are timelike geodesies (these geodesies originate at the past spacelike infinity^- and end at the spacelike infinity/ 1 '). Suppose p is an event on the world-line of a particle O (observer O) in this family, the past null cone ofp is the set of events in spacetime which can be observed by O at that time. All those particles whose world-lines intersect this

':%%Co$6&

Cs world-line j Particle has been * observed by O at p / /^^/ Particle horizon \ / J^fcZP Particles not yet

^Illfl3iil^

\

8-

/

/ ^ H i l ^ l r ^ observed by O at p

Past null cone of O at p

K S Q R 3 The particle horizon defined by a congruence of geodesic curves originating from past spacelike infinity^-. null cone are visible to O, whereas those particles whose world-lines do not intersect it are not visible to O. The division of particles seen and not seen by O 3Xp is called the particle horizon for the observer O atp. Thus particle horizon represents the history of those particles lying at the limits of O's vision. (See Fig. (8.15.)

446 Mathematical Perspectives on Theoretical Physics

A.8 Event Horizon All events outside the past null cone of p are events which are not and have never been observable by O up to the time represented by the event p (in de Sitter spacetime). Thus there is a limit to O's worldline on/1". The surface which is a boundary between events which will at some time be observable by O and those that will never be observable is called the future event horizon of O's world-line. (The past event horizon can be similarly defined.)

References 1. A. Ashtekar and R. O. Hansen, A unified treatment of null and spatial infinity in general relativity I. Universal structure, asymptotic symmetries, and conserved quantities at spacial infinity, J. Math. Phys. 19 (1978), 1542. 2. J. K. Beem and P. E. Ehrlich, Global Lorentzian Geometry (New York: Marcel Dekker, 1981 Second Edition 1996). 3. (a) H. Bondi and T. Gold, The steady state theory of the expanding universe, Mon. Not. Roy. Ast. Soc. 108 (1948), 252-270. (b) H. Bondi, Massive spheres in general relativity, Proc. Roy. Soc. London A282, (1964). 4. (a) S. Chandrashekhar, The maximum mass of ideal white dwarfs, Astrophys. J. 74, (1931), 81-82. (b) S. Chandrashekhar and J. L. Friedman, On the stability of axisymmetric systems to axisymmetric perturbations in general relativity I. The equations governing Nonstationary, stationary, and perturbed systems, Astrophys. J. 175, (1972), 379. (c) S. Chandrashekhar: The Mathematical Theory of Black Holes (Oxford: Clarendon Press, 1983). (d) S. Chandrashekhar in Vol. 3 of [7]. 5. P. C. W. Davies: Space and Time in the Modem Universe (New York: Cambridge University Press, 1977). 6. B. DeWitt and C. M. DeWitt (ed): Black Holes (New York: Gordon and Breach, 1973). 7. J. Ehlers (ed), Relativity theory and astrophysics (3 volumes), American Math. Soc, (1967). (i) A. Schild (Lectures on general relativity theory). Vol. 1. 8. (a) A. Einstein, On the electrodynamics of moving bodies, Annalen der Physik 17 (1905). (b) A. Einstein, On the influence of gravitation on the propagation of light, Annalen der Physik 35 (1911). (c) A. Einstein, The foundation of the general theory of relativity, Annalen der Physik 49 (1916). 9. F. de Felice and C. J. S. Clarke, Relativity on Curved Manifolds (New York: Cambridge University Press, 1990). 10. E. J. Flaherty, Hermitian and Kdhlerian geometry in Relativity (New York: Springer-Verlag, 1976). 11. M. Friedman, Foundations of Space-Time Theories (New York: Princeton University Press, 1983). 12. U. Gerlach, The mechanism of blackbody radiation from an incipient black hole, Phys. Rev. D14 (1976), 1479. 13. (a) R. P. Geroch, Local characterization of singularities in General Relativity, J. Math. Phys. 9 (1968).

Gravitation, Relativity and Black Holes

14. 15. 16.

17.

18. 19.

20. 21.

22. 23. 24.

25. 26.

447

(b) Spinor structures of spacetime in general relativity I and II, Journ. Math. Phys. 9 (1968); Journ. Math. Phys. 11 (1970). G. W. Gibbons, The time symmetric initial value problem for black holes, Commun. Math. Phys. 27 (1972), 87. V. Guillemin and S. Sternberg, Variations on a theme by Kepler, Am. Math. Soc. (1990). (a) S. W. Hawking, Singularities and the geometry of spacetime, Adams prize essay (1966). (b) S. W. Hawking and R. Penrose, The singularities of gravitational collapse and cosmology, Proc. Roy. Soc. London A314 (1970). (c) S. W. Hawking and G. F. R. Ellis, The Large Scale Structure of Spacetime (New York: Cambridge University Press, 1973). (d) S. W. Hawking and W. Israel, Three Hundred Years of Gravitation (New York: Cambridge University Press, 1987). (i) C. M. Will (Experimental gravitation from Newton's Principia to Einstein's general relativity); (ii) T. Damour (The problem of motion in Newtonian and Einsteinian gravity); (iii) W. Israel (Dark stars: the evolution of an idea); (iv) K. S. Thorne (Gravitational radiation); (v) A. Vilenkin (Gravitational interaction of cosmic strings); (vi) S. W. Hawking (Quantum cosmology); (vii) J. H. Schwarz (Superstring unification); (viii) R. Penrose (Newton, quantum theory and reality); (ix) R.D. Blandford (Astrophysical black holes). (e) S. W. Hawking, Hawking on the Big Bang and Black Holes (New Jersey: World Scientific, 1993). A. Held: General Relativity and Gravitation (Vols. 1 and 2, Plenum Press, 1980). (i) J. C. Miller and D. W. Sciama (Gravitational Collapse to the black hole state), (ii) F. J. Tipler, C. J. S. Clarke and G. F. R. Ellis (Singularities and horizons-Review Article), (iii) L. P. Grischuk and A.G. Polnarev (Gravitational waves and their interaction with matter and fields). G. 't' Hooft, Black hole quantization and a connection to string theory, in [23]. (a) F. Hoyle, A new model for the expanding universe, Mon. Not. Roy. Ast. Soc. 108 (1948), 372-382. (b) F. Hoyle and J. V. Narlikar, Time symmetric electrodynamics and the arrow of time in cosmology; A new theory of gravitation, Proc. Roy. Soc. London A277 (1963); A282 (1964). J. A. Isenberg (ed), Mathematics and general relativity, American Math. Soc. (1988). (a) C. J. Isham, Modern Differential Geometry for Physicists (New Jersey: World Scientific, 1989). (b) C.J. Isham, An introduction to general topology and quantum topology, in [7]. S. Kobayashi and K. Nomizu, l.[10]. H. C. Lee (ed.), Physics, geometry and topology, NATO ASI Series B (Phys. Vol. 238, Plenum Press, 1990). M. A. H. MacCallum (ed), General Relativity and Gravitation (New York: Cambridge University Press, 1987). (i) M. A. Abramowicz (Accretion disks around black holes); (ii) L. P. Grischuk (Gravity-wave astronomy); (iii) C. J. Isham (Quantum Gravity); (iv) P. Mazur (Black hole uniqueness theorems); (v) R. Penrose (Twistors in general relativity). A. R. Marlow (ed.), Quantum Theory and Gravitation (New York: Academic Press, 1980). C. W. Misner, K. S. Thorne and J. A. Wheeler, Gravitation (San Francisco: W. H. Freeman & Co., 1973).

448 Mathematical Perspectives on Theoretical Physics

27. Y. Ne'eman (ed.), To Fulfill A Vision (Addison-Wesley Publishing Company, 1981). (i) C. N. Yang (Geometry of Physics); (ii) F. Giirsey (Geometrization of unified fields); (iii) J. D. Bekenstein (Gravitation, the quantum, and statistical physics); (iv) Y. Ne'eman (Gauged and affine quantum gravity). 28. E. T. Newman, L. Tamburino and J. J. Unti, Empty space generalization of Schwarzschild metric, Journ. Math. Phys. 4 (1963). 29. I. Newton, Mathematical Principles of Natural Philosophy and His System of the World (Philosophiae Naturalis Principia Mathematica), Joseph Streater (ed.), London, July 5, 1686 and Florian Cajori (ed.) (Berkeley: University of Cal. Press, 1962). 30. (a) R. Penrose, Gravitational collapse and spacetime singularities, Phys. Rev. Lett. 14 (1965); (b) General relativity, energy flux and elementary optics in Perspectives in geometry and relativity (Hlavaty Festschrift, 1966). (c) C. M. DeWitt and J. A. Wheeler (ed.), Structure of spacetime in Battelle Rencontres (New York: Benjamin, 1968). 31. W. Perett and G. B. Jeffrey, The Principles of Relativity, A Collection of Original Memoirs on the Special and General Theory of Relativity, (New York: Dover-New York, 1923). 32. Z. Perjes (ed.), Relativity Today (Nova Science Publishing Company, 1992). (i) L. M. Skolowski (Gravitational waves in multi-dimensional spacetimes); (ii) R. Bartnik (The spherically symmetric Einstein Yang-Mills equations). 33. N. Prakash: Projective structures in spacetime, Indian J. Pure Appl. Math 17 (5) (1986). 34. T. Regge and J. A. Wheeler, Stability of a Schwarzschild singularity, Phys. Rev. 108 (1975), 1063. 35. R. K. Sachs and H. Wu, General Relativity for Mathematicians (New York: Springer-Verlag, 1977). 36. B. G. Schmidt, A new definition of singular points in General Relativity, J. Gen. Re. and Gravitation 1 (1971). 37. B. F. Schutz, A First Course in General Relativity (New York: Cambridge University Press, 1985). 38. J. M. Stewart, Advanced General Relativity (New York: Cambridge University Press, 1990). 39. E. F. Taylor and J. A. Wheeler, Spacetime Physics Introduction to Special Relativity (2nd ed., New York: W. H. Freeman and Company, 1990). 40. K. S. Thorne, Black Holes and Time Warps (New Jersey: Princeton University Press, 1993). 41. P. C. Vaidya, 'Newtonian' time in general relativity, Nature 171 (1953), 260. 42. C. V. Vishveshwara, Stability of the Schwarzschild metric, Phys. Rev. Dl (1970), 3870. 43. K. Yano and S. Bochner, Curvature and Betti Numbers (Annals of Maths. Studies No. 32, New Jersey: Princeton University Press, 1953). 44. O. Kowalski and D. Krupka (ed.) Differential. Geometry and its Applications Proc. (1993). (i) M. Mikkelsen (Standard Static Space-times with perfect fluid.) 45. A. Strominger and C. Vafa, Microscopic origin of the Bekenstein-Hawking Entropy, hep-th/ 9601029 V2 14 Feb 96. 46. K. Skenderis, Black holes and branes in string theory, hep-th/9901050 V2 19 Jan. 1999.

CHAPTER BASICS OF QUANTUM THEORY

1

V /

INTRODUCTION

We devote this chapter to the basics of quantum theories. Classicaly this theory was developed using two important principles namely, the energy absorbed or emitted by a body is in multiples of a constant h * 0, called the Planck's constant and the Heisenberg's uncertainty principle expressed as (Mp xMvxm4fi)1. Very often an object or a property that carries h in its definition is called a quantum object or a quantum property. For example, a photon which carries the energy E - h a>(co= wave frequency) is a quantum object,2 and a particle carrying an angular momentum which is a multiple of (1/2) ti possesses a quantum property. The theoretical and experimental techniques that help determine the quantum nature of an object or a physical system are often referred to as components of quantum theories. The procedure used in moving from classical physics to the physics that uses the two principles cited above is called the quantization. The quantization of any given system is achieved via one of the two equivalent approaches—namely the canonical formalism (Sec. 2 and Sec. 3) or the path-integral formalism (Sec. 4, Sec. 5 and Sec. 6). In the former case the dynamical variables of the system are treated as operators and these operators are postulated to satisfy the canonical commutation relations (9.2.16). The Hamiltonian of the system is constructed, which is then used to find the time evolution operator (9B.21). This eventually leads to the computations of transition amplitude from the state at an initial time to the state at final time. The path-integral formalism, on the other hand, allows the transition amplitude to be expressed directly as the sum over all paths between the initial and the final state. The summands of the functional integral here are weighted by e'a (denoted also as e's or eM) whereas a (S or A) denotes the action (in the units of Planck's constant K) for the particular path. In both these approaches, one of the key ingredients of the theory is the principle of superposition which asserts that in a given region, every wave function y/ without singularities can be expressed as a linear combination of eigenfunctions coming from Schrodinger's wave equation, provided the boundary conditions satisfied by y/ are the same as those of eigenfunctions. The principle also extends to wave Mp = uncertainty in the correct measurement of the "position" of a given particle P. Mv = uncertainty in the correct measurement of the "velocity" of P. in = mass of P. "' h is actually Planck's constant/2^; the equation E = ttco, where a>denotes the wave frequency, is also interpreted to mean that Planck's constant h connects the wave and particle aspects of 'light' in photon via a> and E.

450

Mathematical Perspectives on Theoretical Physics

functions with multiple components (such as Pauli's or Dirac's) in a natural manner via matrix methods. In the case of path-integrals, the Feynman sum over history is based mainly on the (linear) superposition principle. In fact this principle is such a basic component of the theory that one just uses it without mentioning it explicitly. On a historical note regarding the quantum theory, it is worth mentioning that it took more than five decades to reach the stage at which we are today. Those who contributed the most toward establishing the theory on firm ground were notably Schrodinger, Heisenberg, Pauli, Dirac and Feynman. The mathematical disciplines that were used the most in formulating the theory were: analysis, algebra and geometry. As we are well aware now, Schrodinger's approach was based on differential equations and as such used analysis; Pauli and Dirac who used operators and matrices to describe the theory made algebra their main tool, whereas Feynman and Heisenberg, in addition to these disciplines used geometry (graphs) as well, as a means of identifying the amplitudes of a quantum mechanical process. Since much of the theory now uses these disciplines interchangeably, we have attempted here to present the material in an integrated manner. More specifically, we have devoted Sec. 2 to Schrodinger and Heisenberg equations and Sec. 3 to Dirac's equations along with the Klein-Gordon equations. These equations are studied for free particles as well as for particles moving in different fields. The topic of quantization of fields is studied in Sec. 4 and Sec. 5 using the diagram technique and Feynman's path-integral formalism. In Sec. 6 we use the knowledge of the previous two sections to introduce the Feynman graphs—a tool which is used with great success in string theories. Some of the terms and results that are required for an understandable account of this (giant) theory are described in brief in Appendices A, B and C. The theory of the text is illustrated by examples and exercises. The chapter also contains in Appendix D, a brief account of Hopf algebras-known as quantum groups. Instead of making these Hopf algebras as part of our chapters on algebra, we preferred to include them here because of the word 'quantum'.

2

PASSAGE FROM CLASSICAL TO QUANTUM

We review here in brief the similarities and differences between the tools used in the two theories (see Table 1 at the end of this section). In classical mechanics (CM) the basic entities are the topological spaces known as 'phase spaces,' the 'observables' which are real-valued continuous functions on phase spaces, and symmetry groups-the groups of self-homeomorphisms of phase spaces. In the case of quantum mechanics (QM) the phase space is replaced by a separable infinite-dimensional Hilbert space H, and the observables are self-adjoint operators which may or may not be bounded, and symmetries are given by automorphisms of the ^-algebraic structure3 of L(#)—the space of linear operators on H. The analogues of one-forms CO and vectors v in CM (the dual objects, as each one of them is a linear real-valued function on the space defined by the other, assuming that space is finite dimensional), are the Dirac kets \\ff) and bras (
(\y) = (VW)*

(9.2.1)

(See Appendix 9A for definitions, and (9.2.41)-(9.2.43) for the relation between differential forms and operators.) 3

- (See Sec. (4.1) for definitions).

Basics of Quantum Theory 451

2.1

The Concept of Amplitude, Observable, and Hamiltonian

Every quantum mechanical process is associated with a complex number called the quantum amplitude. The square of the amplitude equals the probability of occurrence of the process: |Amplitude|2 = Prob.

(9.2.2)

For example, consider a particle at a point x0 at time r0 and at x' at time t'. The travelling of this particle from x0 to x during the time (f - t0) is a process, to which we associate the amplitude A, and the probability of finding it at x' is \A |2.

—I

1

x0

x'

^ f f l ^ n j Particle travelling from Ktox' during ttie interval (f - t). The amplitude A for this reason is called the probability amplitude. Quantum mechanics postulates that there is a set of state vectors symbolized as | ) that describe all configurations of the system. In the case of the above example, the state of the system at (x0, t0) is |JC0, t0), and the probability amplitude A is the overlap between the initial and the final states given by a scalar product ( | ) (on Hilbert space to which \x0, t0), etc., belong) thus: A = (x',t'\xo,to)

(9.2.3)

The states are normalized, hence (9.2.2) written out as: \{x', t'\x0, f 0 ) | 2 = P r o b .

(9.2.4)

makes sense (the probability p of occurrence of any event satisfies 0 < p < 1). Another equally important postulate of QM asserts that all physically observable quantities be represented by operators on a Hilbert space, and the result of a measurement of an observable must be an eigenvalue of the operator representing it. For instance, the 'position' of the particle is an observable quantity, thus if X denotes the corresponding operator acting on a state \x) of the system, thenX|x) gives the eigenvalue of the operator. Naturally |JC) is an eigen-vector (eigen-state) of X. It is customary to use the same symbol x for the eigenvalue as well as for the eigenstate \x), thus we have: X \x) = x\x)

(9.2.5a)

Then there is the concept of time evolution in QM, which is provided through a Hermitian operator H known as the Hamiltonian which time translates a (time-dependent) state | y/ (?)) from t to t + e, e being small. Using this operator we have the Schrodinger equation: i4-W{t)) = H\\if{t)) at (See Appendix 9B)

(9.2.5b)

452

Mathematical Perspectives on Theoretical Physics

In short, if the physical state of a particle at time t is described by the normalized wave function *?(?, t) with |*P(r, t)\2 the probability density for finding the particle at position r (see Appendix 9B), then the expectation values4 of the position and momentum (of the particle) are given by using the wave function as follows:

(F> = j^ir,

*)?¥(?, t)d3r

(9.2.6)

(p)= \*¥*(?,t)—VV(7,t)d3r (9.2.7) J i While calculating (p) it is assumed that *P(r, f) has a continuous derivative everywhere (the expectation value of the momentum operator of a non-continuous wave function can also be defined—though it is more complicated). The time-development of the wave function is determined by the wave equation:

ih — = I - — V2 + V(r)\ *F dt I 2/i J

(9.2.8)

The expression within parentheses represents the energy operator ——I- V for a particle of mass fi V 2 -" J in a conservative field, this is denoted by H showing that the Hamiltonian operator here is simply the energy operator. The above equation, known as the equation of motion for the state vector, can thus be written as (see (9B.19)): itt — dt

= H¥

(9.2.9)

As indicated earlier the operator H is always a Hermitian operator. We give below two examples that, illustrate the above introduction. The first of these gives the translation from classical to quantum using of the same differential equation.

2.2

Symmetry Group of the Motion of a Particle in 1 -Dimension

Example 9.2.1

Consider the differential equation: x + F(x) = 0

(9.2.10)

where F: IR —> IR is a given smooth function; in CM it represents a particle moving in a one-dimensional space whose position is given b y x s x(t) at time t. The motion of the system is analyzed by taking the phase space R 2 5 on which canonical coordinate functions q, p: R2 —> IR are introduced as: q(x,y) = x, p(x,y) = y 4

5

'

(9.2.11)

The expectation value of a random variable X for the given probability distribution (Xk, p^) is the weighted sum: ~LXkpk. It is denoted as E(X) or {x). pk denotes the probability that X may take the value Xk. The cotangent bundle T*(M) of n-dimensional C°°-manifold M with coordinates (xv •••,xn) = (xx (t), •••, xn{t)) defines the 2«-dimensional phase space in CM with coordinates (x{, • • •, xn, v,, • • •, yn); here M is IR, hence the phase space is R 2 : (x, y) (see also 9B).

Basics of Quantum Theory 453

The commutation relations satisfied by p and q are: {p,p}={q,q}=0,

{p,q} = l

(9.2.12)

which can be easily seen to follow from the Poisson bracket {/, g] = —-—— ——, for smooth dy dx dx dy real-valued functions/, g : K2 —>IR. The second order differential equation (9.2.10) reduces to a system of first order differential equations on the phase space, thus: x=y,

y=~F(x)

(9.2.13)

Since we are interested in the 'systems' without singularities, we require that through every point (x0, y0) of IR2 there be a unique smooth curve: t h-» (x(t), y(t)) that satisfies (9.2.13) with initial conditions x(0) = x0, y(0) = y0. This requirement translates into the existence of a smooth one-parameter family {0,: t € IR} of homeomorphisms of IR2 such that the functions qt = qo^)t, p,- po0 = Id. The existence of the above symmetry group is ensured by the following result: Result 9.2.2

L e t / : IR -> K be a smooth function such t h a t / '(x) = F(x) (i.e., f(x) = j * F(t)dt), then

i f / i s bounded below, there exists a flow (&,: t e IR) that satisfies (9.2.14). See Chap. 8 in [18] for the proof. In the case of QM, we have to quantize the one-dimensional system given by (9.2.10); therefore we begin with the Hilbert space y{= L2(R) of real valued square integrable functions and choose (in place of the canonical coordinate functions (9.2.11) of CM) the canonical operators: Q = multiplication by x

(9.2.15)

i dx ' The operators Q and P satisfy the commutation rule: [Q, Q] = [P, P] = 0,

[P, Q] = -il

(9.2.16)

Here 1 stands for the identity operator on L2(IR). In view of the above discussions, the dynamics of the system is represented by a one-parameter group (ar: t s IR) of * automorphisms of L{!tt). Thus the equations (analogous to those of a classical system (9.2.14)) satisfied by the dynamical group (a,) are: ~at{Q) = at(P) -j-a,(P) = -F(at(Q)) = -a,(F{Q)) at

(9.2.17)

454

Mathematical Perspectives on Theoretical Physics

2.3

Two-body Problem with Spherically Symmetric Potential

In the next example we consider two non-relativistic spinless particles interacting via a spherically symmetric potential, to obtain the associated Hamiltonian along with their eigenvalues and eigenkets. Example 9.2.3

Consider two particles of masses mx, m2 with position operators rx and r 2 , and mo-

mentum operators px and/>2- Let V(r) denote a spherically symmetric potential where r = (r • r)~2~and r = rx-r2

(9.2.18)

then the Hamiltonian for the system can be written as: HT = - £ * - + - ^ - + V(r) 2m.\ 2m2

(9.2.19)

(see (9.2.8). In order to obtain the eigenkets of HT, we reduce it to a sum: HT = Hcm + H

(9.2.20)

The components Hcm and H are: Hcm=—-A rP 2 2(mj + m2)

(9.2.21)

H = -^— P 2 + V(r) 2M

(9.2.22)

P = Pi+p2

(9.2.23)

where

/7 = (m2pl - mlp2)/(ml M = (mlm2)/(ml

+ m2)

+ m2) and

(9.2.24) (reduced mass)

(9.2.25)

The operators Hcm and H are the Hamiltonians that are respectively associated with the (translational) motion of the centre of mass of the two particles, and their relative motion (rotational and vibrational). Note that HT could be written as a sum since ri andp, (i = 1, 2) are conjugate operators and [r,-, Pj] = 0 ( * * y ) , (i,j = 1,2) (see9A). Also since r and p are linear combinations of ri and pt, they are Hermitian and their corresponding triplets (rk) and (pk) (k, 1=1,2. 3) satisfy: \ph rj\ = -ihdkl

(9.2.26)

(see 9A.22). This shows that r and p are canonically conjugate. It can be checked that P of (9.2.23) commutes with both of them. We further define the orbital angular momentum operator (see Exc. (9.2.1)) : L = rxp 6

'

(9.2.27)

See also Sec. (3.7), in particular the hint to Exc. 2 of that section. There we have used the angular-momentum operator to obtain a 5t/(2)-representation.

Basics of Quantum Theory 455

associated with the motion about the centre of mass (internal angular momentum). Since P commutes with r and p, the operators Hcm, H, L2 and Lz can be seen to form a set of commuting operators; we assume that the set is a complete set, hence the set being a c.s.c.o. it can be used to write the basis kets of HT (see 9A.3 for the definition of c.s.c.o.). Let \E), |/), \m) and \Ecm) be eigenkets of//, L2, Lz and Hcm respectively. We use \Eltn) to denote an eigenket of the first three operators (collectively)-, i.e.: H\Elm) = E\Elm)

(9.2.28)

L 2 | Elm) = fi2l{l + \)\Elm)

(9.2.29)

Lz\Elm) = hm\Elm)

(9.2.30)

The eigenkets \Elm) of (9.2.28)-(9.2.30) constitute an angular momentum basis. The basis kets of HT, on the other hand, can be written as the direct-product basis \Ecm) ® \Elm), where HT(\Ecm) ® \Elm)) = (Ecm + E)(\Ecm) ® \Elm))

(9.2.31)

2

Since H, L and Lz are Hermitian, the kets \Elm) are orthogonal, and if we further assume that they are normalized, then: {E'l'm'\Elm) = 8E,ESn8m.m for discrete energy eigenvalues, and

(9.2.32)

(E'l'm'\Elm) = 8(E' - E) Sn 8m.m

(9.2.33)

for continuous energy eigenvalues. The eigenvalues Ecm have a continuous spectrum and therefore: (E'cm\Ecm) = 8(E'cm - Ecm)

(9.2.34)

Since H commutes with L, we also have (see Exc. (9.2.1) Hint for L±): H(L ± \Elm)) = E(L ± \Elm)). Finally we note that if one of the masses (m, or m2) is infinite, then Hcm = 0 and two-body problem reduces to the problem of a single particle in a spherically symmetric potential.

2.4 The Radial Hamiltonian of the Two-body Problem We next consider the radial momentum operator: Pr=j(r

P+P

r) = hr-p-ih)

(9.2.35)

where f denotes the unit position operator. This operator is related to the linear and the orbital angular momentum operators by the identity: L 2 = r V - p])

(9.2.36)

1

We use (9.2.36) to eliminate p from (9.2.22) to obtain: H = —p\ 2M

r

+ —~L2 2Mr2

+ V(r)

(9.2.37)

456

Mathematical Perspectives on Theoretical Physics

If further we replace L2 by its eigenvalues h2l(l + 1) (see (iv) of Exc. 9.2.1), then the RHS of (9.2.37) becomes:

Ht = ^-P2T + *%££-

+

V(r)

(9.2.38)

2M 2Mr The operator obtained in (9.2.38) is called the radial •Hamiltonian. We denote an eigenket of Ht as \E I) with eigenvalue E. If Hl is Hermitian with respect to \El) and \E'l), then these eigenkets are orthogonal. Thus upon normalization one has: (E'l\El) = SE.E

(9.2.39)

for discrete eigenvalues and (E'l\El)= S(E'-E)

(9.2.40)

for continuous eigenvalues. From the above discussions it follows that operators H, L" and Lz can be expressed in terms of any representation (e.g., coordinate, momentum). For instance, in coordinate representation, (9.2.28)-(9.2.30) become: HyElm{r') = E¥Elm{r') 2

2

L v W > = * Kl + 1) W ) ^VW')

= *«VW)

(9.2.41)

(9-2-42) ( 9 - 2 - 43 >

where YElm(r') = (r'\Elm) (9.2.44) is the coordinate-space wave function defined by the eigenket \r') and eigenvalue r' of the operator r. The operators H, L2, Lz are the differential forms obtained after substituting: r^>r'

(9.2.45)

p -> -fftVr-

(9.2.46)

in the corresponding abstract operators. We conclude this example with the following remarks. -L 1 1 Remark 9.2.3 Similar to the radial operators r = (r • r) 2 and pr= — (r p +p r) = — {r • p - ih), we can define two other radial operators: P2 = ( p p )

(9.2.47)

rp = l ( p • r + r • p) = ±{p • r + ih)

(9.2.48)

and

the Hamiltonian H can then be expressed in terms of p and rp. Remark 9.2.4 If pr is Hermitian, which is always the case when the potential V(r) is a Coulomb potential, then r and pr are conjugate operators. Similarly when rp is Hermitian, p and rp are conjugate operators. The radial Hamiltonian expressed in terms of Hermitian pr and rp can be shown to be Hermitian. (See Chapter 7 of [21].)

Basics of Quantum Theory 457 We shall be pursuing the study of Hamiltonians in the next section by considering relativistic equations. We end this section with a comment on the so-called Schrodinger and Heisenberg picture, which we have used alternatively in the appendices as well as here, without naming them. We clarify this point in the following and show that they can be identified via simple equations.

2.5

The Relation between Schrodinger and Heisenberg Equations

Comment 9.2.5 The conventional QM begins with the Hamiltonian formulation of classical mechanics and uses observables as non-commuting operators. The dynamical law is given by the timedependent Schrodinger equation (9B.19a): ih—\if(t) = H(t)y/(t) dt

(9.2.49a)

or equivalently by Eq. (9.2.5b): i»-j-|y(0>=//|V(0> at in view of the fact that

(9.2.49b)

(x\W(t))=y/(x,t).

(9.2.49c)

Thus when the Schrodinger equation represents the wave function of a particle in one dimension, we have:

m^H

= HMw(X,t)

at

= (-^-fT + V(x))y(x,t)

(9.2.50)

V 2m dx ) From (9B.21)) we note that for the time independent Hamiltonian H, the time evolution operator U equals: U(t{, t2) = exp(-iV/i) (?, - t2)H)We use this to link the Schrddinger's time-dependent states and time-independent operators with Heisenberg's time-independent states and time-dependent operators.7 For instance, in the case of states we have:

IV>»=IV(f = 0)>s=|y(r = 0)> exp((-i/h)tH)\y/(t))H=

exp((-i/h)tH)\v(t

= 0)>5= \y/(t))s

(9.2.51)

where to write the second line we have used (9B.11) and (9B.21) after writing t0 = 0. Similarly the coordinate operator in the Heisenberg picture is related to the one in the Schrodinger picture as: XH(t) = exp((i/h)tH)Xs 7

'

e\p((-i/h)tH)

(9.2.52)

The operators in two systems are distinguished by the suffix S or H, similarly the states are designated as | )s or | )H. The letters 5 and H are dropped when there is no fear for confusion or when a particular equation/ statement is valid in both cases.

458 Mathematical Perspectives on Theoretical Physics

The eigenstates of the operator XH{t) satisfying: XH(t)\x, t)H = x\x, t)H are easily seen to be in accordance with the coordinate basis of the Schrodinger picture: I*, Oi/ = exp((i/»)tf/)|*>

(9.2.53)

Using (9.2.3) we now have: H(xu

f^,

t2)H= (Xl\ exp((-i/»)r 1 //) exp((i/K)t2H)\x2) = (Xl\ exp((-i/»)(r, - t2)H)\x2)

= (Xl\U(h, ' 2 )M = U(tu *,; t2, x2)

(9.2.54)

Since H{xi, tx\x2, t2)H are the time ordered transition amplitudes between the coordinate basis states in Heisenberg picture, it follows that the matrix elements of the time evolution operator U(tv t2) are nothing else but these transition amplitudes. We shall use this relation increasingly in Sec. 5. In the attached Table (9.2.1) we give the dynamical laws for the two approaches along with the ingredients that are used there. Table 9.2.1 Classical and Quantum Mechanics Classical Mechanics 1. Finite-dimensional phase-space 2. Real valued functions: / (one-component) 3. Variables x, p

({x,p} = l)

Quantum Mechanics Infinite-dimensional Hilbert space Complex-valued functions: y/ (with more than one component) Operators x, p = x , p

([x,/»]=l)

4. Hamiltonian H(x, p) or H(x, p, f) 5. Dynamical law

Hamiltonian H (x, p) Dynamical law

d

df

—f(x, p) = {f, H) dt

- ~

(a) Heisenberg Eq: ih — = [/, H ] dt dw (b) Schrodinger Eq: ih—1- = H y/ dt

Hamilton-Jacobi Eq.

H+H(Xtil) dt { 6. Lagrangian L(x, x)

=

0

H=H(x,-ih-^-)

dx)

dx Action S= \L

Exercise 9.2 (1) What are the most commonly used angular momentum operators in particle theory? Obtain their eigenvalues and eigenkets.

Basics of Quantum Theory 459

Hints to Exercise (9.2) 1. From Section (3.7) we are already familiar with the angular momentum operator J whose Cartesian components ./, (i = 1, 2, 3 or x, y, z) are linear Hermitian operators that satisfy: (1)

[/,, Jj\ =

itieijkJk-

In particle theory we come across two types of angular momentum operators, namely the orbital angular momentum operator denoted L, and the spin angular momentum operator S, L is obtained from the definition of classical angular momentum after replacing the classical position and momentum vectors by linear Hermitan operators that satisfy the canonical commutation relations (9A.22). Thus for a particle (ii)

L = r x p , i.e. L

i = Cijk rj Pk

which shows that L is linear and satisfies: (iii) Also

[Lj, Lj] = ift eijk Lk L*t = eijkptr]=

eijk rj pk = L,

Hence in view of (i) it is an angular momentum operator. Using the discussions made for J 2 and J in Sec. (3.7) we note that [L2, L,] = 0, and that L2 and one of L^ (say Lz) can have simultaneous eigenkets, these lead to the relations: (iv)

Lz\lm) = hm\lm) L2|/m) = h2l(l + \)\lm)

The spin angular momentum operator S is postulated, so that 5, satisfy the defining commutation relations given in (i). Similar to J±, we have here for operator L: (v)

L± = Lx±iLy

with corresponding relations (vi)

[Lz, LJ = ±h L±, [L2, LJ = 0

Using these one can obtain their eigenvalues and quantum numbers. We use these operators in Sees 2 and 3 while studying the two-body problem and the Dirac's equation.

3

QUANTUM MECHANICAL EQUATIONS AND RELATED CONCEPTS

n this section a relativistic Hamiltonian with reference to the Klein-Gordon and Dirac equations are itudied. The energy eigenvalues of free as well as charged particles are obtained in both cases. Using he Hamiltonian of a free Dirac particle, the spin and angular momentum operators (denoted S and J) tre defined. The relation between the solutions of the Klein-Gordon and Dirac equations is shown, and he Feynman-Gell-Mann reduction is applied to the Dirac equation of a charged particle in an electromagletic field.

460

Mathematical Perspectives on Theoretical Physics

3.1 Hamiltonian in a Relativistic Field, and Klein-Gordon Equation Consider the classical Hamiltonian H of a free particle: H= (c2p2+ m2oc4)T where m0 is the rest mass and p is the relativistic momentum:

P = [l-^A

2

m0v

(9.3.1)

(9.3.2)

Now the transition to a quantum mechanical system of a given classical system can be effected in more than one way, for instance using the RHS of (9.3.1) in (9.2.10) we have:

(c2p2 + m 0 2 c 4 )T¥ (r , t) = ih^-V(r, at

t)

(9.3.3)

This (quantum mechanical) equation, however, is not of much use due to the absence of symmetry in space and time coordinates. The computations based on this equation are unwieldly. Dirac circumvented this situation (absence of symmetry) by suggesting an alternative method, which we shall explain below. But before that we give an outline of another useful procedure which leads to the well known equation (9.3.4). We square both sides of (9.3.1) before operating on *F(r, t). We replace H with ih— and writep2 as dt (-H2c2V2) to obtain the quantum mechanical equation:

f-fi2c2V2 + h2-^

+ m%A ¥(i\ f) = 0

(9.3.4)

\ dt / When 4* is a scalar (invariant under change of inertial frames), the above equation is the Klein-Gordon equation for a free particle. The familiar form of this equation which one encounters in literature is: (p2 + mlc2mxv)

=0

(/i, v = 1, 2, 3, 4)

(9.3.5)

This follows by using: ct - -ix4

(9.3.6)

and then writing

-ih-4— =Pu (A* = 1, 2, 3, 4) dxM

(9.3.7)

in (9.3.4). For a particle of charge q in an electromagnetic field with vector potential A(r, t) and scalar potential O(r, t), the derivative p^ is replaced by the gauge covariant derivative: Dtl = pfl-qAll

(9.3.8)

where A^ is the four-vector potential:

A^ = ^A,j^j

(9.3.9)

Basics of Quantum Theory

461

(See Sec. 3 and Sec. 4 in Chapter 6.) Hence the Klein-Gordon equation for a charged particle in an electromagnetic field is: {D\ + mlc2)V(xv) = 0

3.2

(9.3.10)

The Dirac Equation

We now return to Dirac's equation. Dirac expressed the sum of four squares in c2p2 + mj c4 as a perfect square by introducing other operators a and p independent of p and m0, thus: c2p2 + m\c4 = (ca • p + Pmoc2)2

(9.3.11)

He assumed that a and p commute with operators p and r, so that the RHS of (9.3.11) could be written as: • j c\ajak + akap

PjPk

+ mQc\afi + paj) Pj + m20 c4/?2-

(9.3.12)

Comparison of (9.3.12) with the LHS of (9.3.11) implied the following identities for a and /5: ajCck + ak(Xj = 25jk

(9.3.13a)

ajp+p(Xj = 0

(9.3.13b)

P2=l

(9.3.13c)

This showed that «, and /3 anticommute, and have unit squares. Using the operators a and /J, the quantum-mechanical Hamiltonian is: H=cap

+ pmoc2

(9.3.14)

Writing p =^-ihV in the above Hamiltonian and substituting this H in (9.2.9), we have the Dirac equation for a free particle: {-iti ca V + Pmoc2) *F(r, t) = ih-^-xV(r, t)

(9.3.15)

at

The usual form of this equation in the literature is: (i7A+mocmxv) = 0

(9.3.16)

which is obtained by multiplying (9.3.15) on both sides by /?, and then writing y=-ipa

and y4= p

(9.3.17)

The operator components (y^) satisfy: 7 ^ + ^ = 25^ (9.3.18) 2 Evidently for each jJL, (y^ is a unit matrix. In view of (9.3.8) and (9.3.9), the Dirac-Hamiltonian for a charged particle in an electromagnetic field is: H = ca • D + pmQc2.

(9.3.19)

The Dirac equation now becomes: (iypDp + mocy¥{xv) = 0

(9.3.20)

462

Mathematical Perspectives on Theoretical Physics

The operators at, j8 and y^ can all be represented by matrices that have real or complex entries. Moreover, since H is Hermitian (see Appendix 9A), these matrices are Hermitian and are therefore square of order N say, we shall see in Exc. 3.1, that N is 4. Using the matrix representation of Exc. 9.3.1, the Hamiltonian (9.3.14) can be written as: H=c\ \CS p

\ -mocj

(9.3.21)

where
¥= J

(9.3.22)

and with appropriate choices (see Exc. 9.3.2), it can be shown that the Dirac equation (9.3.20) consists of four coupled partial differential equations: (a)

mQc2\vV + cD3\\y£ + c(Dl - iD2)\y/4)

= ih-^- - q<S> |y/x)

(b)

m0c2\\ff2)

= f j » - | - - q<5> j |y/ 2 )

(c)

-moc2\wj + cD^x) + c(Di - iD2)\W2) = [ihj^ - 9<M|y3>

(d)

-mQc2\xifA) + c{Dx+iD^)

+ c(D{ + ID2)\y/3> - cD3\y/4)

- cD^2)

= (ih^

- q®)\y/4)

(9.3.23)

We note that if the inertial frame in (9.3.15) is changed the Dirac equation becomes (-ihca • V + j3moc2)^'(r', t') = ih—^Xr',

t')

(9.3.24)

at

where ^ ' ( r ' , t') = expf- —a • V tanh"1 —W(r, t) V 2 cJ

(9.3.25)

The unit velocity vector V here represents the velocity of the second frame relative to the first (v = |V|2), and r\ t' are related to r, t by a Lorentz transformation. From our discussions in Exc. (9.3.2), it follows that the Klein-Gordon and Dirac equations (9.3.10) and (9.3.16) can be written respectively as: (Dl+m2c2)\y/)

=0

(9.3.26)

Basics of Quantum Theory 463

( / 7 ^ + m 0 c ) | y / > = 08

(9.3.27)

Recall that the Hamiltonian is the observable associated with energy. Therefore if the kets are taken as energy eigenkets and A^ is regarded as time independent, then the above equations can be written as: c2(D2+m20c2)\y/)

= (E - q®)2\v)

(9.3.28)

c(a • D + /3moc)|v> = (E- q®)\y)

(9.3.29)

where E is the energy eigenvalue (see Sec. 9.2). In view of above discussions, it is evident that beginning with (9.3.26) and (9.3.27), we can revert to Equations (9.3.10) and (9.3.20) in co-ordinate representation-by using the defining equation: ¥(/•, t) =
(9.3.30)

(see 9A and Sec. 2). We note that the elements of T are *P^(r, t) - (r\y/x) and the normalization condition 4

X = 1 A= l

expressed for bras and kets, in the case of {4^} is given by:

Xf^/^1

(9-3-31)

A= l

Also, as (9.3.27) represents four coupled equations, these four equations can be expressed in coordinate representation using ^ ( r , t) = (f\Wxf- F ° r instance, the first equation of (9.3.23), in coordinate representation becomes:

(a)

m^Vx + cD3¥3 + c{Dx - iD2)«P4 = iihj- - ?* W

(9.3.23)'

In the following we shall discuss an example of momentum representation. From Appendix 9A we know the relation that exists between position and momentum operators. Furthermore, we also know that the. coordinate as well as momentum representations can always be obtained for any physical system using the appropriate relations from the set (A.23)-(A.33). To illustrate this, consider a Dirac particle in a spherically symmetric electrostatic field. The Dirac equation [9.3.29) can now be written as: (cap

+ Pm0c2)\ys) = [E- q(r)]\v)

(9.3.32)

Multiplication on the left by (p | and the use of (A.25), (A.42) and (A.44) gives us: (cap

+ pmoc2-E)y/(p)

= -q{2nh)~^j

d3p'F(p' -p)yr(p')

(9.3.33)

where \jr(p') = (p'\Y) is the Fourier transform of y/(r ') (see Exc. 9C.7), and F(p' -p) is defined as: F(p' -p)= (2nh)~TJ d\ exp[i(p' -p)r/h]®(r)

(9.3.34)

^ote that in contrast to other equations, e.g., (9.3.19) or (9.3.32) which are differential equations, 9.3.33) is an integral equation for the momentum-space wavefunction y(p). 1

It can be easily recognized that (9.3.27) is the consolidated form of (9.3.23).

464 Mathematical Perspectives on Theoretical Physics

3.3

Commuting Observables for a Free Relativistic Dirac Particle

Our objective here is to obtain a c.s.c.o. (complete set of commuting observables) for the physical system that describes a free relativistic Dirac particle. Beginning with the Hamiltonian and the orbital angular momentum (which we already know in this case), we write their commutator as: [H, Lj\ = [cakpk, Lj] = cak[pk, Lj\ = -ihcakeiklpt

(9.3.35)

(see (9.3.14) for the expression on the RHS). This gives: [H,L] = -ihcaxp

(9.3.36)

showing that H does not commute with L. We are looking for a commuting operator, hence we introduce the matrix operator:

0\

fa

(9.3.37)

and calculate the bracket [H, Z]. In view of (9.3.21) it gives:

0

(

[H,T.] = c\ \[o p,o]

[a p,a\\ * 0

(9.3.38)

)

Now using GjOj- Oj(?i= 1i£ijkok, (see Eq. (7.3.2)) we can write [a • p , ojl = [
(9.3.39)

which gives: [H, Z] = lie axp

(9.3.40)

In order to find an operator that commutes with H, we define two new operators: S = ^-ftZ

(9.3.41a)

J = L + S.

(9.3.41b)

and

The operator J commutes with H: [H, J ] = 0 The operators S and J are referred to as the spin and total angular momentum operators (recall that we had a brief exposure to angular momentum operator J in Sec. 3.7.1 in Chapter 3). These operators satisfy the commutation relations: [S,, Sj] = iheijk Sk,

[/,, Jj] = iheijkJk

(9.3.42)

From (9.3.41)(a) we have (see hints to exercises 3.7.2 and 3.7.3 for explanations):

S'=i-»
[ o o 2)

+

l)nf

°)

2 V 2 J [O I)

(9.3.43)

Basics of Quantum Theory 465

Consequently, the quantum number s in S2\y) =

fi2s(s+l)\yr)

(9.3.44)

is equal to —. Moreover, if Sz denotes the spin with regard to z-axis, then S2=~h2l (9.3.45) 4 where 1 is the unit 4 x 4 matrix. Thus possible eigenvalues of Sz are ±—h. As a consequence, the Dirac equation describes particles whose spin angular momentum is —Ti. It should be noted that in general an energy eigenket is not an eigenket of Lz or Sz since unlike JZ=LZ+Sz these operators (separately) do not commute with H (see Chapter 3, Subsec. 2.4).

3.4 The Relationship Between Free Klein-Gordon and Dirac Particles We conclude this section by showing the relationship that exists between a free Klein-Gordon particle and a free Dirac particle. For this purpose we consider the Equations (9.3.4) and (9.3.15) and note that energy and momentum eigenstates of these particles are given in coordinate representation by corresponding plane wave solutions. For the Klein-Gordon particle they are: V(r, t) = n(P) exp[j(P • r - Et)h]

(9.3.46) 9

where P and E are the momentum and energy eigenvalues that satisfy the relation: E=± (P 2 c 2 + m%c4)T

(9.3.47)

and n is a normalization constant. When E is -ve, the solution *F(r, t) represents an anti-particle. In the case of Dirac equation (9.3.15), the eigenstates of energy and momentum (eigenvalues) are: V(r, 0 = «(P) exp[i(P • r - Ef)K\

(9.3.48)

The u here is a column vector:

V I I = "2

(9.3.49)

."4.

that satisfies: c(cc • P + pmoc2 -E)u = 0

(9.3.50)

Written out in full, the above equation represents four linear equations: (moc2 - £)«! + 0H 2 + cPzu3 + cP_u4 = 0 0«i + (rn0c2 - E)u2 + cP+ u 3 - cPz M4 = 0 '• Some of the texts define E only with a plus sign (see, for instance, [33]), we prefer E to stand for a + ve as well as a -ve eigenvalue of energy.

466

Mathematical Perspectives on Theoretical Physics

cP.w, + cP_u2 - (m0c2 + £ > 3 + 0w4 = 0 cP + «, - cPzu2 + 0« 3 - (moc2 + E)u4 = 0

(9.3.51)

where P±- Px± iPy. We know that for the existence of non-zero solutions for w;, the determinant formed by the coefficients must vanish. As can be easily checked, this determinant equals (E2 - c2P2 -m2 c4)2, which shows that the energy and momentum of a free Dirac particle also satisfy (9.3.47). From (9.3.51) it follows that any two of the M;'S (say u3, w4) can be expressed in terms of the other two (say ux, u2). We give below solutions corresponding to u2 = 0 and ux = 0, denoting them as w(1) and M(2). They are easily seen to be: E + moc ,,, um (P) = n(P)

0

0

,

cPz

m E + mnc u(2) (P) = n(P) cP_

. cP+ \

(9.3.52)

[ ~cPz _

Here n(P) is the normalization constant (the role of P as an argument in n(P) will soon be clear when we consider the Dirac equation in the rest frame). The solutions to Eq. (9.3.50) given above are the ones for E > 0; for E < 0, they vanish in the rest frame (E = n^c2) of the particle as P± and Pz are zero. For E < 0, the non-zero solutions correspond to u4 = 0 and M3 = 0. These are respectively: " -CPZ (3) M

(P) = n(P)

i

~f + moc

r

, «(4)(P) = «(P)

—E 0

-CP_

_

'

~!fZ

•

(9.3.53)

0 _moco ~ E

The states corresponding to solution (9.3.53) (since E< 0) are associated with anti-particles, (see Chapter 5 in Bjorken and Drell).

3.5

The Dirac Equation in Rest Frame

We see next that solutions (9.3.52) and (9.3.53) can be obtained by considering the Dirac equation: /3moc2x¥ = E*¥

(9.3.54)

in the rest frame. Evidently, as P is zero, from (9.3.48) we have ¥ = u exp(-iEt/h)

(9.3.55)

with E - ± moc2.10 For E = moc2, the analogues to (9.3.52) of the column vector u are:

"ii (1) M

(0)= °Q , « 2 (0)=

.oj 10

ro~ l Q

(9.3.56)

|A

' The assumption of rest frame implies that P is zero, this justifies the use of um (0) in place of w(1> (P), and shows the dependence of the normalization constant n(P) on P.

s

s

i l l s

s . i §

S"

^

g 3

«*!

w

^

H

«

•SOF

i

?j ill ;,se > s

, r

§ s S

V "~

SO.

Hi

m

!|s -T« i l l 1 1"S| x l a ss

s-

1

15-

1 § lf

w

j> ^

>

ill

n

gus

-5 .2 *o

, us

i

HI <*!

^ S,

" ^ 3

E

"2

g

§ » *^ usi

•*«•

-g .^H .^

^

^^

I | l|i

^

lg^ -?«1 1 i l l

*.

* |

Is! i if ^ ill i i k HI i

f

isiSf S*i4#

"§

I Si §E

SI |l

S3

ill |lt

> £ JJJ | - -S-ig

ag

I

®

a

§i

468

Mathematical Perspectives on Theoretical Physics

where E and B are the electric and magnetic fields (see Sec. 6.4 and also Exc. 3). Accordingly we obtain: c-2[i»-|--?a»j

- (•p-qA)2-m£c2+

qh(B + HT'E) • <7L) = 0

(9.3.64)

Using the above two component equations, the Dirac equation (9.3.27) can be solved more easily. From (9.3.61) and (9.3.64) we thus have: (\

r) 1 ^ co • {p - <jA) + ih— - q<& + m0c2 \<j>) W)= r \ (9.3.65) d co • (p - qk) + ih— q<S? + m0c2 \
Exercise 9.3 1. Obtain a matrix representation of the operators (aj), j3 and y that are used in the Dirac equations (9.3.15) and (9.3.16), and show that using these, the Hamiltonian can be expressed as in (9.3.21). 2. Show that using the Dirac ket and bra, Dirac equations can be written as four different coupled partial differential equations given in (9.3.23). 3. Show that for a charged particle in an external electromagnetic field, the Gamma matrices (/„) and covariant derivatives (DJ satisfy (9.3.63).

Hints to Exercise 9.3 1. Each of these operators are linear, hence they can be represented by matrices with real or complex entries. However since H is Hermitian, these matrices have to be Hermitian. In view of the Hermitian property (see [34]), they are square matrices of order N (say). Therefore from (9.3.13)(a) and (b) it follows that their determinants satisfy: (i)

\aj\\ak\ = (-lf\ak\\a^

(ii)

|oj.||j3| = ( - 1 ^ 1 1 4

j * k

And since neither \(Xj\ nor |/j| is zero, the order N has to be even. Moreover, from the property of unitary matrices, we have that if a,- and /3 satisfy (9.3.13), then so do U+CCjU and U+J3 U for any unitary matrix U. This means that one of these matrices can be diagonalized. We take that to be /?, from (9.3.13)(c) the eigenvalues of [5 are ± 1, hence:

Basics of Quantum Theory

469

(iii) We write a, as:

<"»

'

• » • ( " « )

and since /3 and ot- are of order N, the matrix elements in them, i.e., in (iii) and (iv) are of order —N. Substituting these in (9.3.13)(a) and (b), we find that aj=dj=0, and (v)

bj ck + bkcj =

Cjbk

+ ckbj = 2Sjk.

For j = k it follows from (v) that c-} - b~l. For j * k, the equality bj ck + bkcj = 0 = Cjbk+ ckbj cannot be satisfied for real or complex numbers, i.e., it cannot be satisfied for N - 2. For N = 4, the bp etc., are (2 x 2) matrices and they satisfy: (vi)

bj = lOj,

Cj = mdj

where Oj are Pauli matrices and the constants /, m obey: (vii)

Im = 1.

If we choose 1=1, then we have:

(viii)

a=^

J

and in view of (9.3.17) (ix)

( 0

-io\

We substitute the value of a and /? in H = ca • p + finite2 and obtain: (x)

H=c\

-p+

moc2 = c

2. Recall that the Dirac equation (9.3.16) and (9.3.20) for a free particle and for a particle with charge are written by treating all variables x^(fi = 1, 2, 3, 4) in the same manner. To establish the results of this exercise, we treat time as a parameter, and the space and momentum coordinates as operators along with the operators a and /3. These operators are then the fundamental dynamical variables of a given system. We note that the use of these operators allows us to write the equation explicitly in terms of Pauli matrices and the Dirac kets:

"IV,}"

lv3) JV4>_

470

Mathematical Perspectives on Theoretical Physics

We use (iii), (viii) and (ix) of Exc. (3.1) and (9.3.6)-(9.3.9) to make the required substitutions in (9.3.20) and obtain the matrix form (i) below:

17 0

cA

( 0

rj2A

( 0

rj3A

fl

0^

1

[ U oJ D ' + U ojM-o, oJ^Ho -JD'+/"»T> =0Introducing the standard Pauli matrices:

(o n

(ii)

(o

fi en

-A

we simplify the above equation into its explicit four-component form: 7

D4

0

0 -Dj

(iii)

D4 -Dx+iD2

.t-A-tDj Note that each D^= p^-

£»3

D3

0,-zD^

D,+zD2 -D4

-Dj 0

0

qA^, in particular D 4 = -z'fc— dx^

-D 4 J

ITki)" m

°C

|V^Z> = |^3)

O

JLl^4>.

q—<&. We replace D 4 by this c

expression and change —— to —. Multiplication of matrix with column vector then leads to ic at ax4 equations (9.3.23). Evidently they are coupled since we see here partial derivatives with regard to all space variables operating on all different Dirac kets. 3. We use the fact that any product ab can be put as (i)

ab = j({a,

b) + [a, b])

to write:

(")

Y^YVD^DV= y ^ y v ( { ^ , Dv) + [£>„, DJ).

Since y^, yv satisfies (9.3.18), (ii) becomes: (iii)

YnYvD^DV=^{y^yv}D^Dv

+ | y ^ yv[D^ Dv]

In view of (9.3.6) and (9.3.8), [D^, Dv] can be written as:

(iv)

dA dA.. [Dll,Dv] = iqh ^ ^ • [ dXp dxv

As a result:

Basics of Quantum Theory 471

(v) But

and

(vii)

[Yi, jjl = -Paficcj + ^afiai = a,a- - a,a, J o Pi-<* Pi

0

^

(We have used here the equalities (9.3.13), (9.3.17) and (9.3.37) and equality (viii) of Exc. 1.) Now the magnetic field for a particle of charge q is given as: B=VxA or componentwise as: (viii)

Hence substituting these expressions given in (vi), (vii) and (viii) into (v) and using the product of antisymmetric tensors appropriately, we have the required equality (9.3.63) in the composite form:

yMyv D^ Dv=Dl-qKLK 4

+ ic^qha • E

GAUGE FIELD QUANTIZATIONS

We had a brief exposure to quantum theory techniques in the previous sections and appendices and are now in a position to follow it up with the learning of gauge field quantizations. Since gauge fields are geometric in nature (connections on a fiber bundle), they cannot be quantized by standard methods as those methods lead to difficulties and contradictions. The methods that overcame this problem were suggested for the first time in 1963 by Feynman [13b] for Yang-Mills theory and were improved upon in (1967) by DeWitt [36], and in a separate paper by Faddeev and Popov [10]. In our study here (to a large extent), we follow the approach of Faddeev and Popov (FP) which uses the functional integral formalism. Our main emphasis, however, is on learning the Feynman graph techniques, since it is these techniques that are used (in removing the unwanted divergences) in string theory and in super theories in general. The integrals in the FP approach are calculated over the surfaces of the manifolds of all gauge fields. These gauge fields are represented as points on the surface by their respective classes. Recall that two gauge fields belong to one and the same class if one is a gaugetransform of another (e.g., A^ and A^ + d^A in electrodynamics (see Sec. 6.4)). Hence all gaugeequivalent fields mean one single point on the surface.

472

Mathematical Perspectives on Theoretical Physics

To understand the quantization of gauge fields, we shall begin with the formulation of rules of quantization for fields of general nature, e.g., scalar fields or Bose and Fermi fields. In the process we shall introduce the notion of functional integral for fields. The Green's function that plays an important role in quantum theory (see Appen. 9C) will now be seen as a functional integral. The generating function of a field, and the propagator or the Feynman's Green function will be defined, and the functional integral for Bose and Fermi fields will be obtained. We would like to mention here that an important tool of quantum mechanics—the path integral formalism which should chronologically precede the discussions here is relegated to the next section. Due to its applicability in areas of physics other than quantum theories, e.g., string theories, we feel the topic deserves a separate section. We introduce it there from first principles and show (in Sec. 6) how it leads to Feynman graphs.

4.1 Feynman's Functional Integral Consider a classical action: S(t0, t) = \' (p(z)q(r) - H(.q(r),p(r)))dT= f / dx

(9.4.1)

that corresponds to the trajectory (g(T), p(t)) (t0 < T < t) defined in the phase space, where p is (as usual) the momentum canonically conjugate to q. To determine S(t0, t) we take the mean value over the intermediary trajectories (to be defined shortly). This mean value—known as the Feynman's functional integral-is defined as a limit of the finite-dimensional integrals obtained from the given trajectory in the following manner (see also Sec. 5). The interval [t0, t] is divided into N equal parts by the points T = TX, T2 ... TN_V The momentum function p{t) assumed to be constant in each of these intervals (i;, T)+1) (i = 0, 1, ..., N - 1) is denoted as pi+l. The coordinate function q{T), on the other hand, is viewed as a distinct continuous linear function qi+l in each of these different intervals. At the end points t0 and t, q(f) is assumed to be the fixed number qQ and q respectively. The trajectory (q(t), p{t)) is thus replaced by N distinct trajectories, in other words it is defined by the parameters qx ... qN,px ... pN, and since qN is the constant q by our assumption, the integral (9.4.1) is replaced by a {IN - 1) finite dimensional integral: J dpxdqx

... dpN_x dqN_x dpN I

(9.4.2)

where the integrand / is suitably altered in terms of these parameters. The limit of this integral as N —> °° is the required mean value (Feynman's functional integral). We are particularly interested in the mean value of the finite dimensional integral obtained by using the exponent of S{t0, t): {2nYN J dp{dqx ... dqN_x dpN &xp{iS{t0, t)) = JN{q0, q; t0, t)

(9.4.3)

The limit of this integral as N —> °° is equal to the matrix element11 of the evolution operator U{t, t0) =

e\p{-i{t-t0)

H):

lim JN{q0, q; t0, t) = (q\ exp{-i{t - t0) H \qo) «—»•*>

11

See the definition of matrix element in the Hint to Exc. (9.4.1).

(9.4.4)

Basics of Quantum Theory 473

see Eq. (9B.19) Exc. 9B.1 and Sec. 5; we have taken % = 1 in (9.4.4)). In Exc. (4.1) we shall prove for i simple case the validity of the assertion made in (9.4.4). The functional integral obtained above is lenoted symbolically as:

r

exp(^0, 0) U&&2™

•wo)

(9.4.5)

2n

T

By definition this is the functional integral expression of the evolution operator matrix element. We shall derive it in Sec. 5 using the operator formalism.

4.2

Functional Integral of a Scalar Field

In order to study the quantization of fields, we shall begin by obtaining the functional integral for a scalar field 0 with self-interaction. The action integral here is: (9.4.6) The field functions / and S, corresponds to the f

g

integral of I - — 0

,\

that describes the self-interaction with coupling constant g. To define the func-

tional integral over all fields, we use the finite-dimensional approximation as described below. We take a large cubic volume V embedded in the space V4 and divide it into N4 equal small cubes v-t (r = 1,2 ... N4). We then approximate the function 0(;c) in the volume Vby treating it as a constant function in the vt's. We assume that the first derivative —z— is the finite difference: -^jWxM+ S^M)-

0(x")]

(9.4.7)

where AI is the length of the edge of the cube v{. The function 0(JC) is approximated by values of piecewise constant functions in the volumes v,'s. Using this approximation rule, we consider the finite-dimensional integral r "4 | exp(iS) IInOc)
(9.4.8)

i=i X 6 Vi

over the values of the function 0(JC) in the volumes v-v The action S involves these approximated values of °° and vt —»0, the integral (9.4.8) is of the form exp(cV) with c being a V-independent constant. Usually n(x) is of the type: n(x) = K(M)k (9.4.9) where K and k are constants independent of x. The only x-dependence of n(x) is reflected by the presence of A/. We shall use this formulation to write down the Green's function as a functional integral. n'

Note that we have used elsewhere rj^v- (-1, +1, +1, +1) for Minkowskian metric.

474 Mathematical Perspectives on Theoretical Physics

4.3 Green's Function and Generating Functional The Green's function is the expectation value of the product of two or more field functions weighted by exp(/S). In the case of two fields, it is the two point function defined by the formula (9.4.10) below: G(x,y)s-i((x)(y)) I exp(/S)0U)(y) n n(x)d(j)(x) = -;iim

r ^

V->~

r

o,-+o

•

(9.4.10)

N*

exp(j'S) IT n(x)d<j)(x) i=i

•*

xevt

The limit of the expression on the RHS is usually denoted as: J exp(iS)(pU)(y)Tln(x)d(x)

(9.4.11)

X

*

Associated with the fields of a physical system are the generating functionate which are used to determine the Green's function. The generating functional13 in this case in terms of an arbitrary function J{x)H is: fexp/(S+ \ J(x)(x)d4x)nn(x)d(x) Z[J] = —^ ^ e\p(iS)Un(x)d<j)(x) J

(9.4.12)

X

The two point Green's function is now given by the formula: G(

*' y)

= /

^T^-Tz[/]^ =°

(9 4 13)

--

5J(x) 5J(y) When the component Sl (the interaction part of S) is ignored, the calculation of G(x, y) using (9.4.13) reduces to G0(x, y) = D(x, y) (9.4.14) where D(x, y) is the solution of the operator equation: - ( • + m2)D(x, y) = 5(x, y) (9.4.15) which we studied in Appendix 9C(see 9C.10 and 9C.25). (See Exc. 2 for evaluation of Z[J] and Green's function in the case of a free field theory.) In Exc. 2 we mention the non-uniqueness character of the above solution; to circumvent this problem of non-uniqueness, we replace exp(z'S) by exp(iS£) where Se is a complex action dependent on a nonnegative parameter: Se = — J 0 ( - D - m2 + i^Qd^x 13 1

(9.4.16)

See Sec. (9.5) for derivation of generating functional. ' In Sec. 5 we shall see that / has a physical meaning. J is denoted as X] in some texts (see, for instance, [26]).

Basics of Quantum Theory 475

The action SE is chosen in such a manner that the absolute value of exp(iS£) is less than one and it vanishes when f (j)2d4x —» °°. The Green's function obtained from-S£ when e —> +0 is unambiguous as D(x, y) becomes a limit of the Green's function of the operator ( - • - m2 + is). The function D(x, y) depends on the difference (x - y) and is given by: 1

D{x

r

dAkpik{x~y)

~y) = T ^ r J .I

<9-4-17)

2 .

(2n) J k -m + te where k is the momentum four-vector*. The limit of this function for £ -» + 0, is denoted as DF(x - y) and it is called the propagator or Feynman's Green function. Essentially we have thus shown that in the theory of free fields, the expectation value of the product of fields (x) and (y) (i.e.,
(x)(y)} = iDF(x~y) (9.4.18) The method of obtaining the Green's function by taking the functional derivatives of the generating functional and letting J tend to zero, can be generalized to an arbitrary finite number n. We then say that the expectation value of the n fields (x{), ..., <j)(xn) (Green's n point function) is given by taking n functional derivatives of Z[J] and putting J = 0. Accordingly we have: <0(jC]) ... (xn))o= Go(*i> ...,*„) SJiXi)

=

SJ{xn)

J=o

\Qxp{iSQ)(xx)--
(9.4.19)

\exp(iS0)nn(x)d(x) X

It is worth noting here that the expectation value in (9.4.19) is zero when n is odd 15 and for even n it can be expressed as the sum of products of expectation values of pairs taken over all possible combinations. This result is called Wick's theorem. For n = 4 it reads: (<j>(xl)(t>(x2)(x3)(xi)) = < 0 ( ^ ) 0 ( * 2 » <W*3)0(*4)>

+

{(x3)){<j>(x2)(x,)) + <(*4)> (0(x2)(x3))

(9.4.20)

Next we return to the full action S = So + S,16 in order to compute the contribution of exp(/S7) in exp(j'S) where: exp IS = exp(i5 0 ) exp(j5 / ) To follow the usual practice in literature the 4-vectors k, x, y etc. are not denoted by bold letters here. ' See Sec. 5 as to why the expectation value for odd n is zero.

15

16

S,=-j;lfa)dAX

(9.4.21)

476

Mathematical Perspectives on Theoretical Physics

This is done by using the perturbation theory technique17 which is based on the expansion of exp(/S7) as a series in g: expO'S,) = JT - ^ ^ - J 0 3 (*i) •- \x,)d\

••• d \

(9.4.22)

This series can be integrated term by term, hence after integration when we substitute the result in the 2-point Green's function (9.4.10), we obtain after using (9.4.11) the following expression (Eq. (9.4.23)): £ ^ & l ™P(iSo)(xWy)jHxi)-43(Xn)d4Xi...d*xnIln(x)d(x) G(x, y) = -i*-° , n

I ^ f j ^ r j eMiS0)3(xn)d4xv..d4xnnn(x)d
(9.4.24)

then the denominator stands for the expectation value: lexP(iS0)3(xn)Un(x)d(x) ^ ( x , ) - . 0 (xn))0 = " f exp(iSo)IIn(;t)
(9.4.25)

and the numerator for the expectation value: J exp(zSo)0U) \*n))o = ?—T^TTT; — ~ ( 9 - 4 - 26 ) exp(iS0)nn(x)d
17

Given an operator Ao with eigenvalue eQ another operator Ao + aB, where | a \ is small is called a perturbation of Ao. The eigenvalues of this new operator that lie near e0 are of great interest and so are their relations to B and their properties as functions of a. In quantum mechanics there are 'formal' series for the perturbed eigenvalues. These series are known as perturbation series and are often given in terms of oc. (See Chapter 17 in [24])

Basics of Quantum Theory 477

4.4

Diagram Technique for Scalar Field Theory

The FP ( Fadeev and Popov) procedure is based on constructing the prediagrams associated to a given expectation value. For instance, to every n-th order expectation value of the type (9.4.25), there is assigned a diagram made of n pseudo-euclidean points (with three lines jutting out from each of them). These n points here are called the vertices of the prediagram. For n = 4 this prediagram (showing the vertices also) is:

y- y*- y*- )*-

^^

Similarly (for n = 4) the prediagram assigned to the expectation value (9.4.26) is of the form: x

\x,

-— f—

\x2

/r

L

\x3

-

/

\*4

p - /r—

y

—•

(9.4.28)

Note that we have added here two points (each having one leg) that connect points x and y in V4. These diagrams are symmetric with regard to the permutation of n points xx,..., xn and also with regard to the permutation of three lines in each point. Hence there is a symmetry group Gn that leaves the prediagram invariant; the order of this group can be easily seen to be Rn = n!(3!)". The symmetry of the prediagram is exhibited in the symmetry of expectation values (9.4.25) and (9.4.26), as these expectation values remain unaltered under any permutation of their arguments, and under permutation in any of the triplets of the field functions (*,) 0C*;)0(*;) = 03(*,)In view of Wick's theorem, we know that these expectation values (9.4.25) and (9.4.26) can be expressed as the sums of products of all possible expectation values of field function pairs. If among them the expectation value (0(JC,-)0(JC,-))O is present we connect each pair of points x{, X: with a line and thus assign a diagram to every formation of the pair expectation value. The number of lines is equal to the number of pairs and thus equals half the number of field functions. The diagrams that result from prediagrams (9.4.27) and (9.4.28) after using these connecting lines are respectively: a

GO

b

c

Q

a

(9A29)

^

b

c

ee Q @ and

^-^ e

CD

f

'"vly

(9.4.30)

In order to obtain the expression corresponding to a diagram one has to integrate the product of the pair expectation values over x{, x2, ..., xn and multiply it by the factors (-ig)"IT^and Rnlrn d. Here rnd is the order of the symmetry group of the diagram constructed from the prediagram by joining its vertices with lines. Since Rn is the order of the symmetry group of the prediagram, the ratio R^rn d gives the number of ways through which a given diagram can be obtained from a prediagram.

478

Mathematical Perspectives on Theoretical Physics

Finally to express a given Green's function/expectation value, in terms of these diagrams and viceversa, we need to set up the rules of correspondence between the basic elements of a diagram—the vertices and the lines and the 'elementary Green's functions' for a pair of points. To achieve this, we assign the Green's function DF (*,• - x) to the line joining the points xt, Xj, and a factor of coupling constant g to every vertex-point. Since DF (x-y) in view of (9.4.18) differs from the expectation value by the factor i, in effect we have set up the correspondence rule (9.4.31) given below for the expectation value as well. X

-

Xj

~—

( 9 A 3 1)

DF(,,-Xj) X—8

The expression for a diagram is actually obtained only when the product formed by contributions that correspond to elements (of diagrams) is integrated over the coordinates of vertices and multiplied by the factor (i)!~"~l (rnd)~l, where / is the number of lines, n is the number of vertices, and rn d is the order of the symmetry group of the diagram. Before closing our study on diagram techniques in this section, we make three important comments regarding it. We shall return to these discussions again in Sec. 6. Comment 9.4.1 In the computations of Green's function only those diagrams enter that are 'connected.' A diagram is said to be connected if it is possible to go from any given vertex of the diagram to any other vertex by moving along the diagram lines. Comment 9.4.2 The diagram technique in the momentum space is defined by using the Fourier transforms (j>(k) of the field functions (x):
exp(ikx)$(k)d4k

(9.4.32)

The expectation values of the type

<0(* t ) ••• £(*„)>

(9.4.33)

play the role of Green's functions in the momentum space. The correspondence rule (9.4.31) takes the following form in this case (see (9.4.17)): k

J

k

2

-•—*- 8 (*! + k2)

{kx2 - mx2 + ze)-1 *i

I /*\-« k2 k3

^gd(k{ +k2 + k3)

(9.4.34)

The contribution of a particular diagram is now obtained by considering the product of expressions for all its elements using the rule (9.4.34) and then integrating it over all internal momenta. The multiplica( i V""" 1 tion factor here is • (rn d ) ~ l . Comment 9.4.3 The diagram technique introduced here is based on the functional integral approachthe approach used by Feynman and later by Faddeev and Popov. In most books however the operator method is used (see for instance [7]). In the next example we show how a functional integral—which we have so far seen as an abstract entity-can be transformed into integrals of a (familiar) Hamiltonian form:

Basics of Quantum Theory 479

Example 9.4.4

Consider the functional integral f exp(rS) Un(x)d^{x)

(9.4.35)

given in the denominator of (9.4.11). To write it down in Hamiltonian form we consider the integral whose action functional involves
, n]) Yln(x)d
(9.4.36)

X

The action S[(j>, n] stands for:

5[0, n]=\[ndo- ±n2- | ( V 0 ) 2 - ^-<j>2- j | - 0 3 ) ^

(9-4.37)

It is easy to note that if n is replaced by do in the above integrand then S[<j), n] is the action S of (9.4.6). The above action (9.4.37) is of Hamiltonian form with the corresponding function:

H=

\{IK2+ i

(V0)2+

T-<* 2+ v.^)dh

(9A38)

where n(x) + do4>(x)

(9.4.39)

into (9.4.36) (using the expression (9.4.37) for S[, n]) and note that the integral reduces to the product of the integral (9.4.35) over <j> and the integral

} expf -— J K2(x)d4x ) II dn(x)

(9.4.40)

over n, which leads to the product of normalization factors. Also when a Green's function (expectation values of a product of several fields) is calculated, integrals of the type (9.4.40) appear both in the numerator as well as in the denominator, hence the integral over n is cancelled out and one has simply to compute the integral over . The above process of expressing the functional integral of a scalar field theory in Hamiltonian form by artificially introducing an integral over the canonical momentum n is found very useful in proving the Hamiltonian character of given systems of quantum field theory and statistical physics [29].

4.5

Functional Integral Approach to Bose and Fermi Fields

As expected the functional integral for these two important fields (the Bose and Fermi) is realized by making necessary changes in the theory developed above. We describe these changes in brief. In the case of Bose fields, we consider the system where the large cubic volume V=I? (which we mentioned earlier) is filled with Bose particles and is subjected to some periodic boundary conditions. The functional integral in this case is an integral over the space of complex functions (fields) yix, T), y?(jc, T) where x e V. These periodic functions are in the time parameter T with period /3.

480 Mathematical Perspectives on Theoretical Physics

The Green's function is defined as expectation value (in V) of the product of several functions y/, \p with different arguments weighted with exp S,18 where S itself is the functional of yf and y;

S = Jo" dx J d\\jr{x, T) dx y(x, T) - J* H\x)dx

(9.4.41)

The functional S represents the action here, and H' (x) is the Hamiltonian:

H\x) = j d\\i-^

V \jf{x,x)S/{x, T)1 - Xy/(x, T)y{x, T)1 +

•i-J d3xd3yv(x - y) y(x, x) y?(.y, x) y(y, t) y(*, t)

(9.4.42)

The constant X in (9.4.42) is the chemical potential of the system and v(x - y) is the pair interaction potential of two Bose particles with coordinate vectors x andy. If the system under study has thermodynamical characteristic, i.e., temperature is involved, then the periodicity constant /?equals (kT)~\ k being the Boltzmann constant and T the absolute temperature.19 The one-particle Green's function is thus: f es\if(x, T)W(X, , T, )dwdw

G(x, T; x l f T,) = (y/(x, T) y/(Xl, T,) = - J—Z

_

(9.4.43)

J e dxffdxjf

which as we can see is the ratio of two functional integrals. In the case of Fermi fields, functional integrals are defined by using the integrals over anticommuting variables x, x* (i = 1 ... n) that obey the rules of a Grassman algebra20 (with a unit element and involution) (see Chapter 7, in particular Appen. 7A): Xi Xj + Xj;xi = 0, x*x*j + x*jx)=0,

xtx*j+x*jXt = 0

(9.4.44)

An element of this algebra is given as a polynomial:

p(x, x*)=

V-<WA, *?' x"2 -

£

(9-4.45)

«,-,*, =o,i

The coefficients c 0] ^ ... ^n are complex numbers. The commutation relations in (9.4.44) imply that x\= (xp = 0, and therefore as we shall see the values of at and bt equal 1 (i = 1 ... n) in (9.4.45). The operation of involution on the polynomial p is defined as:

P -*P* = £

cav,,anbv,,K ( * / » ... (*,)*!(*;)"" ... (*l) ai

(9.4.46)

We recall that the integral of an element of this algebra is defined as: J p(x, x*) dxdx* 18

19 20

S

J p{Xl, ..., *„, x\, ..., x*Jdx\dxx ... dx*ndxn

(9.4.47)

Note that for the Green's function in the case of a scalar field theory, the weight is exp(j'S) as opposed to exp 5 here. The absolute temperature^ Celsius (centigrade) temperature +273°C. (See Waldram [39] for k and T). The Grassman algebra here is an infinite dimensional algebra, although we are taking only a finite number of n generators. Moreover, this number has to be even due to the nature of the description here.

Basics of Quantum Theory 481 subject to the following integration rules:

J dxl = 0, J dx* =0,

j xtdxt = 1, J x)dx* = 1

(9.4.48)

We note that the functional integral for Fermi fields is eventually the limit of the integral on an algebra satisfying the rules (9.4.44)-(9.4.48). Thus for instance if x* Ax = ^ aik x] xk is a quadratic form of

u the generators xt, x , corresponding to matrix A, then J exp (-x* Ax) dx* dx = det A

(9.4.49)

f exp (-x* Ax + n* x + x*ri)dx* dx and

(9.4.50)

The expressions r\* x= ^ i

77* xh x*r\ = ^ JC* ?], in (9.4.50) are linear forms of the generators xt, x] i

whose coefficients r\i and 77 * anticommute with each other and with generators (we shall use these ideas in Chapter 11). The reader is advised to see [2] for details.

Exercise 9.4 1. Show that if the Hamiltonian H in the action (9.4.1) is independent of p, then the limit of the integral (9.4.3): lim

(2K)~N

[ dpxdqx ... dpN_x dqN_x dpN exp(/5(r0, f)) = (q\ exp(-i(t -

to)H\qo)

equals the matrix element of the evolution operator (the Feynman functional integral). A similar result holds when H is independent of q. 2. Evaluate the function Z[rj] for a free field theory and show further that formula (9.4.13) holds in that case.

Hints to Exercise 9.4 1. From (9.4.1) we note that the action S(t0, t) when H is a function of q only simplifies as: (i)

S%, t) = J ' p(T) ^ - • dX- j ' Jlo dx J'o

H(q)dT.

To evaluate the first term on the RHS, we use the fact that pit) takes the constant value pt in (T,, T ;+1 ) while qix) is piecewise linear, accordingly we have: (ii)

5 ( r 0 , t) = p l { q l - q0) + p 2 ( q 2 - q{) + ••• + pN{q - qN_x) - f

H(q)dx.

482 Mathematical Perspectives on Theoretical Physics

The LHS of (9.4.3) can now be written as Eq. (iii) below: (2nyN

j dpxdqx

... dpN exp i\px(qx

-qo)+

••• + pN(q - qN-X) - (,'

H(q)dr).

Integration with respect to pi{i=\ ... N) gives the product of N 5-functions: (iv) 8{qx - qQ) 8{q2- qx) ... S(q - qN_x). rt

This allows the expression of exp (-/J H{q)dx) to be equal to exp(-/(f - to)H(qo)) which can now be put outside the integral sign. The integration with respect to qx ... qN_x eliminates all 8functions (see (0.4.10) except S(q0- q) and hence it leads to the result (v)

S(q0 - q) exp - (/(* - t0) H{q0))

which is identical to the matrix element of the evolution operator. We recall here that for any linear operator A the scalar product (0|A|i^) is the matrix element of A between \y/) and \<j>). When H depends on the momentum (p), the second term of (ii) becomes: (vi)

- J ' H(p(t))dr.

We now integrate (iii) with respect to qx ... qN_x first and then with respect to px ... pN and obtain the expression:

(vii)

— [ dp e\p{ip(q - q0) - i{t - to)H(p)} 271

J

equal to the matrix element of the evolution operator for the Hamiltonian H(p). 2. The action 5 given in (9.4.6) no longer contains the term - — 03 now. For ease of calculation we denote the other two terms of the integrand as

[ • + m2] where D is d' Alembert's operator.

Next we apply the shift: (i)

(x) ->*(*) + 0 o «

into (9.4.12), where we choose (^(x) in such a manner that while integrating over <j> in the numerator of (9.4.12), the terms linear in

(ii)

We now use our definition of Green's function (9.4.15) which gives:

(- • , - m2)D(x, y) = 5(x - y) = 5(x, y)

(iii)

to write down the solution of (ii) (in terms of this function) as: <j>0(x) = -j D(x, y) T](y)dAy.

(iv) 21

The numerator written out in full is: J exp i (} -— (D + m 2 )0 2 + i){x){x). 2

x

Basics of Quantum Theory 483

This computation finally leads to: exp!-—J T)ix)D(x, y)n(y)dAxdAy\ X J esxp(iS)TLn(x)d^x) z[rj] = L_2 J £ Jexp(»S)nn(jc)^U)

(v) Since

(vi)

z\ri\

= e x p j - -i-J 77(x)D(x, y) TjGOd4**4)/}

the application of (9.4.13) leads to the two point Green's function: (vii)

G0(x, y) = D(x, y).

Remark An important fact that we would like to emphasize here is that the definition (iii) of D(x, y) is not unique, since it is defined only up to an additive part given by the solution of the homogeneous equation: (- D - m 2 ) / = 0 (see Eq. (9.4.17)).

5

PATH INTEGRALS

5.1

Path Integral via Operator Formalism

In the previous section we studied the functional integral approach to quantum theory by considering a classical action defined in phase space. The foundations to this approach were laid by Feynman in his famous paper of 1948 [lla]. 22 Here he expressed the transition matrix element for a one-dimensional quantum-mechanical system (transition/probability amplitude (9.2.3)):

< ? ', f Ur> = 2 3

(9-5.1) f" t,

tMt

>

22

23

^

Some of the ideas here were already discussed earlier although with a different twist, see for instance Sec (9.4). We shall refer to them as we go along. Note that \q) on the R H S is an eigenstate of the position operator Q in the Schrodinger picture, and \q, t) on the L H S , which equals e'H'\q), denotes the state in the Heisenberg picture (see (9.2.51)-(9.2.54)).

484 Mathematical Perspectives on Theoretical Physics as a functional integral:

NJ [dq] cxp{ij't L(q, q)dx]

(9.5.2)

The integral was taken over the function space q{t) and it represented the sum of contributions over all paths that connect (q, t) and (qr, t') weighted by the exponential of i times the action. The constant N was used as a normalization factor and L(q, q) stood for the Lagrangian. In this section we shall derive (9.5.2)-the Path Integral (Pl)-by first principles using the operator formalism. For this we divide the interval (t\ t) into ri equal parts 8t = (t' — t)ln and write:

W\e-iHU''%) = J dq, ... dqn_, (q'\e-m%n-x) ( ^ - i l ^ ' k ^ ) - (i^^q)

(9.5.3)

by using complete sets of eigenstates of the position operator Q (in the Schrodinger picture). From our discussions in Appendix 9B, we know that for very small 8t we can write24: W\e-iH5'\q) = (q'\e-iH{P'Q)%)

= (q'\[l - iH(P, Q)8t]\q) + O(8t)2

(9.5.4)

P2 When H(P, Q) = — + V(Q), we have: 1m

(q'\H(P, Q)\q) = W\^-\q) + v f - 2 ± ^ W - q)

(9.5.5)

2w v 2 J In writing the second term on the RHS of (9.5.5) we have used the symmetric ordering (Weyl ordering25) of operators, and the fact that (q'\q) = 8(q' - q) in view of Eq. (9A.24); we also use the fact that 8(q' -q) = \—^-eip(q ~ q) to simplify it further. Using (9A.19) and subsequently (9A.42), we can write J In P2 the first term (q'\ \q) as: 1m

\%- W\P) - J ^e^'-^f J In Thus the RHS of (9.5.5) equals:

3

In

2m

12m

J

In

V 2 JJ

(9.5.6)

1m

(9.5.7)

and (9.5.4) becomes:

- j|^ 24.

25

eXp{ip(q'-q)-iStH^p^Jj

(9.5.8)

The variables Q and P in the Hamiltonian H(P, Q) are position and momentum operators with eigenstates |g) and \p) respectively. Weyl ordering of (products of) operators: XP = PX->Jr(XP+ PX), X2P s XPX -> \ (X2P + XPX + PX1).

Basics of Quantum Theory 485

(Note that H(p, q) is the classical Hamiltonian.) Substituting it in (9.5.3), we obtain after simplification:

< * • « - * > - J ( £ H £ ) J *••••«-•

X "Pjd^-*-,)^'^. 2 ^)]}

(9.5.9)

In view of this, the transition amplitude can be symbolically written as:

<,"k-»«' - •>!„> = J [ ^ » ] expfij; A(W - Hip, ,))}

X ^\its<[Pl ^^

- H[Pl, ^ ^ i ) ] J

(9.5.,0,

The extreme RHS of (9.5.10) defines the path integral. . _ -, In order to bring it in line with (9.5.2), we would now calculate the momentum-space i.e. —*— = n dPi\ V L27rJ II —— part of the path integral. The integrand here is oscillatory, therefore we analytically continue j=i 2n )

it to the Euclidean space by (formally) treating (idt) as real. Then using the Gaussian formula:

*i f —e-^^^-fk^e**

(9.5.11)

we simplify only that part of (9.5.10) which involves the variable p , this gives26:

^expi~^Pj+tp^-g^r^t

exp[

7s/ \

(9-5-12)

Substituting this in (9.5.10) we now have:

(9.5.13)

~ 26

(

qt+qi-A

N o t e t h a t from H\ p . , —

\ J

—-

2

P)

(qi+qi-l)

P]

= -^- + V\ —

J 2m

y

in (9.5.10) we have included

2

J

-^-.

2m

486 Mathematical Perspectives on Theoretical Physics

As n —> °o, 8t —> 0 and —

— becomes <j •, accordingly we have shown:

(q,t\q',n=(q'\e-iH«'-%)

= N j[dq] exp {/('dT^-q 2 - V(9)]}

(9.5.14)

and have thus established the equality of Feynman's path integral to the transition amplitudes of the Heisenberg and Schrodinger picture. Note that L(q, q) in (9.5.1) is taken to be — q 2 - V(q) in (9.5.14).

5.2 Time Ordered Product of Operators Having derived the Path Integral (9.5.2), we shall next see the advantages of using the PI formalism in quantum field theories. Consider, for instance, a product of two Heisenberg operators: QH(tx)QH(t2)

(9.5.15)

and evaluate the matrix element: H

( q , t ' \ Q H ( t x ) Q H { t 2 ) \ q , t)H t ' > t x > t 2 > t

(9.5.16)

Since tx > t2 we can insert complete sets of coordinate basis states and write: *
t)H= J dqxdq2

H{q',

t'\QH(tx)\qx, tx)H

H < 9 I . h\QH(h)\ H)H H(<12> hit* *)H

= j dqxdq2qxq2

H(q',

t'\qx, tx)H H(q2, t2\q, t)H

(9.5.17)

where we have used the eigenvalue relation: QH(t)\q, r)=q\q, T )

(9.5.18)

in the RHS for the operator QH(f). We further note that each inner product in the integrand represents a transition amplitude and therefore can be written as a path integral. Combining the products we can write (9.5.16) as: <
t) = Nj [dq\q(tx)q(t2)eiS^

(tx > t2)

(9.5.19)

(Note that we have used the identification qx = q(tx), q2 = q(t2) in the above computations and have supressed H in H(, and )H.) For t2'> tx we can write the matrix element for the operator product QH(t2)^1(tx) and simplify using the above argument to obtain: (q\ t'\QH(t2)QH(tx)\q,

t) = J dqxq2{q',

t'\QH{t2)\q2, t2)

x fe h\QH(h)\(lv h) <0i. hh t) = N\ [dq]q(t2)q(tx)eiS[i]

(t2 > tx)

(9.5.20)

Basics of Quantum Theory

487

But q(t{) and q(t2) are classical quantities, therefore q{t2)q(tx) in the integrand can be written as q(t{)q(t2), this shows that the RHS on (9.5.19) and (9.5.20) are the same and this means that the path integral gives (as a natural phenomenon) the time ordered correlation functions as the moments:* H(q\

t'\T(QH(tx)QH(t2))\q, t)H = Nj d[q}q{tx)q{t2)eiSW

(9.5.21)

where the time ordering can be explicitly represented as: T(QH(t])QH(t2)) = 0 (*, - h)QH(tx)QH(t2) + G(t2 - f,)Q"('2)e"('i) An important consequence of the above fact can be recapitulated in the following remark:

(9.5.22)

Remark 9.5.1 The time ordered product of any set of operators leads to correlation functions in the PI formalism as: H(q\

t'\T{Oy{QH(tx))O2{QH)(t2))

= NJ [dqWMit,))

•••

On{QH{tn))\q,t)H

... On(q(tn))eiSl"]

(9.5.23)

so that all the factors in the path-integral are c-numbers (classical quantities), i.e., there are no operators any more.27 It should be noted that the transition amplitudes in PI formalism obtained so far have been between coordinate states, in physical applications these have to be computed for physical states. For instance, we would like to find out the probability amplitude for a system which is (initially) in a state |y/,)w at time f, and makes the transition to a state \y/f)H at time tj. By definition the wave function at time t is: H<<M|V>,/=

V(q,0

(9.5.24)

accordingly we can write: «#= j dqjdQi H(VMP tf)H «» x H
(9.5.25)

Hence the time ordered correlation functions between such physical states can simply be written as: tf
••• OH(Q"(tn))\\lfi)H=N\ dqfdqw}{qf,

x vtffc, ?,) J [dq]OM{tx)) .- OMQ)

exp((//»)5[ 9 ]).

tf) (9.5.26)

In particular the expectation value can be computed by using: // <^|r(O 1 (Q

//

)(?,)) ... On(QH(tn))\¥)H=

X J [dq]Ox(q(tx)) ... 0M*n» *

27

N j dqfdqi ¥*(qf, f/)V/(?,. 0

exp((i/ft) S[q])

(9.5.27)

The matrix element of a time-ordered product between ground states is: G (r,, t2) = (O\T QH(tx)QH(t2)\O); E ! wheras the matrix element: (O | q, t) = 0O (q) e "' " = fy0 (q, t) is the wavefunction for the ground state. Note that letter Tin an equation indicates that the equation is time ordered, i.e. tx > t2. Note that O ^ ^ f , ) ) is a function of the operator g"(f,) and O,(^(f,)) is a similar function of g(r,) and hence the latter is a onumber.

488 Mathematical Perspectives on Theoretical Physics

Since the states may not necessarily be normalized (i.e., #(Vilv,)// may not be 1), the expectation value is: O W G " ( ' i ) ) ... On(QH(tn)))) H(Vi\T{Oi{QH{h))-On{QH{tn)))Wi)H

_

H(ViWi)H

= J dqfdqtW*(gf, tf y ,• (g,-, t,)) j [dq]Ol(g(ft))•••(?„(q{tn)) exp(Q/ft) S[q]) J dqjdqtfi

5.3

{qf,tf

) y , (q, ,t,)j

^

[dq] exp ((i/ft) 5[g])

Correlation Functions Using an External Source J

We next see how the PI formalism can be used to generate various correlation functions for a physical system simply by adding terms due to external sources to the original action. The altered action incorporating the source term J is: S[q, J] = S[q] + j'tf dt q(t) J(t)

(9.5.29)

which gives back the original one as: S[q, J]J=0=S[q]

(9.5.30) 28

Using the action S[q, J] we have : (ViWih = N\ dqfdqw\{qf,

tf)yf/iqt, tt) J [dq] exp((i/») S[q, J])

(9.5.31)

In view of (9.5.30) this gives: ,=o = (V,, Vi>

(9-5.32)

Hence for a t1 (tf >/, > /,.) we obtain the functional derivative:

^ 0J/ ^ \h)

=Nj dqfdqi¥*(qP tfyfa, 0 n OJ{tx) = NJ dqjdqw'iiqp t^fa,

tt)

X J d[q]jq(t{) exp((i/») S[q, J])

(9.5.33)

(We have used (9C.77) and (9.5.29) to write the last line.) In view of (9.5.27) this implies: S i l/ )j

^ 0J }! \

\h)

j=o

x J [dq]^q(h) 28

= Nj dqfdq^ (qf, tf)yrfa> 0 exp((i/») S[q]) = ^{xif\Q(h)\^

The Heisenberg symbol H on the LHS has been suppressed from now on.

(9.5.34)

Basics of Quantum Theory 489

The above result can be generalized to any finite number of tk's satisfying tf > tx ... tn> tt; for instance, for tf>t{,t2> tv we have:

j^MAJM'-^dq'd'i-r-
J [dq{^\

q(h) 9('2) exp((i/») S[q])

= (j^j (vAnQWQimWi)

(9-5.35)

and in general (9.5.36)

Hence the expectation value (9.5.28) can be expressed as: {Tmo

... Q{tn))) = (^nQ(h)Q(tn))\¥i)

__ (-;*)"

*• <*\*h

(V^ik,)y S(J{tl))...S(J(tn))

(9 . 5 . 37) J=o

From Sec. (9.4) we are already familiar with formulae of this type, where we obtained Green's function using the generating functional. The inner product (V'ilVi)/ is thus called the generating functional for the time ordered correlation functions.

5.4

Vacuum Functional Z{J) and Green's Functions in the Vacuum

We shall now use the above discussions to write down the Green's functions using Pi-formalism and will finally show that end results are the same regardless of the approach (e.g., the Faddeev-Popov formulation or the Pi-formalism). For this purpose we consider the vacuum to vacuum transition amplitude for 7 ^ 0 , beginning with the transition amplitude in coordinate space. Using (9.5.14) we can write it as: (qf, tf\q> tt)j = yVJ [dq] exp((i/») S[q, J]) = N\ [dq] exp(-S[q]

+ -\'f

dtq(t)J(t))

(9.5.38)

In view of (9.5.23), the RHS can also be written as a matrix element of the operator, thus:

(qf, tf\qt, ti)j = (qf, tf\T e x p ^ J ^ dt Q(t) /(?)] \qh tt)

(9.5.39)

490 Mathematical Perspectives on Theoretical Physics

Now the ground state from these coordinate states can be reached through a few complicated steps. To this end we let the initial and final times tend to infinity, so we allow: tt->-<*>, iy-»+°°

(9.5.40)

As for J we assume that it is non-zero in a large but finite interval, i.e., J(t) = 0 |f|> T

(9.5.41)

and since computations are done with J(t) -> 0, we take the limit T -» °° at the end. Accordingly equalities (9.5.38) and (9.5.39) can respectively be written as: lim (gf, tf\qt, f,)y = N\ [dq] expf±- f

dt(L(q, q)) + Jq]

(9.5.42)

it —»°°

and lim (qf, tf\qt, *,->/ = lim (^ —»~

lim {qf, tf\T expf^ f

dtJQ\\qi,

?,)

(9.5.43)

r^ —>«

We further assume that the ground state energy of our Hamiltonian is normalized to zero, i.e.: #|0> = 0 H\n) = En\n)

(9.5.44)

(here we have assumed that energy eigenstates be discrete for simplicity of calculations). Inserting complete sets of energy eigenstates in (9.5.43) we have (see (9.5.3)): lim (qf, tf\qt, t) tf —><*>

= lim lim Y,{qf,tf\n)(n\TexJ^-\T

dtJQ\\m) (m\qi, r,->

(9.5.45)

Now note that the first and last transition amplitudes are the matrix elements of the operator exp((-i/h)Ht) and its inverse, with appropriate value of t, therefore the RHS of (9.5.45) is: lim lim V

(
'/-•-

x (n\Texp^f_rdtJQym)

(m\ exp((i/ft) //r,)|g,)

= lim lim £ exp(-(i/ft) £Br,+ (i/») Emt) (qf\n)

x <m|^)

(9.5.46)

(where we have used (9.5.44)). When we take the limits for r, and tp exponentials oscillate out to zero everywhere except for the ground state, hence we have:

Basics of Quantum Theory 491

lim (qj,tf\qi,t>ij=

lim foylOXOirexpf-j-f dtJQ]\0) (0\qt)

/ ^ —>»o

= < <(%,•> (Olrexp^J^Jr/ejlO)

(9.5.47)

This leads to:

<0|7exp(if dtJQ)\0) = Hm

,

ffi',

/ '

(9.5.48)

Now the LHS of the above equation is independent of end points and so the RHS must also be independent of end points, moreover from (9.5.3) we know that the RHS has the structure of a path integral (see (9.5.14)), hence it follows that: <0|rexp(-i-J_~_ = <0|0>,= NJ [dq] cxp((i/h)S[q, J])

(9.5.49)

where S[q,J]= _Q dt(L(q, q) + Jq)

(9.5.50)

Note that the RHS in (9.5.49) has no end point constraints. If we denote now (see also (9.4.12)) <0|0>, = N\ [dq] exp((i/ft) S[q, J]) = Z[J]

(9.5.51)

then from (9.5.37) we have: (

; f ? " , * l Z i i \ ( t v =

(9-5.52)

Z[J] 5J(tl)...5J{tn) which shows that Z[J] generates the time ordered correlation functions or the Green's functions in the vacuum (see also (9.4.19)). In quantum field theory Z[J] known as the vacuum functional or the generating function for vacuum Green's function plays a central role, because the knowledge of the vacuum Green's functional leads to the construction of the 5-matrix of the theory, which in turn leads to the solution of the theory.

5.5

Effective Action W[J)

While dealing with statistical deviations from the mean values in QM, it is customary to write: Z[7] = exp((i/») W[J])

(9.5.53a)

W[J] = -(ih) In Z[J]

(9.5.53b)

or then the functional derivative:

Mil

= ( ^)_^i^l

(9 . 5 .54)

492 Mathematical Perspectives on Theoretical Physics

is said to define the vacuum expectation value (Q(t{)) (of the operator Q). Accordingly, the second order functional derivative: (_ih)

5 2 ^[-/3 8J{tx)8J{t2) ,__0

= ( ;-ft)2f

l

1 5Z[J] 8Z[J] ^ Z2[J] 8J(tx) 8J{t2))

52Z[J]

[Z[J] 8j{tx)8J{t2)

J=o

= (T(Q(tx) Qih))) - (Q(tO) (Qih)) = (TdQih) - (Q(tx))) (Q(t2) - <<2(f2)»>

(9-5.55)

gives the second order deviation from the mean. It can obviously be generalized to any finite order m, although it becomes rather complicated beyond the fourth order. In the beginning of this section we saw that the path integral for the transition amplitude is proportional to the exponential of the classical action, it is for this reason that W[f] is often referred to as an effective action. In quantum field theory W [J] is known as the generating functional for the connected vacuum Green's functions (see Comment (9.4.1) for the definition of connectedness). We illustrate this by using the example of the harmonic oscillator. Example 9.5.2 Consider the action: S[q, J] = j ^ dt(-mq2

- —m(02q2 + Jq\

(9.5.56)

then Z[J] = NJ

[dq]exp((i/h)S[q,J])

gives: Z[7] d / ^ ) =

(/=0)

NJldqMt^exvai/VSlq]) NJ[dq]exp((i/h)S[q])

=Q

(9.5.57)

This is zero, since the integrand in the numerator is odd. Hence from (9.5.55) we have:

(T(Q(tl)Q(h))) = (rih)2 ]

f ^

Z(J) 8J{ty)8J{t2) =

H»)

«'"M SJ(t,)5J(t2)

y=0

(9.5.58) ,„„

Basics of Quantum Theory 493

showing that W[J] is the generating functional of (T(Q(tx)Q(t^))), i.e., of the two point connected vacuum Green's function. Further, in view of our discussions in Sec. 4 (see (9.4.13)-(9.4.18)) we know that the RHS stands for — DF(tl - t2), which means that m (T(Q{t{)Q(t2)))

= —DF(tx ~ *,) (9.5.59) m Hence the two-point time ordered vacuum correlation function, i.e., the two-point 'connected' vacuum Green's function, is indeed the Feynman propagator of the theory (see (9.4.17)). It is also worth mentioning here that path integrals cannot be exactly evaluated for all types of Lagrangians. When Lagrangians are Gaussians (or can be reduced to a Gaussian), they can be evaluated without difficulty — a simple deviation from that, such as in the following Lagrangian: L = —mq2 - — m(O2q2 - —q*

(9.5.60)

makes the evaluation unwieldly impossible.29 In this case an external source J is introduced to write the vacuum functional: Z[J] = NJ [dq] exp(OYft) S[q, J])

= N\ [dq] exv(±£_dt(±mq2

- ±mCQ2q2 - j-qA + Jqjj

(9.5.61)

Then, since S[q,J] = SQ[q, J] - £

dt-jq*

(9.5.62)

where S0[g, J] is the action of the harmonic oscillator in the presence of a source: SQ[q, J] = f^ dt (~mq2 - mo)q2 + Jq)

(9.5.63)

We can write Z[J] as: Z[J] = N\ [dq] e x p ( - - ^ - £

< V ) • exp((//*) S0[q, J]).

(9.5.64)

Now

^

4

= *(/)

(9-5.65)

oJ(t) hence the operator

while acting on S0[q, J] can be identified with qit) and as a result, the RHS 6J(t) of (9.5.64) can be written as: 29

' The Lagrangian here is usually referred to as an 'anharmonic oscillator'.

494

Mathematical Perspectives on Theoretical Physics

Nj [dq] exp - A £ (_ft_JL_j

exp((i/») S0[q, /])

= exp - A f J" j / - ^ _ | _ | TV J(f [dq] exp((i/») 5 0 [ 9 ) 7])) ^ 4/i - ~ V 8J(t)) J

(9.5.66)

The second term in the above expression is the vacuum functional for the harmonic oscillator interacting with the external source J, we denote it as Z0[J] and note that: Z0[J] = Z0[0] e x p f - ^ - m f

f

dt^J^)

DF(tl - t2) /(?,))

(9.5.67)

(see Subsec. 4.3) In view of this, (9.5.64) can now be written as:

expf —^— £ £ dt^hjit,) Dpit, - t2) J(t2)\

(9.5.68)

For small A (i.e., for weak coupling), the first exponent can be Taylor expanded and the vacuum functional can be obtained as a power series in A. The above discussions show that all the vacuum Green's functions can be calculated perturbatively using the PI formalism as well. (Note that in Sec. 4 (Ftn. 17) we arrived at this result using the perturbative series—both approaches are the same in essence.)

5.6

Path Integral Approach to Field Theory

In Sec. 4 we have discussed the scalar field theory in detail without making much distinction between non-relativistic and relativistic theory; in Sec. 3, however, we studied the relativistic aspects of quantum theory. In this subsection our aim is to point out the features of field theory when it is accessed through PI formalism. In this connection the following comments are in order. Comment 9.5.3 The method of path integrals that has so far been discussed for one particle systems can be generalized to systems with many particles as well as to systems with many degrees of freedom. Consider a system characterized by the coordinates xa(t),30 (a = 1 ... n). These coordinates may denote the coordinates of n particles in one dimension or represent a single particle in n-dimensions. Thus if S[x] is the action of the system, the transition amplitude (9.5.1) generalizes to (xf, tf\xr tj) = NJ [dxa] exp(i/h S[xa])

(9.5.69)

with S[xa] being the generic action: S[xa] = j ' f dt L(xa, xa) 30

' We are using the letter x in place of q to distinguish multi-dimensionality here.

(9.5.70)

Basics of Quantum Theory 495

Note that the integration is done over all paths originating fromxf at t = t{ and ending at x" when t = tf. The transition amplitudes in the presence of sources introduced through appropriate couplings can now be written as: (xf, tf\Xi, t{)j = N\ [dxa] exp((i/h)S[xa,

Ja])

(9.5.71)

where S[xa, Ja] = S[xa] + j ' f dt Ja(t)xa(t)

(9.5.72)

These basic transition amplitudes allow us to derive other transition amplitudes or matrix elements similar to those in (9.5.37) and (9.5.52). The latter of these corresponds to the vacuum to vacuum transition amplitude, which was obtained by letting the time interval approach infinity in the limit and having no end point restrictions while integrating over paths (i.e., initial and final coordinates of the paths could be chosen arbitrarily). The generating functional and the actions in this case are: Z[J] = (0\0)j = NJ [dx] exp((i/») S[xa, Ja])

(9.5.73)

and S[xa, Ja] = f" dt(Uxa,

xa) + Ja(t)xa(t))

(9.5.74)

J — oa

5.7

PI-formalism and Field Theories (with Infinite Degrees of Freedom)

Comment 9.5.4 The path integral method can obviously be extended to continuum field theories after suitable changes, in this case physical systems involve infinitely many degrees of freedom. Thus for a 1 +1 dimensional theory whose basic variable is the field <j) (x, i), the vacuum to vacuum transition function in the presence of a source is given as: Z[J] = <0|0>y = N j [dx] exp((i/»)5[0, /))

(9.5.75)

where S[) + £"_ _Q dt dx J(x, t) 0(x, t)

(9.5.76)

The form of the action S[<j>] is chosen depending on whether the field theory is relativistic or non-relativistic. It is to be .noted, however, that the relation between the Lagrangian and the Hamiltonian must be canonical (see Examples (6.3.2) and (9.4.4)), since then alone the PI method that leads to (9.5.73) or (9.5.75) can be applied. If the relation is not canonical, the transition amplitudes are computed using the methods discussed in Sec. 4 (see, for instance, (9.4.3) and Exc. (9.4.1)), and the formalism is referred to as the Feynman's path integral in phase space. Consider a relativistic field theory in spacetime (3 + 1 dimensions) with Lagrangian density:

£(0,^) = y ^0
(9.5.77)

496

Mathematical Perspectives on Theoretical Physics

with A > 0, then

S[<j)] = J dAxL{
(9.5.78)

and S[(p, J] = S[] + J d4x J(x, t)(j)(x, t)

(9.5.79)

The Euler-Lagrange equation coming from (9.5.79) takes the form:

^ g ^ l =d ^

+ m^ + i - ^ - 70c)

o0(;c)

(9.5.80)

3!

(here x refers to (jc, t) collectively). The theory with the dynamical equation (9.5.80) or the action (9.5.79) is known as the (f-theory in the literature. When A = 0 the Lagrangian density (9.5.77) is quadratic in the field variables and hence the generating functional denoted Z0[J] in this case can be evaluated following the methods of quantum mechanical systems (see (9.5.61) and (9.5.67)), thus Z0[J] = Nj [ # ] exp((i/») 5O[0, J])

= N J [rffl exp^ j d\ ( - i - ^ 0

- ^

2

+

/^|

= yvj [ # ] expf-i-J dAx( j
(9.5.81)

(We have used here the arguments given in the Hint of Exc. (5.1), in particular we have used the Eq. (v) there, see also (9.5.68).) As in the case of the harmonic oscillator when A = 0, we have here: = - ^ r ^ 1 Z0[7] SJ(x)

=0

(9.5.82)

J=Q

and

<0|WxWy))|0)=(-^

f^

]

Z0[J] 5J(x)8J(y) zo[j] \

n

= iftDF(x-y) (see (9.5.57) and (9.5.59)).

y=0

)

y=0

(9.5.83)

Basics of Quantum Theory

497

Similarly, identifying <j)(x) with -ih , we can use the arguments of anharmonic oscillator to 5J(x) write the generating functional for the case X * 0 as:

Z[J] = Nj [d] exp(^ J d'J^d^p = ivj [
- ^-02

- A-04 + J(pjj

exp((i/») So[0, /])

= A/J [
exp((i/») 5O[0, 7])

= expf--^-| d 4 ^ - i f t - ^ - j j (A?J [J0] exP((i7») 5O[0, 7]))

f a fJ d,4 f1 .* 5 V^ z [7] = «p[~4i» T * 5 7 o o J J «

(9 5 84)

--

Note that (9.5.84) is quite similar to (9.5.66) except for the fact that the basic variable here is the field in place of q. The field <j> describes the theory in 3 + 1 dimensions whereas q described it in 0 + 1 dimension. When coupling is weak {X is very small), the exponent in the first term can be Taylor expanded. For instance, retaining terms up to second order in X we have:

I

4!

J

SJA(x)

2l{

x jA

4! J []

+

Sj\x)J

z [y]

( ^y -] °

(9585)

--

where Z0[J] is given by (9.5.81). We use the above expansion to derive Green's functions by further assuming that X is so small that X2 can be neglected. Due to the J -J symmetry in (9.5.81) and (9.5.84), the correlation functions and hence the Green's functions in odd dimension are zero. We give here the expressions for only the two-point and four-point functions computed from (9.5.85). = ih DF{xx - x2) - - ^ - D F (0) J dAx DF{x - xx)(x - x2) and

(9.5.86)

498

Mathematical Perspectives on Theoretical Physics

<0|7W(*iW* 2 W*3W*4))|()> = -ti2{DF(xx - x2)DF(x3 - x4) + DF(xx - x3)DF(x2 - ~ -

- x4) + DF{xl - x4)DF(x2

DF(0) j dAx {DF{xx - x2)DF(x

- x3))

- x3)DF(x

- x4)

+ DF(xl - x3) DF(x - x2)DF(x - x4) + ••• 4 similar terms with appropriate combinations of x, xx, x2, x3, x4} -iXtf]

dAxDF(x - xx)DF(x - x2)DF(x - x3)DF(x - x4)

(9.5.87)

(Note the absence of terms which involve odd number of DF(x - x,)'s (/ = 1, 2, 3, 4); see Exc. (9.5.3) for derivations.) The expressions (9.5.86) and (9.5.87) represent the two-point and four-point Green's function of the 04-theory. Comment 9.5.5 When instead of relativistic fields (discussed above) we consider gauge fields, we have to deal with an extra symmetry that creeps in due to gauge transformation rules, e.g: A = A + V<j>

(9.5.88)

which results into overcounting when integration is done. Thus while the PI formulation allows an easy demonstration of the gauge invariance of the theory, it introduces spurious degrees of freedom. Therefore, while using the PI formalism it is important that these redundant degrees of freedom (due to gaugeinvariance) are weeded out by restricting the theory with gauge-fixing conditions and by factoring out the (infinite) volume (of the orbit) due to overcounting (see Sec. 6.4-5). The computations based on these constraints for general non-abelian gauge theories are quite complicated (see Sec. 6.4-5), and are beyond our limited scope; we illustrate the theory by choosing a Yang-Mills' non-abelian gauge. Example 9.5.6

Consider the Lagrangian density: L = -—FauvFatlv,

a = 1,2,3

(9.5.89)

4 for SU(2)-Yang-Mills field, where: F;v = 9^ Aav -dvAl+

geabc A\ A%

(9.5.90)

g being a coupling constant. The generating functional with external source J can be written here as: Z[J] = J [dkj exp{*j d\[L{x) + JM(x) • A"(x)]} (9.5.91) with field-free part as:

Z0[J] = J [dAJ exp{zj d\[£0(x)

+ J^x) • A"(*)]}

where

j d\£0(x)

= - i - J J 4 x ( ^ Aav - dv Ap (
(9.5.92)

Basics of Quantum Theory 499

= i - J dAx A; (x) 0>^ d2 - dW)A"v(x)

(9.5.93)

Obviously the generating functional would be obtained if the integral on the RHS of (9.5.91) or (9.5.92) can be calculated. This, however, is not possible, although the expression in the second line of (9.5.93) is similar to the scalar field theory (see (9.5.81)) because the operator: Bllv=(gllvd2-dlldv)

(9.5.94)

has no inverse. In other words, the det B = det || B^v\\ is zero, and since this will appear in the denominator if (9.5.93) is calculated, it follows that the integral is divergent. We shall see next how this situation can be avoided. The procedure discussed below is often referred to as PI volume factoring in gauge theories.

5.8 The Faddeev-Popov Ansatz Let 9 be the spacetime-dependent parameters of the group SU(2), and crbe the Pauli-matrices, then the gauge transformation A^ —> A ^ defined by \eu

• — = u(6) A H • — + —tr 1 (d)d u me) t r 1 id) 2 I 2 ig j

(9.5.95)

with 1/(0) = exp(-i0(jc) • 0/2)

(9.5.96)

leaves the action S-invariant. This means that the action is constant on the orbit of the gauge group formed with all A^'s for some fixed A p as U(&) ranges over all elements of SU{2). A proper quantization will thus be obtained by restricting the path integration to a 'hypersurface' which intersects each orbit just once. We write the equation of this hypersurface as: fa(AJ

= O a = 1,2,3

(9.5.97)

and note that for a given A^, the equation:

/fl(AJD = 0

(9.5.98)

has a unique solution 9= 9(x). Equation (9.5.97) is evidently a gauge fixing condition as shown in Fig- (9.3).

i

I

;' ;' ; Orbits in gauge T A I group manifold

Gauge fixing

; y^''': '•:/: •:W^^TTp?

constraint

' / p . •./ ; . •', \ '• '•[• '• ''\'y^

—-

Fig. 9.3

500

Mathematical Perspectives on Theoretical Physics

In order to obtain the volume of the orbit, we have to define the integration over the group space. Let 9 and 9' be elements of SU(2), then in terms of the representation matrices U(9), the multiplication of group elements takes the form: U(9)U(9')= U(99')= U(6")

(9.5.99)

In a neighbourhood of the identity U(9) can be written as: U(6) = 1 - i0

— + O(92)

(9.5.100)

and the integration measure over the group space can be taken as: 3

[d9]= Y\d9a

(9.5.101)

a=\

which is invariant in the sense that d(99') = d9' Next we define a function DF[AM] by integrating over the group space: Dp' [A^]= J [d9(x)]S[fa(AeM)]

(9.5.102)

Thus DF[Afl] = detRf

(9.5.103)

(*/U=|f-

(9.5.104)

where

The above equation implies that R^is just the response offa[A^\ to the infinitesimal gauge transformation. Using (9.5.95) and simplifying it in view of (9.5.100), the infinitesimal gauge transformation is of the form: K"

= A

°n+ eabc9bA^ - -d^9a

(9.5.105)

o

and therefore it follows that the response of fa[A,J is: fa[A%(x)]=fa[A^x)\

+ j d4y[Rf(x, y)]ab 0b(y) + O(92)

(9.5.106)

We note here that due to the uniqueness of solution of (9.5.98), the det Rf is not zero, and also that function DF [A^] is gauge invariant. The gauge-invariance of DF [A^] can be further examined by writing (9.5.102) as: DF1 [AM] = J [d9'(x)]5\fa(AeJ] which implies that we can write: Dpl[^=

i

[d9'(x)]8\fa(A°e'(x))]

= J [d(9(x)9'(x))]8\fa(A°e'(x))]

(9.5.107)

Basics of Quantum Theory 501

= j [d6"(x)]8[fa(A°"(x))]

(9.5.108)

1

But the RHS of the above equality is simply Dp [A^J from (9.5.107). Using (9.5.102) we can now write the path-integral representation of vacuum-to-vacuum amplitude as:

J [dAJ exp{ij d*xL(x)} = J [dOfrMdA^xXDplAJ x 8\fa(AeJ] exp{ij d4xL(x)} = j [J0W][dA/i(x)]DF[AM] 8\fa(AJ] exp{rj
(9.5.109)

In writing the last line, we have used the fact that both DF[A^\ and exp{ i{d4x£ (x)} are invariant under the gauge transformation A^ -» A^. This shows that the integrand is independent of 9(x) and the integration over n dd (x) is the infinite orbit volume that we have been looking for to identify. X

Hence using (9.5.103) and (9.5.109), we can write the generating functional of gauge field A» (which is free from redundancy) as: Z / [J] = j [JAM](det Rf)8\fa(AJ]

exp{ij d4x[L(x) + J^- A"]}

(9.5.110)

This is called the Faddeev-Popov (FP) Ansatz- Note that essentially the quantization here has been done by restricting the functional measure with deX\8fl86\8\f{A^)]. (See Exc. (9.5.4) for an example of gauge-fixing.) We further note that the above discussions can be applied to Abelian gauge theory. Under a £/(l) gauge transformation equation (9.5.105) becomes: A%{x) = AM(JC) - -d,fi{x)

(9.5.111)

Thus for any choice of linear gauge-fixing condition of (9.5.97) the response matrix RAn (9.5.104) or (9.5.106) is independent of G. Hence the FP factor (det Rj) plays no physical role and can be dropped from the generating functional. The generating functional in this case is:

Zf[J] = J [dAjSWiAJ] exp{ij d4x[L(x) + J^x)A»(x)]}

(9.5.112)

(For further studies on the subject of gauge fields, the reader is referred to Chapter 9 in [6], and [11].) Two recent texts on path integration and their applications which make excellent reading are [7] and [19]. Having given some idea of path-integrals to the reader, we move on to Feynman graphs in the next section. But before this, we illustrate some of the theory of this section in the Hints to the Exercises.

Exercise 9.5 1. Show that the generating functional Z[J] for the harmonic oscillator with an external source J given by

(a)

Z[J) = NJ [dq] exp(—f^dt(—mq2

- —ma>2g2 + JqX\

502 Mathematical Perspectives on Theoretical Physics can also be written as

(b)

Z[J] = Z[O] expf- —l— f

f

dt

dt'J(t)DF{t - t') /(/')] •

2. Show that the generating functional Z0[J] for the action S0[x, J] in a relativistic scalar field theory can be expressed as: (a)

Z0[J] = Z0[0] e x p f - - L - f f d4xd4x'J(x)DF(x

\

2h

- x') J(x')) .

JJ

J

3. Consider the Taylor expansion up to first order in X for the generating functional of the 04-theory given in (9.5.84) and show that the two-point and four-point Green's function are given by (9.5.86) and (9.5.87). 4. Show that when the gauge-fixing condition fa(A^) = 0 (a = 1, 2, 3) given in (9.5.97) is simply (a)

fa = Aa3 = 0

then the response matrix R^is independent of the gauge-field and thus while writing the generating functional, (det Rf) can be ignored, hence the Zf[J ] for Yang-Mills theory in this case becomes: (b)

Zf[J] = | [dAjS(A3) exp{iS[J]}

where

(c)

S[J]= j d4x(-±(F^vf+

J\ AaA.

Hints to Exercise 9.5 1. Consider the harmonic oscillator \mq2 - \ma?q2

= ^-mq(t)

—— + CO2 \q (t), in view of

(9.5.51) the generating functional with external source J can be written as:

Z[J] = NJ [dq] expjOV/OJ"^ dt(-mq2

- mco2q2 + Jq j}

To use the theory developed in Subsec (4.3) we further write it as: (i)

Z(J) = lim N\[dq]exp[-(~)r dt!q(t{^-+ e->o+ J [ V2rty J -~ \ \dt

CO2 - ie)q(t) - —J(t)q(t))\. J m J)

(Note that in writing the Harmonic oscillator terms on RHS of (i) we have used the fact: q (t) q(t) = Const)

Basics of Quantum Theory 503

From (9.4.15) and (9.4.16) we know that (ii)

lim [ -^5- +ft)2- ie\DF (t - t') = -S(t - t'). e->0+\dt )

We use this to define a new variable q(t): (iii)

q(t) = q(t) + — f" dt'DF{t - t') J(t') m J -°°

whose substitution in (i) reduces it to the form: (iv)

Z[J] = lim N([dq]exv{-(im/2h)r dt\{q(t) - — T dt'DF(t - t')J(t')) L J £->o + J -~ Lv m J-a° ) X [^1- + CO2 - is] (q(t) - — [° dt'DF{t-t') ~ — J(t)(q(t)-

—r

J(t'))

dt'DF(t-t')J(t'))]}.

In view of (ii) this simplifies to: (v)

Z[J] = lim+ N\ j [dq] expj -(im/2h)j^dtq(t)l-^-

+ a)2 - ie (^(r) 1

x exp{-(i72»m)J"oo J ^ J ? * ' J(t)DF(t - t') J(/')} The first factor in the above term is the generating functional of the harmonic oscillator without source and can be written as Z[0]. This establishes the required result in (b). We would like to note here that Z[0] is also written in the literature as: (vi)

lim N d e t f - ^ r + co2 -ie\ \ +

e^o

I

2

{dt

2

J]

where det stands for the determinant of the operator. This follows from the generalization of Gaussian integral:

(vii)

f

dxe^Ji^f

to the functional integral involving a Hermitian operator O: (viii)

J[^]exp(zj ; V dtq(t)O{t)q{t)) = Mdet O(t)]~T.

504

Mathematical Perspectives on Theoretical Physics

2. The action S0[x, J] does not contain terms higher than two in , hence it is defined as

(i)

S0[x, J] = j d'x^d^f

- ^f-

+ J^j.

The field (x) s (j)(x, t) defined on space time is assumed to satisfy the asymptotic condition: (ii)

lim (p(x, i) —> 0. 1*1-+~ The arguments of Exc. (9.5.1) can be used here by noting that the point q(t) of the trajectory in the above exercise is now replaced by the field (x) satisfying (ii). The new variable of integration (see (iii) in Hint 1) is: (iii)

0 (x) = 0 (x) + j d4x'DF{x - x') J{x').

We substitute it in Z0[J], given below: (iv)

Z0[J] = lim N\[d] exp((i/ft)[dAx(—<j)(x)(dud^ t J V2 e^0 + J

+ m2 - ie)<j)(x) - J(x))). yJ

After repeating steps given in Exc. 1, followed by simplifications we obtain the required result. 3. From (9.5.81) and (9.5.85), retaining the terms with X only we have: (i)

x

exp(-^"JJ

d4xld4x2J(xl)DF(xl - x2)J(x2)).

To compute this we have to evaluate the functional derivative with respect of J(x) to the 4th order. We do this in two steps, thus:

(ii)

-r4 o J (x)

expf- -J- J J d\d%J(xl)DF(xl \

- x2) J(x2))

2hJJ

J

= [-jDF(0) - ±-(j dAx,DF(x - x3)J(x3)) x(J

d4x4DF(x-x4)J(x4))]

x exp(^- -L }J d'x^JixODpix,

- x2) J(x2)\

(In taking the functional derivative we have used the rules laid down in Appendix 9C, e.g., (9C.84) to (9C.88).) This then leads to (iii) below: (iii)

f dAx ——A exp - - 5 - J J d4x1d%J(xl)DF(xl oJ (x) V in J J

J

- x2) J(x2) J

Basics of Quantum Theory 505

=

I d4xJ7^i

7FU* e x p (" ^

d\d%j(Xl)DF(Xi - X2)j(X2)) ]

= 1 dA* [~ ^DF(0)DF{0) + -^DF(0)(jdix3DF(x

- x,)J(x3))

x (J d\ADF(x - x4)J(xA)) + ^nJd4XiDF(x x

ex

- x,.)y(x;)j]

p ( - ^ J j d4x1d4x2J(x^)DF{x1 - x2)J(x2)^

Substituting this into (i) we have:

(iv)

Z[J] =Z0[0] \l + -~DF(0)DF(0)jd4x

x jd4xu(

+ jDF(0)

J d4xpF(x - xt) /(x;)j

-^•J

d'xU^d'x^ix-x^Jix^

x e x p ( - ^ - j j d4xxd4x2J(xx)DF{xx - x2)J(x2)\. From (iv) it follows that: (v)

Z[0] = Zo[O]f 1 + - ^ - £> F (0)O F (0)J^ 4 ^:j

showing the divergent nature of Z[0]. This divergence, however, is absorbed in the normalization constant. Taking the functional derivative of Z[J] we now have: (vi)

S2Z[J]

SJ(xl)SJ(x2)

J=Q

= Z 0 [0]^l + -Y-D f (0)D F (0)Jd 4 x^--^D ir (ac 1 - * 2 ) ) + — DF(0)jd4xDF(x

- Xl)DF(x - x2)].

Using the above computations we obtain from (9.5.83)-(9.5.85):

(vii)
f™

506 Mathematical Perspectives on Theoretical Physics

=

tf Z0l0](l + ij^DF(0)DF(0)jd4x) x Z0[0][l + (l + " ^ DF(0)DF(0)jd4x^ [~jDF(Xl - x2)j + j-DF(O)j d4xDF(x - x{)DF(x - JC2)1

- -h2[-LDF(xl-X2)

+

j-DF(O)jd4xDF(x-xl)DF(x-x2)^

= ih DF(xx - x2) - — D F ( 0 ) J d*xDF(x - xx)DF(x - x2). (The sign — is used to indicate that we have considered here only the leading order term coming DF(0)DF(0)\d4x

from the expansion of \ In

(.

8

t

•'

> . This is valid since we want here terms up

J

to first order in A. The RHS of (viii) consists of two terms, the first of these is the familiar Feynman propagator for the free theory, and the second one is a divergent term due to the presence of DF (0) (the Feynman propagator with 0 argument). The second term is referred to as afirst order quantum correction to the propagator in the theory. The quantum corrections in field theories lead to divergences, is a well recognized feature of quantum field theories. These divergences are, however, taken care of through the process of renormalization (see in particular Chapter 9 and 10 in [6]). We leave the computation of the 4-point Green's function as an exercise to the reader. In view of the above discussions, we note that while the second term in (9.5.87) represents a divergent term, the third one does not. This confirms the above statement that while the first order quantum correction is divergent, the second order is not so that we can regard the second and third term as quantum corrections to the first term which represents the 4-point Green's function in the freefield theory (see (9.4.20)). 4. Let/a(A^) = 0 be given as: (i) fa = A% = 0. This gauge-fixing condition is called the Axial gauge. The gauge transformation (9.5.105) in this case gives: (ii)

/ a (A« ) = A«3 + eatK 6bA\ - ±d36a = - - ^ 3 0 " . 8 g

Thus in view of (9.5.104) the response matrix Rj=

^ 3 ^ ' showing that it is independent of

the gauge field. The factor containing (det Rj) can be dropped from (9.5.110), reducing it to: (iii)

Zf[J] = j [dAJ 5(A3) exp{iS[J]}

with S[J] as given in (c).

Basics of Quantum Theory 507

6

FEYNMAN GRAPHS

The final section of this chapter is devoted to one of the most interesting contributions of Feynman to quantum theories. These are known as Feynman Graphs and have been found quite useful in superfield theories and string theories. We present here the basics that underlay the construction of these graphs, and show how perturbative expansions that are quite cumbersome even at low orders (see Exc. (5.3) for instance) can be diagrammatically represented using the Feynman rules. The key elements for a graphic representation of the 04-theory are the Feynman propagator of the free theory, and the interaction. These are related to a line and to an intersection point called the vertex. Given below is the line and the point along with the mathematical expressions they stand for: (a)

xt

x2 =

ihDF(xl-x2)

X

(b)

l ^ \ / >- ^ < *4

x2

/--

\

x3

= V(*!, x2, JC3, x4) =

1

5*S[
~ftn?=1 6«Xl)

#e0

= - — \d4xUS{x-Xi)

(9.6.1)

(see (9.5.78) for £[$)). With the help of these two elements, other non-trivial graphs can be constructed by joining the vertex to the propagators. Added to these elements is the rule of evaluating the graphs— which says that for any evaluation of graphs the integration is to be performed over the intermediate points where a vertex connects with the propagator. We illustrate it by using the diagram: y

*i

jP^

*2

(9.6.2)

We note that similar to the intersection shown in (9.6.1)(b) we have here the point characterized by merging of yx, y2, y3, y4, and like in (9.6.1)(a) we have the lines (xlt y{), (y2, x2) and the curved line (y3, y^) (or Cy4- y$))- Hence using the mathematical expressions given in (9.6.1) and the integration rule suggested above, we have the required value of the graph (9.6.2) (subject to adjustments for symmetry) as: J dAyld4y2d4y3d/iy4ihDF(xl

- yx)ihDp(y2

- x2)ihDF{y3

- y4)

x ^ O i , y 2 , y 3 , y4) = (ih)3j dAyld4y7d*y344y^DF{xl ( x

v

"X ~$(y

n

- yx)DF(y2

- x2)DF(y3 - yA)

\ - yi)S(y - y2)S(y - y3)S(y - y4)\

J

508

Mathematical Perspectives on Theoretical Physics

= -Ah2DF(0)j d*yDF{xx - y)DF(y - *,)

(9.6.3)

We note that there is a symmetry in the diagram (9.6.2) as the bubble can be rotated through 180° leaving the diagram unaltered; there is one such bubble, hence the symmetry factor is 21 = 2. Dividing by 2 we obtain the actual value of (9.6.2) in terms of Feynman propagators:

lAJy* x\

y\

y>2

= _ 2^-DF(0)\ X2

dAyDF{xx - y)DF(y - x2)

2

(9.6.4)

J

From Exc. 3 of Sec. (9.5) we know that this is the first order (linear in X) correction to the propagator, this confirms (in a small way) that perturbative expansions can be represented by a graph. Using the above graphic representations we can write (9.5.86) and (9.5.87) as: (0\T(<jKx1)
*2 +

(0\T(^xl)(t>(x2)
^3 Q X^

X4

X2

X2

Q

X]

X4

X2

Xj

Q

X3

X\

x\

X4

^

Xi

+l -*4 **2

+ ' 4

Q *4 , x\ Q ^3 ^x\ X2

Xy

X2

(9.6.5)

X2

X^

+

2

(9.6.6)

3

Q

X2

X^

, x\ \ ^ ^ ^ X 4 X^

X2

Xj

Note that in (9.6.6) the three sets of parallel lines correspond to first three terms of (9.5.86). Also in view of (9.6.4) the next six diagrams correspond to next six terms of (9.5.86), while the last term there corresponds to the diagram of (9.6.1)(b).

6.1 Connected Diagrams In Sec. 4 and 5 we have referred to connected diagrams and connected Green's functions (see Comment (9.4.1) and Subsec. 5.5); these, as we shall see, are obtained by considering the logarithmic generating functional W[J] that we had introduced in (9.5.53): W[J] = -ih In Z[J]. In view of (9.5.54) and (9.5.55) for a ^-theory, this gives:

SJ(xi)

j=Q

S[J] SJ(xi)

(9.6.7) J=o

and

_ in

s2wW

SJ(Xl)SJ(x2)

= (_tf)2 J=o

[ i s2z[j] [Z[J] 5J(Xl)8J(x2)

i sz[j] sz[j]l Z2[J] 8J(Xl) SJ(x2)\

= -

J=o

Basics of Quantum Theory 509

=
(9.6.8)

4

Since (O|0(x)|O) = 0 in a 0 -theory, we note that the second term in (9.6.7) is zero, hence, =
(9.6.9)

4

This shows that the 2-point Green's function in 0 -theory is connected and its graphic representation is the same as the one given in (9.6.5). (The subscript c in < )c indicates that the Green's function is evaluated using a connected diagram.) We can similarly write the 3- and 4-point connected Green's function by taking the third and fourth order functional derivatives of W[J]. In the case of 04-theory however, the odd order Green's functions are zero, hence we consider, the 4-point connected Green's function which is given by: {-in?

^M4 5J(Xl)8J(x2)SJ(x3)5J(Xi)

J=o

= c =

-

<0|7W(* 3 W(*4))|0>

- - (0\nWxOHx*W)

(0\T((j)(x2)(l)(x2))\0)

(9.6.10)

Using the graphic representations given in (9.6.5) and (9.6.6), it is evident that we are retaining just one piece of the graph in (9.6.6), hence we have:

(0\T((xl)(x3)^(x4))\0)c^ '

X

x2

4

(9.6.11)

x3

Using the fact that W[J] generates connected Green's functions, we can write down a diagrammatic expansion of the 2-point Green's function up to the order A2 as:

o x

]

®

X

2 = x\

X

2 + x\

on X

2 + -^1

Q X[

K^J X2 -\- Xy

X2

(9.6.12) X2

From the above discussions it is evident that perturbation series can be more easily expanded via these graph techniques. To study these further we would now like to introduce the notion of irreducibility among these diagrams. In this connection we note that W[J], which generates connected diagrams, sometimes contains diagrams that are reducible to two connected diagrams when an internal line is cut. The third graph in the RHS of (9.6.12) is an example of such reducibility. These diagrams are called IP (one particle) reducible. We shall see that IP irreducible (one particle irreducible) diagrams are of more fundamental nature since we can construct all the connected diagrams with their help. To achieve this goal, we extend our considerations of field theory to a broader spectrum. For instance, we know that a one point function in the presence of an external source is given by:

mji,^^ji-_{ommj oJ(x)

Z[J] 5J(x)

(9 . 6 , 3 ,

510 Mathematical Perspectives on Theoretical Physics

and that this is zero for a 04-theory in the absence of J. We now consider the one-point function (vacuum expectation value) in the case of an arbitrary field and write it as: = <0| exp«7//i) P • x)(0) exp(-(i/») P • x)\0)

(9.6.14)

where exp(-(i/h) P • x) is the generator of space-time translations. If we assume that the vacuum state in the Hilbert space under consideration is unique and is Poincare invariant: exp(-(i/ft) P • x)\0) = |0)

or

P/JO> = |O>

(9.6.15)

then, by symmetry arguments, we conclude that = = constant

(9.6.16)

(For a 04-theory, this constant is zero.) It is also worth pointing out that the value of the one-point function plays an important role in the study of symmetries, as a non-vanishing value of it implies spontaneous breakdown of some symmetry in the theory (see Sec. 6.5). Returning to the one-point function in the presence of a source, we note that it is a functional of the source and in general it is not zero, we denote it as 0r(x) and refer to it as the classical field of the theory (though it is only a classical variable), thus from (9.6.13) we have: y = c(x) = - | ^ -

(9-6.17)

It is natural to ask here as to why this vacuum expectation value is called a classical field. The simple answer is: because it behaves like a classical field. This can be seen from the following observations: Observation

9.6.1

The generating functional Z[J] = Nj

[d] exp((i/») S[, J])

is independent of , thus an infinitesimal change in 0(x) in the integrand leaves Z[J] invariant, i.e.: iN r SZ[J] = — f [dfl 5SW, J] exp ((i/ft) S[#, J]) h J

= - ^ - f [d(j)] \ \ d 4 x 8 ( l > ( x ) - 6 f f ' J ] )

exp((t/ft) S[(j), J])=0

(9.6.18)

(We have assumed here that [d(j>] does not change under this redefinition of the field variable
Observation 9.6.2 ence of source J:

5 [

t ^ J] exp(OVft) S[(j>, J]) = 0 S<j)(x)

(9.6.19)

By definition Eq. (9.6.19) is the vacuum expectation value equation in the pres-

( 0 | m Z l | o ) =0

(9.6.20)

Basics of Quantum Theory 511

where

-^—- = 0 is the familiar Euler-Lagrange equation. From Sec. 5 (see (9.5.80)) we know that 5<j)(x) for the 04-theory this equals: d ^

+ m2<j) + — 3 - J(x) = 0 s F((j)(x)) - J(x)

(9.6.21)

where we have written the first three terms as a functional F( (x)), and substituting it in — — — - , we

S(x) note that (9.6.19) can be expressed as:

- NJ [
(9.6.22)

In view of the equality: = NJ [dMW

exp((i/») 5[0(x), /I)

(9.6.23)

we can identify:

and thus write (9.6.22) as:

z[ /1=0

(9A24)

K"*^ir)"H which can also be written as

I F[m~8Mx~)~J{X)\

expi(i/h)

W[J]) =

°

or as exp(-(i/h) W[J])\ Fl-ih

5

)-J(x)

exp((i/h) W{J}) = 0.

This gives:

f».,n.;M=0

(9.6.25)

and after using (9.6.17) this yields:

F (t>c(x) ih

i ~ ~8JU)]

' J(X) = °

(9 6 26)

''

Equation (9.6.26) is the (full) dynamical equation of the theory at the quantum level. Although it is not the same as the classical Euler-Lagrange equation, we note that when h —> 0 it appears similar to the

512 Mathematical Perspectives on Theoretical Physics

classical form (9.6.21). This explains the nomenclature 'classical field' for 0c(jc)-the vacuum expectation value of arbitrary field (f> in the presence of a source (see (9.6.17)). Now the quantum equation (9.6.26) written out in full (using (9.6.21)) becomes:

((V

+ «>)

(•,U) - »-;%-;)) * A (,, .(x)- « ^ y 3 - * ) = 0

(9.6.27)

or

( V + n?)m + A^W - i» * « %&- ^-£*$- /W = 0 3! 2! e OJ(x) 2! Sj(x)oJ(x) M

Reintroducing W[J] this can be written as: (d 3* + m2) SW[J]

"

5/U)

+ A ( 8WVV\

3! t 5/(x) J

_ I'Aft 8W[J] 82W[J]

2! 8J(x) 5J2{x)

- ML?JW1 _ J{x) = o 2!

(9.6.28)

<5J 3 U)

The above equation governs the full dynamics of the quantum theory. Since Green's functions are given by functional derivatives of W[J], using the above equation the dynamical equations satisfied by various Green's functions (of the theory) can be obtained. These equations are known as the SchwingerDyson equations. We would like to remark here that some of the terms in (9.6.28) are not well defined since they are products of operators evaluated at the same spacetime points (it is known that in quantum theory products of field operators at the same spacetime point are always ill defined, and they can be used only after regularization (See Appendix 9C for the definition of regularization)). It is interesting to note, however, that when h -> 0, all those ill defined terms disappear leaving the familiar Euler-Lagrange equation: fyd" + m2)$c(x) + A 4>l(x) - J{x) = 0

(9.6.29)

The above equation can be solved iteratively by writing it as an integral equation in terms of a propagator, thus

(j)c(x) = -j d*yDF(x - y) (j(y) -

jfitiiy))

= - J d4yDF (x - y) J(y) + A J d4yDF (x - y)03c (y) After simplification the iterative solution has the form:

(9.6.30)

Basics of Quantum Theory 513

= - J dAy DF(x - y) J(y) - A J d\d4y2dAy,d4y4(DF(x

- Vl)

X D F ( y y - y 2 ) D F { y x - y 3 ) D F ( y { - yA) J(y2) J(y3) J{yA) + •••

(9.6.31)

The iterative solution can also be diagrammatically represented as:

M) =x

*+ ^

<* + A_ A- / >

x

<*

(9.6.32)

where we use a vertex * to describe the interaction of the field with the external source as:

x^i^TTn

6.2

8(x)

=¥yw

(9 6J3)

-

,=0 »

Effective Functional and Feynman Graphs with Vertex-functions

SW[J] We now describe an important feature of c(x). We note that this relation can be viewed as though the variables J (x) and <j>c(x) are conjugate in some sense. Thus just as <j>c(x) is a functional of J(x), by inverting their roles, J(x) can be expressed (perturbatively) as a functional of <j)c(x). Using (9.6.17) we therefore define a new functional: H 0 J = W[J] - | d4xJ(x)$c(x)

(9.6.34)

whose functional derivative is easily seen to be (see Exc. 9.6.1): r

L ^ J = -j(x) 0
(9.6.35)

We have the following remark concerning the new functional r [ 0 J : Remark 9.6.4 Equation (9.6.35) that defines the source J(x) as a functional of the classical field 0, <j)c(x) —> a constant, hence (9.6.35) gives:

WZo u

rcW

=°

(9 636)

-

0e(jc) = constant

The above (extremum) equation is often used to help determine whether a symmetry is spontaneously broken.

514

Mathematical Perspectives on Theoretical Physics

We show next how the effective action W and the effective action functional F can be utilized in expressing the theory in a compact form. For this we write: W

<»> =

^MH

(9.6.37a)

Sj(Xl)...dJ(xn)

r<"> =

^

J

(9.6.37b)

84>c{Xl)...&l>c{xn) and note that using these the equality: f d\ J

8

2

^ SJ{x2)5J(x)

S

2

r

^ S0c(x)5c(xl)

= -8\Xl - x2) ' 2

(9.6.38)

can be written as an 'operational equation': wm

r (2)

_ _j

(9.6.39)

(See (iii) in the Hint to Exc. (6.2).) This coordinate-free equation, when applied to matrix elements in the coordinate basis, will revert to its originating form (9.6.38). From our study in Sec. 4 and Sec. 5 (in particular Eq. (9.5.59)) and Eq. (9.6.8), we notice that there exists a relation between the Feynman propagator and the effective functional W; to see it further we write: Wm\J=0 = - D

(9.6.40)

Moreover, when 7 = 0, c = constant, hence (9.6.39) can be written as: W(2)\j=0T^\ = -l

(9.6.41)(a)

or equivalently as: £>r ( 2 ) |^ = 1

(9.6.41)(b)

This shows that F ( 2 ) | ^ is the inverse of the propagator at every order of the perturbation theory. We use this fact to write: r ( 2 ) | ^ r = r f > + I = Dp1 + E

(9.6.42)

where Z denotes the quantum corrections in T 2 ^ . 31 From (9.6.41)(b) it follows that:

31

' In the case of ground state |0), the logarithmic functional W[J] becomes W0[J] and F(0t.) is denoted FO(0C); therefore D in this case is DF (see (9.5.82-83)).

Basics of Quantum Theory

_^

L v J _ + W_Ly

D}1

D}1

D}1

DFl

= DF-DF1DF+DF1DF'LDF

D}1

1

D}]

515

+

+ •••.

(9.6.43)

A diagrammatic representation of the above equation is obtained by choosing the following graph for I : —Z = O n As a consequence the relation between the propagator and I ' i s given by:

(9.6.44)

=

vZ?

+ O + O O + ••• (Q {. A<;\ x y x y x y ^y.o.43; The above diagram is known as the proper self energy diagram, and it shows that Z is indeed the IP irreducible (1PI) 2-point vertex function. As a final piece to this (fascinating and vast) theory of Feynman graphs, we choose the following diagrams which represent W(2), W^ and F (3) respectively: —

^

= -ih W(2) (x, y) \J=Q

(9.6.46a)

= i-ih)2 Wm (x, y, z) \J=0

(9.6.46b)

x ^ ^ — y X

^

Z =^-r(3)(jc,y,z)|^

^

(9.6.46c)

y We then use the above diagrams to obtain the diagrammatic relations for the operational equation: W(3) = W(2) W(2) ^

F3

(9.6.47)

and for the connected Green's function given in (9.6.10). We thus have:

/^^

l^^k y

(9.6.48)

516 Mathematical Perspectives on Theoretical Physics

X

W

y

X

W

z y + permutation terms

z (9.6.49)

Having obtained these representations we are now ready to make a few remarks concerning these graphs and Ff^J. Remark 9.6.5 Every connected diagram can be reduced to (1PI) diagrams—a point that we already made earlier. We note that I ^ 3 ^ in (9.6.48) is one of the simplest examples of 1 PI, in fact it gives the proper 3-point vertex function (see Remark 9.6.7). Remark 9.6.6 ( )

The effective action functional (of the theory) T[
r « (x1...x,,4as: HW = £ J d\ ... d\ -L r^Ot, ... xn\ kOt,) ... $c(xn)

(9.6.50)

where it is assumed that 0C = 0 (in the expansion). (See the explanation for c = 0 is called the proper (1PI) n-point vertex function of the theory, and it is these vertex functions that along with appropriate external wave functions lead to the scattering matrix of the theory. Remark 9.6.8 In view of the above two remarks F[(/>J is also known as the (IPl)-generating functional. Being an effective action it can also be expanded like the classical action S[(j)] (see (9.5.78)) in powers of the derivative or momentum. Thus we have:

r [ f . ] = J d\ | (-Veff«j)c(x)))

+ | A (&(*)) d ^ c { x ) d ^ c ( x ) + ••• I

where Veff is chosen to represent the potential

- -^— 0 2 - — 0 4

V

2

4!

(9.6.51)

in terms of (j)c. Note that higher

J

order derivative terms have been neglected here. Since (pc(x) - <j>c= constant in the absence of sources, all the derivative terms in (9.6.51) vanish and r[0 c ] equals:

r[<MUo = - J

d4xV

*f (&) = -yeff Wr)J d*x = <2K)Ad\0)Vm{<j,c)*

The space-time volume \dAx is usually denoted by (2nf <54(0).

(9.6.52)

Basics of Quantum Theory 517

This shows that in this limit effective action simply picks out the effective potential including quantum corrections of all order. Using (9.6.52) it can be easily established that the renormalized values of the masses and the coupling constants (including all quantum corrections) are:

4 ^

= m\

(9.6.53a)

d4V —f-

= XR

(9.6.53b)

(see Exc. (6.4)). So far in this section we have dealt with the description of Feynman graphs in the coordinate-space, however, from our discussions in Sec. 4, we know that these rules can be generalized to the momentumspace by taking the Fourier transforms of the functions involved and then defining the basic graphs in terms of these Fourier transforms. We have briefly illustrated this in Exc. 5 of this section by using the rule: P — > = ih DF(p) = l i m - = — % £->o p -m + ie

J ^ X C * Pi

Pi

=-^-$4(Pi + P2 + Pi + Pd

(9.6.54a)

(9.6.54b)

ft

Note that (b) above refers to the 04-theory. In the following remark we list important points concerning the evaluations in the momentum space. Remark 9.6.9 (i) In a proper vertex diagram, there are no external propagators or legs, (ii) All the momenta associated with the internal lines (propogators) must be integrated, thus if the number of internal lines is /, there are / momentum integrations. (iii) All momenta (in integration) are not independent since at each vertex there are momentum conserving <5-functions, thus if the number of vertices is V, the number of 5-functions is also V. The number of momentum integrations thus reduces to / - V. Nevertheless, there is an overall momentum conserving ^function for the amplitude, this means that the final number of independent internal momentum integration is:L = / - V + l s P + l . (iv) Every proper vertex diagram, i.e., (1PI) diagram with V vertices and / internal lines, has total number of h factors associated to it which equals the number P. Thus the diagram behaves like ~ hp. (v) The number L of (iii) equals the number of loops in a diagram and since L - P + 1, it follows from (iv) above that expanding an amplitude in powers of h is in fact an expansion in the number of loops. Since ft is a small quantity, loop expansion provides an efficient perturbative expansion. Finally, we note that while in general it is simpler to evaluate a diagram (i.e. calculate the Green's functions) in the momentum space, very often the process introduces infinities. For instance, in relativistic field theory the momentum variable in the loop integration ranges all the way from zero to infinity and thus permits no intrinsic cut-off in the momenta. The divergences due to this naturally make the calculations meaningless. However, it is quite relieving to note that this 'unwanted situation' is remedied via the prescription provided by the theory of normalization. The theory of renormalization suggests ways and means to isolate and remove all these infinities from physically measurable quantities. One of the methods used is the introduction of a regularization scheme under which all divergent integrals are made finite, the quantities involved are then freely

518 Mathematical Perspectives on Theoretical Physics

manipulated to obtain the calculation. In the case of a divergent diagram, one first separates the divergent part from the finite part and then allows the divergences to be absorbed in some appropriate redefinitions of mass, coupling, and field operators. We wish to point out here that our introductory remarks on 'renormalization' should not be interpreted to imply that the 'renormalization of a theory' is sought only to expurgate the infinities. Indeed the process of renormalization is used even in finite theories; in fact the term 'renormalized' is used when a given theory is altered by removal or introduction of a source (see for example Exc. (6.4)). Due to our limited scope we are unable to cover this topic here, we refer the reader to [6] for a succinct account of the theory along with useful references. We shall, however, return to it briefly when we discuss anomalies in Sec. 10.6. In conclusion we note that the theory of Feynman Graphs has wide applications in supersymmetry— where these graphs are called supergraphs. See for instance 7.[19] and 7.[16]. We shall return to these ideas briefly in Chapter 11.

Exercise 9.6 1. Verify (9.6.35).

2. show that -A-(^hl)

=

_s^ _ y) = j ^

5J(y){Sc(x)J

/;

J

s*nn

rtfo]

.

8J(y)8J(z) 50(:(z)50fU)

3. Use the compact forms W("\ r ( '° given in (9.6.37) and show that (a) WO)=W(2)W(2)W(2)TO) 4. Obtain the renormalized values of the masses and coupling constants in a 04-theory (in effect verify (9.6.53)). 5. Show how you would use the Feynman rules in momentum space to evaluate the one particle irreducible (1PI) 2-point vertex function of a 04-theory (in the momentum space).

Hints to Exercise 9.6 1. The functional derivative of F[0J in (9.6.34) with respect to <j)c(x) gives (i) 0<j)c(x)

0
J

O(f>c(x)

Since W [J] is a functional of J (JC), the first term in (i) equals:

x,8wui.AJUl

(ii)

SJ(x')

J

dc(x)

8W[J] But ——=-=- = cOc'), hence the first two terms in (i) cancel out giving the required result: 5J(x )

«

frH w 5c(.x)

Basics of Quantum Theory 519 2. Note that <j>r(x) is an independent variable in T[
_ J _ = f d*ys*c(y)

s

- f A , {SJ(y))__S_ = rj 4 82W[J] S_ J ' SJ(x) Sc(y) Using this in (9.6.35) we have:

an

f6r[^]l-fi4,

s

8J(y){d
§2wlJ]

8J(y)8J(z)

}

(5r^)-

6

-S-UM)

50 c (z) [8
8J(y)

Thus we have: (iii)

j j 4 , SMJI 8J(y)8J(z)

J

.

g2n»c]

=

_ 54(x _

8<j)c(z)8(l)c(x)

3. From (iii) above taking the functional derivative with respect to J(w), we obtain:

a)

\d \

52mi

s2n*j

8J(w)8J(y)8J(z)

J

_ !dA

=Oa

8<j)c(z)8$c(x)

rfVJ^W ^3r[^c] 82W[J] 57(y)57(z) 8<j)c(z)5(!>c(x)8c(o) 8J(a)8J(w) '

J

(We have used (ii) of the above hint in writing the extreme RHS of (i).) Also since W(2)T(2)=_l

we can also rewrite the above equation after some adjustment of variables as:

(ii)

^ 0

= J dWy'dV

8J(x)8J(y)8J(z) J w2[j] w2[j] 8J(y)8J(y') 8J(z)SJ(z')

y

W2W

In compact notation this becomes: (iii) w ( 3 ) = w(2)wi2) wi2)r0). 4. Now from (9.6.52) we have:

(i)

r[0c]J=o= -{2nf8\Weff=

-^Y^C ~~
We note that the constant value of 0C = ((j)) is obtained by solving: (ii)

8c(x)

J=o

x

8J(x)8J(x') 83r[$c] 8c(x')8<$>c(y')8c{z) '

520

Mathematical Perspectives on Theoretical Physics

Having once obtained it, we use this fact to take the derivative of Veff, and since (j)c no longer depends on spacetime coordinates, the derivatives of Veff are ordinary derivatives, thus

(iii)

?£S-

=-kn?-k±-yc

i^L

= -U

and

(iv)

But <j>t. = (0) = 0 in a 04-theory, hence (iii) and (iv) give respectively the normalized mass and the normalized coupling constant:

(v)

^f-

= ml

(V1)

^ F "

=

**

5. We recall from Sec. (9.4) that the basic graphs in the momentum space for a field theory could be expressed as: (i)

(»)

p

„ ^ Vi

^

^

> = ih DF{p) - lim —2—^—— 2 2 £->o p - m + ie

P3

=--rS4(Pl n

+ p2 + p3 + p4)

(see (9.4.37), the change in (ii) is due to the fact that we are dealing with a 04-theory here.) While evaluating a Feynman diagram in the momentum-space, we shall naturally integrate over the intermediate momenta, i.e., the momenta of the internal propagators. Thus to evaluate the (1PI) 2-point vertex function up to the first order A given by: (iii)

/=-*—^—— Px Pi we shall calculate the following integral, which is written after using the appropriate signs for plt p2 and k indicated by their arrows: ...

(1V)

,

1(

iX \ t d4k

ih

' " I - T J W F ^

r

5

4,

,

,,

cp.-ft+*-*)-

The factor — in (iv) is due to the symmetry of the diagram, and though ie is not explicitly given here, it should be taken into account while evaluating the integral and then sending it to zero in the limit. Compare (iii) of this excercise with (9.6.44) where (1 PI) 2-point vertex function is given in coordinate space, (see Chapter 2 in [6] for further details).

Basics of Quantum Theory

521

APPENDIX 9A: LANGUAGE OF QUANTUM MECHANICS A.1

State Space, Kets and Bras, Hermitian Operators and Observables

In quantum theory the states of a physical system correspond to vectors in a Hilbert space over the complex numbers. A state vector is denoted by the ket \ yr). The kets form a space known as state space, the properties satisfied by state space are similar to those of a vector space defined earlier. We wish to recall here just two of them: Property 9A.1 For every pair of kets \yf) and \y/') there exists a unique complex number that results from the scalar (or inner) product of \y/) and |y'X a n d it is denoted by the bracket: (|V>, |VO) =

(9A.1)

The obvious properties of this scalar product, which is also called the Hermitian (complex) scalar product, are: W\w)

- (v|vO*

(* = complex conjugation)

(y'\cyr) = c(w'\w) (\y, + y/') = (\y/) + (
(9A.2)

and * 0

(9A.3)

The equality holds if and only if | y/) is the null vector. We say in this case that the ket represents the ground state of the system, we denote it as 10). The real number (y/\ y/) is called the norm of | y/)\ when it is 1, it is called a normalized ket. If for non-zero | yr) and \y/'),(y/\y/') = O, then the kets are orthogonal. Property 9A.2 There exist complete orthonormal sets of bases for a state space. Each such set consists of the kets \k) (k = 1, 2, ..., ri) which are orthonormal: (k\l) = 8kl

(9A.4)

and has the property that any ket (state vector) \y/) of the space can be expanded as:

|V>= El*X*IV>

(9A.5)

k= \

(Note that in mathematics expansions of this nature are written with the eigenvectors as post factors.) Every operator A on the state space assigns to each ket \y/) another ket \y/') of that space: A\y/) = \yr')

(9A.6)

(See Chapter 3 for properties of Hermitian and unitary operators, and Exc. 2 and 3 of Sec. 3.1.) Given a space of kets, a space dual to it can be constructed whose elements denoted as {\, and is defined as:

(\V)(
(9A.7)

522 Mathematical Perspectives on Theoretical Physics

where \x)l5 a n arbitrary ket. The operator defined above is postulated to act to the left of a bra (x\, therefore

<0|) = «*k»<0|

(9A.8)

It is a linear operator and from the relation:

(|v><0|)+=k>
(9A.9)

where + = h.c, it follows that \<j))
£|*X*|=1

(9A.10)

k=\

every operator A can be expressed as: A=Y^\k')(k'\A\k){k\

(9A.11)

k' k

The complex numbers {k'\ A\k) are a matrix representation of A (see (3.1.9)). In the equality A\yr) = a\yr)

(9A.12)

the complex number a is called an eigenvalue of A and \y/) is called an eigenket. The set of eigenvalues of A is called the spectrum of A. If there are g linearly independent eigenkets with the same eigenvalue, then this eigenvalue is said to be g-fold degenerate. If A is Hermitian, then from (9 A. 12) and from: (
(9A.13)

it follows that the eigenvalues of A are real and the eigenkets corresponding to different values are orthogonal, and as such they can be normalized. A Hermitian operator A is an observable if its eigenkets \y/k) form a basis in a state space, i.e., if an arbitrary ket \y/) can be written as:

lv>= Z k*Xv*|v>

(9A.14)

k=\

The corresponding expansion for a bra is
(9A.15)

4=1

For an operator A which is an observable, (9A.11) reduces to:

4 = 5>*|V*XV*|

(9A.16)

k

Note that this is the spectral decomposition of A, when the spectrum is discrete; in the case of continuous spectrum, which happens when the space is infinite dimensional, the eigenvalue equation is:

Basics of Quantum Theory

MYa) = a\Wa)

523

(9 AM)

The parameter a representing the eigenvalue here is continuous. The orthonormality condition (9A.4) and the expansion (9 A. 14) in this case are respectively:
(9A.18)

|V) = j da\Wa) (y/a\¥)

(9A.19)

where the integration is taken over all values of a. n

A polynomial function PN(A) = ^ ckAk of an observable function A is an observable, in particular k=\

A2 is an observable. Hence the expectation value of A with respect to \\j/) defined as:

(9A.20a)

leads to the computation of the expectation value of any polynomial function of A. Based on this, we write below the important inequality in quantum mechanics giving the Heisenberg uncertainty principle. Uncertainty Principle Let A and B be two non-commuting observables then the physical quantites represented by A and B cannot be measured simultaneously with precision. In order to find this lack of precision, we write their commutator as [A, B] = iC. Evidently C is Hermitian. Suppose that (A A)2 denotes the variance of A (for a fixed state y/), i.e., AA = [(y/\(A-

(A})2\¥)]-7

(9A.20b)

then AA A £ > ~ | < y | c | y / > |

(9A.21)

gives the measure of deviation from precision, this is known as the measure of uncertainty or the Heisenberg uncertainty relation.

A.2

Position and Momentum Operators of a Particle

The observables associated with the position and momentum operators of a particle denoted r and p have cartesian components: r = Oj, r2, r3) = (x, y, z) P = (Pi. P 2 . Pi) = (Px> Pr Pz>According to quantum mechanics postulates, they satisfy the canonical (fundamental) commutation relations: (a)

[r,, rj\ = 0

(b)

\p-v Pj] = 0

(c)

[r,, Pj] = ihStj

(9A.22)

524 Mathematical Perspectives on Theoretical Physics

In non-relativistic quantum mechanics it is assumed that rx, r2, r3 form a complete set of commuting operators for a spinless free particle (see Def. (9A.3)), hence there is only one linearly independent eigenket for the operator r that corresponds to the eigenvalue /•', we denote it as |r'), 3 2 thus the eigenvalue equation for the position operator r is: r\r') = r'\r')

(9A.23)

where \r') = \r\ r'2 r'^). These components vary continuously from - ° ° to » . The orthonormality condition (9A.18) gives here: (r"\r') = 5\r" -r)= 8{r'[- r\) S(r"2 - r'2)8{r'i - r'3)

(9A.24)

The closure relation (9A. 10) which in this case becomes: J d3r'\r') {r'\ = 1

(9A.25)

gives the expansion (9A.14) of an arbitrary ket \\j/) in terms of the basis \r') as:

|V> = J d\'\r') (r'\¥)

(9A.26)

Here d3r' stands for dx dy dz and the integrations are over the whole of coordinate space. The expansion coefficient (r'\\j/) in (9A.26) is the coordinate-space wave function \j/(r'), which is a complex function of the continuous real variable r'. Hence we can write:

||Vr' = |v(r')|Vr' = [ £c 2 JdV

(9A.27)

where {r'^y/) = cv The normalization {^f\\ff) = 1, the closure relation (9A.25) and the equality (9A.27) then lead to:

J |V/(r')| 2 A' = l

(9A.28)

This shows that |y/(r')| 2 is zero at infinity. In other words, it means that the probability of r' attaining an infinite value in the case of a moving particle is zero; thus the motion is finite. The particle in this case is said to be in a bound state. If we pre-multiply (9A.26) by (0|F(r), where F(f) is a continuous function of r, and write y/(r') for (r'\y/), we obtain: <0|F(r)|v> = J d\'f{r')F{r')W)

(9A.29)

a relation that we use all the time in text. (Note that writing F{r') on the RHS of (9A.29) amounts to a change of basis from r to r'.) We can likewise obtain matrix elements for the operator/; by using (9A.26), thus: p\y/) = | d\'\r') (-ihVr,)y/(r') 32

(9A.30)

' In physics literature the eigenvalue of an operator A is usually denoted A', by using r' as the eigenvalue and \r') as the eigenket we are essentially following the physics trend.

Basics of Quantum Theory 525

where (9A.31)

This leads to <0|j>|y> = J d3r'P(r') HfiV r 0iKr')

(9A.32)

and, if {<j>\ is an eigenbra of r, then integration can be performed on the RHS and it yields: (r'\p\y,) = -ihVr,(r'\y)

(9A.33)

For momentum operator p, equations similar to (9A.23)-(9A.26) are: P\P') = P'\P') (p"\p') = S*(p" -p')

jd3p'\p')(p'\ = l \Y)=jd3p'\p')(p'\¥)

(9A.34) (9A.35)

(9A.36) (9A.37)

where integrations are over all of the momentum space. The choice of eigenkets | p') as a basis gives the momentum representation and the expansion coefficient {p'\\jf) = i/7(/>')-the momentum space wave function.

A.3

Coordinate and Momentum Space Representations

The eigenkets \r') and \p') as the basis are said to give respectively the coordinate representation and the momentum representation in quantum mechanics. Obviously the two representations are related to each other as we shall soon see. Using the left multiplication on (9A.26) and (9A.37) by (p'\ and (r'\ respectively, we get: (p'\yr) = j d\'{p'\r'){r'\y)

(9A.38)

( # » = ld3p'{r'\p')(p'\¥)

(9A.39)

which shows that there are integral relations between the co-ordinate and momentum-space wavefunctions. Now {r'\p') can be evaluated as an explicit function of r' and/>'. From (9A.33) and (9A.34) it follows that this function satisfies the three first order partial differential equations:

-ihVr,(r'\p')=p'(r'\p')

(9A.40)

and replacing r' by r" and y^by r' in (9A.39) we note that it satisfies the orthonormality condition: J d3p' {r"\p') (PY) = (r'Y) = 8{r" - r')

(9A.41)

526

Mathematical Perspectives on Theoretical Physics

It can be checked that the solution to (9A.40) and (9A.41) is (see Chapter 1 in [21]): ') = (2^ft)~T exp(ip'r'/h)

(9A.42)

Using (9A.38), (9A.39), (9A.42) and (9A.2) we obtain: \j/(p') = (2nh)~TJ d3r' exp(-ip' • r'/hy/(r')

(9A.43)

yr(rt) = (2nh)~^ j d3p' exp(ip' • r'lh) yr(p')

(9A.44)

In Appendix 9C (in particular in (9C.7)) we shall see that y(p') and \j/(r') are Fourier transforms of each other.

A.4 The Complete Set of Commuting Operators Finally we introduce the important concept of the complete set of commuting operators (c.s.c.o.) in quantum mechanics. We already know that the eigenvectors of a Hermitian operator A form a complete orthonormal set, and as such they can be used as basis vectors. These basis vectors in turn are characterized by eigenvalues to which they belong. Thus if all eigenvalues of A are distinct, they can be used to label the basis vectors as |A'), etc. If, however, there are two or more linearly independent eigenvectors that correspond to one eigenvalue, the labeling has to be done \A[), \A'2) .... In this case we look for another Hermitian operator B which commutes with A and has the same set of eigenvectors but gives us distinct eigenvalues say B' and B": B\A\) = B'\A\),

B\A'2) = B"\A'2)

(9A.45)

(See Ftn 32 for notations in above equations.) With the help of these eigenvalues, we now write the eigenvectors as: \A\) = \A'B'),

\A'2) = \A'B")

(9A.46)

Definition 9A.3 A set of commuting Hermitian operators A, B, C, ... whose n common eigenvectors can be given in terms of n distinct eigenvalues so that no two eigenvectors have an identical set of eigenvalues is said to be a complete set of commuting operators. It is denoted as c.s.c.o. The orthonormality condition satisfied by eigenvectors can be written as: (A'B'

... \A"B" . . . ) = 8A,A,, « W <

(9A.47)

or simply as: {k'\k") = 8k,k.,

(9A.48)

where k stands for the complete set A, B, C, ... and k', k" stand for the set of eigenvalues A'B' ..., A"B".... The completeness of the set of eigenvectors implies that an arbitrary ket \a) e state-space can be written as:

|a> = Z I*') <*» k'

(see Sec. 2 and Sec. 3 where c.s.c.o. is used)

( 9A - 49 )

Basics of Quantum Theory 527

APPENDIX 9B: A FEW DEFINITIONS AND DERIVATIONS We give below the definitions of a few mathematical objects which bear a different name when used in classical mechanics. An n-dimensional manifold Xn is the configuration space of a system with n-degrees of freedom. The coordinates used are (ql). The manifold X" x IR with coordinates (ql, t) is called the configuration spacetime of the system. The natural coordinates on the tangent bundle T(Xn) and the cotangent bundle T*(Xn) are respectively (ql, q') and (ql, pt). The latter is called the phase space of the system. The manifold T*(Xn) x IR is called the state space of the system. A function L : T(X") x IR —> R given locally by L(q', q\ t) is called a Lagrangian. The vector with components Pi = — T is a covariant vector on X". A function H: T*{X") x R - > R given by H(q', ph t) e (R is dq' called the Hamiltonian function*. In the following, we shall use the concept of a Hamiltonian function to define a Hamiltonian operator in quantum mechanics.

B.I

The Wave Function y/ in Quantum Mechanics

Consider a one-dimensional wave equation:

^ _ 4 . | £ . ( f + ± f ) r f - i - f •),.<, dx

c dt

\dx

c dt)\dx

(9B,,

c dt)

D'Alembert's solution to this equation is: T](x, t) = fax - ct) + f2(x + ct) (9B.2) We are, however, interested in the form (solution) which is used in quantum mechanics. This is: n{x, t)sri = A cos ^-(x-ct) (9B.3) X t] here is known as a sinusoidal wave travelling (propagating) in the positive ^-direction, the constants A, X and c are described as follows with the help of the diagram given below. A is the amplitude of the wave. X is the wavelength (x2- x{) for constant time f0.33 Similarly, for fixed x (i.e., x — x0) T = (t2-t{) defines the period of the wave, and the constant c = —. The number v = — is called the frequency of T T the wave and 0) = 2nv = is called the angular frequency or the frequency of oscillation-which is an observable phenomenon. The wave TJ is an infinite harmonic plane wave which is associated with the In the Table 9.2.1 we have denoted the Hamiltonian as H(x, p), since we have used there x in place of q, also we did not show there its dependence on 't' explicitly. The concept of wavelength in physics is very important since it characterises a particular wave, and also since it predicts the behaviour of a particle/string as described above. To see the former we note that if the wavelength is a meter or more, the waves are the radio waves; whereas the waves with shorter wavelength (a few centimeters) are known as 'microwaves'. They are called 'infrared' when the length is closer to —^- of a centimeter. The light visible to our eye has a wavelength 11 —— < I < —j- 1 meters, and waves with still shorter wavelengths are known as 'ultraviolet' X-rays and gamma rays.

528

Mathematical Perspectives on Theoretical Physics

motion of a free particle moving in the jc-direction with the momentum p = hk, where h is Planck's constant34 and the magnitude of k is given by the wavelength A =

. We note that above relation k

between p and k is the fundamental de-Broglie equation.

0

~H

_y

_A

T

QH^Q

W- k

A

x

Wave propagating In the x-dlrection.

Using k andft)and by incorporating the amplitude A with the cosine argument, we can write (9B.3) as: ri(x, t) = cos(foc - ox + a)

(9B.4)

Likewise, a plane wave propagating in the -ve (negative) ^-direction is: r}(x, t) = cos(kx + ox + p)

(9B.5)

In order that (9B.4) and (9B.5) may suitably describe plane waves in quantum mechanics, we have to add additional terms to both these coming from — - — —

L

dt

Accordingly we have:

Jr=0

r\x(x, t) o= co&(kx - cot) + ax sin(fcc - cot)

(9B.6)

r]2(x, t) « cos(kx + cot) + a 2 sin (kx + cot)

(9B.7)

The constants ax and a^ can be taken as +i and -i; this follows from the fact that these waves moving in the opposite direction are linearly independent at all times and that an arbitrary replacement of x or t does not alter the physical character of the wave. We note however that waves (9B.4) and (9B.5) violate the above condition of linear independence for t =

—.

Using the values of ax and o^, we can write the expressions for waves propagating in the +ve and - ve direction of x as: 77,(;c, f) = AeKkx-a*> 7]2{x, t) = Be~i(kx

+

(9B.8) "*>

(9B.9) lkx

lkx

The initial value of these waves are t]x{x, 0) = Ae , and r\2(x, t) - Be~ . This wave concept (discussed here in 1-dimension) leads to the definition of our familiar complex wave function which, in conformity with the literature, is denoted as y/(x, y, z, t): 34

Sometimes in literature the symbol h is used to denote

times the Planck's constant. In

Basics of Quantum Theory

y/(x, y, z, t) = Aei(k ' * " m) = Ae*k*x + V

+

*** ~ »«

529

(9B. 10)

Equations (9B.8) and (9B.9) follow from (9B.10) when ky = kz = O and kx= ± k. The wave function y{x, y, z, t) represents the motion of a particle moving in 3-dimensional space with the momentum (px, py, pz) = (fi kx, ft ky, h k.). (See S. L. Sobolev, Chapter 3, Ref.[Ad] for the wave operator.)

B.2 The Hamiltonian Operator H(t), and the Time Evolution Operator U(t) According to (non-relativistic) quantum mechanics postulates, an initial state y/{f0) = \a, t0) of a given system determines the subsequent states y(/i) = \o, t\) and y/(t2) = \a, t2), etc., and if two initial states \a, t0) and \b, tQ) separately evolve into \a, t) and \b, t), then their linear sum cx\a, t0) + c2\b, t0) develops to cL|a, t) + c2\b, t). These two postulates taken together imply that a state \a, t) can be obtained from an arbitrary state \a, t0) with the help of a linear operator U 35: y/(t) = \a, t) = U(t, to)\a, t0) = U(t, to)yf(to)

(9B.11)

(The reason for denoting the linear operator by U will soon become evident.) The operator U does not depend on y/(t0), and as a result one has: y/{t2) = U(t2, tOHh)

= U(t2, f,) t/(r lt t0)y/(to) = U(t2, to)\i/(to)

(9B.12)

which leads to the (group) property: U(t2, *„) = U(t2, tx)U(tx, tQ)

(9B.13)

From (9B.11) it is also evident that: U(t,f) = I

(9B.14)

Hence, U(t, to)U(to, t) = U(t0, t)U(t, t0) = I or W(t, to)]~l = U(t0, t)

(9B.15)

Using a small increment e in t, we now define an operator H{t) as: U(t + e, t) = I -—eH(t) h

(9B.16)

The presence of the factor — will be justified while writing the derivatives of state vectors and operah tors (see also (9B.22)). Presently we use the above definition along with the group property: U(t + e, r0) = U(t + £, t)U{t, t0) to obtain the differential equation for U, thus: 5

U is known as the evolution operator of the system.

530 Mathematical Perspectives on Theoretical Physics

JLUitt ,o) = mnU^ + e^-U^K.±mt)U(t, dt

e->0

£

t0)

(9B.17a)

H

or

ih-4-U(t, t0) = H(t)U(t, t0) (9B.17b) at with initial condition U(t0, tQ) = I. The operator H{t), known as the Hamiltonian operator, is the analogue of the Hamiltonian function in classical mechanics, as already defined in the beginning of this appendix. Writing / + £ in place of t and t for t0 in (9B.11) we have: y{t + £) = U(t + e, t)y/(t)

(9B.18a)

To the first order in e this gives: y/(t) + e ^ - = \l-±£

H(t)] y{i)

dt

J

L

h

(9B.18b)

Hence: ihdytt)_

= H{t)y/(t)

(9B.19a)

dt In Dirac notation this can be written as: ift—\a, t) = H(t)\a, t) dt where by definition:

(9B.19b)

d | > ,. \a,t + e)-\a,t) — \a, t) = lim-1 ———. dt e->o £ Remark 9B.1 Equations (9B.19), known as the Schrodinger equation for the time evolution of y/(t), give the general law of motion for any system; naturally for specific systems the operator H{i) has to be selected in accordance with the requirement of the system. For instance, it is the energy operator in the case of a wave function (See (9.2.9)) and a (2 x 2) matrix operator for spinorial-systems of 2-component spinors. Remark 9B.2 In both these examples we noticed that the Hamiltonian was Hermitian, hence it is worthwhile to assume that any operator selected to be the Hamiltonian of the system would be Hermitian. By (9B.16), this assumption implies that the operator U is unitary and, as a consequence, the length of any state vector remains unaltered during the motion. Thus if (a, to\a, t0) is normalized to 1 initially, it is (a, t\a, t) = 1 throughout the motion. The main advantage that follows from above is the fact that the expansion coefficients in the expansion of the state vector: \a,t) = T.\k')(k'\a,t)

(9B.20) 2

can easily be computed (See also (9A.5)); for \{k'\a, t)\ is the probability of finding the system at time / with the value k' for the observable k.

Basics of Quantum Theory 531

When H does not depend on time, U can still be obtained for finite time intervals by applying the group property (9B.13) repeatedly to n equal intervals of length £=

—. Thus using (9B.16) with n

U{t$, to) = 1, we have: m.

f

,

limf l-±£H\

=exp

= Urn [ / - I f 1^2. VI"

["i ( ? ~ f ° ) H ]

(9B-21)

(by the definition of exponential function). This expresses the evolution operator U as the exponential of the Hamiltonian H (see Sees. 9.2 and 9.4).

B.3

Dynamical Laws

Next we consider the dynamical law for an operator A. This can be obtained as follows. Let A be independent of t, then the time-variation of the expectation value (A) defined in (9A.20a) is: ih-^(ay t\A\a, t) = ih -L{y(f)\A\v{t)) at at

ih ~(^(t), at

= (vr(t), AHy/(t)) - (y/(t), HA\j/{t))

Ay(t))

(9B.22)

(To write the last step, we have used (9B.19a) and the Hermitian property of H.) The use of the partial derivative of y/(?) in (9B.22) should come as no surprise, since \f/= yr(x, y, z, t). Thus the dynamical law of an operator independent of t is: ih—(A) = (AH - HA) dt

(9B.23)

where brackets indicate expectation values of the operators enclosed. The above expression shows that commutators of H with the observables play an important role in the theory. If A commutes with H, then in view of (9B.23) its expectation value is constant and it is said to be a constant of the motion. If A depends on time, instead of (9B.23) we have: ih—(A) = (AH - HA) + ih(—\ dt \ dt /

(9B.24)

dA We now define an operator

such that for every state y of the system the equality of the expectation dt

values:

(¥) \ dt I

= -T W dt

(9B.25)

532

Mathematical Perspectives on Theoretical Physics

holds. The dynamical law in this case is given by:

+

S* dt

(9B.26)

dA The operator

is called the total time derivative. dt In Sec. 2 we obtained Hamiltonian operators for a few simple physical systems (See Exp. 9.2.2).

Exercise 9B 1 Show that the formal solution of the Schrodinger equation (9B.19a) can be written as: y/(t) = U(t, to)y/(to) = expf—l-{t - t0)HJ\i/(t0) where the Hamiltonian H is the energy operator denoted H .

Hints to Exercise 9B 1. In view of Remark (9B.1), the Hamiltonian in the Schrodinger equation is the energy operator H. Hence we have to solve dt This can be done by using the integrating factor method. The integrating factor here is:

ej'io-±Hdt=e,P[-±(t-to)H].

Basics of Quantum Theory

533

APPENDIX 9C: TOOLS OF PHYSICAL THEORIES In this appendix we define a few mathematical objects (along with their related examples) that are constantly used in theoretical physics, particularly in quantum theories. They are: (i) test functions, (ii) distributions, (iii) Fourier-transforms, and (iv) Green-functions.

C.l

Test Functions and Distributions

Definition 9C.1 A differentiate function T: IR —» R with compact support is called a test function. For example, given a > 0:

r W: =

«

{;

(9C1)

M*«

is a test function. Its support has length 2a, and the function is differentiable. For instance, the first derivative T'a (x) given by: 2a2X 2

("

-a2 /(a2-,2) 2 e

ra(x) = \ (a -x f

\*l°

(9C2)

[0 is well defined everywhere, as we see: lim T'a(x) = 0= x-*a+

lim T'a (x)

(9C.3)

X-HX

The derivative of a test function is again a test function. In order to define a distribution, we need the concept of a 'weakly convergent (w.c.) sequence of functions' which is as follows: Definition 9C.2 A sequence of differentiable functions /„ : R —» R (n = 1, 2, 3, ...) is said to be weakly convergent if for any test function T(x) the limit of: lim f

fn(x)T(x)dx

exists. The sequence of functions fix) =

(9C.4)

?~^~> known as Breit-Wigner functions, is weakly n l+n x

convergent. In fact it can be shown that: lim f

fn{x)T{x)dx = 7X0)

(9C.5)

A weakly convergent sequence which satisfies (9C.5) is said to possess the sifting property. Other examples of weakly convergent sequence with sifting property are: z

\

(a)

r /

\

ft

fn(x) = -i=re V7T (see Exc. 9C.1).

—ti*"X

/L\

/• /

x

(b)

fn{x)

,

\

I

Sill

TIX

5—

nn

x

ff\/~* z:\

(9C.6)

534 Mathematical Perspectives on Theoretical Physics

Definition 9C.3 A distribution D (also known as a generalized function) is an equivalence class of weakly convergent sequences of functions [/j,].36 For any representative sequence^, we write it as: [ D(x)T(x)dx:=

lim P fn(x)T(x)dx € R

(9C.7)

The LHS is also denoted as J DT or in the functional setting as D(T) and reads "D evaluated on 77' A distribution is in fact a linear functional over the space of test functions; thus for two test functions T and S we have: D(T + S) = D(T) + D(S)

(9C.8)

and for any real number a: D(aT)=aD(T)

(9C.9)

We note that a distribution can only be "localized" in a finite interval by evaluating it on a test function with support in that interval; we say that "the distribution is smeared out by the test function." In view of this distributions are better suited than functions in formulating the uncertainty principle in quantum field theory. Definition 9C.4

The distribution defined by the sequence [/„(*)] where, (9C.10)

is called the Dirac's delta function at t, e R. It is denoted by 8% or Sc{x) or 5{x - £).37

C.2

Properties of Distributions with Respect to Operations on Them

(i) Two or more distributions can be added to give another distribution. The sum is naturally represented by the equivalence class of the sum of weakly convergent sequences, (ii) For every a e R, the distribution aD is a scalar multiple of D. (iii) The derivative of D represented by a sequence [£,M] is the distribution D' defined as:

JD'T=-JDT'

(9C.11)

and is represented by the sequence \fn(x)]. For example, the derivative of the (^-distribution given in (9C.10) is represented by:

• Two weakly convergent sequences of functions/n(;t), gn(x) are equivalent if their difference converges weakly to zero. 37 Sometimes it is also denoted as 8{x, <£). It is a generalization of the Kronecker delta Sg to the case of a poo

continuous variable, and as such it is defined by the equation:/(
S(x, E,)f(x)dx, where/(x) is an

arbitrary well behaved function. In the above equation S(x, £) = 0 everywhere except when t, is very close to x (See pp. 82-84 of [24].)

Basics of Quantum Theory 535

{ X

/:W = - —

~ ®2 ,

(9C.12)

The above example shows that in many ways distributions behave operationally like functions.38 However, this is not always true, for instance the product of two distributions may or may not be a distribution. An easy example to illustrate this is the square of ^-distribution where the sequence:

[n (l + nV) does not converge weakly, and hence the square of an arbitrary (^-distribution is not a distribution. (iv) The product of a differentiable function g(x) and a distribution D represented by the sequence [fn(x)] is the distribution: (gD)(T) := lim J gfj

= D(gT)

(9C.13)

(v) Let h(x) be a diffeomorphism39 and let \fn(x)] be a weakly connected sequence representing a distribution D(x), then D(h(x)) is a distribution defined as:

f D(h(x))T(x)dx: = lim J fn(h(x))T(x)dx T(h-\y)) t = hmJ fn(y) } l >dy «^~

\h (h (y))

(9C.14)

This means that substitution in a distribution is possible only if h(x) is a diffeomorphism. (vi) In the case of distributions, integration and limit as well as differentiation and limit can be interchanged, more specifically: lim j Dn(x)T(x)dx = f lim Dn(x)T(x)dx

(9C.15)

lim -4-Dn(x) = -~- lim Dn(x)

(9C.16)

and

n->°° aX

dX n->~

(Equality (9C.15) follows from Def. (9C.5).) It should be noted that such an interchange is not possible for pointwise convergence of functions, for example lim i J ° i ^ = 0 n->~

(9C.17)

n

38

Even the support of a distribution is defined in the same way as that of a function. It is the complement of the largest open subset on which the distribution vanishes.

39

By diffeomorphism h(x) w e mean here that h(x) is a differentiable function whose inverse is defined.

536 Mathematical Perspectives on Theoretical Physics

whereas lim -4-(222L)

=

lim cos nx

(9C.18)

fails to exist. Finally, one also has the concept of convergence amongst distributions as shown below. Definition 9C.5 A sequence of distributions Dn, n = 1, 2, 3, ... is said to be convergent if there is a distribution D such that for any test function T

lim f

DJ= f" DT

(9C.19)

We write it as lim Dn = D

(9C.20)

Dirac's delta-distribution Sn(x) for n = 1, 2, ... represents a convergent sequence of distributions: lim 8n(x) = 0

C.3

(9C.21)

Green's Functions

In the following we shall define a Green's function and study some of its properties. We shall see that in general they are not functions, although they carry this name. Definition 9C.6

Consider an W-th degree homogeneous linear differential equation:

ao(x)f(x)

+ a x ( x ) f ' ( x ) + ••• + a£x)f(i>(x)

+ ••• + a N ( x ) f m ( x )

=0

(9C.22)

in the operator form A/= 0

(9C.23)

where A is the linear differential operator:

Am

^{-h)k

<9C24)

with differentiable coefficient functions ak(x). A Green's function for the linear differential operator A given in (9C.24) is a distribution G^(x) such that AG4(x) = 54(x)

40

| € R

(9C.25)

A Green's function is also called a propagator or an elementary solution of A. An easy example of Green's function is the Heaviside's step function41 40

Sometimes in the literature the RHS carries a -ve sign or an i.

4L

Any differentiable function/^) can be considered as a distribution in terms of the weakly convergent sequence/ (x),f(x),f(x)... formed by it. The function H(x) in this sense is a distribution, more specifically it is Green's function G0(x).

Basics of Quantum Theory 537

fO

x<0

"«={, ,>„

(9C26)

given by the weakly convergent sequence: \fn(x)} = f— + — arctan nx] VI it J Since for A =

(9C.27)

we have: dx

H(x) = dx

f —i arctan nx dx 12 n )

= — - — W = *<*)

(9C.28)

z

n l+ nx Green's functions defined in (9C.25) play an important role in solving inhomogeneous equations:

Af(x) = s(x)

(9C.29)

The above equation involves two functions f(x) and s(x)—the latter, called the source function, is supposed to be known. In the case of many source functions, using G^(x), a solution in the form of a distribution can be defined, thus: D{x) = £ , G^x) s(& dS

(9C.30)

Due to its usefulness, the relation in (9C.30) is sometimes referred to as a magic formula. Note that when A =

and GAx) = H(x - £), this formula reduces to the relation between the function/(jc) and dx the source function s(x) (see Exc. 9C.2):

f(x) = £ . H(x - <^) s(Q d^ = J ^ H(x - | ) s(0 dt, + j~H(x - 0 s{& d£

(9C.31)

The second integral is zero since H(x - £) < 0 in the interval [x, °°], the first integral exists if s(£) is integrable in the interval [- °°, x], accordingly we have: f(x) = f

s{$ dt;

(9C.32)

(In the process we have obtained the 'fundamental theorem of differentiation and integration' in (9C.32).) If the coefficients in A are all constant, the Green's function G4 (x) = G(x - I)

(9C.33)

538

Mathematical Perspectives on Theoretical Physics

due to translation invariance can be written as G(x). The function H{x -
(9C.34)

is solved. All these concepts introduced above can be generalized to functions of several variables with complex values after replacing the ordinary Riemann integral by multiple integrals and ordinary derivatives by partial derivatives. For instance, a weakly convergent sequence can be defined as: Definition 9C. 7 A sequence of differentiable functions fn : RN —> C is said to be weakly convergent if for any differentiable function T: R" —>
••• T fn(xl

= : lim f

... xN)T(xl ... xN)dxl ... dxN

fn(x)T(x)dx =: lim L fj

x: = (xl ... xN)

(9C.35)

exists. A useful test function is:

L-«>2-2)

Ta(x): = \e

lo,

where

r: = \x\: = Ij^xf

'

r
*

(9C.36)

and a > 0.

Similarly a Green's function can be defined for the "divergence"—the linear partial differential operator acting on differentiable vector-valued functions E: K3 —» K3: 3

2

divE(x) := X ~E\x) , = i dx

(9C.37)

In view of Eq. (9C.33) this is: G4(x) = G(x -$) = {G\x - £, G\x - |), G\x - 0)

(9C.38)

which is a solution of the inhomogeneous equation: divG(x)= S(x). The solution: GOO = —^T— (9C.39) Anr r is unique, if it is assumed that far away from the origin G(x) as a distribution approaches zero. 12

The convolution product of two functions/(JC) and g{x) is defined as:

(f* g)(x)= f /(x - £)$(£)<*£

Basics of Quantum Theory 539

Using polar coordinates (r cos <j) sin 9, r sin <j> sin 6, r cos 6) (0 < r < °°, 0 < 0 < 2;r, 0 ^ 0 < n) it can be shown that G(x) in (9C.39), although not a piecewise differentiable function, is a distribution given by the limit of the sequence:

Gn(x):=-^T//r--T)

(9C.40)

It has the property G(-x) = - G{x) (i.e. it is odd), and can be used in solving inhomogeneous equations of the type: div£(x) = h(x) (9C.41) Thus, with the help of the magic formula (9C.30) and (9C.39), the solution to (9C.41) can be written as: (9C.42) The above solution is the limit of superpositions of electric fields generated by point-like charges. (See Exp. (9C.15) on Green's function.)

C.4

Fourier Transforms and Related Objects

Fourier transforms play an important role in solving differential as well as integral equations. To begin with, we define here the objects that lead to these transforms. Definition

9C.8 A real variable function /(/) is said to be periodic of period T, if for all t e R

f(t+T)=f(t)

(9C.43)

The constant functions and the so-called harmonics sin-^r, T

cos^f, T

sin^-f, T

cos ^-t ... T

are easy examples of periodic functions.43 Periodic functions are often treated as defined on a circle of circumference T rather than on the whole of real axis R. Periodicity is then reproduced by wrapping R on the circle. One of the central questions that is asked for periodic functions is: given the period T, when can it be expressed as a superposition of harmonics oo

oo

fit) = I aj c o s ( ^ r ) + I

bj sin

[j?ft)

(9C.44)

that is, as a Fourier series with Fourier coefficients aj, bj? To answer this, one has to address the question of convergence. The convergence may be uniform, pointwise, weak or in the mean depending on the properties of periodic function.44 43'

The harmonics should not be confused with harmonic functions—the solutions of the Laplace equation. The terminology used here is based on the fact that they satisfy the equation of the harmonic oscillator / + o?f= 0 with (0=

44

, , etc. T T For various types of convergences mentioned above, see an analysis book, e.g., 3.[12] or Ref. [31]; see also Exc. 1.

540

Mathematical Perspectives on Theoretical Physics

We note that a Fourier series helps reduce a given differential equation for a periodic function into an algebraic equation for the Fourier coefficients a;, bj. This happens because the harmonics form a basis in the space of periodic functions, such that each basis vector is an eigenvector of the differential operator. Very often the Fourier series of a periodic function f{t) with period T can be expressed in complex form: f(t) = lim X Cje-i(2*'/T)J

(9C.45)

n —> °° . j = -n

where the complex Fourier coefficients c- are: \{aj+ibj)

j>0

cj =-aQ

j =0

j{«-j-ib-j)

(9C.46)

7<0

If/(0 is a real function, then the Fourier coefficients aj and bj are real and c_j - Cj. We define an orthonormal set of functions*: eft)

:= ^

e

"

W

jeZ

for the complex harmonics, and note that the Fourier coefficients c immediately follow from the scalar product: cj - -jf(ejtf) = ^\lf{t)e^"^dt

(9C.47)

Thus obtaining the Fourier series of a periodic function amounts to using (9C.45) where c are calculated with the help of (9C.47). (See Exc. 9C.3.) Likewise, the Fourier series of a distribution D(t) which is periodic can also be obtained by considering the Fourier coefficients C" of periodic functions fn (?) that represent D (r), where c

" = jjTofn(t)e*2!"/T)idt

(9C.48a)

For instance, consider the periodic ^-distribution with period 1, its Fourier coefficients are: Cj= j1oS(t)ei(2nl)jdt

= 1

(9C.48b)

consequently the Fourier series for S(t) is: +«

S(t) = lim = X e~K2m)J "^°° ; = - „ The factor

.— = the normalization constant is chosen so that (ey> ek) — 8jk forj, k e Z.

(9C.49)

Basics of Quantum Theory 541 Its convergence (in the sense of distributions) can be obtained by using a test function T(t) on the circle: Y e~2MiJI T(t)dt = T(0)

lim f J

(9C.50)

n-»=o 0 . j = -n

We next define the Fourier Transform (F.T.) of a function fix) along with the properties that fix) should possess for existence of it's F.T. Definition 9C.9 Let P be any polynomial, a function/: K —> C is said to be of fast decrease if it is differentiable, and if the product of/or any of its derivatives with polynomial P is bounded. Any test function as well as the Gaussian: f i x ) = e"x2/a2

a > 0

(9C.51)

is of fast decrease. Moreover, if/is of fast decrease, then so are its derivatives and the function resulting from its product with any polynomial. Definition 9C.10

If/is a function of fast decrease, then its Fourier transform is defined as:45

/(*) := -J=^J1 /(*)*'"** dx

(9C.52)

(see Exc. 9C.4). Fourier's Inversion Theorem The Fourier transform f(k) of f(x) is of fast decrease and satisfies the Fourier's inversion formula:

fix) = - 7 = J 1 f(k)e~ikx dk

(9C.53)

We list below in the form of a theorem some of the properties concerning the Fourier transform of a function of fast decrease. Theorem 9C.11

(a) if / i s of fast decrease, then . 2 . (JC) = _ # / ( * ) dx

(9C.54)

(«?)(*)=4f(*)

(9c-55)

and

dk (b) If f(x) = g(x - a), then fik)=eika

gik)

(c) If/ is even or odd, then so is its F.T. 45

- Sometimes e~lkx is used in place of e'kx=e'kx, and the factor 4in is given in the numerator.

(9C.56)

542 Mathematical Perspectives on Theoretical Physics (d) If/and g are of fast decrease, then their convolution product

if * g) 00: = J_~M f{x - y)g(y)dy

(9C.57)

is well defined and f * g = fg Parseval's Theorem 9C.12 g, then

(9C.58) If/and g are functions of fast decrease with Fourier transforms / and

1 1 f(x)g(x)dx = £ /(*) £<*)dfc

(9C.59)

where f(x) denotes the complex conjugate of f(x). (See Exc. (9C.5) and (9C.6) which deal with functions that are not of fast decreases.) Naturally one can also define Fourier transforms of distributions as well, though for a restricted class—the class of 'tempered distributions.' Definition 9C.13 A tempered distribution (T.D.) is an equivalence class of sequences of functions /„(*) of fast decrease such that for any function F{x) (also) of fast decrease, the sequence of numbers I

fn(x)F(x)dx converges. We denote its limit as: lim f fn(x)F(x)dx =: f" D(x)F(x)dx = : D(F)

(9C.60)

For example, the ^-distribution and the constant function 1 is a T.D. In general every distribution of compact support is a T.D; any linear combination of tempered distributions is a T.D. and the derivative of a T.D. is a T.D. A differentiable function defined on the whole of the x-axis is said to be a function of slow increase if its increase at + °° or - °° is polynomial or slower. For instance, polynomials or e~Vx In x2 are functions of slow increase. Any T.D. multiplied by a function of slow increase is again a T.D. In particular a function of slow increase is tempered. We next define the Fourier transform of tempered distributions by Fourier transforming their representative functions. Definition 9C.14 Let D(x) be a T.D. represented by the functions fn{x) of fast decrease. By Parseval's Theorem (9C.12) their Fourier transforms fn(k) define a tempered distribution known as the F.T. of D(x) and denoted as D(k). Thus, D(k)= lim f

fn(x)F(k)dk=

lim f /„(*) F(x)dx

(9C.61)

Note that the F.T. of the ^-distribution concentrated at the origin is the constant function the sequence representing ^-distribution can be represented \>y fn{x) = -^=-e'n V7T

x

.

. Since

•J27Z = —%^ e~x /(1/ " \ w e -Jn

Basics of Quantum Theory 543 can use the result established in Eq. (ii) of Exc. (9C.4) to write the Fourier transform of/,, (x) as:

-*2"«2

/„(*) = - f , _*

(9C.62)

The ET. /„ (k) represents the F.T. of the 5-distribution, evidently it approaches

,

as n —> °°.

The Fourier transforms of functions and distributions defined over \RN are obtained by replacing the L -JL single integral by multiple integral and the factor (2TC) 2 by the factor (2n) 2 . For instance if / : RN -> C is a function fix) of fast decrease, then its F.T. is:

/(*'... * " )

= _ l (V2TF)

f

. . . f /OcV.^x

= , rLN[f{x)eik-xdx

(9C.63)

where k • x = ]j£ * J JC 7 . The inverse transform therefore becomes: 7=1

fix) =

J_

f f(k)e*

x

dk

(9C.64)

The derivatives in Eqs. (9C.54) and (9C.55) are now replaced by partial derivatives, thus the F.T. of partial differentiation in the x-space becomes multiplication of F.T. by the corresponding component in the fc-space: ? £ $ - =-ikjf{k) dx]

j=l,2

...N

(9C.65)

conversely, —jfik) = fxjfjx) j = 1, 2 ... N (9C.66) dkJ Finally we give below an example of computation of a Green's function for damped harmonic oscillator. Example 9C.15

Recall that a linear differential operator A = —

+ 2X

1- co2Q where co0 > X > 0

is called a damped harmonic oscillator.,46 The Green's function G(t) with respect to A satisfies:

[— I +2A— + col G(t)=5(t) \dt) 46

dt

It is called o verdamped if A > at 0 > 0 and undamped if A = 0.

co0>X>0

(9C.67)

544

Mathematical Perspectives on Theoretical Physics

Our objective is to compute G{t). We assume that G(t) is tempered and so the differential equation (9C.67) can be Fourier transformed to give an algebraic equation in the frequency space (See Appendix 9B). As mentioned earlier, such an algebraic equation is easier to solve. The solution can then be transformed back using the residue theorem to give finally the Green's function. The F.T. of (9C.67) leads to47: [(jft))2 - 2iXco + co20] 6 (
(9C.68)

•J2n This gives

6 (to) = -1=——r;

< 9C69 )

r

y/2n -ft) -2rAa) + a>o which can be checked to be square integrable, thus G(t) is indeed tempered. Also G((O) is integrable, therefore in view of inversion formula (9C.53) one has: 1 re~i0" 1 r» e~im'dto G(t)=—\ =—^ — Td(O = —— f In J -~ - o 2 -2iXco + col 2nJ~°° (OJ -
(9C.71)

ft)2 = -iX + 4

(9C.72)

ft>! = -iXand

are isolated singularities of the integrand. For t < 0 we consider the closed curve Fig. (9.5) and note that isolated singularities are not inside this closed curve C, hence applying Cauchy's theorem as R —> °o, we have G(t) = 0 as the integral vanishes over the half circle C. Imd)

R

:

:

47- J" f f—1 +2l— + col\ eia" G(t) dt = J"_ 8(t)eim dt

R

Ret0

Basics of Quantum Theory 545 Imm

C -R 1

\

;

:

W

2

R 1

Rea

°h

For t > 0 we use the clockwise curve C in Fig. (9.6). The singularities are within this closed curve, thus we have, using the Residue theorem (See Exc. (9C.6)):

«0 - " f^[(«-U • **k) { - ^ , ' L . ^ } ]

<*™>

Next we introduce another entity—the 'functional' which is always used in quantum mechanics in particular in path integral formalism.

C.5

Functional and their Calculus

In layman's language a functional is a function of functions, or a function of infinitely many variables. Calculations involving a functional are carried out by considering it as a function of a finite number of variables (x1 ... xN) and then letting N -> <*>. The simplest example of a functional is our familiar action functional of a particle moving in one direction in a potential (with Lagrangian L(x, x)): S[x] = f" dtL(x, x)= Jt o

\h dt\ ^m[—) 'o {2 \dt)

J

- VU) I J

(9C.74)

Note that unlike other functions whose value would depend on a particular point JC, the value of S[x] depends on the entire trajectory (curve) along which the integration is taken from the initial point x(t0) given by tQ to final point x(t{) given by tv Thus generically a functional is: F[f]= $ dxF(f(x))

(9C.75)

where for example F(f(x)) may simply be (f(x))k. The concept of derivation of a functional can be viewed as an extension of derivation of a generalized function. Thus functional derivative (Gateaux derivative) is defined from the linear functional as:

546 Mathematical Perspectives on Theoretical Physics

F'[g] = •£•*•!/+ eg]\ex0 = j dxj^£g(x)

(9C.76)

The above definition corresponds to an equivalent expression which is more useful from a computational standpoint:

SF(f(x))

= Hm

5f(y)

Fjfjx) + eS(x - y)) - Fjfjx))

£->o

c

£

Equation (9C.77) implies that:

(9C-78)

T7TT = «(*->) The following properties (similar to those of ordinary derivative) can be easily verified:

-~—(F[f] 8 fix)

+ G[f]) = 4 T 7 T + 4 T 7 T 8 fix) 8 fix)

-zj—(F[f\G[f]) = 4 T 7 T G [ / ] 8f(x) 8 fix) F[G[ 1}ss

T7T

*

+

(linearity)

^[/]4T^

(9C.79) (product rule)

(9C.80)

6 fix) (9C81)

7£-ir

c>g(A:) d o Og Any given functional F [ / ] can be expanded as a Taylor series in the following form: F

if]

= j dxgoix) + j dx1dx2glixl,

x2)fix2)

+ ••• + j dxxdx2 ... dxkgk_x (*„ x2 ... xk)fix2)

... fixk)

...

(9C.82)

where

goix) = F(fix))\fM

= 0>

8l(xlt

x2) =

5F X

g X X

* » * "3) = M^ffl )

^ ^

(9C 83)

"

Of{x2)5f{x3) / U ) = 0 We illustrate the functional derivative for two easy functionals (a) F [ / ] = (f)3 and (b) S[x] = f1 dt'Lixit'), x it')). Writing F[f ] = f dyF( /(y)) = f dyifiy)? in the case of (a) and using (9C.77) JtQ

J

J

we have: SFjfjy))

= lim

^(/(y) + £^(y - *)) - ^(/(y»

= lim(/(y)

+ £l5(3;

~x))3~(-/'W)3

Basics of Quantum Theory 547

(f(y)f + ie(f(y))2$(y - *) + o(e2) - (f(y))3

.. - hm

1—

= 3(/(y)) 2 5(y-x)

(9C.84)

In view of (9C.75) and the fact that derivative of a functional is a functional, we obtain:

Jw§)= I dy^nxf=

i d y 3 (f{y))2Siy ~x) = 3(fix))2

(9C 85)

-

For (b) we write L as a sum of two separate functionals: L(x(t), x(t)) = —mx2-

V(x) = T(x(t)) - V(x)

(9C.86)

and applying the differentiation rule (9C.77) to V(x) and T(x(t)), we have respectively: SVjxjt')) _ 1 . m V(x(t') + eS(t' -1)) ~ V{x(t')) 8x(t) e->o e gV(jc(r/)) =

8{f -t)= V'W))8(f'

- t)

(9C.87)

Sx(t') and

5T(X(t')) 5x(t)

=

snap) saw)) Sx(t)

Sx(t)

= mxUf)—S(t'-t)

(9C.88)

where we have regarded x{t') as a function o.f x(t) and therefore have applied the chain rule (9C.81) 5T(x(t')) i and have then used the result of (a) to write the value of ^. = —m(2x (?'))• Thus piecing (9C.87) and (9C.88) together we have:

4 ^ - = Jf dt'l miU')~8Uf - /) - V'(xU'))8W - t) ) 8x(t)

'o

[

dt

J

= - mx (t) - V'x(t) d dL(x(t),x(t)) dt dx(t)

|

dL(x(t\x(t)) dx{t)

(9C89)

We note that the RHS is the familiar Euler-Lagrange expression, this shows that the functional extremum of the action: - ^ i =0 5x(t) is indeed the classical Euler-Lagrange equation.

(9C.90)

548 Mathematical Perspectives on Theoretical Physics

Exercise 9C 1 Establish (9C.5) for /„(*) = — i=-=-, and show that fJx) = - ^ < r n 2 j r 2 also satisfies the n 1+ n x -4n 'sifting property' (9C.5). 2 Use Eq. (9C.30) to show the justification of writing f(x) on the LHS of (9C.32). 3. Show that the periodic function f(t) -t

of period 1 is square integrable. Obtain its Fourier

series and show that it converges in the mean. 4. Compute the Fourier transform of the Gaussian function e~x la (a > 0). 5. Show that the Fourier transform of the 'transmission function' of an ideal slit of width 2a: (0

\x\>a

can be defined although f(x) is not a function of fast decrease. 6. Obtain the Fourier transform of the function: /(*) =

2

*

a

2

>

0

x +a

by using complex methods. 7. Show that ¥ ( / ) = (2nK)~T j d3r' exp(-ip' • r'IKf9(r') and »F(rO = (2nh)~^\ d3p' expO// • r'/K) ¥ ( / / ) are Fourier transforms of each other.

Hints to Exercise 9C 1. Write the integral of (9C.5) as a sum of three integrals,

(i)

lim f

+f

+f

p-

\

2T(x)dx\.

Now every test function T(x) is bounded, hence we note that the first and third integral —» 0 as n becomes large. This is because /••\

(n)

t°°

\

Ji/vVT

1

n

— n

= A\ V2

_,. , ,

T{x)dx

2 i + n2xTT

n

~ . e°°

hiJn

1

n

i+

,

dx Y~T 2 2 n x

arctan -Jn 1 J

(where A is a constant) and since lim arctan 4n —> —, it follows that the third term is zero. «-»» 2 Repeating similar steps, we can show the vanishing of the first term. The middle term equals:

Basics of Quantum Theory

(iii)

limf"^L-5-

\

549

T(x)dx.

We apply the mean value theorem to compute it. Suppose there is a number y e the domain of integration — = - , —=- , then (iii) equals: L vn Vn J (iv)

lim T(y) f" "_ — /,->»

ijil-in

n

hr-j-dx = lim T(y)— arctan Vn~ l +

rfx

«->-

It

when n becomes large, the interval becomes very, very small, hence the choice of y as 0 becomes quite legitimate, which gives the required result: l lim f S-J-T T(x) dx -> 7X0). n -»«J-~ K l + n^x*

Forfn(x) = —2=-e~" x , the first and third integral reduce to zero as n becomes large, the second V7T

one gives:

(v)

f'^

ne.n2x2T(x)dx

Using the mean value theorem in —j=^, —j=- we have: L vn vn J 1 (vi) TOO J '^- -jL ,-« v Jx = r(v) J £

4^,~' 2 ^

where we have changed the variable x to —. Thus when n —» °° the integrand evaluated between n the interval (-«>, <*>) approaches 1, however y lying in I —j=-, —==• tends to zero, hence V vn vn / lim f" -2=-e-"2x2

T(x)dx = 7X0)

which shows that it satisfies the 'sifting property.' 2. Note that the integral on the RHS of (9C.30) should be viewed as a limit (that exists) of distributions of Riemannian sum, and as a result for an operator A, it gives: (i)

AD(x) = A\~

G*(x)s(£,)dZ = A lim £ GAx) *(&)A&

= lim £ AG{.(*) s(^,)A4i = lim £ 5?.(*) ^(^,)A^

550 Mathematical Perspectives on Theoretical Physics

Since D(x) =f(x) and A = — - , the LHS of (i) equals f(x), hence we have: dx dx -~f(x) dx

= s(x)

which apparently leads to (9C.32). 3. A function/(0 defined in an interval (a, b) is said to be square integrable if

(i)

fa\f(t)\2dt

Also if it is a periodic function of period T, then J |/U)| 2 dt <°°. In this case the function is a periodic function of period 1 in the interval t e [0, 1), thus since, J°

2

K

A)

13

2

4 Jo 12

it is square integrable. To obtain the Fourier series, we calculate the Fourier coefficients (see (9C.47)) by using integration by parts for j * 0:

(ii)

cj = f f t -1W H

J =

dt = f r - ±.)-±-e*«" '

2J

\

2) 2nij

l_L_ e 2«.>- _ 1 . _ 1 2 2^7 2 2^/7

^ ( g (2nij)

0 2

Jo

-?i^-dt

Inij

^ - i)

For y = 0, we have:

Hence, the Fourier series after simplification of coefficients is:

fit) = - - X ~ s i n 2«7V. To show the second part of the exercise, we consider the scalar product of two periodic functions /and g of period T:

(iv)

(/,*)= Jj f(t)g(t)dt

under the assumption that the integral exists. This product is sesqui-linear, i.e., antilinear in the first component and linear in the second. The norm of/:

(v)

V(/,/) = | / |

measures the convergence in the mean. In this case it is: (vi)

which shows that the series converges in the mean.

Basics of Quantum Theory

4. In view of (9C.52), the F.T. of e^2'"2 (i)

551

(a > 0) would be:

f{k) = - ^ = - [" e'*2'"2 + ikx dx

L lv a

42^ J-~

2 ;

v2

) jJ

,2 2

= —4=-e ~ P exp| -f— -—Ao") Id*Similar to Exp. (9C.15) we apply Cauchy's theorem to the closed curve C in Fig. (9.7) for evaluating the integral in (i). Imx/a

'

>

-R

'—*- Re xla R

Fig. 9.7

A change of variable to t =

ha gives dx - adt, evidently the limits of integration are still a

2

the same so that:

(ii)

/(Jt)=l_L-r e-'2adt=-^=-e *

This shows that F.T. of a Gaussian is again a Gaussian as also illustrated in Fig. 9.8. t W

\f(x)

J \ -/

1 -^

Small a

^,^-_^

^

1

^

x

k i .

" f(x)

1

Large a

»x

Fig. 9.8

A

AV^'

£-

1

-^

>• k

552

Mathematical Perspectives on Theoretical Physics

We note however that the width of the first Gaussian is:

\j^x2\f(x)\2

dx

I fjf{xfdx

a

~2

whereas that of the second is I/a. Thus more /(x) is peaked, the flatter is its FT., as is evident from above illustrations in Fig. (9.8). 5. In view of (9C.52), the FT. of trammission function f{x) is:

+«

= ZL

yf2n k

keikx

i—

J— sin ka

_a V n

The graphs of/(x) and f(k) given below illustrate how the uncertainty principle (see (9A.21) for definition) is verified experimentally by using the diffraction by a slit. Although f(k) has been computed, the integral in the Fourier inversion formula (9C.53) is not absolutely convergent— (which would always be the case for f(x) if it was of fast decrease), and also Eq. (9C.54) does not hold in the sense of functions. f(x)

f{k)

Small a

1

x

I

\

_^Z—I—X_—*. k

;^«

jf(k) Large a

/

\

R B C Q Uncertainty principle verified via diffraction by a slit 6. We write the FT. f(k) as: ' (i)

/(*) = - £ = • [ " dx •J2n J -~ (x + id)(x - id)

and for k > 0 we use the Residue Theorem (See l.[l], l.[7] and Subsec. 1.6 of Chapter 1) with the closed curve C (consisting of the interval [~R, R] and half circle) to evaluate the integral. We thus have r-, (n)

ivn a r eikz , a \ R eikx . ^r eikz ,1 f (k) = —F=^ —* ^-dz = ~7=^ { —* T-dx +\ —= ^-dz\ J 3c 2 2 LR 2 2 j2^ Z +a j2^l X +a Jhalfciicle Z 2 + fl2 J

Basics of Quantum Theory

=

553

« (2ni Res {1^1 1 = « f 2ni^^-] = , / ^ e ^ . V2wl U(z)J J 42n K h'(ia)J \ 2

Here we have used g(z) and h(z) to denote the holomorphic functions elkz and (z2 + a2) respectively, hence h'(z) \ia = 2ia. Similarly for k < 0 we apply the Residue Theorem to the lower half of the circle that encloses the singularity -ia, and obtain (iii)

f{k)=^eka.

When k = 0, g(z) becomes a constant function whose value at every point is 1, the result of Residue Theorem still holds good, accordingly, (iv) J

f(k)\ J

\k=o

= !$- . v 2

Collecting the results of (ii), (iii), and (iv) we have:

}m=lLe-\K\«. Evidently f(k) is not differentiable as a function, hence property (9C.55) of the F.T. does not hold good. The above two exercises show that if a function is not of fast decrease, its F.T. can exist, but it may not satisfy all the properties. lmx

/a

/ 1 -R

>-

1

\ *

1 R

Rex

E ^ ^ ^ Q The curve C in complex plane enclosing the isolated singularity ia. 7. Use (9C.63) and (9C.64) to solve this exercise. (Note that r' and p' respectively belong to coordinate and momentum space of a particle moving in 3-space.)

554

Mathematical Perspectives on Theoretical Physics

APPENDIX 9D: QUANTUM GROUPS We describe in brief the special type of Hopf algebras which have come to be known as 'quantum groups,' since they were discovered, to begin with, through quantum mechanical models, more precisely using quantum enveloping algebra (of a semisimple Lie algebra). Although origins of these groups can be traced to the early eighties, their popularity in mathematics and physics is of much recent origin. It is well known now that quantum groups are related to theories of low-dimensional topology on one hand, and statistical physics on the other. We recommend the reader to the article by Kirillov and Reshetikhin in 4.[6b] for historical survey and C. Kassel in Ref. [Ad] for the theory and more recent references. We have divided the Appendix in two subsections. The first deals with the definitions that lead to the definition of Hopf algebra, and the second dwells on objects (which we would like to name as ^-objects) that are required to define Hopf algebras SLq{2) and Uq(sl{2)). These are indeed the simplest examples of quantum groups.

D.I Algebra, Coalgebra, Bialgebra and Hopf Algebra Before defining these (Hopf) algebras, we wish to note that like other types of algebras, the definition here is not based directly on the axioms of algebra, instead it stems from the so-called coalgebra. In order to define a 'coalgebra,' we make a pictorial presentation of the definition of 'algebra.' Definition 9D.1 An algebra over a field K is a triple (A, X, fi) where A is a vector space and X: A ® A -* A and fJ.: K —> A are linear maps which satisfy the axioms (Ass) and (Un): (Ass): The square X

®id > A® A

A® A® A

J id ® X A® A

JA —^->

A

(9D.1)

commutes. (Un): The diagram K®

A

"**

) A®

A < "*"

A

A ®K

(9D.2)

commutes. The first axiom (Ass) expresses the requirement that the multiplication map X is associative, whereas the second axiom (Un) implies that the element ^i(l) of A is a left as well as a right unit for X. The algebra A is commutative if, in addition, it satisfies the axiom: (Comm): The triangle A® A

TA A

'

A

) A® A

/*.

x

A

(9D.3)

Basics of Quantum Theory

555

commutes. Note that TAA denotes the mapping which flip switches the elements of A, thus xA A (a ® a) = a ® a. Using the above definition, the morphism between two algebras (A, X, fi) and (A', X', [/) is defined as follows: Definition 9D.2 such that:

A morphism of algebras / : (A, X, /£> -> (A', X', fi') is a linear map /from A to A'

X' o(f®f)=fo

X

and

f 0/1 = 1/

(9D.4)

By definition a coalgebra is a triple which is obtained by reversing the arrows in the diagrams (9D.1-2). More precisely, we have: Definition 9D.3 A coalgebra is a triple (C, v, 8) where C is a vector space and v: C -* C ® C and 8 : C —» K are linear maps that satisfy the axioms (Coass) and (Coun) given below. (Coass): The square C

—^->

C® C

|v C®C

[id ® v v(8a

>

C®C®C

(9D.5)

commutes. (Coun): The diagram K
<

S9id

C®C

V

idm

fv

)

C®K

/S

C

(9D.6)

commutes. The map v is called the coproduct or the comultiplication while Sis called the counit of the coalgebra. The diagrams (9D.5-6) imply that the coproduct vis associative and counital. Furthermore, if the triangle (Cocomm) C

C® C

TQC

)

C® C

(9D.7)

commutes where Tc c denotes the flip, we call the coalgebra cocommutative. A morphism between two coalgebras (C, v, 8) and (C", v', 5 ' ) is defined as follows: Definition 9D.4

A linear map /from C to C" is a morphism of coalgebras if:

(f®f)ov=v'of

and

8= 8'of

(9D.8)

Associated with any algebra A there is the opposite algebra denoted A same as that of A but multiplication is defined as:

op

whose vector space is the

556

Mathematical Perspectives on Theoretical Physics

AAoP = AA o TA_ A

(AAop (a, a') = a a)

(9D.9)

From (9D.3) it is evident that A is commutative if and only if AAoP = AA

(9D.10)

Apparently the 'opposite coalgebra' of a given coalgebra (C, v, 5) can be defined by setting the mapping

v o p = rc co v

(9D.11)

thus (C, v op , 8) is a coalgebra known as the opposite coalgebra of (C, v, 5). We note that the field K has a natural coalgebra structure with v(l) = I <S> 1 and 5(1) = 1. Also for any coalgebra (C, v, S) the map 5 : C —> K is a morphism of coalgebras. Furthermore, we note that the dual vector space of a coalgebra is an algebra (See Exc. (9D.1)). It is called the dual algebra of C. We list below a few examples of coalgebras. Example 9D.4

Let X be a set and C = K[X] = ®xeX ^x v(x) = x®x

and

De

the vector space with basis X. Define

8(x) = 1

(9D.12)

for x e X. Then (C, V, 8) is a coalgebra (of a set). The dual algebra C* is the algebra of functions on X with values in K. Thus in this case a linear form/on C is determined by its values on the basis X, and if/' is another linear form, then48:

(ff')(x) = X(f®f')(x) = W®/')(v(*)) =/(*)/'(*)

(9D.13)

The unit of the algebra C* is given by the constant function 8. While the dual vector space of a coalgebra is an algebra, in general, the dual vector space of an algebra does not carry a natural coalgebra structure. If, however, the vector space of A is finite dimensional, the dual vector space has a coalgebra structure. Example 9D.5 Consider the set Mr(K) of (r x r) matrices with entries in K. Let Etj denote the matrices with all entries equal to zero except for the (/, j) entry which equals 1 (see Chapter 4 for an Exp.). The set {£,-,-} (1 < i, j < r) is a basis of Mr(K). Let {xtj} denote the dual basis. If A denotes the algebra formed by the set Mr(K), then A* is the coalgebra defined by: r

v(Xij) = X *ik ® xkj

and

8(Xij) = Stj

(9D.14)

k=\

In fact we have: 8(Xij) = ^.(/x(l)) = XijCZEuJ = X Slk 8kj = 8tj k

and A* (Xij) (Ekl ® Emn) = Xij U(Ekl ® Emn) = t>lmxij(Ekn) 48

=

' See Hint to Exc. 1 for mappings 77 and rj.

S

lmSikSjn

(9D.15)

Basics of Quantum Theory

557

= X 5ik Slp 5pm Sjn = X Xip(-Ekl) xpj (Emn) P P

= n\yjxip®

xpj }(Ekl ® £„„,)

(9D.16)

Example 9D.6 The tensor product C ® C of two coalgebras (C, v, 5) and (C", v', 5') is a coalgebra with comultiplication (id ® Tc c - ® id) o(v ® v') and counit S ® 5'. Similar to the concepts of an ideal and a quotient algebra in the case of an algebra, we have here a coideal and a quotient coalgebra, these are defined as follows: Definition 9D.7 Let (C, v, S) be a coalgebra, A subspace / of C is a coideal if: v(/) c / ® C + C ® / and 5(7) = 0. In this case v factors through a map v from Cll to C ® Cl(l ® C + C ® I) = Cll ® C//. Similarly the counit factors through a map 8: C/7 —> K. The triple (C/7, v, 5) is a coalgebra called the quotientcoalgebra. Sweedler's Sigma Notation 9D.8 It is usual to write the tensor product C ® C or C ® C ® C for coalgebra C in this notation. According to this, if x is an element of (C, v, 5), then the element v(x) of C ® C is of the form: v(x) = £ x'; ® *','

(9D.17a)

i

This is alternatively written as:

v(x) = X *' ® •*•." s X *(1) ® x(2) X

(9D.17b)

A'

where numerical in parenthesis stands for the number of primes used. The coassociativity of v(i.e., the commutativity of square (9D.5)) is expressed by

X X (*')' ® (*')" 1 ® *" = X *' ® I X u")' ® (^")") u ) V(-t')

y

(*)

v(^")

/

= X •*' ® *" ® x'" s X x0) ® ^^ ® x°] (X)

(9D.18)

(X)

Moreover if we apply the comultiplication to (9D.18), we obtain the following equal expressions

X v(x') ® x" ® x'",- X x ® v(x") ® x'", X * ® ^c" ® v(x'"). (x)

(x)

U)

Note that a coalgebra morphism (Def. (9D.4)) in Sweedler's notation can be written as:

X fix') ® f(x") = X (/W)' ®
(9D.19)

558

Mathematical Perspectives on Theoretical Physics

Definition 9D.9 Consider a vector space H equipped with an algebra structure (//, A, /u), and also with a coalgebra structure (//, v, 5) such that the two structures satisfy the following compatibility conditions: (i) The maps A, fJ. are morphisms of coalgebras. (ii) The maps v, 8 are morphisms of algebras. A quintuple (H, A, /J, v, 8) for which the above two equivalent statements are satisfied is called a 'bialgebra.' (The equivalence between (i) and (ii) is established in the Hint to Exc. 2.) We note that a morphism of bialgebras is a morphism for the underlying algebra and coalgebra structures. Definition 9D.10 Let (A, A, \l) and (C, V, <5) be an algebra and a coalgebra, consider the vector space Hom(C, A) of linear maps from C to A, then for/and g e Hom(C, A), the composition of maps: C —v-^ C ® C

f

®g ) A ® A ——> A

(9D.20)

is the convolution map/* g. In Sweedler's sigma notation, we express it as: (f* 8)(y) = £ / ( / ) * ( / ' ) ,

y e C

(9D.21)

(y)

Evidently the convolution is bilinear. When (H, A, fi, v, 8) is a bialgebra, we may consider the particular case C = A = H, and thus define the convolution on the vector space End(H) of endomorphisms of H. This leads us to the definition: Definition 9D.11 Let (//, A, /J,, v, 8) be a bialgebra. An endomorphism S e End(//) is called an antipode for the bialgebra / / if S * idH = idH* S = ix o 8

(9D.22)

Definition 9D.12 A Hopf algebra is a bialgebra with an antipode. A morphism of Hopf algebras is a morphism between the underlying bialgebras commuting with the antipodes. In relation to antipodes and Hopf algebras, the following remark and result are noteworthy. Remark 9D.13 A bialgebra may or may not have an antipode; if it does, it is unique. For if 5 and S' are two antipodes of a bialgebra, then S = S*(/J.o8)

= S* (idH * S') = (S * idH) * S' = (p o 8) * S' - S"

(9D.23)

A Hopf algebra with an antipode S is denoted (H, A, fi, V, 8, S). Using Sweedler's convention, we note that an antipode S satisfies the relations: ^

x'S{x") = S(x)l = X S(x')x",

U)

(x e H)

(9D.24)

(.v)

X *(I) ® x(2) ® S(x0)) ® x(4) ® x(5) = X * ( 0 ® S(xi2y) ® *(3) ® JC(4) (.v)

= X *

U)

(U

(2)

®*

(3)

®*

(9D.25)

(The first equality in (9D.25) is obvious from (9D.24) and (9D.17b), whereas the second follows from the axiom (Coun) diagram (9D.6).) Result 9D.14 Let H be a finite-dimensional Hopf algebra with antipode 5, then the bialgebra H* is a Hopf algebra with antipode S*.

Basics of Quantum Theory

559

D.2 The Quantum Plane, the Algebra Mq(2), and Hopf Algebras GLq(2), SLq(2) and Uq(sl(2» Definition 9D. 15 Let q be an invertible element of the ground field K, and let / be the two-sided ideal of the free algebra 49 K{x, y) generated by the element yx - qxy. The quotient algebra Kq[x, y] = K{x, y)llq

(9D.26)

is called the 'quantum plane.' When q * 1, the algebra Kq[x, y] is non-commutative. For any algebra A, the algebra of 2 x 2 matrices with entries in A is denoted by M2{A). As a set M2(A) is in bijection with the set A4 of 4-tuples. If further, M(2) denotes the polynomial algebra K [a, b, c, d] = K {a, b, c, d}l{ad - be) and A is a commutative algebra, then Hom A , s (M(2), A) = M2(A)

(9D.27)

This bijection maps an algebra morphism/: M(2) —> A to the matrix

ff(a) [f(O

fib'A fid))

(9D.28)

Furthermore, the multiplication of matrices M2(A) x M2(A) —> M2(A) leads to the bijection of M2(A) x M2{A) with A8, which eventually gives the polynomial algebra M(2)®2 = K[a', a", b", b", c', c", d', d"\

(9D.29)

The above discussions lead to an important result: Result 9D.16

Let A : M(2) -> M(2)®2 be the algebra morphism defined by: A(o) = da" + b'c"

Mb) = a'b" + b'd"

X(c) = c'a" + d'e" X(d) = c'b" + d'd" (9D.30) then for any commutative algebra A the morphism A corresponds to the matrix multiplication in M2(A) under the identifications (9D.27) and (9D.29). The proof is obvious from the relation:

fa

b\

(Ma)

Mb)}

(a

b'\(a"

b"\

(9D.31)

In order to define a ^-analogue of the algebra Af(2), we consider the variable x, y subject to the quantum plane relation yx - qxy and in addition consider four variables a, b, c, d that commute with x andy. Next we define x', y', x" and y" using the matrix relations: 49

Consider the vector space K {X} whose basis is the set of all elements x^ ... x, including (p in the set X. An element xt ... xt is called a monomial and its length p is called the degree of the monomial. The vector space K {X} equipped with multiplication (X,

...

X . ) (X:

. . . X: ) = X:

...

X(

X,

...

X;

becomes an algebra known as the 'free algebra.' WhenX= {x, ... xn},K{A'} is also denoted as &{xx ... xn), like we have in the above definition. Note that the two-sided ideal I of K{x( ... xn} is generated by all elements of the form xiXj-xjxl^^ where i, j belong to the set (1, 2, 3, ..., n). The quotient-algebra K{x, ... x n } / I i s isomorphic to the polynomial algebra K[x, ... xn] in n variables with coefficients in the ground field K.

560

Mathematical Perspectives on Theoretical Physics

(9D.32) Result 9D.17 The two sets of variables / , y', x", y"; a, b, c, d that are related by matrix equations given in (9D.32) satisfy the following quantum plane relations: /- \

/

(1)

/

/

/

//

ft

y x = qx y , y x

(ii)

Ff

= qx y

ba = qab,

db - qbd

ca = qac,

dc = qcd

be = cb,

ft

ad - da = (q~x - q)bc

(See Hint to Exc. 5 for the proof on the equivalence of (i) and (ii)). Definition 9D.18 The quotient of the free algebra IK {a, b, c, d) by the two-sided ideal J generated by the six relations given in (ii) of the above result is the algebra Mq(2). When q = 1, the algebra Mq(2) is obviously isomorphic to M(2). Result 9D.19 The element ad - q~l be = da - qbc of Mq{2) is central. It is called the quantum determinant of Mq(2) and is denoted det?. To prove the result we have to show that det commutes with all the generators a, b, c, d. Using the relations in (ii) of Result (9D.17), we have {ad - q~lbc)a = a(da - qbc),

(ad - q~xbc)b = b(ad - q~Xbc)

(ad - q'lbc)c = c(ad - q~Xbc), (da - qbc)d = d(ad - q~lbc). We further note that similar to Mq(2) we can also define the algebra Mq_ ,(2) replacing q by q~x in the quantum relation yx = qxy. To proceed further toward our goal of Hopf algebras GLq(2), etc., we recall definitions of the groups GL2(A) and SL2(A) that result from M2(A). For instance50: GL2(A) =\\

[\Y

A e M2(A)

such that

ad-

Pye

and SL2(A) is the subgroup of GL2(A) of matrices with determinant aSfollowing result: Result 9D.20

Ax\

J

o)

f5y- 1. This leads to the

Define the commutative algebras GL(2) = M(2)[t]/((ad - bc)t - 1)

(9D.33)

and SL(2) = GL(2)/(t - 1) = M(2)/(ad - be - 1)

51 ;

(9D.34)

then for any commutative algebra A, there are bijections:

50 5L

(a)

HomAlg(GL(2),

(b)

Hom A ^(5L(2), A) = SL2(A)

A) = GL2(A)

and

The algebra A is commutative and A x is the group formed by all elements of A that are invertible. Recall that M(2) = K {a, b, c, d}/(ad - be) = K[a, b, c, d].

(9D.35)

Basics of Quantum'Theory 561

that send an algebra morphism/to the matrix

(f{a)

f(b)\

yf{c)

fid))

(See Exc. 6 for the proof and (9D.28) for

the above expression). We use (9D.29) to write the following commutative algebras: GL(2)®2= M(2f2[t',t"]/{{a'd'-b'cy-l{a"d"

-b" c")t" -1)

SL(2)®2 = GL(2)®2/(t' - 1, t" - 1) = M(2)®2l{{a'd' -b'c' -\),{a"d"

- b" c" - 1))

(9D.36)

(9D.37)

The morphism given in Result (9D.16) then leads to the algebra morphisms: (a)

X: GL(2) -> GL(2)m,

(b) X: SL(2) -> SL(2)m

(9D.38)

In order to obtain the Hopf algebras from Mq(2), we need to endow it with a bialgebra structure; we state it in the following result. Result 9D.21 There exist unique morphisms of algebras: V : Mq(2) -» Mq{2) ® Mq{2)

and 8: Mq{2) -> K

(9D.39)

given by: v ( a ) = a®a + b®c,

v(b) = a® b + b ® d

(9D.40)

v(c) =c ® a

v{d) = c ® b

(9D.41)

+d ® c ,

+d ® d

S(a) = 5(d) = 1, 8{b) = S{c) = 0

(9D.42)

The algebra Mq{2) equipped with these morphisms becomes a bialgebra which is neither commutative nor cocommutative. Under these morphisms, we also have: v(det,) = det9 ® det,,

5(det,) = 1

(9D.43)

The relations (9D.40-42) can be written in the matrix form as: v

(a

b\

=

(a

M

U d) [c d)

(a <8>

b\

[c d)

Ja and 5

b\

[c dj

=

(I

0\

1,0 lj

(9D.44)

Proof In order to prove the result we have to check the coassociativity and counit axioms. For this we write: (V ® id) v = (id ® V) V (9D.45) and since both sides are morphisms of algebras, we verify it on the generators a, b, c, d. Using the matrix form (9D.44) we have:

fa b\ (fa b\ fa b\\ fa b\ (a b\ (fa b\ (a

b\\

562

Mathematical Perspectives on Theoretical Physics

= ((id ® v)v)

fa

b\

(9D.46)

U d)

Similarly, the conunit axiom follows from the matrix identity: fa

b\ (\

0\

fa

b\

(I

Q\ fa

b\

u JU J-U Jio iJU J

mA7>

The computation of v(det^) in (9D.43) follows from the result established in Exc. 7. Definition 9D.22 Consider the algebras GLq{2) = Mq(2)[t]/(t d e t , - 1) = Gg and 5L,(2) = G(2)/(t - 1) = M,(2)/(det,- 1) E= Sq then given an algebra /?, an fl-point of GLq(2) (respectively of SLq(2)) is defined as an tf-point m = (A, 5, C, £>) of M (2) whose quantum-determinant Det^(m) = AD -q~xBC (9D.48) is invertible in 7? (respectively is equal to 1). It can be shown that the set of /?-points of Gq (respectively of Sq) is in bijection with the set Horn^, (Gq, R) (respectively Hom^ (Sq, /?)) of algebra morphisms from Gq to R (respectively from Sq to /?). We thus have the following important result: Result 9D.23 The comultiplication v and the counit 5of M (2) given by (9D.40-42) equip the algebras GLq(2) and SLq(2) with Hopf algebra structures such that the antipode S is given in the matrix form by fS(a)

S(b)} .f d -qb\ , =det1 i (9D.49) q yS(c) S(d)J \-q~xc a ) The proof of the above result entails showing that v and 8 are well-defined maps on GL (2) and SLq(2), and that both these algebras have antipodes. We refer the reader to Thm. IV.6.1 in C. Kassel, Ref. [Ad] for the proof. In the following remark we summarize the important aspect of these algebras with which we began the introduction. Remark 9D.24 The bialgebra Mq(2) and Hopf algebras GLq(2) and SLq(2), which are obtained using the self-transformations of the quantum plane, are indeed one-parameter deformations of the bialgebras Af(2), GL(2) and SL(2), the parameter here is the (quantum number) q. These are the simplest examples of quantum groups. Finally, we give below two more examples of Hopf algebras. Example 9D.25 Let L denote a Lie algebra, and let U(L) = T(L)II(L) be its enveloping algebra (See Chapter 4). Define a comultiplication von U(L) by v= yo U(d) where d is the diagonal map* \->(x,x) fromX intoX ® L and i//is the isomorphism U(L ®L) —> U(L) ® U(L), and also define a co-unit given by 5= U(0) where 0 is the zero morphism from£ into the zero Lie algebra {0}. The antipode for this is defined by S = U(op) where 'op' is the isomorphism from L onto £op such that op(x) = -x for x e L.52 52. ^-op = opposite Lie algebra of X. It is the vector space £ with Lie bracket [, ]op given by [x, y]°p =[y, x] = - [x, y]. Also U(L°P) = U(L)°V

Basics of Quantum Theory

563

The enveloping algebra U(L) is a cocommutative Hopf algebra for the maps v, 8 and the antipode S defined above. More explicitly, for xx, ..., xn e L, we have: V(JC, ... Xn) = 1 ® *! ... * „ + £ X

X

c r ( l ) ••• x o ( p ) ®

JCtr (p+1) ••• *
/> = 1 (T

+ JC, ... xn ® 1

(9D.50)

where cr runs over all (p, q) shuffles of the symmetric group Sn, and the antipode S satisfies S(xx x2 ... xn) = (-l)"x,, ... x2xx

(9D.51)

The enveloping algebra U(L) is a Hopf algebra since the coassociativity axiom (9D.5) required for the definition is satisfied as a consequence of the commutativity of the square: L

—2-^>

L ®L

it]

i id © 77 nmd

L® L

)

L ©L ©L

(9D.52)

and the counit axiom (9D.6) is satisfied in view of the commutativity of the diagram 0©X

<

mid

idm

L®L

\=

>

t rj

L®0 /> =

L

(9D.53)

and the cocommutativity follows from L V / L ©L

\ —^—>

f] £ ©L

(9D.54)

The morphism 77 should not be confused with v (See hint to Exc. 9 for more explanations). We have taken this map to illustrate (9D.5), (9D.6) and (9D.7). Note also that the tensor product sign ® in (9D.5)-(9D.7) is changed to direct sum sign ©, this is obvious from the isomorphism i// mentioned in Exp. (9D.25). Example 9D.26 Next we consider the enveloping algebra of the familiar Lie algebra sl(2). We recall that 5/(2) is formed by (2 x 2) traceless matrices. For if the ground field DC is
,

Y=\

\,

H=\

,

/=

(9D.55)

are its basis elements. The subspace spanned by the basis {X, Y, H) is an ideal of gl(2) and is the Lie algebra 5/(2). Since gl(2) = 5/(2) © a

(9D.56)

564

Mathematical Perspectives on Theoretical Physics

it follows that the results for gl(2) can be deduced from those of sl(2). The enveloping algebra U s (7(5/(2)) is isomorphic to the algebra generated by the three elements X, Y, H with the three relations: [X,Y] = H,

[H,X] = 2X,

[H,Y]=-2Y

(9D.57)

In view of the above Exp. the algebra U(sl(2)) is a Hopf algebra. Although this algebra has many important properties that are used to study the structural theory of Hopf algebras, we limit our attention to the interesting feature of its duality with the algebra SL{2) (see (9D.34)). The notion of duality amongst bialgebras is given by the following definition. Definition 9D.27 Let (U, X, /u, v, <5) and (H, X, fi, v, S) be two bialgebras and let < , > denote a bilinear form on U x H. We say that the bilinear form realizes a duality between U and H (or that they are in duality) if the following relations hold good for all u, v e U and x, y e H: {uv, ;c> = £ (u, x) (v, x")

(9D.58)

U)

(u, xy) = X <«'. x) <«"> y) («) (l,x)=

(9D.59)

S(x)

(9D.60)

<«, 1) = S(u) (9D.61) If in addition U and H are Hopf algebras with antipodes S, then they are said to be in duality if the underlying bialgebras are in duality and if we have: (S(u), x) =
(9D.62)

for ail u e U and x e H. (We leave it for the reader to examine that U(sl(2)) and SL(2) satisfy the conditions of duality given in the above definition (see Chapter 5 in C. Kassel, Ref. [Ad]). As a final piece toward our description of quantum groups, we define the Hopf algebra U = UJsl(2)). Definition 9D.28

Let q e C be an element which is different from 1 and - 1 , then the fraction

- j - is well defined. q-q~ The algebra generated by four variables E, F, K, K ' with the relations: KK~[ = K~XK= I X

KEK~ ~ q'E,

(9D.63)

KFK'

{

2

= q~ F

(9D.64)

and [E, F] =

K K

~

(9D.65)

q-q is the algebra Uq = Uq(sl(2)). The algebra U admits a unique algebra automorphism co such that
Basics of Quantum Theory 565

Exercise 9D 1 Show that the dual vector space of a coalgebra is an algebra. 2 Establish the equivalence between (i) and (ii) of Def. (9D.9) using the pictorial representations of maps A, fi, v, 8. 3. Show that the triple (Hom(C, A), * , ji o 8) is an algebra, and the map r\c A: A <E> C* —> Horn (C, A) is a morphism of algebras where A ® C* is the tensor product algebra of A and of the algebra C* (dual to the coalgebra of Q . Show further that when A = K the algebra structure (Hom(C, K), * , \x o 8) on the dual space C* is the same as the one defined in Exc. 1. 4. The quotient algebra K[x{, ..., xn}/I is isomorphic to the polynomial algebra K[x:, ..,, xn] in n variables with coefficients in the ground field DC When n = 1 for any commutative algebra A, the underlying set A is in bijection with the set 53 (a)

HomA,g(IKM, A) = A.

The algebra K[x] is called the ajfine line and the set Horn^OKIx], A) is called the set of A-points of the affine line. The algebra of polynomials K[JCJ, x2] with the bijection (b)

Hourly (K^c,, x2), A) = A2

is called the affine plane, an element in HomA/^ (K[x{, x2], A) is called an A-point of the affine plane. Suppose that A : K[x] —> K[xt, x2], Ii: KM —> K, v : K[x] —> K[x] are the algebra morphisms defined by: (c)

X(x) = xx + x2,

/i(x) = 0,

v(x) = -x.

Then show that under the identifications (a) and (b), the morphisms A, ji, v correspond to the maps +, 0, - given by: (d)

+ : A2 ^> A,

the unit 0 : {0} —> A,

and the inverse -: A ^> A.

5. Establish the equivalence relation between (i) and (ii) of Result (9D.17). 6. Prove Result (9D.20). 7. Given an algebra R, an /?-point of Mq{2) is a quadruple (A, B, C, D) e RA which satisfies the relations (ii) of Result (9D.17) with {a, b, c, d) replaced by (A, B, C, D). Writing (A, B, C, D) in matrix form

, show that if m = (A, B, C, D) and m' = (A', B', C, D') are two 7?-points of

Mq(2) that commute, then the element mm =

(A" „

B" \

M (2); show further that the quadruple (

{-q-lC

=

(A'

,

B'\ (A ,

B\

is an /?-point of

| is an /?-point of M..-\ (2) and an

A )

# op -point of Mq{2), and that Detq(m'm) = D e t ^ m ' ) Det (? (m) in R where Det ? (m) = AD - q~l BC = DA - q BC. The element Det ? (m) of algebra K is called the quantum determinant of m. 8. Show that v and 8 given in (9D.39) are algebra morphisms. 53

Horn^ (DC [x], A) = the set of algebra morphisms from IK [x] to A.

566

Mathematical Perspectives on Theoretical Physics

9. Given a vector space V, show that there exists a unique bialgebra structure on the tensor algebra T(V) such that 77 (y) = 1 ® v + v ® 1 and 8(v) = 0 for any element v in V. This bialgebra is cocommutative and for all vit ..., vne V (a)

S(vi ... vn) = 0

whereas «-i

(b)

77(17, ... vn) = 1 ® o, ... vn+ £ £

y a ( 1 ) ... yCT(p) ® va(j)+l)

... i>ff(B) + u, ...

vn®\

p=l a

where cr runs over all permutations of the symmetric group Sn such that <7(1) < <J(2) < ... < a(p) and a{p +1) ... < <J(n) (this permutation is sometimes called a (p, n - p) shuffle).

Hints to Exercise 9D 1. Let (C, v, 8) be a coalgebra. It is known that given a finite dimensional vector space C there exists an isomorphism 77: C* ® C* -> (C ® C)*, where C* is the dual space of C. Let 77 = 77 o TC* c; define /I = C*, X = v* o rj and ji = S* where the superscript * on a linear map indicates its transpose. Then (A, A, jx) is an algebra, proving the result that the dual vector space of (C, v, 8) is an algebra. 2. In order to establish the compatibility between the two structures, we consider the tensor product H ® H of vector space H and the two induced structures of algebra and coalgebra on it. We then use the commutative diagrams (9D.1)-(9D.2) and (9D.5)-(9D.6) to express both statements (i) and (ii) of Def. (9D.9): (a)

H ® H

—^>

i (id ®

H

® id) (v ® v)

T

i v

X

®X )

(H ® H) ®(H ® H)

H

5

®5

H ®H

) K®K

I X

i id

H

— ^

K

The fact that \l is a morphism of coalgebras/is expressed by the commutativity of the following diagrams: IK

(0)

—>

lid

tl

^ v

K®K

^W

H ®H

ti

—>

VSL

^8 K

It is easy to note that these four commutative diagrams are exactly the same as the four diagrams given below whose commutativity expresses the fact that v and 8 are morphisms of algebras. (c)

H ®H i A H

v

®v

—v-^>

)

(H ® H) ® (H ® H)

K

—^—>

I (X ® X) (id ® z® id)

I id

'"^

H ®H

K®K

// )

I v H®//

Basics of Quantum Theory

567

and (d)

H® H

5

®6 )K ® K

i X H

K

I id —±->

—^-»

^ id

K

H ^ S

K

(The reader will recall that the mapping Tin these diagrams indicates the flipping of two elements.) (Hints to 3 and 4 can be found in Chapters 3 and 1 of C. Kassel (See Ref. [Ad].) 5. We have to show that (i) <=> (ii). We use the first matrix relation (9D.32) to write y'x = qx' y, we thus have: (a)

(ex + dy)(ax + by) = q(ax + by) (ex + dy).

Equating the coefficients of x2, y2 and xy on both sides we obtain: (p)

ca = qac, db = qbd,

cb + qda = qad + q be.

The third equality in (j3) when divided by q yields: (/)

ad - da = q~xcb - qbc.

Similarly, using x" and y" we have: ba = qab, dc - qcd, ad - da = q~xbc - qcb.

(8)

(Note that (8) is actually (a) with b and c interchanged as should be expected from the expres(x'\ fx" \ sions for and of (9D.32).) From (7) and (5) we obtain

\y J (r)

\y J (q~x + q) (be - cb) = 0.

Since q2 * -1 we have the final equality be = cb of (ii), showing that (i) implies (ii). The converse can be proved using similar arguments. (Note that we shall begin with the relations given in (ii) of Result (9D.17) and then obtain the two relations y'x' = qx' y and y"x" = qx" y" using the matrix relations of (9D.32).)

(a p\ 6. We prove the result for GL(2). Let be a matrix in GL2(A). Since A is commutative, there is a unique morphism/: M(2)[t] —>A such that:

(i)

f(a) = a, f(b) = P, f(c) = y, f(d) = 8, and f(t) = (a8- Pf)-1.

Also (ii)

Mad -bc)t-\)= (f(a)f(d) - f(b)f(c))f(t) - / ( I ) = (aS- pf) (a8- py)'1 - 1 = 0 .

This implies that morphism/factors through the quotient algebra GL(2) which eventually leads to the required equality (9D.35a) of Result (9D.20). Note that in the case of SL(2), the fifth equality in (i) is simply 1, hence (ii) is trivially satisfied. 7. To establish this excercise we consider the tensor product algebra: R' = R® Kq[X, Y] = R{X, Y}I(YX - qXY)

S68 Mathematical Perspectives on Theoretical Physics

and note that the Result (9D.17) can be rewritten in the language of/?-point of Mq (2). According to this a quadruple

(A B\

of elements of an algebra R is an /?-point of Mq (2) if and only if the

pairs X', Y, X", Y", satisfying (9D.32) (with x, y, x, y, x", y", a, b, c, d replaced by corresponding capitals) are /?'-points of the quantum plane. Since A, B, C, D and A', B', C, D' commute, the second equality in (9D.32) becomes:

\Y" )

U' D')\Y)

Now a second application of Result (9D.32) gives that

(A'

B'\(X'\

(A" B"\(X\

[c D'){rj = [c"

D"){Y} and

(A C\(X"\_(A" C"\(X\ U D)\Y")~\B" D")[Y) are R '-points of the quantum plane, hence mm is an /?-point of Mq (2). To prove the second part, we set A' = D, B' - - qB, C = - q~lC and D' = A and note that the Result (9D.17) with (a, b, c, d) replaced by (A, B, C, D) gives the relations A'B' = q B'A' etc. in terms of (A', B',C',D') which means that (A',B\ C',D') is an #-point of M^ (2) or an Rop-point of Mq (2). The third part which is computational is left for the reader. 8. Use the above exercise to show that (v(a), v(b), v(c), v(d)) is an M (2) ® M (2)-point of Mq(2), similarly show that (8(a), S(b), 8(c), 8{d)) is a DC-point of Mq{2). 9. Note that by universality of the tensor algebra, there exist unique algebra morphisms r\: T(V) —> T(V) ® T(V) and 8: T(V) —> K such that their restrictions to Vare given by the formulas given in this Exc. Now consider n elements v{, ..., vn in V, and note that formula (a) is a trivial consequence of the multiplicativity of morphism 8. Next, to compute rj(v}, ..., vn), use induction on«. Since formula (b) holds for n - 1 by definition, assume that it holds up to n - 1 > 1, then writing the equality (i)

77^! ... V,)= 77(U1 ... U,,_,) J](Vn) = 77(U, ... IV4)(1 ® Vn+ Vn ® 1)

and substituting the value of T](vl ... vn_x) from formula (b), one has the equality after using the arguments on (p, n - 1 - p) shuffles of Sn_v (We leave the rest of the proof for the reader.) We note, however, that cocommutativity is a consequence of the fact that the permutation: (ii)

[1 [p+l

2 ... p p + l p + 2 ... n\ p+2 ... n 1 2 ... p)

switches the two shuffles (p, n -p) and (n - p, p). Also, the coassociativity of r] results from the relation (iii)

(d ® id) o d = {id ® d) o d

where d is the diagonal map d(v) - (v, v) from Vinto V® V.

Basics of Quantum Theory

569

References 1. A. O. Barut, New Frontiers in Quantum Electrodynamics and Quantum Optics (New York: Plenum Press, 1990). 2. F. A. Berezin, Method of Second Quantization (New York: Academic Press, 1966). 3. S. N. Bose, Plancks Gesetz und Lichtquantenhypothese, Zeitschriftfur Physic 26 (1924), 178-181. 4. A. Boutet de Monvel, et al. (ed.), Recent Developments in Quantum Mechanics (Proc. of Brasov Conf., Boston: Kluwer Academic Publishers, 1991). 5. L. de Broglie, Heisenberg's Uncertainties and the Probabilistic Interpretation of Wave Mechanics (Boston: Kluwer Academic Publishers, 1990). 6. T. P. Cheng and L. F. Li, Gauge Theory of Elementary Particle Physics (New York: Clarendon Press, 1984). 7. A. Das, Field Theory: A Path Integral Approach (New York: World Scientific, 1993). 8. H. D. Doebner, J. D. Henning and T.-D. Palev (ed.), Infinite-dimensional Lie Algebras and Quantum Field Theory (Proc. of Varna Summer School, New York: World Scientific, 1988). 9. E. N. Economou, Green's Functions in Quantum Physics (New York: Springer-Verlag, 1979). 10. L. D.Faddeev and V. N. Popov, Feynman Diagrams for Yang-Mills Field, Phys. Lett. B25 (1967). 11. L. D. Faddeev and A. A. Slavnov, Gauge Fields: An Introduction to Quantum Theory (New York: Benjamin-Cummings Publishing Co., 1980). 12. J. S. Feldman and L. M. Rosen (ed.), Mathematical Quantum Field Theory and Related Topics, Proc. of the 1987 Montreal Conf., Ann. Math. Soc. (1988). 13. R. P. Feynman, (a) Space-time Approach to Nonrelativistic Quantum Mechanics, Rev. Mod. Phys. 20 (1948), 367-387; (b) Quantum Theory of Gravitation, Acta. Phys. Polon. 24 (1963). 14. A. Galindo and P. Pascual, Quantum Mechanics /(New York: Springer-Verlag, 1990). 15. J. Glimm and A. Jaffe, Quantum Physics (2nd ed., New York: Springer-Verlag, 1986). 16. W. T. Grandy, Relativistic Quantum Mechanics of Leptons and Fields (Boston: Kluwer Academic Publishers, 1991). 17. H. F. Hameka, Quantum Mechanics (New York: John Wiley, 1981). 18. M. W. Hirsch and S. Smale, Differential Equations, Dynamical Systems and Linear Algebra (New York: Academic Press, 1974). 19. D. C. Khandekar, S. V. Lawande, K.V. Bhagwat, Path Integral Methods and their Applications (New York: World Scientific, 1993). 20. L. D. Landau, Quantum Mechanics (Non-relativistic Theory) (New York: Pergammon Press, 1977). 21. O. L. De Lange and R. E. Raab, Operator Methods in Quantum Mechanics (Oxford: Clarendon Press, 1991). 22. F. Mandl, Introduction to Quantum Field Theory (New York: Interscience Publishers Inc., 1959). 23. F. Mandl and G. Shaw, Quantum Field Theory (New York: John Wiley, 1984). 24. E. Merzbacher, Quantum Mechanics (2nd ed., New York: John Wiley, 1970). 25. V. A. Miransky, Dynamical Symmetry Breaking in Quantum Field Theories (New York: World Scientific, 1993). 26. K. Moriyasu, An Elementary Primer for Gauge Theory (New York: World Scientific, 1983). 27. O. Piguet and K. Sibold, Renormalized Supersymmetry, the Perturbation Theory of Renormalized Supersymmetric Theories in Flat Space-Time (Boston: Birkhauser, 1968).

570

Mathematical Perspectives on Theoretical Physics

28. M. Planck, Ueber das Gesetz der Energieverteilung im Normalspectrum, Annalen d. Physic 4 (1901), 553-563. 29. V. N. Popov, Functional Integrals in Quantum Field Theory and Statistical Physics (New York: D. Reidel Publishing Co., 1983). 30. R. M. Santilli, Foundations of Theoretical Mechanics (Vol. I and II, New York: Springer-Verlag, 1978). 31. T. Schiicker, Distributions, Fourier Transforms and Some of their Applications to Physics (New York: World Scientific, 1991). 32. B. Simon, Functional Integration and Quantum Physics (New York: Academic Press, 1970). 33. H. Spohn, Large Scale Dynamics of Interacting Particles (New York: Springer-Verlag, 1991). 34. G. W. Strang, Linear Algebra and its Applications (3rd ed., Saunders, 1988). 35. V. S. Varadarajan, Geometry of Quantum Theory (2nd ed., New York: Springer-Verlag, 1985). 36. B. S. De Witt, Quantum Theory of Gravity, Phys. Rev. 160 (1967); 162 (1967). 37. T. Y. Wu, Quantum Mechanics (New York: World Scientific, 1986). 38. J. D. Bjorken and S. D Drell, Relativistic Quantum Mechanics (New York: McGraw-Hill, 1964). 39. J. R. Waldram, The Theory of Thermodynamics (Cambridge: Cambridge University Press, 1985).

THEORY OF YANG-MILLS AND

THE YANG-MILLS-HIGGS MECHANISM

1

CHAPTER

-4 I

f\ U

INTRODUCTION

This chapter is devoted to one of the most outstanding theories of our times-the theory of Yang-Mills. It is this theory which on the physical side paved the way for another important gauge theory-the electroweak theory of Glashow-Salam-Weinberg, and gave an insight in symmetry-breaking phenomena through Higgs' mechanism. On the mathematical side it led to prolific research in analysis, geometry and topologyThe theory was basically proposed by C. N. Yang and R. P. Mills in 1954 [55] to replace the abelian gauge group f/(l) of Maxwell's theory by the isospin gauge group SU(2). Unfortunately the massless particles predicted by the theory could not be identified with anything similar to photons-the massless carriers of electromagnetic field. Hence for almost two decades the theory remained dormant. In 1966 Higgs [28] circumvented this difficulty by introducing a scalar field
is called the Higgs' field and the ensuing process—the Higgs mechanism. Soon after in mathematical world the non-linear differential equations resulting from Yang-Mills' theory received a new status and the study of their solutions along with their properties became a hot field of research. In the realm of physics the theory began to be viewed with interest and with trust in the sense that experiments were modeled using the principles of the theory (See Salam [43]). Some of the key players (mathematicians and physicists) that participated in the explosive scheme of ideas leading to a structurally sound theory were: Higgs [28], Belavin and Polyakov [15], Bogomolny [16], Prasad [40], Jackiw [30], Atiyah [2], Ward [52], Manton [34], Hitchin [10, 29], Taubes [46], t'Hooft [47], Ulhenbeck [48], Goddard [13], Bott [9], Witten [11, 54], Singer [4, 5, 12,44] and Donaldson [22]. In the context of solutions of these non-linear differential equations (resulting from Yang-Mills functional) new words such as "instantons" were coined, and words and phrases such as anomaly, vortices and monopoles, self-duality of curvature, exotic structures, deRham complex, index theorems, Sobolev spaces and moduli spaces received a new meaning. Deep regularity theorems linking one discipline with another were proved (See, for instance, the work by Ulhenbeck [48], Taubes [46] and (most importantly) Donaldson [22]). It was recognized early on that differential geometry (in particular the fiber bundles) provided a suitable language for the description of the theory (See Sec. 6.5 and Sec. 6.6) and therefore, there appeared quite a few survey articles of the theory; some of these are [5], [9a, b], [13] [20] and [24].

572

Mathematical Perspectives on Theoretical Physics

Since there already exists a vast amount of literature on the subject, some of which is easily comprehensible [2a], [23] we mainly restrict ourselves to providing examples and explanations of the words mentioned above along with a brief background material. For detailed studies, the reader is advised to refer to the above articles and to the books [1], [4], [10], [23], [32], [34].

2

YANG-MILLS AND YANG-MILLS-HIGGS FUNCTIONAL

For reasons mentioned in the introduction, we begin with Yang-Mills-Higgs theory and define the terms required for understanding it.

2.1

Yang-Mills-Hlggs Action in Rn and Rn>'

The ingredients of the theory—called the dynamical variables by physicists—are a gauge potential (connection) A = A; (x)dx' and a scalar field = = (j)(x) takes its values in a vector space 0. on which G acts as a transformation group. The space L is referred to as the internal symmetry space of the system. The gauge potential A defines a field (the curvature) on one hand, FA = dA + A x A =— F , dx' 1

A

dxj

= y (di Aj (x) - dj A, (*) + [A,- (JC), Aj (x)]) dxl A dx\

(10.2.1)

and gives the covariant derivative of <j) on the other, DA 0 = (VA)/ (0) dxl = (V,-0 + p{A^)dxl.

(10.2.2a)

The p in above equation stands for the linear representation of the Lie algebra g on L , evidently this representation is induced by that of G (on L). Obviously the covariant derivation (10.2.2a) couples <j> to the connection A. We note that the gauge potential A also defines the exterior covariant derivative of arbitrary p-forms (O. Thus, if 0) is an I valued p-form: DA(O: = do) + p(A) A (O

(10.2.2b)

and when it is a g-valued p-form it is: DAco: = dco+ A A co- ( - If

CO A A

(10.2.2c)

It is easy to note that when p is the adjoint representation acting on L = g, equations (10.2.2b) and (10.2.2c) agree with each other. (In Exc. 5 we shall see that all these equations (10.2.2) are covariant under gauge transformation.) 1

Lie group G being a transformation group on vector space L is a matrix Lie group here, which can always be replaced by a general (compact) Lie group G, to write the YMH functional. In this case the definition of FA, where A stands for matrix multiplication has to be changed as the use of A in (A A A) in (10.2.1) is valid only in the case of matrix Lie groups. We note that the definition of FA in indicial notation is valid in all cases.

Theory of Yang-Mills and The Yang-Mills-Hlggs Mechanism 573

The Euclidean Yang-Mills-Higgs action can now be written as:

AYMH(A,

0)= | { R , , {(F A ,F A ) + (DA«/>,DAtfO + i- ( | 0 |2 - l ) 2 } S | R n * R »

(10.2.3)

the notations used here are mostly that of [28]. The third term in (10.2.3) represents the Higgs' self-interaction were X> 0 is a constant. In place of R", if we use the Minkowskian space—of dimension (n + 1) with x° denoting the time coordinate, the action density becomes:

AU= \ {y (fy Fv) -OW- ^o«) + ((VA),-^ (V A ),» - ((VA)O0, (VA)O0) + A ( |0|2 _1}2 J

(10 .2.4)

and accordingly the action is: ^H=|

R n

,i^

(10.2.5a)

The field configuration (A, <j>) is called 'static' if A and

(10.2.5b)

exists and equals (10.2.3). The A^MH is called the energy of the static configuration.

2.2

The Vdriational Equations and Solutions

Just as we had variational equations for pure Yang-Mills action (See Eqs. 6.7.18-21), the variational equations for Yang-Mills-Higgs action (10.2.3) (denoted YMH) are DA*F=*J, and

(10.2.6a)

V A 2 0=— 0 ( | 0 | 2 - 1),

(10.2.6b)

When G is arbitrary, L = g and p is the adjoint representation, the current J is: ./= - [ f DA ] (10.2.7) We note that one is usually interested in 'finite action' solutions of above variational equations. These are indeed the time-independent finite energy solutions to the variational equations coming from the action density (10.2.4) on Minkowskian space K"'1. We call these solutions-solitons.

2.3

Instantons, Vortices and Monopoles

In the case of Euclidean pure Yang-Mills equations (i.e., A = 0 and <j> = 0) when n = 4, these (solitons) are called instantons. Now instantons have the property that their curvatures are self-dual; i.e. FA satisfies: *FA = FA

or

*FA = -FA.

(10.2.8)

574 Mathematical Perspectives on Theoretical Physics

In view of the definitions (10.2.2a) of DA and (10.2.1) of FA, as well as the fact that d2 = 0, it follows that FA satisfies the Bianchi identity DAFA=0, (10.2.9) therefore from (10.2.8) we have the Yang-Mills equation: DA*FA=0 (10.2.10) Hence every curvature that satisfies the self-duality or anti-self-duality condition (10.2.8) is a critical point (see Definition (0.5.3)) of the n = 4 Yang-Mills action (functional):

* ™ = y J R 4 (FA> FA)Recall that we obtained the YM equations in Sec. (6.7)-(Eq. (6.7.20)) while studying the gauge theories from bundle-theoretic point of view. We now list a few facts about the solutions of pure Yang-Mills and Yang-Mill-Higgs equations. Fact 10.2.1 When field A is considered as a connection on a principal bundle and it is assumed that it approaches the flat connection sufficiently rapidly as |JC| —> <», then the integral

V Tr f F A F = N $7t

(10.2.11)

J

is an integer. A sufficient condition that A may satisfy the above asymptoticity is, that the connection A be the pullback of a connection on 5 via stereographic projections; for in this case, the above integral is the second Chern number (see 10A.20). The Chern numbers are known to be invariant under pullbacks. (See. Chapter 5 in [35]). Fact 10.2.2 All instanton solutions have an associated integer N. For a fixed N, the solution manifold has a well defined dimensionality which is determined by the group G and the integer N. For instance, when G = 5(7(2), there is an 8|iV| - 3 parameter family of solutions. Fact 10.2.3 The instanton solutions minimize the functional AYM as long as it is restricted to connections for which N is a fixed integer. Also every local minimum of AYM is an instanton. Fact 10.2.4 For dimensions < 4, there are no finite action solutions to the pure Yang-Mills equations (10.2.10). However, finite action solutions exist for YMH equations (10.2.6). In the case of n = 2 these are called vortices, the term vortex coming from superconductivity as we shall see in Sec. 5, and when n = 3 they are known as monopoles. The features of these Higgs models are qualitatively similar to the Yang-Mills theory in 4 dimension. Fact 10.2.5

For n > 4, there are no finite action solutions to (10.2.6).

Fact 10.2.6 For n = 4, the only solutions to (10.2.6) are finite action solutions, which are naturally gauge equivalent to pure YM solutions. We shall prove a few of these facts later. We now illustrate some of the above theory by means of examples and exercises that would deal with instantons, vortices and monopoles.

2.4

An Example on Instantons

Example 10.2.7 The instanton solution (finite action solution) of Euclidean Yang-Mills theory is a connection on a principal bundle with M = S4 as the base space and G = SU{2) = S3 as the fiber. We

Theory of Yang-Mills and The Yang-Mills-Higgs Mechanism

575

obtain below an explicit expression for it (see also Exc. 2 of this section). For this we take the metric on S 4 with radius— as: 2 d2

•>__

dr2 + r2 (o\x + oly + O2LL * Z) =J\(ePf 2 2 2 (l + r la ) p%

dx^dx^ ax ax (l + r2/a2)2

(10.2.12)

which is achieved by considering the projection from the north or south pole onto R4. (See Exp. (0.5.1) and (0.5.2) and Exc. (3) for the notations used.) As in the Hint to Exc. 2, we split S 4 into hemispheres H+ and H_, and in the overlap region H+ n H_— 53, define the transition functions h(x?) that relate the fibers g+, g_ coming from these hemispheres thus2: g-=Mx)]kg+

(10.2.13)

In the hint to Exc. 3 we have seen that h(x^) =

satisfies: r

h'x dh = itk ak

(dh) h'1 = -itk ck

and

(10.2.14)

(We used the summation index Z there). Also, in view of Exp. (0.5.2) the connection 1-forms in the two neighborhoods H+, H_ can be written as:
on

co = gl1 A'g_ + gZl dg_

on

H+

(10.2.15)

H_

where A'(x) = (h(x))k A(x) (h(x)Tk + (h(x))k d(h(x)Tk.

(10.2.16)

When k = 1 we have the single instanton solution: H+: A = (r2/(r2 + a2)) • h~ldh = (r2/(r2 + a2)) ixkak Hj. A' = h[(r2/(r2 + a2)) h-ldh]h~l + hdhT1

(10.2.17)

= {r2l{r2 + a2)) (dh)hTl + (- (dh)h~l), where we have used d(h o h~l) = (dh) o h~x + hdh~x. This gives: A, =

_mfL=_iij^ 2

l+r /a

2

(10.2.18) 2

2

l + r /a

Note that while A is well defined throughout H+, it is singular at the south pole at r = <». Similarly A' that is well defined in H_ is singular at the north pole at r = 0. 2

' k in (10.2.13), (10.2.16) is an integer which corresponds to second chern class and represents the equivalence classes of instanton bundles; see Fact (10.2.2).

576

Mathematical Perspectives on Theoretical Physics

The solutions A and A' are the (Yang-Mills) analogues of the two gauge-equivalent Dirac monopole solutions with Dirac strings in the upper and lower hemispheres of S2 as shown in the Hint to Exc. 1. Corresponding field strengths in H+ and H_ are given as. H+: F+ = dA + A A A

(10.2.19)

H_: F_ = dA' + A' A A'. Using the computations of Exp. (0.5.2) and the equality (10.2.16), these are seen to be: F+ = iT;(2/V) (e° A el + \ e^e1 A ej)

(10.2.20)

F_ = hF+hTl. Now F is self-dual, i.e., *F= F, therefore in view of the Bianchi identity: DAF=0 we have the Yang-Mills equations (10.2.10): DA* F = d* F + AA *F-* F A A = 0.

(10.2.21)

Hence we have established that, A is the single-instanton solution of the equation. We emphasize that, while looking for a solution to the Euclidean Yang-Mills equation, we are indeed in search of a gauge potential which is regular almost everywhere, the potential A described above fulfills that condition. Having obtained the solutions A and A' in index free notations, we write it down locally to explain things further. Recall that A = (A^), and since A is ,sw(2)-valued we can write Atl = AflaTa/2

(a = 1, 2, 3) and

F

nv = Fnva Tfl/2 = [(dv A^ -
(10.2.22)

(See Sec. 6.6.) In view of (10.2.17), when r —> °o the components A^—> h~x d^ h, which shows that locally it is a pure gauge (i.e. it no longer depends on the compactified manifold). From (10.2.22) the components F^v are seen to vanish. Thus the singular point of A is characterized by the zero field strength*. For r = 0, A^ = 0 and F^ = 0. Hence in both cases AM, whether it is trivially zero or it is asymptotic, gives rise to a zero field strength and thus, to a vacuum state.

2.5

An Example on Vortices

In the following example we again use a coordinate set up to obtain the vortex solutions of a complex scalar field (j) interacting with an electromagnetic field. The example illustrates the Yang-Mills-Higgs system in the case n - 2. Example 10.2.8 The field (p(x) (with complex conjugate 0 ) can be viewed as though it was a Higgs particle with mass m. The Yang-Mills field A = (A,) is replaced by the electromagnetic field, and the gauge group G in place of SU(2) is now U(l). As a result, the connection and curvature are respectively: -iA, and -iFA = -idA, with A real (see also Sees (6.6) and (6.7)). F is the field strength tensor.

Theory of Yang-Mills and The Yang-Mills-Higgs Mechanism 577

The covariant derivative DA0 given in (10.2.2a) and the curvature FA in (10.2.1) can now be written as3: D

A

= C?n
FA = d v A ^ - d ^ A

v

dx^^D^ip

dx*

(ji, v = l , 2 ) ,

(10.2.23a) (10.2.23b)

whereas the Lagrangian for the interaction can be expressed as:

L = 1 FMV F^ + (D^)* (D"0 - j - (V0 - - j - ) ,

(10.2.24)

(Note the similarity between this and (10.2.3)). The variational equations for the action \ 2 L are: dvF»v = ef

DtlD*<j>= - A ( V > - — J 0

(10.2.25a)

(10.2.25b)

where the current / = (y'M) stands for: /

= i (<j)* D " <j) - 0 (DM0)*) = i(* <9M 0 - 0 d" 0*) - 2eA" f 0

(10.2.26)

The similarity between (10.2.25) and (10.2.6), and between (10.2.26) and (10.2.7) is obvious. We also note that the equations (10.2.25) are gauge invariant as can be seen by using the phase transformation: (j> -» e'n <j>, * -» e~'v * given by the group U(l). As mentioned earlier the solutions (A, tj>), known as vortex solutions of (10.2.25), result from superconductivity phenomena. We elaborate this point below for this example begining not with gauge potential A but with an arbitrary field B. Thus let B denote an external magnetic field. If the strength of B is less than a (fixed) critical value Ho, then B is not able to penetrate inside the superconductor.4 If, however, this strength > Ho, the field can go through a kind of hole in a superconductor of type II, and there is a magnetic flux across the superconductor. We shall see that this magnetic flux can be quantized, and the pair (A, ) can be determined by making suitable choices. For this purpose we assume that the magnetic field B is in the z-direction. We denote by C a circle in the ry-plane with origin as its centre, and assume that over the circumference dC, the current J is zero. From (10.2.26) (using the vector notation), we thus have: A = —l—U* V0-0V0*) 2e<j) 0 v '

(10.2.27)

In order to write the magnetic flux across the circle C, we use polar coordinates to express the complex scalar field as: 3

The electric charge e is the field coupling constant which is related to the particle electric charge ep as: e = — ep.

4

When the field strength is < Ho, it is called the Meissner effect. (See Sec. 5)

'

578 Mathematical Perspectives on Theoretical Physics

2

(10.2.28)

2

where r = x + y . The circle C is defined by (r, 9) with tan 6 = —, and x(&) stands for phase transformation of 8 (in x other words, % is the phase which depends on 6). The flux in (x, y) coordinates: f B dxdy = I JC

A-dx

(10.2.29a)

JoC

over C of radius a in terms of polar coordinates can be written as (see Exc. 4): Flux = — f

X'(d)d6

(10.2.29b)

Since <j) has to be single-valued throughout the region, the flux has to be quantized. Thus writing a = 1 we have: Flux = l [ * ( 2 w ) - * « ) ) ] = - ^ - , t f = 0 , ± l , ± 2 ••• (10.2.30) e e The integer N results from the map of the circle in the xy-plane on the circle described by the phase %• It is called the winding number of the mapping, or the vortex number associated with the functional ^YMH when the field 0 satisfies certain conditions summarized in the following remark. Remark (10.2.9)

Consider the field energy H per unit length:

H = j dxdy | It + (D0)* -(30) + j - (Vtf> " -y-) }

(10.2.31)

which is < oo. Then as r —> °°, a smooth solution (A, 0) has the properties (i) <j) —> its vacuum value eix(.e)m/

x ^ ^ j j _^ Q^ a n ( j ^

|£j0| _^ Q ^ j s o s j n c e g _> o asymptotically^ tends to a purely gauge

formA —> Va, where a is a scalar on R3. Thus the required solution (A,
(10.2.32)

Finally we emphasize that the asymptotic solutions (A, <j>) given in (10.2.32) are characterized by the mapping 9 -» e' x W from the circle at infinity in the xy-plane to the unit circle in the complex plane. The "winding number" of this map is the integer X(2K)

- x(0) In

Hence the flux (10.2.30) is proportional to the winding number of the phase of the Higgs field

Theory of Yang-Mills and The Yang-Mills-Higgs Mechanism

579

Exercise 10.2 1. Describe the Dirac's 'magnetic monopole' and explain the meaning of Dirac strings. Show that in a principle bundle context this gives rise to the so called "magnetic monopole bundle." 2. What is an 'Instanton Bundle?' Obtain its mathematical expression. 3. Show that the transition functions h(x^) = t\ - x • x satisfy: h~l dh = ixk csk = irk rf^ x? dxvlr2 (dh)hA = rk ak = -»T t jj* v x" dxv/r2 where r\ andTj are t'Hooft's eta tensors and <7k are left invariant 1-forms on SU(2) (see Exp. (0.5.1) for ak). 4. Derive (10.2.29b) using polar coordinates and the value ofA given in (10.2.27). 5. Show that the derivative operator DA is covariant under a gauge transformation and hence the action AYMH is an invariant of gauge group G.

Hints to Exercise 10.2 1. Dirac used the potentials A+, A_ defined below on R3: (i)

A+ =

— (xdy - ydx), 2r z ±r where r2 = x2 + y2 + z2, in order to describe a magnetic charge. As can be seen, this definition of A± leads to the appearance of so-called "string singularities" on the ± z-axis. These are referred to as Dirac strings. In modern terminology this difficulty is overcome by using coordinate patches U± covering the regions z > - £ and z < + e of R3 - {0}. The overlap region U+ r\U_ is naturally xy-plane\{origin} at z = 0. Using the spherical coordinates (r cos <j) sin 0, r sin
is singular when 9 = 0 or 9 = n, the gauge potential A+ is singular at 9 = n (and smooth at 9 = 0, as it vanishes there). Similarly A_ is singular at 9 = 0 and is smooth at 9= n. The gauge transformation that relates A+ and A_ is: (ii)

(iii)

A+- A_= {— (1 - cos0) - — ( - 1 - cos0)jd0 = d

This can be written as: (iv)

A+ = A_+d tan"' (—) \ x)

580 Mathematical Perspectives on Theoretical Physics

From (ii) it is evident that both A+ and A_ are regular in the overlap region 9 = —, r > 0. The field (Dirac's monopole field) is given by: (v)

F = dA±

in U±. Using the description of A± given in (x, y, z) coordinates we can write F as: (vi)

F = d{gx(x, v, z)dy - g2 (x, y, z)dx),

where gl(x, y, z) =

X

2r (z ± r)

, g2 (x, y, z) =

/

2r (z ± r)

• Thus

V <9Z

y

This can be simplified to: (vii)

F = —j(xdy A dz + ydz A dx + zdx A dy).

Obviously the above description can be given a bundle-theoretic construction as follows. In place of R3 we now let M = S2 with coordinates (9, , give the bundle locally as: (viii)

f H+x U(l) < { H_x U(l)

coordinates

(0, ; eiv+)

coordinates

(9, 0; e'¥~)

The transition functions (see Sec. 2.5) along H+r\ H_ must be functions of (j), so the fiber coordinates of H+, H_ are related by the equality: (ix)

ei¥~ = ein* e 1 > + '

The n in above equality is an integer, for without n being an integer, there will be no manifold structure, and thus no principal bundle. The above description is a topological version of the Dirac monopole quantization condition. For n - 0, we have a trivial principal bundle P = S2xSl, and for n = 1, we have the well known Hopf fibering of P = S3 (see Exercise 11 in Sec. 2.5, and Sees. 6.6 and 6.7 for examples and theory). 2. The principal bundle that corresponds to the Yang-Mills instanton (finite-action solution to the Yang-Mills equations) is called the 'instanton bundle' (see also Def. 10.3.7). From discussion in this section, we know that these are defined on R4 with gauge group 517(2), accordingly the principal bundle in question has compactified Euclidean space-time 5 4 as its base space M, and the group SU(2) = 5 3 as the fiber F (see Sec. 2.5). We choose the coordinates (9, (j>, iff, r; 0 < 9 < n, 0 < 0 < 2n, 0 < y < An) on M = S4 and (a, /3, ^ on F = SU(2). We can split S 4 into coordinate neighbourhoods H+ and H_, having the common boundary H+nH_ = 5 3 -the 3-sphere. This boundary can be parametrized by Euler angles (9, , y/) of 5 3. Thus a representation h(9, <j), y/) of SU(2) is given by:

Theory of Yang-Mills and The Yang-Mills-Higgs Mechanism 581

x = r cos 0/2 cos f ^LUt |

I 2)

(i)

h

„ . = tl-n-x r

y = r cos 0/2 sin [ ^ • - | \ 2 ) '

)z = r sin 6/2 cos ( i ^ £ ) f = r sin 0/2 sin j ^ ~ ^ j

where T= (T 1 , T 2 , T 3 ) are the Pauli matrices and (x, y, z, t) e R4. The fiber coordinates, on the other hand, are given by SU{2) matrices g(a, /3, "ft say, where a, )3, y are group-Euler angles. Thus local bundle patches with respective coordinates are: (ii)

H+ x SU(2)

coordinates

(6, 0, y, r, a+, j3+, y+)

H_ x St/(2)

coordinates

(0, 0, t//, r, a_, ^3_, y j

The transition functions from SU(2) fibers g(ot+, /3+, y+) to the fibers g(a_, j3_, y_) in / / + n //_ are obtained using the SU(2) matrix h(0, <j>, y/): «(«., /3_, y_) = fe*(0, 0, yA)g(a+, ^

(iii)

y+).

Just like in Exc. 1, where n was supposed to be an integer in (ix), we must have k as an integer to obtain a well-defined manifold structure. For k = 1 we get the Hopf fibering of S7. This bundle results from 'single-instanton' solution of Belavin, et al. [15]. 3. To prove the result, we take the computational route through coordinate chart. Thus h{x^) = h(x, y, z, t) can be expressed using the Pauli matrices:

as

(i)

1 f t — iz

—y — ix\

r \+y -ix

t + izj

It can be easily verified that h~x equals: 1 f t + iz ir \-y + ix

(ii)

y + ix\ 1 U t - iz J r

y

{tl+it-x).

Therefore,

(iii)

,

1 ( t + iz

y + ix\(dt-idz

r \-y + ix

t-iz)ydy-idx

where we have written: (iv)

d h = - d ( t l - r x ) - ( t l - r x ) - ^ r , r r

-(dy + idx)\ dt + idz

J

dr

r

582 Mathematical Perspectives on Theoretical Physics

and have thus simplified the second term of h~x dh to:

r

r

From Exp. (0.5.1), recall that we had the 1-forms: — = —*-(x dx + y dy + z dz + t dt) r r <JX = — ( - t dx - z dy +y dz + x dt) (v)

ay =\ {zdx-tdy-xdz+ydt) <7Z = — (-y dx + x dy - t dz + z dt)

that defined the connection on R4. In this case, we note that simplification of (iii) in view of (v) yields: ( dr

.

+ IGZ

. \ + iox

a

(dr

d

n

0

\

(vi) -a v

f

Y+icrx

ioz r

Gz

0 )

V

ax-ioy\

{ax+i(Jy

-az

— r )

fa )

r

To write the 1-forms <jk (i.e. CTp <Jy, <JZ) in terms of Eta-tensors, we define these tensors. As usual, fX , v range from 0 to 3 and i,j, k from 1 to 3. The t'Hooft matrices, known as Eta-tensors, are by definition: (vii)

Tlmv = Vijk = eijk' n =- n •

Vijo = 8ij TT- = c _ n ^ o

'liHV

'I niv

'livfi'

v

+ 5vo

L>

n.

'hfiv

where e ^ is an anti-symmetric tensor in all its indices and vanishes for repeated indices. To show that h_{ dh = irt (7/ = i xl r\lllv x*1 — — , we choose / = 1 and thus illustrate it for ax by using the expression in (v). Note that ax in terms of r\itlv becomes: (viii)

ax = 4 " O?ioi x°

dxl

+ ^132 -«3 dx2 + 77123 x2 dx3 + nm xl dx°)

In essence, the + and — signs in (v) have been replaced by TJ'S using the relations in (vii) and t, x, y, z have been changed to x°, x\ x2, x3. The other two forms, ay and <JZ can likewise be expressed in terms of 77's. Hence in view of (vi) we have the required result.

Theory of Yang-Mills and The Yang-Mills-Higgs Mechanism

583

To prove that {dh)h~l = - ixt ot = - izt rjlllvx^ dxv/r2, we repeat the above process for the forms ((Jx, (Jy, O z). These forms are defined as: ax = iydz - zdy - xdt + tdx)lr2 ay = {- zdx - tdy + xdz + ydt)/r2 Gz = (- ydx + xdy + tdz - zdi)lr2. They obey the same cyclic relation: do'x = 2 oy A aZ as we had for (ax, <Jy, o^). 4. For polar coordinates the gradient V = — , r . On the circumference of the circle, since r \dr 96) is a fixed constant a, we have:

0) A = | - J / i g ^ ( ( / ( r ) ) 2 e^ r(-«) z'(0)^ (6) -(/(r)2 e~*™ irX'(6)e^)\

2e

e

hence

(ii)

jgcAde =

j^X'(d)d6

as dr = 0 at dC. 5. The elements of the gauge group G are gauge transformations g: K" —» G. These act on the fields A,

A -> Ag = gAg~l + gdg~l (j> -> <j>g = p{g) (j) = gFAg~l

(iii)

FA-^FAg

(iv)

DA - • DAgg =

pig){DA^>).

Note that

DAg cog = DAg {p(g)(o) = p{g) DA (o».

Similarly, using (iii) for (10.2.2c) where co is ^-valued, we have: (vi)

DAg cog = DAg igco g-1) = giDA co) g->.

From (iv), (v), and (vi) it follows that DA is covariant. Substituting the gauge transformed values in (10.2.3) from (ii), (iii) and (iv), we note that the AYMH action is gauge invariant.

584

3

Mathematical Perspectives on Theoretical Physics

SELF-DUALITY IN YANG-MILLS THEORY AND INSTANTONS

We devote this section to the role of duality in Yang-Mills theory, in particular to the study of intantons on the so-called 4-dimensional self-dual manifolds. The notion of self-duality which now forms an integral part of the Yang-Mills-type theories in a standard manner was originally introduced in 1977 (Atiyah, Hitchin and Singer [6a]) and was later developed by the same authors in 1978 [6b]. We begin here with some definitions and show via simple examples the relation of self-duality to other well-known structures, e.g., conformal and almost-complex. We then use these ideas to locally express the anti-instantons and the instantons and later on multi-instantons as well in quaternion form. Their expressions are then obtained in asymptotic gauge and it is shown that their corresponding curvatures give the absolute minima of AYM. We also compute the dimension of the moduli-spaces of instantons simply by using the counting principle, and refer the reader to original literature where it is done using the Atiyah-Singer index theorem [4], [7].

3.1

Self-duality in 4-dimensions

From Chapters 1 and 6 we are already familiar with the Hodge star operator * on an n-dimensional oriented Riemannian manifold X, which maps a p-foxm e A^-the bundle of exterior p-forms to an (« - p)-form e An~p by the rule: a A*p=(a,

p)co<E A"

(10.3.1)

where a, /J € Ap, ( , ) denotes the inner product with respect to Riemannian metric on X and a> is the volume form. When n = 2m, it can be checked that for p = m this operator satisfies: *2=(-l)m

(10.3.2)

and in addition it is conformally invariant, i.e., multiplication of the metric by a scalar X does not alter (10.3.1). For m = 1, * 2 = - 1 and obviously in this case it defines the complex structure on a Riemannian surface. When m = 2, * 2 = 1 implies that for * = + 1, and * = - 1, there are eigenspaces A2+ and A2_ that result from the splitting of the 2-form bundle A2: A2 = A 2+ ©A 2 _

(10.3.3)

These are called the bundles of 'self-dual' and anti-self-dual 2-forms respectively. Since X is a Riemannian manifold, above splitting of A leads to an important decomposition of Riemannian curvature. For in general this tensor defines a self-adjoint transformation %: A2 —> A2 expressed as: ^eiAej)=\^RijklekAel

(10.3.4)

where {e^ is a basis of local orthonormal 1-forms. Now in 4-dimensions ^.can be written as a block matrix relative to the decomposition (10.3.3): [A

Bl

(10.3.5)

Theory of Yang-Mills and The Yang-Mills-Higgs Mechanism 585

where B e Horn (A2, A +), and A e End (A2+) and C e End (A^_) are self-adjoint. This representation of Ogives a complete decomposition of Riemannian curvature tensor in terms of irreducible components (see Singer and Thorpe [45]): !H^>(TrA,B,A--TrA,C--TrC>\

(10.3.6)

Here TrA = TrC =1/4 scalar curvature, B is the traceless Ricci tensor, and the last two components denoted JV+ and W_ make up the conformally invariant Weyl tensor W = W+ + W_. It is well known that the Riemannian metric is Einstein if and only if B = 0, and is conformally flat if and only if Ws 0 [6b], and both these special types of metrics occur in higher dimensions. When the dimension is 4, the metric becomes still more specialized, as we note in the following examples and in the Definition (10.3.5). Definition 10.3.1 An oriented Riemannian manifold is self-dual if its Weyl tensor W = W+, i.e., if W_ = 0. It is called anti-self-dual if W+ - 0. Since the Weyl tensor and the * operator (in even dimensions) are conformally invariant, it follows that this is a property which depends on the underlying conformal structure and the choice of orientation.

3.2

Examples of Self-(Anti-Self) Dual Manifolds

We list below a few easy examples of self-dual and anti-self-dual manifolds. Example 10.3.2 If X is conformally flat, then since W+ = W_ = 0, X is evidently self-dual. As S 4 , S1 x S3 and 4-torus T4 are all conformally flat with respect to their natural metrics, they are all self-dual. Example 10.3.3 The complex projective plane P2 (
3.3

Self-dual Connection on a 4-manifoId

Consider now a principal G-bundle P over X, with g denoting the Lie-algebra of G. Let a g-valued 1-form co define a connection on P and let Q. be the g-valued curvature 2-form: Q = dco + — [co, co]

(10.3.7)

which descends to X as a section of g <8> A2 where g now denotes the vector bundle associated to P by the adjoint representation (see Subsec. 2.5.7 and Subsec. 6.6.1). In view of our earlier studies we know that curvature D. can also be obtained in a different manner (see Sec 2.5 Part B) from physicists' point of view. This is done by considering a vector bundle E over X. A connection is now defined by covariant derivative V : A° (£) -» A1 (£), where Ap (E) = T(E <E> Ap) is the space of smooth sections of E <S> Ap. A natural extension of V given by Dl: A1 (£) -> A2 (£) is defined by £>, (e <8> a) = Ve A a + e <8> da. Where e e A° (E) and a e A1. The curvature Q then follows from the composition D, V e A2 (End £). (Note that this is possible for a principle bundle P since a representation of G on a vector space E can always define an associated vector bundle PxGE, and a local basis {e,} ofE via a local section of P, (see Subsecs. 2.5.6 and 2.5.7, in particular Eq. (2.5.29-34)). Hence using this terminology, the curvature Q € A2(g) = F (g® A2) is the physicists' gauge field and the connection form co-the gauge potential.

586

Mathematical Perspectives on Theoretical Physics

We know that the connection form eo is not gauge-invariant (See Eq. 6.6.16) but the curvature form Q is gauge-invariant whatever be the dimension of X, i.e., n may be odd or it may be even. Thus for g e £-the infinite-dimensional group of gauge transformations5 formed with respect to P, the curvature Q satisfies: Q->g"1Qg = Q

(10.3.8)

When X is 4-dimensional, we can define a self-dual connection form on it in the following way: Definition 10.3.5 On a 4-manifold X a connection (o is called self-dual if its curvature Q is in A\ig), ie., Q = *Q; it is called anti-self-dual if ft € A2_(g), i.e., Q = - * Q. Also, as the * operator is conformally invariant on 2-forms, the property of self-duality of a connection is invariant under the larger group G of transformations of a principal bundle, these elements of G act as conformal transformations on the base space X.

3.4 Self-duality in Spinor-bundles Next we see how decomposition (10.3.3) leads to the concept of self-duality amongst spinor bundles. Note that by using the metric on X, the 2-forms can be identified with skew-adjoint transformations of A1, and therefore the decomposition A2 = A 2 © A2 can be thought of, to correspond to the Lie algebra isomorphism: so(4) s so(3)@so(3),

(10.3.9) A2±

As a result of this isomorphism the bundles are now bundles of three-dimensional Lie algebras. On the Lie group level, this corresponds to the isomorphism: Spin(4) = SU(2) x SU(2)

(10.3.10)

and introduces at least locally, two complex spinor-bundles V+ and V_ —the bundles of 'self-dual' and 'anti-self-dual' spinors (see Sec. 7.2). These bundles lead to the total spin bundle: V=V+®V_

(10.3.11) 1

which is isomorphic to the complexified Clifford algebra bundle of A , which in turn is isomorphic as a graded vector bundle to A^ = ®p A£ s © Ap ®
3.5

Quaternions and Yang-Mills' Instanton

In this subsection we shall see that the concept of self-duality becomes still more explicit in terms of quaternions. For that we first note some of their properties. Recall that a quaternion x € HI can be written as: x = xx + x2i + x?j + x^k 2

(10.3.12)

2

where i, j , k satisfy i = / =ft = ijk = - 1 and ijk anti-commute among themselves. The variables (x{, x2, x3, x4) in (10.3.12) are all real numbers. 5

A gauge transformation on a principal G-bundle P is a diffeomorphism/: P -> P such that (i)f(JJP) = 3(fp) for ge G and/? e P, (ii)/preserves eachfiber,i.e., acts trivially on the base spaceX. The groups consists of sections of the bundle of groups PxGG where G acts on itself by conjugation.

Theory of Yang-Mills and The Yang-Mills-Higgs Mechanism

587

The conjugate quaternion! is defined by: x = JCX - x2i'• — Xjj — xji

(10.3.13)

Evidently, the conjugation in quaternions is an anti-involution, i.e.,(xy) = yx. From the relations satisfied by i, j , k it follows that: 4

xx = xx = X xl = \x\2

(10.3.14a)

The quantity btl2 called the squared norm of the quaternion is zero only if x = 0. A non-zero x has a unique inverse: x~l = xl\x\2.

(10.3.14b)

The quaternions x with unit norm, i.e., bcl = 1, form a multiplicative group denoted Sp{\); from (10.3.14) it is evident that as a manifold it can be identified with S3: 4

X x^t = 1 which is SU(2) group-theoretically (see Chap. 7). The group Sp{\) is a Lie group. Since S 3 can be identified with SU(2), it follows that Sp(l) is isomorphic to the group SU(2). It can also be checked that the Lie algebra su(2) is isomorphic to the Lie algebra Im HI formed by imaginary quaternions. The quaternion differentials: dx = dxx + dx2i + dxtf + dx4k dx = dx\ — dx2i — dx-xj — dx4k lead to the exterior product dx A dx = -2{(dxi

A dx2 + dx3 A dx4)i + {dxx A dx3 - dx2 A dx4)j

+ (dx{ A dx4 + dx2 A dx3)k}

(10.3.15)

The coefficients of i, j , k in above product form a basis for self-dual 2-forms, similarly the coefficients of i, j , kin dx Adx form a basis for anti-self-dual 2-forms. Hence dx /\dx is a su (2)-valued self-dual 2-form, whereas dx A dx is an anti-self-dual 2-form. Recall that a gauge-potential is Lie-algebra valued, hence as su(2) = Im H, an sM(2)-potential A(x): Mx) = £

A^ (x) dxM

(10.3.16)

is Im IH-valued, i.e., the values of functions A^ (x), where x is a quaternion variable, are imaginary quaternions. Thus if h(x) is any function of quaternion variable x with quaternion values, we can write (10.3.16) as: A(x) = Im {h(x)dx} = —{h(x)dx - dxh(x)}

(10.3.17)

Note that in writing the extreme right of this equality we have used the principles of complex variable theory and the anti-involution property of quaternions. And since h (x)dx = ^ H, it follows that

h^ {x)dxfl with h^{x) €

588

Mathematical Perspectives on Theoretical Physics

Afl(x) = lm(hfl(x)). (10.3.18) It is worth noting here that though potential is given by the imaginary part of H, there is a potential which belongs to the group H* of all non-zero quaternions which is SU(2) times a scale factor. We next express the curvature F = dA + A A A in quaternionic form. Now dA = Y, and

A

A

dAn

A ^ = ~ X (dn Av ~ dv An)

A = ^ A^ Av dx^

A

dxv =— [A^, A^dx^

fl.V

dxii A dxv

A

dxv.

^

Therefore in view of (10.3.17) and (10.3.18), the curvature F can be written as: F = Im {dh * dx + hdx A hdx]

3.6

(10.3.19)

The Basic Instanton and its Asymptotic Form

Consider now the problem of writing the basic instanton (or the anti-instanton), i.e., the gauge-potential of a pure Yang-Mills function for k = 1 (or k = - 1 ) 6 in quaternionic formulation. For this purpose let A be the .sw(2)-potential defined as: (10.3.20)

Using (10.3.17) this can be written as: 1 \xdx-dxx\

/ir>?ii\

A ( x ) = yn7^j'

(103-21)

and its components can be explicitly written from (10.3.20) as: Ax (x) = - ^ ~ X*j2- **k , A2 (x) = 1 + IJCI2

1

Ai(X)

_

x4i + xxj - x2k

~

l + \x\2

Xli

~ X*j \ * 3 * 1 + IJCI2

2

_

'MX) ~

-Xji+Xjj

(10.3.22)

+ Xjk

i^xT2

•

JC

Treating

5- as h{x) of (10.3.17), we write the curvature form F given in (10.3.19) as: 1 + \x\ \dx Adx _ , . . . |2 ,-i , , xdx A xdx\ F = Im \ ,- + xd (1 + 1x1 ) Adx H »-^-} (10.3.23) 1 1 -H IJCI 2

(1 + \x\2 ) 2 J

The middle term on the right hand side equals 7r^r{xdx A xdx + xxdx A dx). (1 + IJCI2 ) 2 6'

k here refers to the one given in (iii) of Exc. (10.2.2).

Theory of Yang-Mills and The Yang-Mills-Higgs Mechanism

589

We substitute these two terms and note that the third term on the right hand side of (10.3.23) cancels with the first term here, while the second term adds up with the first term of (10.3.23), to give the purely imaginary expression: F=Jx_Adx

(10.3.24)

(1 + lxl 2 ) 2

From our remark listed after (10.3.15) on the 2-forms dx A dx and dx A dx, we note that F i s anti-selfdual, i.e., *F = - F. From the equality *F = -F, it follows that, the potential A is an anti-instanton. We shall verify this by following the lines of argument given below. Note that from (10.3.20), A{x) can be written as A(x) = Im

^

> therefore as \x\ —> °°, in

view of (10.3.14b) we have: A(x) - Im Cx"1 dx) = <j)(xyl d<j)(x),

(10.3.25)

where (*) = -^-. \x\ Thus A is asymptotically the gauge transform of 0 by the gauge transformation g(x) = <j)(x), alternatively if the inverse gauge transformation (g(x))~l = (<j)(x))~l is applied to A, then asymptotically 0 is attained. On the unit sphere IJCI = 1 which is S3 in quaternion space, the transform (^(x))"1 simplifies to x, and the map x —>x of S3 to itself has degree - 1 , i.e., k = - 1 , thus A(x) given in (10.3.20) describes an anti-instanton. Obviously replacing x by* in (10.3.20) and in (10.3.24) we obtain an instanton with potential and field: [ xdx A(x) = lm<

T

\ \,

F=

l l + UI 2 J

3.7

dxAdx T^r.

/ i m i<:\ (10.3.26)

(1 + UI 2 ) 2

Anti-instanton in Asymptotic Gauge

We further note that if we change the gauge by <j>(x)~l at \x\ -» <» by using (10.3.25) and introduce the quaternion coordinate v = x~l around the point at °°, which is regarded as a point of S 4 (the compactified R 4 ), then, from the imaginary part of the identity7

* p^-V1 U+I*l J

+ xdx-' = -^%, 1 + lyl2

(10.3.27)

it follows that the anti-instanton (10.3.20) extends to 5 4 and has precisely the same form at °° as near 0. The same is true about instanton also. If instead of gauge change mentioned above in (10.3.27), we substitute y = x'1 in (10.3.20), we obtain:

My) = -Im{T7w}7'

(ia3-28)

The left hand side of this identity follows from the gauge transformation rule A —» g~l Ag+g~] dg where g is We)"1 = 3e=— (*eS 4 ).

x

590 Mathematical Perspectives on Theoretical Physics

This describes the anti-instanton in the "singular" or "asymptotic" gauge - i.e., the gauge in which A —» 0 as \y\ -> °°, but this is well behaved at y = 0 where A(y) is singular. However we have already seen that this singularity (of A(v)) can be removed by suitable gauge transformation.

3.8 Application of Conformal Transformations to Basic Anti-instanton Recall that the conformal group of S4 is 5L(2, [hfl)/{± 1} which acts on quaternionic variables via fractional linear transformations just as SL (2, (C) does on complex variables. It can be checked that A(x) given in (10.3.20), up to a gauge transformation, is preserved by translation x —> x + c and by the inversion x —» x~l (see Exc. 2). It is also unchanged by x —> ax where quaternion a has unit norm, and x —> xa produces only a constant gauge transformation.8 The latter two combine to give a transformation x —> axb with x, a, b all quaternions (a ^ 0, b *• 0); these transformations generate the rotation group SO(4) together with scale of change. Thus A(x) is invariant under S0(4). As a matter of fact it is invariant up to gauge transformations with respect to the larger group 5(9(5) which can be considered here as Sp(2)/{± 1} where Sp(2) c SL (2, H) is the compact subgroup that leaves the norms unaltered. We now move to obtain new anti-instantons using the elements e SL (2, M)/Sp(T); This is done by replacing x with fi{x-b) where fi is a positive real scalar and b is a quaternion. The parameters (/I, b) can be regarded as parametrizing the SL(2, H)/Sp(2), the space of quaternion norms on H 2 with volume 1. This follows after we have associated to (jl, b) the positive self-adjoint matrix: (;.

*)

(10.3.29)

with unit determinant: fiv - \b\2 - 1. The transformation x-*fi(x-b),

(10.3.30)

with jl and b as variable, when applied to (10.3.20) generates a 5-parameter family of anti-instantons with center b and scale fi. Correspondingly the change in (10.3.24) suggests that: the field density is a 'maximum' at the center with the strength being fj,2 there. The above discussions show that no two members of this parametrized family are gauge-equivalent. It can also be shown that every anti-instanton (i.e., k = -1) is gauge equivalent to a specific member of this family [2]. Again as we obtained the anti-instanton (10.3.25) in asymptotic gauge, we can obtain here the general anti-instanton in asymptotic gauge by using the transformation: x^Mb-x)'1

(10.3.31)

Note that (10.3.31) is obtained from (10.3.30) by inversion and a sign change. (For more details, see Chap. 2 in [2].)

3.9 Construction of Multi-instantons As mentioned earlier, the solutions of pure Yang-Mills equations with topological charge k > 1 are called multi-instantons. These were obtained to begin with, by 't Hooft (unpublished) who found a 5£parameter family of solutions to self-dual Yang-Mills-Field equations. A (5k + 4) parameter family of solutions was obtained (independently) by Jackiw, Nohl and Rebbin in 1977 (see [31]). 8

' The product between two quaternions is not commutative, hence ax * xa in general.

Theory of Yang-Mills and The Yang-Mills-Higgs Mechanism

591

Our construction here (which is simplistic in nature) is based on Atiyah's work [2]. It uses the quaternion framework-with which we feel the reader is comfortable by now. Consider the space irf of column vectors: '«^ u=

ua

where each ua (a = 1, 2, •••, k) is a quaternion, and let u* = [ul ••• u „••• u J denote the conjugate transpose of u. An s«(2)-potential on IH4 = K4* similar to A{x) in (10.3.20) can now be defined as:

A(w) = Im j ^ - ^ - j .

(10.3.32)

The term udu here stands for the matrix product: k

udu = X "a dua ' a=l

and IMI2 = u*u - ][T \uj2 is the Euclidean norm. Evidently on each coordinate axis (10.3.32) simplifies a

to (10.3.20) and is unchanged by the group Sp(k) acting on H*. Thus it restricts to (10.3.20) on any one-dimensional subspace IH of H*. This means that it has a high degree of symmetry and thus has important geometrical implications. For our purpose, we simply treat it as a means to construct potentials on H = K4 by using suitable functions u =f(x), i.e., maps/:DH) —> \Ht. Given any such / w e can write (10.3.32) as:

r , v ,,f( J

{X) anX } 2 2

A(f(x)) = Af (x) = Im \ f { \ + \f(x)\

^

UaMdfa(x)] = Im -*

J

l + l/(x)l 2

.

(10.3.33)

I

We now choose our/(x) as the matrix analogue of (10.3.31) but with a conjugation:

u=f0c) = [HB-x)-1]*-

(10.3.34)

The choice off(x) with conjugation gives us instantons in place of anti-instantons. In (10.3.34), B is a symmetric k x k matrix of quaternions, A is a row vector (kx ... Xk) of quaternions, and x stands for the scalar quaternion xl with / being the unit (k X k) matrix. For k = 1, the parameters were arbitrary except that X had to be invertible. In the general case they have to satisfy the following algebraic constraints: (i) B*B + X* X is a real k x k matrix, (ii) For every x e H the equations:

592 Mathematical Perspectives on Theoretical Physics

(a) (B - x) C, = 0

(b) A£ = 0

with f € H*

imply that £ = 0. Condition (i) says that all coefficients of *',./', & in the k xfcquaternion matrix B*B + A* A vanish, and this translates into a system of quadratic relations on the coefficients of B and A. Condition (ii) which is the non-degeneracy condition, can be interpreted to say that the (k + 1) x k matrixf 1 has maximal rank k for every x in H. [B-X) The first condition ensures that the potential A^g obtained by substituting (10.3.34) into (10.3.33) is self-dual, whereas the second ensures that the solution is non-degenerate, meaning thereby that the singularities of the potential with respect to the points x for which (B - x) is singular can be removed by a gauge transformation. We now state an important result concerning the multi-instanton (see [2(b)]). Result 10.3.6 (theorem) Every jfc-instanton for SU(2) arises from a pair of (A, B) that satisfy the conditions (i) and (ii), the potential is given in an asymptotic gauge by (10.3.32) with u{x) defined by (10.3.34). The two potentials defined by (A, B) and (A', B') are gauge-equivalent if and only if A' = qXT and B' = T~l BT where q e Sp(l) and T e O(k). (See [35] for the proof.) In order to show that the number of effective parameters in the above construction is (8& - 3), we first note that the pair (A, B) involves: 4k + 4(— k(k + 1)1 = 2k2 + 6k parameters, the first term 4k coming from X = (A.{ • • • \) e frt and the second from the fact that B is a symmetric k x k matrix over H. However, due to constraints imposed by the conditions (i) and (ii) above, the sum gets reduced. For instance, the requirement of reality in condition (i) implies the reduction by 3k(k - l)/2. The condition (ii) which says that rank

( X \

\B-xJ

= k does not contribute to reduction, but the gauge equivalence com-

ing from S£/(2)-action and O(k)-action leads to reductions by numbers 3 and k{k- l)/2 respectively. As a result of these computations we have: (2k2 + 6k) -

~ 1} - 3 - k {k " 1} = U - 3 (10.3.35) 2 2 For k = 1 this number reduces to 5 as obtained earlier for an instanton. Note further that instantons could be described using more than one mathematical discipline (see for instance, Atiyah-Ward [8] and Atiyah-Bott [9]). Having seen their description in terms of quaternions, we shall next examine as to how elements of algebraic geometry could be used in this direction.

3.10

3k {k

Projective Spaces and Instantons

As expected these different descriptions are related to each other through a well defined correspondence as would be evident from our Results (10.3.7) and (10.3.8). Recall that projective spaces (real or complex) are the basic tools of algebraic geometry (see App. 10.A.1 for definition). We shall see here the

Theory of Yang-Mills and The Yang-Mills-Higgs Mechanism

593

relation between P3 (C)- the complex projective 3-space (the space parametrized by lines through O in C4) and Pj (H)-the 1-dimensional quaternion projective space, as well as the relation between P3 (<£) and the 4-sphere S 4 . The first relation emerges from the fact that H 2 can be identified with C4 (see Exc. 3.1). Accordingly to each line in P3 (C), we can associate the quaternion line it generates, thereby getting a map: P3 (C) -> P, (H)

(10.3.36)

In homogeneous coordinates this map is given by the assignment: (x0, x{, x2, x3) -> (x0 + xj,

x2 + x3j) (xt

G

C, i = 0, 1, 2, 3)).

We also note that this is a bundle map with fiber Pj (C). To establish the second relation (between P3 ((C) and S 4 , we proceed as follows. Let x & y be two points in P3 (C) with homogeneous coordinates (x0, xx, x2, x3) and (y0, Vj, y2, y 3 ). A point p e P5 (C) can now be defined with homogeneous coordinates (known as Pliicker coordinates): Pij

= x i y j - xj v; f o r i < j

(i, j = 0,l,2,

3)

(10.3.37)

The point p = (pQl • • • p23) is uniquely determined by the line joining x and v, we denote this line as Lp. On the other hand these coordinates give a skew-symmetric matrix characterising the line that joins (x) and (y). It can be checked that the coordinates of p satisfy the equation: PaiPii ~ P02P13 + P03P12 = ° (10.3.38) We further note that Lp -> p defines the Pliicker embedding of the Grassmannian Grt (P3 (<£)) into P5 ( ( - zx, IQ, - z3, z2)

(10.3.39)

This mapping is anti-linear (i.e., anti holomorphic in local complex coordinates), without fixed points and satisfies a2 = id. The following properties of o" given below in the Result (10.3.7) can be easily checked: Result 10.3.7 Property (i) a preserves the fibration (10.3.36), acting trivially on P, (H), and acting as the anti-podal map on each fiber (i.e., on S2). Property (ii) Given a point p e Q4 we have p e S4 if and only if Lp is a fixed line of a (i.e., whenever z e Lp, az 6 Lp). Property (Hi) Under the real structure given by a on P 3 (C), the real form of QA admits the signature (5, 1). We note that the property (ii) leads to the fibering: P3^) — ^ S 9

4

(10.3.40)

' Q4 has 3 plus signs and 3 minus signs when it is diagonalized (i.e., when it is expressed as a quadratic form.)

594

Mathematical Perspectives on Theoretical Physics

by assigning to z e P3 (C) the point p corresponding to the line through z and oz. The fiber naturally is P, (C), and for p e S4 one has Lp = 7Tl(p). These lines are called the real lines of P3 (
(hi) /*°(0 = o (iv) H\C (- 2)) = 0 (v) C,\L= OL ® OL for some line L in P 3 . Property (v) says that vector bundle £ when restricted to a line can be expressed as a direct sum of copies of line bundle OL. Note that this definition is based on cohomology groups of £and it can be used to give more information in this direction10. For instance it can be shown that: # ° ( £ W ) = 0 for all k < 0 Hl (£(*)) = 0 for all k < -2 H2 (£(&)) = 0 for all k > - 2 H3 (£(*)) = 0 for all k > - 4 It can also be checked that £is self-dual. (See Barth in [21].)

Exercise 10.3 1. Show that H can be identified with
P5(
[*] = [*]}

' H1 (£(k)) = ith cohomology group of vector bundle Cfjc) where k e Z.

Theory of Yang-Mills and The Yang-Mills-Higgs Mechanism

595

between the real quadratic Q4 and S4 can be established. 4. Prove property (ii) of the involutive mapping a: P3 (C) —> P3 (C) given in the Result (10.3.7). 5. Obtain a coordinate expression of n\U0 for the fiber bundle (10.3.40).

Hints to Exercise 10.3 1. Note that just as in the case of a complex number z = zx + iz2 the first term is real and the second is imaginary, in the case of a quaternion x = xx + x2i + x3j + xA k, the term xx is real and x - xx is imaginary. Now if i is treated as the usual complex number, then C can be viewed as contained in H (with x3 = xA = 0). Moreover, every quaternion x can be uniquely expressed as x = yx + y^J where yx = xx + x2i and y2 = x3 + x4i e axb with x, a, b all quaternions a * 0, b ^ 0 generate the rotation group S0(4) together with changes of scale. We then examine the invariance of instanton.A(jt) given in (10.3.20) under the following transformations: (i) x -»

x (ii) x —> ax (iii) x —> xa. The replacement ofJc by 3c"1 and dx by

=- and bcl2 by—5- in (10.3.20) verifies (i). For (ii) we x \x\ write ax in place of x in (10.3.20) and note that(a;c) =x a and dx = adx, thus A(x) is invariant if we assume that Id2 = 1. In the case of (iii) A(x) —>

— which implies that x —> xa pro1 + \axxa\

duces only a constant gauge transformation. 3. We use the coordinates (p^ defined in (10.3.37) to write a set of homogeneous coordinates (x0 • • • x5) in P5 (C) given by the transformations: (a)

x0 = p0l + p23

x2 - pn + p02

x4 = p03 - p 1 2

x

\ = Poi ~ Pn X3 = i(Pu - P02) X5 = i(Poi + Pn) It can be easily checked that the quadric Q4 in these coordinates can be written as: (b)

xl = x\ + x\ + x\ + x\ + xj

since all squared t e r m s (p,-,-)2 cancel in pairs. W h e n [x] = [(x 0 ••• JC 5 )] = [x] = [(3c 0 ••• x 5 ) ] , ( b ) represents a real equation, hence the 4-sphere S 4 and quadric Q4 an b e identified a s : S4=Q4n{[x]s

(c) 4

P5(
and for x e S , x0 * 0 automatically.

[x] = [ x ] }

596

Mathematical Perspectives on Theoretical Physics

4. Let z e Lp define the homogeneous coordinates (py) of the point p e P5 (C), thus replacing y by z in (10.3.37) we have

pij = xizj- XjZi i<j. The line Lp is said to be fixed under a, if a z e Z,p. Replacing zby ffz and x by cr x we shall have the following transformations of p-s: (a) Poi - * Pol- ^23 ~* P23> P02 "^ P 13' Pl3 ~> P02> P03 -» ~ P12' The coordinates (p,y) satisfy

and

Pl2 ""> ~ P03-

(b) Pol P23 - P02 Pl3 + P03 Pl2 = °The corresponding elements using the involution cr when substituted in (b) give: (c) Poi Pn ~ Pn P02 + (- Pn) ~ (A») = °They will both represent the same quadric Q4 only if p e S4, and this follows from (c) of the Hint to Exc. 3. Conversely equality of (b) and (c) leads to the correspondence (a) and thus to the fact that both z and
x +1 wu = —

xo+x{

f o r n = 1 ••• 4 .

Now (standard) coordinates on P3 (C) neighbourhood £/,- are zv/zt, zt * 0, in particular on Uo they arc zv/Zo' ( v = I- 2, 3). Our effort should be to express w^ in terms of these. We use the definition of x0 ••• x5 given in (a) of Hint to Exercise 3 to write: (b)

Wl

= Pi3+PQ2tM,2= J(£i3-Pq2) 2 2 /'oi Poi _ P03 ~ Pl2 2 Poi

w

Using p*7- to denote

w

_ '(P03 + Pl2) 2 Poi

, (b) can be written as: Poi

(c)

w

i = Y (P*3 + /»02).

W

2 = y '(P*3 - P*oi)

W

3 = y (P 03 - J> 12)' W 4 = y i ( p 03 + P l 2 ) -

From (10.3.37) we know that p^-'s and therefore p*y's are functions of the coordinates of points z e P3 (
Theory of Yang-Mills and The Yang-Mills-Higgs Mechanism

(d)

597

Poi (z) = lz0|2 + !zi |2 ' P02 (z) = - z0 z 3 + z2 Zi P 0 l ( z ) = Z ( ) Z 2 + Z-3 Z l . P l 2 ( Z ) = - Z l Z 3 - Z 2 Z 0

P 1 3 (Z) = Zi Z 2 - Z3 Z 0 .

P23 (*) =

lz

2 | 2 + !Z3I2-

(c) and (d) taken together give the coordinate expressions of K/U0.

4

MORE ON MONOPOLES

In Sec. (10.2) we already defined the term "monopole" as a solution to the finite action Yang-MillsHiggs equations that result from a 3-dimensional set up of the theory. We also summarily dealt there with Dirac's magnetic monopole and mentioned in passing its relation with the monopoles of general nature. We pursue those ideas here giving the required mathematical details. With this end in view, we define the most general action (10.4.3) on the Yang-Mills-Higgs' configuration space CH (see (10.4.1)) and obtain the Lagrangian equations (10.4.4). These in turn simplify to differential equations (10.4.6) when Bianchi identities are applied. We then show that the Higgs field 0 resulting from dimensional reduction of R" toR""1 satisfies the Bogomolny equation (Result (10.4.1)). The appropriate decay conditions (10.4.12) are then used to define Mk—the solution spaces of monopoles with topological charge k. Explicit construction of monopoles is illustrated in Exc. (10.4.1) and in Exp. (10.4.5). The latter half of the section is devoted to questions concerning the smoothness and metric of moduli spaces of monopoles (Results (10.4.2) and (10.4.4)). The role of algebraic geometry in the study of these spaces is also explained in brief (Donaldson's Thm. (10.4.3)). The above theorem is then used to show that Mk is hyper-Kahler (Exc. (10.4.2)).

4.1

Yang-Mills-Higgs' Configuration Space

Using the terminology introduced in Sec. (6.6) and in the previous section, we define the Yang-MillsHiggs' configuration space as: CH: = A(P) x T(adP)

(10.4.1)

where A(P) is the space of gauge potentials (connections) on a principal bundle P(M, G) and T(adP) is the group of sections of the associated bundle Px^ g = E(M, G, ad, />)-the Higgs bundle. In other words, CH can be considered as contained in: CHcz {(A, <j>) e A1 (P, g) x A°(M, adP)}, (10.4.2) where A in (A, is a section of the Higgs bundle ad P. From above description it is evident that 0 is the Higgs field in the adjoint representation. The metric on the manifold M which is assumed to be compact, and a fixed bi-invariant metric on (compact semi-simple gauge group) G allows one to define a Yang-Mills-Higgs action with self-interaction potential V: IR+ —> R+ on the configuration space Cas*: A = !A.YH (A,
[ I F / + cx\DA <j>\2+ c2V (I0I 2 )].

(10.4.3)

C denotes the restriction of CH when action JHY (A, <j>)< °°. Reader would note that C introduced in Subsec. 4.2 is more restrictive than C.

598

Mathematical Perspectives on Theoretical Physics

Note that (10.4.3) is a generalized version of AYMH (10.2.3); here we have used the norm notation for the inner product to indicate that an ordinary vector space may be replaced by a Sobolev space. Also we have given the coupling constants explicitly, the constant cl measures the relative strengths of the gauge field and its interaction with the Higgs field, when ci ^ 0, the constant c-jcx measures the relative strengths of the Higgs field self-interaction and the gauge field-Higgs field interaction. The constant c(M) depends on the dimension of the manifold M, for a 4-dimensional M, c(M) = 1/8 TT2. Thus c(M) is a normalizing constant. The critical points (A, , D A <{>] = 0 2

*DA * DA<j)+ c2V Q<jA ) 0 = 0

(10.4.4a) (10.4.4b)

dV where V'(x) =

. These equations are called the Yang-Mills Higgs field equations with self-interacdx tion potential V. It is to be noted here that in (10.4.4) * DA* is the conformal L2-adjoint of the corresponding map DA. Thus in (10.4.4a) it is a map: *DA*: A2 (M, ad P) -> A1 (M, ad P), and in (10.4.4b) it is: *DA* : A 1 (M, ad P) -> A 0 (M, ad P) = T(ad P). Also note that the second term in (10.4.4a) is the negative of the current J as given in (10.2.7), and that the pair (A, (j>) satisfies the Bianchi identities: DAFA = 0

(10.4.5a)

DA{DA4>) = [FA,Q\,

(10.4.5b)

irrespective of the fact whether it is, or it is not a solution of (10.4.4). In view of the result which asserts that the solutions to (10.4.4) on R" always satisfy: (n - A)\FA\2 + (n - 2) cx\DA 0I2 + nc2\V(I^I2)I = 0, it follows that there are no nontrivial solutions for n > 4 when c t > 0 and c 2 > 0 (see [32], [35]). In the case of n = 4 every solution decouples, which means that it is equivalent to a pure Yang-Mills solutiona fact that we already know from Sec. 2.

4.2

Bogomolny Equations

When M is 3-dimensional and c 2 = 0 11 , the Yang-Mills-Higgs equations (10.4.4) (which are of second order) can be associated to the first order equations: FA = ±*DA

The reason for choosing c2 = 0 is explained below.

(10.4.6)

Theory of Yang-Mills and The Yang-Mills-Higgs Mechanism 599

known as the Bogomolny equations [16]. For obvious reasons these are also referred to as "monopole" equations. These equations, as we shall see next, are obtained only when the Higgs-field <j) satisfies appropriate decay conditions at infinity. From now on let M be IR3 and the gauge group G be SU(2), and let CH stand for the pairs (A, ) in C which is orthogonal to the 5f/(2)-orbit through (A, ). This means that dA * A + [0, 0 ] = 0, which in turn defines a Riemannian metric h:n h

(c> c ) = | R 3 (A, A) + ( 0 , 0 )

(10.4.7)

and a potential function U:

U =~ J R3 (F, F) + (D0, D0)

(10.4.8)

on the space C. If now we consider the motion of a particle on the infinite-dimensional manifold C with potential U, we note that the particle evolves as though it is a representative of a solution to the Yang-Mills-Higgs equations with c 2 = 0 (see (10.4.3)). To understand this analogy, however, we have to translate it into simpler mathematics. Let M' denote the submanifold of C on which U attains its minimum, then, if the motion of the particle is initially tangential to M', the variation of U in subsequent motion is small; and hence the motion is determined by the kinetic energy term (10.4.7). This in turn yields the geodesic motion on M' with respect to the induced metric. Our next step therefore, is to know the absolute minimum of U on C and to determine the induced metric. For this we consider those 5(7(2) connections A and Higgs field
\2)2. Further we rewrite the integrand in (10.4.8) as (F, F) + (£>0, D(j)) = ( F - * D(j), F - *D
[

( F - * D(j>, F - *D
(10.4.9)

JBR

Then in view of Bianchi identity DF = 0, which implies d(0, F) = (D0, F)= * (*D0, F) it follows that second term in the above integral is a surface integral of the 2-form ((p, F) over the sphere of radius R, more explicitly: f •>BR

i2

2(*D^,F) = 2 f

(,F)

JSR

' Note that with this choice of M and G we have again returned to the inner product.(,).

(10.4.10)

600 Mathematical Perspectives on Theoretical Physics

From our assumption on 0 (|0| —> 1 as R —¥ °°) and from the definition of C to which (A, 0) belongs, we can take 2-dimensional complex vector bundle with connection associated to A; the eigenspaces of <j) in this set up are complex line bundles over SR and as such have Chern classes ± k. The curvature of these line bundles (with decay condition |D0j = O(r~2)) approaches the projection of the curvature F onto them. Hence with |0| —> 1 we have: lim

f

(<j>, F) = ±4xk

Accordingly U can be written as:

U=—j\F-

*D(j)\2 ± Ank.

Therefore if k > 0, the absolute minimum of U is knk and it occurs when the pair (A, <j)) satisfies the Bogomolny equation (10.4.6) (with + sign). The integer k coming from the Chern class is called the "charge" of the solution. In the result given below, we shall see that under suitable conditions a Yang-Mills-Higgs model inIR""1 can be viewed as coming from pure Yang-Mills model in \R", hence it is appropriate to say that the Bogomolny equations arise from dimensional reduction toIR3 of the self-dual Yang-Mills equations inK4. We establish this result for an arbitrary K". Result 10.4.1 Every pure Yang-Mills system in R" whose gauge field (connection) is independent of one coordinate reduces to a Yang-Mills-Higgs model in IR""1. Proof: Let the gauge field A = (Ax ••• An) be independent of x" and letF = dA be the corresponding curvature. Write Fjn = djAn+[Aj,An]

(j = l, 2 -

n-1),

and define <j) = An, then since DA <j> = d(f> + [A,
Hence if we write a 2-form F in terms of its local components as:

F=\

Y4FijdxiAdxi,

thenF = (F, F:n). Hence the pure Yang-Mills action density:

T"1 ( .~.) when expressed in terms of F and Fjn gives the action density per unit xn as

^=|i^ij 2(R -.) + ii^^ (R -, ) In view of (10.2.3) we know that this is the Yang-Mills-Higgs action density for X - 0. Recall that in (10.2.3) the Higgs field 0 was defined on the vector space of the Lie algebra g of G and the action there came from the adjoint action of G. Whereas here Aj (J = 1, 2, •••, n - 1) and 0 are Lie-algebra-valued C°°-functions on R""1. •

Theory of Yang-Mills and The Yang-Mills-Higgs Mechanism 601

It should be mentioned here that the pure Yang-Mills equations in this case are invariant under ^"-translation.

4.3 The Solution Space of Monopoles We begin here with the differential equation of monopoles as DA = D=* F . (10.4.11) and study in brief the solution space of these monopoles. Our exposition here is based mainly on [10], and the reader is advised to refer to it for details and to other original papers, e.g., [29] and [41]. Recall that the above equation is obtained by imposing decay conditions on |0| and \D
\D^=0{r~2) (10.4.12) and re-emphasize that they are all consequences of the finiteness of the energy (see [32] for this proof). In the asymptotic conditions listed above,—— refers to angular derivatives and k to the magnetic charge of the monopole (in appropriate units). As mentioned earlier k is a positive integer that gives the degree of the map (see also Exp. (10.2.8))

which is defined on a sphere (of large radius) in R3, and which takes values on a unit sphere in the Lie algebra of SU(2) (see (10.4.2) to verify that <j> is Lie algebra valued). Next let Nk denote the moduli (or parameter) space of gauge-equivalent monopoles, enlarge it by a circle or phase factor to obtain another space Mk. The easy way to do this is to use the following steps: 1. Fix a direction in R3 say xx, then 2. use the gauge A! = 0, and 3. allow only those gauge transformations that tend to the identity as x{ —» °°. Note that in view of decay conditions (10.4.12) it is immaterial whether these conditions are imposed on all lines parallel to xl or on the * r axis only. The space Mk obtained through this procedure is fibered over Nk with fiber Sl, and it depends on the chosen direction which means that it depends on the distinguished point, say * on the sphere at infinity.13 From another description of Mk given in Exc. 1 it is evident that on Mk the Euclidean group acts as well as the group U{\)—the group of automorphisms of bundle Ek. Hence dividing out by U(l) and the translations of R3 we obtain another space which parametrizes monopoles up to translations T(R3): 0=M.x<7(l)

T(R 3 ) 13'

^ _ T(R3)

= J

(10.4.13)

For another bundle theoretic approach for definition of Mk, see Exc. 1 and also [35] or [10] for more details.

602 Mathematical Perspectives on Theoretical Physics

The spaceM^ can also be considered as parametrization space of monopoles with fixed centre since every monopole has a well-defined centre (see Chapter 2 of [10]). For k = 1, Mk reduces to zero, i.e., to a pointthe unique monopole of the system known as the 'Prasad-Sommerfield monopole' [41]. The space Ml simplifies to Mx s R 3 x S' in this case. Having formally defined these moduli spaces of monopoles, our next task would be to learn about their qualitative features via analysis and geometry, in other words to obtain the 'space' where these monopoles would reside (see the formal statement of Result (10.4.2) at the end of this subsection). The first problem in this respect is to establish the existence of a solution to the first order equation *F = DA(j>. This is done by choosing a pair (Ao, 0) is C°°, and satisfies the following condition: (i) the action involving (Ao, 0O) is finite (ii) (VAo),. (0O) = V; 0o + K M ' *ol-

(10.4.14a)

Next we define a pair (a, y/) = (A - Ao, (j> - 0O) which transforms under the adjoint representation of gauge group. It can then be shown that the equations for (a, y) have a solution, provided (Ao, ^ ) itself is sufficiently close to a solution. This happens if each component

of the deviation from the Bogomolny equation for the pair (Ao, 0O) has a small norm in some L^-space. Writing a = £ atd^ and ( V A Q ) ; = V ; + [(A0)J, • ] , the equation *F = DAcan be expressed in terms of (a, y/) and (Ao, 0O) as: {eijk (VAo)jak-

(VAo)w+

^0,

fl/]}

+ [y,, «.] + &a}ak=

B0J.

(10.4.14b)

The system (10.4.14b), which is not elliptic (see the App.) is made into one by making a gauge choice that subjects (a, y/) to the gauge condition: (V AQ ),. at + [0o, V\ = 0.

(10.4.15)

Equations (10.4.14b) and (10.4.15) taken together define an elliptic system. We make these two equations into one by using unit quaternions (l,g ; : j = 1, 2, 3; qj qk = - <5yt - eijk qt, q* = - qk). The quaternion g's are 2 x 2 anti-Hermitian matrices that commute with Lie algebra g; hence we can define a g-valued quaternion: O = yf+qj a.j along with a first order elliptic operator: DQ = - qf (VAQ)J + [0O> •]• The two equations are now combined to give one first order elliptic equation with elliptic operator Do: Do 4> + O A O = qj BOj• = Bo,

(10.4.16)

where A stands (not for exterior product but) for symmetric product on the g-valued quaternion algebra. For instance, for a' = a'o + a'3qp b' = b'0 + bjqf. a'Ab' = b'Aa' = 1 * , . {- [a,', b'o] - [b], a'o] + e«* (djb'k+ bja',)}

(10.4.17)

and accordingly $ A $

= - ^ [Oj, y/\ + qi £ijk ajak.

(10.4.18)

Associated with the pair (Ao, <j)0) is the Hilbert space completion of the space C °°0 formed by smooth, compactly supported sections of the vector bundle IR3 x ( g ® Q), where g is the Lie-algebra and Q is the quaternion space, we shall denote this completion, as

Theory of Yang-Mills and The Yang-Mills-Higgs Mechanism

603

H(A0, (/>0) = H. Let H be the space that is obtained from an (approximate) solution (Ao, 0, \F0\ —» 0 as |JC| —> °° uniformly. Furthermore suppose that (1 + || 0. (See Chap, iv in [32].) Having obtained , in effect we have established the required result we were looking for: Result (10.4.2). The pair (A, 0) = (Ao + a, 0O + y/) is a smooth, finite action solution to *F = DA$. (The above result is a corollary to Thm. (2.1) in [32]; the conditions laid down in (10.4.14a) are based on Thm (2.1).

4.4 The Scattering and Spectral Curve Our next result on monopoles stems from the ordinary differential operator: Dv=Vv-i<j) (10.4.19) acting on sections of the monopole bundle, say E over a fixed oriented line v. Vv here is the covariant derivative along v and the operator Dv can be seen to define a scattering S{v) via solutions to differential equation Dv S(t) = 0, t being a linear parameter on the line v. The equation has two linearly independent solutions s0, sx, these can be picked in such a manner that as t —> <» they in turn give rise to two constants (eQy ex) in the asymptotic gauge: s0 (t) rm e' -> e0, 5, (0 P2 e" -> e}. The constants (e0, e{) form a basis of the fiber of E at + °°. Note that sQ is exponentially decaying as t —> °°. Similarly there is a solution s'o (which we choose = as0 + bs^, that decays exponentially as t —> - °° and is unique up to a constant. The ratio alb is independent of the choice of s'o and it is this ratio that defines the scattering S(v) associated to the line v and the operator Dv. Thus S(v) = alb E P 1 (C) Evidently this depends on the basis (e0, ex). Next consider a fixed isomorphism:

(10.4.20)

R 3 = IR x C

where the coordinates (x1, x2, x3) on R3 are treated as x{ = t, the parameter, and x2, *3 as x2 + ix3 = z e
604

Mathematical Perspectives on Theoretical Physics

onto the space Rk of all such rational functions. Consider the rational function Sm(z) corresponding to the monopole m, note that the poles of this function arise when b = 0 in (10.4.20), in view of s'0= aso+ bs{, it means that in this case the solution s0 decays for / —» °° as well as / —> - ° ° . A line v(z) with this property is called a 'spectral line'. Thus the poles of Sm (z) represent k spectral lines parallel to ^ r axis. The subspace consisting of all spectral lines is called the 'spectral curve'. When S(z) = Sm(z) (which belongs to Rk) has simple poles it can be written as: 5(z)=X—^—.

(10.4.21)

z-bi

;=i

If in addition these poles are far apart, then the monopole associated to S(z) approximates a combination of k simple monopoles having centers at the points

log \aL\, bt

and phase angles given by the

arguments of at. When k = 1, S(z) becomes: 5(z) = - ^ - . z-b

(10.4.22)

The standard monopole centered at the origin is given by b - 0 and a = 1, whereas in general it is located at the point

log \a\, b I, the argument of a giving the phase. Thus Mx has the complex

structure: M, = C x C * When S(z) e Rk has the ' normal form'it can be written as: i-l

/=O

S(z) =

t_1

(10.4.23)

where the numerator and denominator have no factor in common. This is true only if the (2k + 1) X (2k + 1) determinant a

o

a

\

'••

a

k-\ a

a0

h-i

a*-i

% A

^

b

^

b0

bx

.-

K

bk_{

bk bk_x

!>o

bk h-i

b

k

Theory of Yang-Mills and The Yang-Mills-Higgs Mechanism

605

is non-zero given that bk= 1. As a result, Rk can be viewed as the open set in C2* complementary to the algebraic variety A(a, b) = 0. From our discussions in this subsection and in view of Donaldson's theorem, it follows that Mk is 4kdimensional, and that Mk as well as M^have a 2-parameter family of complex structures one for each point of S2 representing the preferred axis in IR3. (See Chapters 2 and 16 of [10] for this subsection.)

4.5 The Metric on Mk In the result given below we prove: Result 10.4.4

The metric on Mk is complete.

In order to establish this we begin by noting that this is a Riemannian metric defined by the L2-norm of "zero-modes" 14 with finiteness property, i.e., with those "zero-modes" that are square-integrable (see Taubes' papers [46] as well as Uhlenbeck's [48] on these concepts). Our presentation here, which is a simplified version of these ideas, follows the reference [10]. 15 Let c denote an arbitrary &-monopole c = (A, <j>), and let Tc be the space of pairs (a, l//) (a e A1 (g), yre A°(g)) which are square integrable and satisfy the equations:

*DAa - DA y/ + [
(10.4.24)

+ [0, y/] = 0

(10.4.25)

(we shall see that Tc = Tc (Mt)-the tangent space to Mk at the point c). It is worth noting here that (10.4.24) is the linearization of Bogomolny equation, and (10.4.25) shows that (a, if/) is orthogonal to gauge directions (more precisely to directions coming from infinitesimal gauge transformations with compact support) (see Exc. (4.2)). We further note that the Higgs field (j) is itself an infinitesimal gauge transformation (non-zero at °°) and it gives rise to a special vector (DA<j>, 0), which can be seen to satisfy both Eqs. (10.4.24) and (10.4.25). Hence (DA(j), 0) belongs to Tc. Let 7/denote the space of all pairs that are orthogonal to (DA , 0), then evidently Tc'cz Tr and dim Tc'= dim T c - 1. The space 7c'can be identified with the tangent space to the moduli space Nk at the point [c]-the gauge-equivalent class represented by c (see Taubes [46(b)]). This identification implies that all directions in Nk can be viewed as square integrable variations. Using this, identification, it is not hard to see the isomorphism between Tc. and the tangent space T(Mk), since Mk fibers over Nk with Sl as the fiber, and this fiber direction (as we have stated above) corresponds to the infinitesimal gauge transformation given by the Higgs field. Hence Tc= T(Mk), and since from Donaldson's Thm. (10.4.3) Mk is 4i-dimensional, Tc is 4&-dimensional as well (see [46(b)] for analytical details). Our next step is to endow Mk with the metric coming from the L2-norm on Tc and to show then that this L2-metric is complete. To establish the first part, we claim that "Tc is a vector space over the 3

quaternions H . " 1 6 Let ¥ = (a, y/), and suppose that a s ]£ at dx', and {/,•} is a basis of Im H, then by 3

definition of (a, y/), ¥ corresponds to y\ + J ) o;,-/,-, where {1, 7 ; , I2, / 3 } is a basis of H. This shows 14

"zero-modes" = solutions of linearized equations. Our lines of proof are slightly different from the approach in [10]. 16 ' In Subsec. (3.5) while discussing instantons we have seen the role played by quaternions. 15

606

Mathematical Perspectives on Theoretical Physics

that Tc is a vector space over H. Also since Pauli matrices form a basis of Im H, one can see that *F is a su(2)®H valued function. It can also be checked that the Equations (10.4.24) and (10.4.25) are H invariant. The norm in Te is given by the norm for quaternions, and since tangent space T(Mk) can be identified with Tc, Mk inherits its metric from this norm. Moreover the basis {1, /,, I2, /3} defines isometries in Tc through their action, and since (/,)2 = - 1 (i = 1, 2, 3), each one of the //s defines an almost complex structure there, showing eventually that Mk has three distinct almost complex structures. In fact these almost complex structures are integrable.1 On the other hand since from Donaldson's theorem (10.4.3) Mk can be identified with the space Rk of rational functions of a complex variable, Mk is a complex manifold. Thus from above discussions it follows that Mk is a complex manifold with respect to three complex structures. Finally, to prove the completeness of the metric on Mk, in essence one has to show that an open curve of finite length in Mk has a limit point. In other words if one were to write a sequence of lengths in a suitable norm, then this sequence should have a convergent subsequence. For this, consider a parametric representation: c{t) = (A(t), (j)(t)) of a curve in Mh such that cit) e Tc^ and || c(t) ||2 = 1 with 0 < t < s (i.e., tangent vector to c{i) has unit L2-norm in [0, s)). Now use the norm || ||f for pairs (a, y/) defined by Taubes [46a]: IK*. V)H2C = l|V A a|| 2 2+ II DA ¥\\\+ ||[0, a ] | | 2 2 + || [0, W] ||22. (10.4.26) The norm || ||(. on Tc is bounded by L2-norm, moreover using the Sobolev inequality the L6-norm on Tc also can be seen to be bounded by Z,2-norm. We now write b(t) = c(t) - c(0), then since c(t) e JTCW, for all t, b{t) is square integrable and we deduce by integration that || b{t) \\p = O{s)

p = 2,6,

therefore by Holders inequality, || b(t) \\p = O(s) for 2 < p < 6. Using this one derives an estimate: \\b(t)\\c(0) = O(S).

This derivation can now be used to write an estimate on the energy. For instance write

ll^ W -^(0)ll = O(,) and replace t = 0by t = s - £, then for t, t' > s - £ we have: 11^(0-^0112 = 0(6). (10.4.27) Finally to obtain the desired result that Mk is complete, we use the results of Uhlenbeck [48] which assert the existence of a subsequence v; and gauge transformations git so that g, (Av., \ —> some constant as |JC| -> °« is, finite (without this finiteness all arguments will fall through). 3

17

' It can be easily checked that each /, is covariantly constant and that any quaternion ^ i=i

defines an almost complex structure which is covariantly constant.

a, /, with unit norm

Theory of Yang-Mills and The Yang-Mills-Higgs Mechanism 607

4.6 Monopoles in Coordinate Form The following example illustrates the construction of a static magnetic Yang-Mills-Higgs monopole solution in a coordinate framework. This example is based on Prasad's paper [40] where it is shown that under certain restrictions (see Conditions 10.4.28), solutions to Yang-Mills-Higgs equations that have arbitrary topological charge can be obtained by using a recursive formula. This formula is based on Atiyah-Ward ansatz of self-duality Equations (10.4.41) and uses the premise that: "given a solution of arbitrary topological charge n a solution for topological charge (n + 1) can always be found." We chose to give this example not only to give a flavour of coordinate formalism in Yang-Mills-Higgs theory, but also to show the use of computer technology in finding solutions of the YMH-systems. Example 10.4.5

Let A£ (x{ • • • xA) = A£ (a = 1, 2, 3, /J. = 1 • • • 4) be the gauge potential and let V v s dn K ~dvA£+e

eabc A* A'v

(10.4.28)

be the gauge field, where e is an arbitrary gauge coupling constant. The self-duality equation (see Def. (10.3.5)) resulting from (10.4.28) is:

Vv=y£^pV

(10.4.29)

(The £-symbols in (10.4.28) and (10.4.29) are Levi-Civita antisymmetric tensors.) The solutions of this equation that satisfy the Conditions given in (10.4.28) are the monopole solutions of YMH model in 4dimensions. Conditions 10.4.28 (i) A" is static in all gauges, i.e.,
H2=f2-2f—

+ O(r~2)

as

r =yjx2 + x\ + x\

-> ~>. (10.4.30) er Here/is an arbitrary constant with dimensions of inverse length and n is a positive integer that represents the topological charge, it is also assumed that ef> 0, hence both constants share the same sign. The key to solving (10.4.29) is to use complex-coordinate formalism and to choose a suitable gauge along with required gauge transformations. All these choices however should be such that they obey the Conditions (10.4.28). Recall that the gauge potential and gauge field are (2 x 2) matrix-valued fields with Pauli matrices { r a } being the basis, thus Ta A^= e —-A^ and

/ > = e~r

F?v= ^ Av -dvA^+

[Ap Av].

(10.4.31)

To satisfy the condition that gauge fields be real, AM and F^v have to be anti-hermitian traceless matrices. Define complex coordinates z and w as: 4lz = x, + ix2,

4lw = x3- ixA.

(10.4.32)

608 Mathematical Perspectives on Theoretical Physics

This gives 4 coordinates (z, z, w, w) in place of coordinates (xx, x2, x$, x4). Obviously the gauge potential {A^} and {F^v) can be viewed as matrix-valued functions of these 4 complex variables.18 To obtain an expression of A^ in terms of complex variables we first note that the self-duality relation (10.4.29) simplifies to: Fzw = 0,

FIW=0,

FzZ + FwW=0.

(10.4.33)

Since Fzw - 0 anAFl^ = 0 are pure gauge, they can be integrated to give the matrix valued gauge potentials as: Az = M~l Mv Aw= AT1 Mw,

Af =M~l MZ,AW

=M~l Mw,

(10.4.34)

where M andM are arbitrary (2 x 2) complex matrix functions of (z, w, z, w) with determinants 1, and where M=—-— etc. 19 We now consider the gauge transformations for M andM as: az M ->V (z, w ) ML,

M -> V^"1 (z, w) ML.

(10.4.35)

The entities L, VandV in (10.4.34) are arbitrary complex matrix functions of (z, w, z, w), (z, w) and (z, w) respectively with determinants 1. It is easy to check that gauge potential A^ and gauge field F^v transform under gauge transformations (10.4.35) as: A^ -> L' 1 A^L + L~l d^L, F^ -> L"1 F^ L. (10.4.36) 2 It is easy to note that the energy density F^ F^v = —j- Tr (F^v F^ is invariant under them. We next define a matrix N in terms of M and M: N=MM~[

(10.4.37)

This transforms under (10.4.35) as N^V

(z,w)

NV(z,w)

(10.4.38)

and reduces the remaining self-duality equation Fzl +FwW = 0 to a simpler form: (AT1 NJz + (AT1 JVJjf = 0.

(10.4.39)

Since N is an arbitrary complex (2 x 2) matrix function with Det N = 1, we can choose a convenient parametrization for Af that will eventually lead to the solution. We thus write:

N-J^ /'«_ 1 2 i, PI a (a + ppya )

(,0.4.40)

where a, ft and /J 20 are arbitrary and independent complex functions of (z, w, z, w). Using (10.4.40) the self-duality equation (10.4.39) can now be put as: 18

Analytical continuation of A^ to the complex space makes this possible.

19

The notationM for arbitrary (2 x 2) complex matrix function should not be taken as the complex conjugate of M (see for instance (10.4.42)).

20

P is not complex conjugate of ji, w e could have used any other parameter say y , h o w e v e r to facilitate things for interested reader w e have stayed with Prasad's Terminology.

Theory of Yang-Mills and The Yang-Mills-Higgs Mechanism 609

{dzd-z + dwdw)\n a + {pzp-z + pJx)/a2

(A-)

+(^Vl

=0

=0'

(10.4.41a)

(10.4.41b) (10.4.41c)

Since our objective is to obtain A^, i.e., solutions to self-duality equations using N = N(a, ft, /?), we have to select a suitable gauge, in other words suitable factorization of N. Apparently there are infinitely many ways in which such a choice can be made, as is evident from (10.4.37). The choice that is made here is known as Yang's gauge R, which is defined by Ns RR~l, where:

R = {V^. [p/Ja

°A

R=(^

-fa)

-»*Pl

{0

(10.4.42)

\l4a )

The gauge potential in the R gauge can be written as:

"A'lT I, PJa

"„ ).*.-("•"" -*"'")• +au/2a)

{

o

00.4.43)

-ast2a)

where u stands for z, w. From (i) of Condition (10.4.28) we know that the gauge potentials given in (10.4.43) must be independent of xA and since this has to be true in all gauges, in view of (10.4.36) it follows that the gauge transformation matrix L must be x4 independent as well, i.e., <9 4 L=0. (10.4.44) It should be noted here that x4 independence of A^ does not necessarily lead to the part (ii) of the Condition, i.e., to the requirement that the gauge potentials A^ satisfy the anti-Hermitian property A^ = -A^. A necessary and sufficient condition for this to happen is thatV (z, w) and V(z,w) be such that the product V (z, w) RR~l V(z, w) is a positive definite Hermitian matrix. It can be verified that in this case the gauge transformation matrix L is a square root of the matrix: LL+± (R+ V+ V~lR)-1-

(10.4.45)

The symbol= indicates that equation is valid only for real variables of xx • • • x4. We further note that the following choice of a, f5, ft:

a = eiefx< Fa, P = eiefx< Fp, p = eiefx* F^ where Fa, Fg,F-~ are functions of xy, x2x3 only, ensures that the gauge potentials (10.4.43) are independent of x4. Next we use the Atiyah-Ward result on monopoles which states that if An = (na, nP,nP)

are solutions

to the self-duality Equations (10.4.41) then so are # n + 1 = ( n + 1 a, n+lp, n+1 J3) given in (10.4.47). We apply this result to obtain the required solutions of Prasad [40] in (10.4.41). The triplet (,!+1o:, , 1+[ A n+1

J8) is defined in terms of (na, np, n]l) by using

610

Mathematical Perspectives on Theoretical Physics

"a2

+ J rP

„«

'

- COO.

,2 - * - - PM, 22 * _ - ?(„,. + npj na + Jn/3

ncc

Hence it can be written as: 2

n+lPz=-Q

n +

(n)[P(n)-\w

+ Q2(n)[P(n)]w

n+iB-z=

A = +Q\n)[P(n)]z

n+lBw

(10.4.46)

= - Q\n)[P(n)]z

(See Exes. 4, 5 and 6 for explanations on these.) The induction here starts with Ansatz Ax defined as: iPz=iOcw,

lPw=-1aI;

J z - = taw,

x%

= - laz;

(10.4.47)

which implies - , a a - + 1 a w a f = 0.

(10.4.48)

In view of above discussions we look for solutions of (10.4.48) in the form (10.4.48): iefx

lCC=e

(10.4.49)

where AQ is a function of (x{, x2, x3). The above choice reduces (10.4.48) to: V2 Ao = (ef)2 Ao

(see Exc. 7).

(10.4.50)

This gives the general solution of (10.4.50) as: A sinh K(eI Ufn) Ao = L P * ~- k=\

(10.4.51a)

r

k

with r / = x2+ x2 + (x3 - Skf, (10.4.51b) where pk and 8k are arbitrary complex constants subject to the condition that AQ be a real function of xx, X2, Xy

Next we define a real function A, by writing: Ajsr'^A^z-'^.A,

(10.4.52)

and use it to integrate equations (10.4.47). This gives:

, a = ek^ Ao,

{p

= eie-^ (Jlzfid,

Ji = eiefx* {•Jlz)~X (<93 - ef) A,.

+ ef) A, (10.4.53)

(Note that constants of integration have been ignored here.) Further define the real functions An for n > 2 by: A,, = r J dz A B+ , = z{ dzAn+l,

(10.4.54)

21

then the A's (i.e., solutions we are looking for) are :

21

We avoided giving the technical details since we feel that the reader can solve these equations by integrating them appropriately (seePrasad [40]).

Theory of Yang-Mills and The Yang-Mills-Higgs Mechanism 611

Ao = («/) X L , P^sinh (efrk)/efrk A, = («/)-' X* = , Pt cosh (e/rt) A2 = («/)"3 X L i P* <«>*> s i n h W

- ^

Ai

(10.4.55)

A3 = (ef)~5 2 L i P* (
(10456)

r This enables us to write the ansatz Ax = da, ,/3, xfl) using the relations in (10.4.47) and (10.4.53). But every solution has to satisfy the requirements of Condition (10.4.28), hence in view of (10.4.44-45) we must choose the matrices V (z, w) and V(z, w) that make V RR "' V a positive definite Hermitian matrix. The required matrices V and V are:

fl 0\ V(z,w)=l = l

1;

{ 0 V(z, w ) =

-^A -JlzY

(10.4.57)

The gauge transformation matrix equality (10.4.45) in this case becomes22: LL+ = (R+ V+ V~lRTl = e"ef<-Tx).

(10.4.58)

This gives L= e 2 confirming the assertion made for LL+ in (10.4.45). Note that AQ in (10.4.56) never vanishes and it provides a real non singular spherically symmetric n = 1 monopole solution. The n = 2 Monopole Solution (Ward [52]): We write AQ as: fMHefrO ^{efrj

^

(1Q4 5 9 )

where r,2 = xf + xl + (*3 - ^f, 22.

ri = x,2 + x22 + (x3 + <5O)2.

(10.4.60)

We remaind the reader that subscript + on L, R and V stands for Hermitian conjugate, and T • x = (f' JC,-) (i = 1,2,3)

612 Mathematical Perspectives on Theoretical Physics

TheK (z, w) and V(z, w) that make (V RR -1V) a positive definite hermitian matrix for the solution (2«, 2p,

jj) are: V ( z , w ) = l;

V(z,w)=\

,^,

2

'

K

(10.4.61)

Here y is a real constant, such that - (4y<50)2 = +1 and <50 = +ic with c a real constant. This is required for the gauge fields to be real, as shown in [40]. The n = 3 monopole solution is obtained by choosing AQ as: AQ =

ainhfrA)

+

sinh(efr2)

r\

+

^rnhjefr)

r7

^

(1Q 4 6 2 )

r

with rx and r2 as given in (10.4.60) and r 2 = ^ 2 + x\ + Xj. The matrices V and V in this case are: VQ,w)

= l,

V(z,w)=\

.

V

M

(10.4.63)

The constants <50 and ynow satisfy: (8y<502)2= +1, y= a real constant.

(10.4.64)

Finally the choice of AQ for an arbitrary monopole solution of topological charge n is given by2 : A

0

(10.4.65a)

= X P * ^ ^ k=\

rk

where rl = x\ + x\ + Qt3 - Skf

(10.4.65b) (10.4.65c)

(/1~1)! . (10.4.65d) * (ik-l)!(#i-Jt)! Note that pk is a binomial coefficient and hence real, whereas the difference 5k- 5k_x = in/ef is imaginary. The matrices V and Vhere are:

V(z,w) = l, 23'

( V(z, w)= \

0 r

Y-x(J2zYn) K ' I

(10.4.66)

It is worth noting here that these solutions involved long algebraic computations, and so to obtain final results, the symbol manipulation computer programme MACSYM had to be used.

Theory of Yang-Mills and The Yang-Mills-Higgs Mechanism 613

It can be verified that VRR'W is a positive definite matrix for Ansdtz (na,nl3, nf5) constructed from (10.4.65). Having obtained the monopole solutions of arbitrary topological charge n that satisfy (i) and (ii) of Conditions (10.4.28), we have to examine their compatibility with part (iii) as well. In this connection we note the following facts: (a) Choice of Ao in (10.4.51) and subsequent derivations in (10.4.52-53) imply that: In {a -» ef(r+

ix4) + 0(ln r) as r -> «,.

(10.4.67)

(b) The Atiyah-Ward transformations (10.4.46) that lead from one Ansat'z to another do not change the asymptotic behaviour of {cc, accordingly we have: In , a -> ef (r + ix4) + O (In r) => In na -> ef (r + ix4) + O (In r)

(10.4.68)

as r —> oo. 2

(c) The gauge invariant quantity H — A.%A% for the Ansdtz %„ can be written as:

# 2 =/ 2 + -r£[- v 2 l n *4 e

(10.4.69)

k=\

Substituting (10.4.68) into (10.4.69) we thus have that a s r - > « >

H2->f-*£L

er

+ O(r-2),

and this is the condition (iii) as given in (10.4.30) (see Prasad [40]). We now move on to the next section of this chapter, after illustrating some of the computational aspects of the theory in the exercises.

Exercise 10.4 1. Show that the space Mk can also be given a bundle theoretic description which is independent of direction. Mention the advantages of this description. 2. Show that (a, y/) is orthogonal to gauge directions that result from infinitesimal gauge transformations (with compact support). 3. Derive the self-duality equations (10.4.41) from (10.4.39) using the matrix N as given in (10.4.40). 4. Show that if (a, /3, /j) satisfies (10.4.41), then the transformed triple (T (a), T(P), T( /3)) defined by:

Tia)

-^TW m=^TW

m =

A

also satisfies it. Moreover, the gauge potentials obtained from (T(a), T(p), T(P)) are gauge transforms of those derived from (a, /3, /3). 5. Given that (a, j3, J3) satisfies (10.4.41), then (S(a), S(p), S(/3)) also satisfies it, where

S(a)=-i 5(/U = -|i-,S(/U = i ,

614

Mathematical Perspectives on Theoretical Physics

s(jgz-) = % a1

S(/3*) = -2f-. a1

6. Show that the equalities given in (10.4.47) which define the Ansatz5lll+l in terms of \ results of product transformation ST where ST are given in Exercises (5) and (6). 7. Establish (10.4.50).

are the

8. Obtain the values of ,/?and ,/? as given in (10.4.53). 9. Show the mathematical calculations that lead Eq. (10.4.69) to (10.4.30).

Hints to Exercise 10.4 1. To obtain a bundle theoretic description for Mh we consider the Hopf line bundle H over S2, in other words we form the vector bundle Hk © H~k over S2. We extend this bundle radially to R3 - {0} = S2 x K+, this gives an St/(2)-bundle Ek with connection and Higgs field on R 3 - {0}; Ek is acted upon by SO(3). From asymptotic analysis of monopoles it is known that any monopole of charge fcis asymptotically isomorphic of Ek (see [29a]); and since the group of automorphisms of Ek is f/(l), the asymptotic isomorphism (denoted by a) is unique up to f/(l). But monopole constructions for k ^ 0 are irreducible, hence the only element of U( 1) which extends to a monopole automorphism is - 1 . We now assume that a given monopole is made 'rigid' by fixing orup to sign. The isomorphism classes of such 'rigidified' monopoles give the required moduli spaceMk fibered over Nk with circle S' as a fiber. In other words U( 1)/{± 1} acts freely on Mk with quotient Nk. We further note that since an automorphism of Ek is determined by its value at one point, the spaces Mk and Mk(*) can be identified. The advantage of this description is evident, since this definition of Mk exhibits the natural action of the group of Euclidean motions of R3 and in particular of the rotation group SO(3) which acts naturally up to (±1) on the bundle Ek. In this context the SO(2) subgroup of SO(3) which fixes a preferred direction, acts naturally on Mk(*). It should be noted here, that this action is not compatible with the identification Mk —> Mk (*), as a point of Mk stands for a monopole bundle E along with an asymptotic isomorphism a: E(*) —> Ek(*), whereas a point of Mk(*) represents E with an isomorphism E(*) —> (C2. 2. Let X be a zero-form with compact support (we have to assume this since an arbitrary X or

((a, y), (DAA, [A, 0]))

is zero for any X. Now (a) equals: (b)

f

(aADAA)+

y/[X, ],

where D.{) is an open set in R3 on which integrand can be defined. If we use the operator *DA *-the adjoint of DA, the first term in (b) can be written as (*DA*a, X), and thus (b) can be re-written as <(*D A * a + {, yA\

This vanishes for all X, only if

X).

Theory of Yang-Mills and The Yang-Mills-Higgs Mechanism 615

(c)

*DA * a + [(/>, y/] = 0.

But Eq. (c) is (10.4.25) showing that (a, y/) is orthogonal to gauge directions. 3. We write(iV~' Nz). explicitly in terms of (a, /?, (3) and the second term follows after replacing z,z by w, w, thus we have:

\(a+M. - i i f -£«. (a) (AT1 N)

a

=— A

_

«x

a

a

a 7 1^ a

a2

a

a2

£._&. _ a

«2

a

a2

z

a2

y

a2 a

K

z

zH>).

_ a2

a2

J?

Note that in view of the self-duality Eq. (10.4.39), the term belonging to the second row and the first column of (a) when added to similar term coming from (W 1 Nw)^ gives: (b)

but this is (10.4.41b). Likewise taking into account (10.4.39) each of the other three entries must satisfy separately equations similar to (b), now the diagonal terms are the same except for the sign and they give (10.4.41a). The entry in the first row and the second column of (a) simplifies to

-^-j- after

some calculations based on/?^ = /3iz, and this leads to the remaining Eq. (10.4.41c) 4. To prove this we only need to write the transformed matrix T(N) in terms of T(a), etc., using (10.4.40), thus:

' (a)

T(N)=

i T

T(P)

np) /<«> (T{a))2 + 7X/3) 7-Q3)

T(a)

I T(cc)

{ a2+fP = v

)

P)

I

j _ = U o)N[i oj-

a

a>

616

Mathematical Perspectives on Theoretical Physics

Evidently N is gauge transformed to T(N) (see (10.4.38)) where now V (z, w) =

(0 V=

.

i\

(0

-i\

\-i

°J

and

, and hence the self-duality equations (10.4.41) are satisfied by (T(a), T(p), T(fi)).

5. We assume that (3 andyS are integrable, i.e., fizv/= j3wv similarlyfizW = fiWi-, and we recall that the variables a, ft, ft are independent of each other. We replace a (3Z, /? ? , Pw,f3w by their transformed values, e.g., S(a) = —, S(B.) = —%a ' a1 and (10.4.41c) to obtain respectively:

- i n the left hand side of (10.4.41a), (10.4.41b)

(i) (ii) and (iii)

(-^xa2) +(--^xa2)

)z \ a2

Va2

Jw

Since In— = - In a and fizw = fiwz, etc., we note that (10.4.41) are satisfied by (S(a), S(p),

S(P)). 6. We note that both f2 and S 2 are identity transformations; in fact computation of S2 is quite trivial. It can be easily checked that: (i)

^ — ^

A2

ST

) ^3

and thus (il)

* „ — ^ - > *n+V

7. We use (10.4.32) to write (10.4.48) i.e. -d?

(a)

8 ^ dx\

dxj J

dz(la)

+dw dw({a) = 0 as

S{ dxy

dxA )

In view of (10.4.49) this becomes:

(b)

e** (*+-fY V dx{

dxl

+ -fT\Ao+* dx\ J

dx\

[e^ A0) = 0. v

;

Since AQ is independent of xi we have: V2A0=(e/)2A0. The above equation is the Helmholtz equation for AQ and hence gives the standard solution in terms of hyperbolic functions.

Theory of Yang-Mills and The Yang-Mills-Higgs Mechanism 617

8. From (10.4.47) we have: i.e.,

(a) Now the transformation-Jlw = x3 + ix4 implies —— = —==dw V2 ^ dx3

i—— , thus: dx4 J

(b) (since AQ is independent of x4). This gives:

(c) Writing AQ in terms of Ai as given in the first equality of (10.4.52), we have the required result after simplification:

(d)

(Note that z~l = V2~

is a zero of the operator (xl-ix2)

, similarly etefxA is a zero of the ydxiJ

operator — . ) The expression for ,ftcan be obtained in a similar manner by using either of the dz last two equalities of (10.4.47). 9. Since for every k, lnt a —> ej{r + ix4) + O (In r) as r —> °°, (10.4.69) can be written as: (a)

H2 = f2 - JL V2 (In ,a) = f2 = \

ef V 2 r + O(V2 (In r))

Note that since V2 = —-r- + — 5 - + —=- , the term ixA is annihilated by it while writing the V dxx dx2 dx\ ) 1

O

RHS of (a). Using V2 (x2 + x\ + Xj)T = —, and V2 (In r) = O(r~2) we have the required expression.

618

5

Mathematical Perspectives on Theoretical Physics

MORE ON VORTICES

Unlike other texts we study vortices after learning more on monopoles. This order seems more logical to us as it retains the descending order of spaces IR4 —> R3 —> R 2 on which instantons, monopoles and vortices can be defined. Also, most of the results related to smoothness of monopoles can be viewed as direct derivations of those of the instantons (see [32], and [10]) hence a study of monopoles soon after instantons seemed more appropriate. Apart from this, one of the compelling reasons to follow this order is to emphasize the fact that vortices represent a symmetry breaking phenomena in nature and thus in that sense are more akin to anomalies (than to instantons)-the entities resulting from symmetry violations in physical systems. We shall be discussing anomalies in the next section. In the present section we shall describe in brief the relation between vortices and superconductivitythe outcome of symmetry-breaking phenomena, and also discuss a few remarks on the existence of vortex solutions (A, <j)) to the abelian (f/(l)) Higgs model.

5.1 Characterization of Superconductivity We recall that a large number of materials when cooled below a critical temperature (Tc) exhibit the superconductivity phenomena, and so are referred to as superconductors. We shall see that in some of these cases the equilibrium state of the system is a multivortex configuration. To begin with we note that at macroscopic level superconductivity is characterized by three properties: (1) electric currents flow without resistance, (2) magnetic fields vanish inside the superconducting medium (leading to the state "magnetic expulsion"), and (3) in the transition from the normal state to the superconducting state, there is no net energy release. At the microscopic level superconductor is described by the BCS (Bardeen, Cooper and Schrieffer) theory, according to which superconductivity sets in on account of formation of bound electron pairs (known as Cooper pairs). These electron pairs behave like a single (Boson) particle with twice the charge of an electron against small applied forces. Large external forces are required to disrupt this pairing and force the system back to normal state. The state of superconductivity is now characterized by the collection of these Cooper pairs, which in turn are described by a scalar field
= ~0, the state of the system is normal, whereas when |0| is bounded away from zero it signals the state of condensed pairs and superconductivity. The field in view of this property is called the order parameter. It is this action of
connects the superconductivity phenomena of macroscopic and microscopic levels.

5.2

Superconductivity and Multivortex Configuration

In order to achieve the goal of relating the physical phenomena of superconductivity with our abstract model AYMH (10.2.3), we have to give a mathematical representation to superconductivity which is similar to this model in 2-dimensions. This is best done by using the free energy density model proposed by Ginzburg and Landau for the field

a = ao+a2

\\2 + a4 \(j)\4 + a{ | ( - ihV - eA/c)
(10.5.1)

Theory of Yang-Mills and The Yang-Mills-Higgs Mechanism

619

where h = Planck's constant, e = 2e eJectron is the electric charge of a Cooper pair and c s velocity of light*. The energy density a is based on thermodynamical principles, and uses the following assumptions: (10.5.2) (i) a has expansion in || and its derivatives. (ii) a is an even function of . (iii) The coefficients a.j in a are constants that depend on factors e.g. the temperature, and composition of the material. To ensure stability it is further assumed that av a 4 > 0. On the other hand to discuss the theory, a2 is assumed to be < 0; since if a 2 > 0 in view of the maximum principle this will mean that the variational equations of (10.5.1) with J a < °° must satisfy <j) = 0, thus limiting the theory to describe only the normal state. After some scale change and replacement of constants by our familiar constant X, we write (10.5.1) as

« = y|V^|2 + j ( M 2 - l ) 2

(10.5.3)

and emphasize that magnetic potential A here is a known entity. In order to make the above density useful for description of internal state of the superconductor, it is essential that we treat A as a dynamic variable, and therefore we add the free energy density of the magnetic field—namely |/ r A | 2 /2 to (10.5.3) and obtain:

« = } l ^ | 2 + } l V ^ 2 + f (W2-1)2"

(10 5 4)

--

From the manner in which a is constructed, it is evident that it represents 3-dimensional abelian Higgs model with G = U(\) and with L = ) inside a superconductor. The potential function V(0) =— (|0| 2 - l) 2 illustrated below, with its minima scaled to |0| = 1, shows 8 that the state with |0| ~ 1 is superconducting, while \<j>\ ~ 0 is normal.

\

m i

*• 1*1 ^^^Sl

The Ginzburg-Landau Potential Function

Note that the gradient term - I ' W is the standard minimal coupling of a scalar field to an electromagnetic potential.

620 Mathematical Perspectives on Theoretical Physics

As mentioned above, superconductivity differs from material to material and therefore the external magnetic force needed to disrupt it varies as well. In general however (as we shall see below) these superconductors can be classified into two categories depending on whether the vortex-vortex force that comes into play is attractive or repulsive. We note that this classification is related to A. For instance when A < 1, the superconductor is called type I superconductor. Its characteristic properties are: (i) the magnitude of the magnetic field is bounded by the order parameter:|F| < 1 - \
s'

Superconducting state

Normal state

1

' Her

j^QQ

\H\

The Meissner Effect In a Type I Superconductor

When X > 1, we have superconductors of type II. These exhibit two critical values of the field H; we denote them Hl and H2. As shown in Fig. (10.3) for Hx< H < H2 the magnetic flux penetrates the superconductor gradually. In spite of the figure being deceptive (a continuous curve in the interval (//l5 H2)), it is not hard to note that penetration of flux into the superconductor is discontinuous, since it enters in tubes whose 2 dimensional cross-section is a vortex. The flux carried in each tube is tide which equals 2n in the units in which the action density (10.5.4) has been written. The region (//j, // 2 ) is characterized by a series of small increments, and the total flux is an integer multiple of tide. Hence in our accepted units it is 2/r times an integer: (10.5.6)

\F=2KN.

^ ^ ^ ip I Superconducting state Total flux expulsion

yS

y ^

^"

Type II

Normal state

/ /Partial flux / expulsion Hi

"2

"

Q S Q H Tne Superconducting transition tor a Type II Material

Theory of Yang-Mills and The Yang-Mills-Higgs Mechanism

621

The diameter of each flux tube (as observed in experiments) in these units is of order 1. This happens to be the length scale defined by the mass of the photon, i.e., mphoton = 1. The length scale is called the penetration depth. On a length scale m~^iggs, the magnitude of 0 becomes 1, this second length scale (in terms of Higgs mass) characterizes the spatial variation of the order parameter <j) and is called the correlation length. In Fig. (10.4) we show the order parameter <j) in the vicinity of the cross-section of a flux tube, which illustrates the vortex-like structure that we had been talking about. Finally, in the following two remarks we conclude the relation between the vortices and superconductors when the latter are assumed to be finite. Remark 10.5.1 For type II materials (as observed in experiments) the flux tubes arrange themselves in a regular lattice, as the vortex-vortex force is repulsive, the lattice spacing being dependent on the geometry of the superconductor and the total flux given in (10.5.6).

Qft^QQ The order parameter f = fcX) near a zero (vortex), treated as a vector whose horizontal (vertical) component is Re j/Qm $)

Figure (10.5) shows how for fixed N the X > 1 vortices are arranged in a lattice that maximizes the vortex-vortex separation. The square in the bottom right here stands for the area mentioned in the case of Fig. (10.4). •

• •

•

• •

• • QSIjQ

• • • •

•

• • • •

•

• • •

•

[•}

Regular lattices of vortices for Type II superconductor

Remark 10.5.2 For A > 1 a vortex-configuration is unstable as the vortices are moving apart. A critical point of !A =J a is not stable for ]7Vj > 1. Similarly a configuration with N vortices and X < 1 can be stationary only if the positions of all vortices coincide.

5.3

Vortices when A = 1

Having considered the cases X & 1 for Higgs model in R2, we now concentrate our attention on results based on X - 1 in (10.5.4). We recall that the action:

622 Mathematical Perspectives on Theoretical Physics

A=jR2a

(10.5.7)

in terms of components can be written as:

*= y jR2 {i(^ -i\)t>\2 + y F,v F^v+±-«M-l)2}d2x

(10.5.8)

where [i, v stand for 1, 2. Similarly the variational equations: d*FA=±*

{0^> - 0DA0)

DA * DA<j) =— * (00 - 1)0

(10.5.9a)

(10.5.9b)

for (10.5.7) in terms of components are: dllFllv = j v

(10.5.10a)

V2A0=y0(l0l2-D.

(10.5.10b)

where j v = - y ( 0 ( a w +• M v )0 - 0 ()

(10.5.10c)

(The complex conjugate^ of the complex scalar field 2 was denoted <j>* in Exp. (10.2.8).) In Sec. 2 we have already seen that for finiteness of the action A, it is required that (A, ) satisfy the condition: |0| -> 1, Z ) ^ s 0, as |JC|-> °°.

(10.5.11)

Moreover if |0| and DA(p reach these limits sufficiently uniformly, then the total curvature is proportional to the vortex number which is an integer:

^ark^arW 2 *

(1O 512

- >

We shall show here, the applicability of conditions laid down in (10.5.11) by choosing the constant A = 1. The first thing we note is, that when A = 1, using the integration by parts, the action (10.5.8) can be expressed as (see [16] and [32]): A = \ J R2 { [(
± (
(10.5.13)

Theory of Yang-Mills and The Yang-Mills-Higgs Mechanism 623

The expression in parentheses {} contains all square terms, which implies that !A. has a lower bound: A>7T\N\ (10.5.14) (Note that we have used here (10.5.12).) Assuming N > 0, we obtain that (10.5.14) is an equality if and only if the square terms in the above integrand are separately zero, i.e., (
(10.5.15a)

(<220, + A2 02) + {di 0 2 - Aj ft) = 0

(10.5.15b)

Fn +— (f} + 0 2 2 - 1) = 0

(10.5.15c)

On the other hand, if N < 0, !A = -n N if and only if: (2) - (d, 02 - A, 0,) = 0 F

i 2 - J (0!2+022-l) = O

(10.5.16a) (10.5.16b) (10.5.16c)

The pairs (A, 0) that result from solving any one of the above sets are naturally solutions of (10.5.9) when X = 1. (We emphasize that the solutions (A, 0) here have been obtained by using (opplying in effect) the condition (10.5.11)).

5.4

Some Existence Theorems in the Complex Framework

In order to solve the above equations and obtain some existence results on these solutions (Thms. (10.5.4) and (10.5.5)), we use the complex framework. Thus, writing z = xx + ix2, 0 = 0j + i<j>2 and A = adz +adz where a =— (A{ - iA2) we can express DA0 as: DA 0 s <20- iA<j> = ((dz- ia)) dz + {{d-z- ia)^)dz In this set up equations (10.5.15)(a) and (b) become the real and imaginary part of: DA(j> - i * DA 0 = 2 ( 0 ^ - ia) 0) dz = 0 But the above equality implies:

(10.5.17) (10.5.18)

^r = ia, dz

(10.5.19a)

^-^-iaQ dz For 0 * 0 , the latter can be written as:

(10.5.19b)

or equivalently

a=;a,ln0

(10.5.20)

624

Mathematical Perspectives on Theoretical Physics

Using the complex framework and replacing <j> by its log function, the first order equations (10.5.15) have been reduced to a single nonlinear elliptic equation in terms of an unknown In |0| 2 . This representation makes them more easily accessible to variational techniques, besides since, poles of In \) is a smooth solution, and as such a is smooth and it extends by continuity to the zero set of 0. In order to make the zeros or the zero set of
=exp— (U + iV),

(10.5.21)

in terms of real valued functions U and V, where U is single-valued and Vis multivalued. (We shall see that it is the multivalued character of V which links to the vortex number N.) Now (10.5.21) can also be written as: U + iV = 2\n
(10.5.22)

a=—-^-(U-iV)

(10.5.23)

which leads to:

2 dz showing that once (j> is known, a and hence A is also known. From (10.5.21) it follows that: i

• V

-f- = e'T,

(10.5.24)

\ defines a map of the circle at °° to the circle parametrized by V = argz; denoting this map by /? we thus have:

P(V) = lim ~

= hm e2-

,

(10.5.25)

|z|-»~

The winding number of this mapft(as we already know from Exp. (10.2.8)) equals the vortex number M: winding number of /3 = N = — [ F

(10.5.26)

For N t- 0, by arguments of Homotopy Theory it follows that (f> must have a zero and that V(|z|, 6) cannot be smooth. We have thus established the fact that whenever ./V ^ 0, under appropriate conditions the scalar field pertaining to solutions of the variational equations (10.5.15-16) has a non-empty zero set. We shall use this fact in the three results stated below. 24

This assumption is based on the Thm. (10.5.4), which we decided to state after actually using the onsequences of it.

Theory of Yang-Mills and The Yang-Mills-Higgs Mechanism 625

The first of these results (Thm. (10.5.4)) is an existence theorem for solutions to (10.5.15) and (10.5.16), the second shows (in effect) that these are either vortex or anti-vortex solutions. The third one (Thm. (10.5.7)) shows that, for a smooth solution, the set of points where 0 vanishes equals the set of points where V is not smooth. Furthermore this result enables one to express 0 analytically in terms of its zeros. Before stating these results we make a useful remark in this connection: Remark 10.5.3 Any solution (A', <j)') with N < 0 can be obtained from corresponding solution (A, 0) with N > 0 by writing cc'(z) = -cc (- z) and 0' (z) = 0 ( - z) where a, a, etc., are as given in (10.5.17). The solutions for N > 0 are called N-vortex solutions and for N < 0 they are known as N-anti-vortex solutions. For N = 0 the solution is trivial with 0 = 1 and A = 0 modulo gauge transformations as will be evident from Thm. (10.5.4) (see Exc. (10.5.1)). Theorem 10.5.4 When A < °°, given an integer N > 0 and a set {z,} of N points in C (not necessarily distinct), there always exists a solution to (10.5.15) which is unique up to gauge equivalence and has the properties: (i) It is globally C°. (10.5.27) 1 (ii) The zeros of 0 are the set of points {z^}, and as z —> z,,
0 and the const = const (8) depends on the choice of S.

(ii) N = — f , FA = •6/t

T n , = n~x& distinct z,

When the integer N in the above theorem is replaced by TV < 0, the solution satisfying the above conditions exists for equations (10.5.16), but with the following changes: 0(z, z) ~ c ; (z - z,-)"'

as

z -> Zi

and

N=

~

S distinct z,

», = " £ £ 2 ^ = - Of1 *>• *•"••

Theorem 10.5.5 Every critical point of the finite action functional A as given in (10.5.8) with A = 1 is a solution that is described in Thm. (10.5.4). (Out of many existence results on vortices, we choose to give this simplistic result due to the following remark:) Remark 10.5.6 The above theorem categorically says that there do not exist any solutions to variational equations (10.5.9) that are not solutions to (10.5.15) or (10.5.16). Putting it more simply, for A < °° the variational equations of first and second order are equivalent with regard to their solutions. This fact justifies the reduction of second order equations to the first order equations. Theorem 10.5.7 Let Z(0) = {z e €: <j>(z) = 0} and S(V) = <£\{z e C: V(z) e C°°] denote the zero set and the singular set pertaining to a smooth solution (A, 0) of (10.5.15). Suppose that (A, 0) is a finite action solution to the first order equations (10.5.15) such that its components there are locally square

626

Mathematical Perspectives on Theoretical Physics

integrable. Then (A 0) is a smooth solution and there exist N (not necessarily distinct) points {zls z2, • • •, zNs <£} for which: Z ( 0 = S(V) = {Zl, z2, - . , Z A , }

(10.5.29)

Also for each distinct z, there exists a neighborhood, say N, in which 0{z} can be written as: 0(z)=(z-z,-)"'"A I -(z)

(10.5.30)

where « ; is the multiplicity of z, in {z^ z2,..., zN} andft,-(z) is a non-vanishing C "-function in 5\£. (Note that (10.5.30) implies that the zeros of Higgs field <j> can be interpreted to give the location of vortices.) Due to our limited scope we do not give the verbatim proofs of the first two theorems; the interested reader should refer to [32] where these are Thms. (1.1) and (1.2) of Chap. 3. We prove the third Thm. (10.5.7) in some detail, essentially to give the flavor of the topic to our readers. The following two results are used in the proof. Result 10.5.8 Let (A, 0. If there exists a gauge in which the components of (A, <j>) and their first order derivatives are locally square integrable, then (A <j>) is gauge equivalent to a globally C "-solution and locally gauge equivalent to a real analytic solution in

(xl,x2). Result 10.5.9 For a smooth finite action solution (A, ) to (10.5.9), given e> 0, there exists an M that depends on e and (A, )) such that: |Re(0£V)| + (1 -
\lm(0DA
(10.5.31)

(10.5.32)

The mL and mT are the notations for Higgs' particle mass and photon mass; in this case they are:

mL = min (VX, 2), m r s 1. Proof In proving the Thm. (10.5.7) we establish (10.5.30) before (10.5.29). To write the expansion (10.5.30) of ) is also a smooth solution of (10.5.15)). (ii) Since (a, ) satisfies the first order equation (10.5.19a) (a)

(d-z - ia) 0 = 0,

we use the ^-Poincare's lemma which says: "given a C°°-function ioc(z) on a closed disc B c C , the differential equation (b)

dza(z)=

ia(z)

has a C°°-solution co(z) in the interior of B" to write a complex analytic function: Q( Z ) = e-^

0( Z )

(Q(z) is complex analytic as<9z (e~m^z' <$>{z)) = 0 in view of (a) and (b)).

Theory of Yang-Mills and The Yang-Mills-Higgs Mechanism 627

We now note that (j> = ea Cl is the product of a C ""-non-vanishing function em and a complex analytic function Q. As the zeros of a complex analytic function are discrete, only a finite number of them occur in B. If Zi is such a zero with multiplicity n;, then in a neighbourhood of z-, there is an analytic function £2, such that (c)

ft(z)

=( Z - z,-)'"' Q,
Hence writing fy= £2, e we obtain (10.5.30). To establish that the total number of zeros zx... zN is finite, we use the decay rates of <j) given by the following inequality 2 :

0 < 1 - |0| < O exp f- M'J

(10.5.33)

in view of this <>/ (z) is non-zero outside B. Hence the zeros of (j) are inside B and they are finite in number. To establish (10.5.29) we first note that from (10.5.23) though Vis gauge dependent, the sets Z(0) and S(V) are invariant under smooth gauge transformations. We then use the inclusion argument on these sets in both directions. Since 0 is smooth, any singularity of Vmust be a zero of 0; hence 5(V) c Z(0). Conversely, let z- e Z(0), then representation (10.5.30) holds near z;, moreover we can write
e'"j^(z-zj)emrghjU)

= eT

(f/ + V

'>

(10.5.34)

(we have used here the equality z = |z|e / a r g z and also the Eq. (10.5.21)). Thus in a neighbourhood of z-, V = 2 (n^ arg (z - zp + arg A- (z)), mod2^. Since /i; (z) is a non-vanishing C~-function, so arg h:(z) is also C°°. Hence Zj€ S(V), i.e., z(<j>) c: 5(V9, giving us the equality (10.5.29).D We would like to note here that in the proof of the theorem, we have used the Result (10.5.8) (although we have not explicitly mentioned it) when we took (A, <j>) as a smooth solution. In conclusion to our discussions in this section, we state the following two results which the reader will realize are quite important. Theorem 10.5.10 Let (A, 0) be a finite action smooth solution to variational equations (10.5.9). Then either |0| = 1 or else 0(x) < 1. Moreover for A < 1,

and

|*F| < —(1 - |0|2)

(10.5.35a)

|D A 0| < - | (1 - \
(10.5.35b)

Hence if | 0 | = 1, then both F and DA(j) are identically zero. Result 10.5.11 The parameter space of smooth finite action solutions to Eq. (10.5.15) and (10.5.16) for fixed (A'j is 2 |vV[ = dimensional.

25

This inequality follows from (10.5.31).

628

Mathematical Perspectives on Theoretical Physics

Exercise 10.5 1 Show that when N = 0, a finite action solution is trivial with 0 = 1 and A = 0 modulo gauge transformations.

Hints to Exercise 10.5 1. From the statement of Thm. (10.5.4), when N = 0 the set {z1? z2, •••, zN] has no points. In dther words,

6

ANOMALIES

In this section we shall study anomalies in brief with greater emphasis on those that are related to the Yang-Mills theory. An anomaly is said to occur in nature if the symmetry-laws of a physical system, after it undergoes a change, are no longer valid. In simplistic terms it is a 'breakdown' of some accepted physical law. We shall see that some of these 'breakdowns' produce a welcome feature into the theory while others lead to inconsistencies. According to physicists this 'breakdown' is a "failure to renormalizability" of the theory, and it appears in the end equation (e.g., the equation of conserved current) in the form of some extra terms known as anomalous terms. For mathematicians, however, anomaly shows up as an "obstruction" to the existence of a group invariant theory. The group in question can be a gauge group, a group of diffeomorphisms or a group of conformal transformations; the anomalies resulting from these choices are respectively called gauge, gravitational and conformal anomalies; they are respectively relevant to gauge theories e.g. Yang-Mills, electromagnetism, or electroweak, to the theory of gravitation and to the theory of strings. We discuss below the basics of this important26 topic from both points of view, illustrating with the help of examples. Due to our limited scope we skip the rigorous proofs; however we feel the material here along with the references is sufficient to motivate an inquisitive reader.

6.1 Renormalization (The Physicist's Approach) From our introductory paragraph it follows that the topics of renormalization and symmetry (in particular the gauge symmetry) must be closely related. The symmetry relations among Green's functions are generally known as Ward identities (see [51], 9.[6]); it can be shown that in a theory with non-trivial symmetry, renormalizability depends on the cancellation of 'divergences' from different sectors of the theory, these divergences are naturally enforced by the Ward identities. This (relation between renormalization and symmetry) is even more so in the case of gauge theories, where to make sense spurious degrees of freedom are introduced and then using the Ward identities these unphysical states are cancelled when writing the physical S-matrix elements. 26

The importance of the topic is borne out from the various conferences held on it during the last decade, see [36], [54], (ii) in [49] and the literature that followed in the form of dissertations and research papers [11], [12], [46(c), (d)].

Theory of Yang-Mills and The Yang-Mills-Higgs Mechanism 629

Thus, for instance, when we make perturbative calculations in a gauge field theory (see Chapter 9 in particular Sec. 9.6), there are loop diagrams with ultraviolet divergencies*. The renormalization process separates away these divergent parts and re-interprets them as multiplicative renormalization of the fields, couplings and masses in the original Lagrangian known as the bare Langrangian. The bare Lagrangian LB is written as a sum of the renormalized Lagrangian LR and some counter terms: LB(4>B)=LR(R)+C.T

(10.6.1)

The renormalized Lagrangian is of the same functional form as the bare Lagrangian where the original couplings (g) and masses (m) have been replaced by renormalized ones thus: -£ft(0K> SR> mR) =-£fl(0fl, #B' mB> (10.6.2) In addition LR is invariant under a set of local gauge transformations isomorphic to those which leave LB invariant. For any computations, one now uses the Lagrangian LR with canonical vertices and propagators and omits the divergent parts (which are already accounted for while normalizing). While separating away the divergent parts (required in normalization process), one has to modify the integrals so that they are finite and thus can be used in the equations. This modification of integrals which is known as 'regularization' (see App. 9C) is eventually removed before comparison with experiments. We note that regularization must violate some physical laws to make sense of renormalization of the given theory. In the case of non-abelian gauge theories (e.g., the Yang-Mills), 'dimensional regularization' is used as a means to renormalization. Here the space-time dimensionality is analytically continued from the real physical value 4 to a complex generic value n. With real n sufficiently small, the ultra-violet behavior is convergent, eventually the limit n -> 4 is taken, while the poles in

define n -4 the counter terms. As long as the theory is free of fermions, the Ward/generalized Ward identities implied by gauge invariance are maintained for general n (see A.13). Before one applies the dimensional regularization, it is customary to rewrite the propagators in a convenient representation by using the Feynman parameter formulae. These formulae help in writing the integral in an arbitrary dimension n (see Exc. 61). For details and examples on dimensional regularization, the reader should see [26] and 9.[6]. In one of the following two examples, we show how an anomaly can be obtained in gauge theory via renormalization. In particular we show that the appearance of an anomaly implies that Ward identities are no longer satisfied after renormalization. (These examples use our earlier study on quantization (Sec. 9.3-6) and then the dimensional regularization.27) Example 10.6.1

Recall that the A
vector current J^. The canonical commutation relation for the complex fields 0 = —-==• (<j)l + i<j>2),

27

A divergence at high energy or large k (momentum-space variable) for internal lines of a Feynman graph is called an ultraviolet divergence, it gives rise to divergent integrals. This is typical of diagrams containing loops in quantum field theories (see Fig. 10.7). We have followed here the notations of Cheng & Li 9.[6] which in some places are different from Chapter 9.

630

Mathematical Perspectives on Theoretical Physics

[d0
(10.6.3)

leads to the commutators: [JQ (x, 0, 0(x', 01 = ' H * (x, r), 0(x', 01 (x, t) = <53 (x - x')
(10.6.4a)

3

[/„ (x, 0, 0* (x', 0] = - 5 (x - x') 0* (x, r) (10.6.4b) (Jo is the time component of J^ = i[(d^ *) <> / - (d^) (j)*]). We use these to write the three-point Green's function given in Fig. (10.6) as: .y p + q

^

q

^£>*\^

P Q § ^ ^

The Green's Function of Two Scalar Fields Coupled to a Vector Current.

GM (p, q)= jd4 xd4 ye'1" ^

<0|7X^*) Vy) 0*(O))|O>

(10.6.5)

Using the current-algebra this can be written as: ? " G^(p, q) =-i \ d4x d4y e-'i'-'ry = -i j d4x d4y e-^-vy

d? W*(0))|0>

- < 0 | r ( 5 ( x 0 - y 0 ) [/„(*), 0(y)] 0*(O))|O> + (10.6.6) Now the first term on RHS of (10.6.6) vanishes because of current conservation (c^ J = 0), and the other two terms get simplified in view of (10.6.4), and as a result (10.6.6) gives: 9* G^ (p, q)=-i\

d4x e-^i)x

(0\T((x) • f (0))\0)

+ i j d4y e~ipy .

(10.6.7)

But the two terms on the RHS are just the propagators for the scalar field, hence we have: -iq" G^ip, q) = A(p + q)- A(p) = (p + qf - p2 (10.6.8) giving us the vector-current Ward identity. We show in brief, using dimensional regularization for one loop diagrams that the Ward identity is satisfied in this case. We recall that Green's function can be expressed in terms of the amputated Green's function T^(p, q) and the 1PI self-energyl (see Subsec. 9.6.2) as: T > , q) = [iA(p + q)Tl G^(p, q) [iA(p)]~\

(10.6.9a)

[A(p)Tl =P2-m2-:Z(p)

(10.6.9b)

with

Theory of Yang-Mills and The Yang-Mills-Hfggs Mechanism 631

The Ward identity (10.6.8) now becomes: iV rfl(p,q)=(p + qf -p2-l(p + q)+l{p). (10.6.10) For the contributions of F^(p, q) Fig. (10.7)0) shows the vertex function in the tree approximation: i
(i)

*>

(ii)

(iii)

(iv) Q S f f i Q Tree and one-loop diagrams for the vector current and scalar fields vertex function

For the one-loop diagram (10.7)(ii), we use the dimensional regularization (as mentioned above) and have:

(10.6.12)

For n < 2 thefirstintegral on the RHS is convergent, moreover, by shifting the integration variable from k to k - q, we note that:

'•«"TV, «-«J ^ - [ ^ r - i ^ r ] =0

00.6.13)

This is true even when we analytically continue to n > 2 and in particular to n = 4. The contribution of Fig. (10.7)(iii) is given by: iq* r f ' (p, q) = iq" (-/) (2p + g)

' _

2

[2(p + 9 ) - S(0)],

(10.6.14)

where (10.6.15)

632 Mathematical Perspectives on Theoretical Physics

This is independent of the external momentum, thus Z(p) = Z ( p ) - Z ( 0 ) = 0, which gives that the contribution from (iii) is zero. The same is true for Fig. (10.7)(iv). Using the dimensional regularization we have thus shown that the Ward identity (10.6.8) is satisfied (in perturbation theory) up to one-loop order, in other words there is no anomaly in this case. Our next example shows the appearance of anomalies from renormalization. Example 10.6.2 Let V^, A^ and P denote the vector, axial vector and pseudoscalar currents (in electrodynamics) given by: Vfl(x)= y{x) 7M y/(x)

V*)=

VW YnftVW

P(xj=f(x)y5y/(x)

(10.6.16)

in terms of fermion field y/(x). Consider the three point functions: T^vX (*!, k2, q)=i \ d4x{d4

x2 (0\T(V^xl)Vv(x2)Ax(0))\0)

e'V*i

+

'V*2

(10.6.17)

x

(10.6.18)

and T^ (*„ k2, q) = ij dAxx dAx2

(0\T(V^Xl)

V V (JC 2 )P(0))|0>

e'^x^^

>

where q — kx + k2.

In order to obtain the Ward identities relating T^ and T^v the divergences of V» and A^ are required. These are calculated from the equation of motion: d" V^x) = 0

(10.6.19a)

d* A^ (x) = 2imP(x),

(10.6.19b)

where m is the mass of the fermion field. The current / „ in relation to a local operator O (y) satisfies the following:28
+ [/„(*), O(y)] 8(x0 - y0)

(10.6.20)

we also know that in this case the equal-time commutators vanish, i.e. [V 0 (JC),

A0(y)]

8(xQ -yo) = O

(10.6.21)

Hence using all this information we can (formally) derive the vector and axial-vector Ward identities given below: *i%vA = *2V V A = 0

(10.6.22)

qxTlivX=2mT^v

(10.6.23)

and It can be checked that once the theory is renormalized, even though only lowest-order contributions (shown in the figures (10.8) (a) and (b)) are taken into account in writing T^vX and T^ the Wardidentities are not satisfied. 28

Eq. (10.6.20) is written using the current algebra techniques (see [1]).

Theory of Yang-Mills and The Yang-Mills-Higgs Mechanism 633

P~ Q *\.

p — <7*\ V

**

>

V

<^

fc

(a)

P-^1

"

i

—~>

y

^f

P - <7*\

P-k2

p - ci*\ ^2

k

(b)

1

^ S R E j (a) Lowest order contributions to T of Eq. (10.6.17). (b) Lowest order contributions to Jin of Eq. (10.6.18)."'' The T^ and T^v after the contributions are:

(2ny YvZ

[ l-p--m

(jL-^-m

rA+\

r

\\

(10.6.24)

- '"/ 7 ^ (-1} lTr f-^" ^ 7 V -

r =

{2%y 7v7

[ ^~

l-p--m

x

(-p--^-)-m

7^ +

•

(10.6.25)

Since m75 = y5m and -p- as well as -q- anticommute with y5, we can use the relation: -9-75 = Ys(-P---
(10.6.26)

to obtain: 9^ ^vA = 2mr^ v + A^ } + AM(?,

(10.6.27)

where

A (") - f

rf

V

- 7

Tr f

£T

(-p---h1)-m

*'

yy

757v7

'

l

y

~rX

{-p.--q.-m) *]

and A^® is obtained from A ^ by interchanging k{ <-* k2 and fi <-^> v.

(10.6.28)

634 Mathematical Perspectives on Theoretical Physics

The Ward identity (10.6.23) would hold if the integrals A® (i = 1, 2) are zero. But this is not the case, since the integrals are linearly divergent moreover a translation of integration variable such as p to p + k2 in the second term of A$ produces extra finite terms giving eventually a non-zero result. Hence we note that here the Ward-identity does not hold good after renormalization, in other words Ward identities involving axial-vector currents are spoiled by renormalization via the triangle fermion loops. Both these examples dealt with abelian gauge theory, these ideas can also be extended to non-abelian gauge theories and the presence of anomalous terms in the divergence of axial-vector current can be shown after the renormalization (see for instance Bardeen [14] and Exc. 3).

6.2

Anomaly as an "Obstruction" (A Mathematician's Point of View)

Here one deals with two different types of anomalies, namely the local and the global. The latter ones which are more difficult to comprehend, appear in (a respective) theory when the gauge group Q or the Diff* (M) group has more than one connected component. We shall only discuss the local anomalies based on the text [39]; for an account of global anomalies see Witten's papers [54] and [36] and [39] and our remarks (10.6.4-7) indicating their importance in symmetry breaking phenomena. For simplicity we consider only Riemannian manifolds of even dimension. The first choice gives the space-time an Euclidean character and the second allows the existence of chirality in the theory. We can therefore couple the gauge field to chiral fermions and use quantum methods to study the problem at hand. To begin with, we note that the introduction of fermions in the theory restricts M to be a spinmanifold*. We shall see that the anomaly here turns out as an obstruction to the existence of gauge invariant determinant of the chiral Dirac operator iA pertaining to the gauge field A. We recall that if co and (d + co) respectively denote the Riemannian-spin connection and the covariant derivative on M and (e^a(x)) define the orthonormal frame (vielbein) &lxeM, then the Dirac operator ID is the map defined over the sections of the spinor bundle E: (see Subsec 7.2.2.) Jt) : T(M, E) -> T(M, E)

(10.6.29)

with

B> = r a « a " W ( ^ + ( V ,

(10.6.30)

where / " are the Dirac matrices, and e£ satisfies: e^(x) ey(x) 8ab = g^v (x) in terms of the Riemannian metric g^v on M. The chirality operator y5 = i"("+1)/2 yl ... yn anti-commutes with B, and allows one to define another operator: 3 = ID (1 + 75)/2,

(10.6.31)

* A manifold M with spin structure is called a spin manifold (see Subsec. 7.2.2 in particular Def. (7.2.3-5)). A good description of these ideas can be found on pages 272-278 in 11.[13], where it is shown that superstring theories can be described only on spin manifolds. See also 7.[14].

Theory of Yang-Mills and The Yang-Mills-Hlggs Mechanism 635

which is called the chiral Dirac operator. When the gauge field A is introduced we have: 3A = DA(1 + y5)/2 = y"ej?(x) (M + AJ (1 + y5)/2

(10.6.32)

The operator dA is now defined on the sections: iA:T{M, E+ ® F) -> T(M, E~ ® F),

(10.6.33)

+

where £ are chiral spin bundles (Ex= £ / © £J) andF is the bundle through which fermions are coupled to the connection A. It is known that if n+ and n_ are the numbers of positive and negative chirality massless fermions and M is the sphere S2"1 then: index iA

= n+-n_=

js2m

CH( F) = k,

(10.6.34)

where CH(F) is the Chern character of the bundleF and k is the integer Cm{F) which classifies bundles over S2"', when n = 4, -k is the familiar instanton number. Returning to the problem of anomaly, we consider the action S (with Q as the gauge group) given by: SmS{A, \lf) = \\F\\2+(xi/,dA\lf> = - Tr \M F A * F + 4-J^ yr(x) J^ (
(10.6.35)

and define the functional integral: Z= $ DADyDytxvi-

\\F\\2- (y/, dA y/)l

(10.6.36)

The fermionic integration is done using the expression: J DyD\j7&xp [- (\ff, dA y/)] = yJd&t(S*A SA) .

(10.6.37)

Thus Z can be written as: Z=\DA

Vdet(FA SA) exp[-||F|| 2 ].

(10.6.38)

Here ^ denotes the space of connections and
(10.6.39)

Following the usual practice in the literature we use the notation <SQ for the quotient.!!/*/. Since (homotopically) !A. is contractible (see Subsec 0.6.6), !A. is the total space of a universal bundle, and hence SMQ is the classifying space for ^-bundles, meaning thereby that it is the base space rBQ. (See [17], 2.[18] and [39]).

636

Mathematical Perspectives on Theoretical Physics

although SA = g'x SA g is an expression which is similar to gauge transformation of F. the failure for this gauge-invariance stems from the fact that we are in infinite dimensions and therefore the Dirac determinant requires regularization. Moreover if A and B are infinite dimensional operators then in general det AB * det A det B.29 The det {$*
6.3

Anomaly as a Welcome Phenomenon

As mentioned in the introduction sometimes the 'symmetry breakdowns' are welcome features since the anomalies resulting from these help in pinning down the changes in the theory. We enumerate some of these in the following remarks. Remark 10.6.4 Consider the classical scale invariance of QCD 30 with massless quarks. This invariance (global symmetry) needs to be broken. The breakdown here is represented by the anomaly in the trace of the energy momentum tensor which leads to the "emergence of hadrons as bound states with non-zero mass." Another example, again from QCD, is incorporated in the next remark. Remark 10.6.5 In QCD with n massless quarks there is a global UL{ri) x UR(n) symmetry, since left and right handed components of the quark rotate independently of each other. In order to account for the approximate vectorial SU(n) symmetry of the observed hadrons in the theory, this symmetry needs to be broken to a diagonal SU(n) subgroup. We re-emphasize that consequences of local (gauge) symmetry breakdowns are quite different, since absence of gauge invariance in a quantum theory with gauge fields suggests that there are inconsistencies in the theory. We further note that gauge invariance here is required for decoupling longitudinally 29

' If A and B are trace class then detAB = detA detB, and there is no need for regularization. Recall that an operator 7 is trace class if it is compact and the sequence of eigenvalues of (T* T) 2 (J* = adjoint of T) counted with their multiplicity is summable. 3a QCD = Quantum chromodynamics, see Chapters 10 and 11 of 9.[6] for details.

Theory of Yang-Mills and The Yang-Mills-Higgs Mechanism

637

polarized gauge fields, since the longitudinal states must decouple from physical processes in order for the theory to be unitary. Evidently the anomalies mentioned in first two remarks are interpreted to represent a change in physical phenomena, whereas in the third case, one looks for cancellation of anomaly. We shall briefly explain this idea in Exp. (10.6.8) of the last subsection. In the next subsection we briefly explain the meaning of the term 'chiral gauge theory.'

6.4

Chiral Gauge Theories

Theories where parity is violated by gauge couplings are called chiral gauge theories. In a chiral gauge theory the left and right handed fermions transform differently under the gauge group (of the theory). Suppose R and R are the representations of the gauge group resulting from left and right handed fermions; then, since in four-dimensions the anti-particles of left handed fermions are right handed and vice-versa, the CPT symmetry there requires that Rbt the complex conjugate representation of R. Hence in 4dimensions a gauge theory is chiral gauge theory only if the representation R is complex but is not equivalent to its complex conjugate, as R and/? have to be distinct (in chiral gauge theories). In view of this it follows that the gauge groups which have no complex representations such as (SO(32) or £ 8 x £ g ) do not give a chiral theory. In the following two remarks we show the effect of dimensionality on chirality, and the cases where one may expect gauge anomalies. Remark 10.6.6 In 4k dimensions for k> I, the situation is very much similar to 4 dimension. In 4k + 2 dimensions, however, it turns out to be different and is of great interest since for k = 2 it represents the 10 dimensions of superstring theory with 2 giving the world sheet dimension. In 4k + 2 dimensions the antiparticle of a left handed particle is again left handed and therefore one can construct theories containing fermions of one chirality only. These fermions of given chirality in 4k + 2 dimensions belong to a real representation of the gauge group. For, suppose if we start with a complex representation Q, then the antiparticles must belong to Q , which means we have a real representation Q ® Q . Thus in 4k + 2 dimensions R and/? are always real, though they may not be equal. Only when they are distinct the gauge couplings violate parity and hence gauge anomalies may arise. In the case of 4 dimensions we have seen that R and/? are complex and so anomalies arise when they are different. Remark 10.6.7 In odd dimensions there are no Weyl spinors, and so there is no left-right asymmetry, i.e., no parity violation and so there are no anomalies. (See the articles by A. A. Slavnov and R. D. Ball in [36] on anomalies and chiral fermions.) In the next subsection we show in brief the construction of an anomaly in a simple model and then the computations of anomalies via differential forms.

6.5

Construction and Computation of an Anomaly

We illustrate this through the following example. Example 10.6.8 We consider a Lagrangian formed by external gauge fields coupled to fermions. The fermions are assumed to be left handed and are supposed to span a representation of an internal, semisimple compact Lie group. The generators of the group are matrices T, that satisfy:

[Tt, TJ = ifjTj

(10.6.40)

638 Mathematical Perspectives on Theoretical Physics

with fkl3 as structure constants. We write the Lagrangian in question as: L = if y" ( ^ - iA{ r,) y

(10.6.41a)

Using the classical point of view, corresponding currents here can be defined as: J?{x) = y W y " T, yr(x),

(10.6.41b)

and these (as is evident from classical equations of motion) are "covariantly" conserved (see Sec. (6.3) and 6.7): D

"^

H

~£MJf + f»TiAt

J

? = °-

(10.6.42)

From our discussions in earlier subsections we know that when fermion fields are quantized, it may not be possible to define local currents that satisfy the conservation law (10.6.42).31 To study these changes, we consider the generating functional for time-ordered matrix elements of the currents32: W[A] = eiZW = (017* e'1 ' t ^ W ' ^ = \(0\T* elJA\0)

(10.6.43)

This gives: , f,

, W[A] = i{0>\T*jf (JC) e' j J > >

-^flsJ^x

X i WW = <0|r" J" (X) J* °° e'J"|0)

(10.6.44)

(1 6 45)

°- -

(i) SA^ix) 5A*(v) Now Jfix) in the above equations is supposed to be formed in terms of free, quantized fermion fields, however an expression like (10.6.41b) cannot be used, since a product of two fermion-field operators taken at the same space-time point is singular. To make the definition of local currents meaningful one should either invoke a regularization scheme, or use two points (instead of one) at infinitesimal distance and take a symmetrized limit. In any event the currents are not likely to be conserved in either case, before taking the limit. However, there is something that can be usefully computed and that is the functional W[A]; this functional will be meaningful after removing the cutoff33 of the regularization scheme or after taking the limits in the point-split approach. This as we shall see later, will be an expression in the external field A^(x). For this purpose we take the divergence of (10.6.44) and note that in the term on RHS there is a contribution from the time-ordering and also from the divergence of the current inside the time-ordering before removing the cutoff. Once the cutoff is removed we can write it as: Xi(x)W[A] = G,[A](JC), 31

(10.6.46)

' Definition of local currents (suited for preservation of a conservation law) depends on the representation of an internal symmetry group. 31 Note that we have mostly used T with parenthesis for time-ordering in place of T *. 33 See App. 9C and 10A.13 for regularization.

Theory of Yang-Mills and The Yang-Mills-Higgs Mechanism 639 where the operator Xt(x) contains the contributions from the time-ordering:

W

= - inr TTTTT " fa" K « TTTTT 8x* SA'^ix) " SA^(x)

(10-6-47)

(Note the similarity in (10.6.42) and (10.6.47).) If the conservation law of the current is preserved in the quantized theory, then G,[A](;c) is zero; if this is not the case, it is called the anomaly. We further note that, if the operator Xfx) acts on the external field A, then after a partial integration it gives rise to gauge transformation:

j k'tw)Xfyv)dw A fa) = dM Xl{x) - k\x)fii A[(x) = Sx A

fa)

(10.6.48)

From above discussion it is evident-that X used for writing the index-free operator X(z) as X'(z) AT,{z) is indirectly related to external field A^ and the structural constants fak. This fact in turn implies that the operators Xfjc) form an algebra: [Xt(x), Xj(y)] = flj Xk(x) <5U - y) Using (10.6.46) we have the so-called consistency condition for an anomaly: *,(*) Gfy) - Xfy) Gfr) = fuk Gk(x) 8{x - y)

(10.6.49)

(10.6.50)

Now any functional say Gj[A], obtained from another local functional W[A] using (10.6.46) can be written as: Gt[A](x) = X£x) W[A] (10.6.51) Apparently Gt[A] satisfies the consistency condition (10.6.50), which shows that an anomaly G; can be changed to another one with the help of a local functional: G,' = Gk + XiW, • (10.6.52) hence anomalies are not unique, instead they form an equivalence class. In view of this it is always an interesting question to find a non-trivial solution to (10.6.50). For the Lagrangian models such as (10.6.41) it can be shown that the anomaly G,(x) is: G,W = ^ T £ M V p f f

Tr X d

i n [AvdpAa

+ j Av ApAa j ,

(10.6.53)

where Ap = — IT; Ap.

Furthermore it can be checked by using all local polynomials in A that there is no local functional W[A] which can produce this anomaly hence the anomaly for model (10.6.41) is uniquely determined. From these discussions it follows that finding an anomaly amounts to finding an "equivalence class". In conclusion we emphasize that a detailed understanding of the origin of anomalies requires renormalization and regularization. Inspite of it however, sometimes cohomological considerations go a long way in pinning down the anomaly. Finding an anomaly then, reduces to a problem in differential forms, their exterior algebra and derivations defined on the manifold M the Lie group G and the principal bundle P (M, G) (see Chapter 0 and 1 for differential forms, Sec. 6.5, Sec. 6.6 for gauge potential and guage transformations and App. 10A for cohomology).

640

Mathematical Perspectives on Theoretical Physics

To set up this machinery here, we consider the differentials (dx ... dx") coming from the coordinates (x ... xn) of the manifold M, and differentials (dX ... d?LN) corresponding to a set of parameters A ... AA on which the elements of group-space and that of gauge transformation depend.34 We treat {dX1} I = \ ••• N as formally independent of x and write the grading rules as: dxM dxv = - dxv dx» dX1 dX" = - dXm dX1 dx* dX1 = - dX1 dx^

(10.6.54)

The exterior derivative in the group manifold is denoted s : s = dXl-^r,

(10.6.55)

1

dX which satisfies:

s2 = 0, sd + ds = 0 and d, as we already know, satisfies: d2 = 0

(10.6.56) (10.6.57)

We note that the sum A = s + d,

(10.6.58)

is also an exterior derivation, which is valid on M as well as on the group-space. Evidently, A2 = 0

(10.6.59)

Now the element g(x, X) = exp(-/A'(;c)T;) can be viewed as a 0-form in group-space as well as in M applying s(d) to it will give a 1-form in group-space (M). At this point however we are interested in a particular 1-form given in (10.6.60), as our goal here, is to examine the consistency relation (10.6.50) for an anomaly. With this aim in view, we consider the (Lie-algebra valued) one-form defined as r] - g~l sg,

(10.6.60)

and note that: 57] = (sg~[)

s g = - g'1 sgg~l

(sg) = - T]2

(10.6.61)

Next we use the vector field A^ by writing it as a Lie algebra valued 1-form A = - iAM'<£e" T,

(10.6.62)

34

Note that since the problem here uses physicists' approach there is no principal bundle P(M, G). It is the Lie group G of the theory which plays an important role. The manifold M represents the space on whichA is defined, and is linked to G via elements g(x, X) = exp {-iX (X)T{). These elements define a parametrization of the groupspace (from physicts' point of view) and provide the gauge transformation relation as given in (10.6.63).

35

5 should actually be defined as a functional derivative: * = J dzdXl(z)—-f kk (x) = dXk (x), Definition (10.6.55) is valid.

, but since f dzdXl(z)—-.

x

Theory of Yang-Mills and The Yang-Mills-Higgs Mechanism 641

to obtain a generalized vector field A_(x, A): A (x, A) = g~l (x, A) A • g(x, A) + (g(x, A))"1 dg(x, A)

(10.6.63)

Note that A (x, A) is the gauge-transformed vector field defined in (10.6.48), and the expression on RHS is the familiar one which we have denoted as A earlier. Furthermore sA(x) = 0, in the case of A_(x, A) using (10.6.51) we have: sA(x,X) = - g~x sgg-lAg - g-xAgg~x sg - g~x sgg-ldg - g-ld(gg-lsg),

(10.6.64)

which gives sA (x, A) = - g'1 sg (g~l Ag + g'1 dg) - g~l Ag (g~' sg) - g"1 dg(g~lsg) - g~l gd (g'lsg) = - r\A- Ari-dr]

(10.6.65)

(To simplify (10.6.64) we have combined the first and third term and have written the fourth term as the sum of two last terms in (10.6.65).) Equations (10.6.61) and (10.6.65) taken together imply that A and 77 form a basis for an algebraic action of s. Also, since r\ is Lie-algebra valued, we can write r\ as: 77 = - ivl(x) T,

(10.6,66)

and accordingly express (10.6.61) in terms of anticommuting fields vl. This gives: *v' = - j / > r

Vs

(10.6.67)

(To write this we have used (10.6.40).) Similarly we can express sA^ (x) also in terms of vl as:

sAJl(x) = dllvl(x)-

v^filA^x)

= j dwvl(w) X,.(w)AJ (*)

(10.6.68)

Equations (10.6.67) and (10.6.68) are known as BRS transformations.36 We shall use these equations to obtain an alternative form of consistency equation (10.6.50) which would eventually lead to the computation of an anomaly. Consider the actin of s on the integral involving Gt [A] - the anomaly for a given vector potential A: s J dxvl(x) G,[A](x)= j ^5(v ; (x)G;[A](x)) = j dx (sv'(x)) GflAKx) - J dxvl(x)

sGt[A](x)

(10.6.69)

Since the anomaly depends only on A, we can replace the action s in the second term by J dw Vs(w) Xs(w), and after using (10.6.67) to write the first term, we have: RHS = { dx ( ™ / r i v ' ( * ) v ' 0 e ) G / [ A ] ( x ) ) - \ dx vl Oe) (jdw vs(w)Xs(w))Gl[A](x) 36

(10.6.70)

- These transformations were obtained by C. Becchi, A. Rouet and R. Stora in 1974, (see Phys. Lett. 52B, p-344)

642

Mathematical Perspectives on Theoretical Physics

Next we write the first term of (10.6.70) using the Dirac-delta property, as a double integral, and use the antisymmetry of vl(x) V'\w) to write the second term. After simplification (10.6.69) becomes: sj dx vl(x) G,[A](x) = j dxdw(-—

vr(x)vs(w)\

{/„' Gt(w) S(x - w) + Xs(w)Gr(x) - Xr (x)Gs(w)}

(10.6.71)

The expression within bracess is the consistency condition (10.6.50) which as we know is zero. Hence sj dx v\x) G[ = 0

(10.6.72)

is an alternative way of expressing the consistency condition for an anomaly. Finally, to compute the anomaly we note that in the above integrand the volume element dx is a 4-form in M-space and the anomaly vl(x) GL is a 1-form in group-space, thus we can write the integrand: vl(x)Gl dx° dx1 dx1' ch? = Cl}(x)

(10.6.73)

(we have used the upper index for the form-degree in group manifold and the lower one for the formdegree in M). In view of (10.6.52), our problem of finding a true anomaly, i.e., of finding a solution to consistency equation reduces to solving the equation: ^Q 4 ' = - d£l\

(10.6.74a)

Q.I * sQ4°.

(10.6.74b)

where Thus a 5-form (with degree 4 in M-space and degree 1 in group-space that satisfies (10.6.74a) subject to (10.6.74&) is a true anomaly. From (10.6.74&)-it is evident that the solution (true anomaly) cannot be an exterior derivative in group-space. We illustrate the theory developed above in the following example based on Yang-Mills' theory. Example 10.6.9 As usual we denote the connection and the field strength as A and F, where A is Liealgebra-valued 1-form defined in( 10.6.62) and F is the Lie-algebra-valued curvature 2-form: F = dA + A2 =— FMV, dx? dx"

(10.6.75)

(we have denoted [A, A] as A2 here) The transformation properties for A and F (as we already know) under a finite group element h are Ah = /f' Ah + h'1 dh

(10.6.76)

Fh = dAh + {Ahf = h'x Fh

(10.6.77)

and

We also know that F satisfies the Bianchi-identity (BI): DF = dF + [A, F] = 0

(10.6.78)

when the manifold M is 2n-dimensional an important 2n-form known as the Chern-character (Ch) can be defined as (see App. 10A) Ch,,(A) = T r F "

(10.6.79)

Theory of Yang-Mills and The Yang-Mills-Higgs Mechanism

643

The Chern-form is a homogeneous polynomial of order n and is guge invariant, i.e: Chn (Ah) = TrC/T1 Fhf = Tr F" = Chn(A)

(10.6.80)

Remark 10.6.10 The Chern-character has an important property, namely: As an element of deRham cohomology it is independent of A (see App. 10A and [17]), i.e: Ch,,(A) - Chn(B) =rf(TCh,,(A,B))

(10.6.81)

where A and B are two connections and TChn (A, B) is a {In - l)-form based on them(sse Exc. (10.6.3)). As a result of the above remark by using an interpolated connection: At = B + t(A-B)

0
(10.6.82)

Chn(A) - Chn(B) = nd Jjj dt Tr(A - B) F^

(10.6.83)

we can write:

and further by choosing a frame where connection B = 0we have: Ch,,(A) = nd f dt Tr AF/'" 1 = dm °2n_i (A, F)

(10.6.84)

(A and Ft are respectively 1-form and 2-form in M-space, hence the RHS can be written as a 2«-form in M-space and 0-form in group-space.) Note that F, in (10.6.83) and (10.6.84) is respectively: Ft = dAt + AtA,= Fg + t(FA - FB) + (t2 - t) (A - B)2

(10.6.85)

F, = tF + (t2-t)

(10.6.86)

and A2

To proceed further we replace the exterior derivative d of the x-space by the operator A = s + d, and denote the transformed fields byA and F, thus (10.6.76) and (10.6.77) become: Ah = h~x Ah +/T 1 Ah

(10.6.87)

Fh=AAh+A2h

(10.6.88)

and

The Binachi identity (BI) forF is: DF = AF + [A, F] = 0

(10.6.89)

Further, when we use the group element g (x, A) as in (10.6.63) we have the most generalized form of transformed-fields: A = g'X Ag + g~l Ag=A + ri

(10.6.90)

F = AA +(A)2=g~l

(10.6.91)

Fg = F

644

Mathematical Perspectives on Theoretical Physics

(see Exc. (6.4) for verification of the above equalities). The corresponding BI forF will be: DF

= AF + [A,F]

= 0,

(10.6.92)

and the Chern character given by (10.6.84) will now be: Ch,,(A) = nA f dt Tr AF_, = A O)°2n_[ ( A , F ) ,

(10.6.93)

where Ft=tF

+ (t2-t)

A2

(10.6.94)

In view of the identity (10.6.91) and the definition (10.6.79) of Chern character, it follows that: Ch,, (A) = Ch n (A) = Ch,, (A)

(10.6.95)

This implies that dco\n_Y (A, D = A
(10.6.96)

O

Now
(A +

T),-E)

= < » V i ( A , £ ) + ( o \ n . 2 + co22n_, + ••• + wo 2 "" 1

(10.6.97)

We substitute this into (10.6.96) and compare the differential forms according to their degree in groupmanifold to obtain the equalities: da?2n_,(A, sa)°2n_l(A,F)

E)=d(O°2n_x{A,F) =

-do)2n_2(A,F)

so)\n_2 (A,F) = - dco22n_3 (A, F) sa>22n_3 (A,F) = - dcol^

sa)02n-l(A,D

(A, F)

(10.6.98)

= 0

Equations (10.6.98) are known as the descend equations, and are used in solving the anomaly problem via cohomology theory. For example, third of these equations is similar to (10.6.74), thus when the form 002n _ 2 is reduced to (2« - 2) dimension and is suitably normalized, it can be identified with the anomaly. For n = 3, a\ is in the same de-Rham class as (10.6.53), therefore for an appropriate choice of a current, it is the anomaly in 4-dimensions. Again, the fourth equality which relates 0) 22n_3 with c^2n-4 gives the Schwinger terms in field theory when n = 3. (See Wess in [36] for Chiral matter fields.) Finally, we close this section with a remark on gravitational anomalies. Remark 10.6.10 If in the above example Yang-Mills curvature F is replaced by the Riemann curvature tensor R, which like F is also Lie-algebra valued for the orthogonal group, then the above procedure can be replicated to solve the anomaly problems in gravitational theory. Since the Chern character is zero in odd dimensions, gravitational anomalies occur only when n = 2m; thus, for instance, equality

Theory of Yang-Mills and The Yang-Mills-Higgs Mechanism 645

(3) of (10.6.98) is meaningful only for solutions of the type O)52, (0 9 2 . (For more details on the material in Subsection 6.5, refer to the references given in Wess in [36].)

Exercise 10.6 1. Obtain the Feynman parameter formulae in arbitrary dimensions. 2. Show that TCh,, - known as the transgression of Chern character - can be expressed as: Chn(A) ~ Ch,,(B) = d{TChn(A,

B)}

= nd £ dt Tr(A - B) F,'"1

(a)

where Ft is the curvature 2-form for the interpolated connection: (b)

A,= B + t(A-

B)

Q
Show further that:

Chn04) = nd jl dt Tr A F,""1 = J < _ , (A, F) (see A. 11 for definition of Chern character). 3. Verify (10.6.91).

Hints for Exercise 10.6 1. Consider the equality dx

— = f1

(i)

J° (ax + b(l-x))2

ab

which can be verified by re-writing the RHS as: (ii)

r

Jl

^ L _ = r _ ^ i _ ^ = _zi_°° = _L

(a + b(y - I ) 2 )

Jo

(a + bz)2

b(a + bz)

0

ab

where y —— and z = y — 1. The derivative of (i) with respect to a gives: x 1 (HI)

o

fi

-^—=2 a2b

Jo

xdx r

.

(ax + b(l-x)f

Similarly the derivative of (iii) with respect to b gives:

(iv) K)

x - ^ - = 2-3 f
646

Mathematical Perspectives on Theoretical Physics

More generally applying the derivative

( d Y*~7 d V"1 to (i) we have: V da )

(v)

(a-l)KP-i)i

a+p_

\db)

fdx

aabp

J«

^-'d-^-1 (ax +

b(l-x))a+p

Using the Gamma function we can write it as: 1 = T(a + P) ,, aafc0 r(a)r()8)Jo Suppose we consider the equality:

(vi)

x'-'(l-x)/'-1 (ax + 2>(l-*)) a+/3

(vii)

J L = 2 f1 ^ f1"* dy l3-, abc Jo Jo ' (fl(l - x - y) + bx + cyf then following the same procedure as above we have:

(viii)

1 a"6^c''

=

T(a + p + r) Mj f ' - ' j r (a) T ()3) T (7) Jo * Jo y

(a(1

(l-x-y)"-'^-1/-' _ x _ y ) + ta + c y ) « + P + y '

Hence the generalization of this for «-parameters can be written as: (1X)

i _

rr.1^'

g r(i:-.a-)j»rf

r - . , ...f'-w-o^n_,

n; (=1 r(«,) Jo

Jo

Jo

(i-(fi)r>^ This is Feynman's parameter formula in n-dimensions, which is used to great advantage in regularizing the integrals of a given theory, more simply when generic dimension is n (1 time and (n - 1) space dimensions) formula (ix) provides the means to simplify calculations.. 2. Given two connections A and B we can define an interpolated connection using a smooth parameter t: (i)

At = B + t ( A - B)

0
This is possible since (A - B) is tensorial. The corresponding curvature form Ft is given by: (ii)

F, = dA, + AtAt=FB+

t(FA- FB) + (t2- t) (A - B)2.

(Note that we have seen (ii) in (6.7.15) and Exc. (6.7.1) from a slightly different point of view.) Differentiation of (ii) with respect to t gives:

(Hi)

±F=d±At+(±Ai)Ai+^±(A).

dt dt \dt ) Using (i) we have after simplification:

dt

Theory of Yang-Mills and The Yang-Mills-Higgs Mechanism 647

(iv)

— Ft = d(A -B) + (A - B)At + A, (A - B). dt

We denote the RHS as Dt {A - B) and note that F, satisfies the Bianchi identity, (v)

D, Ft = 0

The difference Chn (A) - Chn(B) can now be written as:

Chn(A) ~ Chn(B) = j[Q [j- Chn{At)] dt = \\ dt IJL Tr F? j (vi)

= n fo dt Tr D, (A - B) F,""1 -nd

j ' dt Tr (A - B) F/ 1 " 1 .

(we have used (v) in writing the last equality) Choosing a frame where the connection B is zero (since the Chern character is invariant under such transformations) the above equality gives: (vi)

Chn(A) = nd Jo' dt TrAF/'" 1 ,

where in virtue of (ii) F, reduces to: (vii)

F, = tFA + (t2 - t) A2.

Since A is a 1-form and Ft is a 2-form we can write n \ dt Tr AF"~l as a (2« - l)-form (O°2n_l (A, F), the index 0 in the notation ctP2n-i (A, F) indicates, that it is a 0-form on the groupspace. 3. We have to prove the identity: (i)

F = AA+(A)2 = g-1Fg=F.

Using A = (d + s) andA = A + r\ in (i) we have: (ii)

F = (d + s) (A + 77) + (A + rj) (A + 77) = dA + dt] + sA + 577 + {A2 + A77 + r/A + T72}.

In view of (10.6.61) and (10.6.65) (ii) becomes (iii)

F = dA + dt] - r]A- Aj] - dn\ - r\2 + {A2 + A.T7 + 77A + TJ 2 } = dA + A2 = F = g-1 Fg.

The last equality follows from (10.6.63) and (10.6.77).

648

Mathematical Perspectives on Theoretical Physics

APPENDIX 10A: GLOSSARY A.1

The Projective Space Pn (c) 37

The complex Projective space Pn (
Pn(C): = ( [ z ] | z 6 C "

+ 1

\(0}).

The neighbourhoods Ui in Pn (C) are formed by the set of lines for which z( ^ 0. The ratio z-/z,. = XzJIzis well defined on Ut, and by taking Zj/Zj, j * i, (which are complex-valued functions on {/,-) as coordinates on it, each neighbourhood Ui can be identified with C". The set {£/,} (i = 1, 2, ... n + 1) is called the standard open cover of Pn (C). The set Pn (C) defined in (i) together with the differential structure provided by the neighbourhoods U-t and the maps: (ii)

0;: Ui -> C" such that (pi ( Z i , Z2 . . . Z

n +

l ) = (zl/zi,Z2/Zi

...Zn

+

i/Zi)

is a differential manifold, it is also denoted as DJ (C), and this is the complex projective space we are looking for.

A.2

Vector Bundles over Pn (c)

In view of Def. (2.5.7) a vector bundle C, of rank r over Pn is given by the matrix-valued functions: Cy : Ut n f/,- -> GL(T-,
i, ; = 1 ••• n + 1

(10A.1)

that satisfy the cocycle conditions: Cu = lr; Cij • Cji = \r on Ut n f/;-; C,7 • C/X. • Cki = lr on (/,. n ^ n t/,,

(10A.2)

where l r is r-rowed unit matrix. The total space of the bundle C, is obtained by glueing together the disjoint union \J" identify uxv e Ut x
37

We also use the notation P" (C) in place of Pn(C) (See Sec. 10.4).

Ui x C", We

Theory of Yang-Mills and The Yang-Mills-Higgs Mechanism 649

A.3

Line Bundle

When r = 1 and Q = (zj/z,)k, k e Z, the vector bundle £is called a line bundle and is denoted O?n (k) (In A.9 we use L to denote a line bundle over a manifold M).

A.4

Tangent Bundle Tpn

7jp is the rank «-bundle where C,-,- is the functional matrix obtained by differentiating the coordinates zk/Zj on Uj with respect to the coordinates Zi/Zj on Uj. To illustrate the writing of C», we choose n = 6, then for i <j, we can write the matrix as:

fi

*-

I

i -it!~z,-+i

1 J • 7-1

t

Z/

z Ciy = ^ -

:

0

•.

(10A.3)

0 Z

J+I

:

V

A.5

"^ z,-

j

•.

U-7 + 1 1 y

Cotangent Bundle T*Pn

The rank n-bundle where Cy is the transposed inverse of the matrix given in (10A.3).

A.6

Algebraic Vector Bundles

The r2 matrix-entries in this case are polynomials in z^Zi and ZfJzj. (Clearly the tangent and cotangent bundles are algebraic.) Moreover, there is a natural projection n: C, —> Pn, which on £/, x Cr is given by (u, v) -> u. The fibers C,x = n'1 (x), x e Pn are r-dimensional vector spaces. A section a: Pn -» £ (n- a = ldp ) when restricted to £/, defines a map Ut —>
650

Mathematical Perspectives on Theoretical Physics

The fibers of Horn ( £ £') are the vector spaces £*(*) ® f '(*) of linear maps from £x to £'x for A: e Pn. So the sections in Horn (£, f') define bundle morphisms y. £ —> £', i.e., algebraic maps that preserve the fibers and act linearly on them. More specifically if C, is defined by a cocycle Cy and £ ' by Cy, then / i s given by polynomial maps yi on {/,- with Cy )J = ?• Cy. A word of caution here, the description of bundles by cocycles is not always unique, as two cocycles C;J and Cy define essentially the same bundle (isomorphic bundles) if there is a bijective bundle map between their bundles.

A.7 The (Dolbeault) Cohomology Groups hfiO Let £ be a fixed bundle on Pn with cocycle Cy. A global (0, q) form co on £ is defined by patching together the forms a>, (i = 1 ••• n + I ) : 3 8 ««• =

I

/ * ? - *, <*« *i

A

-•

A d

» kq

(10A.4)

subject to the conditions to,- = Ctj (Oj. The coefficients / ( '\. ( ... t in (10A.4) are C °°-maps C/, -> C r and as such the multiplication by Cy of these maps gives a known object. Again, as Cy are polynomial functions, they are holomorphic on £/,- n £/y, i.e., dCy = 0 the differential <9ft)(- patches with
. . . ,.

5-(all (0, ^ - l)-forms) 9

denoted // (^) is the q-th cohomology group of the bundle £.39 All these groups are complex-vector spaces of finite dimension. The group if (£) and T(Q (defined in A.6) are isomorphic. We note that the map F(£) —> H°(Q can be described locally as follows: A section a e T{Q) when restricted to Ui (i.e. cr | C/;) is a polynomial map cr,, which is holomorphic, accordingly it is a d -closed 0-form.

A.8 Dimension of the Cohomology Group for Line Bundles qpn(k) f (k + n\

q = 0,

0 q

dim c H (Op,, (k)) = • 0

k<-l

qr = 1 ••• n - 1, all *

0

q — n,k>

—n

q=n

k<-n-l

r-k-r\ A

n

)

38

For simplicity the differentials dut have been used in place of rf(z;/z,), I * i.

39

Note that5-(all (0, q - 1) -forms) = all exact (0, ^)-forms.

(10A.6)

Theory of Yang-Mills and The Yang-Mills-Higgs Mechanism 651

A.9

The First Chern Class of a Line Bundle

Every complex vector bundle f of rank r has an underlying real vector bundle £R of rank 2r, which is obtained by discarding the complex structure on each fiber. The structure groups of these two bundles are evidently isomorphic. In view of the isomorphism of C/(l) with SO(2), there is a one-one correspondence between the complex line bundles and the oriented rank 2-real bundles. The first Chern class of a complex line bundle L over a manifold M is equal to the Euler class of its underlying real bundle L^: cx{L) = e{L£ e H2(M)* If L and V are complex line bundles with transition functions {Cy} and {Cy},

(10A.7)

Cu,Cif:UinUj-+(i:*

(10A.8)

then L ® L' is the complex line bundle with transition functions {Cy • Cy'}. The Euler class of a tensor product vector bundle can be expressed as: e(L ® i ' ) = - T ^ X d{pkd log(Qy• Ct£) on U,n Uj. 2ni k where {pk } is a partition of unity subordinate to {Ut}. From the logarithmic expression on the RHS, it follows that cx(L <E> L') = e (L V) = Cj(L) + cx(L').

(10A.9)

(10A.10)

In particular when L' = L*-the dual of L, the line bundle L® L* = Horn (L, L) has a nowhere vanishing section given by the identity map, and as such L ® L* is a trivial bundle. As the Euler class of a trivial bundle is zero, it follows that cl(L*) = -cl(L).

A.1O

(10A.11)

Chern Classes and the Chern Numbers

Both these are defined only on complex vector bundles. For instance, to define the Chern classes (cohomology classes) we have to consider a complex vector bundle of dimension k (i.e., a principal bundle with the group GL(k, C)) and write an 'invariant polynomial' in terms of the enteries of (k x k) complex matrices a. A polynomial P(tt) is called an 'invariant or characteristic polynomial' if P(a) = Pig'1 a g)

(10A.12)

for all g e GL(k,
652 Mathematical Perspectives on Theoretical Physics

s,U)= I ' i
V"1-,'

(10A 13)

-

<•••<
then P(a) is a polynomial in 5, (A) (/ = 0, 1,2, •••, £): P(a) = fll + a 2 5,(A) + a 3 S 2 (A) + a 4 (5, (A))2 + a 5 5 3 (A) + •••

(10A.14)

It can be checked that the two polynomials: (i)

Det (/ + a) = 1 + S, (A) + S2 (A) + ••• + Sk (A) Tr(exp a) = X,TT T r («)'

(ii)

(10A.15)

are invariant polynomials. Also if a matrix-valued curvature 2-form ft replaces a in an invariant polynomial, then the following holds good: P (ft) is closed

(10A.16a)

P(ft)

(10A.16b)

has topologically invariant integrals.

Polynomials (i) and (ii) of (10A.15) along with the above properties (10A.16) are used to define the Chern classes and the Chern character respectively. We consider a complex vector bundle E over a (Riemannian) manifold M with GL (k, (C)-transition functions, and let ft), ft denote the connection 1-form and the curvature 2- form that take values in the Lie algebra gl (k,
c2(Q)+

•••

(10A.17)

where Chern forms cl (Q) are polynomials of degree / in Q.. Thus we have: c o (Q) = 1,

c, ( « ) =-J~ Tr Q,

c 3 (Q) = —^

c 2 (Q) = - ^ - { T r (Q A Q) - Trti A Trfl} (10A.18)

{-2 Tr (fi A Q A fl) + 3Tr (Q A ft) A Trft - TrQ A Trft A Trft}.

48TT

These expressions for cl are obtained from the eigenvalue expansion of a = diag (Aj ••• Xk). We use

detf/ + — a ] =nf = ,f 1 + — A , ) = 1 +— 5j(A)+ f - ^ - ] 52(A) ••-, and replace the matrix a by V 2n ) V 2n J 2K \ 2n J 2 ft to obtain (10A.17-18). Since c, (ft) e A ' (A/), it follows that cl (ft) = 0 for 2/ > n - dim M showing that total Chern form c(ft) is finite. Note that for pure Yang-Mills theory on R4, the only non-zero Chern forms are c0, c{ and c2. We also note that since Chern forms ct (Q) are elementary symmetric functions, any invariant polynomial P(a) can be expressed in terms of cl (ft), the Chern forms are thus said to generate the 'characteristic ring.' In view of property (10A.16a) it also follows that any homogeneous polynomial in the expansion of the invariant polynomial P(ft) is closed, i.e., dP(Q.) = 0 for all /. This implies that the Chern forms c, (ft) define 2/-th cohomology classes in the sense that: q(ft) e H21 (M).

(10A.19)

Theory of Yang-Mills and The Yang-Mills-Higgs Mechanism

653

This cohomology class e (H21 (M)) is often denoted as c, (E) or simply as c, and is referred to as a Chern class. Each of these Chern classes are independent of the connection form 0), since the difference P(D) - P(Q') is exact for any characteristic (invariant) polynomial. We mention two important facts about these classes: (1) the cohomology classes to which c, (Q) belong are 'integer' classes, (2) the integer obtained by integrating c, (Q) over any 2/-cycle (with integral coefficients) in M, is independent of the connection. The integrals f

Ct (Q.) (for varying /)

computed over the entire manifold using characteristic polynomials, are called the 'Chern numbers' of the bundle E and are denoted as Cl (£), etc. when n - 4 naturally the only Chern numbers are: C 2 (E) = \M c2 (Q); Cl (£) = \M cx (Q) A c, (Q).

(10A.20)

We list here three more properties of Chern classes given in terms of total Chern class c(£) = co(E) + cx(E) + ••• + ck{E) of a ^-dimensional complex vector bundle E over M: c(E® F)= c(£) A c(F)

(10A.21)(l)

where E © F is the direct sum of two ^-dimensional complex vector bundles E and F, c, (L ® L') =

Cl(L)

+ C](L')

(10A.21)(2)

for line bundles L and L', and

c(f* E) =f*c(E) where / : M' -> M, and £ ' = / * £ is the pullback of £ over M'. A.11

(10A.21)(3)

Chern Characters

The Chern character Ch(£) is defined by writing Q in the invariant polynomial (10A.15) (ii), thus: Ch(cr) = Tr exp (—<*) = X "7 T r (J~a)

•

(10A.22)

Since Tr(Q)z = 0 for 2/ > n, Ch(E) stands for a finite sum. It can be checked that Ch(£) has the splitting principle expansion i.e.: Ch(£) = k + C[(£) +— (cf-2c2)

(E) + •••

(10A.23)

Finally we conclude that while total Chern class has the generating function F^ (1 + x}), the Chern character has the generating function Y.t exp xb where xl = ^ r Q ' s t n e variable coming from: (

i

\

2n c(£) = det fl + - ^ Q l = V 2»- J

det

°

1+—Q2 2^: 2 i+ —n

I

t

2K J (See the original work of Chern [19] and [24] for more details, some of these ideas are explained in the Hints to Exc. (6.7.1) and (6.2).)

654

Mathematical Perspectives on Theoretical Physics

A.12

Elliptic Operator, Index of an Operator, and Heat Operator

These operators are used mostly with index theory problems. This is why we did not study them in Chap. 3 along with other operators. Here too we merely give their definitions, and refer the reader to other texts devoted primarily to these operators. (See Atiyah in Ref. [Ad], Gilkey in Ref. [Ad], and 7.[14]) Reader will soon realize that even the definitions are complicated enough to puzzle a beginner. In order to avoid this, we begin with a simple elliptic equation e.g. a{ d2 fldx2 + a2 d2fldy2 + a3 d2f/dz2 = d2f/dt2, where/e R 4 is a C°°-function and a, > 0 (i = 1,2,3) are real constants that are not simultaneously zero; and a one-dimensional heat equation A d2 F/dx2 = d Fldt where F e R 2 and A ^ 0. We shall see that there are some similarities amongst these equations and the definitions that we give below. Following notations will be used in these definitions: Let | a | = ax + a2 + ••• + <xn denote the length of an ordered n-tuple of non-negative integers: a = («, ••• a,,), and let

dxa = dW/dx? - dxp and £>?:=(- «)W d"

(10A.24)

denote the derivatives with respect to the variable x = (x{ ••• xn) e R". Consider a trivial complex vector bundle U x
P=

(10A.25)

m

where aa is a (j x h) matrix valued function on U. Let U = \R and let denote the space of sections of Km x
= (2n)-"l/2

j

R m

eixy

p(x, y) (Ff)(y)

dfi(y)

(10A.26)

p(x, y) being the matrix valued function defined by p(x, y) = Z\a\
(10A.27)

The function p is called the total symbol of P, while the function ak (P) defined by 40

Sm denotes the space of (rapidly decreasing) smooth, complex valued functions on R m such that for every p e N and for every multi-index a, there exists a real positive constant Cap such that (1 + | J C | 2 / \Daf{x)\ < Cap for every x in Km. The space Sm is called the Schwartz space, this is a dense subspace of L2 (//) - the space of complex valued functions on R m which are square summable with respect to measure \i on K m . The space Sm can be isometrically mapped onto itself by the Fourier transform: Ff(x):

41

Ff(x) s f(x) := {2K)'"12

= (lit)'™12 \ m f(y) e~"'y d^l(y).

f „ f(y) e~'xy d/x(y), here /j. denotes the Lebegue measure on R " , / i s Lebegue

intergrable function/: R" - » C , and x • y = 2^

=]

AT, y-t.

Theory of Yang-Mills and The Yang-Mills-Higgs Mechanism

Ok (/>)(*, y) = X\a\=k aa (x)

655

(10A.28) 42

is called the A:-symbol or the principal symbol (or simply the symbol) of P. Let E and F be Hermitian vector bundles of rank h and j respectively over an m-dimensional differential manifold M. Let P: T{E) —> T(F) be a linear operator from sections of E to sections of F. The operator P is called a (linear) differential operator of order k from E to F if locally, in a coordinate neighbourhood U, P has the expression (10A.28). The space of Linear differential operators of order k from E to F is denoted as Dk {E, F). To define the symbol of an operator P & Dk (E, F) we consider the cotangent bundle T*(M), set to M - r * M \ { t h e image of the zero section}, and let p0 : To (M) —> M be the restriction of the canonical projection p: T* (M) —> M to To* (M). We use the pull-backs p 0 * F-, p0* F of the bundles E and F to the space 7^ (M) with respective projections 7r£ and n*F to define a vector bundle morphism a, shown below. Po E

y p*0 F

_*\

/C* T?(M)

The set of fc-symbols is the set Smk (E, F) defined as: Smk {E, F) := { 0, dx e T*0(M)}.

(10A.29)

The ^-symbol (or simply the symbol) of P, denoted by <jk (P) € Smk (E, F) is then defined as: ak (P) (ax, e) = P{t)(x) where (ocx, e) € p\ E and t e T(E) are defined as follows. Choose s e T(E) a n d / E J(M) such that s(x) = e and df(x) = ax

(10A.30) (x e M); then

r=(i*/*!) (/"-/(*))* * The operator (P(t))(x) depends only on ax and e, is idependent of local choices made. An operator P e Dk (E, F) is called elliptic if o^ (P) is an isomorphism. If P is elliptic, then rank (E) = rank (F). The space of these elliptic operators from E to F is generally denoted as Elk (E, F). (See [35] for more details) In order to define a heat operator we introduce the concept of pseudo-differential operators. For this we consider a smooth compact Riemanian m-dimensional manifold M. A linear operator P : C°° (M) —> C°°(M) is called a pseudo-differential operator of order d denoted generally as P e x¥d (M) if for every open chart U on M and for every y/,

' Sometimes the principal symbol is called the leading symbol and is denoted as aL.

656

Mathematical Perspectives on Theoretical Physics

The very definition of pseudo-differential operator P on M implies that there can exist such operators of all orders on M. We define

«P (M) = U ^ M .

and

^— W =D ^d W)

d

d

to be the set of all pseudo differential operator on M, and the set of infinitely smoothing operators on M. We often use the notation *P DO of order d for an operator P of order d defined above. Let Vbe a graded vector bundle, i.e. it is a collection of vector bundles {Vj}je z such that Vj * {0} for only a finite number of indices j . We let P be a graded *F DO of order d, P is a collection of J-th order pseudo differential operators P}: C°° (V)) -> C " (V)+1). We call the pair (P, V) a complex if Pj + lPj = 0

and aL Pj+l aL Pj = 0.

We say that the complex (V, P) is elliptic if N(oL Pj) (x, § = R(oL Pj_{) (x, §

for | * 0

(10A.31)

i.e. if the complex is exact on the symbol level. (In (10A.31) N stands for nullily of oLPj and R stands for range of aL Pj_{). The cohomology of this complex is defined as: Hj (V, P) = NiPp/RiPj^)

(10A.32)

J

if (V, P) is an elliptic complex, H (V, P) is finite dimensional. The Euler characteristic of this complex is the index: index (P) = I ; (-1) ; ' dim Hj (V, P).

(10A.33)

Finding the index, and the spectrum of an operator has always been a chalanging problem in mathematics, for which several analytical and differential geometric methods have been used. One of these is the heat equation method. It can be shown that the heat equation provides a local formula for the index of any eliptic complex. (See [5] and Patodi in Ref. [Ad]) We define below the heat equation of *F DO operator P: C°° (V) -> C °° (V) of order d > 0 which is self-adjoint with positive definite leading symbol. We note that Spec (P) of such an operator is contained in [- C, °°] for some constant C. The heat equation for P is the partial differential equation: I — + P\ f{x, 0 = 0 \dt

for

t> 0

with fix, 0) = fix)

(10A.34)

)

A formal solution of this equations is: fix, t) = e~tP fix) (Note that fix, t) represents a column of functions) Writing fix) = "LCnn ix) as a generalized Fourier series,43 the solution becomes: f{x, t) = ln e-'x" Cn h ix) 43

Xn e [- C, ~)

{n (x)) represents a complete orthonormal system of eigenvectors here and Cn = (f, (j)n)

(10A.35)

(10A.36)

Theory of Yang-Mills and The Yang-Mills-Higgs Mechanism 657

Furthermore, using the Kernel map: K(t, x, y) = X,, „ (x) ® 0a (y): Vy -> Vx

(10A.36)

we have: e-'Pf(x) =\

K (t, x, y)f(y) dvol(y)

= - I,, «-'*' fl, (x) J M / (y) „ (y) dvol (y)

(10A.37)

From above it is evident that K (t, x, y) can be regarded as an endomorphism from the fiber of V over y to the fiber of V over JC. By abuse of language we shall call e~lP as the heat operator corresponding to P. In conclusion we note a simple result which uses heat equation to define the index of an elliptic ^VDO (of order d > 0) P : C°° (V) -> C°° (MO. Result: If P is an elliptic W ) 0 with properties given above, then for t > 0, e""" p and e~'pp are in H ^ with smooth Kernel functions and index (P) = Tre~'p*p - Tr
(1OA.38)

where P * is the operator adjoint to P. (See Gilkey in Ref. Ad [25] for more details)

A.13

Renormalization

A renormalizable algoritham can be divided into two general parts: (i) a 'regularization procedure' that makes integrals well-defined, and is thus accessible to formal manipulations, and (ii) a 'subtraction procedure' that cancels divergencies in the physical matrix elements. In fact (i) is the most powerful method of demonstrating renormalizability for gauge field theories; more specifically it is the dimensional regularization that is used in problems involving multiloop Feynman diagrams. It respects those 'algebraic relations' among Green's functions that do not depend on the dimensionality of spacetime. In a gauge theory with high degree of symmetry e.g. in QED it is important that these algebraic relations remain unaffected by renormalization. These algebraic relations are the well known Ward identities in the case of QED, and are referred to as generalized Ward identities or Taylor-Slavnov identies in the case of non-abelian gauge theories.

References 1. S. L. Adler and R. F. Dashen, Current Algebra and Application to Particle Physics; (New York: Benjamin 1968) 2. M. F. Atiyah, (a) Geometry of Yang-Mills Fields; Scoula Normale Superiore, (Pisa 1979); (b) Instantons in Two and Four Dimensions; Commn. Math. Phys. 93 (1984); (c) Magnetic Monopoles in Hyperbolic Spaces, in Proc. Int'l Coll. on Vector Bundles in Algebraic Geometry, Tata Inst., Bombay (1984). 3. M. F. Atiyah, R. Bott and A. Shapiro, Clifford Modules, Topology 3 (Suppl. 1) (1964).

658

Mathematical Perspectives on Theoretical Physics

4. M. F. Atiyah and I. M. Singer, The Index of Elliptic Operators; I, III, IV, Ann. of Math. 87 (1968), 484-530; 87 (1968), 546-604; 93 (1971), 119-138. 5. M. F. Atiyah, V. K. Patodi and I. M. Singer, Spectral Asymmetry and Riemannian Geometry; I, II, III, Math. Proc. Camb. Phil. Soc. 77, 78, 79 (1975), (1975), (1976), 44-60, 44-60, 71-99. 6. M. F. Atiyah, N. Hitchin and I. M. Singer, (a) Deformations of Instantons; Proc. Nat. Acad. 74 (1977); (b) Self- duality in Four-dimensional Reimannian Geometry; Proc. R. Soc. London A. 362 (1978), 425-461. 7. M. F. Atiyah, V. G. Drinfeld, N. J. Hitchin, Yu I. Manin, Construction of Instantons; Phys. Lett. 65A (1978), 185-187. 8. M. F. Atiyah and R. S. Ward, Instantons and Algebraic Geometry; Commn. Math. Phys. 55 (1977), 117-124. 9. M. F. Atiyah and R. Bott, (a) Yang-Mills and Bundles over Algebraic Curves in Geometry and Analysis; Patodi Memorial Volume, bid. Acad. of Sc. (1980); (b) The Yang-Mills Equations over Riemann Surfaces; Phil. Trans. R. Soc. London A308 (1982). 10. M. F. Atiyah and N. Hitchin, The Geometry and Dynamics of Magnetic Monopoles; (Princeton Univ. Press, 1988). 11. S. Axelrod, S. D. Pietra and E. Witten, Geometric Quantization of Chern-Simons Gauge Theory; J. Diff. Geom. 36 (1991), 787-902. 12. S. Axelrod and I. M. Singer, (a) Chem-Simon Perturbation Theory; Proc. XXth Conf. on Differential Geometric Methods in Phys.; (b) Chern-Simon Perturbation Theory; J. Diff. Geom. 39 (1994), 173-213. 13. A. de Azcarraga (Ed), Topics in Quantum Field Theory and Gauge Theories; (Springer-Verlag 1978); (i) P. Goddard, (Magnetic Monopoles and Related Topics). 14. W. A. Bardeen, Anomalous W,ard Identities in Spinor Field Theories; Phys. Rev. 184 (1969) 1848. 15. A. A. Belavin, A. M. Polyakov, A. Schwarz and Y. Tyupkin, Pseudoparticle Solutions of the Yang-Mills Equations; Phys. Lett. 59B (1975), 85-87. 16. E. B. Bogomolny, Stability of Classical Solutions; Sov. J. Nuc. Phys. 24 (1976), 861-870. 17. R. Bott and L. Tu, l.[5] 18. W. Chen, G. W. Semenoff and Y. S. Wu, Finite Renormalization of Chern-Simon Gauge Theory, 8.[23]. 19. S. Chern, Complex Manifolds without Potential Theory; (Springer-Verlag 1979). 20. M. Daniel and C. M. Viallet, The Geometrical Setting of Gauge Theories of the Yang-Mills Type; Rev. Mod. Phys. Vol. 52, 1 (1980), 175-197. 21. P. Dita, V. Georgescu, R. Purice (eds.), Gauge Theories, Fundamental Interactions and Rigorous Results; (Boston: Birkhauser, 1982). (i) J. C. Taylor (Standard Electroweak Theory and Decoupling Theorems); (ii) D. Olive (The Structure of Self-dual Monopoles); (iii) G. Trautman (Yang-Mills and Vector Bundles); (iv) W. Barth (Mathematical Instanton Bundles); (v) M. F. Atiyah (Solution of Classical Equations); (vi) A. Jaffe (The Self-duality Problem for Gauge Theories). 22. S. K. Donaldson, (a) Instantons and Geometric Invariant Theory; Commn. Math. Phys. 93 (1984), 453-460; (b) The Yang-Mills Equations on Euclidean Space in Perspectives in Mathematics, 93-109; (Birkhauser-Verlag-Basel, 1984); (c) Anti-Self-dual Yang-Mills Connections over Complex Algebraic Surfaces and Stable Vector Bundles, Proc. London Math. Soc. 3 (1985), 1-26;

Theory of Yang-Mills and The Yang-Mills-Higgs Mechanism 659

23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33. 34. 35. 36.

37. 38. 39. 40.

41. 42. 43. 44.

45.

(d) Compactification and Completion of Yang-Mills Moduli Spaces in Led. Notes in Math. # 1410 (Springer-Verlag 1989), 145-160. W. Drechsler and M. E. Mayer, Fiber Bundle Techniques in Gauge Theories; (Springer-Verlag, 1977). T. Eguchi, P. B. Gilkey and A. J. Hanson, Gravitation, Gauge Theories and Differential Geometry; Phys. Rep. Vol. 66, 6 (1980), 213-393. D. S. Freed and K. Ulhenbeck, lnstantons and Four-manifolds; (Springer-Verlag 1984). P. H. Frampton, Gauge Field Theories; (Benjamin/Cummings 1987). A. Heil, et. al, Anomalies from the Point of View of G-Theory: J. Geom. Phys. 6 (1989). P. W. Higgs, 6. [17] N. Hitchin, (a) Monopoles and Geodesies; Commn. Math. Phys. 83 (1982), 579-602; (b) On the Construction of Monopoles; Commn. Math. Phys. 89 (1983), 145-190. R. Jackiw, Topics in Planar Physics, in 8.[23] R. Jackiw, C. Nohl and C. Rebbi, Classical and Semiclassical Solutions to Yang-Mills Theory, in Proc. Banff School,(P\enum Press, 1977). A. Jaffe and C. H. Taubes, Vortices and Monopoles; (Birkhauser, 1980). J. L. Lopes, Gauge Field Theories, An Introduction; (Pergamon Press, 1981). N. S. Manton, (a) The force between t'Hooft-Polyakov monopoles; Nuc. Phys. B126 (1977), 525-541; (b) A remark on the scattering of BPS monopoles; Phys. Lett. HOB (1982), 54-56. K. B. Marathe and G. Martucci, The Mathematical Fundations of Gauge Theories, (NorthHolland, 1991). M. Martinis and I. Andric. (Ed.), Superstrings, Anomalies and Unification; World Scientific (1987). (i) J. Wess; (Anomalies, Schwinger Terms and Chern Forms); (ii) A. A. Slavnov, (Effective Lagrangians, anomalies and Spontaneous Symmetry Breaking); (iii) R. D. Ball, (Effective Action for Chiral Fermions.) V. Moncrief, Gauge Symmetries of Yang-Mills Fields, Ann. Phys., 108 (1977), 3 8 7 ^ 0 0 . W. Nahm, The Construction of all Self-dual Monopoles by the ADHM Method, in Monopoles in Quatum Field Theory (eds.) N. S. Craigie, P. Goddard and W. Nahm, (World Scientific, 1982). C. Nash, Differential Topology and Quantum Field Theory, (Academic Press, 1991). M. K. Prasad, (a) lnstantons and Monopoles in Yang-Mills Gauge Field Theories; Physica ID, (1980), 167-; (b) Yang-Mills-Higgs Monopoles Solutions of Arbitrary topological Charge; Commn. Math. Phys. 80 (1981), 137-149. M. K. Prasad and C M . Sommerfield, Exact Classical Solutions for the t'Hooft Monopole and Julia-Zee Dyon,/em Phys. Rev. Lett. 35 (1975), 760-762. R. Rennie, Geometry and Topology of Chiral Anomalies in Gauge Theories, Adv. in Phys. 39 (1990). A. Salam, Gauge Unification of Fundamental Forces; Rev. Mod. Phys., 92 (1980), 525-536. I. M. Singer, (a) The Geometry of the Orbit Space for Non-abelian Gauge Theories; Physica Scripta 24 (1981), 817-820; (b) Some Problems in the Quantization of Gauge Theories and String Theories in the Mathematical Heritage of Hermann Weyl (ed.) R. Wells, Jr., Proc. Symp. Pure Math., Vol. 48 (AMS 1988), 199-216. I. M. Singer and J. Thorpe, (a) Lecture Notes on Elementary Topology and Geometry; (Scott Foresman, Glenview. ILL, 1967); (b) The Curvature of 4-dimensional Einstein Spaces, in Global Analysis, Papers in honor ofK. Kodaira, (ed.) D. C. Spencer and S. Iyanaga. (Princeton Univ. Press. 1969), 355-365.

660

Mathematical Perspectives on Theoretical Physics

46. C. H. Taubes, (a) Stability in Yang-Mills Theories, Commn. Math. Phys. 91 (1983), 235-263; (b) Min-max Theory for the Yang-Mills-Higgs Equations; Commn. Math. Phys. 97 (1985), 473-540; (c) The Seiberg-Witten Invariants and Symplectic Forms, Math. Rev. Lett. 1 (1994), 809-822; (d) More Constraints on Symplectic Forms From Sieberg-Witten Invariants, (Harvard Preprint). 47. G. t'Hooft, (a) Gauge Theories of the Forces between Fundamental Particles; Sci. Amer. 242 6 (1980), 104-138; (b) Some Twisted Self-dual Solutions for the Yang-Mills Equations on a Hypertorus, Commn. Math. Phys. 81 (1981), 267-275. 48. K. Uhlenbeck, (a) Removable Singularities in Yang-Mills Fields, Commn. Math. Phys. 83 (1982), 31-42; (b) Connections with If Bounds on Curvature; Commn. Math. Psys. 83 (1982), 31-42. 49. G. Velo and A. S. Wightman, Fundamental Problems of Gauge Field Theory; (New York Plenum Press, 1986), (i) A. Wightman, (Fundamental Problems of Gauge Field Theory.), (ii) L. Alvarez-Gaumme (An Introduction to Anomalies.) 50. C. M. Viallet, Symmetry and Functional Integration, in 8.[23]. 51. J. C. Ward, An Identity in Quantum Electrodynamics; Phys. Rev. 78 (1950), 1824. 52. R. S. Ward, (a) A Yang-Mills-Higgs Monopole of Charge 2, Commn. Math. Phys. 79 (1981), 317-325, (b) Ansatz for Self-dual Yang-Mills Fields; Commn. Math. Phys. 80 (1981), 563-574; (c) Slowly Moving Lumps in the CP1 Model in (2 + 1) Dimensions, Phys. Lett. 58B (1985), 424-428. 53. R. Weder, Absence of Stationary Solutions to Einstein-Yang-Mills Equations; Phys. Rev., Vol. 25, No. 10, (1982), 2515. 54. E. Witten, (a) Some Exact Multipseudoparticle Solutions of Classical Yang-Mills Theory, Phys. Rev. Lett. 38 (1977), 121-124; (b) An Interpretation of Classical Yang-Mills Theory, Phys. Lett. 77B (1978), 394-398; (c) An SU(2) Anomaly, Phys. Lett. 117B (1982), 324-328; (d) Global Gravitational Anomalies, Commn. Math. Phys. 100 (1985), 197-229. 55. C. N. Yang and R. L. Mills, (a) Isotopic Spin Conservation and a Generalised Gauge Invariance, Phys. Rev. 95 (1954), 631-; (b) Conservation of Isotopic Spin and Isotopic Gauge Invariance, Phys. Rev. 96 (1954), 191-195.

CHAPTER

STRINGS AND SUPERSTRINGS [ELEMENTARY ASPECTS) 1

J J I

|

INTRODUCTION

This final chapter of our work brings us to the theory of strings and superstrings—the so-called 'Theory of Everything' according to Davies in Ref. [7]. In a way, this chapter is the culmination of previous chapters, as it uses the mathematical techniques explained in Chapters 1 to 7 and unifies the physical theories summarized in Chapters 8 to 10. We begin with a brief historical perspective of this phenomenal theory. The theory is barely 30 years old, since the earlier notion of hadronic strings due to hadrons (see App. 11A for definition) has very little to do with the present theory, that emerged from considerations of dual models and their miraculous consequences in the form of M-theory. These dual models as we shall see in Sec. 2 were constructed, to begin with, in order to help explain Regge's resonances in scattering experiments of high-energy particles of spin J > 1. The very first paper that established the role of duality in string theory is due to Veneziano [34], As we shall see below it was this paper that led to the scattering amplitude -A (s, t) of strongly interacting particles of different mass and spin that might be exchanged in a r-channel (in the present form):*

A(s ) =

-' - ? 7 ^ f

Here gj and Mj are respectively the coupling constant and the mass of a particle of spin J. The variables 5 and t which also refer to channels (see the App.) are Mandelstam variables defined by momenta of strongly interacting high-energy particles. In fact they represent respectively the creation of a'resonance' and an 'interaction' when these particles collide (in Sec. 2 we shall see their relation to Regge poles). To write the amplitude (11.1.1) Veneziano used the duality principle of the (mathematically) well known Euler beta function: B(u, v) = f dx, x"-x (1 - xf-{ = B(v, u).

(11.1.2a)

A(s, t) = n-a(s) T(-a(t))/r(-a(s) - a{t))

(11.1.2b)

He postulated:

Formula (11.1.1) for A (5, t) in the asymptotic region of large s and fixed t is the simplified version of a sum of partial waves in the f-channel expressed as A (s, t) = 12(7 + 1) Py(cos 6,) fj (t). Note that this is given in terms of Legender polynomial Pj (cos 0,) and the partial wave amplitude/;(t), 91 being the f-channel centre-of-mass scattering angle: cos0, = (u - s)/(t - 4m)2. (See A3 and A.4 and M.B. Green in [26]).

662 Mathematical Perspectives on Theoretical Physics tu~

e~ dt and Regge trajectory a (s).

(11.1.2c)

The function B(u, v) had all the properties of an 5-matrix for hadronic scattering except the unitarity. Unfortunately, however, dual models were not very popular in the theories of strong interactions at that time, and as such, Veneziano's model received little support from physicists. A few years later,Virasoro et al. [35] through their theoretical work suggested that the beta function (11.1.2a) should be treated as the lowest order Born term in a Feynman-like perturbation series involving multiloops. Almost at the same time, experimentalists confirmed that in the Regge region the predictions of theVeneziano model agreed with the high energy behaviour of strong interactions [10]. This changed the picture in favour of Veneziano's approach. Merits of the model, such as an easy generalization to n-dimensions and its suitability as an appropriate format for relativistic string theory, were soon recognized. More importantly, the interest in dual models as a means to solve complicated problems, such as quantum gravity, revived. A large number of physicists and mathematicians met regularly to discuss and present new ideas. Countless research papers flowed out from these meetings to be collected later in the form of proceedings (see [5], [20], [29], [31], [37], [1], [2], [3], [4], [5], [37] and still new references added in the proof e.g. [1], [2], [3], [4], [5]). The major thrust of these meetings was on the search of a 'flawless theory'—a theory without divergences, ghosts, anomalies, tachyons and symmetry violations, etc. In short, this amounted to the establishing of a theory which had a built-in structure to overcome all conceivable inconsistencies, and whose 'governing equations', like the equations of other theories met the needed requirements of invariance due to reparametrization, conformal or gauge change, and in addition to this, they were expected to be accessible to quantization. In the process, the concepts of supersymmetry and graded Lie algebras entered the dual models and these models began to be viewed as theories of quantum gravity in restricted dimensions of 26 and 10, the dimensions that arose as consistency conditions (see (11.2.21)). The former of these was valid for the Veneziano model of bosons and the latter for the Ramond-Neveu-Schwarz model of bosons and fermions. The choice of these (abnormal) dimensions is justified using the arguments of 5-dimensional KaluzaKlein theory, where the additional 1-dimension is compactified into a circle (see [36a]). Here the dimensions to be compactified were 22 and 6, respectively. Soon after, the two-dimensional world-sheet supersymmetry of the Ramond-Neveu-Schwarz string model was generalized to four-dimensional spacetime supersymmetry by Wess and Zumino [6]. In the early 1980s, through the efforts of Green and Schwarz, it was established that the above model represented the tachyon-free spinning string theory, in which oneloop diagrams were finite; and it was also shown that the theory had no ultraviolet divergences. What emerged as a result of their work were the supersymmetric string theories that came (quite naturally) to-be known as superstring theories. These theories differ from each other due to the different symmetry/ supersymmetry groups that are involved*. However they all have one aspect in common—the capability of unifying the earlier two great physical theories—namely the quantum mechanics and the general relativity. For it is here (i.e., in superstring theory) that quantum gravity makes sense. Since there exists sufficient expository material on the subject, e.g., [13], [15], [19], and [32], we devote ourselves to the basic mathematics of string and superstring theory. In other words, we concentrate simply on learning the principles that govern the theory. In order to accomplish this, we study For a linear Regge trajectory a(s) ~ a's, the asymptotic behaviour of Veneziano amplitude (11.1.2) for large s and fixed ns A (s, t)~sa(t). During past five years much more work has been done in this area, which shows that these theories are 'really' not different (see Ref. [Ad] and the references added in the proof).

Strings and Superstrings (Elementary Aspects) 663

various types of 'invariances,' e.g. Poincare and Weyl, that are required to formulate a consistent theory and also the 'formalisms,' e.g. light-cone and covariant path integral, needed to describe it. We use the bosonic string theory as the basis of this introductory chapter on strings. Due to our limited scope we are only able to give a glimpse of world-sheet supersymmetry and space-time supersymmetry in string theory. Our study in the chapter is mainly based on two volumes of [13], the notations used are also borrowed from there.

2

FROM PARTICLES TO STRINGS

Before passing on to the 'definition/description of a string' which is loaded with the complexities of the past 30 years of research, we would like to recall the context in which the word string was used earlier. One usually talked of an elastic or inelastic string attached to a massive body and a string attached to a musical instrument. The only parameters that distinguished strings in general from one another were the "tension" and the "vibration.'"1 To measure the "tension" T of a string in a given configuration, the "Hooke's law"2 was invoked while the "vibration" was represented by the velocity vw of a transverse wave propagating in string direction: v =

IT

m/L = mass per unit length (11.2.1) V m/L Evidently these two parameters are not independent of each other. We shall soon see the relation of these parameters to the present theory when a Regge trajectory is considered. We also note that the concept of dimensionality played no role in the description of these strings. Here we shall begin by defining the 'string' as a one-dimensional object which traces a world-sheet as it moves in physical space-time.3 And as such we shall write the action of a string on lines similar to that of a point particle, which as we know is a O-dimensional object that traces a world-line as it moves from one position to another in space-time.

2.1

Regge Trajectories

In order to understand the implications of this approach from particles to strings, we have to take a detour to Regge's work on some mathematical properties of S-matrices. For it was here, that it was observed for the first time, that the characteristic properties of particles resonating from collision of high energy particles resembled strings more than they did point-like particles. These Regge resonances4 behaved like temporary localizations of matter and energy-referred to as "ringing of energy", and they were extended in space. They possessed a very large value for their spin, as compared to the hadron particles' spin of quantum number 1/2. And since these resonances were particles with high spin, they 1

If any particle of a tight string whose ends are fixed is moved aside and released (perturbed), then it vibrates back and forth, perpendicular to the length of the string, and this disturbance spreads out along the string in both directions. 2 According to Hooke's law, associated with any material substance there is a constant (similar to specific gravity) which is independent of the shape and size of the body and is determined by the proportionality between the applied force and the stretch in the substance. This constant is known as the 'modulus of elasticity' of the substance. 3 ' The usage of the word space-time in place of spacetime indicates that the space dimension here is arbitrary. 4 ' Resonance is a very short lived phenomena (particle) that occurs during a scattering experiment.

664

Mathematical Perspectives on Theoretical Physics

could be considered having a sort of angular momentum and as such they were more like a spinning ball of finite size rather than a particle of O-dimensionality (represented by a point). Regge showed that these resonances ("ringings") of S-matrices (though too many in number to be all fundamental) could be plotted on a graph in an orderly fashion. Obviously for different hadrons the graphs were different. These graphs are called Regge trajectories. Given below are two figures (11.1a and 11.1b) that display the scattering data of baryon and meson resonances. i

i

i

i

i

i

i

i

"

i

i

n -

X " 17 ~ ~Z ~

A (3230)" ^2 ^^°~ A $N5a)J>^

A(2420L^^^(3030': A(195m^>--^N(2650) -

J-f" -

-2- ^j^^N(1518)^

N (939)1

2 3

I

7 8

A

I

I

|

7 ryQ /^^

5 -

43^

9 10 11

9 ^ / ^ ^

^

% ™AfB

1

(a)

Q3D9 ^

I

6 -

1

4 5 6 f(GeV2)

I

7 -

2

3 r(GeV2)

4

5 6

(b)

P'0* o f b a r y ° n resonaces. (D) A plo< of meson resonances

They confirm that resonances have angular momentum and that they can be considered as extended objects or temporary particles smeared out over a finite region of space. These graphs also show that a Regge trajectory can be taken as a straight line: a{t) = ao+a't (11.2.2) to a very good approximation. When a linear approximation is taken for a Regge trajectory, a{t) is referred to as Regge function. We shall see that this function, a(t), is related to the mass of the particles of the scattering data and that it takes discrete values. Recall that the amplitude A (s, t) as defined in (11A.1) has simple poles at a{i) = «,(« = 0, 1, 2, ..., «>). Thus in view of (11.2.2) it has poles at: t=tz^A

n = O ,l,2,

...,-

(11.2.3)

cx But from (11.1.1) the poles of A(s, t) are also given at t=MJ (11.2.4) These two values of t taken together show that in essence the singularities of A (s, t) are simple poles that correspond to ^-channel exchange of particles of mass M2=\

_gi

(11.2.5)

In view of Eq. (11 A.I), we note that the residue at these poles is an n-th order polynomial in s, which is based on the fact that the particles of mass (n - a^)la' can at most have spin n. As a result it follows that the smallest possible mass of a particle of spin J is (J - oQ/a'. It is for this reason that a' is called the Regge slope. The particles of mass M2 = (J- oQ/a' are said to lie on the leading Regge trajectory.

Strings and Superstrings (Elementary Aspects) 665 The slope a' for obvious reasons is assumed to be > 0, for otherwise all or most of the particles will be tachyons. One of the assumptions in Regge's theory is that the amplitude contains a pole at J = a(f), this in turn means: a(M))

=J

(11.2.6)

i.e., a{t) is fixed at discrete positive values of t by the positions of the r-channel resonances (see M. B. Green in [26]). The following remarks in relation to the above discussions are noteworthy: Remark 11.2.1 By making use of the analyticity of scattering amplitudes, it can be shown that the scattering data could be fitted by assuming that the amplitude can be expressed either as a sum over resonances in the s-channel or as a sum over Regge poles* (corresponding to exchange of resonances in the f-channel). This dual relationship is figuratively given as:

?\

/L L

' /

I

\

j

' /

\

Q ^ Q Q Equadty of sum over f-channel poles and ^-channel poles Remark 11.2.2 Dual resonance models carry a natural scale, set by « ' which has dimensions [length]2. In the case of string theory this is related to the string tension T (which has dimensions of [length]"2 or [mass]2) by the equality: a'=—!—

(11.2.7)

2.2 An Action of a Classical Point Particle and Nambu-Goto Action of a String We recall from Chapter 9 that the motion of a classical point particle can be described by an action:

5 = J dxex (T) V 4f-^f-

(11.2.8)

J

dx dx where the indices [i, v in the Minkowski metric r\uv of £>-dimensional space-time take the values 0, 1, ..., D - 1, Tis an arbitrary parameter along the trajectory, x^(x) is the position of the particle, and e(x) is a kind of metric along the particle world-line that ensures the reparametrizarion invariance of 5. If one were to use the gauge invariance property of 5 by picking a gauge with e = 1, then (11.2.8) simplifies to S = \dxriuV— J

M

dx

T~ dx

(11.2.9)

A Regge pole determines not only the positions of the f-channel resonances but also determines the power behaviour of the high energy s-channel scattering amplitude.

666

Mathematical Perspectives on Theoretical Physics

This action is evidently quadratic with variational equations: -2-V=0 dx"

(11.2.10)

The solutions of the equation are straight lines in Minkowski space. It does not follow however, that an arbitrary straight line in Minkowski space is a solution of the variational equations obtained from the original action S. We further note that by choosing the gauge e = 1 we have eliminated e altogether, therefore the right course to follow is to impose the equation of motion: — =0 (11.2.11) be and then fix the gauge e = 1. The above equation of motion implies that gauge invariant quantity T =%

v

^ - ^ ~

(H-2.12)

dx dx must vanish, which shows that the solutions of our equations are the light-like geodesies in Minkowski space, implying finally that (11.2.8) is indeed the action for a 'massless' classical point particle. We also make the following two remarks: Remark 11.2.3

The variational Eq. (11.2.10) derived from the gauge fixed action S does not imply

df T = 0, it does imply though, that T is a conserved quantity, i.e.,

= 0. Thus if we have a gaugedx

fixed action then, T = 0 is, in fact a constraint on the initial data. When the system is quantized, the canonical momenta p^ = —-—

r- = r\ V

the Lorentz invariant wave operator TJ^

d2

becomes p^ = -i

. This means that f becomes

= O (see the operator-list in Sec. 3.1, it is denoted

dx^dx as D there). Just as classically we allowed orbits of f = 0 to represent the point particle, here we should allow only those quantum states (j>= ^>{x^) in space-time, which can be annihilated by T, i.e., those that satisfy f § = 0.5 Remark 11.2.4 The action (11.2.8) is covariant under the Poincare transformations x^ —> a^ vxv + V1, where (a^v) stands for a Lorentz transformation and (b11) is a constant vector. This implies that the Schrodinger equation derived from (11.2.8) is Poincare invariant. Also symmetry (of the quantum system) under reparametrization of the time coordinate leads to the fact that the time derivative dldx does not appear in the Schrodinger equation. We shall use these features of classical point-particle to write the classical equations of a free string. A string can either be open or closed. The open string is conventionally described by a coordinate a that runs from 0 to ;r (assuming that the string has endpoints). A closed string, which can topologically be viewed as a circle, is described by a closed interval 0 < a < n. The motion of these strings is 2

5

This is the massless Klein-Gordon equation • 0 = 0 and is the Schrodinger equation for the quantum system pertaining to (11.2.8). We studied these in Chapter 9 (see Sec. (9.2)).

Strings and Superstrings (Elementary Aspects) 667

described with the help of another parameter x, which is timelike, in the sense that it is a sort of time coordinate for an observer sitting on the string at the position a. As the string propagates in space-time (Fig. (11.3)), it traces a world-sheet—which is the generalization of the world line of a point-particle. Mathematically this world-sheet is described by specifying the coordinates X^(<7, T) giving the position of the string for parametric values a and T.

:'-."-

'-~\

- s

'(•

(0 ^ H Q

Time

(ii)

(i) An open string, (ii) A closed string propagating in Minkowski space sweeps our a world-sheet.

Just as the action of the point-particle was proportional to the length of the world-line, the action here (suggested originally by Nambu and Goto (see Ref.[Ad])) is proportional to the area of this worldsheet. Since the world-sheet is embedded in Minkowski space, the action in terms of area can simply be written as: S = TJ da dr-yjx2X'2 - (X • X'f

(11.2.13)

where *M = * * * ( * • * > , x>"= ' * * < * * > (H.2.14) dt da and T is the constant of proportionality, that makes the expression dimension-less. From the fact that action is expressed in terms of area, it follows that the solutions of the classical equations of the free string are the world-sheets of minimal (or at least extremal) area. We note that this is the generalization of the point-particle case, where solutions are geodesies or curves of minimal length. Now (11.2.13) is highly nonlinear and especially so because of the square root; to make the action wieldy one therefore introduces another variable hap (in addition to X^(a, T)) which is viewed as a metric tensor of the world-sheet. The action in terms of hap becomes:6 S = - — j d2a-Jhhal37]ttvdaXl"dl}Xv 6

'

It is customary to regard the pair (a, r) as a two vector aa = (cr, f) and use d2a as a synonym for

(11.2.15) dtdo.

668

Mathematical Perspectives on Theoretical Physics

where h"^ is the inverse of hapa.nd h = |det(ft a £)|. Since (11.2.15) does not involve derivatives of hap, its equation of motion is a constraint equation and h can be eliminated or integrated out, which eventually gives back (11.2.13). It is worth noting here that classically (11.2.15) describes the propagation of a string in Minkowski space of arbitrary dimension, but quantum mechanically it is of interest only when D = 26 (see Eq. (11.2.21)).

2.3

Reparametrization and Constraint Equations

We note that each of the actions (11.2.13) and (11.2.15) is invariant under general coordinate transformations of the world-sheet, T, a —> T'(T, a), a\x, o), however both these actions describe a field theory, that is generally covariant in (1 +1) dimensions. We further note that under such transformations the world-sheet metric han (as given in (11.2.15)) transforms according to the usual transformation law of a metric tensor. In this (1 + 1) dimensional field theory the coordinates X^ enter as scalar fields. They transform as vectors under 26-dimensional Poincare" transformations, whereas they transform as scalars under reparametrizations of the world-sheet. We also remark: Remark 11.2.5 The action (11.2.15) represents the standard form for coupling of massless scalar fields X11 to (1 + l)-dimensional gravity. The two-dimensional actions in (11.2.13) and (11.2.15) together with their supersymmetric generalizations are the only known actions, that can describe generally covariant field theories in a space-time of any dimension. Now the symmetric tensor hap (a, ft = 1, 2) has three independent components, and two of these can be eliminated by a suitable choice of coordinate transformation cr, T —» cr', T'. 8 A standard choice, which is known as the conformal gauge choice, is the one where the world-sheet is reparametrized by setting hap = T]ap e^. The tensor r\ap here is the metric of a flat world-sheet and e^ is a conformal factor. In two dimensions, as we already know (See Sec. (1.4)), the factor e* drops out from (11.2.15) since •Jh is proportional to e^1 and ha^ is proportional to e~*. As a result inconformal gauge (11.2.15) reduces to a simple free field action:

5= -|-J

d'an^d^d^

(11.2.16)

The equation of motion that results from this is the linear wave equation: (11.2.17) Since this simple equation results from gauge-fixed action, we have to supplement it with a constraint equation just as we did in the point-particle case. This constraint equation is apparently - ^ - = 0 5hap 7'

(11.2.18)

Invariance under reparametrizations of the world-sheet is essential for solving the minimal surface equations derived from (11.2.15). This is possible since in 2 dimensions a general coordinate transformation cr, T —> o"', x' depends only on two free functions, namely the new coordinates a', t'.

Strings and Superstrings (Elementary Aspects) 669

and, since in (1 + 1) dimensional quantum field theory the energy-momentem (EM) tensor is defined as: S_2*JS_9

T P

Jh

(U219)

8haji

(see also Sec. (8.2)) the constraint equations are simply Tap = 0. If one were to choose light-cone coordinates (T±= (T± a) on the world-sheet (see Sec. (1.4)), then 7 ^ = d+X^d+X^ and T__ = d_X^d_X^, whereas T+_ = 0. But this is the trace of the 2-dimensional energy-momentum tensor, its vanishing implies that the massless scalar field theory given by the action (11.2.16) is conformally invariant in (1 + 1) dimension—a fact which is of great importance in string theory. We shall see in the next section that classically the general solution of the wave equation together with the constraint equation can be found without much difficulty. Quantum mechanically, though, it is not so easy since ghost coordinates need to be introduced when constraint equations are considered. We note that constraint equations require that a physical state \(j>) must satisfy: 7>|0> = O (11.2.20) These equations (like Shrodinger equations for point-particles) are linear. In order to study their significance, one writes the equal i commutation relations of the Tap (in light-cone coordinates):

[T++((f), T++((j')] = i(7V^(o) + T^(a'))S'((7-

a') + -i-{2b - D)8'"{o1A

a')

[T++, T_J = 0 (11.2.21) (The expression for T__ is similar to T++.) The first term on the RHS arises in classical theory by taking the Poisson bracket, but the second term arises quantum mechanically via an anomaly. A physical state |0) therefore satisfies T++I0) = 0 only when the second term is zero, i.e., when the dimension D of space-time is 26.We shall pursue these ideas in later sections. We further note (see Rem. (11.2.4)) that apart from the local reparametrization invariance of string world-sheets, the action (and hence the resulting theory) is invariant under a global transformation XM —> AMV Xv + b^ as well. The entities A^v represent a constant orthogonal matrix and b^ represents a constant vector (fi, v = 0, 1, ..., 25). For an observer in space-time these are Poincare transformations, but for a creature living on the string they are internal symmetries of the 26 free, massless quantum fields propagating on the string. The Poincare invariance of the theory implies that the Hilbert space would provide a unitary representation of the Poincare group and thus particle states would be labelled by their mass and spin. In the case of a string which is free to oscillate, there would be an infinite number of harmonics which would give rise to an infinite number of particle states. So far we have discussed a free string, but just as in the case of particles in nature we also come across more than one string interacting with each other. We shall discuss below a simple form of action in this case. Here again the discussions are based on point particle theory.

2.4

Interacting Strings and Vertex Operators

From Chapter 9 we know that the propagator for a massless scalar field between space-time points x and y is given by the formula: {yp-l\x) 9

= j°°dT (y\e-Ta\x)

' Here h - - detft^n,and the factor of 2K is added on account of string theory.

(11.2.22)

670

Mathematical Perspectives on Theoretical Physics

where D is the d'Alembertian wave-operator • = ^{d^ldx^dx^). On the other hand the Hamiltonian of a non-relativistic particle of mass m, which is p2/2m = ullm simplifies to a for m = 1/2. Hence the operator e~ra on the RHS of (11.2.22) is merely the operator that propagates the non-relativistic particle through the imaginary proper time r. If one uses the path integral formula, then (11.2.22) can be written as:

(y\u-l\x) = j™dTJyxDx(t) e x p { - l j o W }

(11.2.23)

fV

where exponent is the action of a classical point-particle and I Dx{t) stands for an integral over all paths originating at x (t) s x(0) and ending at y(t) = y{t). We now consider a Feynman diagram of four external particles originating at space-time points A, B, C, D with interaction occurring at p and q and we recall that each line in the figure corresponds to a propagator. Thus with the representation (11.2.23) for the propagator (see Fig. (11.4)) each line stands for an integration over the trajectory of a particle that propagated in space-time between the indicated points. While evaluating the diagram one has to integrate at the interaction points p and q and has to include at the vertices the factors that depend on the theory considered. C

D

q

p

A ^

^

B

n ^ ^ Q Q Four external particles originating at spacp-time points A 6, C, D undergo a tree level scattering process, p and q are the interaction points

/

Frame 2 ?-''/\::

•"-/\^f2

•

Frame 1

\

(i) Q3QQ

(ii)

Intersection vertices in field theory and in string Iheory (i) A point particle spljis into two (II) A closed siring splits into 1wo There are two different Lorentz frames giving rise to two different families of surfaces of constant time These are shown by solid and dashed lines .' " *

r

*y

i

:

Strings and Superstrings (Elementary Aspects)

671

In order to formulate the theory of interacting strings, we shall naturally make use of the results that are valid for point-particles although as we shall soon see there are great differences between the two theories. We begin by displaying a diagram (followed by remarks) for interaction vertices. The diagram and these remarks would immediately show the origin of differences, and would thus confirm the statement we just made.(See Fig. 11.5) Remark 11.2.6 In the case of a particle there is a well-defined Lorentz invariant space-time point at which splitting occurs. In the case of a string, however, the splitting occurs at different points for different frames. For frame 1 (say) it occurs at the point indicated with solid dot and for frame 2 at the point given by open dot. This phenomenon of string splitting though seemingly complicated works in favour of string theory as is evident from the following remark: Remark 11.2.7 The Lorentz invariance of interaction vertex in Fig. (11.5)(i), allows one to choose special factors that could be included at this vertex while defining the Feynman amplitude. Each such choice gives a different quantum field theory. From Fig. (11.5)(ii), however, it is evident that any part of the diagram looks locally like the propagation of a free string. Thus once the rules for the propagation of a free string are chosen, due to the absence of Lorentz invariant intersection point, the form of the interaction gets uniquely determined. And this indeed is an important feature of string theory interaction point in this diagram. We give below two more diagrams to explain the difference between the two theories. We note that the diagrams in the case of strings are formed by the world tubes of small radius. These diagrams for particles as well as for strings are evaluated by integrating over the trajectories in spacetime of the propagating points or strings; since the radii of the tubes are very small, in the limit the string diagrams approximate to particle diagrams. This is the reason why one says that field theory emerges as the long wavelength limit of string theory. The following two remarks state in brief the advantages of pursuing string theory in preference to particle field theory. Remark 11.2.8 The interaction vertices A, B, C, D in Fig. (11.6)(i) give rise to ultraviolet divergence since when A = B = C — D, the propagators connecting the vertices simultaneously blow up. In the string diagram (ii) there is no well-defined analog of these interaction vertices, hence (supposedly) there is no unwanted (dangerous) region where A = B = C = D, and consequently no ultraviolet divergence.10 It is because of this, that string theories are said to be free of ultraviolet divergence, and as such they are considered to be finite theories.

BI

|C

A

D

(i)

|';| ''" " | / :j _

|;

(ii)

n ^ j j ^ ^ J 0) A one-loop Feynman diagram with interaction points A B, C, D; (ii) the corresponding string diagram for closed strings. 10

In fact divergences do arise in the Bosonic string theory when the world-sheet metric degenerates.

672 Mathematical Perspectives on Theoretical Physics

Remark 11.2.9 From Fig. (11.7) it is evident that compared to field theory there are far fewer string diagrams in the sense that these string diagrams which result from blowing world-lines that correspond to distinct field theory diagrams of (i) into world tubes are isomorphic to each other. Thus they all have the same topology and as such they represent different integration regions in the same string diagram. We note that in the theory of oriented closed strings, there is one and only one Feynman diagram in any given order of perturbation theory. In the case of unoriented closed strings or open strings, this uniqueness of 'one diagram' is not there but still there are far fewer diagrams of strings than those given by Feynman diagrams in the field theories.

(i) ^ ^ Q Q

/

\

O Feynman diagrams representing different one-loop corrections to a tour-particle amplitude In point-particle(fleld) theory, (ti) corresponding r.msed string diagrams

To evaluate an integral over a world-sheet a task which appears to be formidable to begin with, one uses the invariance of (11.2.15) under a conformal rescaling of the world-sheet metric h^ -> e^ hap. A suitable choice of

M (0

/xV""' (x\

(ii)

^ ^ ^ ^ 9 Cornpuctification of a world-sheet via contormol mappings.

Strings and Superstrings (Elementary Aspects)

673

The following figure (Fig. (11.9)(i), (ii), (hi)) shows how a world-sheet with one incoming and one outgoing string (see (i)) can be conformally mapped in different ways. For instance, in (ii) it is in the plane where incoming string appears as the origin and the outgoing string is at infinity (not shown in the figure) and in (iii) it is the sphere with south and north poles depicting the incoming and outgoing strings. f

• : :

• . ' :

: : •

• ";

;"•

f

I

7

w% / ©' / i

^,—

i

i i

(i) (ii) («D Q ^ J Q Different representations of a world-sheet under conformed mappings. The above images under conformal mappings can be obtained by using the transformations (of metric) given below in Remark (11.2.10). The metric d 12 here represents the conformally transformed metric. Remark 11.2.10 (a) We begin with the metric ds2 = dz2 + d<j>2, - o ° < z < ° ° , O < 0 < 2n; write z = In r, which gives ds2 = f~2 (dr2 + r2d$). By a conformal change of metric ds2 -» d!2= r2ds2, we get a new metric d"s2 = dr2 + r2d<j>2, which as we know is the metric of the plane. By this choice the incoming (closed) string which was a circle in the far past (z = - °°, 0 < <j> < 2n) has been mapped to a point at a finite distance (r = 0) as shown in Fig. (11.9)(ii), whereas the outgoing string has been mapped to the point at infinity. (b) If instead we rescale the metric as ds2 -> d 12 - (dr2 + r2d<j>2)/(l + r2/a2)2, we obtain the standard metric on the sphere. The incoming and outgoing strings are now finite points i.e., south and north poles as shown in (iii). Remark 11.2.11 For a string diagram with many external lines, different conformal factors e^ have to be chosen to map each of them to finite points. The thing to remember here is that it is only the asymptotic behaviour of e* that comes into play on the far out points (of any given string L) that have to be projected to points at a finite distance. These asymptotic behaviours of different e^'s can be chosen independently for each L. Remark 11.2.12 In order to ensure that when these external string states are conformally mapped to finite points no information on their quantum numbers is lost, one should provide a 'local operator' with the quantum numbers of the string state that was mapped to that point. This means that for each string state we must find a local operator in (1 + l)-dimensional quantum field theory that describes string propagation. These local operators are called the vertex operators. The vertex operators pertaining to a string state |A) that is responsible for emission or absorption of |A), is usually denoted as WA(a, r).

674

Mathematical Perspectives on Theoretical Physics

Remark 11.2.13 In the case of closed strings, for each particle type A we can find the local operator WA (a, T), which is a scalar under reparametrizations of (a, T), and has the same Lorentz quantum numbers as A. Moreover, WA is a (suitable) polynomial in X11 and its derivatives. For instance, if A is the tachyon which has spin zero under 26-dimensional Lorentz transformations, we can simply take WA = 1. This operator is known as the tachyon operator, denoted as W. If A is the massless dilaton D (see the Appendix for definition), which has spin zero, we again have to pick an operator WA with zero spin; the minimal choice that it is orthogonal to the tachyon operator W is WD ~ da X^d" XM. If on the other hand A is the graviton G, we have to pick the operator with spin two. The minimal spin two operator for a graviton of polarization jiv would be Wgv = da XlidaXv. The operators WA defined above obey the rules of Lorentz transformations, we have to however take into account the global symmetry on account of these space-time translations. Due to this symmetry X^ —> X^ + au, the position of each string is shifted by a constant a11 and the wave function of an external state of momentum k^ is multiplied by e'k ".. The simplest quantum field operator that transforms in this way under X^ —> Xfl+ a** is e'k x. We therefore postulate, that this factor is always present for emission or absorption of a string of momentum k M. It is also worth noting that the choice for a point marked x in Fig. (11.8) at which a given vertex operator is inserted is quite arbitrary, this point can be anywhere on the surface. Collecting all these facts together, we now define the operator n VA(k) = j d2a4h WA{a, f)eikX

(11.2.24)

for emission or absorption of a string state of type A and momentum k^. This is the well known vertex operator, recall that we introduced it briefly while studying infinite dimensional algebras in Sec. (5.6). We next use these vertex operators to write the formula for string scattering amplitude. In view of our discussions that led to Eq. (11.2.23), it follows that the scattering amplitude for scattering of particles of types Aj ... An and momenta kx ... kn., should be a path integral in (1 + 1) dimensional quantum field theory that governs string propagations when insertions of the operators VA are made. Setting T = — we have: it A{hx,

k{; A2, k2; ...; An, *„) = kn~2 j DX(a, r)Dhap(a,

x exp-{ - M d2a4h

ha/}da X'dpX } • ft VA.(*,-)

r)

(11.2.25)

where k is a coupling constant and the symbols DXM and Dhap denote path integrals on the compact string world-sheet of Fig. (11.9). We further note that to evaluate tree diagrams we require a surface that is topologically a sphere, similarly to evaluate a one-loop diagram we require a torus and to evaluate «-loop diagram we need a surface with n handles (i.e., a Riemann surface of genus n). Although the integrand in (11.2.25) seems complicated, we shall see later in Exc. 1 that this can be simplified by proper conformal choice. 11

In view of Remark (11.2.13) the vertex operators that involve tachyon, graviton, and dilaton are respectively: VT=\ d2aeikx, V"£=J d2a daX" daXv eikx, and VD = J d2a3aXfl F X V * x . The vertex operator for an antisymmetric tensor is given as V^v = J d2a e 0 ^ da X^ dp Xv e'k x. Except tachyon whose mass satisfies m2 = - 8 , the other three represent massless particles. Their masses are determined by using the fact that VA is of dimensions two.

Strings and Superstrings (Elementary Aspects)

675

Finally we note that the evaluation of open-string scattering amplitudes can be done in a way similar to that of closed strings. Thus to evaluate the n-point function for scattering of tachyons, one conformally maps the world-sheet onto the upper half plane (see Fig. 11.10), uses the vertex operator V(k) = \

dx

elkx to represent a tachyon of momentum k, and writes the function as:

(I)

Q^^^H

(ii)

(Ni)

A planar world-sheet in (i) Is conformally mapped in (ii) onto the disc and in (iii) onto the upper half complex plane. The external open-string states appear as insertions (marked as on the boundary in both (ii) and (iii)).

A(k{ ...*„) = gn~2 \ d x { . . . dxn l h ^ k ' x ( x ^

(11.2.26)

Here g is the coupling constant and each JC, runs over the real axis.

Exercise 11.2 1. Use the conformal transformation and complex coordinate formalism to evaluate the string scattering amplitude (11.2.25) in 26 dimensions when external particles are tachyons. 2. Show that the string scattering amplitude for four tachyons can be written as: A = k2\ d\\z4|*i

k

*\l -

z4\k2k*<2.

Hints to Exercise 11.2 1. We note that in 26-dimensional space-time, the string scattering amplitude (11.2.25) is expressed in terms of a correlation function in an auxiliary (1 + l)-dimensional quantum field theory. This can be simplified using the conformal invariance without introducing a quantum anomaly because of this particular choice of dimensionality. We also note that the integrand on the RHS contains hap and derivatives of hap. In order to simplify it we reparametrize the world-sheet so as to reduce the three independent components of the 2 x 2 metric tensor to one component: h = e*/i0, where h0 is a suitable metric on the string sheet. This reparametrization (of a, x) which is always valid locally, holds good globally when the string world-sheet is a sphere (i.e., in the case of a tree diagram). The metric h0 in this case is the standard (spherical) metric on S2. It can be further simplified by taking a stereographic projection of S2 onto the x-y plane R 2 , we can thus write

676

Mathematical Perspectives on Theoretical Physics h<*P = e

§ccp^ i e ^ ds2 _ e
T h e f a c t Q r e$ d r Q p s Q U t from

( U 2.25) due to conformal

invariance, and it is simplified to:

(i)

A = kn~2j DX(x, y) exp-{ - ^ J A ^ / X" } U VA.(ty. n

From Chapter 9 we know that RHS is the expectation value of II VA. (£,•)• Hence:

(ii)

A = (nvKi{k{)j.

It is important to note here that since (i) represents a free field theory on a flat two-dimensional world, the tools of complex-coordinate formalism are available here. In terms of complex coordi0

/it

")

1

nates z = x + iy the metric ds = e9(dx + dy ) becomes: (iii) ds2 = e^ dz dzThis can be further changed by use of an analytic function w{z) to another metric: 3

(iv)

2

2

dw dw aw which is again in conformal gauge. Now these are the coordinate transformations that are permitted by the gauge choice ds - e\dx2 + dy2); hence it follows that infinitesimally the residual gauge invariances are transformations 8(z) = e(z), where e(z) is an analytic function subject to the restriction12: (v)

ds = e^

£(7)

—^-

is finite as z -» °°.

We use the set of complex coordinates (zt ••• za) for writing (ii) when external particles are tachyons. The vertex operator in the case of a tachyon is: V0=jd2Zeikx

(vi)

Hence (ii) can be written as:

(vii)

A = k"-2 \fld2Zi <=i

•lu^{ikrX(z,)}\ V=i

/

Here ( ) represents an expectation value with respect to the Gaussian measure defined by the free field path integral of (i). 2. To establish the required result, we use the expression in (vii) of Exc. 1. In order to evaluate it we recall the formula for Gaussian integrals (obtained by completing a square)

(i) 12

(cxP{jid2zJfl(z)X^z)}) = exp[jjd2zd2z'Jli(z)G(z,z')J^(z')}

Since S2 = Z u {<*=} (see Sec. (1.1 and 1.3)), to show that the infinitesimal coordinate transformation 8(z) = e(z) does not have a pole at infinity, one has to use new coordinate z = 1/z, so that the point at infinity in the z-coordinate is the origin z - 0 and S(z) -> 8(z) = -8{z)lz2 = -e(z)/zz.

Strings and Superstrings (Elementary Aspects) 677

where /M(z) is an arbitrary source and G(z, z) is the propagator of the free field Xn. In this case n

Jp(z) = X tf S2(z - z,-)> hence (vii) of Exc. 1 can be written as: 1=1

(ii)

A =fc"-2J f l d h i I I e x p \ \ k t • kj G(Zi, z})}

(due to normal ordering the term i =j is not included in the product). The propagator in (ii) is the Green function for the 2-dimensional Laplace equation satisfying: (iii)

AzG(z, z) = 2n8\z - z)

Az being the Laplacian with respect to the variable z. The solution to this equation is: (iv)

G(z, z) = -2n\ ^£—2— J

An

= ln(/i|z - z'|). q"

The coefficient /i on the extreme RHS is due to infrared cutoff to avoid the divergence at q = 0 in (iv). We absorb this coefficient in the unknown coupling constant and obtain from (ii):

(v)

A = kn-2jUd\

n|z,.-z y |W.

We note that (v) expresses the n-point amplitude not as a path integral but as an (infinite) integral over a finite number of variables ii ... zn resulting from the points to which the external strings were attached to the world-sheet. The integral in (v) contains an integral over the infinite volume of the group SL(2, C). This is because while deriving (v) the gauge fixing that we used did not remove completely the reparametrization invariance but left a residual symmetry 8z = a + bz + cz2. Now the three complex parameters of SL(2, C) can be used to set Z\ = 0, z2 = 1 and z3 = °°; as a result in the limit as z3 —> °°, each term | z3 - Zj |*3 k^2 can be dropped in (v) since these terms become independent of z- (j * 3). In view of momentum conservation, they are also momentum independent; thus (vi)

U\zpkJ'2

= \z3\'"2'2

where m2 is the mass squared of the ground state.13 Since it is independent of external momenta, we discard this factor and write the scattering amplitude as:

(vii)

A = kn~2f n d \ n |d*» k)l2\\ -z,\k2kJ12x J

1 =4

j =4

n

\Zi-ztki12.

4
When n = 4 this reduces to the four-point function introduced (for the first time) by Virasoro [35] (viii)

A = k2\d\\zA\k^l2\\-

Z4|*2*4/2.

Generalizations of this function is due to Shapiro Ref. [Ad]. 13

' Requirement of SL(2, C) invariance mandates that m2 = -8.

678

3

Mathematical Perspectives on Theoretical Physics

BOSONIC STRINGS AND THEIR QUANTIZATION—A CLASSICAL APPROACH

Having studied the basics of string action in Minkowski space, we are now ready to discuss it in a more general setting (e.g., when the background metric is semi-Riemannian of Minkowskian signature). Here again we shall use the point-particle action to write the action of a string. The action of a point-particle of mass m in a background gravitational field given by a metric tensor g^ix) can be written as: S = -mjds

(11.3.1)

where ds is the invariant interval given by ds2 =-gflv(x)dx^

dxv

(11.3.2)

If Tis an arbitrary parameter that labels the points along the particle world-line (trajectory x^(x)), then (11.3.1) can be rewritten as: (11.3.3) This action is invariant under reparametrization T—> x (r) of the particle trajectory. The formula (11.3.3) however, cannot be applied to mass-less particles, moreover the presence of a square root makes it unwieldy. To avoid these shortcomings an auxiliary coordinate e(r) is introduced, and action (11.3.3) is replaced by: S = — f (e~[ x2-em2)dx (11.3.4) 2J We note that the action (11.3.3) can be recovered from (11.3.4) if one substitutes the value of e from the equation of motion: x2+e2rn2=0 It can be checked that (11.3.4) is invariant under the infinitesimal transformations:

(11.3.5)

5x=Ex,

8e = —{Ee) (11.3.6) dx where (§(T) is an arbitrary infinitesimal parameter that depends on x. The above transformations describe the T reparametrization symmetry, of (11.3.4), and hence allow us to make a suitable gauge choice. The gauge choice here is e - —=- which makes the conjugate momentum as: m PM=m2g^vxv

(11.3.7)

and the equation of motion is obtained using the familar rules; we note however that (11.3.5) remains the constraint equation. This constraint equation can be viewed as the mass-shell condition, generalized to propagation in a curved background.*

In its simplest form the term 'mass-shell condition' means that a particle satisfies the equation p\-~ m2, hence the particle in question is physical (and not virtual). This condition is also referred as 'on-shell'. In this context if a particle does not satisfy p2^ = -m2, it is called 'off-shell'.

Strings and Superstrings (Elementary Aspects) 679

Quantum-mechanical propagation of point particles is now described by path integrals of the form: \ Dx De eiS{x'e)

(11.3.8)

where the 'gauge symmetry' of (11.3.4) has still to be handled (approximately). Generalization of the first term in (11.3.4) (which will eventually take us to string action) in arbitrary (n + 1) dimensions can be written as: S = - y j dn+l aJhha'3(a)gfiV(X)da X^XV

(11.3.9)

Here h"^(<J) is the inverse of metric haa((T) that describes the geometry of an (n + l)-dimensional manifold. The coordinate o° = T, and the spatial coordinates a1 describe an n-dimensional object. The functions A^(<7) give a map of the world-manifold (line, sheet, tube ....) into the physical space-time, and g^v(X) describe the geometry of the D-dimensional space-time. It is mandatory that D be > (n + 1). Eq. (11.3.9) is independent of any particular choice of coordinates a since -Jh d"+1 ais the invariant volume element, and h da X^dp Xv is also invariant as tensor indices are properly contracted. Similar to the point-particle case one is interested in knowing if it is possible to choose a gauge in which hap (that corresponds to e) can be eliminated. Now there are (n + 1) independent reparametrization gauge • u • • (n + 1) (n + 2) . . . . , n{n +1) invanances, using these mvariances components of haa can be reduced to . Zi

ZJ

Thus for n > 0, h cannot be eliminated by reparametrization of the world surface. However, there is another local symmetry which should be taken into account here; this symmetry is represented by the Weyl scaling of the metric: hap^A(a)haji (11.3.10) under which "+1 !

•JhhaP-^A2 yfhhap (11.3.11) since for a string n = 1, the correspondence (11.3.10) leaves S invariant. Accordingly, for a free Bosonic string the action is:

S= - y j d2a4h ha\o) gfiV(X)daXtidpXv.

(11.3.12)

The parameter T in above action has dimensions of. (length)"2 or (mass)2 and can be identified as the string tension (see Remark (11.2.2)). We recall that it is related to the universal Regge slope parameter (for open strings) by: T=(2na'yl

(11.3.13)

To study this string propagation action further, we shall replace the space-time manifold by the flat Minkowskian space in the next subsection.

3.1 Symmetries of the Free String in Minkowski Space The string action (11.3.12) in the flat Minkowskian space reduces to: S=-—j

d2a-/hhaP(a)daXMdl}X^

(11.3.14)

680 Mathematical Perspectives on Theoretical Physics

The coordinate o is assumed to have the range 0 < a < it just for the sake of convenience. It is worth mentioning here that one could add two additional terms Sj and 5 2 to this action without violating the compatibility with D-dimensional Poincare invariance, and with power-counting renormalizability of the 2-dimensional theory. These terms are: 5, = X\ d2a4h , S2= — f d2aJhRa\h)

(11.3.15)

where /?(2) (h) is the intrinsic 2-dimensional scalar curvature of the world-sheet formed from the metric h^. The first of these—a 2-dimensional 'cosmological constant' term—does not have the Weyl symmetry of S and thus leads to inconsistent classical field equations. The second one—though significant in the case of interacting strings—contributes nothing here (unless the world-sheet has higher genus) since yfh /?(2) (h)—is a total derivative in 2-dimensions. We therefore consider the symmetries of (11.3.14) only. Regardless of the choice of background, these are the reparametrization invariances: 8X» = Z,ada X",

8ha/} = ?dv haP - dv
dv^hav,

(11.3.16)

and the Weyl scaling Shap=AhaP

(11.3.17)

In addition to these local symmetries, there are global symmetries resulting from the symmetry of the background in which the string is propagating. Since it is Minkowskian space these, symmetries are Poincare invariances described by: SX* = a»vXv + 6 "

and

(11.3.18)

8hal3=0

(11.3.19)

where a^v = r j w a^v (77^ being the Minkowski metric) is antisymmetric. We note that the variational entities \ a and A in (11.3.16)—(11.3.17) are arbitrary (infinitesimal) functions of aa while aMV and If in (11.3.18) are constants. As we have already seen in (Eq. (11.2.19)), the variational derivative of 5 with respect to the 2dimensional metric ha^ defines the energy-momentum tensor Tap as: (11.3.20) Simplifying the RHS we have: Tap= daX»dpX^-

^hapha'P'da,

X%,XM

(11.3.21)

As a result of Weyl symmetry ha^ Tap = 0, showing that this tensor is traceless. Also the field equation —jjg- = 0 requires that Tap = 0. If we write daXfidpX^l = Gap, then the vanishing of Taa gives:

Gap = \haphatfi'Gar,

G = |det Gaf}\ = | / i ( / i ^ G a / j ) 2

(11.3.22)

Strings and Superstrings (Elementary Aspects) 681

Using these we can write (11.3.14) on the world-sheet I as: - ^ J z d2aJhh^Gafi

= ~T(jzd2ayfG)

(11.3.23)

(the term within parenthesis is the formula for 'the area of the world-sheet 2' proposed by Nambu). We now choose a convenient gauge to simplify the action, and this is:

8aP=VaP=^

Q

J

(11.3.24)

(note that three local symmetries—two reparametrizations and one Weyl scaling—have been used to write (11.3.24)). With this choice which is (rightly) referred to as classical covariant-gauge fixing, the action simplifies to S = -Lj

d2a

rftdax» • dpX^

(11.3.25)

The Euler-Lagrange equation obtained from (11.3.25) is the (familiar) 2-dimensional wave equation:

nXfl=(-^T--fT\x'l

=0

(11.3.26)

If one considers a general variation: X^-^X^ + 5X^

(11.3.27)

then one notices that for closed strings, (11.3.26) and periodicity of X is necessary and sufficient to ensure that 5 be stationary under this variation, whereas for open strings, (11.3.26) is necessary but not sufficient. Also the variation of S under (11.3.27) contains a volume term proportional to (11.3.26) as well as a surface term: -TJ dT[X'^8X»\a=n~ X'M SX^l^}

=0

(11.3.28)

The vanishing of this surface term provides the open-string boundary conditions.

3.2

Solution of the Wave-equation in Reference to Strings

Since the above wave-equation is on (1 + l)-dimensional space-time, the light-cone coordinates and 2dimensional conformal theory provide good tools for studying of string dynamics and quantization. We recall that the general solution to the massless wave equation in two dimensions can be written as a sum of two arbitrary functions (see App. 9B.1): X"(o) = X%{

CT=T-G 14

(11.3.29) (11.3.30)

are the light-cone coordinates (see Sec. (1.4)) and X%, (X£) describe the 'right-moving' ('left-moving') modes of the string. The derivatives conjugate to cr* are defined as: 14

In Sec. (1.4) these are denoted as z and z, thus + <-> z, - <-> z •

682 Mathematical Perspectives on Theoretical Physics

d±=±-(dT±da)

(11.3.31)

As a result the Minkowski world-sheet metric tensor in light-cone coordinates becomes r?+-=r7_+ = - y ,

r1++=n__ = 0

(11.3.32)

along with its inverse rf~ = rf* = -2. Apparently the world-sheet indices are raised and lowered by the rule V + =-2V_,

V-=-2V+

(11.3.33)

To obtain a meaningful solution of the wave Eq. (11.3.26) we have to consider the constraint equations Tap = 0. If a dot denotes the T derivative and a prime stands for the a derivative, the constraint equations take the form:

TIO = TOI = x r = o, r 0 0 =r u = i-(x 2 + r 2 ) = o

ai.3.34)

In terms of light-cone coordinates, they can be expressed as: T++=j(TO0+T0l)

= d+X-d+X,

T__= ~(Too-Tol)=d_X-

d_X

(11.3.35)

The tracelessness of the energy-momentum tensor h"^ Tap - 0 thus becomes the statement T+_ = T_+ = 0 which is equivalent to the assertion TQQ = Tu, that we made in (11.3.34). All these facts put together imply that the constraint equations T++ = T__ = 0 become the statement: Xl=X2L=0

(11.3.36)

In 2-dimensional quantum field theory, the law of energy-momentum conservation takes the form: d_T+++ d+T_+= 0

(11.3.37)

with a similar equation for -<->+. In the conformally invariant case since 7 ^ = 0, the above conservation equation reduces to: d_T++ = 0, (dj__ = 0)

(11.3.38)

As already mentioned in Sec. (1.4) and Sec. (6.3) this leads to the existence of an infinite set of conserved quantities. For, if/(cr + ) is any function of
(11.3.39)

Strings and Superstrings (Elementary Aspects) 683

we note that (in view of (11.3.16)) Eq. (11.3.39) preserves the gauge choice. Further, if we write £*= (£° ± if;1), then it follows that <^+ and ^"may be arbitrary functions of rj+ and o~ respectively. Thus if the world-sheet reparametrization 8oa = | a is considered as being generated by the operator U = £,a

, then the generators of the residual symmetries would be:

£/+=
u-=^{
(n.3.40)

It is easy to note that the conserved charges mentioned above, with/ + ~ £ +, generate (11.3.40). We emphasize that the operators in (11.3.40) are the generators of the group of conformal transformations of twodimensional Minkowskian space, this group as we already know is infinite dimensional (see Sec. (1.4)). We now consider the boundary conditions of the solutions of (11.3.29). These boundary conditions as can be expected (see Fig. (11.11)) are different for closed and open strings.

0)

(H)

Q^^fflQ For n 1-dimensional compact manifold there ore two different topologies corresponding to the closed and open strings uf (i) and (ii).

Closed strings are loops (with no free ends) topologically equivalent to circles as shown in Fig. (ll.ll)(i), for them the boundary condition is just the periodicity of the coordinates: X^(a, r) = X^(o, r+ n)

(11.3.41)

Hence the two components of general solution (11.3.29) can be written as: XlxR=-x^+ 2

±l2p»{x2

a)+ - L / ^ l a A j t f - 2 ' " ^ - ^ 2 n/0 n

(11.3.42)

X£=lJc'1+ i-zV(T+ a ) + ^ - / X - « « e " 2 i " ( " a ) 2 2 2 njt0 n

(11.3.43)

The coefficients a^ and a^ are Fourier components, which are to be interpreted as oscillator coordinates, and coordinates x*1 and pM are to be interpreted as the centre of mass position and the momentum of the string. The constant I stands for length which is related to the Regge's slope a ' and tension T (in units h = c = 1) by: / = 7 2 a 7 = \lyfnf

(11.3.44)

We note that the terms linear in a cancel in the sum X$ + X\ showing that closed-string boundary conditions are indeed obeyed. In order that X% and X£ be real functions, it is required that x^ and p^ are real and atn (ff',) is the adjoint of a^ (a^J).

684

Mathematical Perspectives on Theoretical Physics

It can be checked that the Poisson brackets of a^n, a%, and that of p^, xv satisfy the following relations:15 [<>

<]RB

= [«£.

« V JP.B.

= imS^rf"

(11.3.45)

[««. ««]p.B = 0

(11.3.46)

\P^,x\B=rfv.

(11.3.47)

The expressions in (11.3.45) confirm the comment made above that the Fourier modes a^n for n & 0 are harmonic oscillator coordinates, as is the case in other free field theories. Similarly, Eq. (11.3.47) shows (as was expected) that centre-of-mass position and momentum of the string are canonically conjugate variables. In the case of open string, since it is required that the boundary terms (11.3.28) vanish in the variation of the action, the boundary conditions are: X'V=0

for cr=O

and a= n

(11.3.48)

M

This means that the normal derivative of X must vanish at the string boundary. These are 'free boundary conditions'; they prevent momentum from flowing off the ends of the string. The general solution of (11.3.26) with these boundary conditions is given by: X»{o, X) = xfl+ / V T + i / Y - < < f " ! T c o s no n*on

(11.3.49)

As a result of these boundary conditions (11.3.48), the left and right moving components combine into standing waves. In particular, setting a^ = ip*1 we have: 2d± X» = X» ± X'» = / £ < e - / n ( T ±

CT)

(11.3.50)

Similarly for closed strings we have:

£ _ * £ = X»= / £ afie'2in(T-a\

d+X»L= X*L= / £ 5^-2«»(t+o)

(11 . 3 . 51)

In the following remark we list the difference between open and closed strings. Remark 11.3.1

From Eq. (11.3.51) for closed string it is evident that the right- and left-moving

modes are independent of each other. We also note that ot%= a§= — Ip1* here, as opposed to a ^ = lp^ in the open-string's case, and the exponents have an additional factor of two. 15

Eqs. (11.3.45) and (11.3.46) follow from the P.B. of the X» and X*1 at equal T: [X"(o), X V «T')]RB. = [* M (o). * V ) W = °; iX11^), X V (O-')]P.B. = T-X8«j- a'). When Poisson brackets are replaced by commutators, i disappears. Note that we are using 5m+n in place of 8m+n 0 here. (In Sections (7.1) and (9.2) Poisson bracket is denoted as { , }.)

Strings and Superstrings (Elementary Aspects) 685

3.3

Conserved Currents, Linear and Angular Momentum

Returning to the global symmetry resulting from the Poincare transformations (11.3.18), we note that from the point of view of 2-dimensional theory, the 'Noether method' (see Sec. (6.3)) can be used for constructing the conserved current Ja associated with the global symmetry transformation (a) - > <$>(&) + eS(j)(a)

(11.3.52)

where 0(o) is any field and e is an infinitesimal parameter. On a world-sheet, however, e cannot be constant so we have to replace it by £(<7). The action in this case is not in general invariant but since the action S is invariant for a constant e, the variation of S is proportional to the derivative of e and so 55 is of the general form SS= j d2aJadae

(11.3.53)

for some current Ja. From Sec. (6.3) we know that the current defined in this manner is always conserved, provided the equations of motion are obeyed; in fact in that case the action is stationary under any variation and in particular under a variation of the type (11.3.52). This means that when equations of motion are obeyed, SS in (11.3.53) is zero for any £ and this is possible only if da Ja = 0. Using this method, the conserved currents associated with the Poincare transformations of A'1" are: P% = Tda X»;

J£v = T(X^da Xv - Xvda X*)

(11.3.54)

where the current Pa corresponds to translation invariance and j£vto Lorentz invariance. Conservation of these currents implies that: daP^=Q=daJa^v

(11.3.55)

Now these currents describe the density of D-dimensional momentum and angular momentum on the two-dimensional world-sheet. For instance, the amount of momentum flowing across an arbitrary line segment (da, dt) of world-sheet is given by: dP» = P% da + P% dx

(11.3.56)

In view of the boundary conditions (11.3.48) of an open string, (11.3.56) implies that there is no momentum flowing out of the ends of the string. The same is true in the case of J%v—the current of angular momentum. When we integrate the currents given in (11.3.56) and (11.3.54) over c a t T= 0, we obtain the total (conserved) momentum and angular momentum of a string. Thus the total momentum of a closed string is: P» = T\ndodX>1(<7) Jo dx

= KT{la^+

la%) = pM

(11.3.57)

which shows that the total momentum of a string is the same as the 'momentum' p^ of the zero mode (see (11.3.44) and Rem. (11.3.1)). The same result holds good in the case of open string as well. Similarly, the total angular momentum is given by:

J»v= T\nda['x* — - Xv ^-] 3 ° { dx dx )

(11.3.58)

686

Mathematical Perspectives on Theoretical Physics

Using the mode expansions we obtain JM\ respectively, for open and closed strings: jMv=lnv+EuV jnv=[nv+

(11.3.59)

Env +

j}^

(11.3.60)

where /" v = x V - / /

(11.3.61)

and E"v = -i £ -(ain

a\ - a\

<),

(11.3.62)

n

n=i

with a similar expression for EMV in terms of a^, etc.

3.4

The Hamiltonian for Strings; Mode Expansions of Taf} = 0

We recall the definition of Hamiltonian from Chapter 9 for a 2-dimensional theory to write: / / = ^da(X

• PT- L)= — j*(X2 + X'2) da

(11.3.63)

Using (11.3.49), we thus have: 1 °°

1 °°

U= j E «-„««. H = - X («-««„+ «-„«„)

(11.3.64)

for open and closed strings, respectively. The Hamiltonian generates the T-evolution of the string and since % is chosen to be dimensionless, the Hamiltonian is dimensionless. To write the mode expansions of constraints Tap - 0, we note that these equations lead to simply XJ, = X\ = 0, in the case of closed strings. Accordingly, using the mode expansions of (11.3.51) and (11.3.52), these have Fourier components (evaluated at x = 0): T [n -lima

i L

-=7Joe

T (n -lima v 2 . _

,

T

T da=

-

^\oe

X dcJ

«

= } l «»,-«• On 7 L

— T !K J-imOT

'"-yJo

e

r

(11.3.65)

j ^ . T (x 2ima v 2 r_

++ d o r = yJ 0 e

x

Lda

= y Z ««-»•«„

(11.3.66)

In the case of open strings, e""7 are not orthogonal functions for the interval 0 < a < n. To overcome this problem we extend the definition of XR and XL in (11.3.29) beyond this interval by assuming: XR(a+ K) = XL(o),

XL(a+ n) = XR(a)

(11.3.67)

Strings and Superstrings (Elementary Aspects) 687

As a result open-string boundary conditions imply that XR as well as XL is a periodic function of a with period In. The constraint equations now reduce to the vanishing of 7++ for -n< o< n, this eventually implies the vanishing of the Fourier components: Lm = T\Aeimc T++ + e-ima T__) do=~\K

eima (X +

X'fda

= |E«ffl-»'«n

(11.3.68)

— oo

Putting m = 0 in (11.3.65), (11.3.66) and (11.3.68) we find that the Hamiltonians defined in (11.3.64) give: H=L0+

Lo,

H = L0

(11.3.69)

for closed and open strings. In the case of closed strings, the constraint equations also imply the vanishing of L o - L o , which means that this term does not contain the momentum p^. The following remark sums up important consequences based on the above discussions. Remark 11.3.2 For a given state of oscillation the string mass M satisfies M2 = -/?„ pM; the constraint equation L o = 0 determines M2 in terms of the internal modes of oscillation of the string:

M2 = - ^ £«_„«„,

M2 = 4 l

(open strings)

(«_„•«„+«_„• 5 j

(11.3.70)

(closed strings)

Eq. (11.3.70) is known as the mass-shell conditions for open and closed strings, respectively. The massshell condition is the relativistic analog of the equation that expresses the energy of nonrelativistic violin string in terms of its oscillator coordinates. We also note that the equality Lo- Lo (for closed string) also implies that the two terms in (11.3.64) or (11.3.70) give equal contributions. Definition 11.3.3 The Fourier modes (components) Lm and Lm of the energy momentum tensor are called the Virasoro operators. The Poisson brackets of these operators can be calculated using the Poisson brackets of individual oscillators, thus we have: [^m' ^JP.B. = ~7JLI [am-kak> an-l' alh.B. *

k,l

- T £ (ka>n-k-aA+n-i + kam_k-an_t5k+l 4

k,i

+ (m - k)a, • M,,,-*^-/ + (m - *)
(11.3.71)

where as usual 8n = 1 when n = 0 and is 0 otherwise. The above equation simplifies to: [Lm, Ln]p.B. = ^ £ kam_k • ak+n + ±2 1

k

l

(m - k)am_k+n • ak k

and when the variable k is replaced by k + n in the first sum, it gives:

(11.3.72)

688 Mathematical Perspectives on Theoretical Physics [Lm,Ln]PS_ = i(m-n)Lm+n

(11.3.73)

This is the Virasoro algebra, which is of great importance in the study of string theory and in particular in quantum anomalies of the theory. We remind the reader that we had obtained this relation in Chapter 5 Subsec. (2.4) by considering an infinitesimal general coordinate transformation of the circle, 6 —> 9 + a(d). The operators that we defined there were: D

=

ieinej_

dd which can be easily seen to satisfy (11.3.73).

3.5

The Quantization in String Theory

Having seen some of the classical concepts of string theory, our next task is to learn the two (most popular) methods that are used in quantizing the bosonic string theory. Both these methods have one thing in common—the 'covariant approach.' The first of these approaches uses X^ coordinates for its description. The restrictions in this approach are on the physical Fock space corresponding to Virasoro constraint conditions; these restrictions (in fact) are similar to the Gupta-Bleuer conditions in electrodynamics.16 The second one, which is usually referred to as the modern covariant quantization approach, involves the introduction of Faddeev-Popov ghosts and the identification of BRST symmetries and currents. As can be expected, when compared to the first approach the second has a deeper geometric basis. As in the classical case, we shall use the reparametrization invariances and the Weyl scaling symmetries to set the world-sheet metric hap to T]ap. In the quantum theory, however, this leads to anomalies; for instance, here in general there would be an anomaly in the trace of Tap. The anomaly as we shall see later will cancel under special choices, e.g., the dimension of space-time and the mass of the ground state. We illustrate the first approach by using the light-cone quantization procedure. Here one starts by setting hap = t]ap and then imposes additional gauge restrictions. As we already know in this gauge (i.e., hap = riap) the (classical) string dynamics is described by the action: S=-—\d2o3PXdaX

(11.3.74)

along with the supplementary conditions: (X±X'f=0

(11.3.75)

corresponding to T++ - T__ = 0, and the boundary conditions that go with open and closed strings. In order to pass from classical to quantum, we replace the Poisson bracket by commutators: []p.B.->-'[]

(11.3.76)

M

The coordinate X can now be interpreted as a quantum operator and the algebraic relations of classical theory can be replaced by canonical commutation relations at equal T(see Ftn. 15). Thus, [tff(<7, T), Xv(a', T)] = -id(a-

a') if

(11.3.77)

Here the classical constraint equation d^A^ = 0 in electrodynamics, known as Gupta-Bleuer condition is replaced by the requirement that the positive frequency components of the corresponding quantum operator annihilate the physical photon states.

Strings and Superstrings (Elementary Aspects) 689

[XM(cr, T), X\o', T)] = [P^{a, T), P\{&, T)] = 0

(11.3.78)

M

where P^ = TdTX^ (the momentum conjugate to X ) is the Tcomponent of momentum current given in (11.3.54). Similarly, (11.3.45), (11.3.46) and (11.3.47) are replaced by equal-time commutators: [ < < ] = mSm+n vT = [5£, < ] v

[ < „ a n] = 0 v

(11.3.79) (11.3.80)

v

[x^p ]=iTf

(11.3.81)

The harmonic oscillator am is interpreted here as the raising and lowering operator for negative or positive m respectively. The oscillator ground state |0) is defined to be annihilated by the am for m > 0. To determine a state of the string completely, one not only requires the knowledge, that the oscillators are in their ground state but also needs the centre-of-mass momentum p^. Thus a state annihilated by the am with m > 0 and having the centre-of-mass momentum p M is denoted |0; p^).

3.6 The Fock Space and Virasoro Operators The operators c^, are related to normalized harmonic oscillators am by the rule: a£=^atlm,

m>0

«?„, = V^"a£ + ,

m> 0

(11.3.82) (11.3.83)

Now the Fock space formed by applying the raising operators a £ to the ground state JO) is not positive definite because of the presence of commutation relation [a®,, a °+] = - 1 for the time component, which means that the state a °,+ |0) for every m has negative norm: <0|a°a°+|0> = - l

(11.3.84)

The physical space of strings in view of (11.3.82-83) is asubspace of the complete Fock space specified by some 'subsidiary conditions.' For instance, in order to have a viable causal theory, this space is assumed to be without negative-norm states.17 In the case of classical theory, as we already know, these subsidiary conditions corresponded to the vanishing of the T++ and T__, whose Fourier modes gave in turn the Virasoro generators ^=^

a

«

a

»

(11.3.85)

and in addition Lm for closed strings. For quantum theory, am are operators and so the ordering ambiguity of operators has to be taken into account. Since am_n commutes with <xn unless m = 0, the only ambiguity that needs to be addressed is in the expression for LQ. In order to resolve this ambiguity, we simply define (to begin with) that LQ is given by the normal ordered expression:

^o=-7«o+ £ «-««« 17

(11.3.86)

These negative norm states are called 'ghosts.' These ghosts, however, are different from the BRST ghosts.

690

Mathematical Perspectives on Theoretical Physics

Remark 11.3.3 Since the expression on the RHS could also have a constant, it is customary to add a constant, say a, to all formulas containing Lo. Obviously this constant could always be determined with the help of other physical conditions. Furthermore, just as in classical theory one imposed the condition that 'L o must vanish for the allowed motions of the string,' here one requires that 'L o should annihilate physical states.' Hence a physical state should satisfy: (Lo-o)|0> = O (11.3.87) where a is a constant that we shall determine later. As in the classical theory, the above equation determines the mass of a string state in terms of its internal state of oscillation. For open strings (with a' = 1/2) this gives: M 2 = -2a + 2 ^

a_n • an

(11.3.88)

n=l

which shows that the mass squared of an oscillator ground state is —2a, and that the mass squared of any excited state is larger than this by a multiple of 2. For closed strings (11.3.87) is supplemented by (Zo-a)|0> = O

(11.3.89)

and the two together give: M2 = -8a + S^

a_n- an-

-8a + 8 ^

a_n • an

(11.3.90)

n=l

n=\

From the above equation we have : DO

oo

X «-«««= X «-„• 5 n n=l

(11.3.91)

n=\

showing that this constraint equation couples the left-moving modes to right-moving modes. When m ^ 0, Lm and Lm correspond to terms of definite non-zero frequency in T++ and T__. In classical theory they are supposed to vanish, their quantum analog here requires that the positive frequency components annihilate a physical state, i.e., Lm\(j)) = O, m = l , 2, ...

(11.3.92)

The above equation along with a suitable ordering convention ensures that the operators (Lm—a8m) for both positive and negative m have vanishing matrix elements between pairs of physical states. Remark 11.3.4 If |) and \<jf) are two physical states that obey (11.3.87) and (11.3.92), then the value of the expression: <0'|LniLn2...L,^)19

(11.3.93)

depends on the operator ordering. Moreover, if the L are to the right for nk positive and to the left for nk negative, then (11.3.93) vanishes in view, of physical state conditions and the Hermitian property -in ~ L nr 18 19

Note that (11.3.91) in fact follows from the condition ( L o - L)\
Strings and Superstrings (Elementary Aspects)

691

Remark 11.3.5 In the classical case all Ln's are zero for allowed motions of the string. In the quantum case, the anomalous commutation relations of the Ln make it impossible to find states annihilated by all of them. The closest we can come to the classical case is described in Remark (11.3.4). Remark 11.3.6

There is no ordering ambiguity in the angular momentum operators: (11.3.94)

Hence the operators introduced in (11.3.59-62) can be considered unambiguously as quantum operators. As a result of this the Poincare algebra can be obtained quantum mechanically without the possibility of an anomaly. Furthermore, since [L,,, J^ = 0, we note that the physical state conditions remain invariant under Lorentz transformations; this also ensures that physical states form Lorentz multiplets.

3.7

Quantum Anomaly and Physical States

As mentioned above, the Fock space formed by afn (and afn) is not positive definite (see 11.3.84), whereas the space required here (for string theory) has to be free of negative norms. This is accomplished by considering the subspace formed by physical states that satisfy the Virasoro conditions (Lm- aSm)\cj)) = 0 for m > 0. As these conditions are in one-one correspondence with timelike oscillators, their number is just sufficient to guarantee a positive definite Fock space. The following remarks in this connection are noteworthy. Remark 11.3.7 Since Lm~ p • ocm + terms quadratic in oscillators, the absence of quadratic terms implies that the Z,m-conditions (Virasoro conditions) decouple the timelike modes as p^ is timelike in almost all cases (barring a few low-lying states). In the rest frame, therefore, the physical states would be generated by the space components of the oscillators. Remark 11.3.8 In view of the above remark, the counting of Lm-conditions is sufficient to have a chance of decoupling of ghosts. We shall see, however, that a ghost-free spectrum exists only for certain values of the constant a and the space-time dimension D. In order to determine the value of a and that of D, we return to the operator Lm arid note that the Virasoro algebra \-Lm> LJ = (m-

n)Lm+n

of classical theory requires a quantum mechanical correction here. We therefore alter the above formula to the form:* \Lm, Ln] = (m - n)Lm+n + A(m)8m+n

(11.3.95)

with A(m) being an m-dependent c-number. It can be checked that A{m) satisfies: A(m) = c 3 m 3 + cxm

(11.3.96)

where c ( and c3 are constants. Remark 11.3.9 The constant c{ can be changed by shifting the definition of Lo by a constant. Naturally this shifting of Lo would also shift the constant a in (11.3.87). We note that this shifting does not disturb The generalized algebra (11.3.95) is known as a central extension of the Virasoro algebra, the additional (c-number) term is the anomaly term here.

692 Mathematical Perspectives on Theoretical Physics

the Virasoro algebra (11.3.95), and that it is only the relation between the constants a and c, that has an invariant meaning. Now the contribution of anomaly depends on the values of q and c3; we shall see that to determine these values we need to evaluate the commutator [Lm, L^]. More precisely we calculate its expectation value in an oscillator ground state |0; 0), i.e., with p ^ = 0. For m = 1 and m = 2 these are: (0; 0\[L{, Z._,]|0; 0) = 0

(11.3.97)

(since every term in L{ or L_x annihilates a zero momentum ground state) (0; 0|[L 2 , L_2]|0; 0) = (0; 0|L 2 L_ 2 |0; 0) = — <0; 0\ax • axa_x • a.jO; 0)

= y^v<°;oi«t«-iio;°> =yVJMV=yD

(11 3 98)

--

Using these we obtain from (11.3.95) and (11.3.96) the value of c3 and cv thus: A(m)= —D(m3-m)

(11.3.99)

As already mentioned in earlier chapters (see Chapter 5 and Chapter 10), the structure of the Virasoro algebra and of the anomaly here, is such that operators Ly, Lo and L_x generate a closed subalgebra without anomaly, which is isomorphic to SU(l, 1) or SL(2, R). Our next task is to obtain the conditions that would ensure that there are no negative-norm physical states. We shall see that there are negative-norm states only for certain regions of the parameter a and for the space-time dimension D. To reduce the regions of the parameter a and dimension D of negativenorm states in the physical Hilbert space, it is useful to look for physical states of zero norm. Naturally the regions of physical states with negative-norm and zero-norm have a common boundary. This amounts to saying that there is a 'critical' region in which the physical Hilbert space is on the verge of developing negative norms (ghosts). We shall study these for open strings. The case of closed strings follows easily by simply doubling the oscillators a as well as the Virasoro conditions. In the following paragraphs we show that the first condition for the absence of these ghosts is given by: a<\

(11.3.100)

We denote the open-string ground state of momentum k M as 10; k). From the mass-shell condition LQ = a, we have a' k -a. Now let ^(k) be a polarization vector with D independent components, assuming that no gauge constraints are taken into account, then, £ • a_i|0; k) gives the states at first excited level. The mass-shell condition now implies a'k2 =a - l

(11.3.101)

Strings and Superstrings (Elementary Aspects) 693

In addition, the L{ subsidiary condition gives £-k=0

(11.3.102)

This in turn means that there are (D - 1) allowed polarizations. Also the norm of these states is given

by5-£ Further, if the vector k lies in the (0, 1) plane, then the (£) - 2) states with (spacelike) polarizations normal to that plane have naturally positive norms. Consider now the cases (i) k2 > 0, (ii) k2 < 0, (iii). k2 = 0. In case (i) if a is such that the first excited state is a tachyon, then k can be chosen to have no time component, and it follows that the last t, is timelike and has negative norm. In case (ii) k can be picked up as a vector with time component only and so the last £ is spacelike with positive norm. In case (iii) the last £ is proportional to k and thus (evidently) has zero norm. The conditions (11.3.101-102) along with the above discussions lead to (11.3.100) concerning the absence of ghosts and the constant 'a.' When a = 1 (the boundary case) the vector particle is massless and the scalar ground state is a tachyon. Here the Lj subsidiary condition corresponds to d^ A*1 = 0 —the covariant gauge condition of electrodynamics and as such similar to the covariant Gupta-Bleuer20 quantization of electrodynamics, this condition leaves D-2 positive-norm states with transverse polarization and one longitudinal state ^ = kM of zero norm. It can be shown that this zero norm state decouples from the S matrix. One of the ways to do this is via the gravitational Ward identities. It is worth noting here that in the case of field theory, this decoupling follows from gauge invariance and current conservation. At the first excited level (when a = 1), the 'null' state that appears is just the first of an infinite number of such states. We shall see that the result associated to the first excited level can be generalized to give further insight into the state spectrum. For this purpose we make the following definitions. Definition 11.3.10 An arbitrary state |0) is called a physical state if it satisfies the constraints LJ0) = 0 for m > 0 and ( L o - a)\\y/) = O

(11.3.103)

for all physical states \<j>). A spurious state \yr) can always be written as: |V> = lL-n\Xn)

(11.3.104)

n>0

where %n is a state that satisfies: (L0-a

+ n)\Xn) = 0

(11.3.105)

But L_n for n > 3 can be represented as iterated commutators of L_x and L_2 (L_3 ~ [L_j, L_2]), hence the. infinite series in (11.3.104) can be truncated to give: W) = LMi) 20

- See [15].

+ L_2\X2>

(11.3.106)

694

Mathematical Perspectives on Theoretical Physics

where \xx) and |^ 2 ) obey (11.3.105). The spurious state given above is evidently orthogonal to all physical states, since

(\¥) = t(4>\L-,n\X,n)= X < * J I J 0 * = O

(11.3.107)

m=1

m=I

Thus a spurious state can always be expressed either as given in (11.3.104) or as in (11.3.106) (see Exc. 4). When a state \y/) is both spurious and physical, i.e., <0|V> = O, Ljy/-> = 0

for m > 0, and (Lo - a)\y/) = 0

(11.3.108)

then in view of (11.3.105) it follows that it has zero norm:

= Jl (XJLJ V) = 0

(11.3.109)

m>0

States of this kind are orthogonal not only to physical states but also to themselves. They are sometimes called 'null' physical states. It is interesting to note that these null states can be constructed by considering spurious states of the type: \V) = L_l\x),

(11.3.110)

where \%) is an arbitrary state satisfying Lm\%) = 0 for m > 0 and (Lo - a + l ) | ^ ) = 0. The state \x) could be the zero momentum state |0; 0) here or any physical state with suitably shifted pM. The spurious state \y/) given by (11.3.110) satisfies all the conditions for being physical apart from the L, condition. The action of Lx is given by: Lx\xif) = LxL_l\X)

= 2L0\x)

(11.3.111)

which vanishes for a = 1. From the above it is evident that by applying L_, to an arbitrary state \x), an infinite number of zero-norm states can be constructed. A simple example of \x) as mentioned earlier is the massless vector state \x) = |0; 0). In the following example we show that the number of zero-norm states is much larger when the dimension D = 26. Example 11.3.12

Consider spurious states having the structure: \Y)=(L_2+YL?O\X)

(11.3.112)

Here we take a = 1 and require that Lm\x) = 0 for m > 0 and (L o + l)\x) = 0, this gives (Z^, —l)jy/) = 0. For 11//) to have zero norm it must be physical, and as such it should satisfy Ljy) = 0 for m > 0. From (11.3.112) it is evident that Lm\y/) = 0 trivially for m > 3. Therefore all we need to examine is, the result of imposing the conditions Lx\y) = L 2 |y) = 0, when the Virasoro algebra with the anomaly (11.3.99) has been used. This gives the equations (3 - 2f) = 0 and D - (8 + 12$ = 0, implying that y= 3/2 and D = 26. Hence we note that (with a = 1) in this case, there are many more zero-norm states of the form (L_2+±L2_x)\x)

(11.3.113)

We emphasize that unlike the first infinite class of zero-norm states, the norm of states of the type (11.3.113) is zero if and only if D = 26.

Strings and Superstrings (Elementary Aspects) 695

Example 11.3.13 Consider a state of the form (11.3.112) with y= 3/2 and \x) = |0; p) with p2 - -2, i.e.:

(^2+f^,)|0; lP > = [ya_1-«-1+|-p-«-2+}(p-«-1)2]|0;p)

(11.3..114)

It can be checked that this state has norm (D - 26)/2. As was to be expected this norm vanishes for D = 26. For D < 26, the above state has negative norm, however in this case the state does not satisfy the physical conditions and hence would not qualify as a state to be considered. Example 11.3.14 When D > 26, a state with negative norm can be constructed. Consider states of the form: | 0 ) = [c{a_v • a_x + c2p- a_2 + c3(/?- a_,) 2 ]|0; p)

(11.3.115)

with p2 = - 2 . Obviously they satisfy ( L o - 1)| <j>) = 0 and obey L{\ 0) = L2\) - 0 when cx, c2 and c 3 are related as: C

D-\ 2=ci—-—.

c 3 = ci '

D+A (11.3.116)

The norm in this case is: <0|«>=-^-(D-l)(26-D)

(11.3.117)

The RHS shows that there are ghosts in the physical spectrum for D > 26. Finally we remark: Remark 11.3.15 The spectrum is ghost free when a = 1 and D = 26 or when a < 1 and D < 25. In the first case, there are many zero-norm states and the physical spectrum has as many propagating modes as are generated by 24 sets of a oscillators, whereas in the second case there are much fewer zero-norm states and the physical spectrum corresponds to D - 1 sets of oscillators. Remark 11.3.16 In the first case, one may say that the string has only transverse excitations, whereas in the second case it possesses both longitudinal as well as transverse. Based on Remark (11.3.15) we infer that occurrence of extra zero norms in the D = 26 case implies that the theory here has an enlarged gauge invariance and as such it presents a more compelling arena for study, in comparison to the one for which D * 26. Remark 11.3.17 In view of the above discussions, the open-string ground state is a tachyon of mass squared -2, corresponding to a = 1, and the first excited level is a massless vector meson. The existence of this massless gauge particle is an important aspect of the special properties of string theory in the critical dimension (a = 1).

3.8

The Conformal Dimension of an Operator; Vertex Operator

In this section we describe in brief how vertex operators can be used as a tool for analyzing the spectrum of physical states using the concept of conformal dimension of operators. Recall that we studied

696 Mathematical Perspectives on Theoretical Physics

vertex operators in Chapter 9 to analyze the interactions of particles (see in particular Sec. (9.6)) and extended this study to strings in Subsec. (2.4). To make things simpler we discuss it here only for open strings (see Fig. (11.12)). 2 \

\v/

/

1'

1 (0

2

y

1 I (ii)

Q ^ ^ ^ Q An open string splitting into two; (i) all three are off mass shell, msmmmm (j|) one of them (numbered 2) is on mass shell. As mentioned in Subsec. (2.4), the basic open-string interaction can be viewed either as the joining of two strings to make one, or the breaking of a string into two separate ones. All these three strings sometimes are off mass shell as in Fig. (11.12)(i). We consider here the case when at least one of them is a physical on-shell mass eigenstate as in Fig. (11.12)(ii). Remark 11.3.18 The concept of a mass eigenstate of the string is quantum mechanical. Thus when Planck's constant is restored in the formulas, a mass eigenstate has a width and mass squared, both of which are of order h. Hence in the classical limit any mass eigenstate of the string is, in a way, similar to a point particle. Since we are dealing with quantum mechanics here, the transition 1 —> 1' + 2 (with emission of the on-shell state 2) implies that the quantum state of 1' must be related to that of 1 by some linear transformation that depends on the state of the string 2. To put it more simply, with string 2 being point like, string Y is obtained from 1 by the action of a local operator at the end of the string where 2 was emitted. This local operator for the emission of the on-shell state 2 is the familiar vertex operator (See Subsec. 2.4) denoted V2. The above example implies that associated with every on-shell physical state \(j>), there should be a vertex operator V^ with suitable properties. In the following we shall obtain a vertex operator (along with the required properties) for a general open-string. The first step in this direction would be to show, that associated with every local operator, there is a conformal dimension. Let A(a, f) denote a local operator in the open-string Hilbert space. Since the vertex operator V2 was defined as an operator at the endpoint of the string, we shall study A(cr, f) at the string end point given by G= 0 or cr= n. We shall denote A(0, T) as A(T). We note that A(T) can be written as: A(x) = eirL° A(0)e-hl<>

(11.3.118)

since (Lo - a) is the string Hamiltonian. The operators A(T) would be meaningful only if they can be transformed among themselves into one another by the Virasoro algebra. We recall that an arbitrary operator O(r) is said to have conformal dimension J (see Sec. (1.4)) if and only if under an arbitrary change of variables T—> T'(T), it transforms as O'(T')

= (dxldx')JO{x)

(11.3.119)

Strings and Superstrings (Elementary Aspects) 697

Thus if the transformation T —» x' is an infinitesimal transformation: (11.3.120)

T'=T+£(T)

then (11.3.119) gives: SO(f) = - £ ^ ^ dz

+ JO(T)— dz

(11.3.121)

(This is the transformation law for a field of conformal dimension / ) . Now Virasoro operators Lm generate transformations (11.3.120) with e = -ie'mT, so using (11.3.121) the condition for the operator A to have conformal dimension J is: [Lm, A(T)] = eim{-i-j-

+ mJ j A(T)

(11.3.122)

When A(T) is expanded in Fourier modes as: MQ=

X

A

imX

me~

(11.3.123)

the condition of conformal dimensionality J for Fourier modes is given by: [Ln, A,,] = {m(J - 1) - n}Am+n

(11.3.124)

Remark 11.3.19 It can be checked that the above relation is compatible with the Virasoro algebra and the Jacobi identity. Using this definition and the expression for X^(f), it can be verified that conformal dimension J of the string coordinate X M (T) is 0, and that of the momentum operator X^(r) is 1. All operators that satisfy the transformation law (11.3.122) for some definite J are said to have definite conformal dimension. Operators of this type are rather special in the sense that they transform nicely under the Virasoro algebra and can be used to build new physical states from old ones (see Exc. 4, this exercise shows in particular that an operator which maps a given physical state to another one has conformal dimension / = 1). Now the vertex operator V2 associated with the emission of a mass eigenstate 2 actually mapped an initial physical state 1 to a final physical state 1', this suggests the possibility that open-string vertex operators, are operators of conformal dimension 1. In the following paragraph we obtain the expression of an open-string vertex operator which depends on the momentum and show that for a particular value of the momentum, the conformal dimension / of this operator is 1. The vertex operator V(k, 0, z) = V(k, t) (at time z and a = 0) for emission (absorption) of a physical state of momentum -k^ (+&M) among other things changes the momentum of a state on which it acts by an amount k^. This means that its dependence on the centre-of-mass coordinate of the string consists of a factor e'kx^, where x>"(z) = xM + p»%

(11.3.125)

is the centre-of-mass position of the string at time T. From the expression of X^(f) in (11.3.49), it is obvious that for this to happen V{k, T) should contain a factor exp[*l-X(0, T)]. NOW a string that absorbs a mass eigenstate of momentum k11 at world-sheet position (0, T) and space-time position X^(0, T) should also have modifications in its wave function by a factor exp[/& • X(0, T)]. Thus if the absorbed or

698 Mathematical Perspectives on Theoretical Physics emitted string state has no quantum number other than its momentum, then it is usual to treat exp[ik • X(0, T)] as a vertex operator V(k, T). However the expression exp[ik • X(0, T)] of V(k, f) requires a normal ordering, hence using (11.3.125) we have:

X)=:eikmx):

V(k, =

expLf ^LeinAe>k^eJ-k-f I n=i «

Remark 11.3.20

)

^e^A

1 .=i»

(11.3.126) )

We note that if there was no normal ordering, the exponent would differ by the

divergent sum a' k2 Y — from the exponent given above. Hence in the special case k2 = 0, the normal n ordering has no effect. In order to compute the conformal dimension of V(k, T), we make a few observations. Observation 11.3.21 If the operators A J(T) and A 2 (T) have conformal dimensions J\ and J2 respectively, then if the product operator AJ(T) A 2 (T) is unambiguously well-defined (i.e., there is no shortdistance singularity in the operator product AJ(T) A 2 ( T ' ) as T ' —> T) the conformal dimension of the product is Jx + J2Observation 11.3.22 In view of the above observation, one might expect that the conformal dimension of X^(t) Xv(x) would be zero since J = 0 for X^(x) and Xv(x). This however is not always true in fact the normal-ordered product : X^(x)X^(x) : has no definite conformal dimension. On the other hand, the normal-ordered expression of V(k, T) in (11.3.126) has a definite conformal dimension, which (as we shall see) can be obtained by evaluating the commutator (11.3.122) after A(T) is replaced by V(k, T). To begin with we note that [ o £ eka-»] = p5p_nk»ek

a

-»

(11.3.127)

therefore, since L m = —Y^ am_r ar, we have:

= ±-n{kam_n,eka-"}

(11.3.128)

If one were not to use the normal ordering, it can be checked that [Lm, V] with V in place of A, (for m > 0 or m < 0) gives a result of the form (11.3.122) with 7 = 0. But V is defined to be the normally ordered expression : exp[z'& • X]: which means that is also normal-ordered. Thus when one uses dx (11.3.128) to obtain an expression for [Lm, V], it so happens that the expression is not normal-ordered. Out of the infinite number of terms that arise in [Lm,V] from the infinite product (11.3.126), a finite number are not in a normal-ordered form. The terms that have lowering operators to the left of raising operators in V are the ones that are not in normal ordered form. More explicitly they are:

Strings and Superstrings (Elementary Aspects) 699

i \ l k-am_neinAv(k, r)

(11.3.129)

Once the normal ordering in the expression (11.3.129) is done the commutator contribution becomes:

j l _

= l^k2e"^V(k,

kam_ne"\V{k,x)

n=l

J

= ±mk2eimxV(k,

Z

x)

n=l

X)

(11.3.130)

Hence we have: [Lm, V(k, T)] = eimz(-i-^.

+ ±mk2)v(k,

x)

(11.3.131)

k2 Comparison with (11.3.122) shows that J = — . Accordingly the vertex V0(k) s V(k, 0) is a physical vertex operator with conformal dimension 7 = 1 provided that k2 = 2. We note that V0(k) is the proper vertex for emission of the ground state tachyon, whose mass squared has been assumed as M2 - -2. Furthermore, if k2 = 0, V(k, X) requires no normal-ordering (Remark (11.3.19)). The conformal dimension in this case is J - 0. This, however, is not always true, for instance k2 = 0 is correct for a vector meson of no mass, but / = 0 is the wrong conformal dimension for the vertex operator here. The following remark explains how the conformal dimension J of vertex operator of a massless vector meson is 1 and not 0. Remark 11.3.23

From our discussions in Remarks (11.3.18) and (11.3.19),

has conformal dx

dimension 1. Hence one may interpret that VAk, x) = E, • — dx

exp[ik • X]

(11.3.132)

is the vertex operator for emission of a massless meson of polarization ^(k), whose conformal dimension is J - 1. This is so since the operator product of £ •

and exp[ik • X] is free of short-distance dx

singularity (it is assumed here that k • E, = 0).21 The above remark illustrates in particular that vertex operators of conformal dimension 1 are in oneone correspondence with physical states. The vertex operators for the other states in the spectrum are more complicated. In the remark given below, we obtain a vertex operator for a state of zero norm. Remark 11.3.24

Let W(k, x) be an operator of conformal dimension 0 that contains a factor of

:exp[ik • X]:. Then the operator V(k, x) = -i 21

dW (k x) ' = [Lo, W(k, T)] has conformal dimension / = 1. dx

This restriction on the allowed polarization of a vector particle is similar to what one finds in electrodynamics. Note that we used it in analyzing the physical state spectrum (see Subsec. 3.7).

700

Mathematical Perspectives on Theoretical Physics

This is easy to verify, for if k2 = 0, we can choose W(k, T) simply as exp[j& • X], then V(k, f) can be seen to be the vertex operator for emission of a massless vector meson with longitudinal polarization ^ = k*1. In conclusion, it follows that vertex operators of the form V(k, r) = -i(dW/dz) with W of conformal dimension 0 always describe emission of states of zero norm. Consider now emission vertices for the second excited level states with a' k2 = - 1 . The operator V0(k) (corresponding to zero-frequency) now has J = -1 and therefore fv.XMXv

:exp[ik • X]:

(11.3.133)

has J = 1 provided there are no short-distance singularities in the operator product (11.3.133). Now the short-distance singularities are absent if kj;^ = trt, = 0. It is important to note that these are exactly the conditions for ^(k) to be the polarization tensor of a massive spin-two state, i.e., a symmetric traceless tensor of SO(D - 1).

3.9

Gauge Quantization Using the Light-Cone Formalism

This (light-cone) formalism though not manifestly covariant can be shown to be equivalent to one, i.e. to the covariant formalism, and as we shall see it has the advantages of covariant quantization on one hand and is free of ghosts on the other. Apart from this, the light-cone picture is very 'physical' and provides a better understanding as to why one has to choose a = 1 and D = 26 for a viable string theory. We have already seen that in the covariant gauge hap = r]ap, the string coordinates with open-string boundary condition have mode expansions: X^(a, T) = xM + p^z + i"X —e"'"Tcos no n/0

(11.3.134)

n

and satisfy the Virasoro subsidiary conditions T++ = T__ = 0. We also saw in (11.3.33) and (11.3.34) that beside the gauge choice hap = r\a^, there was still a residual gauge symmetry that could be used to impose an additional gauge condition. In the following paragraph we shall see how a noncovariant yet quite convenient gauge conditions can be imposed. Similar to light-cone coordinates a± on a world-sheet, we introduce the light-cone coordinates in space-time by defining22:

(x° + xD-[) x=

4i

'

(x°-xD-1) x'=

4i

(1L3'135)

and let the other (transverse) space-like coordinates X', i = 1, ..., D - 2 remain the same. The nonzero components of the Minkowski metric in this set up are t]^ = 1 (/ =;') and ^ = J?_+ = - 1 . In terms of these coordinates, the components of a vector V^ are V±=-j=-(V°±VD-1)

(11.3.136)

and V, i - 1, ..., D - 2. The indices are raised and lowered by the rules: V + =-V_, 22.

V~=-V+

and V> = Vf

(11.3.137)

Note that coordinates X° andXD~' in (11.3.135) have been chosen in an arbitrary and noncovariant manner.

Strings and Superstrings (Elementary Aspects) 701

The inner product of the two vectors is: V- W= V'W'-

V+W~-

V~W+

(11.3.138)

We recall that in terms of cr* the residual invariance corresponds to the possibility of arbitrary reparametrizations : a+ -> CT+(a+), a' ->

ff-(
(11.3.139)

These reparametrizations transform T = -y(
or = y [ 5 + ( T + o) - cf-(T- o)]

(b)

(11.3.140)

Eq. (11.3.140)(a) suggests that T may be an arbitrary solution of the free massless wave equation:

(sH?)*-0

< IIJ - I4I >

And once x is chosen, a in (11.3.140)(b) is completely determined (up to a rigid translation of (Tin the case of closed strings). A natural way to choose the solution x to (11.3.141) is that we make a reparametrization so that T equals one of the space-time coordinates X^. The reparametrization given by x = X+/p+ + constant is called the light-cone gauge choice. This can alternatively be expressed as: X\c,

x) = x++ p+x

(11.3.142)

+

The X component of the string coordinates here corresponds to the time coordinate when one considers a frame in which the string is travelling at infinite momentum. Also, since X+ is independent of a, in this gauge every point on the string is at the same value of 'time.' In view of (11.3.142), i.e. having fixed X+, the Virasoro constraint equations (X ± X') = 0 become: [X1 ±X")2 (X~±X'-) = ± '— 2p+

(11.3.143)

Some simple algebra shows that the above equation can be solved for X" in terms of X' (with an integration constant). Thus both X+ and X" can be eliminated in light-cone gauge leaving only the transverse oscillators X'. Using the mode expansion of X~: X- = x- + p-t+ i Y - a ; e - i n T c o s no

(11.3.144)

it follows that the explicit solution of (11.3.143) (in terms of oscillator coefficients) is:

«;=-r|X 23

24

I

:«'—«» • - «««

(11.3.145)

For closed strings O"+ and
702

Mathematical Perspectives on Theoretical Physics

Remark 11.3.25 In light-cone gauge the identification of a~Q with p~ is the mass-shell condition. For n = 0 (11.3.145) gives:

M2= (2p+p--pipi)

= 2{N - a)

(11.3.146)

where N = X «-«•«/•

(11.3.147)

n= \

The mass-shell condition obtained here differs from the covariant case in the definition of N, note that in (11.3.147) it is expressed only in terms of transverse oscillators (see (11.3.88)). We further note that the Virasoro algebra relation (11.3.95) satisfied by p+cc~n is: [p+a~m, p+a~n] = (m-

n)p+a-m+n + p

^

V

-m) + 2am^8m+n

(11.3.148)

Having obtained the basic formulae of light-cone quantization, our next task is to examine if the theory is Lorentz invariant in this gauge. The following remarks elucidate this point. Remark 11.3.26 In the light-cone gauge all string excitations are generated by the transverse oscillators aln; for instance, the first excited state is given by al_x\Q\ p). The state a\\0; p) is a (D - 2) component vector representation of the transverse rotation group SO(D - 2). It is known that a transversely polarized vector under a Lorentz transformation acquires a longitudinal polarization,25 unless it is massless. Hence in light-cone gauge the theory is Lorentz invariant only if the vector-state al_x | 0; p) is massless, i.e., the parameter a = 1. Remark 11.3.27 In order to find the limits on the space-time dimension D for Lorentz in variance of the theory, we calculate the normal-ordering constant with the help of the formula:

\°t X «U«!.= 7 if £ :«U<: + ^ - I >

(11-3.149)

To obtain this normal-ordering constant we have to regularise the second sum on the RHS which is apparently divergent. For this we use the fact that the sum ^

n~s converges for Re ^ > 1 to the Riemann

n=\

zeta function £(s), which has a unique analytic continuation to the point s = - 1 , giving f ( - l ) =

.

oo

We thus obtain (substituting the value of ^ n in (11.3.149)) that the required normal-ordering constant n= \

in L o is (see Remark (11.3.9)): ~^p 25.

(11.3.150)

This statement is in fact the well known result of field theory: the 'spin' of a massive (massless) particle is labelled by an irreducible representation of SO(D - 1) (SO(D - 2)).

Strings and Superstrings (Elementary Aspects) 703

From (11.3.87) we know that this equals (-a), and since constant a is 1 for Lorentz invariance, we have D - 26. The restrictions on a and D given in the above two remarks can be established rigorously by considering the Lorentz generators J^v. In fact it can be shown that a = 1 and D = 26 are the necessary and sufficient conditions for Lorentz invariance of the theory in light-cone formalism. We summarize this approach in the following remarks and ask the interested reader to refer to [13a and b] and [15]. Remark 11.3.28 We note to begin with, that some of the Lorentz transformations rotate the + direction into another direction, so a reparametrization (i.e., gauge transformation) has to be performed in the transformed system so as to restore the gauge condition.26 The transformations that affect X+, and hence the gauge condition, are generated by J*~ and J'~ and these transformations are the ones that (may) have an anomaly. Cancellation of this anomaly gives the restrictions on a and D. The remaining Lorentz generators are associated with the transverse space which generate a SO(D - 2) subgroup which as we know is a manifest symmetry of the light-cone gauge formalism. Remark 11.3.29 Consider the general expression for an infinitesimal Lorentz transformation on the coordinates (in the classical theory), that allows for an arbitrary reparametrization £a(<7, T): SX^(a,

T) = alia, a

T) Xv(cr, T) + da X"(a, T),

(11.3.151)

7

these new parameters
(11.3.152)

The + component of (11.3.151) when compared with (11.3.152) leads to the form of the 'compensating' reparametrization £,a. For instance one now has: a+vXv+

Z°p+=a+v(xv+pvT)=a+vxv(T)

(11.3.153)

which implies:

(11.3.154)

p Using (11.3.39) it follows that

^J>'f^

(11.3.155)

The substitution of the expressions for %a into (11.3.151) gives the form for the action of Lorentz transformations which takes into account the noncovariant gauge fixing (see (11.3.142)). Remark 11.3.30 For those transformations that involve a[, the Lorentz transformations act nonlinearly on the transverse coordinates since there are terms on the RHS of (11.3.151) that are quadratic in the transverse coordinates. Now in the quantum theory one has to invoke normal-ordering to deal with these 26 27

This process is called a compensating reparametrization. a takes just the two values 0 and 1, and dQ = da, d{ = dT

704

Mathematical Perspectives on Theoretical Physics

bilinear terms and therefore what one obtains sometimes is an anomaly in Lorentz algebra. In order to figure out this anomaly one has to check that the operators J^v = l^v + E^v (given in 11.3.94) really generate the Lorentz algebra (see Exc. 6) as they are supposed to. It is easy to check that most of the commutators give the correct result for any D, except the ones formed by J'~ transformations. The commutator [J'~, fl~] has to vanish for Lorentz invariance, instead it leads to an anomaly except under certain restriction. Remark 11.3.31 In the light-cone gauge Ef+ = E+ti = 0 whereas E'~ is cubic in the transverse oscillators when the light-cone gauge expansion of a~ is substituted. As a result the commutator [J'~, JJ~] contains terms that are quadratic or quartic in the oscillators.28 The terms in [J'~, JJ~] that contain four oscillators are the same as in the classical case and they cancel just as in the classical computations, and hence there is no anomaly on their account. Thus the anomaly that can arise is due to terms that are quadratic in oscillators.. In other words, the nonvanishing part of [J'~, JJ~] is given by:

I/'-, JJ~] = --L-

JT Am(aLmai - aUa'j

(11.3.156)

[P ) ">=i where the coefficients Am are c-numbers. These coefficients can be computed by using basically the oscillator commutation relations and the relations such as [x~, l/p + ] = i(p+)~2. These computations yield:

A., = m M

+

U M

+ 2(1_o))

(113.157)

12 m \ 12 ) which shows that A m vanishes, i.e., Lorentz invariance holds only when D = 26 and a - 1. To conclude this section, we show in brief how transverse physical states can be constructed using DDF operators.

3.10

DDF Operators and the No-ghost Theorem

In Subsecs. 3.4 and 3.7 we established the (Virasoro) conditions (in covariant quantization) that should be obeyed by physical states but did not give a general description of the states that obey these conditions. With a view towards this goal, we shall explicitly construct all the physical excited states. In the process we shall make contact with light-cone gauge and since that formalism is (manifestly) ghost-free, we shall be in a position to prove a no-ghost theorem for the covariant formalism. This objective will be achieved by constructing a set of operators that commute with the Virasoro operators. These operators, when applied successively to the ground state, give all possible physical states and they form a closed algebra known as the 'spectrum generating algebra.' They are called the DDF operators after the names of Del Giudice, Di Vecchia and Fubini who pioneered the work on them(see [8]). The DDF operators are denoted Aln where i runs over D-2 transverse dimensions of space-time and n is an arbitrary integer. These operators are in one-one correspondence with the transverse components of a^n and they describe the transverse modes of the string. For each value of n (in A',), the Virasoro constraints provide one restriction, thus a spectrum generating algebra contains (D - 1) operators for each value of n. 28

Since [J'~, Jj~] transforms nontrivially under the transverse rotation group, there cannot be a c-number without any oscillators (see Chapter 9 for c-number).

Strings and Superstrings (Elementary Aspects) 705 We note that the longitudinal operators A~ also enter in the theory though not explicitly mentioned here. In the following paragraph we shall construct these DDF operators and show that they are indeed the (definite) integrals of vertex operators. Let |0; PQ) be the tachyonic ground state of the bosonic open-string. When a = 1, we have PQ = 2 for this state. Suppose that a particular state of motion of the tachyon is described by p^ = 1, p~Q = - 1 and p'o = 0, then it is obvious that this choice satisfies the mass-shell condition p\ = 2. Further, let k^ be a null vector with components k# = -l and k +0 = k$ = 0, then we have k0 • p0 = 1. The states that have the property: 'if the mass (of the state) is given by a' M2 = N - 1, then the momentum should be p^ = PQ- Nk^,' are called 'allowed' states. These states turn out to be more amenable to a simple study as compared to others. Moreover, since any physical state can be Lorentz transformed to an 'allowed' state, the study of these states is helpful. In short, understanding of an 'allowed' state implies an understanding of all of the physical states with p' = 0 and p+ = 1 and all other states can be reached by a Lorentz transformation thereafter. We further note that the massless vertex operator V%(k, x) (see (11.3.132)) plays an important role in the construction of the spectrum generating algebra. We recall that it is a periodic function of x with period 2rcexcept for the factor exp(ik • pi), which results from the term/^Tin the expansion of X^(0, x). Now if we choose to study the massless vector vertex operator with k1* = nk§ with integer n, then while acting on 'allowed' states, k • p is an integer and the factor exp(ifc • px) is periodic in ralso. As a result, the vertex operator corresponding to transverse polarizations is Vi(nk0,x)

= X\x)einX+{T)

(11.3.158)

Since this is periodic in the 'allowed' subspace of the Hilbert space, we can define in this subspace the Fourier components:

A< = J - f 2 * x W ' ! X + ( r ) dx = - M 2 * V(n*o, r)dr 2K ° 2TT J °

(11.3.159)

which are indeed the DDF operators. These operators have two important properties: they commute with Ln, and they obey a simple algebra. To examine the first one, we note that using the conformal dimension of V{nk0, r) = V(T) which is / = 1, we have:

[Lm, V(T)] = -i~(e""T dx

V(T))

(11.3.160)

If the massless vertex operator V(x) is periodic as described above, then from (11.3.159) it follows that [Lm,Aln] = 0, and using the LQ condition (see Remarks (11.3.3) and (11.3.7)) one also has [N, A'n] = nA'n where iV is defined in (11.3.147). As a result it follows that an arbitrary state of the form A% All2 ... A!»|0; p0)

(11.3.161)

satisfies the Virasoro conditions and has N = Zrir The algebra of the An' can be determined by using the commutators of the X'(x) at unequal X. Thus:

[Ai, < ] = -^-rJl"

dx dx'[X\x), X'(T')] exp(imX\x) + inX\x'))

(see Exc. 7 for computation).

= m8y Sm+n (11.3.162)

706

Mathematical Perspectives on Theoretical Physics

The above algebra is identical to that of transverse oscillators a'm. We also note that like alm, the operators A'm have the reality property A* = A'_m, as well as the property A'm\0; pQ) = 0, m > 0. These facts ensure that the physical states (11.3.161) obtained from ground state with the DDF operators action are all of positive metric. We recall that the states obtained in light-cone gauge by using transverse oscillators on tachyon states also satisfied the positive metric condition. The states (11.3.161) spanned by the A'm operators are called the DDF states. Since for D > 26 there are ghost states in the physical subspace, it follows that for a general D the A'n do not generate the whole spectrum of physical states. In the following paragraphs, however, we shall show that when D = 26 and a = 1, the DDF states account for all the physical states thus proving the absence of physical negative-norms, i.e., those of ghosts in that particular case. We note that the absence of ghosts for D < 26 with a < 1 is a simple corollary of the result for D = 26, a = 1. Consider the DDF operators defined in (11.3.159) and recall that if there are no ghosts among the 'allowed' states, then Lorentz invariance of the covariant formalism ensures that there are no ghosts in the physical Hilbert space. Denote the space of DDF states by F with | / ) being a generic element of F. Define the operators Km=koam

(11.3.163)

where kQ is the light-like vector used in the construction of the DDF states. It can be easily checked that the operators Km obey the algebra: [Km,Ln] = mKm+n,

[Km,Kn] = 0

(11.3.164)

and satisfy: Kn\f)

= 0,

n > 0

(11.3.165)

Consider now a state obtained by action of a product of operators L_n and K_m on a DDF state | / ) : LX\ Lk2_2 ... Z> m ^ L i ... K^m\f) denoted symbolically as |{A, fi},/).

29

(11.3.166)

Further, define a sum:

IrAr+Z^v=P

(11.3.167)

then it can be shown that for any P the states defined in (11.3.166) are linearly independent. This is done by considering the matrix of inner products of the states at a given value of P. Denoting this matrix as Mp we have:

«V*iwrW-

K^L^...L\

L J L . L_xmm Kf'i...

K»'im\f)

(11.3.168)

where Zrlr + 1.% = ZrA'r + Isfi's = P. From (11.3.164) and (11.3.165) it is evident that Mp is only a function of the Ko and Lo values of the state | / ) (with Ko = k0 • 0CQ^ 0). It can be checked that the detIM^I for any given value of P is non-zero which means that the states (11.3.166) of given P are linearly independent. Thus for P = 1 the det \Ml\ equals: 9

It is important to note that since the Us do not commute, an order in the placement of L_r (e.g., r increasing from left to right) should always be maintained.

Strings and Superstrings (Elementary Aspects) 707

det|M'| = 2^

* ° =-K2Q

(11.3.169)

which is non-zero. For a general P the rows and columns of a matrix can always be ordered in such a manner that the matrix has zeros below its minor diagonals and non-zero elements along the minor diagonal. The determinant of this matrix (from elementary algebra) is known to be non-zero.30 The ordering of the states \{X, u},f) described above for P = 2 (mass level 2) is given by: L2_lt

K_2, K2_x

L_2, L_XK_X,

(11.3.170)

Remark 11.3.32 Note that to evaluate an inner product (11.3.168), we commute L's and K's past each other, the number of ^ ' s however can never be reduced in this process. Therefore to avoid the K's simply killing the conjugate end states, we have to ensure that there are enough L's to turn all of the A"s into factors of ^ 0 . 3 1 The process of arrangement of the states as given in (11.3.170) which gives the inner product in upper triangular form can be generalized to higher mass levels by following the rule given in the next remark. Remark 11.3.33 Note that we have used {ji} or {A} to denote a collection of K's or L's respectively. We now define an ordering between two collections of L's as: {X}> {X'}

if

lrXr > XrX'r

or

2>Ar = ErA' r

and

or

T.rXr = ZrX'r

and Xx = X\, X2 > X'2,

Xx > X\ etc

(11.3.171)

where X{, X2, etc., are superscripts of L's as given in (11.3.168). The rule for combined collections [X, ji] of L's and K's that we use is: {X,^}<{X\

fi'}

if

{X}<{X'}

or

{X} = {X'}

a n d {{1} > [fx'}

(11.3.172)

It can be easily checked that the ordering given in (11.3.170) is a special case of this rule, and that this rule always gives M of the desired form with zero elements everywhere below the minor diagonal and KQ along it. This is seen (trivially) in (11.3.169). We further note that the nonsingularity of M depends (crucially) on the presence of the K_n, since the matrix for states constructed only from the L_n leads to a determinant (the Kac determinant) which can be singular. In the following paragraph (Remark (11.3.34)) we show that the states (11.3.166), where \f) run over all DDF states and {X, jJ.} run over all collections of L's and A"s, are linearly independent. 3a

A (4 x 4) matrix in the form described above looks like

A=

Au

Au

A13

A14

ATI 21

Aoo 22

A23i i

U

A

A

31

U34 31

32

0

U

0

, det|A)^O, when A ^ O .

U

0J

- If |/> and | / ' ) are DDF states, a matrix element (f'lK^ ... K^\f) = 0 unless fxx = fc = ... flr - 0, this is so in view of (11.3.163) and (11.3.165).

708 Mathematical Perspectives on Theoretical Physics Remark

11.3.34

Let | / ) and \g) be two different D D F states with
to be eigenstates of L o . Further, let \f) = |{ X, /J.},f)

and \g) - |{A\ jj.'}, g) b e the states obtained from

\f) and \g) with strings of L's and K's. Write | / ) and \g) explicitly as in (11.3.166) and simplify by commuting L_n and K_n to the left and Ln and Kn to the right (for n > 0). This shows that (f\g) indeed a multiple of (f\g). A s a result

<7|?> = 0

is

(11.3.173)

In other words, the states {|/)} that we constructed using (11.3.166) are linearly independent. The above remark implies that every state in the boson string Fock space can be expressed as a linear combination of states of the form (11.3.166). Now any state in the Fock space can be written in the form:

n fl (p|0>

(11.3.174)

p=0n=l

This shows that there are infinitely many operators that come into play here. We note that, although the total number of states in Fock space is infinite, there are only finitely many with a given eigenvalue of

N = ££«<,<

(11.3.175)

p=0rc=l

The states in (11.3.174) are linearly independent and there are N eigenstates with eigenvalue:

(N)=Jjnen,p

(11.3.176)

n,p

We now note that a general state (11.3.166) is of the form

nibHK?Z nUl n )^'|0> n=1

(11.3.177)

i' = l

for some Xn, fin and /?„,, these states are also eigenstates with eigenvalue

(N) = £
(11.3.178)

i

From comparing the two (/V)'s it is evident that the number of states of the form (11.3.174) of given N is exactly the same as the number of states of the form (11.3.166) pertaining to the same N. This equality is due to the fact that the combinatorics of 26 £'s is the same as that of one A, one (i and 24 j8's. Since the states of the form (11.3.166) are linearly independent and are as numerous (at each mass level) as states of the form (11.3.174), it follows that the former states can be taken to form a basis for the Hilbert space. This fact eventually leads to the no-ghost theorem — as it enables one to show that there are no negative-norm states in the physical Hilbert space. We close this subsection with some guidelines on the proof of this theorem (Comment (11.3.35)) and ask the interested reader to see Sec. (2.3) of [13a].

Strings and Superstrings (Elementary Aspects) 709

Comment 11.3.35 As the states (11.3.166) are a basis for the Fock space, any (general physical) state \<j>) in the Fock space can be written as: \4>) = \s) + \k)

ai.3.179)

where \s) e 5 is a spurious state (defined earlier see Def. (11.3.11)) and \k) e A'isa state of the form: U(K)Wf)

(11.3.180)

n=l

i.e., a state obtained only by actions of strings of ATs on the DDF state \f). This decomposition is possible since every state of the type (11.3.166) either has some L's in its expansion and so is spurious or has no L's and so belongs to K. It is also unique as the states (11.3.166) are linearly independent. Since \ = W

+ (s\k) + (k\s) + (k\k) = (k\k)

(11.3.181)

Further, a general state in K can be written as: \k) = \f) + \k)

(11.3.182)

where \k) stands for:

LIT'*-ri/«> 32

(n.3.183)

a From the elementary properties of the states e K and of the DDF states, we have (k\k) = (f\k) = 0. As a result, (k\k) = (f \f) > 0, where equality holds if and only if |/.) = 0. In view of this it follows that a general physical state |0) cannot have negative norm, in other words, when D = 26 and a = 1, the physical Hilbert space is ghost free. Moreover, if \k) in (11.3.182) is physical, then in view of the commutator relation [Lm, Kn] =-nKm+n and the fact that the Lm for m > 0 annihilates the DDF states \f) and \fa), it follows that \k) = 0 and |&) = | / ) . Hence the general physical state |0) can eventually be written as: |0> = |ft + |s>

(11.3.184)

where \f) is a DDF state and \s) is a spurious physical state. It is interesting to note here that the transformation | / ) -» | / ) + \s) is a 'string-theoretic analog of a gauge transformation.' The expression of \ 0 (see [13a] for details). 32

' The |/ a )'s are DDF states. The primed IT indicates that all /in a's are not zero.

(11.3.185)

710

Mathematical Perspectives on Theoretical Physics

3.11 The Spectrum of Physical States and its Analysis We conclude this section with a few remarks on the analysis of the spectrum that has been generated by using the DDF and Virasoro operators. To begin with, we note that the DDF operators which have (D -2) components can actually be extended to operators with D components by using the vertex operator with the transverse index replaced by + or - . This gives (the longitudinal operators): A+

"

=

2 ^

V+ink

°'

T)dT=

~2n^K '• ^

"

^ '• dT=

S

"

(H-3-186)

and A" = - M V-(nk0, f)dr= - M 2 * : X-einX+: dr

(11.3.187)

27t JU

2.7C

While the first one, i.e., A+n, is trivial, the integrand in A~n does not have conformal dimension J - 1 as can be seen from the following commutator: [Lm,

V-(/!*„, T)] = eimT f - i ~ + m)v-(nk0,

+ -nm2eimTe'"x

t)

+

(11.3.188)

We note that V ~ fails to have J = 1 due to the presence of a second anomalous term; this term has the k2 same origin as the anomalous dimension — for the vertex operator : exp(/& • X) :. To overcome this problem, we replace the vertex operator in (11.3.187) by V~(nk0, f) to define A". This is obtained from: t ) = :XM e i k x : + — k M — Q o g k - X ) e i k X (11.3.189) 2 dr which has conformal dimension J = 1. It is important to note here that the only nonzero component of kMis k~=-n. Hence (11.3.189) reduces to (11.3.158) for ju= i (i = 1 ... D - 2)and for / / = - it gives: V"{k,

V~(/!*„. *) = : * ~ einX+

-—: in— (log 2 dz

X+) e'"x+

(11.3.190)

The operator A

~n=^\TdTV-{nka, In JU

r)

(11.3.191)

is now suitable to generate physical states. It can be checked that the algebra of operators A'm and A~n obeys the following relations: [A'm, A{] = mS^Sm+n,

(11.3.192)

Strings and Superstrings (Elementary Aspects) 71 f

[A-vAJn] = -nAil+n,

(11.3.193)

[A-m, A~a] = ( n - m ) A - m + n + 2m*8m+n

(1U.194)

These spectrum generating operators A'mand A~n together with Virasoro operators Lm generate all the states in the string Fock space. As for the analysis of this spectrum, we refer the reader to [13a] for details and enlist a few important points as follows. Remark 11.3.37 The light-cone-gauge description of string states has the great advantage of giving physical states explicitly in a positive-definite Hilbert space but fails to present the states as multiplets of SO(D - 2)—the transverse rotation group, even though the Lorentz invariance for D = 26 guarantees that the massive levels fill out complete multiplets of SO(D - 1). In the next remark we shall see (for low mass levels) how the 50(24) multiplets fit together into 5O(25) multiplets as they rotate into one another under suitable Lorentz transformations. Remark 11.3.38 Consider the open-string states where the only state with a' M2 = -1 is the ground state tachyon and the only states of a' M2 = 0 are the 24 polarization states of the massless vector boson. The first states with positive M2 occur at a' M2 = 1. These are (D - 2) and — (D - 2) (D - 1) states written explicitly as: ai 2 |0; p)

and

a'^al^O; p)

(11.3.195)

We note that the sum (Z2 - 2) + — (D - 2) (D - 1) = — (D - 2) (D + 1) = 324 is the dimensionality of a symmetric traceless representation, CD, of S0(D - 1). This representation is called the 'spin two' representation. Similarly at the mass level a' M2 = 2 the possible states are: cd,|O),

ai 2 a{.!|0)

and

a^ai^a^O)

(11.3.196)

Thus in all there are 24 + 576 + 2600 = 3200 states. The representation of 50(25) in this case is • m (2900) and H(300). Remark 11.3.39 For any mass level M the maximum 'spin' is given by the number n that satisfies n - a' M2 + 1. We note that the rc-th rank symmetric traceless tensor representation is built from portions of a \ ... a'1.i|0) plus other terms that are required to complete the 50(25) multiplet. A decomposition in terms of an 50(3) subgroup gives one term of spin n which is the unique highest spin state at this mass level. Hence as in the case of classical bosonic strings, here also (i.e., using the quantum approach) the spin J obeys the inequality: J < a' M2+ 1

(11.3.197)

Our final remark on string spectrum concerns closed strings. Remark 11.3.40 The spectrum of closed strings can be (easily) deduced from that of the open strings. We note that it is in the closed string sector that one finds the graviton—the massless spin two state. In the light-cone gauge these strings are described by two sets of transverse oscillators, {a'n] and {a l n }, corresponding to left and right movers. Moreover, due to the restriction Lo = L o , it follows that there is an equal amount of excitation of the left and right movers, i.e.,

712 Mathematical Perspectives on Theoretical Physics

f > i * < = X «-««» n=\

(11.3.198)

n=\

As a result the closed-string multiplet with a'M2 = 4(N - 1) is given by tensor product of open-string states with themselves having a' M2 = N - 1. For example, we saw that the ground state in the case of open-string was a tachyon with a' M2 = - 1 , in view of the above rule it would be a scalar tachyon here with a' M2 = -4. The next level for open strings (in reference to a' M2 = 0) are the 24 polarization states of the massless vector boson. In this case we would have a set of massless states of the form:

pj)=

aUal^O)

(11.3.199)

with SO(24) quantum numbers, which correspond to the tensor product of a massless vector of SO(24) from left moving modes with a massless vector of 50(24) from right moving modes. That portion of |Q'-/) which is symmetric in i and j and is traceless, transforms under SO(24) as a massless spin two particle—which is the graviton. The trace term 5^-|Q'-') is a massless scalar called the dilaton. The antisymmetric part |Q'-') - IQ7') transforms under 50(24) as an antisymmetric second rank tensor.33 The method described above can be continued to closed-string states of positive mass squared by taking suitable tensor products of left and right moving open-string Hilbert space. For instance, the level a'M2 = 4 has representations given by the decomposition of CD x CD. Finally, we note that the spectrum described above is that of oriented closed strings-which is often referred to as the spectrum of the extended Shapiro-Virasoro model, (see Ref.fAd]) and note further that it is equally possible to restrict the spectrum to states that correspond to an unoriented string.

Exercise 11.3 1. Consider a closed string at rest at time t = 0 (Fig. (11.13)(a)). The string is in the form of a circle of radius R in the x = y plane, with its arclength proportional to a at x - t = 0, thus: (a)

x = R cos 2cr,

y = R sin la.

Show that if we assume that t = 2Rr near t = x = 0, then it satisfies the equation of motion (11.3.26), as well as the constraint eqs. (11.3.34). Show further that for this string p° = 2K RT which confirms that Tis the energy per unit length of the string. 2. Consider an open string spinning in the x-y plane according to the formula (a)

x = A cos

T

cos a,

y = A sin x cos a

t = Ax.

Show that it obeys the equations of motion (11.3.26) and the constraint equations (11.3.34) and A2T that the energy and the angular momentum are respectively ;rATand n

. Using this informa-

tion show that a' is the Regge slope and that the endpoints of the string move at the speed of light. 33

All these have their counterparts in supersymmetric string theory, and they all play a fundamental role.

Strings and Superstrings (Elementary Aspects) 713

O

y

(a)

^ ^ ^ Q ™ ^ ^ ^

(b)

(a) A closed string initially at rest with radius /?. (b) An open string that is spinning in the x- y plane.

3. Show that the condition that the infinitesimal transformation T —> %' - z + e(r) leads to the transformation law of a field of conformal dimension / i s (11.3.121). 4. Show that a spurious state can always be expressed as an infinite sum (11.3.104) or as a sum of two terms given in (11.3.106). 5. Use an operator of conformal dimension 1 to obtain a physical state from a given physical state. 6. Verify the Poincare algebra relations: [p", pv] = 0; [p", Jvp] = -irfvpp + irf pv; r/" v ypAj = _irfP fX + .jip yvA + ij]VX Jlv _ iryl jVp 7. Obtain the commutator relation (11.3.162) for the DDF operators A'm.

Hints to Exercise 11.3 1. The first part of this exercise can be proved trivially. To prove the second part we make use of the first equation of (11.3.54). This gives p° = 2KRT 34 and since p° is the momentum of the zero mode, this implies that T is the energy per unit length of the string. 2. Verification of the first part is simple calculus. To check the second part we use the expressions given in (11.3.54) and subsequent relations (11.3.56) and (11.3.58); we note that the energy of this configuration is KAT and angular momentum is n(A2T/2). Since the angular momentum is n(A2T/2), we see that the maximum angular momentum per unit energy squared is M2nT. In view of (11.3.44) this confirms the interpretation of a' = l/2nTas the Regge slope. Furthermore, we see that at the string end points \dx/dt\2 + \dy/dt\2 = 1, since we are taking the speed of light c = 1 in our units, it follows that these end points are moving with the speed of light. We emphasize that this is the consequence of the boundary condition X'M = 0 (see (11.3.48) together with the constraint eqs. (11.3.34). 3. By definition an operator A(r) is said to have a conformal dimension / when under an arbitrary change x —> T'(T) it satisfies the transformation relation: (i)

A'(T')

= (dr/dT')JA(r).

Now the transformation T' = T+ £(T) for small e(t) gives: (ii)

(1/(1 + deldi))1 = 1 - J(de/dt).

Also since £(T) is small, we have: 34

Note that \l takes the value 0 and 1 here.

714

Mathematical Perspectives on Theoretical Physics

(iii)

A'(T')

= A(x + £(T)) = A(x) + £(dAldt).

We use (ii) and (iii) in rewriting (i) and then simplify it to obtain the required result i.e. the condition (11.3.121). 4. We note that if \y) is a spurious state, then there exists an operator O = \\f/) (y/\ which annihilates all physical states. Since the only restriction on general physical states is that they are annihilated by Lm for m > 0, it follows that O can be written as:

(i) for some operators X_n. Comparing |y/) (I/A| with the RHS of (i), we note that \y/) has a representation (11.3.104) where the states \%n) can be expressed in terms of operators X_n using the relation: Xn = X \ V

(")

Moreover since for n > 3, the L_n can be represented as iterated commutator of L_x and L_2, e.g. L_3 ~ [L_{, L_2], a spurious state can be written as in (11.3.106) where \%n) obeys (11.3.105) for every n. 5. Let \
(i)

(Lm-aSJW

= 0, m > 0 .

If an operator A ( T ) has conformal dimension 7 = 1 , then from (11.3.124) ([L m , An] = {m(J - 1) - n] Am + „ ) it follows that for n = 0 (i.e., for zero Fourier mode A o ) (ii)

[Lm, Ao] = 0.

This implies, (iii)

[L m , A0]\<j>) = 0 = (Lm AO- AoLJ\<$>) = 0.

In order that (LmA0)\) be a physical state, say \x)- Thus using an operator A(x) of definite conformal dimension 7 = 1 we have obtained a new physical state from an old one. 6. Use Eqs. (11.3.77-81) and (11.3.94) to obtain these relations. 7. Use the mode expansion X\x) = £ o}me~im% \.o write (see (11.3.50-51)): (i)

\_X\x\ X\x')} = 27CiSij5'(x- T').

Since V'{nk0, x) (the important ingredient in the definition of A/,) equals X'(x) emX (T) (where X+ is known to commute with itself and with the X' - even at unequal x), we can write the commutator for the A,;'s as (see (11.3.159)-(U.3.160)): (ii)

[A;,,, A{] = -~rfon

dxdx'[X\x),

= -^Sijjl*:d(x)X+(x)

X\x')} exV(imX+(x) + inX+(x')

exp(i(m + n) X\x)) = m<^<5m+n.

(Note that we have used here the commutativity of X' and XJ and the fact that p+= 1.)

Strings and Superstrings (Elementary Aspects) 715

4

COVARIANT QUANTIZATION FROM A MODERN POINT OF VIEW

Recall that in Chapter 9 we briefly mentioned BRS techniques, we revisit them here as they are important to the study of questions on symmetry not only in particle field theory, but also in string theory. This is done by introducing the ghosts in the theory using the quantization techniques followed by the introduction of BRST operators35. We note that important concepts such as cancellation of Virasoro anomalies can be studied only after ghosts and antighosts have been incorporated in the theory. Using the presence of ghosts, we are also able to set up a correspondence between Bose and Fermi theories. In Chapter 9 we have already seen the advantage of path-integral quantization in systems with local symmetries. Now the propagation of a string in a background (which is space-time here) is governed by a two-dimensional field theory, that has local reparametrization invariance, hence the use of path-integral formalism is a natural tool in theory of strings and superstrings. We note that the free field theory action S0[X] = - ^ - J d2a daX»daXM

(11.4.1)

along with required constraints, describes the propagation of a free string. The action (11.4.1) is a gauge-fixed form of the action: S[h, X) = - — J d2G -Jhhafi daX^pXp

(11.4.2)

It is this action, which we shall use to write the path integral36: Z = j Dh(a)DX(a)e'm-

fl

(11.4.3)

and then apply the (modern) Faddeev-Popov techniques for our further study.

4.1

Faddeev-Popov Ghosts and Virasoro Generators

We note that in (11.4.3) the integral JDh(o) stands for three independent components h++(a), fr__(o) and /i^_(cr) (in light-cone coordinates), and that the integrand has apparent symmetries. In order to preserve these symmetries and avoid introduction of anomalies, it is essential to define a suitable measure. To accommodate the three gauge invariances (two reparametrizations and the Weyl scaling), we choose a gauge slice with particular choices for each of the three functions in hap (o). Thus we begin by imposing the gauge choice: hap=e^ali

(11.4.4)

35

The B R S T operators named after C. Becchi, A . Rouet, R. Stora and I. V. Tupin result from the quantization method suggested by these authors (see Ref. [Ad]). This method combines the best aspects of Gupta-Bleuler and light-cone quantization methods, for instance it is Lorentz-invariant like the first one, and easily extracts the physical states and the D - 26 constraint like the second one.

36

We are using es in place of e's to keep conformity with [13]'s notations.

716

Mathematical Perspectives on Theoretical Physics

which leads to: h++=h__ = 0

(11.4.5)

in light-cone coordinates. These gauge conditions, under a world-sheet reparametrization C7+ —> o++ £+, a~ ->
<5/*__ = 2V_£.

(11.4.6;

To impose a gauge condition in the path integral (11.4.3), we begin with the group G of reparametrizations of the string world-sheet Z and denote by Dg (g e G) an integration over the group manifold. The metric h transformed by reparametrization g is denoted as hg. Using these we now write the identity that would work as a tool in obtaining the gauge-fixed path integral. The identity in question is: 1 = J Dg(a)S(h8++)8(hiJ dctiSh^/Sg) det(8hg__/Sg)

(11.4.7)

where det(8hs++/8g), det(Shs__/8g) are the usual gauge-fixing determinants. It is these determinants that make the above integral equal to one. Inserting the value of 1 in the form (11.4.7) into the path integral (11.4.3) we have: Z= j Dg(a) j Dh(a)DX(a)e~S[K ^Sih^)

x detiSh^/Sg) det(Shi_/8g)

8(hL)

(11.4.8) 8

The reparametrization invariance of the action S (S[h, X] = S[h , X]) implies that the integrand in (11.4.8) depends on h and g only in the combination hs; hence the integrand can be simplified by changing the integration variables g and h to g and h' = hs and by discarding the J Dg integral—which represents an infinite multiplicative factor. The gauge-fixed path-integral now is: Z = J Dh'(o)DX{o)e-S{h''

x]

S(h'++) 8(h'_J

x det(8h'++/8g) dtt(8h'__/8g)

(11.4.9)

The delta functions in (11.4.9) simply mean that the integral J Dh' would reduce to an integral over h'^, or equivalently to an integral over

Now—rt±-= og

Sh'

*

<5?+

and

Sh'__ Sg

8h'_

= — - — , and this in view of (11.4.6) gives: <5£_

8h'++(a) " ,, = V + S ( < T - a')

(11.4.10)

' This formula is similar to the one in general relativity where, under an infinitesimal coordinate transformation, the metric tensor satisfies the transformation rule 8 hap - V a ^ + V ^ , with V being the covariant derivative (see Chapters 0 and 8).

Strings and Superstrings (Elementary Aspects) 717

The <5-function in (11.4.10) is just the identity operator in coordinate space; in fact to calculate the determinants in (11.4.9) one needs the determinants of the operators V ± . To represent the first determinant there, we introduce an anticommuting ghost c~ and an antighost b__ and write: f 8h'++ \ , r 1t 2 •) det —-±i- = J| Dc~(a)Db__{o) exp\-—\d } ac~V.b_

y 5g J

I n

~~>

(11.4.11)

Similarly, the second determinant in (11.4.9) is represented as an integral over a ghost c + and an antighost b+Jr, thus:

d e t ( - 4 ^ | = J Dc+(G)Db++(G) txp\--\d2ac+V_b++] \

dg J

J

>• K J

(11.4.12) J

Since h can be solved in terms of the conformal factor
(o) J DX(O) Dc(G) Db(o) e~S(X' b' c)

(11.4.13)

The action term in (11.4.13) includes the ghost terms defined above as well as the free field action of (11.4.1). We note that the D(j> integral which gives rise to an infinite factor can be considered as irrelevant in 26 dimensions, as it can be decoupled there (see Remark (11.4.10)). Furthermore, it is in 26 dimensions that the c-number anomaly in the Virasoro algebra cancels if the ghost contributions are included. In order to understand the concepts of ghosts that have been introduced in the integrals above, we recall the principals of complex tensor calculus (see Chapter 1) on world-sheet. Using a complex coordinate z = T+ ioand its conjugate z - T - id, we can write the metric ds2 = e^ida1 + dx2) on a worldsheet as 3 8 : ds2=e'pdzdz

(11.4.14)

In the z, z coordinate system, the components of a vector are: r±=r0±«'i and that of the gradient

(11.4.15)

— are:

do d+=-[^-±i^-] -

2\dT

(H-4.16)

da)

From (11.4.14) it is evident that the metric components are h++ = h__ = 0 and h^_ - h_+ = \e^, and as such the indices are raised and lowered by the rule: t, = —e*f, t = —e* t+ 2 2 38

(11.4.17)

We note that a world-sheet can always be given the metric ds2 = e^(da2 + dx2) at least locally. We further recall that on a world-sheet with Minkowski signature z and z were referred as
718 Mathematical Perspectives on Theoretical Physics

A change of coordinates z —> z =f{z), where/is a holomorphic function of z, preserves the conformally flat form of the metric, thus if p = e*, we have: , , -2

p-> p ' = - L . dz

p

(11.4.18)

A general tensor t is transformed as:

^,,J^X-~"(^f-\

(11.4,9)

where nu (n u) and nt{n {) are upper and lower holomorphic (antiholomorphic) indices, whereas nt - nu and nl - nH are holomorphic and antiholomorphic conformal dimensions of the tensor t. The Christoffel connection coefficient F ^ f o r this conformally flat metric simplifies to only two nonzero components given as: T+++=d+
(11.4.20)

r:_=
(n.4.21)

Thus the covariant derivative of a tensor with n lower or upper + indices has the expression: V+f++ ... +=(d+-nd+)t++...

+

Vj++...+=dJ++...+

(11.4.22)

V+t++-+=(d++nd+(l>)t++Vj++-+=

+

dj++ ••+

(11.4.23)

Next we use this formalism to write the ghost action Sg on a world-sheet with any metric (which may not be in conformal gauge). We note that the ghost fields c+ and c~ can be viewed as the components of a vector field ca and the antighost fields b++ and b__ can be thought of as the components of a symmetric traceless tensor bap (since b+_ = b_+ = 0 for a traceless symmetric tensor). The ghost action in terms of ca and bap can now be written as: Sg = —j

d2a 4hhapcyVabpy

(11.4.24)

The ghost fields b and c represented by covariant symmetric traceless tensor and contravariant vector (respectively) are Grassmann valued since they are anticommuting quantities. To examine the ghost contribution into the theory, we also consider the world-sheet energy-momentum tensor defined earlier (see (11.2.19)): (11.4.25) The action (11.4.24) then leads to the ghost contribution to the EM tensor: T%= ^cv(Vabpv+

Vpbav) + Vacvbpv+ Vpcvbav-

trace

(11.4.26)

Strings and Superstrings (Elementary Aspects) 719 Since Tap and bap are both symmetric and traceless, we have (using (11.4.22)): T(?+ = ^c+d+b++

+ (++; T(± = —c-d_b__ + (d.c-)b__

(11.4.27)

Returning to (11.4.24) we note that in the conformal gauge the action Sg simplifies to: 5 F P = — [ (c+d_b++ + c'd+b___)d2a

(11.4.28)

which is the Faddeev-Popov ghost action. In order to quantize the ghosts and to express them in Fourier modes, we consider the canonical anticommutation relations and the open-string boundary conditions. The first of these gives 39 : {*++(<* r), c+(a',

T)}

= 2KB(G-

a')

{b__(cr, T), c~{a', z)} = 2n8(a-

a')

(11.4.29) (11.4.30)

In the conformal gauge, their equations of motion are seen to be d_c+=d_b++

=Q

d+c-=d+b__

=0

(11.4.31) (11.4.32) +

In view of open-string boundary conditions c = c~ and b

++

c + = X cne-in{T+c); b++ = X bne-"Kr+a);

and

= b~ at the ends of the string, and since:

c~ = X cne-in(T-a)

(11.4.33)

Z>__ = £ V" I '" ( t "° )

(11.4.34)

the canonical anticommutation relations (11.4.29-30) in terms of the modes are: [cm,bn}

= Sm+n

(11.4.35)

[ c m , cr!} = { b m , b n } = 0

(11.4.36)

In the case of closed strings, the boundary condition is just the periodicity in (7, therefore (similar to left-moving and right-moving X coordinates) the ghosts c+ and c~ have independent mode expansions: c+= V 2 " £ cne-2inix+a)

(11.4.37)

V2~X cne-2in{T-c)

(11.4.38)

c'=

Likewise, the components b++ and b__ involve modes bn and bn. The following remark regarding b and c is noteworthy. 39

Eqs. (11.4.29)—(11.4.32) show that b++ (b__) is conjugate to c* (c~), a fact that follows from the ghost action Spp.

720 Mathematical Perspectives on Theoretical Physics

Remark 11.4.1 On a flat world-sheet the ghost and antighost fields b and c enter symmetrically in all formulas despite their asymmetric (tensorial) appearance, e.g., c~ and b__, the fact is that ghost Lagrangian treats b and c symmetrically in this case. On a curved world-sheet however, this is not so as is evident from (11.4.24). These fields do not enter symmetrically in the world-sheet energy-momentum tensor also (even if it is a flat world-sheet) it is because the EM tensor is derived by varying with respect to the world-sheet metric. We now insert the mode expansions (11.4.33)-(11.4.34) in the expression (11.4.27) of T++ and T__ and extracting the Fourier modes Lm = \

doe"no T++ (for the open string), obtain the (ghost) Virasoro

generators:

*#=

£

\mU - 1) - n]bm+nc_nm

(11.4.39)

we note that J = 2 is the conformal dimension of the antighost, while the conformal dimension J of the ghost is - 1 . These generators satisfy the usual Virasoro algebra relations (11.3.95): [L£\ L « ] = (m - n)L^+n + Al\m)8m+n

(11.4.40)

with an anomaly term Ac(m) = — [ 1 - 3 ( 2 7 - I) 2 ] m 3 + —m 12 6 For J = 2 this reduces to: A'Xm) = — (m - 13m3) 6

(11.4.41)

(11.4.42)

When m = 1, we have the anomaly term as: A'\l) = -2

(11.4.43)

We note that for open strings, when a - 0 the expressions (11.4.33) and (11.4.34) simplify to:

and

c(T) = £ cne-inz

(11.4.44)

b(T) = X bne-'nx

(11.4.45)

— oo

Using these Fourier modes we obtain:

40

[L^,bn] = (m-n)bm+n

(11.4.46)

[L(cnl cn] = -(2m + n)cm+n

(11.4.47)

For closed strings there is also a second set of ghost Virasoro generators.

Strings and Superstrings (Elementary Aspects) 721

Hence it follows that c and b have conformal dimensions / = -1 and J=+2 respectively. This confirms the assertion that we made earlier. In view of the above discussions, the complete Virasoro generators corresponding to 5 0 + S^h can be defined as: Lm = L^+L^-a8«

(11.4.48)

The anomaly (adding ghost and matter contributions) is now: A{m) = — (m3 - m) + — (m - 13m3) + 2am 12 6

(11.4.49)

This vanishes if and only if D = 26 and a = 1, showing that only for these values the theory (of strings) is conformally invariant. With the incorporation of ghost and antighost fields, the Virasoro generators that are obtained here are without anomaly. However, the Fock space is much larger since it contains now the excitations of ghost and antighost along with the excitations of the coordinates X^. This restricts one's ability to identify physical states. In the next subsection we shall see that BRST quantization provides a useful tool in sorting out this problem.

4.2

BRST Quantization

It is worth mentioning to begin with, that the topic under discussion is another example where physics and mathematics merge together, since the BRST operator is indeed the operator, that computes the cohomology of a given Lie algebra. Let Kt denote a symmetry operator in a physical system that forms a closed Lie algebra g: [K^K^f^K,

(11.4.50)

with^y * as its structure constants. The BRST quantization involves the introduction of 'antighosts' b{ and the 'ghosts' c'. These transform in the adjoint representation and the dual of the adjoint representation of g respectively (in simple language, the bt carry a covariant index of the Lie algebra g, whereas cl carry a contravariant index). The entities b, and cj obey canonical anticommutation relations: W, bj] = 8)

(11.4.51)

and also lead to the so-called ghost number U f/=Xc''*«

(11.4.52)

The eigenvalues of U are integers running from 0 to n, where n = dim g.42 The operator Q defined as: Q = ciKi-jfijkcicjbk 41 42

(11.4.53)

' By adding the term (-a 8m) here, theL0 is now defined in such a manner that the zeroth constraint is simply Lo = 0. ' When g is infinite-dimensional, a normal-ordering constant has to be subtracted to make U meaningful.

722 Mathematical Perspectives on Theoretical Physics

is the physicists' BRST operator, which from the mathematicians' point of view is the operator that computes the cohomology of the Lie algebra g, with values in the representation defined by the Kt. Using (11.4.50) and the identity (that follows from (11.4.50) via the Jacobi identity): I, '" U l

+ fjkm Li'+

Ui "'fmJ'=

0

(11-4.54)

it can be checked that: <22 = 0

(11.4.55)

From the form of Q, it is also evident that Q changes the ghost number of any state that it acts on by +1. Let 9ik denote the Hilbert space of states of ghost number U = k. A state % in Hk is said to be BRST-invariant if QX=0

(11.4.56)

We note that BRST-invariant states can be trivially found, since in view of (11.4.55) any state % - QX is BRST-invariant; furthermore, the state A necessarily has ghost number k — 1. The interesting solutions of (11.4.56), however, are the states that cannot be expressed as % = Q^~ Two solutions X a n d X' a r e sa id to be equivalent if x — x' is a trivial solution, i.e. X-X'

= Q^

(11.4.57)

for some A. The equivalence relation (11.4.57) defines the equivalence classes of solutions of (11.4.56) of a given ghost number, say n. These equivalence classes in mathematics literature form, what is known as the nth cohomology group of the Lie algebra g (see Sec. (0.6) and 1 • [5]) with values in the matrix representation R of operators Kt, which is denoted as H"(g, R). The following remark shows that BRST-invariant states of ghost number zero are of particular interest. Remark 11.4.2 The form of the ghost number operator U suggests that a state x of ghost number zero must be annihilated by all of the antighosts bh which implies that the action of operator Q on such a state consists of just the first term in (11.4.53): QX=%c%%

(11.4.58)

i

Since a state annihilated by bi cannot be annihilated by c', from (11.4.58) it follows that BRST-invariance of x (i-e-> QX - 0) is equivalent to saying that KiX=0,

i = 1, ..., n

(11.4.59)

Hence a state x °f ghost number zero is BRST-invariant if and only if it is g-invariant. It is important to note that states x o r ghost number zero cannot be written as x = Q^ since there are no states of ghost number - 1 . Remark 11.4.3 From the above remark it follows that the cohomology group H°(g, R) is the same as the space of g invariant states of ghost number zero. Moreover, a state of ghost number zero is annihilated by a ghost annihilation operator bt and therefore contains no ghosts. Thus looking for BRST-invariant states of ghost number zero is, in fact, a way to isolate the g-invariant states that do not contain ghosts. We shall now apply these ideas after suitable modifications to infinite-dimensional Lie-algebras such as Virasoro's—an algebra of great use in string theory.

Strings and Superstrings (Elementary Aspects) 723

These modifications are required for two reasons: (i) the equation Q2 = 0 may be affected by the presence of an anomaly, and (ii) the ghost number U has to be altered by a normal-ordering constant. Remark 11.4.4 The physical states of the string are BRST cohomology classes % (modulo gauge transformations % —> % + Q?C) of some definite ghost number—which may not be necessarily zero. In fact the ghost number of the physical states is a normal-ordering constant that depends on the physical system one chooses to consider. In order to carry out the BRST program in the case of Virasoro algebra, we denote this algebra by the same symbol g. Corresponding to the generators Lm, where m is an arbitrary integer, we now introduce ghosts cm and antighosts 2>,,.43 The BRST operator for the Virasoro algebra can now be written as: Q = X L(->n c,n - \ £ irn - n) : c_m c_n bm+n : -ac0 — oo

(11.4.60)

— oo

(note that we have used here the Virasoro structure constants in their explicit form). Comparing to the form of Lm given in (11.4.48), we can write Q in (11.4.60) as:

Q=l--Ua,)n+jL'-)m-a8m)cm:

(11.4.61)

Similarly, the ghost number is U=Id--c_mbm:

(11.4.62)

Evidently the normal ordering is necessary in the above formulae of Q and U. We also make two important observations regarding them. Observation 11.4.5 For closed strings one has to add to these formulas the contribution of a second set of left-moving ghosts and oscillators. Observation

11.4.6

The factor — before z/ f) in (11.4.61) (though seemingly odd) is a consequence

of evaluating (11.4.60). We note that, this factor — is essential for the validity of the equation Q2- 0, and also for many other formulas such as Lm = {Q, bm} where Lm = L^ + L^. In the following paragraph we show that both of these operators Q and U can be obtained as integrals of conserved charge densities. For this purpose we define the BRST current as: 7*= 2 c + ( 7 t | + -T(%)

(11.4.63)

where T(+]+ = (d+X)2 and T('^+ are respectively defined in (11.3.35) and (11.4.27). (The expression for yf follows via + o - ) . 43

cm and bn are the Fourier modes considered in (11.4.33)-(11.4.36) in the process of quantizing the ghosts.

724

Mathematical Perspectives on Theoretical Physics

The ghost-number current is likewise defined as: J+=c+b++;

J_= c~b__

(11.4.64)

From the equations of motion of b and c ((11.4.31) and (11.4.32)) and the law of energy-momentum conservation on the world-sheet, it can be checked that these currents are conserved, i.e.: = d_ J+ = 0; d+ JB_ =
djl

(11.4.65)

and that the corresponding conserved charges are the BRST charges:

e=-L|%
(11.4.66)

u

and the ghost number is:

U = -±- C da (J+ + 7_)44

(11.4.67)

In order to proceed further we include normal ordering in the definition of L^ and the second term of Q in (11.4.60), and emphasize that the resulting ambiguities can be absorbed in the third term which is linear in c 0 with a free coefficient a. The above discussions ensure that Q2 = 0 in the classical sense. We shall now investigate whether this is true at the quantum level as well. We therefore compute Q2 using (11.4.60) and (11.4.61): Q2 - \ [Q, Q) = y £ ([Lm, Ln] -(m-

n)Lm+n)cmcn

(11.4.68)

with Lm as given in (11.4.48). We note that for D = 26 and a = 1, Q2 = 0 since the anomaly Airn) (see (11.4.49)) vanishes when D = 26 and a = 1. (See Exc. 2 on the converse of this result.) Definition 11.4.7

The BRST transformation of an arbitrary physical quantity Y is defined as: 8Y= [A<2, Y]

(11.4.69)

where X is a constant Grassmann parameter. It is easily seen that the square of this transformation is zero and that it corresponds to an invariance of the gauge-fixed action. Using this definition it can be checked that the coordinates, the ghosts, antighosts, and EM tensor satisfy respectively: 8Xii= +

Xc+d+Xti +

8c =Xc d+c

+

(11.4.70) (11.4.71)

Sb++=2XT++

(11.4.72)

<57 + + =0 4 5

(11.4.73)

(Similarly for - , with + <-» - ) . 44

45

Formulae (11.4.66) and (11.4.67) reduce to the previous ones in the case of open strings, while in the case of closed strings, both left-moving and right-moving modes have been added to write these formulas. T++ - T^ + T^+ is the complete energy momentum tensor.

Strings and Superstrings (Elementary Aspects)

725

Our next task (rather important and conclusive on BRST quantization) is to establish the result: Result 11.4.8 ghost number

The physical states in the bosonic string theory are BRST cohomology classes of .

To achieve this objective, we study the normal ordering of the ghost number operator:

U= U^o 1

- Vo) + I (*-A - b_ncn)46

(11.4.74)

n=\

The zero modes c0 and b0 both commute with the Hamiltonian so the ground state has a degeneracy, this fact, however (as we shall soon see), furnishes a representation of these operators. Now c0 and b0 have the anticommutation relations: co2=fco2 = O, (c 0 , bo} = l

(11.4.75)

The irreducible representation of the above relations requires two states, following [13a] we denote them |T) and \-l) (and call them for our purpose up and down states). We choose that upstate and downstate be annihilated, respectively, by c 0 and b0, and for this they must obey: *olT> = |i>, c o | | ) = j t )

(11.4.76)

A natural question we ask is—what are the 'ghost numbers' of 11) and | i ) . In any case, we denote them as £/-[• and E/j, and note that in view of (11.4.75) and (11.4.76), they satisfy the relation £/-[•= t/j, + 1. In order to find their separate values, we have to consider a normal-ordering constant in the definition of U. Now the most symmetrical choice is U\ = + 1/2, C/j, = -1/2, this choice (fortunately) corresponds to the precise normal-ordering prescription in (11.4.74), and it implies that all eigenvalues of the ghostnumber operator are half-integral. Moreover, it is this choice that makes gauge-invariant string field theory as simple as possible (see for instance [32]). With this ground work in order, we now discuss how a given physical state can be characterized as BRST cohomology class of some definite ghost number. Since physical states are expected not to contain ghost excitations, it should be possible to put a physical state y/ (may be after a transformation y/—> \j/+ QX) in a form in which the ghost wave function is proportional to one of the two ground states |T) or \i). Thus the possible choices for the ghost number of y/ would be ± 1/2. It is important to note here that the choice of one of these options (+1/2 or -1/2) is not a matter of convention, it is indeed dictated by the nature of ghost and antighost fields. They do not enter the theory symmetrically and as we know have different conformal dimensions - 1 and 2. It turns out that the correct choice of ghost number (of a physical state under discussion) is -1/2. To examine this, we let ^ b e a state that is annihilated by the ghost and antighost annihilation operators: cn\x) = h\X) = 0, n>0

(11.4.77)

We may think of % as a state that 'contains no ghosts or antighosts.' Let us suppose in addition that % has ghost number -1/2, and so bo\x) = 0. Acting on a state of this form, the condition of BRST in variance reduces to: 46

' c0 and b0—the ghost and antighost zero modes—have been separated in (11.4.74) since they require special treatment, moreover there is no natural way to normal order them.

726

Mathematical Perspectives on Theoretical Physics

0 = Q\X) = (co(Lo- 1) + X c_,, Ln)\x)

(11.4.78)

n>0

and as a result the single condition Q \%) = 0 reproduces all of the physical state conditions of the older covariant quantization. (If we were to choose +1/2 as the ghost number, and so allow \%) to be annihilated by c0, then the first term on the RHS of (11.4.78) would drop out and we would not quite get all of the physical state conditions.) We now consider a state \%) that obeys (11.4.77) as well as (11.4.78) and examine if % c a n be written as % = QX for some X. It is easy to see that % in that case would have to be expressed as:

X= I ^ l ^ >

(H-4.79)

for some states \Xn), this in turn would imply that the state % be null, for

(x\x) = (lL-nK\x)= KK\K\X) = O

(11.4.80)

(where evidently (11.4.78) has been used). One can easily recognize (see Sec. 11.3) that % is a physical spurious state. The above arguments show, as stated in Result (11.4.8), that the states obeying the traditional physical-state conditions of the bosonic string theory give rise to BRST cohomology classes of ghost number -1/2, and that a physical state is trivial as a cohomology class (i.e., can be written as QX for some state X) if and only if it is a null state as defined in Sec. (11.3). We note that to establish the statement of Result (11.4.8) in full, one must also prove the converse given a state % of ghost number -1/2 that is BRST invariant, it can be written in the form X~X' + QX where x' obeys (11.4.77) and therefore corresponds to a physical state of the old type (discussed in Sec. (11.3)) but is embedded in the enlarged Fock space in the manner described above.

4.3

Anomalies in Reference to String Theory

We have already familiarized ourselves with the concept of anomalies in Section 10.6, where we studied the origin of anomalies in general and anomalies in the theory of Yang-Mills in particular by using some specific examples. In this subsection we discuss them in the context of string theory. For instance, we establish here the fact (mentioned earlier in Sec. (11.3)) that Weyl rescaling of the metric is possible only when the Virasoro anomaly cancels—which happens just when D = 26. We illustrate these ideas, using free fermions and the properties of conformally-invariant two-dimensional theories. In the process we come across a new kind of anomaly—known as the gravitational anomaly (see also Exp. (10.6-8)). For simplicity we consider the real right-moving Majorana fermion y/+ with action: S = — f d2a y+djif+

(11.4.81)

From Chapter 9 we know that the two-point function of the energy-momentum tensor is given by:47 = j(CF+47.

ff'V

(11.4.82)

The reader may note the absence of the ordering symbol T that was used in Chapter 9. This is because we are using the Euclidean signature here.

Strings and Superstrings (Elementary Aspects) 727

We shall write the above equation in momentum space. We note that
(11.4.83)

Now the fermions are propagating on a curved world-sheet—which can be assumed to deviate from a flat world-sheet metric to first order, thus we set: h

ap=nap

+

fap

(11.4.84)

where fap is the disturbance of lowest possible order in the metric. The interaction of matter with a gravitational field is given by: M=-^\d2af^Tap

(11.4.85)

Since in the case under consideration (i.e., (11.4.82)), the only nonzero component of Ta» is T++, so the coupling is simply: f++T —

(11.4.86)

lit The expectation value of the induced fermion energy momentum in a gravitational field can therefore be written (using (11.4.83)) as: (T^ip)) = - ^ - - ^ - / + » (11.4.87) 24 p_ From the principle of conservation of energy-momentum, we know that in a background gravitational field: {DaTa/3) = 0

(11.4.88)

Now the only nonzero component here is T++ and since to lowest order in the gravitational field the covariant derivative can be replaced by an ordinary derivative, therefore this equation simply reduces to (d_T++) = 0. In momentum space this amounts top_ {T++) = 0 which is obviously not true, since we have (instead):

pAT++) = -±rplf++(p) = -\pl£_(p)

(11.4.89)

24 o (see (11.4.17) for raising and lowering of indices with modification for Weyl rescaling). The LHS of this equation would vanish only if (T++) = 0, which would mean that there is no coupling at all to gravity. The breakdown (11.4.89) of energy-momentum conservation in the coupling of a chiral fermion to gravity is called a gravitational anomaly. This means that in two dimensions the coupling of a chiral fermion to gravity does not work unless extra degrees of freedom are added to cancel the anomaly. We note that the RHS of (11.4.89) is a polynomial in the momentum and so (d_ r + + ) in effect is a local functional of/__. We further note that even though (11.4.89) is local, the formula (11.4.87) from which

728 Mathematical Perspectives on Theoretical Physics

it is derived is not so because of the

singularity there. Thus it is not possible to add a local term to P(11.4.87) to eliminate the anomaly in (11.4.89). In order to overcome this difficulty, we consider a theory which incorporates the left-moving fermions y/_ along with the right-moving fermions y/+. The action of i//_ is S'= — f d2ayd+y

(11.4.90)

As a result (11.4.87) is replaced by:

(T++)=4—'6 p_ (T__) = - 1 — / + + 6 p_

(11.4.91)

and (11.4.89) by

P_(T++) = -±plf__ P+(T__) = -±-plf++

(11.4.92)

o At first sight, the anomaly problem does not seem to have disappeared from (11.4.92), on closer examination, however, we find that this (anomalous) violation of energy-momentum conservation can be removed by adding local counter terms to (11.4.91). For instance, we now write:

(T+J = -4—(Plf- -IP+P-f*- + P - / J 6 P-

= -i-—(P-/++ -ip+p-fi-+ 6 P+

p\f-)

(T+_) = Uplf__-2P+P_f+. + plf+J o It is easy to see that energy-momentum conservation is now obeyed since:

(11.4.93)

+ = 0 (11.4.94) Hence it follows that the two-dimensional theory described by the sum of actions (11.4.81) and (11.4.90) of right- and left-moving fermions can be consistently coupled to gravity. The following remark, however, shows that there is a small flaw in the theory that should be accounted for. Remark 11.4.9 The actions (11.4.81) and (11.4.90) appear to be invariant under Weyl rescaling of the metric, corresponding to a theory with T+_ = 0, but the third equality in (11.4.93) shows that this is not so. Thus in order to achieve a theory that obeyed the energy-momentum conservation (a fundamental

Strings and Superstrings (Elementary Aspects) 729

principle of physics), we have in turn disturbed the tracelessness of tensor Tap, since its trace (to lowest order i n / ) is given by the third equation in (11.4.93). We note that this is actually an approximation to the formula: T

+- = ~\R

(11.4.95)

R being the scalar curvature of the string world-sheet. In conclusion, while the two-dimensional theory with massless fermions coupled to gravity respects the principle of energy-momentum conservation, it does not posses at the quantum level the Weyl invariance, which is present in the classical Lagrangian. Consider now a general two-dimensional theory that is scale invariant on a flat world-sheet, thus T+_ = 0, and for some constants c and d components T++, T__ satisfy: (T++(p)T++(-p))

= - ^ - ^ 24 p_

(T__(P)T__(-P)) = - ~ — (11.4.96) 24 p+ We shall see in the next subsection that c and d can respectively be thought of as the Virasoro anomaly of right- and left-moving modes. When the theory is coupled to a curved world-sheet, the energymomentum conservation does not hold good unless c - d; and when c = d, the Weyl invariance is lost with an anomaly of the form (11.4.95), which vanishes only when c = d = 0. This analysis leads to an important remark. Remark 11.4.10 In the case of the Veneziano model [34] (with ghosts included), c = d- (D - 26) in any space-time of dimension D, so the world-sheet energy-momentum tensor is conserved even on a curved world-sheet for any D. However, this tensor is traceless only when D = 26 (since it is only then that c = d = 0) and it is actually in this case that the Weyl invariance used to eliminate the D(j> integral in (11.4.13) is valid.

4.4 Calculation of the Virasoro Anomaly via World-sheet Methods In the previous section we calculated the anomaly 'a' of the Virasoro commutation relation: [Lm, Ln) = (m - n)Lm+n

+ (am3 + bm)Sm+n

48

(11.4.97)

by using mode expansions. This alternative method of computing 'a' is based on the free bosonic field theory on a world-sheet as well as on some (well-known) rules of current algebra.49 (Note that we are interested in computing only 'a' as the other constant b can be absorbed in the normal-ordering constant of Lo (see Sec. (3)). For our purpose, we begin with the action of a free bosonic field <j> described on a world-sheet-which is the whole complex plane:

SB = -—j

d2GdJda5Q

(11.4.98)

48

' In Sec. 3 this relation is (11.3.95-96), and a and b are respectively c 3 and cx there. ' This computational method is also known as covariant calculation. 50 Before this, we had only considered the bosonic fields that were space-time coordinates

49

X^(d).

730 Mathematical Perspectives on Theoretical Physics

Our assumption in (11.4.98) that the world-sheet is a complex plane suggests thatS s represents a conformally invariant theory, a fact which we shall require as we go along. The propagator of the field (j) is

d

A\

£

' ,V

=-±-\n{\o-o'\ii)

(11.4.99)

where jx is an infrared cutoff that cancels out of all relevant formulas (see Sec. (9-4-6) where this propagator was derived). Writing a±- T± a; we note that the free wave equation obeyed by <j) is: O = d+d_<j)

(11.4.100)

which implies that ij> can be expressed as: 0(<7+, o~) = —
(11.4.101)

The splitting in (11.4.101) has an ambiguity in the sense that one could add to if and subtract from
\°°da'^$-

if (a, T) = (0, x) + \"da'^-

(11.4.102)

It can be easily checked that Eq. (11.4.102) obeys (11.4.101). Using the equation of motion (d2T-d^) if) = 0, it also follows that (
(11.4.103)

Now (f (+) and {if if) are functions only of cr+ and = - ln((cr+ - O

(0- - 0") A/2)

(11.4.104)

in view of the observation made above, we can further write the two point functions in separate pieces, e.g., <0+( = - l n [ ( a + - O / f ] <0-(<7-) = - l n [ ( o - - a'-) /i] (The above equalities in (11.4.105) can be checked using (11.4.102).) The world-sheet energy-momentum tensor (in terms of i/)) can be written as:

T++(0+) = djdj

= ±dj+d+
(11.4.105)

Strings and Superstrings (Elementary Aspects) 731

T_ (<7~) = djpdj = —djfdjf 4

(11.4.106)

and it obeys: d_T++=d+T__=Q

(11.4.107)

In order to evaluate the Virasoro anomaly (which is our final goal), we use the current algebra and From our study in earlier chapconsider the time-ordered two-point function (T(T++(GI~)T++(O'+))).51 ters, we know that this is not conserved, instead it obeys the Ward-like identity: djTiT^ic,

T)7++(CT', T')> = y<5(T-T') {[T++{0, t),T++«j', T')])

(11.4.108)

We note that this identity results (as is always the case in current algebra) from the pulling of d_ inside the 7-product, since while doing that one picks up an equal-time commutator. The expectation value of the commutator [^(cr), T++{G')] extracts the c-number piece of this commutator, which is the Virasoro anomaly. This suggests that the Virasoro anomaly can be computed by evaluating the LHS of (11.4.108). Now in free field theory, the two point functions of the energy-momentum tensor is given by the simple one-loop diagram (see Fig. 11.14).

(T++T++) = {

J<

Fig. 11.14 No integration is required to evaluate the above figure in coordinate space, all one needs to do is to take the product of various propagators. Thus using (11.4.105) and (11.4.106) one obtains: {T{T^(c+)T++(a'+)))

= |(<7+- < r V

(11.4.109)

O

and as the LHS of (11.4.108) involves
(11.4.110) (11.4.111)

This gives: d_{(f- G/+)-4=--did_(v+< T ' V = - i — d l S 2 ( a - a') (11.4.112) 6 6 In view of this, we conclude that (11.4.108) and (11.4.109) correspond to an anomalous part of the energy-momentum commutator at equal T, given explicitly as:

51

[T^ia, T), r ++ (a', f)]A = -i~8'"(a24 In fact this function is T-ordered.

a').

(11.4.113)

732

Mathematical Perspectives on Theoretical Physics

The subscript A on the LHS means that only the anomalous (c-number) part of the commutator has been evaluated. We note that to evaluate this anomaly the free field theory has been formulated on the plane, but the anomaly (11.4.113) is determined only by the short distance behaviour of the free field theory, and as such it is valid even when the theory is formulated on a world-sheet formed by a closed string. We shall in that case define the Virasoro generators as Fourier moments of 7 ++ :

Ln=±j*doe2inaT+,(o)

(11.4.114)

It can be checked that (11.4.113) then gives a formula for the coefficient a in (11.4.97): a=-jy

(11.4.115)

a value which agrees with earlier calculations. Finally we remark: Remark 11.4.11 Dependence of (11.4.109) on (a- a'Y^ is completely determined by scale invariance and holds in any conformally invariant theory in 1 + 1 dimensions. Thus only the coefficient of (cr- cr')""4 might be different in another conformally invariant theory. Moreover, the coefficient of (a-cf)^ is always positive for a field theory with physical degrees of freedom only, as two-point function of the Hermitian operator T++ is positive, hence only ghosts can cancel the Virasoro anomaly-a fact that we established in the previous section. The above remark shows that incorporation of ghosts in the theory is of utmost importance to make the theory viable, in view of this we devote the next subsection to ghosts.

4.5

Ghosts in Bosonic Theory

Consider the conformally invariant action involving the (right-moving) ghost c + and the antighost b++: S= — f d2ac+d_b++52

(11.4.116)

The ghost equation of motion derived from (11.4.116) is: d_c+=d_b++ = 0

(11.4.117) +

which is the same equation as obeyed by (bosonic field)
+. Since it is known that the fermion free field theory is specified by the two-point function (see (11.4.33-34) and (11.4.99)):

(c+(a+)b++(a'+)) = 4nd+\ £jL e '*(a-
*

(11.4.119)

It is noteworthy that the energy-momentum tensor cannot be determined from this flat world-sheet action (see Exc. 2 for explanations).

Strings and Superstrings (Elementary Aspects) 733

we attempt to find operators in the bosonic theory that would reproduce this two-point function. With this end in view, we choose an operator: D,(<7+) = il{'2/2) : e^+ia+)

:

(11.4.120)

where ji is the infrared cutoff present in (11.4.105). Using (11.4.105) and the method to compute the expectation value of a product of operators (see Chapter 9), we write the two point function of Dt as: (Dt(a+) D_t(<j'+)) = (a+- a'+y'2

(11.4.121)

We note that infrared cutoff jx cancels out of this formula. When (11.4.119) is compared with (11.4.121), the following identification is derived: c+(a+) ~: <^ +(
b++(e+) ~: e'^^:

(11.4.122)

To establish the above identification, however, we shall have to show that the bosonic operators in (11.4.122) obey the correct fermion anticommutation relations. For instance, we show that the equal T anticommutator of c+ satisfies the following: {c+(<x, x), C+(CT', T)} = 0

(11.4.123)

holds good. To establish this, we study the product:

Dx{a, x)Dx{o', T) = e'** T> expf-f H / S ^ - ) e'^''

V

a

T

> expf-.f" dc'^\

ox)

V

(11.4.124)

dr)

where we have used the explicit formula for (j>+ given in (11.4.102). We now use the formula eAeB = eBeAe^A' B' which holds when [A, B] is a c-number, to interchange the positions of (b and ——, and dt finally using the canonical commutation relations of <j) and —— we write the product: dx D,{a, T)D,(cr', T) = eine{c>- o) ema- T) el*(a'' T) XeJ-ifdd^)-eJ-irda'^-) a JCT

V

dx)

V

(11.4.125)

dx)

Here 6{a' - a) is +1 when {a' - a) > 0 and is 0 otherwise. Since the phase factor 6 {a' - o) on the RHS side of (11.4.125) is an odd function of ( c r -
lim :c+(<j)b++(a'):

(11.4.126)

(T'-»(T

In view of (11.4.122) i.e., using the bosonic language this becomes: J+(O) = lim : ^ + ( f f + ) e-i$+(a'+): cr'-»cr

(11.4.127)

734

Mathematical Perspectives on Theoretical Physics

To take the limit a' —> a, we expand the second exponent, thus: e -;<s

+

«r'+) = e -.-^(«r+) (1 _

/(cr'

+ _
(11.4.128)

and note that the first term gives a discardable c-number (upon normal-ordering), whereas the second contributes to a non-zero term, thus we have: J+ = -id+
(11.4.129) +(CT+)

l a

+

(The above limit is obtained using the fact that as a' -> a, e'^ > has a short distance e~ ^ singularity proportional to (<7 + - a'+)~l). The expression in (11.4.129) is the bosonized version of the ghost-number current. Using the canonical commutation relation when we write [J+(
f)] = ^S(a4

a')

(11.4.130)

we note that the symmetry under the constant shift in 0+ corresponds to ghost number. We use the above discussions to write down the energy-momentum tensor in the bosonized language as well as to obtain the conformal dimension of the operator Dt. In this connection we recall that on a flat world-sheet, the ghost theory has a 'ghost conjugation' symmetry b <-> c; and the ghost current J+ = c+ b++ is odd under ghost conjugation. Hence from (11.4.129) or for that matter from (11.4.122), it follows that ghost conjugation can be interpreted as tp —> -. In Exc. 2 of this section, we formulate a one-parameter family of energy-momentum tensors, such that a member of the family is obtained from another by adding the derivation of the ghost current. The corresponding one-parameter family of energy-momentum tensors in the bosonic formulation turns out to be: T+k+= ^d+^d+<j)+-

^ikdlf.

(11.4.131)

The particular member of the family given by k = 0 is unique, in the sense that it is invariant under ghost conjugation. We note that at k = 0 the operators D, and D_, are related by ghost conjugation, and therefore they have the same conformal dimension. Thus if we denote the conformal dimension of D,(D_t) by dt (d_t), and note that in view of (11.4.121) d,+ d_,= t2,

(11.4.132)

then using the ghost conjugation property we have dt = — (since d, = d_t). When k & 0 the conformal dimension can be determined using the value at k = 0. Since D, is an operator of ghost number t, in view of the identification of the ghost number in bosonic terms, it follows that the conformal dimension of Dt at general k can now be written as (see (v) of Exc. 2): d,{k)=^-k^

(11.4.133)

We close this subsection with the following two remarks that show the affinity between the fermionic and bosonic theory.

Strings and Superstrings (Elementary Aspects) 735

Remark 11.4.12 Similar to the fermionic case (see Subsec. (4.3)), in the bosonic case also the existence of a one-parameter family of energy-momentum tensor corresponds to the existence of a oneparameter family of couplings of the free field (j) defined on a curved world-sheet. The family that we are talking about is given by the action:

Sk= -J rfWft ( j - < ? a ^ - i * | ? > )

(11.4.134)

where /? (2) is the scalar curvature of the string world-sheet. Evidently on a flat world-sheet /? (2) is zero hence there is no ^-dependent term. We vary (11.4.134) with respect to the world-sheet metric to derive the EM tensor and then setting the metric to 77^, we derive the ^-dependent EM tensor (11.4.131). It can be checked that the ghost number is conserved (the action Sk is invariant under —> + const.) only if k = 0. Remark 11.4.13 In the above remark we have discussed the bosonization of fermions for (11.4.134) formed on a (1 + l)-dimensional world of infinite volume. It is sometimes worthwhile to consider the bosonization of fermions that are propagating on a circle. For this we study the free theory (11.4.134) on a finite one-dimensional world with 0 < c< 2n together with our familiar periodic boundary conditions,53 then
(11.4.135)

Here [(j)0, p 0 ] = i, and [<j>n, (j)m] = ndn+m. Moreover, p0 is the ghost-number operator that shifts 0 by a constant. Just as in the case of infinite volume (see (11.4.122)) we again define c+(<J) =: e'*+{a)

:, b++(:

(11.4.136)

and using the appropriate normal ordering we obtain:

c\o) = expl - Y -e~in°<$>n L'*0g'"(P0 + 1/2> x exp [ - Y -e~ina<^n I V no « J

(11.4.137)

6 ++ (o) = e x p [ - X -^ I > 1 C T 0 n V 1 > o e - i ( J ( ™ + 1 / 2 ) x e x p | - X - « - * " > „ ] (H-4.138) [

n<0

n

J

{ n>0 A B

n

)

B A [A B]

It can be checked (with the help of the identity e e = e e e ' ) that c+ and b++ defined above obey the correct anticommutation relations. We note that the factor (p0 + 1/2) (and not p0) is needed for a correct Bose-Fermi correspondence, the periodicity c+(<7 + In) = c+(cr), b++((J + In) = b++{o) requires that the ghost number operator p0 in (11.4.137) and (11.4.138) should have half-integral eigenvalues, which confirms our earlier findings that the ghost number of the states of the bosonic open string is half-integral. Also since p0 is canoni53

We recall that the closed-string periodicity is 7t, whereas for open strings after doubling of the interval, the periodicity is naturally In and this is what we are using here.

736 Mathematical Perspectives on Theoretical Physics

cally conjugate to the zero-mode coordinate <j)0 (po= -i<90), and the fact that it has half-integral eigenvalues means that 0 is an angular variable, i.e., 0 and
instance, one can write the Hamiltonian H = I

tf=iT+

da T++ of the free field theory as:

i>0-»0,,-^7

where U = pQ is the ghost number and

(11.4.139)

is the normal-ordering constant for bosons with periodic

24 boundary condition. (This constant is denoted e + in the literature.) (See [13a] for other results.) In the next two sections we shall study a few other important concepts on an introductory level e.g., the global aspects of the string world-sheet, the strings in background fields, and supersymmetry in string-theory.

Exercise 11.4 1 Show that the two-dimensional scalar curvature for a conformally flat metric is given by: (a)

R(2) = - — d+d . P

2. Show that it is not possible to determine the energy-momentum tensor uniquely from the flat world-sheet action (11.4.116), since there Can always be defined a tensor: (a)

Tk++ = i - [ ( d + c + • b++ - c+d+ b++) + kd+{c+ b++)}

which is conserved for any.*;. Using this expression, show further that the conformal dimension of any physical field Z depends only on the ghost number of Z.

Hints to Exercise 11.4 1. For an arbitrary tensor t++ ... +, using the formulae given in (11.4.22) with p = e*, we can write: (•) [V_, V + ] f + + ... + = [
-n(d+d_logp)t++...+.

On the other hand, we also know that the RHS equals (see Sec. (0.3) for the definition of Riemannian curvature): (ii)

np R(2) t++ ... +

Strings and Superstrings (Elementary Aspects) 737

where /?( ) denotes the two-dimensional scalar curvature for a conformally flat metric. Comparison of (i) and (ii) establishes that Rw = _Ldd 0.

p 2. We recall that the energy-momentum tensor on a world-sheet with metric hap is defined as:

where S is some appropriate action. When we considered the ghost action (11.4.24) Sg=±jd2eJhhaPc?Vabpy we had obtained the energy-momentum tensor (11.4.27): (ii)

T(ci=jc+d+b++

+ {d+c+)b++

But in this particular case the action S in (11.4.116) stands for (iii)

S= — f

d2ac+d_b++

and thus using the formula given in (i) it is not practical to calculate the EM tensor. Now, the tensor given in (a) can be written as: T*+ = i - [ ( * + 1 )d+c+ • b++ - (1 - k)c+d+ b++\. 4 where k is a parameter. In view of (11.4.31), which gives the ghost equations of motion d_c+= dJ>++ = 0, we immediately note that d_Tk++ = 0 for any value of k. We further note that when k = 3, (iv) reduces to (ii) which confirms the statement that k = 3 is the correct value for giving the EM tensor. From (iv), when k is any number, due to linearity in k the conformal dimension of b is (k + l)/2 and that of c is (1 - k)l2. Moreover, the ^-dependent piece in (a) is precisely the derivative of the ghost-number current / + = c+ fc++, this can be interpreted to mean that the conformal dimension of any physical field Z depends only on the ghost-number of Z. Thus if do(z) denotes the conformal dimension of Z at k = 0 and g(z) denotes its ghost number, then the conformal dimension dk{z) of Z at general k is (iv)

(v)

5

dk(z)

=

dQ{z)-^~.

A FEW IMPORTANT TOPICS IN STRING THEORY

The heading of this section is rather controversial for two reasons, namely (i) the topics selected by us may not appear equally important to the reader; (ii) can we describe a theory as vast as this one via few

738

Mathematical Perspectives on Theoretical Physics

topics of our choice? We feel, however, that as we move on with the text, the skeptical reader will certainly agree with our limited choices. And therefore our attempt to merely give a flavour of the theory, motivating thereby an inquisitive mind, will not go in vain. In an earlier section we have mostly dealt with the local theory of string world-sheet, here we shall study some of the global aspects.

5.1

Global Aspects of String World-Sheets

We recall that the three independent components of a world-sheet metric hag can be gauged away via world-sheet reparametrization invariance and a Weyl rescaling invariance, in other words using these invariances the metric hap can be put in any prescribed form. We shall see that the theory of Riemann surfaces which we studied in Sec. (1.3), will be very useful in examining these ideas from a global point of view. We use oriented closed strings for our illustrations. The world sheet of an «-loop closed-string diagram is a sphere with ^-handles—which as w e know is a (compact) Riemann surface of genus n.

V

A

A —

A —

) —

*" V ^-^ ^—

*»»-»-' >v

<

•*, _ ^ - ^ ~ \ ^ _

Sa™s^

*Wr-^

-i^^*^

j

^ - ^

(a)

^^QQ|

World sheet of 4-looped closed string (as in (a)) shown as a sphere with 4 handles.

The following diagrams (Fig. (11.16)) represent the world sheets of a string that is a: (a) tree, (b) one-loop and (c) two-loop. The first of these has no handles and is topologically an ordinary sphere, while the second one corresponds to our familiar torus, and the third to a donut with two holes.

(aL

(b)

(c)

^ ^ Q Q J World-sheet diagram for oriented closed strings: (a) tree ® sphere, (b) one-loop © torus, (c) Two-loop ® donut with two holes. The distinction between these surfaces is noteworthy. Remark 11.5.1 In the tree-level case we have a Riemann surface of genus 0, and according to a theorem due to Riemann, the metric of this surface can be put globally in a standard form by a diffeomorphism plus Weyl rescaling. Thus if h0 denotes the spherical metric on S2, then any other metric h up to a diffeomorphism is of the form h = e h0. We note that if one shows that there are no antighost zero modes on S2, then one can further show that any infinitesimal deformation of h0 gives another metric which is related to it by a diffeomorphism plus Weyl rescaling. From Sees. 3 and 4 of Chapter 1 we also recall that two Riemann surfaces that are related to each other by a reparametrization as well as by Weyl rescaling are conformally equivalent, hence according to Riemann's theorem, any two metrics on S2 are conformally equivalent.

Strings and Superstrings (Elementary Aspects) 739

Our next remark deals with a Riemann surface of genus one and shows that two such surfaces are not conformally equivalent. Remark 11.5.2 On a torus no two metrics are globally equivalent up to a diffeomorphism plus Weyl rescaling. For instance, the two torii in Fig. (11.17) cannot be related in this way.

Q ^ Q Q j Conformally Inequivalent torii. To understand the nature of this inequivalence analytically, we recall that a torus can be constructed by identifying the opposite sides of a parallelogram as shown in Fig. (11.18). #

—*

^ Q Q

-f Xx + X2

*^\

A torus made by identifying the marked line segments in the complex z-plane.

We take this parallelogram to be part of a complex z-plane and take two complex numbers Xx and Aj such that T= AjMi

(11.5.1)

is not real. We assume that Imr > 0 (this can always be done, if necessary, by exchanging Aj with Xx, in (11.5.3) we have taken T > 0), thus T defines a point in the upper half plane.54 We next define a torus by identifying: z = z + nA1 + mA2

(11.5.2)

for arbitrary integers n and m, as shown in the above figure. We note that this torus inherits a flat metric from the z-plane, a natural question that we ask now is: whether or not the torii obtained by using different values of A, and A2 are equivalent under diffeomorphisms plus Weyl scalings. The answer to this question as suggested by Fig. (11.17) is: no, in general. However the variable T obtained as the ratio of A^/Aj is a conformal invariant, meaning thereby that it does not change by diffeomorphisms plus Weyl rescalings. In Exc. 1 we shall see that only by defining a suitable group action:

r^

iaT

+ b)

(11.5.3)

(CT + d) 54

' It is hoped that the variable T used here in keeping with the conventional usage will not be confused with the T coordinate on the string world-sheet.

740 Mathematical Perspectives on Theoretical Physics

on T, we can obtain torii that are conformally equivalent, otherwise in general they are inequivalent. Hence the complex variable T subject to the equivalence relation (11.5.3) is the only feature of the metric of a torus that cannot be absorbed in a diffeomorphism plus Weyl rescaling (see Subsec. 1 of Sec. (3.6)). In conclusion we have shown here that the world-sheet corresponding to one-loop (oriented) closed string, which is a surface of genus 1 carries only one conformally invariant parameter. In the next remark we discuss the world-sheet that corresponds to g-loop closed string. The worldsheet has g handles and as such is a (Riemann) surface of genus g, we denote this surface as E. Remark 11.5.3 In order to construct a surface of genus (g + 1) from that of genus g, all we need to do is to attach a handle to the given surface as shown in Fig. (11.19)(a).

(a)

(b)

(c)

Q H R Q Adding a handle to a surface of genus g to make a surface of genus g + 1. This is done in two stages; we puncture two holes as in (a) and attach two half-tubes of a given length to these holes, and then glue the ends together as in (b). The gluing of these ends as seen in (c) involves an ambiguous relative angle which we call the twist angle. In obtaining the surface with an additional handle, we have introduced six new real parameters (four to specify the position of the two punctures, one for the length of the tube, and one for the 'twist') or equivalently three complex parameters. Thus if Bg stands for the number of conformally invariant (complex) parameters that describe a surface X of genus g, then the above discussion implies the notational equality: Bg+l = Bg+3

(11.5.4)

It is important to note here that (11.5.4) holds only for g > 2. For in the case of g = 0 due to continuous symmetry, all positions of the punctures are equivalent, we are thus left with only two real parameters pertaining to the length of the tube and the twist. Hence when g = 0, the above equality reduces to fl1

=

2Jo+l = l

(11.5.5)

which confirms our findings about the complex parameter T(see Remark (11.5.2)). To construct a surface of genus 2 from the surface of genus 1, we note that it has only the rigid translations of the z-plane for continuous symmetries, thus while one of the positions of the puncture is shiftable, the other puncture accounts for one invariant complex parameter, hence we have B2 - Bx + 2 = 3. The surfaces of genus 2 or greater have no continuous symmetries, hence (11.5.4) implies that the number of conformally invariant parameters on a surface of genus g (g > 2) is: 5,= 3*-3

(11.5.6)

Strings and Superstrings (Elementary Aspects) 741

(B3 = B2+ 3 = 3 + 3;

5 4 = 5 3 + 3 = 6 + 3;

B5 = 9 + 3 => 3 • 4 - 3 = BJ

Before closing this subsection we discuss in the remark below the global symmetry of the string world-sheet that pertains to ghosts and antighosts. Remark 11.5.4 In the previous section when we wrote the gauge-fixed path integral (11.4.13), we mentioned that the presence of ghosts was required there for cancellation of Virasoro anomaly and that this happened only when D = 26. We pursue these ideas a little further by asking the question: whether c+ and b++ have normalizable zero modes on the string world-sheet. For this we consider the equations: V_c+=0, VJ>++=0,

V + c~=O

(11.5.7)

V+Z>__=0

(11.5.8)

and recall that the transformation law of the metric under infinitesimal coordinate reparametrizations da -> aa + t;a is given by: S/i++=V+£+,

8h__=V_£_

(11.5.9)

We compare (11.5.7) and (11.5.9) and note that a zero mode of c is the generator of a conformal symmetry-a world-sheet reparametrization that changes the metric to its own multiple. Obviously this change in the metric can be absorbed in a Weyl rescaling. At tree level the world-sheet is only a sphere, hence its stereographic projection onto the complex plane followed by the choice z = T+ iff, z = T - i<7 reduces the Eq. (11.5.7) for c + to:

4zc+=0 dz

(11.5.10)

The above equation implies that c + must be an analytic function of z (see Chap. 1). According to our discussions on an analytic function e(z) in Sec. (1.4), it follows that the conformal symmetry generated by c+(z)—

dz

would have no pole at infinity only if c+(z) grows at infinity at most like z2. This means

that (11.5.10) has three solutions, namely c+ = 1, c+= z, c + = z2- The conformal transformations that result from these solutions generate a closed Lie algebra. In fact it is the familiar Lie algebra sl(2,
(11.5.11)

where R is the scalar curvature of the surface. We multiply the RHS by c + (the complex conjugate of c+) and integrate over Z to obtain: 0 = J£ c+*(V+V_ + V_V+ + R)c+ = - j z (|V_c + | 2 + | V + c + | 2 - R\c+\2)

(11.5.12)

742 Mathematical Perspectives on Theoretical Physics

When the genus is 1, from Remark (11.5.2) we know it is a torus with flat metric, so R = 0, and the vanishing of the other two terms (separately) implies that c+ is covariantly constant, rather simply constant due to flatness of the metric. This means, that there is precisely one normalizable ghost zero mode on a surface of genus 1. Hence the only conformal symmetry on a surface with g = 1 (in this case a torus) is the rigid translation z —> z + A, with complex X. When the genus is greater than one, it can be shown that Z admits a metric with negative scalar curvature everywhere; from (11.5.12) it is evident that in this case, c + = 0. Hence there are no normalizable ghost zero modes on a surface of genus greater than one. Thus if Cg denotes the number of normalized ghost zero modes on a surface of genus g, then in effect we have shown that: Cg = 0 for g>2;

C0=3,

Cx = 1

(11.5.13)

In order to discuss the normalizable antighost zero modes, we note that in view of Remarks (11.5.2) and (11.5.3) every surface of genus > 0 carries conformally inequivalent metrics. Treating this as a qualitative feature of a world-sheet surface, we examine it now on a quantitative basis. For this we choose a background metric hap and look for the conditions under which a general perturbation Shap of h can be absorbed in a reparametrization plus Weyl rescaling. In the local coordinate system, as we already know, h++ = h = 0 and h^_ = e*, which means that Sh^ can be absorbed in a Weyl rescaling, the question is whether 8h++ and Sh__ can be absorbed in a diffeomorphism. As Sh__ is the complex conjugate of Sh++, from (11.5.9) it follows that Sh++ (and therefore Sh_J can be absorbed in a diffeomorphism if and only if there is some globally defined E,+ with Sh^ = V+£+. Failing this, we have that S = j z \Sh++ - V + | + | 2

(11.5.14)

is nonzero for all
(11.5.15)

If we set b++ = 8h++ - V + £ + , we notice that the above equation is the equation (11.5.8) for antighost zero modes. In fact antighost zero modes are in one-one correspondence with the choices of Sh++ for which (11.5.14) does not vanish. This means that for a surface (world-sheet) Z they (antighost zero modes) reflect the number of deformations of its metric that cannot be absorbed in reparametrizations plus Weyl rescaling. In short, the number of antighost zero modes on a surface of genus g is the number Bg that we defined in (11.5.6). Finally we note that though Cg and Bg are not regular functions separately when g is small, their difference Cg- B = A is a smooth function of g: Ag=Cs-Bg = -3(g-l)

(11.5.16)

The difference A^ is given by a classical theorem known as the Riemann-Roch theorem, the modern generalization of this theorem is the famous Atiyah-Singer index theorem. Both these theorems along with their consequences have been studied widely over the years. The reader can find the literature on these in Palais Ref. [Ad] and in 10- [35] among other references. The study for closed strings obviously has analogs for open strings and world-sheets with boundary. For instance, for open strings at one-loop level, we have a world-sheet that is topologically a cylinder, and another one that is a twisted cylinder (Mobius strip).

Strings and Superstrings (Elementary Aspects) 743

See Fig. (11.20)(a) and (11.20)(c) below.

(a)

(b)

(c)

y g m ^ j (°>- (b> and (c). It is well known that any metric on the cylinder in Fig. (11.20)(a) is conformally equivalent to the standard flat metric on the annulus Xx < \z\ < Aj in the complex z-plane, for some A, and A2, as in (b). 55 By scaling z —> z/A2, we can set A2 = 1, so the conformal structure of the annulus is described by a real A, parameter x = —— that ranges between 0 and 1. Using similar arguments, we obtain a one parameter A2 description of open-string world-sheet given in (c).

5.2

Effect of Non-flat Metric on the Propagation of a String

We consider a string propagating on a general 26-dimensional manifold M, with metric tensor g^. The string action So of flat Minkowskian-space with metric T]^v S =

°

~h^

d2

° ^fta/l<9«*%*VV

(11.5.17)

is now replaced by the action: S = - — J d2ayfhha^daX^d,}Xvg^(Xp)

(11.5.18)

Suppose that g^v differs from r\^v by a perturbative function f^v: 8nV(Xp) = ri^ + f^{Xp)

(11.5.19)

then it is easy to check that the world-sheet path integral

Z o = J DX^Dha/}e-s°

(11.5.20)

pertaining to (11.5.17) when derived from the action (11.5.18) (using (11.5.19)) can be written as: 55

^ ^ The two boundaries in (b) can be changed by the conformal mapping of the complex plane z —> Z changes the metric only by Weyl rescaling.

which

744

Mathematical Perspectives on Theoretical Physics

Z= j DX^Dhape~s

= j DXMDhape~s°

x

( *+ {2 ^ ^ ^ ^ ^ ^ ^ ^ { X P ) J + i-{ } 2 + - - j Here V

(11.5.21)

J d2a -Jh h°^da X^dp Xvf^v (Xp) = { } is the vertex operator for emission of a graviton

of wave function f^v (Xp). We note that an insertion of V'm the Minkowskian path integral Z o given in (11.5.20) would represent the interaction of strings with an external graviton of wave-function f^v, whereas the insertion of e in this integral would describe the interaction with a coherent state of gravitons in this wave function, and it is this that corresponds to string propagation in the metric g^v = T\^v + f^vIn the following remark we give some simple properties of (11.5.18). Remark 11.5.5 Both actions (11.5.17) and (11.5.18) represent two-dimensional quantum field theories, but with a significant difference. Namely, while (11.5.17) becomes a free field theory in the conformal gauge hap = r\ap, (11.5.18) does not. In fact the action in this gauge given as: S'= - - ^ \ d2adaX^Xvgtlv{Xp)

(11.5.22)

describes a nontrivial quantum field theory known as the nonlinear sigma model. Just as in the case of flat Minkowski space, the action (11.5.22) has to be supplemented by Virasoro conditions: Tap=0

(11.5.23)

which (as we know) are conjugate to the gauge choice hap - riap. Since 5 ' is invariant under rescalings or conformal mappings of the (fs, we have at the classical level: 7^ = 0

(11.5.24)

just as we had in the case of string propagation in flat space (see Subsec. (3.1)). This leads to two sets of Virasoro conditions: T++=T__=0

(11.5.25)

and these are enough to eliminate the modes of negative norm and eventually give an interesting theory. We further note that even in the case of flat Minkowski space (g^v = 7]^) where (11.5.18) reduces to a free theory, there can be an anomaly in T+_ if the string world sheet is curved. As we already know, this arises in D * 26 and is of the form T+_ ~ R, R being the scalar curvature of the world-sheet. Therefore if in formulating (11.5.18) the world-sheet geometry is known, then R is a definite c-number function of the world-sheet coordinates a and T; this implies that conformal anomaly described in (Subsec. 4.4) is simply a c-number. When g^v * T]^ and (11.5.18) describes a nonlinear theory, evidently the anomaly in T+_ is not simple, in this case it is called a q-number anomaly of the theory. To follow the usual practice in the literature, we have explicitly used the parameter a ' instead of using a' = — in 5".

Strings and Superstrings (Elementary Aspects) 745

5.3

Breakdown of Weyl Invariance and Beta Functions

Very often scale invariance breaks down in (11.5.22) since there is no way to regularize it while preserving the world-sheet scale or conformal invariance. Neither of the two methods, such as Pauli-Villars regularization or dimensional regularization (see Chapter 9 and Sec. (10.6) for the concept of regularization), serves the purpose of regularization completely. The former, where one subtracts the contribution of a massive regulator field from loops, is unable to maintain the scale invariance; similarly the latter violates the scale invariance in general, as the non-linear sigma model (11.5.22) is scale invariant only in two dimensions. As the regularization needs to be done in any case, the dimensional regularization is usually adopted and the resulting breakdown of scale invariance is handled via the so-called beta (/3) functions. In the following remark we describe a /3 function and show that Weyl invariance implies global scale invariance, which in turn implies the vanishing of the J5 function. Remark 11.5.6 In quantum field theory the nonzero /3 function arises from ultraviolet divergences in Feynman diagrams; in the case of string theory, it is related to Weyl invariance (i.e. if S' is formulated on a curved world sheet, we ask is it Weyl invariant?), thus the vanishing of the ft function here means ultraviolet finiteness.* We compute one-loop /? function from (11.5.22), treating this action as representing a quantum field theory with quantum field X^(o, T). We denote a vacuum expectation value for this field as X^, and expand the quantum field around it, thus XM(cr, T) = X% + *"(
(11.5.26)

with x^ being the quantum fluctuation. Since the action S' is a geometrical expression, it is invariant under redefinition of the field variables X^l—>Xfl(Xp)as long as the space-time metric tensor g^v is also appropriately transformed. We assume that on space-time manifold M the coordinates XM are locally inertial at the point X§, so the metric g^v(Xp) equals the Minkowski metric rjMV at X^ = X1^ and differs from it only in the order (x^)2. Furthermore if the coordinates X11 are Riemann normal coordinates, we can expand g^v in terms of 7]^, the Riemann tensor R,!XVK{XP^) (at the point (Xp()), and the derivative Dp R^xVK neglecting terms of order higher than four in x^.sl This simplifies (11.5.22) to the form: S'=-^~\

d2cr(dax^x»-

\R»XvK(XPo) (dax»daxv)xxxK+

O(x5)).

(11.5.27)

From (11.5.22) we note that it is in the limit of very small a' that 5 ' is large while the quantum corrections are small (we wish to remind ourselves, that in quantum mechanical perturbation theory an expansion in powers of a' is a 'must'). If we rescale the metric g^v —> f2g^v, then large t is equivalent to small a'. Since all lengths on M are rescaled by a factor of t, we can say that large t is the limit in which the size of M is very large in units of a ' , hence if we denote by r the characteristic length or 'radius' of M, then we can view -Ja^/r as the dimensionless parameter. The interesting part of this argument is that the expansion in powers of -4a/1r is equivalent to an expansion in powers of JC in (11.5.27) as the curvature tensor of M is of order \lr2. Thus the lowest-order counterterm is obtained by contracting two of the xM's that appear in the quartic term in (11.5.27). This ultraviolet finiteness is sometimes obtained via modulo wave function renormalization. 5?

- g»v(XP) = V - \R^K(XPG)XXXK-

±DpRilXvKxpxxxK+

O((x»f).

746 Mathematical Perspectives on Theoretical Physics

In dimensional regularization poles come only from logarithmically divergent integrals. The only logarithmically divergent integral that one has to worry about comes from the contraction (x , xK), as the contraction (d x^1 d xv) can be discarded in dimensional regularization.58 In 2 + e dimensions this integral is:

<x*(o)x'«0)ff -> * = nrf* Jim J ^

^

i _

(11.5.28)

The infinite term in the one-loop effective action is thus given as: AS=

— f d2adaX^daXvRuv{Xp) 12TT£ J

(11.5.29)

M

where R^v = R^Xv is t n e Ricci tensor of M. The RHS of (11.5.28) is the counterterm that has to be subtracted from (11.5.22) to obtain a finite theory. In quantum field theory a /? function is obtained from the 1/e poles. In this case the field theory given by (11.5.22) depends not on an arbitrary coupling constant but on a set of arbitrary coupling functions &nv (Xp)- The one-loop infinity (11.5.29) corresponds to a renormalization of these coupling functions, which are described by the one-loop beta functional: Puv (* P ) = - T-K/ii, (* p ) (11.5.30) An From Equation (11.5.30) it follows that the condition for the vanishing of the one-loop beta functional (equivalently the one-loop counterterm) is that: Rflv(Xp) = 0

(11.5.31)

In view of above arguments and the Hint to Exc. 3, we conclude that the action (11.5.18) gives the Weyl invariant quantum theory if and only if R^v - 0. But this is the Einstein equation (in vacuum), hence we reaffirm our assertion that all physical theories converge to the 'theory of strings.' The following remark which can legitimately be captioned 'stringy' corrections to general relativity further proves our point (see [13a]). Remark 11.5.7 well it becomes

We note that if a j3 function is computed including the two-loop contributions as

/V (*P) = - ^ V

+

yVn^v")

(11.5.32)

where the second term is written using the Riemann normal coordinates. Since Einstein's equation R^v - 0 corresponds to the vanishing of /3^v, it follows that the second term which vanishes for a' —> 0, or equivalently when the radius r of M becomes very large compared to V a ' , can be viewed as a ('stringy') correction to Einstein's equation-in other words to general relativity. The next subsection shows the relation between Weyl invariance, the vertex operators and the concept of classical solutions in string theory. 58

The contraction (dx^dx") gives a quadratically divergent integral, and though it can be neglected in dimensional regularization, it's existence has a physical significance as it can be related to the possibility of adding to (11.5.27) non-derivative couplings that correspond to an expectation value of the tachyon field.

Strings and Superstrings (Elementary Aspects)

747

5.4 Weyl Invariance and Vertex Operators In order to show the significant role played by vertex operators in Weyl invariance of (11.5.18) which as we know corresponds to finding a classical solution, we begin with fields <£>*, k = 1 ..., and describe a vacuum state by choosing vacuum expectation values: 3>o =<<&*>

(11.5.33) k

We then write the field <&* as a sum of OQ and the quantum fluctuation
o + 0*

(11.5.34)

The vacuum expectation values of the product of these 0*'s can then be computed to describe the scattering amplitudes: An = (^^-

/«)

(11.5.35) k

In string theory, corresponding to each field $, there is a vertex operator V , thus (11.5.35) in terms of the V*'s becomes: An={Vki

V*2 ...

k V

»)

(11.5.36)

Although A „ looks similar to An, it is not so since An is computed on the string world-sheet, whereas An is computed in space-time. We note, however, that both An and A n describe scattering amplitudes for n > 4, vertex corrections for n = 3 and mass shifts for n = 2. Our interest here lies mainly in the case n = 1. In field theory, the expectation value An for n = 1 is of fundamental importance. Since its vanishing A1 = ( / ) = 0,

* = 1, 2, ...

(11.5.37)

is the statement that the vacuum state around which we are expanding with expectation values <&o of respective quantum fields $>k, is in facta solution of the classical field equations. At the quantum level, this vacuum state is an extremum of the effective potential. Based on this concept, its analog in the string theory would be: if V denotes the vertex operator corresponding to a physical state, then vanishing of the expectation value: (V) = 0

(11.5.38)

is the condition, that the vacuum state be a classical solution in string theory (or be an extremum of the effective potential). We shall show that (11.5.38) is indeed the condition for existence of a classical solution in string theory. The condition is a consequence of world-sheet conformal invariance at the tree level, for it is at the tree level that a conformally invariant nonlinear sigma model (11.5.18) or (11.5.22) corresponds to a solution of string theory at the classical level. We recall from the previous subsection that for closed strings, the world-sheet at tree level is a sphere which can be stereographically projected to the x-y plane. While computing the expectation value (V), we assume that the vertex operator V is inserted on the plane at the origin (0, 0). Now conformal invariance of (11.5.22) implies in particular the invariance under scaling transformation: x^Xx,

y -> Xy

(11.5.39)

748

Mathematical Perspectives on Theoretical Physics

and since a physical closed-string vertex operator Vhas dimension two (see Remarks (11.2.12-13) and Ftn. 11), it transforms under (11.5.39) as V^X'2V.

(11.5.40)

Invariance under the transformation further implies (V)=(X-2V).

(11.5.41)

Since X is arbitrary, this equation leads to (V) = 0, which in turn establishes that (11.5.38) is the condition for a classical solution. The following two remarks on vertex operators are noteworthy. Remark 11.5.8 All of the vertex operators in (11.5.36) are the vertex operators of on-shell physical states. The operators k, and their (tadpoles') summation shifts the invalid vacuum state (around which O* is being expanded) to a valid one, provided there is a nearby valid vacuum state to shift to. In string theory such tadpoles are automatically included in any calculation since such an addition to a propagating string does not change the topology (see Fig. 11.21). (See Chap. 3 in [13a] for more details.) Remark 11.5.9

Both graviton and dilaton vertex operators are of the form V= ^vdaXfidccXveikX u

(11.5.42)

2

where k • X s kM X , and k = 0. The polarization tensor ^ v is symmetric and satisfies:

\

I

'

(a)

/

\\%::-^::u•:

^

4jf

:••.•.:•.•.•;.••.•• \¥

(b)

^ £ >

Q ^ ^ ^ Q (a) A 'tadpole' insertion on a Feynmann diagram; (b) world-sheet of a string after the insertion of tadpole showing that there is no change in topology.

k%v=0

(11.5.43)

so that (11.5.42) may have the correct conformal dimension. The tensor ^ v is traceless in the case of graviton since the trace describes a zero spin. In the case of the dilaton, | /iV can be taken to be ~ t]^ in (11.5.42), but r\^ does not satisfy (11.5.43), we therefore write: <^v= V ~ M v ~ *>*v

(11.5.44)

Strings and Superstrings (Elementary Aspects) 749

where k ^ is an arbitrary vector satisfying k • k = 0 and k • k = 1. The last two terms correspond to the longitudinal part that decouples in the physical processes, leaving the final equation unaffected. Thus except at k*1 = 0, the graviton and dilaton conditions are distinct and so are the corresponding vertex operators, although they share the same form (11.5.42) to begin with.

Exercise 11.5 1. Show that the equivalence relation (11.5.3) given in terms of integer-valued matrices of determinant 1 (that belong to the modular group SL(2, Z)) defines conformally equivalent tori. 2. Using the fact that the number of antighost zero modes on a surface of genus g is Bg, obtain Bg explicitly in the case of g = 0 and g - I. 3. Show that the action (11.5.18) possesses Weyl symmetry at the one-loop level by using the gauge choice hap - e^r\a^ and the dimension (2 + e) (for the theory) where e —> 0 in the limit.

Hints to Exercise 11.5 1. As mentioned in the text, the torus inherits the metric of the complex z-plane which is simply (see Chapter 1): (i)

ds2 = dzdi..

Thus a complex rescaling of z (ii)

z -» z

-kz

where k is a nonzero complex number, changes (i) by |&|2 which can be absorbed in a conformal rescaling. It also rescales A, and X^ but leaves the ratio T as fixed. This naturally means that it is only T that is conformally equivalent. In order to establish the result required in the exercise, we consider the identification: (iii)

z ~ z + n'X\ + m X'2

(in place of z ~ z + nXx + mXj given in (11.5.2)) to construct the torus. The complex numbers X\, X'2 are related to Aj, X2 by the rule (iv)

G;H: m

where a, b, c, d are integers subject to the condition ad -be = 1. From Chaps. 1 and 2 we know that the (2 x 2) matrix

belongs to the modular group SL{2, Z). \c d) It can be easily checked that the torus constructed by using (iii) is the same as that defined by ( d ~b\ (11.5.2), provided the integers n and m are related to«, m via the matrix , the (integer\-c a ) valued) inverse of the matrix \c

d)

. We write r= — - and note that in view of equality (iv) X2

750

Mathematical Perspectives on Theoretical Physics

,

X\

aX, + bX7

at + b

X'2

cX2 + dX2

CT

+d

defines an equivalence relation, showing that two tori which use the identification (iii) and (11.5.2) are conformally equivalent, otherwise in general they are inequivalent. 2. When g = 0, we can use the stereographic projection of S2 onto the complex z-plane, this reduces (11.5.8) to

(i)

4=b" = °

which means that fr++ is an analytic function of z. However, our discussions of b++ imply that b++ is required to approach 0 when z -» °°. But this is impossible for an analytic function, meaning thereby that b++ has no normalizable zero modes. This confirms that Bo = 0. In the case of genus 1, we can repeat the arguments of c + for b++ (see (11.5.12)) and arrive at the result that b++ is covariantly constant and hence constant (due to flat metric of torus). Thus there is just one antighost zero mode for g = 1 which shows that fl] = 1. 3. We shall show here, that another way to discuss the Weyl symmetry of (11.5.18) is to make the gauge choice hap = e^rjap and work in 2 + e dimensions. The action (11.5.18) then becomes (a)

S = --^-J

d2+£oe£«daX^Xvgtlv{Xf>)

The expansion of ee as well as g^v iXp) then gives

(b)

S =-±jd2+E(j[(dax»daxv)(l

+ e) v -

\dax»daxvx

X A //? M w (X o P )(l+ £0)1 Now the ^-dependence of S in (b) does not disappear in the limit e —> 0, since as e —> 0 the term dax^ldaxvxxxKR'XvK{Xp0) -> - [(l/2)e]daX^daXvR^v (evidently the pole here comes from the contraction (x x )). To make the integrand ^-independent, we use the transformation JCM = 1 [£0/2] y to rewrite (b) as (c)

5 =-^j

d2^c{day^ay»

+ |y" V « ^

- y ( l - e
Strings and Superstrings (Elementary Aspects) 751

(i)

(ii)

E j ^ R j S (I). The diagram for one-loop counterterm in the nonlinear sigma model; 00 this diagram represents the additonal term due to Wyel invariance formalism where the cross represents insertion of a soft mass term. When we add the two diagrams and disregard the terms proportional to dada

(c)

S=

-±-jd2oR^day^ayv

which is the counterpart of (11.5.30). This shows that with the above choices, (11.5.18) leads to a Weyl invariant ((^-independent) quantum theory if and only if R^v.= 0.

6

THE CONCEPT OF SUPERSYMMETRY IN STRING THEORY

Having worked our way to string theory using various mathematical disciplines (covered in earlier chapters of this text), we have at last reached the final point of our study: the 'Superstrings'-the theory of everything. To make a meaningful contribution to the study of superstrings, we shall have to write almost half as many pages as we have already written. The task, though desirable, is beyond our reach at present. In the following few pages we explain in brief the supersymmetric aspects of string theory, and define the word 'superstring' that follows from there. The discussions here use the basics of superalgebra, superspaces and superfields, etc., that were covered in Chap. 7-'All that is Super.' As the treatment here is sketchy, the reader may like to browse through Chap. 7 before going through this section.

6.1

Bosonic Theory with Majorana Fermions

We begin here with the (classical) bosonic string action in conformal gauge: S = -—jd2adaXtldaXfi

(11.6.1)

which as we know represents a free field theory in two dimensions, with X^(a, T), jd = 0, 1, •••, D-\ as the coordinates of a string propagating in D space-time dimensions. In order to bring in the notion of supersymmetry into the theory in a simple form, we introduce a free fermion field y/A(o, T). The capital letters A, B, C, etc., stand for world-sheet spinor indices; in two dimensions if both chiralities are included, the spinor index takes only two values. We know that this fermion can either be a Dirac or a Majorana fermion and can carry additional quantum numbers (see Sees. 7.2 and 7.3 for Dirac and Majorana spinors). There are not many feasible choices that one can make here, but the one which gives

752

Mathematical Perspectives on Theoretical Physics

an interesting theory is with the inclusion of a D-plet of Majorana fermion into the action (11.6.1). This Majorana fermion y//(cr, r) transforms in the vector representation of the Lorentz group SO(D-l,l). Thus the action in question is: S=~~j

d2a{da X^da XM - i^pada^},

where symbols pa represent 2-dimensional Dirac matrices.

(11.6.2)

It is usual to use the basis: (11.6.3)

to write the fermion y/as:

V={W~\

(11.6.4)

Since pa in (11.6.3) is purely imaginary, the Dirac operator ipa da is real, and accordingly the components of the world-sheet spinor i//^ are chosen to be real. The two-component real spinor (11.6.4) is the Majorana spinor, and the symbol \jr in (11.6.2) equals y/+ p° as usual. The following remark lists some of the properties of the Majorana spinor y/ introduced above. Remark 11.6.1 checked that

\i% is another Majorana fermion in two dimensions, then from Eq. (7.2.6) it can be XV = WX-

(11.6.5)

This shows that x a n d War& anticorhmuting variables. We further note that % y can be written as p°AB XA VB> anc* since p° is an antisymmetric matrix, from (11.6.5) it follows that expression P°AB XA VB i s symmetric i n l a n d I/A The anticommuting field l//^ that transforms as a 'vector,' which is a bosonic representation of SO(D-l,l), actually maps bosons to bosons and fermions to fermions in the space-time sense. At first sight this does not seem to be correct, but looking at things more closely, one realizes that the action (11.6.2) gives a two-dimensional field theory (and not a field theory in space-time) where x/// transforms as a spinor under 2-dimensional world-sheet transformations and thus turns out to be in agreement with spin-statistics relations. This is so since from the world-sheet point of view the Lorentz group S0(D-1,1) is simply an internal symmetry group, and the spin-statistics theorem has no influence on anticommuting fields as to whether they transform as vectors or as spinors under an internal symmetry group.60 To study y/M further, we return to (11.6.1) and recall (see Ftn. 15) that the equal X commutation relations of the bosonic coordinates: [X"(<7), X\G')] 59.

60

= irfv8(0-

cr')

(11.6.6)

See Sec. (7.3) and (7.A) for Pauli and Dirac matrices, where we have used the symbol r, obviously this symbol could not be used here as it is one of the world-sheet parameters. In this section we use the symbols yM and F^ for 4-dimensional and D-dimensional space-time Gamma matrices. According to spin-statistics theory, an anticommuting field in local quantum field theory in two dimensions has half-integral Lorentz quantum numbers (see Ref. [Ad]).

Strings and Superstrings (Elementary Aspects) 753

introduced ghosts into the theory (via X°(cf) oscillators) on account of the Lorentz metric rf". Fortunately (11.6.1) has an infinite-dimensional symmetry algebra-the Virasoro algebra, with the help of which these ghosts could be eliminated for D - 26. In the case of fermions that appear in (11.6.2), the equal T commutation relations are 61 : {y^a),

yvB(o')} = ICT)11^

8 (a-

o').

(11.6.7)

Since rf° = - 1 , the 'timelike' fermions V/°A(<7) c r e a t ; e wrong metric states similar to the 'timelike' bosons X°(
6.2

World-sheet Supersymmetry and Two-dimensional Superspace

In order to introduce this concept of supersymmetry, we consider an anticommuting infinitesimal Majorana spinor £ which is constant (i.e., independent of a and T), and we assert that the action (11.6.2) is invariant under the infinitesimal transformations:

5X^=ey/H 8y/>l = -ipadaXfl e

(11.6.8)

These transformations which mix bosonic and fermionic coordinates are called the supersymmetry transformations (see Fact. (7.4.10)). As the commutator of two supersymmetry transformations gives a spatial translation in view of (7.4.46), here it means a translation of the string world-sheet. Thus for instance: [8l,82]X^8l(£2\iffl)

- S^ExV11) = (2i£lpa e2)daX^ = aadaX»

(11.6.9)

Similarly, [5,, 82]^=aada^

(11.6.10)

We note here that in writing (11.6.9) we have used the fact that for Majorana spinors in 1 + 1 dimensions, £l pa£2 = - £ 2 P " e i ar>d in writing (11.6.10) we have used the Dirac equation pada y/= 0 (derived from (11.6.2)). We now use (11.6.2) and (11.6.8) to write the supercurrent and energy-momentum tensor. If e is constant, it leaves (11.6.2), i.e., S invariant; when e is not a constant, 5 is not invariant, however its variation is of the general form: 8S = — J d2a{da e)Ja

(11.6.11)

where 7" is the conserved (Noether) current (see Sec. (6.3) and (Eq. 11.3.53)). Applying the same ideas to the transformations (11.6.8), we derive the so called supercurrent here which equals: 61

This anticommutation relation is the quantum version of the Poisson bracket for Grassmann variables.

754

Mathematical Perspectives on Theoretical Physics

Ja=\pPPaYtldpXfl

(11.6.12)

Also using these transformations in the case of the translation 8(f= constant, we obtain the formula for the EM tensor (see (11.3.20-21)):

Tap = da X% X^+j-W

Pa dpW»+j

YM Pp da l ^ - (trace)

(11.6.13)

Using the equations of motion padtt I// = 0, we can easily check that both Ja and Tap are conserved. We also note that just as in the case of a (purely) bosonic theory, the EM tensor here too is traceless, and hence in terms of light-cone coordinates, the components T+_ and T_ + are zero. Moreover, in view of the two-dimensional identity pafP pa= 0, the supercurrent which is conserved satisfies in addition: paJa=0

(11.6.14)

This concept of supersymmetry does not go very far unless we show how it manifests itself in our two-dimensional field theory given by (11.6.2). For this purpose we formulate the theory in a twodimensional superspace (denoted) £ that consists of world-sheet coordinates Oa, and two Grassmann coordinates 9A. The coordinates 0A form a two component Majorana spinor. A general function Y^ in £ is a (familiar) superfield (see Eq. (7.5.25)) which depends generally on both the bosonic and fermionic coordinates, thus62: Y^(a, 0) = Xfi(<j) + 9\f/u{a)

+

— 99B^(CJ)

(11.6.15)

Recall that due to anticommutation properties of 9, any term involving a product of more than two terms will be zero. The superfield YM combines the fields XM and y/*1 with another field B^ known as an auxiliary field. The supersymmetry is represented on the superspace by the generator (see (7.5.6))*:

QA=-±r

+

i(pa9)Ada

(11.6.16)

When an arbitrary anticommuting parameter eA is used as the infinitesimal parameter of a supersymmetry transformation, then using £Q in place of QA, the superspace coordinate transformations are obtained as: 89A = [eQ,eA] = e* 5aa=

[£Q,aa] = i£pa9

(11.6.17)

In this way supersymmetry is realized in superspace as a geometrical transformation. If instead we use the supercharge Q to define: 5Y»=[eQ, y"] 61

Note that the triplet (XM(a), i^M(o), S p ) corresponds to (A, l//, F) of Sec. (7.5). * Reader should note the similarity as well as dissimilarity between (7.5.6) and (11.6.16).

(11.6.18)

Strings and Superstrings (Elementary Aspects) 755

then after expanding 7M componentwise (see (11.6.15)), and using the two-dimensional Fierz relation63:

^

B

= -\sABecec

(n.6.19)

we have:

8yf =

-ipaedaXfl+B'le

8BV=-i£pada^

(11.6.20)

[elQ,e2Q]^2ie]pae2da

(11.6.21)

Also since

it follows that [Si, 82]Ytl = aada YM

(aa = 2iel p%)

(11.6.22)

(see Sec. (7.5) for derivations). From the above equalities it is evident that with the help of the auxiliary field B^, the closure of the supersymmetry algebra is achieved without the use of equations of motion pa day/^ = 0. Also, if one sets B^=0 and uses the equations of motion along with the transformations, then the set (11.6.20) reduces to the set (11.6.8). Finally, if Yu •••,Kk are superfields, then their transformation law under supersymmetry is 5 Yk'= sQYk and it is easy to check that for the product of any two superfields: S(Y{Y2) = eQ(Y[Y2)

(11.6.23)

since the first-order differential operator £Q (in superspace) satisfies the Leibnitz rule: Q(Y:Y2) = Q(Yl)Y2 + 7, Q(Y2)

(11.6.24)

A natural question that we ask now, is how does one write a Lagrangian and an action in this superspace, and what are the sets of 'constraint equations' in this new setup?

6.3

The Action and the Constraint Equations on £ = £ (oa, 0A)

To write an action for a field theory which would be invariant under a symmetry (supersymmetry), we have to first formulate a Lagrangian that is invariant. In this case we have to write a Lagrangian that is invariant under supersymmetry transformations (11.6.20). This is done by using the superspace covariant derivative operator: D=-lL-ipaGda dO which is invariant under supersymmetry, and which satisfies (see (7.5.8)-(7.5.9)): {DA, QB] = 0, [DA, DB] = 2i{pa)ABda, 63

{DA, DB} = 2i(pap°)ABda

(11.6.25)

(11.6.26)

See the Apps. in Ref. 9.[6] or in 7.[21] for explanations on Fierz product, and Sec. 7.5 for some explanations.

756 Mathematical Perspectives on Theoretical Physics

From the first equation of (11.6.26), it follows that if Y transforms as SY = EQY, then so does its covariant derivative DA Y; this is a useful fact in writing a Lagrangian. Another required ingredient for an invariant action, is the integral over 'all' of superspace: \d2od2e

(11.6.27)

where d29\s the fermion integration that satisfies the Berezin integration rule for ferminos: j d29(a + 6lbl + 62b2+ 9x92c) = c

(11.6.28)

This shows that integral J c?20picks out only the coefficient of 9l92 in (11.6.28) (see Eq. (7.1.1-3) and 9. [2]). Since 9 9 = 9p°9 = -2i9l92,

we have

jd299

9 = -2i

(11.6.29)

Like bosonic integrals, we can integrate the Berezin integral (11.6.30) by parts:

\d2ej£r = °

( 1L6 - 3 °)

for any Y. Using the concept of derivation and integration, we now write a simple action on X, in terms of an elementary D-tuple superfield Y that transforms in the vector representation of SO(D-l,l), thus S=—\d2ad29DYttDYu

(11.6.31)

ArtJ

The derivatives DY^ and D Y^ in (11.6.31) can explicitly be written as: DYtl= ^ " + 9B*- ipa9daXtl+

—99paday/^

(11.6.32)

idaX^9pa-

-99day?^pa

(11.6.33)

DY»= y/V+BHQ

+

Retaining the terms that are quadratic in 9, we have: DY^DY^=daX'1dpX^papl39+

(i/2)(vfpada VJ/- da\fi^pa^)9 9

+ B^B^9 9.

(11.6.34a)

After some simplification above equation becomes: D FM DYp = {-cf* X*da X^ + i^pada

^ + BM BJ9 9

(11.6.34b)

Strings and Superstrings (Elementary Aspects) 757

Hence using the integration rule (11.6.29),the action (11.6.31) can be written in the expanded form as: 5' = -±-\

d2a{da X»da X^ - if»pada y^-B» BJ

(11.6.35)

We note that the field equations derived from S' imply that SM= 0, which means that we can simply set 5 M as zero and retrieve (11.6.2) from S'. Since B^ can be disposed of in this manner, one argues the usefulness of inclusion of this field into the theory. In this connection we confirm that if (11.6.2) with its timelike fermion y/° of wrong metric is to represent an acceptable theory without unwanted modes, then B^ has to be considered and, as such, the action (11.6.35) is a step in the right direction. We note, however, that to achieve our objective of invariant action under enlarged supersymmetry, we must study the infinite-component symmetry algebra as well as the corresponding constraint equations. The following remark explains this aspect of the theory. Remark 11.6.2 The symmetries given in (11.6.8) (that we have discussed so far) are global supersymmetry transformations with a constant supersymmetry parameter e, although constant translations of the world-sheet coordinates (a, T) are implicitly involved there. This is so, since the commutator of two global supersymmetries QA is a translation of the world-sheet coordinates crand x. In the case of bosonic string theory, these translations are generated by two Virasoro generators Lo and L o , which taken together define a subalgebra of the infinite dimensional symmetry algebra of the bosonic theory. In order to extend the QA to an infinite-component 'supersymmetry,' we have to consider the fermion equation of motion derived from (11.6.2). This, as we already know, is the two-dimensional Dirac equation paday = 0. If we use (11.6.3) as the basis for pa, this equation decomposes into a pair of decoupled equations:

iir--f)< = 0

(1L6 36)

-

\d(J di J which shows that the upper and lower components y/_ and yr+ of y/ describe the right- and left-moving modes respectively. If we use the light-cone coordinates cr* = z± o and d± = —(

(11.6.37)

2

When we consider the boson equation

d X^

= 0 as well in the light-cone coordinates, the two

do doa equations taken together can be written as: d+ y/»_ = 0 = d+ (
d_y/»+ = 0 = d_ (d+ X")

(11.6.38)

presenting the symmetry between bosons and fermions in a still more transparent form. These equations also show that both y/M_ and d_ XM are functions of cr~, while y/^+ and d+ XM are functions of a+. Thus supersymmetry is indeed the symmetry between y/1 and
758

Mathematical Perspectives on Theoretical Physics

From the above discussions and the action (a) given in Exc. 3 for the fermion part of the action (11.6.2) of this section, it is apparent that one can set \f/+ (y/_) to zero and then discuss a two-dimensional Lagrangian with right- (left)-moving fermions only. Also, since the two-dimensional chirality operator p = pop[ actually has y/± for its eigenstates, the assumption y+=0 amounts to working with a spinor field of positive chirality. In view of the decoupling of positive- and negative-chirality modes, the world-sheet supersymmetry current (11.6.12) and the EM tensor (11.6.13) also become simple. For instance, in terms of light-cone coordinates, the supercurrent (11.6.12) has components J+A and J_A, where the subscripts ± represent the vector component and A the spinor index, and therefore both J+A and J_A can be considered as twocomponent spinors. Moreover as pa Ja= 0 (see (11.6.14)), we note that either the positive-chirality spinor component of J+A or the negative chirality component of J_A is non-zero. We denote these nonzero components simply as J+ and J_ respectively. Hence we have: h = V\ d+ *p

J- = V - d-*n

(11 -6-39)

leading to the conservation equation: d_J+=0=d+J_

(11.6.40)

Using the equal t anticommutators and commutators: { < (cj)V+(c7')} = {^-(O. [d± X"(o), d± X\a')]

VV_(
= rc^vS(a-

a')

= ± (nl2)7]^ S'(a - a')

{Vt, V-} = 0 = [d+X", d_Xv]

(11.6.41)

we can compute the algebra : {J+(
a')T++ (a)

{Jl
o')T__ {a)

{J+(o\

7_(cr')} = 0

The light-cone components T++ and T

(11.6.42) of the EM tensor in above equation are as follows:

T__= d_X"oL X^ + — v/M_ d_ y/_^

(11.6.43)

Having obtained the algebra (11.6.42), we are now at a point where we can write the constraint equations that would help eliminate the (unwanted) timelike components of both \f/M and X^. We know that in the bosonic case these components are eliminated in 26-dimensions by using Virasoro constraints T++ = T__ = 0. Naturally we would like to have similar equations in the Fermionic case as well. In view of (11.6.42) we note, however, that we could not set T++ and T__ to zero without setting J+ and /_ to zero. Hence the required constraint equations for supersymmetric theory are: 64

Quantum mechanically, there is an anomaly term in the algebra relations, which we have not considered.

Strings and Superstrings (Elementary Aspects) 759

J+ = J_=T++=T__ = O

(11.6.44)

These are known as super-Virasom constraints, and though here we have just postulated them, they can actually be derived by gauge fixing of a suitable two-dimensional supergravity Lagrangian (see Sec. 4.3.5 of Ref. [13a] for details).

6.4

Boundary Conditions and Super-Virasoro Operators

So far we have not talked about the boundary conditions (after the inclusion of fermionic field), in the next remark we consider the two types of boundary conditions for open strings-(the Ramond (R) and the Neveu-Schwarz (NS)) and determine the 'unconstrained' theory that results from these two choices. Remark 11.6.3 In the case of Fermi coordinates when the Lagrangian is varied to obtain the EulerLagrange equation, the surface terms that arise from variation are required to vanish-this is possible only when y/+ 8y+- yf_ 5y/_ vanishes at each end of an open string. Evidently this is satisfied by letting V+ = ±V- (which implies 5i/f+ = ± Sy/J at each end. With no loss of generality we set i//"+(0, T) = y"_(0, T)

(11.6.45)

The relative sign between y+ and y/_ though a matter of convention is important once we have made the choice (11.6.45), and thus there are two cases for our consideration. In the first case we set y^in,

T) = y^in, T), (R)-boundary condition

(11.6.46)

and the mode-expansions of the Dirac equations are:

<(c7, T)= * X#e-''1
(H.6.47)

(11.6.48)

and the mode-expansions now are: <(<7, T )=

* V 2

£

£^ e - i > ( T + t T )

b^e-'^c).¥u{ar)=l V 2

reZ + i-

(11.6.49)

rsZ + i

We note that the sum in (11.6.47) runs over all integers, whereas the sum in (11.6.49) runs over halfintegers (summation indices in these equations are respectively denoted by n, m and by r, s). The boundary condition (11.6.46) and the integer modes describe the string states that are space-time fermion, and the boundary conditions (11.6.48) along with half-integer modes give the bosonic states. We must mention though, that these bosonic states are different from those of bosonic string theory discussed in Sec. 3. Remark 11.6.4 In the case of closed strings, surface terms vanish when the boundary conditions satisfy the periodicity or antiperiodicity relation for each component of y separately, and thus the mode expansions are: V^«r, r) = X d»n e'2in(z-

a)

or yt(a,

t) = £ b* e-2ir(r-a)

(11.6.50)

760 Mathematical Perspectives on Theoretical Physics and

yrlia, T) = £ d* e'2'"(T + °>

or < (ff.t) = X K «"2''KT + * }

U 1-6.51)

Since there are four different pairings that follow from combinations of left-moving and right-moving modes, there are four distinct closed-string sectors. These are referred to as NS-NS, NS-R, R-NS, and R-R sectors. The first and last of these describe the closed-string states that are bosons, while the remaining two describe fermions. These mode-expansions lead to our final goal, the super-Virasoro operators. We define them in the following remark. Remark 11.6.5 As in the case of Virasoro operators defined for open-strings in bosonic theory, we define one independent set of Lm by the linear combination of T++ and T_ _, we thus have: Lm = _L \Kdo(eimaT++ + e-imaT

n

Jo

) = — f dOeim°T,+

(11.6.52)

n J-n

For the fermionic generators of the algebra, we now define: F

=

JL \Kd
(11.6.53)

with R boundary conditions, whereas with NS boundary conditions this definition is: Gr= —

\nda{eira

J+ + e~ir<JJ ) - — T daeiraj+

(11.6.54)

In the case of closed strings, the super Virasoro generators form two distinct sets, one corresponds to the modes of T++ and J+ and the other to the modes of T__ and / _ . In the classical string theory, these expressions are required to vanish, whereas in quantum theory similar to the bosonic case (discussed in Sec. 3) there are different options for dealing with them. We describe these options briefly in the following paragraphs. For quantizing the superstring, to begin with we note that in the covariant gauge the dynamics of the coordinates X^(a, T) and y/^icr, f) is given by a free two-dimensional Klein-Gordon equation and a free Dirac equation supplemented by certain constraints. The quantization of these coordinates is given by: [X"(cr, T), X\a\

T)] = -iitS{a-

e')i}^v

(11.6.55)

for bosonic coordinates, and { y ^ ( o ; T), yvB{G, T)} = n5{a - o')^v8AB

(11.6.56)

for fermionic coordinates. Both of these lead respectively to the commutation and the anticommutation relations K , , avn] = mSm + nV»v

(11.6.57)

and {d^d\}

= rv8m

+n

or

[b%bvs)=rfv6r^

(11.6.58)

Strings and Superstrlngs (Elementary Aspects) 761

for their modes*. Similar to the bosonic theory the zero-frequency part of the Virasoro constraint in this case gives the mass-shell condition: a' M2 = N + 'constant'

(11.6.59)

The 'constant' here represents a normal-ordering effect, and JV stands for N=Na+Nd

or Na+Nb

(11.6.60)

where oo

oo

on

Ua= I «-„,«,„> Nd=lmd^a-dm, N"= J r M , ro=l

m=l

r=l/2

(11.6.61)

•

The state of lowest mass-squared corresponds to the Fock-space ground state: <JO> = < | 0 > = 0, m > 0 or c&|0> = b^O), m, r > 0 (11.6.62) An excitation by a raising operator atm or d^m (or b^) increases the eigenvalue of a' M2 by m (or r) units. In the case of half-integer modes, a unique nondegenerate ground state can be chosen which can be identified with a zero spin state. But for the case of integer modes, it is not possible, since d^0 (for all fx) have the algebra {d%, dv0] = TJM" and they also commute with the M2 operator. The algebra formed by the d^0 is just the Dirac algebra (of Dirac matrices) given by {T'", Tv) = -2rfv up to a normalization constant. Hence we can write d% ^ - Y11. Using the mode-expansions of X^ic, t) and y^(o, T), the super-Virasoro operators can be expressed as: Lif+L^

(NS),orL(«>,+ Z/t (R)

(11.6.63)

where r(a) _ J_ Y-fY

n

4 ft) = 7 X (r + ±m):b_r-bm

+ r:

and

(11.6.64)

(In each case the normal ordering is required only for m = 0.) Likewise for the fermionic generator we have

Gr= £ CLn-br + n (NS), Fm= f, a_ndm

+ n(R)

(11.6.65)

(For obvious reasons no normal-ordering is required here.) *

Relations (11.6.57) and (11.6.58) stand for open or closed-string modes, in the case of closed-strings there are three other sets as well, that involve 5 ^ , d % and b y etc.

762

6.5

Mathematical Perspectives on Theoretical Physics

Super-Virasoro Algebra and the Anomaly

Finally the super-Virasoro algebra in the bosonic sector (i.e., the NS sector) is tLm> Ln\ = {m-n) [Lm- Gr] = [Gr,Gs)

Lm + „ + A(m) 8m+n

y—m-rjGm+r

= 2L»s+B(r)8r

(11.6.66)

+s

where A(m) and B{r) are c-number anomaly terms similar to the ones in bosonic theory. Hence using the arguments of that theory, these can be determined by evaluating the expectation values in the Fockspace ground state, thus: A(m)= — D(m 3 -m),

B(r) = — D(r2 - — \

(11.6.67)

Similarly the algebra in the fermionic case, i.e., in the (R) sector, is: ILm. Ln\ = (m - " ) Lm + n + A(m) Sm+n

{Fm,Fn}

= 2Lm + n + B(m)8m

+n

(11.6.68)

The anomaly terms in this case are: A ( m ) = - D m 3 , B(m) = — Dm2.65 (11.6.69) 8 2 The sets (11.6.66)-(11.6.69) describe the closed super algebra for both sectors. In the (NS) case the five generators that form it are: Lj, L$, L_j, G1/2, GL1/2; an< i m the (R) case these generators are: Llt Lo, L_ b Fo. Using these it can be shown that D = 10 is the (critical) dimension for the bosonic as well as the fermionic sector. In other words, for superstring theory the dimension D equals 10. We note that we barely introduced the reader to superstring theory, and we are ready to quit. We do hope, however, that by deriving an important ingredient of the theory—the super-Virasoro algebra (using the other super-objects)-we have sufficiently motivated the searching mind to reach his/her own goals. We close the section, the chapter and the book by drawing the reader's attention to some important points of this 'fascinating' theory.

6.6 Superstrings-A Theory of Unification In layman's language superstring theory unites the various forces of nature and particles in much the same way as a violin string does in producing the musical tones. Just as one physical object-the violin string-is fundamental in creating the varieties of musical notes (and the harmonies), the superstring provides a unifying description of elementary particles and forces. One can therefore say that the "music" created by the superstring is "the forces and the particles of nature." 65

' The discrepancy in the definitions of anomalies given by (11.6.67) and (11.6.69) could be removed by redefining Lo with an additional constant.

Strings and Superstrings (Elementary Aspects) 763

We list below some of the features of this theory that make it an exceptionally promising and attractive theory. (1) The gauge group of the theory includes £ 8 <8> E& which is much larger than the minimal group SU(3) ® SU(2) <S> U(l), hence there is enough scope for phenomenology in the theory. (2) The theory has no anomalies in some restricted dimensions as the symmetries of the superstring theory, by a series of "miracles," are able to cancel all of its potential anomalies in critical dimensions (see [13]). (3) The theory is finite to all orders in perturbation theory. This follows from the application of the techniques of Riemann surfaces theory, as the two theories (superstrings and Riemann surfaces) have much in common (see Rabin in [37]). (4) Superstring models cannot be altered without destroying their miraculous properties. Hence there is very little freedom to play with the theory, in other words one does not have to deal with the problem of too many (e.g., 20 or more) arbitrary constants. (5) The theory includes GUT, super-Yang-Mills, supergravity (see 7.[21]) and Kaluza-Klein theories (see [18] and [36(a)(b)]) as subsets. Thus many of the phenomenological features developed for these theories carry over into this theory. The theory has links with classical concepts such as projective diffeomorphisms via 3rd-order non-linear differential operator—the well known Schwartzian differential operator (see [ll(a) and (b)] and [27c]). In view of the above points in favour of the theory, one may conclude that it is a perfect theory in every sense of the term. But this is not the case as can be seen from the list given below. (1) The theory is not (experimentally) testable and hence is not a good candidate as a physical theory. (2) There is no experimental evidence to confirm the existence of supersymmetry/superstrings. (3) The predictions of superstring theory regarding the orders of magnitude between 100 and 1019 GeV do not tally with anything in the history of science. (4) The theory that claims to be 'a theory of everything' is not able to explain the vanishing of the cosmological constant. In spite of the shortcomings mentioned above, it is believed in general that if superstring theory is broken dynamically, then it would be able to make predictions down to the level of everyday energies. Hence the fundamental problem that faces superstrings is not only of 'experimental' nature but it is also of more sustained work on the 'theoretical' side. The rigorous research that has been going on in various institutions of the world on the subject has removed some of the deficiencies of the theory. Yet more work is required to give definitive answers. A brief summary of recent research in Appendix 11B provides some insight into the theory. The route to the top (the superstrings) is long and difficult, but surely it is within our reach 66 In any case it was this faith into the theory and the quest for a clearer technical understanding (of theories converging to superstrings) that led this author to the project: Mathematical Perspectives oii Theoretical Physics.

Exercises 11.6 1 Verify (11.6.29). 2. Show that the integral f d2a d29 is invariant under supersymmetry. 66

See some recent papers in Ref. [Ad] and, Appendix B for relationship between superstrings and black holes, where a further list of references is given.

764 Mathematical Perspectives on Theoretical Physics

3. Show that by introducing on a world-sheet the light-cone coordinates a* = T ± a a n d d± = — (dz ± 3 a ), the two-dimensional Dirac action can be written in a form that makes apparent the decoupling of y/+ and y and also gives the decoupled pair similar to (11.6.36). Show further that in terms of these variables, the fermion part of the action (11.6.2) is:

(a)

Sp = — J d2o (Y_ d+ y_ + y+.d_ y/+)

4. Use the techniques explained in Subsec. (3.1) to derive the formula (11.6.13) for Tap. (No hints are given as all these exercises can be solved using the material in Chapters 7, 9 and this section.)

Strings and Superstrings (Elementary Aspects) 765

APPENDIX 11A GLOSSARY A.1

Ghost

The ghosts are particles with negative metric, they propagate in the Green's function with the wrong sign and as expected they are associated with the negative sign appearing in the Lorentz metric. Presence of ghosts in a physical theory implies the existence of negative probabilities, meaning thereby that a theory with ghosts is unacceptable. If, however, the ghosts are cancelled against other ghosts in the S-matrix or via some other means, then the theory becomes consistent. Faddeev-Popov ghosts that appear in string theory (conformal field theory) are the best known examples of ghosts which cancel out, leaving the theory as an acceptable one. The physical-space of string theory (in critical dimensions) is known to be ghost-free because of the ward-identities generated by the Virasoro algebra.

A.2 Hadrons The subnuclear particles that experience the strong nuclear force and are (generally) heavy are known as hadrons. Due to the large angular momentums (spins) involved in their scattering process, a string67 framework is required for the description of their spectrum and high energy behavior. There are no massless hadrons with spin 1 or 2. It has been observed that spin 1 states behave like gauge particles, while the spin 2 states behave like a gravitino.

A.3 $ and /-channel Scattering These are scattering channels which give 'dual' description of the same physics. This is evident from the two diagrams given below. 3

4

X w

A '

1

2

(I)

(II)

As is typical of any elementary particle experiment, the two particles 1 and 2 come rushing toward each other. In one case they collide, merge for a short period to form a new collective mode called resonance (denoted R in Fig. (i)) which persists for a while, and then breaks giving rise to particles 3 and 4. In the other case (Fig. (ii)), 1 and 2 approach each other and exchange a particle, which is an expression of the force or interaction between them. As a result, 1 and 2 modify their identities to 67

• Although while writing a historical perspective of string theory-the strings that resulted from hadrons are always mentioned-but these strings (due to hadrons) have in fact no relation with the present theory of superstrings.

766 Mathematical Perspectives on Theoretical Physics

become 3 and 4. These two different interpretations of scattering are shown to be related to each other by Veneziano in his dual model given by the amplitude function:

A(,,0=-i(^)+1)(a^) +2 ) - ( a ^ ) nTi n\

+ W)

l

(11A.1)

a(t)-n

Since Veneziano assumed A(s, i) = A(t, s), this could also be written as:

A(i,Q=-i(g(t)

+ 1)(g(>) + 2 )

«=i

A.4

-(g(>)

+ B)

n\

*

(11A.2)

a(s)-n

Mandelstam Variables

Consider the diagram with incoming particles Al,A2 momenta - p 3 , —p4.

of momenta P\,p2 and outgoing particles A3, A4 of

- P3\

/ " P4

Fig. 11.24 The kinematic variables that use the center of mass framework and squared energy in the channels 12 —> 34, 13 -> 24 and 14 -» 23 are the Mandelstam variables s, t and u. Written out in full they are: s = -(p{ + p2f,

t~-(pl+

p3)2,

u

= -(p1 + p 4 ) 2

(11 A.3)

They obey the identity: 4

s + t + u= I m f

(11 A.4)

where m? = - ^ A.5

(HA.5)

Tachyons

The particles with negative squared mass are called tachyons. They move with a speed faster than the speed of light. Superstring theories have no tachyons. 68'

The metric has the signature {-1,1, 1,1}. (11A.4) can be easily checked by using the fact that ipx + p2) = -(Pi + P4), one can also use other choices for s, t, u, e.g., s = - (px + p2)2, t = -(p2 + p3)2, u = -(pl + p3)2.

Strings and Superstrings (Elementary Aspects)

767

References 1. G. Aktas, C. Saelioglu, M Serdaroglu (eds.), Strings and Symmetries (Proc. Giirsey Mem. Conf. I) LNP447 (Berlin: Springer-Verlag, 1995). 2. T. Banks, D. Horn and H. Neuberger, Bosonization of the SU(N) Thirring Models, Nucl. Phys. B108, No. 3 (1976), 119-129. 3. I. Bars, Compactification of Superstrings and Torsion, Phys. Rev. 33, No. 2 (1986), 383. 4. I. Bars, D. Nemeschansky and S. Yankielowicz, Compactified Superstrings and Torsion, Nucl. Phys. B278, No. 3 (1986), 632-656. 5. P. G. Bergmann and V. de Sabbata (eds.), Proceedings of the 6th Course of the International School of Cosmology and Gravitation on Spin, Torsion, Rotation and Supergravity (Italy: Plenum Press, 1980). (i) F. W. Hehl (Four Lectures on Poincare Gauge Field Theory). 6. P. Candelas, G. T. Horowitz, A. Strominger and E. Witten, Vacuum Configurations for Superstrings, Nucl. Phys. B258 (1986), 46-74. 7. P. C. Davies, Superstrings, A Theory of Everything (Cambridge University Press, 1988). 8. (a) E. Del Giudice and P. Di Vecchia, Factorization and Operator Formalism in the Generalized Virasoro Model, Nuovo Cim. 5A (1971), 90; (b) E. Del Giudice, P. Di Vecchia and S. Fubini, General Properties of the Dual Resonance Model, Ann. Phys. 70 (1972), 378. 9. M. Dine (ed.), String Theory of Four Dimensions (North-Holland, 1988). 10. R. Dolen, D. Horn and C. Schmid, (a) Prediction of Regge Parameters of p Poles from Lowenergy nN Data, Phys. Rev. Lett. 19 (1967), 402; (b) Finite Energy Sum Rules and their Application to nN Charge Exchange, Phys. Rev. 166 (1968), 1768. 11. D. Friedan, (a) Introduction to Polyakov's Theory, in Recent Advances in Field Theory and Statistical Mechanics, (eds.) J. B .Zuber and R. Stora (Cambridge Univ. Press 1984) 839; (b) D. Friedan, E. Martinec and S. Shenker, Conformal Invariance and Critical Exponents in Two Dimensions, Journ. of Magnetism and Magnetic Materials 54-57 (1986), 655-657; (c) Conformal Invariance, Supersymmetry, and String Theory, Nucl. Phys. No. 1, B271, (1986) 93. 12. (a) M. B. Green and J. H. Schwarz, Anomaly Cancellations in Supersymmetric D = 10 Gauge Theory and Superstring Theory, Phys. Lett. 149B, No. 1, 2, 3 (1984); (b) M. B. Green and D. Gross (eds.), Unified String Theories (World Scientific, 1986). 13. M. Green, J. H. Schwarz and E. Witten, Superstring Theory, (a) Vol. 1; (b) Vol. 2 (Cambridge Univ. Press, 1987). 14. A. Jaffe, Ordering the Universe. The Role of Mathematics, SIAM Rev. 26, No. 4 (1984). 15. M. Kaku, (a) Introduction to Superstrings (Springer-Verlag, 1989); (b) Strings, Conformal Fields, and Topology (Springer-Verlag, 1991). 16. D. Kastler (ed.), Algebraic Theory of Superselection Sectors (World Scientific, 1989). 17. J. Kowalski and L. G. Glikman, Doubly Graded Sigma Model with Torsion, Phys. Lett. 180, No. 4 (1986), 358. 18. H. C. Lee (ed.), An Introduction to Kaluza-Klein Theories (World Scientific, 1984). 19. D. Lust and S. Theisen, Lectures on String Theory (New York: Springer-Verlag, 1989). 20. M. Martinis and I. Andric, Superstrings, Anomalies and Unification (World Scientific, 1987). 21. A. N. Mitra, Basic Building Blocks Began (...) Big Bang, Golden Jubilee Publication, Ind. Nat. Sci. Acad. (1984). 22. R. N. Mohapatra, Unification and Supersymmetry (7.[16]).

768 Mathematical Perspectives on Theoretical Physics

23. G. Moore, P. Nelson and J. Polchinski, Strings and Supermoduli, Phys. Lett. 169B, No. 1 (1986), 47-53. 24. P. Nath, R. Arnowitt and A. H. Chamseddine, Applied N=l Supergravity (World Scientific, 1984). 25. S. D. Peats, Superstrings and Search for Theory of Everything (Chicago: Contemporary Books, 1988). 26. T. Piran and S. Weinberg (ed.), Strings and Superstrings Vol. 3, (World Scientific, 1986). 27. N. Prakash, (a) Projective Structures in Space-time, Ind. Journal of Pure and Appl. Math 17 (5) (1986) 629-663; (b) Projective Mappings on Differentiable Manifolds, Rocky Mountain Journal 17, No. 3 (1987), 511-534; (c) Projective Structures and String Theory-A Distant Kinship via Schwartzian Operator, Proc. of Mukhopadhyay Centenary Celebrations, Calcutta Math Soc, (1992). 28. A. Rosenblum (ed.), Relativity, Supersymmetry, and Strings (Plenum Press, 1990). 29. R. Ruffini (ed.), Proceedings of the Fourth Marcel Grossmann Meeting on General Relativity (Elsevier Science Publisher, V. B., 1986); P. Candelas, G. T. Horowitz, A. Strominger and E. Witten (Superstring Phenomenology). 30. A. Salam and J. Strathdee, Super-Gauge Transformations, Nucl. Phys. B76 (1974), 4 7 7 ^ 8 2 . 31. J. H. Schwarz, (a) Superstring Theory, Phys. Reports 89 (1982), 223; (b) (ed.) Superstrings, the First Fifteen Years of Superstring Theory Vols I and II, (World Scientific, 1985); (c) Introduction to Superstrings in Superstrings and Supergravity, (eds.) A. T. Davis and D. G. Sutherland (Edinburgh, 1985). 32. W. Siegel, Introduction to String Field Theory (World Scientific, 1988). 33. A. Strominger, Superstrings with Torsion, Nucl. Phys. B274 (1986), 253-284. 34. G. Veneziano, (a) Construction of a Crossing-symmetric, Regge-behaved Amplitude for Linearly Rising Trajectories, Nuovo. Cim. 57A (1968), 190; (b) An Introduction to Dual Models of Strong Interactions and their Physical Motivations, Phys. Reports C9 (1974), 199; (c) Ward Identities in Dual String Theory, Phys. Lett. 167B (1986), 388. 35. M. A. Virasoro, (a) Alternative Constructions of Crossing-Symmetric Amplitudes with Regge Behaviour, Phys. Rev. Ill (1969), 2309; (b) Generalization of Veneziano's Formula for the Five-point Function, Phys. Rev. Lett. 22 (1969), 37; (c) Subsidiary Conditions and Ghosts in Dual-Resonance Modals, Phys. Rev. D l (1970), 2933. 36. E. Witten, (a) Search for a Realistic Kaluza-Klein Theory. Nucl. Phys. B186, (1981), 412; (b) Fermion Quantum Numbers in Kaluza-Klein Theory in Shelter Island II. Proceedings of the 1983 Shelter Island Conference on Quantum Field Theory and the Fundamental Problems of Physics, (ed.) R. Jackiw, (M.I.T. Press, (1985)). (c) Mirror Manifolds and Topological Field Theory in-Essays On Mirror Manifolds, (ed.) S.-T. Yau, (International Press, (1992)). (d) The Verlinde Algebra and the Cohomology of the Grassmannian in-Geometry, Topology, and Physics for Raoul Bott., (ed.) S.-T. Yau. (International Press, (1994)). 37. S. T. Yau (ed.), Mathematical Aspects of String Theory (World Scientific, 1986).

Strings and Superstrings (Elementary Aspects)

769

APPENDIX 1 IB: SOME RECENT DEVELOPMENTS IN SUPERSTRINGS' THEORY (A FEW DEFINITIONS) In the main text of Chapter 11 we have only introduced some elementary aspects of the theory. We would like to believe however, that we have sufficiently prepared the reader for independent research work via simple expositions and extensive reference lists. As an aid towards this goal of self-study we have listed in Appendix B some of the objects, whose knowledge has become a technical necessity for understanding the current literature. In one of the sections we have given in brief the five theories, which were considered to be different, but are now known to be related through dualities S, T, U described in (B.8). A few paragraphs are devoted to the important topic of p-branes and D-branes and the M-theory (the theory of miracles). We have also ventured into the theory of black holes by giving a description of BPS states and by giving their relationship to elementary particles in (B.14). Our paragraph on cosmic strings shows the relation of strings to theories such as QCD and Yang-Mills.

B.I

DIVERGENCES

We recall, that in relativistic (quantum) field theory while calculating the Feynman's diagrams (for Green's function and S-matrix) that contain loops, we are confronted with 'infinities.' It is because in the case of loop diagrams, the range of integration for momentum variable varies from zero to infinity. We call these 'infinities' in a theory, the divergences, and note that they occur not only for want of finite range of integration, but also for lack of invariance under a symmetry group (symmetry-breakdown). As mentioned earlier in the text there are two types of divergences that occur in field theories as well as in string theories, namely the ultraviolet divergences (see Sec. (10.6)), and the infrared divergences. In point-particle field theory the ultraviolet divergences result from vertex corrections, mass corrections and tadpole diagrams. In string theories, Feynman diagrams for all of the above correspond to different comers of integration region of one and the same string diagram. And very often, by using conformal invariance these ultraviolet divergences (difficulties) can be converted into infrared divergences (difficulties)-which are easier to handle e.g. through infrared-cut off (see Subsec. (4.4)). Such an occurrence of divergences does not imply an inconsistency in the theory, rather it signals of interesting features e.g. symmetry-breaking or quark-confinement. Thus ultraviolet divergences, if not renormalizable signify the inconsistency in a theory; whereas if they can be interpreted as infrared effects, then physical principles such as supersymmetry enable us to show that the theory is free of infrared divergences. We further note that the parameter I / a ' - a ' being the Regge slope, acts in string theory as an ultraviolet regulator in the integrals over loop momenta, this makes the string free from ultraviolet divergences. In field theories divergences are removed by introduction of a finite number of empirical qualities, for instance in Q.E.D. one uses measured parameters e.g. electron mass and charge to make its amplitudes finite. This process is called renormalization. In string theory, elimination of divergences is achieved by introduction of (Faddeev-Popov) ghosts (though only in space-time dimension 26). An example of a field theory, which is ultraviolet free but suffers from infrared divergences is the quantum chromodynamics (Q.C.D), whereas string theory in 26 dimensions is ultraviolet finite as well as infrared free. In mathematical terms these divergences manifest themselves as logarithmic or quadratic divergences in a given integrand (see 11.5.27) during the process of dimensional regularisation.

770

Mathematical Perspectives on Theoretical Physics

Now the origin of divergences can be attributed in nature to collision/interaction of particles/strings. Collision of particles in particular results into wave formations and consequential radiations. Hence we thought it worthwhile to give definitions of these.two words (ultraviolet and infrared) as they appear in Physics textbooks. The following table gives the frequency and wavelength range of infrared and ultraviolet radiations. Table 1 Electromagnetic Radiations

Type of radiation

Usual source

Approximate frequency range Hz (cycles/s)

Infrared radiation Ultraviolet radiation

Hot bodies Electric arcs

1011-3.8 x 1014 7.5 x 1014-3 x 1017

B.2

Approximate wavelength range meters 3 x 10~ 3 -8x 10~7 4 x 10~7- 10~9

SOBOLEV SPACES

Let (/ c K" be an open set, and ©(£/) be the topological vector space of compactly supported C°°functions on U-whose topology is 'inductive limit topology'* given by semi-norms on T)(Zl)*. Also let LP(U) be the space of p-th power integrable function on U, (see Exp. (0.2.10 a)). The Sobolev space denoted Wmp(U) (m a nonnegative integer) is a Banach space such that: LP(U) 3 Wmp(U) 3 £>(U)

(11B.1)

where W"p'(U) is the closure of m in the sense of distribution differentiation (see App. 9C) are also in LP(U)69, Wmp(U)

= { / ; Z y / e Lp(U),

\j\ <m}.

(11B.2)

The norm in Wmp(U) is:

f

V"

H/IU = l\\DJf\\[P

(11B.3)

where

\\DJf\\u, = {ju\Djf\pdx}Up

(11B.4)

Let % denote a family of compact sets k, and let T>k{U) be the space formed by CT-functions with compact support k, the topology of ©(£/) is the topology of all semi-norms on (D(U) whose restrictions to each T)k(U) are continuous. This definition can be viewed as " locally convex inductive limit" of the Banach topologies Dk(U), due to the inclusion properties: (i) D(U) = vjDk(U), (ii) T>k(U) ^ Vk-(U), if k c k' (k, k' e 1Q. A semi-norm on a vector space X over K is a mapping ps: X —> IR such that (a) ps {x + y) < ps(x) + ps(y), (b) ps(Xx) = | X\ ps(x). Similar to the norm definition (0.2.10a), this definition implies ps(0) = 0 and ps(x) > 0 for all x, but it does not have the property : ps(x) - 0 implies that x = 0. ' The notation D^stands for partial differentiation of/, thus D^ = d^fldxj* dxj1- • • • dx^n where |y'| =ji +j2 ••• j n is the length of the ordered multi-index;'-the set {jlj2 • • • jj- (See also 10A.24). Note that, here we are dealing with R".

Strings and Superstrings (Elementary Aspects) 771

when p = 2, W'"2(U) is denoted H"\U) and (11B.3) becomes: (11B.5) A fundamental theorem of functional analysis states: the Sobolev space W"p(U) is a Banach space. The space H"\U) is a Hilbert space (see Sobolev in Ad.[59]).

B.3 B.3.1

K3 SURFACES, ORBIFOLDS, AND NARAIN SPACES K3 Surfaces

Definition (11B.2) A A3 surface is a compact Kahler manifold of complex dimension two. Denoting it by 5 we note that it satisfies: (i) huo (S) = 0, h2'0 (S)= 1 and (ii) c,(r s ) = 0 where hp-q (S) denotes the Hodge number of S- i.e. the dimension of the Dolbeault Cohomology Group Hp'q (S )*, Ts is the holomorphic tangent bundle of S, and c^T^ stands for the first Chern class. (See (10A. 10) for chern class). A simple example of A3 surface is the hypersurface of degree 4:

(iii) / : x40+ x\ + x\ + x\ = 0 in the complex projective space P3(C) with homogeneous coordinates [x0, xu x2, x3]. We note that any two A3 surfaces are diffeomorphic to each other, and thus with the help of one example of a A3 surface one can obtain all topological invariants. A A3 surface admits a globally defined 2-form, which is holomorphic and does not vanish any where. This form is unique up to a constant. Also a A3 surface is hyperkahler, provided it admits a Ricci-flat metric. (See Sec. 10.4 for hyperkahler condition).

B.3.2

Orbifolds

Orbifolds are objects which fall short of the conditions required to define manifolds, one can view them as special type of manifolds that contain singularities produced by quotients. More formally, we define these orbifolds as follows. Definition (11B.3) An orbifold is a space with an open covering {M,} (M, C IR"), such that each patch is diffeomorphic to \R"/Gj, where G/s are discrete groups which can be taken as the groups that fix the origin of R", and which may in addition be trivial. Above definition is often referred to as mathematicians' definition as opposed to the physicists' definition given below (reader will note that, this is more global). Definition (11B.4) An orbifold is a space of the form MIG where M is a manifold and G is a discrete group, which leaves some of the points (of M) fixed. Obviously it is the first definition-rather its complex version, that is used in string theory. The open sets H, are now c in
772

Mathematical Perspectives on Theoretical Physics

(i) Orbifolds are flat; (ii) they can break N = A supersymmetry; (iii) they can produce chiral fermions, and (iv) they are easy to generate. Simplest example of an orbifold is the cone, which is a two dimensional space divided by the action of the discrete symmetry group Z n : (v) R 2 /Z n . The origin, which is a fixed point under this group action is a singularity. A natural question one asks is-can an orbifold be a K3 surface? To answer this in affirmative, we consider the 4-torus as a 2-dimensional complex manifold by dividing the complex plane C2 with affine coordinates (zx, z2) by the group Z 4 generated by (vi) zk i-> zk + 1, zk H> zk + i k = 1, 2 Next consider the Z 2 group of isometries generated by (zY, z2) i-> (- zx, - z2). Clearly this Z 2 generator fixes 16 points: (0, 0), (o, —\ (o,—i), (o, — + —i], (—, o \ . . . , f— + —i, — + —i). Hence we V 27 V 2 J V 2 2 ) \2 ) V2 2 2 2 / have an orbifold, we denote it as So. So is a K2> surface since it can be checked that nx(S^) = 0 and K = 0. (See Aspinwall in Ad [69]).

B.3.3

Narain Spaces

Our next definition deals with a particular type of moduli space-known as Narain space. These spaces are found to be very useful in heterotic string theory. Narain spaces 0tfk { are defined as: (vi) <MK, = SO{k, I, Z)\SO(k, l)ISO{k) x SO(l) From Chapter 2 we are already familiar with SO(k, 1) the non-compact form of SO (k + I) which preserves a metrics with k plus signs and / minus signs. The group SO(k) x SO(l) is its maximal compact subgroup, and the quotient space SO(k, 1)1 SO {K) x SO(l) is a ^-dimensional homogeneous space. The group SO (k, I; Z) is discrete infinite group consisting of all matrices with integer enteries. When above homogeneous space is modded out by this discrete group, singularities are introduced on account of the fixed points of the discrete group in the moduli space, thus Mk l is an orbifold. Narain has shown (See Ad [77, 78]) that when / = k + 16, SO(k, 1; Z) is a subgroup of SO(k, I) that preserves an even self-dual lattice of signature (k, I). For 4d heterotic string the T duality group is GT = SO (6, 22; Z).

B.4 THE CONCEPT OF HOLONOMY, AND CALABI-YAU MANIFOLDS The holonomy is a classical concept based on Levi-civita-connection. It was used by differential geometers to study the flatness (or non flatness) of surfaces (or manifolds) (see O.[6]). It has now acquired a renewed importance with the advent of Calabi-Yau manifolds in string theory (see Chap. 15 in [13 b]). We describe in brief this concept before defining the Calabi-Yau manifolds. Definition (11B.5) Holonomy is the process of assigning to each contractible closed curve yon a Riemannian manifold K the linear transformation which measures the rotation resulting from parallel transport of a vector around this curve. The matrices corresponding to these linear transformations are called holonomy matrices. If Mx and M2 are two such matrices corresponding to closed curves yl and 72, then product M, • M2 is the matrix that corresponds to the composite curve yx • y2. The set of these matrices forms a group say H known as Holonomy group. Generically H is all of SO(n), where n = dim • K. However one is generally interested in cases, where H is a 'proper subgroup' of SO(n)*. When K is flat H consists of identity element only, since the orientation of parallely transported vectors does not change.

Strings and Superstrings (Elementary Aspects)

773

Remark (11B.6) We note that above definition holds good for any physical field y/that can be parallely transported around closed paths on a manifold, in particular it holds for spinors. Furthermore, as in gauge theories, a covariantly constant spinor field y/ (i.e. D, i//= 0) always returns to its original value upon parallel transport around a contractible closed cure. In string theory we are interested when n = 6, since it is here, we can find a covariantly constant spinor y/ which obeys Zl\ff=y/ where 11 e H. The

group His SU(3). Definition (11B.7): A Calabi-Yau manifold K is a compact (complex) Kahler manifold which has SU(d) holonomy, d being the complex dimension of K. From subsection (1.2.6) we know that K carries a Hermitian metric ds2 = gl•] dz' dz1 whose Kahler from: (i/2)glJ dz dz1 is closed. Using these, an equivalent statement that K be Calabi-Yau manifold is, that the metric be Ricci-flat:

As a historical perspective we add: It was Calabi who conjectured, and was Yau who proved it that, a compact Kahler manifold of vanishing first Chern class always admits a Kahler metric of SU(d) holonomy. These manifolds play an important role in string theories, especially when complex dimension is 3. (The quintic hypersurface in complex projective four-space P4(C) with homogeneous coordinates (^, • • •, z5) given by the locus quintic polynomial P(zx, • • •, z5) = 0 is a Calabi-Yau manifold of holonomy SU(3)). The Hodge numbers of these manifolds with SU(d) holonomy satisfy: h°-'=h''° = O \<sd=hd'°=l. Using the fact that K is connected we also have: h°'° = 1.

B.5

DOLBEAULT COHOMOLOGY ON MANIFOLDS OF SU(N) HOLONOMY

Let K be a Kahler manifold of SU(N) holonomy, then a spinor field on K is the same as a collection of (0, q) forms (q = 1, 2 • • • N). Thus a spinor field y/xa with values in some holomorphic vector bundle X is the same as a collection of (0, q) forms y/^ s with values in X. In such a set up the Dirac operator F" Dt and Dolbeault operator D + D are the same. Here F ' stands for the sum of creation and annihilation operators* (a* + a,), and D, stands for covariant derivative. The operator D is the gauge covariant #

The operators a* and «, are defined as follows:

«<** n , - v , = 7 7 7 bih^h -JP+l ± °yclic Permutations},

<»><*'n1-,,_1-V-;,_1 where gy, in (i) denotes a component of metric g on K. Evidently a* acting-on ap-form *P that does not have an index of type i makes it into ap + 1 form by adding an index of type i, and
774 Mathematical Perspectives on Theoretical Physics

exterior derivative operator and D* is its adjoint, D maps (0, q) forms with values in X to (0, q + 1) forms with values in X by the formula: i

(D «/-)_ _ V

flaxa2

_ ...atl

+ i

,

P

=

\D3 q + \

I

" l

y/3

r a

2 -

a

s q +\

± cyclic permutations [ J

?

J

Just as we have I)2 = 0, we have here D2 = 0. A form y/is closed if D \f/ = 0, and is D exact if i//- D X for some X. Hence the equivalence classes of (0, q) forms that are closed but not exact, form the g-th Dolbeault cohomology group H''(X). The zero modes of Dirac operator are the zero modes of the Dolbeault operator, i.e. they are the Dolbeault cohomology classes-the elements of Hq(X). These group are of great value in string theory, due to their properties e.g. Hq{X) and HN~q{X) have the same dimension where X is the dual or complex conjugate bundle of X. (See [13b] for details).

6.6 GROUPS THAT MATTER IN STRING/SUPERSTRING THEORIES The theory of strings and superstrings uses groups of almost all types e.g. SU(2), SL{2, R), 5L(2, Z), SO(2n), SO(2n + 1) and £,. (i - 6, 7, 8). We shall describe in brief the properties of 50(3), 50(8) 50(16) and 50(32). We recall (see Chapter 2) that 50(3) has a covering group 5f/(2). The representations of SU(2) fall into two conjugacy classes (integer spin and half-integer spin); only integer spin representations are representations of 5O(3). SU(2) is simply connected while 50(3) is not.

B.6.1

The Groups SO(8), SO(16) and SO(32).

In the case of SO(2n), the simply connected covering group is Spin (2«). It has a four element center, and its representations fall into four conjugacy classes, as will be seen later. Various covering group differ from each other from global topological point of view, in the neighbourhood of the identity though they are same. This means that there is no distinction at the level of Lie algebras. The group SO(2n) and its various covering/group have rank n, which means that they have a maximal subset of n commuting generators. The groups are 'simply laced', that is all of the root vectors have same length. If we take the (normalised ) length squared of a root vectors as 2, we can define the Cartan (n x n) matrix by using the enteries:

2erej Ay =

= e,- • eeret

where the e, are simple positive roots. (See Sec. 4.3 - 4). The diagonal elements of Cartan matrix are 2, whereas off-diagonal elements are 0 or - 1 . The Dynkin diagrams of these group are used to obtain the symmetries of these group. ^—,-<

~< n-3 (i)

(H)

Dynkin diagrams: (i) SO(2n), (ii) SO(8)

Strings and Superstrings (Elementary Aspects) 775

Thus for instance the Dynkin diagrams of SO(2n) for all n have a 2-fold symmetry under reflections through horizontal plane. When n = 4 this symmetry is larger, as is obvious from (ii). This extra symmetry, (extra automorphism*) of S0(8) called the 'triality' relates the spinor representations of 5O(8) to the vector representations. In Sections (2.3), (2.4) and (3.7) we briefly touched upon representations. In theory of strings it is the spinor representations that are important. To obtain such representations for SO(2n) we consider a system with symmetry group U(n). Next we introduce n fermion creation operators b*j and n annihilation operators bJ (i, j = 1 ••• n). The usual anticommutation relations satisfied by them are {b\ bk) = {b), b\} = 0; {b,, b)} = S,j

(11B.3)

This system can be represented in a Hilbert space of dimension 2". Denoting the 'Fock vacuum' by |Q>, the other states are obviously 6*|Q> = |Q 7 ), b*j{ ... fc*JQ) = 10/,... jk) This system posseses a U(n) symmetry under which bj transforms as n and 6* as n. The U(n) generators are [b', fc*]/2. The system also posseses a natural S0(2n) symmetry (infact spin (In) symmetry, as the SO(2n) quantum numbers are half integral). For this we define the 'gamma matrices' yk (k = 1 ••• 2n) as: Yk= (bk+ b\) Y k

it = 1 , 2 ••• n

= ( b k - n - b \ _ n ) l i k = n + \ ••• 2 n

(11B.4)

and note that they satisfy the Dirac anticommutation relations: {/*.//} = 2 5 «

(11B.5)

Equation (11B.5) leads to the commutation relations: [<*k/. Ojmi = 8u <*km ± permutations where

(11B.6)

akl = [yk, y,]/4

Thus we have constructed a 2"-dimensional representation of SO(2n). This is known as the spinor representation of SO(2n). The representation is not irreducible. To obtain an irreducible representation we define an operator f = ini2"-l) YM-Yzn

(HB.7)

which commutes with all of the generators of SO(2n), and satisfies y2 = 1. Spinor states of y = +1 or -1 are known as spinors of positive or negative chirality, respectively. Using this one has two irreducible representations S± each of dimension 2" ~', namely the spinors of positive and negative chirality. Thus if |Q) has positive chirality then the positive chirality spinors of SO(2n) are states made by acting on \Q) with an even number of b j's : S+=®k

even &y, •• • b\\Q) = ®k even | Qj} . . . j k ) .

(11B.8)

Similarly 5_=©t0ddK-A> Note that these are outer (and not inner) automorphisms (see Def. 2.1.10)

(11B

"9)

776

Mathematical Perspectives on Theoretical Physics

the collection of negative Chirality spinors is formed by the action of an odd number of operators. We recall that the states of an irreducible representation can be described by their weights (a set of points in an n-dimensional weight space (see Sec. 4.3 and Sec. 5.3)). The four representations of SO(2ri) (or that of spin (2«)) (fundamental, adjoint, and two spinor ones) belong to different conjugacy classes. It is because two representations belong to same conjugacy class if and only if their weight vectors differ by a root vector in the root lattice, and this is not the case here. Returning to 50(8), the number 8 can be decomposed as 8 = 4 + 4 , and spinor representations are 8y and 8,., whereas the fundamental vector representation is 8V. All these three groups mentioned in the heading are used in superstring theories, as we shall see in next section. For example if we give up the assumption of space-time super-symmetry, we can formulate a modular-invariant and tachyon free heterotic string theory with gauge group SO(16) x SO(16).

B.6.2 Exceptional Groups Exceptional groups are Lie groups, that do not fit in the usual classification scheme of our familiar groups e.g. O(N), SO(N), SU(N) or Sp(N). These groups are G2, F 4 , E6, E1 and £ 8 . Last three of these are used in string theory. In particular the group £ 8 ® £ g is used as a symmetry group of Heterotic string theory. E% is a simply-laced group (i.e. all non-zero roots have the same length squared (see Chap 5 Subsec 2.5)) of rank 8 and dimension 248. Thus there are 240 roots of length squared two in the root lattice. In terms of eight orthonormal unit vectors M; they are: (a)

± U j ± Uj

i * j i , j = 1 , 2 ••• 8

(P) — (±M, ± u2± ••• ± M8)

even # of + signs.

The roots which are 112 in number in (or) give the root system of the spin (16) subalgebra of £ g , and the 128 roots in (/?) describe the weights of one of the spinor representations of spin (16). The lattice T8 is generated by this root system. This lattice is an even integral lattice and is self dual. The metric for E 8 is the matrix gtj = et • ej given in a basis of simple positive roots by the following 8 x 8 matrix: '2 -

-1 1

0

2

0

-

0

1

0

0

0

0

0 - 1 2 - 1 0 (v) "

0

(n

0

0

0

0 - 1 2 - 1 0

0

0

0 - 1 2 - 1 0

0

0

0

0 0

0

0

0

e- = SlJ

0

0

0

-1

0 - 1 2 - 1 0 0

0

-

1

2

0

,0 0 0 0 - 1 0 0 2, Reader may recall that this is the familiar Cartan matrix (Subsec. 4.4.3) that can be obtained using the Dynkin diagram of £ g given below.

1 1 2

3

4

5

6

7

Strings and Superstrings (Elementary Aspects)

777

Another exceptional group used in the superstring theory is the non-compact form Zs7 7 of the group £ 7 whose maximal compact subgroup is 5(7(8). The Cartan matrix for E7 (and E6) is obtained by deleting one row and one column (two rows and two columns) from the matrix given in (j).

B.7

FIVE SUPERSTRING THEORIES

Through process of elimination it was determined in early eighties, that all five superstring theories required ten dimensions (one time and nine space) for their description. The supersymmetry coordinate 9 had to be Majorana-Weyl spinor (see Sec. 7.2 where we showed that dimension satisfies D = 2 (mod 8) in the case of such a spinor), and therefore coordinates 0 1 and 92 had to be assigned a definite handedness*. This meant that they could both have the same handedness or have the opposite handedness; these choices naturally resulted in different types of theories. We shall give below specific characteristics of these five theories, and show in another section how they can be considered as five different perturbative expansions of a single underlying theory about five different points. To begin with we note, that for open strings 9l and 62 must be equated at the ends of the string due to boundary conditions, hence they should both be of same handedness (a left handed spinor cannot be equal to a right handed spinor). In the case of closed strings the boundary condition is imposed via periodicity in a, which has nothing to do with 6l and 62, hence both choices such as same handedness and opposite handedness are permissible here. Type I-Theory: The theory is based on open superstrings. The open-string boundary conditions reduce the space-time supersymmetry to N = 1 only. The theory consists of interacting unoriented open and closed strings. For consistency requirements of quantum theory ends of open string must be allowed to join, which means in turn that the closed strings have to be included here. The underlying group of the theory is the familiar group 50(32) (See B.6.1). Type IIA-Theory: This theory is based on closed superstrings only. When 6l and Q2 have opposite handedness, the theory necessarily involves oriented strings, as 91 describes modes that propagate one way around the string, while modes described by $2 propogate in the opposite direction. The theory has two conserved D = 10 supersymmetries of opposite handedness. It has two conserved supercharges of opposite chirality. However theory itself is nonchiral, since it is left-right symmetric. Type IIB-Theory: The theory is based on closed superstrings and uses two 9 coordinates of same handedness. Here, one has two options-symmetrize right and left modes and obtain a theory of unoriented closed superstrings or leave it as a theory of oriented closed strings. The former case is nothing else but the closed-string sector of 50(32) theory. In the latter case there are two space-time supersymmetries of the same handedness. This is, what we call the Type IIB-Theory. This theory is left-right asymmetric, in other words it is a chiral theory. Heterotic-string theory: This Theory is formulated by using only one 9 coordinate. It is a closed string theory with the special feature that it treats the compactification of the left and right moving sectors separately. The left moving sector-purely bosonic in character has a 26-dimensional string. Sixteen of these are compactified on a lattice*, remaining ten get reduced to eight in light-cone gauge. The right-movers, on the other hand are supersymmetric, they contain the GS (Green-Schwarz) Majorana#

A spinor is said to be right (left) handed, if the physical system it represents, has right (left) handed rotation. The compactification of 16-dimensional space by a lattice (spanned by basis vectors e\ say) means, that if we walk in the direction L1 specified by one of the lattice vectors, we eventually wind up back to the same point. (The indices / and i stand for 1 • • • 16 and 1 • • • 8 respectively).

778

Mathematical Perspectives on Theoretical Physics

Weyl fermionic fields Sa(r- a) and the space-time string fields X'(r- a). Indices a and i in lightcone gauge run from 1 to 8 (see Subsec. 11.3.9). In the form of a table these left and right movers are: Left movers Right movers

X'ir+a) X'(x+a)

X'(T-CT)

Sa(x-o)

The index I which labels the directions of the 16-dimensional lattice, runs from 1 to 16. The lattice used in the theory comes from either the exceptional group Es ® Es or from spin (32)/Z2. It is because the only even, self dual lattices in 16 dimensions are the root lattices of the groups E% ® E8 and spin (32)/Z2. In general, a member of the Fock space of the heterotic string is the product of the left-and rightmoving sectors: |0>£+|0>*.

The right-moving modes give the supersymmetric degrees of freedom whereas left-moving modes are responsible for gauge symmetries. The symmetry group of the theory is either £ 8 x £ 8 or 50(32). (See Chap. 10 in [15a] and Chap. 6 in [13a] for details). We give below all these 5 theories in the form of a table (see Forste and Louis in [70]). Following notations are used in the description: g^v - the metric tensor, b^v - the antisymmetric tensor, AM - the R-R vector, C^vp - the R-R three form, A"^ - the Yang-Mills field, D - the dilaton, (the metric g^v, the tensor b^v and D are common to all five vacua-i.e. to the vacuum states of these theories). The indices fi, v, p stand for 0, 1 ••• d- 1. Table 2: Superstring Theories in Ten Dimensions No. of O's (Supercharges)

No. of f ; s (Gravitinos)

UA

32

2

NS-NS

ne

32>

2

NS-NS

Bosonic Spectrum g/lv,

R—R

b^, D

Ay, C^yp fyv,

R^R_

b^ D

c*vpa, b'yy, D' g^v, b^, D in the adjoint of E8 x E8

Heterotic E8xE8

16

1

^

Heterotic SO(32)

16

1

g^, b^, D A\ in the adjoint of SO(32)

Type I SO(32)

16

1

NS-NS open string R-R

g^v, D A\ in the adjoint of SO(32) b^

In conclusion we note, that heterotic string theories have non-abelian gauge symmetries, and thus vector bosons and their superpartners (gauginos) belong to the adjoint representation of the gauge group. The bosonic spectrum of Type I and Type II theories on the other hand can appear in two distinct sectors (NS-NS or R-R) depending on the boundary conditions of the world-sheet fermions (see Sec. 11.6). In the Sec. (B.I2) we shed some light on their relation amongst themselves and their relation to M-theory.

Strings and Superstrings (Elementary Aspects) 779

B.8

DUALITY SYMMETRIES OF SUPERSTRING THEORIES

In this section we describe the three dualities S, T and U which have linked the five (seemingly) different superstring theories, and have thus brought about a better understanding of the theory as a whole. Schwarz, calls the discovery that followed this understanding as, 'The second Superstring Revolution' (see Ad [53(b)], and Schwarz in Ad [69]). Let A and B be two theories, they are said to be '5 dual' if theory A at strong coupling is equivalent to theory B at weak coupling (and vice versa). There is an exact map between the descriptions of two theories. In particular if <j)A and <j)B denote the dilaton fields of A and B which determine the string coupling constant according to the rule X = exp (0), then
B.9 THE BPS STATES AND BLACK HOLES Here we describe the so called BPS condition, which, the 'states' of an iV-extended 4d supergravity theory have to satisfy for making the theory viable. We shall see that these states make the comparison possible between strings states and black holes. From See. (7.4) we recall that one of the anticommutators in the N-extended 4 dimensional supersymmetry algebra (in 2-component notation) is:

where ZIJ = -Z" are N(N - l)/2 central charges, they are infact complex numbers, whose real and imaginary parts give the electric and magnetic charges associated with the N(N - l)/2 £/(l) gauge fields, in the iV-extended 4d supergravity multiplet. According to supersymmetry algebra rules the mass of any state is bounded below by its central charges. This bound is known as the 'Bogomolnyi bound'. When the mass of a state attains the minimum value allowed for given charges ''and moduli), the state is said to be BPS saturated. BPS states belong to smaller representations of the algebra than are possible when the bound is not saturated. To see how it works, we take N = 4, and make an 50(4) change of basis to write: ( 0 -Z, 0 , 0

Zx 0 0 0

0 0 0 -Z 2

0\ 0 Z2 0,

780

Mathematical Perspectives on Theoretical Physics

Thus we note, that even though the supergravity multiplet has six f/(l) gauge fields, we can obtain a generic configuration by considering two electric and two magnetic charges. As a result there are two ways to achieve the BPS saturation, one of these is, where mass satisfies the condition:

M=\Zl\ = \Z2\ This gives 16-dimensional gauge multiplet. The second possibility for a BPS state is: \M\ =

\Zl\>\Z2\.

The first case occurs when electric and magnetic charge vectors are parallel, whereas second occurs when they are not parallel. But in perturbative string spectrum BPS states are purely electric, hence they are of first type. We note that static extremal black hole configurations with M = \ZX\ = |2^| have an event horizon of vanishing area i.e. no Bekenstein-Hawking entropy*, and they turn out to preserve one-half of supersymmetry. On the other hand the BPS states with M = \ZX\ > \Z2\ preserve only one-quarter of supersymmetry and have an event horizon of finite area. One of the important features of these BPS states is that they receive no quantum corrections (perturbatively or non-perturbatively) to their masses as long as the supersymmetry remains unbroken. This result of Witten and Olive in earlier years of string theory (see [82] and Schwarz in [69]) has led to insights in duality theories of today. For instance, S duality predicts (non-trivially) that ^the multiplicities of BPS states are SL (2, Z)-invariant. We emphasize here, that BPS-Bogomolnyi-Prasad-Sommerfield states in the context of string theory are an outcome of N =2 supersymmetric Yang-Mills theory. For SU(2) Yang-Mills theory we studied these while finding the monopole solution (see Sec. 10.4). We shall return to them again in Section B.14, when we study the black holes.

B. 10 COSMIC STRINGS AND SUPERSTRINGS It is not surprising to note that even during the early years of string/superstring theory, serious attempts were made to link it directly to gauge theories as well as to theories that involved gravity. The strings that carry the adjective 'cosmic' (cosmic strings) are outcome of such attempts. Subject of cosmic strings which may be viewed as an intersection of cosmology and particle physics is quite advanced now, see [85], and the long list of references there. In this paragraph we describe the cosmic strings as were given by Witten [80] to begin with in 1985. The vortex lines associated with gauge symmetry-breaking in thier astrophysical setting can be thought of as 'cosmic strings'. The quantized QCD flux tube associated with colour confinement is a string-like object. Witten goes on to say, that the superstring is a string-like object of potential significance; in particular the type II string crossing the horizon in today's universe would be stable, and would be detectable as a gravitational lens, the deflection angle in this case would be A = 2Gla', G being Newton's constant and a' being string tension. (See Chap. 10 in above reference). In this context one talks of 'global In early seventies a similarity (analogy) was discovered between the laws of black hole dynamics and the laws of thermodynamics. This is known as Bekenstein-Hawking entropy1 and it states: one quarter the area of event horizon behaves in every way like a thermodynamic entropy. Using the BPS states one can derive this entropy (including the numerical factor) by counting black hole microstates. (See [76] and footnotes 3 and 4 in B.14).

Strings and Superstrings (Elementary Aspects) 781

string dynamics and radiation', 'global monopoles' and string superconductivity'. Naturally this shows the relationship of string theory to other physical theories.

B.11

D-BRANES AND p-BRANES

We give here a simplistic definition of these branes (not brains) just to bring our readers to frontiers of mathematics and physics of the 21st century. And in keeping with the tradition of this text, we show to begin with, their relation to the PDE theory of the early 20th century. We shall see in the next section that it is the p-branes and the D-branes that have led to the 11-dimensional M-theory. This theory, which is more fundamental in nature transcends to all five superstring theories of eighties (see B.7) through duality relations. (See Sec. B.12).

B . l l . 1 Boundary-value Problems of Dirichlet (D) and Neumann (A/) Let U be a bounded open set in R", which has a C2-function in the closure U of U, then the problem of finding u e C2{U) r\C\U) such that Au=f,u = u0ondU

(11B.10)

and duldv = ux on dU is called a (D) or a (N) problem provided it is, as given in (11B.11) or in (1 IB. 12), respectively.71 Find u such that AM = 0inC7

u e C2{U) C\ C°W)

(11B.11)

u(y) = uo(y) y e dU, u0 <= C°(dU) Find u such that Au = 0inU

ue C2(U)nCl(U)

(11B.12)

(du/dv)( y) = Mj( y) y e du, M, e C°(dU) When n = 3, the boundary is sometimes called a membrane (a term borrowed from biology-the covering of an organ). In string phenomenology objects with p dimensions (p > 1) are called p-branes. A p-brane as it evolves in time sweeps out a (p + l)-dimensional world volume. Since, there has to be enough room for the p-brane to move about in space-time, (p + 1) has to be < space-time dimension D. We use d to denote (p + 1). In a space-time equipped with supersymmetry the wordp-brane is replaced by super p-brane. Evidently string is a 1-brane entity in this description, and as such one can write p-brane actions and study problems similar to that of string theory. Before we do this, we define aD-brane by using the principle of dualities in conjunction with the action S for open string in conformal gauge (see Sec. 11.2-3):

s=

-^y2ad°x*d«x»

(The 'spatial' coordinate of worldsheet (a1, a2) runs between 0 <
duldv denotes the derivative with regard to the normal to the boundary.

(IIBI3)

782 Mathematical Perspectives on Theoretical Physics

6S=-ih?

L/

w v x»+-d^Ld°8x"d»x»

<11B-14)

where dn is the derivative normal to the boundary, and \i takes the values 1, 2 ••• 25 (showing the bosonic character of the string). At the boundary, the only Poincare" invariant condition is the Neumann condition72 d B *" = 0 We would like to remark here, that it is consistent with space-time Poincare' invariance and worldsheet conformal invariance to add non-dynamical degrees of freedom to the ends of the string-ChanPaton degrees of freedom have no dynamics, thus an end of the string prepared in one of the states with Chan-Paton degrees of freedom always remains in that state. Hence when we consider the case of the oriented open string with U(N) symmetry 73 and let the open string endpoints be confined to N (p + l)-dimensional hyperplanes, then in view of T-duality (see B.8), the Neumann conditions dnXm{al, a2) = 0 become Dirichlet conditions d,X'm (a1, cr2) = 0 for the dual coordinates X"".

f

/ue

^\

Fig 11.25 ' The Dirichlet condition X^= constant is also consistent with the equation of motion. Our purpose here is slightly different. 73i Chan-Paton Factor: A technique for introducing t/(2) or (7(3) gauge symmetry in the open-string sector of the bosonic string theory was proposed by Chan and Paton (see 11.[13a], and Paton and Chan in Nuc. Phys. B81 (1969), 516) whose generalization to arbitrary N was quite apparent. The 'quark' and 'antiquark' at two ends of oriented open string are transformed in the adjoint representation of U(N), and its U(N) quantum number is given by a U(N) generator X = (Atf). The indices i andy (of the matrix element) correspond to the U{N) states of the quark and antiquark respectively, andN x N matrices X"y form a basis in terms of which a string wavefunction can be decomposed: 7

(a)

IW-J^.

If more than one open string (say 4) are attached to a disc in cyclic order, due to the matching allowed of quark and antiquark indices, one can sum over all basis states and obtain: (b) X\ X2jk X\t X\ = Tr(X1 X2 X3 X4) The wave functions in (a) are the Chan-Paton factors. Each vertex operator carries such a factor. The sum in (b) gives the trace of the product of Chan-Paton factors, which appears in each open string amplitude. All such amplitudes are invariant under the U(N) symmetry.

Strings and Superstrings (Elementary Aspects) 783

We note that while "Neumann condition" does not display the dynamical feature of hyperplane, the "Dirichlet condition" does. Thus the hyperplane is a dynamical object, a Dirichlet membrane (D-brane) or more specifically a Dp-brane. In this terminology the original Type I theory contains N D 25-branes. The main distinction between D-branes and super p-branes is that the fields of the world-volume theory include an abelian vector gauge field A^ in addition to the superspace coordinates (X"\ 9) of the ambient ^-dimensional space-time. Z)-branes and super p-branes are the most recent tools for studying the non-perturbative aspects of string theory. Z)-branes can be viewed as topological defects in string theory, and are defined by the property that strings can end on them (see Ad.[48(a)] and Ad.[48(b)]). Among other things, the D-branes in these references are shown as "instantons" and "black holes." We pursue this topic in next subsection (B.I 1.2).

B.I 1.2 Bosonic p-brane Action, and Supermembrane Action Consider an extended object with 1 time and (d-1) space dimensions moving in a D-dimensional spacetime. The dynamics of this object (p-brane) is governed by minimizing the world volume which the object sweeps out: S = -Td \d\ (- det d, JC" dj xv n^y)

(11B.15)

where £', i = 0 ... d- 1 are worldvolume coordinates and x^,/j. = 0 . . . D - l are space-time coordinates in Minkowski space of metric J]MV. The constant Td is the tension of this object which makes 5 dimensionless. The classical equations of motion that follow from (11B.15) can also be obtained from the action:

S = -Td\d\ ^ - l / 7 y « 3 . ^ a . ^ v + l ( / / _ 2 ) ^ y j

(11B.16)

where Yy(%) are the components of an auxiliary field, y'' a r e t n e inverse, and 7 is its determinant. Varying with respect to yVj gives the equation of motion and shows that ya dkxfi dtxv r\^v= d, and hence ytj = dtxM djXy rj^ is the induced metric on world volume. It is interesting to note that when d = 2, the cosmological term in (11B.16) drops out, and action displays a conformal symmetry YijiS) -» &{$) YS) *"(£)-> JC"($

(11B.17)

where Q is some arbitrary function of t,. A conformally invariant action for any d can now be written as:

S=-Tdjdl^Y~[^yiidixiidjxvrillvj

(11B.18)

We next write the super p-brane action (often called the Green-Schwarz action). Such an action can be written whenever there is a closed (p + 2)-form in supersurface. For this we use the coordinates Z** = {x^, 6tt) of a curved superspace and the supervielbeins E^ (Z) where M = fi, a are world indices and A = a, a are tangent space indices (see Sec. 7.7). The pullback ofE^ is defined as £")=
784

Mathematical Perspectives on Theoretical Physics

S=Tdjd% -j-S^yijE?E»11ab+±(d-2)J=y

J I

'l

>d

il • • • Al

A

(11B.19)

\

Above action contains the Kinetic term, a world-volume cosmological term, and a Wess Zumin term. It reduces to the Green-Schwarz superstring action when d - 2. To write an action for £)-branes we refer the reader to the articles of M.Duff, J. Schwarz and J. Polchinski in the reference [69] added in the proof.

B.I2 B.I2.1

M-THEORY AND ITS RELATION TO OTHER THEORIES Here We Describe in Brief M-Theory and Kaluza-Klein theory

Some twenty years ago Cremmer et. al [84] had proposed this theory as 11-dimensional low-energy theory of supergravity. This may very well be called the seed of 'M' theory' though it is completely transformed now, in the sense that it can be related to all other superstring theories. (See the figure captioned as Grand Finale on page 785). The theory admits 'soliton' solutions, when the core of such a solution extends over/? spatial dimensions and one time dimension, the 'solition' is called a p-brane. The p-brane solitons that saturate a BPS bound (i.e. those, that preserve some fraction of the underlying supersymmetry) are of special interest. It is because the BPS condition ensures that solutions of classical low-energy supergravity field equations exhibit features of the exact quantum string theory, such as the relationship between the tension and the charge. It is worth noting here, that when supergravity solutions are 'non-singular', the energy is smoothly spread over a region surrounding a p-dimensional subspace, in other cases there are delta-function singularities at the core. The situation is saved by postulating the presence of "fundamental /?-branes". One therefore finds in the literature two categories of p-branes-the 'solitonic' and the 'fundamental'. Having introduced the theory via supergravity we next show how the 'duality amongst two theories' added another dimension to 10-dimensional superstring theories, which in effect confirmed the existence of an 11-dimensional theory. For instance consider the case of IIA and Es x £ g theories, in each case there is an eleventh dimension (tenth'spatial dimension, denoted here as L n ) that becomes large at strong coupling, the size of this dimension scales as Lu ~ A2/3 where A is the 10-dimensional coupling constant. This remained unnoticed, since in pertubation theory such a compact dimension is hardly visible. In the IIA case it is a circle (whose dimension is hidden) whereas in £ g x E& case it is a line interval I or (putting it differently it is an orbifold S'/Z 2 ). Thus in £ 8 x E 8 case the 11-dimensional space-time can be visualized with two 10 d faces. They are referred to as 'end-of the world 9-branes'. One of the E% gauge groups is associated to each face. At strong coupling the faces move apart, and the theory is described by the same l i d bulk theory that describes the IIA theory at strong coupling. We further note that IIA/IIB T duality and the IIA/M S duality can be combined as a duality between IIB theory on R9 x S[ and M theory on R9 x r 2 , T2 being the 2-torus.

Strings and Superstrings (Elementary Aspects)

785

One of the concepts that is important in string theories is that of chirality. Through the introduction of branes, the 2-brane and 5-brane, as well as the end-of the world 9-branes it has been shown that there exists left-right asymmetry (consistent with anomaly cancellation requirements) in the theory, hence Mtheory is a 'chiral theory'. The theory incorporates super Yang Mills as well as supergravity theory, and uses the Kaluza-Klein principles of curling dimensions. According to Witten five string theories and D = 11 supergravity represent six different special points in the moduli space of M-theory. D =11

(

M

)

S1/TT\ 0 = 10

f IIB j \

(

IIA )

I

(E8XE8J

D=9

V

/

Y< 1 1

( 0(32) J

)

ri (

m

J

Grand Finale: Connection between superstring theories (Adapted from Schwarz (53))

B. 12.2

KALUZA-KLEIN THEORIES—A MEANS TO F-DUAUTY

The 5-dimensional classical Kaluza-Klein theory is generalized to 10-dimensions, and the concept of curling of one dimension is replaced by compactified 6 dimensions here. Thus if we consider one compact dimension of a circle of radius R, then there are two kinds of excitations to consider. The first of these are Kaluza-Klein momentum excitations on the circle, which contribute (n/R)2 to the energy squared, n being an integer; these excitations are not special to string theory. The winding-mode excitations which result from the winding of a closed string m-times around the circular dimension are the ones that are special to string theory. If t = {2n l}si)~x denotes the string tension74 (energy per unit length), the contribution to the energy squared is (2 nRmtf. The T-duality exchanges these two kinds of excitations by mapping m <-> n and R <-» L2ST/R. This is part of an exact map between a T-dual pair A and B.

B.I 3

SECOND QUANTIZATION AND STRING THEORIES

Importance of second quantization in field theories is a well known fact (see 6.[16], 9.[24], 11.[15] a and [24] added in proof). In earlier chapters (particulary in 9 and 11) whenever we dealt with the topic 74

LST here denotes the unit used in measuring the string length, it is nearly two orders of magnitude larger than Planck length LPL = (h G/c3)m ~ 10"33 cm.

786

Mathematical Perspectives on Theoretical Physics

of quantization, we made no distinction between the first and second quantization, though we used both of them implicitly. For instance Eq. (5.5.5) which expresses the Hamiltonian H = -j-p2 + (i?q2 of a single point-particle in quantized form H - \ P2 + co2Q2 is an example of first quantization. This is based on the quantization rule of position and momentum vector e.g.: [P,-. 9/] = - i 8ij

and

Pi -> p> <7; "* Q-

The first quantization approach however, does not go very far with interacting point-particles/strings that can split and re-form. For in this case one has to consider different topologies for trajectories/ world-sheets of these particles/strings* leading to unwieldy computations. Second, quantization which is a procedure where 'the wave equation is quantized as if it were a classical field' avoids this problem. We note that Eq. (9B.19a), ih

= Hy/in second quantized form would be: dt

dt

i

where y/ (r, t) is a field operator. We must emphasize at this point, that the second quantization does not add any new physics, both of these (first or the second) quantizations when applied to any theory produce equivalent physical content. It is just this, that second quantization handles the many-body problem more effectively. We list below the advantage of the (SQA) second quantized approach (as compared to the first quantized approach: FQA). 1. In the case of SQA all information can be derived with the help of a single off-shell action, in contrast to the FQA which is an on-shell approach, and where the interactions of the theory need to be introduced by hand. 2. The unitarity is manifest in SQA, but not in FQA. Hence to avoid errors, counting of graphs needs to be checked at every stage. 3. SQA is accessible to non-perturbative calculations (e.g. dimensional breaking), whereas the formulation of FQA is perturbative. It is interesting to note here that string theory evolved historically as a first quantized theory, and even to this day the formulation of second quantized field theory is written in terms of first quantized form. With recent progress in string/superstring theories where non-perturbative formulations are requireda knowledge of SQA is essential, for this reader is referred to the text of F.A. Berezin (The method of secondary quantization, Acad. Press 1967). Incidently the representation space Exp !H (of infinite-dimensional Lie groups) introduced in Sec. (0.4) appeared for the first time in connection with second quantization of field theories. The operators of Exp H (which is known as Bose-Fock space) are easy to express in terms of creation and annihilation operators and they are at the base of the formalism of the second quantization. It is hoped that second quantized theory due to its unusual features, would eventually be able to reveal the underlying geometry on which the 'entire superstring theory' model is based. * Number of such topologies (as we already know) is much smaller in the case of strings.

Strings and Superstrings (Elementary Aspects)

B.I4

787

BLACK HOLES AND ELEMENTARY PARTICLES

As the last few pages of the book were going into print, there came (on Aug. 19th) the exciting news of the images received from the X-ray telescope Chandra, named after S. Chandrashekhar—an Indian Nobel Laureate (see Remark (8.5.9) for Chandrashekhar limit). The images consisted of collision of (space) debris, shock waves rushing into interstellar space with a speed of millions of miles per hour, and a bright point near the center of a collapsing star. The Chandra telescope, third in a series (the previous two were the Hubble Space Telescope and the Compton Gamma Ray Observatory launched in 1990 and 1991) was launched on July 23rd with the capability of conducting scientific observations of high energy universe. This new observatory has made us better equipped for searching the dark matter and for obtaining a better estimate on the size of the universe. It would also enable us to study the black holes as well as the origin of elements. These findings would go a longway in confirming the link between black holes, elementary particles, quantum gravity and superstrings—a matter that has already been settled to a large extent as following paragraphs would show. Up to 1994 all attempts to link (Wheeler and Hawking's) black holes to elementary particles and strings had failed, even though both black holes and elementary particles had the same distinguishing features, namely-mass, spin, and force charges. Obviously relating a black hole (whose mass in many cases was far greater than Solar mass-Mo) to an object e.g. proton seemed ludicrous at best. In 1995 the situation changed completely due to a brilliant strategy on part of Strominger. Using Seiberg and Witten's fundamental work in N = 2 supersymmetric Yang-Mills theory, he exploited the singularity concept of black holes to interpret a conifold1 singularity as a phenomena that would imply massless black holes in string theory. More specifically he proposed that in compactified II B string theory, there were electrically or magnetically charged black hole states that become massless at conifiold points.2 Strominger's important contribution was immediately followed by a flurry of research papers, (a few hours after the paper was posted on physics archives by Strominger, Greene and Morrison posted their paper with further suggestions towards this work). Soon after, in a series of articles (see [79] and [83], and Schwarz in [69]) Sen answered many questions on dualities in string theory, and on correspondence3 between black holes and elementary string states. He did this by computing the entropy4 of the elementary string spectrum from a different point of view. Instead of using the area of 'event horizon' (EH) as a measure for the entropy of the '• See the appendix for its definition. ' Seiberg and Witten, in their paper [87] on N = 2 supersymmetric Yang-Mills theory had shown that states which were massive generically in moduli space become massless at singular points-the states that became massless were BPS saturated magnetic monopoles or dyons. (See 1 IB.9 for BPS saturated states, and Peskin in ([69]-Ref Ad). 3 ' One of the troubling points (in the correspondence) was that, while the logarithm of the degeneracy of elementary string states increased linearly with mass, the Bekenstein-Hawking entropy of the black hole increased as the square of the mass. 4 ' Hawking had shown (see 8. [16 e]) that all black holes (massive or not so massive) radiated (quantum mechanically) because of their temperature. According to his calculations, a larger mass implied (in same proportions) a lesser temperature and a lesser radiation, but a greater entropy (i.e. a greater disorder in its microscopic constituents). The reasons underlying the phenomena: 'the laws of black hole dynamics are identical to the laws of thermodynamics' were not clear to him. 2

788

Mathematical Perspectives on Theoretical Physics

black hole, he used the area of a surface close to EH for the purpose. He called this surface the 'stretched horizon' (SH). To determine the location of SH he argued, that, since the string world-sheet geometry greatly influenced the physics near the horizon, the surface should be such that the space-time curvature associated with the string metric and other field strengths was large there. He further showed that for electrically charged extremal black holes his definition of SH coincided with that of a surface where the local Unruh temperature5 was infinite. (See [83] for mathematical computations of entropy, Sen shows here that entropy depends on three parameters-the mass, the left-handed charge of the black hole, and the string coupling constant-all of which are mutually independent). Another major contribution in this stream of ideas came from Polchinski (see Ref. Ad. [48, 49] and [72]) who analyzed Strominger's paper in terms of p-branes and D p-branes (see B.ll for the definitions). This amounted to providing a new language to black hole theory, (see in particular lectures by Duff, Schwarz, and Green in [69]-TASI (96) and 8. [46] added in proof). Strominger himself in a joint paper with Vafa ([76] added in proof) used the D-brane technology to shed light on the microscopic constituents of a certain class of black holes, and went on to calculate their associated entropy and establish the equation. SBH = A/4 where SBH denotes the entropy of extremal black hole and A the area of its horizon. Among many other texts, that have appeared on the subject since these developments, two need a special mention. These are Ref. [72] and [86]. The Ref. [72] authored by Polchinski consists of two volumes. The author begins with the theory of bosonic strings, introduces string duality and D-branes, and uses these tools to explain the theory of superstrings, the M-theory,6 and black hole entropy. Ref. [86] due to Greene is more of a philosophical and historical account of this great theory-the theory of everything. In this book (The Elegant Universe) Greene uses an innovative approach (giving examples from daily life and familiar environments e.g. a cluttered desk7, a donut and a beach ball8) to explain the terse 'mathematical concepts' of his expository article in TASI (96). Here he describes the string theory on Calabi-Yau manifolds, and uses in particular the mirror manifolds, monodromy, flop transitions and homology theory to explain the "linkage" between black holes of string theory and elementary particles. (Our non-mathematical description here, is based on Greene and Polchinski's texts. A few mathematical explainations are given in appendix to this section.) Now, in order to show the linkage mentioned above there had to be a mechanism that changed the topology of Calabi-Yau background. This was achieved through homology relations amongst the vanishing cycles, that gave rise to new flat directions in the scalar black hole potential V (see the appendix for its definition). Moving along these flat directions one could smoothly reach the new branches of the type II theory moduli space. These new branches

5.

6 7.

8

SH was defined by Suskind [91] as the surface where the local Unruh temperature for an observer, who was stationary in the Schwartzschild coordinates, was of the order of the Hagedorn temperature of the string theory (see [83] iii). See Ad. [22], here Gerlach uses the Unruh temperature for study of 'quantum radiation and accelerated frames.' Although we have devoted a small section (11B.12) to this theory, we would like to emphasize here, that the theory is widely accepted as a theory of unification with an open investigational field. A cluttered desk examplifies the 'disorder'-a layman's definition of entropy. Transformation of 'donut' to a 'beach ball' is used to convince the reader that a Calabi-Yau shape could be transformed to another Calabi-Yau (with different topology).

Strings and Superstrings (Elementary Aspects) 789

corresponded to string propogation on topologically distinct Calabi-Yau manifolds. Thus in type II theory, by varying the expectation values of suitable scalar fields one could smoothly go from one Calabi-Yau to another. This implied that black holes of type II string theory could undergo a smooth transition without loosing their characteristic properties, thus signaling a link with an elementary particle. Examining the problem from a physical perspective, we note that black hole solition states are massive in Coulomb phase, and become massless at the conifold point. Once the phase becomes Higgs, some of these massless states are gobbled up by the Higgs' mechanism, the remainder continue as massless. With respect to the topology of the new Calabi-Yau in the Higgs phase these states are known as the perturbative string excitations-referred to as "elementary particles". In short, a massive black hole sheds its mass, becomes massless and eventually re-appears as an elementary particle-like excitation. Thus from string theorist's point of view there is no invariant distinction between black hole states and elementary perturbative string states, they smoothly transform into one another through the spacetearing (see Fig. 11B.26) conifold transitions. Greene argues, that as a 3-brane wrapping around the 3-sphere (which appears to us as a black hole) continues to shrink, the sphere (i.e. black hole) continues to lose its mass, until it reaches the final point of collapse when it becomes massless. And then it transmutes itself into a massless particle, such as photon. In string theory 'photon' is a single string executing a 'particular vibrational pattern'. To help visualising our discussions we reproduce three figures from 'The Elegant Universe'. The first one shows the transition from one shape to another, the second a p-brane wrapping a p-dimensional object, and the third shows a sphere within the curled-up dimensions being wrapped by a brane, the sphere appears as a black hole in the familiar extended dimensions.

ooccct* (i)

(ii)

(iii)

(iv)

(v)

(vi)

(vii)

Fig. (11B.26): A doughnut (a torus) is transformed to a beach ball (a sphere); the circle (one-dimensional sphere) in (i) is pinched until it begins to tear as in (iii), in (iv) it becomes two points (i.e. zerodimensional sphere), when it is plugged it becomes a warped banana in (v) kidney bean in (vi) and is reshaped into a beach ball in (vii).

Fig. (11B.27): (i) A string encircling a one-dimensional curled-up piece of the spatial fabric; (ii) A two-dimensional membrane wrapping around a two-dimensional piece.

790

Mathematical Perspectives on Theoretical Physics

Fig. (11B.28): A sphere within the curled-up dimensions when wrapped around by a brane, appears as a black hole. We prefer to call this analogy between cosmological black holes and these "designer" black holes(that link them to elementary particles), a dialogue 'between black holes and (particle-like) string excitations.' Evidently, similar to Chandra observatory, string theory is indeed quite capable of providing answers on black holes.

Strings and Superstrings (Elementary Aspects) 791

APPENDIX TO (11 B.I4): DEFINITIONS/EXPLAINATIONS OF WORDS USED IN (11 B.I4) Axion—is a hypothetical neutral pseudoscalar boson with mass roughly of order 100 keV to 1 Me V. It is postulated to preserve the parity of time reversal invariance of strong interactions despite the effects of instantons. Conifold—is a Calabi-Yau manifold with a singular complex structure, corresponding to the collapse of a three-cycle. The string theory on this space is singular. In simple terms it is an algebraic variety with a finite number of nodes. Where by node we mean an isolated singularity locally of the form: x2 + y2 + z + w2 = 0. Deformations of complex structure replace the node by £ 3 , and small resolutions replace it by S2. Dyons—are particles that come from classical solutions of Yang-Mills theory. These particles can carry electric as well as magnetic charge. Mirror manifolds—a pair of Calabi-Yau manifolds (M, M) with SU(d) holonomy and equipped with same conformal field theory is said to define mirror manifolds if their Hodge numbers (h'J (M) = dim H''J (M) - see B.3) satisfy the following relation h1'1 hd-l'x

( M ) = hd~1-1

(M)

( M ) = hul ( M ) .

Monodromy—the principle of monodromy asserts that if G is an admissible9 simply connected topological group, then any local homomorphism of G into G ' possesses a unique extension to a global homomorphism of G into G'. We use this principle in string theory to transform a locally single-valued quantity to a multi-valued quantity around non-trivial closed paths. Flop transition—the term stands for a change of topology that occurs in weakly coupled string theory, where a two-cycle collapses and a different two-cycle blows up. In algebraic geometry flop means, that, the area of a rational curve is shrunk to zero (blown down) and then expanded back to positive volume (blown up) in a "transverse" direction. Extremal black holes—are the black holes that carry a charge, and have the minimal possible mass consistent with this charge. In the following paragraph we give Polchinski's description of 'extremal black hole'. We write the action that involves a graviton, dilaton and a q-form field strength in II B theory as:

J dwx (-G)m

e-™ (R + 4
(11B.20)

(Here G, <&, and F(j represent the graviton, dilaton and field strength q-form respectively, ocis - 1 for NS-NS field and is 0 for an R-R field. However, this does not concern us while defining the extremal black hole.) Next we look for a solution which is spherically symmetric in (q + 1) directions and is independent of remaining 10 - (q + 1) directions (including time), we want this solution to have a fixed 'magnetic' charge:

J^=fi> 9.

( 11B - 21 )

A T.G. group G is said to be admissible if its underlying space is admissible-i.e. it is arcwise connected, locally arcwise connected and locally simply connected.

792

Mathematical Perspectives on Theoretical Physics

where S 9 is the
Js.o-, *e*a*Fq=&,

(HB.22)

this solution would be a (q - 2)-brane. In view of Birkhoff 's theorem (see Chap. 8) there exists a unique solution for given mass M and charge Q. When M/Q > a critical value (M/Q)c the solution is a black hole with a singularity behind a horizon. By this we mean that it is a black p-brane i.e. it is extended in p spatial dimensions and has a black hole geometry in other (9 - p) dimensions. The source for the field strength is hidden in the singularity. When M/Q < (M/Q)c there is a 'naked singularity'. The solution that satisfies M/Q = (MIQ)C is called the 'extremal black hole'. In most cases it is a supersymmetric solution that saturates the BPS bound, (see B.9 for BPS states, and Chap. 14 in [72]. A calculation of BPS bounds is given by Peskin in [69]) When p = 3, there are 3 + 1 dimensional extremely charged extended soliton solutions with a horizon, thus giving us black 3-branes. These solitons carry R-R charge which can be calculated by integrating F5 over a surrounding Gaussian 5-cycle S 5 : Qz5=jl5^

(11B.23)

Upon compactification to 4 dimensions via a Calabi-Yau 3-fold, the three spatial dimensions of the black soliton can wrap around non-trivial 3-cycles on the Calabi-Yau and thus appear to a 4-dimensional observer as black hole states. The effective electric and magnetic charges of these black hole states can then be computed by integrating F5 over Aj x S2 and BJ X S 2 . 10 Using the quantization assumption these are:

U2=*5""j^=8smJ'

(11B24)

-

J

where g 5 is the 5-form coupling and n],m are some integers. Since these are BPS saturated we have the mass relation.11 M = gseK/2 \m'G, - n,z') (11B.25) Where K = - In (iz'G,iz'Gj) is the Kahler potential on the complex structure moduli space M. Suppose n, = S,j and ml = 0 for all / and fixed J, then in the conifold limit as zJ —> 0, the mass of the corresponding electrically charged black hole vanishes. The scalar potential V-let H" denote the black hole hypermultiplet associated with the vanishing cycle ya(a = 1, •••, 16) then the charge of Ha under the /-th (7(1) is given by Q" = J

^5 where F5 now is

the self-dual part of I , a, F{, with a ; dual to Av Thus the black hole states have the charges: 10

The homology group H3 (M, Z) of Calabi-Yau M has a 3-cycle basis {A,, B1}, where I,J = 0,-h2' \M) such that A, n B1 = S'j, A, n Aj = B1 n B1 = 0. 1 ' For a holomorphically varying 3-form Q on the family of Calabi-Yau spaces, zJ = [ y Q and G; {zJ)= \ , Q are used as local projective coordinates and functions of z7on the moduli space of complex structures. In terms of these coordinates, a point where z1 vanishes is a conifold point. The homology cycle BJ is collapsing at such a point.

Strings and Superstrings (Elementary Aspects)

Qa,= 5], 1
15 and Q ! f = - 1 , 1 < / < 15

793

(11B.26)

whereas all other charges are zero. The scalar potential governing the black hole hypermultiplets is given as

V=E'apE,ttP

(11B.27)

where 16 E

U = X Q'a £ay h*^h(% - ( « < - » #

(11B.28)

a=\

The indices involved here satisfy 7 = 1 , •••, 15, a, /5, y= 1, 2; and the fields h("l h(a} are the two complex scalar fields in the hypermultiplet Ha. The flat directions12 for V are those for which = 0 with non-zero values for the scalar-fields in the vector multiplets.

Counting the Quantum States of a Black Hole Even though it has been fully established that black holes have thermodynamical properties (see articles by Wald and Sorkin in [93]), some people still argue that black holes have no associated quantum states, hence according to them counting of states is a useless exercise. To contradict this viewpoint, we simply observe that any physical system with thermodynamical properties always possesses entropy13, and since entropy is a measure of the number of underlying quantum states, it follows that astrophysical black holes do posses (associated) quantum states. These quantum states predicted by black hole thermodynamics can be counted in the context of string theory. In Sec. B.9 as well as in the earlier part of this section we have seen that D-branes provide a useful tool in the theory of superstrings while dealing with the question of quantum states14. Here we consider only strings and show that Bekenstein-Hawking entropy Sbh (which equals the X a r e a °f event horizon) is comparable to the string entropy15 Ss. Thus a 'count' of states in string theory implies the 'count in black holes'. We recall that when string is quantized in flat space time there is an infinite tower of massive states (see sections 11.2 and 11.3, and Kaku, Chapters 3 and 10 [15] (a)). For every integer N there are states whose mass M satisfies M2 ~ Nll2s, ls being the length scale defined by string tension. These states are 12

13

14

15

These flat directions are important, since moving along these directions takes one back to the Coulomb phase in which the black hole states are massive. Mathematically this 'moving' gives positive volume back to degenerated S 3 s and thus resolves the singularity by deformation. Note that a ball of radiation with radius/? and temperature T has mass M ~ T4R3 and entropy S ~ r 3 /? 3 in Planck units-i.e. when Newton's constant G=c=h=k=l. The radiation forms a black hole when R ~ M which implies T ~ \I-J~M and hence 5 ~M 3 / 2 . However the entropy of the black hole satisfies Sbh ~M2. This shows that any black hole whose massM » 1 (1 is Planck mass here) has an entropy which is much larger than the entropy of the ball of radiation that formed it. We shall use this black hole entropy relation Sbh ~ M2 later in our computations. In spite of enough evidence in support of the procedure for counting (the black hole quantum states)-through a large number of research papers and texts (given in Additional References) there is still skepticism about it (see for some clarifications Hawking in [93]). String entropy is defined in terms of string scale, i.e. the mass scale 1/yfa7 characterizing the tower of string excitations, and the string tension. We recall that the string tension is the 'mass per unit length of a string at rest' related to the Regge slope by 1/2^ a'.

794 Mathematical Perspectives on Theoretical Physics

highly degenerate, and number of states at excitation level N » 1 is eSs, where Ss ~ -SN~. This shows that the string entropy is proportional to the mass in string units. We also note that Newton's constant G is related to the coupling constant g of string interactions via the relation G ~ g2ls2, and that the space-time metric is well defined in string theory only when the curvature is < III2. With this in view, we use the familiar Schwarzschild black hole solution given by: ds2 = - (1 - r^dt2

+ (1 - Vr)- 1 dr2 + r 2 (d0 2 + sin20 dy2)

(11B.29)

with the black hole mass : Mbh = ro/2G. (see Exc. 8.4). In order to compare the two entropies we have to use equal masses, i.e. equate Mbh to Ms at zero string coupling. Setting these masses equal when r0 ~ ls gives: M2h ~ iflG2 ~ Nil2.

(11B.30)

Hence we have: Sbh ~ rVG ~ JN ~ 5,,

(11B.31)

which suggests that strings have enough states to reproduce the entropy of black holes-and thus lead to the counting of quantum states there, even though numerical coefficients in the entropy formulas can not be calculated. Finally we note that in view of (11B.30) the transition from a string state to a black hole state occurs when the string coupling is of the order: g ~ 1/Nil4 « 1 for large N. (See [95] for details).

Additional References 1. M. Agnagic, P. Costin and J. H. Schwarz, Gauge-Invariant and Gauge-Fixed D-Brane Actions, California Inst. of Tech. (preprint 68-2088) (1996). ((11))* 2. M. Artin, Geometry of Quantum Planes in Azumaya Algebras, Actions and Modules, Contemp. Math. 124 (Amer. Math. Soc, 1992), 1-15. ((9)) 3. S. P. Aspinwall and D. R. Morrison, (a) String Theory of K-3 Surfaces; (b) S. P. Aspinwall, Resolution of Orbifold Singularities in String Theory in Mirror-symmetry II, Stud. Adv. Math 1 (Amer. Math Soc, 1997), 703-716, 355-379. ((11)) 4. T. Banks and M. Dine, Quantum Moduli Spaces of N=l String Theories, Phys. Rev. D53 (3), No. 10 (1996), 5790-5798. ((11)) 5. C. Becchi, A. Rouet and R. Stora, Renormalization of Gauge Theories,/Inn. Phys. 98 (1976), 287.

(dD) 6. M. Berkooz, et al., Anomalies, Dualities and Topology of D = 6, N = 1 Superstring Vacua, Nuc. Phys. B475, No. 1-2 (1996), 115-148. ((11)) 7. B. Bhattacharya, et al., Geometrical Origin of Black Hole Radiation, Commn. Theoret. Phys. 4, No. 1 (1995), 91-108. ((8)) 8. N. D. Birrell and P. C. W. Davies, Quantum Fields in Curved Space (New York: Cambridge Univ. Press, 1984). ((8)), ((9)) 9. A. K. Biswas, et al, Symmetries of Heterotic String Theory, Nuc. Phys. B453, No. 1-2 (1995), 181-198. ((11)) Numbers in double parentheses stand for chapters where they are used.

Strings and Superstrings (Elementary Aspects)

795

10. R. Brout, etal., A Primer for Black Hole Quantum Physics, Phys. Rep. 260, No. 6 (1995), 329446. ((8)), ((9)) 11. A. Dabholkar, Quantum Corrections to Black Hole Entropy in String Theory, Phys. Lett. B347, No. 3-4 (1995), 222-229. ((8)), ((9)), ((11)) 12. T. Damour and A. M. Polyakov, String Theory and Gravity, Gen. Relativity and Gravitation 26, No. 12 (1994), 1171-1176. ((8)), ((11)) 13. C. de Concini, V. G. Kac and C. Procesi, (a) Some Quantum Analogues of Solvable Lie Groups in Geometry and Analysis (Tata Inst. Fund. Res. Bombay, 1995), 41-65; (b) Some Remarkable Degenerations of Quantum Groups, Comm. Math. Phys. 157, No. 2 (1993), 405-427. ((9)) 14. H. J. de Vega, et al., The General Solution of the 2D Sigma Model Stringy Black Hole and the Massless Complex Sine-Gordon Model, Phys. Lett. B323. (1994), 133-138 ((8)), ((11)) 15. J. Dieudonne, Foundations of Modern Analysis (Academic Press, 1969). Vol. 1 ((3)), Vol. 7 ((9)) 16. R. Dijkgraaf, et al., (eds.), String Theory, Gauge Theory and Quantum Gravity (Amsterdam: North-Holland Pub. Co., 1996); (i) S. R. Das (Degrees of Freedom in Two-dimensional String Theory, 224-233); (ii) A. Sen (Duality Symmetries in String Theory, 46-58). ((11)) 17. R. Distler, Physical States of the String in a Black Hole Background in Strings and Symmetries, 1991 (ed.) P. Nelson (World Scientific, 1992), 146-153. ((9)), ((11)) 18. L. Dolan, The Beacon of Kac-Moody Symmetry for Physics, AMS Notices 42, No. 12 (1995), 1489. ((4)), ((5)) 19. K. L. Duggal, Curvature Inheritance Symmetry in Riemannian Spaces with Applications to Fluid Space-Times, Joum. Math. Phys. 33, No. 9 (1992), 2898-2997. ((8)) 20. R. Engelking, Dimension Theory (North-Holland Publ. Co., 1978). ((11)) 21. O. Ganor, et al., The String Theory Approach to Generalized 2D Yang-Mills Theory, Nuc. Phys. B434, No. 1-2 (1995), 139-178. ((10)), ((11)) 22. U. H. Gerlach, Paired Accelerated Frames, Int'l Joum. of Mod. Phys. A l l , No. 20 (1996), 3667-3688; also in the Proc. of the Seventh Marcel Grossmann Meeting on General Relativity (eds.) R. T. Jantzen and M. Keiser (Singapore: World Scientific, 1996). ((8)) 23. D. Ghoshal and S. Mukhi, Topological Landau-Ginzburg Model of Two-Dimensional String Theory, Nuc. Phys. B425, No. 1-2 (1994), 173-190. ((11)) 24. S. B. Giddings, et al., Four-dimensional Black Holes in String Theory, Phys. Rev. D48, (3) (1993), 5784-5797. ((8)), ((11)) 25. P. Gilkey, Invariance Theory, the Heat Equation, and the Atiyah-Singer Index Theorem (Publish or Perish Press: Boston, 1984). ((10)) 26. T. Goto, Relativistic Quantum Mechanics of One-Dimensional Mechanical Continuum and Subsidiary Condition of Dual Resonance Model, Prog. Theor. Phys. 46 (1971), 1560. ((11)) 27. A. Grothendieck, Topological Vector Spaces (Gordon and Breach, 1973). ((0)), ((3)) 28. B. Gruber, L. C. Biedenharn and H. O. Doebner, Algebraic Systems, their Representations, Realizations and Physical Applications, Proc. ofSymp. Symmetries in Science V (Plenum Press, 1991). ((6)) 29. R. T. Hammond, Mass Spectrum in a Conformally Invariant Brans-Dicke Theory, Phys. Rev. D25, No. 10 (1982), 2699. ((8)) 30. S. W. Hawking, G. T. Horowitz and F. S. Ross, Entropy, Area and Black Hole Pairs, Phys. Rev. D51 (3), No. 8 (1995), 4302^314. ((8)) 31. G. T. Horowitz and A. A. Tseytlin, (a) Exact Solutions and Singularities in String Theory, Phys. Rev. D50 (3), No. 8 (1994), 5204-5224; (b) New Class of Exact Solutions in String Theory, Phys. Rev. D50 (3), No. 8 (1995), 2896-2917. ((8)), ((11))

796 Mathematical Perspectives on Theoretical Physics

32. N. Ishibashi, 2D String Theory Coupled to Quantum Gravity, Phys. Lett. B312, No. 4 (1993), 411-416. ((8)), ((11)). 33. B. Jensen, Stability of Black Hole Event Horizons, Phys. Rev. D51 (3), No. 10 (1995), 55115516. ((8)) 34. V. Kac, Vertex Algebras for Beginners, Univ. Lecture Series 10 (Am. Math. Soc, 1997). ((5)) 35. V. Kac and W. Wang, Vertex Operator Superalgebras and their Representations in Mathematical Aspects of Conformal and Topological Field Theories and Quantum Groups, Contemp. Math. 175 (Am. Math. Soc, 1994), 161-191. ((5)), ((7)) 36. S. Kar, J. Maharana and H. Singh, S-Duality and Cosmological Constant in String Theory, Phys. Lett. B374, No. 1-3 (1996), 43-48. ((8)), ((11)) 37. C. Kassel, Quantum Groups (Springer-Verlag, 1995). ((4)), ((9)) 38. A. I. Kostrikin and I. R. Shafarewich (eds.), Algebra 1 (Springer-Verlag, 1990). ((4)) 39. H. W. Lee, et al., Two-dimensional Black Hole in the Three-dimensional Black String, Phys. Rev. D52 (3), No. 4 (1995), 2214-2220. ((8)), ((11)) 40. D. A. Lowe, The Planckian Conspiracy: String Theory and the Black Hole Information Paradox, Nuc. Phys. B456, No. 1-2 (1995), 257-268. ((8)), ((11)) 41. E. Martinec, (a) String Calculus in Superstrings 87 (World Scientific, 1987), 107-156 ((11)); (b) Integrable Structures in Supersymmetric Gauge Theory and String Theory, Phys. Lett. B367, No. 1-4 (1996), 91-96. 42. Y. Nambu, (a) Quark Model and the Factorization of the Veneziano Amplitude in Symmetries and Quark Models (ed.) R. Chand (Gordon and Breach, 1970), 269; (b) Lectures at the Copenhagen Symposium. ((11)) 43. A. Neveu, Dual Resonance Models and Strings in QCD in Recent Advances in Field Theory and Statistical Mechanics (eds.) J. B. Zuber and R. Stora (Cambridge Univ. Press, 1984), 760. ((11)) 44. R. S. Palais, Seminar on the Atiyah-Singer Index Theorem, Ann. of Math Studies No. 57 (Princeton Univ. Press, 1974). ((10)) 45. V. K. Patodi, Curvature and Eigenvalue of the Laplace Operator, J. Diff. Geo. 5 (1971), 233249. ((10)) 46. M. E. Peskin, Chiral Symmetry and Chiral Symmetry Breaking in Ad. [43], 221. ((7)), ((11)) 47. A. Pietsch, Nuclear Locally Convex Spaces (Springer-Verlag, 1972). ((0)), ((3)) 48. J. Polchinski, (a) TASI Lectures on D-Branes (1996); (b) J. Polchinski, S. Chaudhuri and C. Johnson, Notes on D-Branes, preprint NSF-ITP-96-003, hep-th/9602052. ((11)) 49. J. Polchinski and A. Strominger, New Vacua for Type II String Theory, Phys. Lett. B388, No. 4 (1996), 736-742. ((11)) 50. J. Polchinski and E. Witten, Evidence for Heterotic-Type I String Duality, Nuc. Phys. B460 (1996), 525-546. ((11)) 51. A. R. Prasanna, External Magnetic Field of a Static Spherically Symmetric Star in Rosen's Bimetric Theory of Gravitation, Phys. Rev. D25, No. 10 (1982), 2701. ((8)) 52. S. A. Ridgway and E. J. Weinberg, Static Black Hole Solutions without Rotational Symmetry, Phys. Rev. D52 (3), No. 6 (1995), 3440-3456. ((8)) 53. J. H. Schwarz, (a) Anomaly-Free Supersymmetric Models in Six Dimensions, Phys. Lett. B371, No. 3-4 (1996), 223-230; (b) The Second Superstring Revolution, Colloquium Lecture-Sakharov Conf. (Moscow, May 1996). ((11)) 54. A. Sen, (a) Duality Symmetry Group of Two-Dimensional Heterotic String Theory, Nuc. Phys. B447, No. 1 (1995), 62-84; (b) Novel Symmetries in String Theory, Current Trends in Math, and Phys. (Narosa, New Delhi, 1995), 169-177. ((11))

Strings and Superstrings (Elementary Aspects)

797

55. J. A. Shapiro, Electrostatic Analogue for the Virasoro Model, Phys. Rev. D5 (1972), 1945. ((11)) 56. A. I. Singh, Completely Positive Hypergroup Actions, Mem. ofAMS 124, No. 593 (Am. Math. Soc, 1996). ((0)), ((5)), ((9)) 57. T. H. R. Skyrme, Particle States of a Quantized Meson Field, Proc. Roy. Soc. London A262 (1961), 237-245. ((9)) 58. J. A. Smoller, et al., Einstein-Yang-Mills Black Hole Solutions in Papers in Honour of Chen Ning-Yang, (Int'l Press: Cambridge, MA, 1995), 202-220. ((8)), ((10)) 59. S. L. Sobolev, Applications of Functional Analysis in Mathematical Physics, Trans. Math. Monographs, Vol. 7 (Am. Math. Soc, 1963), Trudy Seminar Series No. 1 (Russian Publication, 1983). ((ID) 60. W. Szczyrba, Hamiltonian Dynamics of Gauge Theories of Gravity, Phys. Rev. D25, No. 10 (1982), 2548. ((8)) 61. R. Tikekar and L. K. Patel, Radiating Black Hole with Internal Monopole in Einstein and de Sitter Universe, Math. Today 13, 1995, 3-6. ((8)) 62. I. V. Tupin, Gauge Invariance in Field Theory and in Statistical Physics in the Operator Formulation, Lebden preprint FIAN, No. 39 (unpublished) (1975). ((11)) 63. G. Veneziano, Classical and Quantum Gravity from String Theory in Classical and Quantum Gravity (World Scientific, 1993), 134-179. ((8)), ((11)) 64. J. A. Wheeler, A Journey into Gravity and Space-Time (New York: Scientific American Library, 1990). ((8)) 65. A. Wightman, PCT-Spin and Statistics (Benjamin Cumming, 1978). ((9)), ((11)) 66. E. Witten, (a) String Theory Dynamics in Various Dimensions, Nuc. Phys. B443, No. 1-2 (1995), 85-126. (b) Chern-Simons Gauge Theory as a String Theory in The Floer Memorial Volume (Basel: Birkhauser, 1995), 637-678; (c) Small Instantons in String Theory, Nuc. Phys. B460, No. 3 (1996); (d) Non-Perturbative Superpotentials in String Theory, Nuc. Phys. B474, No. 2 (1996), 343-360. ((11)). 67. L. Witten, Gravitation-an Introduction to Current Research (John Wiley & Sons, Inc., 1962). ((8)) 68. K. Yosida, Functional Analysis (3rd edn., Springer-Verlag, 1971). ((0)) 69. C. Efthimiou, B. Greene, Fields, Strings and Duality; TASI 96 (World Scientific, 1997) 70. A. Sevrin, K.S. Stelle, K. Thielemans, and A. Van Proeyen (ed.), Gauge Theories, Applied Supersymmetry and Quantum Gravity II; (Imperial College Press, 1997) 71. N. Sanchez (ed.), String Theory in Curved Space Times; (World Scientific, 1998) 72. J. Polichinski, String Theory, Superstrings Theory and Beyond, (Cambridge University Press, Vols 1 and 2 1998) 73. I. Bars, P. Bouwknget, J. Minahan, D. Nemeschansky, K. Pilch, H. Saleur, N. Warner (ed.); Strings' 95 (World Scientific, 95). 74. J. Wess and B. Zumino, Supergauge transformations in four dimensions, B70, Nucl. Phys. (1974) 39. 75. J.K. Beem and K.L. Duggal (ed.); Differential Geometry and Mathematical Physics, contemporary mathematics 170 (Am. Math. Soc. 1993) 76. A. Strominger and C. Vafa, Microscopic Origin of the Bekenstein-Hawking Entropy, 8.[45] 77. K. Narain, New Heterotic String Theories in Uncompactified Dimensions < 10, Phys. Lett. 169 B No 1 (1986) 41. 78. K. Narain, M.H. Sarmadi, and E. Witten, A Note on Toroidal Compactificadon of Heterotic String Theory, Nucl. Phys. B279 (1987) 369.

798

Mathematical Perspectives on Theoretical Physics

79. J.H. Schwarz and A. Sen (a) Duality symmetries of 4D heterotic strings, Physics Letters B 312 (1993) 105; (b) Duality symmetric actions, Nucl Phys B 411 (1994) 35 80. E. Witten, Cosmic Strings, Phys. Lett. 153 B (1985) 1138. 81. P.G.O. Freund and K.T. Mahanthappa (ed.), Superstrings, NATOASI series B: Phys. Vol. 175 (Plenum Press, 1987). 82. E. Witten and D. Olive, Supersymmetry algebras that include topological charges, Phys. Lett B Vol. 78B (1978) 97. 83. A Sen, (i) Duality Symmetries in String theories in Dimensions < 4 in [73] cited above; (ii) SL (2,Z) duality and magnetically charged strings". Int. J. Mod. Phys A8 (1993) 5079, (iii) Extremal Black Holes And Elementary string states, Mod. Phys. Lett. A Vol. 10 (1995) 2081; (iv) Remark on marginally stable bound states in type II string theory, Phys Rev. D Vol. 54 No. 4 (1996) 2964. 84. E. Cremmer, B. Julia and J. Scherk, Supergravity theory in 11 dimensions. Phys. Lett. B Vol. 76B No. 4 (1978) 409. 85. A-Vilenkin and E.P.S. Shellard, Cosmic Strings and Other Topological Defects, (Cambridge Univ. Press 1994). 86. B. Greene, The Elegant Universe, (W. W. Norton & Co. (1999)). 87. N. Seiberg and E. Witten, Electric-Magnetic duality, monopole condensation, and confinement in N = 2 supersymmetric Yang-Mills theory, Nucl. Phys. B 426 (1994) 19. 88. E. Witten, String theory dynamics in various dimensions, Nucl. Phys. B 443 (1995) 85. 89. A Strominger, Massless black holes and conifolds in string theory, Nucl. Phys. B 451 (1995) 96. 90. C. Vafa, Instantons on D-branes, Nucl. Phys. B 463 (1996) 435. 91. L. Suskind, Some speculations about black hole entropy in string theory, preprint RU-93-44 [hepth/9309145]; String, Black holes, and Lorentz contraction, Phys. Rev. D49 (1994) 6606. 92. M. Kaku, Introduction to Superstrings and M-theroy, 2nd Edition (Springer (1998)). 93. R.M. Wald (ed.), Black holes and Relativistic Stars, (Univ. of Chicago Press (1998)). 94. R.A. Matzner, Black Hole Horizons, Science Vol. 282 No. 5394 (1998) 1651. 95. G.T. Horowitz, Quantum States of Black Holes in [93]. 96. Robert Irion, Ashes to Ashes: The Inner Lives of Neutron Stars, Science Vol 297 No 5990 (2002), 2199-2201. ((8)) 97. Charles Seife, Nobels Run the Gamut-From Cells to the Cosmos (Neutrino Traps and X ray Eyes), Science Vol 298 No 5593 (2002), 526-528. ((8)), ((9)), ((10)) 98. N.J.C. Spooner and V. Kudryavtsev (ed.), The Identification of Dark Matter; Proc. of the Third Intl. Workshop (World Scientific, Sept 2000). ((8)), ((9)) 99. Stephen Hawking, The Universe in a Nutshell, (Bantam Books, New York (2001). ((8)), ((9)), ((ID) 100. R. Gopakumar, S. Minwalla, N. Seiberg and A. Strominger, OM Theory in Diverse Dimensions, JHigh Energy Phys. 05 (2000) 08. ((11)) 101. R. Gopakumar, J. Maldacena, S. Minwalla and A. Strominger, S-duality and Noncommutative Gauge Theory, J High Energy Phys. 05 (2000) 20. ((11)) 102. A. Strominger, Inflation and the dS/CFT Correspondence, [hepth/0110087]. ((1)), ((11)) 103. Heinrich Reitberger, Leopold Victoris (1891-2002), Notices of the AMS Vol 49 No 10 (Nov 2002) (see 6.3 in Chapter 0) 104. W. Kerterle, et al., Realization of Bose-Einstein Condensates in Lower Dimensions, Phys. Rev. Lett. Vol 87 No 13 (24 Sept 2001). ((8)), ((9)), ((10)) 105. H.L. Bray, Black Holes, Geometric Flows, and the Penrose Inequality in General Relativity, Notices of the AMS Vol 49 No 11 (Dec 2002). ((8)), ((11B))

Strings and Superstrings (Elementary Aspects) 799 Table S Subatomic (Fundamental) Particles*

Particles

Mass Symbol (Me V/c2)

Gravitational Field Particle = graviton 0 Electromagnetic Field Particle photon y 0 Leptons electron e~ (e+) O5T1 muon t ff (//) 105.7 tau f(f) 1777 electron's ve (Pe) <15x10"6 neutrino muon's VJJ{VM) <0.17 neutrino gamma's vt < 19 neutrino

Charge

StranSpin Isospin geness

Mean lifetime (s)

Typical decay modes

Quark content

—

2

—

0

stable

—

=

—

1

—

0

stable

—

—

- 1 (+1) -1(+1) -1(+1) 0

1/2 1/2 1/2 1/2

— — — —

0 0 0 0

stable 2.2 x 1CT6 3x10~ 1 3 stable

— e'vevM lfvllvx —

= —

0

1/2

—

0

stable

—

—

0

1/2

—

0

stable

—

—

0

0

1

0.83

yy

uu,

n+vu

ud, u d

f?Vfl \7t*jp

us, us

yy

uu ,dd,ss

uud udd

—

Mesons pion

/

135.0

0

dd

x10" 16 jf{jf)

139.6

+1(-1)

0

1

0

2.6 x10" 8 +1(-1) 1.24 X10" 8

kaon

K*(K~)

493.7 (492.67)

+1(-1)

0

1/2

eta

rf

548.8

0

0

0

0

~.8x10" 1 8

Baryons nucleons: proton neutron hyperons: lambda

p(P) n(n)

938.3 939.6

+1(-1) 0

1/2 1/2

1/2 1/2

0 0

stable 917

— pe~ve

A(A)

1116

0

1/2

0

-1

2.63

pit or DTP uds

x10" 10 sigma

Z + (Z + )

1189

+1(-1)

1/2

1

-1

delta xi

A ++ 5"(S")

1232 1321

+2 -1

3/2 1/2

3/2 1/2

omega

Q.'(n~)

1672

-1

1/2

0

pip or njt uus

0 -2

0.80 x10-10 ~10" 2 3 1.64 x10- 1 0

pn* Kn

uuu dss

-3

0.82

A/C;

sss

x10- 1 0

H° jc\

~rJl Weak interaction particle W particle W*{\AT) > 20000 1 ~10~17 W° = 2° > 80000 1 ~10"25 * [Based on College Physics: F. Miller, Jr. (5th Edition, 1999), and Phys. Rev. D54 (1996)1]

800

Mathematical Perspectives on Theoretical Physics +

Added in the reprint: The science of physics is open-ended, which continuously evolves towards corrections and subsequent modifications. One such example is the recent result regarding the 'precise measurement of the positive muon' (see Phys. Rev. Lett. Vol. 86 No. 11 March 2001, Pp. 2227). This has led not only to corrections in Standarad Model (denoted SM--Pp. 252, 253), but aso to some new information on speculative theories of superstrings. Legend 1. Antiparticles are shown in parenthesis: y, / , and TJ are their own antiparticles. 2. All spins are in multiples of hlin. 3. Decay products using the most prevalent modes of decay are shown (in multiplicative form) only for the particles, as the decays for antiparticles are analogous. Each term here stands for the sum of its components. 4. In contrast to the charge (the electric charge) the isopin and the strangeness are the quantum quantities that have no classical analogs. The isospin describes the charge independence of nuclear forces (the strong short range froces that hold protons and neutrons together in the nucleus); and strangeness is used to classify the particle production and decay reactions. 5. Quarks-Xhe spin 1/2 fermions denoted q are constituents of hardons (baryons and mesons). They are of three types (known as flavours): u(up), d (down) and s(strange). The u,d,s denote the antiquarks. While baryons are composed of three quarks {qqq), the mesons are composed of pairs (qq). 6. All particles with 0 or integer spin are bosons, and all particles with 1/2 integer spin are fermions.

CONCLUDING NOTE TO THE READER

Having devoted quite some years in writing the material (which I considered was) essential to 'acquire skills' in the superstring theory, I was ready to put my pen down, when I learnt of exciting new developments in the theory*—the so called second superstring Revolution' (SSR). This 'Revolution' meant not only a transformed picture of the theory through 'duality principle' after replacing the 'classical geometry' by 'quantum geometry', it also signified the much desired linkage between the 'gigantic stars' (the black holes) and the minutest of the minute (the elementary) particles of 'matter.' Furthermore, it led to the discovery of the 'theory of branes' (the p-branes and the D-branes)—a new powerful ingredient that helped solve the 'mysteries of the universe' and gave emergence to a theory nicknamed as M-theory after the words e.g. miracle, mysterious, magical or master. A name that has not been suggested so far for the theory is the 'Mount Everest'. Looking at the mystery and the serenity that surrounds this 5-pronged theory, 'Mount Everest' (of string theory) may be as good a name as any other." The following figure giving five peaks in the vicinity of Mount Everest illustrates our view point.

The news of SSR along with its ramifications was very gratifying as it fully agreed with my thinking. For instance I always believed that 'five siblings' of string theory were indeed 'identical quintuplets'; that the physics of 'the large' as well as that of 'the small' shared the same universal laws, and that a Courtesy I. M. Singer. * Perhaps the 'aura' of the Everest will inspire the string-theorists to re-name it as M-E-theory.

802

Mathematical Perspectives on Theoretical Physics

human eye could not see the extra spatial dimensions (of space-time) since they were curled up beyond recognition. It was this firm belief, that compelled me to write on topics of gravity, quantum theory, and gauge theories as a precursor for writing on string theory. Besides these physical theories, I have also introduced the reader to all disciplines in mathematics that are needed to learn the theory. In addition I have explained in brief the terminology used in the new developments—the SSR. This forms the Appendix B of Chapter 11. And of course, like many others I could not sustain my joy over the launching of Chandra Telescope, hence in order to be a part of festive crowd I have recounted the benefits, that this experimental physics would bring about to the vast arena of theoretical physics—namely the 'physics of superstrings'. As I end this lonely enterprise, I fondly hope that readers of the book will approach the subject with same enthusiasm as their predecessors did-when theories of gravity, quantum and gauge fields were discovered. And that they would continue to search for yet more clarifications, until all questions are fully answered.

Symbols

803

SYMBOLS1 Chapter 0 T (X, T); X

rkjj

topology topological space X with topology Tdenoted as (X, T); a vector field on 14 a set of subset of X interior of a set A, (closure of A) orthonormal system; orthonormal basis Hilbert space (n times) cartesian product differential of order r, real analytic, smooth a smooth manifold; tangent space to M at a point p e M set of tangent vector fields covariant differential operator with respect toXex(M) connection coefficients

Exp 0{= 9{ Sr9{ 7tq{X)

exponential of Hilbert space H symmetric power of M gthhomotopy group of X

16 15 25

32

M, N gjr kr

arbitrary complex manifold complexification of the tangent (cotangent) vector space to the manifold M arbitrary Riemannian surface components of Riemannian metric

33 38 51

el 'l

Kronecker tensor

51

CM S2(S") Ta g(Tzz, Tg, Tu)

C u (»} unit sphere in R3(R"+1) components of energy momentum tensor (in (z, z) coordinates)

U A0, (A) ONS; ONB !H R" = R x R x • • • x R Cr, C°, C°° M; Tp(M) X{M) Vx

1 1 1 2 4 5 6 9 10 12 12 13

Chapter 1 M Tx {Mf{T*x{Mf)

57 59 66-7

Chapter 2 G; (a, b, c) R3 (S3) GL(n, R)(GL(n, C)) GL(n H) O(n) (SO(n)) 1

group; (elements of a group) rotation group (permutation group) of order 3 group o f n x n real (complex) invertible matrices group of n x n invertible matrices with quaternion entries orthogonal (special orthogonal) group of order nxn real matrices

Description of symbols in parenthesis is given in parenthesis along with corresponding page numbers.

70 71 75 75 75

804

Mathematical Perspectives on Theoretical Physics

0(1, 3) (SO(1, 3)) U(ri) (SU(n)) SLn (K) GLn (K) (E, X, it, G) Vp 0j gjj(x) P(M, G)

Lorentz (proper Lorentz) group unitary (special unitary) group linear algebraic group over field K group of n x n nonsingular matrices over K arbitrary fiber bundle open set of X; homeomorphic map element of G principal bundle with base manifold M and bundle group G transformation map on X induced by a: G x X -> X associated bundle with adjoint action (Ad) of G on G associated bundle with adjoint action of Lie algebra

ag{x) = cr(g, x) (P x AdG) = Ad(P) (P x ad Q) = ad(P)

75 76 11 11 87 87 88 90 90 97 97

QoiG

Chapter 3 L; D(L)(R(L)) Q; U Q (or X) V; V»; Vx

operator; domain (range) of L linear operator; unitary operator position operator gradient; divergence; curl

109 111 114 114

V2

V ,- iV and

=T

Laplace, momentum and kinetic energy operator

114, 115

2m -i—

=E at f + V(r,t)=

energy operator

115

total energy operator

115

wave or D 'Alembert's operator

115

= L angular momentum operator the operator adjoint to A Hilbert space of square summable functions defined on [0, 1] X; E^- lx eigenvalue of an operator; space formed by eigenvectors corresponding to A; projection operator corresponding to X Em eigenspace corresponding to Laplacian on S""' (A,, X2, ••• Ag); ga, gb, gc Gell-Mann matrices; group generators

115 127 131

f +V

- —— - V 2 = - D2 yd*

)

-irxV = rXp=J A+ L2 (0, 1)

132 138 145, 146

Chapter 4 A; 7 gl(Y) ((gln) = gl(n, R))

an arbitrary algebra; a field of characteristic 0 or prime p general linear algebra formed by endomorphisms of V (R")

153 156

Symbols 805

Te(G) {XJ (ju= 1 ••• f) CXllv C(V^) C(V(1)); CiV^y, CiV^) M £;£>£;(])

tangent space at the identity e of Lie group G generators of a r parameter group structure constants Clifford algebra Dirac algebra; Pauli algebra; algebra of quaternions module over afieldJ arbitrary Lie algebra; derived ideal of L; (an ideal on 183) H; H*(H%) Cartan subalgebra of L; dual space of H (real subspace of H on 172) adx adjoint operator due to x e L T(L) (U(L)) contravariant tensor algebra (universal enveloping algebra of L) gl{M) Lie algebra of endomorphisms of a jF-module M (a,, «2, ••• an); n{at, ccp root system basis; element of Cartan matrix of the root system A Dynkin diagram

156 156 156 157 157 164 164 171 167 177 178 179 180

Chapter 5 g; A; a "H A = (otpij = 1 J{\ ^ / n = {or,, •••, an] fl - [ a u •••, a j

an arbitrary Lie algebra; root space; a root in A Cartan subalgebra generalised Cartan matrix (GCM) a complex vector space; dual of "H indexed subset
190 190 191 191 191 191

{#, n , ft} {{!>{, ft, IT}) realization of A{oi 'A = the transpose of A) g(A) Kac-Moody algebra a Cartan involution of g(A) fyP structure constant of g g affine algebra associated with g gn Heisenberg algebra of g

191 193 193 197 198 199

Si; 1 ® g C[r, T1] D

Cartan subalgebra of g; scalar subalgebra of g algebra of Laurent polynomials Lie algebra of derivations

199 199 199

g - g ©
semi-direct product vector space semi-direct product vector space

199 200

A(A) Q

root system of g with respect to !H(tt) root lattice

200 203

W V A

Weyl group of g g(A)-module highest weight

206 210 210

806

Mathematical Perspectives on Theoretical Physics

P X a, /?, y a n T; V Vr

weight system category roots in A root in A representation of g in V n-dimensional lattice; lattice dual to T on 244 Fock space corresponding to lattice F

211 212 212 212 212 213 215

(S,Q) VQ Ur.

Heisenberg system Fock space of the lattice Q double Fock space of T

221 221 223

Lagrange density Noether current charge of a system structure constants of symmetry group G Pauli matrices gauge field (on 245) Dirac's Lagrangian of free electron theory gauge-covariant derivative Lagrangian on the inclusion of the field Pauli triplet group parameters of transformation coupling constant Levi-Civita (anti-symmetry) symbols anti-symmetric (second rank) tensor Space of smooth functions on Minkowskian space current density of the field y/ massive particles of electroweak theory facors of gauge group of electroweak theory (color) gauge group of QCD Weinberg-Salam Lagrangian free right action of group G fundamental group of G group of generalized gauge transformations on principal bundle P G-equivariant function with respect to the action Ad of G (action ad of G on Lie algebra g on 265) set of section of bundle ad (P) = PxadQ space of gauge potentials (connections) on P a connection on P(M, G) Hodge Laplacian

238 239 239 239 241 243 243 243 244 245 245 245 246 246 251 252 252 253 253 254 261 262 263

Chapter 6 L JMa Qa C"Py f (a = 1,2, 3) AM(AM = (AlM, A \ A^)) Lo 0^ L'o r = (ru T2, Tj) 0=(dx,62,6i) g &k F£v, fuv J(MA) Jy W, W, Z° t/y(l), SUJ2) SUC(3) Lws p nx{G) Aut(P) 7G(P,

G) C^GC 3 '
T(adP) = LQ{P) R(P) co A10 = dw 8m + Sw dw

264 264 265 268 268

Symbols 807

Chapter 7 !A Q F Cl(V, Q) = C(Q) C/(R") = Cln O(V, Q) T(V) (IQ(V)) P(V, Q) Pin (V, Q) Spin (V, Q) (Spin,,) E ^Spin (£) Mc spin (p, q; R) °S, lS <2 M = (Muv\ P = ( / y Q; T (Q) o(4, 1) o(3, 2) 5/7(4, R) SL(2, C) T"'(T"72)

^"a/3' ^"ap /" C = (C"^) t =+ X, XM, XD d 0(4, 1) 0(3, 2) yl^AJ £ a ( a = l, ••• N) z; zB(zs) Ca(Cc)

— f(v)\ fiv) dv y &

arbitrary algebra over a field J{ox F ) 279 vector space over J(or F) 279 the ring Z or Z 2 - the ring of integers modulo 2. 279 Clifford algebra with vector space V and quadratic 285-6 form Q Clifford algebra when V = R" 288 orthogonal group of form Q 287 tensor algebra of V (ideal of °°) 301 generators of A N 301 supernumber; body (soul) of a supernumber 301 set of all a-numbers-purely odd supernumbers 302 (c-numbers-purely even supernumbers) left (right) derivative of/with respect to v

303

supervector space

304

dv)

808

Mathematical Perspectives on Theoretical Physics

X, Y aL, aR R c (R a ) R'" x R"a xM, xy, xa 0*, &, 67{ox 6\ d\0k) O Q'a E,M ( £ A ) ; EMA; LBA

supervectors left (right) multiplication mapping subset of real elements of C c (C a ) supervector space formed by the cartesian product of mRj's and nR fl 's c-type coordinates a-type coordinates open subset of Rm supercharge generators where / = 1, ••• N and a = spinorial index parameters with Einstein (Lorentz) indices; vielbien field that relates %A and £,M; Lorentz transformations

304 305 304 306 306 306 306 307 332-3

Chapter 8 T, t A IVD3 R fi\>(Rab) gfjvigab)', Vab OW, g) W, g) T'b G K

364 366 366 372 372 372 373 374 377 377

V

proper time; coordinate time Cosmological constant Minkowskian space Ricci tensor ( ° n 391) metric tensor (on 392); flat metric tensor on 379 spacetime manifold C-extension of (<M, g) Energy-momentum tensor Newtonian gravitational constant timelike Killing vector field, tangent vector of null geodesies on 391 unit vector field defined by K

—-—

Fermi derivative

388

coab\ 6ab oab Ra p

vorticity tensor; expansion tensor shear tensor Riemann tensor

390 391 391

E,(E'),. = , . . ,4 9{; N Xab^ab) a Z Np(Zl) Cabcd (3W, g*) t (S, Zl) (t (S)) J+ (5, 11) (/" (5)) ftt) = X; u 9vC = M u d

basis vectors of Tq{fq) spacelike-3 surface; unit normal to "H components of second (first) fundamental tensor of 9i variation of a timelike curve y(t) variation vector open (convex normal) neighbourhood of a point p'va'M conformal tensor Kruskal extension chronological future of 5 relative to 11 (if
391 398 399 401 401 401 407 415 418 418 430 430

377

Symbols

809

Chapter 9 !tt L(#) |y)«0|) (x\ t'\x0, t0) «
Hilbert space 450 space of linear operators on 'H 450 Dirac ket (Dirac bra) 450 probability amplitude (on 483) 451 time-dependent state; normalized wave function 451, 452 Hamiltonian operators 451,454 time ordered transition amplitude in Heisenberg picture 458 time-evolution operator (matrix representation of 457 51/(2) on 499) coordinate operator in Heisneberg (Schrodinger) picture 457 gauge-covariant derivative 460 Pauli spin matrices 462 orbital angular momentum operator 464 orbital angular momentum (spin) operator 464 momentum (energy) eigenvalue 465 electric (magnetic) fields; gauge field 468 action functional (on 479); (action functional with external source J, 472 without J on 495) Green's two point function 474 generating functional with external source J 474 Feynman's Green function 475 effective action (functional) based on external source J 491 vacuum functional for the harmonic oscillator 494 generating functional (Faddeev-Popov ansatz) 501 variable (functional of J(x) on 512) 507 a test function; a distribution 533, 534 Heavyside's step function 537 algebra with vector space A or H with linear maps X and 554 M

C* (H, X, ju, V, S) = H (C, v, 5) GLq(2), SLq(2), Uq(sl(2))

dual algebra bialgebra (also dual space of U on 564) coalgebra Hopf algebras

H* S S* Kq [x, y], M(2)

bialgebra related to H antipodeof// antipodeof//* quantum plane, polynomial algebra K[a, b,c,d] = K{a, b, c, d}/(ad-bc) two-sided ideal; algebra K{a, b, c, d)IJq quantum determinant of Mq{2) enveloping algebra

Jq; Mq(2) det? U(L) = T(L)/I(L)

556 558 555, 558 554, 559 562, 564 558 558 558 559 560 561 562

810

Mathematical Perspectives on Theoretical Physics

Chapter 10 [/(I) SU(2) (A, (/>)

field

AYMH(Sl =SiYl) AYM P2(C) IH S (1) H* su(2) H* 0{k) G/-,(P3(C)) Q4 QL CH Nk Mk{Mk{*)) Sm(z) A ^ F//v Tjjip, q) 8 CH (F) (CH(E)) Gt[A](x) Pn((C) Hq(t) \a\ = ax + «2 + ••" an dax =
abelian gauge group of Maxwell's theory isospin gauge group of Yang-Mills' theory configuration formed by Yang-Mills' potential A and Higgs' scalarfield Yang-Mills-Higgs' action functional (on 598) Yang-Mills' action functional Projective plane collection of quaternions the multiplicative group of quaternions with unit norm group of all non-zero quaternions Lie algebra of SU(2) space of column vectors with k quaternion entries orthogonal group Grassmannian quadric line bundle Yang-Mills-Higgs' configuration space (see also 599) moduli space of gauge-equivalent monopoles of magnetic charge k Nk enlarged by a circle or phase factor (on 603) Scattering function where m e Mk(*) gauge potential; gauge field amputated Green's function chiral Dirac operator Chern character of the bundle F (of the bundle E on 653) anomaly dependent on gauge field A Projective space Dolbeault Cohomology Groups length of an ordered n-tuple of non-negative integers derivatives with respect to x = (x, • • • xn)

571 571 573 573 574 585 586 587 588 587 591 592 593 593 594 597 601 601 604 607 630 635 635 638 648 650 654 654

a

D x = (- ipd x A a = ajix) r'j,

(j x h) matrix-valued function on U c \R'" space of smooth sections of trivial vector bundle U x C*, U c R"1

654 654

P = X | a i < * a « D"

linear map from r * -> r'j, (elliptic operator on 655)

654

total symbol of P

654

k-symbol or the principal symbol of P

655

p = p(x, y)

=

a

a

llal
ak s <jk{P) (x, y) ~ Xi \-kaa(x)

Symbols 811 *¥(M) = K^J^/M)

set of all pseudo-differential operators P on M

656

^_»(A0 = C\d *Pd(AO ¥ Z><9 (V, P )

set of infinitely smoothing operators P on M operator P in »F(A/) or ¥„, - (M) complex formed by P a graded ^DO, and a graded vector bundle V 7th cohomology of (V, P) kernel map

656 656 656

scattering amplitude; Mandelstam variables; local operator on 696 Euler-beta function coupling constant and mass of a particle of spin J also known as oscillator coordinates tension (mass per unit length) of a string (see also 679 for T and 687 for mass m)

661

velocity of a transverse wave propagating in string

663

Hj(V, P) K(t, x, y)

656 657

Chapter 11 A(s, t); s, t; A(a, z) and A(0, f) - A{f) B(u, v) gj and Mj T(mlL) \ T ® = V—Tf

v

direction Regge trajectory or Regge function (slope on 665) Minkowskian metric of D-dimensional space-time (in light-cone coordinates on 682) Pft\u =o,i, —D- i (P1' P') momentum in space-time (momentum in light-cone coordinates of space-time on 701, 702) cif) (a') f]f,v(JJ++ etc)

rfv

=•

Lorentz invariant wave operator (See also 670)

661 661 663

664 665 666

666

O>JT dx

(T, O); hap (h±±, h+_)

world-sheet parameters; world-sheet metric (in lightcone coordinates on 716) XM= X** (a, f)\it = 0'\--D-i coordinates in Minkowskian space-time (see 681 and 683 for X"R and X^)

667

XM, Xffl (X*, Xr±) cr* (X1)

667 669

|A)(A(<7) = A) U

derivatives with respect to rand cr(see on 701) light-cone coordinates on worldsheet (light cone coordinates on space-time on 700) components of EM tensor Tagin light-cone coordinates vertex operator for emission or absorption of string state |A) (with momentum k = (k^)) (vertex operator on 675, 697; on 699) string state (Weyl scaling on 679) an operator (ghost number on 721, open set on 770)

aun, (5r^)

Fourier components of closed string solution X^R (XML)

T++, T__, T_+ WA(a, f) (VA(k)); (V(k); V(k, z); V£k, f))

667

669 673-4

673 683 683

812

Mathematical Perspectives on Theoretical Physics

PMa, J%v; (J+, Ja) /*"( JMV) v

v

v

currents corresponding to Poincare transformations of XM; (ghost-number current on 733, supercurrent on 754) total (angular) momentum of a closed string MV

f , E^ , E^

components of J

Lo, Lm, L m (m/an integer)

Fourier modes of EM tensor known as Virasoro operators or generators harmonic oscillator physical (spurious on 693) state; bosonic field on 729 conformal dimension (also spin on 711) Riemann zeta function constants of Lorentz transformation c-numbers DDF Operators; longitudinal operators on 710 antighost (ghost) fields (Faddeev-Popov on 719)ghost action

am - (aMm) |0)> (If)); $ J £(s) aMv (a, f) {a\, a*) A,?1 A'n\i = l, 2 - D-2' A* b++, b__ (c + , O (SFP) Sg F(\f))'>\f) Km(K)

in terms of mode expansion

space of DDF states (a generic element of F); an element of F (i.e. a DDF state) on 708 operator k0- am where k0 is the light-like vector (Riemannian manifold on 772)

685 685 686 686-7 689 690 696 702 703 704 705 718 718 706 706

U}({//})

collection of the product of opeartors e.g. Lx_\ ••• hb;n (K?\ •••K?" )

707

S[h, x], S0[X] (SB) cm, bn Q{QA) y°A(cf); X°((f) Y"(a, 9) NS;R o W'JU) (W'"p(U)) H"\U) 5 (Sbh) R (T) M ls Mbh (Ms)

action (for a bosonic field on 729) Fourier modes of ghosts and antighosts BRST operator (supersymmetry generator on 754) time-like fermions; time-like bosons Superfield Nevue-Schwarz Sector; Ramond Sector Sobolev space (closure of Sobolev space) Hilbert space entropy (of a black hole) radius (temperature) of a ball of radiation Mass of a ball of radiation length scale of the string defined by its tension mass of a black hole (a string)

715 719 721 753 754 759 770 771 793 793 793 794 794

INDEX

*-algebra 155 IP irreducible (1 PI) 515,516 -generating functional 516 -2-point vertex function 515 -n-point vertex function 516 1-parameter group of transformation Ra 105 2-cocycle 203 ' 3-brane 789 a-numbers 302, 304 Abelian differential 45-6 Abelian gauge theory 242 Acceleration 360-61 relative- 360 Achronal -boundary 419 -set 431 Action 341,488,495 effective- 491 Green-Schwarz- 783 Nambu and Goto- 667 O(k)- 592 SU(2)- 592 of a string- 663, 667 Yang-Mills-Higgs- 572, 597 adeaoadep 176 Adjoint -linear mapping 164 -operator ft ill -representation 167,721 Affine -algebra 198 -connection 97 -frame 97 -line 565 -plane 565 -type 191 Algebra(s) 153 affine- 194, 198

anti-de Sitter- 297 associative- 153 commutative- 554, 559-60 commutator- 134,275 complex Clifford- 289 derivation of- 199 de Sitter- 158, 292 dual- 556 enveloping- 561-62 gauge- 265 Grassmann- 290-91, 301 Hopf- 554,558 Heisenberg- 198, 213 infinite-dimensional- 230 infinite-dimensional Lie- 207 internal symmetry- 292 Kac-Moody Lie- 195,206 Kq[x,y]559 morphism- 561 -of polynomials 218 Poincare- 292 quantum enveloping- 554 quaternion- 289, 291 quotient- 169, 175, 565, 567 spectrum generating- 114 super Lie- 353 super Lie conventional- 353 super-Poincare- 310 untwisted affine Kac-Moody- 195 Virasoro- 198, 202 Z2-graded- 307 Algebraic vector bundles 649 Algebraically irreducible 81 'Allowed' states 705 Amplitude(s) open-string scattering- 675 probability- 451 quantum- 451

814

Index

scattering- 661, 674 transition/probability- 483 transition- 490 vacuum-to-vacuum- 501 Analytic mapping 38, 50 Angular -frequency 527 -momentum operator 156, 464 Anharmonic oscillator 497 Annihilation operators 219-20,311 Anomalies 571, 662 Anomaly 571, 634, 720-21 global- 634 local- 634 -G,[A] 638-9 gravitational- 727 -in Lorentz algebra 704 q-number- 744 ft quantum- 691 Virasoro- 729, 731 Ansatz 614 An + l- 614,616 Faddeev-Popov- 499,501 Anti-de Sitter space 411-12 -time 407 universal- 411 Anti-ghosts 716 Antipode 558, 563-64 Anti-quark 782, 799 Anti-self-dual 586 -2 form 587 Approximate symmetry 234-35 Associated bundle 97 Asymptotic behaviour 673 Asymptotically predictable 440 strongly future- 440 Asymptotically simple 422 Atiyah-Singer operator 139 Automorphism 73 inner- 73 outer- 73 -of SO (2n) Auxiliary field 317 Auxiliary functions 102 Axial gauge 506 Axion 791 b-complete 43

b-completeness 429, 430 Banach space 4, Baryons 435, 664, 799 Basic -module 173 -representation 211, 217 -weight 173 Basis 296, 306 canonical- 296 Dirac canonical- 298 Majorana- 296, 298 Weyl- 296, 298 BCS-theory 618 BPS saturation 780 BPS states 779-60 BPS bound 780 Berezin integral rule(s) 343, 756 function 745-46 Bi-spinor Dirac equation 467 Bialgebra 558, 564 Bianchi identities 94, 271, 330, 336, 642-44 Bilinear forms 11 Birkhoff's theorem 422,791 Black holes 366,429,432,436,440,786-89 Body 301,303 Bogomolny equations 597-98 Boltzman's constant 480, Boolean algebra 14 Borel subalgebras 207 Bose 280 -dimensions 341 -elements 284 -fields 479 -linear transformation matrix 281 -matrices 284 -particles 479 -sector 283, 292 -vector 281,285 Bosons 312 Bosonic field 729 Bosonic string theory physical states in- 725 Bosonic strings 678 Bosonization -of fermions 735 Bound state 524 Boyer and Lindquist coordinates 415 Branch number 39

Index 815

Bras 521-22 Breit-Wigner functions 533 BRS transformations 641 BRST -cohomology class 725-26 -invariance 725 -invariant 722 -operators 715, 721-22 -program 723 -quantization 721 -transformation 724 Bundle(s) 86 algebraic vector- 649 cartesian- 86 complex vector space- 654 cotangent- 649 -completeness 429 first Chern class of a line- 650 Higgs'- 597 line- 600, 648 magnetic monopole- 579 -morphism 89 spinor- 586 tangent- 90, 648 vector- 90,648 c-numbers 302, 304, 487, 704 c-type supervector 352 C*-algebra 155 C™-manifold 9 CAMV 156 C*-compatible 9 C*-structure 9 C'-manifold 9 C-extension 373 C-locally inextendible 373 C-manifold 9 Calabi-Yau manifold 773, 788, 789 Canonical -center 195 -central element 195 -commutational relations 579 _form -generators 205-6 -momentum 241 -relation 549 Cartan -automorphism 564

-involution 193 -matrices 184 -subalgebra 170, 196, 199, 228 Cartan matrix 179-80, 183 generalized- 191, 195 Cartan's structural equation 94, 100 Casimir -element(s) 182, 200, 204 -operator 122, 144, 183, 204, 310 Category 23 Cauchy -inequality 28 -sequence 4 -surfaces 404, 409, 434 Cauchy development future- 396,421 past- 421 Cauchy-Riemann equation 28, 54 Causal future 418 Causality condition strong- 419,431 Causally simple 421 Cavendish constant 444 Cayley transform 31 Center 164-65, 175 Central -element 195, 204, 224 -extension(s) 169, 203 -charges 309 Centralizer 164 Chan-Paton factor 782 Characteristic -ideal 164 -polynomial 651 -ring 652 Charge -densities 251 -conjugation matrix 294-95, 309 "Charge"of the solution 600 Chart 8, 307 Chern class(es) 575, 600, 652 first- 268, 771, 773 Chern character 642, 653 -of the bundle 635 transgression of- 645 Chern form 643 total- 652 Chern number 574, 652, 720

816

Index

Chevalley basis 203,205 Chevalley generators 193, 203 Chiral -Dirac operator 635 -gauge theory 636 -spin bundles 635 -invariance 272 -superfield 345 Chirality 775 Christoffel symbols 13, 67 Chronological future 418 Chronology condition 419,431 Chronology violating set 419 Circular measure 138 Classical field 510 Clifford algebra 157,284,286,288 -Cm4) 288 -C(p, q; K) 288 Closed -trapped surface 432, 444 -universe 443 Closed string -g-loop 740 Closure 2 Coalgebra 554-55 opposite- 556 quotient- 557 Coarse topology 306 Cohomology group 722,771 Color -gauge group 253 -indices 259 Commutative 165, 554 -algebra 558 -diagrams 566 Commutativity 563 co- 563 Commutator 323, 334 -algebra 275 Commuting operators complete set of- 526 Compact 2 -Lie group 76 -support 589 'Compensating' reparametrization 703 Complete -atlas 307 -orthonormal set of bases 521 -set of commuting observables (c.s.c.o.) 464

Completely reducible 81 Complex -analytic 27 -extension 160 -plane 27 -potential 28 -structure 32 Component field 313 Component multiplet 316 Comultiplication 555, 562, 661 Concept of derivation of a functional Conditions Dirichlet- 781-783 Free boundary- 684 Mass-shell- 687, 692, 702 Neumann- 781-83 Virasoro- 691,744 Configuration space 527 Conformal 38 -dimension 696, 713, 721, 737 -dimensions 65 -gauge 676 -group 63-5 -structure of infinity 407 -mappings 673 -symmetry 741 -tensor 407 -transformation 60 -weight(s) 65,343 Conformal group 60 two-dimensional- 63 Conformally equivalent 47 Congruence 385 Conifold 791 -point 789 -transition 789 Conjugacy class 71 Conjugate -element 71 -points 397 -to q along y(s) 397 -to Oi along ?"(s) 399 Connected 2 -Green's function 515 -diagram 478, 509, 516 -simply 47 Connection 18, 92, 338 affine- 97

545

Index 817 canonical flat- 104 -form 103, 330 Levi-Civita- 95 linear- 94 universal- 100 Conserved charge(s) 239-40 Conserved current(s) 240,685 Conservation law 239, 374 Conservation of helicity 259 Continuity equation 368 Continuous group 74 Continuous symmetry 234-35 Contravariant functor 23 Convergence 5 -amongst distributions 536 -in the mean 5, 550 strong- 5 weak- 5 Convergence condition timelike- 396 null- 396 Convex 3 -hull 3 -normal coordinate neighbourhood 402 -normal neighbourhood 373,400 Convolution map 558 Convolution product 538 Cooper pair(s) 619 electric charge of a- 619 Coordinate -neighbourhood 8 -time t 364 Coordinate-space wave function 524 Coordinates advanced null- 413 Boyer and Lindquist- 415 cylindrical- 117 Euler- 417 normal- 443 retarded null- 413 spherical polar- 117 Coordinatise 372 Coproduct 555 Coriolis force 367 Correlation functions 488 -based on external source J 487 time ordered- 489 generating functional for time ordered- 491

Correlation length 685 Cotangent bundle 649 Coulomb gauge 256, 258, 267 Coulomb phase 788 Coulomb potential 235 Counit 555,560-61 -axioms 561-62 Covariant -calculation 729 -derivative 12, 329, 429, 433 -derivative operator 755 -differential 95 -functor 23 -quantization 688 -vector fields in superspace 314 Coxeter number 205 CP-violating gauge term 254 Creation operator(s) 219-20,311 Critical point 268,271 -of/ 20 -of Yang-Mills action 633 Critical value of/ 20 Cross-section 100 Current 66 conserved- 66-67 -densities 251 Curvature 18,93,330 infinite- 432 -2-form 99 -of spacetime 460 -singularity 430,433 -two-form matrix 95 Damped harmonic oscillator D-brane 781-83 DDF operators 704 DDF states 706 de Broglie equation 528 Decreasing sequence 175 Degrees of freedom spurious- 498 5-derivative 341 ^distribution 540, 542 Dense 2, 6 de Rham cohomology 22 de Rham complex 22, 571 Derivation 163, 165 inner- 166

543

818

Index

kernel of a- 163 Derivative functional- 545 Gateaux- 545 Derivation of a functional 545 Derived ideal 164 Derived series 162 Derived subalgebra 193 Descend equations 644 de Sitter -algebra 158, 292 anti- 297 -space 408, 410 -spacetime 407 Deviation 388 Diagram(s) commutative- 566 connected- 478,509,516 irreducibility of- 509 particle- 671 string- 671 -technique 476, 478 Differential operator(s) 217-18, 227 -of the system 315 Diffraction 552 Dilation 66 Dilaton 712, 778, 790 -vertex operators 748 Dimensional regularizaion 629 Dirac('s) -algebra 157, 159 -conjugate 294 -^function 339,340 -delta function 16,116,534 -delta distribution 536 -kets 469 -Hamiltonian 461 -magnetic monopole 579 -wavefunction 462 Dirac equation 461-62 -bi-spinor 467 -for a charged particle 468 -free 760 Dirac monopole -field 262,580 -quantization condition 262 -solutions 575 Dirac operator 116,139,634

generalized- 290 Dirac strings 576, 579 Dirichlet -conditions 782-83 -membrane (D-brane) 783 Discrete 2 -symmetry 234-35 -topology 2,6 Distinguishing condition future- 420 past- 420 Distribution(s) 534, 539 convergence amongst- 536 derivative of- Sg 534 Dirac delta- 536 localized- 534 -of Riemannian sum 549 support of a- 535 tempered- 542 Divergence 538 Divergences 662, 769 infrared- 769 logarithmic- 769 quadratic- 769 ultraviolet- 671, 745, 769 Divergent integral(s) 517,746 logarithmically- 746 quadratically- 746 Dolbeault Cohomology Group 650,771 Domain 109 multiply connected- 30 simply connected- 30 Dominant 211 -energy condition 395,441 Dual algebra 555 Dualities 779, 787 Duality 564,779 Dynamical -equation 511 -law of an operator 532 -symmetry 234,237 Dynkin -diagram 173, 180, 184-85, 195, 266, 774-75 -indices 174 Dyons 784,791 f-symbols 350 Eddington-Finkelstein metric 414

Index 819

Effective -action 491, 513-14 -action functional 513-14 -functional 513 Effective potential extremumof- 747 Eigenbra 525 Eigenfunction 119, 121 Eigenkets 455, 522 energy- 463 Eigenstates 150 Eigenvalues 119,130,522 g-fold degenerate- 522 m-fold degenerate- 121 Eigenvector 122 generalized- 122 Eightfold way 234 Einstein's coupling constant 422 Einstein's static universe 406 Einstein equations 378 Einstein indices 332 Electric charge 259 Electromagnetic field total energy of- 268 Electron(s) 258-59, 272,435 non-relativistic- 434-35 Electroweak gauge group 252 Electroweak theory 258 group of- 259 Elementary solution 536 Elements of a diagram 530 Elliptic complex 656 Euler characteristic of a- 656 Elliptic curve 32 Endomorphism (END) 81, 645 Engel's theorem 169 Energy conditions 393-95 dominant- 395, 441 strong- 397 weak- 394-95, 397 Energy-momentum (EM) -conservation 682 -tensor 362,376,669,681,754 -of the matter 387 Energy operator 532 Entropy 786,793 Bekenstein-Hawking- 787,793 Enveloping algebra 562-63

quantum- 554 Equation(s) constraint- 668,755 continuity- 368 Dirac- 460-62 Einstein- 378 Euler- 368 Euler-Lagrange- 238, 270, 338, 375, 382-83, 496, 511, 547, 598, 681 Hamilton-Jacobi- 458 Heisenberg- 457 Jacobi(deviation)- 388 Klein-Gordon- 460,462 -of motion 348 -of motion for the state vector 452 Newtonian- 437 Poisson- 368 Quantum mechanical- 459 Raychaudhuri- 387, 391 Schrodinger- 457 Schwinger-Dyson- 512 Ergosphere 441 Eta tensors 582 Euclidean -distance 6 -rotation 63 Euler angles -group 581 Euler beta function 661 Euler class 718 Euler equation 368 Euler-Lagrange equation(s) 238, 270, 375, 382-83, 496, 511, 547, 598, 681 Even elements 307 Event 358-59 -horizon 439, 787 Exact -solution 404 -symmetry 234-35,238-39 Exceptional groups 776 Exclusion principle 434 quantum mechanical- 434 Expansion tensor 390 Expectation value 475, 523 vacuum- 510 Exponential mapping 353 Extended Shapiro-Virasoro model 712 Extension(s) 414-15

820

Index

advanced Finkelstein- 414-15 Kruskal- 415 retarded Finkelstein- 414-15 Exterior -algebra 157 -covariant derivative 572 -derivative 329 -Schwarzschild solution 438 External source J 488 Extremal black holes 780,787,791 Faddeev-Popov -Ghosts 715 -Ansatz 499,501 Faithful representation 202 Fast decrease 541, 543, 553 Fermi 280 -coordinates 347 -derivative 388-89 -dimensions 341 -elements 284 -fields 480 -matrices 284 -transformation matrices 281 -propagated orthonormal basis 390 -sector 283, 292, 298 -vector 281, 285 Fermionic 312 -derivatives 297 -determinant 636 -generators 283 -integration 635 Fermion(s) 434,436 -equation of motion 757 left-handed- 253-54 right-handed- 253-54 time-like- 753 Feynman -functional integral 472 -graphs 507, 636 -graphs with vertex-functions 512 -Green function 475 -parameter formula 629, 646 -parameter formulae 645 -propagator(s) 507-8 Feynman diagrams -in the momentum space 521 Fiber

-at x 87 -bundle 87 typical- 87, 89 Field(s) 14 Bose- 479 Fermi- 480 gauge- 608 Field configuration 573 static- 573 Field equations 317,380 Field theory -in Pi-formalism 495 -with infinite degrees of freedom 495 Fierz relation 755 Finite type 191 First Chern class 268, 761 First fundamental tensor 399 Fitting null component 170 Flop transition 791 Flow lines 385-86 Flux 578 Fock space 213,219,221,709,726 boson string- 708 double- 221 Form 39 closed- 42 co-closed- 42, 52, 54 exact- 42 harmonic- 42 holomorphic- 42 measurable- 43 Formal adjoint 99 Formally self-adjoint operator 139 Fourier('s) -coefficients 539, 540, 550 -expansion 220 -series 539, 540, 550 -transform 541, 543, 548 -inversion theorem 541 Frame free-float- 358-59 inertial- 358-59 Lorentz reference- 358 Frame of reference center of mass- 369 Free -boundary conditions 684 -Dirac particle 465

Index

-Klein-Gordon particle 461 Friedmann space 412 Fuchsian group 46 Full symmetry group 135 Function 2-point vertex- 515 Breit-Wigner- 533 entire- 27 Green's- 536-38 Heaviside's step- 536 periodic- 539-40, 550 proper (1PI) n-point vertex- 16 source- 537 square integrable- 43 stream- 28 test- 533, 540 Functional -derivative 545 derivation of a- 545 generating- 492,495,510 vacuum- 489 Functional integral Feynman- 472 -of a scalar field 473 Fundamental -form 36 -group 47 Future asymptotically predictable 440 strongly- 440 Future -event horizon 446 -horismos 419 -horizon 421 g-completeness 429 G-equivariant 263 Galilean relativity theory 360 ^matrices 272 Gamma-matrices 298,300,775 canonical basis of- 297 Weyl basis of- 297 Majorana basis of- 297 Gateaux derivative 545 Gauge -algebra 265 axial- 506 asymptotic- 590 -boson 259

-connection 261 -connection form 261 Coulomb- 258 -covariant derivative 243, 460 -equivalent 266 -field(s) 242,250,262,608 -group 242, 261, 571, 579 -invariant 320 -invariance 327 -potential(s) 250, 261, 268, 579, 587, 608 pure- 576 singular- 590 Gauge-fixing -condition 499 -determinants 716 Gauge-invariant Lagrangian 245-46 Gauge symmetry 234-35 non-abelian- 244 Gauge theory chiral- 636 volume factoring in- 499 Gauge transformation(s) 269,319,579,589 chiral- 636 infinite-dimensional group of- 586 supersymmetric- 320 Gaussian 541, 551 -curvature 19 -formula 485 -function 548 -integral(s) 676 -model 83 Gell-Mann's ^-matrices 145 Gell-Mann matrices 260 General relativity limit 437 Generalized -affine parameter 430 -Coulomb gauge 267 -Maxwell's Field 268 Generating functional 492,495,510,638 Generators 193, 196 Generators of SU(2) 259 infinitesimal- 275 Geodesic 14 -completeness 429 Geodesically complete 443 null- 430-31 timelike- 430 Geodesically incomplete 430,432

821

822

Index

Geometrical symmetry 234,237 Germs 6 'Ghost conjugation' symmetry 734 Ghost-free spectrum 691 Ghost number 721, 724, 737 -current 724, 733, Ghosts 662, 706, 716 -in Bosonic Theory 732 Faddeev-Popov- 715 Glashow-Salam-Weinberg theory 571 Global gauge 261 Global symmetry 234-35, 240 Globally hyperbolic 421 Gluons 253 Goto 667 Graded r - 280 Z 2 - 280 -Jacobi-identity 280 -skew-symmetry 280 Grasmann -algebra 290-91, 301 -degrees of freedom 348 -variant of Dirac ^-function 340 Grasmannian 593 Gravitational -anomalies 644 -anomaly 727 -force 361 -mass 361 -potentials 264 -radiation 366 -waves 366 Gravitational field -intensity 361 weak- 378 Graviton 674, 748, 790, 798 Gravitinos 778 Gravity 371 Green's function 509, 536-38 connected- 515 -for the divergence 538 -Tpip, q) 630 -of <j)A theory 498 three-point- 630 -with a connected diagram 509 Gribov ambiguity 267 Group(s)

abelian- 70 cyclic- 70 linear algebraic- 77 quotient- 72 rank of a- 83 representation of a- 80 semi-simple- 82 simple- 82 simply laced- 774, 776 special unitary- 75-6 unitary- 76 Haar measure 15, 156 Hadrons 661, 765 Hadronic strings 661 Hadronic scattering 662 Hamiltonian 65,220-21,451,454,479-80,736 -for strings 686 -in a relativistic field 459 -function 527,530 -operator 112,530-31 radial- 455 string- 696 Hamilton-Jacobi equation 458 Harmonic(s) 28, 52, 539-40 complex- 540 -conjugate 28 -differential 44,49 -forms 36-7, 53, 55, 136 -function 52 -oscillator 501, 689 Hausdorff 2 -topological space 8 Hausdorrff's formula 313,321 ^/-diagonalizable module 210 Heat equation 656 Heaviside's step function 536 Heisenberg -algebra 198-99, 213, 223 -Lie algebra 213 -equation 457 -system 213-14,221,227 -uncertainty princple 449,523,551 Hermitian -adjoint operator 219 -generators 329, 331 -operator 112,120,122,125,130,503,521 -matrix 36

Index 823

-metric 36 Hessian of/ 20 Heterotic string theories 777-78 Hidden symmetry 234, 237 Higgs' 571 -bundle 597 -field 571, 578 -mechanism 571 -particle 576 -scalar field 572 -field self-interaction 598 -solution 574 Highest weight 173 -module 210 -vector 210 Hilbert space 4,116,450,771 physical- 709 Hodge star operator 18 Holomorphic 27 -differentials 53 -mapping 50 -p-form 35 -(p,q)-form 34 -universal covering space 47 -vector field 34 Holonomy 772 -group 772 -matrices 772 SU(M)- 773 Homeomorphic mapping 2 Homeomorphism(s) 89 one parameter family of- 453 Homogeneous Lorentz group 311 Homomorphism 72, 164 Homotopy 24 -type 25 Hooke's law 663 Hopf -algebras 553, 557 -fibering 580-81 -fibration 101 Horizontal 103 -form 98 -vector 93 -vector field 103 Hyper-Kahler 597-771 Ideal

162, 175

left-

right-

157 157 Identity component 74 Imprisonment 420 Increasing derived sequence 162 Indefinite type 191 Index 20 -set 1 Atiyah-Singer- 139 Induced metric tensor 399 Inertial force 368 Inertial system 364 Infinite orbit volume 501 Infinite-dimensional -algebra 230 -group 585 -Lie algebra 207 Infinitesimal -generators 275,279,314 -transformation 336 Infinities 517 Infrared -cutoff 677,733,769 -divergences 769 -radiations 770 Inner product 99,521 Instanton(s) 573 basic- 588 -bundle 580, 594 k- 592 multi- 590 Instanton solution(s) 574 basic- 694 single- 576, 581 Integrable functions 126 Integral operator 116 Integration of a /--form 40 Interacting strings 671 Interior 2 Internal symmetry algebra 292 Intertwining -number 80 -operators 80 Intransitive operator 135 In variance Poincare- 680 reparametrization- 735 Inverse transform 543

824

Index

Irreducible representations 145, 147 ^•dimensional- 147 Irreducibility 509 Isentropic 387 -equation of state 368 Isolated -points 29 -singularities 544 Isometries 137 Isomorphic 176 Isomorphism 72 Isospin -group 245 -symmetry 241 Jacobi -(deviation) equation -field(s) 14, 397 -identity 166 Jordan algebra 153

388

K3 surfaces 771-72 Kac-Moody algebra 193-94 Kac-Moody Lie algebra 206 Kaluza-Klein -theory 236, 662, 785 -momentum excitations 785 Kahler -manifold 36-7 -metric 36 Kepler's laws 357, 367 Kernel(s) 116,163 -map 656 -of a derivation 162 Ket 521 normalized- 521 Killing -form 167-68,170-71,174,181 -vector field 91, 93, 104, 377, 381 Kinked worldline 363 Klein-Gordon equation(s) 457, 459, 460, 462 free- 760 Kleinian group 46 Kruskal extension 414 Lagrange density 238, 240 Lagrangian 317, 382-83 bare- 629

-gauge invariant 330 -renormalized 629 -supersymmetric 331 Laplace -equation 28 -operator 52, 117, 15 Laplace-Beltrami operator 137, 140 Laplacian LK,, 138 Lattice 1, 153, 161, 193, 213 Euclidean- 153 even- 214 even self-dual- 772 integral- 153 Lorentzian- 153 unimodular- 153 Laurent polynomials 198 algebra of- 198 derivation of the algebra of- 199 Laurent series 29 infinite- 208 Left -derivatives 297 -supertranslations 346 -translation 91 Left and right -invariant vector fields 353 -invariant local frame fields 353 -draggings 353 Leibnitz rule 95, 755 Leptons 259 Levi subalgebra 163 Levi-Civita connection 95,262 Levi-Civita symbol 78 Lexicographic ordering 172-73 Lie algebra(s) 156, 192, 289 affine- 193 -automorphism 212 rank of a- 170 representation of a- 167 solvable- 156, 162 Lie group 76 Lie product 175 Lie superalgebra 279-80 Light-cone -coordinates 61 -coordinates in space-time 700 -formalism 61 -gauge choice 701

Index

Light travel time 259 Line bundle(s) 648 complex- 600 Linear algebraic group 77 Linear mapping 1 Linear differential operator propagator of- 536 elementary solution of- 536 Linearized supergravity 310 Local -causality 373 -causality postulate 404 -coordinate patch 307 -gauge 261 -gauge potential 261 -phase 251 -symmetry 234-35 Localized distribution 534 Logarithmic 769 -divergences 769 Loop algebra 198 Loop group 197 Lorentz -algebra 308 -covariant 338 -gauge 258 -generators 338 -group 61,78,308,311 -indices 332 -interval 363 -invariant 297 -quantum numbers 674 -spinor 313 -translation 311 Lorentz group 61,78,308 inhomogeneous- 78 -transformation 67, 69 -symmetry 234-35 Lorentz transformation 311 infinitesimal- 703 Lower central series 164 m-sheeted (ramified) cover 39 m-complete 429,431,618 Macrophysics 429 Macroscopic level 618 Magic formula 537 Magnetic field 577,618

Magnetic flux 577 "Magnetic expulsion" 618 Magnetic monopole 579, 787 Majorana -basis 295, 297 -condition 295 -conjugate 294 -fermion 752 -representation 298 -spinor 295, 752 -Weyl fermionic fields 778 -Weyl spinor 290 Mandelstam variables 661,766 Manifold 8 Calabi-Yau- 773 mirror- 788, 790 n-dimensional topological- 8 Riemannian- 8 Mass of a cold star 438 Mass-shell conditions 678, 687, 692, 702 Matrix element of evolution operator 481-82 Matrix representation 111,168 Matter 371 -fields 373 Maximal 402 -commutative diagonalizable subalgebra 200 -dimension 20 -ideal 192 Maximally stretched state 151 Maxwell('s) -connection 269 -equations 251 -field 269 -theory 256,579 Mayer-Vietoris sequence 24 Measurable quantities 121 Meissner effect 577, 620 Membrane 857 Meromorphic -abelian differential 56 -r-differential 45 -function 38 Meson(s) 664, 799 Metric 11,372,377 -space 3 static- 377 Ricci-flat- 771 Riemannian- 11,

825

826

Index

Metrizable 3 Metrically complete 429 Microscopic 787 -level 618 Minkowskian metric 350 Mobius group 65 Mobius transformation 32, 48, 56 elliptic- 48, 49 hyperbolic- 48, 49 infinitesimal- 65 loxodormic- 48 parabolic- 56 Mode expansions 714, 761 -of ghosts 719 Modes left-moving- 681 right-moving- 681 Modules 168 representations via- 168 -product rule 168 Moduli space of gauge potentials 266 Modulus of elasticity 663 Momentum -operator 156, 523 relativistic- 459 total- 65,685 Momentum-space wavefunction 463 Monodromy 788,791 Monopole(s) magnetic- 787 the solution space of- 602 -solution n=l 611 -solution n=2 611 -solution of topological charge n 612 -solutions of YMH model 607 Morphism(s) algebra- 561 -of algebras 560,566 -of coalgebras 566 zero- 562 Morse function 20 Multiplicative group of units 287 Multiplicity -of an eigenvalue 119 -of a root 201 -of the weight 210 Multivortex 618

Nambu and Goto 667 Neighbourhood 2 Neumann conditions 781-83 Neutrino 258 Neutron star 436-37 Neutrons 253 Newton's first law of motion 357 Newton's second law of motion 357 Newton's law of inertia 357 Newtonian equation 437 Newtonian gravitational constant 377 Newtonian gravity 370 N-extended body problem in- 370 Nilpotency 175 Nilpotent 169, 175-76 -endomorphisms 169 -operator 169 -radical 175 Noether current 244 Non-abelian gauge symmetry 244 Nonchiral 777 Non-commutative 165 Non-compact 2, 5 Nonlinear sigma model 751 Nordstrom theory 379 Norm 4,521 L2- 606 L 6 - 606 Normal coordinates 443 Normal form 604 Normal ordering 698 -constant 702 Normal subgroup 72 Normalization 57 Normalized -mass 520 -coupling constant 520 -ket 521 Nuclear extension 16 Nucleons 439 Null cones 365 past- 445 Null coordinate system 405 Null coordinates advanced- 413 retarded- 413 Null infinity

Index 827

future- 407 past- 407 'Null' physical states 694 Nullity 20 0 (^-action 592 O(4)-symmetry 273 Observer 358-59 Observable 521-22 Off mass shell 696 One-parameter family of homeomorphisms On mass shell 696 Open parametric disc 50 Open string 675 Open universe 443 Operator(s) 109-111 adjoint- 127 angular momentum- 157, 464 bounded linear- 124 BRST- 715, 721 closed- 127 compact- 135 complete- 112 conjugacy- 42 continuous- 110 DDF- 704 densely defined- 127 Dirac chiral- 634 Dolbeault- 774 domain of- 109 dynamical law of an- 531 eigenfunctions of an- 119 eigenvalues of an- 119 elliptic- 655 energy- 532 evolution- 529 Hamiltonian- 115,530-31 heat- 657 Hermitian- 112,120,503,521 Hodge star- 42 incomplete- 112 invariant under the action of- 110 inverse of an- 111 kernel of an- 113 Laplace- 116, 135 Laplace-Beltrami- 137, 140 matrix element of evolution- 481-82 minimal- 139

453

momentum- 157, 523 position- 523 pseudo-differential- 655 quantum field- 674 radial momentum- 455 range of- 109 shift- 114 spin- 464 superspace covariant derivative- 755 super-Virasoro- 759-60 tachyon- 674 time evolution- 457, 529 unitary- 111,118,123,125,132 vertex- 218, 222, 228, 231, 673-75, 696-99 Virasoro- 230, 687 Operator product time ordered- 487 Opposite -algebra 555 -coalgebra 556 Orbifolds 771-72 Orbital angular momentum operator 454 Order parameter 618 Oriented manifold 51 Orthonormal basis (ONB) 4 Fermi-propagated- 390 pseudo- 391 Orthonormal system (ONS) 4 Ortho-symplectic superalgebra 282-83 Oscillator anharmonic- 497 harmonic- 501,689 Outer product 521 p-brane(s) 781 D- 783 bosonic- 783 fundamental- 784 solitonic- 784 super- 783 p-chains 40 p-form 328 Parallel along 13 Parallel translate 90,92 Parity 234 Parseval's theorem 542 Particle Bose- 479

828

Index

-electric charge 577 free Dirac- 464 free Klein-Gordon- 464 -horizon 445 Particle field theory superpoint- 346 Partition function 211 Path integral 715 gauge-fixed- 716 Path integral (PI) formalism 495 field theory with- 494-98 Pauli algebra 157, 159 Pauli matrices 144, 150, 296, 350 Penetration depth 621 Penrose diagram 410 -of de Sitter spacetime 410 -of steady state universe 410 Perfect fluid 378,386 -model 369 Perturbation series 476 Periodic function 539-30,550 0 4 theory 496, 498 Phase -space 452, 527 -transformation(s) 577-78 Photon 244, 272, 788 Physical states 693,714 null- 694 spectrum of- 710 Physical systems invariance properties of- 234 Pin group 287 Planck length 785 Planck's constant 449 Plucker coordinates 593 Plucker embedding 593 Poincare -algebra 158, 292 -group 66, 78, 309 -symmetry 234-35 -invariance 680 -lemma 23 -invariant 857 -transformation(s) 666, 669 Point-spectrum 119 Poisson equation 368 Poisson bracket 453 Pole 30

Positive -definite Hermitian form 5 -measure m 15 -set function 14 -root 172 Potential function Ginzburg-Landau- 619 Pre-Hilbert space 4 Prediagrams 477 Primary field 317 Principal -bundle 98, 268 -fiber bundle 90, 101 -leading symbol 654 Principle of analytic continuation 50 Principle of Relativity 365 Probability -amplitude 451 -distribution 452 Product(s) 99 -group 73 inner (scalar)- 521 -of groups, 72 outer- 521 Projectable -diffeomorphism 263 -vector fields 263 Projection operator(s) 124, 126, 130, 290, 327 completeness property of- 129 Projective -bundles 586 -plane 585 Projective space complex- 50, 57, 648 Projectively Hausdorff 307 Propagator -of field <j) 730 Propagator 475, 536 Feynman- 507 first order quantum correction to- 506 Proper -self energy diagram 515 - (1PI) n-point vertex function 516 - time r 364, 370, 670 Proton(s) 253,799 Pseudo-Majorana 290 Pseudo-tensorial 98 Pure

Index

-gauge 576 -Yang-Mills equation

271

q-form 328 q-number anomaly 744 Quadratic 769 -divergences 769 Quadric 595 Quantization 449 Quaternion anti-involution property of- 587 g-valued- 602 Quantum -amplitude 451 -correction to propagator 506 -corrections 514,780 -chromodynamics 253 -determinant 560,561 -fluctuation 747

T7 level

*4

spin two- 711

5 U

~ -mechanical equation

Relativistic 434 -correction 367 -momentum 459 434 nOnRelativity theory 356 general- 356 special- 356, 360 Renormalization(s) 506,517,694-95 Reparametrization 668 compensating- 703 Repeated focusing 432 Representation(s) 80-82, 177, 191, 580 canonical- 215-16 degree (dimension) of- 81 local- 264 quotient-space- 81 projective- 215 semi-simple (completely reducible)- 178 simple (irreducible)- 178

459

-plane relation 559-60 ,. .- -, „ „ -states 666, 792-93 _ „.._.,. Quantum field 745 ,„ -operator 675 ^ Quantum theory . . . . , ,.. dynamical equation of- 511 Quark(s) 253,636,782,799 Quotient ^oalgebra 557 -representation 81 Radial Hamiltonian 455 Radial momentum operator 455 Radiation fields 256,258 Radiation gauge 258 Radical 163, 175 Ramification number 39 Raychaudhuri equation 387,391,413 Real spinor bundle 288 Realization 191 -of a matrix 191 Reduces 125 Reduction bundle 94 Regge slope 664,683,713 Regge trajectory 664 leading- 664

spinor- 776 <;[/(2)- 580

. tensor product . * o_ unitary- 82 „ . ,. Representation(s) „ ., Residue „ ._ -ofT) 45 J -theorem Resolvent

„ 552

„ o, of- 81 „ .. . , of a Lie algebra

cn

"53

119

Resonance 661 Regge's- 661, 663 Res onse 50 P ° Ricci flat metric 771 Ricci tensor 36,585,746 R.emann surface 37,57 conformally equivalent- 47 holomorphic mapping on- 38 s.mply connected- 47 Riemann zeta function 221,702 Riemannian -curvature tensor 14,585 -manifold 11 -measure 137, 140 -metric 11 Riemannian sum distribution of- 549

,,_,„. 167, 177

829

830

Index

Right-shift operator 114 Right supertranslation 347 "Ringing of energy" 663 Robertson-Walker space 412 Root(s) 171 -decomposition 200 -diagram 186 height of a- 180 highest- 201 -ladder 186 -lattice(s) 192, 203 -system basis 179 -system(s) 178, 181, 200, 229 -vector 171 Rotation group 273 S dual 779 SU(2) action 592 Scalar(s) 669 -mulitplet 317 -subalgebra 199 -superfield 327 Scattering amplitude 674 Schrodinger -operator 115 -equation of motion 457 -equations 669 Schur's lemma 82 Schwartz space 654 Schwarzschild -mass 438 -radius 433, 438 -solution 413,421-22,428 Schwinger term 644 Schwinger-Dyson equations 512 Second quantization 238, 785-86 Self-dual 585-86 anti- 585-86 -connection 586 -spinors 586 -2 form 582,587 Self-dual bundles 584 anti- 584 Self-duality 571 Semi-direct product 200 Semi-norm 770 Semi-simple 193 -Lie algebra(s) 155, 163

Separable 2 Sesqui-linear 550 Shear tensor 391 Shuffle (p,n-p)- 566 Sifting property 548-49 Simple -Lie algebra(s) 163 -lowering elements 198 -raising elements 198 -root(s) 172, 184, 186 Simply connected 47 Simultaneity 358 Simultaneous eigenvector 170 Singular points 431 Singularities 366, 410, 412, 429, 545 coordinate- 413 isolated- 544 short distance- 700 Singularity 45,429,439-40 curvature 430 essential- 31 irremovable- 415 point of- 27 -free 430 -theorems 397, 431 Sinusoidal wave 527 Slow increase 542 Sobolev space 770-71 Soliton(s) 573, 784 Solitonic 784 Solution exact- 403 exterior Schwarzschild- 438 external Schwarzschild- 423 Godel- 416 Kerr- 415,421,434,441 Reissner-Nordstrom- 413-15, 421 Robertson-Walker- 412 Schwarzschild- 413, 421-22, 428 -space of monopoles 601 Solvable 169 -algebra 175 Soul 301, 303 Source free equations 257 Source function 537 Space Bose-Fock- 786

Index

configuration- 527 -divergence 340 Friedmann- 412 homogeneous- 136 Lp- 1 h~ 8 linearMisner- 417 Narain- 772 state- 527 phase- 527 Robertson-Walker- 412 Schwartz- 654 Taub- 416 Taub-NUT- 418 universal covering- 411 orientable- 417 Space of phase factors 262 Space of gauge potentials 264-65 Spacetime interval 363 in variance of- 363 Spatially -closed 444 -flat 444 -homogeneous 443 -open 444 Special unitary groups 75, 143 Spectral -curve 604 -family 130 -line 604 -measure 131 Spectral decomposition 129, 132, 522 Spectrum 119,132,522 band- 119 continuous- 132 ghost-free- 691 point- 119 Spherical symmetry 422 Spherically symmetric potential 454 Spin 285 -group 287 -operator 464 -structure 288 Spin-statistics theorem 752 Spinor 285-86 2-component- 295, 301 4-component- 295, 301

complex- 296 Dirac- 289, 295 eigen- 286 Lorentz- 312 Majorana- 289-91 Majorana-Weyl- 290 orthogonal- 286 pseudo-Majorana- 289 Weyl- 290 Spinorial indices 328 Split 179 Spurious state 693, 714 Square integrable 43, 544, 548 Stable 1 Standard -bases 351 -Gaussian measure 16 -model 253 State 'allowed'- 706 bound- 524 ground- 677 'null' physical- 694 physical- 693 -space 521, 527 spurious- 693, 714 vacuum- 576 zero-norm- 694 Static field configuration 573 Stationary regular predictable space 441 Steady state model 409 Stereographic projection 50,59,274,596 Stream function 28 Streamline 28 Stretched horizon 787 String(s) cosmic- 780 -diagram 671-72 -Hamiltonian 696 -singularities 579 -tension 663, 785 -wave function 782 String theory quantization in- 688 Type I- 777-78 Type HA- 777-78 Type IIB- 777 GS- 777

831

832

Index

NS-R- 778 Strong energy conditions 397 Structure equations 17 ,?K(2)-valued self-dual 2-form 587 Subalgebra 164,183 Cartan- 196, 199, 228 derived- 193 maximal commutative diagonalizable- 200 scalar- 199 Subcovering 2 Subgroup 70 one-parameter- 91 open- 77 topological- 74 Subrepresentation 81 Subsuperalgebra 284 Super Jacobi identity 280 Super p-branes 783 Super Poincare algebra 285,311 Super-Virasoro -operators 759,761 -constraints 759 Superanalytic 303-4 -functions 303 Superanalyticity 303 Superantisymmetry condition 282 Supercharges 307, 312 Superconformal transformation 342 Superconductors 577,618 Superconductivity 574,577,618 Supercurrent 754,758 Superfield(s) 313,317 chiral or anitchiral- 319 chiral(antichiral)- 318,327,346 chiral or scalar- 318 scalar- 327, 330 vector- 318, 319, 331 Supergauge transformations 332 Supergroups 301 Super lie groups 352 conventional- 353 Supermanifolds 301, 306, 310 Supermembrane 783 Supernumbers 301 complex- 304 real - 304 Superposition of harmonics 539 Superpotential 330, 345

Superspace 310, 328 covariant derivative operator in- 755 differential forms in- 328 exterior derivative in- 328 exterior product in- 328 integration on- 339 Supersymmetric -extension of LQED 331 -Lagrangian 331 -renormalizable Lagrangian 344 Supersymmetry 301,313 -algebra 313 -derivative 342 -generator(s) 346, 754 -transformation(s) 347, 753 Supertrace 282 Supertranslation 311 Supervector 304 a-type- 305 c-type- 305, 352 pure- 305 -space 304, 352 zero- 305 Supervielbeins 342, 783 Sweedler's Sigma Notation 557 Symbol total- 654 principal- 655 Symmetric operator 127 Symmetric power 5 rM 15 Symmetry 234 approximate- 234-35 -breaking 236 -breaking phenomena 255,634 continuous- 234,236 derived- 242 dynamical- 234, 237 discrete- 234, 236 exact- 234-35, 238-39 fundamental- 234 gauge- 234-35 geometrical- 234, 237 global- 234-35, 240 hidden- 234, 237 isospin- 241 local- 234-35 Lorentz- 234, 236 O(4)- 273

Index 833

scaling- 242 Poincare- 234, 236 spontaneous breakdown of- 510 spontaneously broken- 236 -transformation(s) 181,183,238 -violations 662 Symmetry group 71,239-40 generators of- 239 Synchronized clocks 358 'tHooft matrices 582 Tdual 779 T-duality 784 Tangent bundle 90, 648 Tachyon(s) 662,699 -free 662 -operator 674 Tadpoles 748 Taub-NUT space 416-17 Taylor series 302, 325, 546 Tempered 544 -distribution 542 Tensorial 98 Tensor(s) 10 covariant- 11,68 contravariant- 11, 68 field strength- 254 Riemannian curvature- 14 Test -function 533, 541 -particle 359 Theorem Coleman-Mandula no-go- 308 Fourier inversion- 541 Hawking and Penrose- 431 Hawking's- 431 Parseval's- 542 Penrose's- 431 Wick's- 475 Tidal force 388 Time evolution operator 457, 529 Time-like infinity future- 407 past- 407 Time ordered -correlation function 489 -operator product 486

Time-orientable 418 Topological group 3, 73, 74 local- 74 Topology 1,7 relative- 78 subspace- 7 Torsion 338 -form 94 Total angular momentum 685 Total dimension 306 Transformation(s) BRS- 641 BRST- 724 gauge- 579, 585 phase- 577-78 Poincare- 666,669 Lorentz- 666,703 Transgression of Chern character 645 Transition function 88 Transitive operator 135 Translations 74 left- 74 right- 74,91 Transverse field 258 Trapped region 444 Triality 775 Triangular decomposition 194 Twist angle 740 Type I string theory 777-78 Type HA string theory 777-78 Type IIB string theory 777 U dual 779 U(N) symmetry 782 Ultraviolet 629 -divergent integrals 629 -divergences 629,671,745,769 -radiations 770 -regulator 769 Uncertainty principle 449, 523, 552 Unipotent 77 Unitary -module 164 -operator(s) 111, 118, 123, 125, 132 -space 5 -symmetries 235 Universal covering 46

834

Index

Universal enveloping algebra 177 Universe closed- 443 high energy- 787 open- 443 Untwisted affine Kac-Moody algebra Upper central series 165

195

Vacuum expectation value (VEV) 510, 779 Vacuum functional 489 Vacuum-to-vacuum amplitude 501 Vacuum vector 221 - o f E x p # 16 Variation -of a timelike curve 401 -of the field 374 -two-paramter 401 Variational -equations 573, 577 -parameter 270 Vector -current Ward identity 630 -fields 10 -meson 699 -meson state 236 Vector bundle 90,100-1,648 fundamental- 105 horizontal- 104 Riemannian- 101 standard horizontal- 105 vertical- 104 Vertex operator 218, 222, 228, 231, 673-75, 696-99 dilaton- 748 graviton- 748 -of on-shell physical state 748 Vertical vectors 93, 103 Vielbein -fields 334, 337 -forms 334, 337 -in polar coordinates 17 Virasoro -algebra 202, 688, 720 -anomaly 729,731 -constraint equations 701 -condition(s) 691,744 -generator(s) 689, 720, 732 • -operator(s) 230, 687 Volume elements 339

Volume form 269 Volume factoring in gauge theory 499 Vortex number 578 Vortex solutions 577 N- 625 N-anti- 625 Vortices 571,574,618 Vorticity tensor 390 Ward identity(ies) 629 axial-vector- 632 vector- 632 Wave function 79 coordinate space- 524 momentum space- 463 Wave operator 115,666 Weak -energy conditions 394-95, 397 -gravitational field 378 -hypercharge 260 -hypercharge gauge group 252-53 -intermediate vector bosons 252 -isospin gauge group 253 Weakly convergent sequence 533, 538 Weight 170 -diagrams 187-88 -ladder module 172 -submodule 170, 172 -system 170, 211 -vector 170 Weyl's -invariance 747, 751 -invariant 751 -scaling 679, 682 -symmetry 681 -ordering 484 -group 179, 206, 212 -spinors 290, 295 -tensor 585 Wess-Zumino (WZ) 313,320 -gauge 320-21 Wick's theorem 475 Wigner-mode 255 Winding number 578 -of/? 624 White dwarf 432,434, 436 World-sheet 663, 667 area of- 681

Index

Yang-Mills -action 270 -connection 271 -field 271 -field equations 252 -potential 110, 631 Yang-Mills' theory 571 Yang-Mills-Higgs -action 572, 597

-action density 600 -configuration space 597 -model in iR""' 600 -variational equations 573 Z2-graded algebra 307 Zero morphism 562 Zero-modes 605, 774 Zero-norm state 694

835

Index

Yang-Mills -action 270 -connection 271 -field 271 -field equations 252 -potential 110, 631 Yang-Mills' theory 571 Yang-Mills-Higgs -action 572, 597

-action density 600 -configuration space 597 -model in iR""' 600 -variational equations 573 Z2-graded algebra 307 Zero morphism 562 Zero-modes 605, 774 Zero-norm state 694

835

Mathematical Perspectives on Theoretical Physics: A Journey from Black Holes to Superstrings

Read more

Mathematical perspectives on theoretical physics

Read more

Scattering from Black Holes (Cambridge Monographs on Mathematical Physics)

Read more

Scattering from Black Holes

Read more

Scattering from Black Holes

Read more

Physics of Black Holes: A Guided Tour

Read more

Physics of Black Holes: A Guided Tour

Read more

Black Holes

Read more

Black Holes

Read more

Black Holes

Read more

Black Holes

Read more

Black Holes

Read more

Black Holes

Read more

The Mathematical Theory of Black Holes

Read more

The Mathematical Theory of Black Holes

Read more

Etudes on theoretical physics

Read more

Etudes on theoretical physics

Read more

Lectures on theoretical physics

Read more

INTRODUCTION TO THEORETICAL PHYSICS

Read more

Introduction to theoretical physics

Read more

Mathematical theory of black holes CH01 Mathematical preliminaries

Read more

Perspectives on LHC physics

Read more

Artificial black holes

Read more

From Newton to Mandelbrot: A Primer in Modern Theoretical Physics

Read more

Artificial Black Holes

Read more

Quantum analogues: from phase transitions to black holes and cosmology

Read more

Quantum Analogues: From Phase Transitions to Black Holes and Cosmology

Read more

From Quarks to Black Holes - Interviewin: Interviewing the Universe

Read more

From Quarks to Black Holes: Interviewing the Universe

Read more

Artificial Black Holes

Read more

Recommend Documents

Mathematical Perspectives on Theoretical Physics: A Journey from Black Holes to Superstrings

...

Mathematical perspectives on theoretical physics

Scattering from Black Holes (Cambridge Monographs on Mathematical Physics)

CAMBRIDGE MONOGRAPHS ON MATHEMATICAL PHYSICS General editors: P. V. Landshoff, D. W. Sciama, S. Weinberg Scattering fr...

Scattering from Black Holes

Scattering from Black Holes

Physics of Black Holes: A Guided Tour

Lecture Notes in Physics Founding Editors: W. Beiglb¨ock, J. Ehlers, K. Hepp, H. Weidenm¨uller Editorial Board R. Beig, ...

Physics of Black Holes: A Guided Tour

Lecture Notes in Physics Founding Editors: W. Beiglb¨ock, J. Ehlers, K. Hepp, H. Weidenm¨uller Editorial Board R. Beig, ...

Black Holes

THE LUCENT LIBRARY OF SCIENCE AND TECHNOLOGY Black Holes by Don Nardo San Diego • Detroit • New York • San Francisco...

Black Holes

This page intentionally left blank SPACE TELESCOPE SCIENCE INSTITUTE SYMPOSIUM SERIES: 21 Series Editor S. Michael F...

Black Holes

THE LUCENT LIBRARY OF SCIENCE AND TECHNOLOGY Black Holes by Don Nardo San Diego • Detroit • New York • San Francisco...