Instructions to authors Aims and scope Physics Reports keeps the active physicist up-to-date on developments in a wide range of topics by publishing timely reviews which are more extensive than just literature surveys but normally less than a full monograph. Each Report deals with one specific subject. These reviews are specialist in nature but contain enough introductory material to make the main points intelligible to a non-specialist. The reader will not only be able to distinguish important developments and trends but will also find a sufficient number of references to the original literature. Submission In principle, papers are written and submitted on the invitation of one of the Editors, although the Editors would be glad to receive suggestions. Proposals for review articles (approximately 500–1000 words) should be sent by the authors to one of the Editors listed below. The Editor will evaluate proposals on the basis of timeliness and relevance and inform the authors as soon as possible. All submitted papers are subject to a refereeing process. Editors J.V. ALLABY (Experimental high-energy physics), EP Division, CERN, CH-1211 Geneva 23, Switzerland. E-mail:
[email protected] D.D. AWSCHALOM (Experimental condensed matter physics), Department of Physics, University of California, Santa Barbara, CA 93106, USA. E-mail:
[email protected] J.A. BAGGER (High-energy physics), Department of Physics & Astronomy, The Johns Hopkins University, 3400 North Charles Street, Baltimore MD 21218, USA. E-mail:
[email protected] C.W.J. BEENAKKER (Mesoscopic physics), Instituut–Lorentz, Universiteit Leiden, P.O. Box 9506, 2300 RA Leiden, The Netherlands. E-mail:
[email protected] E. BREZIN (Statistical physics and field theory), Laboratoire de Physique The´orique, Ecole Normale Superieure, 24 rue Lhomond, 75231 Paris Cedex, France. E-mail:
[email protected] G.E. BROWN (Nuclear physics), Institute for Theoretical Physics, State University of New York at Stony Brook, Stony Brook, NY 11974, USA. E-mail:
[email protected] D.K. CAMPBELL (Non-linear dynamics), Dean, College of Engineering, Boston University, 44 Cummington Street, Boston, MA 02215, USA. E-mail:
[email protected] G. COMSA (Surfaces and thin films), Institut fur . Physikalische und Theoretische Chemie, Universit.at Bonn, Wegelerstrasse 12, D-53115 Bonn, Germany. E-mail:
[email protected] J. EICHLER (Atomic and molecular physics), Hahn-Meitner-Institut Berlin, Abteilung Theoretische Physik, Glienicker Strasse 100, 14109 Berlin, Germany. E-mail:
[email protected] M.P. KAMIONKOWSKI (Astrophysics), Theoretical Astrophysics 130-33, California Institute of Technology, 1200 East California Blvd., Pasadena, CA 91125, USA. E-mail:
[email protected]
vi
Instructions to authors
M.L. KLEIN (Soft condensed matter physics), Department of Chemistry, University of Pennsylvania, Philadelphia, PA 19104-6323, USA. E-mail:
[email protected] A.A. MARADUDIN (Condensed matter physics), Department of Physics and Astronomy, University of California, Irvine, CA 92697-4575, USA. E-mail:
[email protected] D.L. MILLS (Condensed matter physics), Department of Physics and Astronomy, University of California, Irvine, CA 92697-4575, USA. E-mail:
[email protected] R. PETRONZIO (High-energy physics), Dipartimento di Fisica, Universita" di Roma – Tor Vergata, Via della Ricerca Scientifica, 1, I-00133 Rome, Italy. E-mail:
[email protected] S. PEYERIMHOFF (Molecular physics), Institute of Physical and Theoretical Chemistry, Wegelerstrasse 12, D-53115 Bonn, Germany. E-mail:
[email protected] I. PROCACCIA (Statistical mechanics), Department of Chemical Physics, Weizmann Institute of Science, Rehovot 76100, Israel. E-mail:
[email protected] E. SACKMANN (Biological physics), Physik-Department E22 (Biophysics Lab.), Technische Universit.at Munchen, . D-85747 Garching, Germany. E-mail:
[email protected] A. SCHWIMMER (High-energy physics), Department of Physics of Complex Systems, Weizmann Institute of Science, Rehovot 76100, Israel. E-mail:
[email protected] R.N. SUDAN (Plasma physics), Laboratory of Plasma Studies, Cornell University, 369 Upson Hall, Ithaca, NY 14853-7501, USA. E-mail:
[email protected] W. WEISE (Physics of hadrons and nuclei), Institut fur . Theoretische Physik, Physik Department, Technische Universit.at Munchen, . James Franck Strae, D-85748 Garching, Germany. E-mail:
[email protected] Manuscript style guidelines Papers should be written in correct English. Authors with insufficient command of the English language should seek linguistic advice. Manuscripts should be typed on one side of the paper, with double line spacing and a wide margin. The character size should be sufficiently large that all subscripts and superscripts in mathematical expressions are clearly legible. Please note that manuscripts should be accompanied by separate sheets containing: the title, authors’ names and addresses, abstract, PACS codes and keywords, a table of contents, and a list of figure captions and tables. – Address: The name, complete postal address, e-mail address, telephone and fax number of the corresponding author should be indicated on the manuscript. – Abstract: A short informative abstract not exceeding approximately 150 words is required. – PACS codes/keywords: Please supply one or more PACS-1999 classification codes and up to 4 keywords of your own choice for indexing purposes. PACS is available online from our homepage (http://www.elsevier.com/locate/physrep). References. The list of references may be organized according to the number system or the nameyear (Harvard) system. Number system: [1] M.J. Ablowitz, D.J. Kaup, A.C. Newell and H. Segur, The inverse scattering transform – Fourier analysis for nonlinear problems, Studies in Applied Mathematics 53 (1974) 249–315. [2] M. Abramowitz and I. Stegun, Handbook of Mathematical Functions (Dover, New York, 1965).
Instructions to authors
vii
[3] B. Ziegler, in: New Vistas in Electro-nuclear Physics, eds E.L. Tomusiak, H.S. Kaplan and E.T. Dressler (Plenum, New York, 1986) p. 293. A reference should not contain more than one article. Harvard system:
Ablowitz, M.J., D.J. Kaup, A.C. Newell and H. Segur, 1974. The inverse scattering transform – Fourier analysis for nonlinear problems, Studies in Applied Mathematics 53, 249–315. Abramowitz, M. and I. Stegun, 1965, Handbook of Mathematical Functions (Dover, New York). Ziegler, B., 1986, in: New Vistas in Electro-nuclear Physics, eds E.L. Tomusiak, H.S. Kaplan and E.T. Dressler (Plenum, New York) p. 293. Ranking of references. The references in Physics Reports are ranked: crucial references are indicated by three asterisks, very important ones with two, and important references with one. Please indicate in your final version the ranking of the references with the asterisk system. Please use the asterisks sparingly: certainly not more than 15% of all references should be placed in either of the three categories. Formulas. Formulas should be typed or unambiguously written. Special care should be taken of those symbols which might cause confusion. Unusual symbols should be identified in the margin the first time they occur.
Equations should be numbered consecutively throughout the paper or per section, e.g., Eq. (15) or Eq. (2.5). Equations which are referred to should have a number; it is not necessary to number all equations. Figures and tables may be numbered the same way. Footnotes. Footnotes may be typed at the foot of the page where they are alluded to, or collected at the end of the paper on a separate sheet. Please do not mix footnotes with references. Figures. Each figure should be submitted on a separate sheet labeled with the figure number. Line diagrams should be original drawings or laser prints. Photographs should be contrasted originals, or high-resolution laserprints on glossy paper. Photocopies usually do not give good results. The size of the lettering should be proportionate to the details of the figure so as to be legible after reduction. Original figures will be returned to the author only if this is explicitly requested. Colour illustrations. Colour illustrations will be accepted if the use of colour is judged by the Editor to be essential for the presentation. Upon acceptance, the author will be asked to bear part of the extra cost involved in colour reproduction and printing. After acceptance – Proofs: Proofs will be sent to the author by e-mail, 6–8 weeks after receipt of the manuscript. Please note that the proofs have been proofread by the Publisher and only a cursory check by the author is needed; we are unable to accept changes in, or additions to, the edited manuscript at this stage. Your proof corrections should be returned as soon as possible, preferably within two days of receipt by fax, courier or airmail. The Publisher may proceed with publication of no response is received. – Copyright transfer: The author(s) will receive a form with which they can transfer copyright of the article to the Publisher. This transfer will ensure the widest possible dissemination of information. LaTeX manuscripts The Publisher welcomes the receipt of an electronic version of your accepted manuscript (encoded in LATEX). If you have not already supplied the final, revised version of your article (on diskette) to the Journal Editor, you are requested herewith to send a file with the text of the manuscript (after acceptance) by e-mail to the address provided by the Publisher. Please note that no deviations
viii
Instructions to authors
from the version accepted by the Editor of the journal are permissible without the prior and explicit approval by the Editor. Such changes should be clearly indicated on an accompanying printout of the file.
Files sent via electronic mail should be accompanied by a clear identification of the article (name of journal, editor’s reference number) in the ‘‘subject field’’ of the e-mail message. LATEX articles should use the Elsevier document class ‘‘elsart’’, or alternatively the standard document class ‘‘article’’. The Elsevier package (including detailed instructions for LATEX preparation) can be obtained from http://www.elsevier.com/locate/latex. The elsart package consists of the files: ascii.tab (ASCII table), elsart.cls (use this file if you are using LATEX2e, the current version of LATEX), elsart.sty and elsart12.sty (use these two files if you are using LATEX2.09, the previous version of LATEX), instraut.dvi and/or instraut.ps (instruction booklet), readme. Author benefits – Free offprints. For regular articles, the joint authors will receive 25 offprints free of charge of the journal issue containing their contribution; additional copies may be ordered at a reduced rate. – Discount. Contributors to Elsevier Science journals are entitled to a 30% discount on all Elsevier Science books. – Contents Alert. Physics Reports is included in Elsevier’s pre-publication service Contents Alert. Author enquiries For enquiries relating to the submission of articles (including electronic submission where available) please visit the Author Gateway from Elsevier Science at http://authors.elsevier.com. The Author Gateway also provides the facility to track accepted articles and set up e-mail alerts to inform you of when an article’s status has changed, as well as detailed artwork guidelines, copyright information, frequently asked questions and more. Contact details for questions arising after acceptance of an article, especially those relating to proofs, are provided when an article is accepted for publication.
Physics Reports 369 (2002) 1 – 109 www.elsevier.com/locate/physrep
Weak decay of -hypernuclei W.M. Albericoa; ∗ , G. Garbarinob a
INFN, Sezione di Torino and Dipartimento di Fisica Teorica, Universita di Torino, Via P. Giuria 1, I–10125 Torino, Italy b Departament d’Estructura i Constituents de la Materia, Universitat de Barcelona, E- 08028 Barcelona, Spain Received 1 March 2002 editor: W. Weise
Abstract In this review we discuss the present status of strange nuclear physics, with special attention to the weak decay of -hypernuclei. The models proposed for the evaluation of the decay widths are summarized and their results are compared with the data. The rates NM = n + p (+2 ), 0 and − are well explained by several calculations. Despite the intensive investigations of the last years, the main open problem remains a sound theoretical interpretation of the large experimental values of the ratio n =p . However, the large uncertainties involved in the experimental determination of the ratio do not allow to reach any de4nitive conclusion. The n =p puzzle is strongly related to the so-called 5I = 1=2 rule on the isospin change in the non-mesonic decay, whose possible violation cannot be established at present, again due to the insu6cient precision of the data. Although recent works o7er a step forward in the solution of the puzzle, further e7orts (especially on the experimental side) must be invested in order to understand the detailed dynamics of the non-mesonic decay. Even if, by means of single nucleon spectra measurements, the error bars on n =p have been considerably reduced very recently at KEK (however, with central data compatible with older experiments), a clean extraction of n =p is needed. What is missing at present, but planned for the next future, are measurements of (1) nucleon energy spectra in double coincidence and (2) nucleon angular correlations: such observations allow to disentangle the nucleons produced in one- and two-body induced decays and lead to a direct determination of n =p . Notably, the two-body component of the non-mesonic decay rates has not been measured yet, due to the too low counting rates expected for a coincidence experiment. For the asymmetric non-mesonic decay of polarized hypernuclei the situation is even more puzzling. Indeed, strong inconsistencies appear already among data. A recent experiment obtained a positive intrinsic asymmetry ˜ This is in complete disagreement with a previous measurement, which obtained a parameter, a , for 5 He. large and negative a for p-shell hypernuclei, and with theory, which predicts a negative value moderately dependent on nuclear structure e7ects. Also in this case, improved experiment establishing with certainty the
∗
Corresponding author.
c 2002 Elsevier Science B.V. All rights reserved. 0370-1573/02/$ - see front matter PII: S 0 3 7 0 - 1 5 7 3 ( 0 2 ) 0 0 1 9 9 - 0
2
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
sign and magnitude of a for s- and p-shell hypernuclei will provide a guidance for a deeper understanding c 2002 Elsevier Science B.V. All rights reserved. of hypernuclear dynamics and decay mechanisms. PACS: 21.80.+a; 13.75.Ev; 25.40.−h Keywords: Production and structure of hypernuclei; Mesonic and non-mesonic decay of -hypernuclei; n =p puzzle; 5I = 1=2 isospin rule; Decay of polarized -hypernuclei
Contents 1. Hyperons and hypernuclei . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2. Hyperon–nucleon, hyperon–hyperon interactions and hypernuclear structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2.1. N interaction and -hypernuclei . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2.2. N interaction and -hypernuclei . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2.3. Strangeness S = −2 hypernuclei and the H -dibaryon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2. Production of hypernuclei . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2. The (K − ; ± ) strangeness exchange reactions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3. The n(+ ; K + ) strangeness associated production reaction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4. The p(e; e K + ) reaction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3. Weak decay modes of -hypernuclei . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2. Mesonic decay . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3. Non-mesonic decay . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.1. The n =p puzzle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4. Present status of experiment and theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1. Experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2. Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.1. Table 2. Decay width for a in nuclear matter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.2. Table 3. Non-mesonic decay width for 12 C. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.3. Table 4. Non-mesonic decay width for 5 He . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.4. Table 5. Mesonic decay rate for 12 C ....................................................... 4.2.5. Table 6. Mesonic decay rate for 5 He . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.6. Table 7. n =p ratio for nuclear matter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.7. Table 8. n =p ratio for 12 C............................................................... 4.2.8. Table 9. n =p ratio for 5 He . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5. Models for calculation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2. Wave function method: mesonic decay . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3. Wave function method: non-mesonic decay . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4. Polarization propagator method and local density approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4.1. Nuclear matter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4.2. Finite nuclei . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4.3. Phenomenological 2p–2h propagator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.5. Functional approach to the self-energy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.5.1. The bosonic e7ective action . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.5.2. Semiclassical expansion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.6. Results of the phenomenological calculation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3 3 4 5 9 11 14 14 15 18 19 19 19 20 22 24 25 25 27 38 39 40 40 40 41 41 41 42 42 42 43 45 46 50 51 52 54 55 61
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109 5.6.1. Short range correlations and wave function—12 C .......................................... 5.6.2. Decay widths of light to heavy -hypernuclei . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.7. Results of the microscopic calculation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6. The n =p puzzle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2. Two-body induced decay and nucleon 4nal state interactions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2.1. Recent experimental spectra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2.2. Possible improvements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2.3. Potentialities of coincidence experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3. Phenomenological analysis of s-shell hypernuclei . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3.1. Experimental data and 5I = 1=2 rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7. Non-mesonic decay of polarized -hypernuclei: the asymmetry puzzle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2. Spin-polarization observables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3. Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.4. Theory versus experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8. Summary and perspective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix A. Spin–isospin NN → NN and N → NN interactions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3 61 64 66 72 72 73 78 79 81 85 87 92 92 92 94 97 99 101 101 103
1. Hyperons and hypernuclei 1.1. Introduction Hyperons (; ; ; ) have lifetimes of the order of 10−10 s (apart from the 0 , which decays into ). They decay weakly, with a mean free path ≈ c = O (10 cm). A hypernucleus is a bound system of neutrons, protons and one or more hyperons. We will denote with A+1 Y Z a hypernucleus with Z protons, A − Z neutrons and a hyperon Y . A crucial point to describe the structure of these strange nuclei is the knowledge of the elementary hyperon–nucleon (YN ) and hyperon–hyperon (YY ) interactions. Hyperon masses di7er remarkably from the nucleonic mass, hence the Favour SU (3) symmetry is broken. The amount of this breaking is a fundamental question in order to understand the baryon–baryon interaction in the strange sector. Among hyperons and nucleons the following esoenergetic strong reactions (5S = 0) are allowed:
− p → n + n → p
(Q 78 MeV) ;
− p → 0 n →
(Q 26 MeV) ;
− p → 0 − n → − (Q 178 MeV) (in parentheses are quoted the average released energies, the so-called Q-values), hence, only the lightest hyperon () is generally stable with respect to the strong processes which occur in nuclear systems. In this review we shall be mainly concerned with -hypernuclei. The existence of hypernuclei is interesting since it gives a new dimension to the traditional world of nuclei (states with new symmetries, selection rules, etc.). In fact, they represent the 4rst kind of :avoured nuclei, in the direction of other exotic systems (charmed nuclei and so on).
4
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
Hypernuclear physics was born in 1952, when the 4rst hypernucleus was observed through its decays [1]. Since then, it has known several phases of development and it has been characterized by more and more new challenging questions and answers. However, this 4eld has experienced great advances only in the last 10 –15 years. We can look at hypernuclear physics as a good tool to match nuclear and particle physics. Nowadays, the knowledge of hypernuclear phenomena is rather good, but some open problems still remain. Actually, the study of this 4eld may help in understanding some crucial questions, related, to list a few, to: • • • • •
some aspects of the baryon–baryon weak interactions; the YN and YY strong interactions in the J P = 1=2+ baryon octet; the possible existence of di-baryon particles; the renormalization of hyperon and meson properties in the nuclear medium; the nuclear structure: for instance, the long standing question of the origin of the spin–orbit interaction and other aspects of the many-body nuclear dynamics; • the role played by quark degrees of freedom, Favour symmetry and chiral models in nuclear and hypernuclear phenomena. In this review we will widely discuss a great deal of these problems. 1.2. Hyperon–nucleon, hyperon–hyperon interactions and hypernuclear structure We summarize here the phenomenological information available nowadays on YN and YY interactions and on the structure of -, -, - and -hypernuclei. One of the main reasons of interest in hypernuclear physics lies in the possibility of extracting information about the characteristics of the YN and YY interactions. Obviously, measurements of YN and YY cross sections would give more direct information. However, such experiments are very di6cult due to the short lifetime of the hyperons, which gives Fight paths limited to less than 10 cm: nowadays, no scattering data are available on the YY interaction and very limited are the ones for the N; N and N interactions (especially in the last case). Moreover, we remind the reader that the inverse reaction pn → p in free space is under investigation at COSY (JMulich) [2] and KEK [3]. Unfortunately, the experimental observation of this process is di6cult because of its very low cross section [4] with respect to the huge background. The NN interaction can be understood in terms of one-meson-exchange (OME) models, usually combined with a proper parameterization of the repulsive component at short distance, which originates from quark exchanges between the hadrons and has a range of about 0:5 fm, corresponding to a transferred momentum q & 400 MeV. The extension of the OME description to strange particles of the J P = 1=2+ baryon octet is still unsatisfactory. Several models of the YN and YY interactions are available. For instance, with the help of the Favour SU (3) symmetry, the Bonn–JMulich [5] and Nijmegen [6 –9] groups have developed several potentials using the OME picture, also including, in some cases, two-meson-exchange. In addition to meson-exchange potentials, other groups (Tokyo [10], TMubingen [11] and Kyoto-Niigata [12]) use quark cluster models to explain the short-range interactions. Unfortunately, none of these potentials is fully satisfactory and there are large discrepancies among the di7erent models (especially on the spin–isospin dependence). Since the data on YN scattering are very limited (they consist almost exclusively of spin–averaged cross sections), it
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
Λ N
N
Λ
5
N
N
π (ρ) π, ρ, ω, η
Σ
ω, η
N π (ρ)
N
N
Λ
N
Λ
N
Fig. 1. NN and N strong amplitudes in the one-meson-exchange model. Fig. 2. Two pion (!) exchange contribution to the N potential.
is impossible to 4t the YN interaction unambiguously: di7erent YN potentials can reproduce the data equally well, but they exhibit di7erences on a more detailed level, especially when the spin structure is concerned (compare for example Refs. [5,7,12]). The measurement of spin observables in the YN scattering as well as in the weak process pn → p could discriminate among the various interaction models. On the other hand, the study of the hypernuclear structure and weak decays is helpful in order to get useful information on the YN and YY interactions. 1.2.1. N interaction and -hypernuclei The strong N interaction displays some di7erent aspects with respect to the NN one. For instance, due to isospin conservation in strong interactions, the fact that the has isospin I = 0 forbids the emission of a pion ( 9 ). In Fig. 1 we depict the NN and N strong potentials in the OME model. The strong → and → couplings are allowed, and the hyperon can interact with a nucleon by exchanging an even number of pions and=or of ! mesons (see Fig. 2). The dominant part of the N interaction comes from the two-pion-exchange, hence it has a shorter range than the NN one. Moreover, the N potential is weaker than the NN potential: roughly speaking, from the T T diagrams of Figs. 1 and 2, for the tensorial components we have: VN =VNN 1=4. Besides, three-body interactions and two-body interactions with strangeness exchange are also allowed (Fig. 3). The NN three-body force, whose pionic component is depicted in Fig. 3(a), is an important ingredient to investigate the structure of -hypernuclei [13–15], especially in light systems. This is due to the N – N strong coupling, which is sizeable in the nuclear medium [16 –20], and, on the other hand, leads to a non-negligible second order tensor force in the N interaction (Fig. 2). By assuming a repulsive NN potential, the small binding energies in light hypernuclei and the depth of the –nucleus mean potential for heavy systems can be reproduced [14] without requiring a strong spin-dependence of the N interaction; the latter seems, at present, to be excluded (see following discussion in this paragraph). In particular, the NN interaction is essential to explain
6
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
Λ
N
N Λ
N
π K
Σ π
N
Λ (a)
Λ
N
N (b)
Fig. 3. three-body interaction (a) and two-body interaction with strangeness exchange (b).
the existence of the lightest hypernucleus, the hypertriton (3 H≡ pn), which is weakly bound. The binding energy, de4ned as A+1 A B (A+1 Z) ≡ M ( Z) − M ( Z) − m ¡ 0
is as small as 130 keV in hypertriton. In Ref. [21], within a microscopic many-body scheme, the authors showed how the coupling to intermediate N states in the N interaction (Fig. 2) is crucial for a correct evaluation of the binding energy in nuclear matter. In hypernuclei the N – N coupling is more important [especially because of the relatively small − mass di7erence ( 78 MeV)] than the NN –5N coupling in conventional nuclei, where the latter plays a very small role in binding few-nucleon systems (m( −mN 293 MeV). Another signal of the N – N coupling comes from the observation that in S-wave relative states the p interaction is more attractive than the n interaction. This follows from a comparison of the experimental binding energies in the A = 3; I = 1=2 doublet: B (4 He) − B (4 H) 0:35 MeV = 0:15 2:39 MeV B (4 He) (4 He = ppn and 4 H = pnn should di7er only because of Coulomb e7ects, if the n and p interactions were of equal strength). The N – N coupling gives a charge symmetry breaking more important than the one observed in ordinary nuclei by comparing the neutron separation energies in 3 H and 3 He (after correcting for the Coulomb interaction in 3 He). The Coulomb energy in 4 He is expected to be only a little more repulsive than in 3 He: EC (4 He) − EC (3 He) 0:02 MeV. Large part of the charge symmetry breaking observed in light -hypernuclei is due to the coupling between the N and N channels and turns out to be quite sensitive to the mass di7erence between the initial and
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
7
4nal state: 5m(p → + n) 75 MeV ¡ 5m(n → − p) 80:5 MeV (for transitions without charge exchange, p → 0 p and n → 0 n; 5m 77 MeV). Another important aspect of the N interaction is its spin-dependence. A qualitative indication of the di7erence between the singlet (J = 0) and triplet (J = 1) N interactions comes from the comparison of the binding energy in isobar nuclei not related by charge symmetry. For example, |B (7 Li)| is larger than |B (7 Be)| by about 0:42 MeV. The greater |B | value corresponds to the hypernucleus whose nucleons’ core has non-zero spin (being an odd–odd system, while the core of 7 Be is even–even); it can be explained by the e7ect of the spin-dependent interaction with the unpaired nucleons, a proton and a neutron, in 7 Li. However, the N spin–spin interaction is weak. In Ref. [22] the -core spin doublet splittings (J = |Jcore ± 1=2|), which give the strength of the N spin–spin interaction, are predicted to be of the order of 0:1 MeV for p-shell hypernuclei, and only for 7 Li in the ground state this splitting is sizeably larger ( 0:6 MeV). The recent measurements, at KEK and BNL [23,24], of the energy spacing of the 7 Li ground state doublet, + + M 1( 32 → 12 ) = 691:7 ± 1:2 keV, and of various -ray transitions in other p-shell hypernuclei con4rmed this prediction [25]. Experiments with high energy resolution are then essential to study the spin-dependence of the N interaction. From the analysis of the spins of ground and excited states in -hypernuclei one expects the N interaction to be more attractive in the spin–singlet state than in the spin-triplet state [26]. In Ref. [11] the authors found that the quark cluster model gives more attraction in the triplet interaction; moreover, their meson-exchange potentials are almost spin-independent. In the phenomenological OME models of the N interaction the situation is not clear [5 –7]: since there is no direct empirical information about the spin structure of the potential, some versions favour the singlet interaction, while others favour the triplet one. It has been found [27,28] that the N e7ective interaction has repulsive character in the spin-parity J P = 0+ channel, while for the NN interaction an attractive 0+ pairing is well known. This antipairing e7ect originates from a delicate balance between the N inner repulsion and the attraction at intermediate distances. The spin–orbit component of the –nucleus mean potential is rather small. The spin–orbit separation of the levels is at least one order of magnitude smaller than the one typical of the N –nucleus interaction [19,22,25,29 –34]. Such e7ect could originate from the weak tensor component of the N interaction, whose most important contributions come from the exchange, forbidden at the lowest order, of pions and rhos. This supports the hypothesis that the strong one-body spin–orbit potential experienced in ordinary nuclei (central point in order to explain their exact shell structure) originates from a two-body tensor force. However this point is not completely clear yet. In fact, forces besides the spin–orbit one [25] as well as core excitations [33,35,36] may contribute to the observed splittings as well. We will further discuss the problem of the spin–orbit interaction in -hypernuclei in the next section. When a hyperon is embedded in the nucleus, one has to take into account the inFuence of the medium on the hyperon, which originates, likewise in ordinary nuclei, from the strong two-body N interaction. A simple mean 4eld picture turns out to be a good description of the bulk hypernuclear properties (for example the hyperons binding energies and the excitation functions), in agreement with experimental data [37]. Within this approximation the hyperon maintains its single particle behaviour in the medium, and it is well known that this occurs even for states well below the Fermi surface [38,39], a property which is not observed for the nucleons. This is due to the fact that
8
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
in hypernuclei the is a distinguishable particle, which has a relatively “weak” interaction with the nucleons’ core. However, deviations from the independent particle description can be produced, for instance, by the N – N coupling, three-body forces, QCD e7ects at the nuclear level and non-localities due to relativistic e7ects. On the other hand, the presence of a hyperon inFuences the nuclear medium: hence, the Hartree– Fock approximation acquires a new self-consistency requirement in strange nuclei. In spite of the relative weakness of the N interaction (with respect to the NN interaction), for particular nucleon con4gurations the single particle levels may be considerably lowered by the presence of a : for the deepest ones the energy shift can reach 3–5 MeV, while for the valence orbits a value of about 1 MeV is frequent [28]. For example, the extra 1p neutron binding energy for a p-shell hypernucleus A+1 A Z due to the addition of the to the nucleus Z is calculable with the following relation: 1p A A+1 A Bn1p (A+1 Z) − Bn ( Z) = B ( Z) − B ( Z) ¡ 0 :
For a 1s neutron: 1p A+1 A ∗ A Bn1s (A+1 Z) = Bn ( Z) + M ( Z) − M ( Z) ;
where M ∗ (A Z) is the mass of the 1s neutron–hole excited state of A Z, which can be produced by the K − n → − reaction on A+1 Z, through the transformation of a 1s neutron into a 1s -hyperon (see next section). Similar relations hold for the proton levels. The stability of the nucleons’ core is increased by the presence of the particle, which plays then a “glue-like” role. Remarkable examples are 5 He and 8 Be versus 6 He and 9 Be, the former being unstable and the latter stable with respect to strong particle emission. Very recent -ray spectroscopy experiment at KEK [40] showed that the size of the 6 Li core in 7 Li is reduced with respect to that of the loosely bound 6 Li nucleus. In a 5 He–d (4 He–d) cluster model for 7 Li (6 Li) [41], the rms distance between 5 He and d in 7 Li is about 19% smaller than the one between 4 He and d in 6 Li. The role of stabilizer of the in nuclei is due to its position in the inner part of the nucleons’ core, on single particle levels which are forbidden, by the Pauli principle, to the nucleons. On the other hand, the weak decay of the may cause the delayed 4ssion of the host nucleus (because of the decreased stability and energy release of the decay). This process has been used to measure the lifetime of heavy hypernuclei [42,43]. We consider now the di7erent behaviours, in nuclei, of neutron and due to their decay modes. In the free space neutron and are unstable; they decay through the following weak channels: n → pe− +Ve (100%) ; → − p 0 n
(63:9%) ; (35:8%) :
The energy released in the neutron free decay is Qnfree ≡ mn − (mp + me ) 0:78 MeV, while the binding energy of a nucleon in the nucleus is (in the average) BN −8 MeV, therefore a neutron in a nucleus is generally stable (namely its decay is kinematically forbidden). On the other hand, for the free decay the released energy is Qfree ≡ m − (m + mN ) 40 MeV. This is larger than the nuclear separation energy: |B | . 27 MeV (especially in light hypernuclei), hence the is kinematically unstable even if embedded in nuclear systems. The binding energies of nucleon and in nuclei, BN and B , have di7erent behaviours as a function of the mass number: BN saturates at about −8 MeV for nuclei with A & 10, while |B |
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
9
monotonically increases with A up to 27 MeV for 208 Pb. Indeed, the can occupy whatever single particle state, the ground state of the hypernucleus always corresponding to the hyperon in the 1s level. It is then clear that a particle is a good probe of the inner part of nuclei. Actually, the Pauli principle is active on the u and d quarks of nucleons and when they are very close to each other. For example, in the case of 5 He ≡ ppnn, if the constituent baryons maintain their identity, both the hyperon and the nucleons occupy s levels, while at the quark level an up quark in the p short range interaction (and a down quark in n) has to occupy the p-level. The Pauli blocking e7ect at the quark level could be an important ingredient to explain the anomalously small 5 He binding energy with respect to calculations performed within the baryon picture. A study of the role played by the quark Pauli principle on the binding energies of single- and double- s-shell hypernuclei can be found in Ref. [44]. The authors have found signi4cant e7ects when the assumed size of the baryons is of the order of the proton charge radius: b 0:86 fm. With the exception of hypernuclei of the s-shell, the depth of the –nucleus mean 4eld is of about 30 MeV [37], namely it is less attractive than the one typical for a nucleon ( 50–55 MeV). This characteristic reFects the smaller range and the weakness of the N interaction at intermediate distances with respect to the NN one. It is possible to reproduce the experimental single particle levels using Woods–Saxon wells with the above depth and appropriate radii. For s-shell hypernuclei the single particle potential displays a repulsive soft core at short distances [45 – 47]. A measure of this e7ect is given by the rms radii for a nucleon and a in these hypernuclei: the hyperon rms radius is larger than the one for a nucleon. 1.2.2. N interaction and -hypernuclei The investigation of the N interaction is richer but more di6cult than that of the N interaction. We remind the reader that it exhibits a long-range OPE component, its central part is weaker than the N one and it is very sensitive to spin and isospin [48–50]. Very roughly, the strengths of the averaged NN; N and N two-body potentials are in the following ratios: NN=N 3=2; NN= N 3. The strong spin–isospin dependence in the N interaction is natural in OME models and it is due to the exchange of both isoscalar (!; /) and isovector mesons (; !). The N spin–orbit strength is expected to be about 0.5 –1 times the NN one [48]. Calculations and experimental observations have shown that both the N and N e7ective two-body potentials are strongly repulsive at short distances, and a repulsive core even remains in the –0 and 0 –0 folding potentials [47,50] (which describe the and dynamics in 5 He and 5 He, respectively); however, di7erently from 5 He (B 3:12 MeV), because of the large repulsion in the inner region, the 0 –0 potential does not support bound states. On the contrary, for the NN interaction the attraction at intermediate distances is so strong that the N –0 potential obtained by the folding procedure does not contain the inner repulsive component. For heavy nuclei a repulsive bump could appear on the surface of the – nucleus potential because of the particular balance of repulsion and attraction (which is less e7ective on the nuclear surface) in the N interaction [50]. The 4rst production signals interpreted as -hypernuclear states, 20 years ago at CERN, showed unexpected narrow peaks (less than 8 MeV, instead of the 20 –30 MeV estimated for nuclear matter 12 16 [48,51]) which were assigned to the formation of 9 Be; 12
C, Be and C [52]. The 4rst observation reported two narrow peaks above the binding threshold separated by about 12 MeV in the 9 Be (K − ; − ) strong reaction and were ascribed to the formation of 9 0 Be. However, the measurements
10
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
were carried out with very low statistics and the identi4cation of the peaks involved large ambiguities. Moreover, none of the reported states could be assigned to hypernuclear ground states. Recently, at BNL-AGS [53], by employing 10 times better statistic, the existence of such narrow structures for p-shell hypernuclei has been excluded. Due to the relevance of the N – N coupling, -hypernuclei can also be regarded as resonant states of -hypernuclei. On the other hand, in a -hypernucleus the N → N conversion creates a with a kinetic energy of about 40 MeV. Since the –nucleus well depth is smaller than this energy, the -hyperon has a thick probability to escape from the nucleus and decays after 2 cm. From the theoretical point of view, the existence of narrow state in nuclei cannot be explained only in the (plausible) hypothesis of a sizeable N → N strong converting process. Among the mechanisms introduced in order to suppress the calculated widths of -hypernuclei [28,48,54 –56], the most relevant ones are the Pauli blocking e7ect on the 4nal nucleon in N → N , the suppression of particular spin–isospin transitions and the medium polarization e7ect. The latter is accounted for in [54,55] through the so-called induced interaction approach. Moreover, it is also possible that the → conversion is less e6cient in 4nite nuclei because the –nucleus potential has such a small depth that the wave function is considerably pushed out of the nucleus. As already pointed out, it has been established that for s-shell
-hypernuclei the hyperon is pushed towards the nuclear surface by a central repulsion in the
–nucleus potential. The above e7ects can reduce the N → N width up to 5 –10 MeV in p-shell hypernuclei [28,55]. There are many ambiguities in our knowledge of the properties of the –nucleus potential, as obtained from hypernuclear and N scattering data studies. If this potential had small depth, in the production of heavy systems there should be the problem of resolving the small spacing among the single particle levels. In fact, if the energy separation among the -levels is lower than their widths, these states cannot be resolved by the experiment. The analysis of the few existing data on − -atoms and of (K − ; ± ) production indicate [28,48,50,52,57] a single particle potential depth in the range 8 . |V0 | . 15 MeV for hypernuclei beyond the s-shell. Very shallow depths (−V0 . 10 MeV) are consistent with the (K − ; ± ) analysis, which, in fact, has not proved the existence of nuclear states beyond the s-shell. 1 Instead, −V0 20 MeV is more consistent with
-atoms data, which, however, are not sensitive to the interior part of the nucleus: hence, 20 MeV probably overestimates |V0 |. Moreover, from the above cited analysis, the –nucleus potential turns out to be strongly spin- and isospin-dependent, with a spin–orbit part comparable with the N –nucleus one. From further theoretical speculations [56] and experiments carried out in the last years at KEK [58,59] and BNL [53,60], the existence of 0 and + bound states for nuclear mass numbers A ¿ 4 seems to be strongly unlikely. On the other hand, the existence of the predicted [61] 4 + He bound state has been proved (with binding energy −B + 2:8–4:4 MeV and width + 7:0–12:1 MeV) both at KEK [58] and BNL [60]. Actually, only for very light systems the widths are expected to be narrower than the separation among the -levels; moreover, for hypernuclei other than the s-shell ones, the –nucleus potential could not be deep enough to accommodate 0 or + bound states. 1 The use of the (K − ; ± ) production reaction to extract information on the –nucleus potential is spoiled by the following defect. Distorted pion and kaon waves are needed to study this reaction. The optical potentials normally used in the calculations remove from the Fux of pions and kaons the particles which undergo not only absorption but also quasi-elastic scatterings. The latter, however, will continue the reaction and contribute to the measured (K − ; ± ) cross section.
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
11
Phenomenolgical analyses of − atoms support the presence of a substantial repulsive component in the –nucleus potential also in medium and heavy hypernuclei [57,62]. The conclusion of these works is that, although the magnitude of the repulsive component of the –nucleus potential cannot be determined unambiguously by the atomic data, the smallness of the attractive part of such a potential does not provide su6cient binding to form -hypernuclei. We also remind the reader that Coulomb-assisted hybrid bound states for − hyperons in nuclei have been predicted by Yamazaki et al. [63]. The energy and radial distribution of these states vary from the “deepest bound state with the smallest radius” up to the “shallow atomic states with large radii”. Up to now, no experimental search of these states have appeared. It is then clear that further experiments and theoretical work are needed to properly understand the existence of bound states in nuclei. 1.2.3. Strangeness S = −2 hypernuclei and the H -dibaryon Some experiments have revealed the existence of -hypernuclei [28,64 – 67]. They are produced through the K − p → K + − , K − p → K 0 0 and K − n → K 0 − strong reactions, which, because of the relatively large momentum transferred ( 500 MeV), preferentially excite high total spin hypernuclear states. The measured 1s − binding energies (old emulsion data) have been 4tted by using a Woods–Saxon potential with radius R = 1:1A1=3 fm, depth 20 . |V0 | . 28 MeV and surface di7useness a = 0:65 fm [64]. The depth V0 compares well with theoretical predictions based on Nijmegen OME models and allows for the binding of several levels. More recent speculations 12 favour smaller well depths, around 12–16 MeV [67,68] for 12 − C and − Be. However, improved experiments are needed to extract precise information concerning the –nucleus potential [69]; for example it is not yet clear whether the potential depth exhibits a mass number dependence [70]. The authors of Ref. [50] obtained a − –0 potential characterized by a quite strong inner repulsion and a shallow attraction at intermediate distances: the − wave function is pushed on the nuclear surface and the small − binding energy has been reproduced. They have also found that in the formation of 5− He an important role is played by the Coulomb interaction. When a − -hypernucleus (or a − -atom) is formed, the hyperon strongly interacts with a nucleon of the medium (exchanging a strange meson, K or K ∗ ) and produces two ’s with an energy release of only 28 MeV (further reduced by binding e7ects): − p → . This o7ers the possibility of producing double--hypernuclei [71,72], which were observed for the 4rst time during the 1960s in emulsion experiments. The formation probability of a -hypernucleus is sizeable because the 28 MeV energy release in − p → is only 0.1% larger than the separation energy in an 0-particle. Therefore, if an 0-cluster is broken as a consequence of the − absorption, the 4nal ’s will not have enough energy to escape from the nucleus. The production probability of a double- or twin- hypernucleus turned out to be (18 ± 13)% in the experiment of Ref. [73], which used − atomic capture on 12 C. The strangeness S = −2 hypernuclei are quite interesting because they represent the unique way of getting information on the N and interactions. In Ref. [64] the conversion width due to the process − p → has been estimated to be quite narrow (of the order of 5 MeV or less), as one expects because of the small energy released in the process. More recent calculations have found conversion widths narrower than the spacing among the levels: typically 1:6 MeV for s-states and 0:9 MeV for p-states [67]. For 5− He the calculation of Ref. [74] obtained a very small width, = 0:76 MeV, which results from a small overlap between the − wave function
12
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
and the nuclear core (the − binding energy being only 1:7 MeV). Therefore, if the experimental energy resolution is good enough, the smallness of makes it feasible to perform spectroscopic studies. Because of the small mass di7erence between initial and 4nal states, the –N coupling plays an important role in double hypernuclei. However, a suppression of this coupling coming from the Pauli blocking on the nucleon becomes sizeable in medium–heavy hypernuclei. On the contrary, the –
strong coupling will be less important because of the large mass di7erence (5m 155 MeV). The study of - and -hypernuclei is closely related to the observation of hyperon mixed states due to the –N –
couplings [18] and, in particular, to the search for a stable H -particle. The latter is predicted to be a six quark state containing two u, two d and two s quarks coupled into a singlet SU (3) state of both colour and Favour: it should have J P = 0+ ; I = 0 and it should be stable against strong decays (obviously, if its mass is smaller than twice the mass). This object has baryonic number 2 but it is not an ordinary nuclear state, namely the three quark clusters contained in H are decon4ned. This kind of di-baryon was predicted by Ja7e in 1977 [75] within a quark bag model. Nowadays, searching experiments are running [65,73,76 –79]. From observations on double--hypernuclei, the expected mass is mH ≡ 2m + BH & 2m − 28 MeV [11,65,71]. The 4rst calculation by Ja7e found a large value for BH (−80 MeV). Should the binding energy of the H -dibaryon be more attractive than the binding energy of two ’s in nuclei, BH ¡ B ≡ A+2 A M (A+2 Z) − M ( Z) − 2m , then the di-baryon should be strongly emitted from the nucleus ( Z → A Z + H ), and the hypernucleus would have a very short lifetime. On the contrary, if BH ¿ B , successive decays of the two hyperons (weak processes) should be observed, but this would not necessarily imply the non-existence of the H di-baryon: the interaction could also be attractive, although weaker than B . It is then clear that, in this sense, the stability of double- hypernuclei may hinder the experimental detection of the H -particle: the observation of the weak decay of a double- hypernucleus only excludes the H mass in the region mH ¡ 2m + B . From the present experimental searches there is no unambiguous evidence which supports the existence of di-baryon resonances in the strange sector. Studies of the contribution to the experimental binding energy B [80] are quite di6cult because of the few data available on double- hypernuclei and of the density dependence of the interaction (–N coupling, three body forces, etc.). This interaction occurs by the exchange of I = 0 mesons at lowest order, which favours an attractive character for V . Nuclear emulsion 13 experiments reported the observation of three double- hypernuclei: 6 He; 10 Be and B. From these events, an e7ective matrix element − V 5B ≡ |B | − 2|B | 4–5 MeV [18] was determined, |B | being the separation energy of the pair from the A+2 Z hypernucleus and |B | the hyperon separation energy from the A+1 Z hypernucleus. However, a very recent counter-emulsion hybrid experiment, performed at KEK [81], favours a quite weaker interaction: 5B (6 He) = 1:01+0:27 −0:23 MeV. The authors of Ref. [81] advanced the hypothesis that in the previous emulsion experiments, the single- hypernuclei were produced in excited states. In this case, a value of 5B around 1 MeV is expected also from these experiments. The production of 4 H hypernuclei has been reported, very recently, in a counter experiment at BNL [82]. Unfortunately, due to the limited statistics, the authors have not determined 5B . The quantity 5B , called bond energy, is expected to decrease as the nuclear mass number A increases and goes to zero in the limit A → ∞: for increasing A the attraction between the ’s becomes weaker because of the larger average distance. We note here
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
13
that in Ref. [83] the author, by using the Skyrme–Hartree–Fock approach, found the approximation − V 5B between interaction strength and bond energy to be questionable. Indeed these quantities seem to be sizeably a7ected by the interplay of several factors, such as the spatial distributions of the ’s and the core polarization: for 13 B the author evaluated − V 2:1– 5:6 MeV (depending on the various parameterizations used for the Skyrme potential) once the potential was adjusted to reproduce the value 5B = 4:8 MeV of the old emulsion experiment. In a double- hypernucleus the two hyperons are in the 1 S0 relative state (the 3 S0 is not allowed by Pauli principle). We can then compare the interaction matrix elements with the 1 S0 ones for n and neutron-neutron interactions in light systems: − Vn 2–3 MeV, − Vnn 6–7 MeV. We know that the 1 S0 nn system is not bound. However, a 1 S0 bound system, which has a smaller matrix element than nn, cannot be excluded on this basis because of the unknown balance between the short range repulsion and the intermediate distance attraction in the interaction. On the other hand, also the –N coupling must be taken into account [84]. Measurements of B in medium and heavy double--hypernuclei are expected, too. We conclude this section by recalling that hypernuclei are always unstable with respect to weak decay. A variety of processes are in principle accessible (which do not have counterpart in the non-strange sector). Limiting ourself to 5S = 1 transitions we have: • for -hypernuclei: N → nN
(Q 176 MeV) ;
• for -hypernuclei:
N → NN
(Q 255 MeV) ;
• for -hypernuclei: N → N (Q 202 MeV) ; → N (Q 123 MeV) ; • for -hypernuclei: → n (Q 176 MeV) ; → N (Q 97 MeV) ;
(1)
• and many other processes for multi strangeness systems (S 6 − 3), for example: → N
(Q 174 MeV) ;
→ (Q 199 MeV) ; → (Q 199 MeV) : These decays are expected to have lifetimes of the order of 10−10 s or less. However, when hyperons other than the are embedded in a nucleus, strong processes, which have very short lifetimes ( 10−22 –10−21 s), dominate over the quoted weak decays, preventing them to occur. For double- hypernuclei, the -induced weak decay rates [Eq. (1)] of s-shell systems are estimated [85,86] to be suppressed by a factor 25 –70 with respect to the free width, and are impossible to detect at present.
14
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
2. Production of hypernuclei 2.1. Introduction The new development (in the last 15 years) of counter experiments have opened a new phase of hypernuclear physics. In fact, the old experiments in the 1960s used emulsion and bubble chamber techniques and, practically, they only measured the hyperons binding energies. Through counter techniques, the experiments have discovered new and interesting features of the hypernuclear structure, although several questions still remain unsolved. Hypernuclei can be produced by using strong processes in which a particle (generally a pion or a kaon) hits a nucleus. Since strangeness has to be conserved, one can use the following production reactions: (1) Processes with strangeness exchange: K − n → − ;
0 − ;
− 0 ; K p → − + ; −
+ − ; 0 ;
0 0 ; (2) Processes with associated production of strange hadrons: + n → K + ;
0 K + ; + p p e− p pN
→ → → →
+ K 0 ;
+ K + ; (photoproduction) ; K + − + e K (electroproduction) ; K + N (proton-induced) ;
(3) Reactions in which strangeness exchange and associated production of strangeness are combined (used for the production of S = −2; −3 hypernuclei): K − p → − K + ; 0 K 0 ; K + K 0 − ; K − n → − K 0 ; K − p → 0 ; followed by 0 p → K + ; pp → 0 K + K + n :
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
15
Fig. 4. Momentum qY transferred to the hyperon Y as a function of the projectile momentum in the laboratory frame pLab ◦ for the reaction aN → Yb at 3b; Lab = 0 (taken from Ref. [28]).
In the following, we denote with N (a; b)Y , or simply with (a; b), the process: aN → Yb ; where N is a nucleon and Y a hyperon. The considered reactions have di7erent characteristics depending on their kinematics. In the following we shall see that, because of the complementarity of the reactions, the combined use of various production modes is important for exhaustive spectroscopic studies. In order to produce a hypernucleus, the hyperon emerging from the reaction has to remain inside the nuclear system. The formation probability of a hypernucleus depends on the energy transferred in the production. When the momentum transferred to the hyperon, qY , is much larger than the nuclear Fermi momentum kF , the hyperon has a very small sticking probability and it leaves the nucleus. Instead, when qY . kF , the hyperon is created, with a high probability, in a bound state. In Fig. 4 the momenta transferred to the hyperon Y in the reactions N (a; b)Y are shown as a function of the ◦ projectile momentum pa at 3b = 0 in the laboratory frame. With the exception of the (K − ; ± ) reactions, the other ones reported in the 4gure are endoenergetic, therefore the hyperon cannot be produced at rest: qY decreases as the projectile momentum increases but it remains 4nite for high pa . In this situation the hyperon is produced with a non-negligible probability above its emission threshold, namely with B ¿ 0 (quasi-free production). Some hypernuclear states in the continuum may be quasi-bound states: they do not emit the hyperon but nucleons and=or cluster of nucleons. 2.2. The (K − ; ± ) strangeness exchange reactions In the (K − ; ± ) production reactions the incident K − transforms the struck neutron (proton) into a or 0 ( − ) and a − (+ ) is emitted with an energy spectrum which is directly related to the populated hypernuclear level. The reactions n(K − ; − ); p(K − ; + ) − (used for the 4rst time at CERN [87] and BNL [31] to produce - and -hypernuclei) are esoenergetic and can create the hyperon at rest (qY = 0). By considering, as an approximation, the initial neutron in n(K − ; − )
16
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
Table 1 Peaks observed at CERN in Peak
16 O
production (taken from Ref. [29]) B (MeV)
Con4guration −1 (1p3=2 ; 1p3=2 )J P =0+
#1
3.5
#2
−2.5
−1 (1p1=2 ; 1p1=2 )0 +
#3
−7
−1 (1p3=2 ; 1s1=2 )1 −
#4
−13
−1 (1p1=2 ; 1s1=2 )1 −
◦
at rest, the transferred momentum is zero, p ˜K = p ˜ ≡ p ˜ , and the pion is emitted at 3 = 0 in the laboratory frame. Thus, from energy–momentum conservation: p ˜ 2 + m2K + mN = m + p ˜ 2 + m2 and the momentum for the production of the at rest (called magic momentum) can be derived as follows: m2 − m2 + (m − mN )2 ˜ 2 + m2K = K EK ≡ p ⇒ p 530 MeV : 2(m − mN ) If the production reaction is p(K − ; + ) − , the kaon magic momentum is p 280 MeV. Since both the initial K − and the 4nal pion are strongly absorbed in the nucleus (they have a small mean free path), the kaon induced reactions preferentially populate less bound -levels and they have been only employed for s- and p-shell hypernuclear studies. Moreover, the low intensity and poor resolution of the kaon beams hinder the use of the (K − ; ± ) reactions. ◦ By using the strangeness exchange reaction at 3 = 0 , the hyperon is predominantly produced in a state with the same quantum numbers of the struck nucleon, namely the neutron hole and the are coupled to J P = 0+ and 5l = 0 (substitutional reaction). By increasing 3 the relative importance of 5l = 1; 2; etc. transitions increases, and hypernuclear states with higher spin can be produced. ◦ ◦ From measures at both 3 0 and 3 ¿ 10–15 it has been possible to study a large part of the level structure of light hypernuclei [31,34]. 16 Spectroscopic studies with the reaction n(K − ; − ) in a few hypernuclei (13 C, O and others) have shown that the spin–orbit part of the –nucleus mean potential is very small compared to the one of a nucleon [29 –31], although the exact magnitude is not known yet. Taking, for instance, the case of 16 O, the measured and nucleon p1=2 –p3=2 spin–orbit shifts are [29] 16 5E (16 O; 1p1=2 –1p3=2 ) 6 0:3 MeV5EN ( O; 1p1=2 –1p3=2 ) 6 MeV :
This estimate comes from the observed peaks in the excitation spectrum, which are reported in Table 1 with the relative (N −1 ; ) con4gurations. We see that the p1=2 –p3=2 spin–orbit separation for the nucleon is obtained by subtracting the energies of peaks #3 and #4. From the observation that almost the same separation exists between the peaks #1 and #2, we can infer that the analogous spin– orbit separation for the is compatible with zero. Subsequent (+ ; K + ) experiments have con4rmed small spin–orbit splittings. Very recently, the hyperon 1p1=2 − 1p3=2 splitting of 13 C hypernuclei,
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
produced by the
13
17
C(K − ; − )13 C reaction, has been measured at BNL [88], with the result:
5E (13 C; 1p1=2 –1p3=2 ) = 0:152 ± 0:065 MeV 5EN (13 C; 1p1=2 –1p3=2 ) 3–5 MeV : To our best knowledge, only the analysis of Dalitz et al. [89] of old emulsion data on 16 O found larger e7ects: 5E (16 O; 1p –1p ) = 1:30–1:45 MeV. On the other hand, the smallness of the 1=2 3=2 –nucleus spin–orbit interaction arises naturally in a relativistic mean 4eld description [32]. The p1=2 –p3=2 splitting is generally considered to be originated predominantly from the N spin–orbit force acting on the in the p1=2 and p3=2 levels of the –15 O system. However, excitations of the core may contribute to the spin–orbit splitting as well [33,35], especially in heavy hypernuclei [36,90]. Hence, the smallness of the spin–orbit splittings does not necessarily imply a weak N spin–orbit interaction: 2 we also have to take into account the response of the nucleons’ core to the added , which can modify the mentioned shifts itself. Indeed, there is evidence [35] that the core response is able to reduce signi4cantly the spin–orbit splitting, already in 13 C. Hypernuclear structure calculations with core-excited states [22,25] will be important in future analysis. In the last 15 years the strangeness exchange reaction has been used at BNL [92,93] for production and decay studies of hypernuclei from 4 H to 12 C. However, because of the small momentum transfer and the large background coming from the in-Fight kaon decays, the measurement could not be extended to heavy hypernuclei. At BNL [53,60] and KEK [58,59], the (K − ; − ) reaction con4rmed the existence of the 4 He bound state, which was under discussion for about 10 years. The (stopped K − ; − ) reaction has been used at KEK [47,94,95], and, in the near future, will be employed at DaXne [96], the Frascati 6-factory. Moreover, this process was the standard method to produce -hypernuclei in emulsion and bubble chamber experiments during the 1960s. When the K − is stopped in the target, it is captured into an atomic level and then, after cascade down to inner levels, it is absorbed in the nuclear surface, converting a nucleon into a or . The momentum ◦ transferred to the produced is close to kF (for 0 scattering-angle, q 250 MeV), while when a is produced, q 180 MeV. The process with absorption of a kaon at rest in nuclei has the good feature of a large production yields, especially for -hypernuclei [94], and a large number of hypernuclear states is accessible. Moreover, di7erently from the in-Fight reaction, it allows a clean separation of the quasi-free hypernuclear production (because of the larger transferred momentum), resulting in a better determination of the weak decay rates, especially in light systems [47]. At KEK [97], the mesonic decay widths ( → N ) for 4 H and 4 He, produced by the in-Fight reaction, have been measured quite accurately. The decay into 0 n has been directly identi4ed 12 for the 4rst time in 4 He. Similarly, a measurement of the 0 n decay channel for 11 B and C is presented in Ref. [95]. Observations of this kind are of great importance also in connection with a proper parameterization of the 0 –nucleus optical potential.
l--s l--s Yet, we know that it is smaller than the NN spin–orbit force, with a ratio VNN =VN 3–10 between the strengths expected from phenomenological studies of the baryon–baryon potentials [5,7,12]. Very recent results from hypernuclear high-precision -ray spectroscopy experiments [23,91] seem to suggest an even smaller N spin–orbit interaction: l--s l--s VNN =VN 10 [24]. 2
18
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
2.3. The n(+ ; K + ) strangeness associated production reaction When a + hits a neutron, by the creation of a ssV quark pair one has the associated production of two strange hadrons in the 4nal state: the s-quark becomes a constituent of a and the sV is transferred to the meson, which becomes a K + . The n(+ ; K + ) reaction is complementary to the n(K − ; − ) one. In fact, di7erently from the latter, the former is best suited for studying deeply bound states in medium and heavy hypernuclei [38,39]. It produces almost background free spectra and it has the advantages of using good quality and large intensity pion beams. In addition, the 4nal K + is moderately distorted by the nucleus (7K + 27K − 27± 4 fm). The reaction n(+ ; K + ) thus preferentially populates bound states with high (n−1 ; ) spin con4gurations. Since the mass of the 4nal strange hadrons pair is sizeably larger that the mass of the initial particles, the n(+ ; K + ) reaction is endothermic, with a quite large momentum transferred to the hyperon: qY 300–400 MeV ◦ at 0 scattering-angle (see Fig. 4). Hence, this reaction is able to populate all possible levels, from the deepest one up to the quasi-free region. We note that when the is produced above its emission threshold, namely in the quasi-free region, it may leave the nucleus or spread its energy inside the nucleus. In the latter case, by the emission of nucleons and=or photons, a variety of hypernuclear states are accessible. Because of the relatively large momenta transferred to the hyperon, the relevant cross section for the associated production reaction is one=two orders of magnitude smaller than the one typical of the strangeness exchange reaction. However, this defect is overcompensated by the high intensity of the available pion beams. From experiments using this reaction we have high quality information about the spectroscopy of many light to heavy -hypernuclei [36 –39,98]. The associated production reaction has been used for the 4rst time at BNL [99,100] for 12 C, while more recently it has been employed at KEK [36,39,101]. Here, it allowed to accurately mea56 sure the lifetime of -hypernuclei over a broad range of mass numbers [102] (from 12 C to Fe, and 89 data on Y are under analysis now), with the explicit identi4cation of the produced hypernuclei. Moreover, (+ ; K + ) spectroscopy experiments at KEK [36,103] observed double-peak structures in 12 16 51 89 C; O; V and Y, interpretable as spin–orbit splittings. The magnitude of the shifts suggests a –nucleus spin–orbit interaction stronger than the one extracted from (K − ; − ) experiments. However, the interpretation of the measured spectra is still under discussion. At KEK [104] the n(+ ; K + ) reaction has been also utilized to measure the weak decay width for → − p in 12 C. This measurement has been carried out with a relatively small error and allowed a quite precise determination of the medium distortion acting on the pion coming out from the decay, a useful point for a better understanding of the pion–nucleus interactions. The KEK Superconducting Kaon Spectrometer worked with an energetic resolution of 1.5 –2 MeV FWHM. 3 Nowadays, there is an e7ort for sub-MeV resolution spectroscopy (and pion beams with high statistics and intensity), again by using the (+ ; K + ) reaction, at the Japan Hadron Facility (JHF) [105]. The use of high resolutions is important, in particular, for the observation of the hypernuclear 4ne-structure and, in turn, for a better understanding of the N spin–isospin dependent interactions.
3
Full-width at half-maximum.
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
19
2.4. The p(e; e K + ) reaction The electroproduction reaction is characterized by large momentum transfer ( 350 MeV) and by the dominance of the spin–Fip amplitudes in the elementary process p(; K + ). Thus, the electroproduction cross sections are small and the reaction mainly populates stretched and unnatural parity hypernuclear states. The smallness of the (e; e K + ) reaction cross section is partially compensated by the high intensity of the initial electron beam relatively to that of the 4nal kaon beam. This reaction could complement our knowledge of hypernuclear spectroscopy derived from studies performed with meson beams. Indeed, the high precision of electron beams can considerably improve the quality of experimental data. Moreover, the (K − ; ± ) and (+ ; K + ) reactions hardly produce ground states and deep-hole states in heavy hypernuclei, because of the strong pion and kaon absorption in the nuclear medium. Unnatural parity states are also di6cult to excite in (K − ; ± ) and (+ ; K + ) experiments, due to their moderate spin–Fip amplitudes. At TJNAF laboratories [106], by using the electroproduction reaction, hypernuclear levels will be observed with high-resolution ( 0:6 MeV FWHM) and, through 4ssion fragment detection techniques, the lifetimes of heavy hypernuclei will be measured with great accuracy and precise identi4cation of the decayed system [107]. 3. Weak decay modes of -hypernuclei 3.1. Introduction In the production of hypernuclei, the populated state may be highly excited, above one or more threshold energies for particle decays. These states are unstable with respect to the emission of the hyperon, of photons and nucleons. The spectroscopic studies of strong and electromagnetic de-excitations give information on the hypernuclear structure which are complementary to those we can extract from excitation functions and angular distributions studies. Once the hypernucleus is stable with respect to electromagnetic and strong processes, it is in the ground state, with the hyperon in the 1s level, and can only decay via a strangeness–changing weak interaction, through the disappearance of the hyperon. This is the most important decay mechanism, because it opens the possibility to study some very interesting questions, which have been quoted in the introduction of Section 1. Now we come to the main subject of the review, the study of the weak decay of -hypernuclei. In the next two subsections we brieFy discuss the main characteristics of the decay channels for these systems. In Section 3.2 we will introduce the mesonic mode ( → N ), which resembles what happens to the in free space, and in Section 3.3 the so-called non-mesonic modes (N → NN; NN → NNN , etc.), which can only occur in nuclear systems. Semi-leptonic and weak radiative decay modes: → n
(B:R: = 1:75 × 10−3 ) ;
p− (B:R: = 8:4 × 10−4 ) ; pe− +Ve (B:R: = 8:32 × 10−4 ) ; p8− +V8 (B:R: = 1:57 × 10−4 )
20
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
are neglected in the following, being, in free space, orders of magnitude less important than the mesonic decay (B:R: = 0:997). 3.2. Mesonic decay The free decays via the pionic channels: free → − p (free = 0:639) − = free 0 n (free = 0:358) 0 = free −10 s. with a lifetime free ≡ ˝= = 2:632 × 10 free The experimental ratio of the relevant widths, free 1:78, and the polarization observables − = 0 free this follows from a are compatible with the 5I = 1=2 rule on the isospin change (for free − = 0 simple Clebsch–Gordan analysis), which is also valid for the decay of the hyperon and for pionic kaon decays (namely in non-leptonic strangeness changing processes). Actually, this rule is slightly free violated in the free decay, since it predicts free = 2 (taking the same phase space for the − = 0 two channels and neglecting the 4nal state interactions). Nevertheless, the ratio A1=2 =A3=2 between the 5I = 1=2 and the 5I = 3=2 transition amplitudes is very large (of the order of 30). This isospin rule is based on experimental observations but its dynamical origin is not yet understood on theoretical grounds. On the other hand, it is not clear whether it is a universal characteristic of all non-leptonic processes with 5S = 0. The free decay in the Standard Model can occur through both 5I = 1=2 and 5I = 3=2 transitions, with comparable strengths: an s quark converts into a u quark through the exchange of a W boson. Moreover, the e7ective 4-quark weak interaction derived from the Standard Model including perturbative QCD corrections (box and penguin quark diagrams, namely one-loop gluon radiative corrections) gives too small A1=2 =A3=2 ratios ( 3– 4, as calculated at the hadronic scale of about 1 GeV by using renormalization group techniques [108,109]). Therefore, non-perturbative QCD e7ects at low energy (such as hadron structure and reaction mechanism), which are more di6cult to handle, and=or 4nal state interactions could be responsible for the enhancement of the 5I = 1=2 amplitude and=or the suppression of the 5I = 3=2 amplitude. In the low energy regime, chiral perturbation theory is the e7ective theory which is usually employed for describing hadronic phenomena [110]. However, it is well known that, when used in connection with perturbative QCD corrections, it is not able to reproduce the rates for hyperon non-leptonic weak decays. Taking intoaccount energy–momentum conservation in the mesonic decay, m is equal to p ˜ 2 + m2 + p ˜ 2 + m2N in the center-of-mass system, thus the momentum of the 4nal nucleon is p 100 MeV and corresponds to an energy release Q m − mN − m 40 MeV. We have neglected the binding energies of the recoil nucleon and , which tend to decrease p. Hence, in nuclei the mesonic decay is disfavoured by the Pauli principle, particularly in heavy systems. It is strictly forbidden in normal in4nite nuclear matter (where the Fermi momentum is kF0 270 MeV), while in 4nite nuclei it can occur because of three important e7ects:
• In nuclei the hyperon has a momentum distribution (being con4ned in a limited spatial region) that allows larger momenta to be available to the 4nal nucleon; • The 4nal pion feels an attraction bythe medium, such that for 4xed momentum ˜q it has an energy smaller than the free one [!(˜q) ¡ ˜q2 + m2 ], and consequently, due to energy conservation, the 4nal nucleon again has more chance to come out above the Fermi surface. It has been shown
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
21
[111,112] that the pion distortion increases the mesonic width by one or two orders of magnitude for very heavy hypernuclei (A 200) with respect to the value obtained without the medium distortion. For light and medium hypernuclei this enhancement factor is smaller, being about 2 for A 16. • At the nuclear surface the local Fermi momentum is considerably smaller than kF0 (namely the Pauli blocking is less e7ective) and favours the decay. Nevertheless, the mesonic width rapidly decreases as the nuclear mass number A of the hypernucleus increases [111,112]. A further (but very small) e7ect which reduces the mesonic rate, especially in medium and heavy hypernuclei, is the absorption of the 4nal pion in the medium. Actually, while energy–momentum conservation forbids the absorption of a on-shell pion by a free nucleon, the absorption by a correlated pair of nucleons is allowed for both on- and o7-shell pions, and the corresponding decay is observed as non-mesonic, resulting in a 4nal state with 3 nucleons: NN → NNN . Hence, the mesonic channel is strictly related to the three-body non-mesonic decay. From the study of the mesonic channel it is possible to extract important information on the pion–nucleus optical potential, which we do not know nowadays in a complete and unequivocal form from pionic atoms and low energy pion–nucleus scattering experiments (on the other hand, no data are available for neutral pions). The nuclear mesonic rate, M = − + 0 , is very sensitive to the pion self-energy in the medium [111–113]: it is signi4cantly enhanced by the attractive P-wave part of the optical potential, but exclusive decays to closed shell nuclei mainly select the repulsive S-wave interaction and reduce the mesonic rate with respect to the calculation using non-distorted (free) pion waves. The mesonic width is also extremely sensitive to the Q-value of the process, Q 40 MeV +B − BN , which is in fact very small and decreases with the nuclear mass number. This implies a great sensitivity of the available phase space to the mass of the 4nal light particle, i.e. the pion (in analogy with the problem of determining the neutrino mass from the nuclear : decay), and to the and 4nal nucleon binding energies. It is then clear that a systematic measurement of the mesonic decays in medium–heavy systems is strongly advisable. Unfortunately, no data are nowadays available on the mesonic decay for A ¿ 56 hypernuclei, apart from some old emulsion and bubble chamber limits for 40 ¡ A ¡ 100 [114]. From calculations and experiments on mesonic decays of s-shell hypernuclei we have evidence for a central repulsion in the –nucleus mean potential [47,50,115] (named, for this reason, “Isle” potential). This is an indication for a particular balance between the strongly repulsive N interaction at short range, which automatically appears in quark based models [11,46], and the weak (with respect to the NN one) N attraction at intermediate distances. The following consideration about the Pauli principle in very light systems is interesting. We have discussed how the Pauli exclusion principle suppresses the nuclear mesonic decay. However, in A = 3 hypernuclei the mesonic decays into two-body 4nal states are enhanced (with respect to the corresponding free decays) as a result of the antisymmetrization of the nucleons in the particular 4nal states [115,116]. For example, the experimental rate for the two-body process 4 H → 4 He + − ( 0:69free ) is almost equal to the → − p free rate ( 0:64free ). Adding also the contribution of three-body mesonic decays with a − in the 4nal state, the rate is about the total 4 0 free free width: (4 H → − + all) ¿ free − . Moreover, again from data, ( He → + all) & 0 .
22
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109 N
N
N
N
N
N
N
π, ρ, ω, η, Κ, Κ * π, ρ, ω, η, Κ, Κ *
Λ
N
Λ
Fig. 5. One-nucleon (a) and two-nucleon (b) induced decay in nuclei.
From theoretical calculations [111,112] and experimental measurements (see Ref. [117] for a review) there is evidence that the − =0 ratio in nuclei strongly oscillates around the value 2, predicted by the 5I = 1=2 rule for a nucleus with an equal number of neutrons and protons and closed shells. However, this is essentially due to nuclear shell e7ects and might not be directly related to the weak process itself. On the other hand, in the calculation of Refs. [111,112] the 5I = 1=2 rule is enforced in the → N free vertex; however, shell e7ects, also related to the Pauli blocking for the available 4nal nuclear states, make − =0 strongly dependent on the hypernuclear structure. We remind the reader that − =0 is also sensitive to 4nal state interactions and Coulomb e7ects. 3.3. Non-mesonic decay When the pion emitted from the weak hadronic vertex → N is virtual, then it will be absorbed by one or more nucleons of the medium, resulting in a non-mesonic process of the following type: n → nn p → np
(n ) ;
(2)
(p ) ;
NN → nNN
(3)
(2 ) :
(4)
The total weak decay rate of a -hypernucleus is then: T = M + NM ; where M = − + 0 ;
NM = 1 + 2 ;
1 = n + p
and the lifetime is =˝=T . The channel (4) can be interpreted by assuming that the pion is absorbed by a pair of nucleons correlated by the strong interaction. Obviously, the non-mesonic processes can also be mediated by the exchange of more massive mesons than the pion (see Fig. 5). The non-mesonic mode is only possible in nuclei and, nowadays, the systematic study of the hypernuclear decay is the only practical way to get information on the weak process N → NN (which provides the 4rst extension of the weak 5S = 0 NN → NN interaction to strange baryons),
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
23
especially on its parity-conserving part, which is masked by the strong interaction in the weak NN → NN reaction. In fact, there are not experimental observations for the process N → NN using beams: however, the measurement of the (low) cross section for the inverse reaction pn → p, which could give much cleaner information, is under study (at COSY [2] and KEK [3]). The precise measurement of n and p in s-shell hypernuclei is very important for the study of the spin–isospin dependence and of the validity of the 5I = 1=2 rule in the non-mesonic processes (see the analysis presented in Section 6.3); on the other side, it is relevant in connection with the hypernuclear structure dependence, which is rather important in these very light systems. In s-shell hypernuclei all nucleons are con4ned (as the hyperon) into the s-level, while complications arise with increasing mass number, due to the appearance of more initial N states and of the nucleons’ rescattering inside the residual nucleus, which entangles the kinematics of the measured nucleons. The 4nal nucleons in the non-mesonic processes emerge with large momenta: disregarding the and nucleon binding energies and assuming the available energy Q = m − mN 176 MeV to be equally splitted among the 4nal nucleons, it turns out that pN 420 MeV for the one-nucleon induced channels [Eqs. (2) and (3)] and pN 340 MeV in the case of the two-nucleon induced mechanism (4). Therefore, the non-mesonic decay mode is not forbidden by the Pauli principle: on the contrary, the 4nal nucleons escape from the nucleus with great probability and the non-mesonic mechanism dominates over the mesonic mode for all but the s-shell hypernuclei. For very light systems the two decay modes are competitive, the smallest value for the non-mesonic width corresponding to hypertriton, where it is evaluated to be 1.7% of the free decay rate [118]. The non-mesonic channel is characterized by large momentum transfer, thus, apart from very light hypernuclei, the details of the hypernuclear structure do not have a substantial inFuence (then providing useful information directly on the hadronic weak interaction). On the other hand, the NN and N short range correlations turn out to be very important. There is an anticorrelation between mesonic and non-mesonic decay modes such that the experimental lifetime is quite stable from light to heavy hypernuclei [114,117], apart from some Fuctuation in light systems because of shell structure e7ects: = (0:5–1) free . Since the mesonic width is less than 1% of the total width for A ¿ 100, the above consideration implies that the non-mesonic rate is rather constant in the region of heavy hypernuclei. This can be simply understood from the following consideration. If one naively assumes a zero range approximation for the non-mesonic process N → NN (actually, the range is not zero, but very small, due to the large transferred momenta), 1 is proportional to the overlap between the and nuclear densities: 1 (A) ˙ d˜r | (˜r)|2 !A (˜r) ; where the wave function (nuclear density !A ) is normalized to unity (to the nuclear mass number A). This overlap integral increases with the mass number and reaches a constant value: by using, for simplicity, harmonic oscillator wave functions (with frequency ! adjusted to the experimental hyperon levels in hypernuclei) and Fermi distributions for the nuclear densities, we 208 4nd 1 (12 C)=1 ( Pb) 0:56, while 1 is 90% of the saturation value for A 65. In Fig. 6 the qualitative behaviour of mesonic, non-mesonic and total widths as a function of the nuclear mass number A is shown. For A 6 11 the experimental data are quite well 4tted by NM =free 0:1A: 1 (namely the probability of the N → NN process) is proportional to the number of N pairs, A, as it is expected from the above simple description, where we neglect the contribution of 2 .
24
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
Fig. 6. Qualitative behaviour of mesonic, non-mesonic and total decay widths as a function of the baryonic number A + 1.
Actually, the observed saturation of the N → NN interaction is strictly related to its range: the saturation occurs when the radius of the hypernucleus becomes sensitively larger than the range of the interaction. By inspecting the experimental data of Refs. [43,93,101,102] we can conclude that the decaying can interact at most with about 15 –20 neighbouring nucleons, namely almost exclusively with s- and p-shell nucleons. However, for a more quantitative explanation it will be important to collect data (with good precision, like in the KEK experiment [102] or in the planned 28 FINUDA [96]) for hypernuclei between 12 C and Si and in the region A = 100–200. Yet, from the available data one can say, very roughly, that the long distance component of the N → NN interaction has a range of about 1:5 fm and corresponds, as we expect, to the OPE component of the interaction. 3.3.1. The n =p puzzle Nowadays, the main problem concerning the weak decay rates is to reproduce the experimental value for the ratio n =p between the neutron- and proton-induced widths n → nn and p → np. The theoretical calculations underestimate the central data for all considered hypernuclei (see Tables 7–9):
n p
Th
n p
Exp
;
0:5 .
n p
Exp
.2
(only for 4 He this ratio seems to be less than 0.5), although the large experimental error bars do not allow to reach any de4nitive conclusion. The data are quite limited and not precise because of the di6culty in detecting the products of the non-mesonic decays, especially for the neutron-induced one. Moreover, the present experimental energy resolution for the detection of the outgoing nucleons do not allow to identify the 4nal state of the residual nuclei in the processes A Z →A−2 Z + nn and A Z →A−2 (Z − 1) + np. As a consequence, the measurements supply decay rates averaged over several nuclear 4nal states.
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
25
In the one-pion-exchange approximation, by assuming the 5I = 1=2 rule in the → − p and → 0 n free couplings, the calculations (discussed in the next section) give small ratios, in √ the range 0.05 – 0.20. This is due to the 5I = 1=2 rule, which 4xes the vertex ratio V− p =V0 n = − 2 (both in S- and P-wave interactions), and to the particular form of the OPE potential, which has a strong tensor and weak central and parity-violating forces: the large tensor transition N (3 S1 ) → NN (3 D1 ) requires, in fact, I = 0 np pairs in the antisymmetric 4nal state. In p-shell and heavier hypernuclei the relative N L = 1 state is found to give only a small contribution to tensor transitions for the neutron-induced decay, so it cannot improve the OPE ratio. The contribution of the N L=1 relative state to NM seems to be of about 5 –15% in p-shell hypernuclei [119 –121]. For these systems we expect the dominance of the S-wave interaction in the initial state, due to the small N relative momentum. By using a simple argument about the isospin structure of the transition √ N → NN in OPE, it is possible to estimate that for pure 5I = 3=2 transitions (V− p =V0 n = 1= 2) the OPE ratio can increase up to about 0:5. However, the OPE model with 5I = 1=2 couplings has been able to reproduce the one-body stimulated non-mesonic rates 1 = n + p for light and medium hypernuclei [120 –124]. Hence, the problem seems to consist in overestimating the proton-induced rate and underestimating the neutron-induced one. In order to solve this puzzle (namely to explain both n + p and n =p ), many attempts have been made up to now, but without success. We recall the inclusion in the N → NN transition potential of mesons heavier than the pion (also including the exchange of correlated or uncorrelated two-pions) [120 –123,125 –128], the inclusion of interaction terms that explicitly violate the 5I = 1=2 rule [129 –131] and the description of the short range baryon–baryon interaction in terms of quark degrees of freedom (by using a hybrid quark model in [132] and a direct quark mechanism in [124,133,134]), which automatically introduces 5I = 3=2 contributions. The calculations of Refs. [120,121,128,134,135] are the only ones which have found a sizeable increase of the neutron to proton ratio with respect to the OPE value. We shall come back to the problem of the n =p ratio more extensively in Sections 4.2 and 6.
4. Present status of experiment and theory 4.1. Experiment We shortly summarize here the main experiments which have observed the weak decay of -hypernuclei. The decay of a hypernucleus was observed for the 4rst time in 1952 [1] in a nuclear emulsion used for cosmic-ray observations. The experiments on the weak decays started in the 4rst 1960s and employed negative kaons stopped in emulsions and bubble chambers [136]. They were mostly based on the detection of the emitted negative pions, and only established rough limits on the total lifetimes of s-shell -hypernuclei. In the following years [137], until the 4rst 1970s, although with great di6culties (the identi4cation of hypernuclei was hard, statistics and precision were very low, etc.), the experiments succeeded in separating the mesonic and non-mesonic decays and established the 4rst limits on the partial rates. In these experiments hypernuclei were produced by using kaon or pion beams, as explained
26
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
in Section 2. They showed [93,114,138] that: • for s-shell hypernuclei the mesonic and non-mesonic widths were comparable, NM =− 0:3–1.5, and 0:3 . n =p . 2; • for p-shell hypernuclei: NM =− 2–7 and 0:6 . n =p . 2; • for medium and heavy hypernuclei (40 ¡ A ¡ 100) the non-mesonic processes were dominant, NM =− 100–200, and 1:5 . n =p . 9; • the total lifetimes for light hypernuclei (A 6 15) oscillated in the interval =free 0:3–1. The interest in the detection of hypernuclear decays seems out of stock in the 4rst 1970s, until the 4rst half of the 1980s [31,99], when at the Brookhaven synchrotron, by using modern techniques (scintillators, proportional chambers, etc., which allow direct timing observations), the lifetime of 11 12 − − B and C [92], produced by the (K ; ) reaction, was measured. After some years, through the detection of protons, neutrons (from non-mesonic decays) and negative pions (from mesonic decays), the partial rates for 5 He and 12 C have been measured [93]. The total lifetime was measured directly, and the mesonic rate into 0 n obtained by subtraction: 0 = T − n − p − − . It must be noted that lifetime measurements are free from nuclear 4nal state interactions and material e7ects, which, on the contrary, a7ects very much the measurement of the partial rates n and p . The so-called “modern era” of hypernuclear physics starts with counter experiments like these, which very much improved the quality of data. More recently, with the same techniques, 4 He and 9 Be hypernuclei have been studied at BNL [139]. 238 In the middle of the 1980s, at CERN LEAR, the lifetimes of 209 Bi and U (produced by antiproton annihilation) were measured [140], although with very large error bars, with results comparable with the lifetimes of light hypernuclei. More recent results, obtained with an improved apparatus, are published in Ref. [42]: large uncertainties remain because of the limited precision of the recoil shadow method. The experiment measures the 4ssion fragments of the produced hypernucleus, with a delay time which is equal to the hypernuclear lifetime. In fact, the 4ssion events are mainly induced by the energy released in the non-mesonic decay (the probability of a time delayed 4ssion due to the decay is more than up to 2 orders of magnitude higher than the one of prompt 4ssion due to other sources [42]). Experiments of this kind are very di6cult to perform (the produced hypernucleus cannot be unambiguously identi4ed) and, as already mentioned, the lifetimes are generally measured with large errors. Only very recently [43,141,142], more accurate results have been obtained from delayed 4ssion experiments. Nevertheless, there is a certain disagreement among these new data, and the saturation value of the lifetime for very heavy hypernuclei is not established with precision. It is important to remind the reader that, for the decay of very heavy hypernuclei, the application of more accurate techniques, employing direct timing methods (used, for instance, in the BNL experiment of Ref. [93]), is practically impossible due to the large background of light particles. In the last 15 years there has been a rapid development of various experiments, which have led to a more systematic investigation of hypernuclei, although no experiment has been able to measure directly the whole set of partial decay rates. At Brookhaven [38,99] (starting since 1983) and KEK [39,101,143] (since 1989), the (+ ; K + ) strangeness associated production reaction has been used. At JMulich (COSY) [43,142], by using proton–nucleus scattering processes, the total lifetime of very heavy hypernuclei (in the region of bismuth and uranium: A 200–240) has been measured by delayed 4ssion observations. By using the same techniques [141], again at COSY, the lifetime for
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
27
hypernuclei with mass numbers A = 180 ± 5, produced in p-Au collisions, has been obtained. At − 0 BNL AGS [144], the 12 B hypernucleus has been produced by the (stopped K ; ) reaction, the 0 being created on a proton, with the 4nal detected by a high energy resolution (less than 1 MeV FWHM) neutral meson spectrometer. At BNL [66] and KEK [65] the process (K − ; K + ) produces strangeness −2 hypernuclei, which are important for the study of the and N interactions. Several experiments are planned for the future. At TJNAF laboratories [106], by using the electroproduction reaction (e; e K + ), hypernuclear levels will be observed with high-resolution ( 0:6 MeV FWHM) and, through 4ssion fragment detection techniques, the lifetimes of heavy hypernuclei will be measured with great accuracy and precise identi4cation of the decayed system [107]. In the near future, within the Japan Hadron Facility (JHF) project, at KEK a germanium detector system will measure the hypernuclear -ray transitions with an energy resolution around 300 keV FWHM [105]. Germanium detectors with a few keV resolution are already collecting -spectroscopy data at BNL and KEK [23,91]. Experiments of this kind will be essential for a better understanding of the N spin-dependent interactions. Finally, the FINUDA facility [96] will make use of very thin targets ( 0:1 g=cm2 ) and large detector acceptance ( 2 sr). The (stopped K − ; − ) production reaction, already employed at KEK [47,94,95], will be used, with low energy K − ( 16 MeV) coming from the decay of the 6 mesons (6 → K + K − ; B:R: = 49:1%). This mesons will be created at the DaXne e+ e− collider at a center-of-mass energy of 1:02 GeV. The experiment has been designed to work with high production rate (about 80 hypernuclei produced per hour at the e+ e− luminosity of 1032 cm−2 s−1 ), high-resolution spectroscopy ( 0:7 MeV FWHM) and high precision in the measurements of the weak decay rates (2% statistical error on the total lifetimes for one week of data taking at L = 1032 cm−2 s−1 ). The − coming from the hypernuclear production could be detected by the FINUDA spectrometer in coincidence with all the particles emitted in the subsequent decay. It will be possible to measure the n =p ratio with precision better than the one of the other running experiments and to use about 10 di7erent targets, covering the whole mass range, for a systematic study of production and decay of hypernuclei. We think that the wide program of FINUDA could represent a new step forward in understanding the hypernuclear phenomena. The main results obtained with the above listed experiments will be quoted at the end of the next subsection, for a comparison with theoretical predictions. 4.2. Theory We summarize here the historical evolution of the various theoretical approaches utilized for the evaluation of the weak decay of -hypernuclei. Some details of the formalisms employed in the calculations are given in the next section. The numerical results of the di7erent calculations are reported and discussed at the end of this subsection, in Tables 2–9. The 4rst calculations of the mesonic rate for light hypernuclei date at the end of the 1950s [116]. The Pauli blocking e7ect for nuclear decay was estimated and used in order to assign the spin to the ground state of s-shell hypernuclei. The possibility of non-mesonic hypernuclear decay was suggested for the 4rst time in 1953 [145] and interpreted in terms of the free space → N decay, where the pion was considered as virtual and then absorbed by a bound nucleon. In the 1960s Block and Dalitz [146 –148] developed a phenomenological model, which has been more recently updated [149 –151]. Within this approach, some important characteristics of
28
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
Table 2 Decay width for a in nuclear matter (T ≡ NM ) Ref.
Unc
+ SRC
Adams [154]
3.47
0.38 1.57
Dalitz [148] Cheung et al. [132]
0.99
McKellar–Gibson [155]
4.13 1.13
Oset–Salcedo [166]
4.3
Dubach et al. [125]
3.89
2.31 0.72
+ SRC + FF
Model OPE No tensor SRC
2.0
Contact int.
0.77 3.0
OPE ¿ 0:8 fm Hybrid
1.06 0.10
OPE +!
2.2
PPM
1.82 1.55 1.23
OPE +! OME
Nardulli [156]
0.7–2.1
+!
Alberico et al. [167]
1.74
PPM with 2B
Shinmura [172]
2.92 3.97
OPE Rel OPE
Shinmura [164] Dubach et al. [126] Sasaki et al. [134]
1.73 1.85 4.66
1.85 1.38 2.819 2.068 2.863
OPE + unc 2 OPE OME
1.850 1.216 1.906 2.456
OPE +K OME + K + DQ
the non-mesonic decays (for instance the validity of the 5I = 1=2 rule) of s-shell hypernuclei can be reproduced in terms of elementary spin-dependent branching ratios for the n → nn and p → np processes, by 4tting the available experimental data (see the discussion in Section 6.3). Although this kind of analysis makes use of several delicate assumptions, it has the good feature that it does not need the knowledge of the e7ective Hamiltonian for the reaction mechanism. An interesting empirical conclusion of Ref. [147], never explained on theoretical ground, is the dominance of the N (3 S1 ) → NN (3 P1 ) transition, which leads to large n =p ratios ( 1–2) for 5 He. Following the phenomenological approach, it emerges that in order to establish the degree of violation of the 5I = 1=2 rule in the non-mesonic decays of s-shell hypernuclei, we need more precise measurements of n and p , especially for 4 H. With the present data one cannot exclude a large violation of the 5I = 1=2 rule [150 –153]. In Ref. [153] the authors came to the conclusion that the 5I = 1=2 rule is strongly violated by observing that the experimental lifetimes of heavy hypernuclei (in the region A 180–220) seem to favour n =p ratios larger than 2, while n =p should be 6 2 if the 5I = 1=2
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109 Table 3 Non-mesonic decay width for
12 C
Ref.
Unc
Cheung et al. [132]
0.48
+ SRC
Oset–Salcedo [166] Ramos–Bennhold [160]
29
1.58 4.30
Ramos et al. [168]
+ SRC + FF
Model
0.41 1.28
OPE ¿ 0:8 fm Hybrid
1.5
PPM
0.87 0.98
OPE OME
1.72
PPM with 2B
Parre˜no et al. [173]
1.641
1.186
0.964
OPE
Parre˜no et al. [157]
1.716
1.239
1.110 0.991
OPE +!
Dubach et al. [126]
3.4
0.5 0.2
Parre˜no et al. [123]
1.682 2.055 2.301
1.232
OPE OME 0.885 0.859 0.753
OPE +! OME
Parre˜no et al. [131]
0.753 0.753– 0.796
OME OME + 5I = 3=2
Itonaga et al. [120]
1.05
+ 2=! + 2=?
Zhou–Piekarewicz [171]
0.413
Rel PPM
Jun et al. [135]
0.468 1.174
OPE OPE + 4BPI
Jido et al. [128]
1.075 0.795 0.769 1.039
OPE +K + K + 2 + ! Full with 2B
Parre˜no–Ramos [121] (correction of [123])
0.751– 0.762 0.413– 0.485 0.554 – 0.726
OPE +K OME
Exp BNL [93]
1:14 ± 0:20
Exp KEK [101]
0:89 ± 0:18
Exp KEK [174]
0:83 ± 0:11
rule were not violated. However, we point out that, apart from the poor approximations adopted in the calculations of Ref. [153], in the phenomenological analysis, the inequality n =p 6 2 for nuclear matter is valid if the decays by interacting only with s-shell nucleons, while in heavy systems p-shell nucleons are expected to contribute too [see Eqs. (68), (71) and the discussion of paragraph 6.3.1 for 12 C].
30
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
Table 4 Non-mesonic decay width for 5 He Ref.
NM
Model
Dalitz [148]
0.5
Contact int.
Takeuchi et al. [158]
0.144 0.033
OPE +!
Oset–Salcedo [166]
1.15
PPM
Oset–Salcedo–Usmani [175]
0.54
PPM
Itonaga et al. [122]
0.20 0.30
OPE + 2=?
Parre˜no et al. [173]
0.56
OPE
Dubach et al. [126]
0.9 0.5
OPE OME
Inoue et al. [133]
0.333 0.381
OPE DQ only
Parre˜no et al. [123]
0.414
OME
Itonaga et al. [120]
0.39
+ 2=! + 2=?
Inoue et al. [124]
0.216 0.627
OPE OPE + DQ
Sasaki et al. [134]
0.370 0.302 0.519
OPE +K + K+ DQ
Jun et al. [135]
0.158 0.426
OPE OPE + 4BPI
Parre˜no–Ramos [121] (correction of [123])
0.424 – 0.425 0.235 – 0.272 0.317– 0.425
OPE +K OME
Exp BNL [93]
0:41 ± 0:14
Exp KEK [176]
0:50 ± 0:07
After the 4rst analysis by Block and Dalitz, microscopic models of the N → NN interaction began to be developed. The 4rst paper, for nuclear matter, including only L = 0 N relative states, is due to Adams [154] (his numerical results are quoted in Tables 2 and 7). He used the OPE description with 5I = 1=2 N couplings within a Fermi gas model and found a large sensitivity of the decay widths to the NN and N short range repulsive correlations. For N they were described through the arbitrary insertion, in the two-body transition matrix element, of an analytical function which was an approximation to the exact solution of the Bethe–Goldstone equation with a hard-core potential (rcore 0:4 fm). The results obtained were not realistic because the N coupling employed
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109 Table 5 Mesonic decay rate for
31
12 C
Ref.
M
Model
Oset–Salcedo [166]
0.41
PPM
Itonaga–Motoba–BandVo [113]
0.233– 0.303
WFM
Ericson–BandVo [178]
0.229
WFM
Nieves–Oset [111]
0.245
WFM
Itonaga–Motoba [112]
0.228
WFM
Ramos et al. [168]
0.31
PPM
Zhou–Piekarewicz [171]
0.112
Rel PPM
Exp BNL [93]
0:11 ± 0:27
Exp KEK [101]
0:36 ± 0:13
Exp KEK [177]: − M (with 0 from [95])
0:113 ± 0:014 0:31 ± 0:07
Table 6 Mesonic decay rate for 5 He Ref.
M
Model
Oset–Salcedo [166]
0.65
PPM
Oset–Salcedo–Usmani [175]
0.54
PPM
Itonaga–Motoba–BandoV [113]
0.331– 0.472
WFM
Motoba et al. [45]
0.608
WFM + Quark Model
Motoba [179]
0.61
WFM
Straub et al. [46]
0.670
WFM + Quark Model
Kumagai–Fuse et al. [115]
0.60
WFM
Exp BNL [93]
0:59+0:44 −0:31
was too small to reproduce the free lifetime. Taking this into account, the Adams’ results for NM should be multiplied by 6.81, as it is done in Table 2. Afterwards, in order to improve the OPE model, mesons heavier than the pion have been introduced as mediators of the N → NN interaction. McKellar and Gibson [155] evaluated the width for a in nuclear matter, adding the exchange of the !-meson and taking into account N relative S states only. They calculated the N! weak vertex (experimentally not accessible) by using the factorization approximation (which, however, contains many ambiguities) and a pole model. The authors assumed the 5I = 1=2 rule and made the calculation by using the two possible relative signs (being at the time unknown and not 4xed by their model) between the pion and rho potentials,
32
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
Table 7 n =p ratio for nuclear matter Ref.
n =p
Model
Adams [154]
0.35
OPE
Dubach et al. [125,126]
0.06 0.08 0.34
OPE +! OME
Nardulli [156]
0.67–1.25
+!
Shinmura [172]
0.255
Rel OPE
Shinmura [164]
0.07 0.08
OPE + unc 2
Sasaki et al. [134]
0.087 0.430 0.398 0.716
OPE +K OME + K + DQ
V + |V! | and V − |V! |, with very di7erent results in the two cases. In Table 2 the listed results are the ones with the (nowadays 4xed) right sign, namely V − |V! |. It is important to note that for mesons heavier than the pion, no experimental indication supports the validity of the 5I = 1=2 rule for their couplings with baryons (for example, see Ref. [130] for an evaluation of the violation of the 5I = 1=2 rule in the → !N vertex). Some years later, Nardulli [156] determined the relative sign (–) between and ! exchange by using a somewhat di7erent pole model, also implementing the available information from weak non-leptonic and radiative decays. Refs. [155,156] obtained a non-mesonic width in the ( + !)-exchange model smaller than the OPE one. This characteristic resulted from a destructive interference between the two mesons and would have been con4rmed in the future. In [156] the n =p ratio in (+!)-exchange resulted sizeably increased with respect to the OPE value (see Table 7). However, more recent calculations [123,157] showed a small e7ect of the !-exchange on n =p . Takeuchi et al. [158,159] applied a model with ( + !)-exchange to 4 H; 4 He and 5 He, 4nding quite small non-mesonic rates when pion and rho have negative relative phase (see Table 4). The same result was obtained in Ref. [155] for nuclear matter. More recently [157], a ( + !) model has been applied to 12 C. The authors found the central potential from !-exchange (omitted in the previous calculations) to be more important than the tensor part. Moreover, the n =p ratio remained unchanged when the rho-meson was included (see Tables 3 and 8). The conclusion we can draw from the calculations that include the !-exchange is that the results strongly depend on the model used for the evaluation of the N! vertex. Nevertheless, today there is a general consent that the inclusion of the ! cannot improve the calculation of n =p [123,126]. In 1986 Dubach et al. [125] introduced a OME model with ; !; K; K ∗ ; ! and /, for a calculation in nuclear matter. The n =p is expected to be sensitive to the isospin structure of the transition potential. Therefore, the inclusion of mesons heavier than the pion could give ratios in better agreement with the data. In order to evaluate the meson–baryon–baryon vertices which are not accessible to the experiment, a quite large number of di7erent models (pole model, SU (6)w
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109 Table 8 n =p ratio for
33
12 C
Ref.
n =p
Model
Ramos–Bennhold [160]
0.19 0.27
OPE OME
Parre˜no et al. [157]
0.12 0.12
OPE +!
Dubach et al. [126]
0.20 0.83
OPE OME
Parre˜no et al. [123]
0.104 0.095 0.068
OPE +! OME
Parre˜no et al. [131]
0.068 0.034 – 0.136
OME OME + 5I = 3=2
Itonaga et al. [120]
0.10 0.36
OPE + 2=! + 2=?
Jun et al. [135]
0.08 1.14
OPE OPE + 4BPI
Jido et al. [128]
0.12 0.52 0.53
OPE +K + K + 2 + !
Parre˜no–Ramos [121] (correction of [123])
0.078– 0.079 0.205 – 0.343 0.288– 0.341
OPE +K OME
Exp [138]
0:59 ± 0:15
Exp BNL [93]
1:33+1:12 −0:81
Exp KEK [101]
1:87+0:67 −1:16
Exp KEK [180]
1:17+0:22 −0:20
symmetry, PCAC, Goldberger–Treiman relations, etc.) have been used, also enforcing the 5I = 1=2 rule. The calculation of the above vertices is strongly model-dependent, and this makes the use of potentials with mesons other than the pion almost impracticable. The above cited paper only reports preliminary results, while the 4nal ones are published in Ref. [126] (see Tables 2– 4 and 7–9). Here the model is also extended, within the extreme single particle shell model, to 4nite nuclei (5 He and 12 C). The controversial results presented in [126] caused debate and critical discussions in the literature. Because of the few details available from Ref. [126] (which, on the other hand, does not take into account the hadronic form factors and quotes some di7erent results from the preliminary ones of Ref. [125]), it is not possible to compare the model used in this work for the OME potential with other ones proposed afterwards, for example by Ramos and Bennhold [160] and Parre˜no et al. [123], where the decay is again treated in a shell model framework. In Ref. [123], di7erently from [126],
34
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
Table 9 n =p ratio for 5 He Ref.
n =p
Model
Itonaga et al. [122]
0.13 0.17
OPE + 2=?
Inoue et al. [133]
0.12 0.95
OPE DQ only
Dubach et al. [126]
0.05 0.48
OPE OME
Parre˜no et al. [123]
0.073
OME
Inoue et al. [124]
0.132 0.489
OPE OPE + DQ
Sasaki et al. [134]
0.133 0.450 0.701
OPE +K + K+ DQ
Jun et al. [135]
0.10 1.30
OPE OPE + 4BPI
Parre˜no–Ramos [121] (correction of [123])
0.086 0.288– 0.498 0.343– 0.457
OPE +K OME
Exp BNL [93]
0:93 ± 0:55
Exp KEK [176]
1:97 ± 0:67
all the possible N and NN relative states have been included. The unknown hadronic vertices have been obtained from pole model, soft meson theorems and SU (6)w symmetry. The repulsive baryon– baryon correlation were based on Nijmegen and JMulich N and NN interactions. The calculation of the unknown hadronic vertices turned out to be model-dependent and the obtained decay rates were very di7erent from the ones of Ref. [126]. Ref. [123] calculated the non-mesonic widths in the 5 OME picture to be di7erent at most by 15% from the OPE ones for 12 C and He (see Tables 3, 4, 8 and 9). The n =p ratio in the full OME turned out to be 30% smaller than the OPE value for 12 C, in contrast with the improved ratio of Ref. [126], even if it was quite sensitive to the isospin structure of the exchanged mesons (the largest changes corresponding to the inclusion of the strange meson K). This was mainly due to the destructive interference between the exchange of mesons with the same isospin [(; !); (K; K ∗ ); (!; /)]. Moreover, the contribution of mesons heavier than the pion were suppressed by form factors and short range correlations. Very recently, in Ref. [121], the authors of Ref. [123] corrected a mistake they made in the inclusion of the K- and K ∗ -exchange. This correction had the e7ect of increasing the n =p ratio: (n =p )OME 4(n =p )OPE . The only inclusion of the K-meson in addition to the pion leads to a smaller n + p and signi4cantly enhances the n =p ratio (see Tables 3, 4, 8 and 9). This behaviour has been con4rmed by other recent calculations [128,134] (even if the di7erent numerical results are not always compatible), which are
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
35
discussed in the following. In [121] the authors also presented a detailed (T -matrix) study of the 4nal state interactions acting between the nucleons emitted in the non-mesonic decay. Recently, the authors of Ref. [118] studied the decay of the hypertriton (3 H) in the full OME picture of Ref. [123]. They worked in the framework of the Faddeev equation, which allows to exactly calculate (at least in principle) wave functions and 4nal scattering states for three-body systems. They reproduced the experimental separation energy and the total lifetime of 3 H, obtaining T =free = 1:03. The non-mesonic width was found to be 1.7% of the free decay rate, only a little less than the calculation with pion-exchange alone. In view of the corrections made in Ref. [121] to the OME model of Ref. [123], the results obtained in Ref. [118] should be updated. The OME model of Refs. [121,123] has also been employed, in Ref. [161], to discuss the e7ects of the nuclear deformation on the non-mesonic decay of p-shell hypernuclei. By using the Nilsson model with realistic values of the deformation parameter, the authors found that, due to nuclear deformation, both n + p and n =p can change at most by about 10% with respect to the spherical limit. In addition to the exchange of one pion, other authors have included in the non-mesonic transition potential the exchange of two pions (correlated [120,122,127] or not [162–164] into ? and ! resonances). In Ref. [162] the two-pion-exchange mechanism contains the N intermediate state, and the 5I =1=2 rule is enforced in the N and N vertices. The intermediate NN state has to be excluded in order to avoid a double counting when the uncorrelated 2-exchange is employed in connection with short range correlations. The authors of [162] found that the component further reduces (with respect to the OPE value) the n =p ratio. On the contrary, a 15% increase of the OPE ratio was found in Ref. [164], due to the N intermediate state in uncorrelated 2-exchange (see Tables 2 and 7). From the results of Ref. [163] on the N → NN matrix elements we can point out that the inclusion of both the N and 5N intermediate states in the 2-exchange could sizeably increase the OPE n =p ratio. This conclusion comes from the observation that large 1 S0 →1 S0 transitions were obtained for uncorrelated 2-exchange. The same 4nding about the importance of 1 S0 →1 S0 transitions comes from Ref. [127] for correlated 2-exchange. In Ref. [120] an improvement of n =p for 12 C and 5I = 3=2 contributions (introduced by the boson–boson coupling model) less important than the 5I = 1=2 ones were found by employing correlated two-pion-exchange (2=! + 2=?) in addition to the OPE (see Tables 3, 4 and 8). The authors also studied the A-dependence of the non-mesonic decay rate and reproduced the data for light hypernuclei but not the saturation of NM at large A. The baryon–baryon short range interactions have been studied by Cheung et al. [132] by using a hybrid model, through which the decay is described by two separate mechanisms: the long range interactions (r ¿ 0:8 fm) are treated in terms of hadronic degrees of freedom (OPE with 5I = 1=2 rule), while the short range interactions (which cannot be explained in terms of meson exchange) are described by a 6-quark cluster model, which includes both 5I = 1=2 and 5I = 3=2 components (see Tables 2 and 3). In Ref. [129], Maltman and Shmatikov combined a OME potential containing (+K)-exchange for long distance interactions and a quark model picture at short distances again in a hybrid model. By employing also the e7ective weak Hamiltonian modi4ed by perturbative QCD e7ects of Ref. [109], the authors obtained signi4cant violations of the 5I = 1=2 rule in the J = 0 N → NN amplitudes. As they pointed out, this should sizeably modify the value of n =p in nuclei. In Ref. [130] the same authors evaluated the 5I = 3=2 contribution to the N! coupling by using the factorization approximation and obtained for it a magnitude comparable with the 5I = 1=2 contribution.
36
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
More recently, Inoue et al. [124,133] treated the non-mesonic decay in s-shell hypernuclei within a direct quark model combined with the OPE description (enforcing here the 5I = 1=2 rule). In their model the NN and N repulsion at short distance originates from quark exchange between baryons (induced by the quark antisymmetrization) and gluon exchange between quarks. The main uncertainties in this kind of approach come from the parameterization of the e7ective weak Hamiltonian for quarks, obtained through the so-called operator product expansion [109], which contains perturbative QCD e7ects and, by construction, terms associated to both 5I = 1=2 and 5I = 3=2 transitions. The authors found that the direct quark (DQ) mechanism was signi4cant, giving sizeable 5I = 3=2 contributions in the J = 0 channel, in agreement with Ref. [129]. The results on the n =p ratio are more consistent (even if sizeable discrepancies still remain) with the experiment, because of a large increase (with respect to the OPE) of the neutron-induced decay rate (see Tables 4 and 9). Unfortunately, the calculation is only made for s-shell hypernuclei (and, as we will mention just below, for symmetric nuclear matter in [134]), and the employed quark Hamiltonian is not able to reproduce the large ratio between the 5I = 1=2 and 5I = 3=2 amplitudes observed in the free decay. At present, the data on hypernuclei do not allow the extraction of the 5I = 3=2 amplitude in N → NN (see the discussion of Section 6.3). Notice, however, that in Ref. [165] Oka showed that the 5I = 3=2 component is probed, in a clean way, by the soft + emission observed in the nuclear decay ( → 0 n followed by 0 p → + n), because of the absence of the 5I =1=2 component. Another delicate point in models with a direct quark contribution is related to double counting and superposition problems, which could arise when both the quark and hadronic description are employed together. However, this does not seem to be a problem in the calculations by Inoue et al., because their relativistic formalism does not allow the exchange of quark–antiquark pairs in the direct quark baryon–baryon interaction. Moreover, in the soft pion limit, they determined the relative phase between the OPE and the direct quark contributions. Apart from the quoted problems, it would be interesting to establish the connection between the quark e7ective weak Hamiltonian and the phenomenological weak vertex. On the other hand, the possibility that the subnucleonic degrees of freedom play a role in nuclear systems remains an interesting 4eld of research, and the non-mesonic decay of -hypernuclei could reveal itself as the only good tool to study such e7ects. The systematic measurement of the partial non-mesonic rates will be useful in distinguishing the di7erent decay mechanisms (meson exchange and direct quark interactions). Very recently [134], the direct quark mechanism has been combined with a full OME potential (; !; K; K ∗ ; /; !), for calculations in nuclear matter and in s-shell hypernuclei (see Tables 2, 4, 7 and 9). The authors compared the OME model with the light mesons (; K) + direct quark model: the short range repulsion is given, respectively, by heavy-meson-exchange and direct quark mechanism. Heavy-meson-exchange and direct quark contributions employed together could cause double counting problems: in any case, the authors obtained that the OME + DQ description does not improve the results. Both the previous pictures, namely OME and + K + DQ, gave the best results of the calculation: also the n =p ratio is signi4cantly improved when the + K + DQ model is employed, both for 5 He hypernuclei and nuclear matter (considered as an approximate description of heavy hypernuclei). E7ects of a violation of the 5I =1=2 rule in N → NN have been studied in Ref. [131] by Parre˜no et al. The OME model employed is the same of Ref. [123], with hadronic couplings evaluated in the factorization approximation. The conclusion reached by the authors is di7erent from the one of Inoue et al. [124]: even by introducing large 5I = 3=2 contributions (of the order of the 5I = 1=2
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
37
ones), to take into account the ambiguities in the factorization approximation, the calculated n =p ratio for 12 C remains abundantly below the experimental values (see Tables 3 and 8). We recall that the OME model employed in Refs. [123,131] has been recently corrected in Ref. [121]. A similar correction could a7ect the conclusions obtained with a 5I = 3=2 contribution. Concerning the n =p puzzle we quote here a di7erent mechanism, which has been suggested by the authors of Ref. [135]. In order to describe the process N → NN , they employed, in addition to the OPE at large distances, a phenomenological 4-baryon point interaction for short range interactions, including 5I =3=2 contributions as well. Such a 4-baryon interaction was initially considered by Block and Dalitz [147] as an approximation of the short range interactions mediated by heavy mesons. By properly 4xing the di7erent phenomenological coupling constants of the problem (in particular, by using a small 5I = 3=2 component), the authors of Ref. [135] could 4t fairly well both the experimental n + p and n =p in 4 He, 5 He and 12 C (see Tables 3, 4, 8 and 9). Up to this point we have essentially discussed theoretical evaluations of the non-mesonic hypernuclear decay, which is the dominant channel for all hypernuclei, but for the s-shell ones. Concerning the mesonic decay, Refs. [45,46] presented a study of the 5 He decay using a quark model based hypernuclear wave function. The authors have shown how the short range N repulsion (which naturally arises in a quark model) is relevant to reproduce the observed mesonic rates in s-shell hypernuclei (see Table 6). In 1993, Nieves and Oset [111] calculated the mesonic widths for a broad 209 range of -hypernuclei (from 12 C to Pb). They used a shell model picture and distorted pionic wave functions, solutions of a pion–nucleus optical potential. The results showed wild oscillations of − =0 around the value (equal to 2) predicted by the 5I = 1=2 rule for N = Z closed shell hypernuclei. This was due to e7ects of the hypernuclear shell structure. Similar calculations have been carried out by Itonaga et al. in Refs. [113,159]. With respect to Ref. [111], they use di7erent optical potentials and descriptions of the energy balance in the decays (more accurate in [111]), and obtain somewhat dissimilar results (especially in very heavy systems). Motoba and Itonaga updated the calculations in Ref. [112] by using an improved optical potential (see Tables 5 and 6). Both the evaluations of Nieves–Oset and Motoba–Itonaga–BandVo showed how the mesonic rate strongly depends on the competition between the Pauli blocking, which suppresses the decay, and the enhancement due to the pion wave distortion in the medium. When the pion wave is distorted by the optical potential, for A ¿ 100 the mesonic width is enhanced by one=two orders of magnitude with respect to the calculation without pion distortion. For the decay into − p the Coulomb distortion alone gives rise to a non-negligible enhancement. The results of the above calculations for light to heavy hypernuclei are shown in Fig. 13 of Section 5.6. A di7erent approach, which allows a uni4ed treatment of mesonic and non-mesonic channels and automatically includes all the partial waves of the relative N motion, has been suggested by Oset and Salcedo [166] (see Tables 2– 6) and utilizes the random phase approximation (RPA) within the framework of the polarization propagators. We shall discuss in detail this method in the next section. Here we only remind the reader that the crucial point for the evaluation of the decay rates is a realistic description of the pion self-energy in the medium and (especially for the evaluation of NM ) of the baryon–baryon short range correlations. More recently, this model has been applied to the calculation, in nuclear matter, of the three-body decay NN → NNN [167] (see Table 2), through a purely phenomenological parameterization (by means of data on deep inelastic (e; e ) scattering and pionic atoms) of the 2p–2h excitations in the pion self-energy. A more detailed analysis of 2 , also implemented in 4nite nuclei via the local density approximation, has been made in Ref. [168]
38
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
(see Tables 3 and 5). Here, the authors employed a more realistic 2p–2h polarization propagator, based again on an empirical analysis of pionic atoms but also extended to kinematical regions not accessible by this phenomenology. The introduction of a new non-negligible (as found in the above mentioned calculations) two-nucleon induced non-mesonic channel requires a reanalysis of the n =p ratio. The most recent calculations performed within the polarization propagator method can be found in Refs. [128,169,170]. They reproduce quite well both the mesonic and non-mesonic rates for light to heavy hypernuclei, although problems related to the n =p ratio still remain. The results obtained in [169,170] will be discussed in detail in Sections 5.6 and 5.7; those of Ref. [128] are listed in Tables 3 and 8. In Ref. [128] the one-nucleon induced non-mesonic decay has been studied within a meson-exchange approach including one pion, one kaon, correlated and uncorrelated 2-exchange and !-exchange (+K +2=?+unc 2+!). The correlated 2-exchange (in the ? channel) has been treated in terms of a chiral unitary approach to the scattering. For the NN interaction this approach leads to a ?-meson-exchange potential with a moderate attraction at r & 0:9 fm and a repulsion at shorter distances, in contrast with the attraction of the conventional ?-exchange. Once the correlated and uncorrelated 2-exchange are added, a net NN attraction is obtained for all distances. In order to restore the behaviour of realistic NN potentials, which present a moderate attraction only at intermediate distances, the authors of Ref. [128] introduced the exchange of the !-meson to produce the required repulsion. A large cancellation between ?-exchange and uncorrelated 2-exchange has been found for momenta around the relevant value 420 MeV. Consequently, the total 2-exchange contribution to the decay turned out to be small (around 10% on n and p ). The !-exchange gave a contribution of the same order of magnitude. On the contrary, the K-exchange, also constrained by chiral unitary theory, has been of primary importance to reproduce the experimental non-mesonic Full rate NM = n + p and to improve the OPE n =p ratio for 12 4:5(n =p )OPE . In C: (n =p ) Ref. [171], still within the polarization propagators framework, by using a relativistic mean-4eld approximation to the Walecka model, the authors evaluated the ring OPE non-mesonic decay widths to be considerably smaller than the non-relativistic ones of Refs. [128,169,170] (see Tables 3 and 5 and Sections 5.6 and 5.7). This also seems to be unrealistic when compared with the 4ndings of Ref. [172]. Here, by employing the Walecka model within the wave function formalism, the relativistic OPE calculation gave nuclear matter non-mesonic rates larger (by about 40%) than the non-relativistic ones (see Tables 2 and 7). However, we remind the reader that this calculation does not include the e7ects of vertex form factors and short range correlations, which signi4cantly reduce the non-mesonic rates, both in the relativistic and non-relativistic descriptions. In Tables 2–9 the numerical results obtained within the above discussed models are summarized and compared with experimental data. The decay widths are in units of the free width. 4.2.1. Table 2. Decay width for a in nuclear matter The results of Adams are corrected for the small N coupling constant he used, as explained above. All the uncorrelated OPE decay widths are compatible with a value of about 4. The result by Cheung et al. is sizeably smaller than 4, but we recall that in their calculation the pion-exchange is only active for r ¿ 0:8 fm, while a large OPE contribution comes from smaller distances. This is equivalent to use very strong short range correlations (SRC), which prevent the process for r ¡ 0:8 fm. Di7erences among the various calculations are observed when the e7ects of SRC and form factors (FF) are included in the OPE models. They reduce the uncorrelated widths by
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
39
a factor & 2. Adams used an inappropriate (too strong) correlation for the tensorial transition 3 S1 → 3 D1 . Neglecting the tensorial SRC, his correlated result (1.57) is more realistic. The di7erences among the other calculations may be understood taking into account the parameterizations used for SRC and FF. For example, in the polarization propagator method (PPM) of Ref. [166], a monopole FF with cut-o7 = 1:3 GeV is used, while in Ref. [155] a stronger FF is employed ( 0:6 GeV). This is responsible for the ratio 2.1 between the results of Refs. [166] and [155]. The inclusion of the !-exchange in the transition potential decreases the decay rate (this characteristic has been con4rmed in 4nite nucleus calculations): in Ref. [155] the !-meson leads to an unrealistic almost complete cancellation of the OPE contribution. The results of Nardulli refer to di7erent choices for the FF. Also the one-meson-exchange (OME) models (we refer, here and in the following, to OME models when the transition potential contains the exchange of ; !; K; K ∗ ; !; and / mesons) tends to reduce the rate with respect to the pure OPE calculation. This is also true, as we shall see in the next tables, for 4nite hypernuclei. In particular, the K-meson-exchange considerably cancels the OPE contribution [134]. From inspection of the experimental data on heavy hypernuclei one concludes that realistic values of the decay rate in nuclear matter lie in the range 1:5–2. 4.2.2. Table 3. Non-mesonic decay width for 12 C The OPE results of Cheung et al. underestimate the experiment for the same reason explained in connection with the calculation in nuclear matter. Note that here the reduction obtained in going from the uncorrelated case to the correlated one is even smaller than what occurred in nuclear matter. In this calculation, the SRC in OPE plays a little role because the -exchange is only active for distances r ¿ 0:8 fm. However, the complete result (hybrid model) of Ref. [132] is realistic. The relativistic polarization propagator method (Rel PPM) of Ref. [171] predicts a too small decay rate. On the contrary, the non-relativistic PPMs of Refs. [166,168] overestimate the data, although the decay rates are reduced when an improved -wave function is used [169,170], and particularly if other SRC in the strong interaction and weak transition are used, as done in Refs. [169,170]. These results will be discussed in Sections 5.6 and 5.7. Antisymmetrization of the 4nal nucleons, as in Ref. [128], would also moderately decrease the non-mesonic rates. The calculation by Dubach et al. [126] provides a too large uncorrelated OPE rate [too small is the reduction with respect to their calculation (3.89 in Ref. [125] and 4.66 in Ref. [126]) for nuclear matter] and too small correlated results, both in OPE and in the full OME. The available computational details of Refs. [125,126] do not allow to explain these controversial results. All the other correlated OPE calculations (apart from the case of Ref. [135]) are compatible with the experiment and give rates reduced with respect to the uncorrelated ones by a factor 1.5 –2. The + !; + 2=! + 2=? and OME rates are quite similar to the OPE estimates. In Ref. [131], 5I = 3=2 contributions to the → N transition are evaluated in the factorization approximation: their e7ect on the non-mesonic rate seems very small. In Ref. [128], the authors used the polarization propagator method with (+K +2=?+ unc 2+!)-exchange. The result for the one-nucleon induced non-mesonic rate of the full calculation is reduced with respect to the OPE value of about 30%. This is due, almost completely, to K-exchange. The full result including the two-body induced contribution (2B) has been obtained by adding the value 2 = 0:27 obtained in Ref. [168]. Realistic calculations supply non-mesonic widths in 12 C reduced by a factor 1.5 –2 with respect to the values for nuclear matter. The results of Parre˜no and Ramos of Ref. [121] correct those of Ref. [123] (due to a mistake in the inclusion of the K and K ∗ contributions) and
40
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
correspond to the use of di7erent Nijmegen models [7,8] for the hadronic coupling constants. The authors also made an accurate evaluation of the 4nal state interactions between the outgoing nucleons, by using the scattering NN wave function from the Lippmann–Schwinger (T -matrix) equation obtained with the Nijmegen NN potentials. The K-exchange decreases the rate n + p with respect to the one calculated in OPE by about 26% in Ref. [128] and 37– 45% in Ref. [121]. 4.2.3. Table 4. Non-mesonic decay width for 5 He In this and in the following tables, only the results obtained including FF and SRC are listed. Unrealistic rates are predicted by Refs. [158,166]. The result of Ref. [158] presents a strong cancellation between - and !-exchange. In [166] the authors overestimated NM because they employed a wave function for the hyperon too much superimposed with the nuclear core. We remind the reader that –4 He potentials consistent with experimental observations have a repulsive core. By using the same model, with a more realistic wave function (calculated from a variational method) the same authors obtained [175] a non-mesonic width compatible with the experiment. There are remarkable di7erences among the several OPE estimates, ranging from 0.144 (Takeuchi et al.) to 0.9 (Dubach et al.). Because of the lack of technical details, the calculation of Dubach et al. cannot be easily compared with the other ones. We remark that they do not take into account the FF, which reduce the non-mesonic width, especially the OPE one. The large di7erence between their OPE and OME results could originate from a double counting between heavy-meson-exchange and SRC. It is also rather strange that the uncorrelated OPE result of Dubach et al. (0.6, not shown in the table) is smaller than the correlated one (0.9). Another point to recall is that in Ref. [126] the correlated 5 OPE and OME non-mesonic rates for 12 C are smaller than the corresponding rates for He of 12 5 Table 3, while, from experiment, we know that NM ( C) 2NM ( He). The calculations by Inoue et al. [124,133] and Sasaki et al. [134] show di7erent OPE results. They can be understood in terms of the di7erent FF and SRC employed. The calculation or Ref. [121] is an updating of that presented in [123]: the intervals shown correspond to the use of di7erent Nijmegen models for the hadronic coupling constant. We note that for ( + K)-exchange the results of Ref. [121] are substantially compatible with the value of Ref. [134]. The reduction of the + K rate with respect to the OPE one is larger in Ref. [121] (36 – 45%) than in Ref. [134] (26%). 4.2.4. Table 5. Mesonic decay rate for 12 C The results reported in the table are all compatible with the data, which, however, have very large error bars. The only exception is the calculation of Ref. [171], supplying a decay rate which underestimates the recent KEK data [177]. The estimates obtained with the wave function method (WFM) of Refs. [111–113] are consistent with the experimental ratio 0 =− 1–2 ¿ (0 =− )free = 1=2, which reFects the particular nuclear shell structure of 12 C. 4.2.5. Table 6. Mesonic decay rate for 5 He The theoretical results agree with the experimental data. This is also true for − =0 , which does not deviate much from the 5I = 1=2 value (=2) for free decays. We expect this result, since 5 He has a closed shell core with N = Z. A repulsive core in the –0 mean potential (used in all but the calculation of Ref. [166]) is favoured. Moreover, it comes out naturally in the quark model descriptions of Refs. [45,46]. The results of Refs. [113,179] refer to the use of di7erent pion–nucleus optical potentials.
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
41
4.2.6. Table 7. n =p ratio for nuclear matter The OPE ratios of Adams [154] and Shinmura [172] seem unrealistic: in fact, they are considerably larger than the other OPE estimates. We note, however, that Adams’ (Shinmura’s) calculation did not include hadronic FF (SRC and FF). The ( + !) calculation by Nardulli supplies values of n =p (the interval in the table corresponds to the use of di7erent FF) close to the experimental indication for 12 C. However, no other estimate that employed a ( + !)-exchange potential has con4rmed an important role of the !-meson in the calculation of n =p . In Refs. [125,126,134] the introduction of heavier mesons supplies improved ratios: a great improvement, due to both the exchange of the K-meson and the DQ process, has been found by Sasaki et al. [134]. 4.2.7. Table 8. n =p ratio for 12 C All the calculations but the ones of Refs. [120,121,126,128,135] strongly underestimate the observed ratios. However, we must notice that the various data have very large error bars and there are still problems about the methods employed by the experiments to extract n =p (see the discussion of Section 6). In Ref. [135], in addition to the OPE at large distances, a 4-baryon point interaction (4BPI), including 5I = 3=2 contributions as well, is employed to describe the short range interactions through a purely phenomenological model which 4ts the partial non-mesonic rates for light hypernuclei. However, the values of some of the parameters used in this model are questionable. The large n =p ratio obtained by Dubach et al. in OME is not con4rmed by the calculations of Refs. [121,123,160]. Moreover, we note that the calculation of Dubach et al. obtains a realistic n =p but strongly underestimates n + p (see Table 3). Also surprising is the large di7erence between the results of Ref. [126] for 12 C and nuclear matter (see Table 7). The OME calculation in Ref. [123] overestimates p and underestimates n : p 2pexp ; n 0:1nexp (we refer, here, to the data of Ref. [93]). In Ref. [121], the results of [123] have been corrected for a mistake made in the inclusion of the strange mesons exchange (a sign error in certain transitions mediated by Kand K ∗ -exchange). The new calculation shows an improvement of the OME n =p ratio, mainly due to K-exchange. The results quoted in the table has been obtained by means of di7erent models for the calculation of the unknown hadronic vertices and by using the Lippmann–Schwinger equation to obtain the scattering wave function for the 4nal NN states. In Ref. [131], by introducing 5I = 3=2 contributions in the OME N → NN transition amplitude (OME + 5I = 3=2) of Ref. [123] (which, we remind the reader, contains the above discussed error), variations of n only have been obtained. The inclusion of correlated 2-exchange in [120] (both in the ? and ! channels) improves the calculated ratio. In Ref. [128], thanks to the K-exchange, a signi4cant improvement of the OPE ratio has been obtained. The two-pion-exchange (correlated in the ? channel and uncorrelated) as well as the !-exchange turned out to have small e7ects on the decay rates. The ( + K) calculation of Ref. [128] provides a ratio about 52% larger than the maximum value obtained in Ref. [121]. 4.2.8. Table 9. n =p ratio for 5 He Also for 5 He, apart from the phenomenological 4t of Ref. [135] and the ( + K + DQ) calculation of Sasaki et al. [134], the theory underestimates the experiment. In Refs. [124,133] Inoue et al. showed how the direct quark (DQ) mechanism is an important ingredient in the evaluation of n =p . The calculation of Sasaki et al. [134] found a large improvement of the ratio, due to the combined e7ects of K-exchange and DQ mechanism. However, this model tends to overestimate the
42
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
observed total non-mesonic rates for heavy hypernuclei (see results for nuclear matter in Table 2). As explained by the authors, this e7ect could be originated from the fact that the short range baryon–baryon correlations used in the calculation were not su6ciently strong. The results of Ref. [123] have been revisited in Ref. [121]: here, in addition to a correction of an error in the previous OME calculation, the authors made a detailed analysis of the 4nal state NN interactions and found a considerable improvement of the ratio. The ( + K) calculation of this paper agrees with that of Ref. [134]. The theoretical calculations quoted in the tables for the non-mesonic decay show that further e7orts (both on the theoretical and experimental side) must be focused on a better understanding of the detailed dynamics of this channel. Some models 4nd an overall agreement with the experimental total non-mesonic rates, but for the partial rates, neutron- and proton-induced, there are large discrepancies. Only the calculations of Refs. [120,121,128,134,135] obtained improved n =p ratios as well as realistic total rates. Recent calculations showed the importance of both the K-meson-exchange and the direct quark mechanism [121,128,134] for a considerable improvement of n =p . On the other hand, the mesonic widths are well explained by the proposed models.
5. Models for calculation 5.1. Introduction In this section we present the frameworks utilized in the literature for the formal derivation of decay rates in nuclei. In Sections 5.2 and 5.3 we discuss the general features of the approach used for direct 4nite nucleus calculations. It is usually called wave function method (WFM) and it has been employed by large part of the authors [111,112,121,123,126,134]. This method makes use of shell model nuclear and hypernuclear wave functions (both at hadronic and quark level) as well as pion wave functions generated by pion–nucleus optical potentials. In Section 5.4 the polarization propagator method (PPM), applied for the 4rst time to hypernuclear decay in Ref. [166] and subsequently in Refs. [128,167–171], is summarized. We shall see how the decay widths can be evaluated, in nuclear matter, by means of a many-body description of the hyperon self-energy. The local density approximation (LDA) allows then one to implement the calculation in 4nite nuclei. Finally a microscopic approach, based again on the PPM, is presented in Section 5.5: here the Feynman diagrams contributing to the self-energy are classi4ed by means of a functional integral approach, according to the prescriptions of the so-called bosonic loop expansion. The numerical results of the literature obtained with WFM and PPM calculations have already been discussed in the previous section. Those obtained by the authors of the present review by applying the formalism of Sections 5.4 and 5.5 are the subject of Sections 5.6 and 5.7. 5.2. Wave function method: mesonic decay The weak e7ective Hamiltonian for the → N decay can be parameterized in the form: 2 V ˜ HW ·6 N = iGm N (A + B5 )˜
;
(5)
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
43
where the values of the weak coupling constants G = 2:211 × 10−7 =m2 ; A = 1:06 and B = −7:10 are 4xed on the free decay. The constants A and B determine the strengths of the parity violating and parity conserving → N amplitudes, respectively. In order to enforce the 5I =1=2 rule (which 4xes free free − = 0 = 2), in Eq. (5) the hyperon is assumed to be an isospin spurion with I = 1=2; Iz = −1=2. free is given by In the non-relativistic approximation, the free decay width free = free − + 0 2 d˜q P 2 C[m − !(˜q) − EN ] S 2 + 2 ˜q2 ; 0free = c0 (Gm2 )2 3 (2) 2!(˜q) m where c0 = 1 for 0 and c0 = 2 for − (expressing the 5I = 1=2 rule), S = A; P = m B=(2mN ), whereas EN and !(˜q) are the total energies of nucleon and pion, respectively. One then easily 4nds the well known result: P2 2 free 2 2 1 mN qc:m: 2 S + 2 qc:m: ; 0 = c0 (Gm ) 2 m m which reproduces the observed rates. In the previous equation, qc:m: 100 MeV is the pion momentum in the center-of-mass frame. In a 4nite nucleus approach, the mesonic width M = − + 0 is calculable by means of the following formula: d˜q 2 2 2 C[E − !(˜q) − EN ] 0 = c0 (Gm ) 3 (2) 2!(˜q) N ∈F
2
2
P 2
∗ ∗
˜ (˜q;˜r)6N (˜r)
× S d˜r6 (˜r)6 (˜q;˜r)6N (˜r) + 2 d˜r6 (˜r)6 ;
m 2
where the sum runs over non-occupied nucleonic states, and E is the hyperon total energy. The and nucleon wave functions 6 and 6N are obtainable within a shell model. The pion wave function 6 corresponds to an outgoing wave, solution of the Klein–Gordon equation with proper pion–nucleus optical potential Vopt : 2
˜ − m2 − 2!Vopt (˜r) + [! − VC (˜r)]2 6 (˜q;˜r) = 0 ; where VC (˜r) is the nuclear Coulomb potential and the energy eigenvalue ! depends on ˜q. Di7erent calculations [111–113] have shown how strongly the mesonic decay is sensitive to the pion–nucleus optical potential, which can be parameterized in terms of the nuclear density, as discussed in Refs. [112,113], or evaluated microscopically, as in Ref. [111]. 5.3. Wave function method: non-mesonic decay Within the meson-exchange-mechanism, the weak transition N → NN is assumed to proceed via the mediation of virtual mesons of the pseudoscalar (; / and K) and vector (!; ! and K ∗ ) octets [121,123,126,134] (see Fig. 5). Two-pion-exchange has been considered in the literature as well [120,127,162,164]. The fundamental ingredients for the calculation of the N → NN transition within a OME model are the weak and strong hadronic vertices. The N weak Hamiltonian is given in Eq. (5). For the
44
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
strong NN Hamiltonian one has the usual pseudoscalar coupling: ˜ N ; HSNN = igNN V N 5˜ · 6 gNN being the strong coupling constant for the NN vertex. In momentum space, the non-relativistic transition potential in the OPE approximation is then B gNN ˜?2 · ˜q ˜?1 · ˜q V (˜q) = −Gm2 A+ ˜1 · ˜2 ; 2mN 2mV ˜q2 + m2 where mV = (m + mN )=2 and ˜q is the momentum of the exchanged pion (directed towards the strong vertex), whose static free propagator is −(˜q2 + m2 )−1 . One can ignore relativistic e7ects and use for calculations the above non-relativistic potential [117]. Given the large momentum ( 420 MeV) exchanged in the N → NN transition, the OPE mechanism describes the long range part of the interaction, and more massive mesons are expected to contribute at shorter distances. A di6culty appears when one wants to include other mesons in the exchange potential. In fact, for mesons m other than the pion, the weak and strong vertices Nm and NNm are experimentally unknown; moreover, their theoretical evaluation resulted quite model-dependent, as explained in the previous section. For example, if one includes in the calculation the contribution of the !-meson, the weak N! and strong NN! Hamiltonians: ?8+ q+ 2 V 8 8 HW + j 0 · ˜!8 ; = Gm − i: 5 ˜ N! N 2mV T g NN! 8+ V ? q+ ˜ · ˜!8 N HSNN! = V N gNN! 8 + i 2mV are needed [123]. They give the following !-meson transition potential: V T + gNN! ) (0 + :)(gNN! 2 V (˜?1 × ˜q) · (˜?2 × ˜q) V! (˜q) = Gm gNN! 0− 4mn m V T j(gNN! + gNN! ) ˜1 · ˜2 +i (˜?1 × ˜?2 ) · ˜q 2 ; 2mm ˜q + m2! where the weak coupling constants 0; : and j must be evaluated theoretically. The potential for a OME calculation accounting for the exchange of pseudoscalar and vector mesons can be expressed through the following decomposition: 0 ˆ Iˆm ; V (˜r) = Vm (˜r) = Vm0 (r)Oˆ (˜r) (6) m
m
0 0
where m = ; !; K; !; /; the spin operators Oˆ are (PV stands for parity-violating): 1ˆ central spin-independent ; ˜ ? · ˜ ? central spin-dependent ; 1 2 0 ˆ = 3(˜?1 · ˜r)(˜ ˆ ?2 · ˜r) ˆ − ˜?1 · ˜?2 tensor ; ˆ = S12 (˜r) Oˆ (˜r) ˜?2 · ˜rˆ PV for pseudoscalar mesons ; ˆ (˜?1 × ˜?2 ) · ˜r PV for vector mesons ; K ∗;
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
whereas the isospin operators Iˆm are 1ˆ Iˆm = ˜1 · ˜2 linear combination of 1ˆ and ˜ · ˜ 1 2 For details concerning Assuming the initial then be written as d˜ p1 1 = (2)3
45
isoscalars mesons (/; !) ; isovector mesons (; !) ; isodoublet mesons (K; K ∗ ) :
the potential (6), see Refs. [123,126]. hypernucleus to be at rest, the one-body induced non-mesonic decay rate can d˜ p2 |M(˜ p1 ; p 2 C(E:C:) ˜ 2 )|2 ; (2)3
(7)
where C(E:C:) stands for the energy conserving delta function: p ˜ 12 p ˜ 22 ; − C(E:C:) = C mH − ER − 2mN − 2mN 2mN moreover: M(˜ p1 ; p ˜ 2 ) ≡ GR ; N (˜ p1 )N (˜ p2 )|Tˆ N →NN |GH is the amplitude for the transition of the initial hypernuclear state GH of mass mH into a 4nal state composed by a residual nucleus GR with energy ER and an antisymmetrized two nucleon state N (˜ p1 )N (˜ p2 ), p ˜ 1 and p ˜ 2 being the nucleon momenta. The sum in Eq. (7) indicates an average over the third component of the hypernuclear total spin and a sum over the quantum numbers of the residual system and over the spin and isospin third components of the outgoing nucleons. Customarily, in shell model calculations the weak-coupling scheme is used to describe the hypernuclear wave function GH , the nuclear core wave function being obtained through the technique of fractional parentage coe6cients [123]. The many-body transition amplitude M(˜ p1 ; p ˜ 2 ) is then expressed in terms of two-body amplitudes NN |V |N of the OME potential of Eq. (6). Since the decays from an orbital angular momentum l = 0 state, in the non-mesonic decay rate one can easily isolate the contributions of neutron- and proton-induced transitions [123], and the n =p ratio can be directly evaluated. The NN 4nal state interactions and the N correlations (which are absent in an independent particle shell model) can also be implemented in the calculation [121,123,134,173]. 5.4. Polarization propagator method and local density approximation The decay in nuclear systems can be studied by using the polarization propagator method [182], which is usually employed within the random phase approximation (RPA). The calculation of the widths is performed in nuclear matter and then it is extended to 4nite nuclei via the LDA. This many-body technique has been applied for the 4rst time to hypernuclear decays in Ref. [166]. It provides a uni4ed picture of the di7erent decay channels and it is equivalent to the WFM [183] (in the sense that it is a semiclassical approximation of the exact quantum mechanical problem). For the calculation of the mesonic rates the WFM is more reliable than the PPM in LDA, this channel being rather sensitive to the shell structure of the hypernucleus, due to the small energies involved. In general it is advisable to avoid the use of the LDA to describe very light systems. On the other
46
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109 Λ
k
k-q
q
N
π k
Λ
Fig. 7. self energy in nuclear matter.
hand, the propagator method in LDA o7ers the possibility of calculations over a broad range of mass numbers, while the WFM is hardly exploitable for medium and heavy hypernuclei. 5.4.1. Nuclear matter To calculate the width one needs the imaginary part of the self-energy: = −2 Im :
(8)
By using the customary Feynman rules, from Fig. 7 the self-energy in the non-relativistic limit is obtained as d4 q P2 2 2 2 2
(k) = 3i(Gm ) S + 2 ˜q F2 (q)GN (k − q)G (q) (9) (2)4 m the factor 3 being a consequence of the 5I = 1=2 rule. The nucleon and pion propagators in nuclear matter are, respectively: GN (p) =
3(kF − |˜ p|) 3(|˜ p | − kF ) + p0 − EN (˜ p) − VN + ij p0 − EN (˜ p) − VN − ij
(10)
and G (q) =
q02
− ˜q2
1 : − m2 − ∗ (q)
(11)
The above form of the non-relativistic nucleon propagator refers to a non-interacting Fermi system but includes corrections due to Pauli principle and an average binding. Other e7ects of the nucleon renormalization in the medium are found to be negligible in the processes we are treating [184]. In the previous equations, p = (p0 ; p ˜ ) and q = (q0 ;˜q) denote four-vectors, kF is the Fermi momentum, EN is the nucleon total free energy, VN the nucleon binding energy (which is density-dependent), and ∗ is the pion proper self-energy in nuclear matter. Moreover, in Eq. (9) we have included a monopole form factor describing the hadronic structure of the N vertex: F (q) =
2 − m2 ; 2 − q02 + ˜q2
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
47
Λ 2
+
N
+
+
+
π Λ
1
(a)
(b)
+
(c)
(e)
+. . .
+
+
(f)
(d)
(g)
(h)
Fig. 8. Lowest order terms for the self-energy in nuclear matter. The meaning of the various diagrams is explained in the text.
which is normalized to unity for on-shell pions. Since at present there is no reason to introduce a di7erent form factor in the weak vertex, one utilizes here the same expression usually employed for the NN strong vertex. For instance, in the pole dominance description of the parity-conserving weak vertex, a form factor identical to the strong one is assigned. From empirical studies on the NN interaction it follows that NN 1:3 GeV, and the same value can be used for N . We note here that the parity-conserving term (l = 1 term) in Eq. (9) contributes only about 12% of the total free decay width. However, the P-wave interaction becomes dominant in the nuclear non-mesonic decay, because of the larger exchanged momenta. In Fig. 8 we show the lowest order Feynman diagrams for the self-energy in nuclear matter. Diagram (a) represents the bare self-energy term, including the e7ects of the Pauli principle and of binding on the intermediate nucleon. In (b) and (c) the pion couples to a particle–hole (p–h) and a 5–h pair, respectively. Diagram (d) is an insertion of S-wave pion self-energy at lowest order. In diagram (e) we show a 2p–2h excitation coupled to the pion through S-wave N interactions. Other 2p–2h excitations, coupled in P-wave, are shown in (f) and (g), while (h) is a RPA iteration of diagram (b). In Eq. (9) there are two di7erent sources of imaginary part. The analytical structure of the integrand allows the integration over q0 [166]. After performing this integration, an imaginary part is obtained from the (renormalized) pion–nucleon pole and physically corresponds to the mesonic decay of the hyperon. Moreover, the pion proper self-energy ∗ (q) has an imaginary part itself for (q0 ;˜q) values which correspond to the excitation of p–h; 5–h; 2p–2h, etc. states on the mass shell. By expanding the pion propagator G (q) as in Fig. 8 and integrating Eq. (9) over q0 , the nuclear matter decay
48
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
width of Eq. (8) becomes [166]: d˜q 2 2 ˜ 3(|˜k − ˜q| − kF )3(k0 − EN (˜k − ˜q) − VN ) (k; !) = −6(Gm ) (2)3 ×Im [0(q)]q0 =k0 −EN (˜k −˜q)−VN ; where
2 P2 2 S˜ (q)UL (q) 0(q) = S + 2 ˜q F2 (q)G0 (q) + m 1 − VL (q)UL (q) 2 2 P˜ T (q)UT (q) P˜ L (q)UL (q) +2 : + 1 − VL (q)UL (q) 1 − VT (q)UT (q)
(12)
2
(13)
In Eq. (12) the 4rst 3 function forbids intermediate nucleon momenta smaller than the Fermi momentum (see Fig. 7), while the second one requires the pion energy q0 to be positive. Moreover, the energy, k0 = E (˜k) + V , contains a phenomenological binding term. With the exception of diagram (a), the pion lines of Fig. 8 have been replaced, in Eq. (13), by the e7ective interactions ˜ P˜ L ; P˜ T ; VL ; VT (L and T stand for spin-longitudinal and spin-transverse, respectively), which inS, clude - and !-exchange modulated by the e7ect of short range repulsive correlations. The potentials VL and VT represent the (strong) p–h interaction and include a Landau parameter g , which accounts ˜ P˜ L and P˜ T correspond to the lines connecting weak and strong for the short range repulsion, while S; hadronic vertices and contain another Landau parameter, g , which is related to the strong N short range correlations. For details on these potentials see Appendix A. Furthermore, in Eq. (13): 1 G0 (q) = 2 q0 − ˜q2 − m2 is the free pion propagator, while UL (q) and UT (q) contain the Lindhard functions for p–h and 5–h excitations [185] and also account for the irreducible 2p–2h polarization propagator: UL; T (q) = U ph (q) + U 5h (q) + UL;2p2h T (q) :
(14)
They appear in Eq. (13) within the standard RPA expression. The decay width (12) depends both explicitly and through UL; T (q) on the nuclear matter density ! = 2kF3 =32 . The Lindhard function for the p–h excitation is de4ned by [185] d4 p 0 ph G (p)GN0 (p + q) ; U (q) = −4i (2)3 N where GN0 (p) =
3(kF − |˜ p|) 3(|˜ p | − kF ) + p0 − TN (˜ p) + ij p0 − TN (˜ p) − ij
is the free nucleon propagator. In the above equation, TN is the nucleon kinetic energy. The Lindhard function U 5h is obtained from U ph by replacing the p–h propagators with the 5–h ones. Analytical expressions of U ph and U 5h are given in Refs. [55,185]. For the evaluation of UL;2p2h T we discuss two di7erent approaches. In Refs. [168,169] a phenomenological parameterization was adopted: as we shall see in paragraph 5.4.3, this consists in relating UL;2p2h T to the available phase space for on-shell 2p–2h excitations in order to extrapolate for o7-mass
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
49
shell pions the experimental data of P-wave absorption of real pions in pionic atoms. In an alternative approach [170], as we shall discuss in detail in Section 5.5, UL;2p2h T is evaluated microscopically, starting from a classi4cation of the relevant Feynman diagrams according to the so-called bosonic loop expansion, which will be obtained by means of a functional approach. In the spin-longitudinal channel, U (q) is related to the P-wave pion proper self-energy through:
(P)∗ (q) =
˜q2 (f2 =m2 )F2 (q)UL (q) ; 1 − (f2 =m2 )gL (q)UL (q)
where the Landau function gL (q) is given in Appendix A. The full pion (proper) self-energy:
∗ (q) = (S)∗ (q) + (P)∗ (q) ; also contains an S-wave term, which, by using the parameterization of Ref. [186], can be written as m (S)∗
(q) = −4 1 + b0 ! mN with b0 = −0:0285=m . The function (S)∗ is real (constant and positive), therefore it contributes only to the mesonic decay [diagram (d) in Fig. 8 is the relative lowest order]. On the contrary, the P-wave self-energy is complex and attractive: Re (P)∗ (q) ¡ 0. The propagator method provides a uni4ed picture of the decay widths. A non-vanishing imaginary part in a self-energy diagram requires placing simultaneously on-shell the particles of the considered intermediate state. For instance, diagram (b) in Fig. 8 has two sources of imaginary part. One comes from cut 1, where the nucleon and the pion are placed on-shell. This term contributes to the mesonic channel: the 4nal pion eventually interacts with the medium through a p–h excitation and then escapes from the nucleus. Diagram (b) and further iterations lead to a renormalization of the pion in the medium which may increase the mesonic rate even by one or two orders of magnitude in heavy nuclei [111,112,166]. The cut 2 in Fig. 8(b) places a nucleon and a p–h pair on shell, so it is the lowest order contribution to the physical process N → NN ; analogous considerations apply to all the considered diagrams. In order to evaluate the various contributions to the width stemming from Eq. (12), it is convenient to consider all the intervening free meson propagators as real. Then the imaginary part of (13) will develop the following contributions: Im U ph (q) + Im U 5h (q) + Im UL;2p2h UL; T (q) T (q) Im = : 1 − VL; T (q)UL; T (q) |1 − VL; T (q)UL; T (q)|2
(15)
The three terms in the numerator of Eq. (15) can be interpreted as di7erent decay mechanisms of the hypernucleus. The term proportional to Im U ph provides the one-nucleon induced non-mesonic rate, 1 . There is no overlap between Im U ph (q) and the pole q0 = !(˜q) in the (dressed) pion propagator G (q): thus the separation of the mesonic and one-body stimulated non-mesonic channels is unambiguous. Further, Im U 5h accounts for the ( → N decay width, thus representing a contribution to the mesonic decay.
50
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
The third contribution of Eq. (15), proportional to Im UL;2p2h T , intervenes in a wide kinematical range, in which the above-mentioned cuts put on the mass shell not only the 2p–2h lines, but possibly also the pionic line. Indeed the renormalized pion pole in Eq. (11) is given by the dispersion relation: !2 (˜q) − ˜q2 − m2 − Re ∗ [!(˜q);˜q] = 0 with the constraint: !(˜q) = k0 − EN (˜k − ˜q) − VN : At the pion pole, Im UL;2p2h T = 0, thus the two-body induced non-mesonic width, 2 , cannot be disentangled from the mesonic width, M . In other words, part of the decay rate calculated from Im UL;2p2h T is due to the excitations of the renormalized pion and gives in fact M , with the exception of the mesonic contribution originating from Im U 5h , which is, however, only a small fraction of M . In order to separate M from 2 , in the numerical calculation it is convenient to evaluate the mesonic width by adopting the following prescription. We start from Eq. (12), setting: P2 (16) 0(q) = 0M (q) ≡ S 2 + 2 ˜q2 F2 (q)G (q) m and omitting Im ∗ in G (which corresponds to setting Im U ph = Im U 5h = Im UL;2p2h T = 0). Then Im 0M (q) only accounts for the (real) contribution of the pion pole: Im G (q) = −C[q02 − ˜q2 − m2 − Re ∗ (q)] : We notice that the compact relation (16) between 0(q) and the pion propagator is valid only for the calculation of the mesonic decay mode. In fact in this case the following substitutions must be performed in Eq. (13) (see also Appendix A): f ˜ S(p) → SF 2 (q)G0 (q)|˜q| ; m f P 2 2 P˜ L (p) → ˜q F (q)G0 (q) ; m m P˜ T (p) → 0 and hence the various terms in 0(q) can be combined to give expression (16). Obviously this implies that no correlation other than the pion is active between the and the strong vertices (g = 0). Once the mesonic decay rate is known, one can calculate the three-body non-mesonic rate by subtracting M and 1 from the total rate T , which one gets via the full expression for 0(q) [Eq. (13)]. 5.4.2. Finite nuclei Using the polarization propagator approach, the decay widths in 4nite nuclei are obtained from the ones evaluated in nuclear matter via the LDA: the Fermi momentum is made r-dependent (namely a local Fermi sea of nucleons is introduced) and related to the nuclear density by the same relation which holds in nuclear matter: 1=3 kF (˜r) = 32 2 !(˜r) : (17)
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
51
Moreover, the nucleon binding potential VN also becomes r-dependent in LDA. In Thomas–Fermi approximation one assumes: kF2 (˜r) + VN (˜r) = 0 : 2mN For the binding energy, V , the experimental values [38,39] can be used. With these prescriptions one can then evaluate the decay width in 4nite nuclei by using the semiclassical approximation, through the relation: ˜ (18) (k) = d˜r | (˜r)|2 [˜k; !(˜r)] ; jF (˜r) + VN (˜r) ≡
where is the appropriate wave function and [˜k; !(˜r)] is given by Eqs. (12) and (13). This decay rate can be regarded as the ˜k-component of the decay rate in the nucleus with density !(˜r). It can be used to estimate the decay rates by averaging over the momentum distribution | ˜ (˜k)|2 . One then obtains the following total width: (19) = d˜k | ˜ (˜k)|2 (˜k) ; which can be compared with the experimental results. 5.4.3. Phenomenological 2p–2h propagator Coming to the phenomenological evaluation of the 2p–2h contributions in the self-energy, we recall that the authors of Ref. [168] employed the following equation for the imaginary part of UL;2p2h T : P(q0 ;˜q; !) ˜ Im UL;2p2h q; !) = (20) Im UL;2p2h T (q0 ;˜ T (m ; 0; !e7 ) ; ˜ P(m ; 0; !e7 ) where !e7 = 0:75!. By neglecting the energy and momentum dependence of the p–h interaction, the phase space available for on-shell 2p–2h excitations [calculated, for simplicity, from diagram 8(e)] at energy–momentum (q0 ;˜q) and density ! turns out to be d4 k ph q ph q P(q0 ;˜q; !) ˙ Im U + k; ! Im U − k; ! (2)4 2 2 q q 0 0 + k0 3 − k0 : ×3 2 2 In the region of (q0 ;˜q) where the p–h and 5–h excitations are o7-shell, the relation between UL2p2h and the P-wave pion–nucleus optical potential Vopt is given by ˜q2 (f2 =m2 )F2 (q)UL2p2h (q) = 2q0 Vopt (q) ; 1 − (f2 =m2 )gL (q)UL (q)
(21)
at the pion threshold Vopt is usually parameterized as 2q0 Vopt (q0 m ;˜q ˜0; !) = −4˜q2 !2 C0 ;
(22)
where C0 is a complex number which can be extracted from experimental data on pionic atoms. By combining Eqs. (21) and (22) it is possible to parameterize the proper 2p–2h excitations in the
52
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
spin-longitudinal channel through Eq. (20), by setting ˜q2
f2 2 F (q0 m ;˜q ˜0)UL2p2h (q0 m ;˜q ˜0; !) = −4˜q2 !2 C0∗ : 2 m
(23)
The value of C0∗ also depends on the correlation function gL . From the analysis of pionic atoms data made in Ref. [187] and taking g ≡ gL (0) = 0:615, one obtains C0∗ = (0:105 + i0:096)=m6 : The spin-transverse component of U 2p2h is assumed to be equal to the spin-longitudinal one, UT2p2h = UL2p2h , and the real parts of UL2p2h and UT2p2h are considered constant [by using Eq. (23)] because they are not expected to be too sensitive to variations of q0 and ˜q. The assumption UT2p2h = UL2p2h is not a priori a good approximation, but it is the only one which can be employed in the present phenomenological description. Yet, the di7erences between UL2p2h and UT2p2h (which will be discussed in Section 5.7; see, in particular, Fig. 16) can only mildly change the partial decay widths: in fact, 2p2h ph UL;2p2h = UT2p2h the T are summed to U , which gives the dominant contribution. Moreover, for UL transverse contribution to 2 [fourth term in the right-hand side of Eq. (13)] is only about 16% of 2 (namely 2–3% of the total width) in medium-heavy hypernuclei. 5.5. Functional approach to the self-energy In alternative to the above-mentioned phenomenological approach for the two-body induced decay, we discuss here a microscopic approach. In particular, we will show how the most relevant Feynman diagrams for the calculation of the self-energy can be obtained in the framework of a functional method: following Ref. [170] we will shortly derive a classi4cation of the diagrams according to the prescription of the so-called bosonic loop expansion (BLE). The baryon–baryon strong interactions cannot be treated with the standard perturbative method. Indeed, in the study of nuclear phenomena we always need to sum, up to in4nite order, the series of pertinent diagrams. For instance, one usually performs the summation of the in4nite classes of diagrams entailed by the RPA and Dyson equations. However, in the above-quoted schemes no prescription is given to evaluate the “next-to-leading” order. The functional techniques can provide a theoretically founded derivation of new classes of expansion in terms of powers of suitably chosen parameters. On the other hand, as we will see, the ring approximation (a subclass of RPA) automatically appears in this framework at the mean 4eld level. This method has been extensively applied to the analysis of di7erent processes in nuclear physics [188–190]. Here it will be employed for the calculation of the decay rates in nuclear matter, which can be expressed through the nuclear responses to pseudoscalar–isovector and vector– isovector 4elds. The polarization propagators obtained in this framework include ring-dressed meson propagators (which represent the mean 4eld level of the theory) and almost the whole spectrum of 2p–2h excitations (expressed in terms of a one-loop expansion with respect to the ring-dressed meson propagators), which are required for the evaluation of 2 . Actually, the semiclassical expansion leads to the prescription of grouping the relevant Feynman diagrams in a consistent many-body description of the “in medium” meson self-energies: the general theorems and sum rules of the theory are preserved.
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
53
Let us 4rst consider the polarization propagator in the pionic (spin-longitudinal) channel. In order to exemplify, it is useful to start from a Lagrangian describing a system of nucleons interacting with pions through a pseudoscalar–isovector coupling: ˜ ; ˜ · 98 6 ˜ − 1 m2 6 ˜ 2 − i V ˜ ·6 LN = V (i9= − mN ) + 12 98 6 2 where
˜ is the nucleonic (pionic) 4eld, and: (6) ˜ = g5˜
(g = 2f mN =m ) is the spin–isospin matrix in the spin-longitudinal isovector channel. We remind the reader that in the calculation of the hypernuclear decay rates one also needs the polarization propagator in the transverse channel [see Eqs. (12) and (13)]: hence, we will have to include in the model another mesonic degree of freedom, the ! meson. This is relatively straightforward, since the semiclassical expansion is characterized by the topology of the diagrams, so the same scheme can be easily applied to mesonic 4elds other than the pionic one. In this subsection we present a relativistic formalism, its non-relativistic reduction being trivial. ˜ with the quantum numbers of the pion. The Let us now introduce a classical external 4eld ’ Lagrangian then becomes: ˜ ·’ ˜ : LN → LN − i V The corresponding generating functional in terms of Feynman path integrals has the form: ˜ V V ˜ ˜ (x)] Z[˜ ’] = D[ ; ; 6] exp i d x[LN (x) − i (x) (x) · ’
(24)
(here and in the following the coordinate integrals are 4-dimensional). All the 4elds in the functional integrals have to be considered as classical variables, but with the correct commuting properties (hence the fermionic 4elds are Grassman variables). The physical quantities of interest for the problem are deduced from the generating functional by means of functional di7erentiations. In particular, by introducing a new functional Zc such that ’]} ; Z[˜ ’] = exp{iZc [˜
(25)
the spin-longitudinal, isovector polarization propagator turns out to be the second functional derivative ˜ of the pionic 4eld: of Zc with respect to the source ’ 2 C Zc [˜ ’] : (26) Lij (x; y) = − C’i (x)C’j (y) ’˜ =0 We notice that the use of Zc instead of Z in Eq. (26) amounts to cancel the disconnected diagrams of the corresponding perturbative expansion (linked cluster theorem). From the generating functional Z one can obtain di7erent approximation schemes according to the order in which the functional integrations are performed. By integrating Eq. (24) over the mesonic degrees of freedom Crst, the generating functional can F be written in terms of a fermionic e7ective action Se7 . Up to an irrelevant multiplicative constant: F V [ ; ]} : Z[˜ ’] = D[ V ; ] exp{iSe7
54
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
F The remaining integration variables are interpreted as physical 4elds and, beyond the kinetic term, Se7 describes a quadrilinear non-local, time- or energy-dependent nucleon–nucleon interaction induced by the exchange of one pion: F V Se7 [ ; ] = d x dy V (x)GN−1 (x − y) (y) 3
1 V + (x)i (x)G0 (x − y) V (y)i (y) 2 i=1
;
(27)
where GN and G0 are the nucleon and free pion propagators, respectively, which satisfy the following 4eld equations: ˜ ·’ ˜ )GN (x − y) = C(x − y) ; (i9=x − mN − i (
x
+ m2 )G0 (x − y) = −C(x − y) :
The pion propagator is diagonal in the isospin indices: (G0 )ij = Cij G0 . The e7ective action (27) can then be utilized in the framework of ordinary perturbation theory and does not bring signi4cant novelties with respect to the usual calculations; furthermore, it cannot be correctly renormalized due ˜ 4 , which is needed to cancel the divergence of the 4-points to the absence of a term proportional to 6 fermion loops. 5.5.1. The bosonic eDective action Alternatively it is possible to eliminate, in Eq. (24), the nucleonic degrees of freedom Crst (without ˜ → destroying the renormalizability of the theory [189]). By introducing the change of variable 6 ˜ ˜ , Eq. (24) becomes 6−’ i 0− 1 ˜ (x) · G (x − y)˜ Z[˜ ’] = exp d x dy ’ ’(y) 2 ˜ exp i d x dy V (x)G −1 (x − y) (y) × D[ V ; ; 6] N 1˜ −1 ˜ + 6(x) · G0 (x − y)(6(y) + 2˜ ’(y)) ; 2
(28)
where the integral over [ V ; ] is Gaussian: − 1 D[ V ; ] exp i d x dy V (x)GN (x − y) (y) = (det GN )−1 : Hence, after multiplying Eq. (28) by the unessential factor det GN0 (GN0 being the free nucleon propagator), which only rede4nes the normalization constant of the generating functional, and using the property det X = exp{Tr ln X }, one obtains: i 0− 1 ˜ exp{iS B [6]} ˜ ˜ (x) · G (x − y)˜ d x dy ’ Z[˜ ’] = exp ’(y) D[6] (29) e7 2
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
+
+
+
55
+ . . .
Fig. 9. Diagrammatic representation of the bosonic e7ective action (30).
with
1˜ −1 B ˜ ˜ ˜ 6 = d x dy + 2˜ ’(y)] + V [6] ; Se7 6(x) · G0 (x − y)[6(y) 2 ∞ 1 ˜ ˜ 0 n ˜ (i · 6GN ) V [6] = i Tr n n=1 1 = Tr(i j ) 2 i; j
(30)
d x dy L0 (x; y)6i (x)6j (y)
1 + Tr(i j k ) 3
˜ 4) : d x dy d z L0 (x; y; z)6i (x)6j (y)6k (z) + O(6
(31)
i; j; k
In the above: 4 − iL0 (x; y) = iGN0 (x − y)iGN0 (y − x) ;
(32)
− iL0 (x; y; z) = iGN0 (x − y)iGN0 (y − z)iGN0 (z − x); etc :
(33)
˜ This action With this procedure we have thus derived an e7ective action for the bosonic 4eld 6. contains a term for the free pion 4eld and also a highly non-local pion self-interaction V , which is illustrated by the Feynman diagrams shown in Fig. 9. This e7ective interaction is given by the sum of all diagrams containing one closed fermion loop and an arbitrary number of pionic legs. We note that the function in Eq. (32) is the free particle–hole polarization propagator, namely the Lindhard function. Moreover, the functions L0 (x; y; : : : ; z) are symmetric for cyclic permutations of the arguments. 5.5.2. Semiclassical expansion The next step is the evaluation of the functional integral over the bosonic degrees of freedom in Eq. (29). A perturbative approach to the bosonic e7ective action (30) does not seem to provide any valuable results within the capabilities of the present computing tools and we will follow here another approximation scheme, namely the semiclassical method. 4
Eq. (31) is a compact writing: for example, the n = 2 term must be interpreted as: i ˜ N0 )2 = i ˜ · 6G d x dy Tr ii GN0 (x − y) ij GN0 (y − x)6i (x)6j (y); Tr(i 2 2 i:j
˜ and so on. where the trace in the right-hand side acts on the vertices ,
56
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
5.5.2.1. Mean Celd level. The lowest order of the semiclassical expansion is the stationary phase approximation (also called saddle point approximation in the Euclidean space): the bosonic e7ective action is required to be stationary with respect to arbitrary variations of the 4elds 6i : B ˜ CSe7 [6] =0 : C6i (x) From the partial derivative of Eq. (30) one obtains the following equation of motion for the classical ˜ 4eld 6: ˜ CV [6] −1 2 ; (34) ( + m )6i (x) = dy G0 (x − y)’i (y) + C6i (x) ˜ . The exact solution cannot be written down whose solutions are functional of the external source ’ ˜ when ’ ˜ = 0; the ˜ → 0 one solution is 6 explicitly. However, due to the particular form of V [6], ˜: general solution of Eq. (34) can then be expressed as an expansion in powers of ’ 6i (x) = dy Aij (x; y)’j (y) j
+
1 2
dy d z Bijk (x; y; z)’j (y)’k (z) + O(˜ ’3 ) :
(35)
j; k
By substituting Eqs. (35) and (31) into (34) and keeping only terms linear in ’i , one obtains the following relation for Aij : Aij (x; y) − Tr(i2 ) du dv G0 (x − u)L0 (u; v)Aij (v; y) = Cij C(x − y) : (36) Finally, by introducing the ring-dressed pion propagator Gring , which satis4es the Dyson equation: ring 0 2 G (x − y) = G (x − y) + Tr(i ) du dv G0 (x − u)L0 (u; v)Gring (v − y) ; or, formally: Gring =
G0 ; 1 − Tr(i2 )G0 L0
the solution of Eq. (36) reads: −1 Aij (x; y) = Cij d z Gring (x − z)G0 (z − y) :
(37)
˜ is Thus, the saddle point solution of Eq. (30) at 4rst order in the source ’ −1 6ring dy d z Gring (x − z)G0 (z − y)’i (y) i (x) = ≡
−1 (x − y)’i (y) dy Gring G0
and the corresponding bosonic e7ective action reads: 1 −1 −1 B ˜ ring d x dy du dv G0 (x − u)˜ Se7 [6 ] = − ’(u) · Gring (x − y)G0 (y − v)˜ ’(v) : 2
(38)
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
57
Now, the generating functional of Eq. (29) takes the form: i −1 ˜ (u) · G0 (x − u) d x dy du dv ’ Z[˜ ’] = exp 2 −1
× [G0 (x − y) − Gring (x − y)]G0 (y − v)˜ ’(v) and the polarization propagator can then be evaluated by using Eqs. (25) and (26). One obtains that in the saddle point approximation it coincides with the well known ring expression: 0 2 Lij (x; y) = Cij [L (x; y) + Tr(i ) du dv L0 (x; u)Gring (u − v)L0 (v; y)] ≡ Cij Lring (x; y) ; or, formally: L=
L0 ≡ Lring : 1 − Tr(i2 )G0 L0
Hence, the ring approximation corresponds to the mean 4eld level of the present e7ective theory. 5.5.2.2. Quantum :uctuations around the mean Celd solution (one-boson-loop corrections). the next step of the semiclassical expansion we write the bosonic e7ective action as 2 B ˜ S [ 6] C 1 B ˜ B ˜0 e7 Se7 d x dy [6] = Se7 [6 ] + [6i (x) − 60i (x)][6j (y) − 60j (y)] ; 2 ij C6i (x)C6j (y) ˜ ˜ 0
In
6=6
˜ 0 also contains the second order term in the source ’ ˜ [see Eq. (35)]. Then, after where now 6 ˜ the generating functional (29) reads performing the Gaussian integration over 6, i 0− 1 ˜ (x) · G (x − y)˜ d x dy ’ Z[˜ ’] = exp ’(y) 2 2 B ˜ C S [ 6] 1 B ˜0 e7 (39) [6 ] − Tr ln ×exp iSe7 2 C6i (x)C6j (y) ˜ ˜ 0 6=6
and the polarization propagator is 2 B ˜ C S [ 6] C2 i B 0 e7 ˜ ] + Tr ln S [6 Lij (x; y) = − C’i (x)C’j (y) e7 2 C6k (x)C6l (y) ˜
˜0
6=6
:
˜ =0 ’
˜ 2 turns out to be In the above, the second derivative of the e7ective action (30) at the order 6 B ˜ C2 Se7 [6] −1 = Cij G0 (x − y) + Tr(i j )L0 (x; y) C6i (x)C6j (y) + du[Tr(i j k )L0 (x; y; u) + Tr(j i k )L0 (y; x; u)]6k (u) k
(40)
58
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
+
du dv [Tr(i j k l )L0 (x; y; u; v) + Tr(j i k l )L0 (y; x; u; v)
k;l
+ Tr(i l j k )L0 (x; v; y; u)]6k (u)6l (v) :
(41)
The second term in the right-hand side of Eq. (41) does not a7ect the calculation of Eq. (40). By ˜ 2 one gets for the substituting Eq. (35) in the equation of motion (34), from the terms of order ’ Bijk functions the following expression: Bijk (x; y; z) = 2 Tr(i j k ) du dv dt L0 (u; v; t)Gring (x − u) −1 −1 × Gring G0 (v − y)(Gring G0 )(t − z) :
(42)
˜ . One can mulThere remains now to calculate the logarithm in Eq. (40) up to second order in ’ tiply the generating functional (39) by the factor (det G0 )−1=2 , inessential in the calculation of the polarization propagator (this corresponds to multiply Eq. (41) by G0 ). Then, after calculating Eq. ˜ 0 given by the Eqs. (35), (37) and (42), we expand the logarithm up to ˜ =6 ˜ 0 , with 6 (41) for 6 2 ˜ and take the trace to the same order. This is rather tedious, but, at the end, the derivation with ’ respect to the external source provides the following total polarization propagator: Lij (x; y) = Cij L(x; y); where L(x; y) = Lring (x; y) +
kl
+
Tr(k l )
Tr(k l )
du dv Gring (u − v)L0 (x; u; y; v)
du dv Gring (u − v)[L0 (x; u; v; y) + L0 (x; y; v; u)]
kl
+
du dv dw ds Gring (u − w)Gring (v − s)L0 (x; u; v)
× [Tr(k l m n )L0 (y; w; s) + Tr(k l n m )L0 (y; s; w)] :
(43)
klmn B ˜ ring B ˜0 [6 ] and Se7 [6 ] with respect to the We remind the reader that the second derivative of Se7 ring ˜ 0 ˜ external source, with 6 [6 ] given by Eq. (38) [Eqs. (35), (37) and (42)], gives the same result ˜ = 0. (the ring polarization propagator) when evaluated at ’ The Feynman diagrams corresponding to Eq. (43) are depicted in Fig. 10. Diagram (a) represents the Lindhard function L0 (x; y), which is just the 4rst term of Lring (x; y). In (b) we have an exchange diagram (the thick dashed lines representing ring-dressed pion propagators); (c) and (d) are self-energy diagrams, while in (e) and (f) we show the correlation diagrams of the present approach. The approximation scheme developed here is also referred to as bosonic loop expansion (BLE). The practical rule to classify the Feynman diagrams according to their order in the BLE is to reduce to a point all its fermionic lines and to count the number of bosonic loops left out. In this case the diagrams (b) – (f) of Fig. 10 reduce to a one-boson-loop. Diagrams (b) – (d) can be represented by the loop (A) of Fig. 11, while (e) and (f) correspond to the loop (B) of the same 4gure.
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
(a)
(b)
(c)
(e)
59
(d)
(f)
Fig. 10. Feynman diagrams for the polarization propagator of Eq. (43): (a) particle–hole; (b) exchange; (c) and (d) self-energy-type; (e) and (f) correlation diagrams. Only the 4rst contribution to the ring expansion has been drawn. The dashed lines represent ring-dressed pion propagators.
(A)
(B)
Fig. 11. First order diagrams in the bosonic loop expansion. Diagrams (b) – (d) of Fig. 10 reduce to diagram (A), while (e) and (f) reduce to (B).
The polarization propagator of Eq. (43) is the central result of this microscopic approach, which will be used in the calculation of the decay width in nuclear matter. Notice that the model can easily include the excitation of baryonic resonances, by replacing the fermionic 4eld with multiplets. The topology of the diagrams remains the same as in Fig. 10 but, introducing for example the ( resonance (as it has been done in the calculation of Ref. [170] and paragraph 5.7), each fermionic line represents either a nucleon or a (, taking care of isospin conservation. One thus obtains 15 exchange, 14 self-energy and 98 correlation diagrams (see Ref. [190] for the whole diagrammology).
60
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
Moreover, since the BLE is characterized by the topology of the diagrams, one can include in the model additional mesonic degrees of freedom, together with phenomenological short range correlations. In particular, the extension to other spin–isospin channels simply amounts to change the de4nition of the vertices i in Eq. (43) and the same occurs for the non-relativistic reduction of the theory. Accordingly, for the non-relativistic pion-exchange, i becomes (apart from the coupling constant) (˜? · ˜q)i , for the !-exchange it reads (˜? × ˜q)k i ; k being a spatial index, and for the !-exchange i ˙ (˜? × ˜q)i . The exchange of !-mesons is taken into account only inside the one-boson-loop diagrams (b) – (f) of Fig. 10, but not in the mesonic lines stemming from the decay vertex, where the considered exchanged meson is, necessarily, of isovector nature ( or !). Beyond ; ! and ! mesons, the present approach also contains (partly) the exchange of the scalar–isoscalar ?-meson: indeed, in the phenomenology of the Bonn NN potential [191], the latter is described through box diagrams (which are contained in the correlation diagrams of Fig. 10), namely by the exchange of two pions with the simultaneous excitation of one or both the intermediate nucleons to a ( resonance. A further di6culty arises if one starts from a potential model rather then from a Lagrangian containing bosons as true degrees of freedom. However this disease is easily overcome by means of a Hubbard–Stratonovitch transformation, which enables one to substitute a potential with a two-body interaction between nucleons by a suitably introduced auxiliary 4eld. As an example, for a scalar– isoscalar potential V , the relevant identity reads: √ i V V exp d x dy (x) (x)V (x − y) (y) (y) = det V 2 i d x dy ?(x)V −1 (x − y)?(y) + i d x V (x) (x)?(x) ; × D[?]exp 2 where ? is the auxiliary 4eld. Clearly, the previous derivation will remain valid, providing one substitutes the inverse propagator of the auxiliary 4eld with the inverse potential in the “free” part of the action. Finally, a relevant point for the feasibility of the calculations is that all fermion loops in Fig. 10 can be evaluated analytically [192], so that each diagram reduces to a 3-dimensional (numerical) integral. In particular, the formalism can be applied to evaluate the functions UL; T of Eq. (14), which are required in Eqs. (12) and (13). In the one-boson-loop (OBL) approximation of Eq. (43) and Fig. 10 we have to replace Eq. (13) with 2 P2 S˜ (q)U1 (q) 0(q) = S 2 + 2 ˜q2 F2 (q)G0 (q) + m 1 − VL (q)U1 (q) +
2 2 P˜ T (q)U1 (q) P˜ L (q)U1 (q) +2 1 − VL (q)U1 (q) 1 − VT (q)U1 (q)
2 2 2 + [S˜ (q) + P˜ L (q)]ULOBL (q) + 2P˜ T (q)UTOBL (q) ;
where U1 = U ph + U 5h ;
(44)
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
while UL;OBL T are evaluated from the diagrams 10(b) – (f) using the normalization of these functions is such that U ph (x; y) = 4L0 (x; y); One relevant di7erence between the OBL formula (44) and the RPA the fact that in the former, to be consistent with Eq. (43), the 2p–2h UL;OBL T ) are not RPA-iterated.
61
standard Feynman rules. The L0 being given by Eq. (32). expression of Eq. (13) lies in diagrams (which contribute to
5.6. Results of the phenomenological calculation We shall illustrate here and in the following subsection the results which can be obtained for hypernuclear decay widths by employing the two approaches (phenomenological and microscopic) illustrated above. To start with let us consider the PPM combined with the LDA: in order to evaluate the width from Eqs. (18) and (19) one needs to specify the nuclear density and the wave function for the . The former is assumed to be a Fermi distribution (normalized to the nuclear mass number A): 1 A ; !A (r) = (45) 2 r − R(A) a 4 1 + exp 3 R (A) 1 + 3 a R(A) with radius R(A) = 1:12A1=3 − 0:86A−1=3 fm and thickness a = 0:52 fm. The wave function is obtained from a –nucleus potential of Woods–Saxon shape, with 4xed di7useness and with radius and depth such that it exactly reproduces the 4rst two single particle eigenvalues (s and p levels) of the hypernucleus under analysis. 5.6.1. Short range correlations and wave function—12 C A crucial ingredient in the calculation of the decay widths is the short range part of the strong NN and N interactions. They are expressed by the functions gL; T (q) and gL; T (q) reported in Appendix A and contain the Landau parameters g and g , respectively. No experimental information is available on g , while many constraints have been set on g , for example by the well known quenching of the Gamow–Teller resonance. Realistic values of g within the framework of the ring approximation are in the range 0.6 – 0.7 [182]. However, in the present context g correlates not only p–h pairs but also p–h with 2p–2h states. In order to 4x the correlation parameters in this new contest, in Ref. [169] the calculated non-mesonic width of 12 C has been compared with the experimental one. In Fig. 12 we see how the total non-mesonic width for carbon depends on the Landau parameters. 5 The rate decreases as g increases. This characteristic is well established in RPA [see Eq. (13)]. Moreover, for 4xed g , there is a minimum for g 0:4 (almost independent of the value of g ). This is due to the fact that for g 0:4 the longitudinal P-wave contribution in Eq. (13) dominates 5
The calculations discussed in paragraphs 5.6.1 and 5.6.2 are a7ected by the following conceptual Faw, which, however, has only slightly altered the numerical results. The decay widths 1 and 2 have always been evaluated by using the value of the parameter C0∗ of Eq. (23) corresponding to g = 0:615. However, we do not expect a dramatic variation of 1 and 2 due to the change of C0∗ when g increases up to the value (0.8) which provides the best 4t of the data. In fact, both Re C0∗ and Im C0∗ at g = 0:8 are increased of only 10 –15% with respect to their values at g = 0:6. Moreover, the analysis of pionic atoms [187] employed to extract the value of C0∗ (g = 0:615) used in the calculations is a7ected by theoretical approximations and experimental uncertainties.
62
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
Fig. 12. Dependence of the non-mesonic width on the Landau parameters g and g for 12 C. The experimental value from BNL [93] (KEK [101]) lies in between the horizontal solid (dashed) lines (taken from Ref. [169]).
over the transverse one and the opposite occurs for g 0:4 (we also remind the reader that the S-wave interaction [Eq. (A.5)] is independent of g ). Moreover, the longitudinal P-wave N → NN interaction [Eq. (A.3)] contains the pion exchange plus short range correlations, while the transverse P-wave N → NN interaction [Eq. (A.4)] only contains repulsive correlations, so with increasing g the P-wave longitudinal contribution to the width decreases, while the P-wave transverse part increases. From Fig. 12 we see that there is a broad range of choices of g and g values which 4t the exp “experimental band”: NM =free = 0:94–1.07. The latter represents decay widths which are compatible with both the BNL [93] and KEK [101] experiments. One should notice that the theoretical curves reported in Fig. 12 contain the contribution of the three-body process NN → NNN ; should the latter be neglected (ring approximation), then one could get equivalent results with g values smaller than the ones reported in the 4gure (typically 5g −0:1). The phenomenology of the (e; e ) quasi-elastic scattering suggests, in ring approximation, g values in the range 0.6 – 0.7. Here, by taking into account also 2p–2h contributions, “equivalent” g values larger than in ring approximation are used. From Fig. 12, the experimental band appears to be compatible with g in the range 0.75 – 0.85 and g in the range 0.3– 0.5. On the other hand, the new KEK results [102,104,174,177] set an upper limit of about 1:03 for the non-mesonic width, which practically forced us to chose g & 0:8 and g in the above-mentioned interval; considering that NM does not change dramatically in this range, g = 0:4 is a reasonable choice. The new KEK data are: T =free = 1:14 ± 0:08 and − =free = 0:113 ± 0:014; taking for 0 the data from [93], 0 =free = 0:06+0:08 −0:05 , and [95], 0 =T = 0:174 ± 0:058, which gives 0 =free = 0:198 ± 0:067 (the calculation of Refs. [111,112] supply 0 values which lie in between the above central data), by subtraction from the total width one obtains NM =free = 0:97+0:11 −0:10 or
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109 Table 10 Sensitivity of the decay rates to the wave function for
M 1 2 NM T
12 C
63
(taken from Ref. [169])
Micr.
Dover W–S
H.O.
New W–S
BNL [93]
KEK [101]
KEK new [102,174,177]
0.25 0.69 0.13 0.81 1.06
0.25 0.77 0.15 0.92 1.17
0.26 0.78 0.15 0.93 1.19
0.25 0.82 0.16 0.98 1.23
0:11 ± 0:27
0:36 ± 0:13
0:31 ± 0:07
1:14 ± 0:20 1:25 ± 0:18
0:89 ± 0:18 1:25 ± 0:18
0:83 ± 0:11 1:14 ± 0:08
NM =free = 0:83 ± 0:11 for the two choices of 0 . Finally, from Fig. 12 we see that the values compatible with both these intervals (0.87– 0.94) require g = 0:85– 0.90. This argument somewhat enlarges the above considered experimental band of Fig. 12 (0.94 –1.07) from below, giving a new interval, 0.87–1.07, whose central value is reproduced by 4xing g = 0:8; g = 0:4. Using these values for the Landau parameters, we illustrate now the sensitivity of the calculation of Ref. [169] to the wave function in 12 C. In addition to the Woods–Saxon potentials (new W–S) that reproduces the s and p -levels, other choices have also been used. In particular: an harmonic oscillator wave function (H.O.) with an “empirical” frequency ! [38,39], obtained from the s– p energy shift, the Woods–Saxon wave function of Ref. [37] (Dover W–S) and the microscopic wave function (Micr.) calculated, in Ref. [193], from a non-local self-energy using a realistic N interaction. The results are shown (in units of the free width) in Table 10, where they are compared with the experimental data from BNL [93] and KEK [101,102,174,177]. By construction, the chosen g and g reproduce the experimental non-mesonic width using the W–S wave function which gives the right s and p hyperon levels in 12 C (column new W–S). We note that it is possible to generate the microscopic wave function of Ref. [193] for carbon via a local hyperon–nucleus W–S potential with radius 2:92 fm and depth −23 MeV. Although this potential reproduces fairly well the experimental s-level for the in 12 C, it does not reproduce the p-level. A completely phenomenological –nucleus potential, that can easily be extended to heavier nuclei and reproduces the experimental single particle levels as well as possible, has been preferably adopted in Ref. [169]. Except for s-shell hypernuclei, where the experimental data require –nucleus potentials with a repulsive core at short distances, the binding energies have been well reproduced by W–S potentials. The authors of Ref. [169] use a W–S potential with 4xed di7useness (a = 0:6 fm) and adjust the radius and depth to reproduce the s and p -levels. The parameters of the potential for carbon are R = 2:27 fm and V0 = −32 MeV. To analyze the results of Table 10, we note that the microscopic wave function is substantially more extended than all the other wave functions used in the present study. The Dover’s parameters [37], namely R = 2:71 fm and V0 = −28 MeV, give rise to a wave function that is somewhat more extended than the new W–S one but is very similar to the one obtained from a harmonic oscillator with an empirical frequency ˝! = 10:9 MeV. Consequently, the non-mesonic width from the Dover’s wave function is very similar to the one obtained from the harmonic oscillator and slightly smaller than the new W–S one. The microscopic wave-function predicts the smallest non-mesonic widths due to the more extended wave-function, which explores regions of lower density, where the probability of interacting with one or more nucleons is smaller. From Table 10 we also see that,
64
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
Table 11 W–S parameters (taken from Ref. [169]) A+1 Z
R (fm)
V0 (MeV)
12 C 28 Si 40 Ca 56 Fe 89 Y 139 La 208 Pb
2.27 3.33 4.07 4.21 5.07 6.81 5.65
−32.0 −29.5 −28.0 −29.0 −28.5 −27.5 −32.0
against intuition, the mesonic width is quite insensitive to the wave function. On this point we remind the reader that the more extended is the wave function in r-space, the larger is the mesonic width, since the Pauli blocking e7ects on the emitted nucleon are reduced. However, the integral over the -momenta in Eq. (19) is weighted by the momentum distribution | ˜ (˜k)|, which correspondingly tends to cancel the above-mentioned e7ect: as a result, M is insensitive to the di7erent wave functions used in the calculation and it is consistent with both the BNL and KEK data. In summary, di7erent (but realistic) wave functions give rise to total decay widths which may di7er at most by 15%. 5.6.2. Decay widths of light to heavy -hypernuclei Using the new W–S wave functions and the Landau parameters g = 0:8 and g = 0:4, in Refs. [169,194] the calculation has been extended to hypernuclei from 5 He to 208 Pb. We note that, in order to reproduce the experimental s and p levels for the hyperon in the di7erent nuclei one must use potentials with nearly constant depth, around 28–32 MeV, in all but the lightest hypernucleus (5 He). Radii and depths of the employed W–S potentials are quoted in Table 11. In the case of helium, the –nucleus mean potential has a repulsive core. For this hypernucleus the most convenient wave function turn out to be the one derived in Ref. [46], within a quark model description of 5 He. The resulting hypernuclear decay rates are shown in Table 12 [169,194]. We observe that the mesonic rate rapidly vanishes by increasing the nuclear mass number A. This is well known and it is related to the decreasing phase space allowed for the mesonic channel, and to smaller overlaps between the wave function and the nuclear surface, as A increases. In Fig. 13 the results of Refs. [169,194] for M (thick solid line) are compared with the ones of Nieves–Oset [111] (dashed line) and Motoba–Itonaga–BandoV [112,113] (solid line), which were obtained within a shell model framework. Also the central values of the available experimental data [93,101,177] are shown. Although the wave function method (WFM) is more reliable than the LDA for the evaluation of the mesonic rates (because of the small energies involved in the decay, which amplify the e7ects of the nuclear shell structure), we see that this LDA calculation agrees with the WFM ones (apart from 12 28 the case of 208 Pb) and with the data. In particular, the results for C and Si are in agreement 12 free free with the recent KEK measurement [177]: M ( C)= = 0:31 ± 0:07, − (28 Si)= = 0:047 ± 40 56 89 0:008. The results for Ca, Fe and Y are in agreement with the old emulsion data (quoted in Ref. [114]), which indicates − =NM (0:5–1) × 10−2 in the region 40 ¡ A ¡ 100. Moreover, the
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
65
Table 12 Mass dependence of the hypernuclear weak decay rates A+1 Z
M
1
2
T
5 He 12 C 28 Si 40 Ca 56 Fe 89 Y 139 La 208 Pb
0.60 0.25 0.07 0.03 0.01 6 × 10−3 6 × 10−3 1 × 10−4
0.27 0.82 1.02 1.05 1.12 1.16 1.14 1.21
0.04 0.16 0.21 0.21 0.21 0.22 0.18 0.19
0.91 1.23 1.30 1.29 1.35 1.38 1.33 1.40
Fig. 13. Mesonic width as a function of the nuclear mass number A.The results of Ref. [169,194] (thick solid line) are compared with the calculations of Nieves–Oset [111] (dashed line) and Motoba–Itonaga–BandoV [112,113] (solid line). Available experimental data [93,101,177] are also shown. See text for details on data. free recent KEK experiments [177] obtained the limit: − (56 Fe)= ¡ 0:015. It is worth noticing, in Fig. 13, the rather pronounced oscillations of M in the calculation of Refs. [112,113], which are caused by shell e7ects.
66
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
Fig. 14. Partial decay widths in 4nite nuclei as a function of the nuclear mass number A. Experimental data are taken from Refs. [42,43,93,101,102]. Fig. 15. Total lifetime in 4nite nuclei as a function of the nuclear mass number A. Experimental data are taken from Refs. [42,43,93,101,102].
Coming back to Table 12, we note that, with the exception of 5 He, the two-body induced decay is rather independent of the hypernuclear dimension and it is about 15% of the total width. Previous works [167,168] gave more emphasis to this new channel, without, however, reproducing the experimental non-mesonic rates. The total width does not change much with A, as it is also shown by the experiment. In Fig. 14 the results of Table 12 are compared with recent (after 1990) experimental data for NM and T [42,43,93,101,102], while in Fig. 15 the same comparison concerns the total lifetime = ˝=T . The theoretical results are in good agreement with the data over the whole hypernuclear mass range explored. The saturation of the N → NN interaction in nuclei is well reproduced. 5.7. Results of the microscopic calculation The results presented in this subsection have been obtained by applying the formalism developed in Sections 5.4 and 5.5 for nuclear matter. Although, in principle, one could extend this calculation to 4nite nuclei through the local density approximation, as in previous subsection, in practice this would require prohibitive computing times. Indeed the latter are already quite conspicuous for the evaluation of the diagrams of Fig. 10 at Cxed Fermi momentum. Hence, in order to compare the
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
67
Table 13 Average Fermi momenta for three representative mass regions. The experimental data are in units of the free decay rate (taken from Ref. [170])
Medium–light: Medium: Heavy:
11 12 B– C
28 56 Si– Fe
209 238 Bi– U
exp NM
kF (fm−1 )
0.94 –1.07 [93,101]
1.08
1.20 –1.30 [102]
1:2
1.45 –1.70 [42,43]
1.36
results with the experimental data in 4nite nuclei, di7erent Fermi momenta, 4xed on basis, have been employed in the calculation for nuclear matter. First we remind the LDA the local Fermi momentum kFA (r) is related to the nuclear density (45) by Eq. present purpose, the average, 4xed Fermi momentum can be obtained by weighting with the probability density of the hyperon in the considered nucleus:
kF A = d˜r kFA (˜r)| (˜r)|2 :
the following reader that in (17). For the each local kF (46)
In Ref. [169] (˜r) has been calculated from a –nucleus Wood–Saxon potential with thickness a = 0:6 fm and with radius and depth which reproduce the measured s and p -levels. It is possible to classify the hypernuclei, for which experimental data on the non-mesonic decay rate are available, into three mass regions (medium–light: A 10; medium: A 30– 60; and heavy hypernuclei: A & 200), as shown in Table 13. The experimental bands include values of the non-mesonic widths which are compatible with the quoted experiments. For medium and heavy hypernuclei the available experimental data actually refer to the total decay rate. However, from experiments and various estimates it turns out that the mesonic width for medium hypernuclei is at most 5% of the total width and rapidly decreases as A increases. Therefore, because of the low exp precision of the data, one can safely approximate NM with Texp for medium and heavy systems. In the third column of Table 13 we report the average Fermi momenta obtained with Eq. (46). In the calculations we discuss next we have then used the following average Fermi momenta: kF = 1:1 fm−1 for medium–light, kF = 1:2 fm−1 for medium and kF = 1:36 fm−1 for heavy hypernuclei. In addition to kF , other parameters enter into the microscopic calculation of hypernuclear decay widths, which are speci4cally related to the baryon–meson vertices and to the short range correlations. In Ref. [170], with the exception of the Landau parameters g and g , the values of these parameters have not been left free: rather, they have been kept 4xed on the basis of the existing phenomenology (for example in the analysis of quasi-elastic electron–nucleus scattering, spin–isospin nuclear response functions, etc.). For the complete list of these quantities we refer to Ref. [170]. An important ingredient in the calculation of the decay rates is the short range part of the NN and N strong interactions: in fact, the momenta involved in the non-mesonic processes are very large. These short range correlations can be parameterized with the functions reported in Appendix A. The zero energy and momentum limits of these correlations, g and g , are considered as free parameters. We remind once again the reader that no experimental constraint is available on g , while in the framework of ring approximation (namely by neglecting the 2p–2h states in the self-energy), realistic values of g lie in the range 0.6 – 0.7 [182]. However, in the present context, g enters into the one-boson-loop contributions; moreover, in some diagrams [for instance (f) and (g) of
68
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
Fig. 16. Polarization propagators in one-boson-loop approximation UL; T (q0 ;˜q) (UL; T = U1 + UL;OBL T ) of Fig. 10 and Eq. (44) as a function of q0 , with q0 = k0 − EN (˜k − ˜q) − VN . In the 4gure g = 0:7 and kF = 1:36 fm−1 .
Fig. 8] two consecutive g are “connected” to the same fermionic line, introducing a sort of double counting, which imposes a renormalization of g . In the picture of Figs. 8 and 10 the self-energy acquires an energy and momentum behaviour which cannot be explained and simulated on the basis of the simple ring approximation. Therefore, the physical meaning of the Landau parameters is di7erent in the present scheme with respect to the customary phenomenology. Hence in Ref. [170] g has been used as free parameter, to be 4xed in order to reproduce the experimental hypernuclear decay rates. In Fig. 16 we report the real and imaginary parts of the spin-longitudinal (L) and spin-transverse (T) polarization propagators in one-boson-loop approximation UL; T (q0 ;˜q) (UL; T = U1 + UL;OBL T ), which are needed in Eq. (44), as a function of q0 , ˜q being related to q0 by the constraint of Eq. (12), q0 =k0 −EN (˜k −˜q)−VN . The Landau parameter g has been 4xed to 0.7 and the Fermi momentum to kF =1:36 fm−1 . In (a) and (b) we show real and imaginary parts of U1 =U ph +U 5h (dashed lines) and
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
69
U1 + ULOBL (solid lines), respectively. The former is the sum of the p–h and 5–h Lindhard functions of Fig. 10(a), the latter has been calculated by adding the OBL diagrams 10(b) – (f). For reasons related to the technique employed in the numerical evaluation of the Feynman diagrams, it is not possible to separate, in the OBL contributions, the imaginary parts which arise from placing on shell p–h and 2p–2h excitations. In (c) and (d), the above quantities are plotted for the spin-transverse channel. As discussed in the previous subsection, for 4xed g the non-mesonic width (the total width in nuclear matter, where M = 0) has a minimum as a function of g , which is almost independent of the value of g (see Fig. 12). This characteristic does not depend on the set of diagrams taken into account in the calculation, but it is simply due to the interplay between the longitudinal and transverse parts of the P-wave N → NN potential [P˜ L and P˜ T functions of Eqs. (13) and (44)]. Thus, also in the microscopic calculation the minimum of NM is obtained for g 0:4. Fixing g = 0:4, in ring approximation one can reproduce the experimental decay rates by using g values which are compatible with the existing literature. In Fig. 17 we show, as a function of g (for g = 0:4), the calculated non-mesonic decay widths (in units of the free width) for the three mass regions of Table 13. The thick solid curves refer to the one-boson-loop approximation of Eq. (44) and Fig. 10, while the dot-dashed curves are obtained through a RPA iteration of both the particle– hole and the one-boson-loop diagrams, namely by using Eq. (13). However, we remind the reader that only the former approximation has a theoretically founded basis, in line with the semiclassical scheme introduced in Section 5.5; moreover, this “inconsistent” RPA calculation has the tendency to overestimate, in the acceptable range of g values, the experimental non-mesonic widths. The dashed lines represent the pure ring approximation. The calculated widths are compatible with the experimental bands for the g values reported in Table 14. As we have already noticed, the intervals corresponding to the ring calculation are in agreement with the phenomenology of other processes, like the (e; e ) quasi-elastic scattering. However, only the full calculation (column OBL) allows for a good description (keeping the same g value) of the rates in the whole range of kF considered here. In Fig. 18 we see the dependence of the non-mesonic widths on the Fermi momentum. The solid lines correspond to the one-loop approximation, with g = 0:7; 0:8; 0:9 from the top to the bottom, while the dashed lines refer to the ring approximation, with g = 0:5; 0:6; 0:7, again from the top to the bottom. We can then conclude that for the one-loop calculation the best choice for the Landau parameters is the following: g = 0:8;
g = 0:4 :
This parameterization turns out to be the same that was employed in Section 5.6. However, we must point out that in the other calculation the 2p–2h contributions in the self-energy have been evaluated by using a phenomenological parameterization of the pion–nucleus optical potential. Here we are considering a microscopical evaluation of all the relevant diagrams which contribute at the one-boson-loop level: it is evident that the role played by the Landau parameters is di7erent in the two approaches. In order to compare the results of the phenomenological and the microscopical approaches, it is appropriate to consider the former as obtained with constant density rather than in LDA. In Table 15 we show the comparison between the one-boson-loop approximation (column OBL) and the phenomenological model (column PM) of paragraph 5.4.3 at Cxed kF . Both calculations have been carried out with g = 0:8 and g = 0:4, and reproduce with the same accuracy the data. For
70
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
Eq. (44) Eq. (13)
Fig. 17. Dependence of the non-mesonic width on the Landau parameter g , for g = 0:4. The three plots correspond to the classi4cation of Table 13. The thick solid curves refer to the one-boson-loop approximation of Eq. (44), the dot-dashed ones to the RPA calculation of Eq. (13) and the dashed ones to the ring approximation. The experimental bands of Table 13 lie in between the horizontal solid lines (taken from Ref. [170]). Table 14 g values compatible with the experiments (taken from Ref. [170])
kF = 1:1 fm−1 kF = 1:2 fm−1 kF = 1:36 fm−1
OBL
Ring
& 0:75 0.75 – 0.90 0.70 –1.00
0.45 – 0.65 0.55 – 0.65 0.65 – 0.75
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
71
Fig. 18. Dependence of the non-mesonic width on the Fermi momentum of nuclear matter. The solid curves refer to the one-boson-loop approximation (with g = 0:7; 0:8; 0:9 from the top to the bottom), while the dashed lines refer to the ring approximation (g = 0:5; 0:6; 0:7). The experimental data are also shown (taken from Ref. [170]). Table 15 Comparison between the one-boson-loop approximation (column OBL) and the phenomenological model (column PM) of paragraph 5.4.3 for g = 0:8; g = 0:4. The decay rates are in units of the free width (taken from Ref. [170]) kF = 1:1 fm−1
kF = 1:2 fm−1
kF = 1:36 fm−1
OBL
PM
OBL
PM
OBL
PM
1 2 NM
0.82 0.22 1.04
0.81 0.13 0.94
1.02 0.26 1.28
1.00 0.18 1.19
1.36 0.19 1.55
1.33 0.26 1.59
exp NM
0.94 –1.07
1.20 –1.30
1.45 –1.70
technical reasons, the OBL calculation does not allow to precisely identify the partial rates 1 and 2 which contribute to the total NM =1 +2 . In fact, one cannot separate in the imaginary parts of the diagrams (b) – (f) of Fig. 10 the contributions coming from cuts on p–h and 2p–2h states, and hence the partial width (2 ) stemming from the two-nucleon induced decay. The values listed in the table for 2OBL have been obtained from the total imaginary part of the diagrams 10(b) – (f) [namely from the last two terms in the right hand side of Eq. (44)]. In this approximation, 1OBL = ring [second, third and fourth terms in the right-hand side of Eq. (44)]. As a matter of fact, one would expect that 2 increases with kF (and this is the case for the PM calculation), but the OBL results do not follow this statement. From the study of the g -dependence of 2OBL , which has not been discussed here, the only reasonable conclusion we can draw on the two-body induced processes in OBL approximation is that for 1:1 fm−1 . kF . 1:36 fm−1 and g 0:8; 2 =free = 0:1– 0.3, in agreement with the results of Table 12 obtained for 4nite nuclei with the phenomenological model in LDA. We conclude by noticing that the PM results at 4xed kF of Table 15 are consistent, when we follow the mass classi4cation of Table 13, with the ones for 4nite nuclei obtained with the same
72
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
Table 16 Comparison between the phenomenological model for 4nite nuclei (column LDA) and at 4xed kF (column PM). The decay rates are in units of the free width and for g = 0:8; g = 0:4 kF = 1:1 fm−1
kF = 1:2 fm−1
kF = 1:36 fm−1
LDA
PM
LDA
PM
LDA
PM
1 2 NM
0.82 0.16 0.98
0.81 0.13 0.94
1.02–1.12 0.21 1.23–1.33
1.00 0.18 1.19
1.21 0.19 1.40
1.33 0.26 1.59
exp NM
0.94 –1.07
1.20 –1.30
1.45 –1.70
model in LDA and presented in Section 5.6 (see Table 16). There is only some disagreement (at the level of 12% on NM ) for kF = 1:36 fm−1 . This comparison provides an indication of the reliability in using 4xed Fermi momenta to simulate the decay in 4nite nuclei. 6. The n =p puzzle 6.1. Introduction The most relevant open problem in the study of the weak hypernuclear decay is to understand, theoretically, the large experimental values of the ratio n =p . Actually, the large experimental uncertainties involved in the extraction of the ratio do not allow to reach any de4nitive conclusion. The data are quite limited and not precise due to the di6culty of detecting the products of the non-mesonic decays, especially the neutrons. Moreover, up to now it has not been possible to distinguish between nucleons produced by the one-body induced and the (non-negligible) two-body induced decay mechanism. The polarization propagator method used to obtain the results discussed in Sections 5.6 and 5.7 does not distinguish between neutron- and proton-induced processes, but makes an “average” over these reactions. However, within a N → NN OPE model, a simple counting of the isospin factors in the diagrams contributing to the non-mesonic width at lowest order (low density limit) and for a at rest, gives [55] n =p N=(14 Z) for N; Z . 10 (N and Z are the number of neutrons and protons of the hypernucleus, respectively) when only the (dominating) parity-conserving part of the N vertex is taken into account. For heavier systems, a nearly constant ratio ( 1=14) is expected, as a result of the saturation of the n → nn and p → np interactions. The inclusion of the N parity-violating term tends to increase the OPE ratio [128]. As we have seen in Section 4, more re4ned calculations in OPE agree with the previous naive expectation, with values in the interval: OPE n 0:05–0:20 (47) p for all the considered systems. √ The small OPE ratios are due to the 5I = 1=2 rule, which 4xes the vertex ratio V− p =V0 n = − 2 (both in S- and P-wave interactions), and to the particular form of the OPE potential, which has a strong tensor and weak central and parity-violating components: the
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
73
large tensor transition N (3 S1 ) → NN (3 D1 ) requires, in fact, I = 0 np pairs in the antisymmetric 4nal state. In p-shell and heavier hypernuclei the relative N L = 1 state is found to give only a small contribution to tensor transitions for the neutron-induced decay, so it cannot improve the ratio (47). The contribution of the N L = 1 relative state to n + p seems to be of about 5 –15% in p-shell hypernuclei [119 –121]. For these systems we expect the dominance of the S-wave interaction in the initial state, due to the small N relative momentum. By using again a simple argument about the isospin structure of the N → NN interaction√ in OPE, it is possible to estimate that for pure 5I = 3=2 transitions (for which V− p =V0 n = 1= 2) the OPE ratio is increased by a factor 2:5 with respect to the value obtained for pure 5I = 1=2 transitions. On the other hand, the OPE model with 5I = 1=2 couplings has been able to reproduce the one-body stimulated non-mesonic rates 1 = n + p for s- and p-shell hypernuclei [120 –124]. Hence, the problem rather consists in overestimating the proton-induced rate and underestimating the neutron-induced one. Other ingredients beyond the OPE might be responsible for the large experimental ratios. A few calculations with N → NN transition potentials including heavy-meson-exchange or direct quark contributions have improved the situation, without providing, nevertheless, a satisfactory theoretical explanation of the puzzle: very recent evaluations showed the importance of both K-meson-exchange [121,128,134] and direct quark mechanism [134] to obtain larger ratios. The tensor component of K-exchange has opposite sign with respect to the one for -exchange, resulting in a reduction of p . The parity violating N (3 S1 ) → NN (3 P1 ) transition, which contributes to both the n- and p-induced processes, is considerably enhanced by K-exchange and direct quark mechanism and tends to increase n =p [121,134]. In Table 17 we summarize the calculations that predicted ratios considerably enhanced with respect to the OPE values. Experimental data are given for comparison. Almost all calculations reproduce the observed non-mesonic widths n + p , as one can see in Table 18 [we remind the reader that the experimental data also include (at least a part of) the two-body induced channel]: only Parre˜no and Ramos tend to underestimate the data for 12 C, whereas Sasaki et al. overestimate the most accurate experiments for very heavy hypernuclei. Itonaga et al. predict a too small n =p . The results of Sasaki et al. for n =p and n + p in 5 He are compatible with data, but for nuclear matter the authors underestimate n =p and overestimate n + p . The phenomenological 4t of Jun et al. reproduces n =p and n + p for 5 He and 12 C. However, the values of some of the coupling constants of their 4-baryon point interaction, which are required to 4t the data, are questionable. Jido et al. give a ratio for 12 no and Ramos obtain a ratio C compatible with the lower limits of the data. Finally, Parre˜ compatible with the lower limits of the data for 5 He but they underestimate the experiments for 12 C. Clearly, a variety of situations, sometimes contradictory, which give a Favour of the di6culties inherent to n =p . 6.2. Two-body induced decay and nucleon Cnal state interactions The analysis of the ratio n =p is inFuenced by the two-nucleon induced process NN → NNN , whose experimental identi4cation is rather di6cult and it is a challenge for the future. By assuming that the meson produced in the weak vertex is mainly absorbed by an isoscalar NN correlated pair (quasi-deuteron approximation), the three-body process turns out to be np → nnp, so that a considerable fraction of the measured neutrons could come from this channel and not only from
74
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
Table 17 n =p ratio Ref. and model
5 He
Itonaga et al. [120] ( + 2=! + 2=?) 0.701
Jun et al. [135] (OPE + 4BPI)
1.30
Jido et al. [128] ( + K + 2 + !) 0.343– 0.457
Exp KEK [180,181]
1.14
0.288– 0.341 0:59 ± 0:15
0:93 ± 0:55
1:33+1:12 −0:81 1:87+0:67 −1:16
Exp KEK [101] Exp KEK [176]
0.716
0.53
Exp 1974 [138] Exp BNL [93]
Nuclear matter
0.36
Sasaki et al. [134] ( + K + DQ)
Parre˜no–Ramos [121] ( + ! + K + K ∗ + ! + /)
12 C
1:97 ± 0:67 1:17+0:22 −0:20
56 Fe
: 2:54+0:61 −0:81
n → nn and p → np. In this way it might be possible to explain the large experimental n =p ratios, which originally have been analyzed without taking into account the two-body stimulated process. Nevertheless, the situation is far from being clear and simple, both from the theoretical and experimental viewpoints. The new non-mesonic mode was introduced in Ref. [167] and its calculation was improved in Ref. [168], where the authors found that the inclusion of the new channel would bring to extract from the experiment even larger values for the n =p ratios, thus worsening the disagreement with the theoretical estimates. However, in the hypothesis that only two out of the three nucleons coming from the three-body decay are detected, the reanalysis of the experimental data would lead back to smaller ratios [195]. The above hypothesis is plausible for the following reason. The two-body induced decay mode takes place when the pion emitted by the vertex is not too far from being on its renormalized mass–shell (on the contrary, the particle–hole region of the in-medium pion excitation spectrum, which contributes to the one-body induced decay, would be quite far from the pionic branch in the medium). It occurs that the pionic branch (which is a delta function on the energy–momentum dispersion relation in free space) is renormalized in the medium and has a width associated to its capacity to excite 2p–2h states. Part of this strength overcomes the Pauli blocking, giving rise to the two-body induced decay. As a consequence of the emission of an “almost on-shell” pion, the nucleon coming out from the vertex will have a small kinetic energy (TN 5 MeV for a rigorously on-shell pion) and hence will be, most probably, below the experimental detection threshold, which was around 30 –40 MeV in the experiments quoted in Table 17.
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
75
Table 18 Non-mesonic width n + p (in units of free ) Ref. and model
5 He
Itonaga et al. [120] ( + 2=! + 2=?)
12 C
Nuclear matter
1.05
Sasaki et al. [134] ( + K + DQ)
0.519
Jun et al. [135] (OPE + 4BPI)
0.426
Jido et al. [128] ( + K + 2 + !)
2.456 1.174 0.769
Parre˜no–Ramos [121] ( + ! + K + K ∗ + ! + /)
0.317– 0.425
0.554 – 0.726
Exp BNL [93]
0:41 ± 0:14
1:14 ± 0:20 pV + Bi : 1:46+1:83 −0:52
Exp CERN [42]
pV + U : 2:02+1:74 −0:63 Exp KEK [101] Exp KEK [176]
0:89 ± 0:18 0:50 ± 0:07 p + Bi : 1:63+0:19 −0:14
Exp COSY [43] Exp KEK [102,174]
0:83 ± 0:11
56 Fe
: 1:22 ± 0:08
Exp COSY [141]
p + Au : 2:02+0:56 −0:35
Exp COSY [142]
p + U : 1:91+0:28 −0:22
These observations show that n =p is sensitive to the detection threshold and to the detailed kinematics of the process. For instance, the calculated energy spectra of the emitted nucleons clearly display the above statement about the slow nucleon emitted in the weak vertex [196]; their calculation also requires a careful treatment of the nucleon 4nal state interactions. In Ref. [196] the nucleon energy distributions have been calculated by using a Monte Carlo simulation to describe nucleon’s rescattering inside the nucleus: the ratio n =p has been taken as a free parameter and extracted by comparing the simulated spectra with the experimental data. The momentum distributions of the primary nucleons were determined within the polarization propagator scheme discussed in Section 5.4. In their way out of the nucleus, the nucleons, due to the collisions with other nucleons, continuously change energy, direction, charge, and secondary nucleons are emitted as well. Then, the energy distribution of the observable nucleons, which also loose their energy by the interactions with the experimental set-up, is di7erent from the one at the level of the primary nucleons. The shape of the proton spectrum obtained in Ref. [196] is sensitive to the ratio n =p . The protons from the three-nucleon mechanism NN → NNN appear mainly at low energies, while, for 12 C, those from the one-nucleon stimulated process peak around 75 MeV. Since the experimental spectra show a fair
76
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
Counts (arb. units)
30 Γn /Γp =0.1 Γn /Γp =1.0 Γn /Γp =2.0 Γn /Γp =3.0
20
10
0
0
50
100
150
Tp (MeV) Fig. 19. Proton spectrum from the decay of (taken from Ref. [169]).
12 C
for di7erent values of n =p . The experimental data are from Ref. [93]
amount of protons in the low energy region, they would favour a relatively larger two-body induced decay rate and=or a reduced number of protons from the one-body induced process. Consequently, for 2 = 0:27free the authors of Ref. [196] found that the experimental spectra of Refs. [93,138] were compatible with values of n =p around 3 for 12 C, in strong contradiction with the theoretical predictions. However, by using available data on the total number of emitted neutrons and protons, the same calculation shows that the experimental error bars on n =p are increased by the inclusion of the three-body channel, leading to values which, within one standard deviation, can be even compatible with the OPE values. In Ref. [196] it was also pointed out the convenience of measuring the number of outgoing protons per decay event. This observable, which can be measured from delayed 4ssion events in the decay of heavy hypernuclei, gives a more reliable neutron to proton ratio and it is less sensitive to the details of the intra-nuclear cascade calculation determining the 4nal shape of the spectra. The excellent agreement of the calculations discussed in Section 5.6 for the experimental total non-mesonic decay rates made it worth to explore again the predictions for the nucleon spectra [169]. The question is whether the model used in Section 5.6 a7ects the momentum distribution of the primary emitted nucleons strongly enough to obtain good agreement with the experimental proton spectra without requiring very large values for n =p . The nucleon spectra from the decay of several hypernuclei have been thus generated by using the Monte Carlo simulation of Ref. [196]. The spectra obtained for various values of n =p , used again as a free parameter, are compared in Fig. 19 (Fig. 20) with the data from the BNL experiment of Ref. [93] (from Ref. [138]). We remark that, although the total non-mesonic widths are smaller than those of Ref. [196] by about 35%, the resulting nucleon spectra, once they are normalized to the same non-mesonic rate, are practically identical. The reason is that the ratio 2 =1 of two-body induced versus one-body induced decay rates is essentially the
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
77
100 Γn/Γp =0.1 Γn/Γp =1.0 Γn/Γp =2.0 Γn/Γp =3.0
Counts (arb. units)
80
60
40
20
0
0
50
100
150
Tp (MeV) Fig. 20. Proton spectrum from the decay of Ref. [138].
12 C
for di7erent values of n =p . The experimental data are taken from
same in both models (between 0.2 and 0.15 from medium to heavy hypernuclei), and the momentum distributions for the primary emitted protons are also very similar. As a consequence, the conclusions drawn in Ref. [196] still hold and the new calculation also favours very large values of n =p when compared with experimental spectra. On the basis of the above considerations, the origin of the discrepancy between theory and experiment for n =p is far from being resolved. On the theoretical side, there is still room for improving the numerical simulation of the nuclear 4nal state interactions: Coulomb distortion, multiple scattering and the evaporating process should be incorporated in the calculation. In particular, multiple scattering and the evaporating process are important ingredients, which increase the nucleon spectra at low energies. On the experimental side, although more recent spectra are available [101,197], they have not been corrected for the energy losses inside the target and detector as well as for the geometry of the detector, so a direct comparison with theoretical predictions is not possible. Attempts to incorporate these corrections by combining a theoretical model for the nucleon rescattering in the nucleus with a simulation of the interactions in the experimental set-up have been done at KEK [180,181,198]. The results reported in Refs. [180,181] show that n =p increases with 28 56 the hypernucleus mass number, with values in the range 1–3 for 12 C, Si and Fe. In the next paragraph we shall discuss in detail these recent results. A decisive forward step towards a clean extraction of n =p would be obtained if the nucleons from the di7erent non-mesonic processes, N → NN and NN → NNN , were disentangled. Through the measurement of coincidence spectra and angular correlations of the outgoing nucleons, it could be possible, in the near future, to split the non-mesonic decay width into its two components 1 and 2 [96,139,199] and obtain a more precise and direct measurement of n =p . We shall discuss this important point more extensively in paragraph 6.2.3. By using a simple argument about the detection e6ciency in coincidence measurements, the authors of Ref. [139] evaluated the inFuence
78
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
Fig. 21. Proton energy spectra measured at KEK-E307 (taken from Ref. [180]).
of the 4nal state interactions and of the two-body induced process on NM to be of (15 ± 15)% for s-shell hypernuclei. According to the calculation reported in Table 12, for 5 He the e7ect of the two-body stimulated decay alone is 2 =NM = 13%. 6.2.1. Recent experimental spectra 28 56 Very recently, at KEK-E307 [180,181], the proton spectra for 12 C, Si and Fe have been measured and compared with theoretical simulations of the intra-nuclear cascades after the weak processes, obtained with the MC code of Ref. [196]. Corrections for the detector geometry and the nucleonic interactions inside the target and detector materials have also been implemented, through a GEANT MC code. The proton energy spectra have been measured by means of a coincidence counter system identifying the hypernuclear production instant time through the detection of the kaon 28 emitted in the n(+ ; K + ) production reaction. In Fig. 21 the spectra obtained for 12 C and Si are shown. The vertical axes are normalized to the number of protons per hypernuclear non-mesonic decay. The results of KEK-E307 supply a n =p ratio, again estimated by 4tting the proton spectra, which increases with the mass number [180,181]: n 12 ( C) = 1:17+0:22 −0:20 ; p
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
79
n 28 ( Si) = 1:38+0:30 −0:27 ; p n 56 ( Fe) = 2:54+0:61 −0:81 p
(48)
the last one being preliminary. Because of the non-zero experimental energy threshold for proton detection, the obtained 4ts for n =p turn out to be slightly sensitive to the two-body induced process, although this mechanism gives a non-negligible contribution to the non-mesonic rate: for 12 C and 28 Si, the central values of = are reduced by about 16% when the two-body induced process is n p taken into account, with results anyhow compatible within the error bars in both descriptions [180]. The 4ts including the two-nucleon stimulated decay have been performed by using 2 =(1 +2 ) 0:3 as input. This small e7ect of the two-nucleon stimulated decay is principally due to the rather large value of the proton energy detection threshold (ETh 40 MeV) with respect to the average energy of the protons from the process NN → NNN . The results reported in Eq. (48) and Fig. 21 refer to the analysis in which only the one-nucleon induced process is taken into account. The 4ts which included +0:24 +0:33 28 the two-nucleon induced processes lead to: n =p (12 C) = 0:96−0:23 and n =p ( Si) = 1:18−0:31 . At the present level of precision, the observation of signals from the two-body induced decay is thus impossible. However, the degree of accuracy of the new KEK measurements allowed to signi4cantly improve the error bars with respect to the previous experiments (see data listed in Table 17): this leads to exclude neutron to proton ratios smaller than 0:73 (0:50) at the 1? (2?) level for 12 C, in the analysis including the two-nucleon induced process. We want to make the following remark on the mass dependence of the KEK-E307 results. The 56 ratio n =p sizeably increases in going from 28 Si [180] to Fe [181] (we remind the reader that the data for iron are only preliminary). This is in disagreement with the well known behaviour of the N → NN interaction in nuclei, namely its saturation for large mass numbers. In fact, should we 56 estimate, as in paragraph 5.6.2, the mesonic rate for 28 Si and Fe to be 0:07 and 0:01, respectively free (here and in the following the decay rates are in units of ) and use the total decay rates measured in the same experiments [102], then for the central values of the non-mesonic decay width we would 56 obtain: NM (28 Si) = NM ( Fe) = 1:21. This value, together with the ratios of Eq. (48), provides: 28 28 56 n ( Si)=0:70, p ( Si)=0:51, n (56 Fe)=0:87 and p ( Fe)=0:34. As a consequence, n and p do not follow the saturation behaviour (the contrary occurs for the observed total rate NM = n + p ), which predicts n- and p-stimulated rates increasing with N and Z, respectively, and saturating for N; Z 10: n =p nsat =psat for N; Z & 10. Since one also expects a saturation for the neutronand proton-induced decay rates separately, this result could represent a signal of a systematic error in the experimental analysis employed to extract the n =p ratios. 6.2.2. Possible improvements As discussed above, the n =p ratios of Eq. (48), extracted from the recently measured proton spectra, con4rm the results of previous experiments: the neutron- and proton-induced decay rates are of the same order of magnitude over a large hypernuclear mass number range. However, since the new experiments have signi4cantly improved the quality of the data, small values of n =p (say smaller than 0:7 for 12 C) are now excluded with good precision. After having inspected both the experimental procedures and the theoretical models until now developed to determine the ratio, we want to summarize here some interesting features which could
80
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
lead to future perspectives: (1) The proton spectra originating from neutron- and proton-induced processes (and, eventually, from two-nucleon stimulated decays) are added incoherently in the Monte Carlo intra-nuclear cascade calculations used to determine n =p . In this way a possible quantum-mechanical interference e7ect between the two channels is lost. Therefore, in an experiment like KEK-E307, in which only charged particles are detected, one cannot go back up to the decay mechanism which produced an observed proton. The conclusion of paragraph 6.2.1 about a possible systematic error in the experimental analyses of n =p could then be due to the incorrect procedure of summing incoherently, i.e. in a classical picture, the proton spectra from the n → nn and p → np processes. We think that the consequences of this idea, which we put forward for the 4rst time here, should be explored in the future analyses. (2) The experimental spectra of Fig. 21 do not exhibit a peak around the energy ( 75 MeV) which corresponds to the kinematical situation of back-to-back nucleon pairs coming from one-nucleon induced decays. The shape of the spectra just above the 30 MeV detection threshold are quite Fat, or even decreasing for increasing energy, and are not well 4tted by the simulations used to extract n =p . This is in principle due to di7erent, hardly distinguishable, e7ects: (1) the nucleon energy losses in the nucleus, (2) the nucleon energy losses in the target and detector materials, and (3) the relevance of the two-nucleon stimulated non-mesonic decay. Because of the present level of experimental accuracy, an analysis which takes into account the two-nucleon induced decay alone cannot improve the comparison between experiment and theory. It would then be advisable to explore the e7ects of stronger nucleon 4nal state interactions on the simulations used to determine the ratio from the experimental spectra. (3) The direct observation of the three-body emission events is quite di6cult and up to now no signal has been found. The calculated nucleon spectra [196] for this channel present a maximum at energies below the detection threshold, and only a fraction (about 40% for n =p = 1 and ETh = 40 MeV) of nucleons from three-body emission can be detected. Moreover, for E ¿ ETh the nucleon distribution from one-body induced processes superimposes to the previous one. The spectrum simulated for the two-nucleon emission dominates and, for 12 C, peaks at an energy ( 75 MeV) that corresponds to the situation in which the two nucleons come out back-to-back. This observation shows that the separation of the nucleons from the two non-mesonic channels is only possible by angular correlation measurements. In Ref. [139] the authors studied how the back-to-back kinematics is able to select the one-nucleon induced process, and NN coincidence measurements (of energies and angular distributions) are expected in the near future at DaXne [96], KEK [199] and BNL [200]. (4) In order to disentangle the two-nucleon induced decay events from the one-nucleon induced ones, the direct observation of the outgoing neutrons is thus needed. Neutron spectra can be measured down to about 10 MeV kinetic energy, since they are less a7ected than the proton ones by energy losses in the target and detector materials. The joint observation of proton and neutron spectra could then help to disentangle the set-up material e7ects from the nucleon 4nal state interactions occurring inside the residual nucleus. A very recent experiment, KEK-E369, 89 measured neutron spectra from 12 C and Y non-mesonic decays [201]. A preliminary analysis of data is consistent with a ratio in the range 0.5 –1 for 12 C, obtained through a new intra-nuclear calculation. The newly developed Monte Carlo code with the same range of n =p values also
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
81
seems to be able to reproduce the 12 C proton spectrum observed at KEK-E307. These analyses have been performed by neglecting the contribution of the two-nucleon induced decay. (5) In our opinion, the key point to avoid the possible de4ciencies of the single nucleon spectra measurements discussed in point (1), will be to employ coincidence detections of the 4nal nucleons. Only such a procedure leads to a direct and unambiguous determination of n =p . In the experiment KEK-E462 [199], an angular and energy correlation measurement will study the decay of 5 He hypernuclei. Very low detection thresholds ( 10 MeV for neutrons and 20 MeV for protons) and statistics improved with respect to previous measurements will be used. By using a light, spin–isospin saturated hypernucleus such as 5 He, one has the good point that the nuclear 4nal state interaction are considerably reduced. The use of low nucleon threshold energies will make it possible to observe essentially all the 4nal state interactions e7ects. Another experiment, at BNL [200], will measure n =p for 4 H, again by nn and np coincidence measurements. 6.2.3. Potentialities of coincidence experiments We concentrate here on the potentialities of future experiments employing double coincidence nucleon detection. The purpose is to stress the importance of this kind of measurements both for the solution of the n =p puzzle and for the observation of two-nucleon stimulated decay events. A simplistic analysis supplies the following expressions for the numbers of detected neutrons, Nn = Nn1B + Nn2B , and protons, Np = Np1B + Np2B , in terms of the non-mesonic partial decay widths: 2n + p p Nn1B = jn n Rn N ; Np1B = jp p Rp N ; T T Nn2B = jn n Rn
22 N ; T
Np2B = jp p Rp
2 N : T
(49)
Here, jn (jp ) is the neutron (proton) detection e6ciency, n (p ) the detector acceptance for neutrons (protons) and Rn (Rp ) the fraction of outgoing neutrons (protons) with kinetic energy above the detection threshold. The quantities Rn and Rp take into account the nucleon rescattering e7ects in the nucleus, which inFuence, as previously discussed, the numbers of observed neutrons and protons. Moreover, in the relations for the number of neutrons and protons originating from two-body induced decays, Nn2B and Np2B , we have employed the quasi-deuteron approximation, in which the three-body processes proceed mainly through the channel np → nnp. By imposing jn n Rn = jp p Rp = 1, the previous equations supply the number of nucleons at the weak decay vertex. Finally, T is the total decay width (T = n + p + 2 + M ) and N the total number of decayed hypernuclei. In an experimental analysis, the ratio n =p can then be obtained from the measurement of Nn ; Np , N and T and the theoretical evaluation of 2 as follows: Nn Nn 2 0 −1+ 0 −2 N N n p p n + p = ; (50) Nn 2 p 2− 0 −2 Np n + p where 1 n + p = 2
Np Nn + jn n Rn j p p R p
T 3 − 2 N 2
(51)
82
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
and 0≡
jp p Rp : j n n Rn
On the other hand, the numbers of nn and np coincidence detections for an opening angle 3 between the pairs are n 1B 1B (3) = j2n NN (3) fNN (3)R1B N ; Nnn nn T 1B 1B Nnp (3) = jn jp NN (3)fNN (3)R1B np
p N ; T
(52)
respectively, when two-body stimulated decays are neglected. With NN (3) we have denoted the 1B average acceptance for nucleon pairs detected at an opening angle 3, while fNN (3) is the NN 1B 1B angular correlation function for N → NN . Finally, Rnn and Rnp are, respectively, the fraction of nn and np pairs from one-body induced processes, leaving the residual nucleus with energies above the detection thresholds. The ratio n =p can then be measured through the following relation: 1B jp R1B Nnn n np = 1B ; p Nnp jn R1B nn
(53)
1B 1B and Nnp being the total numbers of detected nn and np pairs from one-body induced decays, Nnn respectively. Angular two-nucleon correlation measurements allow to disentangle the two-body stimulated decays from the total set of data. The number of nn and np pair detected at an angle 3 and originating from three-body decays are 2 2B 2B Nnn (3) = j2n NN (3)fNN (3)R2B N ; nn T
22 N ; (54) T respectively, the factor 2 in the second equation being the number of np pairs in the three-particle 2B 4nal state nnp. Besides, fNN (3) is the NN angular correlation function for three-body decays, while 2B 2B Rnn (Rnp ) is the fraction of nn (np) pairs from two-nucleon induced decays leaving the nucleus with energies above the detection thresholds. Nucleon pairs from one-body induced decays are mainly emitted back-to-back with 75 MeV kinetic energy. A detailed study [139] of the NN angular 1B correlation function in N → NN , fNN (3), shows that the NN opening angles are “with great ◦ 2B probability” larger than 140 for s-shell hypernuclei. On the contrary, the function fNN (3) peaks ◦ ◦ ◦ around 120 . By using the approximation that all pairs detected at angles 3 ¿ 140 (3 ¡ 140 ) come from one-nucleon (two-nucleon) induced processes (this assumption is realistic only for light hypernuclei, where a small e7ect of the 4nal state interactions is expected), one can give an estimate of the various N ’s. To do this, we refer to the case of the experiment KEK-E462 [199], which will study the decay of 5 He hypernuclei. Let us start by assuming that n = p = 0:20 and T = 1:00 (the widths are in units of free ). These values agrees with the results of the 1991 BNL experiment [93]. Moreover, from the calculation presented in Table 12 it follows a ratio 2 =(n + p ) = 0:15. It is not easy to 2B 2B (3) = jn jp NN (3)fNN (3)R2B Nnp np
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
83
1B 2B 2B evaluate Rn ; Rp ; R1B nn , Rnp ; Rnn and Rnp : neglecting the nucleon rescattering e7ects in the residual nucleus and assuming a 0 MeV detection threshold, these quantities are equal to 1. The nuclear 4nal state interactions increase the total number of nucleons outgoing from the nucleus with respect to the number of nucleons at the weak decay vertex, but a non-zero energy detection thresholds decreases the number of 4nal nucleons which can be observed [196]. The R’s factors depend on a delicate balance between these two e7ects. A simulation of single and coincidence nucleon spectra 4tting experimental data allows to determine these quantities. Here, we can use as a guidance the results of Refs. [93,139,196] to estimate that, very roughly, for the detection thresholds which will 1B be used at KEK-E462 (around 10 –20 MeV), Rn ; Rp , R1B nn and Rnp are su6ciently close to 1 and 2B 5 R2B nn Rnp 0:8 for He. Further, the following parameters of this experiment are needed [199]:
jn = 0:23 ;
jp = 0:85 ;
n = 0:27 ;
p = 0:18 ;
1B = 0:143 ; NN
2B NN = 0:05 ;
1B 2B and NN are the total detector acceptances for NN pairs coming from one-body and where NN two-body induced processes, respectively. The latter quantities are obtained by averaging the func◦ 1B 2B tions NN (3)fNN (3) and NN (3)fNN (3) of Eqs. (52) and (54) over the intervals 3 ¿ 140 and ◦ 3 ¡ 140 , respectively. By observing N = 100 000 decays of 5 He hypernuclei, we thus expect the following total number of counts:
Nn1B = 3726 ;
1B Nnp = 3060 ;
Nn2B = 745 ;
Np2B = 918 ;
1B = 151 ; Nnn
1B Nnp = 559 ;
2B = 13; Nnn
2B Nnp = 94 :
(55)
If, on the contrary, one assumes n + p = 0:3 (this value agrees with the calculation presented in Table 12) and n =p = 0:5, the number of counts are Nn1B = 2484 ; Nn2B = 559 ; 1B = 76 ; Nnn 2B = 10; Nnn
Np1B = 3060 ; Np2B = 689 ; 1B Nnp = 559 ; 2B Nnp = 70 ;
(56)
respectively. From these estimates one reaches the following important conclusion: if the two-nucleon induced decay rate is about 15% of the total non-mesonic width, from existing data and calculations on n and p one expects a non-negligible number of NN coincidence counts (of the order of 100) coming from two-body induced processes for an ensemble of N = 100 000 hypernuclear decays. In an experiment which could measure the quantities of Eqs. (55) and (56), N and T with su6cient statistics, one would have two independent ways to determine the ratio n =p [by using Eqs. (50) and (53)] and two independent ways to determine 2 [Eqs. (54)]. Careful analyses of
84
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
the nucleon 4nal state interactions must be done, in order to estimate the di7erent factors R’s. We must also note that the use of Eq. (50) to obtain the ratio could be a7ected by problems related to interference e7ects between neutron- and proton-stimulated decays, as mentioned in point (1) of paragraph 6.2.2. In particular, in the low energy kinematical region, the measured protons can be primary or secondary protons from the process p → np but also secondary protons from n → nn. A determination of the ratio with both Eqs. (50) and (53) will thus be able to quantify these interference e7ects. To study the e7ect of the two-nucleon stimulated decay on the determination of n =p for experiments which do not detect the nucleons in double coincidence, let us consider the data from the BNL experiment of Ref. [93] for 5 He: Nn 5 ( He) = 3000 ± 1300 ; j n n
Np 5 ( He) = 1730 ± 260 : jp p
In order to calculate n =p with Eq. (50), an estimate of the nuclear 4nal state interaction e7ects for the outgoing nucleons is required. By using as a guidance the analyses of Refs. [93,139,196,201], one has that, very roughly, Rp =Rn 1–1.1 for the energy thresholds of the BNL experiment ( 30– 40 MeV). By assuming 2 =(n + p ) = 0:15, Eq. (50) then supplies: n 5 ( He) = 0:44+0:53 −0:44 p
(1B + 2B; Rp =Rn = 1:1) ;
(57)
while neglecting the two-nucleon induced channel: n 5 ( He) = 0:45 ± 0:44 p
(1B only; Rp =Rn = 1:1) :
(58)
For Rp =Rn = 1, the ratios of Eqs. (57) and (58) become slightly smaller, namely: n 5 ( He) = 0:34+0:47 −0:34 p
(1B + 2B; Rp =Rn = 1) ;
(59)
n 5 ( He) = 0:37+0:40 −0:37 p
(1B only; Rp =Rn = 1) ;
(60)
respectively. A similar analysis can be performed on Nn 12 ( C) = 3400 ± 1100 ; j n n
12 C
data, again from the BNL experiment of Ref. [93]:
Np 12 ( C) = 1410 ± 200 : jp p
The calculation of Table 12 supplies a ratio 2 =(n + p ) = 0:2 for obtains:
12 C.
Thus, from Eq. (50) one
n 12 ( C) = 1:14 ± 0:80 p
(1B + 2B; Rp =Rn = 1:2) ;
(61)
n 12 ( C) = 0:95 ± 0:51 p
(1B only; Rp =Rn = 1:2) ;
(62)
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
85
n 12 ( C) = 0:78 ± 0:60 p
(1B + 2B; Rp =Rn = 1) ;
(63)
n 12 ( C) = 0:71 ± 0:43 p
(1B only; Rp =Rn = 1) :
(64)
Again, because of the big error bars, the e7ect of 2 on the determination of n =p with Eq. (50) is negligible [this also occurs for very large (and unrealistic) values of 2 ]. In order to have a determination of the ratios of Eqs. (57) and (61) with relative errors of about 20%; Nn and Np must be measured with very small errors: 5Nn =Nn 25Np =Np 8%. Thus, to determine 2 one must resort to NN correlation measurements, as previously discussed. It is worth noticing that the ratios of Eqs. (57) – (60) are considerably smaller than the result published in Ref. [93] for 5 He (n =p = 0:93 ± 0:55) and compatible with the OPE calculations. Only sizeable 4nal state interactions (Rp =Rn 1:7) can give ratios around 1 by employing Eq. (50) for 5 He. The values of +1:12 Eqs. (61) – (64), instead, are closer to the published result for 12 C (n =p = 1:33−0:81 ) and disagree with the OPE calculations within one standard deviation. Interestingly, the ratio of Eq. (50) for 12 C becomes equal to the central data point (1:33) when Rp =Rn 1:3. 6.3. Phenomenological analysis of s-shell hypernuclei The analysis of the non-mesonic decays in s-shell hypernuclei o7ers an important tool both for the solution of the n =p puzzle and for testing the validity of the related 5I = 1=2 rule. Since in these hypernuclei the N pair is necessarily in the L = 0 relative state, the only possible N → NN transitions are the following ones (we use the spectroscopic notation 2S+1 LJ ): 1
S0 → 1 S0 (If = 1) → 3 P0 (If = 1)
3
S1 → 3 S1 (If = 0) → 1 P1 (If = 0) → 3 P1 (If = 1) → 3 D1 (If = 0) :
(65)
The n → nn process has 4nal states with isospin If = 1 only, while for p → np both If = 1 and If = 0 are allowed. We discuss in this subsection an analysis performed by the authors of the present review [151] in order to explore the validity of the 5I = 1=2 rule in the one-nucleon induced -decay. This analysis is based on the phenomenological model of Block and Dalitz [146,147], which we brieFy outline now. The interaction probability of a particle which crosses an in4nite homogeneous system of thickness ds is, classically, dP =ds=7, where 7=1=(?!) is the mean free path of the projectile, ? is the relevant cross section and ! is the density of the system. Then, if we refer to the process N → NN , the
86
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
width NM = dPN →NN =dt can be written as NM = v?! ; v = ds=dt being the velocity in the rest frame of the homogeneous system. For a 4nite nucleus of density !(˜r), by introducing a local Fermi sea of nucleons, one can write, within the semiclassical approximation: NM = v? d˜r!(˜r)| (˜r)|2 ; where (˜r) is the wave function in the hypernucleus and denotes an average over spin and isospin states. In the above equation the nuclear density is normalized to the mass number A=N +Z, hence the integral gives the average nucleon density !A at the position of the particle. In this scheme, the non-mesonic width NM = n + p of the hypernucleus A+1 Z turns out to be A+1 A+1 N RV n ( Z) + Z RV p ( Z) !A ; NM (A+1 Z) = A where RV n (RV p ) denotes the spin-averaged rate for the neutron-induced (proton-induced) process appropriate for the considered hypernucleus. Furthermore, by introducing the rates RN; J for the spin-singlet (Rn0 ; Rp0 ) and spin-triplet (Rn1 ; Rp1 ) elementary N → NN interactions, the non-mesonic decay widths of s-shell hypernuclei are [146,147] !2 ; NM (3 H) = (3Rn0 + Rn1 + 3Rp0 + Rp1 ) 8 !3 ; NM (4 H) = (Rn0 + 3Rn1 + 2Rp0 ) 6 !3 ; NM (4 He) = (2Rn0 + Rp0 + 3Rp1 ) 6 !4 NM (5 He) = (Rn0 + 3Rn1 + Rp0 + 3Rp1 ) : (66) 8 These relations take into account that the total hypernuclear angular momentum is 0 for 4 H and 4 3 5 He and 1=2 for H and He. In terms of the rates associated to the partial-wave transitions (65), the RNJ ’s of Eqs. (66) read Rn0 = Rn (1 S0 ) + Rn (3 P0 ) ; Rp0 = Rp (1 S0 ) + Rp (3 P0 ) ; Rn1 = Rn (3 P1 ) ; Rp1 = Rp (3 S1 ) + Rp (1 P1 ) + Rp (3 P1 ) + Rp (3 D1 ) ; the quantum numbers of the NN 4nal state being reported in brackets. If one assumes that the N → NN weak interaction occurs with a change 5I = 1=2 of the isospin, the following relations (simply derived by angular momentum coupling coe6cients) hold among the rates for transitions to I = 1 4nal states: Rn (1 S0 ) = 2Rp (1 S0 );
Rn (3 P0 ) = 2Rp (3 P0 );
Rn (3 P1 ) = 2Rp (3 P1 ) :
(67)
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
Hence
Rn1 Rn0 6 =2 : Rp1 Rp0
87
(68)
For pure 5I = 3=2 transitions, the factors 2 in Eqs. (67) are replaced by 1=2. Hence, by further introducing the ratio:
If = 1A1=2 Ii = 1=2 r=
If = 1A3=2 Ii = 1=2 between the 5I = 1=2 and 5I = 3=2 N → NN transition amplitudes for isospin 1 4nal states (r being real, as required by time reversal invariance), for a general 5I = 1=2–3=2 mixture one gets Rn1 4r 2 − 4r + 1 Rn0 4r 2 − 4r + 1 ; (69) = 2 6 = Rp1 2r + 4r + 2 + 672 Rp0 2r 2 + 4r + 2 where 7=
If = 0A1=2 Ii = 1=2 :
If = 1A3=2 Ii = 1=2
(70)
The partial rates of Eq. (69) supply the n =p ratios for s-shell hypernuclei through Eqs. (66), which provide the sum n + p for each considered hypernucleus. For example, for 5 He one has n 5 Rn0 + 3Rn1 ( He) = : (71) p Rp0 + 3Rp1 By using Eqs. (66) and (69) together with the available experimental data it is possible to extract the spin and isospin behaviour of the N → NN interaction without resorting to a detailed knowledge of the interaction mechanism. This reasoning was applied for the 4rst time by Block and Dalitz [146,147]. Unfortunately, up to now, the large experimental error bars have not allowed to draw de4nitive conclusions about the validity of the 5I =1=2 rule in non-mesonic decays by employing the previous model. There are indications for a sizeable violation of this rule [150,152,153], but more precise measurements are needed, especially for 3 H and 4 H. If con4rmed, this would represent the 4rst evidence of such a violation in non-leptonic strangeness changing processes. By using the phenomenological model of Block and Dalitz, in the next paragraph we shall discuss, through a new analysis [151] which employs recent data, the validity of the 5I =1=2 rule in the process N → NN . Before proceeding, we note that Eqs. (66) make use of several assumptions, which cannot be easily tested: the decays are treated incoherently on the stimulating nucleons within a simple 4-baryons point interaction model, thus interference e7ects originating from antisymmetrization of the two-nucleon 4nal state as well as 4nal state interactions are neglected. Moreover, the calculation requires the average nuclear density at the position and does not take into account non-mesonic decays induced by more than one nucleon. However, given the high momentum of the outgoing nucleons and the present level of accuracy of the data, the above approximations can be considered as satisfactory. 6.3.1. Experimental data and 5I = 1=2 rule In Ref. [151] a phenomenological analysis of experimental data on non-mesonic decay of s-shell hypernuclei is employed to study the possible violation of the 5I = 1=2 rule in the N → NN interaction. In that paper we have analyzed recent data (which are summarized in Table 19) by using a quite di7erent method with respect to the previous works of Refs. [149,150,152].
88
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
Table 19 Experimental data (in units of free ) for s-shell hypernuclei (taken from Ref. [151]) n
p
4 H
NM
n =p
0:22 ± 0:09 0:17 ± 0:11 0:29 ± 0:14
Ref. Reference value KEK [47] [147]
4 He
0:04 ± 0:02
0:16 ± 0:02
0:20 ± 0:03
0:25 ± 0:13
BNL [139]
5 He
0:20 ± 0:11
0:21 ± 0:07
0:41 ± 0:14
0:93 ± 0:55
BNL [93]
Unfortunately, no data are available on the non-mesonic decay of hypertriton and on n =p for 4 H. Indeed, we shall see in the following that the future measurement of n =p for 4 H at BNL [200] will be of great importance for a test of the 5I = 1=2 rule. The BNL data [93,139] for 4 He and 5 He of Table 19 together with the reference value for 4 H have been used in our analysis. This last number is the weighted average of the previous estimates [47,147], which have not been obtained from direct measurements but rather by using theoretical constraints. One has then 5 independent data which allow to 4x, from Eqs. (66), the 4 rates RN; J and !3 . Indeed, the average nucleon density !4 at the position for 5 He, also entering into Eqs. (66), has been estimated to be !4 =0:045 fm−3 by employing the wave function of Ref. [46] (which was obtained through a quark model description of the N interaction) and the Gaussian density for 4 He that reproduces the experimental mean square radius of the nucleus. For 4 H and 4 He, instead, no realistic hyperon wave function is available and we can obtain the value !3 = 0:026 fm−3 from the data of Table 19, by imposing that [see Eqs. (66)]: p (5 He) 3 !4 : = p (4 He) 4 !3 The best choice to determine the rates RN; J by 4tting experimental data corresponds to use the relations for the observables: n 4 NM (4 H); NM (4 He); NM (5 He); ( He) ; p which have the smallest experimental uncertainties. After solving these equations we obtained the following partial rates (as usual, the decay widths of Eqs. (66) are considered in units of the free decay width): Rn0 = (4:7 ± 2:1) fm3 ;
(72)
3 Rp0 = (7:9+16:5 −7:9 ) fm ;
(73)
Rn1 = (10:3 ± 8:6) fm3 ;
(74)
Rp1 = (9:8 ± 5:5) fm3 ;
(75)
RV n (5 He) ≡ 14 (Rn0 + 3Rn1 ) = (8:9 ± 6:5) fm3 ; RV p (5 He) ≡ 14 (Rp0 + 3Rp1 ) = (9:3 ± 5:8) fm3 ;
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
89
The errors have been obtained with the standard formula: & ' N ' 9O 2 Cri ; C[O(r1 ; : : : ; rN )] = ( 9ri i=1 namely by treating the data as independent and uncorrelated. Due to the large relative errors [especially in the measures of NM (4 H) and NM (5 He)] implied in the extraction of the above rates, the Gaussian propagation of the uncertainties has to be regarded as a poor approximation. For the ratios of Eq. (69) we have then Rn0 = 0:6+1:3 −0:6 ; Rp0
(76)
Rn1 = 1:0+1:1 −1:0 Rp1
(77)
while the ratios of the spin-triplet to the spin-singlet interaction rates are Rn1 = 2:2 ± 2:1 ; Rn0 Rp1 = 1:2+2:7 −1:2 : Rp0 The large uncertainties do not allow to draw de4nite conclusions about the possible violation of the 5I = 1=2 rule and the spin-dependence of the transition rates. Eqs. (76) and (77) are still compatible with Eq. (68), namely with the 5I = 1=2 rule, although the central value in Eq. (76) is more in agreement either with a pure 5I = 3=2 transition (r 0) or with r 2 [see Eq. (69)]. Actually, Eq. (76) is compatible with r in the range −1=4– 40, while the ratio 7 of Eqs. (69) and (70) is completely undetermined. By using the results of Eqs. (72) – (75) we can predict the neutron to proton ratio for 3 H, 4 H and 5 He, which turn out to be: n 3 ( H) = 0:7+1:1 −0:7 ; p n 4 ( H) = 2:3+5:0 −2:3 ; p n 5 ( He) = 0:95 ± 0:92 p and, by using !2 = 0:001 fm−3 [147], NM (3 H) = 0:007 ± 0:006 : The latter is of the same order of magnitude of the detailed 3-body calculation of Ref. [118], which provides a non-mesonic width equal to 1.7% of the free width. The ratio obtained for 5 He is in
90
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
good agreement with the data of Table 19. An accurate measurement of NM (3 H) and n =p for 3 H and 4 H would then provide a test of the weak decay model of Eqs. (66) if the rates of Eqs. (72) – (75) could be extracted with less uncertainty from data. The compatibility of the data with the 5I = 1=2 rule can be discussed in a di7erent way: by assuming this rule, we 4x Rn0 =Rp0 = 2. Then, by using the observables: n 4 NM (4 He); NM (5 He); ( He) ; p the extracted partial rates (Rn0 ; Rn1 ; RV n and RV p are unchanged with respect to the above derivation) are Rn0 = (4:7 ± 2:1) fm3 ; Rp0 ≡ Rn0 =2 = (2:3 ± 1:0) fm3 ; Rn1 = (10:3 ± 8:6) fm3 ; Rp1 = (11:7 ± 2:4) fm3 : These values are compatible with the ones in Eqs. (72) – (75). For pure 5I = 1=2 transitions the spin-triplet interactions seem to dominate over the spin-singlet ones: Rn1 = 2:2 ± 2:1 ; Rn0 Rp1 = 5:0 ± 2:4 : Rp0 Moreover, since: Rn1 = 0:9 ± 0:8 Rp1 from Eq. (69) one obtains the following estimate for the ratio between the 5I = 1=2 amplitudes:
If = 0A1=2 Ii = 1=2
1
If = 1A1=2 Ii = 1=2 3:7 –2:3 : The other independent observables which here have not been utilized are then predicted to be: NM (4 H) = 0:17 ± 0:11 and n 5 ( He) = 0:95 ± 0:72 p in good agreement with the values of Table 19, with a S2 for one degree of freedom of 0.31 (corresponding to a 0:56? deviation). This means that the data are consistent with the hypothesis of validity of the 5I = 1=2 rule at the level of 60%. In other words, the 5I = 1=2 rule is excluded at the 40% con4dence level.
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109 Table 20 Experimental data for
12 C
91
(taken from Ref. [151])
NM
n =p
Ref.
1:14 ± 0:20
1:33+1:12 −0:81
BNL [93]
0:89 ± 0:18
1:87+0:67 −1:16
KEK [101]
1:01 ± 0:13
1:61+0:57 −0:66
Average
The observables for which experimental data are not available at present are predicted to be n 3 ( H) = 1:3 ± 0:6 ; p n 4 ( H) = 7:6 ± 6:2 p and, for !2 = 0:001 fm−3 , NM (3 H) = 0:005 ± 0:003 : We note that the central value of n =p for 4 H in the analysis which enforces the 5I = 1=2 rule is considerably larger than the central value obtained in the general analysis previously discussed. Thus, the future measurement [200] of this quantity will represent an important test of the 5I = 1=2 rule. We conclude this subsection by considering a simple extension to hypernuclei of the p-shell. In Table 20 the data on weak non-mesonic decay of 12 C are quoted. The relevant decay rate can be written in the following form: NM (12 C)
!s11 !p11 V 5 [3Rn (p) + 4RV p (p)] ; = NM ( He) + !4 7
(78)
where !s11 (!p11 ) is the mean s-shell (p-shell) nucleon density at the hyperon position, while RV n (p) [RV p (p)] is the spin-averaged p-shell neutron-induced (proton-induced) rate. By using the previous results from s-shell hypernuclei and the weighted average values in Table 20, we obtain RV n (p) = (18:3 ± 10:7) fm3 ; 3 RV p (p) = (3:6+12:6 −3:6 ) fm :
The densities !s11 (=0:064 fm−3 ) and !p11 (=0:043 fm−3 ) have been calculated from the appropriate nucleon s- and p-shell Woods–Saxon wave functions. The s- and p-shell contributions in Eq. (78) are 0:58 ± 0:20 and 0:43 ± 0:24, respectively. The contribution of the N P partial waves to NM is estimated to be only 5 –15% in p-shell hypernuclei [119 –121]. Thus, about 10 –30% of the 12 C p-shell contribution is expected to be originated by N relative states with L = 1.
92
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
7. Non-mesonic decay of polarized -hypernuclei: the asymmetry puzzle 7.1. Introduction Lambda hypernuclear states can be produced with a sizeable amount of polarization [202]. The development of angular distribution measurements of decay particles (photons, pions and protons) from polarized hypernuclei is of crucial importance in order to extract new information on hypernuclear production, structure and decay. A new open problem, of very recent origin, in the study of the weak hypernuclear decay concerns large discrepancies among the results of two experiments [203,204], performed at KEK, which observed the asymmetric emission of non-mesonic decay protons from polarized hypernuclei. Theoretical predictions are able to reproduce, although not very accurately, the older measurement, but very recent observations have completely changed the situation, leading to a puzzling status. We analyze the problem in this section. Thanks to the large momentum transfer involved, the n(+ ; K + ) reaction has been used, at ◦ ◦ p = 1:05 GeV and small K + laboratory scattering angles (2 . 3K . 15 ), to produce hypernuclear states with a substantial amount of spin-polarization, preferentially aligned along the axis normal to the reaction plane [203,204]. The origin of hypernuclear polarization is twofold [202]. It is known that the distortions (absorptions) of the initial (+ ) and 4nal (K + ) meson-waves produce a small polarization of the hypernuclear orbital angular momentum up to laboratory scattering angles 3K ◦ 15 (at larger scattering angles, the orbital polarization increases with a negative sign). At small but non-zero angles, the main source of polarization is due to an appreciable spin–Fip interaction term in the elementary reaction + n → K + , which interferes with the spin–nonFip amplitude. In ◦ a typical experimental situation with p = 1:05 GeV and 3K 15 , the polarization of the hyperon spin in the free + n → K + process is about 0.75. The KEK experiment of Ref. [203] measured for the 4rst time the asymmetry of the angular ˜ → np, of polarized p-shell hyperdistribution of protons produced in the non-mesonic decay, p 12 nuclei, produced on C target. The di7erence between the number of protons emitted along the polarization axis and the number of protons outgoing in the opposite direction must be determined. As we shall brieFy discuss in the next subsection, this proton asymmetry is related to the interference between the parity-violating and parity-conserving transition amplitudes with di7erent values of the NN isospin [28]. Due to the antisymmetry of the NN state, the N → NN parity-violating and parity-conserving amplitudes correspond to S + I = even and S + I = odd 4nal states, respectively (S = spin, I = isospin). This means that the interference terms contributing to the proton asymmetry occur between amplitudes with the same NN intrinsic spin S. The non-mesonic partial rates are dominated by the parity-conserving amplitudes. Thanks to the information on the spin-parity structure of the process, which can be obtained with the study of the asymmetric emission of protons from polarized hypernuclei, new constraints can then be imposed on the N → NN decay mechanism. 7.2. Spin-polarization observables In this subsection we brieFy outline the formal derivation of the proton asymmetry parameter and its relation with the other spin observables. More details can be found in Ref. [205]. The intensity of protons emitted in the non-mesonic decay of a polarized hypernucleus along a direction forming
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
93
an angle T with the polarization axis is de4ned by I (T; J ) ≡ Tr[M!(J )M† ](T) =
F; T|M|I ; J; M I ; J; M |!(J )|I ; J; M I ; J; M |M† |F; T :
(79)
F;M;M
˜ → np transition, |I ; J; M is the initial hypernuclear state, Here, M is the operator describing the p M denoting the third component of the hypernuclear total spin J; |F; T the many-body 4nal state (given by the residual nucleus and the outgoing nucleons, with a proton emerging at an angle T) and ! is the density matrix of the polarized hypernucleus. With reference to the (+ ; K + ) production reaction, the density matrix for pure vector polarization along ˜k × ˜kK is given by [205] 3 1 1 + Py (J )Sy (80) !(J ) = 2J + 1 J +1 in the Madison frame, in which the zM -axis is along the direction of the incoming pion and the yM -axis is along ˜k × ˜kK . In Eq. (80) Py is the hypernuclear polarization and Sy the projection along the yM -axis of the spin operator J . From Eq. (79) one then obtains the proton distribution in the form: I (T; J ) = I0 (J )[1 + A(T; J )] ; where Tr(MM† ) 2J + 1 is the (isotropic) intensity for an unpolarized hypernucleus. The asymmetry of the angular distribution for the outgoing protons is expressed by I0 (J ) =
3 Tr(MSy M† )(T) : J +1 Tr(MM† ) One easily obtains that this proton asymmetry parameter is proportional to cos T [205]: A(T; J ) = Py (J )
A(T; J ) = Py (J )Ay (J ) cos T : Here, the quantity: 3 M M ?(J; M ) ; Ay (J ) = J +1 M ?(J; M )
(81) (82)
which is a property of the hypernuclear non-mesonic decay only, is usually referred to as the hypernuclear asymmetry parameter. The hypernuclear polarization Py depends both on the kinematics (p and 3K ) and dynamics of the production reaction. In Eq. (82): | F|M|I ; J; M |2 (83) ?(J; M ) = F
is the intensity of protons emitted along the quantization axis z for a projection M of the hypernuclear total spin. The transition amplitudes appearing in Eqs. (79) and (83) are evaluated in the proton helicity frame, whose z-axis is along the direction of the outgoing proton.
94
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
In the shell model weak-coupling scheme with the 1s -hyperon coupled to the nuclear core ground state, Py is directly related to the polarization p of the spin in the hypernucleus as follows: J Py (J ) if J = JC − 12 ; − p (J ) = (84) J +1 P (J ) 1 if J = J + ; y
C
2
JC being the total spin of the nuclear core. It is useful to introduce an intrinsic lambda asymmetry ˜ → np and should be independent parameter a , which is characteristic of the elementary process p of the hypernucleus, such that A(T; J ) = p (J )a cos T :
(85)
From Eqs. (81) and (84) it follows then J +1 Ay (J ) if J = JC − − a = J A (J ) if J = J + y
C
1 2 1 2
;
(86)
˜ JC = 0 and J = 1=2, thus: and a = Ay (J ) = 0 if J = 0. In the case of 5 He, 5˜ 5˜ ˜ = ?( He; +1=2) − ?( He; −1=2) a ≡ Ay (5 He) ˜ +1=2) + ?(5 He; ˜ −1=2) ?(5 He; and −1 6 a 6 1. 7.3. Experiments Experimentally, the proton asymmetry parameter is obtained by comparing the number of protons emerging parallel and antiparallel to the yM -axis: ◦
◦
I (0 ) − I (180 ) : A(0 ) = ◦ ◦ I (0 ) + I (180 ) ◦
◦
(87)
The asymmetry A(0 ) measured by the KEK experiments [198,203,204] su7ered from large uncertainties, principally due to limited statistics, 4nal state interaction e7ects (which attenuate the weak decay vertex proton asymmetry) and to the poor knowledge of the hypernuclear polarization. Moreover, two-nucleon induced decays, not taken into account in the experimental analyses, are expected to contribute. ˜ 12 ˜ In the 4rst experiment [203], KEK-E160, 11 B, C and other p-shell hypernuclei were produced + + 12 − by the ( ; K ) reaction on C. At about 10 MeV excitation energy with respect to the 12 C (1 ) 1+ ground state, the reaction can create proton-unbound states, which then populate the 11 B( 2 ) ground state by proton and photon emissions. The high excitation energy region, around 20 MeV, is called quasi-bound region since, even if here the has a 4nite escape probability, de-excitations via the emission of one or more nucleons are also possible, and lead to a light hyperfragment (LH) with 11 10 A 6 10: for example, the emission of a p; n; d; 3 He or 0 particle produces a 4nal 11 B, C, B, 9 8 Be or Be hypernucleus, respectively. The statistics and energy resolution (5–7 MeV) of the kaon spectrometer were limited at KEK-E160; moreover, the polarization of the produced hypernuclei,
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
95
Table 21 Asymmetries observed at KEK-E160 [203]
◦
A(0 )
12 ˜ C
11 ˜ B
LH (A 6 10)
−0:01 ± 0:11
−0:19 ± 0:10
−0:24 ± 0:09
−0:17 ± 1:83
−1:33 ± 0:72
−1:50 ± 0:68
(p = 0:06–0:09) −0:13 ± 1:45 (p = 0:095)
(p = 0:16–0:21) −0:77 ± 0:41 (p = 0:31)
(p = 0:15–0:26)
◦
a =
A(0 ) kp
whose decay protons were observed, had to be evaluated theoretically in order to determine the intrinsic asymmetry, a , from the measured A. Such a calculation requires a delicate analysis of (1) the polarization of the hypernuclear states directly produced in the (+ ; K + ) reaction and (2) the depolarization e7ects due to strong and electromagnetic transitions of the populated excited states, which take place before the weak decay. In Table 21 we list the observed asymmetries. ◦ According to Eq. (85), the proton asymmetry A(0 ) should depend linearly on the polarization p of the hyperon in the nucleus (see second and fourth line of the table), which is always positive, reFecting the positive sign of the polarization in the elementary + n → K + reaction. The ◦ values for the intrinsic asymmetry, a = A(0 )=(kp ), of the third line are obtained by using the theoretical evaluations of Py originally employed in the analysis of Ref. [203]. The attenuation factor k, estimated to be around 0.8, is due to the Fermi motion and the rescattering of the emitted protons. The main reason of the attenuation in the observed asymmetry is the detection of secondary protons, emitted as a consequence of the scattering of decay neutrons and protons with the nucleons of the residual nucleus. By assuming that a is independent of the hypernucleus, the weighted average of the three results supplies a very large and negative asymmetry: a = −1:3 ± 0:4, namely in the physically acceptable range between −0:9 and −1. In the 4fth line of the table, more realistic evaluations of the polarization, extracted from Refs. [206 –208], are used to obtain a . 12 ˜ ˜ A weighted average among the improved results for 11 B and C and the original one for lighter hyperfragments gives a smaller asymmetry value: a = −0:9 ± 0:3. ˜ hypernuclei, which, More recently [209], it has been possible to measure the polarization of 5 He 5˜ 5˜ 6 from Eq. (84), coincides with the polarization: Py ( He) = p ( He). The Li(+ ; K + )6 Li reaction is used to produce a polarized 6 Li hypernucleus. The ground state of 6 Li lies above the 5 He + p threshold, thus an 5 He hypernucleus in the 0+ ground state is exclusively produced by the emission of a proton. The polarization of 5 He is measured by observing the asymmetric emission of negative − − − − pions in its mesonic decay, A = Py Ay . To obtain the polarization from the observed A ; Ay was assumed [210] to be equal to the value for the free → − p decay: 0− = −0:642 ± 0:013 [211]. This approximation is reasonable, since in 5 He the hyperon is coupled to a spin-parity 0+ 4 He core. Unfortunately, the small branching ratio and asymmetry parameter for the mesonic decay of p-shell hypernuclei makes such a measurement very di6cult for these systems. The distorted wave ˜ [209]. impulse approximation of Ref. [210] reproduces quite well the measured values of Py (5 He) However, it is not clear whether such a model is able to account for the polarization mechanism of p-shell hypernuclei.
96
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
Table 22 ˜ The values of p in the third line are taken from Ref. [209] Asymmetries observed at KEK-E278 [204] for 5 He. K + scattering-angle ◦
A(0 ) p
◦
◦
◦
2 ¡ |3K | ¡ 7
7 ¡ |3K | ¡ 15
0:082 ± 0:060
0:035 ± 0:080
0:247 ± 0:082
0:393 ± 0:094
0:441 ± 0:356
0:120 ± 0:271
◦
◦
A(0 ) jkp Weighted average a =
0:24 ± 0:22
˜ The experimental values of Py (5 He) have been employed, very recently, to determine a ≡ 5˜ ˜ (KEK-E278) [204]. Again, 5 He ˜ hyAy ( He) from a measurement of the proton asymmetry in 5 He 6 + + 6 pernuclei have been produced by the Li( ; K ) Li reaction. When compared with an experiment ˜ has evident virtues: the measured hypernuclear employing a p-shell hypernucleus, the use of 5 He polarization is larger and approximatively equal to that of the -hyperon, since J P (4 He) = 0+ ; the nuclear e7ects on the observed asymmetry A are smaller; 4nally, only the relative S-wave in the initial p system is active. All these features help the theoretical interpretation of data. In ◦ Table 22 the obtained results are quoted. The proton asymmetry A(0 ) has been measured for two ◦ ◦ ◦ ◦ K + scattering-angle regions, 2 ¡ |3K | ¡ 7 and 7 ¡ |3K | ¡ 15 . The reduction factor, j = 0:804, is due to the 4nite acceptance of the decay counter system, while the attenuation factor, k = 0:935, is again due to nuclear e7ects. Both these quantities, estimated through Monte Carlo simulations, and ˜ are required in order to derive the intrinsic asymmetry. A statistical the polarization in 5 He Fuctuation caused a remarkable di7erence between the values of a in the two scattering-angle regions. However, in the hypothesis that this observable depends on the one-body induced non-mesonic decay only, a weighted average is permitted and leads to a relatively small, positive value, within 2 standard deviations. ˜ [204] and p-shell hypernuclei The experiments thus revealed an opposite sign of a for 5 He [203]. This is puzzling, since from its de4nition one expects a to be not much sensitive to the nuclear structure e7ects: Ref. [205] ([121]) demonstrated that this is true within 25% (6%) in a ˜ ˜ and 12 calculation for 5 He C (see next subsection). The weak coupling scheme is known to be a good approximation for describing the ground states of hypernuclei. However, one must note that, in the experiment on p-shell hypernuclei, due to the low energy resolution, several excited hypernuclear states enter into the game. The procedure used to calculate the hypernuclear polarization in this case is complicated and could have led to an unrealistic value of a . For example, in Ref. [205], a sizeable reduction (increase) of the hypernuclear polarization Py has been found for 12 ˜ 11 ˜ C ( B) once the spin depolarization of possibly populated excited states of these hypernuclei is taken into account. It is di6cult, however, to think that the sign di7erence is only due to this e7ect. Also a statistical Fuctuation can hardly cause such a di7erence between the two experiments. Another possible explanation, suggested in Ref. [204], could arise from a dominance of the L=1 N interaction in p-shell hypernuclei. However, this hypothesis is incompatible with calculations [119 –121] which proved how the N L = 0 relative state is the dominant one in the non-mesonic decay of those hypernuclei, giving about 85 –95% of the total non-mesonic rate.
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
97
The preliminary results of KEK-E307 [198], which employs carbon, silicon and iron targets, show ˜ a large and positive value of a for 12 C within 2 standard deviations: they 4nd a = 0:85 ± 0:39, in complete disagreement with the outcome of KEK-E160. To improve statistics, in the E307 experiment an analysis of a including all the non-mesonic decay events gated to both the bound and continuum ˜ 27 ˜ 28˜ ˜ regions of 12 C, Al, Si and Fe is in progress. A preliminary result con4rms a positive value of a [198] for hypernuclei beyond the s-shell. However, future data analysis as well as improved statistical and systematic uncertainties are needed before this conclusion can be ensured. 7.4. Theory versus experiment Within the model of Block and Dalitz [146,147], discussed in Section 6.3, the intrinsic lambda asymmetry parameter of Eqs. (82) and (86) is evaluated through the following formula [212]: √ √ √ 2 3Re[ap ep∗ − √13 bp (cp∗ − 2d∗p ) + fp ( 2cp∗ + d∗p )] ˜ = ; (88) a ≡ Ay (5 He) |ap |2 + |bp |2 + 3(|cp |2 + |dp |2 + |ep |2 + |fp |2 ) where ap = np;1S0 |t|p;1S0 ; bp = np;3P0 |t|p;1S0 ; cp = np;3S1 |t|p;3S1 ; dp = np;3D1 |t|p;3S1 ; ep = np;1P1 |t|p;3S1 ; fp = np;3P1 |t|p;3S1 are the elementary p → np transition amplitudes. The use of Eq. (88) to estimate a only provides approximate results. Indeed, within the model of Block and Dalitz, interference e7ects as well as 4nal state interactions of the two outgoing nucleons with the residual nucleus are neglected. However, it is evident from Eq. (88) that the asymmetry is due to the interference between parity-conserving (ap ; cp and dp ) and parity-violating (bp ; ep and fp ) p → np amplitudes with the same value of the np intrinsic spin S. Hence, interference terms between spin-singlet (J = 0) and spin-triplet (J = 1) amplitudes (terms in ap ep∗ , bp cp∗ and bp d∗p ) enter a . In Table 23 we summarize the calculations of the intrinsic asymmetry. Previously discussed experimental data are reported for comparison. All evaluations provide a negative asymmetry, between −0:38 and −0:73 for the complete results, in fair agreement with the old KEK result of 1992, but in strong disagreement with the positive sign revealed by the recent experiments. As expected, the calculations show a moderate sensitivity of the asymmetry to the details of nuclear structure. The work of Ramos et al. [205] has been performed in a relativistic nuclear model by applying formula (82), which de4nes, through Eq. (86), the intrinsic asymmetry. The nuclear matter calculation of Dubach et al. [126] refers to a OME model including the exchange of ; !; K; K ∗ ; !, and / mesons. In this case, only relative S-wave interactions are considered in the initial p state; moreover,
98
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
Table 23 ˜ ˜ = a and Ay (12 Calculations of the intrinsic lambda parameter a . Note that Ay (5 He) C) = −a =2 Ref. and model
5˜ He
12 ˜ C
Ramos et al. [205] OPE +K
− 0.524 − 0.509
− 0.397 − 0.375
Dubach et al. [126] OPE OME
− 0.192 − 0.443
Sasaki et al. [213] OPE +K + K + DQ
− 0.441 − 0.362 − 0.678
Parre˜no et al. [121] OPE +K OME
− 0.252 − 0.572 to − 0.606 − 0.675 to − 0.682
Exp KEK-E160 [203] Exp KEK-E278 [204] Exp KEK-E307 [198]
NM
− 0.340 − 0.626 to − 0.640 − 0.716 to − 0.734 −0:9 ± 0:3
0:24 ± 0:22 0:85 ± 0:39 (prel.)
a has been calculated through Eq. (88) by neglecting the above-mentioned interference terms between the J = 0 and 1 p → np transitions. These terms must be included in the calculation, and are quantitatively important: for example, in the OPE calculation of Sasaki et al. [213], the complete formula supplies an asymmetry equal to −0:441, to compare with the result, −0:159 [214], obtained by disregarding the J = 0–1 interference terms. Incidentally, this approximation is allowed only when the p → np process occurs in free space: more precisely, in free space only a spin-triplet p initial state contributes to the asymmetry [see Eq. (88)]. Finally, Parre˜no et al. [121] applied ˜ ˜ and 12 Eq. (82) to 5 He C hypernuclei within a shell model framework with a OME transition potential including the exchange of ; !; K; K ∗ ; !, and / mesons. We note that these authors 4nd a considerable increase of the asymmetry when the K-meson is added to the pion. On the contrary, Sasaki et al. [213] obtained a lower asymmetry in the +K calculation with respect to the pure OPE value. However the OME and + K+DQ results of the two calculations agree with each another, with values around −0:7. At variance with the above-discussed results, the calculation of Ramos et al. [205] supplies practically the same asymmetry in the OPE and + K models. The origin of these discrepancies is unknown: the di7erence among the various OPE calculations are due to the use of di7erent N form factors and short range correlations for the initial p and 4nal np states. In conclusion, further investigations are required to clarify the situation: on the theoretical side there seems to be no way (even by forcing the model parameters to unrealistic values) to obtain positive asymmetry values [215]; on the experimental side the present anomalous discrepancy between di7erent data needs to be resolved. The hypothesis has been advanced that the asymmetry
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
99
puzzle could have the same origin of the previously discussed puzzle on the n =p ratio [204]. At present there is no 4rm evidence of this relation. Indeed, the situation is even more confused for the asymmetry than for the n =p ratio: in the former case, the experiments cannot provide any guidance for new theoretical speculations. We hope that future experimental studies of the inverse reaction p ˜ n → p in free space could help in disentangling the puzzling situation. Indeed, the weak production of the -hyperon through the scattering of longitudinally polarized protons on neutron targets can give a richer and cleaner (with respect to the non-mesonic hypernuclear decay) piece of information on the polarization-observables [212]. 8. Summary and perspective In this review we have discussed the present status of hypernuclear physics. Beyond an extensive and updated description of our present understanding of weak hypernuclear decay processes, which are the main topic of the paper, we have also illustrated some phenomenological aspects of the YN; YY interaction and the hypernuclear structure and reviewed the reactions which are used to produce hypernuclei. Measurements of the YN and YY cross sections are very di6cult to perform, because of the very short lifetimes of hyperons. As a consequence, the various phenomenological models developed to describe these interactions are not completely satisfactory. One of the major reason of interest on hypernuclear phenomena lies thus in the information which can be extracted about the YN and YY interactions (both of strong and weak nature, the former being relevant for hypernuclear structure studies and the latter for hypernuclear weak decays). Further, we have introduced the weak decay modes of -hypernuclei: beyond the mesonic channel, which is observed also for a free , the hypernuclear decay proceeds through non-mesonic processes, mainly induced by one nucleon or by a pair of correlated nucleons. This channel is the dominant one in medium–heavy hypernuclei, where the Pauli principle strongly suppresses the mesonic decay. The results obtained within the various models proposed to describe the mesonic and non-mesonic decay rates as well as the asymmetry parameters in the decay of the -hyperon in nuclei have been thoroughly discussed. The mesonic rates have been reproduced quite well by calculations performed in di7erent frameworks. The non-mesonic rates have been considered within a variety of phenomenological and microscopic models, most of them being based on the exchange of a pion between the decaying and the nucleon(s). More complex meson exchange potentials, as well as direct quark models have also been considered for the evaluation of non-mesonic decay rates. In this context, particular interest has been devoted to the partial rates n and p and to their ratio. In spite of the fact that several calculations have been able to reproduce, already at the OPE level, the total non-mesonic width, NM = n + p (+2 ), the values therewith obtained for n =p reveal a strong disagreement with the measured central data. Actually, due to the large experimental uncertainties involved in the extraction of n =p , at present one cannot draw de4nite conclusions, and di7erent and more re4ned experimental analysis are required to correct for eventual de4ciencies of the models. Notably, the non-mesonic partial rates n and p are dominated by parity-conserving transition amplitudes. The asymmetric emission of protons from proton-induced non-mesonic decays of polarized hypernuclei is related to the interference between the parity-conserving and parity-violating
100
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
transition amplitudes to NN states with the same intrinsic spin S. Therefore, the study of the decay asymmetries complements the one of the non-mesonic partial rates, providing, at least in principle, new constraints on the N → NN decay mechanism. Nuclear structure uncertainties are under control, and cannot inFuence very much the calculation of the hypernuclear observables for the non-mesonic decay. The total non-mesonic widths turn out to be relatively insensitive to the details of the weak interaction model. On the contrary, the ratio n =p strongly depends on the decay mechanism. Nevertheless, the OME calculations, including the exchange of mesons more massive than the pion such as the !; K; K ∗ ; ! and /, as well as the (correlated or uncorrelated) two-pion exchange models have not su6ciently improved the comparison with the experimental n =p ratios and total non-mesonic rates. These evaluations are rather sensitive to the models used for the required meson–baryon–baryon strong and weak vertices. However, only by using rather unrealistic coupling constants it is possible to 4t, simultaneously, the data on n + p and n =p for di7erent hypernuclei [215]. The OPE mechanism alone is able to reproduce the observed total non-mesonic widths, but strongly underestimates (by about one order of magnitude) the central data for the ratio. Only the K-meson-exchange turned out to be important to obtain considerably larger n =p ratios, but the central data remains underestimated. The inclusion in the non-mesonic transition potential of quark degrees of freedom su7ers from large theoretical uncertainties. The models that implemented direct quark interactions in OME calculations found n =p values considerably larger than the OPE estimates, also as a result of the K-meson exchange, but problems remains in reproducing both the ratio and n + p for the considered systems. Although some of the discussed improvements could represent a step forward in the solution of the n =p puzzle, further e7orts (especially on the experimental side) must be invested in order to understand the detailed dynamics of the non-mesonic decay. From the theoretical point of view, it is not easy to imagine new mechanisms as responsible for the large observed ratios. Very recent experiments at KEK have considerably reduced the error bars on n =p , by means of single nucleon spectra measurements. The new experiments con4rmed previous data, with improved accuracy. However, in order to avoid possible de4ciencies of this kind of observations, a direct and unambiguous extraction of the ratio is compulsory. As widely discussed in the present review, for such a determination good statistics coincidence measurements of the nn and np emitted pairs are required. These correlation measurements will also allow for the identi4cation of the nucleons which come out from the di7erent one- and two-nucleon induced processes. As far as the asymmetry parameters are concerned, the situation is even more puzzling. Indeed, strong inconsistencies already appear at the experimental level: the two existing experiments revealed ˜ and p-shell hypernuclei. This an opposite sign of the intrinsic asymmetry parameter, a , for 5 He is in strong contradiction with the theoretical expectation of an intrinsic asymmetry which should be, in principle, rather insensitive to nuclear structure e7ects. Some calculations reproduced the 4rst measurement of a , which found a large and negative value for p-shell hypernuclei, but no ˜ The experiments thus cannot provide any guidance calculation could obtain a positive value for 5 He. for further theoretical evaluations. Improved experiments, establishing with certainty the sign and magnitude of a for s- and p-shell hypernuclei, are then strongly awaited. We conclude this work by reminding the reader that hypernuclear physics is 49 years old, yet a lot of e7orts remain to be done, both experimentally and theoretically, in order to fully understand the hyperon dynamics and decay inside the nuclear medium. The impressive progress experienced
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
101
in the last few years is promising and we hope that it deserves a de4nite answer to the intriguing open questions which we have illustrated here. Acknowledgements Fruitful and friendly discussions with H.C. Bhang, R.H. Dalitz, O. Hashimoto, A. Molinari, E. Oset, H. Outa, A. Parre˜no, K. Sasaki and Y. Sato are acknowledged. We are especially grateful to the members of KEK for providing us with a paper of theirs prior to publication; also we thank E. Botta and A. Feliciello for technical support and assistance during and after the VII International Conference on Hypernuclear and Strange Particle Physics. We also take the opportunity to warmly acknowledge our colleagues, R. Cenni, A. De Pace and A. Ramos, who collaborated with us in obtaining some of the results discussed in the review. A special recognition goes to A. Parre˜no, for her suggestions and comments after reading part of the manuscript. The work has been partially supported by the EEC through TMR Contract CEE-0169. Appendix A. Spin–isospin NN → NN and N → NN interactions In this appendix we show how the repulsive NN and N strong correlations at short distances are implemented in the NN → NN and N → NN interactions and then in the hypernuclear decay width calculated within the polarization propagator method of Section 5.4. The NN → NN interaction can be described through an e7ective potential given by G(r) = g(r)V (r) :
(A.1)
Here g(r) is a two-body correlation function, which vanishes as r → 0 and goes to 1 as r → ∞, while V (r) is the meson exchange potential, which in our case contains and ! exchange: V = V + V! . A practical and realistic form for g(r) is [182]: g(r) = 1 − j0 (qc r) ;
(A.2)
where j0 is the Bessel spherical function of order 0. With qc = m! 780 MeV one gets a good reproduction of realistic NN correlation functions obtained from G-matrix calculations. The inverse of qc is indicative of the hard core radius of the interaction. Since there are no experimental indications, the same correlation momentum is generally used for the strong N interaction. On the other hand, we remind the reader that qc is not necessarily the same in the two cases, given the di7erent nature of the repulsive forces involved. Using the correlation function (A.2) it is easy to get the e7ective interaction, Eq. (A.1), in momentum space. It reads GNN →NN (q) = V (q) + V! (q) +
f2 {gL (q)qˆi qˆj + gT (q)(Cij − qˆi qˆj )}?i ?j˜ · ˜ ; m2
where the correlations are embodied in the functions gL and gT . Then, the spin–isospin NN → NN interaction can be separated into a spin-longitudinal and a spin-transverse parts, as follows: GNN →NN (q) = {VL (q)qˆi qˆj + VT (q)(Cij − qˆi qˆj )}?i ?j˜ · ˜ (qˆi = qi =|˜q|) ;
102
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
where: VL (q) =
f2 2 2 {˜q F (q)G0 (q) + gL (q)} ; m2
VT (q) =
f2 2 {˜q C! F!2 (q)G!0 (q) + gT (q)} : m2
In the above, F and F! are the NN and !NN form factors, respectively, while G and G! are the corresponding free meson propagators: Gm0 = 1=(q02 − ˜q 2 − m2m ). The N → NN transition potential, modi4ed by the e7ect of the strong N correlations, splits into a P-wave (again spin-longitudinal and spin-transverse) part: GN →NN (q) = {P˜ L (q)qˆi qˆj + P˜ T (q)(Cij − qˆi qˆj )}?i ?j˜ · ˜ with f P˜ L (q) = m f P˜ T (q) = m
P 2 2 {˜q F (q)G0 (q) + gL (q)} ; m P g (q) m T
(A.3) (A.4)
and an S-wave part: f 0 2 ˜ S(q) = S{F2 (q)G0 (q) − F˜ (q)G˜ (q)}|˜q| : m
(A.5)
Form factors and propagators with a tilde imply that they are calculated by replacing ˜q2 → ˜q2 + qc2 , while C! is given by f!2 f2 −1 C! = 2 : (A.6) m! m2 The expressions for the correlation functions are the following ones: 2 0 0 2 gL (q) = − ˜q2 + 13 qc2 F˜ (q)G˜ (q) − 23 qc2 C! F˜ ! (q)G˜ ! (q) ; 0 0 2 2 gT (q) = − 13 qc2 F˜ (q)G˜ (q) − ˜q2 + 23 qc2 C! F˜ ! (q)G˜ ! (q) ; 2 0 gL (q) = − ˜q2 + 13 qc2 F˜ (q)G˜ (q) ; 0 2 gT (q) = − 13 qc2 F˜ (q)G˜ (q) :
The functions gL and gT [gL and gT ] have been obtained from Eqs. (A.1) and (A.2) with V = V + V! [V = V ]. Using the set of parameters: qc = 780 MeV;
= 1:2 GeV;
! = 2:5 GeV;
at zero energy and momentum we have: gL (0) = gT (0) = 0:615;
gL (0) = gT (0) = 0:155 ;
f2 =4 = 0:08;
C! = 2
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
103
which can be identi4ed with the customary Landau–Migdal parameters. However, if one wishes to keep the zero energy and momentum limit of gL; T and gL; T as free parameters, a replacement of the previous functions by gL; T (q) → g
gL; T (q) ; gL; T (0)
gL; T (q) → g
gL; T (q) gL; T (0)
is required. Now we come to the implementation of the above spin–isospin e7ective potentials in the nuclear matter self-energy of Eq. (9). From the graphs of Fig. 8, by applying the Feynman rules we obtain d4 q P2 2 2 2 2
(k) = 3i(Gm ) GN (k − q) S + 2 ˜q F2 (q){G0 (q) (2)4 m + G0 2 (q)
f2 2 F (q)U (q)qi qj [Cij + {VL (q)qˆi qˆj + VT (q)(Cij − qˆi qˆj )}U (q) m2
+ {VL (q)qˆj qˆk + VT (q)(Cjk − qˆj qˆk )}{VL (q)qˆk qˆi + VT (q)(Cki − qˆk qˆi )}U 2 (q) + · · · ]} ;
(A.7) ph
where only the NN short range correlations are taken into account. The function U = U + U 5h + U 2p2h contains the p–h; 5–h and irreducible 2p–2h proper polarization propagators. Now we must include in the previous equation the repulsive correlations in the lines connecting weak and strong vertices. For the P-wave interaction this corresponds to perform the replacement: f P 2 2 ˜q F (q)G0 (q)qˆi qˆj → P˜ L (q)qˆi qˆj + P˜ T (q)(Cij − qˆi qˆj ) ; m m while the interaction which connects the S-wave weak vertex and the P-wave strong vertex becomes: f ˜ qˆi : SF 2 (q)G0 (q)|˜q|qˆi → S(q) m The functions P˜ L ; P˜ T and S˜ are given by Eqs. (A.3) – (A.5). Moreover, the polarization propagator U in the modi4ed Eq. (A.7) has to be understood as UL when multiplied by a spin-longitudinal ˜ while it is UT when multiplied by VT or P˜ T . By introducing these prescriptions potential (VL , P˜ L ; S), in Eq. (A.7) and summing the two geometric series (there is no interference between longitudinal and transverse modes) one obtains: d4 q
(k) = 3i(Gm2 )2 GN (k − q)0(q) (2)4 with 0(q) given by Eq. (13). Then, integrating over q0 , for the decay width one 4nally obtains Eq. (12). References [1] M. Danysz, J. Pniewski, Philos. Mag. 44 (1953) 348. [2] J. Haidenbuaer, K. Holinde, K. Kilian, T. Sefzick, Phys. Rev. C 52 (1995) 3496.
104
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
[3] T. Kishimoto, Nucl. Phys. A 629 (1998) 369c; T. Kishimoto, et al., Nucl. Phys. A 663 (2000) 509. [4] A. Parre˜no, A. Ramos, N.G. Kelkar, C. Bennhold, Phys. Rev. C 59 (1999) 2122. [5] B. Holzenkamp, K. Holinde, J. Speth, Nucl. Phys. A 500 (1989) 485; A. Reuber, K. Holinde, J. Speth, Nucl. Phys. A 570 (1994) 543. [6] M.M. Nagels, Th.A. Rijken, J.J. de Swart, Phys. Rev. D 20 (1979) 1633; P.M.M. Maessen, Th.A. Rijken, J.J. de Swart, Phys. Rev. C 40 (1989) 2226; Th.A. Rijken, V.G.J. Stoks, Phys. Rev. C 54 (1996) 2851. [7] Th.A. Rijken, V.G.J. Stoks, Y. Yamamoto, Phys. Rev. C 59 (1999) 21. [8] V.G.J. Stoks, Th.A. Rijken, Phys. Rev. C 59 (1999) 3009. [9] Th.A. Rijken, Nucl. Phys. A 691 (2001) 322. [10] M. Oka, K. Shimizu, K. Yazaki, Nucl. Phys. A 464 (1987) 700; K. Yazaki, in: T. Yamazaki, K. Nakai, K. Nagamine (Eds.), Perspectives of Meson Science, Elsevier Science Publisher, Amsterdam, 1992, p. 795. [11] U. Straub, Z.Y. Zhang, K. BrMauer, A. Faessler, S.B. Khadkikar, G. LMubeck, Nucl. Phys. A 483 (1988) 686; U. Straub, Z.Y. Zhang, K. BrMauer, A. Faessler, S.B. Khadkikar, G. LMubeck, Nucl. Phys. A 508 (1990) 385c. [12] Y. Fujiwara, C. Nakamoto, Y. Suzuki, Phys. Rev. C 54 (1996) 2180. [13] Q.N. Usmani, A.R. Bodmer, Nucl. Phys. A 639 (1998) 147c; Q.N. Usmani, A.R. Bodmer, Phys. Rev. C 60 (1999) 055215. [14] A.R. Bodmer, Q.N. Usmani, Nucl. Phys. A 477 (1988) 621; M. Shoeb, N. Neelofer, Q.N. Usmani, M.Z. Rahman Khan, Phys. Rev. C 59 (1999) 2807. [15] Y. Akaishi, et al., Phys. Rev. Lett. 84 (2000) 3539. [16] T. Motoba, H. BandVo, Prog. Theor. Phys. 76 (1986) 1321. [17] B.F. Gibson, Nucl. Phys. A 479 (1988) 115c. [18] B.F. Gibson, I.R. Afnan, J.A. Carlson, D.R. Lehman, Prog. Theor. Phys. Suppl. 117 (1994) 339. [19] B.F. Gibson, E.V. Hungerford, Phys. Rep. 257 (1995) 349. [20] E. Hiyama, M. Kamimura, T. Motoba, T. Yamada, Y. Yamamoto, Phys. Rev. C 65 (2002) 011301. [21] M. Hjorth-Jensen, A. Polls, A. Ramos, H. MMuther, Nucl. Phys. A 605 (1996) 458. [22] D.J. Millener, C.B. Dover, A. Gal, R.H. Dalitz, Phys. Rev. C 31 (1985) 499. [23] H. Tamura, et al., Nucl. Phys. A 639 (1998) 83c; in: Il.-T. Cheon, S.W. Hong, T. Motoba (Eds.), Proceedings of the APCTP Workshop SNP99, World Scienti4c, Singapore, 2000, p. 411; Phys. Rev. Lett. 84 (2000) 5963. [24] H. Akikawa, et al., Nucl. Phys. A 691 (2001) 134c; H. Tamura, Hirschegg 2001, Structure of Hadrons, 2001, p. 290. [25] D.J. Millener, Nucl. Phys. A 691 (2001) 93c. [26] T. Hasegawa, et al., Phys. Rev. Lett. 74 (1995) 224. [27] H. BandVo, T. Motoba, Prog. Theor. Phys. 76 (1986) 1321. [28] H. BandVo, T. Motoba, J. Z\ ofka, Int. J. Mod. Phys. A 5 (1990) 4021. [29] W. BrMuckner, et al., Phys. Lett. B 79 (1978) 157. [30] A. Bouyssy, Phys. Lett. B 84 (1979) 41; A. Bouyssy, Phys. Lett. B 91 (1980) 15. [31] M. May, et al., Phys. Rev. Lett. 47 (1981) 1106; M. May, et al., Phys. Rev. Lett. 51 (1983) 2085. [32] B.K. Jennings, Phys. Lett. B 246 (1990) 325. [33] O. Hashimoto, in: Il.-T. Cheon, S.W. Hong, T. Motoba (Eds.), Proceedings of the APCTP Workshop SNP99, World Scienti4c, Singapore, 2000, p. 116. [34] A. Sakaguchi, et al., in: Il.-T. Cheon, S.W. Hong, T. Motoba (Eds.), Proceedings of the APCTP Workshop SNP99, World Scienti4c, Singapore, 2000, p. 231. [35] E. Hiyama, M. Kamimura, T. Motoba, T. Yamada, Y. Yamamoto, Nucl. Phys. A 639 (1998) 173c; Phys. Rev. Lett. 85 (2000) 270. [36] H. Hotchi, et al., Phys. Rev. C 64 (2001) 044302. [37] C.B. Dover, A. Gal, D.J. Millener, Phys. Rev. C 38 (1988) 2700. [38] P.H. Pile, et al., Phys. Rev. Lett. 66 (1991) 2585.
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109 [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] [50] [51] [52] [53] [54] [55] [56] [57] [58] [59] [60] [61] [62] [63] [64] [65] [66] [67] [68] [69] [70] [71] [72] [73] [74] [75] [76] [77] [78] [79] [80] [81] [82] [83] [84] [85] [86]
105
T. Hasegawa, et al., Phys. Rev. C 53 (1996) 1210. K. Tanida, et al., Phys. Rev. Lett. 86 (2001) 1982. E. Hiyama, M. Kamimura, K. Miyazaki, T. Motoba, Phys. Rev. C 59 (1999) 2351. T.A. Armstrong, et al., Phys. Rev. C 47 (1993) 1957. H. Ohm, et al., Phys. Rev. C 55 (1997) 3062; P. Kulessa, et al., Phys. Lett. B 427 (1998) 403. H. Nemura, Y. Suzuki, Y. Fujiwara, C. Nakamoto, Prog. Theor. Phys. 101 (1999) 981. \ T. Motoba, H. BandVo, T. Fukuda, J. Zofka, Nucl. Phys. A 534 (1991) 597. U. Straub, J. Nieves, A. Faessler, E. Oset, Nucl. Phys. A 556 (1993) 531. H. Outa, et al., Nucl. Phys. A 639 (1998) 251c. C.B. Dover, D.J. Millener, A. Gal, Phys. Rep. 184 (1989) 1. T. Harada, S. Shinmura, Y. Akaishi, H. Tanaka, Nucl. Phys. A 507 (1990) 715. Y. Akaishi, T. Yamazaki, Prog. Part. Nucl. Phys. 39 (1997) 565. R.H. Dalitz, Nucl. Phys. A 354 (1981) 101c. R. Bertini, et al., Phys. Lett. B 90 (1980) 375; R. Bertini, et al., Phys. Lett. B 136 (1984) 29; R. Bertini, et al., Phys. Lett. B 158 (1985) 19. S. Bart, et al., Phys. Rev. Lett. 83 (1999) 5238. R. Brockmann, E. Oset, Phys. Lett. B 118 (1982) 33. E. Oset, P. Fern]andez de C]ordoba, L.L. Salcedo, R. Brockmann, Phys. Rep. 188 (1990) 79. T. Yamada, K. Ikeda, Prog. Theor. Phys. Suppl. 177 (1994) 201. J. Mare\s E. Friedman, A. Gal, B.K. Jennings, Nucl. Phys. A 594 (1995) 311. R.S. Hayano, et al., Phys. Lett. B 231 (1989) 355. H. Outa, T. Yamazaki, M. Iwasaki, R.S. Hayano, Prog. Theor. Phys. Suppl. 117 (1994) 177. T. Nagae, et al., Phys. Rev. Lett. 80 (1998) 1605. Y. Akaishi, Few-body Syst. Suppl. 1 (1986) 120. C.J. Batty, E. Friedman, A. Gal, Phys. Lett. B 335 (1994) 273. T. Yamazaki, R.S. Hayano, O. Morimatsu, K. Yazaki, Phys. Lett. B 207 (1988) 393. C.B. Dover, A. Gal, Ann. Phys. 146 (1983) 309. K. Nakazawa, Nucl. Phys. A 639 (1998) 345c. M. May, Nucl. Phys. A 639 (1998) 363c. T. Fukuda, Nucl. Phys. A 639 (1998) 355c; Phys. Rev. C 58 (1998) 1306. P. Khaustov, et al., Phys. Rev. C 61 (2000) 054603. T. Motoba, Nucl. Phys. A 691 (2001) 213c. S. Tadokoro, H. Kobayashi, Y. Akaishi, Nucl. Phys. A 585 (1995) 225c. S. Aoki, et al., Prog. Theor. Phys. 85 (1991) 1287. A. Ichikawa, et al., Phys. Lett. B 500 (2001) 37. J.K. Ahn, et al., Phys. Rev. C 62 (2000) 055201. I. Kumagai-Fuse, Y. Akaishi, Prog. Theor. Phys. 94 (1995) 151. R.L. Ja7e, Phys. Rev. Lett. 38 (1977) 195. K. Yamamoto, et al., Nucl. Phys. A 639 (1998) 371c. R.E. Chrien, Nucl. Phys. A 629 (1998) 388c. K. Yamamoto, et al., Phys. Lett. B 478 (2000) 401. A. Alavi-Harati, et al., Phys. Rev. Lett. 84 (2000) 2593. J. Caro, C. Garcia-Recio, J. Nieves, Nucl. Phys. A 646 (1999) 299. H. Takahashi, et al., Phys. Rev. Lett. 87 (2001) 212503. J.K. Ahn, et al., Phys. Rev. Lett. 87 (2001) 132504. D.E. Lanskoy, Phys. Rev. C 58 (1998) 3351. I.R. Afnan, in: Il.-T. Cheon, S.W. Hong, T. Motoba (Eds.), Proceedings of the APCTP Workshop SNP99, World Scienti4c, Singapore, 2000, p. 149. K. Itonaga, T. Ueda, T. Motoba, Nucl. Phys. A 691 (2001) 197c. A. Parre˜no, A. Ramos, C. Bennhold, Phys. Rev. C 65 (2002) 015205.
106
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
[87] R. Bertini, et al., Nucl. Phys. A 360 (1981) 315; R. Bertini, et al., Nucl. Phys. A 368 (1981) 365. [88] S. Ajimura, et al., Phys. Rev. Lett. 86 (2001) 4255; H. Kohri, et al., Phys. Rev. C 65 (2002) 034607. [89] R.H. Dalitz, D.H. Davis, T. Motoba, D.N. Tovee, Nucl. Phys. A 625 (1997) 71. [90] T. Motoba, Nuovo Cimento A 102 (1989) 345. [91] H. Tamura, et al., Nucl. Phys. A 670 (2000) 249c; K. Tanida, et al., Nucl. Phys. A 684 (2001) 560c; K. Tanida, et al., Nucl. Phys. A 691 (2001) 115c. [92] R. Grace, et al., Phys. Rev. Lett. 55 (1985) 1055. [93] J.J. Szymanski, et al., Phys. Rev. C 43 (1991) 849. [94] T. Yamazaki, et al., Nucl. Phys. 450 (1986) 1c. [95] A. Sakaguchi, et al., Phys. Rev. C 43 (1991) 73. [96] V. Lucherini, et al., Nucl. Phys. A 639 (1998) 529c; V. Filippini, et al., Nucl. Phys. A 639 (1998) 537c; T. Bressani, et al., in: Il.-T. Cheon, S.W. Hong, T. Motoba (Eds.), Proceedings of the APCTP Workshop SNP99, World Scienti4c, Singapore, 2000, p. 383; A. Feliciello, Nucl. Phys. A 691 (2001) 170c; P. Gianotti, Nucl. Phys. A 691 (2001) 483c. [97] H. Outa, et al., Nucl. Phys. A 585 (1995) 109c. [98] O. Hashimoto, Nucl. Phys. A 639 (1998) 93c. [99] C. Milner, et al., Phys. Rev. Lett. 54 (1985) 1237. [100] J.C. Peng, Nucl. Phys. A 450 (1986) 129c. [101] H. Noumi, et al., Phys. Rev. C 52 (1995) 2936. [102] H.C. Bhang, et al., Phys. Rev. Lett. 81 (1998) 4321; H. Park, et al., Phys. Rev. C 61 (2000) 054004. [103] N. Nagae, Nucl. Phys. A 691 (2001) 76c. [104] Y. Sato, et al., Nucl. Phys. A 639 (1998) 279c. [105] H. Noumi, Nucl. Phys. A 639 (1998) 121c; K. Imai, in: Il.-T. Cheon, S.W. Hong, T. Motoba (Eds.), Proceedings of the APCTP Workshop SNP99, World Scienti4c, Singapore, 2000, p. 419. [106] E.V. Hungerford, in: Il.-T. Cheon, S.W. Hong, T. Motoba (Eds.), Proceedings of the APCTP Workshop SNP99, World Scienti4c, Singapore, 2000, p. 356. [107] L. Tang, in: Il.-T. Cheon, S.W. Hong, T. Motoba (Eds.), Proceedings of the APCTP Workshop SNP99, World Scienti4c, Singapore, 2000, p. 393. [108] F.J. Gillman, M.B. Wise, Phys. Rev. D 20 (1979) 2392. [109] E.A. Paschos, T. Schneider, Y.L. Wu, Nucl. Phys. B 332 (1990) 285. [110] K. Takayama, M. Oka, e-print archive hep-ph=9809388; hep-ph=9811435. [111] J. Nieves, E. Oset, Phys. Rev. C 47 (1993) 1478. [112] T. Motoba, K. Itonaga, Prog. Theor Phys. Suppl. 117 (1994) 477. [113] K. Itonaga, T. Motoba, H. BandVo, Z. Phys. A 330 (1988) 209; Nucl. Phys. A 489 (1988) 683. [114] J. Cohen, Prog. Part. Nucl. Phys. 25 (1990) 139. [115] I. Kumagai-Fuse, S. Okabe, Y. Akaishi, Phys. Lett. B 345 (1995) 386. [116] R.H. Dalitz, Phys. Rev. 112 (1958) 605; R.H. Dalitz, L. Liu, Phys. Rev. 116 (1959) 1312. [117] E. Oset, A. Ramos, Prog. Part. Nucl. Phys. 41 (1998) 191. [118] J. Golak, K. Miyagawa, H. Kamada, H. Witala, W. GlMockle, A. Parre˜no, A. Ramos, C. Bennhold, Phys. Rev. C 55 (1997) 2196; Erratum: J. Golak, K. Miyagawa, H. Kamada, H. Witala, W. GlMockle, A. Parre˜no, A. Ramos, C. Bennhold, Phys. Rev. C 56 (1997) 2892; H. Kamada, J. Golak, K. Miyagawa, H. Witala, W. Glockle, Phys. Rev. C 57 (1998) 1595. [119] C. Bennhold, A. Ramos, Phys. Rev. C 45 (1992) 3017.
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109
107
[120] K. Itonaga, T. Ueda, T. Motoba, Nucl. Phys. A 639 (1998) 329c. [121] A. Parre˜no, A. Ramos, Phys. Rev. C 65 (2002) 015204. [122] K. Itonaga, T. Ueda, T. Motoba, Nucl. Phys. A 577 (1994) 301c; K. Itonaga, T. Ueda, T. Motoba, Nucl. Phys. A 585 (1995) 331c. [123] A. Parre˜no, A. Ramos, C. Bennhold, Phys. Rev. C 56 (1997) 339. [124] T. Inoue, M. Oka, T. Motoba, K. Itonaga, Nucl. Phys. A 633 (1998) 312. [125] J.F. Dubach, G.B. Feldman, B.R. Holstein, L. de la Torre, Nucl. Phys. A 450 (1986) 71c. [126] J.F. Dubach, G.B. Feldman, B.R. Holstein, Ann. Phys. 249 (1996) 146. [127] M. Shmatikov, Nucl. Phys. A 580 (1994) 538. [128] D. Jido, E. Oset, J.E. Palomar, Nucl. Phys. A 694 (2001) 525. [129] K. Maltman, M. Shmatikov, Phys. Lett. B 331 (1994) 1; Nucl. Phys. A 585 (1995) 343c. [130] K. Maltman, M. Shmatikov, Phys. Rev. C 51 (1995) 1576. [131] A. Parre˜no, A. Ramos, C. Bennhold, K. Maltman, Phys. Lett. B 435 (1998) 1. [132] C.-Y. Cheung, D.P. Heddle, L.S. Kisslinger, Phys. Rev. C 27 (1983) 335; D.P. Heddle, L.S. Kisslinger, Phys. Rev. C 33 (1986) 608. [133] T. Inoue, S. Takeuchi, M. Oka, Nucl. Phys. A 597 (1996) 563. [134] K. Sasaki, T. Inoue, M. Oka, Nucl. Phys. A 669 (2000) 331; Erratum: K. Sasaki, T. Inoue, M. Oka, Nucl. Phys. A 678 (2000) 455. [135] J.-H. Jun, H.C. Bhang, Nuovo Cimento A 112 (1999) 649; J.-H. Jun, Phys. Rev. C 63 (2001) 044012. [136] V.L. Telegdi, Sci. Am. 206 (1962) 50. [137] B. Povh, Rep. Prog. Phys. 39 (1976) 823. [138] A. Montwill, et al., Nucl. Phys. A 234 (1974) 413. [139] V.J. Zeps, Nucl. Phys. A 639 (1998) 261c. [140] J.P. Bocquet, et al., Phys. Lett. B 182 (1986) 146; J.P. Bocquet, et al., Phys. Lett. B 192 (1987) 312. [141] B. Kamys, et al., Eur. Phys. J. A 11 (2001) 1. [142] P. Kulessa, et al., Acta Phys. Polon B 33 (2002) 603–618. [143] A. Hashimoto, et al., Nuovo Cimento A 102 (1989) 679. [144] A. Rusek, Nucl. Phys. A 639 (1998) 111c. [145] W. Cheston, H. Primakov, Phys. Rev. 92 (1953) 1537. [146] R.H. Dalitz, G. Rajasekaran, Phys. Lett. 1 (1962) 58. [147] M.M. Block, R.H. Dalitz, Phys. Rev. Lett. 11 (1963) 96. [148] R.H. Dalitz, Proceedings of the Summer Study Meeting on Nuclear and Hypernuclear Physics with Kaon Beams, BNL Report No. 18335, 1973, p. 41. [149] C.B. Dover, Few-Body Syst. Suppl. 2 (1987) 77. [150] R.A. Schumacher, Nucl. Phys. A 547 (1992) 143c. [151] W.M. Alberico, G. Garbarino, Phys. Lett. B 486 (2000) 362. [152] J. Cohen, Phys. Rev. C 42 (1990) 2724. [153] Z. Rudy, et al., Eur. Phys. J. A 5 (1999) 127; W. Cassing, et al., e-print archive nucl-ex=0109012. [154] J.B. Adams, Phys. Rev. 156 (1967) 1611. [155] B.H.J. McKellar, B.F. Gibson, Phys. Rev. C 30 (1984) 322. [156] G. Nardulli, Phys. Rev. C 38 (1988) 832. [157] A. Parre˜no, A. Ramos, C. Bennhold, Phys. Rev. C 52 (1995) R1768; Erratum: A. Parre˜no, A. Ramos, C. Bennhold, Phys. Rev. C 54 (1996) 1500. [158] K. Takeuchi, H. Takaki, H. BandVo, Prog. Theor. Phys. 73 (1985) 841. [159] H. BandVo, Prog. Theor. Phys. Suppl. 81 (1985) 181. [160] A. Ramos, C. Bennhold, Nucl. Phys. A 577 (1994) 287c. [161] K. Hagino, A. Parre˜no, Phys. Rev. C 63 (2001) 044318. [162] H. BandVo, Y. Shono, H. Takaki, Int. J. Mod. Phys. A 3 (1988) 1581. [163] M. Shmatikov, Phys. Lett. B 322 (1994) 311.
108 [164] [165] [166] [167] [168] [169] [170] [171] [172] [173] [174] [175] [176] [177] [178] [179] [180] [181] [182] [183] [184] [185] [186] [187] [188] [189] [190] [191] [192] [193] [194] [195] [196] [197] [198] [199] [200] [201]
[202] [203] [204] [205]
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109 S. Shinmura, Nucl. Phys. A 585 (1995) 357c. M. Oka, Nucl. Phys. A 647 (1999) 97. E. Oset, L.L. Salcedo, Nucl. Phys. A 443 (1985) 704. W.M. Alberico, A. De Pace, M. Ericson, A. Molinari, Phys. Lett. B 256 (1991) 134. A. Ramos, E. Oset, L.L. Salcedo, Phys. Rev. C 50 (1994) 2314. W.M. Alberico, A. De Pace, G. Garbarino, A. Ramos, Phys. Rev. C 61 (2000) 044314. W.M. Alberico, A. De Pace, G. Garbarino, R. Cenni, Nucl. Phys. A 668 (2000) 113. L. Zhou, J. Piekarewicz, Phys. Rev. C 60 (1999) 024306. S. Shinmura, Prog. Theor. Phys. 89 (1993) 1115. A. Parre˜no, A. Ramos, E. Oset, Phys. Rev. C 51 (1995) 2477. H. Outa, et al., Nucl. Phys. A 670 (2000) 281c. E. Oset, L.L. Salcedo, Q.N. Usmani, Nucl. Phys. A 450 (1986) 67c. H. Noumi, et al., in: H. Ejiri, T. Kishimoto, T. Sato (Eds.), Proceedings of the IV International Symposium on Weak and Electromagnetic Interactions in Nuclei, World Scienti4c, Singapore, 1995, p. 550. Y. Sato, et al., Nucl. Phys. A 691 (2001) 189c. M. Ericson, H. BandVo, Phys. Lett. B 237 (1990) 169. T. Motoba, Nucl. Phys. A 547 (1992) 115c. O. Hashimoto, et al., Phys. Rev. Lett. 88 (2002) 042503. Y. Sato, et al., Ph.D. Thesis, Tohoku University, 1999; Proceedings of the Workshop on Hypernuclear Physics with Electromagnetic Probes, December 2– 4, 1999, Hampton, VA, USA. E. Oset, H. Toki, W. Weise, Phys. Rep. 83 (1982) 281. E. Oset, P. Fern]andez de C]ordoba, J. Nieves, A. Ramos, L.L. Salcedo, Prog. Theor. Phys. Suppl. 117 (1994) 461. P. Fern]andez De C]ordoba, E. Oset, Nucl. Phys. A 528 (1991) 736. A.L. Fetter, J.D. Walecka, Quantum Theory of Many Particle Systems, McGraw-Hill, New York, 1971. R. Seki, K. Masutani, Phys. Rev. C 27 (1983) 2799. C. Garcia-Recio, J. Nieves, E. Oset, Nucl. Phys. A 547 (1992) 473. J.W. Negele, Rev. Mod. Phys. 54 (1982) 813. W.M. Alberico, R. Cenni, A. Molinari, P. Saracco, Ann. Phys. 174 (1987) 131. R. Cenni, F. Conte, P. Saracco, Nucl. Phys. A 623 (1997) 391. R. Machleidt, K. Holinde, Ch. Elster, Phys. Rep. 149 (1987) 1. R. Cenni, P. Saracco, Nucl. Phys. A 487 (1988) 279; R. Cenni, F. Conte, A. Cornacchia, P. Saracco, Nuovo Cimento 15 (12) (1992) 1. J. Vida˜na, A. Polls, A. Ramos, M. Hjorth-Jensen, Nucl. Phys. A 644 (1998) 201. G. Garbarino, Ph.D. Thesis, University of Turin, 2000; Nucl. Phys. A 691 (2001) 193c. A. Gal, in: H. Ejiri, T. Kishimoto, T. Sato (Eds.), Weak and Electromagnetic Interactions in Nuclei, World Scienti4c, Singapore, 1995, p. 573. A. Ramos, M.J. Vicente-Vacas, E. Oset, Phys. Rev. C 55 (1997) 735. H.C. Bhang, et al., Nucl. Phys. A 629 (1998) 412c. H.C. Bhang, et al., Nucl. Phys. A 691 (2001) 156c; H.C. Bhang, invited talk presented at the VII International Conference On Hypernuclear and Strange Particle Physics, October 23–27, 2000, Torino, Italy. H. Outa, et al., Proposal of KEK-PS E462 (2000). R.L. Gill, Nucl. Phys. A 691 (2001) 180c. T. Nagae, in: S. Bianco et al. (Eds.), Proceedings of the III International Workshop on Physics and detectors for DAPHNE, November 16 –19, 1999, Frascati, Italy, p. 701; H. Bhang, in: International Symposium on Hadrons and Nuclei, February 20 –22, 2001, Seoul, Korea; T. Nagae, et al., in: Proceedings of the INPC 2001, July 30 –August 3, 2001, Berkeley, USA, to appear. H. BandVo, T. Motoba, M. Sotona, J. Z\ ofka, Phys. Rev. C 39 (1989) 587; T. Kishimoto, H. Ejiri, H. BandVo, Phys. Lett. B 232 (1989) 24. S. Ajimura, et al., Phys. Lett. B 282 (1992) 293. S. Ajimura, et al., Phys. Rev. Lett. 84 (2000) 4052. A. Ramos, E. van Meijgaard, C. Bennhold, B.K. Jennings, Nucl. Phys. A 544 (1992) 703.
W.M. Alberico, G. Garbarino / Physics Reports 369 (2002) 1 – 109 [206] [207] [208] [209] [210] [211] [212] [213] [214] [215]
E.H. Auerbach, et al., Ann. Phys. 148 (1983) 381. H. Ejiri, T. Kishimoto, H. Noumi, Phys. Lett. B 225 (1989) 35. K. Itonaga, T. Motoba, O. Richter, M. Sotona, Phys. Rev. C 49 (1994) 1045. S. Ajimura, et al., Phys. Rev. Lett. 80 (1998) 3471. T. Motoba, K. Itonaga, Nucl. Phys. A 577 (1994) 293c. E. Gromm, et al., Review of Particle Physics, Eur. Phys. J. C 15 (2000) 1. H. Nabetani, T. Ogaito, T. Sato, T. Kishimoto, Phys. Rev. C 60 (1999) 017001. K. Sasaki, T. Inoue, M. Oka, Nucl. Phys. A 691 (2001) 201c. K. Sasaki, private communication. A. Parre˜no, private communication.
109
Physics Reports 369 (2002) 111 – 176 www.elsevier.com/locate/physrep
Submillimeter galaxies Andrew W. Blaina; b; ∗ , Ian Smailc , R.J. Ivisond , J.-P. Kneibe , David T. Frayerf a
Department of Astronomy, Caltech, Pasadena, CA 91125, USA Institute of Astronomy, Madingley Road, Cambridge CB3 0HA, UK c Department of Physics, University of Durham, South Road, Durham DH1 3LE, UK d Institute for Astronomy, University of Edinburgh, Edinburgh EH9 3HJ, UK e Observatoire Midi-Pyr-en-ees, 14 Avenue E. Belin, F-31400 Toulouse, France f SIRTF Science Center, Caltech, Pasadena, CA 91125, USA b
Received 1 January 2002 editor: M.P. Kamionkowski
Abstract A cosmologically signi4cant population of very luminous high-redshift galaxies has recently been discovered at submillimeter (submm) wavelengths. Advances in submm detector technologies have opened this new window on the distant Universe. Here we discuss the properties of the high-redshift submm galaxies, their signi4cance for our understanding of the process of galaxy formation, and the selection e7ects that apply to deep submm surveys. The submm galaxies generate a signi4cant fraction of the energy output of all the galaxies in the early Universe. We emphasize the importance of studying a complete sample of submm galaxies, and stress that because they are typically very faint in other wavebands, these follow-up observations are very challenging. Finally, we discuss the surveys that will be made using the next generation of submm-wave c 2002 Elsevier Science B.V. All rights reserved. instruments under development. PACS: 98.80.Es; 98.62.−g Keywords: Dust extinction; Observational cosmology; Galaxy evolution; Galaxy formation; Gravitational lensing; Radio continuum
Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2. Submm-wave emission from galaxies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1. The power source for dusty galaxies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2. Continuum emission from dust . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ∗
Corresponding author. Department of Astronomy, Caltech, Pasadena, CA 91125, USA. E-mail address:
[email protected] (A.W. Blain).
c 2002 Elsevier Science B.V. All rights reserved. 0370-1573/02/$ - see front matter PII: S 0 3 7 0 - 1 5 7 3 ( 0 2 ) 0 0 1 3 4 - 5
113 116 117 119
112
A.W. Blain et al. / Physics Reports 369 (2002) 111 – 176
2.2.1. The emission spectrum, dust mass and temperature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3. The observed SEDs of dusty galaxies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4. Line emission . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4.1. Line emission contribution to continuum detections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5. The observability of high-redshift dusty galaxies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.6. Submm-wave selection e7ects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.7. Deep submm-wave surveys . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.8. Submm observations of known high-redshift galaxies and QSOs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.9. Alternative strategy for deep submm surveys . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.10. Determining redshifts of submm galaxies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.10.1. Photometric redshifts from far-IR SEDs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.10.2. Radio–submm photometric redshifts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3. The observed properties of submm-selected galaxies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1. Confusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.1. Confusion and follow-up observations of submm galaxies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2. Multi-waveband follow-up studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.1. Optical=near-IR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.2. Ultradeep radio images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.3. CO rotation line emission and continuum mm-wave interferometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.4. X-ray observations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.5. Mid- and far-IR observations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3. A gallery of follow-up results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4. Clustering properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4. Submm galaxy luminosity functions and their relationship with other populations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1. Optically selected Lyman-break galaxies (LBGs) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2. Extremely red objects (EROs) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3. Faint radio galaxies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4. Active galaxies and X-ray sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5. Gamma-ray burst (GRB) host galaxies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.6. Prospects for the follow-up observations in the future . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5. Modeling the evolution of submm galaxies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1. An array of possible treatments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2. Observational tests of models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3. Modeling the detailed astrophysics of the submm galaxies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4. The global evolution of dust-enshrouded galaxies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6. Gravitational lensing in the submm waveband . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1. Magni4cation bias . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2. Conditions for exploiting submm lensing by galaxies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3. Prospects for the lensing studies in the future . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7. Future developments in submm cosmology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.1. New technologies for instrumentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2. New telescopes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3. Future capabilities and progress . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8. Summary: key questions and targets for the future . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
120 122 124 125 126 127 129 130 133 134 134 136 139 143 144 145 148 149 149 150 150 150 151 152 152 152 153 154 154 154 156 157 159 161 163 163 163 164 165 165 165 167 167 168 170 170
A.W. Blain et al. / Physics Reports 369 (2002) 111 – 176
113
1. Introduction Discovering the process by which the dense, gravitationally bound galaxies formed in the Universe from an initially almost uniform gas, and understanding the way their constituent populations of stars were born is a key goal of modern physical cosmology. A wide range of well understood physical processes are involved; including general relativity, gas dynamics and cooling physics, nuclear reactions and radiative transfer. However, the range of possible initial conditions and the non-linear nature of most of the events, starting with the collapse of primordial density perturbations, ensure that these intimately connected processes can generate a very wide range of possible scenarios and outcomes. Galaxy formation can be studied by attempting to reproduce the observed Universe via analytical models and numerical simulations. The information required to constrain these models is provided by both forensic studies of the current constituents of the Universe, including stellar ages, chemical abundances and the sizes and shapes of galaxies, and by direct observations of the galaxy formation process taking place in the young Universe at great distances. Direct observations exploit both the light emitted by distant galaxies, and the signature of absorption due to intervening structures along the line of sight, and began almost 50 years ago using sensitive optical and radio telescopes. Astronomers must now use all available frequencies of radiation to probe the properties of the Universe, from the lowest energy radio waves to the highest-energy -rays. It is vital to combine the complementary information that can be determined about the constituents of the Universe at di7erent wavelengths in order to make progress in our understanding. This review discusses the results of a new type of direct observation of the galaxy formation process, made possible by the development of powerful new radiation detectors sensitive to wavelengths in the range 200 m to about 1 mm: the submillimeter (submm) waveband. The detection of submm radiation from distant galaxies is one of the most recent developments in observational cosmology, and has 4nally brought this region of the electromagnetic spectrum into use for making cosmological observations not directly connected with the cosmic microwave background (CMB; Partridge and Peebles, 1967). With the possible exception of the hardest X-ray wavebands, studies of distant galaxies in the submm waveband remained elusive for the longest period. We will also discuss some observations at the mid- and far-infrared(IR) wavebands that bound the submm waveband at short wavelengths, usually de4ned as the wavelength ranges from about 5 – 40 and 40 –200 m, respectively. The most signi4cant reason for the late Jowering of submm cosmology is the technical challenge of building sensitive receivers that work eKciently at the boundary between radio-type coherent and optical-like incoherent detection techniques. In addition, atmospheric emission and absorption permits sensitive submm observations from only high mountain sites, and only in speci4c atmospheric windows. The zenith opacity from the best sites in the clearest submm atmospheric window at 850 m is typically about 0.1. Furthermore, the long wavelength of submm radiation limits spatial resolution unless very large 4lled or synthetic apertures are available. The largest single apertures available at present are in the 10 –30 m class, providing spatial resolution of order 10 arcsec. This resolution is much coarser than the sub-arcsec resolution of optical and near-IR observations. The appearance of the same region of sky at optical and submm wavelengths is compared in Fig. 1 to illustrate this point: the multicolor optical image was obtained using the Hale 5-m telescope at Mt. Palomar, while the 850-m submm image was obtained using the 15-m James Clerk Maxwell Telescope (JCMT) on Mauna Kea. Interferometers can dramatically enhance the resolution of images, but so far have
114
A.W. Blain et al. / Physics Reports 369 (2002) 111 – 176
14010+0253
cD galaxy 14011+0253
14009+0252
14010+0252
Fig. 1. A comparison of deep optical and submm views of the sky. The background image is a 3-color optical image of the rich cluster of galaxies Abell 1835 at the low=moderate redshift z = 0:25 (Smail et al., 1998b) taken using the 5-m Hale telescope, overlaid with the 14-arcsec resolution contours of a SCUBA 850-m submm-wave image of the same 4eld (Ivison et al., 2000a). North is up and East to the left. The brightest SCUBA galaxies at (−45; −15), (65,0) and (20; −60), and the central cD galaxy (Edge et al., 1999), all have clear radio detections at a frequency of 1:4 GHz in images with higher spatial resolution than the SCUBA contours, obtained at the Very Large Array (VLA), supporting their reality. The bright SCUBA galaxy at (−45; −15) is associated with SMM J14011+0253, an interacting pair of galaxies at redshift z = 2:56 in the background of the cluster (Frayer et al., 1999). Spectacular fragmented structure appears in the Easterly red component of this galaxy in Hubble Space Telescope (HST) images (Fig. 18).
only operated at longer mm wavelengths. The commissioning of the 8-element Sub-Millimeter Array (SMA; Ho, 2000) 1 on Mauna Kea in Hawaii with baselines of up to about 500 m, the 4rst dedicated submm-wave interferometer, will provide images with sub-arcsecond resolution. The much larger 64-element Atacama Large Millimeter Array (ALMA; Wootten 2001) 2 will be in service at the end of the decade. A key development was the commissioning of the Submillimetre Common-User Bolometer Array (SCUBA) camera at the JCMT in 1997 (Holland et al., 1999). SCUBA images the sky in the atmospheric windows at both 450 and 850 m in a 2.5-arcmin-wide 4eld, using hexagonal close-packed arrays of 91 and 37 bolometer detectors at the respective wavelengths. SCUBA provided a dramatic leap forward from the pre-existing single-pixel or one-dimensional array instruments available. The 1 2
http://sma2.harvard.edu. http://www.alma.nrao.edu.
A.W. Blain et al. / Physics Reports 369 (2002) 111 – 176
115
combination of 4eld of view and sensitivity was suKcient to enable the 4rst searches for submm-wave emission from previously unknown distant galaxies. The Max-Planck Millimetre Bolometer Array (MAMBO; Kreysa et al., 1998) is a 1.25-mm camera with similar capabilities to SCUBA, which operates during the winter from the Institut de Radio Astronomie MillimNetrique (IRAM) 30-m telescope on Pico Veleta in Spain. A similar device—the SEST Imaging Bolometer Array (SIMBA)—designed at Onsala in Sweden is soon to begin operation on the 15-m Swedish–ESO Submillimetre Telescope (SEST) in Chile, providing a sensitive submm imaging capability in the South. The capability of mm and submm-wave observatories is not standing still: a number of larger, more sensitive mmand submm-wave cameras are under construction, including the SHARC-II (Dowell et al., 2001), BOLOCAM (Glenn et al., 1998) and SCUBA-II instruments. 3 Bolometer technology continues to advance. The advent of extremely stable superconducting bolometers that require no bias current and can be read out using multiplexed cold electronics, should ultimately allow the construction of very large submm detector arrays of order 104–5 elements (for example Benford et al., 1999). SCUBA-II is likely to be the 4rst instrument to exploit this technology, providing a 8 × 8-arcmin2 4eld of view at the resolution limit of the JCMT. The 4rst extragalactic submm=mm surveys using SCUBA and MAMBO revealed a population of very luminous high-redshift galaxies, which as a population, were responsible for the release of a signi4cant fraction of the energy generated by all galaxies over the history of the Universe (Blain et al., 1999b). Almost 200 of these galaxies are now known (Smail et al., 1997; Barger et al., 1998, 1999a; Hughes et al., 1998; Eales et al., 1999, 2000; Lilly et al., 1999; Bertoldi et al., 2000; Borys et al., 2002; Chapman et al., 2002a; Cowie et al., 2002; Dannerbauer et al., 2002; Fox et al., 2002; Scott et al., 2002; Smail et al., 2002; Webb et al., 2002a). There is strong evidence that almost all of these galaxies are at redshifts greater than unity, and that the median redshift of the population is likely to be of order 2–3 (Smail et al., 2000, 2002). However, only a handful of these objects have certain redshifts and well-determined properties at other wavelengths (Frayer et al., 1998, 1999; Ivison et al., 1998a, 2001; Kneib et al., 2002). The results of these mm=submm surveys provide complementary information to deep surveys for galaxies made in the radio (Richards, 2000), far-IR (Puget et al., 1999), mid-IR (Elbaz et al., 1999) and optical (Steidel et al., 1999) wavebands. Submm observations are a vital component of the search for a coherent picture of the formation and evolution of galaxies, which draws on data from all wavebands where the distant Universe can be observed. In this review, we describe the key features of the submm emission processes in galaxies. We summarize the current, developing state of submm-wave observations of distant galaxies, including the results of both blank-4eld surveys, and targeted observations of known high-redshift galaxies, including radio-galaxies, optically selected quasars=QSOs, X-ray detected active galactic nuclei (AGNs) and optically selected Lyman-break galaxies (LBGs). Submm-wave surveys are not immune to selection e7ects, and we discuss their strengths and weaknesses. We describe the properties of the class of submm-luminous galaxies, and discuss the key results that are required to make signi4cant progress in understanding them. We consider the relationship between the submm-selected galaxies and other populations of high-redshift galaxies, and describe models that can account for the 3
Details can be found in Table 3. The next-generation SCUBA-II camera for the JCMT is under development at the United Kingdom Astronomy Technology Centre (UKATC). See http://www.jach.hawaii.edu/ JACpublic/JCMT/Continuum observing/SCUBA-2/home.html.
116
A.W. Blain et al. / Physics Reports 369 (2002) 111 – 176
properties of submm-selected galaxies. We introduce the unusually signi4cant e7ects of the magni4cation of distant submm-selected galaxies due to gravitational lensing (Schneider et al., 1992). Finally, we recap the key developments that are keenly awaited in the 4eld, and describe some of the exciting science that will be possible in the next decade using future instruments. The cosmological parameter values assumed are generally listed where they appear. We usually adopt a Jat world model with a Hubble constant H0 = 65 km s−1 Mpc−1 , a density parameter in matter m = 0:3 and a cosmological constant = 0:7. 2. Submm-wave emission from galaxies There are two major sources of submm radiation from galaxies: thermal continuum emission from dust grains, the solid phase of the interstellar medium (ISM), and line emission from atomic and molecular transitions in the interstellar gas. The ladder of carbon monoxide (CO) rotational transitions, spaced every 115 GHz, is the most important source of molecular line emission, but there is a rich zoo of other emitting molecules in the denser phases of the ISM. Submm surveys for distant galaxies have so far been made using cameras that detect only continuum dust emission, and so this will be the main focus of the review. However, the search for line emission is already important, and its study will become increasingly signi4cant. The spectral resolution provided by line observations reveals much more about the physical and chemical conditions in the ISM, for studies of kinematics, metallicity and excitation conditions. Molecular lines can also be used to obtain a very accurate spectroscopic redshift for the ISM in high-redshift galaxies with prior optical redshifts (for example, Frayer et al., 1998). Searches for redshifts at cm and (sub)mm wavelengths using CO lines will be possible using future telescopes. The best studied regions of the Universe in the submm waveband are giant molecular clouds (GMCs) in the Milky Way, in which ongoing star formation is taking place (Hollenbach and Tielens, 1997). GMCs are perhaps very low-luminosity archetypes for distant dusty galaxies, although these galaxies have far-IR luminosities that are up to 4 orders of magnitude greater than that of the whole Milky Way. Detailed, resolved submm-wave images and spectra only exist for low-redshift galaxies (for example Regan et al., 2001; Sakamoto et al., 1999), and it is often necessary to use them as templates to interpret the properties of more distant galaxies. A very important class of well-studied galaxies similar in luminosity, and perhaps in physical properties, to high-redshift submm galaxies are the ultraluminous IR galaxies (ULIRGs) discovered in the InfraRed Astronomy Satellite (IRAS) all-sky survey in the mid 1980s (see the review by Sanders and Mirabel, 1996). ULIRGs are usually de4ned as having a bolometric luminosity, integrated over all wavelengths at which dust emission dominates the SED (from about 1 mm–8 m), in excess of 1012 L . 4 They are amongst the most luminous of all galaxies, but number less than 0.1% of galaxies in the local Universe. Due to their selection by IRAS, they are typically at relatively low redshifts, less than about 0.3. The 4rst IRAS-detected high-redshift ULIRG was identi4ed by Rowan-Robinson et al. (1991) at z = 2:3. The current record redshift for a galaxy detected by IRAS is z = 3:9 for APM 08279+5255 (Irwin et al., 1998). Both these galaxies appear to be extremely luminous; however, their luminosities are boosted by at least 4
1 L = 3:84 × 1026 W.
A.W. Blain et al. / Physics Reports 369 (2002) 111 – 176
117
Fig. 2. Various observed restframe spectral energy distributions (SEDs) of galaxies from the radio to the near-IR wavebands. Two examples of the most luminous low-redshift galaxies detected by IRAS are included (I). Five very luminous high-redshift galaxies that have been, or could have been, detected directly in deep submm surveys (S), three high-redshift galaxies serendipitously magni4ed and made easier to study by the gravitational lensing e7ect of foreground galaxies and also detected by IRAS (L), and 4ve high-redshift AGNs detected in optical or radio surveys (H) are also shown. In addition, three template SEDs are shown. One includes the properties of CO and atomic 4ne-structure emission lines in the (sub)mm waveband at wavelengths from 100 to 3000 m (Blain et al., 2000), one includes polycyclic aromatic hydrocarbon (PAH) molecular emission features at wavelengths ∼ 10 m in the mid-IR waveband (Guiderdoni et al., 1998), and one is normalized to the typical SED of a sample of low-redshift IRAS galaxies (Dunne et al., 2000). For further information on far-IR SEDs see Dale et al. (2001). With the exception of the high-redshift AGNs and the lensed galaxies, the templates tend to provide a reasonable description of the SED at wavelengths around and longer than its peak, the regime probed by submm surveys. Less luminous galaxies like the Milky Way have dust spectra that peak at a wavelength about a factor of 2 longer than these templates (Reach et al., 1995).
a factor of 10 due to gravitational lensing by foreground galaxies. A compilation of the properties of some of the most extreme ULIRGs is given by Rowan-Robinson (2000). The IR spectral energy distributions (SEDs) of some low-redshift ULIRGs and a compilation of results for the more sparsely sampled SEDs of high-redshift dusty galaxies are illustrated in Fig. 2. 2.1. The power source for dusty galaxies About 99% of the energy released by galaxies in the submm and far-IR wavebands is produced by thermal emission from dust grains; the remainder comes from 4ne-structure atomic and molecular rotational line emission. However, the source of the energy to power this emission by heating dust is
118
A.W. Blain et al. / Physics Reports 369 (2002) 111 – 176
often unclear. Any intense source of optical=ultraviolet (UV) radiation, either young high-mass stars or an accretion disk surrounding an AGN, would heat dust grains. Because dust emits a featureless modi4ed blackbody spectrum, submm continuum observations can reveal little information about the physical conditions within the source. Regions of intense dust emission are very optically thick, and so little information can be obtained by observing optical or UV radiation. In typical spiral galaxies, with relatively low far-IR luminosities of several 1010 L (for example, Alton et al., 2000, 2001), the dust emission is known to be signi4cantly extended, on the same scale as the 10-kpc stellar disk. 5 The emission is certainly associated with molecular gas rich star-forming regions distributed throughout the galaxy (Regan et al., 2001), in which dust is heated by the hot, young OB stars. In intermediate luminosity galaxies, such as the interacting pair of spiral galaxies NGC 4038=4039 ‘the Antennae’ (Mirabel et al., 1998; Wilson et al., 2000), the most intense knots of star-formation activity, from which most of the luminosity of the system emerges, are not coincident with either nucleus of the merging galaxies, but occur in a deeply dust-enshrouded overlap region of the ISM of the galaxies. This provides a strong argument that almost all of the energy in this system is being generated by star formation rather than an AGN. In more luminous ULIRGs that are at suKciently low redshift for their internal structure to be resolved, the great majority of the dust emission arises in a much smaller, sub-kpc region (Downes and Solomon, 1998; Sakamoto et al., 1999) within a merging system of galaxies. It is plausible that a signi4cant fraction of the energy could be derived from an AGN surrounded by a very great column density of gas and dust that imposes many tens of magnitudes of extinction on the emission from the AGN in the optical and UV wavebands, and which remains optically thick even at near-IR wavelengths. Alternatively, an ongoing centrally condensed burst of star-formation activity, fueled by gas funneled into the center of the potential well of a pair of interacting galaxies by a bar instability (Mihos, 2000) is an equally plausible power source. If the geometry of absorbing and scattering material is known or assumed, then radiative transfer models can be used to predict the SED of a galaxy, which should di7er depending on whether the source of heating is a very small AGN with a very hard UV SED, or a more extended, softer-spectrum nuclear star-forming region (for example, Granato et al., 1996). Note that the results are expected to be very sensitive to the assumed geometry (Witt et al., 1992). In merging galaxies this geometry is highly unlikely to be spherical or cylindrical, and is uncertain for the high-redshift galaxies of interest here. In the case of AGN heating, the SED would be expected to peak at shorter wavelengths and the mid-IR SED would be expected to be Jatter as compared with a more extended star-formation power source. Both these features would correspond to a greater fraction of hot dust expected in AGNs (see Fig. 2), and is seen clearly in the SEDs of low-redshift IRAS-detected QSOs (Sanders and Mirabel, 1996). An alternative route to probing energy sources in these galaxies is provided by near- and mid-IR spectroscopy. At these longer wavelengths, the optical depth to the nucleus is less than in the optical=UV, and so the e7ects of the more intense, harder UV radiation 4eld expected in the environs of an AGN can be observed directly. These include the excitation of characteristic highly ionized lines, and the destruction of relatively fragile polycyclic aromatic hydrocarbon (PAH) molecules (Rigopoulou et al., 1999; Laurent et al., 2000; Tran et al., 2001), leading to the suppression of 5
1 pc = 3:09 × 1016 m.
A.W. Blain et al. / Physics Reports 369 (2002) 111 – 176
119
their distinctive emission and absorption features. Mid-IR spectroscopic observations with the successor to IRAS, the Infrared Space Observatory (ISO), in the mid-1990s indicated that most of the energy from low-redshift ULIRGs is likely generated by star-formation activity rather than AGN accretion. However, the fraction of ULIRGs containing AGN appears to increase at the highest luminosities (Sanders, 1999). This could be important at high redshifts, where the typical luminosity of dust-enshrouded galaxies is greater than in the local Universe. In addition, there may be duty-cycle e7ects present to make an AGN accrete, and perhaps to be visible, for only a fraction of the duration of a ULIRG phase in the evolution of the galaxy (Kormendy and Sanders, 1998; Sanders et al., 1988; Archibald et al., 2002). X-ray observations also o7er a way to investigate the power source, as all but the densest, most gas-rich galaxies, with particle column densities greater than 1024 cm−2 are transparent to hard (¿ 2 keV) X rays. Ultra-high-resolution radio observations provide a route to probing the innermost regions of ULIRGs (Smith et al., 1998; Carilli and Taylor, 2000). By detecting the di7use emission and multiple point-like radio sources, expected from multiple supernova remnants, rather than a single point-like core and accompanying jet structures expected from an AGN, these observations suggest that high-mass star formation contributes at least a signi4cant part of the luminosity of the ULIRGs Arp 220 and Mrk 273. It is interesting to note that the observed correlation between the inferred mass of the black holes in the centers of galaxies and the stellar velocity dispersion of the surrounding galactic bulges, in which most of the stars in the Universe reside (Fukugita et al., 1999), might inform this discussion (Magorrian et al., 1998; Ferrarese and Merritt, 2000; Gebhardt et al., 2000). The mass of the bulge appears to exceed that of the black hole by a factor of about 200. When hydrogen is processed in stellar nucleosynthesis, the mass–energy conversion eKciency is about 0:007j∗ , where j∗ ( 0:4) is the fraction of hydrogen burned in high-mass stars. When mass is accreted onto a black hole, the mass–energy conversion eKciency is expected to be about 0:1jBH , with jBH ∼ 1 with the de4nition above. If accretion and nucleosynthesis were to generate the same amount of energy during the formation of a galaxy, then the ratio of mass contained in both processed stars and stellar remnants to that of a supermassive black hole is expected to be about 0:1jBH =0:007j∗ . For j∗ = 0:4 and jBH = 1, this ratio is about 36. As a mass ratio of about 200 is observed, this implies that a greater amount of energy, by a factor of about 6, is generated by high-mass star-formation activity than by gravitational accretion. If the bulge-to-black-hole mass ratio is in fact greater than 200, then either the factor by which star formation dominates will exceed 6, or the accretion must have been more than 10% eKcient; that is jBH ¿ 1. If low-eKciency accretion dominates the process of the build up of mass in the central black hole, then less than 1 part in 7 of the luminosity generated during galaxy formation will be attributed to accretion as compared with high-mass star formation. A greater amount of energy generated by star formation as compared with accretion processes appears to be favored by these circumstantial arguments. 2.2. Continuum emission from dust The dust emission process is thermal, with dust grains emitting a modi4ed blackbody spectrum. Grains of interstellar dust, distributed throughout the ISM of a galaxy, are heated to temperatures between about 20 and 200 K, depending on the spectrum and intensity of the interstellar radiation
120
A.W. Blain et al. / Physics Reports 369 (2002) 111 – 176
4eld (ISRF), and the size and optical properties of the grains. Higher dust temperatures can be produced close to a powerful source of radiation, with dust subliming at temperatures of order 2000 K. Very small grains can be heated far above their equilibrium temperatures by absorbing hard-UV photons (see Draine and Li, 2001). Lower dust temperatures, always exceeding the CMB temperature, are possible in opaque regions of the ISM that are shielded from intense heating, in the intergalactic medium or in regions with an intrinsically weak ISRF. Unless dust is heated by the ISRF in addition to the CMB the galaxy will not be detectable. We now consider the properties of the dust emission that are relevant to observations of high-redshift galaxies. 2.2.1. The emission spectrum, dust mass and temperature The minimum parameters necessary to describe the emission from dust grains are a temperature Td and a form of the emissivity function j . In any galaxy there will be a distribution of dust temperatures, reJecting the di7erent nature and environment of each grain. It is useful to use Td to describe the coolest grains that contribute signi4cantly to the energy output of a galaxy when discussing submm observations. In most cases, spatially and spectrally resolved images of galaxies are not available, and so it is reasonable to assume a volume-averaged description of the emissivity function as a function of frequency ; j ˙ . Values of in the range 1–2 are usually assumed. Scattering theory predicts that → 2 at low frequencies, while a value 1 at high frequencies matches the general trend of the interstellar extinction curve that describes the properties of absorption of optical and UV radiation by the ISM (see Calzetti et al., 2000 and Section 2 of the review by Franceschini, 2002). The simplest form of the emission spectrum=SED, f is given by assuming that f ˙ j B , in which B is the Planck function (2kTd 2 =c2 in the Rayleigh–Jeans limit, in units of W m−2 Hz−1 sr −1 ). This assumes that the emitting source is optically thin. For 4tting spectra of galaxies found in deep submm surveys, we assume the simple j B function to describe the SED. Dunne et al. (2000) and Dunne and Eales (2001) also use this functional form to 4t the observed submm spectra of low-redshift galaxies. At the expense of adding another parameter to describe the SED, there is some physical motivation for a SED that includes an optical depth term f ˙ [1 − exp(− )]B ;
(1)
where is the frequency-dependent optical depth of the cloud, and is a multiple of j . This equation tends to the simpler j B function at long wavelengths, and is assumed by, for example, Benford et al. (1999), Omont et al. (2001), Priddey and McMahon (2001) and Isaak et al. (2002), whose submm data for high-redshift AGNs tends to correspond to rest-frame frequencies that are relatively close to the peak of the SED. The extra parameter required to relate and j can be de4ned as the frequency at which = 1 and the cloud becomes optically thick. If the opacity near a wavelength of 100 m is large, then the form of the peak of the SED tends to that of a blackbody spectrum. This suppresses the emission near to the SED peak relative to the emission in the Rayleigh–Jeans regime, and so this functional form provides a good 4t to a set of submm and far-IR data with a higher value of Td as compared with the j B function, usually by about 10 –20%. However, because most observed SEDs for high-redshift galaxies have fewer than four data points (see Fig. 2), the di7erence is unlikely to be very signi4cant. It is reasonable to assume that the mid-IR SED can be smoothly interpolated from a modi4ed blackbody function at low frequencies to a power-law f ˙ in the mid-IR waveband on the
A.W. Blain et al. / Physics Reports 369 (2002) 111 – 176
121
high-frequency side of the spectral peak, in order to prevent the high-frequency SED from falling exponentially with a Wien spectrum. Hotter components of dust, emitting at shorter wavelengths, and ultimately stellar emission in the near-IR waveband, are certain to be present to reduce the steepness of the SED in the Wien regime. That an exponential Wien spectrum is inappropriate can be seen from the well-de4ned power-law mid-IR SEDs of Arp 220 and Mrk 231 shown in Fig. 2. It is not always necessary to relate the SED f and luminosity L of a galaxy to the mass of dust Md that it contains; this can of worms can remain closed by normalizing f in a self-consistent way. However, if a dust mass is required, perhaps in order to estimate the metal content of the ISM, and so provide information about the integrated star-formation activity in the galaxy at earlier times (Hughes et al., 1997; Omont et al., 2001), then it is conventional to de4ne a frequency-dependent mass-absorption coeKcient (Draine and Lee, 1984; with units of m2 kg−1 ), which is proportional to j . is the ‘e7ective area’ for blackbody emission by a certain mass of dust, L
f = B M d : 4 f d
(2)
Values of at a conventional frequency of around 1 mm are in the range 0.04 –0:15 m2 kg−1 (Hughes, 1996). Recent comparisons of optical extinction and submm emission from partially resolved edge-on spiral galaxies have tended to give values of 0.05 –0:4 m2 kg−1 (see Fig. 4 of Alton et al., 2001). Domingue et al. (1999) derive 0:09 m2 kg−1 from similar far-IR, optical and submm data. Dunne et al. (2000) adopt a value of 0:077 m2 kg−1 . Note that there is at least a factor of 3 uncertainty in these conversion factors. An alternative dimensionless function Q (Hildebrand, 1983) is sometimes used, which includes information about the mass=volume and surface area of a typical grain. If grains are assumed to be spherical (a big if), with bulk density , radius a, and an emissive cross section a2 , then = 3Q =4a. Q B is the e7ective emissivity function describing the energy Jux from unit area of the dust grain surface. However, dust grains are more likely to be irregular in shape, possibly colloidal or in the form of whiskers. In that case, the emissivity per unit mass would be increased, and the dust mass associated with a 4xed luminosity would be overestimated. This geometrical uncertainty will inevitably result in uncertainty about the mass of dust. Hence, dust masses quoted in papers must be treated with caution, and may be best used as a comparative measure to distinguish galaxies. In general, we will avoid quoting dust masses, as this is unlikely to provide a reliable physical measure of the properties of galaxies until detailed resolved images are available, which is likely to require observations with the ALMA interferometer. This will be a recurring theme: observations with excellent sensitivity and spatial resolution using a large interferometer will resolve many of the questions raised throughout the paper. Working from submm data, it is also diKcult to assess the dust mass of a galaxy, even subject to the caveats above, without knowing its dust temperature. In the Rayleigh–Jeans spectral regime, the Jux density from a galaxy S ˙ 2+ Md Td . If Td is uncertain to within a factor, then Md is uncertain to within the same factor. The dust mass is at least easier to estimate from a single long-wavelength observation than the luminosity L. As L ˙ Md Td4+ , or equivalently L ˙ S Td3+ , an uncertainty in Td corresponds to a proportionally much larger uncertainty in the inferred value of L. However, even if the dust mass can be determined reliably at low redshifts, it remains unclear whether the same procedure can be applied to determine the dust mass in more luminous and
122
A.W. Blain et al. / Physics Reports 369 (2002) 111 – 176
more distant systems. In order to determine the dust properties of high-redshift galaxies, data of the same quality that has been obtained for nearby galaxies is required. High-frequency submm=far-IR observations are necessary to provide information about the rest-frame frequency of the peak of the SED for a high-redshift galaxy. Given the current lack of resolved images of distant galaxies in the submm and far-IR wavebands, it is important to neither over-parameterize the descriptions nor overinterpret the results of observations of their SEDs. When spatially resolved, high-spectral resolution images are available, building on existing interferometric images of low-redshift dusty galaxies (Downes and Solomon, 1998; Sakamoto et al., 1999; Wilson et al., 2000), it should be possible to study the radiative transfer from sites of intense star formation and AGN in these geometrically complex opaque galaxies (see Ivison et al., 2000a, 2001). Models of the SEDs of dust-enshrouded AGN at di7erent viewing angles have been developed by Granato et al. (1996), while star-forming regions embedded in a disk geometry have been analyzed by Devriendt et al. (1999). More powerful and eKcient radiative-transfer codes are being developed (for example Abel et al., 1999), and it should be practical to develop detailed models of the appearance of galaxies with realistic geometries to account for future, high-resolution multi-band submm images. At present, we prefer to use a few simple parameters—; and Td —to describe the essential features of the SEDs of dusty galaxies. Although such a model can encapsulate only a small part of the true complexity of the astrophysics in a galaxy, it can account for the existing SED data for a wide variety of dusty galaxies. A simple parametrization is preferable to a more baroque, and necessarily at present unconstrained, combination of geometry, dust mass and temperature. In the following section we list plausible values of our SED parameters and discuss the associated degeneracies in 4tted values. 2.3. The observed SEDs of dusty galaxies Information about the submm SEDs of galaxies has been gathered from targeted mm and submm observations of samples of low-redshift far-IR-selected galaxies from the IRAS catalog (Andreani and Franceschini, 1996; Dunne et al., 2000; Lisenfeld et al., 2000; Dunne and Eales, 2001), and from far-IR and submm observations of high-redshift galaxies (see Fig. 2). The most extensive local survey (SLUGS; Dunne et al., 2000) consists of 850-m SCUBA observations of 104 galaxies selected from the low-redshift IRAS Bright Galaxy Sample (BGS; Soifer et al., 1987). After 4tting single-temperature j B SEDs to the galaxies, Dunne et al. found that = 1:3 ± 0:2 and Td = 38 ± 3 K described the sample as a whole, with a natural dispersion in the properties from galaxy to galaxy. This IRAS-selected sample could be biased against less dusty galaxies. Dunne et al. are currently addressing this issue by observing a complementary sample of B-band selected low-redshift galaxies, which should be representative of optically luminous low-redshift galaxies as a whole. Note, however, that when 4tting only a few datapoints, there is a signi4cant correlation between values of and Td that can account for the data (left panel of Fig. 3). This can lead to ambiguity in the results, further emphasizing the diKculty in associating the dust mass or temperature inferred from a galaxy SED with the real physical properties of the galaxy. The addition of 450-m data for 19 of the 104 galaxies in the SLUGS sample (Dunne and Eales, 2001), tends to split the galaxy SEDs into two categories: those that retain a de4nite 40-K spectrum after including the 450-m data, and those for which cooler single-temperature SEDs, more similar
A.W. Blain et al. / Physics Reports 369 (2002) 111 – 176
123
Fig. 3. An illustration of some of the issues involved in describing the SEDs of dusty galaxies. On the left is a probability contour plot that shows the 0.5, 5 × 10−3 and 5 × 10−5 probability contours for a 4t to an SED model de4ned by the variable parameters and Td with a 4xed value of = −1:95, taking into account four SED datapoints for the galaxy NGC 958 as shown in the right-hand panel (Dunne and Eales, 2001). Note that 1 Jy = 10−26 W m−2 Hz−1 . Note that there is a very signi4cant degeneracy in the 4tted parameters. Adding additional data points with small errors close to the peak of the SED at 200 m reduces the extent of the probability contours by about 50%, but they remain elongated in the same direction. Note that ¿ 2 is not expected physically. On the right the data are compared with 4tted single-temperature SEDs. The solid line is the best 4t to the data. The dashed lines correspond to SEDs from the ends of the probability ‘banana’ shown in the left-hand panel. Note that without the 450-m point, the thick dashed curve describes the best-4t SED, which is de4ned by a signi4cantly greater dust temperature. This SED is similar to that of a typical luminous IR galaxy, whereas the best-4tting model with all four data points is much more like the SED of the Milky Way. Note that the shift in the best-4t model on adding 450-m data is generally less signi4cant than in this case.
to the SEDs of normal spiral galaxies, then provide a better 4t. The 4rst group are typically the more luminous galaxies in the sample, while the second includes 3 of the 5 lowest luminosity galaxies from the sample. Dunne and Eales (2001) propose a two-temperature model to account for the changes in light of the new 450-m data; however, a cooler single-temperature model with a larger value of provides a 4t of similar quality. The results for one of the most signi4cantly di7erent 4ts is shown in the right-hand panel of Fig. 3. With the addition of the 450-m data, the nature of the SEDs of low-redshift, low-luminosity galaxies become more diverse. However, the more luminous galaxies, which are likely to be the most similar to typical high-redshift submm galaxies, are still described reasonably well by the original Dunne et al. (2000) 38-K SED. An alternative approach is to determine an SED that can describe the observed Jux density distribution of galaxies in the far-IR and submm wavebands, which are sensitive to galaxies at low, moderate and high redshifts (Blain et al., 1999b; Trentham et al., 1999; Barnard and Blain, 2002). Using the j B functional form, values of 1:5 and Td 40 K are required to provide a good description of the data, rather similar to the values derived for temperatures of individual low-redshift luminous dusty galaxies in Dunne et al. (2000) and Lisenfeld et al. (2000), and for both the small number of high-redshift submm-selected galaxies with known redshifts and mid-IR spectral constraints (Ivison et al., 1998a, 2000a) and typical high-redshift QSOs (for example Benford et al., 1999). These temperatures are signi4cantly less than those determined for the most extreme
124
A.W. Blain et al. / Physics Reports 369 (2002) 111 – 176
high-redshift galaxies (Lewis et al., 1998), and signi4cantly greater than the Td = 17 K inferred from the maps of the Milky Way made using the all-sky survey from the FIRAS instrument on the Cosmic Background Explorer (COBE) satellite in the early 1990s (Reach et al., 1995). Note that there are examples of moderate-redshift infrared-selected galaxies with both hotter and colder typical dust temperatures than 40 K: see Deane and Trentham (2001) and Chapman et al. (2002d), respectively. At present it seems likely that a 40-K dust temperature is a reasonable assumption for high-redshift submm-selected galaxies. Inevitably, however, there will be a population of hotter high-redshift galaxies (Wilman et al., 2000; Trentham and Blain, 2001). These galaxies would be underrepresented in existing submm surveys, but may make a signi4cant contribution to the 240-m background radiation intensity (Blain and Phillips, 2002). Further observational information to test the assumption of a 40-K dust temperature is keenly awaited. As we discuss below, in Section 2.6, the assumed dust temperature has a signi4cant e7ect on the selection function of submm galaxy surveys, and on the properties that are inferred for the galaxies that are found in these surveys. 2.4. Line emission Emission from molecular rotation and atomic 4ne-structure transition lines can be used to diagnose physical conditions within molecular clouds and photodissociation regions, and to trace out the velocity structure within. Some lines, such as those from CS, HCN and HCO+ are excited only in high-density gas, while others, including the most abundant polar species CO, trace more typical regions in the ISM. Studies of many emission lines from molecular cloud regions in nearby galaxies are possible using existing mm and submm-wave telescopes (Wilson et al., 2000; Helfer, 2000). However, for more distant galaxies only CO lines have so far been detected in signi4cant numbers, almost exclusively from galaxies which have been subject to strong gravitational lensing by foreground galaxies (see the summary in Combes et al., 1999). These observations are useful for deriving physical conditions within the sources, especially if multiple lines are detected (as in the case of APM 08279+5255; Downes et al., 1999b). The improved capabilities of the forthcoming mm=submm interferometer arrays—SMA, upgrades to the IRAM Plateau du Bure interferometer (PdBI), and the Combined Array for Research in Millimeter-wave Astronomy (CARMA) 6—and ultimately the dramatically increased sensitivity of ALMA, will make high-redshift lines much easier to observe over the next decade (Combes et al., 1999; Blain et al., 2000). One of the most important uses of CO-line observations of distant submm galaxies found in continuum surveys is their ability to con4rm an identi4cation absolutely, by tying together an optical and submm redshift at the position of the galaxy. So far this has been achieved for only three submm galaxies (Frayer et al., 1998, 1999; Kneib et al., 2002: see Figs. 14, 18 and 19). In principle, these observations could be made for all continuum-selected galaxies. The diKculty is the narrow fractional bandwidth available for receivers and correlators. Even at the relatively low frequency of 90 GHz, the redshift of the target must be known to better than 0.5% to ensure that a 300 km s−1 wide
6
http://www.mmarray.org/.
A.W. Blain et al. / Physics Reports 369 (2002) 111 – 176
125
CO line, typical of a massive galaxy, with a width equivalent to 0.1% in redshift falls entirely within a 1-GHz band. Future cm-, mm-, and submm-wave instruments with wider bandwidths will signi4cantly assist the search for redshifts using molecular lines. Specially designed low-resolution, ultra-wideband dispersive spectrometers covering many tens of GHz simultaneously on single-antenna mm-wave telescopes also promise to provide redshifts for submm galaxies (Glenn, 2001). A complementary search for redshifted cm-wave OH megamaser emission to pinpoint the redshifts and positions of ultraluminous high-redshift galaxies could be possible using radio telescopes (Townsend et al., 2001). However, there are very stringent requirements on the acceptable level of radio frequency interference from terrestrial and satellite communications. Observations of low-redshift megamasers are described by Darling and Giovanelli (2001). Megamaser emission at high redshifts is discussed by Briggs (1999) in the context of the proposed Square Kilometer Array (SKA) meter=centimeter-wave radio telescope. If it can operate at frequencies of several tens of GHz, then the SKA is also likely to be an eKcient detector of low-excitation high-redshift CO lines (Carilli and Blain, 2002). 2.4.1. Line emission contribution to continuum detections An interesting feature of the CO line emission from low-redshift galaxies is that lines can lie in the passbands of continuum instruments, and could contribute to the continuum Jux inferred. For low-redshift galaxies, the 345-GHz CO(3 → 2) line lies within the 850-m atmospheric window, while the 691-GHz CO(6 → 5) and 230-GHz CO(2 → 1) lines lie in the 450-m and 1.25-mm windows, respectively. Assuming a reasonable template spectrum (Blain et al., 2000), the equivalent width in frequency of the CO(3 → 2) line is 7:4 GHz. The passband of the current SCUBA 850-m (353-GHz) 4lter is about 120 m (50 GHz) wide, and so about 15% of the measured continuum Jux density of a low-redshift galaxy in the 850-m channel is likely to be from the CO line. The high-frequency SCUBA passband in the 450-m atmospheric window is 75 GHz wide, while the equivalent width of the CO(6 → 5) transition is 3:3 GHz. Hence, a smaller 5% contribution to the continuum Jux density from the line is expected at 450 m. The CO(2 → 1) line has an expected equivalent width of 9:2 GHz, while the wide MAMBO passband has half-power points at 210 and 290 GHz. Contamination of the Jux densities detected by MAMBO by about 10% may thus be expected. The largest of these correction factors is comparable to the calibration uncertainty in submm-wave observations, and could be relevant to the detailed interpretation of low-redshift observations. For example, the presence of the CO(3 → 2) line in the 850-m window would shift the inferred continuum emissivity spectral index in the SLUGS survey from 1.3 to 1.52. At high redshifts, any corrections are likely to be less signi4cant, both because the relatively bright CO(3 → 2) line redshifts out of the 850-m passband, and the equivalent width of lines in frequency space decreases as (1 + z)−1 . Although the contribution to measured submm-wave Jux densities from line emission could be signi4cant at the level of order 10%, only a small fraction of the bolometric luminosity from galaxies is detected in the submm waveband. More than 99% of the bolometric luminosity still appears in the continuum, predominantly at shorter far-IR wavelengths.
126
A.W. Blain et al. / Physics Reports 369 (2002) 111 – 176
Fig. 4. The predicted Jux density of a dusty galaxy as a function of redshift in various submm atmospheric windows, and at shorter wavelengths that will be probed by forthcoming space missions. Note the powerful K correction in the mm and submm wavebands at wavelengths longer than about 250 m, which yields a Jux density that is almost independent of redshift. The template spectrum is chosen to reproduce the typical properties of distant submm-selected galaxies (Fig. 2). Subtle e7ects due to the additional heating of dust by the CMB, and 4ne details of the radio SED of galaxies are not included; these e7ects are illustrated in Fig. 8.
2.5. The observability of high-redshift dusty galaxies The detectable Jux density at an observed frequency from a galaxy with bolometric luminosity L at redshift z with an intrinsic SED f , S =
f(1+z) 1+z L ; 2 f d 4DL
(3)
where DL is the luminosity distance to redshift z (for example Peebles, 1993). The key feature that makes submm-wave observations of distant galaxies interesting is the ability to sample the SED of a target galaxy at wavelengths for which the SED is a strongly increasing function of frequency (Fig. 2). This ensures that distant galaxies are observed at a rest-frame wavelength closer to the peak of their SED. There is thus a strong, negative K correction, which leads to high-redshift galaxies being relatively easy to detect at submm wavelengths as compared with their low-redshift counterparts. This e7ect is illustrated in Fig. 4 for the template SED from Blain et al. (1999b) shown in Fig. 2. The strong K-correction e7ect applies at wavelengths longer than about 250 m. At these wavelengths the Jux density from galaxies at z ¿ 1 ceases to decline with the inverse square of distance, but instead remains approximately constant with increasing redshift. A window is thus opened to the detection of all galaxies with similar SEDs at redshifts up to z 10–20. The e7ect is more pronounced at longer wavelengths: in the mm waveband more distant galaxies are expected to produce greater Jux densities than their more proximate counterparts.
A.W. Blain et al. / Physics Reports 369 (2002) 111 – 176
127
Note that both the radio and optical Jux-density–redshift relations decline steeply with increasing redshift, and so high-redshift galaxies are not selected preferentially in those wavebands. The advantage that faint radio and optical galaxy surveys have over submm surveys comes from the complementary probe of astrophysical signatures, and the combination of greater 4elds of view and 4ner angular resolution. A submm telescope that is suKciently sensitive to detect a certain class of galaxy at redshift z 0:5, can detect any similar galaxies out to a redshift z ∼ 10 (Blain and Longair, 1993a). Note, however, that surveys to exploit this unusual K correction are not immune to selection e7ects. The K correction can also only be exploited at redshifts for which suKcient heavy elements are present in the ISM of the target galaxy to form enough dust to reprocess optical radiation. Nor does the K correction e7ect overcome cosmological surface brightness dimming for progressively more distant submm galaxies: the normal (1 + z)−4 reduction in surface brightness still applies; however, it is not expected to become signi4cant until redshifts in excess of about 5. Because submm-wave telescopes do not yet resolve distant galaxies, this e7ect cannot be observed at present. It may provide an opportunity to estimate redshifts for the most distant submm-selected galaxies when they can be resolved using ALMA. 2.6. Submm-wave selection e<ects Deep submm-wave observations image the high-redshift Universe with very little contamination from low-redshift galaxies, and can potentially 4nd a population of galaxies that is quite di7erent to those detected in conventional deep optical surveys, and which could be undetectable in these surveys. The complementarity of submm and optical observations is illustrated by the very limited overlap between galaxies detected in the deep submm-optical image shown in Fig. 1. However, submm surveys are certainly subject to selection e7ects. In Fig. 5 the Jux-density–redshift relation for a submm-luminous galaxy with a 4xed bolometric luminosity is presented as a function of its SED parameters—Td ; and . The relatively minor e7ects of di7erent assumed cosmological models are also shown. Changing the dust temperature has the greatest e7ect. The inferred luminosity of a dusty galaxy for a 4xed observed submm Jux density goes up by a factor of 10 if the dust temperature is doubled, at all but the very highest redshifts. There is thus a signi4cant potential bias in submm surveys against the detection of galaxies with hotter dust temperatures for a given bolometric luminosity. This e7ect was noted by Eales et al. (1999), when investigating the evolution of galaxies in the context of the results of deep SCUBA surveys. They suggested that the submm galaxies may be cooler than the temperatures of about 60 K usually assumed, and so their signi4cance as a population of strongly evolving high-redshift galaxies may have been overestimated. As discussed in Section 2.3, a cooler dust temperature of 40 K is compatible with observations of the SEDs of individual submm galaxies with con4rmed redshifts detected in submm surveys (Ivison et al., 1998a, 2000a) and with the results of targeted observations of luminous low-redshift IRAS galaxies and high-redshift QSOs. If this temperature is assumed, then the inferences about galaxy evolution made from the results of submm surveys (Blain et al., 1999b, c; Eales et al., 2000; Smail et al., 2002) should be reliable. However, until a large sample of submm galaxies with redshifts and multi-waveband SEDs is available, the possibility that a cold or hot population of high-redshift dusty galaxies could be missing from or misidenti4ed in submm surveys cannot be ruled out (Eales et al.,
128
A.W. Blain et al. / Physics Reports 369 (2002) 111 – 176
Fig. 5. Flux-density–redshift relations illustrating some of the uncertainties that apply to the interpretation of submm and far-IR surveys. In the top-left panel the relatively small e7ects of changing the world model parameters are shown. Note that changes to the volume element and Jux density received counteract each other, and so the chosen cosmology has a small e7ect on the interpretation of surveys. The most signi4cant e7ect—that of changing the dust temperature—is shown in the top-right panel. At 175 m for galaxies at moderate redshifts, the e7ect of temperature is rather small. However, at 850 m, the e7ects are very signi4cant, and must be remembered when interpreting the results of 850-m observations: doubling the dust temperature corresponds to increasing the luminosity associated with a given Jux density by a factor of about 10. The less signi4cant e7ects of changing the dust emissivity index or the mid-IR spectral index are shown in the bottom-left and bottom-right panels respectively.
1999; Blain and Phillips, 2002). The possible e7ects on inferred luminosities of di7erent forms of the SED shown in Fig. 5 need to be taken seriously, especially when describing the properties of individual galaxies selected in submm surveys.
A.W. Blain et al. / Physics Reports 369 (2002) 111 – 176
129
There is little reliable evidence for a systematic relationship between dust temperature and redshift. Observations of low-redshift IRAS galaxies (Andreani and Franceschini, 1996; Dunne et al., 2000), indicate that any variation of dust temperature with luminosity appears to be gradual. However, there is evidence for a signi4cant and systematic change in the temperature of dusty galaxies with a wider range of luminosities, from about 20 K for low-redshift spirals (Reach et al., 1995; Alton et al., 2000; Dunne and Eales, 2001) to about 40 K for more luminous objects typical of the galaxies detected in the IRAS survey. Temperatures of up to 110 K are found for some extremely luminous high-redshift galaxies (Lewis et al., 1998). We stress that there could be a signi4cant selection e7ect in submm surveys that depends on the range of dust temperatures in the source population. The importance of such an e7ect can be quanti4ed once a complete redshift distribution is available for a submm-selected galaxy sample. 2.7. Deep submm-wave surveys Images of the redshift z = 0:25 cluster of galaxies Abell 1835 in both the optical and submm wavebands were compared in Fig. 1. This image provides a realistic impression of the appearance of deep optical and submm images of the sky. Note that the only relatively low-redshift or cluster member galaxy that contributes any submm-wave Jux is the central cD galaxy: the other cluster galaxies are either quiescent, neither forming stars nor heating dust, or are insuKciently luminous to be detectable at 850 m using SCUBA. Background galaxies at much greater redshift, which have faint optical counterparts as compared with the cluster member galaxies, dominate the image. This is a direct visual demonstration of the strong bias towards the detection of distant galaxies in submm-wave surveys that was illustrated in Fig. 4. These background galaxies are magni4ed by a factor of order 2–3 due to the gravitational lensing potential of the foreground cluster over the full extent of the image. The e7ects of gravitational lensing can be determined using accurate models of the cluster potential, that are constrained with the help of data from Hubble Space Telescope (HST) images and spectroscopic redshifts for multiply imaged optically selected galaxies. The uncertainty in the results is comparable to the uncertainty in the calibration of the submm images. As shown in Fig. 1, existing deep submm images are much less visually stimulating than deep optical images, because their angular resolution is not suKcient to image the internal structure in distant galaxies. The limited resolution also imposes a confusion limit to the depth for submm surveys, at which the noise level is dominated not by atmospheric or instrumental noise but by the telescope resolution blurring together signals from faint unresolved galaxies. It takes about 50 h of integration using SCUBA to reach the practical confusion limit in a single 4eld. Confusion is discussed in more detail in Section 3.1. There are a variety of published results from deep submm galaxy surveys. Surveys aim to detect high-redshift galaxies exploiting the powerful K-correction e7ect in the submm waveband. Over 500 arcmin2 of blank sky has been surveyed using SCUBA by several groups (Barger et al., 1998, 1999a; Hughes et al., 1998; Eales et al., 1999, 2000; Borys et al., 2002; Fox et al., 2002; Scott et al., 2002; Webb et al., 2002a). These range from an extremely deep survey in the area of the Hubble Deep Field-North (HDF-N; Williams et al., 1995) by Hughes et al. (1998) searching for the faintest detectable populations of submm galaxies, to wider-4eld shallower surveys to detect brighter sources that might be easier to follow-up and could be used to trace large-scale structure (Borys et al., 2002; Scott et al., 2002). About 30 5-arcmin2 lensed cluster 4elds have been imaged using
130
A.W. Blain et al. / Physics Reports 369 (2002) 111 – 176
SCUBA (Smail et al., 1997, 2002; Chapman et al., 2002a; Cowie et al., 2002; (Kraiberg Knudsen et al., 2001; van der Werf and Kraiberg Knudsen, 2001), to various RMS depths between 0.5 and 8 mJy. 7 By exploiting the magni4cation e7ect of gravitational lensing, which extends over 4elds several arcminutes across, due to rich clusters of galaxies at moderate redshifts, the population of distant galaxies in the source plane behind the lensing cluster can be probed to greater depths than is possible in a blank 4eld (Blain, 1998). The detection rate of galaxies using SCUBA based on published papers appears to have declined over time since 1998. In signi4cant part this is due to the absence of the sustained excellent observing conditions on Mauna Kea that were experienced during the El Nino winter of 1997–1998, just after SCUBA was commissioned. A wide variety of complementary, and sometimes overlapping surveys have been made using MAMBO at the IRAM 30-m telescope (Bertoldi et al., 2000, 2001; Carilli et al., 2001), during several winters. These 4elds include the cluster Abell 2125 and the ESO-NTT Deep Field (Arnouts et al., 1999). A compilation of the results from all the SCUBA and MAMBO surveys is presented in Figs. 9 and 10. A full summary of deep projects that have been undertaken or are underway can be found in Ivison (2001). In addition, larger shallower submm surveys of the Galaxy (Pierce-Price et al., 2000), 8 and perhaps the CMB images obtained using the BOOMERANG balloon-borne experiment (Masi et al., 2001), can be used to search for brighter submm-wave galaxies. 2.8. Submm observations of known high-redshift galaxies and QSOs The advent of SCUBA and MAMBO has also provided the opportunity to study the submm properties of large samples of interesting high-redshift galaxies, including almost all types of previously known distant galaxies. Isolated detections of high-redshift AGN-powered radio galaxies and QSOs were made in the mid-1990s using single-element bolometer detectors (for example Dunlop et al., 1994; Isaak et al., 1994); however, the compilation of statistical samples, and the secure rejection of contamination from Juctuating atmospheric noise have only been possible more recently, using SCUBA and MAMBO, and the 350-m one-dimensional bolometer array SHARC at the 10.4-m aperture Caltech Submillimeter Observatory (CSO) on Mauna Kea. A key advantage of observing these sources is that both their redshifts and some of their astrophysical properties are already known, in contrast with the submm-selected galaxies discovered in blank-4eld surveys. Some of the targeted galaxies—very faint non-AGN radio galaxies, mid-IR-selected ISO galaxies, and X-ray selected AGNs—have only been detected very recently. As the relationship between these populations of galaxies and submm-selected galaxies is still unclear, many of the limits will be discussed in the context of following up submm surveys in Section 4. Targeted surveys include a search for submm-wave continuum emission from high-redshift AGN-powered radio galaxies (Archibald et al., 2001), and observations of various samples of optically selected QSOs (for example Carilli et al., 2001; Isaak et al., 2002). In these observations a single bolometer is aimed at the position of the target. While this does not lead to a fully sampled 7
1 Jy = 10−26 W m−2 Hz−1 . A JCMT project (Vicki Barnard et al.) is currently searching for candidate high-redshift galaxies detected in wide-4eld SCUBA images of star-forming regions in the Milky Way (Barnard et al., 2002). 8
A.W. Blain et al. / Physics Reports 369 (2002) 111 – 176
131
image of the sky, it provides a more rapid measurement of the Jux density at a chosen position. The results have been the detection and characterization of the dust emission spectra for a range of luminous high-redshift galaxies and QSOs, including APM 08279+5255 (Lewis et al., 1998), the galaxy with the greatest apparent luminosity in the Universe. Barvainis and Ivison (2002) have targeted all the known galaxies magni4ed into multiple images by the gravitational lensing e7ect of foreground galaxies from the CASTLES gravitational lens imaging project, 9 signi4cantly expanding the list of high-redshift galaxies magni4ed by a foreground mass concentration with a submm detection. The SEDs of several of these galaxies are shown in Fig. 2. Archibald et al. (2001) 4nd evidence for signi4cant evolution in the properties of dust emission with increasing redshift in a carefully selected sample of AGN radio galaxies, whose radio properties were chosen to be almost independent of the redshift of the observed galaxy. The results perhaps indicate that more intense star-formation activity, as traced by the submm emission, takes place alongside the radio source activity at higher redshifts, and so provide a possible clue to the formation and evolution of the massive elliptical galaxies thought to host radio galaxies. Hughes et al. (1997), Ivison et al. (1998b) and Omont et al. (2001) discuss the consequences of 4nding large masses of dust at high redshifts, in terms of the limited cosmic time available for the formation of the stars required to produce the metals and dust required to generate suKciently intense submm emission from the host galaxy. The radio galaxies detected by Archibald et al. (2001) in pointed single-bolometer SCUBA observations were followed up by imaging observations of the surrounding 5-arcmin2 4elds, to search for submm-loud companions. Ivison et al. (2000b) found that the surface density of submm galaxies in some of these 4elds is about an order of magnitude greater than that in a typical blank 4eld, indicating a signi4cant overdensity of sources. This is likely due to some radio galaxies being found in high-density regions of biased high-redshift galaxy formation, which are possibly ‘protoclusters’— rich clusters of galaxies in the process of formation. A similar targeted approach has been taken to try to detect submm-wave emission from optically selected LBGs at redshifts between 2.5 and 4.5 (Steidel et al., 1999). The Lyman-break technique (Steidel et al., 1996) detects the restframe 91.2-nm neutral hydrogen absorption break in the SED of a galaxy as it passes through several broad-band 4lters. Large samples of candidate LBGs can be gathered using multi-color optical images from 4-m class telescopes. The eKciency of the selection method is of order 70% after spectroscopic con4rmation of the candidates using 8=10-m class telescopes. The LBGs are the largest sample of spectroscopically con4rmed high-redshift galaxies, with a well-de4ned luminosity function (Adelberger and Steidel, 2000), a surface density of order 10 arcmin−2 , and inferred star-formation rates between 1 and 10 M yr −1 . They appear to be typical of the population of distant galaxies, and their spectra provide useful astrophysical information. Observing the LBGs at submm wavelengths is an important goal, as an accurate determination of their submm-wave properties will investigate the link (if any) between the large well-studied LBG sample and the more enigmatic submm galaxy population (Blain et al., 1999c; Lilly et al., 1999; Adelberger and Steidel, 2000; Granato et al., 2001). At present, the typical very faint limits to the optical counterparts of the subset of submm galaxies with accurate positions (Smail et al., 1998a, 2002; Downes et al., 1999a; Dannerbauer et al., 2002), and the detection of extremely red object (ERO) galaxies (with R − K ¿ 6) as counterparts to a signi4cant fraction of submm galaxies 9
http://cfa-www.harvard.edu/castles.
132
A.W. Blain et al. / Physics Reports 369 (2002) 111 – 176
(Smail et al., 1999, 2002; Gear et al., 2000; Frayer et al., 2002; Ivison et al., 2001; Lutz et al., 2001), argue against a large overlap between the two populations. The direct submm detection of LBGs using SCUBA has been largely unsuccessful at the 0.5-mJy RMS level: a single galaxy out of 16 was detected by Chapman et al. (2000, 2002c), while Webb et al. (2002b) describe a low signi4cance of overlap between LBGs and SCUBA galaxies in a wide-4eld survey. The LBG cB58 at z = 2:72 (Ellingson et al., 1996; Frayer et al., 1997; Seitz et al., 1998; Pettini et al., 2000), which is magni4ed strongly (by a factor of 10 –20) by a foreground cluster of galaxies at z = 0:37, and is at least 10 times brighter than a typical LBG, was detected in the mm and submm by Baker et al. (2001) and van der Werf et al. (2002). However, after correcting for lensing, its 850-m Jux density is only about 0:1 mJy, below the level of confusion noise in SCUBA images, and similar to the Jux density level of the statistical detection of high-redshift LBGs in the 850-m SCUBA image of the HDF-N (Peacock et al., 2000). In the 4eld surrounding an overdensity of LBGs at z =3:09, Chapman et al. (2001a) were successful in detecting bright submm emission that appears to be associated with di7use sources of Lyman- emission at the redshift of the overdensity, but were not included in the Lyman-break catalog. A key point to note is that the limits on submm-wave emission from LBGs are typically lower than expected if the relationship between UV spectral slope and far-IR luminosity observed for low-redshift low-luminosity starburst galaxies (Meurer et al., 1999) continues to high redshifts. Goldader et al. (2002) indicate that the relationship does not appear to hold for the most luminous galaxies. The required sensitivity for successful submm observations of typical LBGs seems to be deeper than can be achieved using existing instruments. Observations using future very sensitive, highresolution interferometers certainly ALMA, and perhaps CARMA and SMA, will shed more light on the submm–LBG connection. The advent of the current generation of very sensitive X-ray observatories, Chandra and XMM-Newton, is generating a large sample of faint, hard X-ray sources, the luminosity of which is assumed to be dominated by high-redshift AGN (Fabian, 2000). Absorption and Compton scattering in large column densities of gas preferentially depletes soft X-rays, hardening the X-ray SEDs of gas-rich AGN. Such a population of hard, absorbed X-ray sources is required in order to account for the cosmic X-ray background radiation spectrum, which is harder than the typical SEDs of individual low-redshift AGN (Fabian and Barcons, 1992; Hasinger et al., 1996). Observations of the limited areas of the sky where both submm and X-ray data are available (Fabian et al., 2000; Hornschemeier et al., 2000; Mushotzky et al., 2000; Almaini et al., 2002) have tended to show little direct overlap between the X-ray and submm galaxies, although there are examples of X-ray-detected submm-wave galaxies (Bautz et al., 2000). The combined results of Bautz et al. and Fabian et al. reveal that 2 out of 9 SCUBA galaxies are detected by Chandra. In the larger-area brighter 8-mJy survey, Almaini et al. (2002) identify only 1 out of 17 SCUBA galaxies using Chandra. Page et al. (2002) discuss further the submm properties of X-ray sources. Perhaps of order 10% of known submm galaxies have faint hard X-ray counterparts that would be typical of dust-enshrouded AGN. There is also a statistical detection of excess submm-wave emission from the positions of faint high-redshift hard X-ray sources (Barger et al., 2001) and a positive submm–X-ray galaxy correlation function (Almaini et al., 2002). The lack of strong X-ray emission from a majority of submm galaxies lends circumstantial support to the idea that much of their luminosity is derived from star formation and not from AGN accretion. However, some submm galaxies may have hydrogen column densities, and thus optical depths to Compton scattering, that are suKciently great to obscure soft X-ray radiation
A.W. Blain et al. / Physics Reports 369 (2002) 111 – 176
133
entirely (¿ 1024 cm−2 ). Even if they contained powerful AGN, these submm galaxies would be very faint in Chandra surveys, which reach detection limits of order 10−17 erg cm−2 s−1 at soft 0.5 –2 keV wavelengths (Giacconi et al., 2002). They may be found in very deep observations using the greater collecting area of XMM-Newton for hard X-ray photons. However, note that the 15-arcsec resolution of XMM-Newton leads to confusion due to unresolved faint sources in the beam that is likely to impose a practical limit of order 10−15 erg cm−2 s−1 to the depth of a survey in the hard X-ray 2–8 keV band (Barcons et al., 2002). Deconvolution of the images from joint XMM-Newton=Chandra deep 4elds, exploiting the sub-arcsec positional information from Chandra, will perhaps allow this limit to be exceeded. Finally, galaxies detected in far-IR surveys using ISO (for example Puget et al., 1999) out to redshifts z ∼ 1 have been targeted for SCUBA submm observations (Scott et al., 2000). The large arcmin-scale observing beam in 170-m ISO surveys makes identi4cation of submm counterparts diKcult, but progress has been made by combining sub-arcsec resolution deep radio images. The results include some sources with apparently rather cool dust temperatures of order 30 K (Chapman et al., 2002d), and are generally consistent with redshifts less than unity and dust temperatures of less than about 50 K for most of the galaxies. 2.9. Alternative strategy for deep submm surveys An alternative strategy for 4nding submm galaxies has also been tried: targeted submm observations of faint radio-selected galaxies with accurate positions and radio spectral index information detected at a 1.4-GHz Jux density level brighter than 40-Jy using the VLA (for example, Richards, 2000). The radio source population at these faint Jux density levels is expected to include mostly high-redshift star-forming galaxies, and only a minority of sources with the more powerful synchrotron-emitting jets and lobes associated with particles accelerated by AGN. By selecting those faint radio sources with faint K-band counterparts (Barger et al., 2000; Chapman et al., 2001b), it should be possible to sift out low-redshift galaxies from the target sample, and so generate a concentrated sample of luminous high-redshift galaxies to study in the submm. From Fig. 4, it is clear that a high-redshift faint 20 –100-Jy non-AGN 1.4-GHz radio source is likely to be a very luminous galaxy. Chapman et al. (2001b) claim that high-redshift submm galaxies can be detected using SCUBA at a rate of about one every hour using this method, which is signi4cantly more rapid than the rate of about one every 10 h achieved in blank-4eld surveys. The eKciency of submm detection of optically faint 1.4-GHz 20-Jy radio galaxies at an 850-m Jux density greater than about 5 mJy is of order 30 – 40%. Barger et al. (2000) detect 5=15 with I ¿ 24 at 6 mJy, and Chapman et al. (2001b) detect 20=47 with I ¿ 25 at 4:5 mJy. The diKculty comes in the interpretation of the results. The additional selection conditions of requiring 4rst a radio detection, and then a faint optical counterpart will inevitably lead to the omission of a certain fraction of the submm galaxy population. Galaxies missing would include both the order of 15% of distant submm-selected galaxies that have relatively bright optical counterparts— the ‘Class-2’ SCUBA galaxies (Ivison et al., 2000a; Smail et al., 2002)—and any very high-redshift submm galaxies that cannot be detected at the VLA, despite lying on the far-IR–radio correlation (Condon, 1992; see Fig. 8). There is also likely to be a bias (at all redshifts) towards both detecting AGNs, in which radio Jux densities are boosted above the level expected from the standard radio– far-IR correlation, and the exclusion of a small population of low-redshift submm galaxies, which
134
A.W. Blain et al. / Physics Reports 369 (2002) 111 – 176
would be too bright in the radio and K band to be included in the survey. The rare, and perhaps especially interesting, submm galaxies at the lowest and highest redshifts are thus likely to be missing from radio pre-selected surveys, as are the distant submm galaxies with brighter optical magnitudes that are likely to be easiest to follow-up and investigate. The size of these selection e7ects is diKcult to quantify at present. However, their existence can be inferred from the diverse multiwaveband properties displayed by submm-selected galaxies from imaging surveys, a signi4cant fraction of which are known not to be detected in deep radio observations (Downes et al., 1999b; Smail et al., 2000). In some cases, the depth of the radio images used to make the comparison could be improved by a factor of several; it is possible, but we suspect unlikely, that all submm sources have radio counterparts lurking just below existing detection thresholds. It is instructive to compare the results of the blank-4eld and faint radio pre-selected submm surveys. In a true 260-arcmin2 blank-4eld survey, Scott et al. (2002) found surface densities of 550+100 −170 and 180 ± 60 deg−2 submm galaxies brighter than 6 and 10 mJy respectively, while Borys et al. (2002) estimate 164 ± 28 deg−2 brighter than 12 mJy in a 125-arcmin2 survey, with a rather conservative error estimate from 12 detections. The surface densities resulting from the faint radio-selected investigations of Barger et al. (2000) and Chapman et al. (2001b) are 430 deg−2 brighter than 4 mJy and 135 deg−2 brighter than 10 mJy. 10 This indicates incompleteness in the faint radio-selected counts by at least about 25%. This is perfectly acceptable for gathering a list of high-redshift galaxies for further study; however, for statistical analysis of a large sample of many tens of submm galaxies, the uncertainty introduced in the derived properties of the population due to incompleteness is expected to dominate the Poisson uncertainty, and so limit the accuracy of the inferences that can be made. Once the bright counts of distant 850-m galaxies are known from direct, hopefully unbiased, large-area SCUBA imaging surveys (Fox et al., 2002; Scott et al., 2002) and matched with very deep radio images (Ivison et al., 2002), then the value of this shortcut for compiling a large sample of submm galaxies can be assessed. The same strategy would be very valuable for wide-4eld MAMBO 1.2-mm surveys (Dannerbauer et al., 2002). At present, some care needs to be taken in the interpretation of faint radio pre-selected surveys. 2.10. Determining redshifts of submm galaxies By virtue of their uniform selection function with redshift, it is impossible to even indicate the redshift of submm galaxies from single-wavelength submm Jux densities alone. The relatively poor positional accuracy of submm images also makes it very diKcult to identify an unambiguous optical counterpart for spectroscopic follow-up. This will be discussed in more detail when the properties of individual galaxies are described in Section 3. Here we discuss some general features of the prospects for determining their redshifts from submm, far-IR and radio observations. 2.10.1. Photometric redshifts from far-IR SEDs The submm–far-IR SED of dusty galaxies is thermal, and so redshifting a 4xed template SED a7ects the observed colors in exactly the same way as changing the dust temperature. This means 10
Barger et al. (2000) also impose a limit of less than 840 deg−2 brighter than 6 mJy at 850-m, based on serendipitous detections of sources in their 2.5-arcmin-wide SCUBA images around the AGN radio galaxies.
A.W. Blain et al. / Physics Reports 369 (2002) 111 – 176
135
Fig. 6. The ratio of Jux densities expected in di7erent observing bands as a function of the degenerate redshift=dust temperature parameter, compared with the Jux density expected in the 70-m band at which the SIRTF satellite will be very sensitive. Where the lines have steep gradients, measured colors from multi-band data locate the peak of the dust SED accurately in the observer’s frame, providing a measurement of temperature–redshift. The degeneracy between Td and z can be lifted slightly by including radio data (Blain, 1999a; Yun and Carilli, 2002), if the dust temperature is greater than about 60 K (see also Fig. 7). If deep near-IR and optical images can be included, and the optical counterpart to the galaxy can be readily identi4ed, then conventional photometric redshifts can be determined from stellar synthesis models. However, care must be taken as it is unclear whether the SEDs of very dusty galaxies have familiar restframe-optical spectral breaks.
that even when multi-frequency far-IR data is available, the redshift will be uncertain, unless there is information about the intrinsic dust temperature of the source. This e7ect is related to the selection e7ect in favor of cold galaxies illustrated in Fig. 5. The expected colors of dusty galaxies, as a function of (1 + z)=Td , the parameter that can be constrained in light of this degeneracy, is shown in Fig. 6. Colors in a variety of observing bands in both submm atmospheric windows and for the observing bands of the MIPS instrument on the Space InfraRed Telescope Facility (SIRTF) 11 satellite are included. It is vital to stress that without knowledge of the dust temperature, it is impossible to determine a redshift from any combination of multicolor broadband far-IR=submm data. The degeneracy between Td and z is lifted partially by combining information derived from radio observations, if Td ¿ 60 K. It may also become clear that there is a Universal temperature–luminosity relation that extends to high redshifts, and can be exploited to determine redshift information using a far-IR–submm color–magnitude relation. In the absence of such a Universal relation, far-IR and submm colors can only be used to 4x the parameter (1 + z)=Td reliably.
11
sirtf.caltech.edu.
136
A.W. Blain et al. / Physics Reports 369 (2002) 111 – 176
2.10.2. Radio–submm photometric redshifts In Fig. 4, an estimate of the Jux density of radio emission of a template dusty galaxy was shown as a function of redshift. We now investigate the radio properties of submm galaxies, which provide useful information about their SED and redshift. There is an excellent observed correlation between the radio and far-IR (60- and 100-m) Jux densities of low-redshift galaxies over 4 orders of magnitude in luminosity, reviewed by Condon (1992) and recently investigated out to z 0:3 by Yun et al. (2001). If this correlation is assumed to hold to high redshifts, then submm-selected galaxies should be detectable in the deepest 10- Jy-RMS 1.4-GHz VLA radio images out to redshifts of order 3. Note that as a result of this correlation, optical spectroscopy of faint non-AGN radio galaxies alone can be used to trace the evolution of star-formation activity to redshifts z 1:2, beyond which spectroscopic redshifts are hard to determine (Haarsma et al., 2000). The far-IR–radio correlation is thought to be due to a match between the rate at which the optical=UV radiation from young stars is absorbed by dust on a local scale in star-forming regions of galaxies, and re-emitted as thermal far-IR radiation, and the radio luminosity from the same regions (Harwit and Pacini, 1975). The radio luminosity is due to both free–free emission in HII regions, and more importantly at frequencies less than about 10 GHz, to the synchrotron emission from relativistic electrons accelerated in supernova shocks. If an AGN is present, then it is likely that its accretion disk will provide an additional source of UV photons to heat dust, and both the disk and outJows will generate shocks to accelerate relativistic electrons. There is little reason to expect these e7ects to be proportional, unlike UV heating by massive stars and particle acceleration by supernova shocks in star-forming regions. Radio-quiet QSOs tend to lie on the radio-loud side of the far-IR–radio correlation, while radio-loud AGN can lie up to three orders of magnitude away. Carilli and Yun (1999, 2000) demonstrated that the radio–submm color is a useful redshift indicator, assuming that dusty galaxies have simple synchrotron SEDs in the radio waveband and thermal dust spectra in the submm and far-IR wavebands. The radio–submm color was also considered in the interpretation of the redshifts of galaxies detected in the submm surveys by Hughes et al. (1998), Lilly et al. (1999) and Eales et al. (2000). The Carilli–Yun redshift indicator is subject to a degeneracy between dust temperature and redshift, at dust temperatures less than about 60 K: see Fig. 7 in which its form is shown for a variety of SEDs. Hot, distant galaxies are diKcult to distinguish from cool, low-redshift ones (Blain, 1999a, b). Despite this degeneracy, the Carilli–Yun redshift indicator is very useful, especially for investigating optically faint submm galaxies for which almost no other information is available (Smail et al., 2000). AGN, which are expected to be radio-loud as compared with the standard far-IR–radio correlation, lead to conservatively low Carilli–Yun estimated redshifts. At the highest redshifts, two additional factors arise to modify the relation expected. First, synchrotron emission from relativistic electrons is likely to be suppressed. The intensity of synchrotron emission depends on only the energy density in the interstellar magnetic 4eld, while the total cooling rate of the electrons depends on the sum of the energy densities in the interstellar magnetic 4eld, for synchrotron losses, and in the ISRF, for inverse Compton scattering losses. The tight low-redshift far-IR–radio correlation implies that these energy densities are proportional over a very wide range of galaxy properties (VVolk, 1998). However, above some critical redshift, the energy density in the CMB will always rise to dominate the ISRF, upsetting this balance. Inverse-Compton electron cooling will then dominate and the amount of synchrotron emission will decline. Note, however, that free–free emission does not su7er this suppression at high redshifts, and so should remain detectable
A.W. Blain et al. / Physics Reports 369 (2002) 111 – 176
137
Fig. 7. The behavior of the Carilli–Yun 1.4-GHz to 850-m radio–submm redshift indicator. The left panel shows the ratios of 1.4-GHz:850-m Jux density predicted from empirical SEDs by Carilli and Yun (1999; dotted lines) and Dunne et al. (2001; dashed line; see also Carilli and Yun, 2000). Predictions for the ratio based on the results of Blain (1999a) are also shown assuming radio–far-IR SEDs with various dust temperatures, but which all lie on the far-IR–radio correlation (Yun et al., 2001; solid lines). The Jux ratio is a good indicator of redshift, clearly separating high- and low-redshift galaxies. Both synchrotron and free–free radio emission are included, and the dust temperature and radio properties evolve with redshift self-consistently, as modi4ed by the CMB. In the right panel, the solid curves are replotted as a combined function of temperature and redshift, emphasizing that for Td ¡ 60 K, the inferred temperature and redshift are degenerate, just as for a thermal spectrum (Fig. 6). For Td ¿ 60 K the Jux ratio becomes a non-degenerate redshift indicator.
out to any practical redshift. The almost Jat SED of free–free emission ensures a more favorable K correction for high-redshift galaxies than expected for a pure synchrotron emission spectrum. The free–free emission spectrum cuts o7 only at photon energies greater than the thermal energy of emitting electrons in HII regions. In the absence of free–free optical depth e7ects, for 104–5 K this corresponds to optical frequencies (Yun and Carilli, 2002). In Fig. 8 we show the e7ects of CMB suppression of synchrotron emission, for a ratio of energy densities in the magnetic 4eld and the ISRF of 0.33, which is reasonable for M82 and the Milky Way (Hummel, 1986). We assume a galaxy SED template that lies on the standard far-IR–radio correlation at z = 0 (Condon, 1992). Secondly, again most signi4cant at high redshifts, a minimum dust temperature is imposed by the rising CMB temperature: dust must be hotter than the CMB. Given that observed dust temperatures in ultraluminous galaxies seem to lie in the range 40 –100 K, then this may become an important factor at redshifts 10 ¡ z ¡ 30, if an early generation of stars generates the heavy elements required to form dust prior to these redshifts. For cooler Milky-Way-like SEDs, this e7ect would be important at z 5. An increase in dust temperature due to CMB heating at high redshifts shifts the peak of the dust SED to higher frequencies, counteracting the bene4cial K correction illustrated in Fig. 4. As a result, there is a 4rm upper limit to the redshift at which submm continuum radiation can be exploited to image the most distant galaxies eKciently, even if these galaxies do contain dust.
138
A.W. Blain et al. / Physics Reports 369 (2002) 111 – 176
Fig. 8. Some key features of Jux density–redshift relations expected at a range of wavelengths, extending to very high redshifts. CMB heating of dust at z ¿ 10 prevents the mm-wave K correction from assisting the detection of very high redshift galaxies: the Jux density–redshift relation has the same redshift dependence beyond z 15 at 230, 90 and 30 GHz. CMB cooling of relativistic electrons suppresses synchrotron radio emission beyond z ∼ 5, as shown by the thin dashed line. Realistic free–free emission is also included in the model represented by the thick dashed line (Condon, 1992; Yun et al., 2001), signi4cantly increasing the radio emission expected from very high redshifts. An estimate of the Jux density from a 3 × 104 -K stellar photosphere at 5 m is also shown, cuto7 at the redshift beyond which Lyman- absorption is redshifted through the band. Note that there is probably a maximum redshift above which dust does not exist, and so beyond which thermal emission from dust can never be detected; this e7ect is not included here.
This is illustrated by the Jux density–redshift relations at 230, 90 and 30 GHz shown in Fig. 8: at redshifts greater than about 15 all three curves have the same form. Both of these factors were included in the derivation of the curves in Fig. 7, which should thus provide an accurate guide to the usefulness of the Carilli–Yun redshift indicator out to the highest redshifts. Note that the indicator becomes of little use for the most distant galaxies, for which an almost constant radio:submm Jux ratio of about 10−3 is expected. In addition, the hottest dusty galaxies may be more likely to contain AGN, and thus to lie on the radio-loud side of the far-IR–radio correlation. This e7ect could make the interpretation of Fig. 7 for determining redshifts ambiguous, even if Td ¿ 60 K. However, it should always guarantee a conservative estimate of the redshift for any observed galaxy. Foreground absorption is not a problem for very high-redshift sources of any radio, submm or far-IR radiation. It is thus likely that the most sensitive future instruments at submm and radio wavelengths, ALMA and the SKA will both be able to detect ‘4rst light’ galaxies. Note, however, that a mask of foreground structure may be signi4cant for radio observations (Waxman and Loeb, 2000). A practical limit to the capability of determining the history of very early star formation from an ‘SKA deep 4eld’ could also be set by the low, cosmologically dimmed surface brightness of galaxies at the highest redshifts, and their potentially overlapping emission regions. The importance
A.W. Blain et al. / Physics Reports 369 (2002) 111 – 176
139
of both of these factors is expected to depend critically on the unknown physical sizes of the 4rst galaxies. ALMA will probably be limited to observe galaxies out to a maximum redshift set by the requirement that suKcient metals have been generated to form obscuring dust at that early epoch. To compare what might be possible in the near-IR waveband, the emission from a 3 × 104 -K blackbody stellar photosphere at 5 m is shown in Fig. 8, cuto7 at the redshift beyond which absorption by redshifted Lyman- becomes important in the band. This indicates the potential for probing the earliest galaxies using a near-IR camera on the Next Generation Space Telescope (NGST). Free– free emission and redshifted near-IR stellar emission may thus be the best routes to the detection of the 4rst galaxies at redshifts greater than 10, if they exist. 3. The observed properties of submm-selected galaxies Well over 100 submm-selected galaxies are now known (see Fig. 9), although their redshifts and detailed astrophysical properties are very largely uncertain. The key information available about their properties comes from observations of discrete galaxies made using the SCUBA and MAMBO bolometer array cameras at wavelengths of 450, 850 and 1200 m. Counts of distant galaxies at far-IR wavelengths of 95 and 175 m have also been measured using the PHOT instrument aboard ISO. Limits to the counts at 2:8 mm have been obtained using the Berkeley–Illinois–Maryland Association (BIMA) mm-wave interferometer. The results of all the relevant observations are compiled in Figs. 9 and 10. Information is also available about the population of mid-IR 15-m sources using the CAM instrument aboard ISO (Altieri et al., 1999; Elbaz et al., 1999): see Fig. 22. Measurements of the observed intensity of background radiation from the radio to far-UV wavebands are shown in Fig. 11. The background measurements include lower limits obtained by summing over the observed counts plotted in Figs. 9 and 10. The deepest counts at 850 and 15 m come from surveys made in 4elds magni4ed by gravitational lensing clusters. Surface brightness conservation in the lensing process ensures that the mean background intensity in the direction of a lens should be the same as that in a blank 4eld. From the properties of the counts and backgrounds alone, without any details of the individual galaxies involved, it is possible to infer important details about the population of distant dustenshrouded galaxies. The signi4cant surface density of the faint SCUBA and MAMBO galaxies, when coupled to plausible SEDs (Blain et al., 1999b; Trentham et al., 1999; Dunne et al., 2000), clearly indicates that the luminosity function of distant dusty submm galaxies is much greater than that of low-redshift IRAS galaxies (Saunders et al., 1990; Soifer and Neugebauer, 1991), and undergoes very strong evolution. An extrapolation of the low-redshift luminosity function without evolution predicts a surface density of galaxies brighter than 5 mJy at 850 m of only about 0:25 deg−2 , as compared with the observed density of several 100 deg−2 (Fig. 9). Because of the Jat Jux density–redshift relation in the submm shown in Fig. 4, a 5-mJy SCUBA galaxy at any moderate or high redshift (z ¿ 0:5) has a luminosity greater than about 8×1012 L . Immediately, this tells us that the comoving density of high-redshift galaxies with luminosities in excess of about 1013 L is 400 times greater than at z = 0. We stress that the submm K correction ensures that the redshift has little e7ect on the results: the count would be approximately the same whether the population is concentrated at z 1 or extends from z 2 to 10.
140
A.W. Blain et al. / Physics Reports 369 (2002) 111 – 176
Fig. 9. A summary of count data from several mm, submm and far-IR surveys, shown at wavelengths of 2.8 mm, 850 m and 95 m in order up the 4gure. The overplotted curves are derived in models that provide good 4ts to the compilation of data, and are updated from the results in the listed MNRAS papers (Blain et al., 1999b, c). Identical symbols represent post-1999 data from the same source. The errors are shown as 1 values unless stated. The 2:8 mm data (square) is from Wilner and Wright (1997). At 850 m, in order of increasing Jux (less than 15 mJy), data is from Blain et al. (1999a), Cowie et al. (2002) with 90% con4dence limits; Hughes et al. (1998), Chapman et al. (2001b), Barger et al. (1998, 1999a), Smail et al. (1997), Eales et al. (1999, 2000) consistent with the increased area reported by Webb et al. (2002a), Borys et al. (2002), Barger et al. (2000) and Scott et al. (2002). The data points between about 2 and 10 mJy are consistent with a steep integral source count N (¿ S) ˙ S , with a power-law index −1:6. The counts at brighter Jux densities are likely to steepen considerably; note that the counts must turn over at fainter Jux densities to have ¡ − 1 to avoid the background radiation intensity diverging. At 95 m, the data is from Juvela et al. (2000), Kawara et al. (1998), Matsuhara et al. (2000), Serjeant et al. (2001) and Linden-Vornle et al. (2000).
This estimate is subject only to an uncertainty in the dust temperature, which is assumed to be about 40 K. Even if the dust temperature of some of the galaxies is as low as the 20 K found for low-redshift spiral galaxies, then their luminosity is still about 8 × 1011 L , considerably greater than the several 1010 L expected for typical spiral galaxies. This issue can be addressed by taking into account both the observed background spectrum and the counts at di7erent wavelengths. The submm-wave background radiation spectrum can also be exploited to provide information about the form of evolution of the luminosity function. The submm-wave background, measured directly using COBE–FIRAS (Puget et al., 1996; Hauser et al., 1998; Schlegel et al., 1998), reasonably exceeds the sum of the measured Jux densities of discrete galaxies detected in SCUBA surveys (Smail et al., 1997, 2002; Blain et al., 1999a). However, the submm background makes up only a small fraction of the total energy density in the far-IR background, which peaks at a wavelength of about 200 m and is generated by galaxies at redshift z ∼ 1. The relatively Jat source SEDs and
A.W. Blain et al. / Physics Reports 369 (2002) 111 – 176
141
Fig. 10. Counterpart to Fig. 9 for three other observing bands, shown at wavelengths of 1.2 mm, 450 m and 175 m in order up the 4gure. The data at 1:2 mm (circles at Jux densities less than 10 mJy) are from Bertoldi et al. (2001), Carilli et al. (2001) and Carilli (2001). The data at 450 m (circles at 10 –50 mJy) are from Smail et al. (2002), with limits from Smail et al. (1997) and Barger et al. (1998). The data at 175 m (100 mJy) are from Kawara et al. (1998), Puget et al. (1999), Matsuhara et al. (2000), Juvela et al. (2000), Dole et al. (2001) and Stickel et al. (1998).
the rate of change of the cosmic volume element at this redshift conspire to generate most of the background light, just as in the radio, X-ray, optical and near-IR wavebands. The mm and submm background radiation is unique in originating at a higher redshift. Very little of the background is expected to be generated at z ¡ 1, and so it is an important signature of high-redshift galaxy formation. Despite representing only a small fraction of the total energy density in the cosmic background radiation, the mm-wave background is one of the cleanest measures of activity in the distant Universe. There are signi4cant consequences for the evolution of galaxies at high redshifts due to the observed smooth power-law form of the background spectrum, I ˙ 2:64 , for ¡ 500 GHz (Fixsen et al., 1998), which originates at moderate to high redshifts, on account of the submm-wave K correction. The shape of the background radiation spectrum at frequencies greater than about 100 GHz can be approximated quite accurately by associating an evolving comoving volume emissivity (L in units of W m−3 ) with an SED that peaks at a single frequency 0 , so that j ˙ L (z)"( − 0 ) (Blain and Longair, 1993b), and then integrating over cosmic volume over a 4xed angle on the sky. If the SED, via 0 , is assumed not to evolve strongly with redshift—there is no clear evidence that it does—then in order to reproduce the observed slope of the mm=submm background spectrum, L (z) ˙ (1+z)−1:1 is required for z1, and so the comoving luminosity density of dust-enshrouded galaxies must decline at large redshifts. If it did not decline, then the background spectrum measured
142
A.W. Blain et al. / Physics Reports 369 (2002) 111 – 176
Fig. 11. The observed intensity of cosmic background radiation between the radio and far-UV wavebands. The great majority of the background energy density in the Universe derived from sources other than the CMB is represented in this 4gure. Almost all of the rest appears in the X-ray waveband. Some signi4cant uncertainty remains, but the combination of measurements and limits indicates that a comparable amount of energy is incorporated in the far-IR background, which peaks at a wavelength of about 200 m, and in the near-IR=optical background, which peaks at a wavelength between 1 and 2 m. The data originates from a wide range of sources: 1. Fixsen et al. (1998); 2. Puget et al. (1996); 3. Blain et al. (1999a); 4. Schlegel et al. (1998); 5. Hauser et al. (1998); 6. Lagache et al. (2000a) see also Kiss et al. (2001); 7. Puget et al. (1999); 8. Kawara et al. (1998); 9. Finkbeiner et al. (2000); 10. Stanev and Franceschini (1998); 11. Altieri et al. (1999); 12. Dwek and Arendt (1998); 13. Wright and Johnson (2002); 14. Pozzetti et al. (1998); 15. Bernstein (1999) and Bernstein et al. (2002); 16. Toller et al. (1987); 17. Armand et al. (1994); 18. Lampton et al. (1990); and 19. Murthy et al. (1999). For a detailed review of cosmic IR backgrounds see Hauser and Dwek (2001). Note that Lagache et al. (2000a, b) claim that the Finkbeiner et al. points (9) could be a7ected by di7use zodiacal emission. Where multiple results are available in the literature the most sensitive result is quoted.
by COBE would be too Jat, with too much energy appearing at long wavelengths. This argument has been made using Monte-Carlo simulations of L (z) by Gispert et al. (2000). A similar set of simulations have been carried out by Eales et al. (2000), taking into account the observed background radiation spectrum, counts and inferred redshift distribution of submm-selected galaxies. An approximately equal fraction of the cosmic background radiation energy density emerges in the near-IR=optical and far-IR wavebands (Fig. 11). Because dusty galaxies do not dominate the total volume emissivity at low redshifts (Sanders, 1999; Yun et al., 2001), then the volume emissivity of dusty galaxies must increase by a factor of at least 10, matching the signi4cant evolution of the population of galaxies observed in the optical waveband at z ¡ 1 (Lilly et al., 1996), to avoid the intensity of the far-IR background radiation being signi4cantly less than observed. Only a very small fraction of the total far-IR luminosity from all low-redshift galaxies comes from galaxies more luminous than 1012 L , yet as discussed above in the context of the submm-wave counts, these luminous galaxies are much more numerous at high redshifts, by a factor of several hundred. These twin constraints demand that the form of evolution of the luminosity function of dusty galaxies
A.W. Blain et al. / Physics Reports 369 (2002) 111 – 176
143
cannot be pure density evolution, a simple increase in the comoving space density of all far-IRluminous galaxies. If the counts were to be reproduced correctly in such a model, then the associated background radiation spectrum would be much greater than observed. A form of evolution similar to pure luminosity evolution, in which the comoving space density of galaxies remains constant, but the value of L∗ , the luminosity that corresponds to the knee in the luminosity function, increases—in this case by a factor of order 20—is consistent with both the submm-wave counts and background intensity. By a more rigorous process, taking into account all available information, including the need to normalize the results to the observed low-redshift population of dust-enshrouded galaxies from the IRAS luminosity function and the populations of galaxies observed by ISO at z 1, the evolution of the luminosity density L can be constrained. The results have been discussed by Blain et al. (1999b, c), as updated in Smail et al. (2002), and by Eales et al. (2000). They are discussed further in Section 5 below. 3.1. Confusion Source confusion, the contribution to noise in an image due to the superimposed signals from faint unresolved sources clustering on the scale of the observing beam (Condon, 1974; Scheuer, 1974), is a signi4cant problem for observations in the submm waveband (Blain et al., 1998; Eales et al., 2000; Hogg, 2001). This is due to the relatively coarse ( 10 arcsec) spatial resolution currently available. In fact, a signi4cant fraction of the noise in the deepest 850-m SCUBA image of the HDF-N (Hughes et al., 1998) can be attributed to confusion (Peacock et al., 2000). At present, the practical confusion limit for galaxy detection in SCUBA observations at the atmospherically favored 850-m wavelength is about 2 mJy. This limit makes it diKcult to determine accurate sub-arcsec positions for the centroids of the submm emission from faint SCUBA-selected galaxies, rendering follow-up observations more challenging. Unfortunately, experience has shown that many known high-redshift galaxies, especially optically selected LBGs, are typically fainter than the confusion limit, and so are diKcult to study using SCUBA. The variety of count data for dusty galaxies shown in Figs. 9 and 10 can be used to estimate the e7ect of source confusion in observations made at a wide range of frequencies and angular scales. The distribution of Jux density values from pixel to pixel in an image due to confusion noise depends on the underlying counts of detected galaxies. Confusion noise is always an important factor when the surface density of sources exceeds about 0:03 beam−1 (Condon, 1974). The results of a confusion simulation for the 14-arcsec SCUBA 850-m beam are shown in Fig. 12: see also Hogg (2001), Eales et al. (2000) and Scott et al. (2002). Note that Eales et al. assume a very steep count and obtain larger values of confusion noise than those shown here. Observations made with 4ner beams at the same frequency su7er reduced confusion noise, while for those made in coarser beams the e7ects are more severe: compare the results for the much larger 5-arcmin beam in the three highest frequency submm-wave observing bands of the planned Planck Surveyor space mission all-sky survey shown in Blain (2001a). The simulated confusion noise distribution is non-Gaussian (Fig. 12), but can be represented quite accurately by a log-normal distribution, leading to many more high-Jux-density peaks in an image than expected assuming a Gaussian distribution of the same width. The width of the central peak of the distribution in Jux density is approximately the same as the Jux density at which the count of
144
A.W. Blain et al. / Physics Reports 369 (2002) 111 – 176
Fig. 12. Histograms showing the simulated e7ects of confusion noise in deep SCUBA integrations at 850 m. Left: the expected distribution of pixel Jux densities when the telescope samples the sky in a standard (−0:5; 1; −0:5) chopping scheme, with no additional noise terms present. The Jux distribution is non-Gaussian, with enhanced high- and low-Jux tails as compared with the overplotted Gaussian, which has the width predicted by simple calculations (Fig. 13). Right: the same confusion noise distribution is shown in the right-hand panel, but convolved with Gaussian instrument and sky noise with an RMS value of 1:7 mJy, which is typical of the noise level in the SCUBA Lens Survey (Smail et al., 2002). At this noise level, the additional e7ect of confusion noise is small.
sources exceeds 1 beam−1 . This provides a useful indication of the angular scales and frequencies for which confusion noise is likely to be signi4cant, and of the limit imposed to the e7ective depth of surveys by confusion for speci4c instruments: see Fig. 13. 3.1.1. Confusion and follow-up observations of submm galaxies The real problem of confusion for identifying and conducting multiwaveband studies of submmselected galaxies is illustrated by the results of the 4rst generation of surveys. The very deepest optical image that matches a submm-wave survey is the HDF-N, in which there are several tens of faint optical galaxies (at R ¿ 26) that could be the counterpart to each SCUBA detection (Hughes et al., 1998; Downes et al., 1999a). It is thus impossible to be certain that a correct identi4cation has been made from the submm detection image and optical data alone: compare the identi4cations in Smail et al. (1998a, 2002). In some cases, EROs and faint non-AGN radio galaxies (Smail et al., 1999, 2000; Gear et al., 2000; Lutz et al., 2001) can be associated with submm galaxies, especially after higher-resolution mm-wave interferometry observations have provided more accurate astrometry for the submm detection (Downes et al., 1999b; Frayer et al., 2000; Gear et al., 2000; Lutz et al., 2001), to reduce the e7ects of submm confusion, with the investment of signi4cant amounts of observing time. The surface density of both EROs and faint non-AGN radio galaxies is less than that of the faintest optical galaxies, and so the probability of a chance coincidence between one and a submm galaxy is reduced. A very red color and detectable radio emission from a high-redshift galaxy are both likely to indicate signi4cant star-formation=AGN activity and=or dust extinction, making such galaxies better candidate counterparts even in the presence of confusion-induced positional uncertainties.
A.W. Blain et al. / Physics Reports 369 (2002) 111 – 176
145
Fig. 13. An approximate measure of the 1- confusion noise expected as a function of both observing frequency and angular scale from the mm to mid-IR waveband (updated from Blain et al., 1998). The contributions from extragalactic and Galactic sources are shown in the left and right panels, respectively. Radio-loud AGN may make a signi4cant contribution to the top left of the jagged solid line (To7olatti et al., 1998). A Galactic cirrus surface brightness of B0 = 1 MJy sr −1 at 100 m is assumed. The ISM confusion noise is expected to scale as B01:5 (Helou and Beichman, 1990; Kiss et al., 2001). The bands and beamsizes of existing and future experiments (see Tables 1 and 2) are shown by: circles—Planck Surveyor; squares—BOOMERANG; empty stars—the SuZIE mm-wave Sunyaev–Zeldovich instrument; triangles—BOLOCAM, as 4tted to CSO (upper 3 points) and the 50-m Large Millimeter Telescope (LMT; lower 3 points); 4lled stars—SCUBA (and SCUBA-II); diamonds—Herschel; asterixes—Stratospheric Observatory for Infrared Astronomy (SOFIA); crosses—SIRTF. The resolution limits of the interferometric experiments ALMA and SPECS lie far below the bottom of the panels. The confusion performance of the 2.5-m aperture BLAST balloon-borne instrument is similar to that of SOFIA. Confusion from extragalactic sources is expected to dominate over that from the Milky Way ISM for almost all of these instruments.
3.2. Multi-waveband follow-up studies A great deal of telescope time has been spent so far to detect and study submm-selected galaxies in other wavebands. In many cases, rich archival data predated the submm observations: most notably in HDF-N (Hughes et al., 1998). Considerable data was also available in the 4elds of rich clusters (Smail et al., 1997, 1998a, 2002; Cowie et al., 2002), in the region of the Eales et al. (1999, 2000) surveys in Canada–France Redshift Survey (CFRS) 4elds, which include the Groth Strip, and in the deep Hawaii survey 4elds (Barger et al., 1999a). The results of follow-up deep optical and near-IR imaging (Frayer et al., 2000) and spectroscopy (Barger et al., 1999b), mm-wave continuum imaging (Downes et al., 1999a; Bertoldi et al., 2000; Frayer et al., 2000; Gear et al., 2000; Lutz et al., 2001; Dannerbauer et al., 2002) and molecular line spectroscopy (Frayer et al., 1998, 1999; Kneib et al., 2002) have been published, and many additional studies are under way. The time spent following up the SCUBA Lens Survey (Smail et al., 2002) exceeds by almost an order of magnitude
146
A.W. Blain et al. / Physics Reports 369 (2002) 111 – 176
Fig. 14. Multi-waveband images of SMM J02399−0136 (S850 = 23 mJy; Ivison et al., 1998b; Frayer et al., 1998). The format of this 4gure is the template for those that follow. Note that these multiwaveband 4gures are presented in order of reducing 850-m Jux density, without correcting for gravitational lensing ampli4cation. The leftmost panel shows black contours of 850-m emission superimposed on a grayscale I -band image. The second panel shows black contours of faint 1.4-GHz radio emission superimposed on a K-band image. These two left-hand images are both 30 arcsec on a side. The third panel shows a 10-arcsec zoom of the K-band image (from UKIRT unless otherwise stated; Smail et al., 2002). The rightmost panel shows a B-band CFHT image in this 4gure; in the 4gures that follow this panel shows an HST image. Here and in the 4gures that follow, white contours are added to show contrast in saturated regions of the grayscale. North is up and East is to the left. SMM J02399−0136 is a merging galaxy with a con4rmed optical=radio counterpart, and a CO redshift z = 2:808: see Vernet and Cimatti (2001) for a new high-quality spectrum showing Lyman- emission from this galaxy extended over 12 arcsec.
Fig. 15. Images of SMM J00266+1708 (18:6 mJy; Frayer et al., 2000). The left-hand K-band image is from UKIRT; the right-hand K-band image is from Keck-NIRC. The K-band detection is located at the position of the very red galaxy M12 in a 1.1-mm continuum image obtained using the OVRO MMA.
the time required to make the submm discovery observations (Smail et al., 1997). The diKculty of the task is highlighted by the identi4cation of plausible counterparts to these galaxies being only about 60% complete over 4 y later at the start of 2002. The follow-up results from a well-studied subset of galaxies in the SCUBA Lens Survey are shown in Figs. 14 –20, in order of decreasing 850-m Jux density. These are chosen neither to be a representative sample of submm galaxies, nor to be a suKciently large sample for statistical studies, but rather to present a Javor of the range of galaxies that can be detected in submm-wave surveys for which high-quality multi-waveband data is available. The galaxies that are presented are chosen to have good positional information, and redshifts where possible. Other detections for which excellent multi-wavelength follow-up data are available include the brightest source in the HDF-N SCUBA image (Hughes et al., 1998; Downes et al., 1999a), an ERO detected in the CUDSS survey by Eales et al. (1999) (Gear et al., 2000),
A.W. Blain et al. / Physics Reports 369 (2002) 111 – 176
147
Fig. 16. Images of SMM J0942+4658 (17:2 mJy), an ERO counterpart (Smail et al., 1999). Faint radio emission and extended, rather bright K-band emission make this a good candidate for the source of the submm emission. H1 is a low-redshift spiral galaxy in the foreground of Abell 851.
Fig. 17. Multi-waveband images of SMM J14009+0252 (14:5 mJy), the bright radio-detected submm galaxy Abell 1835 (Fig. 1; Ivison et al., 2000a). Two faint near-IR counterparts can be seen in the K-band image. Of these, J5 is extremely red, has no counterpart in the HST-F702W image, and is aligned accurately with the centroid of the radio emission.
Fig. 18. Multi-waveband images of SMM J14011+0252 (12:3 mJy; Ivison et al., 2000b, 2001). This complex merging system has a con4rmed optical=radio counterpart, and a CO redshift z = 2:565 (Frayer et al., 1999). High-resolution CO and radio images are presented in Ivison et al. (2001). Note that the Northern extension of J1 is extremely red, and is close to the centroid of the radio emission. J2 is blue, while J1 is red. The complexity of this system is a caution against simple treatment of extinction as a uniform screen in submm galaxies: for a detailed discussion see Goldader et al. (2002) and references therein.
a z = 2:8 QSO in a cluster 4eld (Kraiberg Knudsen et al., 2001), and the substantially overlapping catalogs of galaxies detected by Cowie et al. (2002) in deeper images of 3 of the 7 clusters in the Smail et al. lens survey.
148
A.W. Blain et al. / Physics Reports 369 (2002) 111 – 176
Fig. 19. Multi-waveband images of SMM J02399 − 0134 (11:0 mJy; Kneib et al., 2002). This ring galaxy has a con4rmed optical=radio counterpart, and a CO redshift z = 1:06. Its low redshift accounts for its very bright K-band image and mid-IR ISO detection at 15 m. The other galaxy in the K-band image is a member of Abell 370.
Fig. 20. Images of SMM J04431+0210 (7:2 mJy), a very red counterpart (Smail et al., 1999). A tentative H redshift of z = 2:5 is determined from a near-IR Keck-NIRSPEC observation (Frayer et al., 2002). Unlike SMM J09429+4658 this galaxy has no radio emission.
3.2.1. Optical=near-IR The properties of submm galaxies in the optical waveband, corresponding to the rest-frame UV waveband, appear to be very diverse (Ivison et al., 2000a). This may be due in part to their expected broad redshift distribution. However, given that two submm galaxies at the reasonably high redshifts z = 2:5 and 2.8 are known to be readily detectable at B 23 (Ivison et al., 1998a, 2000a, 2001)— before correcting each for the magnitude (factor of about 2.5) of ampli4cation due to the foreground cluster lens—while most others are very much fainter (Smail et al., 2002; Dannerbauer et al., 2002), it is likely that much of the spread in their observed properties is intrinsic. As most counterparts are extremely faint, con4rmation of their nature requires a large, completely identi4ed sample of submm galaxies with known redshifts, which is likely to be some time away. It is possible that optically faint submm galaxies have similar properties, an issue that can be addressed when deep near-IR observations are available. Ivison et al. (2000a) and Smail et al. (2002) proposed a 3-tier classi4cation system to stress the varied nature of submm galaxies (three being an eminently sensible number of classes for 15 galaxies!). Class-0 galaxies are extremely faint in both the observed optical and near-IR wavebands. Class-1 galaxies are EROs, very faint in the optical but detectable in the near-IR, while Class-2 galaxies are relatively bright in both bands. It is unclear how closely this scheme reJects the underlying astrophysics of the submm galaxies; however, the classi4cation separates the optically bright
A.W. Blain et al. / Physics Reports 369 (2002) 111 – 176
149
galaxies (Class 2s), for which the acquisition of optical redshifts and con4rming CO redshifts are likely to be practical, and the fainter galaxies, for which this will be a great challenge (Class 0s). A similar approach for MAMBO sources has been discussed by Dannerbauer et al. (2002). Note that submm galaxies could change classi4cation by having di7erent redshifts, despite identical intrinsic SEDs. 3.2.2. Ultradeep radio images The surface density of the faintest radio sources that can be detected using the VLA (Richards, 2000) is signi4cantly less than that of optical galaxies, and so an incorrect radio counterpart to a submm-selected galaxy is relatively unlikely to be assigned by chance. If 1.4-GHz VLA images are available at a Jux limit approximately 1000 times deeper than 850-m images, then the radio counterparts to non-AGN submm galaxies should be detectable to any redshift (see Fig. 7). Deep radio follow-up observations of submm-selected galaxies yielding cross identi4cations have been discussed by Smail et al. (2000), and further information about very faint radio sources in the 4eld of the UK 8-mJy SCUBA survey (Fox et al., 2002; Scott et al., 2002) should soon be available in Ivison et al. (2002). Despite an extremely deep radio image (Richards, 2000), the brightest submm galaxy detected in HDF-N (Hughes et al., 1998) does not have a radio detection, probably indicating a very high redshift. The survey results reported by Eales et al. (2000) and Webb et al. (2002a) were discussed in the context of radio data covering the same 4elds; however, the radio survey is not deep enough to detect a signi4cant fraction of the relatively faint submm sources. As can be seen for the speci4c submm galaxies shown in Figs. 14 –20, when they are available, deep high-resolution radio images are very useful for determining accurate positions and even astrophysical properties of submm galaxies: see Ivison et al. (2001). 3.2.3. CO rotation line emission and continuum mm-wave interferometry The detection of CO line emission from submm-selected galaxies is a crucial step in the con4rmation of their identi4cation. It has been demonstrated in only three cases so far (see Figs. 14, 18 and 19), using the OVRO Millimeter Array (MMA; Frayer et al., 1998, 1999), in one case in combination with the BIMA array (Ivison et al., 2001), and the IRAM PdBI (Kneib et al., 2002). These observations are very time-consuming, typically requiring tens of hours of observing time. The 1 GHz instantaneous bandwidth of existing line-detection systems also means that a redshift accurate to at least 0.5% must be known before attempting a CO detection. In other cases, continuum emission is detected using the interferometers, con4rming the reality of the initial submm detection and providing a better position (Downes et al., 1999a; Bertoldi et al., 2000; Frayer et al., 2000; Gear et al., 2000; Lutz et al., 2001; Dannerbauer et al., 2002), but no absolute con4rmation of a correct optical=near-IR identi4cation or a crucial redshift. The ALMA interferometer array will have the collecting area and bandwidth to make rapid searches for CO line emission in the direction of known submm continuum sources from about 2010. Specialized wide-band mm-wave spectrographs to search for multiple high-redshift CO lines separated by 115 GHz=(1 + z) that are currently under development (Glenn, 2001). Wide-band cm-wave receivers at the 100-m clear-aperture Green Bank Telescope (GBT) could detect highly redshifted 115-GHz CO(1 → 0) line emission.
150
A.W. Blain et al. / Physics Reports 369 (2002) 111 – 176
3.2.4. X-ray observations Based on synthesis models of the X-ray background radiation intensity (Fabian and Barcons, 1992; Hasinger et al., 1996), Almaini et al. (1999) and Gunn and Shanks (2002) suggested that 10 –20% of the submm galaxy population could be associated with the hard X-ray sources that contribute this background. Observations of 4elds with common deep Chandra and SCUBA images were discussed in Section 2.8. The small degree of observed overlap between the submm and X-ray sources, implies that if a signi4cant fraction of submm galaxies are powered by accretion in AGN, then the accretion must occur behind an extremely thick absorbing column of gas, and less than 1% of the X-ray emission from the AGN can be scattered into the line of sight (Fabian et al., 2000; Barger et al., 2001; Almaini et al., 2002). In order to avoid detection using SCUBA, high-redshift hard X-ray Chandra sources must either contain a very small amount of gas and dust, and thus have only a small fraction of their energy reprocessed into the far-IR waveband, which seems unlikely given their hard spectra; or they must contain dust at temperatures much higher than appears to be typical for submm-selected galaxies. The detection of Chandra sources in Abell 2390 using ISO at 15 m, but not using SCUBA at 850-m, argues in favor of at least some Chandra sources having very hot dust temperatures (Wilman et al., 2000). Comparison of larger, deep ISO 15-m images with Chandra images shows that most of the faint, red AGN detected by Chandra are detected in the mid-IR (Franceschini et al., 2002). This should be readily con4rmed using wide-4eld sensitive mid-IR observations of Chandra 4elds using SIRTF, images which will also yield well-determined SEDs for the dust emission from the detected galaxies. 3.2.5. Mid- and far-IR observations Distant submm galaxies are too faint at far- and mid-IR wavelengths to have been detected in the all-sky IRAS survey. However, the 4rst submm-selected galaxies were detected while the next-generation ISO space observatory was still operating, and there were both late-time ISO observations of submm 4elds, and some serendipitous overlap of 4elds. In general, the small aperture and small-format detector arrays of ISO still led to relatively little overlap between SCUBA and ISO galaxies, for example in the HDF-N (Hughes et al., 1998; Elbaz et al., 1999) and Abell 2390 (Fabian et al., 1999; Wilman et al., 2000). The brightest SCUBA galaxies in the cluster Abell 370 (Ivison et al., 1998a) have Juxes measured by ISO at 15 m (Metcalfe, 2001), providing valuable, and otherwise diKcult to obtain, constraints on their short-wavelength SEDs (Fig. 2), and thus their dust temperatures. The much larger detector arrays aboard SIRTF should allow many more constraints to be imposed on the mid-IR SEDs of submm galaxies. Existing deep submm 4elds are included for imaging within the SIRTF guaranteed time programs, and individual submm galaxies have been targeted for mid-IR spectroscopy. 12 3.3. A gallery of follow-up results In this section we show some of the submm-selected galaxies with the most complete and comprehensive follow-up information, including all three with con4rmed redshifts (Figs. 14, 18 and 19). 12
Details of SIRTF observing programs can be found at sirtf.caltech.edu.
A.W. Blain et al. / Physics Reports 369 (2002) 111 – 176
151
To reveal more of the diversity of counterparts to the SCUBA galaxies, we also show a relatively strong radio source with a faint red K-band counterpart (Fig. 17), two K 19:5 galaxies, one a formal ERO and the other with very red colors (Smail et al., 1999; Figs. 16 and 20), which are likely to be the correct counterpart on the grounds of the relatively low surface density of EROs; and a mm-continuum source located by the OVRO MMA at the position of a SCUBA-selected galaxy, with a very faint K-band counterpart observed using the NIRC instrument on the Keck telescope (Frayer et al., 2000; Fig. 15). Similar enigmatic faint red galaxies have been reported as counterparts to well-located submm-selected galaxies from other submm surveys by Gear et al. (2000) and Lutz et al. (2001), while Dannerbauer et al. (2002) do not 4nd counterparts to three well-located MAMBO galaxies to a 3- limit Ks = 21:9. The galaxies shown in Figs. 14 –20 are certainly an unrepresentative sample of submm-selected galaxies, missing galaxies that are either intrinsically very faint at other wavelengths or lie at the highest redshifts. It is to be hoped that within the next few years, deep follow-up observations, especially near-IR ground-based observations 13 and mid-IR observations using SIRTF will reveal the nature of the majority of submm-selected galaxies. 14 3.4. Clustering properties The deep submm-wave surveys made to date provide relatively little information about the spatial distribution of the detected galaxies. There is some indication from the UK 8-mJy SCUBA survey (Almaini et al., 2002; Fox et al., 2002; Ivison et al., 2002; Scott et al., 2002) and from the widest-4eld MAMBO surveys (Carilli et al., 2001) that the clustering strength of SCUBA-selected galaxies is greater than that of faint optically selected galaxies, yet less than that of K 20 ERO samples (Daddi et al., 2000). Webb et al. (2002b) point out that the angular clustering signal expected in submm surveys is likely to be suppressed by smearing in redshift, as the submm galaxies should have a wide range of redshifts, and so the spatial correlation function might in fact be stronger than that of the EROs. As most K 20 EROs are evolved elliptical galaxies at z 1, which are likely to be amongst the 4rst galaxies to form in the most overdense regions of the Universe, their strong clustering is easily explained. However, a de4nitive result on the clustering of submm galaxies awaits a much larger sample of galaxies than considered by Scott et al. (2002) and Webb et al. (2002b). Characteristic brightness Juctuations on the angular scales expected from faint unresolved dusty galaxies have been found in both the 850-m SCUBA image of the HDF-N (Peacock et al., 2000) and in deep, confused 175-m ISO images (Lagache and Puget, 2000; Kiss et al., 2001). Haiman and Knox (2000) have discussed the details of measurements of the correlation function of unresolved submm galaxies on arcminute angular scales in the context of CMB experiments, 4nding that the correlated signal can carry important information about the nature and evolution of the submm galaxy population. A simpler investigation by Scott and White (1999) drew similar conclusions, while there is further discussion by Magliocchetti et al. (2001). 13
Near-IR imaging to 2 K 24 of all of the SCUBA Lens Survey submm galaxies is underway using the NIRC camera at the Keck telescopes: see Fig. 15. 14 More information about this sample can be found in the catalog paper of the SCUBA Lens Survey (Smail et al., 2002 and references therein).
152
A.W. Blain et al. / Physics Reports 369 (2002) 111 – 176
4. Submm galaxy luminosity functions and their relationship with other populations Submm-selected galaxies are an important component of the Universe, but are typically very faint in other wavebands, and so diKcult to study. This immediately implies that the submm population does not overlap signi4cantly with other types of high-redshift galaxies, although it may be possible to infer their properties if detailed information about these other classes is available (Adelberger and Steidel, 2000). The likely lack of overlap is reinforced by the relatively low surface density of submm galaxies as compared with the faintest optically selected galaxies. Nevertheless, a detailed understanding of the process of galaxy formation demands that the relationship of the submm galaxies to other populations of high-redshift galaxies is determined. 4.1. Optically selected Lyman-break galaxies (LBGs) The LBGs (Steidel et al., 1999) are suKciently numerous to have a well-de4ned luminosity function (Adelberger and Steidel, 2000). The e7ects of dust extinction on the inferred luminosity of a small subset of LBGs have been estimated reliably from near-IR observations of H emission: corrections by factors of 4 –7 are indicated (Pettini et al., 1998, 2001; Goldader et al., 2002). At present, it is diKcult to con4rm this degree of extinction directly, as attempts to detect LBGs using submm-wave instruments have not so far been successful: see Section 2.8. Observations suggest that a typical LBG has a 850-m Jux density of order 0:1 mJy, well below the confusion limit at the resolution of existing submm-wave images. Adelberger and Steidel (2000) have discussed the various selection e7ects associated with submm, optical and faint radio selection of high-redshift galaxy samples. They assumed that the relation between the slope of the UV SED of a galaxy and the fraction of its luminosity emitted in the far-IR waveband that is observed for low-redshift IUE starbursts with luminosities less than about 1011 L (Meurer et al., 1999) holds at greater redshifts and luminosities. A common, smooth luminosity function can then account for the properties of LBGs and submm galaxies. A priori, there must be an underlying multi-waveband luminosity function of all high-redshift galaxies from which both classes of galaxies are drawn. However, while observations of some submm galaxies (a key example being SMM J14011+0252; Ivison et al., 2000a, 2001; Fig. 18) seem to support this interpretation at 4rst sight, it is clear that only the J2 region of this galaxy would be identi4ed as a LBG, while the submm emission is concentrated nearer to J1. Further discussion can be found in Goldader et al. (2002). Because of the apparent diversity of optical–submm properties of submm galaxies (Ivison et al., 2000a; Smail et al., 2002), this simple transformation is unlikely to hold. Hence, a fraction of submm galaxies will probably never be detected in rest-frame UV continuum surveys because of their extreme faintness. The con4rmed ERO submm galaxies (Smail et al., 1999; Gear et al., 2000; Lutz et al., 2001) are clear examples of such a population. 4.2. Extremely red objects (EROs) The development of large format near-IR detectors has enabled relatively deep, wide-4eld IR surveys, and lead to the discovery of a class of faint EROs (galaxies with colors in the range R − K ¿ 5:5– 6), supplementing traditional low-mass stellar EROs (Lockwood, 1970). At 4rst EROs were found one by one, (Hu and Ridgway, 1994; Graham and Dey, 1996), but statistical samples
A.W. Blain et al. / Physics Reports 369 (2002) 111 – 176
153
are now detected in K ¡ 20 near-IR surveys (Thompson et al., 1999; Yan et al., 2000; Daddi et al., 2000; Totani et al., 2001), in parallel to their identi4cation in multiwaveband surveys (Smail et al., 1999; Pierre et al., 2001; Smith et al., 2001; Gear et al., 2000; Lutz et al., 2001). The clustering of relatively bright K ¡ 19:2 EROs (Daddi et al., 2000) is observed to be very strong, fueling speculation that EROs are associated with the deepest potential wells that have the greatest density contrast at any epoch. This has been used as an argument in favor of their association with submm galaxies, which could be good candidates for massive elliptical galaxies in formation (Eales et al., 1999; Lilly et al., 1999; Dunlop, 2001). There are two obvious categories of extragalactic EROs: very evolved galaxies, containing only cool low-mass stars, and strongly reddened galaxies, with large amounts of dust absorption, but which potentially have a very blue underlying SED. Only the second are good candidates for identi4cation with submm-luminous galaxies. Detections of faint radio emission associated with young supernova remnants in EROs, and determinations of signatures of ongoing star-formation in their rest-frame UV colors should allow these cases to be distinguished. Radio and submm follow-up observations of EROs have tended to show that most are passively evolved non-star-forming galaxies without detectable radio emission (Mohan et al., 2002), with at most about 10 –20% being candidates for dust-enshrouded starbursts=AGN. It is important to remember that few EROs selected from wide-4eld near-IR surveys, which reach limits of K 20, are actively star-forming submm galaxies (see the summary of results in Smith et al., 2001). A small but signi4cant fraction of submm galaxies appear to be associated with EROs at bright magnitudes K ¡ 20 (Smail et al., 1999, 2002). Many more submm galaxies probably 4t the ERO color criterion, but at much fainter magnitudes; see for example the K = 22:5 SCUBA galaxy shown in Fig. 15 (Frayer et al., 2000). It is certainly possible that future fainter ERO samples with K ¿ 22 could contain a greater fraction of submm-luminous galaxies and fewer passive ellipticals than the K 20 samples. Note also that EROs have unfavorable K corrections for detection at high redshifts (Dey et al., 1999; Gear et al., 2000): beyond z 2:5 any ERO would be extremely faint at even near-IR wavelengths. This could account, in part or in whole, for the extreme faintness of counterparts to submm galaxies, if many do have extremely red intrinsic colors. 4.3. Faint radio galaxies As discussed above, the faintest radio galaxies should be detectable in the submm waveband if the far-IR–radio correlation remains valid at high redshifts. The narrow dispersion of this correlation suggests that submm galaxies and faint radio galaxies are perhaps the most likely populations of high-redshift galaxies to overlap substantially. Surveys made using SCUBA to search for optically faint, and thus presumably high-redshift, galaxies with radio Jux densities close to the detection threshold of the deepest radio surveys (Barger et al., 2000; Chapman et al., 2001b) have been used to detect many tens of high-redshift dusty galaxies much more rapidly than blank-4eld surveys. The selection e7ects at work when making a radio-detected, optically faint cut from a radio survey are not yet suKciently well quanti4ed to be sure that these catalogs are representative of all submm galaxies. The typical optical magnitudes of the radio-selected objects with submm detections are clustered around I 24 and greater. Hence, bright optical counterparts to mJy-level submm galaxies are rare (Chapman et al., 2002b). Because relatively accurate positions are available from the radio observations, it should be possible to determine spectroscopic redshifts for a signi4cant number of
154
A.W. Blain et al. / Physics Reports 369 (2002) 111 – 176
these galaxies, providing a valuable contribution to our knowledge of the distances to at least a subset of submm galaxies. 4.4. Active galaxies and X-ray sources An important category of objects that could be associated with submm galaxies are accreting AGN. As discussed in Sections 2.8 and 3.2.4, the overlap between Chandra and SCUBA galaxies does not appear to be very great—at about the 10% level. However, the gravitational energy released when forming the supermassive black holes in the centers of galactic bulges (Magorrian et al., 1998) is likely to be only a few times less than the energy released by stellar nucleosynthesis over the lifetime of the stars in the bulge, and so a case remains for the existence of both a signi4cant population of Compton-thick AGN submm sources with no detectable X-ray emission at energies less than 10 keV, and for hot dusty AGN detectable in the mid- and far-IR but not the submm wavebands (Wilman et al., 2000; Blain and Phillips, 2002). 4.5. Gamma-ray burst (GRB) host galaxies An interesting new development is the idea that if GRBs are likely to be associated with the deaths of massive stars, then the rate of GRBs and the global high-mass star-formation rate should be linked (Krumholz et al., 1998). By searching for submm emission from the directions of GRBs, it may be possible to test whether either submm or UV-bright galaxies are the dominant population to host high-mass stars, and what fraction of the submm galaxies are powered by non-GRB-generating AGN (Blain and Natarajan, 2000). About 10% of GRBs are expected to be in hosts with 850-m Jux densities greater than 5 mJy (Ramirez-Ruiz et al., 2002), if submm galaxies dominate the cosmic star-formation rate and are not typically powered by AGN. Two excellent candidates for submm-loud host galaxies of GRBs are now known (Berger et al., 2001; Frail et al., 2002), the 4rst based on deep VLA radio images, the second on direct SCUBA and MAMBO mm=submm observations. Most GRB hosts appear to be associated with R 25 optical galaxies (Bloom et al., 2002), which could also be typical of the submm galaxy population. It is diKcult to detect GRB host galaxies without hitting the confusion limit using SCUBA, but attempts are underway. As a byproduct of surveys for submm afterglow emission, Smith et al. (1999, 2002) imposed limits to the submm host galaxy emission from the direction of 12 GRBs. An ongoing JCMT program (Nial Tanvir et al.) is searching directly for submm emission from the host galaxies of accurately located GRBs (Barnard et al., 2002). Detecting and resolving the submm emission from GRB host galaxies should ultimately be very simple using ALMA, requiring observations of only a few minutes per burst. 4.6. Prospects for the follow-up observations in the future In order to make detailed submm-wave studies of the astrophysics of high-redshift galaxies, high-resolution images will be required. Existing mm-wave interferometers can provide high-quality images of the brightest submm galaxies (for example Frayer et al., 2000; Gear et al., 2000; Lutz et al., 2001); however, in order to detect rapidly typical LBGs, EROs and hard X-ray sources, the
A.W. Blain et al. / Physics Reports 369 (2002) 111 – 176
155
Table 1 Wavelengths $, sensitivities (as noise equivalent Jux density—NEFD), 4elds of view (FOV), and confusion limits due to galaxies and the ISM (in brackets) for existing and future ground-based and airborne instruments Name
$ (m)
√ NEFD (mJy= Hz)
FOV (arcmin2 )
Confusion (mJy)
SCUBA
850 450
80 160
1.7 1.7
0.12 (9 × 10−4 ) 0.053 (3 × 10−3 )
MAMBOa ;b
1250
95
1.0
0.05 (7 × 10−5 )
SCUBA-II
850 450
28 90
64 64
0.12 (9 × 10−4 ) 0.053 (3 × 10−3 )
HAWC-SOFIA
200
408
9.0
1.2 (0.30)
BOLOCAM-CSO
1100
42
44
0.32 (2 × 10−3 )
BOLOCAM-LMT
1100
2.8
2.5
6 × 10−3 (4 × 10−5 )
BLAST
750 450 300
115 130 150
10 10 10
3.9 (0.25) 6.8 (0.7) 7.8 (1.1)
SMA
850 450
170 1700
0.2 0.05
¡ 10−7 (¡ 10−6 ) ¡ 10−8 (¡ 10−4 )
ALMA
870 450
1.9 11
0.050 0.013
¡ 10−7 (¡ 10−6 ) ¡ 10−8 (¡ 10−4 )
Extended VLA
20:5 cm
0.40
700
∼ 0 (∼ 0)
SKA
20:5 cm
∼ 10−2
TBD
∼ 0 (∼ 0)
An estimate of the speed of a survey down to a chosen depth can be obtained by multiplying the FOV by the square of the NEFD value. The approximate extragalactic confusion noise values are the Jux density at which there is one brighter source per beam (Blain et al., 1998; Fig. 13). This corresponds approximately to the width of peak in the non-Gaussian confusion noise distribution (see Fig. 12). The expected ISM confusion noise (in brackets) is calculated for a 100-m surface brightness B0 = 1 MJy sr −1 (Helou and Beichman, 1990), and scales as B01:5 . Other instruments under development, which have not published detailed performance estimates include the 350-m SHARC-II camera for the CSO. The FOV and NEFD values are chosen to provide the correct results for making a fully sampled image of the sky, not measuring the Jux from a single galaxy. Updated from Table 1 in Blain (1999b). Relevant references are listed in Table 3. TBD: to be decided. a Note that the FOV of MAMBO is expanded by a factor of 3 for the winter of 2001=2002, with a 117-bolometer detector array. b A similar device SIMBA is being commissioned at SEST.
additional sensitivity and resolution that should be provided by ALMA is required. Improved information on the SEDs of dusty galaxies will be obtained using the SIRTF, SOFIA, ASTRO-F and Herschel space- and air-borne observatories: see Tables 1 and 2.
156
A.W. Blain et al. / Physics Reports 369 (2002) 111 – 176
Table 2 The equivalent to Table 1 for space-borne instrumentation Name
$ (m)
√ NEFD (mJy= Hz)
FOV (arcmin2 )
Confusion (mJy)
Herschel-SPIRE
500 350 250 170 90 160 70 24 8.0 350 550 850 250
114 90 84 24 24 18 4.5 1.8 0.15 26 19 16 0.17
40 40 40 6.1 6.1 2.5 25 25 26 All-sky All-sky All-sky 4
2.9 (0.16) 2.6 (0.12) 1.6 (0.24) 0.80 (0.16) 0.03 (0.01) 6.6 (3.1) 0.28 (0.07) 6 × 10−4 (2 × 10−4 ) 8 × 10−2 (∼ 10−6 ) 50 (70) 22 (12) 8.1 (1.6) ∼ 10−5 (∼ 10−3 )
Herschel-PACS SIRTF-MIPS SIRTF-IRAC Planck Surveyor SPECS testbed SPIRIT
Note that the values listed for Planck Surveyor apply to an all-sky survey. Another instrument under development, which has not published detailed performance estimates is the 50 –200-m sky survey from the Japanese ASTRO-F=IRIS satellite sky survey.
5. Modeling the evolution of submm galaxies As soon as the 4rst submm galaxies were detected in 1997, it was clear that they made a signi4cant contribution to the luminosity density in the high-redshift Universe, subject to the plausible assumptions that their SEDs were similar to those of luminous low-redshift dusty galaxies, and that their redshifts were not typically less than about z = 0:5. These assumptions remain plausible and have been con4rmed to an acceptable level by subsequent observations (Smail et al., 2000, 2002). Despite an initial suggestion that 30% of the SCUBA galaxies could be at z ¡ 1 (Eales et al., 1999), it now seems that a median redshift of submm-selected galaxies is of order 2–3 (Eales et al., 2000; Smail et al., 2002). We have already discussed that evolution by a factor of about 20 in the value of L∗ is required to account for the properties of the submm source counts and background radiation intensity. That is the key result from submm surveys, but how can it be explained in models of galaxy evolution? In this section we will not describe the modeling process in great detail, but we highlight the key features of such analyses, and the most important future tests of our current understanding. It is important is to be aware that there is still considerable uncertainty in the exact form of evolution required to explain the submm observations. While strong luminosity evolution of the dusty galaxy population is required out to z 1 and beyond, the detailed form of that evolution is rather loosely constrained by count and background data. Redshift distributions are an essential requirement in order to determine the form of evolution accurately.
A.W. Blain et al. / Physics Reports 369 (2002) 111 – 176
157
5.1. An array of possible treatments A variety of approaches have been taken to making predictions for and interpreting the results of submm surveys. This work began after the deep 60-m counts of IRAS galaxies were derived, and it was clear that strong evolution was being observed out to z ∼ 0:1 (Hacking and Houck 1987; Saunders et al., 1990; Bertin et al., 1997). Observations of more distant galaxies at longer wavelengths could probe the extrapolated form of evolution, and disentangle the degenerate signatures of density and luminosity evolution (Franceschini et al., 1988; Oliver et al., 1992). Before the 4rst deep submm survey observations in 1997, Franceschini et al. (1991), Blain and Longair (1993a, b, 1996) and Pearson and Rowan-Robinson (1996) made a variety of predictions of what might be detected in submm surveys. Guiderdoni et al. (1998) and To7olatti et al. (1998) did similarly as the 4rst observational results become available. Generally, the observed surface density of submm galaxies was underpredicted. Prior to the recognition that an isotropic signal in the far-IR COBE-FIRAS data was an extragalactic background (Puget et al., 1996) and not Zodiacal emission (Mather et al., 1994), this could be accounted for by the use of this unduly restrictive limit on the intensity of the submm background spectrum. Once submm count data became available after SCUBA was commissioned, it was possible to take either a more empirical or a more theoretical view of the consequences for dusty galaxy evolution. On the empirical side, forms of evolution of the low-redshift luminosity function of dusty galaxies (Saunders et al., 1990; Soifer and Neugebauer, 1991) that were required to 4t the count and background data could be determined (Malkan and Stecker, 1998, 2001; Blain et al., 1999b; Tan et al., 1999; Pearson, 2001; Rowan-Robinson, 2001). At redshifts less than about unity, this is done by requiring that the predicted counts of low-redshift IRAS and moderate-redshift ISO galaxies are in reasonable agreement with observations. At greater redshifts, where the form of evolution is constrained by submm count and background data, there is signi4cant degeneracy in the models: strong evolution could proceed all the way out to a relatively low cuto7 redshift (z ∼ 2–3), or the strong evolution could terminate at a lower redshift z ∼ 1, followed by a tail of either non-evolving or declining luminosity density out to greater redshifts (z ¿ 5) (see Fig. 9 of Blain et al., 1999b). This degeneracy occurs because the far-IR background radiation (like almost all backgrounds) is generated predominantly at z ∼ 1, while submm galaxies can contribute to the counts equally at almost any redshift 1 ¡ z ¡ 10. It can be broken by determining a redshift distribution of submm galaxies, which would be very di7erent in the two cases. The form of evolution that is consistent with the latest observational constraints and radio-derived redshift information for submm-selected galaxies (Smail et al., 2002) is shown by the thick solid and dashed lines in Fig. 21. Note that the assumptions that underlie these derivations are not yet all veri4ed by observations. It is unclear whether all high-redshift dusty galaxies detected in submm surveys have similar SEDs. It is possible that the properties of the dust grains in galaxies evolve with redshift, leading to a systematic modi4cation to the temperature or emissivity index. It is reasonable to expect the dust-to-gas ratio in the highest redshift galaxies to be lower than in low-redshift galaxies, as less enrichment has taken place. However, note that enrichment proceeds very rapidly once intense star formation activity is underway. Even the very 4rst regions of intense star formation could thus be readily visible in the submm, despite the global metallicity being extremely low. While it seems unlikely, based on a handful of observations (Fig. 2), it is certainly possible that a population of dusty galaxies with a signi4cantly di7erent SED is missing from current calculations (Blain and Phillips, 2002).
158
A.W. Blain et al. / Physics Reports 369 (2002) 111 – 176
Fig. 21. The history of energy generation in the Universe, parameterized as a star-formation rate per unit comoving volume. The absolute normalization of the curves depends on the assumed stellar IMF and the fraction of the dust-enshrouded luminosity of galaxies that is generated by AGN. The points show results derived from a large number of optical and near-IR studies, for which detailed references can be found in Blain et al. (1999b) and Smail et al. (2002). The most important results are from Lilly et al. (1996; 4lled stars) and Steidel et al. (1999; high-redshift diagonal crosses). The up-pointing arrow comes from the submm-based estimate of Hughes et al. (1998). An important new measurement of the extinction-free low-redshift star-formation rate from radio data, that is not plotted, has been obtained by Yun et al. (2001): 0:015 ± 0:005 M yr −1 . The thick solid and dashed lines represent the current best 4ts to far-IR and submm data in a simple luminosity evolution model and a hierarchical model of luminous merging galaxies, respectively, as updated to reJect additional data and a currently favored non-zero- cosmology. The thinner solid lines show the approximate envelope of 68% uncertainty in the results of the luminosity evolution model. The thin and thick dotted lines represent the best-4tting results obtained in the original derivations (Blain et al., 1999b, c).
A more theoretically motivated approach, based on making assumptions about the astrophysical processes at work in galaxy evolution and then predicting the observational consequences, has rightly become popular in recent years. These ‘semi-analytic’ models, which were generally developed to explain optical and near-IR observations, take a representative set of dark-matter halos that evolve and merge over cosmic time, from the results of N-body simulations, and determine their star-formation histories and appearance using a set of recipes for star-formation and feedback (White and Frenk, 1991; Kau7mann and White, 1993; Cole et al., 1994, 2000; Guiderdoni et al., 1998; Granato et al., 2000, 2001; Benson et al., 2001; Somerville et al., 2001; Baugh et al., 2001). Unfortunately, at present there is insuKcient information from submm observations to justify a model that contains more than a handful of uncertain parameters, and so it is diKcult to exploit the full machinery of semi-analytic models to explain the submm observations. Despite the free parameters available, semi-analytic models have had limited success in accounting for the observed population of high-redshift submm galaxies, without adding in an extra population of more luminous galaxies to the standard prescription (Guiderdoni et al., 1998), or breaking away from their traditional reliance on a universal initial mass function (IMF). As more information becomes available, then the full capabilities of the semi-analytic models can hopefully be applied to address dusty galaxy evolution.
A.W. Blain et al. / Physics Reports 369 (2002) 111 – 176
159
In a blend of these approaches, assuming that the SCUBA galaxies are all associated with merging galaxies (Ivison et al., 1998a, 2001; Figs. 14 and 18), and yet not being sure of the physical processes by which the mergers generate the luminosity we observe, Blain et al. (1999c) used a minimally parametrized semi-analytic model to investigate the change in the properties of merging galaxies required to reproduce the submm and far-IR counts and backgrounds: see also Jameson (2000) and Longair (2000). The observed background spectrum can only be reproduced for strong evolution of the total luminosity density out to redshift z 1, by a factor of about 20 (see Fig. 21). In addition, the lifetime of the luminous phase associated with mergers, and thus the mass-to-light ratio also had to be reduced by a large factor at high redshift in order to reproduce the observed submm counts. The physical reason for this change must be an increased eKciency of star formation during starburst activity or of AGN fueling at increasing redshifts; both make sense in light of the greater gas densities expected at high redshifts. The model has the advantage of being able to reproduce the faint optical counts, if blue LBGs are also associated with merging galaxies. The submm galaxies release about four times more energy in total than the LBGs, and do so over a period of time during a merger that is about ten times shorter. It is likely that the total baryonic mass and geometry of the merging galaxies also play important roles in determining the details of star-formation activity or AGN fueling during a merger. When reviewing the predictions and results of any model, note that it is easy to produce a model that can account for the far-IR–submm background radiation intensity; more diKcult to account for the submm counts; and more diKcult again to reproduce a plausible redshift distribution. 5.2. Observational tests of models The key observational test of models of submm-wave galaxy formation is the redshift distribution, which is know in outline from radio–submm observations (Smail et al., 2000). Determining the redshift distribution is a key goal of extensive ongoing follow-up observations, but the process has proved to be diKcult and time-consuming, as documented extensively by Smail et al. (2002). The crucial problems are the faintness of the counterparts, combined with the relatively poor positional accuracy of the centroids of the submm galaxies, which are unresolved due to the coarse spatial resolution of existing submm images. Detailed measurements of the counts of galaxies at both brighter and fainter Jux densities than those shown in Fig. 9 would also constrain models. However, determining the bright counts requires a large-area survey, which is likely to be relatively ineKcient (Fig. 23), while determining the faint counts requires greater angular resolution than can be provided by the telescopes used to make existing surveys, to avoid source confusion. The very bright counts will certainly be probed directly towards the end of the decade by the Planck Surveyor all-sky survey at a resolution of 5 arcmin, and sooner by large-area surveys using forthcoming large-format mm=submm-wave bolometer arrays on ground-based telescopes, including BOLOCAM (Glenn et al., 1998) and SCUBA-II. Limits on the bright submm-wave counts can be imposed from the number of candidate point sources that can be found in large-area submm maps of Galactic 4elds (Pierce-Price et al., 2000; Barnard et al., 2002). The faint counts will ultimately be determined directly using the SMA, CARMA and ALMA interferometers. The results of deep ISO surveys have been regularly cited as a useful constraint on galaxy evolution (Rowan-Robinson et al., 1997; Xu, 2000; Chary and Elbaz, 2001). This is certainly true out to
160
A.W. Blain et al. / Physics Reports 369 (2002) 111 – 176
Fig. 22. A summary of observed (Elbaz et al., 1999) and predicted (Blain et al., 1999b, c) di7erential counts of galaxies in the 15-m ISO band. The model predictions assume only a power-law SED in the mid-IR, with f ˙ −1:95 : no 4ne-tuning with PAH features in the SED is included. The hierarchical model (Blain et al., 1999c) provides a better 4t to the data, but both agree reasonably well with the observations. Both models asymptote to the same count at greater Jux densities. Note that the relative form of the model counts reJects that seen in Fig. 9, with the hierarchical model having the steepest rise.
z 1. However, when estimating a total luminosity density from 15-m data, it is vital that the correct SED is used to extrapolate to longer wavelengths, as it is easy to overestimate the amount of luminosity associated with a 15-m source by assuming a mid-IR SED that is too steep. For example, compare the inferred luminosity density results at redshifts z 0:7 quoted by Rowan-Robinson et al. (1997) and Flores et al. (1999). The results di7er by a factor of 5; Flores et al. (1999) obtain the lower result by using radio observations to constrain the total luminosity of the galaxies detected at 15 m. Extrapolating mid-IR data towards the peak of the SED at longer wavelengths is more diKcult than extrapolating submm observations to 4x the position of the peak of the SED that lies at shorter wavelengths. This is both because the form of the SED is intrinsically simpler on the long-wavelength side of the peak, and because the well-determined spectrum of the far-IR background radiation can be used to constrain the luminosity-averaged dust temperature of the submm galaxies. Mid-IR observations with SIRTF after 2002 will provide much more information about the SEDs and evolution of dusty galaxies to redshifts z 2. In Fig. 22 we show the deep 15-m counts predicted by models designed to account for the submm data (Blain et al., 1999b, c), updated to the current data and cosmology. If the mid-IR SED is chosen appropriately, then the 4t is quite acceptable. Including PAH emission features or varying the mid-IR SED index has relatively little e7ect on the result. The same approach can be used to estimate the deep cm-wave radio counts. If we assume just the form of the radio–far-IR correlation (Condon, 1992), without any 4ne tuning, and a radio SED of the form f ˙ −0:6 , then the predicted 8.4-GHz counts brighter than 10 Jy, based on the submm-based models are 1.05 and 0:98 arcmin−2 ,
A.W. Blain et al. / Physics Reports 369 (2002) 111 – 176
161
Fig. 23. The detection rates expected in a variety of forthcoming mm, submm and far-IR surveys. The names and wavelengths in microns of the relevant instruments are listed: see Tables 1 and 2. The higher- and lower-peaking SCUBA-II curves correspond to wavelengths of 450 and 850 m, respectively. References to instrument performance for these calculations (Blain and Longair, 1996) can be found in Table 3. The surface density of galaxies assumed follows the models of Blain et al. (1999b): see Figs. 9 and 10. Curves stop on the right if the surface density is expected to fall below a single galaxy on the whole sky. Curves stop at the left when a relatively optimistic de4nition of the 5 confusion noise level for detection is reached (see Fig. 13).
respectively; the corresponding power-law indices of the count function N (¿ S) ˙ S are = −1:4 and −1:3, respectively. The results in both models match the observed 8.4-GHz 10-Jy count of 1:01±0:14 arcmin−2 with =−1:25±0:2 (Partridge et al., 1997). The reasonable agreement between the predictions of the models, which are constrained only by observations in the submm and far-IR, and the observed deep mid-IR and radio counts con4rms that the models are reliable. The source confusion estimates shown in Fig. 13, which are based on the same models, should thus be reliable over a wide wavelength range from about 10 cm to 10 m. 5.3. Modeling the detailed astrophysics of the submm galaxies It is not possible to separate the modeling of the evolution of the population of submm galaxies fully from studies of the nature of the galaxies themselves. Their luminosities and masses
162
A.W. Blain et al. / Physics Reports 369 (2002) 111 – 176
(see Frayer et al., 1999) demand that the submm-luminous phase be short-lived as compared with the age of the Universe. The observational information for most of the submm galaxies is insuKcient to be con4dent that their nature is understood at present. Of the galaxies with reliable counterparts, there are three bright Class-2 galaxies (Soucail et al., 1999; Ivison et al., 1998a, 2000a, 2001; Vernet and Cimatti, 2001) with optical redshifts and CO detections (Frayer et al., 1998, 1999; Kneib et al., 2002), and a total of four Class-1 galaxies (Smail et al., 1999; Bertoldi et al., 2000; Gear et al., 2000; Lutz et al., 2001), which are known to be either very red galaxies or formal EROs. Other submm-selected galaxies with accurate positions from radio observations (Smail et al., 2000) or mm-wave interferometry (Downes et al., 1999b; Bertoldi et al., 2000; Frayer et al., 2000; Dannerbauer et al., 2002) remain enigmatic. All that can be said about these galaxies is that they all appear to have thermal dust spectra, are all very faint at optical wavelengths, and most also appear to be very faint at near-IR wavelengths. The Class-2 galaxies are all clearly undergoing mergers or interactions. Much less is known about the morphology of the faint, but typically extended Class-1 galaxies (Figs. 16, 17 and 20). It is certainly possible that they too are involved in interactions, which appear to trigger the dramatic luminosity of almost all the low-redshift ULIRGs (Sanders and Mirabel, 1996). Programs of ultradeep near-IR imaging on 10-m-class telescopes should soon test this idea. Hydrodynamical simulations of gas-rich mergers by Mihos and Hernquist (1996), Bekki et al. (1999) and Mihos (2000) show the formation of very dense concentrations of gas, which could be associated with short-lived, very-intense bursts of star formation. However, at present it is not possible to simulate a representative sample of mergers with the range of geometries likely to be encountered, the necessary time resolution, and a suKciently accurate treatment of the detailed astrophysics of star-formation to make a reliable connection between the limited observations and the underlying galaxy properties. The spatial extent of the three bright Class-2 galaxies in the optical waveband appears to be considerably greater than that of most low-redshift ULIRGs. It is thus diKcult to be sure that simulations of well-studied low-redshift ULIRGs adequately represent the properties of the high-redshift submm galaxies. Note, however, that the precise spatial relationship between the optical and submm emission in these objects is still unclear (Ivison et al., 2001); the submm emission could be more compact than the optical galaxy. Larger submm galaxy samples will be available over the next few years, boosting the likelihood that examples of the full range of submm galaxies will be available to be studied in detail. More sensitive observations of the properties of the known galaxies will also improve our knowledge of their astrophysics. One key question is the relationship of the submm galaxies to the formation of elliptical galaxies (Lilly et al., 1999). Whether the bulk of submm galaxies are high-redshift low-angular-momentum gas clouds, forming elliptical galaxies in a single episode by a ‘monolithic collapse’ (Eggen et al., 1962), as advocated by for example Archibald et al. (2002), or galaxies observed during one of a series of repeated mergers of gas-rich, but pre-existing galaxy sub-units, likely to take place at relatively lower redshifts, as discussed by Sanders (2001), and which might ultimately yield elliptical merger remnants, is an important question that future follow-up observations will address. Existing observations of extended and disturbed counterparts to submm galaxies (Ivison et al., 1998a, 2001; Lutz et al., 2001) tend to favor the second explanation, in which well-de4ned pre-existing stellar systems merge. However, in either scenario it is likely that the bulk of the stellar population in the resulting galaxies form during the submm-luminous phase.
A.W. Blain et al. / Physics Reports 369 (2002) 111 – 176
163
5.4. The global evolution of dust-enshrouded galaxies Fig. 21 summarizes the current state of knowledge of the strong evolution of the comoving luminosity density contributed by luminous far-IR galaxies, whose emission is redshifted into the submm (Blain et al., 1999b, c; Smail et al., 2002). The derivation of these results was discussed brieJy above, and is explained in much more detail in these papers. The results of both the luminosity evolution and hierarchical models, which both include strong luminosity evolution of the population of low-redshift IRAS galaxies out to high redshifts, are fully compatible with the redshift information available for submm galaxy samples. This would not be the case if the redshift evolution of the luminosity density was markedly di7erent from the forms shown in Fig. 21. For example, if the luminosity density of submm galaxies were to match rather than exceed the value denoted by the datapoints at z 1 in Fig. 21, and remain at the same high level out to z 10, then the submm-wave spectrum of the cosmic background radiation (Fig. 11) would tend to be too Jat, the intensity of the far-IR background radiation would fall short of the observed level, and the predicted redshift distribution of submm galaxies would be biased strongly to the highest redshifts, which at present seems not to be the case (Smail et al., 2002). The determination of complete redshift distributions for existing samples of submm galaxies, and of more redshifts for individual very luminous galaxies drawn from large, future submm-selected galaxy catalogs, will ultimately allow the evolution of distant dusty galaxies to be traced in detail. 6. Gravitational lensing in the submm waveband The 4rst submm-wave surveys for distant galaxies exploited both the weak to moderate gravitational lensing magni4cation, by about a factor of 2–3, experienced throughout the inner few square arcminutes of rich foreground cluster of galaxies at a moderate redshift in the range z 0:2– 0.4 (Blain, 1997) and the greater magni4cation produced along critical lines for much smaller areas of the background sky. A 5-arcmin2 SCUBA 4eld centered on a moderate redshift cluster includes both these regions, enhancing the Jux density from all high-redshift background galaxies (Smail et al., 1997). More, and in some cases deeper, SCUBA images of clusters have been taken (Smail et al., 2002; Chapman et al., 2002a; Cowie et al., 2002), especially in Abell 2218, where a multiply-imaged source has been detected (van der Werf and Kraiberg Knudsen, 2001). Whether the magni4cation acts to increase the surface density of background galaxies on the sky depends on the form of their counts. Lensing by both galaxies and clusters could have signi4cant applications in future submm surveys, especially those sampling the steep counts of bright submm galaxies in wide 4elds, including the all-sky survey from Planck Surveyor (Blain, 1998), and surveys using BLAST, Herschel-SPIRE and SCUBA-II (Tables 1 and 2) covering many tens of square degrees. 6.1. MagniCcation bias Because surface brightness is conserved by all gravitational lenses, the net e7ect of magnifying a population of background galaxies depends on the slope of the counts dN (¿ S)=dS, where N (¿ S) is the number of galaxies per unit area on the sky brighter than S (Schneider et al., 1992). Subject to a magni4cation , the count becomes [1=%2 ] dN [ ¿ (S=%)]=dS. For a power-law count with
164
A.W. Blain et al. / Physics Reports 369 (2002) 111 – 176
dN (¿ S)=dS ˙ S , a value of ¡ − 2 corresponds to an increase in surface density if % ¿ 1. This threshold value corresponds to a slope of −1 for the integral counts N (¿ S) shown in Fig. 9. Note that for a uniform non-evolving population of galaxies = −2:5. As shown in Fig. 9, submm-wave counts are expected to be steep, and to change slope sharply at mJy Jux density levels, as compared with deep optical or radio counts. The signi4cant changes in the count slope are particularly unusual, and not found in any other waveband. As a result, the magni4cation bias can be large, increasing the number of detectable bright galaxies (Blain, 1996, 1997), and providing a way to investigate very faint counts by comparing lensed and unlensed 4elds; for example in the innermost regions of clusters of galaxies (Blain, 2002). That a signi4cant magni4cation bias can be exploited using relatively weak lensing by clusters of galaxies can be seen by comparing the number of 10 –20-mJy 850-m galaxies detected in the SCUBA Lens Survey (Smail et al., 1997, 2002), and in the larger 4eld of the unlensed 8-mJy survey (Scott et al., 2002). The ratio is about 3:1, showing a clear positive magni4cation bias, and indicating that if the 850-m counts at Jux densities greater than 10 mJy are represented by a power-law, then the index ¡ − 2. 6.2. Conditions for exploiting submm lensing by galaxies The key advantage of observing background galaxies that are gravitationally lensed by foreground mass concentrations in the submm waveband is that the K correction (Fig. 4) acts to brighten the distant background lensed galaxy as compared with the lens. This is already very familiar from surveys of lensed radio AGN (Rusin, 2001), and is illustrated clearly in Fig. 1, in which only the central cD galaxy in the lensing cluster shows any signi4cant submm emission. In SCUBA cluster lens surveys, both the image separations, and the extent of the highmagni4cation regions are of order 1 arcmin, a scale which is well matched both to the 15-arcsec resolution of the JCMT and to the 2.5-arcmin 4eld of view of SCUBA. The magni4cation ensures that a signi4cantly greater fraction of the submm-wave background radiation intensity is thus resolved into detectable galaxies in surveys in the 4elds of gravitational lensing clusters than in even the deepest blank-4eld surveys (Blain et al., 1999a). However, for background sources lensed by galaxies rather than clusters, the relevant image separations and the extent of the high-magni4cation region are only of order 1 arcsec, and so cannot be resolved using any single-antenna telescope. High-resolution submm observations are required to disentangle lensed and unlensed galaxies; this capability will be provided by ALMA (Blain, 2002), while pilot studies of should be possible using the CARMA, SMA and IRAM PdBI interferometers. The most luminous lensed sources can already be resolved into multiple images using the IRAM mm-wave interferometer (Alloin et al., 1997). The only caveat for exploiting galaxy-scale lensing is that the source size must be small as compared with the area of sky behind the lens that is strongly magni4ed. The intense far-IR and submm emission from low-redshift ULIRGs is typically very compact (several hundred pc across; Downes and Solomon, 1998), and would easily meet this condition; however, there are indications that the dust emission from at least some luminous high-redshift submm galaxies could extend over scales greater than 10-kpc (Papadopoulos et al., 2001; Chapman et al., 2001a; Lutz et al., 2001; Isaak et al., 2002; Ivison et al., 2001). The whole area of sky covered by these galaxies would not then be lensed eKciently by an intervening galaxy, although bright knots of emission within them
A.W. Blain et al. / Physics Reports 369 (2002) 111 – 176
165
could still be magni4ed by large factors. This concern about lensing eKciency and the angular size of distant submm galaxies does not apply to lensing by much larger clusters of galaxies, which will always be e7ective. 6.3. Prospects for the lensing studies in the future Larger area surveys for brighter samples of luminous dusty galaxies using the array of forthcoming ground-based, air- and space-borne instruments, including BOLOCAM, BLAST, SOFIA, SCUBA-II, SIRTF, Herschel and Planck Surveyor (see Table 2), should be subject to an enhanced magni4cation bias. High-resolution follow-up observations using ALMA should then yield a large sample of strongly magni4ed high-redshift lensed systems to complement the systematically selected CLASS sample of gravitational lensed AGN identi4ed at radio wavelengths (Rusin, 2001). These surveys will not be subject to any extinction bias due to absorption by dust in the lensing galaxies, and should yield a very complete and reliable catalog of up to several thousand lenses (Blain, 1998). 7. Future developments in submm cosmology During the last 4 years, the 4rst steps have been taken towards investigating the Universe using direct submm-wave surveys. The technologies of the class of detectors that made these initial surveys possible are still developing rapidly. Many instrumentation projects are underway, which will allow us to increase the sizes of samples of distant submm galaxies, and to study known examples in more detail; some of their key features are outlined in Tables 1–3. 7.1. New technologies for instrumentation A key technology under development is for bolometers with superconducting temperature-sensitive elements, including transition-edge sensors (TESs). These are much more stable than the semiconducting thermistors used in existing systems, and so can be read out using multiplexed, and therefore much simpler, cold electronics. Another advantage of TES devices is that they require no bias current, and so need fewer heat-conducting, diKcult-to-assemble connections to each device. The prototype Fabry–Perot spectroscopic device FIBRE, which uses TES bolometers (Benford et al., 2001), was tested successfully at the CSO in May 2001. TES devices o7er the prospect of increasing the size of the arrays of detector elements in mm=submm-wave cameras from of order 100 to of order 104–5 , providing much larger 4elds of view. Filled-array detector devices using conventional semiconducting bolometers are being demonstrated in the SHARC-II and HAWC cameras for the CSO and SOFIA, while the SCUBA-II camera has a goal of at least a 8 × 8 arcmin 4eld of view—about 25 times greater than the 4eld of view of SCUBA—is under development in Edinburgh and is expected to integrate large arrays and superconducting bolometers. SCUBA-II will supplement its much wider 4eld of view with an enhanced point source sensitivity: the same galaxies should be detectable about 8 and 4 times faster using SCUBA-II as compared with SCUBA at wavelengths of 850 and 450 m respectively: see Table 1. A 10-m telescope operating at 850-m with a 105 -element detector would have a square 4eld of
166
A.W. Blain et al. / Physics Reports 369 (2002) 111 – 176
Table 3 References to instruments listed in Tables 1 and 2 Name
Information
SCUBA MAMBO SCUBA-II
Holland et al. (1999) Kreysa et al. (1998) www.jach.hawaii.edu/JACpublic/JCMT/ Continuum observing=SCUBA-2=home.html Davidson (1999); so4a.arc.nasa.gov Glenn et al. (1998); www-lmt.phast.umass.edu/ins/ continuum=bolocam.html Devlin (2001); www.hep.upenn.edu/blast Ho (2000); sma2.harvard.edu Wootten (2001); www.alma.nrao.edu www.nfra.nl/skai www.ssd.rl.ac.uk/spire pacs.ster.kuleuven.ac.be sirtf.caltech.edu astro.estec.esa.nl=Planck Mather et al. (1998); space.gsfc.nasa.gov=astro=specs
SOFIA BOLOCAM BLAST SMA ALMA SKA Herschel-SPIRE Herschel-PACS SIRTF-MIPS & IRAC Planck Surveyor SPECS=SPIRIT
view about 1 deg on a side, at a Nyquist-sampled plate scale: much larger than the 5-arcmin2 4elds of view of SCUBA and MAMBO. With such large 4elds of view, it is not unreasonable to survey most of the sky down to the confusion limit of a 10 –30-m telescope in an observing campaign lasting for several years. By combining large numbers of bolometer detectors with dispersive mm-wave optics (Glenn, 2001), it should be possible to obtain low-resolution mm-wave spectra of galaxies over a very wide band, perhaps 100 GHz, to search for CO and atomic 4ne-structure line emission from high-redshift submm galaxies detected in continuum surveys, while simultaneously carrying out unbiased surveys for line-emitting galaxies within the 4eld of view (Blain et al., 2000). Development of several such systems is underway. Phase-sensitive heterodyne submm detectors are already very eKcient; however, only small numbers of these detectors can currently be fabricated into an array. Their strength is in very high resolution submm-wave spectroscopy, and as sensitive coherent detectors in existing mm-wave interferometers. They will be 4tted to the forthcoming SMA and CARMA, and will be exploited to the full with the large collecting area of the ALMA array. Sensitive arrays of mid- and far-IR detectors should soon be Jying, both in space aboard SIRTF and ASTRO-F, and in the upper atmosphere, on balloons such as BLAST, and aboard SOFIA. Limits to the continuum Jux densities of the most luminous high-redshift galaxies derived using these facilities, measured close to the peak of their SED (see Fig. 2), will provide valuable information about their properties. Spectrographs, both aboard these facilities and on ground-based telescopes, will provide detections of and sensitive limits to the line radiation from the same objects, providing redshift information and astrophysical diagnostics.
A.W. Blain et al. / Physics Reports 369 (2002) 111 – 176
167
7.2. New telescopes At present, a wide range of submm-wave telescopes are available. Single-antenna telescopes include the 10.4-m CSO on Mauna Kea, the 10-m Heinrich Hertz Telescope (HHT) on Mount Graham, the 15-m JCMT on Mauna Kea, the 15-m SEST at La Silla in Chile, the 30-m IRAM telescope in Spain, and the 45-m antenna at Nobeyama in Japan. The Large Millimeter Telescope (LMT), a 50-m mm-wave telescope is under construction on a 5000-m peak near Puebla in Mexico, and it is hoped that the 100-m Green Bank Telescope (GBT) in West Virginia can operate at 90 GHz=3 mm during the winter. New single-antenna telescopes with large survey cameras have been proposed for the excellent submm observing sites at the South Pole and the ALMA site in Chile. The Planck Surveyor CMB imaging mission will generate an all-sky map in the submm at a resolution of 5 arcmin, and the 3.5-m Herschel space telescope will carry out pointed submm imaging and spectroscopic observations of known galaxies, and carry out deep confusion-limited cosmological surveys over 4elds several hundred square degrees in size. Cameras exploiting the 2.5-m telescope aboard SOFIA and BLAST and other dedicated ultra-long-duration balloon instruments will allow far-IR and submm-wave observations from the upper atmosphere. Existing mm-wave interferometers include the 6 × 15-m IRAM PdBI, the 6 × 10:4-m OVRO MMA, the 10 × 6-m BIMA array at Hat Creek in California and the 6 × 10-m Nobeyama Millimeter Array. The 8 × 6-m SMA is under construction on Mauna Kea, the 4rst imaging submm-wave interferometer, while it is planned to combine 9 of the BIMA antennas with the OVRO MMA at a high site in the Inyo Mountains east of Owens Valley in California to form CARMA. The international 64 × 12-m ALMA submm interferometer array in Chile will provide a tremendous increase in the capability of submm-wave spectral line and continuum imaging, providing 10- to 30-arcsec resolution, and detailed images of even the most distant galaxies. The most luminous submm galaxies so far discovered, with 850-m Jux densities of about 25 mJy, could be detected at a 10- signi4cance by ALMA in about a second. Its excellent sensitivity and wide 8-GHz instantaneous bandwidth will allow a signi4cant fraction of the galaxies detected in deep surveys to be detected simultaneously in the continuum and CO rotation lines, providing direct and exact redshifts. As the redshifted ladder of CO lines are separated by 115=(1 + z) GHz, about 25% of galaxies at z 2:5 will have a CO line lying within the 8-GHz-wide ALMA band (Blain et al., 2000). 7.3. Future capabilities and progress The enhanced capabilities of this array of new facilities is illustrated in Fig. 23. The rate at which galaxies can be detected is likely to grow dramatically from a few per day at present to many hundreds per hour. Note that the various instruments operate at di7erent wavelengths, and so each is most sensitive to galaxies at di7erent redshifts and with di7erent luminosities. However, sample sizes are certain to increase dramatically, especially when the 104 –105 galaxies that will be detected in the Planck Surveyor all-sky survey are taken into account. Multiwavelength follow-up observations of all these new submm galaxies are likely to remain a time-consuming challenge. However, the likely availability of 30-m-aperture ground-based optical= near-IR telescopes in the next decades, and the extremely deep imaging capability of NGST, should help us to study a complete sample of submm galaxies down to luminosities that are only a fraction of L∗ .
168
A.W. Blain et al. / Physics Reports 369 (2002) 111 – 176
8. Summary: key questions and targets for the future The 4rst generation of extragalactic submm-wave surveys have provided an important complement to more traditional optical and radio searches for distant galaxies, and discovered a cosmologically signi4cant population of very-luminous, high-redshift dusty galaxies. We have found that is very hard to study a complete sample of submm galaxies at other wavelengths (Smail et al., 2002). The similar experience of other groups involved in deep mm=submm surveys (Barger et al., 1999a; Eales et al., 2000; Carilli et al., 2001; Scott et al., 2002; Webb et al., 2002b) is reJected in the relatively few papers describing the individual multi-waveband properties of the almost 200 galaxies detected. The most sensitive follow-up observations are required in the near-IR, radio and optical wavebands to identify and study them (Frayer et al., 2000; Ivison et al., 2001), that is very faint detection thresholds of order 10 Jy at 1.4-GHz, K 23 and B26, respectively. Much more time has been devoted to multi-waveband follow-up observations than was spent on the initial submm detections. Typical examples of the submm population can be detected in imaging-mode SCUBA observations in about 10 h of integration. However, at least 2 h of near-IR observations at the 10-m Keck telescope and about 24 h of integration at the VLA are then required in order to 4nd likely counterparts to typical submm galaxies. The advantage of the VLA radio observations over those at optical and near-IR wavelengths is the very large 4eld of view, which allows many galaxies to be detected simultaneously. The very brightest optical counterparts to submm galaxies can be identi4ed spectroscopically in about 7 h of integration using 4-m class telescopes (Ivison et al., 1998b) and higher-quality spectra can be obtained in a comparable time using 8-m class telescopes (see the results of a 5-h integration using the UVES spectrograph at the European Southern Observatory (ESO) VLT by Vernet and Cimatti, 2001). In all cases, identifying a plausible counterpart, where possible, is only a 4rst step; 4nding a redshift for these typically faint, red galaxies is much more challenging. In this context, the unusual sensitivity of submm surveys to the most distant galaxies is almost a drawback, making it very hard to detect a complete sample of submm galaxies at other wavelengths. Key questions for understanding submm galaxies in the future include: What are the properties of typical submm galaxies in other wavebands, and what is their relationship to other high-redshift galaxy samples? The submm-selected galaxies appear to a diverse mixture of types, including bright merging systems (Ivison et al., 1998a, 2000a, 2001), optical QSOs (Kraiberg Knudsen et al., 2001), EROs with K ¡ 20 (Smail et al., 1999, 2002; Gear et al., 2000; Lutz et al., 2001), and much fainter IR-detected galaxies (Frayer et al., 2000), which may also turn out to have very red colors. It seems that the overlap between the 850-m submm galaxy population and both the LBGs and faint Chandra X-ray sources is small. Note that some of this apparent diversity is sure to be due to the very wide redshift distribution of the submm galaxies. What is the redshift distribution of the submm galaxies? Models of the evolution of submm galaxies that do not grossly violate basic observational constraints on the source counts and cosmic background radiation are easy to generate. However, it is vital to predict a plausible redshift distribution, with only a small fraction at redshifts less than unity, and a probable median redshift of at least 2–3. It is easy to generate a redshift distribution that is biased too high. The observational determination of a redshift distribution for a well-de4ned sample of submm galaxies remains a crucial goal. This will be easy with ALMA. In the meantime, concerted and time-consuming
A.W. Blain et al. / Physics Reports 369 (2002) 111 – 176
169
campaigns of optical and near-IR spectroscopy will pay o7 gradually, while observations of cm-wave megamasers and the development of wide-band mm- and cm-wave spectrometers may o7er alternative routes. The forthcoming (sub)mm interferometers CARMA and SMA, and developments of the IRAM PdBI will also provide accurate positions and some CO redshifts for submm galaxies. What are the details of the astrophysics responsible for the luminosity of the submm galaxies? This is very important, as the submm galaxies appear to be signposts to some of the most luminous and violent phases of galaxy evolution, and could be associated with the formation of the bulk of galactic bulges, elliptical galaxies and supermassive black holes (Lilly et al., 1999; Dunlop, 2001). Whether these galaxies are formed in a single event, or as a series of lesser bursts, is a key question for our understanding of the process of galaxy formation and evolution. Detailed comparisons of the luminosity derived from dust continuum emission, the dynamical mass inferred from molecular line pro4les, the evolved stellar mass inferred from near-IR observations, and the spatial extent of the activity from various high-resolution observations will all be important for disentangling the complex astrophysics of these systems. When can submm instruments be used to resolve and study high-redshift galaxies in detail? This is already practical given enough observing time at the OVRO MMA and the IRAM PdBI. The CARMA and SMA interferometers will soon have important roles to play in these studies. In about 10 years, ALMA will provide the 4rst real chance to detect and study galaxies rapidly and in great detail using submm observations alone. Luminosities, redshifts, dynamical masses and metallicities could all be determined without needing to resort to radio, optical and near-IR observations as a matter of course. However, because ALMA has a relatively small 4eld of view, the most eKcient survey strategy may be to detect large numbers of galaxies using wide-4eld mm=submm cameras like BOLOCAM, SCUBA-II and their successors on single-antenna 10 –50-m aperture survey telescopes, and the Herschel and Planck Surveyor space missions, and then use ALMA to provide detailed images and spectra of all the detected galaxies. What is the fundamental limit to making submm observations of distant galaxies? Submm observations rely on the presence of metals, in the form of molecular gas or dust grains in order to detect galaxies. While submm radiation is able to travel unattenuated across the Universe from prior to the epoch of reionization, it is possible that a large fraction of pre-reionization ‘4rst-light’ sources are insuKciently dusty and metal rich to be detectable as continuum sources. Low-metallicity galaxies should be detectable by 4ne-structure C and O far-IR line emission, however. It would be tremendously exciting to see the birth of the 4rst metal-enriched dusty systems with ALMA, and so perhaps to determine directly the redshift limit for submm surveys. Of course, even if this were possible, ALMA would still have a long and fruitful career studying the detailed astrophysics of galaxies out to and beyond redshift 5, while the search for the most primitive galaxies in the second and third decades of the century is taken up by space-based mid-IR interferometers and the SKA radio telescope (see Fig. 8). Submm observations of the distant Universe are a new tool for probing the earliest and most dramatic stages of the evolution of galaxies. Over the years to come, the capabilities of submm-wave observatories, and our understanding of the Universe in this new window, should continue to advance dramatically.
170
A.W. Blain et al. / Physics Reports 369 (2002) 111 – 176
Acknowledgements This work is heavily based on results obtained from the SCUBA Lens Survey. The following have all been involved with aspects of the SCUBA lens survey: Lee Armus, Amy Barger, Jocelyn Bezecourt, Leo Blitz, Len Cowie, John Davies, Alastair Edge, Aaron Evans, Andy Fabian, Allon Jameson, Tom Kerr, Jean-Francois Le Borgne, Malcolm Longair, Leo Metcalfe, Glenn Morrison, Frazer Owen, Naveen Reddy, Nick Scoville, Genevieve Soucail, Jack Welch, Mel Wright and Min Yun. We thank the sta7 of the JCMT for operating and the UK ATC for providing SCUBA. We thank Omar Almaini, Vicki Barnard, Frank Bertoldi, Jamie Bock, Chris Carilli, Helmut Dannerbauer, Darren Dowell, Steve Eales, Jason Glenn, Sunil Golwala, Dean Hines, Kate Isaak, Kirsten Kraiberg Knudsen, Attila Kovacs, Andrew Lange, Simon Lilly, Ole MVoller, Priya Natarajan, Max Pettini, Tom Phillips, Kate Quirk, Enrico Ramirez-Ruiz, Nial Tanvir, Neil Trentham, Paul van der Werf, the editor Marc Kamionkowski, an anonymous referee and Roberta Bernstein for useful conversations and comments on the manuscript. AWB was supported in Cambridge by the Raymond and Beverly Sackler Foundation as part of the Foundation’s Deep Sky Initiative Program at the IoA. IRS is supported by the Leverhulme Trust and the Royal Society. JPK is supported by CNRS. Full references and acknowledgement to the instruments and telescopes used in this research can be found in Smail et al. (2002). This research has made use of the NASA=IPAC Extragalactic Database (NED) which is operated by the Jet Propulsion Laboratory, California Institute of Technology, under contract with the National Aeronautics and Space Administration. References Abel, T., Norman, M.L., Madau, P., 1999. Astrophys. J. 523, 66. Adelberger, K.L., Steidel, C.C., 2000. Astrophys. J. 544, 218. ∗ Alloin, D., Guilloteau, S., Barvainis, R., Antonucci, R., Tacconi, L., 1997. Astron. Astrophys. 321, 24. Almaini, O., Lawrence, A., Boyle, B.J., 1999. Mon. Not. R. Astron. Soc. 305, L59. Almaini, O. et al., 2002. Mon. Not. R. Astron. Soc., submitted, astro-ph=0108400. Altieri, B., et al., 1999. Astron. Astrophys. 343, L65. ∗ Alton, P.B., Xilouris, E.M., Bianchi, S., Davies, J., Kyla4s, N., 2000. Astron. Astrophys. 356, 795. Alton, P.B., Lequeux, J., Bianchi, S., Churches, D., Davies, J., Combes, F., 2001. Astron. Astrophys. 366, 451. Andreani, P., Franceschini, A., 1996. Mon. Not. R. Astron. Soc. 283, 85. Archibald, E.N., Dunlop, J.S., Hughes, D.H., Rawlings, S., Eales, S.A., Ivison, R.J., 2001. Mon. Not. R. Astron. Soc. 323, 417. Archibald, E.N., Dunlop, J.S., Jimenez, R., Friaca, A.C.S., McLure, R.J., Hughes, D.H., 2002. Astrophys. J., submitted, astro-ph=0108122. Armand, C., Milliard, B., Deharveng, J.M., 1994. Astron. Astrophys. 284, 12. Arnouts, S., D’Odorico, S., Christiani, S., Zaggia, S., Fontana, A., Giallongo, E., 1999. Astron. Astrophys. 341, 641. Baker, A.J., Lutz, D., Genzel, R., Tacconi, L.J., Lehnert, M.D., 2001. Astron. Astrophys. 372, L37. Barcons, X. et al., 2002. Astron. Astrophys. 582, 522. Barger, A.J., Cowie, L.L., Sanders, D.B., Fulton, E., Taniguchi, Y., Sato, Y., Kawara, K., Okuda, H., 1998. Nature 394, 248. Barger, A.J., Cowie, L.L., Sanders, D.B., 1999a. Astrophys. J. 518, L5. ∗ Barger, A.J., Cowie, L.L., Smail, I., Ivison, R.J., Blain, A.W., Kneib, J.-P., 1999b. Astron. J. 117, 2656. Barger, A.J., Cowie, L.L., Richards, E.A., 2000. Astron. J. 119, 2092. Barger, A.J., et al., 2001. Astrophys. J. 560, L23.
A.W. Blain et al. / Physics Reports 369 (2002) 111 – 176
171
Barnard, V.E., Blain, A.W., 2002. Mon. Not. R. Astron. Soc., submitted for publication. Barnard, V.E. et al., 2002. Mon. Not. R. Astron. Soc., submitted submitted for publication. Barvainis, R., Ivison, R.J., 2002. Astrophys. J., submitted, astro-ph=0201424. Baugh, C.M., Benson, A.J., Cole, S., Frenk, C.S., Lacey, C.G., 2001. in: Marquez et al. (Eds.), QSO Hosts and their Environments, Kluwer/Plenum, New York, p. 295. Bautz, M.W., et al., 2000. Astrophys. J. 543, L119. Bekki, K., Shioya, Y., Tanaka, I., 1999. Astrophys. J. 520, L99. Benford, D.J., Cox, P., Omont, A., Phillips, T.G., McMahon, R.G., 1999. Astrophys. J. 518, L65. Benford, D.J., Ma7ei, B., Moseley, S.H., Pajot, F., Phillips, T.G., Rioux, C., Shafer, R.A., 2001. Astron. Astrophys. Suppl. 198, 0510. Benson, A.J., Frenk, C.S., Baugh, C.M., Cole, S., Lacey, C.G., 2001. Mon. Not. R. Astron. Soc. 327, 1041. Berger, E., Kulkarni, S.R, Frail, D.A., 2001. Astrophys. J. 560, 652. Bernstein, R.A., 1999. In: Bunker, A.J., van Breughel W.J.M (Eds.), The Hy-Redshift Universe, ASP Conf. Series, Vol. 193, ASP, San Francisco, p. 487. Bernstein, R.A., Freedman, W.L., Madore, B.F., 2002. Astrophys. J. 571, 56. Bertin, E., Dennefeld, M., Moshir, M., 1997. Astron. Astrophys. 323, 685. Bertoldi, F., et al., 2000. Astron. Astrophys. 360, 92. Bertoldi, F., Menten, K.M., Kreysa, E., Carilli, C.L., Owen, F., 2001. In: Highlights of Astronomy, Vol. 12, ASP, San Francisco, astro-ph=0010553. Blain, A.W., 1996. Mon. Not. R. Astron. Soc. 283, 1340. Blain, A.W., 1997. Mon. Not. R. Astron. Soc. 290, 553. Blain, A.W., 1998. Mon. Not. R. Astron. Soc. 297, 511. Blain, A.W., 1999a. Mon. Not. R. Astron. Soc. 304, 669. Blain, A.W., 1999b. In: Weymann, R.J. et al, (Eds.), Photometric Redshifts and High Redshift Galaxies, ASP Conf. Series, Vol. 191, ASP, San Francisco, p. 255, astro-ph=9906141. Blain, A.W., 2001a. In: Tran Thanh Van, J., Mellier, Y., Moniez, M. (Eds.), Cosmological Physics with Gravitational Lensing. EDP Sciences, Gif sur Yvette, p. 245, astro-ph=0007196. Blain, A.W., 2002. Mon. Not. R. Astron. Soc. 330, 219. Blain, A.W., Longair, M.S., 1993a. Mon. Not. R. Astron. Soc. 264, 509. ∗ Blain, A.W., Longair, M.S., 1993b. Mon. Not. R. Astron. Soc. 265, L21. Blain, A.W., Longair, M.S., 1996. Mon. Not. R. Astron. Soc. 279, 847. Blain, A.W., Natarajan, P., 2000. Mon. Not. R. Astron. Soc. 312, L39. Blain, A.W., Phillips, T.G., 2002. Mon. Not. R. Astron. Soc. 333, 222. Blain, A.W., Ivison, R.J., Smail, I., 1998. Mon. Not. R. Astron. Soc. 296, L29. Blain, A.W., Kneib, J.-P., Ivison, R.J., Smail, I., 1999a. Astrophys. J. 512, L87. ∗ Blain, A.W., Smail, I., Ivison, R.J., Kneib, J.-P., 1999b. Mon. Not. R. Astron. Soc. 302, 632. ∗ Blain, A.W., Jameson, A., Smail, I., Longair, M.S., Kneib, J.-P., Ivison, R.J., 1999c. Mon. Not. R. Astron. Soc. 309, 715. Blain, A.W., Frayer, D.T., Bock, J.J., Scoville, N.Z., 2000. Mon. Not. R. Astron. Soc. 313, 559. Bloom, J.S., Kulkarni, S.R., Djorgovski, S.G., 2002. Astron. J. 123, 111. Borys, C., Chapman, S.C., Halpern, M., Scott, D., 2002. Mon. Not. R. Astron. Soc. 330, L63. Briggs, F.H., 1999. Preprint, astro-ph=9910415. Calzetti, D., Armus, L., Bohlin, R.C., Kinney, A.L., Koornneef, J., Storchi-Bergmann, T., 2000. Astrophys. J. 533, 682. Carilli, C.L., 2001. Private communication. Carilli, C.L., Taylor, G.B., 2000. Astrophys. J. 532, L95. Carilli, C.L., Yun, M.S., 1999. Astrophys. J. 513, L13. ∗ Carilli, C.L., Yun, M.S., 2000. Astrophys. J. 530, 618. Carilli, C.L., Blain, A.W., 2002. Astrophys. J. 569, 605. Carilli, C.L., et al., 2001. Astrophys. J. 555, 625. Carilli, et al., 2001. In: Lowenthal J., Hughes D. (Eds.), Deep Millimeter Surveys: Implications for Galaxy Formation and Evolution. World Scienti4c, Singapore, p. 27, astro-ph=0009298. Chapman, S.C., et al., 2000. Mon. Not. R. Astron. Soc. 319, 318. ∗ Chapman, S.C., et al., 2001a. Astrophys. J. 548, L17.
172
A.W. Blain et al. / Physics Reports 369 (2002) 111 – 176
Chapman, S.C., Richards, E.A., Lewis, G.F., Wilson, G., Barger, A.J., 2001b. Astrophys. J. 548, L147. Chapman, S.C., Scott, D., Borys, C., Fahlman, G.G., 2002a. Mon. Not. R. Astron. Soc. 330, 92. Chapman, S.C., Lewis, G.F., Scott, D., Borys, C., Richards, E.A., 2002b. Astrophys. J. 570, 557. Chapman, S.C., et al., 2002c. Astrophys. J. in press, astro-ph/0203068. Chapman, S.C., et al., 2002d. Astrophys. J. in press, astro-ph/0203068. Chary, R., Elbaz, D., 2001. Astrophys. J. 556, 562. Cole, S., Aragon-Salamanca, A., Frenk, C.S., Navarro, J.F., Zepf, S.E., 1994. Mon. Not. R. Astron. Soc. 271, 781. Cole, S., Lacey, C.G., Baugh, C.M., Frenk, C.S., 2000. Mon. Not. R. Astron. Soc. 319, 168. Combes, F., Maoli, R., Omont, A., 1999. Astron. Astrophys. 345, 369. Condon, J.J., 1974. Astrophys. J. 188, 279. Condon, J.J., 1992. Annu. Rev. Astron. Astrophys. 30, 575. ∗ Cowie, L.L., Barger, A.J., Kneib, J.-P., 2002. Astron. J. 123, 2197. Daddi, E., et al., 2000. Astron. Astrophys. 361, 535. Dale, D.A., Helou, G., Contursi, A., Silbermann, N.A., Kolhatkar, S., 2001. Astrophys. J. 549, 215. Dannerbauer, H., Lehnert, M.D., Lutz, D., Tacconi, L., Bertoldi, F., Carilli, C., Genzel, R., Menten, K., 2002. Astrophys. J., in press, astro-ph=0201104. Darling, J., Giovanelli, R., 2001. Astron. J. 121, 1278. Davidson, J.A., 1999. Astrophys. Space Sci. 266, 35. Devlin, M.J., 2001. Preprint, astro-ph=0012327. Deane, J.R., Trentham, N., 2001. Mon. Not. R. Astron. Soc. 326, 1467. Devriendt, J.E.G., Guiderdoni, B., Sadat, R., 1999. Astron. Astrophys. 350, 381. Dey, A., Graham, J.R., Ivison, R.J., Smail, I., Wright, G.S., Liu, M.C., 1999. Astrophys. J. 519, 610. Dole, H., et al., 2001. Astron. Astrophys. 372, 364. Domingue, D.L., Keel, W.C., Ryder, S.D., White, R.E., 1999. Astron. J. 118, 1542. Dowell, C.D., et al., 2001. Astron. Astrophys. Suppl. 198, 0509. Downes, D., Solomon, P.M., 1998. Astrophys. J. 507, 615. ∗ Downes, D., et al., 1999a. Astron. Astrophys. 347, 809. Downes, D., Neri, R., Wiklind, T., Wilner, D.J., Shaver, P.A., 1999b. Astrophys. J. 513, L1. Draine, B.T., Lee, H.M., 1984. Astrophys. J. 285, 89. Draine, B.T., Li, A., 2001. Astrophys. J. 551, 807. Dunlop, J.S., Hughes, D.H., Rawlings, S., Eales, S.A., Ward, M.J., 1994. Nature 370, 347. Dunlop, J.S., 2001. In: Lowenthal, J., Hughes, D., 2001. (Eds.), Deep Millimeter Surveys. World Scienti4c, p. 11, astro-ph=0011077. Dunne, L., Eales, S., Edmunds, M., Ivison, R., Alexander, P., Clements, D.L., 2000. Mon. Not. R. Astron. Soc. 315, 115. ∗ Dunne, L., Eales, S., 2001. Mon. Not. R. Astron. Soc. 327, 697. Dunne, L., Clements, D.L., Eales, S.A., 2001. Mon. Not. R. Astron. Soc. 319, 813. Dwek, E., Arendt, R., 1998. Astrophys. J. 508, L9. Eales, S., Lilly, S., Gear, W., Dunne, L., Bond, J.R., Hammer, F., Le FNevre, O., Crampton, D., 1999. Astrophys. J. 515, 518. ∗ Eales, S., Lilly, S., Webb, T., Dunne, L., Gear, W., Clements, D., Yun, M., 2000. Astron. J. 120, 2244. Edge, A.C., Ivison, R.J., Smail, I., Blain, A.W., Kneib, J.-P., 1999. Mon. Not. R. Astron. Soc. 306, 599. Eggen, O.J., Lynden-Bell, D., Sandage, A.R., 1962. Astrophys. J. 136, 748. Elbaz, D., et al., 1999. Astron. Astrophys. 351, L37. Ellingson, E., Lee, H.K.C., Bechtold, J., Elston, R., 1996. Astrophys. J. 466, L35. Fabian, A.C., Barcons, X., 1992. Annu. Rev. Astron. Astrophys. 30, 429. Fabian, A.C., 2000. In: Giacconi, R., Stella, L., Seiro, S. (Eds.), X-Ray Astronomy 2000, ASP Conf. Series, ASP, San Francisco, Vol. 234, in press, astro-ph=0103431. Fabian, A.C., et al., 2000. Mon. Not. R. Astron. Soc. 315, L8. Ferrarese, L., Merritt, D., 2000. Astrophys. J. 539, L9. Finkbeiner, D.P., Davis, M., Schlegel, D.J., 2000. Astrophys. J. 544, 81. Fixsen, D.J., Dwek, E., Mather, J.C., Bennett, C.L., Shafer, R.A., 1998. Astrophys. J. 508, 123.
A.W. Blain et al. / Physics Reports 369 (2002) 111 – 176
173
Flores, H., et al., 1999. Astrophys. J. 517, 148. Fox, M.J. et al., 2002. Mon. Not. R. Astron. Soc. 331, 839. Frail, D.A. et al., 2002. Astrophys. J. 565, 829. Franceschini, A., Danese, L., de Zotti, G., Xu, C., 1988. Mon. Not. R. Astron. Soc. 233, 175. Franceschini, A., To7olatti, L., Mazzei, P., Danese, L., de Zotti, G., 1991. Astron. Astrophys. Suppl. 89, 285. Franceschini, A., 2002. In: Perez-Fournon, I., Balcells, M., Moreno-Insertis, F., Sanchez, F. (Eds.), Galaxies at High Redshift, Cambridge University Press, Cambridge, in press, astro-ph=0009121. Franceschini, A., Fadda, D., Cesarsky, C., Elbaz, D., Flores, H., Granato, G.L., 2002. Astrophys. J. 568, 470. Frayer, D.T., et al., 1997. Astron. J. 113, 562. Frayer, D.T., Scoville, N.Z., Yun, M., Evans, A.S., Smail, I., Blain, A.W., Kneib, J.-P., 1998. Astrophys. J. 506, L7. ∗ Frayer, D.T., et al., 1999. Astrophys. J. 514, L13. Frayer, D.T., Smail, I., Ivison, R.J., Scoville, N.Z., 2000. Astron. J. 120, 1668. Frayer, D.T. et al, 2002. Astron. J. submitted for publication. Fukugita, M., Hogan, C.J., Peebles, P.J.E., 1999. Astrophys. J. 503, 518. Gear, W.K., Lilly, S.J., Stevens, J.A., Clements, D.L., Webb, T.M., Eales, S.A., Dunne, L., 2000. Mon. Not. R. Astron. Soc. 316, L51. Gebhardt, K., et al., 2000. Astrophys. J. 539, L13. Giacconi, R. et al, 2002. Astrophys. J. in press, astro-ph=0112184. Gispert, R., Lagache, G., Puget, J.-L., 2000. Astron. Astrophys. 360, 1. Glenn, J., et al., 1998. SPIE 3357, 326. Glenn, J., 2001. Private communication. Goldader, J.D., Meurer, G., Heckman, T.M., Seibert, M., Sanders, D.B., Calzetti, D., Steidel, C.C., 2002. Astrophys. J. in press, astro-ph=0112352. Graham, J.R., Dey, A., 1996. Astrophys. J. 471, 720. Granato, G.L., Danese, L., Franceschini, A., 1996. Astrophys. J. 460, L11. Granato, G.L., Lacey, C.G., Silva, L., Bressan, A., Baugh, C.M., Cole, S., Frenk, C.S., 2000. Astrophys. J. 542, 710. Granato, G.L., Silva, L., Monaco, P.L., Panuzzo, P., Salucci, P., De Zotti, G., Danese, L., 2001. Mon. Not. R. Astron. Soc. 324, 757. Guiderdoni, B., Hivon, E., Bouchet, F.R., Ma7ei, B., 1998. Mon. Not. R. Astron. Soc. 295, 877. ∗ Gunn, K.F., Shanks, T., 2002. Mon. Not. R. Astron. Soc. submitted, astro-ph=9909089. Haarsma, D.B., Partridge, R.B., Windhorst, R.A., Richards, E.A., 2000. Astrophys. J. 544, 641. Hacking, P.B., Houck, J.R., 1987. Astrophys. J. Suppl. 63, 311. Haiman, Z., Knox, L., 2000. Astrophys. J. 530, 124. Harwit, M., Pacini, F., 1975. Astrophys. J. 200, 127. Hasinger, G., et al., 1996. Astron. Astrophys. Suppl. 120, 607. Hauser, M.G., et al., 1998. Astrophys. J. 508, 25. Hauser, M.G., Dwek, E., 2001. Annu. Rev. Astron. Astrophys. 39, 249. ∗ Helfer, T.T., 2000. In: Mangum, J. (Ed.), Imaging at Radio through Submillimeter Wavelengths, ASP Conf. Series, Vol. 217, ASP, San Francisco, p. 25. Helou, G., Beichman, C.A., 1990. ESA SP-314, 117. Hildebrand, R.H., 1983. Q. J. R. Astron. Soc. 24, 267. Ho, P.T.P., 2000. In: Mangum, J. (Ed.), Imaging at Radio through Submillimeter Wavelengths, ASP Conf. Series, ASP, San Francisco, Vol. 217, p. 227. Hogg, D.W., 2001. Astron. J. 121, 1336. Holland, W., et al., 1999. Mon. Not. R. Astron. Soc. 303, 659. Hollenbach, D.J., Tielens, A.G.G.M., 1997. Annu. Rev. Astron. Astrophys. 35, 179. Hornschemeier, A.E., et al., 2000. Astrophys. J. 541, 49. Hu, E., Ridgway, S.E., 1994. Astron. J. 107, 1303. Hughes, D., 1996. In: Bremer, M.N., van der Werf, P.P., RVottgering, H.J.A., Carilli, C.J. (Eds.), Cold Gas at High Redshift. Kluwer, Dordrecht, p. 311. Hughes, D.H., Dunlop, J.S., Rawlings, S., 1997. Mon. Not. R. Astron. Soc. 289, 766. Hughes, D., et al., 1998. Nature 394, 241. ∗
174
A.W. Blain et al. / Physics Reports 369 (2002) 111 – 176
Hummel, E., 1986. Astron. Astrophys. 160, L1. Irwin, M.J., Ibata, R.A., Lewis, G.F., Totten, E.J., 1998. Astrophys. J. 505, 529. Isaak, K.G., McMahon, R.G., Hills, R.E., Withington, S., 1994. Mon. Not. R. Astron. Soc. 369, L28. Isaak, K.G., et al., 2002. Mon. Not. R. Astron. Soc. 329, 149. Ivison, R.J., 2001. In: Tran Thanh Van, J., Mellier, Y., Moniez, M. (Eds.), Cosmological Physics with Gravitational Lensing, EDP Sciences, Gif sur Yvette, p. 233. Ivison, R.J., Smail, I., Le Borgne, J.-F., Blain, A.W., Kneib, J.-P., Bezecourt, J., Kerr, T.H., Davies, J.K., 1998a. Mon. Not. R. Astron. Soc. 298, 583. ∗ Ivison, R.J., et al., 1998b. Astrophys. J. 494, 211. Ivison, R.J., Smail, I., Barger, A.J., Kneib, J.-P., Blain, A.W., Owen, F.N., Kerr, T.H., Cowie, L.L., 2000a. Mon. Not. R. Astron. Soc. 315, 209. Ivison, R.J., Dunlop, J.S., Smail, I., Dey, A., Liu, M.C., Graham, J.R., 2000b. Astrophys. J. 542, 27. Ivison, R.J., Smail, I., Frayer, D.T., Kneib, J.-P., Blain, A.W., 2001. Astrophys. J. 561, L45. Ivison, R.J., et al., 2002. Mon. Not. R. Astron. Soc., submitted for publication. Jameson, A., 2000. Ph.D. Thesis. University of Cambridge. Juvela, M., Mattila, K., Lemke, D., 2000. Astron. Astrophys. 360, 813. Kau7mann, G., White, S.D.M., 1993. Mon. Not. R. Astron. Soc. 261, 921. Kawara, K., et al., 1998. Astron. Astrophys. 336, L9. ∗ Kiss, C., Abraham, P., Klaas, U., Juvela, M., Lemke, D., 2001. Astron. Astrophys. 380, 388. Kneib, J.-P. et al, 2002. Astron. Astrophys., submitted for publication. Kormendy, J., Sanders, D.B., 1998. Astrophys. J. 325, 74. Kraiberg Knudsen, K., van der Werf, P.P., Ja7e, W., 2001. In: Lowenthal, J., Hughes, D. (Eds.), Deep Millimeter Surveys, World Scienti4c, Singapore, p. 168, astro-ph=0009024. Kreysa, E., et al., 1998. SPIE 3357, 319. Krumholz, M., Thorsett, S.E., Harrison, F.A., 1998. Astrophys. J. 506, L81. Lagache, G., Puget, J.L., 2000. Astron. Astrophys. 355, L17. Lagache, G., Abergel, A., Boulanger, F., Desert, F.-X., Puget, J.-L., 2000a. Astron. Astrophys. 344, 322. Lagache, G., Ha7ner, L.M., Reynolds, R.J., Tufte, S.L., 2000b. Astron. Astrophys. 354, 247. Lampton, M., Bowyer, S., Deharveng, J.M., 1990. In: Bowyer, S., Leinert, C. (Eds.), The Galactic and Extragalactic Background Radiation, Proc. IAU, Vol. 139. Kluwer, Dordrecht, p. 449. Laurent, O., Mirabel, I.F., Charmandaris, V., Gallais, P., Madden, S.C., Sauvage, M., Vigroux, L., Cesarsky, C., 2000. Astron. Astrophys. 359, 887. Lewis, G.F., Chapman, S.C., Ibata, R.A., Irwin, M.J., Totten, E.J., 1998. Astrophys. J. 505, L1. Lilly, S.J., Le F[evre, O., Hammer, F., Crampton, D., 1996. Astrophys. J. 460, L1. Lilly, S.J., Eales, S.A., Gear, W.K.P., Hammer, F., Le F[evre, O., Crampton, D., Bond, J.R., Dunne, L., 1999. Astrophys. J. 518, 641. Linden-Vornle, M.J.D., et al., 2000. Astron. Astrophys. 359, 51. Lisenfeld, U., Isaak, K.G., Hills, R.E., 2000. Mon. Not. R. Astron. Soc. 312, 433. Lockwood, G.W., 1970. Astrophys. J. 160, L47. Longair, M.S., 2000. In: Dingus, B.L. et al. (Eds.), AIP Conference Proceedings, Vol. 516, AIP, Woodbury, New York, p. 3. Lutz, D., et al., 2001. Astron. Astrophys. 378, 70. Magliocchetti, M., et al., 2001. Mon. Not. R. Astron. Soc. 325, 1553. Magorrian, J., et al., 1998. Astron. J. 115, 2285. Malkan, M.A., Stecker, F.W., 1998. Astrophys. J. 496, 13. Malkan, M.A., Stecker, F.W., 2001. Astrophys. J. 555, 641. Masi, S., et al., 2001. Astrophys. J. 553, 93. Mather, J.C., et al., 1994. Astrophys. J. 420, 439. Mather, J.C. et al, 1998. SPECS de4nition preprint, astro-ph=9812454. Matsuhara, H., et al., 2000. Astron. Astrophys. 361, 407. Metcalfe, L., 2001. Private communication. Meurer, G.R., Heckman, T.M., Calzetti, D., 1999. Astrophys. J. 521, 183.
A.W. Blain et al. / Physics Reports 369 (2002) 111 – 176
175
Mihos, J.C., Hernquist, L., 1996. Astrophys. J. 464, 641. Mihos, C., 2000. Preprint, astro-ph=9903115. Mirabel, I.F., et al., 1998. Astron. Astrophys. 333, L1. Mohan, N.R., Cimatti, A., RVottgering, H.J.A., Andreani, P., Severgnini, P., Tilanus, R.P.J., Carilli, C.L., Stanford, S.A., 2002. Astron. Astrophys. 383, 440. Murthy, J., Hall, D., Earl, M., Henry, R.C., Holberg, J.B., 1999. Astrophys. J. 522, 904. Mushotzky, R.F., Cowie, L.L., Barger, A.J., Arnaud, K.A., 2000. Nature 404, 459. Oliver, S., Rowan-Robinson, M., Saunders, W., 1992. Mon. Not. R. Astron. Soc. 256, P15. Omont, A., Cox, P., Bertoldi, F., McMahon, R.G., Carilli, C., Isaak, K.G., 2001. Astron. Astrophys. 374, 371. Page, M.J., Stevens, J.A., Mittaz, J.P.D., Carrera, F.J., 2002. Science 294, 2516. Partridge, R.B., Peebles, P.J.E., 1967. Astrophys. J. 147, 868. Partridge, R.B., Richards, E.A., Fomalont, E.B., Kellerman, K.I., Windhorst, R.A., 1997. Astrophys. J. 483, 38. Papadopoulos, P., Ivison, R., Carilli, C., Lewis, G., 2001. Nature 409, 58. Peacock, J.A., et al., 2000. Mon. Not. R. Astron. Soc. 318, 535. Pearson, C., 2001. Mon. Not. R. Astron. Soc. 325, 1511. Pearson, C., Rowan-Robinson, M., 1996. Mon. Not. R. Astron. Soc. 283, 174. Peebles, P.J.E., 1993. Principles of Physical Cosmology, Princeton, NJ. Pettini, M., Kellogg, M., Steidel, C.C., Dickinson, M., Adelberger, K.L., Giavalisco, M., 1998. Astrophys. J. 508, 539. Pettini, M., Steidel, C.C., Adelberger, K.L., Dickinson, M., Giavalisco, M., 2000. Astrophys. J. 528, 96. Pettini, M., Shapley, A.E., Steidel, C.C., Cuby, J.-G., Dickinson, M., Moorwood, A.F.M., Adelberger, K.L., Giavalisco, M., 2001. Astrophys. J. 554, 981. Pierce-Price, D., et al., 2000. Astrophys. J. 545, L121. Pierre, M., et al., 2001. Astron. Astrophys. 372, L45. Pozzetti, L., Madau, P., Zamorani, G., Ferguson, H., Bruzual, A., 1998. Mon. Not. R. Astron. Soc. 298, 1133. Priddey, R.S., McMahon, R.G., 2001. Mon. Not. R. Astron. Soc. 324, L17. Puget, J.-L., Abergel, A., Bernard, J.-P., Boulanger, F., Burton, W.B., DNesert, F.-X., Hartmann, D., 1996. Astron. Astrophys. 308, L5. ∗ Puget, J.-L., et al., 1999. Astron. Astrophys. 345, 29. Ramirez-Ruiz, E., Trentham, N., Blain, A.W., 2002. Mon. Not. R. Astron. Soc. 329, 465. Reach, W.T., et al., 1995. Astrophys. J. 451, 188. Regan, M.W., Thornley, M.D., Helfer, T.T., Sheth, K., Wong, T., Vogel, S.N., Blitz, L., Bock, D.C.J., 2001. Astrophys. J. 561, 218. Richards, E.A., 2000. Astrophys. J. 533, 611. Rigopoulou, D., Spoon, H.W.W., Genzel, R., Lutz, D., Moorwood, A.F.M., Tran, Q.D., 1999. Astron. J. 118, 2625. Rowan-Robinson, M., et al., 1991. Nature 351, 719. Rowan-Robinson, M., et al., 1997. Mon. Not. R. Astron. Soc. 289, 490. Rowan-Robinson, M., 2000. Mon. Not. R. Astron. Soc. 316, 885. Rowan-Robinson, M., 2001. Astrophys. J. 549, 745. Rusin, D., 2001. Astron. Astrophys. Suppl. 197, 6704. Sakamoto, K., Scoville, N.Z., Yun, M.S., Crosas, M., Genzel, R., Tacconi, L., 1999. Astrophys. J. 514, 68. Sanders, D.B., 1999. Astrophys. Space. Sci. 266, 331. Sanders, D.B., 2001. Preprint, astro-ph=0109138. Sanders, D.B., Mirabel, F., 1996. Annu. Rev. Astron. Astrophys. 34, 749. ∗ Sanders, D.B., Soifer, B.T., Elias, J.H., Neugebauer, G., Matthews, K., 1988. Astrophys. J. 328, L35. Saunders, W., Rowan-Robinson, M., Lawrence, A., Efstathiou, G., Kaiser, N., Ellis, R.S., Frenk, C.S., 1990. Mon. Not. R. Astron. Soc. 242, 318. ∗ Scheuer, P.A.G., 1974. Mon. Not. R. Astron. Soc. 166, 329. Schlegel, D.J., Finkbeiner, D.P., Davis, M., 1998. Astrophys. J. 500, 525. ∗ Schneider, P., Ehlers, J., Falco, E.E., 1992. Gravitational Lenses. Springer, Berlin. Scott, D., White, M., 1999. Astron. Astrophys. 346, 1. Scott, D., et al., 2000. Astron. Astrophys. 357, L5. Scott, S.E. et al., 2002. Mon. Not. R. Astron. Soc. 331, 817.
176
A.W. Blain et al. / Physics Reports 369 (2002) 111 – 176
Seitz, S., Saglia, R.P., Bender, R., Hopp, U., Belloni, P., Ziegler, B., 1998. Mon. Not. R. Astron. Soc. 298, 945. Serjeant, S., et al., 2001. Mon. Not. R. Astron. Soc. 322, 262. Smail, I., Ivison, R.J., Blain, A.W., 1997. Astrophys. J. 490, L5. ∗ Smail, I., Ivison, R.J., Blain, A.W., Kneib, J.-P., 1998a. Astrophys. J. 507, L21. Smail, I., Edge, A.C., Ellis, R.S., Blandford, R.D., 1998b. Mon. Not. R. Astron. Soc. 293, 124. Smail, I., et al., 1999. Mon. Not. R. Astron. Soc. 308, 1061. Smail, I., Ivison, R.J., Owen, F.N., Blain, A.W., Kneib, J.-P., 2000. Astrophys. J. 528, 612. ∗ Smail, I., Ivison, R.J., Blain, A.W., Kneib, J.-P., 2002. Mon. Not. R. Astron. Soc. 331, 495. ∗ Smith, H.E., Lonsdale, C.J., Lonsdale, C.L., Diamond, P.J., 1998. Astrophys. J. 493, L17. Smith, G.P., Treu, T., Ellis, R., Smail, I., Kneib, J.-P., Frye, B.L., 2001. Astrophys. J. 562, 635. Smith, I.A., et al., 1999. Astron. Astrophys. 347, 92. Smith, I.A., Tilanus, R.P.J., Wijers, R.A.M.J., Tanvir, N., Vreeswijk, P., Rol, E., Kouveliotou, C., 2002. Astron. Astrophys. 380, 81. Soifer, B.T., et al., 1987. Astron. J. 320, 238. Soifer, B., Neugebauer, G., 1991. Astron. J. 101, 354. Somerville, R.S., Primack, J.R., Faber, S.M., 2001. Mon. Not. R. Astron. Soc. 320, 504. Soucail, G., Kneib, J.-B., BNezecourt, J., Metcalfe, L., Altieri, B., LeBorgne, J.-F., 1999. Astron. Astrophys. 343, L70. Stanev, T., Franceschini, A., 1998. Astrophys. J. 494, L159. Steidel, C.C., Giavalisco, M., Pettini, M., Dickinson, M., Adelberger, K.L., 1996. Astrophys. J. 624, L17. Steidel, C.C., Adelberger, K.L., Giavalisco, M., Dickinson, M., Pettini, M., 1999. Astrophys. J. 519, 1. ∗ Stickel, M., et al., 1998. Astron. Astrophys. 336, 116. Tan, J.C., Silk, J., Balland, C., 1999. Astrophys. J. 522, 579. Thompson, D., et al., 1999. Astrophys. J. 523, 100. To7olatti, L., et al., 1998. Mon. Not. R. Astron. Soc. 297, 117. Toller, G., Tanabe, H., Weinberg, J.L., 1987. Astron. Astrophys. 188, 24. Totani, T., Yoshii, Y., Iwamuro, F., Maihara, T., Motohara, K., 2001. Astrophys. J. 559, 592. Townsend, R., Ivison, R., Smail, I., Blain, A., Frayer, D., 2001. Mon. Not. R. Astron. Soc. 328, L19. Tran, Q.D., et al., 2001. Astrophys. J. 552, 527. Trentham, N., Blain, A.W., 2001. Mon. Not. R. Astron. Soc. 323, 547. Trentham, N., Blain, A.W., Goldader, J., 1999. Mon. Not. R. Astron. Soc. 305, 61. van der Werf, P.P., Kraiberg Knudsen, K., LabbNe, I., Franx, M., 2002. In: van Bemmel, I., Wilkes, B., Barthel, P. (Eds.), New Astronomy Reviews. Elsevier, Amsterdam, in press, astro-ph=0011217. van der Werf, P.P., Kraiberg Knudsen, K., 2001. Private communication. Vernet, J., Cimatti, A., 2001. Astron. Astrophys. 380, 409. VVolk, H.J., 1998. Astron. Astrophys. 218, 67. Waxman, E., Loeb, A., 2000. Astrophys. J. 545, L11. Webb, T.M., Eales, S.A., Lilly, S.J., Dunne, L., Gear, W.K., Flores, H., Yun, M., 2002a. Astrophys. J., submitted, astro-ph=0201180. Webb, T.M. et al, 2002b. Astrophys. J., submitted, astro-ph=0201181. White, S.D.M., Frenk, C.S., 1991. Astrophys. J. 379, 52. Williams, R.E., et al., 1995. Astron. J. 112, 1335. Wilman, R.J., Fabian, A.C., Ghandhi, P., 2000. Mon. Not. R. Astron. Soc. 318, L11. Wilner, D.J., Wright, M.C.H., 1997. Astrophys. J. 488, 67. Wilson, C.D., Scoville, N., Madden, S.C., Charmandaris, V., 2000. Astrophys. J. 542, 120. Witt, A.N., Thronson, H.A., Capuano, J.M., 1992. Astrophys. J. 393, 611. Wootten, A. (Ed.), 2001. Science with the Atacama Large Millimeter Array, ASP Conference Series, Vol. 235. ASP, San Francisco. Wright, E.L., Johnson, B.D., 2002. Astrophys. J., submitted, astro-ph=0107205. Xu, C., 2000. Astrophys. J. 541, 134. Yan, L., et al., 2000. Astron. J. 120, 575. Yun, M.S., Carilli, C.L., 2002. Astrophys. J. 568, 88. Yun, M.S., Reddy, N.A., Condon, J.J., 2001. Astrophys. J. 554, 803. ∗
Physics Reports 369 (2002) 177 – 326 www.elsevier.com/locate/physrep
Study of diatomic van der Waals complexes in supersonic beams Jaros law Koperski ∗ Instytut Fizyki im. Mariana Smoluchowskiego, Uniwersytet Jagiellonski, ul. Reymonta 4, 30-059 Krakow, Poland Received 1 March 2002 editor: Dr. J. Eichler
Abstract Laser spectroscopy of van der Waals diatoms produced in supersonic beams is a source of information on the ground- and excited-state interatomic potentials. The goal of this review article is to provide a comprehensive characterization of the MeRG and Me2 diatoms, where Me and RG are 12-group (Zn; Cd; Hg) and rare gas atoms, respectively. As a result, the ground and a number of excited states of the molecules are characterized over a broad range of internuclear separations. Analytical functions are proposed to represent the potential energy curves in three separate regions of internuclear separation: short-range region, vicinity of the equilibrium internuclear separation, and long-range limit. Several models, trends, and regularities of dispersive interaction in the studied diatoms are observed and described. The molecular characteristics presented here are compared c 2002 Elsevier Science B.V. All rights reserved. with experimental and ab initio results of other investigators. PACS: 33.15.Fm; 33.20.Lg; 33.20.Vq; 33.20.−t Keywords: van der Waals molecules; Potential energy curves; Molecular potential characteristics; Supersonic expansion beams
Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2. Spectroscopical characterization of weakly bound complexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1. Molecular potential in diAerent regions of internuclear separation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2. Characteristics of diatomic molecule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3. Electronic structure of molecular states . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1. Hund’s coupling case (a) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2. Hund’s coupling case (c) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ∗
Tel.: +48-12-632-4888 ext. 5789; fax: +48-12-633-8494. E-mail address:
[email protected] (J. Koperski).
c 2002 Elsevier Science B.V. All rights reserved. 0370-1573/02/$ - see front matter PII: S 0 3 7 0 - 1 5 7 3 ( 0 2 ) 0 0 2 0 0 - 4
179 181 181 182 184 185 186
178
J. Koperski / Physics Reports 369 (2002) 177 – 326
3.3. Wave function symmetries and electronic selection rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4. 12-group metal–rare gas molecules and 12-group metal dimers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5. Potential energy curves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5.1. Heteronuclear molecules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5.2. Homonuclear molecules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5.3. Ab initio calculations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5.4. The Morse potential . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5.5. The Lennard–Jones (n − m) potential . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5.6. The Maitland–Smith (n0 ; n1 ) potential . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5.7. Combined Morse–vdW potential . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5.8. Other forms of combined potentials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5.9. Hybrid potential . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5.10. Double-well potential . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.6. Vibrational structure of electronic transitions. Vibrational bands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.6.1. Bound–bound and bound–free transitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.6.2. Franck–Condon principle and Born–Oppenheimer approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.6.3. Isotope structure of vibrational band. Isotope shift . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.7. Rotational structure of a vibrational band. Determination of rotational characteristics . . . . . . . . . . . . . . . . . . . . 4. Determination of a potential energy curve in diAerent regions of internuclear separations . . . . . . . . . . . . . . . . . . . . . 4.1. Analysis of excitation spectra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1.1. Birge–Sponer method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1.2. Limiting and generalized near-dissociation expansions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1.3. Simulation of bound–bound vibrational spectra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1.4. Simulation of rotational structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2. Analysis of Huorescence spectra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.1. Modelling of bound–bound discrete transitions and bound–free continuous spectra . . . . . . . . . . . . . . . . 4.3. Complementary results of excitation- and Huorescence-spectra simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4. Approximations for long-range characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4.1. Kramer–Herschbach combination rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4.2. London–Drude theory of dispersion interactions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4.3. Slater–Kirkwood formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5. Approximations for internuclear separations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5.1. Liuti–Pirani method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.6. Le Roy radius evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5. Experimental considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1. Supersonic expansion—a source of molecules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1.1. Phenomenological characteristics of supersonic expansion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1.2. Pulsed supersonic beam . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1.3. Continuous supersonic beam . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2. Laser systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.1. Laser dyes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.2. Second and third harmonics generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.3. Tuning, stability, and calibration of the laser wavelength . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3. Detection systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4. Experimental procedure and data acquisition systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6. Interpretation of results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1. CdRG molecules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1.1. X0+ singlet, and A0+ and B1 triplet states . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1.2. D1 singlet states . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1.3. E1 triplet Rydberg states . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1.4. Conclusions—CdRG family . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
186 188 190 191 195 198 203 205 206 207 208 209 209 209 210 213 215 218 221 222 222 224 227 230 233 234 239 243 243 244 246 247 247 248 249 249 250 256 258 260 260 262 262 263 265 266 266 268 275 276 276
J. Koperski / Physics Reports 369 (2002) 177 – 326 6.2. HgRG molecules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2.1. X0+ singlet, and A0+ and B1 triplet states . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2.2. Conclusions—HgRG family . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3. ZnRG molecules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3.1. X0+ and D1 singlet states . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3.2. Absence of evidence for the A0+ and B1 triplet states . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3.3. Conclusions—ZnRG family . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.4. MeRG families of molecules—comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.5. Me2 dimers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.5.1. Hg2 interatomic potentials from excitation and Huorescence spectra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.5.2. Cd2 and Zn2 interatomic potentials from excitation spectra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.6. Me2 dimers—comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7. Summary and conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
179 280 282 287 290 291 293 294 297 300 300 307 309 311 313 319 319
1. Introduction The goal of this review article is to characterize as completely as possible van der Waals (vdW) diatoms such as MeRG and Me2 molecules, where Me and RG stand for metal atom of 12-group of the periodic table (i.e. Zn, Cd, Hg) and rare gas atom (i.e. He, Ne, Ar, Kr, or Xe), respectively. 1 VdW molecules are of current interest for several reasons. (i) They represent a unique class of simple heteronuclear and homonuclear diatomic species with very small (10–1000 cm−1 ) dissociation energies, thus Llling some of the last gaps in a periodic table of dimers [1]. (ii) Experimental determination of the unusual vdW nature of these species provide a good test for theoretical formulations which aid our understanding of simple models as well as theoretical regularities that govern the long-range dispersion forces (e.g. London–Drude (L–D) theory [2– 6], Slater–Kirkwood (S–K) [7] and Kramer–Herschbach (K–H) [8] models, or Liuti–Pirani (L–P) regularity [9]). (iii) The properties of small vdW clusters, of which dimers are the simplest prototypes, are of current interest with an eye towards the understanding forces that holds liquids and solids together as well as transition of molecular properties to bulk metal properties [10 –12]. (iv) Determination and understanding of simple vdW forces mechanism in elementary diatoms enables to build a picture of bonding in larger complexes of this type (e.g. [13–22]). This bridges the gap between the vdW molecules and clusters. (v) The electronic properties of excimers and exciplexes (i.e. excited-state homo- and heteronuclear dimers, respectively) allow a relatively easy production of population inversions, thus making them likely candidates for laser media [12,23,24].
1
The MeRG and Me2 abbreviations will be used throughout this review as these describing the ZnRG, CdRG, HgRG, and Zn2 , Cd 2 , Hg2 systems, respectively.
180
J. Koperski / Physics Reports 369 (2002) 177 – 326
(vi) In diAerent branches of atomic and molecular physics there is rapidly growing interest in long-range forces acting between atoms interacting in variety of traps, in experiments of matter-wave interferometry and photoassociation of cold molecules [25]. The interaction of RG or closed-shell Me atom with ground-state or electronically excited-state Me atom is an example of such vdW interaction, and it has been an important area of study [23,26 –28]. Determination of accurate potential energy (PE) curves for speciLc MeRG or Me2 ground and excited-state interactions in diAerent regions of internuclear separation is important to broaden our knowledge about these basic interactions. Processes such as thermal collisions, quenching of excited states, pressure broadening of spectral lines, intra-multiplet transitions, collision-induced light (Raman) scattering, collision-induced absorption, collisional redistribution of radiation, speed-dependent collisional eAects of atomic lines perturbed by RG atoms, relaxation processes of Me2 molecules in solid RG, molecular dynamics, etc., are understood and interpreted properly if the interatomic potentials are known. The experimentally derived PE curves are frequently confronted against ab initio calculations. In the rapidly evolving Leld of matter-wave interferometry, in order to determine an index of refraction of a RG medium for Me atomic waves it is necessary to know the interatomic potential of MeRG systems in the long- and short-ranges as well as in the bound well region [29,30]. Hence, the information about the interatomic potentials over the widest possible range of internuclear separations (R) is highly desirable. The recent advances in laser cooling and optical trapping techniques as well as photoassociation of cold molecules have been largely responsible for the renewed interest in the studies of the long-range region of R. Among the various available techniques, diatomic molecular spectroscopy has proved to be the most eAective and precise way to obtain information about the interaction between the two entities (i.e. Me and RG) [25,31]. Recently, precise information about PE curves of mercury dimer has been applied in experiments of femtosecond photoassociation spectroscopy. Photoassociation of Hg2 , femtosecond dynamics, and quantum dynamical wave packet description of these reactions [32–34] as well as coherent bond formation of Hg2 obtained in the femtosecond time scale [35] have been reported. Furthermore, results of mercury dimer spectroscopy are used in planned fundamental experimental tests of quantum mechanics, particularly in realization of the famous Einstein–Podolsky–Rosen gedankenexperiment [36] and loophole-free test of the Bell inequalities in conditions diAerent from those using photons [37,38]. In addition, recently a cooling mechanism for mercury dimer in a magneto-optical trap has been proposed and its experimental realization is in progress [39]. The mechanism is based on the previously determined electronic and vibronic properties of Hg2 , thus including this species in the family of potentially cooled diatoms. Although it was attempted to present a broad account to the Leld, it is perhaps inevitable that the research interests of the author have received strong emphasis. This review is based on studies of the ZnRG [40 – 42], CdRG [43–50,63], HgRG [51–57] as well as Zn2 [58], Cd 2 [58,59] and Hg2 [51,57,60 – 62] vdW complexes produced in a variety of supersonic beams and studied using methods of laser spectroscopy. The experiments were carried out in the Department of Physics, University of Windsor, Windsor, Ontario, Canada and with new experimental apparatus constructed in Institute of Physics, Jagiellonian University, KrakPow, Poland. As a result, ground and a number excited states of the molecules have been characterized, several of them for the Lrst time. Analytical functions have been proposed to represent the PE curves in three separate regions of R: the short-range region, the vicinity of the equilibrium internuclear separation, and the long-range limit. This provided a
J. Koperski / Physics Reports 369 (2002) 177 – 326
181
characterization of the interatomic potentials over a broad range of R. The E1(63 S1 ) Rydberg triplet states in CdRG (RG=Ne, Ar, Kr) have been investigated as well [63]. Several models, trends, and regularities of dispersive interaction in studied diatoms have been observed and described. The ground state of the mercury dimer has been found to possess an unusually soft repulsive wall, supporting a hypothesis of the short-range induction eAects playing an important role in the stabilization of Hg2 . A theoretical prediction of covalent bonding contributions to the Me2 ground-state interaction potential has been conLrmed in the experimental observations. The molecular characteristics presented here are compared with experimental and ab initio results of the others investigators. It is diRcult to cover all aspects of experimental results obtained during the studies in the Leld [40 – 63]. Some of them are emphasized and treated in detail. On the other hand, some of the results are pointed out only and a reference is made to the appropriate article. Furthermore, the theoretical part of this review is devoted only to those issues, which are directly related to the presented experimental aspects of the study. Therefore, the theoretical interpretation of these aspects is frequently illustrated with speciLc experimental examples from the results of the investigations. To have a complete overview of all theoretical facets the reader is referred to textbooks on molecular physics and molecular spectroscopy. Sections 2– 4 cover theoretical and descriptive aspects of the vdW molecules as physical objects, their characteristics and methods of extracting them from the experimental data. They are illustrated with examples from the author’s studies. Section 5 presents the experimental part of the study. Among others, instrumental arrangements in laboratories in Windsor and KrakPow are described there. In Section 6, results are interpreted and discussed while Section 7 summarizes the review. This review article covers the state of knowledge of the Leld up to February 2002.
2. Spectroscopical characterization of weakly bound complexes 2.1. Molecular potential in di4erent regions of internuclear separation If a diatomic molecule is to exist in a physically stable state, its PE curve must have a minimum. The internuclear separation corresponding to that minimum is then said to be the equilibrium value (or bond-length), Re . 2 In the simplest picture, if the atoms are brought closer than the Re , then the PE must rise sharply due to the work done against the increasing Coulomb repulsive force between electrons and between charged nuclei. If the atoms are drawn apart, then the PE curve must also rise due to work against the now superior electronic binding force. As the separation between the atoms increases the PE curve must approach a limiting value, which will be the well depth, De , of the molecule. This simple physical picture of the molecular system leads to an internuclear potential with a minimum at Re , a sharp rise towards inLnity as the nuclei are brought together, and a less sharp rise towards the dissociation limit as the separation is increased (Fig. 1).
2
PE curves for diatomic systems fall mainly into two categories, ones with appreciable minima (bound states) and ones which exhibit a very shallow minimum (vdW states) or none at all (repulsive states).
182
J. Koperski / Physics Reports 369 (2002) 177 – 326
Fig. 1. A diagram showing potential energy, E, of diatomic molecule vs. internuclear separation, R. Due to the vibrations and rotations molecule has v- and J -levels, respectively (see insert). For the sake of clarity, the individual J -levels are represented by shorter horizontal lines than “pure” v-levels (J = 0). The potential well is characterized by the equilibrium internuclear separation, Re , well depth, De , or dissociation energy, D0 . Three characteristic regions of R can be described separately: strong repulsion (short-range region, R ¡ Re ), bound well (an intermediate region, vicinity of Re ), and attraction in long-range limit (RRe ).
Any mathematical expression representing the PE as a function of internuclear distance, R, ought to fulLl the simple conditions. In practice, however, it is very diRcult to Lnd a simple analytical representation, which would be derived from experimental data and, moreover, describe accurately the real interatomic potential in diAerent regions of R (for overview of several analytical representations see Section 3.5). Generally, an interatomic potential of a speciLc molecule can be characterized in diAerent regions of R using appropriate method of investigation (see Section 4). Combining all the experimental data available for a particular molecular system one can develop a single function (or a hybrid function) that represents the interatomic potential in the widest attainable range of R. 2.2. Characteristics of diatomic molecule There are several approximations of nuclear motion in a diatomic molecule. The simplest are rigid rotator and harmonic oscillator approximations for rotations and vibrations, respectively. However, to take into consideration all aspects of the nuclear dynamics one has to employ more complex models as non-rigid rotator and anharmonic oscillator, which allow to introduce to the simplest model the centrifugal force and anharmonicity [64]. Therefore, in molecular spectroscopy for interpretation of experimentally observed spectra the anharmonically vibrating non-rigid rotator is the most often model used for nuclear motion in diatomic molecule. Aside from nuclear motion, one has to consider a motion of electrons. In the Lrst approximation, the electronic and nuclear (vibrational and rotational) motions in the molecule can be treated independently, and their mutual interaction is considered as a Lrst correction (Born–Oppenheimer approximation [65], Section 3.6.2). Consequently, it is justiLed to characterize a molecular state using symmetry of the electronic part
J. Koperski / Physics Reports 369 (2002) 177 – 326
183
(i.e. electronic wave function), and the nuclear part (i.e. vibrational, v, and rotational, J , quantum numbers). The total energy E of the molecule is, to a very good approximation, a sum of three components, E = Ee + Evib + Erot ;
(1)
where, Ee ; Evib and Erot are the electronic, vibrational and rotational energies, respectively. Traditionally, in the Leld of molecular physics and molecular spectroscopy the energy is expressed in wave number units (cm−1 ), and Eq. (1) can be rewritten using so-called term values, 3 T = Te + G(v) + F(J ) ;
(2)
where Te , G(v) and F(J ) are electronic, vibrational and rotational terms, respectively. This approximation has a consequence in deLnition of parameters that characterize diatomic molecule. Firstly, Te does not depend on the nuclear-motion characteristics G(v) and F(J ). Secondly, the vibrational energy does not depend on the rotational energy, and can be expressed as G(v) = !e (v + 12 ) − !e xe (v + 12 )2 + !e ye (v + 12 )3 + · · · ;
(3)
where !e ; !e xe and !e ye are vibrational frequency (from the harmonic oscillator approximation), single anharmonicity and “second-order” anharmonicity, respectively (!e !e xe and !e xe !e ye ). The vibrational characteristics are the same for given electronic term described by Te . The rotational term, F(J ), has a form (generally F(J ) ¡ G(v)) F(J ) = Bv J (J + 1) − Dv J 2 (J + 1)2 + · · · ;
(4)
where Bv and Dv are rotational constants (Dv Bv ), and they reHect an inHuence of the nuclear moment of inertia (indicates the rigidity of a molecule) and centrifugal force (indicates the centrifugal stretching of a molecule), respectively. Subscript v denotes that Bv and Dv are given for a particular vibrational state Bv = Be − e (v + 12 ) + · · · ;
(5)
Dv = De + e (v + 12 ) + · · · ;
(6)
where e 4 and e are constants, which characterize an interaction between the vibrational and rotational motions (e ¡ Be ; e ¡ De ), and Be and De refer to the completely vibrationless state of
3
The wave number of energy quantum (units (cm−1 )) is expressed as = E=hc, where h and c are Planck constant and speed of light, respectively. Therefore, the electronic, Te , vibrational, G(v), and rotational, F(J ), term-values correspond to Ee =hc, Evib =hc and Erot =hc, respectively. √ 4 If the potential of the electronic state is represented by a Morse function (Section 3.5.4) then e =6Be ( !e xe Be −Be )=!e ; empirically it has been found [64] that e ≈ (!e xe =!e )Be , which is very useful in practical analyses of band spectra (Section 3.7).
184
J. Koperski / Physics Reports 369 (2002) 177 – 326
the molecule and correspond to the equilibrium internuclear separation Re . They are expressed by the formulas: 5 Be =
h ; 82 cR2e
(7)
De =
4Be3 ; !e2
(8)
where is the reduced mass of the molecule. Taking into consideration the interaction of vibration and rotation, in the way described above (Eqs. (2) – (4)), the term values of an anharmonically vibrating non-rigid rotator can be expressed as follows: T = Te + !e (v + 12 ) − !e xe (v + 12 )2 + !e ye (v + 12 )3 + · · · + Bv J (J + 1) − Dv J 2 (J + 1)2 + · · · :
(9)
To provide as complete as possible description of the molecular term values, it is necessary to determine all the electronic (Te ), vibrational (!e ; !e xe ; !e ye ) and rotational (Bv ; Dv ; Be ; De ) constants either from the experimental data or from ab initio calculations.
3. Electronic structure of molecular states In order to describe the electronic motion in the Leld of two Lxed nuclei, it is necessary to take into consideration the fact that the actual rotation and vibration of nuclei in a molecule take place simultaneously with the motion of electrons, and these diAerent motions inHuence one another. The mutual interaction of vibrational and electronic motions is taken into account if the v-levels correspond to the rotationless eigenvalues of the SchVodinger equation for a mass moving in the potential given by the PE curve of the electronic state (Fig. 1, Section 2), since this PE curve represents the dependence of the electronic energy (including nuclear repulsion) on the R. The mutual interaction of vibration and rotation has already been discussed above (Section 2.2). Therefore, it is required to consider the inHuence of rotational and electronic motions on each other and deLne the quantum numbers that describe molecular electronic terms. These quantum numbers depend on the character of the coupling between diAerent total angular momenta in the molecule—electron spin S, electronic orbital angular momentum L, angular momentum of nuclear rotation N—in one resulting J (disregarding the nuclear spin). The character of the coupling is determined by three types of interaction in a molecule: between L and internuclear axis (i.e. the line joining the nuclei) that represents the electrostatic Leld produced by the two nuclear charges,
5
Note that the centrifugal stretching constant De should not be confused with the well depth, De , of the molecular potential (Fig. 1).
J. Koperski / Physics Reports 369 (2002) 177 – 326
185
Fig. 2. Vector diagram for the Hund’s coupling case (a). Solid-line ellipse represents the notation of J while the much more rapid precessions of L and S about the internuclear axis are indicated by the broken-line ellipses. Precession of # and N around the direction of J is not shown.
between L and S, and between total electronic angular momentum # 6 and N. The classiLcation of diAerent kinds of coupling was Lrst done by Hund [66] according to whom one can distinguish Lve (Hund’s) cases: from (a) to (e). Here, for the sake of brevity, only cases (a) and (c) will be shortly described, as they directly apply to the investigated MeRG and Me2 (Me = Zn; Cd; Hg) molecules. Hund’s case (b) applies particularly for light molecules (weak or zero coupling of S to the internuclear axis). For an explanation of the (b), (d), and rare (e) cases, the reader is referred to textbooks and articles on molecular spectroscopy (e.g. [64,67,68]). 3.1. Hund’s coupling case (a) In the most commonly encountered Hund’s case (a) depicted in Fig. 2, it is assumed that the interaction of the nuclear rotation with the electronic motion is very weak (N is weakly coupled to S and L), whereas the electronic motion itself, which is well deLned even in the rotating molecule, is coupled very strongly to the internuclear axis (# strongly coupled to N). Consequently, S and L are individually coupled to the internuclear axis (strong electrostatic correlation), then their projections
6
The total electronic angular momentum about the internuclear axis, denoted by # (with magnitude ˜, where ˜=h=2), is obtained by adding $ and %, where $ and % are the components of L and S momenta (with magnitudes ˜; ˜) along the internuclear axis, respectively ( = | + |). Note, that for molecules (as distinct from atoms) an algebraic addition is suRcient, since the vectors $ and % both lay along the internuclear axis.
186
J. Koperski / Physics Reports 369 (2002) 177 – 326
on the axis form #, and Lnally # and N form the resulting J, which is constant in magnitude (˜) and direction. # and N rotate about the direction of J. The rotational energy in Hund’s case (a) is expressed by the formula Fv (J ) = Bv [J (J + 1) − 2 ] ;
(10)
where J = + N; N = 0; 1; 2; : : : ; i.e. J ¿ [64]. Since # is the component of J it follows that J is integer when is integer (i.e. for an even number of electrons) whereas J is half-integer when is half-integer (i.e. for an odd number of electrons). The total energy in excess of the vibronic contribution is Fv (J ), expressed by Eq. (10), plus A2 , where A is essentially the constant of molecular spin–orbit coupling for a given electronic state and can therefore be included in the electronic energy Te . The notation of electronic terms follows the general scheme: 2S+1 () , where () = ; ; !; : : : for = 0; 1; 2; : : : ; respectively 7 or, in short, 2S+1 (). 3.2. Hund’s coupling case (c) In certain cases, particularly for heavy molecules, the interaction between L and S may be stronger than the interaction with the internuclear axis (strong spin–orbit coupling). In this case, called Hund’s case (c), $ and % are not deLned. L and S Lrst form a resulting Ja , which is then coupled to the internuclear axis with a component # (Fig. 3). Then the total electronic angular momentum # and the angular momentum of nuclear rotation N form the resulting angular momentum J just as in Hund’s case (a). The energy expression for this coupling case is the same as for case (a) (Eq. (10)), except that the constant A is so large that the manifold of levels appears as several distinct electronic states, rather than a splitting of rotational levels in a singlet state. The notation of electronic terms in this case follows the scheme: 2S+1 or, in short, are simply denoted by proper value of . 3.3. Wave function symmetries and electronic selection rules In diatomic molecules, the quantum numbers introduced above (i.e. ; S; or W) are not suRcient to classify the molecular electronic states and to specify selection rules for transitions between two electronic energy states. It is necessary to use symmetry properties of the electronic eigenfunction e depend on the symmetry properties of the Leld in which the e . The symmetry properties of electrons move. If the electronic eigenfunction e of a non-degenerate state ( state) either remains unchanged or changes sign upon the reHection at any plane passing through both nuclei 8 then the state is called a + or a − state, respectively. For degenerate states ( ; !; : : : states), the symmetry operation is more complicated. However, it is always possible to Lnd a combination of eigenfunction ( e+ and − e ) that will change or remain unchanged upon the reHection, and therefore one also distinguishes + ; − ; !+ ; !− ; : : : states. 7
states (for which = 0) should not be confused with the vector %, which is a projection of S on the internuclear axis (Fig. 2). 8 Any plane through the internuclear axis is a plane of symmetry.
J. Koperski / Physics Reports 369 (2002) 177 – 326
187
Fig. 3. Vector diagram for Hund’s coupling case (c). Solid-line ellipse represents the notation of J while the much more rapid precessions of L and S about the direction of Ja are indicated by the broken-line ellipses. Precession of Ja about the internuclear axis, and # and N around the direction of J is not indicated.
If the two nuclei in the molecule have the same charge (i.e. also homonuclear molecules such as Hg2 ) in addition to the symmetry axis molecule has a centre of symmetry. 9 In consequence of this symmetry, the electronic eigenfunctions remain either unchanged or only change sign when reHected at the centre. In the Lrst case, the state to which the eigenfunction belongs is called an even, in the second case an odd state (subscripts ‘g’ or ‘u’ stem from the German “gerade” or “ungerade”, respectively). Thus, for homonuclear molecules one has g ; u ; g ; u ; : : : states. The distinction between even and odd electronic states is independent of whether the molecule is homonuclear or not, provided the nuclei have the same charge. The quantum numbers and symmetry properties aAect the spectra of diatomic molecules through the selection rules that hold for them. Since in this review only ro-vibrational transitions between two diAerent electronic-energy states are considered and analysed, only the selection rules for these transitions, which are mostly of the electric dipole type, will be given: A. General selection rules: 1. &J = 0; ±1, with the restriction J = 0 9 J = 0 (rigorous for electric dipole radiation); 2. + ↔ −; +{+; −{− (positive terms combine only with negative and vice versa); 3. g ↔ u; g{g; u{u (even electronic states combine only with odd). 9
The Leld in which the electron moves remains unaltered by a reHection of the nuclei at the centre of symmetry.
188
J. Koperski / Physics Reports 369 (2002) 177 – 326
B. Selection rules holding for Hund’s case (a): 1. 2. 3. 4. 5. 6.
&$ = 0; ±1; %+ ↔ %+ , %− ↔ %− ; %+ { %− (relevant only for ↔ transitions); &S = 0 (only states with the same multiplicity combine with one another); 10 &% = 0 (the component of the spin along the internuclear axis does not alter); = 0; ±1; if # = 0 for both electronic states, then &J = 0 is forbidden for = 0 → = 0. C. Selection rules holding only for Hund’s case (c):
1. = 0; ±1; 2. 0+ ↔ 0+ ; 0− ↔ 0− ; 0+ {0− (where ‘0’ refers to the value of ). It is necessary to complete the above discussion and recall an established rule to denote the electronic energy terms and transitions between them. First, the ground electronic state is always distinguished by letter X (e.g. X0+ g ). Moreover, for the excited electronic states there is an alphabetical convention (which is commonly but not always used), i.e. letters placed in front of the term symbol start from A; B; C; : : : ; a; b; c; : : : in order of increasing energy (e.g. A0+ ; B1u ; : : : ; a0− ; b1g ). 11 For all types of transitions, the upper state is given Lrst and the lower state second (e.g. D1u ← X0+ g for excitation, B1 → X0+ for Huorescence). Finally, the lower and upper-state ro-vibrational levels (v; J ) are marked by (v ; J ) and (v ; J ), respectively. More detailed aspects of allowed and forbidden electronic transitions are discussed in Ref. [64]. 3.4. 12-group metal–rare gas molecules and 12-group metal dimers The ZnRG, CdRG, HgRG, Zn2 , Cd 2 and Hg2 ground-state diatoms form a real vdW complex as the RG and 12-group Me atoms have closed-shell spherical electronic conLgurations, i.e. have no valence electrons. When no valence forces either of homopolar or heteropolar kind are acting between two atoms, a very weak attraction between them still remains. The attraction is responsible for the long-range force present, and is called vdW force. 12 London [2,3] has treated these forces based on quantum mechanics and has shown that they are due to the perturbation of the repulsive ground state by the higher electronic states of the system consisting of two atoms. This perturbation at large R gives rise to a PE variation as −1=R6 . At
10
This prohibition holds less and less rigorously with increasing interaction of S and $, i.e. with increasing nuclear charge. For example, triplet-singlet transitions are strictly forbidden in H2 , but, in Cd 2 or HgRG, the 3 u ← 1 g+ or A0+ (3 ) ← X0+ (1 + ) transitions, respectively, are observed. 11 Labels A; B; C; : : : and a; b; c; : : : are for states of the same and diAerent multiplicities, respectively than that of the ground state. However, this convention is sometimes broken and a general usage A; B; C; : : : is applied. 12 The very weak attraction between two neutral atoms is responsible for the deviations of behaviour of the real molecular gas from the ideal gas laws. These deviations are represented by the well-known vdW equation, and therefore the residual attraction is called a vdW force. For an early review, reader is referred to [69].
J. Koperski / Physics Reports 369 (2002) 177 – 326
189
! −
+
+
−
+
−
+
−
Fig. 4. Induced dipole–induced dipole (London dispersion) interaction. This is due to the electron density Huctuations that occur in every atom and gives rise to an instantaneous dipole moment in one atom. Subsequently, the dipole induces also instantaneous dipole moment in the neighbouring atom and the two dipoles attract one another. Because directions and magnitudes of these dipole moments are correlated, the non-zero average interaction takes place and manifests itself at large atom–atom separation in a molecule. Orientations of both induced dipoles are depicted with arrows.
smaller R, the strong repulsion of the closed-shell atoms becomes dominant, resulting in a shallow minimum at relatively large R’s. 13 The London (dispersion) forces are present in every, also excited, molecular state. However, in general the valence forces overshadow them. The former are noticeable only when the valence forces are very weak or non-existent, i.e. for closed-shell spherical atoms or, in general, at large R. As a result of these dispersion forces, two atoms will attract one another very weakly at large R due to the induced (>uctuating) dipole–induced dipole interaction, and repel one another strongly at small R as they electron clouds overlap (Fig. 4). This is in eAect e.g. for two Me (Zn, Cd or Hg) and RG, or two Me atoms with Llled-shell electronic conLgurations (4s2 ; 5s2 ; 6s2 ; 1s2 ; 2p6 ; 3p6 ; 4p6 and 5p6 for Zn, Cd, Hg, He, Ne, Ar, Kr and Xe ground-state conLgurations, respectively). The vdW-type interaction will dominate the long-range tail of diatomic molecular potential if at least one of the two neutral atoms forming the molecule is in an S electronic energy state (e.g. Hg(61 S0 ) + Ar(31 S0 ); Cd(53 P1 ) + Ne(21 S0 )) [237]. If the two atoms are suRciently far apart that their electron clouds overlap is negligible, the interatomic potential in the long-range region can be written as U (R) = D −
∞ C2k k=3
R2k
=D−
C6 C8 C10 − 8 − 10 + · · · ; R6 R R
(11)
where D is the dissociation energy limit relative either to the well minimum or to the energy of the lowest ground-state vibrational energy level v = 0 (Fig. 1) and D = De or D = D0 , respectively. For 13
Traditionally these attractive forces are called dispersion forces because, as Lrstly revealed by London, they depend on the same quantities as the dispersion of light by the RG atoms.
190
J. Koperski / Physics Reports 369 (2002) 177 – 326
Fig. 5. Energies of the nsnp 1 P1 and nsnp 3 PJ states of Zn, Cd and Hg (n = 4; 5 and 6, respectively), and 5s6s 3 S1 state in Cd. The wavelengths associated with transitions from the ground to the 1 P1 and 3 P1 energy states, as well as from the Y Diagram shows also a comparison between 5s5p 3 P1 to the 5s6s 3 S1 energy state in Cd, are expressed in angstroms (A). magnitudes of the spin–orbit splitting in the Me 3 PJ multiplets (in rectangles).
large R, one power of R will usually dominate and the potential can be approximated as U (R) = D − Cm =Rm , where m is in general not an integer, and m = 6 for pure vdW interaction. The additional terms represent contributions arising from the interactions of higher order instantaneous multipoles, such as dipole–quadrupoles (∼1=R8 ), and quadrupole–quadrupole and dipole–octupole (∼1=R10 ), etc., interactions. Although at large R only the leading term makes a substantial contribution to the attractive energy, at smaller R near to Re , it is found [4] that the higher order terms may contribute as much as 20% of the total dispersion energy. At still smaller R, these higher order terms might be expected to become increasingly important. 3.5. Potential energy curves Conveniently, one can begin the discussion on PE curves of the MeRG and Me2 vdW systems with the comparison of the atomic energy diagrams of the Me atoms which both, or one of them together with a ground-state RG atom, would form the molecule under consideration. Fig. 5 presents the atomic energy levels drawn for the nsnp 1 P1 and nsnp 3 PJ conLgurations, where n = 4; 5 and 6, for Zn, Cd and Hg, respectively, as well as for 5s6s 3 S1 conLguration in Cd. The electric dipole transitions from the 1 S0 ground to the 1 P1 and 3 P1 (and 63 S1 in Cd) excited states are also deY = 0:1 nm = 10−10 m) picted showing corresponding wavelengths in air expressed in aY ngstrems (1 A [70]. All of the transitions shown in Fig. 5 (except that of 61 P1 −61 S0 in Hg, which is VUV) take place in the UV or VIS region of electromagnetic radiation and are accessible using conventional dye lasers using fundamental frequency or conversion of the fundamental to the second or third harmonics. Therefore, laser spectroscopy of the MeRG and Me2 molecular states that correlate asymptotically with the Me-atom 1 P1 and 3 P1 excited states is feasible. Moreover, using pump-and-probe approach one can investigate higher (Rydberg) states correlating asymptotically e.g. with Cd-atom 63 S1 excited state.
J. Koperski / Physics Reports 369 (2002) 177 – 326
191
Fig. 6. A schematic diagram showing a correlation of the Me atomic 3 PJ multiplet (where J = 0; 1; 2 is a total atomic angular momentum) with MeRG molecular spin–orbit levels for two extremes: Hund’s case (c), i.e. high spin–orbit coupling, on the left, and Hund’s case (a), i.e. low spin–orbit coupling, on the right. The majority of MeRG systems correspond to transition from Hund’s pure case (a) for R → ∞ and=or U (R) → 0 towards Hund’s case (c) for R = Re and U (Re ) = De . Thick lines indicate molecular states investigated by the author (see Section 6).
3.5.1. Heteronuclear molecules In real molecular system, such as for example heteronuclear MeRG complex, a particular coupling case (Hund’s case (a) or Hund’s case (c)) is applicable. It depends on the degree of the spin– orbit interaction in the Me atom, which is proportional to Z 4 , where Z is an atomic number, i.e. positive charge on the Me-atom nucleus. For nsnp 3 PJ multiplets the L–S coupling is large in Hg (4265 cm−1 ), intermediate in Cd (1142 cm−1 ) and small in Zn (385 cm−1 ) [71]. Its magnitude should be related to the bond strengths De of the MeRG electronic states correlating with the corresponding atomic asymptotes. For example, because the L–S coupling in Hg(6s6p 3 PJ ) multiplet is larger than the De of the HgRG states, which correlate with this multiplet, a Hund’s case (c) treatment of the electronic interactions at the Re is appropriate. Fig. 6 shows a correlation diagram for the Hund’s case (a) and Hund’s case (c) drawn for 12-group Me-atom nsnp 3 PJ multiplet. For instance, the Hg(63 P1 ) atom can form two electronic states with the RG atom. The Lrst, A state ( = 0 and (+) symmetry) corresponds to a pure -orientation (-alignment of the Hg 6p electronic density distribution, see below), similar to the Hund’s case (a) 3 state where the L–S interaction is small compared to De . The second, B state ( = 1) cannot be described so simply because of L–S mixing with the high-lying = 1 () state correlating with the Hg(63 P2 ) asymptote. In eAect, going from low L–S (Hund’s case (a)) to high L–S (Hund’s case + (c)) interactions, the multiplets of the 3 =1 and 3 =1 states mix strongly near their Re values. Thus, the B( = 1) state can in fact be described as approximately 50% (*-alignment) and 50% (-alignment) orientations (– Hund’s case (a) mixing) with respect to the RG-atom 1s2 or np6 symmetric orbital (see below).
192
J. Koperski / Physics Reports 369 (2002) 177 – 326
Fig. 7. A schematic diagram representing how degenerate molecular electronic states with a nsnp conLguration are split + − by the four terms VRG , e2 = Rˆsp , aˆ‘ˆz sˆz , and a( ˆ ‘ˆ sˆ− + ‘ˆ sˆ+ )=2, in the molecular Hamiltonian (Eq. (12)). The right-hand side shows the atomic energy levels, which can be deLned as the convergence limits of the molecular eigenstates when the interaction VRG becomes zero. Thick lines indicate molecular states investigated by the author (see Section 6).
Considering a more detailed picture from the point of view of Hamiltonian for the Me(nsnp)+RG conLguration: H = H0 + VRG +
e2 + HLS ; Rˆ sp
(12)
where H0 is a zero-order term for the two ns and np electrons (no interaction included), VRG is an interaction (repulsive, i.e. exchange and attractive, i.e. dispersion) between the RG and np electrons and the RG atom, e2 = Rˆ sp is the exchange interaction between the ns and np electrons (Coulombic interaction and the Pauli exclusion principle) and Rˆ sp represents the distance between the ns and np electrons, and Lnally HLS = i aˆi ‘ˆi sˆi represents a single-electron spin–orbit interaction, where aˆi acts only on the radial part of a wave function and ‘ˆi and sˆi are single-electron operators (e.g. [67,68]). + − For a heteronuclear diatom as MeRG, the HLS can be expressed as a[ ˆ ‘ˆz sˆz + (‘ˆ sˆ− + ‘ˆ sˆ+ )=2]. As pointed out, the operator aˆ depends on ZeA , the eAective charge on a nucleus in the molecule, therefore aˆ becomes larger as ZeA increases, and, in general, ZeA becomes larger with increasing Z. Fig. 7 illustrates how the VRG ; e2 = Rˆ sp , and HLS interactions split energy levels in an MeRG system. Due to the interaction VRG between the excited Me and the ground RG atoms, the nsnp conLguration splits into and states. Then, the exchange interaction e2 = Rˆ sp between the ns and np electrons generates the energy diAerence between the triplet and the singlet states. Finally, the spin–orbit interaction HLS causes two types of splitting. Firstly, all the states are split into = 0; 1 and 2 due to the diagonal-interaction term aˆ‘ˆz sˆz (only states with the same values of ; , and , but S can either be the same or diAer by ±1). Secondly, the two degenerate states 3 0;+ 1 and 3 0 + − split further into two states by the oA-diagonal a( ˆ ‘ˆ sˆ− + ‘ˆ sˆ+ )=2 interaction term, which couples conLgurations with the same value of , but with \ = ±1; \ = ±1, and \S = 0; ±1. It is clearly
J. Koperski / Physics Reports 369 (2002) 177 – 326
193
Fig. 8. A partial PE diagram for the CdNe molecule showing PE curves for the ground and excited states correlated with the Cd(51 S0 ) + Ne(21 S0 ); Cd(53 P1 ) + Ne; Cd(51 P1 ) + Ne and Cd(63 S1 ) + Ne atomic asymptotes. All potentials are represented by Morse functions with parameters reported in Refs. [44,45]. Solid arrows with wavelengths corresponding to the centres of excitation from the X0+ to the A0+ (53 P1 ); B1(53 P1 ) and D1(51 P1 ) excited states and from A0+ to the E1(63 S1 ) Rydberg state, as well as Huorescence regions to the repulsive part of the ground state (dashed arrows), are depicted. Full (X0+ ; A0+ ; D1) and open (B1) circles represent ab initio points of Czuchaj and Stoll [72]. For the molecular states the Hund’s case (c) notation is used.
seen that the last interaction causes the mentioned above – mixing in those triplet states with the same value of and the same total parity (+ or −), 3 1 = 3 1+ and 3 0− = 3 0+− , and thus causes them to become mixed – states as L–S coupling increases. 14 Fig. 8 presents an example of PE diagram drawn for the ground and several excited electronic energy states of heteronuclear CdNe molecule. Parameters determined in Ref. [45] were used. Similar diagrams could be drawn for the remaining CdRG as well as ZnRG and HgRG complexes with the only exception that the molecular transitions would be centred at the wavelengths that correspond to the resonance n3 P1 –n1 S0 and n1 P1 –n1 S0 atomic transitions in Zn (n = 4) and Hg (n = 6) as depicted in Fig. 5. For the sake of simplicity, the potentials in Fig. 8 are represented by Morse functions (Section 3.5.4) in the whole range of R. As explained above, the molecular energy states correlate with four atomic asymptotes that correspond to three different energy states of the Cd atom and the ground state of Ne atom: Cd(51 S0 ) + Ne(21 S0 ), Cd(53 P1 ) + Ne(21 S0 ), Cd(51 P1 ) + Ne(21 S0 ), and Cd(63 S1 ) + Ne(21 S0 ). The molecular states are denoted by X0+ (or 1 + in Hund’s case (a)), A0+ (or 3 ), B1 (or 3 + 3 + ), D1 (or 1 ) and E1(or 3 + ), 15 and in Fig. 8 the Hund’s case (c) notation is used. As analysed for the HgRG molecule, the B1 state has both and components and can be expected to have a minimum (in contrary to the situation of pure 3 + state which is purely repulsive). The A0+ state (53 P1 + 21 S0
+
The aˆ‘ˆ sˆ− -type interaction can also couple the 1 1 = 3 1+ and 1 + = 3 0+ states, but it will not be considered here. The Hund’s case (c) notation will be mostly used throughout this review when appropriate, and the Me-atom energy state with which the particular molecular state correlates rather than corresponding notation in the Hund’s case (a) will be given in parentheses (i.e. D1(51 P1 ) rather than D1(51 )). 14 15
194
J. Koperski / Physics Reports 369 (2002) 177 – 326
asymptote) has predominantly character and is expected to feature a more pronounced minimum than B1. Similarly, the singlet D1 state (51 P1 + 21 S0 asymptote) has purely character, while the triplet E1 state (63 S1 + 21 S0 asymptote) is a 3 + state. The X0+ ground state (1 S0 + 1 S0 asymptote) in its long-range tail is dominated by a pure 1=R6 vdW interaction (Section 3.4) and a balance with the short-range repulsive force produces a very shallow minimum at Re . The molecular transitions are centred at the energies, which correspond to the 53 P1 –51 S0 , 51 P1 – 1 5 S0 and 63 S1 –53 P1 transitions in Cd atom (also Fig. 5). As will be presented below (Section 6.1) some of the molecular transitions are red- (e.g. A0+ ← X0+ ), some blue-shifted (e.g. B1 ← X0+ ), and some of them occupies both sides of the corresponding atomic transition (e.g. D1 ← X0+ ). It should be mentioned that Fig. 8 shows only those excited states that are directly accessible from the ground state via bound–bound ro-vibrational transitions. There is one more repulsive state, 0+ (or 1 + in Hund’s case (a)) with very shallow minimum and it correlates with the Cd(51 P1 ) + Ne(21 S0 ) asymptote. However, because of the diAerence between the 0+ and ground-state equilibrium internuclear separations, excitation to this particular state would produce only an unstructured dissociate continuum [72]. There are also several other excited metastable molecular states correlating with two other 53 P0 and 53 P2 atomic asymptotes belonging to the 53 PJ multiplet in atomic Cd. They are denoted in Hund’s cases (c) or (a), respectively: a0− (or 3 − ) for that which correlates with 53 P0 , and b2 (or 3 ), c1 (or 3 ), and d0− (or 3 + ) for those which correlate with 53 P2 . Taking into consideration the selection rules that hold only in Hund’s case (c) (points C.1 and C.2 in Section 3.3), directly from the ground state it is possible to excite only one of these states, i.e. c1(53 P2 ). As shown by Amano et al. [73] and Kurosawa et al. [74], the allowed c1 ← X0+ transition dipole moment in HgAr molecule is small so that the persistence time of the Huorescence from the excited v -levels could be longer than 6 ms and the excited molecule would escape from the supersonic-beam detection region before it Huoresces. So far, that kind of c1 ← X0+ excitation has not been observed in the experiments employing the MeRG supersonic beams. The remaining b2 and d0− states as well as the c1 or even higher-lying E1 states are also optically accessible by employing pump-and-probe methods and making use of one of the higher lying molecular states (e.g. 1(73 S1 ) in HgAr [73,75], or A0+ and B1 in CdRG (RG = Ne, Ar, Kr)) as an intermediate one. It is worthwhile to quote a simple qualitative explanation that allows a better understanding of the repulsion and attraction in a variety of molecular states of the MeRG complex. This depends on how the electron density distributions of the excited state Me atomic p-orbital are oriented with respect to the one of the RG atom and internuclear axis in the molecule. Fig. 9 shows such an example for the ZnNe 4p* and 4p orbitals in the 3 + and 3 states, respectively, correlating with the Zn(4s2 ) ground and Zn(4s4p) excited states. In the ground 1 + molecular state, spherically symmetric 4s2 Zn and 2p6 Ne orbitals act upon each other with strong repulsion and a weak induced dipole–induced dipole interaction (also Fig. 4) in the short- and long-range limits, respectively. The excited 3 + and 3 molecular states correspond to the excitation of one of the zinc 4s2 electrons to the 4p orbital (this results in the 4s4p conLguration). Occupation of this orbital produces two distinct conLgurations in the complex. Orientation of the p-orbital perpendicular to the internuclear axis (4p) results in a stronger attraction (3 state) while the excited 3 + state, corresponding to orientation of the p-orbital along the internuclear axis (4p*), results in a smaller bond energy. This appears to be due to the ability of the RG atom to “feel” the dispersive attraction of the diAuse p-orbital to much shorter distances for the orientation before repulsion begins. For the states,
J. Koperski / Physics Reports 369 (2002) 177 – 326
195
Fig. 9. Schematic representation of the electron density distributions for the and states of the ZnNe molecule. The 1 + and 3 + ; 3 molecular states correlate with the Zn(4s2 ) ground and Zn(4s4p) excited atomic conLgurations, respectively. Contours represent the Zn 4s2 (ground state) and 4p (excited state) valence electron orbitals and the 2p6 electronic cloud of the ground-state Ne atom. The scheme also represents the interaction for all other RG-atom orbitals (3p6 ; 4p6 and 5p6 for Ar, Kr and Xe, respectively) as well as for 1s2 closed-shell He orbital.
where the diAuse p-orbital is pointing directly towards the RG atom, electron–electron repulsion sets in at larger distances, resulting in very shallow potential wells. 3.5.2. Homonuclear molecules The second category of diatomic molecules presented here are homonuclear Me2 vdW complexes. Fig. 10 presents a partial PE diagram for Hg2 molecule drawn in order to show the ground and excited electronic energy states correlated with the Hg(63 P1 ) + Hg(61 S0 ), Hg(63 P2 ) + Hg(61 S0 ) and Hg(61 P1 ) + Hg(61 S0 ) atomic asymptotes. The electronic energy states shown in Fig. 10 were investigated in Refs. [51,60 – 62]. Fig. 10 includes only those molecular states to which a direct electric-dipole excitation from the ground state is possible It occurs in the UV region, for wavelengths Y down to 2000 A. Y For the sake of simplicity, the excited-state potentials from approximately 2700 A are represented by Morse functions in the whole range of R. Additionally, the ground-state potential is represented by a general form of Lennard–Jones (L–J(n − 6)) function, where n = 6:21 [62] (Eq. (20). What distinguishes the example from that of the heteronuclear MeRG molecule discussed above, is additional electronic-wave function ‘g’ and ‘u’ symmetry. Here, the molecular states are 1 + 3 + + denoted by X( = 0+ g or g -Hund’s case (c) or (a), respectively), D( = 1u or u ), F( = 0u or 3 3 + 1 + u ), E( = 1u or u ) and G( = 0u or u ). In Fig. 10 the rigorous Hund’s case (c) notation, as the only appropriate, is used. 16 Because spectroscopic investigation of this molecule dates back to early 1910s [12], the alphabetical convention traditionally used to label the Hg2 molecular states diAers from one investigator to another. However, the one used in Fig. 10 is established since late 1970s (e.g. [76 –79]) with only one exception: the newly characterized E1u state [61] was omitted in earlier classiLcations, where the letter E labelled a diAerent state. Fig. 10 also shows wavelengths + + corresponding to the centres of excitation bands from the X0+ g;v =0 to F0u ; D1u ; E1u and G0u 16
Although in heavy Hg2 molecule the Hund’s case (c) coupling is in eAect it is justiLed to quote the respective Hund’s case (a) parent.
196
J. Koperski / Physics Reports 369 (2002) 177 – 326
Fig. 10. A partial PE diagram for the Hg2 molecule showing PE curves for the ground and excited states correlated with the 63 P1 + 61 S0 ; 63 P2 + 61 S0 and 61 P1 + 61 S0 atomic asymptotes. All excited-state potentials are represented by Morse functions with parameters reported in Refs. [60 – 62]. The ground-state potential is represented by a Lennard–Jones (n − 6) function (n = 6:21). Solid arrows with wavelengths corresponding to the centres of excitation bands from the X0+ g; v =0 to 3 3 3 + 1 the F0+ u (6 P1 ), D1u (6 P1 ), E1u (6 P2 ) and G0u (6 P1 ) excited states as well as Huorescence regions (grey-shaded rectangles with dashed arrows) are depicted. In the case of deeper D1u and G0+ u states, ranges of v -levels probed from the ground state (so-called FC-“windows”) are also shown. Full circles represent ab initio points of Dolg and Flad [92] for the X0+ g. + Full circles (F0+ ; E1 ) and open circles (D1 ; G0 ) represent ab initio result of Czuchaj et al. [93] for the excited states. u u u u For the molecular states the Hund’s case (c) notation is used.
+ + excited states, as well as two Huorescence bands (D1u → X0+ g and G0u → X0g ) from the deeper excited electronic states to the repulsive part of the ground-state potential after a selective excitation of a v -level in the upper state. There is also a number of other excited molecular states with ‘g’ symmetry that correlate with the 63 P1 atomic asymptote as well as those with ‘g’ and ‘u’ symmetries that correlate with the 61 P1 and two other 63 P0 and 63 P2 asymptotes belonging to the 63 PJ multiplet. They are shown in Fig. 11 and were extensively investigated in Hg-vapour cell experiments by Krause and co-workers [79 –85] as well as Mrozowski [86], Drullinger and co-workers [87,88], Komine and Byer [89], Mosburg and Wilke [90], and Callear and co-workers [91] in order to determine whether the Hg2 system could be used as an eRcient laser medium (excimer). The states are denoted (Hund’s case 3 3 3 − 3 (a) is given in parentheses): B1g (3 g ) and A0+ g ) (correlated with the 6 P1 ), 2u ( u ), 0u ( u ), g( 3 3 + 1 − 3 + 3 − (3 mixing), and 0 ( ) (correlated with the 6 P ), as well as A0 2g ( g ), 1g ( g + g 2 g) g g g − 3 + 3 1 and C0u ( u ) (correlated with the 6 P0 ). States, which correlate with the 6 P1 asymptote are: 1 + 1 1g (1 g + 3 g+ ); 0+ u ). It is important to mention here that for the Hg2 excimer g ( g ) and 1u ( ± studied in vapour cells, the A0g states serve as a metastable-population reservoir created using a collisionally induced photoassociation process of the A0+ g ; B1g and D1u states followed by their col± + lisional relaxation (the A0g → X0g decay is forbidden). Population of the reservoir is also depleted by radiative losses due to the D1u → X0+ g transition as well as the Hg3 formation and its radiation to the visible [87,88]. The description of the ground and low-lying excited molecular states of homonuclear Hg2 molecule presented above holds also for Zn2 and Cd 2 complexes investigated in Refs. [58,59]. However, for
J. Koperski / Physics Reports 369 (2002) 177 – 326
197
Fig. 11. A schematic diagram showing the correlation of 61 S0 ; 63 PJ (where J =0; 1; 2 is a total atomic angular momentum), and 61 P1 Hg-atomic asymptotes with Hg2 molecular spin–orbit levels for Hund’s case (c), i.e. high spin–orbit coupling. Hund’s case (a), i.e. low spin–orbit coupling states are also indicated. Thick lines indicate molecular states studied by the author. Here, the capital-letter labelling rule is broken and only the lowest A, B, C labels are used. Next, the labelling continues for the states D, E, F, G. This convention is adhered in this work.
Cd 2 and Zn2 the number of experimentally identiLed excited states and transitions is smaller than 3 that for the Hg2 complex. In both, Zn2 and Cd 2 molecules only the 0+ u (n P1 ) state (n = 4 and 5, 3 respectively), as well as 1u (5 P2 ) in Cd 2 were excited in supersonic beam experiments from the shallow ground state potential [58,59,94 –96]. So far, there is no such an evidence for the 1u (n3 P1 ) 1 3 and 0+ u (n P1 ) states in Zn 2 and Cd 2 , and 1u (4 P2 ) in Zn 2 , which in case of Hg2 tend themselves + + 1 to excitation from the X0g state. The singlet 0u (n P1 ) state in Zn2 and Cd 2 (e.g. [97–101]), triplet 3 3 3 1u (43 P1 ) state in Zn2 [102], and triplet 0+ u (5 P1 ); 1u (5 P1 ) and 1u (5 P2 ) states in Cd 2 [103–105] were investigated by various groups in vapour-cell and solid-RG experiments. The model for interacting atomic electron density distributions shown in Fig. 9 as an example representing the interaction in heteronuclear MeRG molecules, is also valid for the closed-shell orbitals in Hg–Hg (as well as Zn–Zn and Cd–Cd) interaction. The 4s2 ; 5s2 and 6s2 spherical ground-state
198
J. Koperski / Physics Reports 369 (2002) 177 – 326
orbitals of the Me atom replace the ground-state orbital of the RG atom and should interact similarly with the respective ground ns2 and excited nsnp atomic orbitals in the ground and excited molecular states, respectively. In a very weakly bound ground state of Me2 , complex the long-range interaction is dominated by vdW forces [58,62] as expected from the simple consideration of the closed-shell atomic conLgurations. However, as will be discussed below, recent ab initio calculations of the components of the interaction energy in the ground state of mercury dimer [106] show that short-range induction eAects play a signiLcant role in the stabilization of Hg2 , which distinguishes the mercury dimer from the RG counterparts. Because of this short-range induction eAects, the mercury dimer may be regarded as an intermediate case between a weakly bound vdW molecule and a chemically bound species (e.g. [107–109]). Very recently, the same behaviour has been inferred from ab initio calculations of Zn2 and Cd 2 [110,111] and awaits experimental veriLcation. This will be discussed in the next section as well as in Section 6. 3.5.3. Ab initio calculations From theoretical point of view, for determining properties and interatomic potentials of molecules in their ground and excited electronic states, ab initio methods constitute best possible approach. The ab initio methods have developed considerably over the last decade and their precision is increasing rapidly with the advent of modern computers. Unfortunately, standard ab initio calculation, which in principle can be very accurate, provided a suRcient number of electronic conLgurations are taken into account, rapidly become diRcult as the number of electrons in the system increases. The diRculty grows even more when one wants to determine PE curves of higher-lying excited molecular states, especially of the excited states with the same symmetry as the ground state. For heavy diatomics such as Hg2 the calculations represent also a major challenge for testing various relativistic all-electron treatments. For many-electron MeRG and Me2 (Me = Zn, Cd, Hg) molecular systems discussed here (among them are molecules consisting of one or two heavy atoms, Table 1) the diRculties can be overcome by employing calculation methods that apply the concept of pseudopotentials. The methods take advantage of the fact that the 12-group Me and RG atoms posses closed-shell structures, which allows to approximate such a two-atom complex with a two-valence-electron model system. This reduces the amount of necessary computational eAort and relative accuracy. The pseudopotential calculations were introduced on a large scale by Baylis for alkali-RG [112] and mercury dimer [113] interatomic potentials. Since then, they have drawn a broad attention and were applied by theoreticians for the MeRG light (RG = He, Ne) [114] and heavy (RG = Ar, Kr, Xe) [115] systems as well as for Me2 [93,116 –118] complexes. Apart from the above-mentioned motivations for ab initio calculations, the reason which is practically always present, especially of the Zn2 , Cd 2 and Hg2 complexes, is their prospective application as laser active media in analogy with lasers based on the RG dimers and RG halides [24]. As stated above, both the RG and 12-group Me dimers have ground states that are essentially repulsive with only shallow vdW minima. Therefore, lasing action could be possible on bound–free transitions from several bound excited states, with the dissociative ground state facilitating the population inversion. Hence, it is justiLed to study theoretically these molecules in order to determine not only interatomic potentials and spectroscopic constants but also e.g. oscillators strengths, absorption and emission coeRcients, and transition dipole moments between the corresponding states considered as
J. Koperski / Physics Reports 369 (2002) 177 – 326
199
Table 1 Atomic weights mMe and mRG , and reduced masses MeRG = mMe mRG =(mMe + mRG ) and Me2 = mMe =2 of the MeRG and Me2 molecules, respectively (Me = Zn, Cd, Hg; RG = rare gas) investigated in Refs. [40 – 63]. All in atomic mass units (a.m.u.), 1 a:m:u: = 1:66043 × 10−27 kg Me atom
Zn
mMe ∗
RG atom
65.37 He
Cd
Hg
mRG ∗
112.40
4.003
Ne
20.18
Ar
39.95
Kr
83.80
Xe
131.30
200.59
MeRG or Me2 molecule ZnHe ZnNe ZnAr ZnKr ZnXe
3.77 15.42 24.80 36.72 43.64
CdHe CdNe CdAr CdKr CdXe
3.87 17.11 29.47 48.01 60.56
HgHe HgNe HgAr HgKr HgXe
3.92 18.34 33.31 59.11 79.36
Zn2 Cd 2 Hg2 ∗
33.69 56.20 100.30
Atomic weights are masses of natural elements [119].
potential laser systems. Some of the results bounded ill (for Cd 2 [120,121], Hg2 [122]) some of them bounded well (for Hg2 [93]) for the projected laser operation on speciLc dimer bands. Investigations of mercury clusters [123–125] have documented for the Lrst time the gradual transformation of the bonding character with size, among others, in the HgN systems. For 2 6 N 6 8–13 they appear to be weakly bound vdW clusters, then the bonding changes to covalent for 30 6 N 6 70 [125], and then rather abruptly to metallic for N 6 20–80 [123,124] or N = 100 [125]. Hence, ab initio investigation of bonding properties in function of the cluster size seems to be justiLed and necessary [126]. Another motivation already mentioned in previous section is the search for an answer whether these complexes in their ground-states form pure vdW molecules or some other kind of bonding is admixed. The problem was raised Lrstly by Neumann and Krauss [127] in their ab initio study of open-shell systems. Recent results of Kunz et al. [106] and Schautz et al. [128] suggest that covalent bonding contributions at equilibrium distance may play an important role. It will be shown that for mercury dimer these new postulates are conLrmed in two diAerent experiments: laser spectroscopy in supersonic beam performed in Ref. [62] and depolarized collision-induced light scattering of Bonechi et al. [129,130]. Table 2 collects ab initio methods known to the author that have been employed in calculations of interatomic potentials of the MeRG and Me2 complexes discussed here. The methods themselves
Type of calculation employed
Calculated electronic-energy-state potentials
Ref.
2
3
4
MeHe (Me = Zn, Cd)
EAective Hamiltonian and ‘-dependent pseudopot. at the valence level
Excited correl. with np 3 PJ , np1 P1 at. asympt.; spin– orbit coupl.
[144]
ZnRG (RG = He, Ne, Ar, Kr, Xe)
Complete-active-space multiconLgur. self-consistent Leld=complete-active-space multiref. second-order perturb. theory (CASSCF=CASPT2); relat. eAects incl.
X1 + , excited correl with 4p 3 PJ , 4p 1 P1 at. asympt.; spin–orbit coupl.
[133]
ZnRG (RG = He, Ne, Ar, Kr, Xe)
Coupled-cluster with single and double excit. and perturb. contrib. of connected triple excit. (CCSD(T)); large-scale relat. Zn20+ and RG8+ pseudopot.
X 1 +
[134]
ZnRG (RG = He, Ne, Ar, Kr, Xe)
CASSCF=CASPT2 calc., Zn20+ and RG8+ pseudopot.; relat. eAects incl.
X1 + , excited correl. with 4p 3 PJ , spin–orbit coupl.
[321]
CdRG (RG = Ar, Kr, Xe)
‘-dependent statistical semi-empirical pseudopot.
X1 + , excited correl. with 5p 3 PJ , 5p 1 P1 at. asympt.; spin–orbit coupl.
[115]
MeHe, MeNe (Me = Cd, Hg)
‘-dependent pseudopot. at self-consistent Leld=conLgur. interact. (SCF=CI) level; relat. eAects incl.
X1 + , excited correl. with np 3 PJ , np 1 P1 , (n + 1)s 3 S1 at. asympt.; spin–orbit coupl. approximately
[114]
CdRG (RG = He, Ne, Ar, Kr, Xe)
Valence CASSCF=CASPT2 calcul.; Cd 2+ and RG8+ pseudopot.; relat. eAects incl.
X1 + , excited correl. with 5p 3 PJ , 5p 1 P1 , 6s 3 S1 , 6s 1 S0 at. asympt.; spin–orbit coupl.
[72]
CdRG (RG = He, Ne, Ar, Kr, Xe)
CASSCF=CASPT2 calc., Cd 20+ and RG8+ pseudopot.; relat. eAects incl.
X1 + , excited correl. with 5p 3 PJ , 5p 1 P1 ; spin–orbit coupl.
[131] [132]
CdRG (RG = He, Ne, Ar, Kr, Xe)
Valence CCSD(T) calcul.; quasirelat. energy-consistent small-core Cd 20+ and RG8+ pseudopot.
X 1 +
[145]
CdAr
CASSCF; no vdW interact. incl.
X1 + , excited correl. with 5p 3 PJ , 5p 1 P1 , 6s 3 S1 at. asympt.; no spin–orbit coupl.
[146]
J. Koperski / Physics Reports 369 (2002) 177 – 326
Molecules studied 1
200
Table 2 Comparison of ab initio methods applied in calculation of interatomic potentials and spectroscopic constants of the MeRG and Me2 molecules (Me = Zn, Cd, Hg; RG = rare gas)
Excited correl. with 6p 3 PJ , 6p 1 P1 , 7s 3 S1 , 7s 1 S0 at. asympt.; spin–orbit coupl. partly incl.
[147]
HgRG (RG = He, Ne, Ar, Kr, Xe)
Valence CASSCF=CASPT2 calcul.; Hg20+ and RG8+ pseudopot.; relat. eAects incl.
X1 + , excited correl. with 6p 3 PJ , 6p 1 P1 , 7s 3 S0 , 7s 3 S1 ; spin–orbit coupl.
[135] [137]
HgRG (RG = He, Ne, Ar, Kr, Xe)
Valence CCSD(T) calcul.; quasirelat. energy-consistent small-core Hg20+ and RG8+ pseudopot.
X 1 +
[136]
Zn2 , Hg2
Full polarization (POL)-CI; relat. eAects incl.
X1 g+ , excited correl. with np 3 PJ , np 1 P1 at. asympt.; spin–orbit coupl.
[71]
Zn2 , Cd 2
SCF=CI
X1 g+ , excited correl. with np 3 PJ , np 1 P1 at. asympt.; spin–orbit coupl. incl. for Cd 2
[120]
Zn2
Independent SCF=CI
X1 g+ , excited correl. with 4p 3 PJ , 4p 1 P1 at. asympt.
[148]
Zn2
Multireference pseudopot.
X1 g+ , excited correl. with 4p 3 PJ , 4p 1 P1 , 5s 3 S1 , 5s 1 S0 at. asympt.
[116]
Cd 2
MulticonLguration (MC)-SCF
Excited correl. with 5p 3 PJ , 5p 1 P1 , 6s 3 S1 , 6s 1 S0 at. asympt.; spin–orbit coupl. not incl.
[121]
Cd 2
MRCI(SD)+semi-empirical ‘-dependent pseudopot.
X1 g+ , excited correl. with 5p 3 PJ , 5p 1 P1 , 6s 3 S1 , 6s 1 S0 at. asympt.; spin–orbit coupl.
[117] [118]
Hg2
Hartree–Fock (HF) and restricted Hartree–Fock (RHF)+ semi-empirical pseudopot.
X1 g+
[113]
Hg2
Investigation of relat. eAects on the potential at short and intermediate distances
X1 g+
[149]
Hg2
MCSCF
X1 g+ , excited correl. with 6p 3 PJ , 6p 1 P1 at. asympt.; spin–orbit coupl.
[150]
Hg2
Large-scale CI procedures (valence) + ab initio relat. effective potentials (core)
X1 g+ , excited correl. with 6p 3 PJ , 6p 1 P1 at. asympt.; spin–orbit coupl.
[122]
Hg2
Pseudopot. local-density-approx. (LDA)
X1 g+
[126]
(MR)-CI(SD)
procedure + ‘-dependent
(continued on next page)
201
CI
J. Koperski / Physics Reports 369 (2002) 177 – 326
HgHe, HgNe
Type of calculation employed
Calculated electronic-energy-state potentials
Ref.
2
3
4
Hg2
CAS-MCSCF + 1st and 2nd order CI and relat. CI
X1 g+ , excited correl. with 6p 3 PJ , 6p 1 P1 , 7s3 S1 , 7s 1 S0 at. asympt.; spin–orbit coupl.
[151]
Hg2
Quasi-relat. pseudopot. with nth order MHller–Plesset (MPn), quadratic QCISD and QCISD(T) approx.
X1 g+
[152]
Hg2
Qll-electron Dirac–Fock–Slater SCF relat.
X1 g+
[153]
Zn2 , Cd 2 , Hg2
RHF + MPn (n = 3; 4) + quadratic CI
X1 g+
[154]
Zn2 , Cd 2 , Hg2
Pure diAusion-quantum Monte-Carlo (QMC) with relat. energy-consistent large-core pseudopot. and core-polarization potentials
X1 g+
[128]
Zn2 , Cd 2 , Hg2
Relat. energy-consistent small-core pseudopot., large valence basis sets and coupled cluster singles and doubles correlation treatment
X1 g+ , spin–orbit coupl. incl. for Hg2
[110]
Hg2
CCSD(T) large-scale relat. pseudopot.
X1 g+
[155]
Hg2
CCSD(T) and relat. eAective core potentials
X1 g+
[314]
Hg2
CASSCF relat. all-electron super molecular potential + damped dispersion energy
X1 g+
[106]
Hg2
SCF=MRCI+ quasirelat. ‘-dependent pseudopot.
X1 g+ , excited correl. with 6p 3 PJ , 6p 1 P1 , 7s 3 S1 , 7s 1 S0 , 7p 3 PJ , 7p 1 P1 at. asympt.; spin–orbit coupl.
[93]
Hg2
Scalar relativistic (SR) coupled cluster with correlation consistent uncorrelated basis set (cc-USB)
X1 g+
[318]
Hg2
Multireference single- and double excitations conLgur. interact. (MRSDCI)
X1 g+
[156]
J. Koperski / Physics Reports 369 (2002) 177 – 326
Molecules studied 1
202
Table 2 (continued)
J. Koperski / Physics Reports 369 (2002) 177 – 326
203
will not be characterized here as it goes beyond the assumed framework of this review. The reader is referred to the literature. A comparison of the results of ab initio calculations with characteristics determined experimentally by the author and other investigators will be presented in Section 6. At this point of the discussion, examples of the results obtained by Czuchaj and Stoll for CdNe [72], and by Dolg and Flad [92] and Czuchaj et al. [93] for Hg2 are presented together with author’s potentials in Figs. 8 and 10, respectively. It is worthwhile to point out that at the present time, all-electron Dirac–Fock–Slater self-consistent-Leld (SCF) relativistic calculations of Dolg and Flad [92] result in an Hg2 ground-state PE curve being almost exactly identical to the experimental one [62]. This attests to a real progress in the theoretical treatment of this “diRcult” relativistic molecule. Concerning calculation of excited-state potentials, results reported by Czuchaj and Stoll [72], and Czuchaj et al. [131,132] for CdRG, Czuchaj and KroPsnicki [133,321], and Czuchaj et al. [134] for ZnRG, Czuchaj et al. [135,137] and Czuchaj and KroPsnicki [136] for HgRG, and Czuchaj et al. for Zn2 [116], Cd 2 [117,118] and Hg2 [93] are so far closest to those determined experimentally. Ab initio calculations of other dispersion–interaction characteristics, such as for example the static dipole and quadrupole polarizabilities of the Zn, Cd, Hg, and RG atoms as well as the C6 , C8 and C10 long-range constants for the corresponding MeRG and Me2 molecules, are documented in the literature (e.g. [106,127,138–140]). The relativistic and electron correlation eAects were discussed thoroughly by Seth et al. [141], Desclaux et al. [142] and Lam [143], and it was demonstrated that both eAects are important in Zn, Cd and Hg atoms whilst the relativistic eAects do not signiLcantly aAect the static dipole polarizabilities of RG atoms, RG . Experimentally, apart from inaccurate Me static dipole polarizabilities, Me , of Miller and Bederson for Zn, Cd and Hg atoms (±50% error margin for Cd and Hg ) [138], relatively accurate Zn [157], Cd [158], and Hg [159] were determined, showing that the values of Ref. [138] should be corrected or taken into consideration with a great care. 3.5.4. The Morse potential Before reviewing the commonly used analytical representations of the diatomic molecule PE curves, it is necessary to formulate criteria which must be satisLed by any function to be accepted as such a representation over the whole range of R. According to the general consideration presented in Section 2.1 the function: (1) should asymptotically arrive at a Lnite value as R → ∞, (2) should have a minimum at R = Re , (3) should become inLnite at R = 0. The most useful and frequently used analytical function of general purpose relating to the behaviour of an anharmonic oscillator is that introduced in 1929 by Morse [160]. 17 It is given by two exponential terms UM (R) = De [e−2(R−Re ) − 2e−(R−Re ) ] ; 17
(13a)
Note that in the original paper of Morse [160] there is a misprint in formula (13) that has been recently discovered [161]. Denominator of the expression for normalizing integral should contain 1(s + 1) instead of 1(s − 1), where 1 is the Euler gamma function.
204
J. Koperski / Physics Reports 369 (2002) 177 – 326
Fig. 12. Repulsive, De e−2(R−Re ) , and attractive, −2De e−(R−Re ) , parts of the Morse potential. The two parts generate function with well-depth De and equilibrium internuclear separation Re , as expressed by Eq. (13).
where the Lrst and second terms correspond to the repulsive and attractive parts of the potential, respectively (Fig. 12). The Morse potential can be expressed in more compact form often used in the literature UM (R) = De [1 − e−(R−Re ]2 ;
(13b)
where the well-depth De is deLned as in Fig. 12, 18 and 82 c!e xe (14) = h is a constant, which depends on the reduced mass of the molecule and the potential anharmonicity. One of the advantages of the Morse function is when substituted to the SchrVodinger equation it allows to solve it analytically. As eigenvalues G(v), one obtains exact solutions expressed by Eq. (3) without cubic (i.e. ∼(v + 1=2)3 ) and higher terms, and with anharmonicity ! e xe =
!e2 4De
(15)
(all in cm−1 ) just as required for the Birge–Sponer (B–S) extrapolation procedure to be valid (Section 4.1.1). The Morse potential accommodates a @nite number of vibrational levels, vmax (v = 0; 1; 2; : : : ; vmax ), in the potential well. It should be noted that in the Morse function the well-depth De (as named) is measured from the minimum of the PE curve, not from the lowest real v-level, so that the De is slightly greater than the real dissociation energy D0 measured from the lowest v = 0 (Fig. 1). The De may be determined from D0 using relation ! e !e xe ! e y e − + + ··· : (16) De = D 0 + 2 4 8 18
Eqs. (13a) and (13b) are equivalent. To get Eq. (13a) one should subtract De from Eq. (13b).
J. Koperski / Physics Reports 369 (2002) 177 – 326
205
Along with the above named advantages, the Morse function also has its limitations in practice and theory. The potential curve to which it leads is of correct general shape, with its minimum at the Re , and going exponentially to the dissociation limit as R increases. As R tends to zero, however, the PE represented by the Morse function rises to a very high value, but remains Lnite. This defect is not important in practice, nevertheless, it should be noted. Since only three spectroscopically determined parameters De ; and Re are required (Re is to be determined from rotational analysis using constant Be , Eq. (7)), the function can often be used for molecular states for which spectroscopical data are limited. However, the Morse function cannot be adopted to give a more accurate Lt that might be justiLed by more precise data. In diatomic molecular systems (such as MeRG or Me2 ) it usually represents satisfactorily the real potential in the region of the potential well, particularly in the vicinity of the Re . As will be shown below, using examples of the author’s results, this representation usually cannot be made to faithfully reproduce the PE curve shape in both, short- and long-range limits. Of course, there are some exceptions to the statement. Some investigators (e.g. [162,163]) modiLed a simple Morse function by incorporating additional parameters in order to make it more adequate in the short- and long-range regions. 3.5.5. The Lennard–Jones (n − m) potential For weak interatomic forces a Lennard–Jones L–J(n − m) inversed-power potential form was Lrst proposed by Mie in 1906 [5,6] and adopted by Lennard–Jones in the early 1920s [164] in the form C n Cm UL–J (R) = n − m ; (17) R R where Cn and Cm are repulsive and attractive constants, respectively (n ¿ m). 19 It has been frequently applied especially for m = 6, i.e. L–J(n − 6), not only because its simplicity but because there is a theoretical justiLcation for the 1=R6 term in the attractive component of the function. As already pointed out, this is the leading term in the long-range attractive interaction (vdW dispersion interaction) between two closed-shell atoms. The C6 vdW constant can be calculated for a given molecule using various exact (Section 3.5.3) or approximate methods (e.g. L–D [2– 6], S–K [7], K–H [8] formulas), which will be discussed later (Section 4.4). For known De and Re parameters, the L–J(n − m) function can be written as [4] n m Re De Re m ; (18) UL–J (R) = −n n−m R R with the long-range potential coeRcient n De Rme : (19) Cm = (n − m) Commonly applied form of the L–J(n − m) potential (18) is that for a dominating vdW long-range term (m = 6) [165,166] 6 Re n De Re UL–J (R) = 6 : (20) −n n−6 R R 19
The potential (17) was used by Lennard–Jones to explain the experimental temperature variation of viscosity in gases on the basis that both the repulsive and attractive parts of the molecular Leld varied according to an inverse power of the distance.
206
J. Koperski / Physics Reports 369 (2002) 177 – 326
The L–J(n−m) function (and its special case, L–J(n−6)) has the advantage that the repulsive forces dependent of the coeRcient n can be made suRciently soft. Eq. (20) has been used by a number of researchers to represent interatomic vdW potentials for hetero- and homonuclear complexes in a wide range of R (e.g. [62,167,168]). In the speciLc case, i.e. for n=12 describing the repulsive part, Eq. (17) has a form of best-known potential function for diatomic systems. 20 The L–J(12 − 6) potential was proposed by Behmenburg [169] and Hindmarsh et al. [170] UL–J (R) =
C12 C6 − 6; R12 R
(21)
where constant C12 can be evaluated, using so-called Hindmarsh radii deLned for individual atoms in the molecule. For given De and Re parameters, the L–J(12 − 6) potential can be expressed as the particular case of Eq. (20) 6 Re 12 Re (22) −2 UL–J (R) = De R R and C6 and C12 coeRcients are now deLned as 2De R6e and De R12 e , respectively. 3.5.6. The Maitland–Smith (n0 ; n1 ) potential In 1973, Maitland and Smith [171] proposed a function for accurate representation of the groundstate interatomic potential of homonuclear RG2 and heteronuclear RG–RG vdW molecules [172–174]. The function has the form of Eq. (20), i.e. L–J(n − 6) function with given De and Re parameters, with an important modiLcation that n becomes an R-dependent variable: R ∗ −1 ; (23) n (R) = n0 + n1 Re where n0 and n1 are constants. This introduces an additional parameter n1 . 21 The Maitland–Smith M–S(n0 ; n1 ), i.e. L–J(n∗ (R) − 6) function has been found to represent remarkably accurate not only PE curves of the RG2 , but also repulsive parts of the MeRG molecules (e.g. [42,47–50,54,56,175]). This will be widely discussed later. However, there is one restriction concerning relation between the n0 and n1 coeRcients. If the requirement that UM–S (R → 0) → ∞ is to be satisLed, the condition n0 ¿ n1 has to be fulLlled. Otherwise, n0 ¡ n1 , at very small R the UM–S reaches a maximum and then tends to −∞ as R → 0. Fig. 13 presents an illustrative comparison of Morse (Eq. (13)), L–J(12 − 6) (Eq. (22)), M–S(n0 ; n1 ) (Eqs. (20) and (23)), and single C6 =R6 vdW functions, using as an example the HgKr ground-state PE curve plotted with characteristics determined in Ref. [56]. Because the comparison should be made individually in diAerent regions of R, Figs. 13(a) and (b) show separately the short- and long-regions, respectively. From the comparison, it is apparent that softness of the repulsive part of the potential (Fig. 13(a)) increases 20
For the diatomic systems the interaction energy depends only on the separation R of the atomic nuclei. It has been found [171,173] that for homonuclear RG2 and heteronuclear RGRG molecules n0 = 13 in Eq. (23) with the only exception, n0 = 12, for He2 . 21
J. Koperski / Physics Reports 369 (2002) 177 – 326
207
Fig. 13. HgKr ground-state PE curve represented by several functions and drawn using parameters (i.e. !e xe ; !e ; De and Re ) obtained in Ref. [56]: Morse, L–J(12 − 6), M–S(n0 = 11:39, n1 = 10:5), and single C6 =R6 vdW tail, where Y 6 ; (a) and (b), repulsive and long-range parts of the potential, respectively. C6 = 0:711 × 106 cm−1 A
from the L–J(12 − 6), through Morse, to M–S(11:39; 10:5), while in the long-range limit the Morse converges to zero more rapidly than the C6 =R6 , M–S and L–J functions. 3.5.7. Combined Morse–vdW potential From the preceding discussion one should acknowledge that there are speciLc potential forms that represent real interatomic potential more satisfactorily in the one speciLc than in an other region of R. A Morse function (13) can adequately represent potential in the bound-well region while the expression ∼1=Rn , where n ¿ 6 or given by Eq. (23) is more appropriate to represent the repulsive, short-range part of the potential. As already pointed out, a single expression −C6 =R6 in Eqs. (17) and (21), if necessary, can be used separately to approximate a long-range behaviour of the interatomic potential according to the interaction, which dominates in this region. To introduce higher-multipole contributions (Section 3.4, Eq. (11)), in the long-range representation one should employ higher term, proportional to −(1=R8 + 1=R10 + · · ·). However, there is always an open question how to represent the potential in more wider range of R, which still would be characterized by single representation. York et al. [176] proposed a Morse–vdW potential in order to represent simultaneously the bound-well and long-range limit of the NaRG molecules. It has a form C 6 C8 12 UM–vdW (R) = De (1 − e−(R−Re ) )2 − (1 − e−(R=Rc ) ) × + + · · · ; (24) R6 R8 where 1 − exp(−(R=Rc )12 ) term which cuts oA the long-range part and Rc is chosen to yield a smooth transition between the Morse and long-range representations. Whilst Rc is not unique, it has to be set properly and, as will be seen, it is located within the region of R where the long-range approximation is valid. The Morse–vdW potential was successfully applied to represent ground states of NaRG (RG = Ar, Kr, Xe) [176,177], and bound-well and long-range part of the ground- as well as excited-states of the ZnRG (RG=Ar, Kr) [42,47], CdRG (RG = Ne, Ar, Kr) [45,46,48,49], and HgKr [56] complexes. Instead of the 1 − exp[ − (R=Rc )12 ] term in formula (24), other investigators employ complex polynomials in x=R=Re (so-called spline-functions) and the potential is represented by Morse–spline– vdW (MSV) function. The MSV potential was applied to represent among others the ground-state PE curve of Ar 2 [178].
208
J. Koperski / Physics Reports 369 (2002) 177 – 326
3.5.8. Other forms of combined potentials Of a variety of analytical functions [179] that are available for interatomic-potential representations, it is worthwhile to point out the so-called Buckingham-type potential [180] C6 C8 −bR UB (R) = Ae (25) − + 8 ; R6 R where A and b are constants, and the repulsive wall (governed by the parameter b) is represented by a simple exponential function. The generalized Ae−bR exponential form introduced by Born and Mayer in 1932 for crystal forces, has been most frequently used in representations of short-range interatomic energy [4]. There is an example of the single exponential form Ltted to represent a repulsive wall of the ground and excited-state potentials of InAr complex [181]. Function (25) has been used to represent among others, the PE curve of HgRG ground states [182], and ground and excited states of CdRG complexes [183]. Function (25), however, has a deLciency. Although the exponential term rises steeply as R decreases, it remains Lnite at R = 0, so the dispersion term dominates at very small R, and the potential reaches a maximum and then tends to −∞ as R → 0 (similarly as UM–S for n0 ¡ n1 ). This can be overcome by damping the dispersion term (see below) at the cost of complicating the form of the function. The Buckingham-type potential (25) with allowance for higher −Cm =Rm terms in the dispersion k −bR , where k = 0; : : : ; 2n, part, and with an universal damping function F(R) = [1 − 2n k=0 (bR) =k!]e cutting down the asymptotic divergence of the long-range part for small values of R, is called the Tang and Toennies (T–T) potential [184]. When a diAerent form of damping function F(x) is used, and F(x) = exp[ − ((D=x)4 − 1)2 ] for x ¡ D and F(x) = 1 for x ¿ D, where x = R=Rc and D and 4 are parameters, the potential is of Hartree–Fock-dispersion-type (HFD) [185]. The HFD potential has been used to reproduce a wide range of properties of the interaction between two RG atoms [186,187]. The T–T and HFD potentials have also been used to represent the ground state of Hg2 [129,130], NaKr [188] and NaXe [177] molecules. A general form of Buckingham-type function for given De and Re is called exp(n; m) potential Uexp (R) =
De [men(1−R=Re ) − n(Re =R)m ] ; (n − m)
(26)
where m has the same meaning as in Eq. (18), and parameter n, as b in Eq. (25), determines the steepness of the repulsive part of the potential. The long-range potential coeRcient of the exp(n; m) potential has the same form (19) as for general L–J(n − m) potential. For m = 6 Eq. (26) expresses a commonly known exp(n; 6) or exp-6 model potential Uexp (R) =
De [6en(1−R=Re ) − n(Re =R)6 ] ; (n − 6)
(27)
which has also been extensively used [4]. The L–J(n − m) potential, especially in its L–J(6 − 12) form, has enjoyed a high degree of popularity since its introduction, although its close equivalent, the Buckingham-type exp(n; m) potential, probably can be theoretically more justiLed than the simple power law for the repulsive term. The mixture of power law and exponential does, however, pose some problems in establishing parameters and analytical treatment of molecular problems. For this reason, the L–J power potential originally
J. Koperski / Physics Reports 369 (2002) 177 – 326
209
devised to treat RG gases won an early popularity, which it has still not lost. Also, in the investigations discussed here, the L–J(n − 6) as well as its direct variation, the M–S(n0 ; n1 ) potential, play primary role in representing ground- and excited-state PE curves of the MeRG and Me2 molecules. 3.5.9. Hybrid potential Empirical potentials such as those represented by several analytical functions presented above, cover piecewise the entire range of R. However, there is no single form of the interaction potential applicable in the whole range of R, as most forms lose their validity for either small or large R. The choice, therefore, depends on the problem under consideration. Thus far, one can deduce (usually from an experiment or other semi-empirical considerations) a reliable analytical function that describes the potential in the widest possible range of R or one can construct a hybrid functional form comprising of several separate representations for short R-region, the bound-well and the long-range limit. If possible, the single representations can be joined smoothly using certain simple or more complicated functions, to form a complex expression as shown above for the Morse–vdW combined potential. In some cases, however, the only suggested form of analytical representation of real PE curve is a hybrid form, as for example proposed for HgAr [54] and HgKr [56,175] as well as for LiHg [189] and Ne2 [190]. Summarizing the above discussion on the numerical, semi-empirical and empirical representations of an interatomic potential, it is necessary to stress that all the representations considered here conform to the simplest shape of conventional PE curve of diatomic molecule shown in Fig. 1, i.e. they fulLl the necessary criteria (1) – (3) postulated in Section 3.5.4. However, this is not the only conceivable form for the potential. PE curves with at least one maximum between the main minimum and the dissociation limit have been postulated [64]. Moreover, in some particular cases the possibility of actual multiple maxima and minima is not excluded [191]. 3.5.10. Double-well potential Fig. 14 shows the B1 excited-state potential of the CdKr molecule investigated recently [48–50]. A double-well character of the potential has been predicted in ab initio calculations of Czuchaj and Stoll [72] (Fig. 14, [50]) and subsequently, the experimental data conLrmed this result (Section 4.1.3, Fig. 24) [49]. There have been several attempts to represent analytically a double-well potential using properly joined single-well functions. From the studies of Okunishi et al. [192], and Onda and Yamanouchi [193] related to Rydberg states in HgNe molecule it is known that a double-well potential representation may consist of two Morse potentials (13) joined smoothly by a polynomial expansion. This kind of representation can be used in simulation of the vibrational excitation spectrum as shown in Refs. [50,192,193]. It has been also used in analytical representation of the E1(63 S1 ) Rydberg states in the CdRG (RG = Ar; Kr) molecules (see Section 6.1.3 and Fig. 40). 3.6. Vibrational structure of electronic transitions. Vibrational bands Excitation and >uorescence spectra of the MeRG and Me2 molecules discussed here were recorded Y i.e. in the UV region of electromagnetic radiation, in the range of wavelengths from 1996 A, + + Y short-wavelength wing of the G0u ← X0g excitation spectrum in Hg2 [62], to approximately 3700 A, + + i.e. long-wavelength wing of the A0v =9 → X0 Huorescence spectrum in CdKr [49] (compare with Figs. 5, 8 and 10; this does not include the E1 ← A0+ and E1 ← B1 transitions in CdNe, CdAr and
210
J. Koperski / Physics Reports 369 (2002) 177 – 326
Fig. 14. Double-well B1-state hybrid potential of the CdKr molecule (thick line) considered in studies of Ref. [50]. The Y and (b) Morse potential approximating potential consists of (a) experimentally determined Morse potential (R ¿ 4:5A) Y joined with (d) a fourth-degree polynomial. A similar hybrid potential was (c) ab initio points of Ref. [72] (R ¡ 3:6A) used in simulation of the B1 ← X0+ v =0 progression in the CdKr excitation spectrum shown in Fig. 24.
CdKr). This corresponds to wave numbers in the approximate range from 27000 to 50100 cm−1 . The spectra are due to electronic transitions between ro-vibrational levels of the ground and excited states in these molecules. The wave numbers of the transitions (both, in excitation and Huorescence spectra) are given by (compare with Eq. (2)): = T − T = (Te − Te ) + [G(v ) − G(v )] + [Fv (J ) − Fv (J )] = e + vib + rot ;
(28)
where, e = Te − Te is an electronic part (origin of the band system), 22 which is constant for a given electronic transition (usually it is assumed that Te = 0), and vib = G(v ) − G(v ) and rot = Fv (J ) − Fv (J ) are vibrational and rotational parts, respectively (subscript v is used for Fv to emphasize that the rotational term F is given for particular vibrational level). The vib and rot vary during the transition. All terms in Eq. (28) are in cm−1 . The G(v) and Fv (J ) terms are expressed using Eqs. (3) and (4), respectively. Since, in general, Fv (J ) ¡ G(v) one can neglect the rot while discussing excitation and Huorescence spectra in this section, and consider transitions between the rotationless states (Fv (J ) = Fv (J ) = 0), i.e. only the vibrational structure of electronic transition (coarse structure). The rot term will be taken into consideration while discussing the so-called @ne (i.e. rotational) structure of the vibrational band. 3.6.1. Bound–bound and bound–free transitions Fig. 15 schematically shows a simpliLed diagram showing transitions between ground- and excitedstate vibrational levels in diatomic molecule (a rotationless structure) in the excitation (Fig. 15(a)) and Huorescence (Fig. 15(b)) spectra. In the excitation, one can distinguish progressions (those starting from a common v or v level and terminating at diAerent v or v levels are called v - or 22
A band system represents the totality of the transitions between two diAerent electronic states of a molecule.
J. Koperski / Physics Reports 369 (2002) 177 – 326
211
Fig. 15. A simpliLed diagram illustrating formation of (a) excitation, and (b) Huorescence spectrum in diatomic molecule. In the excitation from the ground to the excited state one can distinguish progressions (v ← v = 0-progression is drawn as an example) and sequences, i.e. transitions with \v = v − v = const (\v = 0 sequence is shown as an example). Excited-state electronic term, Te , and energy corresponding to the excited-state dissociation limit, D , diAerence between the equilibrium internuclear separations, \Re , well-depths and dissociation energies for the ground, De , D0 , and excited, De , D0 , states as well as energy corresponding to v = 0 ← v = 0 band, v00 , and atomic, vat , transitions are depicted. The Huorescence emitted after a selective excitation of a particular vibrational level v (transition with energy v0v ) may comprise to the bound–free and bound–bound transitions. In the diagram, a rotationless structure is considered.
212
J. Koperski / Physics Reports 369 (2002) 177 – 326
Hg2: F0+u←X0g+ ∆v=2
1
0
-1
-2
∆v=1
∆v=−1
LIF (arb. units)
4 v′ 2
F0+u
v′
X0+g
v″
0 (a) 4 v″ 2 (b) 0
2535
2537
2539
2541
2543
2545
(c)
laser wavelength (Å) + Fig. 16. (a) F0+ u ← X0g experimental transition in the excitation spectrum of Hg2 studied in Ref. [60], showing \v = −2; −1; 0; 1; 2 sequences measured on the long-wavelength side of the 63 P1 –61 S0 atomic transition. The bands formed by particular \v sequences are due to transitions schematically shown in (c). Computer simulation (b) presents the \v = 1 and −1 sequences as well as the dominated \v = 0 sequence (the feature in the middle). FWHM of the Lorentzian convolution function representing the laser linewidth is 0:3 cm−1 . As reported [60], the excited F0+ u and the ground states have similar Re , !e and !e xe characteristics. This causes the unusual shape of the bands. The presentation is made mainly to emphasize the diAerence between the \v = 1 and −1 sequences as well as the similarity of the \v = −1 sequence to a “shading” of vibrational band, which is usually due to its rotational substructure.
v -progressions, respectively [64]) and sequences, i.e. transitions with \v = v − v = const. Two or more v -progressions recorded simultaneously in one spectrum increase accuracy of determination of the excited-state characteristics while v -progressions (so-called “hot” bands) introduce a possibility of direct determination of the ground-state characteristics (see for example an analysis of the + + E1u ← X0+ g transition in Hg2 [61] (Fig. 23) or 0u ← X0g transition in Cd 2 [58]). In Ref. [60] one + can Lnd a detailed analysis of \v = −2; −1; 0; 1; 2 sequences in the F0+ u ← X0g transition in Hg2 (Fig. 16). The sequences, particularly those for \v ¡ 0, form vibrational band proLles that can often be misinterpreted as rotational structure of the band (so-called “shading” of the band, see below)! Thus, a very careful analysis of the data, reinforced by a number of simulations of the spectral proLles, is often necessary to arrive at the correct interpretation. This peculiar case is illustrated in Fig. 16, in which trace (a) is taken from Ref. [60]. Additionally, to emphasize the similarity of the \v = −1 sequence to a “shading” due to the rotational sub-structure, a simulation of the experimental trace is shown in part (b). The rotational “shading” of vibrational bands as well as the result shown in Fig. 16(a) will be discussed in detail below, however for the sake of illustration they are mentioned here. In some particular cases, with favourable FC-“window” for the excitation, a discrete v - or v -progression continue beyond the excited- and=or ground-state dissociation limits, respectively,
J. Koperski / Physics Reports 369 (2002) 177 – 326
213
where it joins unstructured continuum that corresponds to the dissociation of the molecule. This aAords direct determination of the excited- and=or ground-state dissociation limit with respect to the energy of the level from which the excitation took place (for the determination of the excited-state dissociation limit see for example analyses of the B1 ← X0+ excitation spectrum in CdNe [45], CdAr [46] or HgAr [53,54]). The Huorescence spectrum in Fig. 15(b) consists of two, bound–free and bound–bound parts. 23 After a selective excitation of a v -level, the radiative transition can terminate on the repulsive part or on the discrete bound-levels structure of the ground state. This will be discussed in detail later while analysing results presented in this review. Here, it will be only pointed out that spectra of this kind were investigated for the ZnAr and ZnKr [40,42,47], CdNe [45], CdAr [47] (Fig. 27) and CdKr [48,49], HgAr [53,54] HgKr [56], and Hg2 [61,62] molecules. Moreover, in case of the CdNe, CdAr and CdKr complexes, two independent “channels” of the bound–free Huorescence terminating on the same part of the ground-state potential (Section 4.3, Fig. 29) were used, increasing dramatically the accuracy of determination of the PE-curve repulsive part above the dissociation limit. In case of the A0+ → X0+ and B1 → X0+ bound–bound Huorescence in HgAr (Fig. 26) [53,54], the recorded data made possible a direct characterization of the ground-state potential well below the dissociation limit supplementing information obtained from the excitation spectra. The character of the excitation and Huorescence spectra determines the method of their detection. Excitation spectrum is produced by scanning, for example, a dye laser through the structure of ro-vibrational levels while monitoring the total Huorescence signal using a suitable optical detection system (e.g. a photomultiplier). Fluorescence spectrum is recorded following a selective excitation of a particular ro-vibronic level (0v in Fig. 15(b)) by scanning the resulting Huorescence spectrum with an appropriate spectral analyser (e.g. a monochromator). It is obvious that ability to resolve a Lne rotational structure in the excitation spectra depends on the spectral bandwidth of the laser. In the Huorescence spectra, the ability to resolve an energetically “dense” structure of the ro-vibrational levels close to the ground-state dissociation limit depends on the spectral resolution of the monochromator. Experimental details will be discussed in Section 5. 3.6.2. Franck–Condon principle and Born–Oppenheimer approximation As seen in Fig. 15, all the vibrational transitions between the ground and excited states symbolically occur along vertical lines at the initial R-value. This and the other aspect, an intensity distribution in the vibrational structure, are the consequences of the Franck–Condon (F–C) principle, which was formulated qualitatively in 1925 by Franck [194] and developed quantum-mechanically in 1928 by Condon [195,196]. The F–C principle says that the electronic transition in a molecule takes place so rapidly in comparison with the vibrational motion of the nuclei that immediately afterwards the nuclei still have very nearly the same relative position and velocity as before the electronic transition. The Lrst requirement (the same position) means that the electronic transition occurs between points, which lie on vertical line in Fig. 15 (upward or downward vertical transitions). The second requirement (the same velocity) means than the transition takes place between stationary points of the nuclei’s vibrational motion, i.e. between the classical turning points or, in 23
The structure of a Huorescence spectrum depends on the relative shift, \Re = Re − Re , of the equilibrium internuclear separations in the excited and ground states. In some cases, only bound–bound [42,61] or only bound–free transitions are observed.
214
J. Koperski / Physics Reports 369 (2002) 177 – 326
general, higher probability of a transition occurs for vertical transitions whose wavelengths tend to preserve the kinetic energy of the nuclei motion. The F–C principle can be illustrated with respect to the diAerence \Re between equilibrium internuclear separations in the excited, Re , and ground, Re , states. Basically, one can distinguish between three situations: (1) Re ∼ = Re , (2) Re ¿ Re or Re ¡ Re , and (3) Re Re . They have an inHuence on the vibrational-progression intensity distributions and the reader will Lnd examples for all the (1) – (3) situations in this review. Here, one wants to draw the reader’s attention only to typical examples of the three situations adequately illustrated in his + + studies: (1) F0+ u ← X0g transition in Hg2 [60] (Fig. 16) or B1 ← X0 transition in CdXe [42], (2) + + + + E1u ← X0g [61] (Fig. 23), D1u ← X0g [61] (Fig. 22) or G0u ← X0g [62] transitions in Hg2 , and (3) B1 ← X0+ transition in CdNe [45] or B1 ← X0+ transitions in HgAr [53]. Taking into account the F–C principle, one can derive a formula for so-called F–C factors (F–CF), qv v . First, one has to assume a Born–Oppenheimer approximation [65], which separates completely the electronic and vibrational motion in the molecule, and therefore the total molecular wave function can be written as 6molecular (r; R) =
e (r; R) n (R)
;
(29)
where e (r; R) and n (R) are electronic and nuclear parts, respectively. 24 The internuclear separation R is a Lxed parameter in the electronic wave function e (r) and r is a relative electronic coordinate. The standard derivation of the formula for the F–CF can be found in any textbook (e.g. [64]). The F–CF is given by the square of the vibrational overlap integral ( n replaced with v ) (see footnote 24)
2
∗
( v ) v dR = qv v (30)
and it expresses a relative intensity of a transition between any two vibrational states. Eq. (30) represents an overlap between the vibrational wave functions corresponding to the excited and ground states of the electronic transition. Bearing in mind that all the transitions occur vertically, the only regions of the excited-state potential that are accessible in the transition (F–C “window” for the excitation) are those for which the vibrational wave function of the ground state has an appreciable magnitude. Furthermore, if the vibrational wave functions have several nodes (number of nodes is equal to v ), there will be interference between the two (the ground- and the excited-state wave functions), leading to oscillatory variations with v in the F–CF (see for example v ← v ¿ 0-progressions + + in 0+ u ← X0g excitation spectrum in Cd 2 [58] or in E1u ← X0g excitation spectrum of Hg2 [61]). An analogous argument holds for Huorescence spectra: the only regions of the ground state that are accessible (F–C “window” for the Huorescence) are those for which the vibrational wave function of the excited state, from which the molecule is radiating, has an appreciable magnitude. Illustrative examples can be found in the bound–bound Huorescence from selectively excited v level terminating on the discrete vibrational v levels in the ground state bound-well. The reader is re+ + ferred to appropriate examples in Refs. [53,54] where the A0+ v =2; 3; 4; 5 → X0 and B1v =0; 1; 2; 3 → X0 24
The nuclear part of the wave function n describes the vibrational as well as rotational movements of the nuclei, and if only the former is considered, the n is replaced with v .
J. Koperski / Physics Reports 369 (2002) 177 – 326
215
transitions in HgAr are reported, as well as Refs. [61] and [62] in which the E1u; v =0; 1; 2; 3 → X0+ g + and G0+ → X0 transitions, respectively, were analysed in Hg . g 2 u; v =39 The above considerations allow to draw a general conclusion concerning suitability of the method reported here for spectroscopical characterization of molecular states. In principle, it depends on whether excitation or Huorescence spectrum is investigated. In the former case both, the excited (upper) as well as the ground (lower) state can be characterized. In the later case the characterization may be limited to the ground state only. For detailed discussion, see Sections 4.1 and 4.2. 3.6.3. Isotope structure of vibrational band. Isotope shift To correctly interpret spectra of diatomic molecules one has to be aware of the inHuence of a mass diAerence between, for instance, two isotopic molecules (isotopomers) on the components of the total molecular energy. The mass diAerence aAects the vibrational, G(v), and rotational, Fv (J ), energy of the molecule in each electronic state. The electronic energy, Te , is the same for these two hypothetical isotopomers since it depends only on the motion of the electrons and on the Coulomb repulsion of the nuclei. 25 As a consequence, the form as well as the relative position of the corresponding PE curves of diAerent excited electronic states are the same for the two isotopic molecules. Restricting the consideration to the non-rotating molecule (Fv (J )=0) one can obtain an expression for isotope shift, \ij , in v ← v vibronic transition between components corresponding to the various combinations (m1 + m2 ) of the masses m1 and m2 of the isotopes present in natural elements whose atoms form the molecule \ij (v ; v ) = (1 − 9)[!e (v + 12 ) − !e (v + 12 )] (31) −(1 − 92 )[!e xe (v + 12 )2 − !e xe (v + 12 )2 ] ; where cubic and higher terms in (v+1=2) were neglected, 9= i =j is an “isotopic ratio”, i and j are the reduced masses of two isotopic molecules with diAerent (m1 + m2 ) mass combinations. If one considers only the v → v = 0 progression, which is often the case in supersonic-beam spectroscopy, the isotope shift is given by 1 1 2 − (1 − 92 )!e xe v + \ij (v ; v = 0) = (1 − 9)!e v + 2 2 −(1 − 9)
! x !e + (1 − 9)2 e e 2 4
(32)
and it depends only on v . Conclusions, that can be drawn from this consideration are that the relatively substantial isotope shift is possible to be observed in transitions involving rather large v and for species with considerably rich isotopic composition. Moreover, analysis of the isotope shift in vibrational spectra facilitates ambiguous v -assignment while lack thereof (a single-isotope molecule) can render such a determination impossible. The latter is especially true for transitions between pair of states with large \Re , when the v =0 ← v =0 component is to weak to be detected. 25
The motion of the electrons is almost independent while the Coulomb repulsion is entirely independent on the masses of the nuclei.
216
J. Koperski / Physics Reports 369 (2002) 177 – 326
403 404
399
408
405 406
398 (a )
396 397
LIF (arb. units)
400 401 402
v'=(57±1)←v''=0
(b ) 2657
2659 2661 laser wavelength (Å)
2663
Fig. 17. (a) The isotopic structure of the D1u; v =57±1 ← X0g;v =0 vibrational component reported in Ref. [61]. (b) Simulated proLle of the (m1 + m2 ) isotopic structure with respective (A1 + A2 ) combinations shown above each component (compare with Table 3). The relative positions of the isotopic peaks were calculated using Eq. (32). Their amplitudes were weighted relative to the isotopic abundances in natural mercury. The individual isotopic peaks were represented by a Lorentzian convolution function with FWHM of 1:3 cm−1 . It should be noted that the individual experimental peaks are “blue-shaded” due to the unresolved rotational structure, which was absent from the simulation. However, considerably large width of the assumed Lorentzian representation causes apparent broadening of the simulated isotope component. + Among spectra discussed here, the isotope structure of the v ← v =0 progression in the G0+ u ← X0g transition of the Hg2 excitation spectrum was thoroughly analysed in the range from v = 25 to 52 [62]. The isotope shifts were also observed in the v ← v =0 progression in the D1u ← X0+ g transition + + of the Hg2 [61] and A0 ← X0 transition of the CdXe excitation spectra [42] for v from 41 to 68 and from 14 to 23, respectively. As an example, one can focus an attention on the isotopic structure of one particular, v =57 ← v = A1 0, vibronic component of the D1u ← X0+ Hg + A2 Hg, where A1 and A2 are g transition of Hg2 (i.e. mass numbers) [61] (Fig. 17). First, one has to explore the relative abundances of isotopes included in natural mercury AZ Hg (Z = 80). There are 10 diAerent isotopes A Hg of which 6 have signiLcant abundance (in bold, masses in a.m.u.) [119]:
A=
196 197 198 199 200
(0.15%) (unstable) (10.0%) (16.8%) (23.1%)
m=
— — 197.966 198.968 199.968
A=
201 202 203 204 205
(13.2%) (29.8%) (unstable) (6.9%) (unstable)
m=
200.970 201.970 — 203.973 —
The 6 most abundant isotopes form 21 isotopic (m1 + m2 ) combinations but only 12 with diAerent molecular reduced masses (Table 3). Therefore, the 57 ← 0 vibronic component should have a structure, which would reHect the isotopic composition. Fig. 17 shows the 57 ← 0 component of the D1u ← X0+ g transition in Hg2 investigated and reported in Ref. [61]. Here, it is completed with a simulation of the isotopic structure, which clearly shows 12 components corresponding to diAerent
J. Koperski / Physics Reports 369 (2002) 177 – 326
217
Table 3 Twelve di4erent isotopic (A1 + A2 ) and (m1 + m2 ) combinations of two A1 Hg + A2 Hg atoms forming mercury dimer. Generally, the 12 combinations originate from all possible (A1 + A2 ) combinations and have 12 diAerent molecular reduced masses aver (all m and in a.m.u.). Note that in seven cases aver are formed from several, slightly diAerent . This Y i.e. ±(0.085 – 0.17) cm−1 in the spectral causes additional small shifts in isotopic structure of Hg2 ±(0.006 – 0.012) A, region shown in Fig. 17 A1
A2
A1 + A 2
m1 + m 2
aver
198
198
396
395.932
98.983
98.98
198
199
397
396.934
99.233
99.23
198 199
200 199
398
397.934 397.936
99.481 99.484
99.48
198 199
201 200
399
398.936 398.936
99.728 99.733
99.73
198 199 200
202 201 200
400
399.936 399.938 399.936
99.974 99.982 99.984
99.98
199 200
202 201
401
400.938 400.938
100.229 100.234
100.23
198 200 201
204 202 201
402
401.939 401.938 401.940
100.462 100.482 100.485
100.48
199 201
204 202
403
402.941 402.940
100.720 100.734
100.73
200 202
204 202
404
403.941 403.940
100.975 101.985
100.98
201
204
405
404.943
101.230
101.23
202
204
406
405.943
101.483
101.48
204
204
408
407.946
101.987
101.99
(m1 + m2 or A1 + A2 ) mass combinations from Table 3. The relative intensities and separation of the components were estimated based on relative abundances and isotope shift, respectively (according to Eq. (32)). Additionally, Fig. 18 shows a comparison of the experimental isotope shift in D1u ← X0+ g; v =0 transition (vertical bars) with \ij (v ) dependence of Eq. (32) plotted according to three diAerent sets of data: (a) Ref. [61], (b) improved analysis of the D1u ← X0+ g transition, and (c) analysis reported in Ref. [197]. It is obvious that the analysis presented in Fig. 18 oAers very valuable option
218
J. Koperski / Physics Reports 369 (2002) 177 – 326
Fig. 18. The measured (vertical bars correspond to experimental errors, Ref. [61]) and (a) – (c) calculated \vij isotope shifts of the D1u ← X0+ g; v =0 vibrational progression. (a) The \vij (v ) dependence plotted according Ref. [61] (empty triangles) assuming the v ± 1 assignment (dashed line). (b) Result of an improved characterization of the D1u ← X0+ g; v =0 vibrational progression in which the isotope structure was taken into consideration (empty circles). (c) Plotted according to the result of Zehnacker et al. [197] (empty squares).
for analysing excitation spectra with satisfactorily resolved isotopic structure. First, it is feasible to verify the v -assignment assumed in the excitation or determined in the Huorescence spectra (see dashed lines in Fig. 18 plotted for v ± 1). Moreover, it is very easy to adjust the !e and !e xe parameters determined previously using the excitation spectrum with assumed v -assignment. The adjustment is carried out until the calculated \ij (v ) approximates the experimentally determined isotope shift (see Fig. 18(b) in which the previously determined !e [61] was appropriately adjusted). Details of the adjustment will be discussed in Section 6. In addition, the isotope-shift analysis oAers a comparison with results determined by other investigators, see graph (c) of Fig. 18 plotted according + to data of Ref. [197]. A similar analysis of the isotope shift in the G0+ u ← X0g; v =0 transition in Hg2 [62] facilitated a correction of the v -assignment reported earlier by Schlauf et al. [198]. 3.7. Rotational structure of a vibrational band. Determination of rotational characteristics Using a laser beam of suRciently narrow bandwidth (i.e. \L 6 0:1–0:6 cm−1 in the case of MeRG complexes discussed here) and for reasonably light molecules, it is possible to resolve a @ne rotational structure of a band system. As will be discussed below, in three cases of the reviewed experimental data, the rotational structure of a vibrational component was resolved, recorded, and + analysed (analysis of the A0+ v =0 ← X0v =0 bands recorded in CdHe [44], CdNe [45] (Fig. 25), and HgHe [51,52] (Fig. 19)). In Section 3.6, the bound–bound and bound–free transitions were analysed disregarding the rotational term in Eq. (28) (i.e. rot = Fv (J ) − Fv (J ) = 0). Here, the reader’s attention will be drawn just to this particular term and its signiLcance in the mentioned analysis of the experimental data, which allows the determination of rotational characteristics of the molecule (compare with Eqs. (4) – (8) in Section 2.2). Using the previously characterized rotational terms for the ground and excited states as well as rotational constants describing non-rigid rotator, the wave number of a rotational
J. Koperski / Physics Reports 369 (2002) 177 – 326
219
Fig. 19. The A0+ ← X0+ v =0 transition in excitation spectrum of HgHe investigated in Refs. [51,52], showing the v ← v = 0-progression and a rotational structure in the v = 0 ← v = 0 vibrational component. The spectra were detected Y atomic Huorescence (a) without, and (b) with a Hg vapour Llter that was applied to absorb out the intense Hg 2537 A which otherwise tended to swamp near-lying molecular spectral features (see Section 5 for details). The “blue-shading” in the rotational structure of the 0 ← 0 component indicates that Re ¡ Re . The vertical arrow points to the onset of the dissociation in the B1 ← X0+ transition. (c) Simulation of the rotational structure of the 0 ← 0 component. It was deduced from the simulation that P-branch is responsible mostly for the band head while components of the R-branch form degraded (“blue-shaded”) contour (because of the = 0+ ← = 0+ transition the Q-branch is not present in the spectrum, see also [199 –201]). Rotational temperature Trot = 4 K was obtained. The FWHM of 0:6 cm−1 was used for the Lorentzian convolution function. No attempt was made to take into account the isotopic shift (according to Ref. [200], \ij ≈ 0:15 cm−1 ). (d) Experimental spacing between R-branch lines in the 0 ← 0 component plotted against J . The slope and the intercept with vertical axis give information on the Bv =0 and Bv =0 rotational constants. The Be and Be constants were determined using Eq. (36).
transition can be expressed as rot = Bv J (J + 1) − Dv J 2 (J + 1)2 + · · · − [Bv J (J + 1) − Dv J 2 (J + 1)2 + · · · ] ;
(33)
where Bv ; Dv ; Bv and Dv are deLned for a given excited or ground vibrational state using Eqs. (5) and (6). 26 It should be remembered that in the measured ro-vibrational spectrum, the 26
Notations Bv ; Bv ; Dv ; Dv and B ; B ; D ; D are equivalent in the literature. Here, for the sake of convenience, they will be used interchangeably.
220
J. Koperski / Physics Reports 369 (2002) 177 – 326
total wave number for the transition is expressed by Eq. (28), therefore the rotational wave number rot is related to the quantity of e + vib = 0 , which in present analysis, for a speciLc vibrational transition, is considered to be a constant. 27 As a consequence, all possible transitions with wave numbers rot that occur for the constant 0 form a single band, 28 which is now analysed in detail. Depending on applied general selection rule \J = J − J = 0 or ±1 (except J = J = 0, Section 3.3, A.1) one can distinguish between three or two 29 series of lines (so-called P, Q, R-branches) P (J ) = 0 − (B + B )J + (B − B )J 2 ;
for P-branch (J = J − 1) ;
(34a)
Q (J ) = 0 + (B − B )J + (B − B )J 2 ;
for Q-branch (J = J ) ;
(34b)
R (J ) = 0 + 2B + (3B − B )J + (B − B )J 2 ;
for R-branch (J = J + 1) ;
(34c)
where J = J 30 and higher order terms in Dv were neglected. Using Eqs. (34) for a given band origin, 0 , i.e. within one vibrational component, one can calculate spacings between lines in the three branches \P (J ) = P (J + 1) − P (J ) = 2(B − B )J − 2B ;
(35a)
\Q (J ) = Q (J + 1) − Q (J ) = 2(B − B )J + 2(B − B ) ;
(35b)
\R (J ) = R (J + 1) − R (J ) = 2(B − B )J + 2(2B − B ) :
(35c)
Plotting spacings \v against J obtained from an experiment, one can determine B =Bv and B =Bv from the slope of the plot and its intercept with the vertical axis, respectively. Footnote 4 in Section 2.2 gives a practical suggestion which one can use to derive Be and Be constants from Bv and Bv . Using this approximation one obtains (compare with Eq. (5)) 1 1 ! e xe 1 ! e xe ≈ Be − = Be 1 − ; (36) Bv ≈ Be − e v + v+ Be v + 2 !e 2 !e 2 which relates Bv and Be with the vibrational constants !e xe and !e determined independently, e.g. from a B–S plot (Section 4.1.1). Consequently, using Eqs. (7) and (8) one obtains absolute values for Re and Re , and centrifugal constants De and De . + The rotational structure of the A0+ v =0 ← X0v =0 vibrational band in CdNe has been recorded and analysed in Ref. [45] (see Section 4.1.4 and Fig. 25). Here, a recorded rotational proLle of the + A0+ v =0 ← X0v =0 band in HgHe [51,52] will be discussed. The band is shown in Figs. 19(a, b). Its analysis, initiated by a plot presented in Fig. 19(d), led to a determination of the Bv =0 and Bv =0 27
Note that Eq. (33) should be supplemented with +:i (J ), where :i is a term resulting from the interaction between rotation and electronic motion and manifests itself as -(Hund’s case (a)) or -doubling (Hund’s case (c)). 28 0 is also called the band origin or the zero line [64]. 29 \J = 0 is forbidden if = 0 in both electronic states (1 –1 transitions, Hund’s (a)) or for = 0– = 0 transitions (Hund’s case (c)). 30 It should be noted that for P-branch J decreases and increases by 1 in absorption and emission, respectively, while for R-branch J changes inversely.
J. Koperski / Physics Reports 369 (2002) 177 – 326
221
constants, and consequently other rotational characteristics such as Be and De , as well as absolute values for the Re of the ground and excited states of the complex (Table 12, Section 6.2). + As seen in Figs. 19(a, b) the proLle of the A0+ v =0 ← X0v =0 component has its characteristic shape: it is “blue-shaded” (degraded) and has a distinct band head at its long-wavelength side. This is caused by the linear and quadratic terms in J (Eqs. (34)), which can be positive or negative depending on the (Bv − Bv ) diAerence. The maximum or minimum (J ) is reached at certain J value forming the band head, and if (Bv − Bv ) ¡ 0 (i.e. Bv ¡ Bv , and therefore Re ¿ Re ) the band is “red-shaded ” while the opposite ((Bv − Bv ) ¿ 0, i.e. Bv ¿ Bv , and therefore Re ¡ Re ) the band is degraded toward the short-wavelength side (“blue-shaded ”). Thus, a simple inspection of rotationally non-resolved band proLles in detected excitation spectra gives a Lrst insight into the relation between the equilibrium internuclear separations of the ground and excited states of the complex. 4. Determination of a potential energy curve in di1erent regions of internuclear separations To accurately determine the shape of the interatomic potential U (R) of a molecular electronic state in a widest possible range of R, it is necessary to take into consideration as many as possible complementary types of data acquired in the experiment. As will be emphasized below, the correctness of this determination increases if the interatomic potentials involved are probed using diAerent ro-vibrational transitions in diAerent ranges of R. Generally, an interatomic potential of a speciLc molecule can be characterized in diAerent regions of R using appropriate method of investigation accordingly to the F–C-principle overlap between the corresponding wave functions of the ground and excited molecular electronic states. For example, an excitation from the ground-state lowest vibrational levels to the F–C-accessible vibrational levels in the excited state usually provides reliable information about the excited-state potential well, i.e. below the dissociation limit. Fluorescence from a selectively excited vibrational level may terminate, again accordingly to the F–C principle, at diAerent parts of the ground-state potential, i.e. when the Huorescence terminates at the discrete levels and/or the repulsive part of the ground state then completes the information on the ground state PE curve below and/or above its dissociation limit, respectively. Therefore, combining all the experimental data available for a particular molecular system one can develop a representation for the U (R) in the widest attainable range of R. The variety of the experimental data includes excitation spectra with “hot” and “cold” vibrational progressions. The former (v ← v ¿ 0) enhance a direct characterization of the ground state (see examples for ZnNe [41], CdNe [45], CdAr [46], CdXe [42], HgNe and HgAr [53], HgKr [56], Cd 2 [58,59], and Hg2 [60,61]). The latter (v ← v = 0) extend up to the dissociation limit, facilitating conclusions on the shape of the excited state PE curve at large R (see characterization of the B1-state in CdNe [45] and CdAr [46], as well as in HgAr [55] and HgKr [56]). In certain cases considered here, one has to deal with very weakly bound molecules, with bound wells supporting only few vibrational levels (e.g. X0+ state in ZnNe [41], A0+ state in CdHe [43,44], or X0+ and B1 states in CdNe [45]). Therefore, the determination of the ground- and/or excited-state potentials has to be based on rather sparse evidence, making the whole procedure very diRcult. Therefore, the analysis has to be performed with great care. Similar precautions have to be taken while concluding about the potential from experimental data that are limited due to a F–C “window” for the
222
J. Koperski / Physics Reports 369 (2002) 177 – 326
excitation. A convenient, although erroneous assumption is sometimes made, that the analytical shape of the potential inferred from the data recorded in a relatively narrow F–C “window” can be freely extrapolated over a much wider range of R. This review considers these cases as well (e.g. A0+ states in CdKr [48,49], CdXe [42], HgAr [53], HgKr [56], and excited states in Hg2 [61,62] and Cd 2 [58,59]). However, conclusions about the U (R) are drawn with care and only in those ranges of R that are supported by the experimental data. Consequently, as one of the main achievements presented here, several potentials in MeRG and Me2 molecules have been corrected and new analytical representations have been proposed. Fluorescence spectra make possible a precise determination of a steepness of the ground-state repulsive part (see examples for ZnAr [47], CdNe [45], CdAr [47], CdKr [48], HgAr [54], HgKr [56] and Hg2 [61,62]) or characterization of the ground-state bound well (e.g. for CdAr [47] and HgAr [53]). The precision of this determination increases if the experimental data includes more than one “channel” of the Huorescence originating at di4erent excited vibrational levels but terminating at the same part of the ground-state potential (see analyses of the Huorescence in CdNe [45], CdAr [47], CdKr [48], and HgAr [54]). 4.1. Analysis of excitation spectra 4.1.1. Birge–Sponer method In all analyses discussed here [40 – 63], the B–S method plays an important role in determination of vibrational characteristics of interatomic potential. The method itself has been proposed in 1926 [202] and assumes that separations between bound vibrational levels decreases linearly with vibrational quantum number v. Neglecting cubic and higher terms in Eq. (3) the separation of successive vibrational levels is given by [64] \G
v+
1 2
= G0 (v + 1) − G0 (v) = !0 − 2!0 x0 (v + 12 ) = G(v + 1) − G(v) = !e − 2!e xe (v + 1) ;
(37)
where similarly to Eq. (3) the term value G0 (v) = !0 v − !0 x0 v2 + !0 y0 v3 + · · ·
(38)
is referred to the lowest vibrational level v = 0 instead to the bottom of the potential well (index ‘e’), and ! 0 ≈ !e − ! e x e ;
(39a)
! 0 x 0 ≈ !e x e
(39b)
within anharmonic potential approximation 31 (see discussion on a Morse potential in Section 3.5.4). As seen from Eq. (37), a plot of \Gv+1=2 against (v + 1=2) or (v + 1) gives directly the vibrational frequencies !0 or !e , respectively (from intercepts with the vertical axis), and anharmonicities !0 x0 or !e xe , respectively (from slopes). The applicability and reliability of the B–S plot has been 31
Without this approximation !0 = !e − !e xe + 3!e ye =4 + · · · and !0 x0 = !e xe − 3!e ye =2 + · · · :
J. Koperski / Physics Reports 369 (2002) 177 – 326
223
45 ∆Gv'+1/2 (cm−1)
45 30
30
15
15 vD' ↓
0 (a)
0
20
40
60
80 v'
100
120
0 (b)
0
5
10 v'
15
20
Fig. 20. B–S plot for the E1u ← X0+ g; v =0 transition in the excitation spectrum of Hg2 reported in Ref. [61]. (a) A whole range of v showing linear long-extrapolation (solid line) used for determination of the last discrete vibrational level before dissociation vD = !0 =2!0 x0 , and a real curvature (dashed line) of the plot according to the result for D0 (41). (b) A detail of (a) showing a short-extrapolation used for determination of the vibrational frequency !0 (the intercept with vertical axis) and anharmonicity !0 x0 (the slope).
discussed in many textbooks (e.g. [64,191]), however for the sake of clarity some of the aspects need to be underlined here, especially in the context of particular examples considered in this review. To gain an insight into the problem, the reader is referred to Refs. [40 – 63] as well as to other numerous articles on the subject. Nevertheless, it is necessary to stress that the accuracy of the vibrational constants determined using the B–S plot depends on the experimental data. The method works very well in the vicinity of the Re , i.e. while dealing with the data close to the bottom of the potential well. Fig. 20 presents a B–S plot (\Gv +1=2 vs. (v + 1=2)) constructed using data of the v ← v = 0—progression of the E1u ← X0+ g transition in Hg2 [61] (Fig. 23). This example illustrates very well a situation when the !0 and !0 x0 vibrational constants determined from the experimental points (i.e. for lower v , short extrapolation shown in Fig. 20(b)) cannot be extended to the region of higher v , i.e. closer to the excited-state dissociation limit. Using the above !0 and !0 x0 constants and assuming a linear B–S plot in the whole range of v , which is equivalent to the assumption that a Morse function represents the real bound well of the E1u -state potential up to the dissociation limit, it is possible to approximate the excited-state dissociation energy using following formula (compare with Eq. (15)) [64]: D0 =
!02 : 4!0 x0
(40)
The above expression gives D0 = 2222 cm−1 which should be compared with the value obtained using relation (Figs. 15 and 29) 32 D0 = D0 + at − 00 :
32
(41)
Eq. (41) is a very good approximation for states that are free of potential maxima (energy barriers) near the dissociation limit.
224
J. Koperski / Physics Reports 369 (2002) 177 – 326
Using experimentally determined Hg2 ground-state dissociation energy D0 [61,62], energy at corresponding to the Hg 63 P2 –61 S0 atomic transition [203], and directly recorded energy 00 corresponding to the v = 0 ← v = 0 transition [61], one obtains D0 = 1640 cm−1 . The straightforward conclusion is that the B–S plot overestimates the D0 value for the E1u -state and that the real \Gv +1=2 vs. (v + 1=2) dependence has a negative curvature when approaching the dissociation (dashed line in Fig. 20(a)). There are examples of overestimation as well as underestimation (a positive curvature) while using B–S plots for the determination of D0 [64,191]. This (i.e. overestimation) is a common behaviour that characterizes considerably deep molecular states, however a B–S plot can also be non-linear (underestimation) while approaching the dissociation in case of very shallow potentials governed only by weak vdW forces in the long-range limit. This will be discussed in the following section. As a conclusion, it has to be stressed that applicability of the B–S method is very limited. It works better while restricted to the bottom of the potential well (determination of !0 and !0 x0 in this region) than to the long-range tail of the PE curve (great caution in determination of D0 from Eq. (40)!). Nevertheless, the B–S plot is a basic method that was applied in order to derive the fundamental vibrational characteristics of the excited molecular states reported here. In some cases however, the B–S method was employed in order to characterize the ground state as well See for example the + case of HgAr Huorescence A0+ v =2; 3; 4; 5 → X0 spectra [53,54], and cases of CdNe [45], CdAr [46,47], CdXe [42], HgKr [56] and Cd 2 [58] with detection of “hot” bands in a variety of the excitation spectra. 4.1.2. Limiting and generalized near-dissociation expansions In 1970 Le Roy and Bernstein suggested an alternative treatment to the B–S analysis for the determination of asymptotic long-range potential, eAective (non-integer) vibrational index at the dissociation limit, vD , and energy corresponding to the dissociation limit D itself [204,205]. 33 It is very well known [207] and was already mentioned above, that near dissociation, and for very weakly bound vdW systems, a Morse-type linear B–S extrapolation is not adequate to describe the long-range behaviour of the internuclear potential. According to the limiting Le Roy–Bernstein (LR–B) method, a long-range behaviour of the potential can be expressed by 34 U (R) = D
Cm m
Rm
;
(42)
where the powers m have positive integer values and constants Cm are yielded by perturbation theory [69]. As the dissociation limit is approached, the vibrational spacings \Gv+1=2 lying very near the dissociation depend mainly on the asymptotically dominant inverse-power contribution. Therefore, a dominant exponent m is determined by the nature of the leading long-range attractive interaction between the dissociating atoms. As already discussed, in the case of vdW molecules the long-range
33
A similar approach was proposed independently by Stwalley [206]. Note that Eq. (11) in Section 3.4 is a special case of Eq. (42) written for m = 2k, i.e. for pure vdW interaction m = 6; 8; 10; : : : : 34
J. Koperski / Physics Reports 369 (2002) 177 – 326
225
limit of internuclear separations are governed by the induced dipole-induced dipole interaction and m = 6 is the leading term in the expression for the U (R). Therefore, it is justiLed to approximate the long-range part of the potential by the U (R) ≈ D − C6 =R6 . 35 In the LR–B theory, the vibrational Lrst diAerences \Gv =(G(v −1)G(v +1))=2 36 of the G(v −1) and G(v +1) total vibrational energies (terms) of v − 1 and v + 1 levels, respectively, are related by the equation (\Gv )2m=(m+2) = (Km )2m=(m+2) (D − G(v))
(43)
and the constant Km is given by Km = √
˜ Cm1=m
m1(1 + 1=m) ; 1(1=2 + 1=m)
(44)
where ˜ = h=2, and 1 is the Euler gamma function. For m = 6, Eqs. (43) and (44) simpliLes to the following expressions: (\Gv )3=2 = (K6 )3=2 (D − G(v)) : K6 = √
˜ C61=6
61(7=6) : 1(2=3)
(43a) (44a)
It is straightforward that according to Eqs. (43) and (43a), for a given m, a plot of \Gv against G(v) (so-called LR–B plot) directly determines the energy corresponding to the dissociation limit D, as an intercept with the horizontal axis. Fig. 21(a) shows the LR–B plot (m = 6, i.e. Eq. (43a)) for the B1 state investigated in CdAr [46]. The plot is compared with that which corresponds to a linear B–S characteristic, i.e. \Gv2 vs. G(v ). 37 As reported in Ref. [46], the vibrational components in the B1 ← X0+ transition were measured up to the dissociation limit and applicability of the limiting ) 6 0:1D ). However, it is clearly seen, that even LR–B method was fully justiLed (i.e. D − G(vmax e though there was full evidence from experimental data of vibrational-levels structure very close to the dissociation limit (within the ±0:25 cm−1 experimental accuracy) the B–S method gives a slightly diAerent (i.e. a 1:80 cm−1 diAerence) value for D . It shows that when the dissociation is asymptotically approached, a simple Morse-potential shape does not represent the real B1-state interatomic potential, and vdW forces take over in the long-range attractive part of R. Discussion related to this eAect in the ground and excited states of MeRG (ZnNe [41], CdHe [43], CdNe[45], CdAr [46,47], CdKr [48,49], CdXe [42], HgAr [55], HgKr [56]) and Me2 (Cd 2 and Zn2 [58]) molecules can be found in the literature. Except the D, the LR–B method can yield the Cm constant of the asymptotically dominant inverse-power contribution to the potential as well as the eAective (in general non-integer) vibrational
35 36
98]. 37
A complete discussion on the inverse-power contributions to the long-range potential can be found in Ref. [237]. A diAerence between \Gv and \Gv+1=2 of Eq. (37) is strongly emphasized. See also [179] and discussion in [64, p.
For the Morse potential, it is straightforward to show that the B–S plot obeys the relationship \Gv2 = 4!e xe (D − G(v)) [56].
226
J. Koperski / Physics Reports 369 (2002) 177 – 326 40
40
4
30
3
←
20
→ D'B-S ↓ D'LR-B ↓
10 (a) 0 30700
20
2
10
v'D
(b)
↓ 0
0 30720
30740 G(v')
30760
1
(D−G(v'))1/3 (cm-1/3)
30
(∆Gv')2 (cm-2)
(∆Gv')3/2 (cm-3/2)
LR-B method linear B-S plot
0 2 4 6 8 10 12 14 v'
Fig. 21. (a) LR–B plot (full squares, according to Eq. (43a)) constructed for the B1 energy state of CdAr molecule investigated in Ref. [46]. A linear long extrapolation of the plot performed for the points closest to the dissociation, as advised in Refs. [204,205,208], gives the dissociation limit DLR−B . The result of the LR–B method is compared with that corresponding to the linear B–S procedure (empty squares, see footnote 37). Vertical error bars are indicated. Y 6 ), and the eAective vibrational index vD at the (b) Determination of the H6 (slope), C6 constant (C6 = 1:47 × 106 cm−1 A dissociation (vD = 11:2) for the B1 energy state of CdAr molecule (according to Eq. (45a) and [46]). Size of the data points correspond to horizontal error bars in (a) and (b).
index at the dissociation limit, vD . With the Cm =Rm dominant inverse-power asymptote close to the dissociation, the energies G(v) and vibrational quantum numbers are related as follows: (D − G(v))(m−2)=2m = Hm (vD − v) ;
(45)
where Hm = Km (m − 2)=2m, and for particular case of m = 6 the Eq. (45) has a simple form: (D − G(v))1=3 = H6 (vD − v) ;
(45a)
which for the considered B1 state of CdAr can be plotted as (D − G(v ))1=3 against v . Fig. 21(b) presents the plot. It gives H6 (and consequently, C6 ) and vD as the slope and intercept with the horizontal axis of a linear extrapolation, respectively. 38 It is worthwhile to point out that the determined eAective vibrational index at the dissociation, vD = 11:2, is larger than that obtained with the help of the B–S plot, vD = 9:5 [46]. It supports the result derived for the D using both, the LR–B and B–S methods. As mentioned above, the limiting LR–B method is a reliable tool for analysing the long-range part of the interatomic potential only when suRcient experimental data exists for analysis. To overcome this diRculty while dealing with vibrational energies G(v) detected far from the dissociation limit, Le Roy developed a special procedure [208] which was applied here to characterize the X0+ and D1 states in ZnNe [41], A0+ and D1 states in CdNe [45], X0+ state in CdAr [46], and A0+ and 38
The equivalent to Eq. (45) relationship corresponding to the Morse-type B–S method is (D−G(v))1=2 =(!e xe )1=2 (vD −v) [56].
J. Koperski / Physics Reports 369 (2002) 177 – 326
227
X0+ states in HgKr [56]. The generalized near-dissociation-expansion (NDE) theory is based on transformed Eq. (45) G(v) = D − Xm (vD − v)2m=(m−2) f(vD − v) ;
(46)
where Xm = X˜ m
1 (m Cm2 )1=(m−2)
(47)
and X˜ m is a known numerical factor which depends only on physical constants and the value of m [209,210], and f(vD − v) term is certain empirically determined analytical function which is constrained to approach 1 as v approaches vD . In the Gv NDE program [208], the f(vD − v) term is represented by a ratio of polynomials in (vD − v). Eq. (46) incorporates the limiting behaviour of the analogue Eq. (45) of the limiting near-dissociation procedure into empirical expansions that account for the observed energies of the lower vibrational levels v. The program performs Lts to a variety of those f(vD − v) forms and generates weighted averages of the physically signiLcant parameters (D; vD or Cm ). This eventually yields more realistic estimates of these parameters than could otherwise be obtained. To increase the accuracy of the procedure, a theoretically known limit-behaviour is introduced into the program by the plausible estimate of the leading Cm coeRcient. Hence, the program gives a more realistic extrapolation of other long-range characteristics. Methods of simple evaluations of the Cm constants for vdW long-range interaction (m = 6) will be discussed below (Section 4.4). 4.1.3. Simulation of bound–bound vibrational spectra The B–S plot is a fundamental method that is applied in order to derive vibrational characteristics of the ground and excited molecular states. When the isotopic structure of vibrational components is resolved, results of isotope-shift analysis can improve the molecular constants obtained from the B–S analysis (Section 3.6.3). All of the above serve as a main source of information that can be used in the simulation of bound–bound components in the vibrational excitation spectra. As already described and discussed in Section 3.6.2 (Eq. (30)), the simulation relied on calculation of series of F–CF, i.e. squares of the vibrational overlap integrals, which express relative intensities of transitions between the ground and excited vibrational states. A Turbo Pascal code used for simulation of excitation spectra was developed in Windsor’s group [168,211] and oAered a possibility of representing the ground- and excited-state potentials by Morse functions. However, it had a limitation in generating vibrational wave functions for levels with v ¡ 20. Therefore, for the simulation of majority of cases presented here, a Fortran code of Le Roy [212] was used. The code (LEVEL 6.1–7.2) solves the radial SchrVodinger equation for bound (v; J ) states and calculates the F–CF for transitions between the ground- and excited-state ro-vibrational levels. The program oAers a whole variety of standard analytical representations for interatomic potentials, has an option for approximation of double-well potentials (Section 3.5.10 and Fig. 14) and has practically no restriction on size of v. Therefore, it can be used in the simulation of excitation spectra for considerably deep excited-state potentials. The range of ro-vibrational levels that could be accessible in the excitation process from the energetically lowest ground-state levels (see characteristics of supersonic expansion below) depends on a diAerence \Re = Re − Re in the ground and excited states, and deLnes the FC-“window” for excitation. The “window” can be “open” for the excited-state levels that lie in diAerent but narrow
228
J. Koperski / Physics Reports 369 (2002) 177 – 326
Fig. 22. (a) D1u ← X0+ g; v =0 transition in the excitation spectrum of Hg2 studied in Ref. [61]. The spectrum shows the v ← v = 0-progression. Lower traces (b) and (c) show computer-simulated spectrum (using LEVEL 6.1 code [212]) of the v ← v = 0-progression for the single (m1 + m2 ) = 401 isotopic peaks with assumption of (b) L–J (n − 6) ground-state Y and Morse potential with Re = 2:71 A Y for the excited potential (Eq. (20) in Section 3.5.5) with n = 6:21 and Re = 3:69 A Y state (\Re = Re − Re = −0:980 ± 0:005 A) or (c) the same L–J (6:21 − 6) ground-state potential and L–J (12 − 6) Y (dotted envelope only). The individual m1 + m2 = 401 isotopic peaks in each excited-state potential with Re = 2:71 A vibrational component were represented by a Lorentzian convolution function (FWHM of 0:5 cm−1 ) to give them a Lnite width. Detailed view of the v = 57 ← v = 0 component is shown in Fig. 17. Isotope-shift analysis is presented in Fig. 18 (Section 3.6.3). See Table 14 for complete results.
regions of the excited-state potential well: near the bottom (e.g. studies of the E1u state in Hg2 + [61] or 0+ u and 1u states in Cd 2 [58,59]), in the mid-height of the bound well (e.g. D1u and G0u states in Hg2 investigated in Refs. [61] and [62], respectively, see also Figs. 10 and 29, as well as A0+ states in characterization of the CdXe [42] or HgKr [56]) or close to the dissociation limit (e.g. [46,53,55]). Consequently, one is forced to extend the characterization of the potentials to the regions of internuclear separations, which were not covered in the experiment. In majority of cases, it introduces certain inaccuracies (minor or major) creating a necessity of rather complicated analytical PE representations (Section 3.5). Only rarely one does Lnd the exclusive situation in which the F–C-“window” is “opened” for all the excited-state ro-vibrational transitions from the bottom up to the dissociation limit. This oAers a unique possibility for complete description of the investigated PE curve. This is possible for rather shallow excited-state potentials with suRciently large \Re , e.g. some B1 states in MeRG studied in Refs. [45,46,53,55]. The regions of R that are characterized, extend from the intermediate (vicinity of Re ) up to the long-range limit. Nevertheless, the main characteristic that is straightforwardly derived from the simulation is \Re . This is well documented in all articles discussed here [40 – 63]. It should be stressed that the accuracy of determining \Re is rather high (usually from ±0:3% [62] to ±1:6% [53]). Frequently, this is the only information on a relative position of minima of molecular potentials involved in the analysis (e.g. when the rotational structure is not resolved and absolute values of the Re cannot be determined). This will be discussed later (Section 4.5.1). A very interesting case is presented in Fig. 22. It illustrates an inHuence of how a particular excited-state analytical representation can have on the simulated excitation spectrum in addition to variations in \Re . In the simulation procedure one has to assume a certain analytical function
J. Koperski / Physics Reports 369 (2002) 177 – 326
229
(e.g. Morse in Fig. 22(b) or L–J(12 − 6) in Fig. 22(c)) to represent the excited-state potential. Similar choice has to be made concerning the ground state potential. Keeping all parameters (i.e. De ; !e ; !e xe ; \Re ; 00 ; v and v -assignments) in the simulation unchanged, it is shown that when the L–J(12 − 6) replaces the Morse representation, a maximum of the envelope of the simulated spectrum shifts towards longer wavelengths. Therefore, not only \Re but also the electronic state representation can serve as an adjustable parameter in the simulation. The other information that can be derived from the simulation is a vibrational temperature Tvib , especially when one has a rich experimental data of “hot” bands present in the spectrum. The Tvib governs the population distribution among the vibrational levels supported by the ground-state well. It has to be taken into consideration that the method of production of molecules characterized in this review (a supersonic expansion beam, Section 5.1) generates them with generally non-predictable non-Boltzmann component in the ground state v -population distribution. In spite of that, as an approximation, a Boltzmann v -population distribution has to be assumed for the sake of calculations. Accordingly, the number of molecules in each of the v states is deemed to be proportional to the Boltzmann 2 factor e−hcG0 (v )=kTvib ≈ e−hc(!0 v −!0 x0 v )=kTvib , where k is the Boltzmann constant (see Eq. (38) as well). The Tvib has a direct inHuence on amplitudes of “hot” vibrational components in diAerent v -progressions in the spectrum. In the supersonic expansion, the Tvib can be partly controlled and this will be discussed later. Using a trial-and-error approach in the simulation of all, “cold” v —(i.e. starting from v = 0, Fig. 15(a)) and “hot” v —(i.e. starting from v ¿ 0) progressions, and adjusting relative intensities of vibrational components belonging to diAerent progressions, is possible to determine the Tvib with an accuracy of several percent. It should be remembered, however, that this determination carries an inherent error due to the approximation of a strict Boltzmann ground-state v -population distribution. One more important aspect of the simulation, which has to be stressed here, is the assumption of a Lorentzian convolution function chosen to illustrate the result of the simulation of the total excitation spectrum. The assumption was made only to make the simulated proLle resemble the real shape of the observed vibrational components and the spectrum as a whole. It should be born in mind that in Figs. 22 or 23 the simulated components represented only the single (m1 + m2 ) = 401 isotopic peaks, or rotationally “shaded” bands, respectively. The isotopic structure of the v components from Fig. 22 was analysed and simulated in Section 3.6.3 (Figs. 17 and 18). On the other hand, the “shading” of the rotational structure in a vibrational component was already discussed and simulated in Section 3.7 (Fig. 19) and will not be repeated here (see also next section). The exception was made in Fig. 24, where the rotational “shading” of the A0+ ← X0+ and B1 ← X0+ transitions in the CdKr excitation spectra was reproduced using rotational constants determined from vibrational characteristics !e and !e xe and Re (see Table 11, and Eqs. (7) and (36) as well as caption of Fig. 24 for details). The other reason for using a convolution function (not presented here) in illustration of the result of simulation was to show an inHuence of the neighbouring atomic line 39 or other vibrational components in the spectrum. They can be large in amplitude or close to each other, and apparently can aAect the intensity distribution amongst particular vibrational bands. Ignoring that eAect can lead to inaccuracies in the quantitative interpretation of the spectra (for example, see
39
In cases of very shallow molecular potentials the excitation spectrum is located in proximity of the atomic transition, and vibrational components often overlap an intense atomic line (see Section 6).
230
J. Koperski / Physics Reports 369 (2002) 177 – 326
LIF (arb. units)
12 12
10 10
6
8 8
4 6
2 4
2
0←v"=0 0←v"=1 0←v"= 2 4
(a)
(b)
(c) 2310
2315
2320
2325
2330
2335
2340
2345
laser wavelength (Å)
Fig. 23. (a) – (b) E1u ← X0+ g; v =0 transition in the excitation spectrum of Hg2 reported in Ref. [61]. The spectrum shows the v ← v = 0 and v ← v = 1-progressions. A partial “hot” v = 0 ← v -progression is present in (a). A contribution of “hot” bands in the spectrum was attained by lowering a pressure of the carrier gas, increasing of the nozzle oriLce, and decreasing a distance of the excitation region from the nozzle. (c) Trace showing a computer-simulated spectrum of both v ← v = 0 and v ← v = 1-progressions performed with assumption of L–J (n − 6), n = 6:21 function for Y and Morse representation for the excited-state potential (Re = 3:445 ± 0:002 A). Y the ground state potential (Re = 3:69 A) Vibrational temperature Tvib = 17:5 ± 0:5 K reproduces the contribution of “hot” bands to the spectrum. The vibrational components were represented by a Lorentzian convolution function (FWHM of 1:5 cm−1 ). Rotational shading was ignored in simulation. See Table 14 for complete results.
simulations of excitation spectra in ZnNe [41], CdHe [43], CdNe [45], CdAr [46], CdKr [48,49] and CdXe [42]). 4.1.4. Simulation of rotational structures + Simulation of the rotational structure within the HgHe A0+ v =0 ← X0v =0 vibrational component was presented in Section 3.7 as an example of analysis of a Lne structure of the vibrational band (Fig. 19(c)). Here, as another illustration of the simulation, a result obtained in Ref. [45] will be discussed in detail and additional aspects of the analysis will be presented. Fig. 25(a) shows the rotational structure of the v = 0 ← v = 0 component detected in the A0+ ← X0+ transition of the excitation spectrum in CdNe molecule [45]. Similarly to the situation in the HgHe, the Q-branch is not present ( = 0+ ← = 0+ transition) and only P and R-branches need to be analysed (see footnote 30 and Eqs. (34)). It is assumed here that the intensity distribution among rotational transitions in a particular vibrational band is governed principally by the Boltzmann distribution of population among the initial J = J levels, multiplied by , where is the wave number associated with the photons absorbed in a particular transition. The number of molecules in each of the J levels is proportional to the Boltzmann factor e−hcF ( J )=kTrot ≈ e−hcB J ( J +1)=kTrot (compare with Eq. (10)) multiplied by (2J + 1), which represents the J -level degeneracy. 40 40
In the case when only P- and R-branches appear, the degeneracy is J +J +1, which equals to 2J and 2(J +1) for Pand R-branches, respectively. For bands where Q-branch is present, the J -dependence is given by so-called HVonl–London coeRcients [64].
J. Koperski / Physics Reports 369 (2002) 177 – 326
B1v'←X0+v''=0
231
53P1-51S0 atomic line
... 2 1 0
A0+v'←X0+v''=0
LIF (arb. units)
13 12 11
*
*
*
10
9
8
7
6
4 ...
5
(a)
2X
(b)
(c) 3250
3255
3260
3265
3270
3275
3280
3285
laser wavelength (Å)
Fig. 24. (a) B1 ← X0+ and A0+ ← X0+ transitions in the excitation spectrum of CdKr molecule reported in Refs. [48,49]. + + (b) Simulation of the B1v ← X0+ v =0 and A0v ← X0v =0 transitions with assumption that Morse functions represent the + + A0 - and X0 -state potentials, and a double-well potential similar to that of Fig. 14 (Section 3.5.10, see text for details) represents the B1-state potential [50]. An attempt to reproduce rotational “shading” of individual vibrational components was made using rotational constants evaluated from Re , !e and !e xe determined in vibrational analysis (compare with Eqs. (7) and (36)) and Trot = 5 K. Lorentzian convolution function (FWHM of 1:0 cm−1 ) was used in the simulation. The isotope shift within each vibrational component (\ij = 0:1 cm−1 ) was found to be negligible at this level of spectral resolution, and was not analysed. (c) Individual v ← v = 0 and v ← v = 1-progressions of the B1 ← X0+ and A0+ ← X0+ transitions represented by thick and thin bars, respectively with assumption of Morse representations for all three states (Tvib = 9:0 ± 0:5 K). From the comparison of both (b) and (c) simulations for B1 ← X0+ transition it is obvious that the excited-state Morse representation reproduces only the bottom of the deeper well in Fig. 14. Position of + Cd 2 vibrational components in the 0+ u ← X0g transition investigated in Ref. [58] are indicated with asterisks.
+ Figs. 25(c) – (e) show three simulations of the A0+ v =0 ← X0v =0 band in CdNe, performed for three diAerent rotational temperatures Trot = 0:5; 5 and 10 K, respectively. It is obvious that by adjusting the Trot it is possible to Lt the shape of the experimental band, and here it was determined
232
J. Koperski / Physics Reports 369 (2002) 177 – 326
Fig. 25. (a) Rotational structure of the v = 0 ← v = 0 band recorded in the A0+ ← X0+ transition of the CdNe excitation spectrum [45]. The J = J assignment is shown for the R-branch (see footnote 30). Lower part shows fringes detected using a Fabry–Perot etalon (FSR = 1 cm−1 ) to monitor the tuning process of the fundamental laser frequency. No isotope splitting was resolved (according to Eq. (32), \ij = 0:005–0:010 cm−1 ). (b) Graphical illustration of rotational transitions between J = J and J for three P-, Q- and R-branches (also Eqs. (34)). (c), (d) and (e) Simulations of the rotational contour from (a) performed using rotational constants determined in Ref. [45] and assumption of Trot = 0:5; 5, and 10 K, respectively. From the simulation, Trot = 5 K was obtained as the “best Lt” to the experimental spectrum in (a). It is seen + (also for the case of the A0+ v =0 ← X0v =0 band in HgHe, Fig. 19) that P-branch is responsible mostly for the band head while components of the R-branch form a degraded (“blue-shaded”) contour (because of the = 0+ ← = 0+ transition the Q-branch is not present in the spectrum, see also [213]). The FWHM of 0:2 cm−1 was used for the Lorentzian convolution function.
that the simulation for Trot = 5 K constitutes of the “best Lt”. Performing the simulation also makes possible to adjust the B and B constants determined previously for HgHe using a plot shown, in Fig. 19(d). From the simulation of the separate branches (here the P- and R-branches) it is easily to analyse the contribution of particular branches to the total band proLle. In Fig. 25, the P-branch is responsible mostly for the band head while components of the R-branch form degraded (“blue-shaded”) contour, which is characteristic for Re ¡ Re (see discussion in Section 3.7). Similar + was found to be true for the rotational structures of recently analysed A0+ v =0 ← X0v =0 band in CdHe [44].
J. Koperski / Physics Reports 369 (2002) 177 – 326
233
HgAr: B1v'=3→X0+v"
LIF (arb. units)
v'=3→v"=0 1 2 34...
(a)
(c) (b) 2520
2530
2540
monochromator wavelength (Å)
Fig. 26. (a) B1v =3 → X0+ v Huorescence spectrum (interference in character, see explanation in text) of HgAr molecule reported in Ref. [54]. The spectrum was detected after a selective excitation of the v = 3 excited-state vibrational level. It was recorded with a highest possible spectral resolution of the detection system, i.e. 2:5 cm−1 monochromator band-pass. It shows partly resolved bound–bound v = 3 → v = 0; 1; 2; 3; : : : transitions. (b) Simulation of F–CF for the B1v =3 → X0+ v transitions with the assumption that the Morse function represents the bound well of the B1 and X0+ states, and with molecular parameters determined in an analysis of the B1 ← X0+ excitation spectrum [53] (see also Table 12). (c) Simulation performed to resemble the spectrum. Each individual vibrational peak was convoluted with a Gaussian convolution Y (i.e. 2:5 cm−1 ) representing the monochromator bandpass. function with FWHM of 0:16 A
4.2. Analysis of >uorescence spectra Fluorescence from a selectively excited ro-vibrational level can be helpful in characterization of a large part of the ground state PE curve. As concluded previously, if the \Re displacement is favourable, both the repulsive wall and the potential-well below the dissociation limit are “accessible” in the Huorescence. This creates a possibility of direct determination of ground-state characteristics unfeasible in the excitation, or to supplement and/or to complete the characterization performed while studying the excitation spectra. In this case, the regions of R that are characterized extend from the intermediate down to the short-range limit. Combining these two approaches, i.e. analyses of both the Huorescence and excitation spectra, is a powerful tool for characterization of the groundand excited-state potentials in the wide range of R. The ground-state characterization employing an analysis of the bound–bound and bound–free parts of the Huorescence was performed for ZnAr [40,47], CdNe [45], CdAr [47] (Fig. 27), CdKr [48,49], HgAr [53,54] (Fig. 26), HgKr [56] and Hg2 [61,62]. The determination was more accurate if not just one but two or more “channels” of the Huorescence (from v of diAerent electronic states and/or from diAerent v -states in the same electronic state) on the same part of the ground state PE curve were analysed. To examine these, the reader is
234
J. Koperski / Physics Reports 369 (2002) 177 – 326
+ + referred to analyses of CdNe (D1v =1 → X0+ and A0+ v =1 → X0 ) [45], CdAr (D1v =7; 8 → X0 and + + + A0v =4 → X0+ ) [47], CdKr (D1v =16 → X0+ and A0v =9 → X0+ ) [48,49], HgAr (A0v =2; 3; 4; 5 → X0+ + and B1v =0; 1; 2; 3 → X0+ [53] and [54]), and Hg2 (E1u; v =0; 1; 2; 3 → X0+ g and D1u; v =57±1 → X0g [61], + + and G0u; v =39 → X0g [62]) spectra.
4.2.1. Modelling of bound–bound discrete transitions and bound–free continuous spectra Usually, the shortest-wavelength part of the Huorescence spectrum consists of bound–bound discrete transitions (see Fig. 15(b) to compare with adequate scheme of transitions between the excitedand ground-state PE curves). If the detection system comprises a monochromator with suRcient spectral resolution (details in Section 5.3), it is possible to resolve the bound–bound transitions and reveal the ground state bound levels’ structure. The spectral resolution of the monochromators was suRcient to resolve only vibrational transitions (this work does not consider isotopic structures or rotational spectra detected in the Huorescence), however, this was enough to improve or supplement the ground-state characteristics determined in the course of analyses of the excitation spectra. To examine the character of the bound–bound Huorescence, reader is referred to the example analyses presented + + in Ref. [47]: D1v =8 → X0+ v transitions in CdAr (Fig. 27), Refs. [53,54]: A0v =2; 3; 4; 5 → X0v and + B1v =0; 1; 2; 3 → X0v (Fig. 26) transitions in HgAr, or in Refs. [61] and [62]: E1u; v =0; 1; 2; 3 → X0+ g; v + + and G0u; v =39 → X0g; v transitions in Hg2 , respectively. To look over the bound–free continua in the Huorescence one should inspect the analyses of Refs. [40,47]: D1v =10 → X0+ in ZnAr (Fig. 46 in + + + + Section 6.3), Ref. [45]: A0+ v =1 → X0 and D1v =1 → X0 in CdNe, Ref. [47]: A0v =4 → X0 and + + + + D1v =7; 8 → X0 in CdAr, Refs. [48,49]: A0v =9 → X0 and D1v =16 → X0 in CdKr, Ref. [54]: + + + + A0+ v =2; 3; 4; 5 → X0 in HgAr, Ref. [56]: A0v =8 → X0 in HgKr, Refs. [61,62]: D1u; v =57±1 → X0g + and G0+ u; v =39 → X0g in Hg2 , respectively. + For the sake of illustration, the B1v =3 → X0+ v transitions in HgAr[54] and D1v =8 → X0 transitions in CdAr [47] are shown in Figs. 26 and 27, respectively (also Fig. 46 for D1v =10 → X0+ transitions in ZnAr [40,47]). They present characteristic proLles of the Huorescence arising from the decay of selectively excited v = 3 and 8 vibronic levels in the B1 (of HgAr) and D1 (of CdAr) states, respectively. However, spectra from Figs. 26 and 27 diAer from each other. Firstly, the B1 → X0+ spectrum possesses an interference while D1 → X0+ a re>ection character. These two characters of the Huorescence spectra will be discussed below. Secondly, they diAer because of the overlap of the excited and ground state potentials (deLned by the \Re and shape of the PE curves), which promotes only bound–bound transitions in the case of the B1 → X0+ spectrum, and both, bound–bound and bound–free transitions in the case of the D1 → X0+ spectrum. As mentioned above, the transitions in Huorescence spectra can terminate on the repulsive as well as on the bound part of the ground state PE curve. They give rise to the so-called Condon internal diAraction (CID) patterns [195,196], which are the result of interferences between the vibrational wave function of the bound excited state v (R) and the continuum of wave functions belonging to the unbound ground state (R; E) (see below). In the Huorescence spectra, the CID oscillatory pattern can continue into the bound–bound part of the spectra in the form of the F–C envelope of the discrete transitions. 41 The proLle of a CID pattern depends on the “Mulliken diAerence potential” 41
The actual shape of the envelope may be unrecognizable, when the period of oscillations of the envelope becomes comparable to, or even smaller than, the spacing of discrete lines in the spectrum.
J. Koperski / Physics Reports 369 (2002) 177 – 326
235
Fig. 27. D1v =8 → X0+ Huorescence spectrum (re>ection in character, see explanation in text) of CdAr molecule detected after a selective excitation of the v = 8 excited-state vibrational level. The spectrum was studied and analysed in Ref. [47]. (a) and (b) The most-short-wavelength part of the spectrum detected with spectral resolution of 7.0 and 3:1 cm−1 monochromator band-pass, respectively. Trace (b) shows partly resolved Lrst v = 8 → v = 0; 1; 2; 3; 4 bound–bound transitions. The onset of the bound–free transitions is shown. (c) and (d) The simulated bound–bound part of the spectrum generated on the assumption that Morse functions represent the bound well of the D1 and X0+ states, and using molecular parameters determined in the analysis of the D1 ← X0+ excitation spectrum [47]. The simulation in (c) allowed to Y The precisely determine the diAerence of the equilibrium internuclear separations \Re = Re − Re = −1:070 ± 0:005 A. simulations in (c) and (d) was performed to resemble the bound–bound spectra in (a) and (b), respectively. Therefore, the individual F–CF corresponding to vibrational peaks were represented by a Gaussian convolution function representing the Y (i.e. 7:5 cm−1 ) and 0:15 A Y (i.e. 2:8 cm−1 ), respectively. (e) The gross monochromator throughput with FWHM of 0:4 A spectrum with bound–bound (a horizontal bar showing their range) and bound–free transitions detected with low spectral resolution of the detection system (50 cm−1 monochromator band-pass). (f) The simulation of the bound–free part of the spectrum with an assumption that the excited state and repulsive part of the ground state potential are represented by Morse and M–S (10.6,7.0) functions, respectively. Insert shows the M–S ground-state repulsive-potential representation compared with a Morse function plotted using determined parameters (see also Table 11).
deLned as UMull (R) = G(v ) − [U (R) − U (R)] ;
(48)
where G(v ) is a total vibrational energy (term) of the emitting v level, and U (R) and U (R) are potential energies of the emitting (excited) and Lnal (ground) states, respectively [214]. When UMull (R) has one or more extrema (Fig. 28(a)), the spectrum displays a more complicated
236
J. Koperski / Physics Reports 369 (2002) 177 – 326
Fig. 28. The PE curves for the (a) B1 excited and X0+ ground states of HgAr molecule represented by Morse and M–S (11.96,10.8) potentials and the (b) D1 excited and X0+ ground states of CdAr molecule represented by Morse and M–S (10.6,7.0) potentials determined in Refs. [53,54] and [47], respectively. The “Mulliken diAerence potentials” UMull (R) (compare with Eq. (48)) related to the (a) B1 and X0+ potentials of HgAr (interference character), and (b) D1 and X0+ of CdAr (reHection character), are displayed for transitions from the (a) v = 3, and (b) v = 8 level, respectively. For the sake of clarity the B1- and D1-state PE curves were shifted down by at (63 P1 –61 S0 ) = 39412:3 cm−1 in Hg [70] and at (51 P1 –51 S0 ) = 43692:5 cm−1 in Cd [70], respectively.
interference structure (compare with Fig. 26). When UMull (R) is monotonic (Fig. 28(b)), the spectrum displays a re>ection structure, i.e. regular oscillations in the resulting spectrum are reHections of the ( v )2 squared vibrational wave function associated with the initial level v (compare with Fig. 27. 42 Therefore, the number of maxima in the resulting CID pattern exceeds by one of the v quantum number of the emitting level. Consequently, detection and simulation of the Huorescence spectra is one of the methods that allow to conLrm the v -assignment that was assumed while analysing the excitation spectrum. Examples of characteristic reHection Huorescence D1v =8 → X0+ spectrum of CdAr and D1v =10 → X0+ spectrum of ZnAr are shown in Figs. 27 and 46, respectively. The intense peak at the short-wavelength end of the band is interpreted as being due mainly to unresolved bound–bound transitions to the closely spaced vibrational levels of the shallow ground-state, while the broad maximum at the long-wavelength end corresponds to the inner turning point in the potential of the emitting v level. The simulation of such spectrum consisted of two distinct bound–bound and bound–free parts that had to be performed separately. The simulation of the bound–bound part relies on calculation of F–CF (Eq. (30)) for transitions from emitting v level to the ladder of bound v levels (Fig. 15(b)) while the simulation of the bound–free part consists of calculation of continuous intensity proLles of the Huorescence bands. As in the simulation of the excitation spectra (Section 4.1.3) for simulation of intensities of the bound–bound discrete transitions in the Huorescence, a Turbo Pascal code of Windsor’s group 42
A detailed discussion about the reHection and interference spectra was given by Tellinghuisen [242]. See also an illustrative discussion in Ref. [80].
J. Koperski / Physics Reports 369 (2002) 177 – 326
237
[168,211] was employed mostly (see examples in Figs. 26 and 27). As mentioned above, the program had its limitations. Usually, the simulation of bound–bound part employs all the vibrational levels of the ground state and it is a very good test for the particular representation of the ground state potential in the bound-well region. As in the simulation of excitation spectra, the simulation of bound–bound Huorescence can be also used to determine or improve a value for the \Re diAerence, which was the case for the D1v =8 → X0+ spectrum of CdAr presented in Fig. 27(c) and (d). A Turbo Pascal code reported in Ref. [168] consisted of generation of the CID patterns, i.e. F–C integrals of Eq. (30) that are in that instance a function of the (non-quantized) energy of the two atoms in the repulsive molecular ground state. This implies that the intensity of the emitted Huorescence I (E) (or I ( = E=h) or I (@ = hc=E)) may be represented as a function of the photon energy E [214]: ∞ 2 I (E) ˙ ((G(v ) − E); R) dR ; (49) v (R) 0
G(v ) − E
where is the total energy of the ground-state continuum at which the transition terminates. Eq. (49) neglects the rotational structure of the vibrational levels (the J -dependence is discussed in [54]), and the variation of the electronic transition moment M with R. It also neglects the 3 -dependence of the emission intensity (which is equivalent to the photomultiplier current). These will be discussed below. In the program [168], the excited state was represented by a Morse function while for the ground-state representation one could choose between Morse or L–J(n − 6) functions. In the later version, the program could accommodate also a M–S(n0 ; n1 ) function (Section 3.5.6) as a ground-state representation (see analyses in Refs. [40,42,45,47– 49,54,56]). Having identiLed the particular vibrational level of the emitting v state, the next step was to assume the available analytical function representing the repulsive part of the ground-state potential above the dissociation limit. The simulation procedure relied on applying of free parameter or parameters (\Re ; n; n0 ; n1 or other characteristics; for detailed discussion of the simulation see Refs. [54,62]) and simulating the bound–free spectrum using the representative ground-state potential. Then, the parameters of the potential were varied (a trial-and-error approach) until the best agreement between the observed and calculated positions of the oscillatory maxima and minima was achieved. Generally, using this particular program, three (i.e. Morse, L–J(n − 6) and M–S(n0 ; n1 )) ground-state potential representations were exercised. In the program, the analytical form of the ground state potential was used to generate wave functions associated with its energy continuum [54]. They were generated in the JWKB approximation using Langer’s condition [215] to obviate problems arising from the singularities in the quasiclassical wave functions at the classical turning points. The vibrational wave functions for the excited v states were analytically written based on the Morse potential and the spectroscopic constants derived in analyses of the excitation (vibrational, rotation and/or isotopic) spectra. The products of the excited- and ground-state wave functions were then integrated according to Eq. (49) to produce the simulated Huorescence intensity proLles. The procedure used in the program also assumed a 3 -dependence of the emission intensity. Since monochromators normally have a pass band with constant \@, it might be appropriate to multiply the individual “F–CF” in the continuum by 5 rather than 3 . This would change the relative intensities and shapes of the peaks in the structured continua, but not the positions of the maxima and minima, which constituted the criterion for the “best Lt”. Fig. 27(f) shows a result of the simulation performed for the D1v =8 → X0+ Huorescence proLle in CdAr with an assumption of the Morse and M–S(10:6; 7:0) functions representing the
238
J. Koperski / Physics Reports 369 (2002) 177 – 326
excited- and ground-state potentials, respectively (see also example of the ZnAr Huorescence shown in Fig. 46). It is worthwhile to emphasize here, that to approximate the starting value of the n or n0 parameter in the L–J(n − 6) or M–S(n0 ; n1 ) ground-state potential, respectively, it was assumed that within a harmonic oscillator approximation the relation 43 d 2 U (R = Re ) = 2c!e dR2
(50)
produces formula for n or n0 =
22 c(!e )2 (Re )2 : 3hDe
(51)
Wallace et al. [216] made similar approximation (50) in analysis of regularities in the NaRG and MeRG (Me=Zn; Cd; Hg; RG =Ne; Ar; Kr) molecules. In a simulation employing the M–S(n0 ; n1 ) potential, Eq. (51) allowed to set the n0 and there was still one adjustable parameter, n1 , that was used during the procedure. The “best Lt” of the simulated proLle to the experimental spectrum deLned a single pair (n0 ; n1 ), which labels a single M–S(n0 ; n1 ) potential (an example of a result is shown in Fig. 27(g)). + In the analysis of the G0+ u; v =39 → X0g bound–free Huorescence in Hg2 , to overcome limitations imposed by the Turbo Pascal program, a Fortran code BCONT 1.4 of Le Roy [217] was employed. It oAered a larger range of available analytical functions to represent the ground-state repulsive wall. Moreover, there was no limitation for the magnitude of the v -value, and, what was very convenient, the program made provision for the variation of the electronic transition dipole moment M with R, which can be found in the literature for some diatoms reviewed here (e.g. in Ref. [150] for Hg2 ). The main result of the modelling of bound–free continua is an analytical representation of the ground-state repulsive wall above its dissociation limit. This is always an open question how to join two representations obtained in analysis of diAerent kinds of experimental data: one from the simulation of bound–free transitions and another found from modelling of the excitation spectra, which both provide a representation for the ground-state potential bound-well below its dissociation limit. One particular approach was employed by Le Roy et al. [218] for modelling of bound– free d 3 1 → a3 + spectra of NaK. The Ltted potential was required to merge smoothly into the experimentally determined bound part of the electronic state, i.e. both the function and its Lrst derivative were made continuous at the junction. However, this requirement was omitted by Masters et al. [219] in analysis of the same bound–free emission in NaK. The approach was simpliLed and in simulations reviewed here it was required that the amplitude of the last F–CF in the bound– bound Huorescence spectrum had to be equal to the initial amplitude of the adjacent bound–free proLle. This simpliLcation probably caused the potential representing bound well of the ground state not to join smoothly the potential describing the repulsive wall (e.g. shifts of approximately Y were observed for HgAr [54], Hg2 [62], CdKr [48,49] 8×10−3 ; 5×10−3 ; 9×10−3 and 1:8×10−3 A and HgKr [56], respectively).
43
Eq. (50) yields for the Morse potential a well-known approximation De = !e2 =4!e xe (Eq. (15)) and should be satisLed by every analytically realistic potential.
J. Koperski / Physics Reports 369 (2002) 177 – 326
239
For some particular cases it was very interesting to compare results of modelling of the bound– free continua with the ground-state repulsive wall “constructed” with the help of a semi-classical RKR-like 44 inversion method of Le Roy [220]. This approach, which is complementary to the “exact” computational and Ltting procedures, oAers one peculiar advantage: it distinguishes between the “phase” and “amplitude” information in the experimental spectrum and shows how the positions of the intensity extrema are determined by the shape of the repulsive potential while the peak heights depend on the transition dipole moment. The experimental input data were the energy values of the intensity extrema (maxima and minima) in the recorded Huorescence bands. A Lve-constant polynomial Lt was used to represent the positions of the experimental intensity extrema, while the representation of the emitting state was taken from other analyses. The method was applied for + + the G0+ u; v =39 → X0g and D1v =16 → X0 spectra of Hg2 [62] and CdKr [49], respectively, and the results are shown in Fig. 50 (Section 6.5.1) and Fig. 30(a) (Section 4.3), respectively (open circles). These results will be discussed later. It is important to appreciate that both the simulation of the bound–free continua with an analytical representation of the ground-state repulsive wall and the RKR-like inversion procedure reproduce the real potential in a speciLc region of R. The region is determined by the experimental data available. Because the information about the repulsive potential comes from the bound–free Huorescence spectra, the characterization limit Rlim , which depends on the Elim = hc=@lim , is determined by the occurrence of the spectra in the long-wavelength limit (i.e. up to the @lim ). Therefore, the repulsive part of the ground-state potential can be “mapped” accurately only to a point of G(v ) − Elim − D0 above the ground-state dissociation limit (see the scheme shown in Fig. 29 with E = 0 cm−1 at the ground-state dissociation limit). Consequently, the repulsive walls of the ground-state potentials of the ZnAr [40,42,47], CdNe [45], CdAr [47], CdKr [48,49], HgAr [54], HgKr [56] and Hg2 [62] were determined up to 6640, 390, 3610, 5650, 1600, 3100 and 7210 cm−1 , above their individual dissociation limits, respectively, which correspond to a characterization down Y respectively. to Rlim = 2:45; 3:15; 2:85; 2:75; 2:90; 2:85 and 2:45 A, 4.3. Complementary results of excitation- and >uorescence-spectra simulations Fig. 29 presents a diagram of the PE curves of the X0+ ground, and three, A0+ , B1 and D1, excited states in CdKr molecule investigated in Refs. [48,49]. The excited states correlate with respective 53 P1 and 51 P1 Cd-atomic asymptotes. The curves are drawn according to the experimentally determined parameters using a Morse approximation (see also Table 11). The scheme in Fig. 29 shows also frequencies, 00 , corresponding to the excitation from v = 0 to v = 0 of the A0+ and D1 states, two Huorescence channels (wide vertical arrows) from v = 9 and v = 16 vibrational levels of the A0+ and D1 states, respectively, to the same repulsive part of the ground state, D0 and D0 dissociation energies of the excited and ground states, respectively, and frequencies at corresponding to the atomic transitions from the 51 S0 ground to the 53 P1 and 51 P1 excited states in Cd. At the top of the Lgure, there are approximated three characteristic regions of internuclear separations that correspond to the (a) short-, (b) intermediate-, and (c) long-range regions of R (i.e. to the repulsive wall, 44
RKR-like character of the inversion procedure refers to the very well known potential inversion method of Rydberg– Klein–Rees [222–224] in diatomic spectroscopy. It produces classical turning points of PE curve under investigation from experimentally determined frequencies of ro-vibrational transitions.
240
J. Koperski / Physics Reports 369 (2002) 177 – 326
D1 a
c
b
Cd*(51P1)+Kr
v'=16
D0'
v'=0
...
D ν00
D'D
νat(51P1)
A0+
E (cm-1)
B1
Cd*(53P1)+Kr
v'=9
D0'
...
v'=0
νA00 DAB'
X0+
D"0
E=0 v"=0 Re"
νat(53P1)
Cd(51S0)+Kr
R (Å)
Fig. 29. Scheme of the PE curves drawn for the ground and lowest excited electronic states of CdKr molecule. For details of the description, see text.
bound well, and long-range tail of the potentials, respectively). In the articles reviewed here, those three regions were characterized separately using diAerent experimental approaches and methods of data processing. Thus, the short-range characterization was related only to the ground state repulsive walls listed at the end of Section 4.2.1 (see also Figs. 13(a) and 30(a)). The experimental data was provided by the bound–free Huorescence detected after a selective excitation of the corresponding vibrational state. Using the approach, four analytical representations were exercised, i.e. Morse (Eq. (13)), L–J(n − 6) (Eq. (20)), L–J(12 − 6) (Eq. (21)) and M–S(n0 ; n1 ) (Section 3.5.6). It was found that generally the most suitable representation for the MeRG and Me2 ground-state repulsive walls are the M–S(n0 ; n1 ) and L–J(n − 6) functions, respectively. This will be discussed in detail in Section 6. Because of the F–C “window” for the excitation, almost all the excited states in MeRG and Me2 molecules were satisfactorily characterized in their region of intermediate R (bound-wells,
J. Koperski / Physics Reports 369 (2002) 177 – 326
241
Fig. 30(b)). As already discussed, the experimental data was not always suRcient to fully “cover” the entire well of an excited-state potential with transitions from v = 0 (and several lowest v ¿ 0) to the ladder of corresponding v . As an example, Fig. 29 shows ranges of v in the A0+ and D1 states to which transitions were allowed due to the F–C “window” in the excitation from v = 0 (dotted horizontal lines). Therefore, the description of the A0+ - and D1-excited states potentials was based on energies of the transitions within those ranges and had to be extended down to the bottom of their potential wells as well as up to their dissociation limits. The former is much more reliable that the latter, the conclusion reached already in Section 4.1.1. Moreover, the characterization in the intermediate region of R may be improved provided the excitation spectra between rotational levels are detected. This allows to extend the excited-state characterization beyond the vibrational, to include the rotational as well. The ground state characterization in this region of R merits a comment. As by nature the supersonic expansion is a source of vdW molecules in their lowest vibrational v levels (Section 5.1), a characterization of the ground state is usually indirect. Most researchers in this Leld [26 –28], who are using similar methods of production of very weakly bound molecules, characterize the ground state using relation expressed by Eq. (41). Thus, the accuracy of D0 depends on the accuracy of determination of D0 which, as already known, is not always precise (Section 4.1.1 and an example of D0 (E1u ) in Hg2 in Fig. 20). An error of its determination propagates rendering this approach unreliable. 45 With this in mind, the main eAort in predominant number of cases of ground-states characterization has been to produce the studied molecules also in rather “hot” conditions of the supersonic expansion and, consequently, having populated higher v levels. To control the conditions of the supersonic expansion is a subtle procedure and it will be discussed below. As a result, in recorded excitation spectra a considerable number of “hot” bands were present extending possibilities of a direct characterization of molecular ground state. It is worthwhile to list here certain representative examples of such analyses performed for the ZnNe [41], ZnAr [40,42,47], ZnKr [42], CdHe [43], CdNe [45], CdAr [46], CdXe [42], HgNe and HgAr [53], Zn2 and Cd 2 [58], and Hg2 [61,62]. These illustrate the eAort, which was put to make the most of the available experimental data. As a result, in these cases the ground-state characterization was either performed for the Lrst time or signiLcantly enhanced. In some cases, it was possible to even employ a B–S method, as the number of “hot”-band frequencies was suRciently large (e.g. [42,45,58]). However, these examples concern transitions from the lowest v . Spectroscopy of bound–bound transitions in Huorescence from a selectively excited v oAers a possibility for approaching the v levels closest to the ground-state dissociation limit. 46 But to detect the near-dissociation v one must have a detection system with suRciently high spectral resolution. In the experiments discussed here only v → v transitions to the lowest v were resolved enabling a direct characterization of the ground state (Fig. 27(b)) and results presented in Refs. [53,54]. The main conclusion drawn from the characterization of the ground-state bound wells is that in this region of R the most adequate representation for the interatomic potential is a Morse representation. This is mostly because of the method (i.e. B–S plot) assumed in their characterization. As
45
Even in this case it is always possible to approximate the basic ground-state vibrational characteristics knowing the dissociation energy D0 and the lowest ground-state vibrational levels \ = G0 (v = 1) − an energy interval between G0 (v = 0): !0 = 2D0 (− 1 − \ =D0 ) and !0 x0 = D0 (− 1 − \ =D0 )2 [64]. 46 This would enable a long-range ground-state characterization as well.
242
J. Koperski / Physics Reports 369 (2002) 177 – 326
J. Koperski / Physics Reports 369 (2002) 177 – 326
243
found in simulations of the experimental spectra, this particular representation works very well and experimental traces were reproduced satisfactorily. The characterization in the long-range-limit (Fig. 30(c)) was based on two kinds of experimental data: detection of v ← v = 0 frequencies up to the excited-state dissociation limit (see examples of analyses of the B1 ← X0+ transitions in CdNe [45], CdAr [46] and HgAr [53,54]) and subsequent application of a limiting LR–B method [204,205], or detection of vibrational frequencies considerably far from the dissociation (as allowed by the F–C “window”) and application of a generalized method of Le Roy [208] (see discussion in Section 4.1.2). It is obvious that the former approach produces more reliable long-range characteristics than the latter. The best representation of the long-range limit for most of the cases considered in the discussed articles was a combined Morse–vdW function (Eq. (24)) which joined smoothly a Morse representation describing the bound-well intermediate region. The accuracy of this representation depends on the accuracy of determination of long-range characteristics. The Morse–vdW-representation approach was employed in Refs. [45,46,48,56] to represent interatomic potentials of the ground and several excited states of CdNe, CdAr, CdKr, and HgKr molecules, respectively, in their intermediate and long-range regions. In several investigations, instead of trying to Lnd a single analytical representation for an interatomic potential in the whole range of R, a hybrid-potential characterization was proposed. As a result, separate PE-curve representations for separate regions of internuclear separations were speciLed for the ground state of HgAr [54] and HgKr [56] molecules. 4.4. Approximations for long-range characteristics In the following sections, very practical methods based on an empirical approach of approximation of long-range behaviour of interatomic potential (i.e. K–H, L–D and S–K formulas as well as L–P method) will be discussed. They can be useful in both LR–B limiting and Le Roy generalized methods to evaluate the C6 constants. 4.4.1. Kramer–Herschbach combination rule In 1970, Kramer and Herschbach [8] formulated a simple combination rule that allows to accurately predict the C6 (MeRG) constants for unlike partners from a knowledge of the corresponding constants ←−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
Fig. 30. Example of three-region ground state characterization of CdKr molecule investigated in Refs. [48,49] in the (a) short-, (b) intermediate, and (c) long-range regions of internuclear separations (compare with Fig. 29). In the short-range (a) the repulsive part of the potential is represented by the M–S (8.6,7.3) function, which is compared with Morse and L–J (8:6 − 6) representations as well as points of an RKR-like inverted method of Le Roy [220 –224] (open circles) and ab initio result of Czuchaj and Stoll [72] (full circles). In the intermediate region (b) the bound well is represented by Y the combined Morse–vdW function (see Eq. (24)) with C6 = 254 a:u: (average from Tables 5 and 6) and Rc = 9:45 A, which is compared with M–S (8.6,7.3) function and ab initio points if Ref. [72] (full circles). Note that at intersection Y of U (R) with the dissociation limit (U (R) = 0) the shift between Morse–vdW and M–S is very small (9 × 10−3 A) and, consequently, not discernible. In the long-region limit (c), the tail of the potential is represented by the combined Morse–vdW function (as in (b)), which is compared with Morse, C6 =R6 vdW attractive branch (C6 = 254 a:u:), and M–S (8.6,7.3) function as well as ab initio points of Ref. [72] (full circles). Note that the Morse–vdW representation Y and vdW C6 =R6 tail for larger R (R ¿ 7:6 A). Y LeRoy radius approximates the Morse function for smaller R (R ¡ 7:6 A) Y for the CdKr molecule is evaluated to be RLR = 7:2 A (Table 7).
244
J. Koperski / Physics Reports 369 (2002) 177 – 326
Table 4 Atomic polarizabilities Me and RG , C6 (Hg2 ) and C6 (RG2 ) constants and resulting C6 (HgRG) constants evaluated according to the K–H formula of Eq. (52) [8]. Table contains also C6 (HgRG) constants evaluated with assumption of the C6 (Hg2 ) as a result of the L–D theory (Section 4.4.2) and S–K formula with the correction of Cambi et al. (Section 4.4.3). The Hg is known with ±0:9% accuracy [159]. The He is an “exact” value while the Ne , Ar , Kr and Xe are Y3 known within ±2%, ±0:5%, ±0:5% and ±0:5% error margins, respectively [138]. 1 a:u: = 0:148185 A C6 (Hg2 ) (a.u.)
Hg b (a.u.)
240a
329.98d
33.9
290.46e
RG
C6 (RG2 ) (a.u.)
RG c (a.u.)
He
1.47
1.383
Ne
6.30
2.663
Ar
65.0
11.080
Kr
130
16.734
Xe
270
27.292
HgRG
C6 (HgRG) (a.u.)
×106 (cm−1 A6 )
HgHe HgNe HgAr HgKr HgXe
15.41 30.53 112.51 163.43 245.18
0.07426 0.14716 0.54231 0.78774 1.18182
HgHe HgNe HgAr HgKr HgXe
19.61 39.18 139.86 201.29 296.49
0.09452 0.18887 0.67416 0.97025 1.42912
HgHe HgNe HgAr HgKr HgXe
17.83 35.53 128.52 185.67 275.55
0.08597 0.17124 0.61949 0.89498 1.32822
a
Ref. [8]. Ref. [159]. c Ref. [138]. d From Table 5 (L–D theory [2– 6]). e From Table 6 (S–K formula [7] with correction of Cambi et al. [226]). b
for like partners. For the MeRG molecules the formula reads: C6 (MeRG) =
2C6 (Me2 )C6 (RG2 ) ; (RG =Me )C6 (Me2 ) + (Me =RG )C6 (RG2 )
(52)
where Me and RG are static dipole polarizabilities for Me and RG atoms, respectively (they were discussed in Section 3.5.3), and C6 (Me2 ) and C6 (RG2 ) are induced-dipole–induced-dipole vdW constants for the Me2 and RG2 molecules, respectively (all in (a.u.)). This combination rule was suggested Lrst by Moelwyn-Hughes in 1957 [225]. The authors of Ref. [8] published table of and C6 parameters for testing of the combination rule (see their Table I). For the purpose of this review, one can only use pairs of Hg and RG atoms (no data for Zn or Cd are available). The evaluated C6 (HgRG) are listed in Table 4. 4.4.2. London–Drude theory of dispersion interactions The C6 (MeRG) constants can be also calculated using a simple model proposed by London [2,3] and Drude [4 – 6]. According to the L–D model, the dispersive interaction between the ground-state
J. Koperski / Physics Reports 369 (2002) 177 – 326
245
Table 5 Atomic polarizabilities Me and RG , ionization potentials IMe and IRG , and resulting C6 (MeRG) and C6 (Me2 ) constants 2 evaluated according to formula C6 (MeRG) = 32 Me RG IMe IRG =(IMe + IRG ) or C6 (Me2 ) = 34 Me IMe of L–D theory [2– 6] (Eq. (53)) for all reviewed molecules. The Me for Zn, Cd and Hg are known with ±2% [157], ±3% [158] and ±0:9% [159] accuracies, respectively. The He is an “exact” value while the Ne , Ar , Kr and Xe are known within ±2%, ±0:5%, ±0:5% and ±0:5% error margins, respectively [138] Me
Me
Zna
38.8 a.u. Y3 5.75 A
Cdb
49.7 a.u. Y3 7.36 A
Hgc
a
33.9 a.u. Y3 5.02 A
Ref. Ref. c Ref. d Ref. e Ref. b
IMe d (cm−1 )
75 743.5
72 517.3
84 155.9
RG
RG e
IRG (cm−1 )
He
1.383 a.u. Y3 0.205 A
1 98 251.0
Ne
2.663 a.u. Y3 0.395 A
1 73 885.0
Ar
11.080 a.u. Y3 1.642 A
1 27 072.6
Kr
16.734 a.u. Y3 2.481 A 27.292 a.u. Y3 4.044 A
1 12 885.3
Xe
97 810.8
MeRG Me2
C6 (a.u.)
×106 (cm−1 A6 )
ZnHe ZnNe ZnAr ZnKr ZnXe
20.10 37.29 139.43 201.23 308.89
0.09690 0.17975 0.67209 0.96998 1.48891
CdHe CdNe CdAr CdKr CdXe
24.93 46.30 173.63 250.90 385.71
0.12017 0.22316 0.83695 1.20937 1.85918
HgHe HgNe HgAr HgKr HgXe
18.92 34.99 129.86 186.86 285.77
0.09120 0.16867 0.62597 0.90071 1.37748
Zn2 Cd 2 Hg2
389.65 611.21 329.98
1.87820 2.94617 1.59057
[157]. [158]. [159]. [70]. [138].
Me and RG atoms is proportional to −1=R6 and can be expressed by a relationship U (R) =
1 IMe IRG 3 1 3 Me RG C (MeRG) Me RG × 6= 6 6 × 6= ; 2 IMe + IRG R 2 F R R
(53)
where IMe and IRG are ionization potentials of Me and RG atom, respectively, F = (IMe + IRG )=IMe IRG is a reduced ionization potential (a similar relationship can be written for two Me atoms in an Me2 molecule), and |U (Re )|=De . Table 5 lists C6 constants for the MeRG and Me2 molecules evaluated Y 3 ); (cm−1 ) and (cm−1 A Y 6 ), respectively). using this approach (; I and C6 are expressed in (A The above determination of the C6 (MeRG) and C6 (Me2 ) constants merits a comment. It is obvious that the C6 depend mainly on the Me and RG because the IMe and IRG potentials are
246
J. Koperski / Physics Reports 369 (2002) 177 – 326
determined with very high accuracy [70] and are considered as “exact” values. Moreover, the RG are also known with considerably high precision: the error margins for the Ne ; Ar ; Kr and Xe are within ±2%; ±0:5%; ±0:5% and ±0:5%, respectively [138]. Furthermore, the He is considered an “exact” value. Until recently, the Me (i.e., Cd and Hg ) polarizabilities were known with rather poor accuracy, except for the Zn , whose error margin is comparable to those of RG , i.e. ±2%. Unfortunately, the Cd and Hg were determined only with ±50% accuracies [138] and more precise values were needed to improve precision of C6 . Here, the recently published Cd (±3%) and Hg (±0:9%) polarizabilities were adopted from Refs. [158,159], respectively. Concluding, instead of well established, inaccurate “old” values of Zn = 47:8 ± 1:0 a:u: (7:08 ± Y 3 ) [138], and Hg =34:4±17:2 a:u: (5:1±2:6 A Y 3 ) [138], Y 3 ) [138], Cd =40:5±20:3 a:u: (6:0±3:0 A 0:14 A Y 3 ) [157], Cd = 49:7 ± 1:6 a:u: (7:36 ± the more precise values of Zn = 38:8 ± 0:8 a:u: (5:74 ± 0:11 A Y 3 ) [159] were used in analyses discussed here. Y 3 ) [158], and Hg = 33:9 ± 0:3 a:u: (5:02 ± 0:45 A 0:22 A 4.4.3. Slater–Kirkwood formula Plausible estimates for the C6 (MeRG) constants could also be evaluated with the help of a Slater– Kirkwood (S–K) formula [7] Me RG 3 C6 = ; (54) 2 (Me =NMe ) + (RG =NRG ) where NMe and NRG are numbers of electrons in the outer shell (K, L, M, etc.) of Me and RG atoms, respectively. Table 6 collects the C6 constants for the MeRG and Me2 molecules evaluated using (54). When calculating the C6 , it is necessary to take into account the correction of Cambi et al. [226] for the NMe and NRG values and substitute for each of them a corresponding value of the eAective electron number, NeA , expressed by the formula Next Nint 2 NeA = Next 1 + 1 − ; (55) Nint Ntot where Nint and Next are the numbers of total inner and total outer electrons and Ntot = Nint + Next . The empirical formula (55) does not require knowledge of atomic properties whose values are not always easily available, especially for heavier atoms. The correction of Ref. [226], applied in the evaluation of the long-range constant, assesses the role played by the inner-shell electrons to that played by the outer-shell ones. It gives the deviation of the eAective number of electrons, expressed by the product of the two terms in round brackets. The Lrst term, 1 − Next =Nint , gives the sign of the deviation, while their product gives its amount. One can notice that for many-electron systems, the ratio Next =Nint is less than one and therefore NeA is larger than Next (for elements from B to Ne the deviation is negative, i.e. NeA ¡ Next , and for the Lrst row elements H and He, no correction is due as Ntot = Next ; Nint = 0). As can be seen from Table 6, the C6 constants calculated using the correction of Cambi et al. are larger by about 20 –25% for MeRG and about 25 –27% for Me2 molecules. At this point, it is probably worthwhile to compare the C6 constants from Table 6 with some of those that are available in the literature. For MeRG complexes, the C6 constant was determined experimentally using a limiting LR–B method (Section 4.1.2) for CdNe as equal to 31.7 a.u. [45]. Moreover, Brym [227] calculated the C6 constant for all CdRG molecules. Theoretical values of Brym who used UnsVold formula [228] with the aid of Hartree–Fock approximation are: 16.79 a.u.
J. Koperski / Physics Reports 369 (2002) 177 – 326
247
Table 6 Atomic polarizabilities Me and RG , NMe and NRG numbers of electrons in the outer shell of Me and RG atoms, respectively, NeA eAective electron numbers of NMe and NRG , and resulting C6 (MeRG) and C6 (Me2 ) constants evaluated according to 3=2 1=2 the S–K formula of Eq. (54) [7] for all reviewed molecules (for Me2 formula (54) reduces to C6 = 34 Me =NMe ). The C6 −1 Y 6 constants are calculated without and with the correction of Cambi et al. [226]. 1 a:u: = 4820:2 cm A Me
Znc
Cdd
Hge
Me (a.u.)
38.8
49.7
33.9
NMe
2
2
2
NeA a (Me)
3.62
3.75
RG
RG (a.u.)
NRG
NeA b (RG)
He
1.384
2
2
Ne
2.6663
8
7.04
Ar
11.080
8
8.49
Kr
16.734
8
11.45
Xe
27.292
8
12.76
3.85
MeRG Me2
C6 (a.u.) Without correct.
With correct.
ZnHe ZnNe ZnAr ZnKr ZnXe
15.38 31.11 115.54 166.46 254.08
19.62 42.44 146.02 217.26 335.62
CdHe CdNe CdAr CdKr CdXe
17.74 35.70 134.06 193.98 297.81
23.07 46.65 172.70 257.25 398.71
HgHe HgNe HgAr HgKr HgXe
14.22 28.85 106.43 183.85 232.95
18.52 37.80 142.60 203.75 313.28
Zn2 Cd 2 Hg2
256.34 371.63 209.35
344.88 508.88 290.46
a
Ref. [133]. Ref. [134]. c Ref. [135]. d Eq. (58). e Ref. [114]. b
(CdHe), 32.49 a.u. (CdNe), 134.83 a.u. (CdAr), 201.08 a.u. (CdKr), and 332.16 a.u. (CdXe). For Me2 complexes the most intensively investigated has been the Hg2 molecule and the C6 values of 224 a:u: [127], 255 a.u. [159], 240 a.u. [229], and 290 a.u. [106] were determined theoretically or experimentally. Furthermore, for Zn2 and Cd 2 molecules, the C6 = 257 a:u: [157] and C6 = 466 a:u: [158], respectively, were experimentally derived. 4.5. Approximations for internuclear separations 4.5.1. Liuti–Pirani method In 1985, Liuti and Pirani [9] found an useful regularity related to the vdW interactions. They discovered a correlation between the potential parameters and atomic polarizabilities which leads, as a consequence, to a correlation of the C6 constants describing the long-range interaction, to
248
J. Koperski / Physics Reports 369 (2002) 177 – 326
the potential well depth De and equilibrium distance Re . Ref. [9] cites about 50 systems (among them MeRG complexes) for which the correlation works with a remarkable accuracy, assuming that the long-range forces are in operation only. It requires the knowledge of the C6 constants of the “investigated”, e.g. MeRG, and “reference”(ref) molecules, and is based on the following relationship: De (ref ) 1=6 C6 (MeRG) 1=6 Re (MeRG) = × × Re (ref ) ; (56) De (MeRG) C6 (ref ) where Re (MeRG) and Re (ref) are equilibrium distances, De (MeRG) and De (ref), dissociation energies, and C6 (MeRG) and C6 (ref), long-range constants of the “investigated” and “reference” molecules, respectively. The method can be used where there is no experimental data for direct determination of the absolute values for Re (e.g. rotationally resolved spectra). The method was employed in studies of the MeRG and Me2 molecules as more reliable and precise, and replaced so-called Kong’s rule [230,231] widely used in Japanese investigations of HgRG [232,233] as well as HgRG [234,235] and CdRG [236] studies by other investigators. As seen from Eq. (56), it accounts for the attractive long-range parameters of the interatomic potential, which is not the case for the Kong’s rule. In the analyses, the relationship (56) was used to evaluate the Re of ZnNe [41], CdHe [43], CdNe [45], CdXe [42], HgKr [56], and Zn2 and Cd 2 [58] using as a reference well established Re and De characteristics determined in the high-resolution studies of MeRG and Me2 molecules as well as C6 constants evaluated using the S–K method with the correction of Cambi et al. [226] (see Tables 11–13). 4.6. Le Roy radius evaluation In Fig. 30(c)) of Section 4.3, a so-called Le Roy radius, RLR , was depicted for the ground state Y it was assumed that, instead of a of CdKr molecule. In this particular case, for R ¿ RLR = 7:2 A, Morse function, the ground-state interatomic potential is most likely described by a C6 =R6 vdW tail. There is always a question what is a justiLed limit, at which the long-range approximation begins, i.e. generally, what can be considered as the region of validity of the inverse-power expansion of Eq. (42). According to Le Roy [237], for the MeRG molecule the approximation is valid for internuclear distances larger than
2 2 RLR = 2 RMe + RRG ; (57) where R2a is the expectation value of the square of the electronic radius of the valence shell of atom a (in other words radius of the valence electron shell). 47 It means that two atoms (e.g. Me and RG) must be far apart enough so that the mutual distortion of their charge clouds is small and that the short-range interactions (the charge-overlap interaction, i.e. the Coulomb repulsion among the electrons, and exchange interaction imposed by the Pauli exclusion principle) are small compared to the long-range interactions. The criterion (57) has been widely adopted in the long-range 47
ModiLcation of Le Roy criterion (57) was proposed by Stwalley and co-workers [238], which accounts for the spatial orientation of the atomic electron density distributions (atomic orbitals). However, for the spherically symmetrical ground-state atomic orbitals (S state atoms) it gives the same result as Eq. (57).
J. Koperski / Physics Reports 369 (2002) 177 – 326
249
Table 7 The Le Roy radii, RLR , evaluated according to Eq. (57) for MeRG and Me2 molecules investigated in Refs. [40 – 63] and discussed in this review Me
R2Me RG
R2RG MeRG RLR a Y Y a Y (A) (A) Me2 (A) Zn
Cd
He
1.52
1.63
1.50
He
1.53
ZnHe ZnNe ZnAr ZnKr ZnXe
6.10 6.24 6.88 6.98 8.24
Ne
1.60
Ar
1.92
Kr
1.97
CdHe CdNe CdAr CdKr CdXe
6.32 6.46 7.10 7.20 8.46
Xe
2.60
HgHe HgNe HgAr HgKr HgXe
6.06 6.20 6.84 6.94 8.20
Zn2 Cd 2 Hg2
6.08 6.52 6.00
a
Refs. [239 –241]. Note: the R2a values from diAerent sources diAer. These quoted here (columns 2 and 4) are averaged values from the above sources.
analysis of various molecular states. Table 7 collects R2Me and R2RG as well as RLR evaluated according to Eq. (57) for MeRG and Me2 48 molecules reviewed here. As shown in several articles discussed here [45,46,48,49,56], the RLR helped in practical qualiLcation of the R, beyond which vdW forces eAectively dominates. Therefore, in the ground-state potential representations given by of, for example, a combined Morse–vdW function (24) the parameter Rc was always chosen to be equal or greater than RLR (see CdNe [45], CdAr [46], CdKr [49] and HgKr [56]). 5. Experimental considerations 5.1. Supersonic expansion—a source of molecules Supersonic expansion technique is a widely used method in laser spectroscopy of molecules. The method has been engaged in diAerent Lelds of research in physics and chemistry. The supersonic expansion technique provides a source of rotationally and vibrationally cold molecules, which are 48
For Me2 molecule Eq. (57) reduces to 4 ×
R2Me .
250
J. Koperski / Physics Reports 369 (2002) 177 – 326
Fig. 31. Schematic diagram of a cross-section of supersonic expansion beam. M —appropriate Mach numbers: MeA —eAective, MT —terminal; XeA , XT —corresponding distances from the nozzle; XM —distance to the Mach disk shock; P0 , T0 , n0 —pressure, temperature, and density of expanding species in the source; D—diameter of the oriLce; P1 —background pressure (P1 P0 ). A magnitude of the Mach number (M = 1, M ¿ 1, M 1, M ¡ 1) is indicated along the expansion (see text for details). Shading, from black to white colour, represents a change (from high to low) of the density, number of collisions of expanding species and temperature along the expansion. Black and white arrows indicate schematically the thermal and ordered movement of expanding species behind and downstream from the nozzle, respectively. A narrowing of the axial velocity distribution N (C), from N (C) ˙ exp(−mC2 =2kT0 ) to N (C) ˙ exp(−m(C−u)2 =2kTt ), where u and m are velocity of the gas in the beam and mass of expanding species, respectively, is also shown below in diAerent points of expansion.
very weakly bound in their ground electronic states. Moreover, in a certain part of the expansion the molecules can be treated as isolated objects that are “travelling” in the beam without collisions. 5.1.1. Phenomenological characteristics of supersonic expansion There are excellent textbooks and reviews that are devoted to diAerent aspects of the method of free-jet supersonic expansion, which is a technique of so-called internal cooling of molecules [243–246]. When a gas of Me atoms and Me2 molecules mixed with a carrier RG, 49 expands freely from a high-pressure region (P0 ) through a small oriLce with diameter D into the vacuum (P1 ) (Fig. 31), an adiabatic cooling of the internal energy occurs. During the process, the thermal energy of the molecules in the source is partly transferred into the expansion energy. In the source, the termal energy is comprised of translational, vibrational and rotational energies, and T0 = Tt = Tvib = Trot , where T0 ; Tt ; Tvib ; Trot are the source, translational, vibrational and rotational temperatures, respectively. The energy transfer takes place in the oriLce at densities where the collision probability is 49
Here one considers mainly two-component supersonic beams in which a special class exempliLes so-called seeded molecular beams (the seeded gas, e.g. Me, has a much smaller density number than the carrier gas, RG). In the supersonic expansion the translational temperature, Tt , of the RG carrier gas drops Lrst and then, via collisions, the rotational and vibrational energy of seeded Me2 is transferred to the reservoir of the RG cold “bath”.
J. Koperski / Physics Reports 369 (2002) 177 – 326
251
Table 8 Rotational, Trot , and vibrational, Tvib , temperatures evaluated according to Eq. (58), for typical parameters used in the experiments discussed in the review [40 – 63]: T0 = 800 K, D = 115, 150 and 300 m, and P0 = 5 and 15 atm Temperature
5 atm
Trot ≈ TT (K) Tvib ≈ 10Trot (K)
15 atm
115 m
150 m
300 m
115 m
150 m
300 m
1.3 13
1.1 11
0.6 6
0.6 6
0.4 4
0.3 3
very high. The degree of cooling depends on the number of collisions during the expansion, which is proportional to the product n0 D, where n0 is a density of expanding species in the oriLce [245]. Since the cross-sections for elastic collisions (*elast ) are larger than those for collision-induced rotational transitions (*rot ), which are still larger than those for collision-induced vibrational transitions (*vib , and *elast ¿ *rot ¿ *vib ), the translational cooling (i.e. monokinetization, narrowing of the axial velocity distribution N (C), Fig. 31) is more eAective than the rotational or vibrational cooling. Consequently, after the adiabatic expansion: Tt ¡ Trot ¡ Tvib . Estimate of the Trot and Tvib can be made from the terminal translational temperature, TT , in the beam [247]: TT =
T0 ; 1 + 5896(P0 D)0:8
(58)
where P0 ; T0 and D are expressed in (atm), (K) and (cm), respectively. As an approximation, one can assume Trot ≈ TT and Tvib as about one order of magnitude higher that Trot [248]. The results presented here were obtained in three diAerent experiments using: (1), continuous (Cd seeded in RG) supersonic beam in laboratory in KrakPow, (2), continuous (Zn or Cd seeded in RG) and (3), pulsed 50 (Hg seeded in RG) supersonic beams in laboratory in Windsor. Table 8 collects typical rotational and vibrational temperatures attainable in these beams for carrier gas pressures of P0 = 5 and 15 atm, and source temperature of T0 = 800 K. It is a straightforward conclusion, consistent with the above discussion, that increasing both P0 and D causes more eRcient cooling of rotational and vibrational degrees of freedom. The internal cooling has two advantages for laser spectroscopy. First, only the lowest ro-vibrational levels in the electronic ground state are populated (compare with Boltzmann coeRcients in Sections 4.1.3 and 4.1.4). This reduction to a few populated levels implies a considerable reduction of the number of absorption lines and leads to an appreciable simpliLcation of spectra (Fig. 32). Moreover, because of the low temperatures, very weakly bound molecules with small dissociation energies, D0 , can be formed in supersonic beams. 51 The weakest bound molecules studied by the author were those of MeHe and MeNe for which the well depths ranged from ∼8 cm−1 for HgHe [51,52], through 14:2 cm−1 for CdHe [43,44] to 23.6 and 28:3 cm−1 for ZnNe [41] and CdNe [45], respectively. In order to determine a region in the beam, that can be most useful for spectroscopy, it is worthwhile to overview other characteristics of the supersonic expansion. Considering region downstream 50 51
Characteristics of the supersonic continuous (stationary) seeded beams can be easily applied for pulsed beams [249]. They would immediately dissociate at room temperature where kT D0 .
252
J. Koperski / Physics Reports 369 (2002) 177 – 326 63P1-61S0 atomic line
A0+←X0+
B1←X0+
LIF (arb. units)
(b) ↓↓
↓
↓
↓
↓ ↓
↓
↓ ↓
↓
10x
(a) 10x 2525
2530
2535
2540
2545
2550
laser wavelength (Å)
Fig. 32. Illustration of a simpliLcation of the excitation spectrum using vibrational- and rotational-cooling process in the supersonic expansion beam. The proLles show the same B1 ← X0+ and A0+ ← X0+ transitions in excitation of HgAr molecule, studied in Ref. [53], detected for (a) “cold”, Trot ≈ 0:6 K (58), Tvib ≈ 6 K (P0 =2 atm, D =0:3 mm, XeA =17 mm, T0 =450 K), and (b) “hot”, Trot ≈ 2:2 K, Tvib ≈ 22 K (P0 =0:5 atm, D=0:3 mm, XeA =10 mm, T0 =460 K) conditions in the beam. The “cold” excitation spectrum (a) shows essentially v ← v =0 progressions (vibrational cooling, i.e. decreasing of the Tvib ) while the “hot” trace (b) contains so-called “hot”-bands due to the transitions from v = 1 and 2 (indicated with vertical arrows) suRciently populated during the expansion. The “blue-” and “red-shading” of the proLles of vibrational components in the B1 ← X0+ and A0+ ← X0+ transitions, respectively are very distinct in the “hot” spectrum due to the considerably large population of the J ¿ 0, which is governed by the Trot in the beam. The relative intensity scale has been changed (×10) near the long-wavelength limit of the spectra.
from the nozzle, both temperature and density, n, of the expanding gas (consisting of atoms, diatomic molecules and sometimes larger clusters 52 ) decrease with increasing eAective distance XeA . Therefore, the collision rate falls rapidly with increasing XeA . At this point, the N (C) axial velocity distribution and Mach number, M , 53 become eAectively “frozen”, i.e. no further cooling is taking place (“zone of silence”, Fig. 31). The so-called terminal Mach number, MT , for monoatomic gas (here Ar) is given by [245] MT = 133(P0 D)0:4 ;
52
(59)
A relative composition of expanding species in the supersonic beam depends on many parameters (e.g. P0 ; P1 ; T0 ; D, length of the nozzle channel) and the problem has been discussed in number of articles [14 –16,250,251] as well as in the study of higher Hg2 RGn HgRG2 and Hg3 clusters [57]. √ 53 M is the ratio of the local velocity of the gas in the beam, u, to the local velocity of sound, Cs (Cs ∼ T , therefore when M increases the T drops rapidly along the expansion).
J. Koperski / Physics Reports 369 (2002) 177 – 326
253
Fig. 33. (a) Terminal Mach number, MT (Eq. (59)), (b) terminal distance, XT , to the MT (Eq. (60)) as well as distance to the Mach disk, XM (Eq. (62) for P1 = 10−4 atm ≈ 10−1 mbar ≈ 76 mTorr) plotted against P0 , (c) eAective Mach number, MeA , and (d) relative density in the beam, n=n0 , (Eq. (61)) plotted against eAective distance from the nozzle, XeA , for three diAerent beam sources used: (1) D = 115 m (continuous beam in KrakPow), (2) D = 150 m (continuous beam in Windsor), (3) D = 300 m (pulsed beam in Windsor). It is shown that (a) for increasing oriLce diameter, D, the same MT is attainable for decreasing carrier-gas pressure, P0 , and (b) the distance XT to MT and XM to the Mach disk increases considerably with P0 (lengths of the “zone of silence” are indicated with arrows). This illustrates that the distance to the collisionless zone and its length is larger when one applies nozzles with larger oriLce. As shown in (c) the eAective Mach number, MeA , increases faster when one applies nozzle with smaller oriLce. The degree of decreasing the relative density along the XeA is illustrated in (d), and one can see that the diameter of the nozzle D aAects the n=n0 , however, for (3) D = 300 m, the n=n0 (XeA = 1 mm) and n=n0 (XeA = 10 mm) (not shown) decrease about two and four orders of magnitude, respectively, from its starting value (i.e. for XeA = 0 mm). Therefore, from this point of view large distances from nozzle are not favourable for spectroscopy.
where P0 and D are expressed in (atm) and (cm), respectively. Expression (59) is a good approximation for other RG monoatomic gases (except He [252]). Fig. 33(a) shows the terminal Mach numbers attainable in the experiments discussed here as a function of P0 , which for seeded beams is in fact the carrier-gas pressure (see also Table 9). Terminal distance, XT , at which MT occurs, is given by MT 1:5 : (60) XT = D 3:26
254
J. Koperski / Physics Reports 369 (2002) 177 – 326
Table 9 Parameters of the supersonic expansion for all transitions in the MeRG and Me2 molecules (Me = Zn, Cd, Hg; RG = rare gas) investigated in Refs. [40 – 63] No.
Molecule
Transition studied
Carrier gas
T0 (K)
P0 (atm)
P1 ∗ (atm)
PMe (Torr)
1
ZnNe
D←X
Ne
920 –950
10 –14
1–5 × 10−4
∼50
2
ZnAr
D↔X
Ar
920 –970
5 –11
1–6 × 10−4
∼65
3
ZnKr
D←X
10%Kr+Ne
930 –980
11
3:3 × 10−4
∼75
4
CdHe
A; B ← X
He
800 –870
9.5 –10.9
1:3 × 10−4
24 –81
5
CdNe
A; B; D ← X A; D ← X E ← A; B ← X
Ne
750 –870 780 –900 820 –920
8–11 8–11 11
2–3 × 10−4 2–3 × 10−4 4 × 10−4
10 –100 20 –120 70 –180
6
CdAr
B←X A↔X E ← A; B ← X
Ar
770 –870 790 –850 820 –920
8–12 6.8–11 11
1–5 × 10−3 1–2 × 10−4 2 × 10−4
12–100 20 –55 70 –180
7
CdKr
A; B; D ↔ X A ↔ X; B ← X E ← A; B ← X
10%Kr+Ne 10%Kr+Ne, Kr 20%Kr+He
800 750 –760 820 –920
11 9 –12 11
3:2 × 10−4 2–5 × 10−3 8 × 10−4
22 8–10 70 –180
8
CdXe
A; B ← X
1,10%Xe+Ne 10%Xe+He
850 720 –760
7 6 –7
1:4 × 10−4 1–3 × 10−3
55 5 –10
9
HgHe
A; B ← X
He
455 –500
17–18
2 × 10−4
8.5 – 44
10
HgNe
A; B ← X
Ne
460 –500
1–7
1–3 × 10−5
10 – 44
11
HgAr
A; B ← X A; B → X
Ar
450 – 460 480 –500
1–2
4–6 × 10−6 2–4 × 10−5
7.9 – 44
12
HgKr
A; B ← X A→X
Kr
410 – 420
0.2– 0.7 0.2
3–8 × 10−6
2–3
13
Zn2
+ 0+ u ← X0g
Ar
980 –990
0.7–1.4
1–3 × 10−5
75 –100
14
Cd 2
+ 0+ u ← X0g + 1u ← X0g
Ar Ar
890 –930 850 –870
0.8–1.5 6.8
1–4 × 10−5 1:5 × 10−4
180 –190 55 –81
15
Hg2
+ F0+ u ← X0g E1u ↔ X0+ g; D1u ↔ X0+ g + G0+ u ↔ X0g
He Ne, Ar
505 –515 505 –515; 450 – 460 500 –510
0.01–3 0.2–3
1 × 10−6 –1 × 10−5 5 × 10−6 –1 × 10−5
1
5 × 10−6
45.5 – 60 45.5 – 60; 7.9 –11.2 38–51
Ne, Ar
J. Koperski / Physics Reports 369 (2002) 177 – 326
255
Table 9 (continued) No.
X1 (mm)
D (mm)
X=D
XT (mm) Eq. (60)
MT Eq. (59)
XM (mm) Eq. (62)
Refs.
1
4 –9.6
0.15
27– 64
12.5 –15.3
62–71
16 –35
[40 – 42]
2
1–15
0.15
6.7–100
8.3–13.3
47– 65
13–19
[40,42]
3
5 –8
0.15
33–53
13.2
65
18
[42]
4
10 –15
0.15
67–100
12.1–13.2
61– 64.5
19
[43,44]
5
4 –18 4 –18 4 –7
0.15 0.15 0.16
27–120 27–120 25 – 44
11–13.3 11–13.3 14.7
57– 64.7 57– 64.7 66.4
18–22 18–22 17.7
[45] [45]
6
5 –15 5 –10 4 –7
0.115 0.15 0.16, 0.20
43–130 33– 67 20 – 44
7.2–9.1 10 –13.2 14.7–21
51– 60 53– 65 66.4, 72.5
3.5 –10 21–27 25, 31
[46] [47]
7
6 –8 5 –10 4 –7
0.15 0.115 0.18, 0.20
40 –53 43–87 20 –39
13.2 3.7–9.1 18–21
65 54 – 60 70 –72.5
19 4 –9 14, 16
[48–50] [48–50]
8
2– 6 3–7
0.15 0.115
15 – 40 26 – 61
13.2 6 – 6.6
65 46 – 49
22 6.3–10.5
[42]
9
5 –18
0.3
17– 60
52–54
102–104
59
[51,52]
10
7–20
0.3
23– 67
9.5 –31
33–71
45 –75
[53]
11
10 –20
0.3 0.9
33– 67 11–22
9.5 –14.5 55 –84
33– 43 51– 67
82–142 95 –191
[53–55]
12
10 –15 5
0.2 0.4
50 –75 12.5
2– 4
15 –24
34 –50 54
[56]
13
5 –8
0.15
33–53
2.5 – 4
21–28
23
[58]
14
2.5 –8 3–5
0.15 0.18
17–53 17–28
2.8– 4 13
23–29 57
22 26
[58] [59]
15
5 –25 5 –15 5 –12
0.3; 0.9 0.3; 0.9 0.3; 0.9
17–83; 6 –28 17–50; 6 –17 17– 40; 6 –13
1–18; 1–107 4 –54; 21–105 33–55
5 –50; 8–78 17–50; 27–78 10 –51
20 –110; 60 –330 30 –110; 70 –330 28; 85
[60] [61] [62]
∗ In case of the pulsed jets (HgRG, Hg2 ); P1 depended also on the pulsed-value repetition rate (10–15 Hz) as well as the amplitude and width (1–3 ms) of the value pulse.
256
J. Koperski / Physics Reports 369 (2002) 177 – 326
The dependence (60) is shown in Fig. 33(b) for three experimental set-ups mentioned above, assuming MT given by Eq. (59). Fig. 33(b) shows that the distance to the collisionless “zone of silence” is larger when one applies nozzles with larger oriLce. One of the goals in the experimental procedure that employs a supersonic beam is to carry out the laser excitation process in this particular zone where molecules do not disturb each other and, in some sense, can be treated as isolated ones [247]. Unfortunately, in many experimental set-ups the distance to this zone is considerably large and the relative density of molecules at XT is low. The relative density n=n0 is expressed by [247] n 2 −1=(4−1) = [1 + 12 (4 − 1)MeA ] ; (61) n0 where n is a densities at given point of the expansion (Fig. 31), and 4 is the heat capacity ratio cp =cv . 54 A particular point of expansion is characterized by an eAective Mach number MeA = 3:26(XeA =D)0:67 [247] (Fig. 33(c)). Fig. 33(d) illustrates relationship of Eq. (61) for the three experimental set-ups. It is evident, that n=n0 drops very rapidly with XeA , and at XeA = 1 and 10 mm the relative density for D = 300 m (curve (3)) decreases from its starting value (XeA = 0 mm) about two and four orders of magnitude, respectively. Comparing this with n=n0 for D=115 m (curve (1)) at the same XeA , density decreases from its starting value about three and Lve orders of magnitude, respectively. Therefore, one has to considerably increase the density n0 (i.e. T0 of expanding species) in the source to ensure a suRciently high-density n(XeA ) in the excitation region. This relatively simple picture of expansion is complicated by the presence of shock waves in the expanding beam [245]. The central core of expansion, phenomenologically described above, is surrounded concentrically by a shock boundary (Barrel shock). Particularly important is the shock front, called Mach disk, perpendicular to the direction of the How (Fig. 31). The disk originates as the beam molecules collide with surrounding gas. Therefore, the collisionless zone deLned above extends up to XM , the distance at which the Mach disk occurs [247] P0 ; (62) XM = 0:67D P1 where P1 is the pressure in the vacuum chamber. Fig. 33(b) shows the dependence given by Eq. (62), plotted against P0 with assumption of a standard pressure in the vacuum chamber attainable in the three experiments P1 = 10−4 atm ≈ 10−1 mbar ≈ 76 mTorr. Together with the XT vs. P0 dependence shown in Fig. 33(b) as well, it is possible to illustrate the size of the “zone of silence”, which extends from XT to XM (see vertical arrows). 5.1.2. Pulsed supersonic beam As concluded in the previous section, in most of the supersonic beam sources that produce MeRG and Me2 molecules, it is essential to heat up the Me sample to the temperatures considerably higher than its melting point. 55 This will assure high-density n0 in the oriLce. Considerably lower temperatures needed to produce mercury vapour at suRciently high densities allow to apply solenoid pulsed valves in supersonic beam sources. 54 55
cp =cv = 5=3 for a monoatomic gas. ◦ ◦ ◦ Melting points for Zn, Cd and Hg are 693 K (420 C), 594 K (321 C) and 234 K (−39 C), respectively.
J. Koperski / Physics Reports 369 (2002) 177 – 326
257
Fig. 34. Arrangement of the apparatus used in the investigation of HgRG and Hg2 molecules in the laboratory in Windsor [51–56,60 – 62]. Molecular beam system: VC, vacuum chamber; V, high-temperature pulsed valve (General Valve Series 9) assembly; Hg, mercury reservoir placed outside of the chamber; G, carrier gas feed; MB, molecular beam; VPS, vacuum pump system (Edwards EH500A Roots pump backed by E2M80 rotary pumps, 8 × 10−3 mTorr = 1 × 10−8 atm ultimate pressure). Laser system: N2 or YAG LASER and DL, nitrogen- (in-house-built) or Q-switched Nd:YAG-laser (Quanta Ray DCR 1A) pumped dye laser (in-house-built); HG, second or third harmonic generator (Inrad, KDP-C and/or BBO-C); P, prism (second harmonic separator); L, lens; BD, laser beam dump. Detection system and controls: F1 , F2 , spectral Llters (UG5 and Hg-vapour absorption Llters); PM, photomultiplier (cooled EMI 9813QB); WR or DO, waveform recorder (Biomation 6500) or digitising oscilloscope (Hewlett-Packard 54111D); PC, computer; C, scanning controllers; VD, valve driver; DG, delay generator; W, Fizeau wavelength meter [253] or 0:85 m SPEX double-grating monochromator. While observing Huorescence spectra, a grating monochromator (Jobin-Yvon HR320) was placed in front of the PM.
There are several crucial advantages of pulsed supersonic beams over those operated in the continuous mode. Pulsed expansions allow to use larger oriLce diameters and, therefore, lowering P0 pressures to obtain suRcient cooling conditions (Fig. 33(a)). They oAer higher transient beam densities yet require lower average carrier RG load on the pumping system. Consequently, employing pulsed beams signiLcantly reduces consumption of expensive RG carriers. Details of the pulsed Hg-beam source assembly that was designed, constructed, and used for the studies of HgRG and Hg2 (as well as higher Hg2 RG and Hg3 complexes) are given in Refs. [53,56,57,60 – 62]. A general scheme of the experimental set-up, together with the pulsed source assembly, is presented in Fig. 34. The expansion chamber was evacuated by a vacuum pump system consisting of Edwards EH500A Roots pump backed by an E2M80 rotary pump which assured the pumping speed of 500 m3 =h (8 × 10−3 mTorr = 1 × 10−8 atm ultimate pressure). The beam source assembly consisted of a high-temperature pulsed valve (General Valve Series 9, rated up ◦ to 600 K (∼330 C) with in-house modiLcations). The source was connected to a stainless-steel reservoir with tripled-distilled mercury placed outside the chamber and to a carrier gas (He, Ne, Ar, Kr) feed. A variety of commercial (General Valve) and in-house-built stainless-steel valve oriLces
258
J. Koperski / Physics Reports 369 (2002) 177 – 326
was used (D = 100; 200; 300; 400; 900 m). In the investigation of larger Hg2 RG and Hg3 complexes [57], the valve was modiLed so that it could be operated with various lengths of the channel (from 2 to 20 mm) between the poppet and the nozzle oriLce, since increased channel lengths allowed for an increase in the rate of the collision processes leading to the formation of larger complexes [14 – 16,250,251]. A diAerential heating system maintained the valve and the valve nozzle at a temperature higher by about 20 –50 K than T0 to prevent condensation of Hg inside the valve. The T0 and P0 were kept in diAerent ranges while investigating the HgRG and Hg2 molecules (Table 9). The valve was driven at a repetition rate of 10 –15 Hz, producing expansion beam pulses of 2 ms duration (the laser pulses were delayed 3–4 ms relative to the start of the valve pulses, to overlap with the expansion beam pulses in the interaction region). 5.1.3. Continuous supersonic beam Frequently, experimental requirements do not allow to employ a pulsed valve in supersonic expansion-beam apparatus. Therefore, it is necessary to use continuously operating beam sources. Such continuous Zn and Cd supersonic beams were used in studies of ZnRG (RG = Ne; Ar; Kr) [40,42,47], CdRG (RG = He, Ne, Ar, Kr, Xe) [42–50], and Zn2 and Cd 2 [58,59] molecules. The Cd supersonic beam system designed, constructed, and employed in the investigation of the CdRG molecules in KrakPow is shown in Fig. 35. Construction of an eRcient supersonic Zn or Cd beam source is an experimental challenge because of several reasons. Firstly, one of these two elements, i.e. Zn, is highly reactive in high temperatures (900 –1000 K) and causes corrosion of the stainless-steel oven body. This manifests itself, in particular, by clogging and/or severe damage of the nozzle oriLce, especially when higher carrier gas pressures are applied (strong adiabatic expansion causes intense cooling of the upper part of the oven in which the nozzle is situated—Fig. 35). Therefore, in the construction of the Zn beam source, it was necessary to use material that is resistant to the aggressive Zn vapour. Solid molybdenum meets these requirements and it is essential in the production of the Zn beam sources despite the fact that it is very diRcult to machine. Secondly, as emphasized above, because of the high temperatures necessary to suRciently heat the Zn or Cd metal in the oven (Table 9), it is diRcult to employ a pulsed valve. 56 Therefore, the Zn and Cd supersonic beams are in continuous operation, which creates high demands for the eRciency of vacuum pump systems applied to evacuate the chamber during the beam operation. The Edwards and Leybold rotary pump and Roots pump combinations employed in Windsor (EH500/E1M80) and in KrakPow (RUVAC-WAU501/SOGEVAC-SV200) experimental set-ups, respectively, ensured the pumping speed of 500 m3 =h (1 mTorr = 1:3 × 10−6 atm ultimate pressure) and 505 m3 =h (1:5 mTorr = 1:95 × 10−6 atm ultimate pressure), respectively. As shown in Fig. 35, the stainless-steel (or molybdenum) oven was heated up by two independent insulated cables (WATLOW) and the oven was insulated from the vacuum chamber interior by a water-cooled shield. The beam source was mounted on an XYZ translator on the bottom Hange of the vacuum chamber. Additionally, a thermal screen was placed between the oven and the water shield to reHect the thermal radiation back to the oven body. The RG carrier gas (research grade) was admitted through a tube fed into the oven, in whose side wall an additional hole (inside the oven) was drilled to prevent the cold stream of the carrier gas to blow directly on the nozzle’s oriLce. Temperature was monitored 56
There has been one report of a Japanese experimental group (studies of CdAr vdW molecules) [254] applying a ◦ high-temperature pulsed valve with long metal poppet (rated up to 870 K (∼600 C)).
J. Koperski / Physics Reports 369 (2002) 177 – 326
259
Fig. 35. Arrangement of the apparatus used in the investigation of the CdRG molecules in the laboratory in KrakPow [46 –50,59]. A similar experimental set-up was used in studies of the ZnRG [40 – 42,47], CdRG [43– 45,47– 49], and Zn2 and Cd 2 [58,59] molecules in the Physics Department, University of Windsor. Molecular beam system: VC, vacuum chamber; O, stainless-steel or molybdenum oven; Me, metal (Cd or Zn); RG, carrier gas feed; MB, molecular beam; V, vacuum pump system (Leybold RUVAC WAU501 Roots pump backed by SOGEVAC SV200 rotary pump, 1:5 mTorr = 1:95 × 10−6 atm ultimate pressure in KrakPow, or Edwards EH500 Roots pump backed by E1M80 rotary pump, 1 Torr = 1:3 × 10−6 atm ultimate pressure in Windsor). Laser system: Nd:YAG laser (Continuum Powerlite Series 7000 in KrakPow, or Quanta Ray in Windsor) with second (SHG) and third (THG) harmonic generators; dye laser (SOPRA LCR I in KrakPow, or in-house-built in Windsor); KDP-C or BBO-C, second harmonic generators (Inrad); SHS, second harmonic separator (e.g. Pellin-Broca prism); BS, beam splitters; L, lenses. Detection system and controls: F, spectral Llter, PMT, photomultiplier (Peltier-cooled EMI 9893QB/350 in KrakPow, or Schlumberger EMR-541-N-03-14 in Windsor); Digital Scope, digitising oscilloscope (Tektronix TDS-210 in KrakPow, or Hewlett-Packard 54510A in Windsor); C, scanning controllers; FP, Fabry–Perot etalon or interferometer; Wavelength meter (Burleigh WA 4500 in KrakPow, or in-house-built [223] in Windsor); PD, photodiode. While observing Huorescence spectra, a grating monochromator (Jarrel Ash 500 mm Ebert in KrakPow, or Jobin–Yvon HR640 in Windsor) was placed in front of the PMT.
by four thermocouples. Three of them were mounted on the lower, middle, and upper parts of the oven body and the fourth one on the nozzle. CdRG and Cd 2 , or ZnRG and Zn2 molecules were formed by heating Cd or Zn shots (purity ¿ 99:999%) in the oven and expanding through the nozzle 115 m (in KrakPow) or 150 m (in Windsor) in diameter (D) and approximately 200 –300 m in length. Typical experimental parameters that characterized all the three supersonic beam sources are collected in Table 9. In the experiments presented here (similarly, as in most experiments with atomic and molecular beams), the supersonic and laser beams propagate along perpendicular directions. Therefore, the Doppler broadening is limited to the divergence of the supersonic beam (due to the non-zero translational temperature, Fig. 31). One can estimate the residual Doppler broadening, \Dopp , in the beam using formula \Dopp = ua sin F=c, where ua and F are mean velocity in the beam and divergence
260
J. Koperski / Physics Reports 369 (2002) 177 – 326
angle of the supersonic beam, respectively. Using wave number = 30000 cm−1 ; ua = 500 m=s and ◦ F = 45 (a value recently found in the pump-and-probe experiment, see Section 6.1.3, as characteristic for the divergence when no skimmer in the beam source was used) one can easily calculate \Dopp = 0:1 cm−1 , which is smaller than the smallest spectral bandwidth of the dye laser used in the experiments (Section 5.2.3). 5.2. Laser systems Figs. 34 and 35 present laser system set-ups in the laboratory in Windsor (spectroscopy of HgRG and Hg2 , Fig. 34) and in KrakPow (spectroscopy of CdRG, Fig. 35, similar arrangement was used in Windsor, for spectroscopy of ZnRG, Zn2 , CdRG and Cd 2 ). Each of the experimental pulsed laser systems consisted of a pump laser (an in-house-built N2 laser in earlier studies of the HgRG and Hg2 , and Quanta Ray DCR 1A Q-switched Nd:YAG laser in later studies of the HgRG and Hg2 as well as MeRG and Me2 studies in Windsor, or Continuum Powerlite Series 7000 Nd:YAG laser in CdRG studies in KrakPow). The pump laser was used to excite a dye laser. The measured energies (per pulse) of the pump lasers were reported as 12, 7 and 2 mJ for the 2nd, 3rd and 4th of the Quanta Ray Nd:YAG laser harmonics, respectively, and 100 and 60 mJ for the 2nd and 3rd of the Continuum Powerlite Nd:YAG laser harmonics, respectively. In Windsor, the dye lasers were in-house-built oscillator- and one-ampliLer-stage systems (the design was similar to the Molectron DL-300 laser). The oscillator stage included a diAraction grating/mirror cavity (Littrow conLguration, quad-prism expander [255]) and a magnetically stirred quartz dye cell (Molectron DL-051); a similar dye cell was used in the ampliLer stage. Both stages were side-pumped (with 1:4 pump-power ratio of the oscillator:ampliLer stage) with the aid of cylindrical lenses to focus the pumping radiation in the dye cells. In KrakPow, the dye laser was a commercial Sopra LCR I system with an oscillator stage enhanced with an in-house-built ampliLer stage [46]. The oscillator stage consisted of a diAraction grating/mirror cavity (Littrow conLguration, telescope expander) and a commercial quartz dye How-cell serviced by a dye circulator (RBM RD250 system); a similar dye How-cell with dye circulator was used in the ampliLer stage. Both stages were side-pumped (with 1:4 pump-power ratio of the oscillator:ampliLer stage) employing cylindrical lenses to focus the pumping radiation in the dye cells. In all laser systems the 2nd and 3rd dye-laser frequency harmonics were produced using angle-tuned (phase-matched) non-linear KDP-C and BBO-C generators of Inrad. In pump-and-probe experiments (see Section 6.1.3) two Nd:YAG-laser- and N2 -laser-pumped dye lasers were used. The delay time between the pump and probe pulses was adjusted individually (between 10 and 50 ns) for experimental conditions (i.e. X; P0 ; T0 ) employed in particular measurement. 5.2.1. Laser dyes The centres of absorption bands in excitation spectra of the studied MeRG and Me2 molecules (see e.g. Figs. 8 and 10) are situated in the UV and VIS regions of electromagnetic radiation. The spectral ranges that were necessary to cover with the laser radiation, extended from approximately + Y (i.e. short-wavelength limit of the G0+ 2000 A u ← X0g transition in Hg2 [62]), through approxiY (i.e. long-wavelength limit of the A0+ ← X0+ transition in CdXe [42], compare with mately 3300 A Y (i.e. long-wavelength limit of the E1 ← A0+ transition in CdKr). For each Fig. 38), up to 5200 A dye applied in the experiments, a “power curve” was recorded to be able to normalize the signal
J. Koperski / Physics Reports 369 (2002) 177 – 326
261
Table 10 Laser dyes (or their mixtures) used in studies of the MeRG and Me2 molecules Laser dye or dyes mixture
Solvent or solvents mixture
Fundamental wavelength range Y (A)
Second S or third T harmonic wavelength Y range (A)
Molecule
Transition in excitation
Refs.
R610:R640 1:3
Eth:Meth 1:3
5985 – 6220a
1995 –2075T
Hg2
+ G0+ u ← X0g
[62]
S420:C440 1:1
Meth:Eth 1:1
4240 – 4340b
2120 –2170S
ZnNe ZnAr ZnKr
D1 ← X0+
[40,41] [40,42,47] [42]
C460:C480 1:20
Eth:Meth 1:20
4520 – 4640b
2260 –2320S
CdNe CdAr CdKr
D1 ← X0+
[45] [47] [48,49]
C480
Eth Meth
4600 – 4720c 4740 – 4870c
2300 –2360S —
Hg2 CdRG
E1u ← X0+ g E1(63 S1 ) ← A0+ ; B1
[61]d
C500
Eth
5000 –5140a; c
2500 –2570S
+ F0+ u ← X0g
A0+ ← X0+ B1 ← X0+ E1(63 S1 ) ← A0+ ; B1
[60] [51,52] [53] [53–55] [56] [63]
Eth
4910 –5180c
—
Hg2 HgHe HgNe HgAr HgKr CdRG
C540A:C500 1:10
Eth
5200 –5560c
2600 –2780S
Hg2
D1u ← X0+ g
[61]
R610 DCM
Eth DMSO or Meth
6100 – 6180d 6300 – 6620a; c
3050 –3090S 3150 –3310S
Zn2 CdHe CdNe CdAr CdKr CdXe Cd 2 Cd 2
+ 0+ u ← X0g
[58] [43,44] [45] [46,47] [48–50] [42] [58] [59]
A0+ ← X0+ B1 ← X0+ + 0+ u ← X0g + 1u ← X0g
R = Rhodamine; C = Coumarine; S = Stilbene; Eth = Ethanol; Meth = Methanol; DMSO = Dimethyl-sulphoxide. a Pumped with 2nd harmonic of Nd:YAG laser. b Pumped with 3rd harmonic of Nd:YAG laser. c Pumped with N2 laser. d J. Koperski and M. Czajkowski, to be published.
intensity variations vs. wavelength in the range of measured excitation spectra (e.g. Fig. 3 of Ref. [62]). Table 10 lists all laser dyes and their mixtures used, as well as solvents, ranges of generated fundamental frequencies, corresponding ranges of its second and/or third harmonics along with corresponding transitions in the studied MeRG and Me2 molecules.
262
J. Koperski / Physics Reports 369 (2002) 177 – 326
+ Fig. 36. Arrangement for the frequency mixing (3rd harmonic generation) used in the investigation of the G0+ u –X0g transition in the excitation and Huorescence spectra of Hg2 [62] in the laboratory in Windsor. A linearly vertically Y polarized (represented by double arrow, ) radiation from an Nd:YAG laser-pumped dye laser (VIS, @VIS = 6000 A) was frequency doubled in a KDP-C angle-tuned crystal generating a linearly horizontally polarized (represented by full Y The 2nd harmonic horizontal polarization was 90◦ -degree rotated in circle, •) second harmonic (UV2 , @UV2 = 3000 A). a Nanosecond Polarization Rotator (Inrad, BCBH-1100) to match requirements for the frequency mixing in the BBO-C Y which after separation from the angle-tuned crystal. As a result, a 3rd harmonic was generated (UV3 , @UV3 = 2000 A), fundamental and 2nd harmonics in a 3rd harmonic separator (THS) was used to excite Hg2 molecules in the vacuum Y the dye-laser chamber. To properly tune the fundamental and 3rd harmonic over the range of approximately 100 A, dispersion element (diAraction grating), KDP-C and BBO-C crystals were synchronously rotated with the aid of stepper motors (S) driven by the computer via scanning controllers. The system was equipped also with a stage that had provision for compensation for the angular walk oA of the 3rd harmonic component.
5.2.2. Second and third harmonics generation To generate a desired laser radiation in the UV region, doubling or tripling of the fundamental dye-laser frequency were employed. The process of frequency doubling (2nd harmonic generation) required non-linear angle-tuned BBO type C (5 mm long -barium borate) or KDP type C (6 mm long potassium-dideuterium-phosphate) crystals for the fundamental frequency in the 4200 –5600 or Y range, respectively. To generate the laser radiation in the vicinity of 2000 A Y (Table 10), 6100 –6650 A a frequency mixing (3rd harmonic generation) process required both the KDP-C and BBO-C crystals, as well as commercial rotator of polarization (Inrad) to meet requirements for the frequency mixing in the second non-linear crystal [62]. Details of the arrangement are presented in Fig. 36. It is Y worthwhile to mention, that the 3rd harmonic single-scanning process over approximately 100 A lasted about 30 min. The scanning process was very sensitive to the environment conditions and required a laboratory-temperature stability to be better than ±0:5 K. 5.2.3. Tuning, stability, and calibration of the laser wavelength As the design of dye-laser resonators in both laboratories, in Windsor and KrakPow, is based on Littrow conLguration, the dye lasers were tuned over a large range of generated wavelengths by rotating a diAraction grating (1800 or 2400 grooves/mm in 1st, 2nd or 3rd-order of diAraction in Windsor lab, and 420 grooves/mm in 6th or 7th-order of diAraction in KrakPow lab) with the help of a sine-drive activated with a computer-controlled stepper motor. It allows to tune the laser wavelength
J. Koperski / Physics Reports 369 (2002) 177 – 326
263
corresponding to the fundamental laser frequency over the considerably large ranges. The largest Y (i.e. 620 cm−1 ) to stable scan-range achieved during the experiments in Windsor varied from 231 A − 1 + + + Y 340 A (i.e. 1180 cm ) in the case of G0u ← X0g [62] and D1u ← X0g [61] transitions, respectively, in the excitation spectra of Hg2 . Moreover, the largest scan (the E1 ← A0+ transition in CdKr) Y (i.e. 1800 cm−1 ) and required two dyes to be used. In KrakPow, the largest laser covered 430 A scan was attained in the investigation of the A0+ ; B1 ← X0+ transitions in the excitation spectra of Y which CdAr, CdKr and CdXe [30,46 –50]. The dye laser was scanned over approximately 160 A, − 1 corresponded to the range of 380 cm of the fundamental frequency. During the experiments, the laser-wavelength tuning process was monitored with Fabry–Perot etalons (with free spectral ranges, FSR, from 0:7 cm−1 (in KrakPow) to 1 cm−1 (in Windsor)). Examples of such a monitoring are presented in Fig. 25(a) as well as in Refs. [45,46]. The spectral bandwidth, !L , of the dye-laser fundamental radiation was estimated with the help of Fabry–Perot interferometers and etalons, and varied between 0.15 and 2 cm−1 in Windsor and was not larger than 0:25 cm−1 in KrakPow. The dye laser tuning process was always performed with the requirement that the Lnest step of the tuning not exceed the dye-laser bandwidth !L . The spectral resolution of the laser systems and accuracy of measuring the energies of vibrational and ro-vibrational components in the excitation spectra depended on the dye-laser beam spectral bandwidth, !L . The long-term stability of the dye-laser wavelength, important especially in detection of the Huorescence spectra, Y (0:08 cm−1 ) per 5 h and 0:05 A Y (0:12 cm−1 ) per 4 h in was satisfactory, and was better than 0:02 A the laboratory in Windsor and KrakPow, respectively. Y (0:04 cm−1 ) or In Windsor, calibration of the dye-laser wavelength was performed with 0:01 A − 1 Y (0:12 cm ) accuracy using an in-house-built Fizeau wavelength meter [253] or 0:85 m SPEX 0:05 A double-grating monochromator, respectively. In KrakPow, the dye-laser wavelength was calibrated with Y (0:12 cm−1 ) accuracy using a Burleigh WA 4500 pulsed wavelength meter. a 0:05 A 5.3. Detection systems As shown in Figs. 34 and 35, the laser beam crossed the molecular supersonic beam in the vacuum chamber at some distance, XeA , from the oriLce. In the interaction region, the laser beam was focused down to 0.5 –1:5 mm. The resulting Huorescence was observed at right angles to the laser and molecular beams. The excitation spectra were produced by simultaneous rotation of the dye-laser dispersion element and rotation of one (dye-laser frequency doubling) or two (dye-laser frequency tripling) non-linear crystals, while monitoring the total laser-induced >uorescence (LIF) signal with a photomultiplier. The >uorescence spectra were recorded by setting the laser frequency at a particular v ← v vibronic transition and scanning the dispersed Huorescence spectrum with a monochromator furnished with a photomultiplier. In the laboratory in KrakPow, to enable collection of the LIF at the highest possible intensity level and to focus it directly on the photocathode of PMT, a specially designed optical system was put into the chamber. The arrangement is shown in Fig. 37 and consists of concave mirror and plano-convex lens situated on both sides of the interaction region. Moreover, in both continuous beam apparatus, to reduce an amount of scattered light reaching the PMT from the chamber interior, suitably cone-shaped shields screened the optical elements. In case of the detection of excitation spectra, the total LIF collected from the interaction region was passed through a suitable spectral Llter (i.e. UG5 or an appropriate narrow-band interference
264
J. Koperski / Physics Reports 369 (2002) 177 – 326
Fig. 37. LIF collecting optics of continuous supersonic beam apparatus in the laboratory in KrakPow. Concave mirror and plano-convex lens are situated on both sides of the interaction region enabling collection of the LIF from the interaction region and focusing it directly on the photocathode of PMT. Suitably cone-shaped shields screened the mirror and lens to reduce an amount of scattered light reaching the PMT from the chamber interior. The second plano-convex lens and PMT are situated outside the vacuum chamber.
Llter in case of radiation close to the n3 P1 –n1 S0 or n1 P1 –n1 S0 transitions in Zn (n = 4), Cd (n = 5) or Hg (n=6) atom, respectively) mainly to cut oA the scattered radiation from the dye-laser fundamental frequency. In the pump-and-probe experiments, an appropriate broadband Corning colour Llter eliminated a strong laser radiation scattered from the chamber interior due to the Lrst (pump) step in the excitation. The LIF was detected with Peltier-cooled EMI 9813QB, Schlumberger EMR-541-N-03-14 or Hamamtsu R1463-01, and Peltier-cooled Electron Tubes 9893QB/350 PMT in the Windsor pulsed, Windsor continuous or Windsor pump-and-probe, and KrakPow experiments, respectively. The EMI, Schlumberger and Electron Tubes PMTs had their peak sensitivities in the UV and blue spectral Y It diminished further regions, and were practically insensitive to wavelengths longer than 6000 A. the detection of the strayed light of the fundamental frequency scattered from the chamber inteY region enabling to detect rior. The Hamamatsu PMT had broad cathode sensitivity in 2000 –8000 A + weaker Huorescence of the E1 → A0 and E1 → B1 transitions in CdRG molecules. In the case of acquisition of the excitation spectra in HgHe [51,52], HgNe and HgAr [53] molecules, an Hg-vapour absorption Llter was placed in front of the PMT. The Llter consisted of a mixture of Hg vapour and air contained in a cylindrical quartz cell 5 cm long and 5 cm in diameter. The cell was maintained ◦ Y atomic Huorescence at about 320 K (∼50 C). Its purpose was to absorb out the intense Hg 2537 A which otherwise tended to swamp near-lying molecular spectral features. In the case of detection of Huorescence spectra, the emitted Huorescence was focused on the entrance slit of the Jobin-Yvon HR-320, HR-640 or 500 mm Ebert (Jarrel Ash, 82-000 Series) grating monochromator in the Windsor or KrakPow experiment, respectively, and the monochromators were Ltted with the respective PMT mentioned above. The monochromators were calibrated using Hg and Cd spectral lamps. The monochromator+PMT systems were intensity-calibrated with the
J. Koperski / Physics Reports 369 (2002) 177 – 326
265
help of appropriate (Hg or deuterium) calibration lamps, and the spectral response characteristics for the systems were found to be virtually Hat in the UV and blue regions of wavelengths. To encompass the total bound–free Huorescence bands, the spectral resolution of the monochromators was deliberately lowered and resulted in typical values of 50 –200 cm−1 band-pass (e.g. 0.3–1:3 mm slit-width, HR-640). For HR-640, to resolve bound–bound components the width of monochromator slit-width was narrowed down to 0:07 mm, which resulted in highest possible resolution of 10 cm−1 band-pass. However, it required increasing the intensity of measured Huorescence (e.g. lowering XeA , or increasing P0 ; T0 or D) in order to compensate for the reduced radiation throughput of the monochromator. Usually, the higher resolution of the monochromator allowed to resolve several vibrational components corresponding to transitions to lower-lying vibrational levels of the ground state bound well (up to v = 6 in the A0+ ; B1 → X0+ spectra of HgAr [53,54]—Fig. 26, and up to v = 5 in the D1 → X0+ spectra of CdAr [47]—Fig. 27). Unfortunately, it was not suRcient to use this data in characterization of the ground state interatomic potentials in their long-range limits, as the vibrational structure of the levels close to the dissociation limit could not be resolved. 5.4. Experimental procedure and data acquisition systems The PMT signal was registered with a transient digitiser (Biomation 6500 waveform recorder or Hewlett-Packard 54111D digitising oscilloscope in pulsed Hg supersonic beam experiments, or Hewlett-Packard 54510 or Tektronix TDS-210 digitising oscilloscope in continuous supersonic beam experiments in Windsor or KrakPow laboratories, respectively). The transient digitiser was triggered by the laser pulse (Fig. 35), which largely eliminated background due to scattered light. The output of the transient digitiser was stored in a computer coupled through a GPIB interface. Simultaneous tuning of the dispersion element in the dye laser and rotation of one or two non-linear crystals, or scanning the monochromator were realized by in-house-built or Euro-crate scanning controllers coupled with the computer through an RS232 parallel port. The scans of the laser wavelength and monochromator were repeated several times to average the signal and to reduce the eAect of noise, such as for example the pulse-to-pulse amplitude jitter. In studies of larger Hg3 and Hg2 RG complexes [57], using the transient digitiser also a time evolution of the Huorescence intensity was measured by scanning a properly set time-window over the time interval in which the Huorescence occurred. It allowed to monitor the time-evolution of larger complexes formed in the supersonic expansion, as well as its dependence on XeA ; P0 , and T0 . The computer code used for programming the experimental procedure was written in Borland Turbo Pascal (versions from 4.0 to 6.0) in Windsor and in Borland C++ (version 3) in KrakPow. Summing up this section, it is necessary to stress that the completion of the experiments described above hinged entirely on the successful operation of the supersonic beam source and eRcient production of molecules under investigation. The heart of the experiment, the beam source, allowed to produce extremely weakly bound species, which served as objects of experimental studies. In several cases, it was a challenge to successfully carry out the experimental procedure as the molecules were produced in extreme conditions in the oven and vacuum chamber. It involved a very high carrier gas pressures [43,44], high temperatures of the oven [59,40 – 42], and high level of dexterity required to manipulate the carrier gas pressure while dealing with mixtures of expensive carriers [42,48,49]. Also maintaining the in-house-built set-up for tripling the dye-laser frequency (Fig. 36) [62] put high demands for its mechanical and temperature stability. The fact that the resulting laser frequency would
266
J. Koperski / Physics Reports 369 (2002) 177 – 326
Y posed an additional substantial experimental diRculty and be as high as 50 000 cm−1 (i.e. 2000 A), hence high technical requirements.
6. Interpretation of results Among the results discussed here are pioneering studies of extremely weakly bound CdHe [43,44] + and ZnNe [41] molecules as well as Lrst-time observed B1 ← X0+ ; E1u ← X0+ g and 1u ← X0g transitions in the excitation spectra of CdXe [42], Hg2 [61] and Cd 2 [59], respectively, and D1v =10 → X0+ , + + D1v =16 → X0+ ; B1v =0; 1; 2; 3 → X0+ ; A0+ and G0+ v =8 → X0 u; v =39 → X0g transitions in the Huorescence spectra of ZnAr [40,42,47], CdKr [48,49], HgAr [53,54], HgKr [56] and Hg2 [62], respectively. Furthermore, it was possible to measure directly the dissociation limits of the B1 state in both HgAr [55] and CdAr [46]. These led to a reliable description of the long-range behaviour in the B1-state potentials. Also, in a number of cases a reliable characterization of the ground-state potentials of ZnNe [41], ZnAr [48,42], CdHe [43], CdAr [46], HgNe [53], 57 HgAr [53,54] and Cd 2 [58], with the help of observed “hot” bands superseded previous ground-state characteristics, sometimes erroneous or inaccurate. For two molecules, CdNe [45] and CdKr [48–50], the interpretation of the B1 ← X0+ transitions in excitation, was corrected. This enabled to propose functional representations of the B1-states interatomic potentials in these molecules. Finally, the investigation puts a special emphasis on the characterization of the ground-state repulsive regions of interatomic potentials, through interpretation of the observed Huorescence spectra. Consequently, the ground-state short-range repulsive walls were directly determined for the Lrst time for ZnAr [42,47] and HgKr [56] molecules. This determination was more accurate since two channels of Huorescence were used and both terminated on the same part of the ground-state repulsive part as reported for CdNe [45], CdAr [47], CdKr [48,49] and HgAr [54]. 6.1. CdRG molecules Because the experimental data for CdRG molecules presented here are more complete than that for HgRG and ZnRG, the detailed discussion on the results begins with the CdRG family. Experimental investigation of CdRG molecules in the supersonic beams is easier than that of ZnRG. This is because the oven temperature requirements are less stringent (Table 9) and the aggressiveness of Cd metal does not pose a problem as it does for Zn. Therefore, the evidence for the experimental investigation of CdRG molecules is richer and one can Lnd a number of articles on the laser spectroscopy of ground and lowest excited states of CdRG produced in supersonic beams. Three research groups are known to be presently involved (or were involved in the past) in these studies. The group of University of Utah reported on spectroscopy of the ground X0+ (1 + ) and excited A0+ (3 ) triplet states of CdRG (RG = Ne, Ar, Kr and Xe) [213] as well as the C1 1 58 singlet states of CdRG
57
Amendment to the HgNe ground-state characterization is presented in Ref. [45]. C1 1 (or C1 1(1 P1 ) Hund’s case (c)) state in author’s articles as well as throughout this review is denoted D1 1 (or 1 D1( P1 )). It is the author’s intention to keep the notation for molecular states that was used by other investigators. 58
J. Koperski / Physics Reports 369 (2002) 177 – 326
267
(RG = Ne, Ar and Kr) [183] molecules. In addition, the C1 1 and D1 0+ singlet states of CdXe have been studied [256]. These three reports concluded with characterization of the A0+ triplet states using Morse representations, including rotational characteristics for X0+ and A0+ states of CdNe and CdAr as well as L–J (Eq. (21)) C6 and C12 constants for the A0+ (CdNe) and A0+ (CdAr) states. The ground and C1 1 singlet states were characterized using Buckingham-type (Eq. (25)) and Morse representations, respectively. However, the ground state well depths of CdNe, CdAr and CdKr were assumed to be known from observation of the B1 ← X0+ spectra reported in early work of Kowalski et al. [257] where the De were assessed indirectly (i.e. using relationship (41)). Moreover, those B1 ← X0+ excitations spectra for CdNe and CdKr were afterwards found to be incorrectly interpreted (see Refs. [45] and [48–50], respectively). The second group of University of Tokyo applied a pulsed supersonic beam to investigate the C1 ← X1 0+ transition in the excitation spectrum of CdAr complex. Their work resulted in rotational characterization of the C1 1 excited state. Unfortunately, the indirect ground-state characteristics were adopted from Refs. [213,257]. Third group, of the University of Windsor, investigated the A0+ ← X0+ and B1 ← X0+ transitions in the excitation spectra of CdNe and CdAr [258] as well as in CdKr [259] molecules. In addition, the D1 ← X0+ transitions in the excitation spectra of CdNe and CdAr were studied [260,261]. For the sake of data interpretation, the investigated molecular states were assumed to have the form of Morse functions, and special attention was paid to accurately characterize the ground states using relationship (41) simultaneously applied to the A0+ ← X0+ ; B1 ← X0+ and D1 ← X0+ transitions. Certain systematic trends in vdW interaction in MeRG diatomic molecules were also studied using the L–D theory expressed by Eq. (53). Complex analyses of ZnRG, CdRG and HgRG characteristics concluded with a linear De vs. RG dependence allowing to predict experimentally unobserved ground-state well depths, equilibrium internuclear separations as well as C6 long-range characteristics [260] using the L–P methodology [9] (Section 4.5.1). In studies of the excitation spectra of CdHe [43], CdNe [45], CdAr [46], CdKr [48,49], and CdXe [42] molecules (comparison in Figs. 38 and 39) a special attention was paid to an eRcient population of the v ¿ 0 vibrational states in order to observe a number of “hot” bands in all studied transitions. Detection of the CdAr, CdKr, and CdXe excitation spectra was realized using the two experimental set-ups in the laboratories in Windsor and KrakPow. In case of the CdKr and CdXe complexes, diAerent mixtures of carrier gases were used (Table 9) to increase certainty in inducing and detection of the proper vibrational components that were actually investigated. In all cases of carrier-gas mixtures, the lighter (He and Ne) gas served as carrier and “solvent” for expensive heavier Kr or Xe. This is diAerent from the method employed at the University of Utah where usually heavier Ar was used as the “solvent” [183,213,256]. The latter choice most likely caused a CdAr-“contamination” of the investigated CdKr and CdXe spectra as well as higher probability of the Cd 2 and higher clusters forming in the supersonic beam (a production of higher MeN clusters increases as the carrier-gas mass, mRG , increases [249,261]). In case of the CdNe, CdAr and CdKr investigated by the author, the accuracy of characterization of the ground-state repulsive part was successfully improved by detection of two or three channels of bound–free Huorescence. On the whole, the studies of CdRG molecules completed in both laboratories, constitute the most comprehensive investigation of all MeRG complexes discussed here, resulting in complete characterization of the ground and A0+ ; B1 and D1 excited states in wide regions of R. Moreover, the E1 Rydberg states of CdNe, CdAr and CdKr molecules have been characterized for the Lrst time using pump-and-probe method providing reliable analytical representations for their PE curves (see Section 6.1.3).
268
J. Koperski / Physics Reports 369 (2002) 177 – 326 Eat (53P1-51S0) A0+←X0+
B1←X0+ *
LIF (arb. units)
*
CdXe
CdKr * **
** *
CdAr
*
CdNe
CdHe 3240
3250
3260 3270 3280 3290 laser wavelength (Å)
3300
Fig. 38. A comparison of the B1 ← X0+ and A0+ ← X0+ transitions in the excitation spectra of CdHe, CdNe, CdAr, CdKr (see also Fig. 24) and CdXe molecules reported in Refs. [43–50] and [42], respectively. The B1 ← X0+ v =0 and 3 1 A0+ ← X0+ v =0 progressions are situated on the short- and long-wavelength sides of the 5 P1 –5 S0 atomic transition, respectively. A straightforward conclusion than can be drawn is that the B1 and A0+ excited states are more strongly and more weakly bound, respectively, than the X0+ ground state (compare with Fig. 29). Because of the experimental procedure and properties of supersonically cooled species (see text), the CdAr and CdKr spectra contain Cd 2 features while in the CdXe spectrum there are CdNe components present (all marked with asterisks).
6.1.1. X0+ singlet, and A0+ and B1 triplet states 6.1.1.1. CdHe. The pioneering study of the A0+ ← X0+ and B1 ← X0+ transitions in the excitation spectra of CdHe molecule produced in supersonic beam was published in Refs. [43,44], and up to date this is the only report in which a stable CdHe ground-state was observed. This extremely weakly bound complex (D0 = 10:4 cm−1 ; D0 (B1) = 4:6 cm−1 !) was possible to produce only under high pressure of the carrier gas and by locating the excitation region considerably far from the nozzle (large XeA ) in order to achieve larger cooling of the expanding species (Fig. 33(c) and Table 9). This resulted, however, in rather uncomfortably low density of absorbing molecules (Fig. 33(d)), requiring the sensitivity of the detection system to be adjusted to the highest possible level. Further increasing of the XeA , which might improve the cooling eRciency [247], resulted in the rapid decline of the LIF signal. Even far from the nozzle, it was possible to detect the v = 1 ← v = 1 “hot” band in the A0+ ← X0+ transition. This facilitated more reliable determination of the ground-state characteristics
J. Koperski / Physics Reports 369 (2002) 177 – 326
269
Eat (51P1-51S0) D1←X0+ * *
CdKr
LIF (arb. units)
*
CdAr
CdNe
2285
2290
2295
2300
2305
2310
2315
laser wavelength (Å)
Fig. 39. A comparison of the D1 ← X0+ transition in the excitation spectra of CdNe, CdAr and CdKr molecules reported in Refs. [45,47– 49], respectively. In all cases, the D1 ← X0+ v =0 progressions are situated on long-wavelength side of the 51 P1 –51 S0 atomic transition, which allows to draw a conclusion that the D1 excited state is more strongly bound than the X0+ ground state (compare with Fig. 29). Because of the experimental procedure and properties of supersonically cooled species (see text), the CdKr spectrum contains CdNe components, which are marked with asterisks. The relative intensity scale has been changed (×10) for CdAr spectrum near the long-wavelength limit.
using formulas of footnote 45 along with relationship (41). The entire spectrum spanned a range Y (i.e. 48 cm−1 ). Vibrational components of the A0+ ← X0+ and B1 ← X0+ transitions of merely 5 A were situated very close to the dominating atomic transition (Fig. 38). Therefore, in the simulation of the F–CF intensity distribution one had to take into account the large intensity of the atomic line, which modiLed the F–CF envelope. This resulted in more accurate value for the \Re =Re (A0+ )−Re . It was found that the extremely weak bonding in the B1 and X0+ states still allows to accommodate two and three vibrational levels, respectively. The X0+ ; A0+ and B1-state interatomic potentials were represented by Morse functions in the intermediate regions of R and a LR–B procedure was employed to assess the long-range characteristics (C6 ; D; vD ) for the X0+ and A0+ states. Recently, + the rotational analysis of the A0+ v =0 ← X0v =0 band has been performed [44] allowing to directly determine the ground and A0+ states equilibrium internuclear separations (see Table 11). 6.1.1.2. CdNe. Until the CdNe investigation [45], despite the number of articles published [94,183,213,258,260], several controversies existed concerning the determination of the interatomic
270
J. Koperski / Physics Reports 369 (2002) 177 – 326
Table 11 Summary of the X0+ (1 + ), A0+ (3 ), B1(3 + + 3 ), D1(1 ) and E1(3 + )-state potential characteristics for the CdRG molecules (RG = rare gas). Results of the author’s studies are put in bold. The most recent ab initio values of Refs. [72,131,132] are included. Phenomenological ground-state long-range characteristics are collected in Tables 5 –7. Note: \Re = Re − Re Designation
CdHe
CdNe
CdAr
CdKr
CdXe
1
2
3
4
5
6
14.2a 15.1b 16.8d
28.3c 39g 55p 34q
106.7 ± 0.7d 106g 112p 107q
165e 129 134p 145q
276 ± 5e 187h 153p 192q
4.54 4.6 ± 0.2b 4.50p 4.24q
4.32 ± 0.02c; k 4:26 ± 0:05i 4.13p 4.22q
4.31 ± 0.02d 4:31 ± 0:03g 4.29p; q
4.27 ± 0.02e 4.33g 4.45p 4.34q
4.21 ± 0.05f
!e (cm−1 )
9.6a 10p 10.8q
15.0c 13.2g 15.2p 13.1q
19.8d 18.8g 17.5p 18.3q
18.1e 16.6p 15.3p 16.6q
33.1 ± 0.6f 14h 14p 16.8q
!e xe (cm−1 )
1.63a
1.94c 1.15g
0.93d 0.87g
0.50e 0.58g
0.99 ± 0.01d 0.26h
Be (cm−1 )
0.203 ±0.010b
0.053c; k 0.0542i
0.0306i
— —
— —
e (cm−1 )
—
0.00685c; k
—
—
— —
+
1
+
X0 ( ) De (cm−1 )
Re
De
Y (A)
a
(cm
−1
)
n or n0 n1 C6
−6c; k
—
2.65 × 10
—
—
2.3
12.3
10.6d
8.6e
— (a:u:)
Y (×10 A 8
10.2
79–98 −1
)
0.612
a
a
r
7.0
c
31.7 128.8 1.389
g
c
m
7.3
d
162 295.5 1.275
g
d
21
m
361.7
4.66p 4.45q
— g
—
e
1.886f
1.194
A0+ (3 ) De (cm−1 )
41.2a 36p 27.5q
70.8c 77i 115p 53q
323d 325i 355p 324q
541e 513i 535p 568q
1196 ± 10f 1086 ± 40i ; 934p 1040q
Y Re (A)
3.04a; n 3.7 ± 0.2b 3.57p; q
3.76 ± 0.02c; k 3:62 ± 0:05i 3.44p 3.61q
3.51 ± 0.03d; l 3:45 ± 0:03i 3.39p 3.37q
3.34 ± 0.03e; n
3.02 ± 0.05f ; n
3.39p 3.30q
3.31p 3.27q
Y \Re (A)
−1.50a
−0.56 ± 0.02c
−0.80 ± 0.01d
−0.93 ± 0.01e
−1.19 ± 0.01f
!e (cm−1 )
20.0a 19.5p 19.7q
24.9c 22.6i 25.8p 16.5q
39.2d 38.5i 35.9p 37.1q
37.0e 37.1i 36.1p 40.2q
52.3 ± 0.5f 50.7i 44.7p 52q
!e xe (cm−1 )
2.4a
2.2c 1.6j
1.22d 1.22i
0.63e 0.65i
0.60 ± 0.01f 0.6i
Be (cm−1 )
0.320 ±0.016b
0.070c; k 0.0753i
— 0.0481i
— —
— —
J. Koperski / Physics Reports 369 (2002) 177 – 326
271
Table 11 (continued) Designation
CdHe
CdNe
CdAr
CdKr
CdXe
1
2
3
4
5
6
e (cm−1 )
— —
0.00640c; k 0.0075i
0.0075i
— —
— —
De (cm−1 )
—
2.41 × 10−6c; k
—
—
—
C6
(a:u:)
42–104
Y (×10 A 8
−1
)
0.742
a
a
c
76.1 82.8 1.493
i
287.3
c
i
1.461
—
d
1.340
— e
1.468f
B1 (3 + + 3 ) De (cm−1 )
109.0e 72j 112p (145,77)q
227.9 ± 5.0f
28p 10q
59.7 ± 1.5d 56j 82p 48q
4.66a; n 5.56p 5.98q
5.12 ± 0.02c; n
5.01 ± 0.02d; n
5.09p 5.33q
5.03p 5.09q
4.78 ± 0.03e; n 4.70o 4.97p (6.37,9.38)q
4.26 ± 0.05f ; n 4.89p 3.29q
Y \Re (A)
0.12a
0.80 ± 0.02c
0.70 ± 0.02d
0.51 ± 0.01e
0.05 ± 0.01f
!e (cm−1 )
3.6a 4.8p
6.5c 8.8p 4.5q
11.8 ± 0.1d 11.7p 9.1q
9.3e 9.2p 30.2q
18.3 ± 0.3f 10p 9.1q
!e xe (cm−1 )
0.53a
1.1c
0.57 ± 0.02d
0.20e
0.37f
C6 (a:u:)
—
76.8c
305 ± 10d
—
—
0.349a
1.057c
0.998d
0.755e
1.153f
75p
78.7c 89g 111p
539d 544g 475p
1089e 1036g 843p
2485h 1750p
3.60 ± 0.02c; n 3:61 ± 0:05g 3.49p
3.24 ± 0.02d; n 3:28 ± 0:04g 3.23p
3.105 ±0.025e; n 3.17g 3.12p
2.92h 3.10p
−0.72 ± 0.01c −0:65g
−1.07 ± 0.005d −1:03g
−1.165 ± 0.005e −1:16g
−1:29h
23.5c 23.4g ; 23.7p
48.7d 47.97g ; 42.2p
59.4e 56.72g ; 48.8p
87.7h 69p
—
1.78c ; 1.80g
1.1d ; 1.11g
0.85e ; 0.81g
0.775h
—
69.1c
Y Re (A)
Y (×108 A
−1
)
6.1a 7.8p 2.2q
9.6c
151p 572q
D1 (1 ) De (cm−1 ) Y Re (A) 3.44 Y \Re (A)
—
!e (cm−1 ) !e xe
(cm
−1
35.6 )
C6 (a:u:) Y (×10 A 8
p
−1
)
p
— c
g
— d
1.387 1.399
g
— e
—
1.344 1.350
1.556 1.5236
E1 (3 + ) De(in) (cm−1 )
—
91.0 ± 4.0s
1285 ± 10t 1266v ; 1252.8w
1644 ± 3u
De(out) (cm−1 )
—
—
24.2 ± 1.0t
25u
g
1.669h —
— (continued on next page)
272
J. Koperski / Physics Reports 369 (2002) 177 – 326
Table 11 (continued) Designation
CdHe
CdNe
1
2
3
Re(in)
Y (A)
CdAr 4 s
CdKr
CdXe
5
6
t
—
3.21 ± 0.05
2.84 ± 0.03 2.84v ; 2:84 ± 0:03w
— —
— —
Y Re(out) (A)
—
—
5.60 ± 0.05t
—
—
Y \Re(in) (A)
—
−0.55 ± 0.05s
−0.61 ± 0.02t −0:61 ± 0:01w
— —
— —
Y \Re(out) (A)
—
—
0.57 ± 0.02t
—
—
!e(in) (cm−1 )
—
56.6 ± 3.0s
106.8 ± 2.0t 105.4v ; 105w
91 ± 1u
— —
!e(out) (cm−1 )
—
—
4.40 ± 0.02t
4.1u
—
!e xe(in) (cm−1 )
—
8.8 ± 0.4s
2.1 ± 0.1t 2.19v ; 2.21w
— 1.27 ± 0.01u
— —
!e xe(out) (cm−1 )
—
—
0.200 ± 0.002t
0.17u
—
a
Ref. [43]; Ref. [44]; c Ref. [45]; d Refs. [46,47]; e Refs. [48–50]; f Ref. [42]; g Ref. [183]; h Ref. [256]; i Ref. [213]; j Ref. [257]; k From rotational analysis (Fig. 25); l From Eq. (51); m + + + + + From simulation of the D1v =7; 8 → X0+ ; A0+ v =4 → X0 , and D1v =16 → X0 ; A0v =9 → X0 bound–free spectra of CdAr and CdKr, respectively (e.g. Fig. 27); n From \Re obtained in simulation of the corresponding excitation and/or Huorescence spectra assuming Morse representation for the ground and excited states; o From \Re obtained in simulation of the B1 ← X0+ transition assuming double-well potential for the B1 and Morse representation for the X0+ states; p Ab initio values of Ref. [72]; q Ab initio values of Refs. [131,132]; r n1 chosen so that M–S(n0 ; n1 ) potential has the same slope that Morse potential; s Based on a B–S analysis of the E1 ← A0+ v =0; 1 and E1 ← B1v =0; 1 transitions, Ref. [63]; t Based on a B–S analysis of the E1 ← A0+ v =5 and E1 ← B1v =0; 1; 2 transitions (J. Koperski and M. Czajkowski, to be published); u Based on a B–S analysis of the E1 ← A0+ v =9 and E1 ← B1v =1 transitions (J. Koperski and M. Czajkowski, to be published); (in) Inner well (see Fig. 41); (out) Outer well (see Fig. 41); v Ref. [319]; w Ref. [320]. b
J. Koperski / Physics Reports 369 (2002) 177 – 326
273
potentials for various electronic molecular states of the complex. Firstly, the excitation spectrum of the B1 ← X0+ transition presented and analysed twice in Refs. [257,258], seemed to be incorrectly interpreted. Compared to the analogous very well known B1 ← X0+ transition in the excitation spectrum of HgNe (Refs. [53,200], and discussion below), the v ← v = 0-progression in CdNe, which is detectable up to the dissociation limit of the B1 state, was claimed to contain components with suspiciously high v -vibrational quantum numbers. Secondly, in the Lrst investigation [257] the dissociation energy of the CdNe ground state was determined indirectly, assuming knowledge of the dissociation energy of an excited state and the energy corresponding to a relevant v = 0 ← v = 0 transition. The value De = 39 cm−1 [257] was then adopted by others and used as a reference in their studies of CdRG complexes [183,213,258]. Therefore, the ground-state dissociation energy of CdNe established in the literature was somewhat uncertain and called for another, direct and more reliable determination. + In the study of Ref. [45], Lrst time observed A0+ v =1 → X0 transition as well as repeated detec+ tion of the D1v =1 → X0 transition in the Huorescence, and the A0+ ← X0+ ; B1 ← X0+ (Fig. 38) and D1 ← X0+ (Fig. 39) transitions in the excitation spectra were reported. Three aspects of the new investigation were emphasized: eRcient detection of “hot” bands (v ← v = 0; 1; 2) that provided information on the D0 , good signal-to-noise ratio in detection of the B1 ← X0+ transition for accurate determination of the number of bound v levels in the B1-state potential well, and separate detection of two “channels” of Huorescence that originate from selectively excited vibrational levels in di4erent electronic energy states and terminate on the same repulsive part of the ground-state potential. The spectra were subjected to a rigorous analysis based on a complete simulation of bound–free and bound–bound parts. The B–S and LR–B procedures as well as the GvNDE program were employed. As a result, spectroscopic constants for all four electronic states were determined. Particularly, more reliable parameters of the potentials were found in the case of the B1 and X0+ states. Moreover, similarly to the ZnNe and CdHe excitation spectra, in the simulation of the A0+ ← X0+ transition it was allowed for inHuence of the intense atomic line on the F–CF intensity distribution. This procedure changed the view on the previously evaluated diAerence \Re between the A0+ - and X0+ -state bond lengths. The rotational analysis of + the A0+ v =0 ← X0v =0 band (Fig. 25) fully corroborated that approach. It was found that a Morse function combined with an adequate long-range approximation, represents well the interatomic PE curves of the X0+ ; A0+ and B1 states below their dissociation limits. The ground-state interatomic potential was represented by a combined Morse–vdW function (24). It was also determined that in the long-range limit the three excited states have strong non-Morse components with dominating vdW interaction. The dissociation energy of the ground state was determined from the “hot” bands observed in the excitation spectra. Simulation of the Huorescence spectra conLrmed the result for the D value. Moreover, the repulsive part of the ground state above the dissociation limit Y and found to be represented by a Morse was accurately determined in the range of 3.15 –3:75 A, potential. Therefore, the Morse–vdW representation of the CdNe ground-state potential was extended to the short-range region and proposed as a combined representation in the wide region of R. 6.1.1.3. CdAr. An important achievement in the CdAr characterization [46] was a direct measurement of the B1-state dissociation energy constituted an important achievement in characterization of CdAr [46]. This allowed to derive a long-range representation of the B1-state potential with the
274
J. Koperski / Physics Reports 369 (2002) 177 – 326
aid of the LR–B method presented in Fig. 21 (the B1 ← X0+ transition is shown in Fig. 38 on the short-wavelength side of the atomic transition). Similarly to the CdNe, also in this case, the B1-state interatomic potential below its dissociation limit was represented by a combined Morse– vdW function (24). A number of “hot” bands in the B1 ← X0+ transition was recorded. This allowed to Lrst-time characterize the ground state in the intermediate region of R. Together with the NDE method applied, it resulted in a reliable description of the ground-state potential below its dissociation limit (i.e. in intermediate and long-range regions). Again, a Morse–vdW representation was used. To extend the ground-state representation over the short-range region (as well as to conLrm + a v -assignment in the A0+ and D1 states), the A0+ and D1v =7; 8 → X0+ bound–bound v =4 → X0 and bound–free proLles in Huorescence were recorded [47] (Fig. 27). They originated from diAerent, selectively excited vibrational levels (in diAerent A0+ and D1 electronic states) and terminated on the same part of the ground-state repulsive wall. As a result of the modelling of bound–free spectra (Section 4.2), the M–S(10.6, 7.0) potential representation was determined (Fig. 27(g)), and together with the Morse–vdW function of Ref. [46] served as a hybrid CdAr ground-state representation. 6.1.1.4. CdKr. A recently published result of ab initio studies of CdKr molecule [72] suggests the existence of a potential barrier in the B1-state bound well in the region of R that is a priori accessible via excitation from the ground-state lower v levels. This provides additional information related to the interpretation of the B1 ← X0+ transition in the CdKr excitation spectrum reported previously [257,259]. In both earlier investigations, the pure Kr [257] and 10% mixture of Kr in Ne [259] were used, and the characteristic envelope of the intensity distribution was observed. The spectra were subjected to the standard B–S analysis, but it was assumed [259] that the B1-state well supports lower number of vibrational levels than expected from the linear B–S approximation, and the B1-state potential exhibits strong non-Morse behaviour as it approaches the dissociation limit. In the reviewed studies, the 10% Kr mixture in Ne and pure Kr were used in experiments in Windsor and KrakPow, respectively (Table 9) [48,49]. The most short-wavelength part of the B1 ← X0+ + transition in excitation as well as D1v =16 → X0+ and A0+ v =9 → X0 transitions in Huorescence were + + + investigated in detail. The B1 ← X0 and A0 ← X0 transition are presented in Figs. 24 and 38. From the former, it is apparent that after a rapid decrease of intensity of vibrational components, occurring approximately for v = 5, there is a minimum in the intensity distribution, and then a subtle revival occurs approximately for v = 6–7. From the simulation shown in Fig. 24(c), in which both, B1 and X0+ wells were represented by Morse functions, it is obvious that the Morse representation is adequate to represent the B1-state potential only up to the v =4–5 because the frequencies and intensities of the vibrational components are accurately reproduced only up to these energies. For shorter wavelengths, the simulation does not resemble the envelope of the experimental trace. One of the explanation of the previous erroneous interpretation of the B1 ← X0+ transition [257,259] is omitting a possibility of high “contamination” of the short-wavelength part of the B1 ← X0+ spectrum by 3 + 1 + “cold” (v ← v = 0) and “hot” (v ← v = 1) vibrational components of the 0+ u ) ← X0g ( g ) u ( transition in eRciently produced Cd 2 molecule (Section 6.5.2) [50,58]. These Cd 2 components were previously interpreted as higher B1 ← X0+ v =0 transitions in CdKr molecule [259]! It resulted in an incorrect interpretation of the whole transition, and consequently, as mentioned above, in an inaccuracy in determination of the ground-state dissociation energy. Therefore, in the approach reviewed here [49,50], a very thorough analysis relying on the excluding of Cd 2 components from the
J. Koperski / Physics Reports 369 (2002) 177 – 326
275
spectrum was performed. 59 The same spectrum was recorded under somewhat diAerent experimental conditions, which decreased the probability of detecting the Cd 2 components. As suggested by the ab initio result [72], a double-well potential was assumed to represent the B1-state potential 60 with a long-range tail represented by the C6 =R6 approximation, and a LEVEL 7.2 Fortran code of Le Roy [212] was applied to simulate the spectrum. A result of the simulation is shown in Fig. 24(b). From the comparison of both simulations (i.e., in Figs. 24(b) and (c)), it is obvious that the excited-state Morse representation reproduces only the bottom of the B1-state deeper well (compare with Fig. 14). For detailed discussion of the simulation procedure, the reader is referred to Ref. [50]. From the analysis of the A0+ ← X0+ transition, a Morse–vdW representation for the A0+ state was + + 61 concluded. Moreover, two A0+ channels of Huorescence resulted in v =9 → X0 and D1v =16 → X0 a very accurate M–S(8.6, 7.3) representation of the CdKr ground-state repulsive wall, which together with a Morse–vdW representation in the intermediate and long-range regions provided accurate hybrid PE curve shown in Figs. 30(a) – (c) separately for three regions of R. 6.1.1.5. CdXe. Surprisingly, despite the eAort invested in the description of the CdRG molecules investigated using the supersonic beam method, the B1 ← X0+ transition in the CdXe excitation spectrum has not been investigated until the recent studies. The B1 ← X0+ and A0+ ← X0+ transitions in that molecule were reported in 1996 [42] (Fig. 38). Similarly as in the case of CdKr, experiments were performed in both laboratories, in Windsor and KrakPow, using 10% mixtures in pure Ne and He, respectively. A number of “hot” bands, observed mainly in the B1 ← X0+ transition, facilitated reliable and direct ground-state characterization, which departed rather considerably from the indirect, very speculative value of Ref. [256]. The interpretation, analysis (B–S and LR–B methods), and simulation of the spectrum resulted in hybrid Morse and vdW representations for the ground and two excited-state potentials below their dissociation limits. 6.1.2. D1 singlet states 6.1.2.1. CdNe, CdAr and CdKr. Analysis and simulation of the D1 ← X0+ transitions in the excitation spectra of the CdNe, CdAr and CdKr molecules (Fig. 39) and reported in Refs. [45,47–50], essentially reinforced the respective ground-state characterization through detection of a number of “hot” bands as well as served for searching for the D1-state potential representations in the intermediate and long-range regions. The studies resulted mostly in the Morse–vdW combined functions (24) as most adequate for the ground-state representations. The D1-state characteristics for the three molecules obtained in both B–S and LR–B (or NDE) analyses do not diAer considerably from those of Ref. [183] as shown in Table 11. In Table 11, the reader can Lnd also a complete summary of CdRG characteristics in the ground and A0+ , B1 and D1 lowest excited electronic states that were 59
The decay of LIF was simultaneously monitored while scanning the dye-laser frequency through the v ← v transitions excluding these of Cd 2 with short decay rate (∼1 s [95]). The lifetime of the CdKr v -vibrational states was approximately 2–3 times longer [258]. 60 The F–CF intensity envelope in the B1 ← X0+ transition in CdKr is similar to that of the same transition in HgXe [233]. Recently published ab initio B1-state interatomic potential of HgXe [135 –137] appears to have a double-well structure as well. 61 Detection of the gross D1v =16 → X0+ bound–free proLle (which is diAerent from a detection of the short-wavelength part only reported in Ref. [183]) as well as their simulation conLrmed the D1-state v -assignment.
276
J. Koperski / Physics Reports 369 (2002) 177 – 326
obtained in the studies. The results are compared with the respective experimental characteristics of the most reliable Refs. [183,213,256,257] as well as with ab initio results of Ref. [72]. To illustrate the comparison of the experimental and ab initio results for the X0+ ; A0+ , B1 and D1 states separately, the ground-state well depths, De , as well as ground-state bond lengths, Re 6 were plotted vs. RG . They are shown in Fig. 40 for the four investigated states. 6.1.3. E1 triplet Rydberg states + + Fig. 41 shows the E1(63 S1 ) ← A0+ v =5 ← X0v=0 bound–bound (trace (a)) and E1 ← B1v =2 ← X0v=0 bound–bound and bound–free (trace (b)) transitions in the excitation spectrum of CdAr molecule studied using a pump-and-probe method (the method can also be called an optical–optical double resonans). The E1-state characterization was performed using two paths of excitation, i.e. via A0+ v =5 or B1v =0; 1; 2 intermediate vibrational levels. This allowed to reach diAerent parts of the E1-state PE curve (inner, E1in , and outer, E1out , respectively). The inner part of the CdAr E1-state potential was characterized previously [319,320]. However, the outer, very shallow well was not described in those experiments. In the same manner the E1 Rydberg states were characterized for the Lrst time in CdNe and CdKr molecules revealing humps in their PE curves. This tendency is in agreement with recent ab initio studies of Czuchaj et al. [131] for CdRG molecules. The PE characteristics for the CdRG E1 states are collected Table 11. 6.1.4. Conclusions—CdRG family Summarizing the CdRG (RG = He, Ne, Ar, Kr, Xe) characterization in four X0+ ; A0+ , B1 and D1 electronic states (Table 11 and Figs. 40(a) – (d)), one can draw a straightforward conclusion that in the author’s [42–50] as well as in other experimental [83,213,256,257] and ab initio [72] results related to the ground- and excited-state well depths, certain trends are present De (CdHe) ¡ De (CdNe) ¡ De (CdAr) ¡ De (CdKr) ¡ De (CdXe) ;
(63a)
De (CdHe) ¡ De (CdNe) ¡ De (CdAr) ¡ De (CdKr) ¡ De (CdXe) :
(63b)
Let us focus the reader’s attention on the relationship (63a), which is illustrated in Fig. 40(a). The De values increase regularly with increasing RG , which reHects a well-known trend of any kind of attractive forces (here an induced-dipole–induced-dipole (dispersion) interaction). Comparing with both experimental as well as ab initio results of the others, the De (CdRG) vs. RG dependence obtained using the author’s results shows a linear trend, which is fully justiLed by the L–D relationship (53) discussed in Section 4.4.2 (also Section 3.4 and Ref. [260]). Similarly to the ground CdRG states, the De values increase as RG increases. The trend is present for the experimental as well as ab initio values, but no longer linear dependence is observed for the A0+ , B1 and D1 states as −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−→
Fig. 40. (a) – (d) Well depths, De , and (e) – (h) bond lengths, R6e , plotted in function of RG polarizability, RG , for X0+ ground ((a), (e)), and A0+ ((b), (f)), B1 ((c), (g)) and D1 ((d), (h)) excited states of CdRG molecules. Results of Y −3 slope) are compared with those of Refs. [183,213,256,257]. Refs. [42–50] (the linear Lt in (a) produces a 68:5 cm−1 A Results of ab initio calculation of Czuchaj and Stoll [72], and Czuchaj et al. [131,132] are also shown. Inserts illustrate mutual orientations of the electron density distributions in the ground and excited molecular electronic states.
J. Koperski / Physics Reports 369 (2002) 177 – 326 1200
Ar
HeNe
Kr
De'(A0+, CdRG) [cm-1]
300
De''(CdRG) [cm-1]
277
Xe
200
(a)
RG
Cd
100
the author's Refs. [42-50] Refs. [183, 256] ab initio, Ref. [131, 132]
900 Ar
HeNe
600
Kr
Xe
Cd
(b)
RG
the author's Refs. [42-50]
300
Refs. [213] ab initio, Ref. [131, 132]
ab initio, Ref. [72]
ab initio, Ref. [72]
0 0
1
2
3
4
0
5
0
1
2
αRG [Å3]
3
4
5
3
αRG [Å ]
600
De'(D1, CdRG) [cm-1]
De'(B1, CdRG) [cm-1]
the author's Refs. [42-50] Ref. [257]
500
ab initio, Ref. [131, 132] ab initio, Ref. [72]
400 HeNe
Kr
Ar
Xe
300 RG
Cd
200
(c)
100
2400 Ar
HeNe
1600
Kr
Xe
Cd RG
(d) 800 the author's Refs. [42-50] Refs. [183, 256] ab initio, Ref. [72]
0
0 0
1
2
3
4
0
5
1
2
αRG [Å3] HeNe
Ar
Kr
7000
(e) the author's Refs. [42-50] Refs. [183, 256] ab initio, Ref. [131, 132]
3000
Kr
Ar
HeNe
9000
5000
4
5
2500
Xe
[Re'(D1, CdRG)]6 [Å6]
[Re''(CdRG)]6 [Å6]
11000
3
αRG [Å3]
2000
Xe
the author's Refs. [42-50] Refs. [183] ab initio, Ref. [72]
1500
(f) 1000
ab initio, Ref. [72]
0
1
2
3
4
500
5
0
3
αRG [Å ]
1
2
3
4
5
3
αRG [Å ]
Ar
HeNe
50000
Kr
[Re'(A0+, CdRG)]6 [Å6]
[Re'(B1, CdRG)]6 [Å6]
70000
Xe
the author's Refs. [42-50] ab initio, Ref. [131, 132]
30000
ab initio, Ref. [72]
(g)
10000
HeNe
3000
Ar
Xe
Kr
2000
1000
the author's Refs. [42-50] Refs. [213] ab initio, Ref. [131, 132]
(h)
ab initio, Ref. [72]
0
1
2
3
αRG [Å3]
4
5
0
1
2
3 3
αRG [Å ]
4
5
278
J. Koperski / Physics Reports 369 (2002) 177 – 326 E1in
E1out
E1out
0←B1v"=2
A0+ E1in LIF (arb. units)
(b)
E
B1
16 15 14 137 B1v"=2
v″=2
v″=5
X0+ E1in
4750 15
4850
4800 10
5
v=0
R 0←A0+v″=5
(a)
4750
4800
4850
4900 4950 5000 probe laser wavelength (Å)
5050
5100
+ + Fig. 41. (a) E1(63 S1 ) ← A0+ v =5 ← X0v=0 bound–bound and (b) E1 ← B1v =2 ← X0v=0 bound–bound and bound–free transitions in the excitation spectrum of CdAr studied using a pump-and-probe method. The E ← A ← X (solid arrows) and E ← B ← X (dashed arrows) excitation paths are shown in PE-curves diagram. The former probes the inner (E1in ) while the later probes mostly the outer (E1out ) part of the E1 Rydberg-state potential. Similar experiments allowed to Lrst-time characterize the E1 states in CdNe and CdKr molecules (see Table 11).
shown in Figs. 40(b), (c) and (d), respectively. This reHects the fact that the dispersion L–D theory (53) cannot be applied here. 62 Inspecting Figs. 40(a) – (d), one can conclude that in almost all cases, the ab initio De values are smaller than the experimental ones [42–50], especially for the A0+ and D1 states, and heavier CdRG molecules. The ab initio values published in Ref. [72] as well as Refs. [131,132] depicted in Fig. 40 are obtained taking into account not only Cd but also the RG valence electrons (the Cd 2+ and RG8+ , as well as Cd 20+ and RG8+ cores, respectively were modelled by ‘-dependent scalar relativistic pseudopotentials, and core-polarization potential has been applied for Cd 2+ [72] and Cd 20+ [131,132], see Table 2). This treatment, as stated in Ref. [72], considerably improves the calculated CdRG potentials as compared with former approaches [114,115]. Moreover, the evident systematic deviations of ab initio De values from the experimental ones can be reduced by treating the spin–orbit interaction in a more advanced manner, namely if the R-dependence of this coupling term is taken into account. Y −3 ) × RG linear dependence, the respective relationships As compared to the CdRG ground-state De = (68:5 cm−1 A + written for the A0 , B1 and D1 excited states reveal non-linear De ∼(RG )4 relationship, where, intriguingly, the exponent 4 for CdAr, CdKr and CdXe appears to be twice of that for CdHe and CdNe. These interesting regularities are presently under investigation laying out a potential direction for the future studies. 62
J. Koperski / Physics Reports 369 (2002) 177 – 326
279
Figs. 40(e), (f), (g) and (h) present the R6e vs. RG dependence (as recommended by relationship (53)) plotted for the X0+ , A0+ , B1 and D1 states, respectively, using the author’s [42–50] as well as other experimental [183,213,256,257] and ab initio [72,131,132] results. Comparing these dependencies obtained for the X0+ state (Fig. 40(e)) one can Lnd that the ground-state bond-lengths obtained in Refs. [42–50] are related as follows: Re (CdHe) ¿ Re (CdNe) ¿ Re (CdAr) ¿ Re (CdKr) ¿ Re (CdXe)
(64a)
reHecting a tendency of decreasing of the Re as RG increases. This is consistent with the previously observed behaviour of De vs. RG shown in Fig. 40(a) and with the L–D model (53). The increase of the ground-state potential well depth has to be compensated by decrease of the respective ground-state bond-length. For ab initio Re values (except Re (CdHe) 63 ) the trend is the reverse of relationship (64a). For the other experimental Re of Refs. [183,256] the trend follows that of ab initio studies of Refs. [131,132] and Re (CdNe) ¡ Re (CdAr) ¡ Re (CdKr). For the excited-state bond-lengths (Figs. 40(f) – (h)), the general tendency is the same for both experimental (obtained by the author in Refs. [42–50] and by other investigators [183,213,256,257]) as well as ab initio [72,131,132] results, and is described by following relationship (except Re (CdHe) and Re (B1) for CdKr, which depart from this regularity): Re (CdNe) ¿ Re (CdAr) ¿ Re (CdKr) ¿ Re (CdXe) :
(64b)
The tendency (64b) is consistent with that for the ground states, and can be qualitatively illustrated using electron density distributions in the ground-RG and excited-Me atomic states. As the RG increases from that of Ne, through Ar and Kr to Xe, the electron density distributions tend to be spatially closer to each other despite their mutual orientation (5p-orbital along (i.e. in *-alignment) for B1, or 5p-orbitals perpendicular (i.e. in -alignment) to the internuclear axis, for A0+ and D1 states). Overall, the electron-density distribution approach is consistent as related to the results reviewed here. Moreover, it is very interesting that for all four molecular states the ab initio Re values of Ref. [72] (except Re (CdHe)) are smaller than experimental ones for CdNe or CdAr molecules (RG small) and larger than experimental ones for CdKr or CdXe (RG large). This is consistent with already mentioned trend that for heavier CdRG molecules, the ab initio well depths are smaller than the experimental ones. It is apparent, and may serve as a possible explanation, that for RG atoms with larger polarizabilities the second-order MHller-Plesset perturbation theory applied in ab initio calculation [72] does not fully account for large vdW attraction and produces shallower CdRG potentials with larger bond lengths. The ab initio results of Refs. [131,132] are closer to the experimental values for the X0+ and A0+ states. However, the largest discrepancy occurs for the Re (B1) of CdKr. Inspecting Fig. 40 as well as results collected in Table 11, one can see that the ground-state well depths and bond lengths of CdRG are smaller and larger, respectively than those corresponding to the De and Re values for the B1 excited state, while exactly the reverse is true for the A0+ excited state. As already discussed above and in Section 3.5.1 (Fig. 9), this is caused by a mutual orientation of the electron density distributions in the X0+ (1 + ), and B1(3 + 3 + ) and A0+ (3 ) molecular states (inserts in Fig. 40). The purely repulsive 3 + admixture to the 3 conLguration in the B1 63
Characteristics of almost every MeHe molecule depart from presented regularities. This is often attributed to He atom, which possesses an s2 orbital while the rest of RG atoms have p6 valence orbitals.
280
J. Koperski / Physics Reports 369 (2002) 177 – 326
Fig. 42. Comparison of the ground-state interatomic potentials of the CdRG (RG =Ne, Ar, Kr, and n=2; 3; 4, respectively) vdW molecules. The CdNe, CdAr and CdKr ground-state repulsive parts were determined from the modelling of bound–free Huorescence spectra. They are represented by Morse (and M–S (12.3,10.2), see Table 11) [45], M–S (10.6,7.0) [47] and M–S (8.6,7.3) [48,49] functions, respectively. Thick lines represent ranges actually probed in the experiment.
states results in much more shallow well depths and larger bond lengths than in the A0+ states. At long range, the B1 are actually more attractive than the A0+ states, but the net attraction persists to much shorter distances in the A0+ state. It is also interesting to compare the De and Re for two pure -states: A0+ (3 ) triplet and D1(1 ) singlet states. In both states the excited state 5p-orbitals are oriented perpendicularly to the internuclear axis (-alignment), however the node of the singlet-state orbital approach closer the spherical np6 orbital of the RG ground state 64 resulting in deeper well depths and shorter bond lengths in D1 states. In conclusion, it is worthwhile to compare a degree of repulsion in the CdRG ground-state potentials of CdNe, CdAr and CdKr in the short range limits as determined from the modelling the bound–free parts of Huorescence spectra (Fig. 42). All three repulsive parts of the potentials were represented by M–S(n0 ; n1 ) functions (the n1 = 10:2 coeRcient for CdNe was adjusted so the M–S function smoothly merges into the Morse potential determined in Ref. [45], Table 11). As seen in Fig. 42, the degree of repulsion (slope) for the three molecules is approximately the same suggesting that the spherically symmetric np6 orbital of RG atom, even for diAerent RG , does not inHuence the potential in the short-range region. This immediately implies that for R ¡ Re in this family of molecules, the contribution of the long-range vdW interaction is negligible. As will be seen below, it is not true for the Me2 diatoms. 6.2. HgRG molecules First reports on the spectroscopy of HgRG vdW molecules produced in supersonic beams were published in early 1980s. The studies were initiated in Yokohama, Japan with Lrst detection of the A0+ ← X0+ and B1 ← X0+ transitions in excitation spectrum of HgAr [262]. Since then using 64
The higher-energy nsnp 1 P1 atomic states have larger and more diAused p-orbitals than the lower energy nsnp 3 PJ states [183]. This results in closer penetration of the RG atom as there is less np-RG repulsion (compare with Fig. 9) as well as more eAective attraction because of the larger C6 dispersion coeRcient for the D1 than A0+ state! (Table 11).
J. Koperski / Physics Reports 369 (2002) 177 – 326
281
this method, noteworthy experimental investigations of HgRG were carried out by several groups in Japan (at the University of Tokyo, at Keio University, Yokohama, and at the Tohoku University in Sendai). Moreover, these complexes were studied at the UniversitPe Paris-Sud in Orsay, in Clarendon Laboratory in Oxford, at the Purdue University in West Lafayette, IN, United States, and by the author in laboratories in Windsor and KrakPow. Comparing to the laser spectroscopy of ZnRG (Section 6.3) and CdRG (Section 6.1) produced in supersonic beams, studies of HgRG do not pose quite as many diRculties and experimental problems as do the two other classes of molecules. This is because of the relative ease to obtain comparable Hg-vapour densities using relatively low temperatures of the supersonic beam source (Table 9). Moreover, those considerably lower temperatures permit the use of solenoid pulsed valves to drive the beams, which consequently lowers demands for the eRciency of the vacuum pump systems. A short overview of the most signiLcant studies of the ground and lowest excited electronic energy states of the HgRG reported so far by other investigators, should start with those in which characterization of the HgHe complex in the X0+ , A0+ and B1 states was performed [199 –201]. It was found that in this very weakly bound molecule, each of the two A0+ and X0+ bound electronic states could support three and at least one vibrational levels, respectively, whereas the B1 state was found to be + totally repulsive. Moreover, from the observed A0+ v ← X0v =0 vibrational progression and rotational + contours it was concluded that the A0 state is extremely anharmonic and that L–J(6−12) (Eq. (21)) rather than Morse was more suitable to represent the A0+ -state potential [201]. Similarly to CdHe, the ground and excited states in HgHe correlated with the 63 P1 atomic asymptote are extremely shallow, therefore, the corresponding transitions in excitation spectrum lie very close to the 63 P1 –61 S0 atomic line. To expose the spectral features that otherwise are obscured by atomic Huorescence, an Hg vapour Llter was employed (Section 5.3) [201]. The ground and lower A0+ , B1, C1 1(61 P1 ) (see footnote 58) excited states of the HgNe molecule were investigated [200,232,263,264], including rotational structures and isotopic shifts in the A0+ ← X0+ and B1 ← X0+ transitions [200]. Moreover, a direct observation of the dissociation limit in the B1 ← X0+ transitions using photofragment excitation spectroscopy was reported [263]. All those studies concluded with Morse representations for the ground and all three excited states. The Morse representations were extended for wide region of internuclear separations despite the fact that they were determined with the help of the experimental data mostly in the intermediate region of R. In addition, what is very confusing, the De ¡ D0 values were concluded for the A0+ state in HgNe [200]. Rich experimental data exists on the spectroscopy of analogous transitions in HgAr molecule [73,200,232,262–269]. The ground and A0+ , B1 and C1 1 excited states of the HgAr molecule were investigated including rotational [200,264,268] and isotopic structures [200] of the A0+ ← X0+ , B1 ← X0+ and C1 1 ← X0+ transitions. Analogically to the HgNe, also in HgAr the dissociation limit in the B1 ← X0+ transitions was observed [263]. In most cases, the interatomic potentials in intermediate regions of R were represented by Morse functions with parameters determined from analysis of the spectra. Similarly to the HgNe, also for the HgAr a confusing De ¡ D0 conclusion has been drawn for the dissociation energies and well depths of the A0+ as well as B1 states [200]. In studies of an alignment of photofragments after photodissociation [269], hybrid potentials (Section 3.5.9) were adopted to represent the X0+ , A0+ and B1-state PE curves in the intermediate and long-range regions of R. In the work on picosecond spectroscopy of the HgAr [265], the RKR method was used for determination of the B1-state potential. The same method was adopted in a very accurate determination of the A0+ -state potential [266]. Moreover, for the A0+ excited state, noteworthy attempts have been made to determine the
282
J. Koperski / Physics Reports 369 (2002) 177 – 326
shape of the PE curve in the long-range region by investigation of an -type doubling in the B1 electronic state of the 200 HgAr isotopomer [267]. Reports on laser spectroscopy of the HgKr and HgXe complexes are signiLcantly less extensive, probably due to large diAerence in the cost of the experiment (here cut down considerably though, by using pulsed supersonic beam apparatus). The X0+ , A0+ , B1 states of HgKr and HgXe were characterized in Refs. [232,233], respectively, and then the investigations were revisited [270] correcting mostly excited-state characteristics in these two molecules. The HgKr and HgXe higher-excited singlet C1 1-state characterization was presented in Ref. [264]. All those reports concluded with Morse representation for the ground and excited HgKr and HgXe states mostly in the intermediate regions of R. Refs. [264,270] also focused on the investigation of rotational and isotopic structures in the excitation spectra of HgKr and HgXe using methods of high-resolution spectroscopy. It should be mentioned here, that there were several attempts to represent the HgKr ground-, A0+ - and B1-state interatomic potentials with M–S(n0 ; n1 ) functions [175,271], mostly in the short and intermediate regions of R, employing methods diAerent than the laser spectroscopy in supersonic beams. In studies of the excitation spectra of HgHe [51,52], HgNe [53], HgAr [53–55], and HgKr [56] molecules (see comparison in Fig. 43) performed in Windsor laboratory, a special care was taken to reliably characterize the X0+ , A0+ and B1-state potentials in the wide region of R using both excitation and Huorescence spectra, as well as all available methods for analysing data and simulating the spectra. An eRcient population of the v ¿ 0 vibrational states was assured by properly adjusting the conditions of supersonic expansion. Therefore, a number of never previously seen “hot” bands in all studied transitions (especially for HgNe and HgAr [53]) were observed. This facilitated direct and more reliable characterization of the ground states in the HgNe and HgAr molecules. Detection of the excitation and Huorescence spectra in HgRG was performed using pure RG carriers (Table 9) to avoid “contamination” of the investigated spectra by unwanted HgRG components. In detection of HgHe [51,52], HgNe and HgAr [53] excitation spectra, an Hg-vapour Llter was used. In case of the HgAr and HgKr, the accuracy of characterization of the ground-state repulsive part was improved by detection of bound–bound [53] and bound–free [53,54] Huorescence terminating on the same part + of the respective ground-state repulsive wall. The Lrst-time observed A0+ v =8 → X0 Huorescence in HgKr was reported in Ref. [56]. The direct determination of the number of vibrational components accommodated in the B1-state bound well, and the B1-state dissociation limit [55] allowed to resolve the controversy that surfaced in the case of determination of the highest vibrational level, vmax [265]. + + The most thorough and complete characterization of the X0 , A0 and B1 states was presented in Ref. [56] as a result of the study of HgKr molecule. 6.2.1. X0+ singlet, and A0+ and B1 triplet states 6.2.1.1. HgHe. Spectroscopy of the HgHe A0+ ← X0+ and B1 ← X0+ transitions in excitation was presented at the OSA Annual Meeting in Toronto in 1993 [51]. Successful production of bound HgHe ground-electronic state required (similarly as in the case of CdHe) certain experimental parameters as signiLcantly high pressure of the carrier gas (P0 ¿ 17 atm) and large distance, XeA , from the nozzle to the excitation region (Table 9). These assured that conditions in the supersonic beam favoured production of extremely cold molecules allowing to appreciable populate of HgHe in its shallow ground state. The laser spectroscopy of HgHe had been previously reported [199 –201]. In Ref. [51], similarly as in Ref. [201], an Hg-vapour Llter was employed to test its applicability and
J. Koperski / Physics Reports 369 (2002) 177 – 326
283
Eat (63P1-61S0) A0+←X0+
+
B1←X0
HgKr
LIF (arb. units)
HgAr
10x
HgNe
HgHe
2525
2530
2535
2540
2545
2550
2555
laser wavelength (Å)
Fig. 43. A comparison of the B1 ← X0+ and A0+ ← X0+ transitions in the excitation spectra of HgHe, HgNe, HgAr and HgKr molecules reported in Refs. [51–53,55,56], respectively. The spectra of the HgHe, HgNe and HgAr were recorded Y atomic Huorescence (see text for details). with an Hg vapour Llter that was applied to absorb out the intense Hg 2537 A + + The B1 ← X0+ (for HgNe, HgAr and HgKr) and A0 ← X0 (for HgHe, HgNe, HgAr and HgKr) progressions are v =0 v =0 situated on the short- and long-wavelength sides of the 63 P1 –61 S0 atomic transition, respectively. As in the case of the CdRG, a conclusion is that the B1 and A0+ excited states are more strongly and more weakly bound, respectively, than the X0+ ground state. The relative intensity scale has been changed (×10) for HgAr spectrum near the long-wavelength limit.
performance in the experimental set-up of Windsor laboratory. The excitation spectrum recorded with and without the Llter is shown in Figs. 19(a) and (b) (see also Fig. 43). The reader can appreciate an improvement of the quality of the spectrum detected when the strong atomic Huorescence is eliminated. The interpretation of the spectrum as well as v -assignment were similar to that in + previous studies [199 –201]. A rotational analysis of the A0+ v =0 ← X0v =0 band is presented in Fig. 19(d) along with a simulation of the P and R-branches (Fig. 19(c), Q-branch is not present in the spectrum, Section 3.7). The limited experimental data allowed to determine only some of the molecular characteristics (see footnotes of Table 12). However, application of the Hg-vapour Llter was found to be quite useful, especially for very weakly bound HgRG complexes, whose spectroscopical components overlap with prominent atomic features in the spectrum.
284
J. Koperski / Physics Reports 369 (2002) 177 – 326
Table 12 Summary of the X0+ (1 + ), A0+ (3 ), B1(3 + + 3 ) and D1(1 )-state potential characteristics for the HgRG molecules (RG = rare gas). Results of the author’s studies are put in bold. The most recent ab initio values of Refs. [135 –137] are included. Phenomenological ground-state long-range characteristics are collected in Tables 4 –7. Note: \Re = Re − Re Designation 1
HgHe 2
HgNe 3
HgAr 4
HgKr 5
HgXe 6
De (cm−1 )
6.2 ± 0.4a 8.0e 13.7p
41.4 ± 1.1b 46f 42.3p
133.7 ± 2.0c 143f 118p
178.49 ± 0.06d 178k 166p
254r 304s 213p
Y Re (A)
4.50a 4.6e 4.19p
3.89 ± 0.01b 3.90f 3.99p
3.99 ± 0.01c 3.99f 4.10p
4.03 ± 0.02d 4.07k 4.16p
4.25r 4.05s 4.32p
20.6 ± 0.5b 18.5f 13.5p
24.7 ± 0.04c 23.5f 19.4p
20.7 ± 0.2d 20k 17.4p
18.3r 16.6p
2.56 ± 0.03b 1.6f
1.14 ± 0.02c 1.06f
0.06 ± 0.10d 0.54k
0.33r
0.197 0.202f
0.0597f
0.0311f
— —
— —
Be (cm−1 )
0.212a 0.20e
— —
— —
— —
— —
n or n0 m
—
14.1b
11.96c 11.3n
11.39d 10.63l
9.3
10.5n 16.86l
—
X0+ (1 + )
!e (cm−1 )
16.9
!e xe (cm−1 ) Bv =0
(cm
−1
2.6
f
f a
)
n1
—
Y (×10 A 8
−1
10.8 b
1.247r
27.7 ± 0.8a 29d 46p
79.0 ± 1.0b 67f 81p
362.3 ± 6.0c 369.19g 447p
627.9d 628.7q 832p
1380.9q 1769p
Y Re (A)
3.55a 3.6e 3.43p
3.48 ± 0.02b; o 3.47f 3.53p
3.35 ± 0.02c; o 3.368g 3.29p
3.27 ± 0.04d; o 3.35q 3.17p
3.15q 3.09p
Y \Re (A)
— —
−0.41 ± 0.02b −0:43f
−0.64 ± 0.01c −0:63f
−0.76 ± 0.02d −0:72k
−1:13q; r
26.9 ± 0.4b 28.3f 20.4p
41.7 ± 0.6c 41.2g 43.8p
40.61d 40.63q 48p
54.17q 65.1p
2.28 ± 0.05b 3.0f
1.20 ± 0.06c 1.207g
0.686d 0.691q
0.565q
0.316 0.319f
0.0727f
— —
— —
— —
0.341a 0.32e
— —
— —
— —
— —-
—
—
182.6h
230.7d
—
—
1.575b
1.540c
1.552d
1.634q
13.3 ± 0.8b 13f 17p
67.2 ± 1.0c 53f 78p
104.2d 104.8q 120p
187.6q 403p
4.57 ± 0.02b; o 4.92f 4.98p
4.64 ± 0.02c; o 4.70f 4.74p
4.49 ± 0.04d; o 4.58q 4.67p
4.47q 3.18p
)
(cm
−1
A0 ( De
)
!e (cm−1 )
21.1
!e xe (cm−1 ) Bv =0
(cm
−1
— a
)
Be (cm−1 ) C6 (a:u:) Y (×108 A
−1
3
3
+
B1 ( + De (cm−1 ) Re
Y (A)
p
)
1.451
d
1.669
3
1.501
c
—
+
)
—
n
) 4.4
p
5.55p
J. Koperski / Physics Reports 369 (2002) 177 – 326
285
Table 12 (continued) Designation 1
HgHe 2
HgNe 3
HgAr 4
HgKr 5
HgXe 6
Y \Re (A)
— —
0.68 ± 0.01b 1.02f
0.65 ± 0.01c 0.71f
0.46 ± 0.02d 0.51q; k
0.22q; r
7.7 ± 0.4b 7.9f 6.7p
12.7 ± 0.2c 11.5f 11.7p
10.95d 11.1q 10.4p
9.71q 44.5p
0.303d 0.301q
0.215q
!e (cm−1 ) !e xe (cm−1 )
—
1.12 ± 0.05b 1.2f
0.60 ± 0.01c 0.63f
Bv =0 (cm−1 )
—
0.0356f
0.0233f
C6
(a:u:)
—
Y (×10 A 8
−1
)
c
—
315.3 231.2 b
1.089
c
— i
— d
390.4 272.2 1.043
d
l
— 1.006q
—
1.104
104p
89j 91p
487j 532s 702p
1410j 1498s 1485p
3463j 3615s 3451p
3.41j 3.50p
3.28j 3.12p
2.93j 2.98p
2.95j 2.75s 2.89p
−0:49f ; j
−0:71f ; j
−1:14j; k
−1:30j; r
27.3j 20.8p
50.3j 49.0s 55.6p
69.1j 69.1s 66.5p
98.8j 98.8s 88.8p
—
2.1j
1.29j
0.85j
0.71j
—
1.512j
1.597j
1.727j
1.829j
C1 1 (1 )59 De (cm−1 ) Y Re (A)
3.15
Y \Re (A)
—
!e (cm−1 ) !e xe
(cm
p
−1
42.6 )
Y −1 ) (×108 A
p
a Refs. [51,52], only D0 and D0 (A0+ ) was evaluated by the author and they were put in table instead of De and De (A0+ ), respectively; Re and Re (A0+ ) obtained from rotational analysis (Fig. 19). b Ref. [53]. c Ref. [53–55]. d Ref. [56]. e Ref. [199], only D0 and D0 (A0+ ) were evaluated, it was put in table instead of De and De (A0+ ). f Ref. [200]. g Ref. [266]. h Ref. [267]. i Ref. [265]. j Ref. [264]. k Ref. [232]. l Ref. [175]. m From Eq. (51). n + + From simulation of the A0+ and A0+ bound–free spectra of HgAr [54] and HgKr [56], rev =2; 3; 4; 5 → X0 v =8 → X0 spectively. o From \Re obtained in simulation of the corresponding excitation spectrum. p Ab initio values of Refs. [135 –137]. q Ref. [270]. r Ref. [233], Re (HgKr) and Re (HgXe) evaluated using so-called Kong’s intercombination rule of Refs. [230,231]. s Values obtained as a result of re-examination of the C1 1 ← X0+ excitation spectrum of HgAr, HgKr and HgXe of Ref. [264] with the help of LEVEL 6.1 code [212] and L–P method [9,226].
286
J. Koperski / Physics Reports 369 (2002) 177 – 326
6.2.1.2. HgNe. The Hg-vapour absorption Llter was also applied in detection of the A0+ ← X0+ and B1 ← X0+ transitions in excitation of the HgNe [53] (Fig. 43). It improved the recorded spectrum in comparison with those previously reported [200,232]. The dissociation continuum, usually present in the short-wavelength part of the spectrum, was eliminated allowing detection of the “pure” B1v ← X0+ v =0 progression up to the dissociation limit. Moreover, a large number of “hot” bands detected in both transitions (most of them observed for the Lrst time) allowed to reliably characterize the molecular ground-state (see also Ref. [45]). The improved v -assignment of the A0+ ← X0+ transition components [200] corrected the earlier erroneous assignment of Ref. [232]. The new as+ signment was conLrmed in observation of the A0+ v =1 → X0 Huorescence as well as in the detailed + + 65 study of the F0u ← X0g transition in Hg2 [60]. It resulted in slightly modiLed values for the A0+ and B1-state well depths, and most reliable representation of the ground-state potential. As a result, all three states were represented by Morse functions in the intermediate regions of R. 6.2.1.3. HgAr. The spectroscopical characterization of the HgAr molecule presented in Refs. [53–55], along with the hybrid ground-state representation, is one of the most complete analyses of the experimental data discussed in this review. This concerns mostly the characterization of the X0+ state, which was performed based on the “hot” bands detected in the excitation [53] as well + as in the B1v =0; 1; 2; 3 → X0+ bound–bound [53,55] (Fig. 26) and A0+ v =2; 3; 4; 5 → X0 bound–free transitions recorded in the Huorescence [55]. Similarly to the HgHe and HgNe, the Hg-vapour Llter was also employed during the detection of HgAr excitation spectrum, allowing to reveal the true + intensity of the A0+ vibrational component usually obscured by the intense atomic-line v =7 ← X0 proLle (Fig. 43). Moreover, analogically to the HgNe, the most short-wavelength dissociation continuum was eliminated allowing to detect the “pure” B1v ← X0+ v =0 progression up to the B1-state dissociation limit, D (B1). Determination of the D (B1), with the aid of the limiting LR–B method [55] (Section 4.1.2), facilitated a characterization of the B1-state potential in the long-range limit assuming that pure vdW forces dominate in that region of R. Thus, the B1-state long-range tail was represented by the D (B1) − C6 =R6 approximation taking over the Morse representation valid in the intermediate region. It should be mentioned here, that the B1-state interatomic potential was determined to be represented by a function that is less steep than the Morse function. That was concluded by Duval et al. [75] from the C3 1v =2; 10 (73 S1 ) → B1 bound–free Huorescence. A very interesting example of exploiting data, which complement one another, is the characterization of the HgAr ground state [53,54]. The bound well of the X0+ state below its dissociation limit was determined using both, the “hot” bands detected in the A0+ ← X0+ and B1 ← X0+ transitions in the + excitation spectrum, and B1v =0; 1; 2; 3 → X0+ and A0+ v =2; 3; 4; 5 → X0 bound–bound transitions recorded in Huorescence. Consequently, the Morse function was derived as a representation of the ground state bound well. To characterize the repulsive part of the X0+ -state potential above the dissociation limit, + the A0+ bound–free proLles were recorded and then simulated using the M–S(11.3, v =2; 3; 4; 5 → X0 10.8) function. This was diAerent from the conclusion from previous studies of other investigators where a Morse function was postulated [232]. As a conclusion, a hybrid Morse–M–S representation was proposed for the X0+ -state interatomic potential [54]. 65
+ As pointed out in Ref. [53], the F0+ u; v =0 ← X0g; v =0 vibrational component in Hg2 excitation spectrum was mistakenly + interpreted in Ref. [232] as the HgNe A0v =0 ← X0+ v =0 component.
J. Koperski / Physics Reports 369 (2002) 177 – 326
287
6.2.1.4. HgKr. The last HgRG molecule reviewed here is the HgKr diatom [56] (Fig. 43). The + A0+ → X0+ and B1 ← X0+ bound–bound as well as A0+ v =8 → X0 bound–free transitions in excitation as well as Huorescence spectra, respectively, were detected using pure Kr as a carrier gas. After an analysis involving B–S, limiting LR–B and generalized NDE methods, the A0+ and B1 excited states were represented by Morse functions, which in the long-range region of R were approximated with pure C6 =R6 vdW tails. It was found that both excited states exhibit strong non-Morse behaviour + in the attractive long-range region. The A0+ v =8 → X0 bound–free Huorescence that was never previously observed allowed to accurately determine of the repulsive part of the ground-state potential, which considerably diAers from that obtained previously in the study of blue satellites of the Hg 253:7 nm line perturbed by Kr [175]. Therefore, a hybrid three-parts potential (for the repulsive, intermediate and long-range parts) was proposed to represent the X0+ state. This complete representation adequately illustrate how the experimental data from diAerent spectra, recorded in spectral regions corresponding to diAerent ranges of R, can be complementarily used to increase the accuracy of determination of a real interatomic potential. These representations for the short and long R-ranges are presented in Figs. 13 (a) and (b), respectively. 6.2.2. Conclusions—HgRG family Because the author’s results obtained for the HgRG molecules does not include HgXe nor 1 C 1(61 P1 ) state (see footnote 58), in the conclusions related to the characterization of the X0+ , A0+ , B1 and C1 1 electronic states in HgRG (RG=He, Ne, Ar, Kr, Xe), results of Refs. [233,264,270] were used (Table 12 and Fig. 44). One can draw a straightforward conclusion that, similarly to the case of CdRG analysed above, in Refs. [51–56] as well as other experimental [175,199,200,232,233,264 –267,270] and ab initio [135 –137] results related to the ground- and excited-state well depths, certain trends are present De (HgHe)66 ¡ De (HgNe) ¡ De (HgAr) ¡ De (HgKr) ¡ De (HgXe) ;
(65a)
De (HgHe)66 ¡ De (HgNe) ¡ De (HgAr) ¡ De (HgKr) ¡ De (HgXe) :
(65b)
D
Similarly to the CdRG molecules, the e values in HgRG also increase regularly with increasing RG , which reHects an induced-dipole–induced-dipole interaction. Comparing to the both experimental as well as ab initio results of the other investigators, the De (HgRG) vs. RG dependence obtained using results of Refs. [51–56] can be approximated linearly within the L–D theory (see Fig. 44(a) and the above discussion on CdRG). Consequently, one may conclude that the HgXe ground-state well depth determined experimentally by Yamanouchi et al. [233] and obtained in ab initio calculation by Czuchaj and co-workers [135 –137] is perhaps too small and is expected to have larger value by approximately 20%, i.e. De (HgXe) = 304 cm−1 . 67 Similarly as for the ground states, the De values increase as RG increases (the trend is present for the author’s and other experimental as well as ab initio values), but no longer a linear dependence is observed, as shown in Figs. 44(b), (c) and (d) 66
For HgHe only D0 and D0 (A0+ ) was evaluated and they were put in Table 12 instead of De and De (A0+ ), respectively. In this molecule the B1 state was found totally repulsive (experimental data) or very weakly bound (ab initio calculation). There is no experimental data for De (C1 1) in HgHe. 67 Indeed, the author’s re-examination of the C1 1 ← X0+ excitation spectrum of HgXe that was published previously [264] leads to corrected values for Re and Re (C1 1) as well as De and De (C1 1) (Table 12). It was done with the help of LEVEL 6.1 code of LeRoy [212] and L–P method [9,226] described in Section 4.5.1.
288
J. Koperski / Physics Reports 369 (2002) 177 – 326 1800
300
HeNe
-1
Hg RG
(a)
1200
Xe
Kr
Ar
Hg RG
(b)
+
De"(HgRG) [cm-1]
200
Xe
Kr
De'(A0 , HgRG) [cm ]
Ar
HeNe
the author's Refs. [51-56]
100
Refs. [199, 200, 232, 233] ab initio, Ref. [135-137]
600 the author's Refs. [51-56] Refs. [199, 200, 266, 270]
re-examined C1-X0 spectrum (see Table 12)
ab initio, Ref. [135-137]
0
0 0
1
2
3
4
0
5
1
2
3
4
5
αRG [Å ]
αRG [Å ]
3
3
500 Kr
Xe
-1
Ar
De'(C 1, HgRG) [cm ]
Hg
RG
300
(c)
the author's Refs. [51-56] Refs. [200, 270]
200
ab initio, Ref. [135-137]
4000
Ar
HeNe
Xe
Hg RG
(d)
2000
100
Ref. [264] ab initio, Ref. [135-137] re-examined by the author (Table 12)
0
0 0
1
2
3
4
0
5
1
2
3
4
5
αRG [Å ] 3
αRG [Å3] 3000
13000 Ar
Kr
11000
Xe
[Re'(A0+, HgRG)]6 [Å6]
HeNe
[Re''(HgRG)]6 [Å6]
Kr
1
De'(B1, HgRG) [cm-1]
HeNe
400
the author's Refs. [51-56] Refs. [199, 200, 232, 234] ab initio, Ref. [135-137]
9000
re-examined by the author (Table 12)
7000
(e)
5000
HeNe
Ar
Xe
Kr
the author's Refs. [51-56]
2000
Refs. [199, 200, 266, 270] ab initio, Ref. [135-137]
(f) 1000
0
1
2
3
4
0
5
1
2
αRG [Å ]
4
5
2000 6
[Re'(C 1, HgRG)] [Å ]
30000
HeNe
Kr
Xe
6
Ar
HeNe
20000
Ar
Kr
Xe the author's Refs. [51-56] Refs. [200, 270] ab initio, Ref. [135-137]
Refs. [264] ab initio, Ref. [135-137]
1000
1
[Re'(B1, HgRG)]6 [Å6]
3
αRG [Å3]
3
10000
(g)
(h) 0
0 0
1
2
3
αRG [Å ] 3
4
5
0
1
2
3
αRG [Å3]
4
5
J. Koperski / Physics Reports 369 (2002) 177 – 326
289
for the A0+ , B1 and C1 1 excited states, respectively. This reHects the fact that, in general, the dispersion L–D theory (53) cannot be applied for the excited states. Inspecting Figs. 44(a) – (d) one can conclude that the ab initio values generally follow the trend of those determined experimentally except the De (B1) for CdXe. Minor discrepancies between ab initio and experimental results occur for De (HgXe) and De (A0+ ) in HgKr and HgXe. Figs. 44(e), (f), (g) and (h) present the R6e vs. RG dependence (according to relationship (53)) plotted for the X0+ , A0+ , B1 and C1 1 states, respectively using author’s [51–56] as well as other experimental [175,199,200,232,233,264 –267,270] and ab initio results [135 –137] for HgRG. Comparing these dependencies obtained for the X0+ state (Fig. 44(e)) with those for CdRG ground states (Fig. 40(e)) one can easily notice the diAerence: except the Re (HgHe), the bond lengths increase as the RG increases. The Re (HgRG) obtained experimentally as well as in ab initio calculations are related as follows: Re (HgNe) ¡ Re (HgAr) ¡ Re (HgKr) ¡ Re (HgXe) :
(66a)
From Fig. 44(e) it is obvious that the Re ab initio values for HgRG (RG = Ne, Ar, Kr, Xe) are systematically larger with increasing RG . It is also very interesting that the Re (HgHe) does not follow tendency (66a) and is anomalously large compared to the other Re values; this is observed for both the experimental and ab initio values. A similar, anomalously large ab initio value is observed for Re (B1) while, according to the experimental studies of HgHe [199,200], the B1 state is totally repulsive (or too shallow to support any vibrational levels). From the above discussion relating to the behaviour of Re for CdRG (64a) and HgRG (66a) (compare Figs. 40(e) and 44(e)) it is evident that these two trends seem to be opposite and the latter is not possible to explain using simple L–D theory. Instead of decreasing (like in CdRG) from Y to Re (CdXe) = 4:21 A Y (Table 11), here one can observe an increase in bond Re (CdHe) = 4:33 A Y to Re (HgXe) = 4:25 A Y (Table 12). It seems that the Re (HgRG) lengths from Re (HgNe) = 3:89 A Y tendency (except the Re (HgHe)) follows that of RG-atom “hard-sphere” diameters 2:8 : 3:4 : 3:6 : 4:1 A for Ne:Ar:Kr:Xe, respectively [272], supporting that the repulsive forces, which start manifesting themselves in a closest approach of the Hg and RG atoms, rather than the long-range attractive ones (case of the CdRG) dominate here. For the excited-state bond lengths (Figs. 44(f) – (h)), the general tendency is the same for the A0+ and B1 states while considering both experimental (obtained in Refs. [51–56]) and by other investigators [199,200,232,233,264,266,270]) as well as ab initio [135 – 137] results. The tendency is described by the following relationship (except ab initio value for Re (A0+ ) of HgHe, which departs from this regularity): Re (HgHe) ¿ Re (HgNe) ¿ Re (HgAr) ¿ Re (HgKr) ¿ Re (HgXe) :
(66b)
The relationship (66b) is consistent with that present for the CdRG excited states and can be similarly explained (Section 6.1.4). As the RG increases from that of He to Xe, the spatial electron density ←−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
Fig. 44. (a) – (d) Well depths, De (for HgHe there are D0 instead), and (e) – (h) bond lengths, R6e , plotted in function of RG polarizability, RG , for X0+ ground ((a), (e)), and A0+ ((b), (f)), B1 ((c), (g)) and D1 ((d), (h)) excited states Y −3 slope) are compared with of HgRG molecules. Results of Refs. [51–56] (the linear Lt in (a) produces a 75:1 cm−1 A those of Refs. [199,200,232,233,264,266,270]. Result of ab initio calculation of Czuchaj et al. [135 –137] is also shown. Inserts illustrate mutual orientations of the electron density distributions in the ground and excited molecular electronic states.
290
J. Koperski / Physics Reports 369 (2002) 177 – 326
Fig. 45. Comparison of the ground-state interatomic potentials of the HgRG (RG = Ne, Ar, Kr, Xe, and n = 2; 3; 4; 5, respectively) vdW molecules. The HgAr and HgKr ground-state repulsive parts were determined through the modelling of bound–free Huorescence spectra. They are represented by M–S (11.3,10.8) [54] and M–S (11.39,10.5) [56] functions, respectively. The HgNe [45,53] and HgXe potentials represented by Morse function are shown for comparison. Thick lines represent ranges actually probed in the experiment.
distributions tend to be closer to each other despite their mutual orientation (6p-orbital along the internuclear axis for B1 (*-alignment), or 6p-orbital perpendicular to the internuclear axis for A0+ (-alignment)). A large discrepancy and rather no distinct tendency are observed while comparing experimental [264] and ab initio [135 –137] results for Re (C1 1) (Fig. 44(h)). It is a conclusion of this review that, along with corrections described in Table 12 (see footnote 19 to Table 12), results for HgXe, especially those for X0+ and C1 1 states, call for additional investigation. Inspecting Fig. 44 as well as numbers collected in Table 12, one can see that, similarly as for the CdRG, the ground-state well depths and bond lengths of HgRG molecules are smaller and larger, respectively than corresponding De and Re values for the B1 excited state while exactly the reverse is true for the A0+ state. Also, a comparison of De and Re for two, A0+ (3 ) triplet and D1(1 ) singlet, pure -states leads to a conclusion that well depths and bond lengths are longer and shorter, respectively for the D1 states. The interpretation of the above observation is similar as for CdRG molecules and the reader is referred to Section 6.1.4. Finally, it is worthwhile to compare the degree of repulsion of the HgRG ground-state potentials in the short-range limit determined for the HgAr and HgKr from modelling of the bound–free parts in the Huorescence spectra (Fig. 45). The two repulsive parts of the potentials were represented by M–S(n0 ; n1 ) functions (Table 12). As seen in Fig. 45, in which also HgNe [45,53] and HgXe Morse representations are added, the degree of repulsion for all repulsive branches is approximately the same, suggesting the same conclusion as that derived for repulsive branches of the CdRG ground states. 6.3. ZnRG molecules As already mentioned, the ZnRG molecules in their stable ground states are relatively diRcult to produce. There are two main obstacles: high temperatures (to obtain suRciently high density ◦ of Zn vapours in the oven, i.e. 50 –100 Torr, one has to heat it up to 920 –1000 K (∼650–730 C),
J. Koperski / Physics Reports 369 (2002) 177 – 326
291
Table 9), and aggressiveness of the Zn metal while in contact with a stainless-steel body of the oven. This is most likely the reason that there are so few reports on the spectroscopy of these molecules in supersonic beams. The only other group dealing with the laser spectroscopy of ZnRG by detecting excitation spectra and employing the supersonic beam method is that of University of Utah. Four articles in which a spectroscopical characterization of the C1 1 excited (see footnote 58) and 1 + ground states in ZnNe [273], 68 ZnAr [274], ZnKr [216] and ZnXe [275] can be found in the literature. These reports characterize intermediate regions of R using Morse representations for the ground and excited interatomic potentials, including their rotational characteristics. The ground-state characterization in Refs. [216,273–275] is indirect, i.e. based on expression (41) and relying on determination of the excited state dissociation energy, D0 , and experimentally measured frequency v00 . Therefore, CdRG and HgRG alike, in the studies of the excitation spectra of ZnNe [40,41], ZnAr [40,42] and ZnKr [42] 69 molecules (comparison in Fig. 46) a particular eAort was made to produce eRcient population of the v ¿ 0 vibrational states to observe a number of “hot” bands facilitating the ground-state characterization. Also, in case of the ZnKr spectroscopy, the more favourable mixture of 5% of Kr in Ne, rather that in Ar, as reported in [216], was employed to prevent the D1 ← X0+ transition in ZnKr from heavy “contamination” by ZnAr vibrational components (as observed in Fig. 1 of Ref. [216]). Moreover, in case of ZnAr, a successful observation of the bound–free Huorescence proLle from selectively excited D1v =10 vibrational state [47] (Fig. 47) enabled to determine the degree of repulsion in the ZnAr ground-state short-range limit. 6.3.1. X0+ and D1 singlet states 6.3.1.1. ZnNe. Excitation spectrum of this very weakly bound molecule (Fig. 46) has been observed for the Lrst time in Windsor laboratory and reported at the 12th International Conference on Laser Spectroscopy in 1995 [40]. The direct spectroscopic characterization of the X0+ state based on “hot” bands detected in the D1 ← X0+ transition as well as characterization of the D1 state have Y been proposed and published afterwards [41]. The entire spectrum spanned a range of merely 3 A − 1 (i.e. 65 cm , in this spectral range), and was located very close to the atomic transition. Therefore, in the simulation of the F–CF intensity distribution, the inHuence of large amplitude of atomic line needed to be taken into account. This modiLed the \Re . The ground-state interatomic potential was represented by a Morse function in the intermediate region of R. To properly characterize a long-range behaviour of the ground state potential tail, the Gv NDE program of Le Roy [208] was employed (Section 4.1.2) with an assumption that the long-range forces between Zn and Ne atoms are dominated by a pure vdW interaction. The D1 excited state was also characterized in the intermediate (a Morse representation) and long-range (a C6 =R6 vdW approximation) regions. The latter characterization was concluded with the help of the Gv NDE program. 6.3.1.2. ZnAr. The most important result of the spectroscopy of ZnAr molecule is detection of the D1–X0+ transition in both excitation (Fig. 46) and Huorescence (Figs. 47(a) and (b)) spectra which, together with recorded “hot” bands, allowed a comprehensive, ground-state characterization 68
Ref. [273] has been published during the approval process of Ref. [41] in Phys. Rev. A. See Note added in proof in Ref. [41]. 69 Results of Refs. [40,42] were preliminary, and were improved later in additional experiments, and in precise B–S and LR–B analyses.
292
J. Koperski / Physics Reports 369 (2002) 177 – 326 Eat (41P1-41S0) D1←X0+
*
*
LIF (arb. units)
ZnKr
ZnAr
ZnNe
2135
2140
2145
2150
2155
2160
laser wavelength (Å)
Fig. 46. A comparison of the D1 ← X0+ transitions in the excitation spectra of ZnNe, ZnAr and ZnKr molecules reported in Refs. [40 – 42], respectively. The D1 ← X0+ v =0 progressions in all ZnRG molecules are situated on long-wavelength side of the 41 P1 –41 S0 atomic transition. This allows to draw a conclusion that the D1 excited state is more strongly bound than the X0+ ground state. Because of the experimental procedure (see text), the ZnKr spectrum contains ZnNe components, which are marked with asterisks. The analysis of the spectra resulted with characteristics collected in Table 13.
in the short as well as in intermediate regions of R [40,42]. It appears that for this class of molecules (i.e. ZnRG) the measured D1v =10 ← X0+ bound–free proLle [40,47] is the only experimental data of the ZnRG Huorescence spectrum reported in the literature. The studies concluded with a hybrid M–S(11.3,9.0) –Morse potential representation (Sections 3.5.6 and 3.5.9) for the ground-state repulsive wall in the short and intermediate regions of R, and a Morse representation for the D1-state potential (mainly in the intermediate region). In Fig. 47(e) the M–S(11.3,9.0) representation for the short-range region is compared with the Morse function, and it appears that the M–S potential is steeper than the Morse function. This shows inadequacy of the latter function as representation of the ground-state repulsive wall. 6.3.1.3. ZnKr. Similarly to the ZnNe and ZnAr molecules, also in the case of ZnKr, the detection of “hot” bands in the D1 ← X0+ transition of excitation spectrum facilitated a more reliable ground-state characterization. What clearly sets apart the experimental approach of Ref. [42] from
J. Koperski / Physics Reports 369 (2002) 177 – 326
293
Fig. 47. D1v =10 → X0+ Huorescence spectrum (re>ection in character, Section 4.2) of ZnAr molecule reported in Refs. [40,47]. The spectrum was detected after a selective excitation of the v = 10 vibrational level. (a) Gross spectrum detected with low spectral resolution of the detection system (150 cm−1 monochromator band pass). (b) The most short-wavelength part of the spectrum detected with a higher spectral resolution (30 cm−1 slit-width). (c) Simulation of the bound–free part of the spectrum performed with an assumption that the excited state and repulsive part of the ground state potential are represented by Morse and M–S (11.3,9.0) functions, respectively (amplitudes of the Lrst three short-wavelength maxima was changed). (d) Simulated bound–bound part of the spectrum (two shortest-wavelength maxima) generated on the assumption that the Morse functions represent the bound well of the D1 and X0+ states. The individual F–CF corresponding to vibrational peaks (vertical bars) were represented by a Gaussian convolution function representing the Y (i.e., 15 cm−1 ). A vertical scale for simulated bound–bound transitions monochromator throughput with FWHM of 0:65 A diAers from that for bound–free transitions. Horizontal bar represents the range presented in insert. (e) The M–S (11.3,9.0) ground-state repulsive potential representation compared with a Morse function plotted using parameters determined in Refs. [40,47] (see Table 13).
that of Wallace et al. [216], is employing of a 5% Kr + 95% Ne mixture for a carrier gas rather than Kr mixed with Ar (no ZnAr components overlapping the ZnKr components). As shown in Fig. 46, the ZnNe components are situated close to the atomic transition and do not interfere with the ZnKr. The studies resulted in the X0+ as well as D1 states representations using Morse functions in the intermediate ranges of R [42,47] (see footnote 69). 6.3.2. Absence of evidence for the A0+ and B1 triplet states As discussed in Ref. [41], using the supersonic beam apparatus in the laboratory in Windsor, the direct optical excitation of the ZnRG (RG = Ne, Ar, Kr) molecules from their X0+ ground states to the A0+ and B1 triplet states was practically impossible. This is because the Zn(41 S0 –43 P1 ) oscillator strength is too low and the radiative lifetimes of the triplet states are longer than the transit time
294
J. Koperski / Physics Reports 369 (2002) 177 – 326
from the nozzle to the interaction region. After approximately 27 s 70 the ZnRG molecules, under conditions of typical supersonic expansion (i.e. P0 = 10 atm, TT ≈ 2 K), travel a distance three times larger than XT and rapidly dissociate in the region of high-temperature turbulences (Fig. 31). This reduces considerably the Huorescence-collection eRciency and makes a direct excitation of the ZnRG triplet states within the time-window of the observation very unlikely. The long radiative lifetime of the Zn 43 P1 state may be one of the possible explanations why the normally observed Huorescence from the triplet asymptotic A0+ (n3 P1 ) and B1(n3 P1 ) (n=5 and 6 for CdRG and HgRG, respectively) cannot be detected in case of the ZnRG molecules (n = 4). The same conclusion was drawn in the ZnAr investigation [276]. 6.3.3. Conclusions—ZnRG family The experimental data for ZnRG molecules is relatively scarce as compared to that for CdRG and HgRG. Therefore, the sets of spectroscopical constants of the ZnRG are not as complete as those for the two remaining families of molecules. Summarizing the ZnRG (RG = Ne, Ar, Kr, Xe) characterization in the two X0+ and D1 electronic singlet states (Table 13), one can conclude that in the author’s [40 – 42,47] and other investigators’ (i.e. Wallace et al. [216,274,275] and McCaArey et al. [273]) results for the ground- and excited-state well depths, certain trends are present De (ZnNe) ¡ De (ZnAr) ¡ De (ZnKr) ¡ De (ZnXe) ;
(67a)
De (ZnNe) ¡ De (ZnAr) ¡ De (ZnKr) ¡ De (ZnXe) ;
(67b)
which characterized also CdGR and HgRG molecules in the corresponding electronic states (Sections 6.1.4 and 6.2.2). Similarly, the De (ZnRG) vs. RG dependence, plotted according to the results of Refs. [40 – 42,47], manifests itself with a linear trend (Fig. 48(a)), which reHects an induced-dipole– induced-dipole interaction. Assuming the linear dependence, it is most likely that the reported ab initio and experimental De values of ZnXe are underestimated by some 20 –25%. Obviously, this calls for additional investigation in the future. Also, analogically to all the MeRG ground and D1 states in CdRG and HgRG molecules, the well depths of D1 states in ZnRG increase as RG increase, however, a linear dependence is not observed (Fig. 48(b)) and the L–D theory (53) is not applicable. From the comparison shown in Fig. 48(b) it is evident that, as was for D1 states of CdRG, the ab initio De values are smaller than the experimental ones for all ZnRG (RG = Ne, Ar, Ke, Xe) molecules. Therefore, the possible explanation that was proposed in the case of CdRG D1-states (Section 6.1.4), is also applicable here. The ab initio values [133,134] shown in Fig. 48(b) were obtained taking into account not only Zn but also the RG valence electrons (the Zn20+ and RG8+ cores were modelled by ‘-dependent scalar relativistic pseudopotentials, and core-polarization potential has been applied for Zn20+ , see Table 2). Similarly as for the CdRG, the systematic deviations of ab initio De values from the experimental ones can be reduced by treating the spin–orbit interaction in more advanced manner [133,134]. 70
The radiation lifetime G(Zn 43 P1 ) = 27 s is larger than estimated transition time from the nozzle to the end of the “zone of silence” (Mach disk) in the supersonic beam. This is not the case for the radiation lifetimes of the Cd 53 P1 and Hg 63 P1 atomic states which are reported to be G = 2:3 and 0:1 s, respectively [277].
J. Koperski / Physics Reports 369 (2002) 177 – 326
295
Table 13 Summary of the X0+ (1 + ), and D1(1 )-state potential characteristics for the ZnRG molecules (RG = rare gas). Results of the author’s studies are put in bold. The most recent ab initio values of Refs. [133,134] are included. Phenomenological ground-state long-range characteristics are collected in Tables 5 –7. Note: \Re = Re − Re j s Designation +
1
ZnHe
X0 ( ) De (cm−1 )
a
Y Re (A)
10.5 7.6n 4.39i
!e (cm−1 )
4.46n —
!e xe (cm−1 ) n or n0 n1 Y −1 ) (×108 A 1 D1 ( ) De (cm−1 )
3.14n —
Y \Re (A) (cm
−1
) 39.7n —
!e xe (cm−1 ) Y (×10 A 8
a
— — — — —
99.7n
Y Re (A)
!e
ZnNe
ZnAr
ZnKr
23.6 ± 1.2b f 18+12 −3 26.7n 4.42 ± 0.06b; i 4:16 ± 0:10f 4.20n 15.06 ± 1.80b 15 ± 3f 10n 2.68 ± 0.90b 2.1f 14.3 9.8k 1.566b
81.7c 57:1 ± 0:5g 89.9n 4.38 ± 0.02c; i 4:18 ± 0:07g 4.23n 19.8c 23 ± 1g 17n 1.2c 2:3 ± 0:5g 11.3c 9.0l 1.329c
123d 115h 118.3n 4.36 ± 0.03d; i 4.20h 4.27n 17.2d 13.5h 17.3n 0.6d 0.4h 8.3 6.7k 1.143d
75.53 ± 1.00b f 71:1+12 −3 61.5n 3.58 ± 0.08b; m 3:48 ± 0:06f 3.62n −0.85 ± 0.02b −0:68f 25.35 ± 1.07b 25:8 ± 0:4f 16.1n 2:99 ± 0:89b 2:34 ± 0:20f 1.654b
690c 667 ± 5g 508.5n 3.18 ± 0.03c; m 2:97 ± 0:03g 3.04n −1.20 ± 0.01c −1:21 ± 0:02g 62c 61:57 ± 0:07g 47n 1.4c 1:442 ± 0:003g 1.435c
1476d 1400 ± 32h 1164n 2.96 ± 0.05d; m 2:74 ± 0:03h 2.86n −1.40 ± 0.02d −1:38h 82.7d 81:0 ± 0:6h 70n 1.16d 1:17 ± 0:02h 1.590d
−1
)
—
Hypothetical values from De vs. RG dependence of Ref. [260]. Ref. [41]. c Refs. [40,42,47], footnote 69. d Ref. [42], footnote 69. e Ref. [275]. f Ref. [273]. g Ref. [274]. h Ref. [216]. i With the aid of the L–P method of Refs. [9,226], Section 4.5.1. j From Eq. (51). k n1 chosen that M–S (n0 ; n1 ) potential has the same slope as Morse potential. l From simulation of the D1v =10 → X0+ bound–free spectrum (Fig. 47(c)). m From \Re obtained in simulation of the D1 ← X0+ spectrum. n Ab initio values of Refs. [133,134]. b
ZnXe
+
162 ± 1e 157n 4.32i 4.38e 4.42n 13:2 ± 1:2e 18.2n 0:27 ± 0:05e 4.5 — — 3241 ± 142e 2703n 2.82n −1:55 ± 0:04e 116:9 ± 2e 106.5n 1:05 ± 0:03e —
J. Koperski / Physics Reports 369 (2002) 177 – 326
De''(ZnRG) [cm-1]
HeNe
Ar
Xe
Kr
De'(D1, ZnRG) [cm-1]
296
200
Zn
RG
(a)
100
author's Refs. [40-42] hypothetical value, Ref. [260] Refs. [216, 273-275] ab initio, Ref. [133, 134]
0 0
1
2
3
4
3000
HNee
2000
Ar
Xe
Kr
Zn RG
(b)
1000 author's Refs. [40-42] Refs. [216, 273-275] ab initio, Ref. [133, 134]
0
5
0
1
2
αRG [Å3]
3
4
5
αRG [Å3] 3000
8000
7000 Ar
HeNe
Xe
Kr
(c)
6000
author's Refs. [40-42] Refs. [216, 273-275] ab initio, Ref. [133, 134]
5000
0
1
2
3
4
[Re'(D1, ZnRG)]6 [Å]6
[Re''(ZnRG)]6 [Å]6
HNee
Ar
Xe
Kr
2000
author's Refs. [40-42] Refs. [216, 273-275] ab initio, Ref. [133, 134]
1000
5
3
αRG [Å ]
(d)
0 0
1
2
3
4
5
αRG [Å3]
Fig. 48. (a) – (b) Well depths, De , and (c) – (d) bond lengths, R6e , plotted in function of RG polarizability, RG , for X0+ ground ((a), (c)) and D1 excited ((b), (d)) states of ZnRG molecules. Results of Refs. [40 – 42] (the linear Lt in (a) Y −3 slope) are compared with those of Refs. [216,273–275] and hypothetical value of Ref. [260]. produces a 49:6 cm−1 A Result of ab initio calculation of Czuchaj and KroPsnicki [133], and Czuchaj et al. [134] are also shown. Inserts illustrate mutual orientations of the electron density distributions in the ground and excited molecular electronic states.
Figs. 48(c) and (d) present the R6e vs. RG dependence (according to relationship (53)) plotted for the X0+ and D1 states, respectively, using results of Refs. [40 – 42,47] as well as those of other experimental [216,273–275] and ab initio [133,134] investigations. Comparing these dependencies obtained for the X0+ state (Fig. 48(c)), one can Lnd striking diAerences between those three sets of results. The ground-state bond-lengths obtained in Refs. [40 – 42,47] are related as follows (except the Re (ZnHe)): Re (ZnNe) ¿ Re (ZnAr) ¿ Re (ZnKr) ¿ Re (ZnXe) ;
(68a)
reHecting a regular tendency of decreasing of Re as RG increases (the same was true for Re (CdRG)). Similarly as for CdRG, this is consistent with the previously observed behaviour of De vs. RG shown in Fig. 48(a) and with the L–D model (53). For ab initio and other experimental Re (ZnRG) values of Refs. [216,273–275] (except the Re (ZnHe)) the reverse of (68a) is the case, which is rather puzzling and diRcult to explain. For the D1-state bond-lengths (Fig. 48(d)), the general tendency is the same (as it was for the D1-state of CdRG) for both the author’s [40 – 42,47] and other investigators’ [216,273–275] experimental as well as ab initio [133,134] results, and is described by following relationship (except
J. Koperski / Physics Reports 369 (2002) 177 – 326
297
the ab initio Re (D1) of ZnHe): Re (ZnNe) ¿ Re (ZnAr) ¿ Re (ZnKr) ¿ Re (ZnXe) :
(68b) +
Again, as for CdRG, this tendency is consistent with that obtained for the X0 states, and can be qualitatively illustrated using electron density distributions in the ground-RG and excited-metal atomic states. As the RG increases the electron density distributions, shown in the insert of Figs. 48(a) and (b), approach each other. Overall, it represents a consistent model as related to the results of studies in Refs. [40 – 42,47]. Inspecting Fig. 48 as well as numbers collected in Table 13 one can see that the ground-state well depths and bond lengths of ZnRG are smaller and larger, respectively, than the corresponding De and Re values for the D1 excited state. As already discussed above for the CdRG and in Section 3.5.1 (Fig. 9), this is caused by the mutual orientation of the electron density distributions in the X0+ (1 + ) and D1(1 ) molecular states. In the D1 states, the perpendicularly oriented, with respect to the internuclear axis, 4p-orbital (i.e., -alignment) approach the spherically symmetric 4p6 -orbital of the RG atom. This results in larger well depths and shorter bond lengths than those of the X0+ states. 6.4. MeRG families of molecules—comparison Results of the author’s studies as well as other experimental investigations allow to compare well depths and bond lengths of the X0+ ground and D1 excited states for all the MeRG molecules discussed here (Tables 11–13). Fig. 49 shows plots of De and R6e vs. RG for X0+ ((a), (b)) and D1 ((c), (d)) states of MeRG molecules, for which data were complete enough for this comparison. Concerning ground-state characteristics of Figs. 49(a) and (b), an interesting experimental trend can be observed. For all four molecular families, i.e. MeNe, MeAr, MeKr and MeXe (excluding experimental De and Re of MeHe), the ground-state well depths and bond lengths increase and decrease, respectively in the following sequences: De (ZnRG) ¡ De (CdRG) ¡ De (HgRG) ;
(69a)
Re (ZnRG) ¿ Re (CdRG) ¿ Re (HgRG) :
(69b)
The experimental trends (69) are consistent with various results of ab initio calculations [72,133–137], especially for the ground-state well depths, as it is evident from comparison of Fig. 49(a) with Fig. 49(c). Concerning bond lengths the distinct experimental trend from Fig. 49(b) is also present in ab initio values for MeNe and, partly, for MeHe molecules (Fig. 49(d)). As seen in Fig. 49(a), there is no particular trend in De (MeRG) with respect to the Me atom polarizabilities Me , 71 as would be expected from the L–D theory (Eq. (53) and Tables 4 – 6). On the contrary, the HgRG molecules have the largest well depths despite the fact that the Hg is the smallest of all Me-atom polarizabilities. This will be discussed in more detail in Section 6.5 while analysing regularities in ground state bonding of Me2 . The pronounced increase of well depths (69a) can be explained qualitatively in a simple manner following a suggestion of Ref. [278], and particularly of Ref. [216] by assuming a steric e4ect, which 71
In fact, even though Me ’s are close to each other, the relation Hg ¡ Zn ¡ Cd is present, as obtained from experimental [157–159] and theoretical [110,128] studies.
298
J. Koperski / Physics Reports 369 (2002) 177 – 326 Hg
Zn
Cd
Hg
Cd
Zn
(a) 200
author's studies MeHe MeNe MeAr MeKr MeXe
100
[Re''(MeRG)]6 [Å6]
-1
De''(MeRG) [cm ]
300 8000
(b) author's studies
4000
0 4
5
6
7
MeHe MeNe MeAr MeKr MeXe
4
8
5
Hg
Cd
De''(MeRG) [cm-1]
300
(c)
200
ab initio MeHe MeNe MeAr MeKr MeXe
100
0 4
5
6
7
αMe [Å ] 3
8
[Re''(MeRG)]6 [Å6]
Zn
7
8
3
αMe [Å ] Hg
6
αMe [Å ]
3
Cd
Zn
8000
(d) ab initio MeHe MeNe MeAr MeKr MeXe
4000
4
5
6
7
8
αMe [Å ] 3
Fig. 49. (a) – (b) Experimental and (c) – (d) ab initio ground-state well depths, De (for HgHe ground state, there is experimental D0 instead), and bond lengths, (Re )6 plotted vs. RG , for MeRG molecules according to the results of Refs. [40 –56] and ab initio results of Refs. [72,133–137]. In those cases where the author’s studies do not provide necessary data, results of other experimental investigations were adopted: for X0+ of HgXe [233] and ZnXe [275].
relates to diAerences in electron charge density in the Me-atoms ns2 -shell and Me-atomic radii. In the sequence of Zn:Cd:Hg atoms their atomic radii calculated as a half distance of closest approach of Y [272], respectively. Assuming a simple atomic centres in the crystalline state are 1:33 : 1:48 : 1:56 A picture of spherical electron density distributions of interacting atoms (inserts in Figs. 40(a), 44(a) and 48(a)), it is possible that two neutral atoms like, e.g. Zn and RG may establish an equilibrium distance being eAectively longer that the one established by the same RG and larger Cd or Hg atoms. This apparently surprising conclusion may be a result of a diAerent electron charge of the outer s2 -shell of Me atoms. The RG atom may partly penetrate the outer ns2 -shells, therefore the penetration is more eAective in the case of Hg(6s2 ) than in case of the Zn(4s2 ) outer shell. Under such an assumption, it is a result of the eAective diAerence in the “e− –e− ” repulsive interaction of the outer electrons of both atoms. This results in relation (69b) that supports the general model described above. A very interesting analysis of the Me–RG interaction in higher excited pure electronic states is presented in Ref. [216], in which a simpliLed “model potential” is employed U (R) =
C12 RG (Ze)2 − ; R12 2R4
(70)
J. Koperski / Physics Reports 369 (2002) 177 – 326
299
Fig. 50. Comparison of the ground-state interatomic potentials of (a) MeAr and (b) MeKr vdW molecules. The MeAr and MeKr ground-state repulsive parts were determined through the modelling of bound–free Huorescence spectra (except ZnKr, for which a Morse approximation was used). They are represented by M–S (11.3,9.0) [40,47], M–S (10.6,7.0) [47], M–S (11.3,10.8) [54], M–S (8.6,7.3) [48,49] and M–S (10.6,7.0) [56] for ZnAr, CdAr, HgAr, CdKr and HgKr, respectively. Thick lines represent ranges actually probed in the experiment.
where the Lrst and second part of U (R) is the repulsive short-range term (compare with Eq. (21)) and the long-range electron (or ion, e.g. Z = 1)—molecule interaction term, respectively, and Z and e are the eAective charge on the Me atom and the unit charge, respectively. Using condition (50) (Section 4.2.1) one can calculate the excited-state vibrational frequency 1620De [cm−1 ] !e(calc) [cm−1 ] = : (71) Y 2] [a:m:u:]Re2 [A In analyses of the C1 1(n1 P1 )-state parameters in the MeNe as well as HgAr, HgKr and HgXe molecules, the D1-state vibrational frequencies were calculated and compared to those obtained from experiments of other investigators (Tables 11–13). It was found that for the MeNe and the majority of HgRG molecules, the agreement is very good (discrepancies are smaller than 1%), however for molecules with heavier RG atoms (e.g. HgKr, HgXe) the !e(expt) vs. !e(calc) discrepancies are larger (by approximately 5 –10%). This led to a re-examination of the C1 1 ← X0+ transitions in the excitation spectra of HgKr and HgXe published by Tsuchizawa et al. [264] (Table 12), and consequently resulted in slightly diAerent C1 1-state characteristics as well as better agreement with calculated !e(calc) values. Thus, it is the author’s belief that the “model potential” presented in Ref. [216], as simple as it is, seems to be a signiLcant step forward in a better understanding of the nature of the vdW bonding in the higher excited states. As already discussed above, the ground-state repulsive parts in CdRG (Fig. 42) and HgRG (Fig. 45) molecules have similar slopes within each of these families. It is very interesting to compare the short-range parts of the ground-state potentials for the MeAr and MeKr families. Fig. 50 presents a comparison of the repulsive branches determined in (a) MeAr and (b) MeKr molecular families. As opposed to CdRG and HgRG, in which presence of the RG-atom np-orbital, even for diAerent RG , does not inHuence the repulsive part of the potential, in the present case, the repulsive parts in the MeAr or MeKr families have di4erent slopes showing a direct inHuence of the
300
J. Koperski / Physics Reports 369 (2002) 177 – 326
Me atoms (with diAerent Me ) on the interatomic potential in this region of R. This immediately implies that for the same RG atom, Me atom plays a more important role than RG plays in each of the CdRG or HgRG families. It is obvious that diAerent slopes of the repulsive potentials show much more diverse inHuence of spherically symmetric Me-atom ns-orbital in close approach with the RG-atom np-orbital (here He-atom 1s-orbital is excluded as the behaviour of molecular parameters in MeHe departs strongly from MeRG regularities discussed above). It is evident that the vdW interaction, in the simplest L–D model directly dependent on Me (e.g. Eq. (53)), which dominates in the long-range region, here slightly modiLes the degree of repulsion between RG and diAerent 12-group Me atoms. Summarizing characterization of the MeRG vdW molecules in their ground and several excited states, one can certainly say that, despite their simplicity, they are physical objects not easy to describe. Existing methods allow extracting as much information as possible from experimental data to determine the molecular interatomic potentials. As demonstrated, these methods complement one another giving reasonably reliable picture of PE curves that can be compared with results of ab initio calculations. However, there are still several working theories that try to Lnd and reliably describe regularities that may exist in these small systems bound by very weak long-range forces. As shown here, it is a quite diRcult task and each observation is a challenge to interpret. However, it may lay directions for possible future experimental and theoretical studies. 6.5. Me2 dimers An adequate introduction to the spectroscopy of 12-group Me2 dimers (Zn2 , Cd 2 and Hg2 ) has been made in Section 3.5.2 where, as an example, a PE-curve diagram of Hg2 (Fig. 10) as well as all molecular states correlating with the 61 S0 , 63 PJ and 61 P1 Hg atomic asymptotes (Fig. 11) were listed. 1 + In the 12-group Me2 dimers, the weak bonding interaction in the X0+ g (X g ) ground state is usually denoted as a vdW (dispersion) interaction. However, as recently shown [106,110,111,128,279], covalent bonding contributions appear in addition to pure vdW interactions in ground states of these complexes. From ab initio studies of the Me2 dimers, these contributions are expected to manifest their inHuence through short-range induction eAects (“softening” of the repulsive wall) and in the vicinity of the equilibrium separation (Re larger than that for pure vdW molecule) [106,110]. They play a signiLcant role in the stabilization of Me2 , in which the Me2 dimers diAer from their RG2 counterparts. 6.5.1. Hg2 interatomic potentials from excitation and >uorescence spectra Before 1994, spectroscopy of Hg2 molecules produced in supersonic beams was a subject of three scientiLc reports [197,198,280]. As already stated in Section 3.5.2, the preparation of a stable Hg2 ground state in the supersonic beam presents the possibility of a direct excitation of the odd (“ungerade”) states from the even (“gerade”) ground state, if allowed by the F–C “window” for the excitation. This eRciently extended possibilities of investigation of the odd Hg2 states that are not accessible in the vapour-cell experiments (where most of the Hg population resides in the form of free atoms), as well as created an opportunity to investigate the properties of the Hg2 ground state allowing determination of its spectroscopic characteristics directly. The D1u ← X0+ g transition in excitation spectrum was investigated by Zehnacker et al. [197]. Additionally, an isotopic structure of vibrational band was resolved and v -assignment was given with ±1 accuracy. The well depths, vibrational frequencies, and anharmonicities were determined
J. Koperski / Physics Reports 369 (2002) 177 – 326
301
for the X0+ g and D1u states assuming Morse representations for the two states. Although the absolute values for equilibrium internuclear separation were not determined, the diAerence \Re = Re (D1u ) − Y was concluded. It was also suggested that the value of Re = 3:25 ± 0:20 A Y known Re = −1:1 ± 0:1 A from literature [12,122,150] is most probably too small. The rotationally resolved single-isotopic + + m1 +m2 =402 component (Table 3) in the D1u ← X0+ g transition as well as the F0u ← X0g transition in excitation were analysed by van Zee et al. [280]. The former resulted in a direct determination 72 of the ground-state bond length and X0+ while the latter g and D1u states rotational characteristics Y band consisted of a series of sequence bands. All the studied Hg2 with a conclusion that the 2540 A + electronic energy states (X0+ g , D1u and F0u ) were assumed to be represented by Morse functions for + the sake of result interpretation. The G0u ← X0+ g transition in absorption was detected and analysed by Schlauf et al. [198]. Instead of a laser, an UV-deuterium lamp was used. A B–S analysis of the v ← v = 0 progression (and therefore Morse representation for the G0+ u -state potential) resulted in determination of the excited state characteristics with assumption of the ground state spectroscopical constants of Ref. [280]. The v -assignment was determined within ±5 error margin. At this point it is necessary to mention vapour-cell experiment on high-resolution spectroscopy + 202 of the G0+ Hg)2 single-isotopomer [83], which provided very accurate u ← A0g transition in the ( (vibrational and rotational) spectroscopic constants of the G0+ u excited state as well as suggestion of an inaccurate G0+ u -state v -assignment of Ref. [198]. The precise values determined in Ref. [83] + were used latter as reference data in studies of the G0+ u ← X0g transition (see below). Furthermore, + Ref. [83] reports also detection and analysis of the G0u; v =12 → X0+ g Huorescence band. Simulation of its bound–free part resulted in L–J(n − 6), n = 6:53 representation for the ground-state repulsive Y from Ref. [280]. Table 14 collects results part, however with assumption of the Re = 3:63 ± 0:04 A of all investigations described above compared with the results of the author’s studies and most reliable ab initio calculation. Ab initio calculation for the ground- and excited-state PE curves of Hg2 were carried out by several investigators (Table 2). The most recent [93] were performed using self-consistent Leld (SCF) multireference conLguration interaction (MRCI) and two-valence-electron energy-adjusted pseudopotential representing the Hg core. The spin–orbit eAects were taken into account only approximately. Furthermore, the Hg2 ground-state interatomic potential was investigated [92,110,155] using largescale ab initio relativistic calculations. SigniLcantly large basis sets for modelling the Hg atom as a 2- and 20-valence-electron system were employed including spin–orbit eAects. The ground-state spectroscopic parameters reported there (Table 14), obtained at the highest level of approximation are in excellent agreement with the result of Refs. [60 – 62]. 6.5.1.1. X 0+ g ground state. The characterization of the Hg2 ground state was performed through + + detection and analysis of “hot” bands in F0+ u ← X0g [60] (Fig. 16) and E1u ← X0g [61] (Fig. 23) transitions of the excitation spectra. Very thorough analysis of \v=−2; −1, 0, 1 and 2 sequences in + + the F0+ u ← X0g (also Fig. 15(a)) and v = 0 ← v progression in the E1u ← X0g transitions allowed to determine the ground-state vibrational constants, and Morse representation of the potential below its dissociation limit. Fluorescence bands detected after a selective excitation of diAerent v levels 72
As pointed out in Ref. [60], a discrepancy between the values for !e (F0+ u ) and !e of Ref. [280] has been observed (see their Table 1). The most reasonable is to swap one for another, as it is consistent with Morse approximation employed there.
302
J. Koperski / Physics Reports 369 (2002) 177 – 326
Table 14 1 + + 3 3 Summary of the X0+ u ) excited state potential characteristics in Me2 (Me=Zn, Cd, Hg), 1u ( u) g ( g ), ground and 0u ( 3 + + 1 + potentials in Cd 2 and Hg2 , as well as D1u ( u ) and G0u ( u ) excited state potential characteristics in Hg2 only. Results of the author’s studies are put in bold. The most recent ab initio values as well as results of other experiments are included. Ground-state long-range characteristics are given in Tables 5 –7 (Section 4.4). Note: \Re = Re − Re Designation
Zn2
Cd 2
Hg2
De (cm−1 )
279.1a 274l 194 ± 56p 220q
330.5a 323l 250 ± 40p 316.5r
380 ± 15b 350 ± 20h 296n 379.1o 379.149s
Y Re (A)
4.19a 3:88 ± 0:05p 4.12q
4.07a 4:05 ± 0:03p 4.39r
369 ± 0.01d 363 ± 0:04i 3.94n 3.730o 3.72s
Be (cm−1 )
— — —
— — —
0.0123 ± 0.0001f 0:0127 ± 0:0003i 0:0122 ± 0:0003s
!e (cm−1 )
25.9 ± 0.2a 25:7 ± 0:2l 25 ± 2p
23.0 ± 0.2a 23:0 ± 0:2l 21 ± 1p
19.6 ± 0.3c 19 ± 2h 19:7 ± 0:5i 19o 19.6446s
!e xe (cm−1 )
0.60 ± 0.05a
0.40 ± 0.01a
0.26 ± 0.03b 0.25h 0.27i 0.2265s
n, L–J(n − 6)
—
—
6.21 ± 0.03d; m 6.53k
220.7 ± 1.0a 215:0 ± 0:5l 130q
260 ± 1a 252 ± 0:5l 250r
432 ± 10b 410 ± 20i 313n
Y Re (A)
4.49a 4.37q
4.33a 4.71r
3.66 ± 0.04b 3:61 ± 0:5i 4.10n
Y \Re (A)
0.300 ± 0.015a
0.26 ± 0.03a
0.030 ± 0.002b −0:02i
!e (cm−1 )
20.3 ± 0.2a 20:1 ± 0:2l
18.50 ± 0.02a 18:4 ± 0:2l
18.6 ± 0.4b 18:5 ± 0:5i
!e xe (cm−1 )
0.47 ± 0.02a 0:47 ± 0:05l
0.330 ± 0.005a 0.33l
0.20 ± 0.02b 0.21i
—
723 ± 10t 845:5 ± 20u
1660 ± 40c 1305n
Y Re (A)
—
3.93 ± 0.05t 3:3 ± 0:3u
3.38 ± 0.04c 3.445 ± 0.002e 3.44n
Y \Re (A)
—
−0.14 ± 0.02t
−0.250 ± 0.004c −0.245 ± 0.002e
!e (cm−1 )
—
28.7 ± 1.0t
40.2 ± 0.3c
!e xe (cm−1 )
—
0.220 ± 0.005t
0.18 ± 0.02c
1 + X0+ g ( g )
3 0+ u (
De
(cm−1 )
1u (3 De
u)
u)
(cm−1 )
J. Koperski / Physics Reports 369 (2002) 177 – 326
303
Table 14 (continued) D1u (3 u+ )
3 + G0+ u ( u )
8100 ± 200c 8385 ± 100g 8260 ± 200h 6204n
8280 ± 15d 7092 ± 700j 7832n
Y Re (A)
2.710 ± 0.005e 2:5 ± 0:1i 2.83n
3:00 ± 0:03j 2.8506k 2.91n
Y \Re (A)
−0.980 ± 0.005e −1:1 ± 0:1h
−0.8394 ± 0.0100d −0:63 ± 0:03j
Be (cm−1 )
0.0228 ± 0.0001f
0:020542 ± 0:000001k
!e (cm−1 )
127.0 ± 0.6c 129.5 ± 0.3g 133 ± 1h
79 ± 1j 88:5901 ± 0:0003k
!e xe (cm)−1
0.50 ± 0.01c 0.52h
0:22 ± 0:03j 0:29566 ± 0:00002k
Hg2 De (cm−1 )
a
Ref. [58]. Ref. [60]. c Ref. [61], Re (E1u ) obtained from \Re and Re of Ref. [250]. d Ref. [62]. e This work, Figs. 17 and 18 (Section 3.6.3) and Figs. 22 and 23 (Section 4.1.3). f This work, using Re of Ref. [62] and Re (D1u ) of this analysis. g This work, from analysis of isotope shift in D1u ← X0+ g; v =0 transition (Section 3.6.3). h Ref. [197]. i Ref. [280]. j Ref. [198]. k Ref. [83]. l Ref. [96]. m + From simulation of the G0+ u; v =39 → X0g bound–tree spectrum. n Ab initio values of Ref. [93]. o Ab initio values of Ref. [155]. p Ab initio values of Ref. [128]. q Ab initio values of Ref. [116]. r Ab initio values of Refs. [117,118]. s Ab initio values of Ref. [314]. t Ref. [59]. u Ref. [105]. b
in several excited states permitted corroboration of applicability of Morse representation for the ground-state bound well as well as precise determination of the ground-state repulsive part above its dissociation limit. The E1u; v =0; 1; 2; 3 → X0+ g bound–bound transitions recorded for the Lrst time [61], though unresolved, allowed to visualize a shape of the squared vibrational wave function, ( v )2 , in the initial v level from which the Huorescence was emitted. Modelling of the Huorescence bands conLrmed that a Morse function is adequate to represent the ground state bound well. The + G0+ u; v =39 → X0g bound–free Huorescence proLles were recorded and modelled in Ref. [62] yielding a very accurate slope of the potential above its dissociation limit. It was found that a L–J(n − 6) potential (Eq. (20)) with n = 6:21 ± 0:03 is the function that more appropriately represents the
304
J. Koperski / Physics Reports 369 (2002) 177 – 326
Fig. 51. Comparison of diAerent experimental and theoretical representations of the Hg2 ground state potential for the (a) bound-well and (b) short-range regions. The Morse and L–J (6:21 − 6) potentials as well as points from RKR-like inversion methods (◦) obtained in the investigations reported in Refs. [61,62]. The results of Refs. [60 – 62] are compared with those of ab initio calculations ( ) [92], potential of Bonechi et al. [129] with damping term for suppression of the dispersion part at close range (see Section 3.5.8), and points ( ) and M–S (6.668,2.916) potential of Greif [283] used in collision-induced Raman scattering of Hg2 .
•
ground-state repulsive part. The L–J(6:21–6) function approximates very well points determined using a semi-classical RKR-like inversion method of Le Roy [220,221] (Section 4.2.1). It is apparent, that the repulsive part of the Hg2 ground state [62] is unusually soft (as compared e.g. to the MeRG or RG2 [172–174] ground- state repulsive walls). The substantial lowering of repulsive forces, as a result of strong induction e4ects, was indicated in ab initio studies of Hg2 X0+ g -state potential [106]. As acknowledged by several other investigators (e.g. [10,107]), those induction e4ects play a signiLcant role in the interaction between two ground state Hg atoms and stabilization of Hg2 . This is quite unlike the situation commonly encountered for the RG2 dimers and it makes interaction in Hg–Hg pair not completely vdW in nature, hence quite diAerent from RG gases. 73 Therefore, the result [62], together with results of other theoretical [110,111,281,282] and experimental studies [129] support a surmise that in addition to the vdW interaction, in the short as well as intermediate (near the Re ) regions of internuclear separations, a covalent bonding contributes to the net forces acting between two Hg atoms. In Ref. [129] a special damping term (Section 3.5.8) was included into the Hg2 interatomic potential for suppression of the dispersion part at close range. It is considered a success of both experimental and theoretical eAorts to accurately describe the complex interactions and eAects in this considerably heavy molecule. Fig. 51 shows diAerent representations for bound-well (a) and repulsive parts (b) of the Hg2 ground state potential obtained by the author and other investigators. From the comparison, it is obvious that below the dissociation energy limit in the repulsive part of the well (R ¡ Re ), Morse and L–J(6:21 − 6) potentials almost overlap, while in the attractive part of 73
As emphasized in Ref. [107], because of the short-range induction eAects, Hg2 may be regarded as an intermediate case between a weakly bound vdW molecule and a chemically bonded species. Induction contributions to the bonding energy of the Hg2 are an early indication of the transition of the bonding in mercury clusters from weak vdW, to covalent, and Lnally to metal bonding as a function of size.
J. Koperski / Physics Reports 369 (2002) 177 – 326
305
the well (R ¿ Re ) the Morse converges to zero much quicker that the L–J potential. The ab initio points of Ref. [92] as well as Ref. [155] almost overlie the Morse (or L–J), and Morse potentials for R ¡ Re and R ¿ Re , respectively. However, an improved model of the Hg–Hg interaction potential with damping term for suppression of a long-range part for small R [129] as well as potential of Greif [283], both used in interpretation of mercury diatom Raman spectra, depart slightly from the Y Above the dissociation limit, in the repulsive part of the potential, the above potentials for R ¡ 4 A. + L–J(6:21 − 6) obtained from a modelling of the G0+ u; v =39 → X0g bound–free Huorescence band [62] is an appropriate ground-state representation, which consequently overlays the experimental points from the inversion procedure of Le Roy. Moreover, the ab initio points [92] agree with the slope of the L–J potential extending the experiment-with-theory agreement above the dissociation limit. Therefore, it is recommended to represent the interatomic potential using a hybrid function (Section 3.5.9) composed of the L–J(6:21 − 6) and Morse functions for R ¡ Re and R ¿ Re , respectively. 6.5.1.2. F0+ u and D1u excited states. Among four Hund’s case (c) electronic energy states that correlate with the 63 P1 atomic asymptote there are two of odd (‘u’) symmetry: F0+ u and D1u (Figs. 10 and 11), to which electric dipole transitions from the even (‘g’) ground state are allowed. The v = 0 ← v = 0 excitation frequency to the Lrst one is red-shifted by only ∼50 cm−1 from the 63 P1 – 61 S0 atomic transition [60] while the centre of the second broad band is well red-shifted by about 1900 cm−1 [61]. It immediately results with the conclusion that the F0+ u - and D1u -state bound wells are considerably shallow and deep, respectively. + The F0+ u ← X0g transition was studied in detail in Ref. [60] (Fig. 16). It was concluded that a mutual conLguration of frequencies of the v ← v transitions is responsible for the shape and complex structure of the registered bands (see discussion in Section 3.6.1). The interpretation of the + F0+ u ← X0g spectra was aggravated by the fact that both states have almost identical well depths and bond lengths (Table 14). Five \v sequences that were registered with very high sensitivity, as well as simulation of the vibrational-band proLles (Fig. 16(b)) led to the conclusion that all \v ¡ 0 sequence bands appear as blue-shaded and, if incorrectly perceived, can be erroneously interpreted as rotational contours of vibrational bands. Analysis of the \v = 0; ±1 and ±2 sequence bands allowed + to determine so called turning points and band-heads [64] for the components in the F0+ u ← X0g transition within an anharmonic potential approximation. It was concluded, that conditions of the supersonic expansion created in the experiment allowed to eRciently populate an unusually high number of the ground-state v levels (up to v = 20). This circumstance, in particular, rendered the experiment successful. The excited-state interatomic potential was represented by a Morse function in the intermediate region of R. The large isotope shift (Eq. (32) and Fig. 17) in the D1u ← X0+ g transition (Fig. 22) associated with high-v excited vibrational levels allowed to perform an isotope-shift analysis that was omitted in Ref. [61]. The experimentally measured \ij were plotted against v together with evaluated isotope shift (32) assuming a v -assignment and an improved !e vibrational frequency (the !e xe anharmonicity was not changed as it was quite precisely determined in the B–S analysis in the range of the observed v ← v = 0 vibrational components [61]). The results are compared in Fig. 18 along with the \ij calculated according to the results of the previous analysis [61] as well as result of Ref. [197]. It was concluded that the improved value for !e (Table 14) resulted in better agreement between the experimental and calculated isotope shifts than those from [46] and other
306
J. Koperski / Physics Reports 369 (2002) 177 – 326
Y [197] investigations. 74 Moreover, in Ref. [61] a 4th harmonic of the Nd:YAG laser (2660:3 A) was used to selectively excite the D1u; v =57 level and then a long bound–free proLle was recorded conLrming the v -assignment (with a v ± 1 accuracy) obtained from the isotope-shift analysis. As a result, the v -assignment of Zehnacker et al. [197] was corrected by −3 (i.e. vZehnacker −3=vpresent ). A very important consequence of the improved value for !e and of new result for Re [62] (see below) was the possibility to simulate of the v ← v = 0-progression in the D1u ← X0+ g transition and, as a direct result, determination of the \Re . Result of the simulation is shown in Fig. 22. The analysis Y Having determined the Re and Re (D1u ) it produced a new value for Re (D1u ) = 2:710 ± 0:005 A. was possible to re-examine the rotational structure of the m1 + m2 = 402 single-isotope component (v = 60 ± 1) in the excitation spectrum reported in Ref. [280] (see their Fig. 2). Using formulas (7) and (36) the Be , Be (D1u ) and other rotational constants were evaluated (Table 14), and they were found to agree within the margin of error with those reported by van Zee et al. [280]. Then, the new rotational constants were used to simulate the band reported in Ref. [280] using the experimental conditions of van Zee et al. An agreement was found to be satisfactory and supported the result for Re presented here. 6.5.1.3. E1u excited state. Fig. 23 presents the excitation spectrum of the E1u ← X0+ g transition in Hg2 detected for the Lrst time using laser excitation combined with supersonic beam method [61]. In 1909, this transition was seen by Wood in collision-induced absorption (so-called Wood’s bands) [284], and then investigated in early works of Mrozowski, Grotrian, Hamada and Lord Rayleigh [86,285 –288]. However, it was never explored in detail in laser spectroscopical studies. The E1u (3 u , Hund’s case (a)) electronic state correlates with the 63 P2 atomic asymptote and even though the 63 P2 –61 S0 transition in atomic Hg is forbidden (\J = 2), in molecular Hg2 , + the E1u ← X0+ g transition is not (\ = 1). The centre of E1u ← X0g proLle is red-shifted by ap− 1 Y atomic line indicating that the excited state proximately 1190 cm from the forbidden 2270 A well depth is larger than that of the ground state. Adjusting properly conditions of the supersonic expansion, it was possible to record pronounced v ← v = 0- and v ← v = 1-progressions (Fig. 23). The v -assignment in the E1u ← X0+ g transition was conLrmed through observation of the E1u; v =0; 1; 2; 3 → X0+ bound–bound transitions in Huorescence spectra. The inadequacy of the B–S plot g (Fig. 20) in a reliable determination of the E1u well depth and dissociation energy was analysed in Section 4.1.1 and will not be repeated here. It has to be stressed, however, that in this case the B–S method overestimates the D0 (E1u ) by about 35%! Summarizing, near its bottom (i.e. in the vicinity of Re ) it is justiLed to represent the E1u -state interatomic potential with a Morse function, however as the dissociation limit is approached, the potential departs from its Morse-like behaviour and one has to consider estimating it by properly chosen long-range approximation. + 6.5.1.4. G0+ u excited state. As mentioned above, the characterization of the G0u state was per+ formed by Kedzierski et al. [83] in a high-resolution spectroscopy of the G0+ u ← A0g transition in 202 + + the ( Hg)2 single-isotopomer. Therefore, the author’s study of the G0u ← X0g transition was focused only on the analysis of \ij (v ) isotope shift and simulation of the F–CF intensity proLle [62]. 74
However, one has to be very cautious in determining the D1u -state dissociation energy using formula (40). As shown for the E1u state (Section 4.1.1), the D0 can be overestimated. Therefore, more safe is using formula (41), for which, however, a precise knowledge about 00 and D0 is necessary.
J. Koperski / Physics Reports 369 (2002) 177 – 326
307
Both analyses provided new results. The isotope-shift examination led to the change by +13 of the + v -assignment in the G0+ u ← X0g transition reported by Schlauf et al. [198] (i.e. vSchlauf +13=vpresent ). + The simulation of the F–CF intensity proLle provided the \Re = Re (G0u ) − Re . Consequently, a larger value for the Re obtained from the \Re and very precise Re of Ref. [83] was calculated + (Table 14). The new v -assignment was conLrmed in the observation of the G0+ u; v =39 → X0g bound– bound and bound–free Huorescence bands. In general, both the work of Kedzierski et al. [83] and studies of Ref. [62] provided a broaden view on the G0+ u -state interatomic potential, and it was found, as opposite to the cases of the D1u - and E1u -state potentials, that the B–S method underestimates by approximately 20% the G0+ u -state dissociation energy. Comparison of results of Refs. [60 – 62] with ab initio points of Ref. [93] for the Hg2 F0+ u , D1u , + E1u and G0u excited states is shown in Fig. 10. All other experimental results are compared in Table 14. 6.5.2. Cd2 and Zn2 interatomic potentials from excitation spectra As compared to the laser spectroscopy of Hg2 species produced in supersonic beams, the Cd 2 and especially Zn2 were studied less extensively (for historical review see Ref. [12]). The reason was the same as quoted while discussing spectroscopy of the HgRG, CdRG and ZnRG molecules. In + 1985, a laser excitation spectrum of the 0+ u ← X0g transition in Cd 2 produced in supersonic beam was reported by Kowalski et al. [94] providing Lrst estimates for the ground- and excited-states dissociation energies. Some years later, the same transition was investigated by Czajkowski et al. in Cd 2 [95] and Zn2 [96] using a similar experimental approach. Both reports assumed Morse representations for the ground and excited state interatomic potentials, however, the ground-states characterization was indirect and internuclear separations were estimated from \Re obtained in simulation of the F–CF intensity distribution. To the best of the author’s knowledge, those are the only experimental studies of the Cd 2 and Zn2 produced in supersonic beams that had been carried out before the studies presented in Refs. [58,59]. To complete the view on characterization of the Cd 2 and Zn2 , it is necessary to include other important experimental studies as well as ab initio calculations. The most reliable ones related to the laser spectroscopy of Cd- and Zn-vapours are those of Eden and co-workers [97,98]. Two B1 u+ (51 P1 ) and a3 g (53 P1 ) excited states and repulsive part of the X1 g+ ground state in Cd 2 were investigated [98]. A more detailed study of Cd 2 and Zn2 bound–free emission in B1 u+ → X1 g+ transition were also reported. The main conclusion drawn was that the ground-state repulsive parts in both molecules are represented by Morse functions. Electron beam excitation of Zn2 [289] and Cd 2 [290] molecules was studied as well, particularly the bound– free emission in 3 u+ (n3 P) → X1 g+ transition. These studies were driven by a search of an eAective energy reservoir in possible tuneable laser medium (excimer). The ab initio calculations of the Cd 2 and Zn2 interatomic potentials are summarized in Table 2 (Section 3.5.3). The most recent ones, to which the experimental characteristics are compared, are those of Czuchaj et al., Refs. [116 – 118] (see also Fig. 52) for Zn2 and Cd 2 , respectively, as well as Schautz et al. [128] for Zn2 and Cd 2 . + + + 6.5.2.1. X 0+ g ground, 0u and 1u excited states in Cd2 . The 0u ← X0g transition in Cd 2 produced in a supersonic beam was studied in Ref. [58], and an emphasis was put on eRcient population of the v ¿ 0 vibrational levels in order to detect as many “hot” bands as possible. Moreover, the ground
308
J. Koperski / Physics Reports 369 (2002) 177 – 326
+ 3 3 Fig. 52. Comparison of the result for Cd 2 interatomic potentials of the X0+ g ground, and 0u (5 P1 ) [58] and 1u (5 P2 ) [59] excited states with ab initio points of Refs. [117,118] (full circles). It is shown that a reasonable agreement is present not + only for the bound-well regions but also for the repulsive parts of the X0+ g and 0u -state potentials. Central wavelengths + + + of the 0u ← X0g and 1u ← X0g transitions are indicated.
and excited states characterization included modelling of the long-range parts of these potentials. The + −1 from the 53 P –51 S atomic line and unresolved 0+ 1 0 u ← X0g transition is blue-shifted by about 80 cm rotational proLles of vibrational components are “red-shaded”. Thus, the excited state is expected to have shallower well depth and larger bond length than those of the ground state. The study of Ref. [58] reports on a detection of seven “hot” bands (the highest v suRciently populated in the beam was v = 4). This permitted a direct and more reliable characterization of the ground state, which resulted in increasing of determined ground-state well depth by 2.5% with respect to the indirect value of Ref. [95]. The long-range LR–B analysis allowed to determine the excited-state dissociation limit, and to improve the accuracy of the 0+ u -state dissociation energy (Table 14). Determination of the Re was based on a L–P method (56) and the new Re (0+ u ) value obtained via \Re from modelling of the F–CF intensity distribution of the v ← v = 0; 1; 2; 3 vibrational progressions. The experimentally determined PE curves of Cd 2 [58] are in very good agreement with the result of ab initio calculations of Ref. [98], as shown in Fig. 52. The 1u ← X0+ g transition in Cd 2 studied in Ref. [59] revealed short v ← v = 0 “cold” and weak −1 from the v ← v = 1; 2 “hot” progressions. The 1u ← X0+ g transition is red-shifted by about 390 cm 3 1 Y The Lrst Lve vibrational components of the v ← v = 5 P2 –5 S0 forbidden atomic transition (3141 A). 0 progression revealed a linear B–S plot which, in result, overestimated the 1u -state dissociation energy obtained from Eq. (41). Therefore, it was postulated that the analytical representation for the 1u -state PE curve should be rather the L–J(12 − 6) of Eq. (22) than Morse function [59] (Fig. 52). Comparing the PE curves determined in studies of Hg2 , shown in Fig. 10, with these in Fig. 52, it becomes intriguing that at the time when this review was written there was no evidence of studies of other Cd2 excited states accessible in a direct excitation from the ground state in experiments with crossed laser and supersonic beams. One may expect a possibility of excitation of two such 1 + 3 1 Hg2 -analogues: 1u (3 u+ ) and 0+ u ( u ) correlating with 5 P1 and 5 P1 atomic asymptotes, respectively. The two interatomic potentials have been ab initio calculated to have bound wells [98] and are
J. Koperski / Physics Reports 369 (2002) 177 – 326
309
accessible in excitation from the ground state. The quest for the experimental data of the kind started in laboratory in KrakPow. 6.5.2.2. Xg+ ground and 3 u excited states in Zn2 . There are only two articles [58,96] devoted to experimental studies of Zn2 molecule produced in supersonic expansion. Chronologically, the Lrst one [96] reported on an observation of the 3 u (43 P1 ) ← X1 g+ transition in Zn2 , 75 in which the v ← v = 0-progression was analysed using the B–S approach. The report concluded with Morse representations for the two molecular states. Similarly as in Cd 2 , the 3 u ← X 1 g+ transition in Zn2 is also blue-shifted with respect to the corresponding 43 P1 –41 S0 atomic line, and all vibrational components in the spectrum are “red-shaded” indicating that the well depth of the excited state is shallower, and the excited-state bond length is larger than that of the ground state. In the report [58], previously recorded transition in the excitation was analysed taking into account “hot” bands. Also, a long-range behaviour of the ground- and excited-state interatomic potentials were analysed thoroughly. As a result, improved values for ground- and excited-state well depths and bond lengths were obtained along with an energy corresponding to the dissociation limit of the excited state (Table 14). The experimentally determined Zn2 characteristics (De , Re , !e ) [58] are in reasonably good agreement with the result of ab initio calculations of Refs. [116,128]. The future studies of Zn2 in supersonic beams will constitute an unquestionable challenge for investigators, especially if one considers a skillful maintenance of the Zn-beam source. As in the case of the Cd 2 molecules, one may expect a possibility of direct excitation from the ground state of two 3 u+ and 1 u+ states correlating with the 43 P and 41 P atomic asymptotes, respectively. 6.6. Me2 dimers—comparison 1 + + 3 Table 14 summarizes the X0+ u ) excited state potential characteristics of g ( g ) ground and 0u ( 3 3 + 1 + Me2 , 1u ( u ) of Cd 2 and Hg2 , as well as D1u ( u ) and G0+ u ( u ) excited state potential characteristics of Hg2 . Table 14 collects also the most recent results of ab initio calculations of Czuchaj et al. [93,116 –118], Yu and Dolg [110], Schautz et al. [128], Dolg and Flad [155] and Munro et al. [314], as well as results of other experiments of Kedzierski et al. [83], Czajkowski et al. [96], Zehnacker et al. [197], Schlauf et al. [198], and van Zee et al. [280]. Very interesting conclusions can be drawn from the comparison shown in Fig. 53 where the experimental [58– 62] and ab initio ground-state well depths and bond lengths are plotted vs. Me . The De vs. Me tendency in the experimental as well as ab initio values (Fig. 53(a)) is similar in nature to those of Figs. 49(a) and (c) where analogous trends were shown for the MeRG molecules. However, character of the De vs. Me tendency in Fig. 53 resembles rather that for the heaviest MeXe molecule. It is quite expected behaviour as the magnitude of Xe approaches that of Me (Tables 5 and 6). A comparison of overall trend in the (Re )6 vs. Me ab initio and the experimental dependencies reveals that ab initio results are approximately linear with Me 76 similarly to the trend observed in Fig. 49(d) for heavier MeRG 75 3 + 1 + In Zn atom, the magnitude of L–S coupling (Fig. 5) is comparable to the well depths of the 0+ u ) and X0g ( g ) u( molecular states in Zn2 (Table 14). Therefore, in this case it is justiLed to use Hund’s case (a) rather than Hund’s case (c) notation. 76 Due to relativistic eAects and shell structure eAects (so-called lanthanide contraction) the bond length of Hg2 is smaller than those of Zn2 and Cd 2 [110]. Very interesting discussion on the subject can be found in Ref. [291].
310
J. Koperski / Physics Reports 369 (2002) 177 – 326 Hg
Zn
8000
Cd
Hg
300
200
(a) author'sRefs.[58-62]
[Re''(Me2)]6[Å6]
De''(Me2)[cm-1]
400
4000
(b) author'sRefs.[58-62] abinitio,Czuchajetal.[93,116-118] abinitio,Schautzetal.[128] DolgandFlad[155]
abinitio,Czuchajetal.[93,116-118]
100
abinitio,Schautzetal.[128] DolgandFlad[155]
4
5
6
αMe[Å3]
7
Cd
Zn
0
8
4
5
6
7
8
αMe[Å3]
Fig. 53. Experimental and ab initio ground-state (a) well depths, De , and (b) bond lengths, (Re )6 plotted vs. Me , for Me2 molecules according to results of Refs. [58– 62] and ab initio results of Czuchaj et al. [93,116 –118] as well as Schautz et al. [128], and Dolg and Flad [155].
(RG = Ar, Kr, Xe) molecules. This behaviour is also rather expected. However, the experimental Re (Zn2 ) does not conform to the plausible linear trend determined by remaining Re (Cd 2 ) and Re (Hg2 ) values. It calls for more thorough and direct experimental determination of Re (Zn2 ). Nevertheless, the overall experimental-to-ab initio comparison shown in Fig. 53 and Table 14 is encouraging. Firstly, the ab initio values of Dolg and Flad [155], and Munro et al. [314] for De (Hg2 ) and Re (Hg2 ), and Schautz et al. [128] for De (Cd 2 ) are in almost perfect agreement with those from the experimental studies [58,62]. Secondly, the ab initio values of Czuchaj et al. for De (Cd 2 ) [117,118] and Re (Zn2 ) [116] are very close to the experimental ones [58]. Finally, the two ab initio values for De (Zn2 ) [116,128] are close to each other. As mentioned in Section 6.5.1, when discussing the Hg2 ground state potential, there is strong theoretical evidence, argued for Lrstly by Kunz et al. [106], that an induction contribution to the bonding energy is essential in the Hg2 ground-state potential. Therefore, the authors of Ref. [106] concluded with a statement that the picture of the mercury dimer as a vdW complex should be modiLed with a non-negligible covalent contribution to the binding energy. One can ask whether the covalent contributions are also present in ground state bonding of the other Me2 molecules: Zn2 and Cd 2 . A systematic ab initio investigations of covalent contributions to the Zn2 , Cd 2 , and Hg2 ground-state energies were presented in Ref. [110] at the complete active space self-consistent Leld level, and in Ref. [128] using pure quantum Monte-Carlo calculations (Table 2). This was achieved by studying the interatomic charge >uctuations whose presence indicates the covalent bonding. The stronger the covalent contribution to the bonding, the less equal the electron distributions are equally distributed between the two Me–Me atomic domains, i.e. an increase in the charge Huctuations is observed. On the other hand, since a pure vdW interaction results from simultaneous intraatomic excitations (e.g. s2 → sp on both atoms) no charge Huctuations are observed in that case. The theoretical studies [110,128] resulted in a clear conclusion that the 12-group homonuclear dimers, although a vdW-type interaction, exhibit the presence of signiLcant covalent contributions to the bonding (with the ratio of vdW to covalent being approximately 0.75 to 0.25). Estimating the relative strength of the dispersion interaction using the L–D formula (∼IMe (Me )2 , i.e. ∼C6 , Eq. (53), Section 4.4.2), would render an ordering De (Hg2 ) ¡ De (Zn2 ) ¡ De (Cd 2 ).
J. Koperski / Physics Reports 369 (2002) 177 – 326
311
According to this, Hg2 should have the smallest binding energy of the three group 12 dimers (!), which is in contrast to both experimental and theoretical evidence (Fig. 53(a)). However, as obtained in Ref. [110], at the Re the covalent contributions obey the order Zn2 ¡ Cd 2 ∼ = Hg2 , and the bond length of Hg2 is smaller than those of Zn2 and Cd 2 (Fig. 53(b), due to relativistic effects and shell structure eAects, see footnote 76). Therefore, the relative strength of the dispersion interaction at Re (De ∼IMe (Me )2 =(Re )6 ) is De (Zn2 ) ¡ De (Cd 2 ) ¡ De (Hg2 ), explaining the experimentally observed sequence of the De values (Fig. 53(a)). This also explains trends seen in Fig. 49 for MeRG molecules (Section 6.4) where for particular RG atom in MeRG class of molecules (i.e. ZnRG, CdRG or HgRG) similar trends are observed in the results of experimental and theoretical investigations. More experimental data is needed, that would conLrm the covalent contribution to the Me2 ground-state bonding energy. Especially, there are relatively few results for the shape of the ground-state repulsive part in Zn2 and Cd 2 . This lays possible directions for future studies of these molecules. 7. Summary and conclusions This review is based on the author’s studies of the MeRG and Me2 diatoms, where Me is a 12-group atom (Zn, Cd, Hg) and RG stands for a rare gas atom (He, Ne, Ar, Kr or Xe). The molecules were produced in three diAerent supersonic beams and studied using methods of laser spectroscopy. The experiments were carried out at the University of Windsor, Windsor, Canada, and using a newly designed and constructed experimental apparatus at the Jagiellonian University, KrakPow, Poland. The goal of the review article was to provide a comprehensive characterization of the MeRG and Me2 diatoms: ZnRG [40 – 42,47], CdRG [43–50,63], HgRG [51–57] as well as Zn2 [58], Cd 2 [58,59] and Hg2 [51,57,60 – 62]. As a result, ground and a number low-lying and Rydberg excited electronic energy states of the molecules have been characterized, several of them for the Lrst time. Analytical functions have been proposed to represent the PE curves in three separate regions of internuclear separation: in the short-region, in the vicinity of the Re (an intermediate region), and in the long-range limit. This provided a characterization of the interatomic potentials in broad range of R. A number of controversies and ambiguous interpretations concerning the earlier observed spectra of these molecules have been clariLed and new interpretation has been proposed. Among the most important results discussed here are: (i) Pioneering studies of extremely weakly bound CdHe and ZnNe molecules and characterization of their ground and lower excited electronic states. (ii) Studies of the E1 Rydberg states in CdNe, CdAr and CdKr molecules. + (iii) First-time observed B1 ← X0+ , 1u ← X0+ g and E1u ← X0g transitions in the excitation spectra of the CdXe, Cd 2 and Hg2 , respectively. This allowed to characterize the B1, 1u and E1u excited states in these molecules. + + (iv) First-time observed D1v =10 → X0+ , B1v =0−3 → X0+ , A0+ and G0+ v =8 → X0 u; v =39 → X0g transitions in the Huorescence spectra of the ZnAr, HgAr, HgKr and Hg2 , respectively. Consequently, the repulsive parts of the ground state potentials in these molecules were determined. (v) Direct observation of the B1-state dissociation limits in the HgAr and CdAr, which enabled a reliable description of the long-range behaviour of the B1-state potentials.
312
J. Koperski / Physics Reports 369 (2002) 177 – 326
(vi) Direct characterization of the ground-state potentials of the ZnNe, ZnAr, CdHe, CdAr, HgNe, HgAr and Cd 2 using observed “hot” bands. In a number of cases, this allowed, to supersede previous, indirect ground-state characteristics sometimes erroneous or inaccurate. (vii) For the CdNe and CdKr molecules an interpretation of the B1 ← X0+ transitions in their excitation spectra was corrected with respect to the previous analyses. This concluded with enhanced representations for the B1-state interatomic potentials of these molecules. (viii) A special emphasis was put on characterization of ground-state repulsive branches of interatomic potentials from the observed Huorescence spectra. Consequently, the ground-state short-range repulsive walls of the ZnAr and HgKr molecules were directly determined for the Lrst time. The determination was more accurate when two “channels” of the Huorescence terminating on the same part of the ground-state repulsive branch were used for analysis, as reported for CdNe, CdAr, CdKr and HgAr Huorescence spectra. (ix) In general discussion on the classiLcation of regularities in the MeRG molecules, a simple model of dispersive vdW interaction has been applied and a distinct linear trend of the De vs. RG has been shown to occur. (x) An unusually soft repulsive wall of the ground-state mercury dimer has been determined, supporting a hypothesis of short-range induction eAects playing a signiLcant role in the stabilization of Hg2 . A theoretical prediction of covalent bonding contributions to the Me2 ground-state interaction potential has been partly conLrmed in experimental observations. As an additional result of the discussion carried out here, it is worthwhile to mention several problems that were indicated and laid a range of possible directions for the future studies of the MeRG and Me2 molecules. (i) There still exist several theories that try to Lnd and reliably describe regularities that may exist in the small MeRG systems bonded by very weak long-range forces. As shown here, it is quite a diRcult task and every interesting observation is a challenge to interpret. For example, as compared to the De vs. RG linear dependence in CdRG (Fig. 40(a)), the respective relationships written for the A0+ , B1 and D1 excited states reveal nonlinear De ∼(RG )4 relationship, where, what is surprisingly striking, the exponent 4 for CdAr, CdKr and CdXe is twice of that for CdHe and CdNe. These interesting regularities are presently under investigation. (ii) The De (ZnRG) vs. RG dependence, plotted according to the author’s results manifests itself with a distinct linear trend. Assuming the linear dependence, it is evident that the ab initio and experimental De values of ZnXe determined by other investigators are too small by some 20 –25%. This calls for additional investigation. (iii) More experimental data is needed that would conLrm the covalent contribution to the Me2 ground-state bonding energy. Especially, there are relatively few results for the shape of the groundstate repulsive part in Zn2 and Cd 2 . (iv) The experimental Re (Zn2 ) does not fall into the plausible linear trend determined by remaining Re (Cd 2 ) and Re (Hg2 ) values, which calls for more thorough and direct experimental determination of Re (Zn2 ). (v) One should expect the possibility of excitation of two excimer states in Cd 2 molecule: 1u (3 u+ ) 1 + and 0+ u ( u ). These interatomic potentials have been ab initio calculated to have bound wells and are accessible via excitation from the ground state. Similarly, one may expect a possibility of direct
J. Koperski / Physics Reports 369 (2002) 177 – 326
313
excitation from the Zn2 ground state of two 3 u+ and 1 u+ excited states correlating with the 43 P and 41 P atomic asymptotes, respectively. The results discussed in this review [40,41,43– 46,48,53–58,60 – 63] have been recognized and applied as a source of experimental spectroscopical data in diAerent Lelds of molecular physics and chemistry related to small weakly bound species [10,11,31–35,39,72,73,83,93,107–111,128– 137,145,211,263,265 –267,279,281,292–318,321,322].
Notation e Me RG e 1 4 4 ! \Gv \Gv+1=2 \@ !L \ \Dopp \ij \P; Q; R \Re = Re − Re \v = v − v F $ @ @lim aver 0 00 0v at e
rotational constant for vibrationless state (at R = Re ) static dipole polarizability of metal atom static dipole polarizability of rare gas atom constants (exponent) in the Morse function rotational constant for vibrationless state (at R = Re ) gamma function parameter in the Hartree–Fock-dispersion-type potential heat capacity ratio molecular electronic energy state with = 2 vibrational Lrst diAerence separation of successive vibrational levels monochromator pass-band constant laser spectral bandwidth energy interval between the lowest ground-state vibrational levels Doppler (inhomogeneous) broadening isotope shift between ith and jth isotopomers frequency spacing in P,Q,R-branches of rotational transition diAerence between equilibrium internuclear separations in the excited and ground electronic molecular states sequence of vibrational transitions (\v = const.) divergence angle of the supersonic beam component of L momentum along the internuclear axis quantum number of wavelength long-wavelength characterization limit molecular reduced mass averaged molecular reduced mass wave number of electronic transition band origin or the zero line wave number of v = 0 ← v = 0 transition wave number of v ← v = 0 transition wave number of atomic transition electronic part of electronic-transition wave number (origin of the band system)
314
P; Q; R rot vib 9 % + − g u *elast *rot *vib G :i 6molecular (R; E) e n
v ; v # !0 !0 (!0 )
!e !e ; !e !0 x0 !0 x0 ; !0 x0 !e xe !e xe ; !e xe !e ye A A A A1 ; A2
J. Koperski / Physics Reports 369 (2002) 177 – 326
wave number of P,Q,R-branch of rotational transition rotational part of electronic-transition wave number vibrational part of electronic-transition wave number molecular electronic energy state with = 1 “isotopic ratio” component of S momentum along the internuclear axis quantum number of % molecular electronic energy state with = 0 state, which electronic eigenfunction e remains unchanged upon the reHection at any plane passing through both nuclei state, which electronic eigenfunction e changes upon the reHection at any plane passing through both nuclei even (“gerade”) state, which electronic eigenfunction e remains unchanged when reHected at the centre of symmetry odd (“ungerade”) state, which electronic eigenfunction e changes sign when reHected at the centre of symmetry cross sections for elastic collisions cross sections for collision-induced rotational transitions cross sections for collision-induced vibrational transitions radiation lifetime correction allowing for the interaction between rotation and electronic motion in molecule total molecular wave function continuum of wave functions belonging to the unbound ground state electronic part of molecular wave function nuclear part of molecular wave function wave function of the excited- or ground-state vibrational level total electronic angular momentum about the internuclear axis quantum number of # vibrational frequency if the zero energy is at the lowest vibrational level vibrational frequency of the excited (ground) state if the zero energy is at the lowest vibrational level vibrational frequency vibrational frequency of the excited or ground state single anharmonicity if the zero energy is at the lowest vibrational level single anharmonicity of the excited or ground state if the zero energy is at the lowest vibrational level single anharmonicity single anharmonicity of the excited or ground state “second-order” anharmonicity constant of molecular spin–orbit coupling for a given electronic state constant in the Buckingham-type potential mass number atomic masses
J. Koperski / Physics Reports 369 (2002) 177 – 326
aˆ a‘ ˆ z sz a‘ ˆ + s− a‘ ˆ − s+ a.m.u. b b Be Be ; Be B–S Bv Bv ; B Bv ; B CASPT2 CASSCF C2k ; Cm C6 C6 ; C6 Cn C12 CCSD(T) CI CID c cp cv D D D (D ) D0 D0 (D0 ) De De (De ) De De (De ) Dv Dv (or D ) Dv (or D )
315
operator of HLS Hamiltonian acting on the radial part of the wave function diagonal element of one-electron spin–orbit coupling operator oA-diagonal element of one-electron spin–orbit coupling operator oA-diagonal element of one-electron spin–orbit coupling operator atomic mass unit constant in the Buckingham-type potential exponent in universal dumping function rotational constant (rigidity) for vibrationless state (at R = Re ) excited- or ground-state rotational constant (rigidity) for vibrationless state (at R = Re ) Birge–Sponer (method, plot) rotational constant (rigidity) for single vibrational level v rotational constant (rigidity) of excited-state vibrational level v rotational constant (rigidity) of ground-state vibrational level v complete-active-space multireference second-order perturbation theory (ab initio calculations) complete-active-space multiconLguration self-consistent Leld (ab initio calculations) long-range constants (C6 ; C8 ; C10 ; : : : for 2k or m = 6; 8; 10; : : :) long-range vdW constant excited- or ground-state long-range vdW constant short-range constant (C12 , for n = 12) short-range constant counterpoise corrected with single, double and triple excitation (ab initio calculations) conLguration interaction Condon internal diAraction (pattern) speed of light speciLc heat under constant pressure speciLc heat in constant volume diameter of the oriLce dissociation energy limit excited- (ground-) state dissociation limit dissociation energy referred to the lowest v = 0 vibrational level excited- (ground-) state dissociation energy referred to the lowest v = 0 vibrational level centrifugal stretching rotational constant for vibrationless state (at R = Re ) excited- (ground-) state centrifugal stretching rotational constant for vibrationless state (at R = Re ) well depth (bond strength) excited- (ground-) state well depth (bond strength) centrifugal stretching rotational constant for single vibrational level v centrifugal stretching rotational constant of excited-state vibrational level v centrifugal stretching rotational constant of ground-state vibrational level v
316
E Ee Erot Evib F F F(J ) F–C F–CF Fv (or F ) Fv (or F ) f(vD − v) FSR FWHM G(v) G(v ) (G(v )) G0 (v) Gv NDE H h ˜ H0 H6 (H6 ) HF HFD HLS Hm ; H6 I (E) IMe ; IRG J J J J ; J Ja JWKB k K–H Km ; K6 L ‘ L–D LDA L–J(n − m) L–P
J. Koperski / Physics Reports 369 (2002) 177 – 326
energy (total) electronic energy rotational energy vibrational energy reduced ionisation potential universal damping function rotational term Franck–Condon (principle, “window” for excitation or emission) Franck–Condon factor, qv v excited-state rotational term for particular v level ground-state rotational term for particular v level term in Gv NDE program represented by a ratio of polynomials in (vD − v) free spectral range full-width in half-maximum vibrational term excited- (ground-) state vibrational term vibrational term if the zero energy is at the lowest vibrational level near-dissociation expansion program of Le Roy total Hamiltonian of molecular conLguration Planck constant Planck constant/2 zero-order term in total Hamiltonian of molecular conLguration constants in the LR–B formula (slopes) for the excited (ground) state Hartree–Fock (calculations) Hartree–Fock-dispersion-type (function, potential) Hamiltonian of spin–orbit interaction constant in the LR–B formula (slope) intensity of the emitted Huorescence ionisation potential of Me or RG atom total angular momentum of molecule quantum number of J rotational quantum number excited- or ground-state rotational quantum number resulting of L and S in molecule Jordan–Wentzel–Kramers–Brillouin Boltzmann constant Kramer–Herschbach (model) constants in LR–B theory, limiting slopes resulting electronic orbital angular momentum one-electron orbital angular momentum London–Drude (theory) local density approximation (ab initio calculations) Lennard–Jones (function, potential) Liuti–Pirani (regularity)
J. Koperski / Physics Reports 369 (2002) 177 – 326
LR–B L–S M M MeA MT MC Me Me2 MeRG MOT MPn MR MRSDCI M–S(n0 ; n1 ) MSV m1 ; m2 mMe mRG N N N (C) n n n0 n0 ; n1 n∗ NDE Nd:YAG NeA Next Nint NMe ; NRG Ntot P0 P1 PE PMe p6 p p* QMC qv v
317
Le Roy–Bernstein (method, plot) spin–orbit coupling electronic transition moment Mach number eAective Mach number terminal Mach number multiconLguration (ab initio calculation) metal (atom) metal–metal (metal dimer) metal–rare gas (molecule) magneto-optical trap nth order MHller-Plesset (method in ab initio calculations) multireference (ab initio calculations) multireference single- and double excitations conLguration interaction (ab initio calculations) Maitland–Smith (function, potential) Morse–spline–vdW (function, potential) atomic masses mass of metal atom mass of rare gas atom angular momentum of nuclear rotation in molecule quantum number of N axial velocity distribution principal quantum number density of molecules at given point of expansion density of molecules in source and oriLce coeRcients in M–S function modiLed M–S exponent near-dissociation expansion (method, theory) neodymium-doped yttrium aluminium garnet eAective electron number number of total outer electrons number of total inner electrons numbers of electrons in the outer shell of Me or RG atom total number of electrons high-pressure in molecular source pressure in vacuum chamber (background pressure) potential energy metal-vapour pressure in molecular source Llled-shell electronic conLgurations with six p electrons p-orbital perpendicular to the internuclear axis, -conLguration p-orbital parallel to the internuclear axis, *-conLguration quantum Monte-Carlo (ab initio calculations) Franck–Condon factor
318
J. Koperski / Physics Reports 369 (2002) 177 – 326
R R2a Rc Re Re (Re ) RG RHF RKR Rlim RLR Rˆ sp 2S+1 () 2S+1
S S s s2 SCF S–K T T ; T T0 Te Te ; Te T –T Trot TT Tt Tvib U U ; U UB Uexp UL–J UM UM–S UMull UM−vdW VRG v v (v )
internuclear separation expectation value of the square of the electronic radius of unLlled valence shell of atom a (radius of the valence electron shell) parameter in UM–vdW potential equilibrium internuclear separation (bond length) excited- (ground-) state equilibrium internuclear separation (bond length) rare gas (atom) restricted Hartree–Fock (ab initio calculations) Rydberg–Klein–Rees (-like inversion method of Le Roy) repulsive ground-state potential characterization limit Le Roy radius s–p electrons distance operator Hund’s case (a) notation of molecular electronic energy state (e.g. 1 1 , S = 0; = 1; = 1) Hund’s case (c) notation of molecular electronic energy state (e.g. 3 1; S=1; =1) resulting electron spin quantum number of S one-electron electron spin operator Llled-shell electronic conLgurations with two s electrons self-consistent-Leld (ab initio calculations) Slater–Kirkwood (model) total term total term of the excited or ground state temperature in molecular source electronic term electronic term of excited or ground state Tang and Toennies (function, potential) rotational temperature terminal translational temperature translational temperature vibrational temperature interatomic potential interatomic potential of the excited (emitting) or ground (Lnal) state Buckingham-type potential exp(n; m); exp(n; 6) or exp-6 Buckingham-type potential Lennard–Jones interatomic potential Morse interatomic potential Maitland–Smith interatomic potential Mulliken diAerence potential combined Morse–vdW interatomic potential exchange interaction between valence Me electrons and RG atom in MeRG molecule vibrational quantum number vibrational quantum number of the excited (ground) state
J. Koperski / Physics Reports 369 (2002) 177 – 326
vs vD vD (vD ) vdW (v; J ) vmax X XeA XM Xm ; X˜ m XT Z ZeA
319
local velocity of sound vibrational quantum number of the last discrete vibrational level vibrational quantum number of the last discrete vibrational level in the excited (ground) state van der Waals (molecule, interaction) ro-vibrational level with v, vibrational, and J , rotational, quantum numbers maximum vibrational quantum number symbolizes the ground molecular electronic state (eg. X+ ; X0+ g) eAective distance from the nozzle distance to Mach disk shock numerical factors in NDE distance to terminal Mach number atomic number eAective charge of a nucleus in Me atom
Acknowledgements I would like to express my thanks to Prof. L. Krause, Prof. M. Czajkowski and Prof. J.B. Atkinson (all of University of Windsor) for a long-term cooperation. I thank Prof. T. Dohnalik, Prof. W. Gawlik and Prof. K. Musiol (all of Jagiellonian University) for their support. I also thank Professors J. ViguPe (UniversitPe Paul Sabatier, Toulouse), Prof. P. Hannaford (Swinburne University, Melbourne), Prof. R.J. Le Roy (University of Waterloo), Prof. E. Czuchaj (GdaPnsk University), Dr. J. Supronowicz (LEAR, Dearborn), Prof. W. Kedzierski (University of Windsor) and MSc. David Gough (Swinburne University, Melbourne). I value help and expertise of personnel of electric and machine shops in both, the Department of Physics of University of Windsor, and the Institute of Physics of Jagiellonian University. My personal appreciation goes to my beloved wife Agata for her help and limitless understanding, and to my daughter Monika for cheering me up. A Lnancial assistance of Polish Committee for ScientiLc Research (KBN Grants 2 P03B 107 10 and 5 P03B 037 20) is acknowledged. References [1] K.P. Huber, G. Herzberg, Molecular Spectra and Molecular Structure. IV. Constants of Diatomic Molecules, D. Van Nostrand, New York, 1979. [2] F. London, Z. Phys. 63 (1930) 245. [3] F. London, Z. Phys. Chem. Abt. B 11 (1930) 222. [4] G.C. Maitland, M. Rigby, E.B. Smith, W.A. Wakchem, Intermolecular Forces, Clarendon Press, Oxford, 1987. [5] A.J. Stone, The Theory of Intermolecular Forces, Clarendon Press, Oxford, 1996. [6] I.M. Torrens, Interatomic Potentials, Academic Press, New York, 1972. [7] J.C. Slater, J.G. Kirkwood, Phys. Rev. 37 (1931) 682. * [8] H.L. Kramer, D.R. Herschbach, J. Chem. Phys. 53 (1970) 2792. * [9] G. Liuti, F. Pirani, Chem. Phys. Lett. 122 (1985) 1245. * [10] F. Hensel, Adv. Phys. 44 (1995) 3 and references therein.
320 [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] [50] [51] [52] [53] [54]
J. Koperski / Physics Reports 369 (2002) 177 – 326 F. Hensel, Phil. Trans. R. Soc. London A 356 (1998) 97 and references therein. M.D. Morse, Chem. Rev. 86 (1986) 1049 and references therein. S. Martenchard-Barra, C. Jouvet, C. Lardeux-Dedonder, D. Solgadi, J. Chem. Phys. 98 (1993) 5281. K.F. Willey, P.Y. Cheng, C.S. Yeh, D.L. Robbins, M.A. Duncan, J. Chem. Phys. 95 (1991) 6249. D.L. Robbins, K.F. Willey, C.S. Yeh, M.A. Duncan, J. Phys. Chem. 96 (1992) 4824. V. Beutel, G.L. Bhale, M. Kuhn, W. DemtrVoder, Chem. Phys. Lett. 185 (1991) 313. B. Cabaud, A. Hoareau, P. Melinon, J. Phys. D 13 (1980) 1831. C. BrPechignac, M. Broyer, Ph. Cahuzac, G. Delacretaz, P. Labastie, L. WVoste, Chem. Phys. Lett. 120 (1985) 559. H. Haberland, H. Kornmeier, H. Langosh, M. Oschwald, G. Tanner, J. Chem. Soc. Faraday Trans. 86 (1990) 2473. K. Rademann, O. Dimopoulou-Rademann, M. Schlauf, U. Even, F. Hensel, Phys. Rev. Lett. 69 (1992) 3208. H. Haberland, B. IssendorA, J. Yufeng, T. Kolar, G. Thanner, Z. Phys. D 26 (1993) 8. B. Land, A. Vierheilig, E. Wiedenmann, H. Buchenau, G. Gerber, Z. Phys. D 40 (1997) 1. J.C. Miller, L. Andrews, Appl. Spectrosc. Rev. 16 (1980) 1 and references therein. C.K. Rhodes, Excimer Lasers, 2nd Edition, Springer, Berlin, 1984. and references therein P.D. Lett, P.S. Julienne, W.D. Phillips, Ann. Rev. Phys. Chem. 46 (1995) 423. D.H. Levy, in: J. Jortner, R. Levine, S.A. Rice (Eds.), Photoselective Chemistry, Advances in Chemical Physics, Vol. 47, Wiley-Interscience, New York, 1981, Part I, pp. 323–362. B.L. Blaney, G.E. Ewing, Ann. Rev. Phys. Chem. 27 (1976) 553. W.H. Breckenridge, C. Jouvet, B. Soep, in: M. Duncan (Ed.), Advances in Metal and Semiconductor Clusters, Vol. 3, JAI Press Inc., Greenwich, CT, 1995, pp. 1–83. C. Champenois, E. Audouard, P. Duplcaa, J. ViguPe, J. Phys. II France 7 (1997) 523. R.C. Forrey, L. You, V. Kharchenko, A. Dalgarno, Phys. Rev. A 55 (1997) R3311. B. Kohler, A.H. Zewail, M. Quack, B.A. Hess, H. Hamaguchi, J.-L. Martin, K. Yamanouchi, P. Backhaus, J. Manz, B. Schmidt, Adv. Chem. Phys. 101 (1997) 83. P. Backhaus, B. Schmidt, Chem. Phys. 217 (1997) 131. P. Backhaus, B. Schmidt, M. Dantus, Chem. Phys. Lett. 306 (1999) 18. U. Marvet, Q. Zhang, M. Dantus, J. Phys. Chem. A 102 (1998) 4111. U. Marvet, M. Dantus, Chem. Phys. Lett. 245 (1995) 393. E.S. Fry, Th. Walther, R.A. KeneLck, Phys. Scr. T 76 (1998) 47. E.S. Fry, Th. Walther, S. Li, Phys. Rev. A 52 (1995) 4381. E.S. Fry, Th. Walther, Adv. At. Mol. Opt. Phys. 42 (2000) 1. Th. Walther, Workshop on Prospects of Cold Molecules, Heidelberg, November 1999. M. Czajkowski, J. Koperski, in: M. Inguscio, M. Allegrini, A. Sasso (Eds.), Laser Spectroscopy XII, World ScientiLc, Singapore, 1996, pp. 392–393. * J. Koperski, M. Czajkowski, Phys. Rev. A 62 (2000) 12505. *** M. Czajkowski, J. Koperski, in: Proceedings of the 28th EGAS Conference Graz, EPS Europhysics Conference Abstracts 20D (1996) 19 and 551. * J. Koperski, M. Czajkowski, J. Chem. Phys. 109 (1998) 459. *** J. Koperski, M. Czajkowski, Chem. Phys. Lett. 350 (2001) 367. *** J. Koperski, M. Czajkowski, Eur. Phys. J. D 10 (2000) 363. * J. Koperski, S. Kielbasa, M. Czajkowski, Spectrochim. Acta A 56 (2000) 1613. * J. Koperski, M. Czajkowski, J. Mol. Spectrosc. 212 (2002) 162–170. J. Koperski, M. Lukomski, M. Czajkowski, in: J. Seidel (Ed.), Spectral Line Shapes, Vol. 11, American Institute of Physics, Melville, New York, 2001, pp. 304 –306. * J. Koperski, M. Lukomski, M. Czajkowski, Spectrochim. Acta A (2002) in press. * M. Lukomski, J. Koperski, M. Czajkowski, Spectrochim. Acta A 58 (2002) 1757. * J. Koperski, J.B. Atkinson, L. Krause, in: Proceedings of the 1993 OSA Annual Meeting, Tech. Digest Ser. 16 (1993) 20. * J. Koperski, J.B. Atkinson, L. Krause, 1993, unpublished result. J. Koperski, J.B. Atkinson, L. Krause, Chem. Phys. 186 (1994) 401. ** J. Koperski, Chem. Phys. 211 (1996) 191; J. Koperski, Chem. Phys. 214 (1997) 431. **
J. Koperski / Physics Reports 369 (2002) 177 – 326 [55] [56] [57] [58] [59] [60] [61] [62] [63] [64] [65] [66] [67] [68] [69] [70] [71] [72] [73] [74] [75] [76] [77] [78] [79] [80] [81] [82] [83] [84] [85] [86] [87] [88] [89] [90] [91] [92] [93] [94] [95] [96] [97] [98] [99] [100]
321
J. Koperski, J. Chem. Phys. 105 (1996) 4920. ** J. Koperski, J.B. Atkinson, L. Krause, J. Mol. Spectrosc. 207 (2001) 172. ** J. Koperski, J.B. Atkinson, L. Krause, J. Mol. Spectrosc. 187 (1998) 181. M. Czajkowski, J. Koperski, Spectrochim. Acta A 55A (1999) 2221. * J. Koperski, M. Lukomski, M. Czajkowski, Spectrochim. Acta A 58 (2002) 927. * J. Koperski, J.B. Atkinson, L. Krause, Chem. Phys. Lett. 219 (1994) 161. *** J. Koperski, J.B. Atkinson, L. Krause, Can. J. Phys. 72 (1994) 1070. *** J. Koperski, J.B. Atkinson, L. Krause, J. Mol. Spectrosc. 184 (1997) 300. *** J. Koperski, M. Czajkowski, Chem. Phys. Lett. 357 (2002) 119. G. Herzberg, Molecular Spectra and Molecular Structure. I. Spectra of Diatomic Molecules, 2nd Edition, D. Van Nostrand, Princeton, NJ, 1950. M. Born, R. Oppenheimer, Ann. Phys. 84 (1927) 457. F. Hund, Z. Phys. 36 (1926) 637. H. Lefebvre-Brion, R.W. Field, Perturbations in the Spectra of Diatomic Molecules, Academic Press, Orlando, 1986. K. Onda, K. Yamanouchi, J. Chem. Phys. 104 (1996) 9376. H. Margenau, Rev. Mod. Phys. 11 (1939) 1. C.E. Moore, Atomic energy levels, Vol. III, NSRDS-NBS 35, 1971. P.J. Hay, T.H. Dunning Jr., R.C. RaAenetti, J. Chem. Phys. 65 (1976) 2679. E. Czuchaj, H. Stoll, Chem. Phys. 248 (1999) 1. ** K. Amano, K. Ohmori, T. Kurosawa, H. Chiba, M. Okunishi, K. Ueda, Y. Sato, A.Z. Devdariani, E.E. Nikitin, J. Chem. Phys. 108 (1998) 8110. T. Kurosawa, K. Ohmori, H. Chiba, M. Okunishi, K. Ueda, Y. Sato, A.Z. Devdariani, E.E. Nikitin, J. Chem. Phys. 108 (1998) 8101. M.-C. Duval, O. Benoist D’Azy, W.H. Breckenridge, C. Jouvet, B. Soep, J. Chem. Phys. 85 (1986) 6324. R.E. Drullinger, M.M. Hessel, E.W. Smith, J. Chem. Phys. 66 (1977) 5656. E.W. Smith, R.E. Drullinger, M.M. Hessel, J. Cooper, J. Chem. Phys. 66 (1977) 5667. D.J. Ehrlich, R.M. Osgood, IEEE J. Quantum Electron QE-15 (1978) 301. R.J. Niefer, J.B. Atkinson, L. Krause, J. Phys. B 16 (1983) 3531; R.J. Niefer, J.B. Atkinson, L. Krause, J. Phys. B 16 (1983) 3767. R.J. Niefer, J. Supronowicz, J.B. Atkinson, L. Krause, Phys. Rev. A 34 (1986) 1137; R.J. Niefer, J. Supronowicz, J.B. Atkinson, L. Krause, Phys. Rev. A 35 (1987) 4629. W. Kedzierski, J. Supronowicz, A. Czajkowski, M.J. Hinek, J.B. Atkinson, L. Krause, Chem. Phys. Lett. 218 (1994) 314. A. Czajkowski, W. Kedzierski, J.B. Atkinson, L. Krause, Chem. Phys. Lett. 238 (1995) 327. W. Kedzierski, J. Supronowicz, A. Czajkowski, J.B. Atkinson, L. Krause, J. Mol. Spectrosc. 173 (1995) 510. * A. Czajkowski, W. Kedzierski, J.B. Atkinson, L. Krause, J. Mol. Spectrosc. 181 (1997) 1. L. Krause, W. Kedzierski, A. Czajkowski, J.B. Atkinson, Phys. Scr. T 72 (1997) 48. S. Mrozowski, Rev. Mod. Phys. 16 (1944) 153. R.E. Drullinger, M.M. Hessel, E.W. Smith, J. Chem. Phys. 66 (1977) 5656. M. Stock, E.W. Smith, R.E. Drullinger, M.M. Hessel, J. Chem. Phys. 76 (1977) 2463. H. Komine, R.L. Byer, J. Chem. Phys. 67 (1977) 2536. E.R. Mosburg Jr., M.D. Wilke, J. Chem. Phys. 66 (1977) 5682. A.B. Callear, Chem. Rev. 87 (1987) 335 and references therein. M. Dolg, H.-J. Flad, Private communication, 1996. E. Czuchaj, F. Rebentrost, H. Stoll, H. Preuss, Chem. Phys. 214 (1997) 277. ** A. Kowalski, M. Czajkowski, W.H. Breckenridge, Chem. Phys. Lett. 119 (1985) 368. M. Czajkowski, R. Bobkowski, L. Krause, Phys. Rev. A 40 (1989) 4338. M. Czajkowski, R. Bobkowski, L. Krause, Phys. Rev. A 41 (1990) 277. G. Rodriguez, J.G. Eden, J. Chem. Phys. 95 (1991) 5539. H.C. Tran, J.G. Eden, J. Chem. Phys. 105 (1996) 6771. C.-H. Su, Y. Huang, R.F. Brebrick, J. Phys. B 18 (1985) 3187. B.S. Ault, L. Andrews, J. Mol. Spectrosc. 65 (1977) 102.
322 [101] [102] [103] [104] [105] [106] [107] [108] [109] [110] [111] [112] [113] [114] [115] [116] [117] [118] [119] [120] [121] [122] [123] [124] [125] [126] [127] [128] [129] [130] [131] [132] [133] [134] [135] [136] [137] [138] [139] [140] [141] [142] [143] [144] [145] [146] [147] [148] [149] [150] [151]
J. Koperski / Physics Reports 369 (2002) 177 – 326 C. Bousquet, J. Phys. B 19 (1986) 3859. D. Xing, K. Ueda, H. Takuma, Jpn. J. Appl. Phys. 33 (1994) L1676. D. Xing, Q. Wang, S. Tan, K. Ueda, Jpn. J. Appl. Phys. 36 (1997) L1301. M.S. Helmi, T. Grycuk, G.D. Roston, Spectrochim. Acta B 51 (1996) 633. G.D. Roston, M.S. Helmi, Chem. Phys. 258 (2000) 55. C.F. Kunz, C. HVattig, B.A. Hess, Mol. Phys. 89 (1996) 139. P.P. Edwards, R.L. Johnston, F. Hensel, C.N.R. Rao, D.P. Tunstall, Solid State Phys. 52 (1999) 229. F. Barocchi, F. Hensel, M. Sampoli, Chem. Phys. Lett. 232 (1995) 445. M. Sampoli, F. Hensel, F. Barocchi, Phys. Rev. A 53 (1996) 4594. M. Yu, M. Dolg, Chem. Phys. Lett. 273 (1997) 329. * H.-J. Flad, F. Schautz, Y. Wang, M. Dolg, A. Savin, Eur. Phys. J. D 6 (1999) 243. * W.E. Baylis, J. Chem. Phys. 51 (1969) 2665. W.E. Baylis, J. Phys. B 10 (1977) L583. E. Czuchaj, H. Stoll, H. Preuss, J. Phys. B 20 (1987) 1487. * E. Czuchaj, J. Sienkiewicz, J. Phys. B 17 (1984) 2251. * E. Czuchaj, F. Rebentrost, H. Stoll, H. Preuss, Chem. Phys. Lett. 255 (1996) 203. * E. Czuchaj, F. Rebentrost, H. Stoll, H. Preuss, Chem. Phys. Lett. 225 (1994) 233. * E. Czuchaj, Private communication, 1995. R.M. Tennent (Ed.), Science Data Book, Oliver & Boyd, Edinburgh, 1976. C.F. Bender, T.N. Rescigno, H.F. Schaefer, A.E. Orel, J. Chem. Phys. 71 (1979) 1122. W.J. Stevens, Appl. Phys. Lett. 35 (1979) 751. K.C. Celestino, W.C. Ermler, J. Chem. Phys. 81 (1984) 1872. K. Rademann, B. Kaiser, U. Even, H. Hensel, Phys. Rev. Lett. 59 (1987) 2319. C. BrPechignac, M. Broyer, P. Cahusac, G. Delacretaz, P. Labastie, J.P. Wolf, L. WVoste, Phys. Rev. Lett. 60 (1988) 275. H. Haberland, H. Kornmeier, H. Langosh, M. Oschwald, G. Tanner, J. Chem. Soc. Faraday Trans. 86 (1990) 2473. P. Ballone, G. Galli, Phys. Rev. B 40 (1989) 8563. D.B. Neumann, M. Krauss, J. Chem. Phys. 75 (1981) 315. F. Schautz, H.-J. Flad, M. Dolg, Theor. Chem. Acc. 99 (1998) 231. A. Bonechi, M. Moraldi, L. Frommhold, J. Chem. Phys. 109 (1998) 5880. A. Bonechi, F. Barocchi, M. Moraldi, C. Bierman, R. Winter, L. Frommhold, Phys. Rev. A 57 (1998) 2635. E. Czuchaj, M. KroPsnicki, H. Stoll, Theor. Chem. Acc. 105 (2001) 219. * E. Czuchaj, M. KroPsnicki, J. Czub, Eur. Phys. J. D 13 (2001) 345. * E. Czuchaj, M. KroPsnicki, H. Stoll, Chem. Phys. 265 (2001) 291. * E. Czuchaj, M. KroPsnicki, Chem. Phys. Lett. 335 (2001) 440. * E. Czuchaj, M. KroPsnicki, H. Stoll, Chem. Phys. 263 (2001) 7. * E. Czuchaj, M. KroPsnicki, J. Phys. B 33 (2000) 5425. * E. Czuchaj, M. KroPsnicki, J. Czub, Mol. Phys. 99 (2001) 255. * T.M. Miller, B. Bederson, Adv. At. Mol. Phys. 13 (1977) 1. F. Maeder, W. Kutzelnigg, Chem. Phys. 42 (1979) 95. A. Nicklass, M. Dolg, H. Stoll, H. Preuss, J. Chem. Phys. 102 (1995) 8942. M. Seth, P. Schwerdtfeger, M. Dolg, J. Chem. Phys. 106 (1997) 3623. * J.P. Desclaux, L. Laaksonen, P. PyykkVo, J. Phys. B 14,419 (1981). L.T. Sin Fai Lam, J. Phys. B 14 (1981) 3543. A.L. Zagrebin, M.G. Lednev, Opt. Spectrosc. 75 (1993) 562. E. Czuchaj, M. KroPsnicki, Chem. Phys. Lett. 329 (2000) 495. * J.A. Boatz, K.L. Bak, J. Simons, Theor. Chim. Acta 83 (1992) 209. A.L. Zagrebin, M.G. Lednev, S.I. Tserkovnyi, Opt. Spectrosc. 74 (1993) 14. H. Tatewaki, M. Tomonari, T. Nakamura, J. Chem. Phys. 82 (1985) 5608. N.C. Pyper, I.P. Grant, R.B. Gerber, Chem. Phys. Lett. 49 (1977) 479. F.H. Mies, W.J. Stevens, M. Krauss, J. Mol. Spectrosc. 72 (1978) 303. K. Balasubramanian, K.K. Das, D.W. Liao, Chem. Phys. Lett. 195 (1992) 487.
J. Koperski / Physics Reports 369 (2002) 177 – 326 [152] [153] [154] [155] [156] [157] [158] [159] [160] [161] [162] [163] [164] [165] [166] [167] [168] [169] [170] [171] [172] [173] [174] [175] [176] [177] [178] [179] [180] [181] [182] [183] [184] [185] [186] [187] [188] [189] [190] [191] [192] [193] [194] [195] [196] [197] [198] [199] [200] [201]
323
P. Schwerdtfeger, J. Li, P. PyykkVo, Theor. Chem. Acta 87 (1994) 313. T. Baestufg , W.-D. Sepp, D. Kolb, B. Fricke, E.J. Baerends, G. Te Velde, J. Phys. B 28 (1995) 2325. J.R. BieroPn, W.E. Baylis, Chem. Phys. 197 (1995) 129. M. Dolg, H.-J. Flad, J. Phys. Chem. 100 (1996) 6147,6152. * T. Sumi, E. Miyoshi, Y. Sakai, O. Matsuoka, Phys. Rev. B 57 (1998) 914. D. Goebel, U. Hohm, G. Maroulis, Phys. Rev. A 54 (1996) 1973. * D. Goebel, U. Hohm, Phys. Rev. A 52 (1995) 3691. * D. Goebel, U. Hohm, J. Phys. Chem. 100 (1996) 7710. * P.M. Morse, Phys. Rev. 34 (1929) 57. J.S. Supronowicz, Private communication, 1996. J. Tellinghuisen, A. Ragone, M.S. Kim, D.J. Auerbach, R.E. Smalley, L. Wharton, D.H. Levy, J. Chem. Phys. 71 (1979) 1283. W. Hua, Phys. Rev. A 42 (1990) 2524. J.E. Lennard-Jones, Proc. R. Soc. A 106 (1924) 441, 463. D.E. Pritchard, F.Y. Chu, Phys. Rev. A 2 (1970) 1932. J.H. Dymond, M. Rigby, E.B. Smith, Phys. Fluids 9 (1966) 1222. D. Eisel, D. Zevgolis, W. DemtrVoder, J. Chem. Phys. 71 (1979) 2005. J. Supronowicz, D. Petro, J.B. Atkinson, L. Krause, Phys. Rev. A 50 (1994) 2161. * W. Behmenburg, Z. Naturforsch. 27a (1972) 31. W.R. Hindmarsh, A.D. Petford, G. Smith, Proc. R. Soc. A 297 (1968) 296. G.C. Maitland, E.B. Smith, Chem. Phys. Lett. 22 (1973) 443. *** D.W. Gough, E.B. Smith, G.C. Maitland, Mol. Phys. 27 (1974) 967. R.A. Aziz, J. Chem. Phys. 64 (1976) 490. K.M. Smith, A.M. Rulis, G. Scoles, R.A. Aziz, V. Nain, J. Chem. Phys. 67 (1977) 152. M. Findeisen, T. Grycuk, J. Phys. B 22 (1989) 1583. G. York, R. Scheps, A. Gallagher, J. Chem. Phys. 63 (1975) 1052. P. Baumann, D. Zimmermann, R. BrVuhl, J. Mol. Spectrosc. 155 (1992) 277. J.M. Parson, P.E. Siska, Y.T. Lee, J. Chem. Phys. 56 (1972) 1511. Y.P. Varshni, Rev. Mod. Opt. 29 (1957) 664. R.A. Buckingham, Proc. R. Soc. 168 (1938) 264. W.M. Fawzy, R.J. Le Roy, B. Simard, H. Niki, P.A. Hackett, J. Chem. Phys. 98 (1993) 140. R. Heller, J. Chem. Phys. 9 (1941) 154. D.J. Funk, A. Kvaran, W.H. Breckenridge, J. Chem. Phys. 90 (1989) 2915. K.T. Tang, J.P. Toennies, J. Chem. Phys. 80 (1984) 3726. R. Ahlrichs, R. Penco, G. Scoles, Chem. Phys. 19 (1977) 119. R.A. Aziz, H.H. Chen, J. Chem. Phys. 67 (1977) 5719. D.A. Barrow, R.A. Aziz, J. Chem. Phys. 89 (1988) 6189. R. BrVuhl, J. Kapetanakis, D. Zimmermann, J. Chem. Phys. 94 (1991) 5865. R.E. Olson, J. Chem. Phys. 49 (1968) 4499. B. Brunetti, F. Pirani, F. Vecchiocattivi, E. Luzzatti, Chem. Phys. Lett. 58 (1978) 504. A.G. Gaydon, Dissociation Energies and Spectra of Diatomic Molecules, Chapman & Hall, London, 1968. M. Okunishi, K. Yamanouchi, K. Onda, S. Tsuchiya, J. Chem. Phys. 98 (1993) 2675. K. Onda, K. Yamanouchi, J. Chem. Phys. 102 (1995) 1129. J. Franck, Trans. Faraday Soc. 21 (1925) 536. E.U. Condon, Phys. Rev. 32 (1928) 858. E.U. Condon, Phys. Rev. 41 (1932) 759. A. Zehnacker, M.-C. Duval, C. Jouvet, C. Lardeux-Dedonder, D. Solgadi, B. Soep, O. Benoist d’Azy, J. Chem. Phys. 86 (1987) 6565. * M. Schlauf, O. Dimopolou-Rademann, K. Rademann, U. Even, F. Hensel, J. Chem. Phys. 90 (1989) 4630. * M.-C. Duval, C. Jouvet, B. Soep, Chem. Phys. Lett. 119 (1985) 317. K. Yamanouchi, S. Isogai, M. Okunishi, S. Tsuchiya, J. Chem. Phys. 88 (1988) 205. R.D. van Zee, S.C. Blankespoor, T.S. Zwier, Chem. Phys. Lett. 158 (1989) 306. *
324 [202] [203] [204] [205] [206] [207] [208] [209] [210] [211] [212] [213] [214] [215] [216] [217] [218] [219] [220] [221] [222] [223] [224] [225] [226] [227] [228] [229] [230] [231] [232] [233] [234] [235] [236] [237] [238] [239] [240] [241] [242] [243] [244] [245] [246]
J. Koperski / Physics Reports 369 (2002) 177 – 326 R.T. Birge, H. Sponer, Phys. Rev. 28 (1926) 259. * C.E. Moore, Atomic Energy Levels, Natl. Bur. Stand. US NSRDS No. 35, US GPO, Washington, DC, 1971. R.J. Le Roy, R.B. Bernstein, J. Chem. Phys. 52 (1970) 3869. ** R.J. Le Roy, R.B. Bernstein, Chem. Phys. Lett. 5 (1970) 42. ** W.C. Stwalley, Chem. Phys. Lett. 6 (1970) 241. R.J. Le Roy, J. Chem. Phys. 57 (1972) 573. * R.J. Le Roy, J. Chem. Phys. 101 (1994) 10214. * R.J. Le Roy, W.-H. Lam, Chem. Phys. Lett. 71 (1980) 544. * R.J. Le Roy, in: M.S. Child (Ed.), Semiclassical Methods in Molecular Scattering and Spectroscopy, Reidel, Dordrecht, 1980, p. 109. ** J. Supronowicz, J.B. Atkinson, L. Krause, Phys. Rev. A 50 (1994) 3719. R.J. Le Roy, LEVEL 6.1–7.2: A computer program solving the radial SchrVodinger equation for bound and quasibound levels, and calculating various expectation values and matrix elements, University of Waterloo Chemical Physics Research Report CP-555R, 1996, unpublished. A. Kvaran, D.J. Funk, A. Kowalski, W.H. Breckenridge, J. Chem. Phys. 89 (1988) 6069. R.S. Mulliken, J. Chem. Phys. 55 (1971) 309. P.E. Langer, Phys. Rev. 51 (1937) 669. I. Wallace, J. Ryter, W.H. Breckenridge, J. Chem. Phys. 96 (1992) 136. R.J. Le Roy, Comput. Phys. Commun. 52 (1989) 383. R.J. Le Roy, W.J. Keogh, M.S. Child, J. Chem. Phys. 89 (1988) 4564. M. Masters, J. Huennekens, W.-T. Luh, L. Li, A.M. Lyyra, K. Sando, V. ZaLropulos, W.C. Stwalley, J. Chem. Phys. 92 (1990) 5801. R.J. Le Roy, A computer programs for inversion of oscillatory bound-continuum spectra, University of Waterloo Chemical Physics Research Report CP-327, 1988, unpublished. M.S. Child, H. EssPen, R.J. Le Roy, J. Chem. Phys. 78 (1983) 6732. R. Rydberg, Z. Phys. 73 (1931) 376. O. Klein, Z. Phys. 76 (1932) 226. A.L.G. Rees, Proc. Phys. Soc. 59 (1932) 998. E.A. Moelwyn-Hughes, Physical Chemistry, Pergamon, New York, 1957, p. 332. R. Cambi, D. Cappelletti, G. Liuti, F. Pirani, J. Chem. Phys. 95 (1991) 1852. ** S.H. Brym, Habilitation Thesis, Pedagogical University, Olsztyn, 1996, p. 101, Table 6.3 (in Polish). A. UnsVold, Physik der Sternatmospharen, Springer, Berlin, 1968, p. 269. W.C. Stwalley, H.C. Kramer, J. Chem. Phys. 49 (1968) 5555. C.L. Kong, J. Chem. Phys. 59 (1973) 968,1953. C.L. Kong, M.R. Chakrabarty, J. Phys. Chem. 77 (1973) 2668. K. Fuke, T. Saito, K. Kaya, J. Chem. Phys. 81 (1984) 2591. K. Yamanouchi, J. Fukuyama, H. Horiguchi, S. Tsuchiya, J. Chem. Phys. 85 (1986) 1806. T. Grycuk, E. Czerwosz, Physica 106C (1981) 431. A. Borysow, T. Grycuk, Physica 114C (1982) 414. M.S. Helmi, T. Grycuk, G.D. Roston, Chem. Phys. 209 (1996) 53. R.J. Le Roy, in: R.F. Barrow, D.A. Long, D.J. Millen (Eds.), Molecular Spectroscopy, Vol. I: A Specialist Periodical Report of the Chemical Society, London, 1973, p. 113. * B. Ji, C.-C. Tsai, W.C. Stwalley, Chem. Phys. Lett. 236 (1995) 242. C.C. Lu, T.A. Carlson, F.B. Malik, T.C. Tucker, C.W. Nestor Jr., At. Data 3 (1971) 1. M. Karplus, R.N. Porter, Atoms and Molecules, Benjamin, New York, 1970. M. Winter, WebElements 2.0: http://www.shef.ac.uk/chemistry/web-elements/, University of SheReld, UK. J. Tellinghuisen, in: K.P. Lawley (Ed.), Photodissociation and Photoionization, Wiley, New York, 1985, p. 299. D.H. Levy, Ann. Rev. Phys. Chem. 31 (1980) 197. R.E. Smalley, L. Wharton, D.H. Levy, Acc. Chem. Res. 10 (1977) 139. J.B. Anderson, R.P. Andres, J.B. Fenn, Supersonic nozzle beams, in: J. Ross (Ed.), Molecular Beams, Interscience Publishers, New York, 1966, p. 275. T.A. Miller, Science 223 (1984) 545.
J. Koperski / Physics Reports 369 (2002) 177 – 326
325
[247] D.M. Lubman, C.T. Rettner, R.N. Zare, J. Phys. Chem. 86 (1982) 1129. ** [248] G.M. McClelland, K.L. Saenger, J.J. Valentini, D.R. Herschbach, J. Phys. Chem. 83 (1979) 947. [249] K. Bier, O. Hagena, in: J.H. de Leeuw (Ed.), Advanced Appllied Mechanics, Vol. 2, Academic Press, New York, 1965. [250] C.L. Callender, S.A. Mitchell, P.A. Hackett, J. Chem. Phys. 90 (1989) 2535. [251] O.F. Hagena, Rev. Sci. Instrum. 63 (1992) 2374. [252] J.P. Toennies, K. Winkelmann, J. Chem. Phys. 66 (1977) 3965. [253] W. Kedzierski, R. Berends, J.B. Atkinson, L. Krause, J. Phys. E 21 (1988) 796. [254] M. Okunishi, K. Yamanouchi, S. Tsuchiya, Chem. Lett. 3 (1989) 393. [255] R.J. Niefer, J.B. Atkinson, Opt. Commun. 67 (1988) 139. [256] D.J. Funk, W.H. Breckenridge, J. Chem. Phys. 90 (1989) 2927. [257] A. Kowalski, M. Czajkowski, W.H. Breckenridge, Chem. Phys. Lett. 121 (1985) 217. [258] R. Bobkowski, M. Czajkowski, L. Krause, Phys. Rev. A 41 (1990) 243. [259] M. Czajkowski, R. Bobkowski, L. Krause, Phys. Rev. A 44 (1994) 5730. [260] M. Czajkowski, L. Krause, R. Bobkowski, Phys. Rev. A 49 (1994) 775. * [261] M. Czajkowski, R. Bobkowski, L. Krause, in: SPIE Proceedings Series, Vol. 1711, High Performance Optical Spectrometry, 1992, p. 129. [262] K. Fuke, T. Saito, K. Kaya, J. Chem. Phys. 79 (1983) 2487. [263] T. Tasaka, K. Onda, A. Hishikawa, K. Yamanouchi, Bull. Chem. Soc. Jpn. 70 (1997) 103. [264] T. Tsuchizawa, K. Yamanouchi, S. Tsuchiya, J. Chem. Phys. 89 (1988) 4646. [265] L. Krim, B. Soep, J.P. Visticot, J. Chem. Phys. 103 (1995) 9589. [266] S.J. Lawrence, D.N. Stacey, I.M. Bell, K. Burnett, J. Chem. Phys. 104 (1996) 7860. [267] A. Hishikawa, H. Sato, K. Yamanouchi, J. Chem. Phys. 108 (1998) 9202. [268] Y. Ohshima, M. Iida, Y. Endo, J. Chem. Phys. 92 (1990) 3990. [269] C.J.K. Quayle, I.M. Bell, E. TakPacs, X. Chen, K. Burnett, J. Chem. Phys. 99 (1993) 9608. [270] M. Okunishi, H. Nakazawa, K. Yamanouchi, S. Tsuchiya, J. Chem. Phys. 93 (1990) 7526. [271] D.M. Segal, K. Burnett, J. Chem. Soc. Faraday Trans. 285 (1989) 925. [272] D.R. Lide (Ed.), Handbook of Chemistry and Physics, 72nd Edition, CRC Press, Boca Raton, FL, 1991–1992. [273] J.G. McCaArey, D. Bellert, A.W.K. Leung, W.H. Breckenridge, Chem. Phys. Lett. 302 (1999) 113. [274] I. Wallace, R.R. Bennett, W.H. Breckenridge, Chem. Phys. Lett. 153 (1988) 127. [275] I. Wallace, J.G. Kaup, W.H. Breckenridge, J. Phys. Chem. 95 (1991) 8060. [276] R.R. Bennett, W.H. Breckenridge, J. Chem. Phys. 92 (1990) 1588. [277] R.H. Garstang, J. Opt. Soc. Am. 52 (1962) 845. [278] C. Jouvet, C. Lardeux-Dedonder, S. Martenchard, D. Solgadi, J. Chem. Phys. 94 (1991) 1759. [279] S. Ceccherini, M. Moraldi, Phys. Rev. Lett. 85 (2000) 952. [280] R.D. van Zee, S.C. Blankespoor, T.S. Zwier, J. Chem. Phys. 88 (1988) 4650. * [281] M. Dolg, H.-J. Flad, Mol. Phys. 91 (1997) 815. * [282] Y.X. Wang, H.-J. Flad, M. Dolg, Phys. Rev. B 61 (2000) 2362. [283] J.N. Greif-WVustenbecker, Ph.D. Thesis, Philipps-UniversitVat Marburg, 2000, unpublished. [284] R.W. Wood, Phil. Mag. 18 (1909) 240. [285] S. Mrozowski, Z. Phys. 62 (1930) 314. [286] Grotrian, Z. Phys. 5 (1921) 148. [287] H. Hamada, Phil. Mag. 12 (1931) 50. [288] Lord Rayleigh, Proc. R. Soc. London A 135 (1932) 617. [289] D. Xing, K. Ueda, H. Takuma, Jpn. J. Appl. Phys. 33 (1994) L1676. [290] D. Xing, Q. Wang, S. Tan, K. Ueda, Jpn. J. Appl. Phys. 36 (1997) L1301. [291] M. Seth, M. Dolg, P. Fulde, P. Schwerdtfeger, J. Am. Chem. Soc. 117 (1995) 6597. * [292] Y.X. Wang, J.-H. Flad, M. Dolg, Int. J. Mass Spectroscosc. 201 (2000) 197. [293] R. Wesendrup, L. Kloo, P. Schwerdtfeger, Int. J. Mass Spectr. 201 (2000) 17. [294] T. Sumi, E. Miyoshi, K. Tanaka, Phys. Rev. B 59 (1999) 6153. [295] S.D. Baranovskii, R. Dettmer, F. Hensel, H. Uchtmann, J. Chem. Phys. 103 (1995) 7796. [296] M.V. Korolkov, B. Schmidt, Chem. Phys. 237 (1998) 123.
326 [297] [298] [299] [300] [301] [302] [303] [304] [305] [306] [307] [308] [309] [310] [311] [312] [313] [314] [315] [316] [317] [318] [319] [320] [321] [322]
J. Koperski / Physics Reports 369 (2002) 177 – 326 P. Gross, M. Dantus, J. Chem. Phys. 106 (1997) 8013. J.G. Eden, H.T. Tran, V.S. Zuev, J. Russ. Laser Res. 19 (1998) 116. J.G. Eden, H.C. Tran, V.S. Zuev, J. Russ. Laser Res. 20 (1999) 342. B.M. Smirnov, A.S. Jacenko, Usp. Fiz. Nauk 166 (1996) 225. S.S. Batsanov, Russ. J. Phys. Chem. 72 (1998) 894. S.S. Batsanov, Zh. Fiz. Khim. 72 (1998) 1008. L. Krause, W. Kedzierski, A. Czajkowski, J.B. Atkinson, Phys. Scr. T72 (1997) 48. J. Supronowicz, E. Hegazi, J.B. Atkinson, L. Krause, Chem. Phys. Lett. 222 (1994) 149. Y. Sato, T. Nakamura, M. Okunishi, K. Ohmori, H. Chiba, K. Ueda, Phys. Rev. A 53 (1996) 867. S. Brym, R. Ciurylo, R.S. TrawiPnski, A. Bielski, Phys. Rev. A 56 (1997) 4501. A. Bielski, D. Lisak, R.S. TrawiPnski, Eur. Phys. J. D 14 (2001) 27. R.S. TrawiPnski, A. Bielski, D. Lisak, Acta Phys. Polon. A 99 (2001) 243. A. Bonechi, M. Moraldi, in: R. Herman (Ed.), Spectral Line Shape, Vol. 10, AIP, New York, 1999, p. 427. M.S.A. El-Kader, Phys. Lett. A 257 (1999) 301. J. Helbing, A. Haydar, M. Chergui, Chem. Phys. Lett. 310 (1999) 43. J. Helbing, M. Chergui, A. Haydar, J. Chem. Phys. 113 (2000) 3621. C. CrPepin-Gilbert, A. Tramer, Int. Rev. Phys. Chem. 18 (1999) 485. L.J. Munro, J.K. Johnson, K.D. Jordan, J. Chem. Phys. 114 (2001) 5545. S. Ceccherini, M. Moraldi, Chem. Phys. Lett. 337 (2001) 386. J. Tellinghuisen, J. Chem. Phys. 114 (2001) 3465. E. Sarantopoulou, Z. Kollia, A.C. Cefalas, Lambda Highlights 58 (2001) 6. P. Schwerdtfeger, R. Wesendrup, G.E. Moyano, A.J. Sadlej, J. Greif, F. Hensel, J. Chem. Phys. 115 (2001) 7401. R.R. Bennett, W.H. Breckenridge, J. Chem. Phys. 96 (1992) 882. M. Czajkowski, R. Bobkowski, L. Krause, Phys. Rev. A 45 (1992) 6451. E. Czuchaj, M. KroPsnicki, Spectrochim. Acta A 57 (2001) 2463. E. Sarantopoulou, C. Skordoulis, A.C. Cefalas, A. Vourdas, Synth. Met. 124 (2001) 267.
Physics Reports 369 (2002) 327 – 430 www.elsevier.com/locate/physrep
Dilaton gravity in two dimensions D. Grumillera; ∗ , W. Kummera , D.V. Vassilevichb; c a
b
Institut fur Theoretische Physik, TU Wien, Wiedner Hauptstr. 8–10, A-1040 Wien, Austria Institut fur Theoretische Physik, Universitat Leipzig, Augustusplatz 10, D-04109 Leipzig, Germany c V.A. Fock Insitute of Physics, St. Petersburg University, 198904 St. Petersburg, Russia Received 1 June 2002 editor: A. Schwimmer
Abstract The study of general two-dimensional models of gravity allows to tackle basic questions of quantum gravity, bypassing important technical complications which make the treatment in higher dimensions di3cult. As the physically important examples of spherically symmetric Black Holes, together with string inspired models, belong to this class, valuable knowledge can also be gained for these systems in the quantum case. In the last decade, new insights regarding the exact quantization of the geometric part of such theories have been obtained. They allow a systematic quantum 9eld theoretical treatment, also in interactions with matter, without explicit introduction of a speci9c classical background geometry. The present review tries to assemble these results in a coherent manner, putting them at the same time into the perspective of the quite large literature on this subject. c 2002 Elsevier Science B.V. All rights reserved. PACS: 04.60.−w; 04.60.Ds; 04.60.Gw; 04.60.Kz; 04.70.−s; 04.70.Bw; 04.70.Dy; 11.10.Lm; 97.60.Lf Keywords: Dilaton gravity; Quantum gravity; Black holes; Two-dimensional models
Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1. Structure of this review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2. Di?erential geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2.1. Short primer for general dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2.2. Two dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2. Models in 1 + 1 dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ∗
Corresponding author. E-mail addresses:
[email protected] (D. Grumiller),
[email protected] (W. Kummer),
[email protected] (D.V. Vassilevich). c 2002 Elsevier Science B.V. All rights reserved. 0370-1573/02/$ - see front matter PII: S 0 3 7 0 - 1 5 7 3 ( 0 2 ) 0 0 2 6 7 - 3
329 333 334 334 339 340
328
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
2.1. Generalized dilaton theories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.1. Spherically reduced gravity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.2. Dilaton gravity from strings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.3. Generalized dilaton theories—the action . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.4. Conformally related theories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2. Equivalence to 9rst-order formalism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3. Relation to Poisson-Sigma models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3. General classical treatment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1. All classical solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2. Global structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.1. Schwarzschild metric . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.2. More general cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3. Black hole in Minkowski, Rindler or de Sitter space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4. Additional 9elds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1. Dilaton-Yang–Mills theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2. Dilaton supergravity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3. Dilaton gravity with matter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3.1. Scalar and fermionic matter, quintessence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3.2. Exact solutions—conservation law for geometry and matter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5. Energy considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1. ADM mass and quasilocal energy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2. Conservation laws . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3. Symmetries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6. Hawking radiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1. Minimally coupled scalars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2. Non-minimally coupled scalars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7. Non-perturbative path integral quantization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.1. Constraint algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2. Path integral quantization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3. Path integral without matter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.4. Path integral with matter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.4.1. General formalism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.4.2. Perturbation theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.4.3. Exact path integral with matter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8. Virtual black hole and S-matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.1. Non-minimal coupling, spherically reduced gravity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.2. E?ective line element . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.3. Virtual black hole . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.4. Non-local 4 vertices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.5. Scattering amplitude . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.6. Implications for the information paradox . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9. Canonical quantization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10. Conclusions and discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix A. Spherical reduction of the curvature two-form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix B. Heat kernel expansion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
341 341 342 343 344 346 348 351 352 357 359 362 364 367 367 368 374 374 375 377 377 381 382 383 384 388 391 392 396 398 401 401 402 403 404 405 407 407 409 410 411 412 415 417 417 418 421
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
329
1. Introduction The fundamental di3culties encountered in the numerous attempts to merge quantum theory with General Relativity by now are well-known even far outside the narrow circle of specialists in these 9elds. Despite many valiant e?orts and new approaches like loop quantum gravity [1] or string theory 1 a 9nal solution is not in sight. However, even many special questions search an answer. 2 Of course, at energies which will be accessible experimentally in the foreseeable future, due to the smallness of Newton’s constant, respectively, the large value of the Planck mass, an e?ective quantum theory of gravity can be constructed [3] in a standard way which in its infrared asymptotical regime as an e;ective quantum theory may well describe our low energy world. Its extremely small corrections to classical General Relativity (GR) are in full agreement with experimental limits [4]. However, the fact that Newton’s constant carries a dimension, inevitably makes perturbative quantum gravity inconsistent at energies of the order of the Planck mass. In a more technical language, starting from a 9xed classical background, already a long time ago perturbation theory has shown that although pure gravity is one-loop renormalizable [5] this renormalizability breaks down at two loops [6], but already at one-loop when matter interactions are taken into account. Supergravity was only able to push the onset of non-renormalizability to higher loop order (cf. e.g. [7–9]). It is often argued that a full treatment of the metric, including non-perturbative e?ects from the backreaction of matter, may solve the problem but to this day this remains a conjecture. 3 A basic conceptual problem of a theory like gravity is the double role of geometric variables which are not only 9elds but also determine the (dynamical) background upon which the physical variables live. This is e.g. of special importance for the uncertainty relation at energies above the Planck scale leading to Wheeler’s notion of “space–time foam” [11]. Another question which has baPed theorists is the problem of time. In ordinary quantum mechanics the time variable is set apart from the “observables”, whereas in the straightforward quantum formulation of gravity (the so-called Wheeler–DeWitt equation [12,13]) a variable like time must be introduced more or less by hand through “time-slicing”, a multi-9ngered time, etc. [14]. Already at the classical level of GR “time” and “space” change their roles when passing through a horizon which leads again to considerable complications in a Hamiltonian approach [15,16]. Measuring the “observables” of usual quantum mechanics, one realizes that the genuine measurement process is related always to a determination of the matrix element of some scattering operator with asymptotically de9ned ingoing and outgoing states. For a gauge theory like gravity, existing proofs of gauge-independence for the S-matrix [17] may be applicable for asymptotically Rat quantum gravity systems. But the problem of other experimentally accessible (gauge independent!) genuine observables is open, when the dynamics of the geometry comes into play in a non-trivial manner, a?ecting e.g. the notion what is meant by asymptotics. The quantum properties of black holes (BH) still pose many questions. Because of the emission of Hawking radiation [18,19], a semi-classical e?ect, a BH should successively lose energy. If there is no remnant of its previous existence at the end of its lifetime, the information of pure states swallowed by it will have only turned into the mixed state of Hawking radiation, violating basic 1
The recent book [2] can be recommended. A brief history of quantum gravity can be found in Ref. [1]. 3 For a recent argument in favor of this conjecture using Weinberg’s argument of “asymptotic safety” cf. e.g. [10]. 2
330
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
notions of quantum mechanics. Thus, of special interest (and outside the range of methods based upon the 9xed background of a large BH) are the last stages of BH evaporation. Other open problems—related to BH physics and more generally to quantum gravity—have been the virtual BH appearing as an intermediate stage in scattering processes, the (non-)existence of a well-de9ned S-matrix and CPT (non-)invariance. When the metric of the BH is quantized its Ructuations may include “negative” volumes. Should those Ructuations be allowed or excluded? The intuitive notion of “space–time foam” seems to suggest quantum gravity induced topology Ructuations. Is it possible to extract such processes from a model without ad hoc assumptions? From the experience of quantum 9eld theory in Minkowski space, one may hope that a classical singularity like the one in the Schwarzschild BH may be eliminated by quantum e?ects—possibly at the price of a necessary renormalization procedure. Of course, the latter may just reRect the fact that interactions with further 9elds (e.g. other modes in string theory) are not taken into account properly. Can this hope be ful9lled? In attempts to 9nd answers to these questions, it seems very reasonable to always try to proceed as far as possible with the known laws of quantum mechanics applied to GR. This is extremely di3cult 4 in D=4. Therefore, for many years a rich literature developed on lower dimensional models of gravity. The 2D Einstein–Hilbert action is just the Gauss–Bonnet term. Therefore, intrinsically 2D models are locally trivial and a further structure is introduced. This is provided by the dilaton 9eld which naturally arises in all sorts of compacti9cations from higher dimensions. Such models, the most prominent being the one of Jackiw and Teitelboim (JT), were thoroughly investigated during the 1980s [21–30]. An excellent summary (containing also a more comprehensive list of references on literature before 1988) is contained in the textbook of Brown [31]. Among those models spherically reduced gravity (SRG), the truncation of D =4 gravity to its s-wave part, possesses perhaps the most direct physical motivation. One can either treat this system directly in D = 4 and impose spherical symmetry in the equations of motion (e.o.m.’s) [32] or impose spherical symmetry already in the action [19,32– 41], thus obtaining a dilaton theory. 5 Classically, both approaches are equivalent. The rekindled interest in generalized dilaton theories (henceforth GDTs) in D = 2 started in the early 1990s, triggered by the string inspired [42– 49] dilaton black hole model, 6 studied in the inRuential paper of Callan, Giddings, Harvey and Strominger (CGHS) [52]. At approximately the same time it was realized that 2D dilaton gravity can be treated as a non-linear gauge theory [53,54]. As already suggested by earlier work, all GDTs considered so far could be extracted from the dilaton action [55,56] √ R U (X ) L(dil) = d2 x −g X − (∇X )2 + V (X ) + L(m) ; (1.1) 2 2 where R is the Ricci-scalar, X the dilaton, U (X ) and V (X ) arbitrary functions thereof, g is the determinant of the metric g , and L(m) contains eventual matter 9elds. When U (X ) = 0 the e.o.m. for the dilaton from (1.1) is algebraic. For invertible V (X ) the dilaton 9eld can be eliminated altogether, and the Lagrangian density is given by an arbitrary function of 4
A recent survey of the present situation is the one of Carlip [20]. The dilaton appears due to the “warped product” structure of the metric. For details of the spherical reduction procedure we refer to Appendix A. 6 A textbook-like discussion of this model can be found in Refs. [50,51]. 5
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
331
the Ricci-scalar. A recent review on the classical solution of such models is Ref. [57]. In comparison with that, the literature on such models generalized to depend also 7 on torsion is relatively scarce. It mainly consists of elaborations based upon a theory proposed by Katanaev and Volovich (KV) which is quadratic in curvature and torsion [27,28], also known as “PoincarVe gauge gravity” [58]. A common feature of these classical treatments of models with and without torsion is the almost exclusive use 8 of the gauge-9xing for the D = 2 metric familiar from string theory, namely the conformal gauge. Then the e.o.m.’s become complicated partial di?erential equations. The determination of the solutions, which turns out to be always possible in the matterless case (L(m) = 0 in (1.1)), for non-trivial dilaton 9eld dependence usually requires considerable mathematical e?ort. The same had been true for the 9rst papers on theories with torsion [27,28]. However, in that context it was realized soon that gauge-9xing is not necessary, because the invariant quantities R and T a Ta themselves may be taken as variables in the KV-model [60 – 63]. This approach has been extended to general theories with torsion. 9 As a matter of fact, in GR many other gauge-9xings for the metric have been well-known for a long time: the Eddington–Finkelstein (EF) gauge, the PainlevVe–Gullstrand gauge, the LemaˆXtre gauge, etc. As compared to the “diagonal” gauges like the conformal and the Schwarzschild-type gauge, they possess the advantage that coordinate singularities can be avoided, i.e. the singularities in those metrics are essentially related to the “physical” ones in the curvature. It was shown for the 9rst time in [65] that the use of a temporal gauge for the Cartan variables (cf. Eq. (3.3)) in the (matterless) KV-model made the solution extremely simple. This gauge corresponds to the EF gauge for the metric. Soon afterwards it was realized that the solution could be obtained even without previous gauge-9xing, either by guessing the Darboux coordinates [66] or by direct solution of the e.o.m.’s [67] (cf. Section 3.1). Then the temporal gauge of [65] merely represents the most natural gauge 9xing within this gauge-independent setting. The basis of these results had been a 9rst-order formulation of D = 2 covariant theories by means of a covariant Hamiltonian action in terms of the Cartan variables and further auxiliary 9elds X a which (beside the dilaton 9eld X ) take the role of canonical momenta (cf. Eq. (2.17)). They cover a very general class of theories comprising not only the KV-model, but also more general theories with torsion. 10 The most attractive feature of theories of type (2.17) is that an important subclass of them is in a one-to-one correspondence with the GDTs (1.1). This dynamical equivalence, including the essential feature that even the global properties are exactly identical, seems to have been noticed 9rst in [68] and used extensively in studies of the corresponding quantum theory [69 –71]. Generalizing the formulation (2.17) to the much more comprehensive class of “Poisson-Sigma models” [72,73] on the one hand helped to explain the deeper reasons of the advantages from the use of the 9rst-order version, on the other hand led to very interesting applications in other 9elds [74], including especially also string theory [75,76]. Recently, this approach was shown to represent a very direct route to 2D dilaton supergravity [77] without auxiliary 9elds. 7 For the de9nition of the Lorentz scalar formed by torsion and of the curvature scalar, both expressed in terms of Cartan variables zweibeine ea and spin connection !ab we refer to Section 1.2. 8 A notable exception is Polyakov [59]. 9 A recent review of this approach is provided by Obukhov and Hehl [64]. 10 In that case, there is the restriction that it must be possible to eliminate all auxiliary 9elds X a and X (see Section 2.1.3).
332
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
Apart from the dilaton BH [52] where an exact (classical) solution is possible even when matter is included, general solutions for generic D = 2 gravity theories with matter cannot be obtained. This has been possible only in restricted cases, namely when fermionic matter is chiral 11 [79] or when the interaction with (anti)self-dual scalar matter is considered [80]. Semi-classical treatments of GDTs take the one-loop correction from matter into account when the classical e.o.m.’s are solved. They have been used mainly in the CGHS-model and its generalizations [48,81–90,438]. In our present report we concentrate only upon Hawking radiation as a quantum e?ect of matter on a <xed (classical) geometrical background, because just during the last years interesting insight has been obtained there, although by no means all problems have been settled. Finally, we turn to the full quantization of GDTs. It was believed by several authors (cf. e.g. [55,56,91–93]) that even in the absence of interactions with matter non-trivial quantum corrections exist and can be computed by a perturbative path integral on some 9xed background. Again the evaluation in the temporal gauge [439], at 9rst for the KV-model showed that the use of other gauges just obscures a very simple mechanism. Actually, all divergent counter-terms can be absorbed into one compact expression. After subtracting that in the absence of matter the solution of the classical theory represents an exact “quantum” result. Later this perturbative argument has been reformulated as an exact path integral, 9rst again for the KV-model [94] and then for general theories of gravity in D = 2 [69 –71,95 –97]. In our present review we concentrate on the path integral approach, with Dirac quantization only referred to for the sake of comparison. In any case, the common starting point is the Hamiltonian analysis which in a theory formulated in terms of Cartan variables in D = 2 possesses substantial technical advantages. The constraints, even in the presence of matter interactions, form an algebra with momentum-dependent structure constants. Despite that non-linearity the simplest version of the Batalin–Vilkovisky procedure [98] su3ces, namely the one also applicable to ordinary non-Abelian gauge theories in Minkowski space. With a temporal gauge 9xing for the Cartan variables also used in the quantized theory, the geometric part of the action yields the exact path integral. Possible background geometries appear naturally as homogeneous solutions of di?erential equations which coincide with the classical ones, reRecting “local quantum triviality” of 2D gravity theories in the absence of matter, a property which had been observed as well before in the Dirac quantization of the KV-model [66]. These features are very di3cult to locate in the GDT-formulation (1.1), but become evident in the equivalent 9rst-order version with a “Hamiltonian” action. Of course, non-renormalizability persists in the perturbation expansion when the matter 9elds are integrated out. But as an e?ective theory in cases like spherically reduced gravity, speci9c processes can be calculated, relying on the (gauge-independent) concept of S-matrix elements. With this method, scattering of s-waves in spherically reduced gravity has provided a very direct way to create a “virtual” BH as an intermediate state without further assumptions [96]. The structure of our present report is determined essentially by the approach described in the last paragraphs. One reason is the fact that a very comprehensive overview of very general classical and quantum theories in D = 2 is made possible in this manner. Also a presentation seems to be overdue in which results, scattered now among many di?erent original papers can be integrated into 11
This solution was rediscovered in Ref. [78].
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
333
a coherent picture. Parallel developments and di?erences to other approaches will be included in the appropriate places. 1.1. Structure of this review This review is organized as follows: • Section 1 in its remaining part contains a short primer on di?erential geometry (with special emphasis on D = 2). En passant most of our notations are 9xed in that subsection. • Section 2 motivates the study of GDTs and introduces its action in the three most frequently used forms (dilaton action, 9rst-order action, and Poisson-Sigma action) and describes the relations between them. • Section 3 gives all classical solutions of GDTs in the absence of matter. The global structure of such theories is discussed using Schwarzschild space–time as a simple example. As a further illustration we consider a family of dilaton models describing a single black hole in Minkowski, Rindler or de Sitter space–time. • Section 4 extends the discussion to additional gauge-9elds, supergravity and (bosonic or fermionic) matter 9elds. • Section 5 considers the role of energy in GDTs. In particular, the ADM mass, quasilocal energy, an absolute conservation law and its corresponding NYother symmetry are discussed. • Section 6 leaves the classical realm providing a concise treatment of (semi-classical) Hawking radiation for minimally and non-minimally coupled matter. • Section 7 is devoted to non-perturbative path integral quantization of the geometric sector of GDTs with (scalar) matter, giving rise to a non-local and non-polynomial e?ective action depending solely on the matter 9elds and external sources. The matter sector is treated perturbatively. • Section 8 shows some consequences of the previously developed perturbation theory: the virtual black hole phenomenon, the appearance of non-local vertices, and S-matrix elements for s-wave gravitational scattering. • Section 9 describes the status of Dirac quantization for a typical example of that approach. • Section 10 concludes with a brief summary and an outlook regarding open questions. • Appendix A recalls the spherical reduction procedure in the Cartan formalism. • Appendix B collects some basic properties of the heat kernel expansion needed in Section 6. Several topics are closely related to the subject of this review, but are not included: (1) Various calculations and explanations of the BH entropy [99,100] became a large and rather independent 9eld of research which shows, however, overlaps [101,102] with the general treatment of the dilaton theories presented in this review. We do not cover approaches which imply further physical assumptions which transgress the orthodox application of quantum theory to gravity [103–107]. (2) The ideas of the holographic principle [108,109] and of the AdS=CFT correspondence [110 –112] are now being actively applied to BH physics (see, e.g. [113] and references therein). (3) There exist di?erent approaches to integrability of gravity models in two dimensions [114 –117]. In particular, a rather sophisticated technique has been applied to solve the e?ective 2D models emerging after toroidal reduction (instead of the spherical reduction considered in this review)
334
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
of the four-dimensional Einstein equations [118,119]. Recently, again interesting developments should be noted in Liouville gravity [120,121]. Some relations between 2D dilaton gravity and the theory of solitons were discussed in [122,123]. Each of these topics deserves a separate review, and in some cases such reviews exist. Therefore, we have restricted ourselves in those 9elds to just a few (somewhat randomly selected) references which hopefully will permit further orientation. 1.2. Di;erential geometry 1.2.1. Short primer for general dimensions In the comprehensive approach advocated for D=2 gravity, the use of Cartan variables (zweibeine, spin connection) plays a pivotal role. As an introduction and in order to 9x our notations we shall review brieRy this formalism. For details we refer to the mathematical literature (cf. e.g. [124]). On a manifold with D dimensions in each point one introduces vielbeine ea (x), where Greek indices refer to the (holonomic) coordinates x = (x0 ; x1 ; : : : ; xD−1 ) and Latin indices denote the ones related to a (local) Lorentz frame with metric = diag(1; −1; : : : ; −1). The dual vector space is spanned by the inverse vielbeine 12 ea (x): ea eb = ab :
(1.2)
SO(1; D − 1) matrices La b (x) of the (local) Lorentz transformations obey La c Lb c = ab :
(1.3) a
A Lorentz vector V =
ea V
transforms under local Lorentz transformations as
V a (x) = La b (x)V b (x) :
(1.4)
This implies a covariant derivative (D )a b = ab 9 + ! a b ;
(1.5)
if the spin connection ! ab is introduced as the appropriate gauge 9eld with transformation ! a b = −Lb d (9 La d ) + La c ! c d Lb d :
(1.6)
The in9nitesimal version of (1.6) follows from La b = a b + la b + O(l2 ) where la b = −lb a . Formally also di?eomorphisms xZ (x) = x − (x) + O(2 )
(1.7)
can be interpreted, at least locally, as gauge transformations, when the Lie variation is employed which implies a transformation referring to the same point. In 9xZ 9x = − ; ; = + ; (1.8) 9x 9xZ partial derivatives with respect to x have been abbreviated by the index after a comma. 12
For simplicity we shall use indiscriminately the term “vielbein” for the vielbein, the inverse vielbein and the dual basis of one-forms (the components of which are given by the inverse vielbein) whenever the meaning is clear either from the context or from the position of indices.
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
335
For instance, for the Lie variation of a tensor of 9rst order VZ (x) Z = (9x =9xZ )V (x), one obtains V (x) = VZ (x) − V (x) = ; V + V; :
(1.9)
For the dual to the tangential space, e.g. V 9 = V (9 xZ )9Z = VZ 9Z one derives the analogous transformation V = VZ (x) − V (x) = − ; V + V ; :
(1.10)
The metric g in the line element is a quadratic expression of the vielbeine (ds)2 = g dx dx = ea eb ab dx dx ;
(1.11)
and, therefore, a less elementary variable. Also the reparameterization invariant volume element (−)D−1 g dD x = (−)D−1 det g dD x =
(−)D−1 (det ea )2 det dD x = |det ea | dD x = |e| dD x
(1.12)
is of polynomial form if expressed in vielbein components. The advantage of the form calculus [124] is that di?eomorphism invariance is automatically implied, when the Cartan variables are converted into one-forms ea → ea = ea dx ;
! a b → !a b = ! a b dx
(1.13)
which are special cases of p-forms 1 (1.14) !p = ! ;::: dx1 ∧ dx2 ∧ · · · ∧ dxp : p! 1 p Due to the antisymmetry of the wedge product dx ∧ dx = dx ⊗ dx − dx ⊗ dx = −dx ∧ dx all totally antisymmetric tensors !1 ···p are described in this way. Clearly !p = 0 for p ¿ D. The action of the (p + q)-form !q ∧ !q on p + q vectors is de9ned by 1 !p ∧ $q (V1 ; : : : ; Vp+q ) = % !(V%(1) ; : : : ; V%(p) )$(V%(p+1) ; : : : ; V%(p+q) ) ; (1.15) p!q! % where the sum is taken over all permutations % of 1; : : : ; p + q and % is +1 for an even number of transpositions and −1 for an odd number of transpositions. It is convenient at this point to introduce the condensed notation for (anti)symmetrization: 1 1 &[1 ···p ] := % &a%(1) ···a%(p) ; '(1 ···p ) := 'a%(1) ···a%(p) ; (1.16) p! % p! % where the sum is taken over all permutations % of 1; : : : ; p and % is de9ned as before. In the volume form 1 1 ::: a[1 :::D ] j˜ 1 D dD x = a[ ::: ] |e|j1 ···D dD x ; !p=D = (1.17) D! D! 1 D the product of di?erentials must be proportional to the totally antisymmetric Levi-CivitVa symbol 01:::(D−1) −1 The integral of the volume j˜ = −1 or, alternatively, to the tensor j = |e| j˜ (cf. (1.12)). form MD !D on the manifold MD contains the scalar a = a[1 :::D ] j1 :::D which is the starting point to construct di?eomorphism invariant Lagrangians.
336
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
By means of the metric (1.11) a mixed j-tensor j1 :::p p+1 :::p+q = g1 1 g2 2 : : : gp p j1 :::p p+1 ···p+q
(1.18)
can be de9ned which allows the introduction of the Hodge dual of !p as a D − p form 1 ∗ !p = !D −p = j ::: 1 :::p !1 :::p dx1 ∧ · · · ∧ dxD−p : p!(D − p)! 1 D−p In D= even and for Lorentzian signature we obtain for a p-form ∗ ∗!p = (−1)p+1 !p :
(1.19)
(1.20)
2
The exterior di?erential one-form d = dx 9 with d = 0 increases the form degree by one: 1 d!p = 9 !1 :::p dx ∧ dx1 ∧ · · · ∧ dxp : (1.21) p! Onto a product of forms d acts as d(!p ∧ !q ) = d!p ∧ !q + (−1)p !p ∧ d!q :
(1.22)
We shall need little else from the form calculus [124] except the PoincarVe Lemma which says that for a closed form, obeying d!p = 0, in a certain (“star-shaped”) neighborhood of a point x on a manifold M, !p is exact, i.e. can be written as !p = d!p −1 . In order to simplify our notation we shall drop the ∧ symbol whenever the meaning is clear from the context. The Cartan variables expressed as one-forms (1.13) in view of their Lorentz-tensor properties are examples of algebra valued forms. This is also the case for the covariant derivative (1.5), now written as Da b = ab d + !a b ;
(1.23)
when it acts on a Lorentz vector. From (1.13) and (1.23) the two natural quantities to be de9ned on a manifold are the torsion two-form T a = Da b eb
(1.24)
(“First Cartan’s structure equation”) and the curvature two-form Ra b = D a c ! c b
(1.25)
(“Second Cartan’s structure equation”). From (1.23) immediately follows (D2 )a b = Da c Dc b = Ra b ;
(1.26)
Bianchi’s 9rst identity. Using (1.26) D3 can be written in two equivalent ways: D a b Rb c − R a b D b c = 0 ;
(1.27)
corresponding to (dRab ) + !a c Rcb + !b c Rac = : (DR)ab = 0 :
(1.28) ab
The left-hand side (l.h.s.) de9nes the action of the covariant derivative (1.23) on R , a Lorentz tensor with two indices. The brackets indicate that those derivatives only act upon the quantity R and not
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
337
further to the right. Eq. (1.28) is called Bianchi’s second identity. The structure equations together with the Bianchi identities show that the covariant action for any gravity action in D dimensions depending on ea ; !a b can be constructed as a volume form depending solely on Rab ; T a and ea . The most prominent example is Einstein gravity in D = 4 [125,126] which in the Palatini formulation reads [127] LHEP ˙ Rab ec ed jabcd ; (1.29) M4
having used the de9nition jabcd = j'( ea eb ec' ed( . The condition of vanishing torsion T a = 0 for this special case already follows from varying !a b independently in (1.29). In the usual textbook formulations of Einstein gravity, in terms of the metric, the a3ne connection ) * appears as the only variable in the covariant derivative, e.g. for a contravariant vector X X; :=∇ X = (∇ ) * X * = (9 * + )* )X * :
(1.30)
In the vielbein basis ea we relate X b = e*b X * and let (1.5) act onto that X b . Multiplying by the inverse vielbein (1.2) and comparing with (1.30) yields ) * = ea * [(D )a b e b ] :
(1.31)
The same identi9cation follows, of course, from the covariant derivative of a covariant vector: X; :=9 X − ) * X* :
(1.32)
Covariant derivatives may be constructed easily also for tensors with mixed space–time and local Lorentz indices. For instance, that derivative acting upon the vielbeine ec* (D e)a = [(D ) * ]a c ec* :=(∇ ) * ea* + (! )a c ec = 0
(1.33)
is seen to vanish. By (1.2) this implies the same result for analogously de9ned vielbeine e*a (D e)a* = 0 :
(1.34)
From (1.34) and the antisymmetry of !a b = −!b a (one version of metricity) corresponding to its property as a Lorentz generator of SO(1; D − 1) immediately ∇ g*' = 0
(1.35)
can be derived, the version of the metricity usually employed in torsionless theories. Comparing the antisymmetrized part of the a3ne connection )[] * = 12 () * −) * ) of (1.31) with the components of the torsion (1.24), multiplied by the inverse vielbein, shows that the expressions are identical: a ea* T = )[] * :
(1.36)
This allows to express the full a3ne connection ) * = )() * + T *
(1.37)
in terms of Christo?el symbols {; ; *} and the contorsion K ' )()* = g*' )() = {; ; *} + K()*
(1.38)
338
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
by the standard trick of considering (1.35) in the form g*; = ) + g+* + )* + g+
(1.39)
with (1.38) and by taking the linear combination of the identity (1.39) minus the one for g; * plus the one for g*; . In this way the Christo?el symbol {; ; *} = 12 (g*; + g*; − g; * ) ;
(1.40)
but also the additional contorsion contribution K from the non-vanishing torsion in (1.38) K()* = T[*] + T[*]
(1.41)
can be found. Non-vanishing torsion and thus also a non-vanishing contorsion are important for the determination of the global properties of a certain solution of a generic theory of gravity. In contrast to ordinary Minkowski space 9eld theories, the variables of gravity—in the most general case the independent Cartan variables e and !—in the dynamical evolution also determine the non-Minkowski dynamical background upon which the theory lives. Thus, for the investigation of that background a device must be found which acts like a test charge in an electromagnetic 9eld. The simplest possibility in gravity is to add the Lagrangian of a point particle with path x = xZ (() to the original action (xZ˙ = dxZ =d( with the a3ne parameter (), (2 (2 (p) L = −m ds = −m g (x) Z xZ˙ xZ˙ d( ; (1.42) (1
(1
with a mass m, small enough to be of negligible gravitational inRuence. Variation of L(p) with respect to xZ leads to the usual geodesic equation * ' xZY + )˜ (*') xZ˙ xZ˙ = 0 ;
(1.43)
where, by construction from (1.42), )˜ (*') = g& {*; '; &} only “feels” the Christo?el part (1.40) of the a3ne connection and not the contorsion (1.41). Alternatively, also the full a3ne connection ) may be considered in (1.43) (“autoparallels”) [128,129]. For that modi9ed geodesic equation for xZ& (() also a (non-local) action replacing (1.42) can be found in the literature [130,131]. In order to explore the local and topological properties of a certain manifold which corresponds to a solution of a generic gravity theory all points must be connected which can be reached by a device like the geodesic (1.43) by means of a time-like, but also space-like or light-like path. The classi9cation of possible extensions of a certain patch uses the notion of “geodesic” incompleteness: a geodesic which has only a 9nite range of a3ne parameter, but which is inextendible 13 in at least one direction is called incomplete. A space–time with at least one incomplete (time=space=light-like) geodesic is called (time=space=light-like) geodesically incomplete. The notion of incompleteness also yields the most satisfactory classi9cation of (geometric) singularities. For example, a singularity like the one in the Schwarzschild metric [134] can be reached by at least one (time- or light-like) geodesic with 9nite a3ne parameter (i.e. with 9nite proper time for massive test particles). For D = 2 theories a complete discussion of “geodesic topology” for any generic theory can be carried out (cf. Section 3.2) [135,136]. Here we just want to emphasize the importance of the type of device to be used for the determination of the “e?ective” topology of the manifold which, in principle, may be di?erent for geodesics, autoparallels, spinning particles, etc. 13
This means the corresponding geodesic must have (at least) one endpoint. For details we refer to [132,133].
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
339
1.2.2. Two dimensions In D = 2 the Lorentz transformations (1.3),(1.4) simply reduce to a boost with velocity v a cosh v sinh v a L b= = ab + ja b v + O(v2 ) ; (1.44) sinh v cosh v b where in local Lorentz indices with metric ab = ab ( = diag(+1; −1)) the Levi-CivitVa symbol ja b = ac jcb (j01 = −j01 = +1) coincides with the tensor. It is related to the tensor j in holonomic coordinates (cf. (1.17)) by (explicit values of Lorentz indices in (1.47) are underlined) 1 j = − jab ea ∧ eb ; (1.45) 2 *'
j = ea eb jab = |e|j˜ = |e|−1 g* g' j˜ 0 1
;
1 0
|e| = det ea = e0 e1 − e0 e1 :
(1.46) (1.47)
As there is only one generator ja b in SO(1; 1) (cf. (1.44)) the spin connection one-form simpli9es to a single term !a b = !ja b and hence the one quadratic in ! of Rab (1.25) vanishes: Ra b = ja b d! :
(1.48)
From now on for simplicity we shall refer to the one-form ! as the “spin connection”. This shows that the curvature in D = 2 only possesses one independent component which we take to be the Ricci-scalar: 14 *'
R = 2 ∗ d! = 2|e|−1 j˜ 9* !' :
(1.49)
It is clear from this expression that the Hilbert–Einstein action in two dimensions contains a total √ √ divergence. In (compact) Euclidean space ( −g → g) without boundaries, it becomes the Euler characteristic of a 2D Riemannian space with genus √ d2 x gR = 8%(1 − -) : (1.50) M-
Also the torsion simpli9es to a volume form 1 T a = T a dx ∧ dx ; T a = (D e )a − (D e )a ; 2 with (D )a b = 9 a b + ! ja b :
(1.51)
(1.52)
a
The Hodge dual of T here is a di?eomorphism scalar:
( a := ∗ T a = |e|−1 j˜ (D ea )
(1.53)
In D = 2 the inverse of the zweibeine from (1.2) obeys the simple relation
ea = −|e|−1 j˜ jab eb :
(1.54)
Our convention corresponds to the contraction R = Rab ea eb where R ab are the tensor components of Rab . R *' then coincides with the usual textbook de9nition [133]. 14
340
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
The formula for the change of the Ricci-scalar under a conformal transformation of the metric gˆ = e2* g is most easily derived from a transformation eˆa = e* ea of the zweibeine for vanishing torsion T a = 0, i.e. with ! = !˜ = ea ∗ dea in the Ricci-scalar (1.49) ˜ (' a aj ˜ (1.55) |e| R = 2|e| ∗ d(ea ∗ de ) = 2 j 9( e' 9 ea : |e| √ ˆ = −g) ˆ an important identity: Remembering eˆ = |e|e2* and using (1.54) for ea eb ab = g yields ((e) √ √ (' −gˆRˆ = −gR − 29( ( −gg 9' *) : (1.56) Light-cone Lorentz vectors are especially useful in D = 2, 1 X ± := √ (X 0 ± X 1 ) ; 2
(1.57)
Z
yielding X 2 = X a Xa = X aZXaZ = aZbZX aZX b = 2X + X − with metric 01 aZbZ = 10
(1.58)
and the corresponding Lorentz j-tensor jaZ bZ = aZcZjcZbZ with .±± = ±1. The light-cone components of the torsion (1.51) become T ± = (d ± !)e± :
(1.59)
Since we are going to discuss fermionic matter (as well as supergravity) we have to 9x our spinor notation. The -a -matrices are de9ned in a local Lorentz frame {-a ; -b } = 2ab ; 0 1 0 1 0 1 ; - = ; - = 1 0 −1 0 1 1 0 = − [-0 ; -1 ] : -∗ := − -0 -1 = 0 −1 2
(1.60)
(1.61)
In light-cone components we obtain a representation in terms of nilpotent matrices √ √ 0 1 0 0 + − ; - = 2 : - = 2 0 0 1 0
(1.62)
The covariant derivative acting on two-dimensional Dirac fermions D = 9 − 12 -∗ !
(1.63) 0
1
is determined by the Lorentz generator for spinors [- ; - ]=4 = −-∗ =2. 2. Models in 1 + 1 dimensions There are (at least) four di?erent motivations to study generalized dilaton theories (GDT) in D = 2: • Starting from Einstein gravity in D ¿ 4 and imposing spherical symmetry one reproduces a certain GDT.
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
341
• A certain limit of (super-)string theory yields a particular GDT as e?ective action. • GDTs can be viewed as toy models for quantization of gravity and as a laboratory for studying BH evaporation. • In a 9rst-order formulation, the underlying Poisson structure reveals relations to non-commutative geometry and deformation quantization. Again, GDTs are a convenient laboratory to elucidate these new concepts and techniques. Moreover, a result obtained along one route is of course also valid for all other approaches after having translated the jargon from one 9eld to the others. In this sense, GDTs may even serve as a link between general relativity, string theory, BH physics and non-commutative geometry. We base our discussion on the 9rst (somewhat more phenomenological) route and show the links to the other 9elds in this section. 2.1. Generalized dilaton theories 2.1.1. Spherically reduced gravity The introduction of dilaton 9elds allows the treatment of the dynamics for a generic higher dimensional (D ¿ 2) theory of gravity in an e?ective theory at lower dimension D1 ¡ D, which is still di?eomorphism invariant. In certain special cases the isometry group of the D-dimensional metric is such that it allows for a reduction to D1 = 2. Important examples for D = 4 are toroidal reduction [137–141] and spherical reduction [19,32– 41]. The latter is of special importance, because it covers the Schwarzschild BH. Therefore, we concentrate on that example. Splitting locally the D-dimensional manifold MD into a direct product M2 ⊗S D−2 the line element becomes (ds)2(D) = g (x)dx dx − +−2 X 2=D−2 (d!)2S D−2 ;
(2.1)
where (d!)2S D−2 is the surface element of the (D − 2)-dimensional sphere, x = {x0 ; x1 } are the coordinates in M2 , and + is a parameter of mass dimension one. A straightforward calculation (cf. e.g. [142]; explicit formulae for the curvature two-form, the ensuing Ricci-scalar and the Euler- and Pontryagin-class can be found in Appendix A) for the D-dimensional Hilbert–Einstein √ action LHE = dD x −g(D) R(D) yields ((∇X )2 = g 9 X 9 X ) √ OD−2 d2 x −g L(SRG) = D−2 + 16%GN D − 3 (∇X )2 2 (D−4)=D−2 − + (D − 2)(D − 3) X XR + : (2.2) D−2 X In the prefactor, which will be dropped consistently in the following, OD−2 denotes the surface of the unit sphere S D−2 . Fixing the 2D di?eomorphisms (partially) as X = (+r)D−2 (the radius r representing one of the coordinates and + ¿ 0) Eq. (2.1) yields the usual spherically symmetric line element in which r ¿ 0 is required. Another way to obtain a 2D theory from a higher dimensional one is to suppose that the D-dimensional manifold is a direct product MD = M2 ⊗ T D−2 , where T D−2 is a torus, and that all 9elds are independent of the D − 2 extra coordinates. This procedure is called dimensional
342
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
reduction. It also produces a dilaton theory in 2D if the higher dimensional theory already contains the dilaton [143]. 2.1.2. Dilaton gravity from strings Developments in string theory contributed much to the increase of interest in dilaton gravity in the 1990s. The simplest way to obtain it from strings is to consider the conditions for world-sheet conformal invariance [144]. The starting point is the non-linear sigma model action for the closed bosonic string, √ 1 (') 2 d L = −h[g hij 9i X 9j X + & 6R] ; (2.3) 4%& where is a coordinate on the string world-sheet, hij is a metric 15 there, R represents the corresponding scalar curvature. The other symbols denote: the target space coordinates (X ), the target space metric (g ), and the dilaton 9eld (6). As usual, & is the inverse string tension. The antisymmetric B-9eld is set to zero. It is essential for string consistency that, as a quantum 9eld theory, the sigma model be locally scale invariant. This is equivalent to the requirement that the trace of the 2D world-sheet energy– momentum tensor vanishes. Its general structure is g ij 2%Tii = 86 R + 8 h 9i X 9j X ;
(2.4)
g 8
are local functionals of the couplings g and 6, usually where the “beta functions” 86 and calculated in the form of a power series in & . Note that the 9rst term in L(') is conformally invariant and contributes to the 8-functions at the quantum level only through the conformal anomaly. It corresponds to O(& )0 . The second term in (2.3) breaks local scale invariance already at the classical level. Due to the factor & its contributions to the trace (2.4) also start with the zeroth power of g & . The leading terms in 86 and 8 were calculated in Ref. [144]. With our sign conventions they read: 86 +2 1 = − − (4(∇6)2 − 4∇ ∇ 6 − R) ; 2 & 4% 16%2
(2.5)
g = R + 2∇ ∇ 6 ; 8
(2.6)
where ∇ is the covariant derivative in target space, R is the scalar curvature of the target space manifold. The constant + depends on the central charge. For the bosonic string it is 26 − D +2 = : (2.7) 12& This constant vanishes for critical strings. The key observation regarding the beta functions (2.5) and (2.6) is that the conditions of conformal g = 0 are equivalent to the e.o.m.’s to be derived from the dilaton gravity invariance 86 = 0 and 8 action √ (dil) = dD X −ge−26 [R + 4(∇6)2 − 4+2 ] : (2.8) L 15
This metric should not be confused with g restricted to D = 2 in (2.8).
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
343
In particular, the dilaton e.o.m. is equivalent to 86 = 0. The Einstein equations are given by a g combination of the two beta functions, 8 − 8%2 g 86 =& = 0. For D = 2 action (2.8) describes the “string inspired” dilaton (CGHS) model [52] which has been studied since the early 1990s [145 –150]. It is intimately related to the SO(2; 1)=U (1)-WZW exact conformal 9eld theory 16 [42– 45]. An amusing feature of (2.8) with D = 2 is that after the identi9cation X = e−26 it can be obtained from (2.2) by taking there the limit D → ∞ keeping +2 (D − 2)(D − 3) → const: = 4+2 . This corresponds to the classical limit & → ∞. 2.1.3. Generalized dilaton theories—the action A result like (2.2) or (2.8) suggests the consideration of GDTs U (X ) R (dil) 2 √ 2 X− (∇X ) + V (X ) ; L = d x −g 2 2
(2.9)
where the overall factor has been chosen for later convenience. Clearly, an even more general action could contain still another arbitrary function Z(X ), replacing X in the 9rst term of the square bracket [56,152]. However, we assume that Z(X ) is invertible for the range of X to be considered. 17 This allows the inversion X = Z −1 (X˜ ) and the reduction to the form (2.9). Indeed the “physical” applications seem to be always of that type. The BH singularity of SRG reveals itself in the singular factor U of the dynamical term for the dilaton 9eld. This is the 9rst hint to the fact that the “strength” of that singularity in the solution of (2.2) is not 9xed by the action; it will actually turn out to be a “constant of motion” which for the BH coincides with the ADM mass (cf. Section 5). An alternative representation is suggested by (2.8): √ 1 L(dil) = d2 x −g e−26 [R − U˜ (6)(∇6)2 + 2V˜ (6)] ; (2.10) 2 with U˜ (6) = 4 exp(−26)U (exp(−26)) and V˜ (6) = exp(26)V (exp(−26)). Eqs. (2.9) and (2.10) are related by the rede9nition of the dilaton 9eld X = e−26 ;
(2.11)
explicitly taking into account positivity of X which is required in many models. Among the GDTs (2.9) with U (X ) = 0 the simplest non-trivial choice of Refs. [21–25,154] VJT = ;X;
UJT = 0 ;
(2.12)
the Jackiw–Teitelboim (JT) model, has played a decisive role for the understanding of 2D (lineal) gravity [26]. Depending on the sign of ; it describes a 2D (anti-) de Sitter manifold with constant positive or negative curvature. The symmetry properties of the model are related to the Lie algebra SO(1; 2). It has been explored in detail in the quoted references. Below this, algebra will turn out to represent the special linear case of some, in general, nonlinear (9nite W -)algebra [155] associated with a generic dilaton theory (2.9) (cf. Section 2.3). 16
The non-compact form is SO(2; 1)=SO(1; 1). An early review on 2D gravity and 2D string theory from the stringy point of view is Ref. [151]. 17 To the best of our knowledge there is no literature on non-trivial models where Z(X ) is not invertible (cf. also [153]). By a suitable rede9nition, a di?erent simpli9cation with U (X ) = 1, Z(X ) = 1 was proposed in Ref. [55].
344
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
More complicated models with U (X ) = 0, but V (X ) exhibiting a singularity in X , among others may also involve solutions with space–time structure of a BH or its generalizations. For example, the choice 18 2M Q2 + (2.13) X2 4X 3 produces a line element like the one for the Reissner–NordstrYom BH with charge Q and mass M [156,157]. Evidently in this case the singularities are kept 9xed by parameters of the action. They cannot be related to the conservation law referred to already above for a “dynamical” model with singular non-vanishing U (X ) and regular V (X ). A 9nal remark for the case U = 0 concerns the possibility to eliminate the dilaton 9eld altogether by means of the algebraic equations of motion produced by varying X in (2.9), V (X ) = −R=2. If this equation can be inverted, the dilaton Lagrangian for U = 0 turns into a Lagrangian depending on the function of R alone [57,153,158,159]: √ L = d2 x −gf(R) : (2.14) VRN = −
As compared to such theories (2.14), the literature on models generalized so as to depend also on torsion (cf. (1.53)) √ L = d2 x −gh(R; ( a (a ) (2.15) is relatively scarce. It mainly consists of elaborations based upon the model of Katanaev and Volovich [27,28] where the function h in (2.15) is quadratic in R and linear in ( a (a , also known as “PoincarVe gauge gravity” [58,60 – 64]. Models with U (X ) = 0 and di?erent assumptions for that function and V (X ) have been studied extensively (cf. e.g. [29,55,56,81,82,152,160 –166]). For their solution throughout these works, the conformal or the Schwarzschild gauge have been used, leading to complicated e.o.m.’s, the solution of which often requires considerable mathematical e?ort. Because we shall avoid this complication altogether (Section 3) no explicit examples of this approach will be given here. 2.1.4. Conformally related theories Sometimes, it is convenient [167–175] to use a conformal transformation (1.56) with *(X ) = X 1=2 U (y)dy in (2.9) to simplify the dynamics by the transition to a new theory with U˜ = 0 and V˜ (X ) = V (X ) exp(−2*). One has to keep in mind, however, that the two theories need not be equivalent physically. To interpret the results one must always return to the original theory. This subtlety was sometimes ignored. One source of this misunderstanding seems to be that in 9eld theory, the transformation of 9eld variables in a 9xed Rat Minkowski background is allowed, as long as such a transformation is regular. For a GDT (2.9) with singular U this has to fail for two reasons. The 9rst one is that such a conformal transformation must be singular in order to compensate for a singularity in U (X ). Still one could argue that locally such a transformation should be permissible. However, and this is the second crucial reason, in gravity the 9eld theory in its variables at the 18
Solving the general theory in Section 3.1 we shall 9nd that the potentials U and V as in (2.9) determining a dilaton action can even be “designed”, starting from a given line element.
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
345
same time determines the (dynamical!) manifold upon which it lives. For a singular conformal transformation, the new manifold can possess completely di?erent topological properties. An extreme example is the CGHS model (2.8) [52] which from a Schwarzschild-like topology may be transformed into Rat (Minkowski) space. The reason can be seen most easily in the transformation behavior of geodesics: only null geodesics are mapped onto (in general non-a3nely parameterized) null geodesics and their corresponding a3ne parameters are related by [133] d(˜ ˙ e2* : d+
(2.16)
If approaches in9nity at a certain point, by such a singular conformal transformation geometric properties like geodesic (in)completeness can be altered. 19 In fact, this misunderstanding had been clari9ed already half a century ago [177] in connection with the Jordan–Brans–Dicke theory in D = 4 [178–180]. There already in D = 4 a “Jordan-9eld” X in a D = 4 action like (2.9) with U (X ) = const: is introduced. The D = 4 version of identity (1.56), together with an appropriate transformation of X may be used to transform that action so that the term involving R is reduced to the Hilbert–Einstein form. At that time a controversy arose whether the latter (the “Einstein-frame”) or the original one (the “Jordan frame”) was the “correct” one. As argued by Fierz [177] the answer to that question depends on the de9nition of geodesics, to be used for the determination of the global topology (cf. Section 1.2). A geodesic depending on the metric g in the Jordan frame is quite di?erent from the one which feels the metric of the conformally transformed gˆ in the Einstein-frame. Of course, for a (globally) regular conformal transformation !2 , g = !2 gˆ it would be perfectly correct to simultaneously transform g into the Jordan frame. But then the equation of the geodesic, when expressed in terms of gˆ acquires an additional dependence on !(X ), i.e. the test particle would feel a non-geodesic external force exerted by the Jordan9eld X . The confusion in D = 2 probably also originated from the by now very familiar situation in string theory [2,181]. Its conformal invariance does not carry over automatically to the world-sheet, where it is achieved by imposing the e.o.m.’s in target space (cf. Section 2.1.2). String theory yields dilaton gravity as its low energy limit also in higher dimensions. In that context, the Jordan frame usually now is called the string frame and the old discussion referred to in the previous paragraph has been resurrected in modern language [182–186]. A simple example of a singular conformal transformation leading to a change of (time-like) geodesic (in)completeness can be found in [133] (Fig. 9.1). Another obvious case is provided by the Schwarzschild metric, Eq. (3.37) below. A (singular) conformal transformation with !2 = −1 = r − 1 (1 − 2M=r) and a (singular) coordinate transformation r˜ = dy=(y) leads to Minkowski space– time. This is, of course, a rather trivial consequence of (patchwise) conformal Ratness of any 2D metric. It will be discussed below why ADM mass (Section 5.1) and Hawking radiation (Section 6) are, in general, di?erent in conformally related theories.
19
Since the usual conformal transformation involved in this context is proportional to the integral of U (X ) and the latter has a singularity in, practically, all physically interesting models, there will be at least one such singular point in addition to the (asymptotic) singularity at X → ∞ [176].
346
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
2.2. Equivalence to
which seems to have been introduced 9rst for the special case (V = 0) in string theory [53], then considered for a special model in Ref. [54] and 9nally generalized to the D = 2 most general form (2.17) for a theory of pure gravity in Refs. [72,73]. It depends on auxiliary 9elds X a and X so that it is su3cient to include only the 9rst derivatives of the zweibeine (torsion) and of the spin connection (curvature). The whole dynamical content is encoded in a (Lorentz-invariant) potential V multiplied by the volume form (1.45). In the following very often light-cone coordinates (1.57) and (1.59) will be used: Xa Dea = X + (d − !) e− + X − (d + !) e+ :
(2.18)
We also recall (1.49), the relation 2 ∗ d! = R between spin connection and curvature scalar. The component version of (2.17) with (2.49) follows from the identi9cation (cf. (1.53), (1.55)) implying the Hodge duals,
as
1 ± (d ± !)e± ⇒ j˜ d2 x T = (e)(± d2 x ; 2 R d! ⇒ j˜ 9 ! d2 x = (e) d2 x ; 2 1 j ⇒ − j˜ab ea eb j˜ d2 x = (e) d2 x ; 2 (FOG)
L
=
(2.19)
d2 x{j˜ [X + (9 − ! )e− + X − (9 + ! )e+
+ X 9 ! ] + (e)V(2X + X − ; X )} :
(2.20)
The original intention of formulation (2.17) had been to express a general 2D Lagrangian involving the only independent geometric quantities (Ricci-scalar R, and torsion scalar T 2 = ( a (a = 2(+ (− , cf. (1.53), (2.15)) √ (R; (2 ) L = d2 x −gh(R; (2 ) (2.21) in a simpler fashion. The variables X a and X can be eliminated by the algebraic e.o.m.’s from variation X a ; X in (2.17) or (2.20), 9V R 9V + =0 ; (2.22) = 0; (a + 9X a 2 9X provided the Hessian |92 V=9X A 9X B | does not vanish (X A = {X; X a }). Evidently, this is not always possible, but also, inversely, not every action L(R; (2 ) permits a reformulation as L(FOG) in (2.17). 20 20
For a mathematically more precise discussion of this point we refer to Ref. [153].
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
347
Fortunately, the relation of (2.17) to GDT (2.9) and especially to models with a physical motivation (e.g. SRG) is more immediate and subjected to weaker conditions. Then, instead, only X a and the torsion-dependent part of the spin connection are eliminated by e.o.m.’s which are linear and algebraic and thus may be reinserted into the action. 21 From the de9nition for ∗T a (1.53) with ! = !a ea in the local Lorentz basis ea , the identities ea ∧ eb = jab · j and ∗j = 1, one gets ∗ T a = ∗dea − !a
(2.23)
! = !a ea = ea ∗ dea − ea ∗ T a = : !˜ − ∗T ;
(2.24)
or where !˜ represents the torsion-free part of the spin connection. The e.o.m. from variation of X a in (2.17) 9V =0 ; (2.25) 9X a after taking the Hodge dual, multiplication with ea and comparison with the identity (2.24) yields the relation between ∗T and V 9V : (2.26) ∗ T = −ea 9Xa dea + ja b ! ∧ eb + j
Reinserting this algebraic equation into (2.17) produces 9V c a b c 9V X = L(FOG) j e ∧ e + Xd ! ˜ − dX ∧ e + jV ; ab 1 9X c 9X c M2
(2.27)
where the torsion-dependent part of ! now has been eliminated, but the dependence on X a is retained. For potentials 9V=9X a = 0, Eq. (2.27) already by the second and third equations of (2.19) can be identi9ed directly as GDT (2.9) with U = 0; V(X ) = V (X ). When 9V=9X a = 0 the e.o.m. from X a in (2.27) must be used, (dX ∧ ec + X c j)
92 V =0 ; 9X c 9X a
which for non-vanishing Hessian of V, now with respect to the X a alone, leads to 22 j˜ a e (9 X ) : X a = ea ∧ ∗dX = (e)
(2.28)
(2.29)
For easy comparison with the GDT (2.9), before using (2.29) and (2.27), the latter action is rewritten in component form. After cancellation of two terms with 9V=9X a , the 9nal result is very simple X R˜ (dil) 2 2 + V(−(∇X ) ; X ) ; = d x (e) (2.30) L 2 21
This equivalence has been published 9rst in Refs. [68,187] for the KV-model [27,28]. The proof below follows the formulation used in Ref. [188] for the even more general case of 2D dilaton supergravity (cf. also Section 4.2). 22 For potentials V of the form (2.31), Eq. (2.29) does not hold necessarily at points X0 where U (X0 ) = 0.
348
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
Fig. 2.1. A selection of dilaton theories
where, according to (2.29), the argument X a Xa in V has been replaced by a second derivative term ˜ of X , to be identi9ed also here with the same dilaton 9eld as in (2.9). The curvature scalar R=∗2d !˜ refers to the torsionless part of the spin connection in (2.24). Thus it may be expressed equally well directly by the 2D metric g . For potentials quadratic in the torsion-related variable X a X a Xa + V (X ) ; (2.31) V = U (X ) 2 action (2.30) exactly coincides with (2.9) in which torsion had been zero from the beginning. As we have used the algebraic (and even only linear) e.o.m.’s to reduce the con9guration space of L(FOG) to the one of L(dil) , the two actions lead to the same dynamics. This equivalence can be veri9ed easily as well by the study of the explicit analytic solution (cf. Section 3.1). We anticipate also that at the quantum level the steps above can be simply reinterpreted as “integrating out” the torsion-dependent part of ! and X a [69], cf. footnote 60. Apart from covering torsionless dilaton theories (2.9), the 9rst-order formulation (2.17) also permits the inclusion of 2D theories with non-vanishing torsion. The choice & 8 VKV = X a Xa + X 2 − ; ; (2.32) 2 2 after elimination of X a and X according to (2.22), produces the KV-model [27,28] which is quadratic in curvature and torsion and thus of the type of “PoincarVe gauge” theory [128,129]. By our equivalence relation, it could also have been written as the corresponding dilaton theory (2.9), of course. A 9nal remark concerns the overall normalization of our action. By comparing the term ˙ R in (2.9) and in SRG (2.2), the factor OD−2 =(16%GN +D−2 ) is replaced by 12 in (2.9). We shall 9nd it more convenient to stick to the latter normalization so that e.g. for SRG (2.2) (D − 3) +2 ; VSRG = − (D − 2)(D − 3)X (D−4)=(D−2) : USRG = − (2.33) (D − 2)X 2 Of course, when introducing matter by spherical reduction (cf. Section 4.1) the same overall normalization must be chosen. The potentials for the most frequently used dilaton models are summarized in Fig. 2.1. 2.3. Relation to Poisson-Sigma models Possibly, the most important by-product of the approach to 2D-gravity theories as presented in this report has been the realization that all models of type (2.17) are a special case of the more
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
comprehensive concept of Poisson-Sigma models (PSM) [72,189 –191] with the action 1 IJ (PSM) I dX ∧ AI + P AJ ∧ AI ; = L 2 M2
349
(2.34)
de9ned on a 2D base manifold M2 with target space N with coordinates X I . Those coordinates as well as the gauge 9elds AI are functions of the coordinates x on the base manifold (X I (x); AI (x)). The same symbols are used to denote the mapping of M2 to N. The dX I stand for the pullback of the target space di?erential dX I =dx 9 X I and AI are one-forms on M2 with values in the cotangent space of N. The non-trivial (topological) content of a certain PSM is encoded in the Poisson tensor PIJ =−PJI which only depends on the target space coordinates. This tensor may be related to the Schouten– Nijenhuis bracket [192,193] {X I ; X J } = PIJ
(2.35)
which is assumed to obey a vanishing bracket of P with itself, i.e. nothing else than a Jacobi identity which expresses the vanishing of the Nijenhuis tensor [193] PIL
9PJK + cycl(I; J; K) = 0 : 9X L
(2.36)
Only for PIJ linear in X I (in gravity theories the Jackiw–Teitelboim model [21–26]), Eq. (2.36) reduces to the Jacobi identity for the structure constants of a Lie algebra and becomes independent of X . In general, algebra (2.35) with (2.36) covers a class of 9nite W-algebras [155]. Early versions of this non-linear algebras from 2D gravity were discussed as constraint algebra of the Hamiltonian in the context of the KV-model in [194], and with scalar and fermionic matter in [79]. The interpretation as a non-linear gauge theory in a related approach goes back to [54,189]. Although we are dealing with bosonic 9elds in the present section our notation anticipates already the graded PSM (gPSM) of supergravity in Section 4.3. Thus the index summation in (2.34) is in agreement with the convention used in supersymmetry and (just here and in Section 4.3) we shall also de9ne instead of (1.22) the exterior di?erentiation to act from the right: d(!p ∧ !q ) = !p ∧ d!q + (−1)q d!p ∧ !q :
(2.37)
In the bosonic PSM for 2D gravity, action (2.34) reduces to (2.17) with the identi9cation X I → (X; X a );
AI → (!; ea ) :
(2.38)
The component PaX of PIJ is determined by local Lorentz transformation for which (cf. (2.43)) PaX = X b jb a
(2.39)
is the generator. The components Pab = Vjab contain the potential V(Y; X ) which determines the speci9c model (Y = X a Xa =2).
(2.40)
350
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
With the present convention (2.37), the e.o.m.’s from (2.34) become dX I + PIJ AJ = 0 ; 1 9PJK AK ∧ AJ = 0 : dAI + 2 9X I
(2.41) (2.42)
The identities (2.36) are the essential ingredient to show the validity of the symmetries, 23 X I = PIJ jJ ;
JK
(2.43)
9P jK AJ ; (2.44) 9X I in terms of the local in9nitesimal parameters jI (x). Eq. (2.44) reveals the gauge 9eld property of AI . Whereas for 2D gravity with (2.38), (2.39), (2.40) local Lorentz-transformations (jI → jX ) can be extracted easily from (2.43) and (2.44), di?eomorphisms (1.9) are obtained by considering jI = AI [73]. Evidently, (2.34) is invariant under target space di?eomorphisms too. Only when those transformations are di?eomorphisms also globally, the topology of M2 remains unchanged. It should be noted that conformal transformations of the world sheet metric can be expressed as target space di?eomorphisms. Otherwise the problems discussed in Section 2.1.4 are relevant also at the present, more general, level. Singular target space reparameterization (analogous to the conformal transformations discussed there) could eliminate singularities of the manifold M2 if the identi9cation (2.38) of the PSM variables is retained. Of course, an appropriate simultaneous (singular) rede9nition in the relation between AI and the Cartan variables could formally keep the topology of M2 in terms of the new variables intact, at the price of those singularities appearing in the relation between AI and (ea ; !). In 2D gravity the Poisson-tensor PIJ is not of full rank, because the number of target space coordinates is odd. This also may happen for general PSMs and it implies the existence of “Casimir functions”, whose commutator with X I in the sense of (2.35) vanishes. In 2D gravity there is only one such function 24 9C {X I ; C} = PIJ =0 : (2.45) 9X J The conservation of C with respect to both coordinates of the manifold 9C 9C dC = dX I I = −PIJ AJ =0 (2.46) 9X 9X I follows from (2.45) and the use of (2.41) in (2.46). Lorentz-invariance requires C = C(Y; X ) with Y = X a Xa =2 and thus according to (2.45) C must obey (cf. (2.39) and (2.40)) 9C 9C − V(Y; X ) =0 : (2.47) 9Y 9X This partial di?erential equation has a simple analytic solution for the physically most interesting potentials of type (2.31). It will be discussed in connection with the solution in closed form in Section 3.1. AI = −djI −
23
Applying (2.43) and (2.44) to the commutator of in9nitesimal transformations the resulting one is again a symmetry only if the e.o.m.’s (2.41) are used, or if PIJ is linear in X [153]. 24 For more details regarding generic PSMs with more Casimir functions C, we refer to Ref. [153].
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
351
The rank of the Poisson-tensor is not constant in general but may change at special points in the target space or corresponding points on the world-sheet. A noteable example is a Killing-horizon. Thus, the introduction of so-called Casimir–Darboux coordinates in which the Poisson-tensor 0 0 0 IJ (2.48) PCD =0 0 1 0 −1 0 is constant only works patchwise. Such singular points may be modelled by “Casimir–non-Darboux” coordinates Z I 0 0 0 IJ 0 Z1 : PCnD =0 (2.49) 1 0 −Z 0 This allows the extension of patches over a point which is singular in Casimir–Darboux coordinates since for Z 1 = 0 the rank changes from 2 to 0. In addition, however, such a coordinate system still may change the singularity structure of the original theory: e.g. a singularity like the one at X = 0 in SRG is not visible in (2.49); thus the transformation between these coordinate systems must be singular at X = 0. A di?erent route to simplify the target space structure is symplectic extension [195]. By adding an auxiliary target space coordinate, one can elevate the Poisson structure to a symplectic structure. Again, this works only patchwise, in general, since the determinant of the Poisson-tensor can be singular. At a physical level, the symplectic extension resembles Kucha_r’s geometrodynamics of the Schwarzschild BHs [32]: one introduces a canonically conjugate variable for the conserved quantity (in Kucha_r’s scenario on the world-sheet boundary, in the symplectic extension in the bulk of target space). For the application to gravity and supergravity theories in D = 2, we shall not need to know more about the PSM formulation. However, the 9eld of PSM-theories recently has attracted substantial interest in string theory [75,76] and, quite generally, in mathematical physics in connection with the Kontsevich formula for the non-commutative star product 25 [198,199]. The quantization of general PSMs [200,201] essentially follows the approach which will be presented in Section 7 for the special case of dilaton gravity. 3. General classical treatment Simple counting of degrees of freedom shows that dilaton gravity without matter 9elds in 2D has no propagating modes. Therefore, in terms of suitable variables, the dynamics may be made essentially trivial. This suggests (but in no way guarantees!) that all classical solutions can be found in a closed form. As pointed out already above, the fact that the solutions for dilaton theories of type (2.9) can be obtained in analytic form had been tested in many speci9c cases [29,62,159,163,164,202], always 25
For the de9nition and physical applications of deformation quantization, the seminal papers [196,197] may be consulted.
352
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
using the conformal gauge ds2 = 2e2* dx+ dx−
(3.1)
or the Schwarzschild gauge (cf. (3.34)). With (3.1) even for a theory as simple as (2.12) the solution of a Liouville equation * = −;e*
(3.2)
had been necessary. The advantages of the light-cone gauge for Lorentz indices combined with a temporal gauge for the Cartan variables e0+ = 0;
e0− = 1;
!0 = 0
(3.3)
was realized 9rst [439] in connection with the classical solution of the KV-model [27]. In this gauge the line element for any 2D gravity theory becomes (ds)2 = 2e1+ dx1 (dx0 + e1− dx1 )
(3.4)
which in GR represents an Eddington–Finkelstein (EF) gauge [203,204]. In terms of the Killing-9eld k & = (0; 1), the existence of which is a property of the solutions (9g =9x1 = 0) and rede9ning the x0 -coordinate by dxZ0 = e1+ (x0 )dx0 , the line element (3.4) may be rewritten as (ds)2 = dx1 (2dxZ0 + k 2 dx1 )
(3.5)
with the Killing-norm k 2 (xZ0 ) = k & k& containing all the information of the system (like *(x) in the conformal gauge (3.1)). The key advantage of the ingoing (outgoing) EF gauge as compared to the conformal or the Schwarzschild gauge is that it is free from coordinate singularities on an ingoing (outgoing) horizon. The only singularities of k 2 (xZ0 ) correspond to singularities of the curvature; zeros k 2 (xZ0 ) = 0 describe horizons. This gauge will turn out to be intimately related to the natural solution of the e.o.m.’s for all models in the 9rst order formulation. In Section 3.1 all classical solutions of GDTs without matter are determined in a very simple way, maintaining gauge invariance. Among the speci9c gauge choices, the EF gauge emerges as the most natural one, also for the analysis of the global structure of these solutions. The most important dilaton gravity models (cf. Table 2:1) belong to a two-parameter sub-family of all possible theories. This family of models is considered in more detail in the last subsection. 3.1. All classical solutions In anticipation of what we shall need in Section 4.3 we derive the e.o.m.’s from an action (2.17) supplemented as L = L(FOG) + L(m) by an, as yet, unspeci9ed matter part L(m) . The quantities W ± :=L(m) =e∓ ;
W :=L(m) =X
(3.6) (m)
contain the couplings to matter. A dependence of L on the spin connection or the auxiliary 9elds X ± will be discarded (cf. Section 4.3). Variation of !; e∓ ; X , and X ∓ , respectively, yields the e.o.m.’s: dX + X − e+ − X + e− = 0 ;
(3.7)
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
(d ± !)X ± ∓ Ve± + W ± = 0 ;
353
(3.8)
9V +W =0 ; (3.9) 9X 9V (d ± !)e± + j ∓ = 0 : (3.10) 9X The 9rst equation (3.7) can be used to eliminate the auxiliary 9elds X ± in terms of e± and dX . The second pair (3.8) is contained in the set of higher-dimensional Einstein equations for the special case of dimensionally reduced gravity. Eq. (3.9) yields the dilaton current W which is proportional to the trace of the higher-dimensional energy momentum tensor for dimensionally reduced gravity, and (3.10) entails the torsion condition. If the potential V is independent of the auxiliary 9elds X ± the condition for vanishing torsion (1.59) is obtained. Of course, in addition to (3.7) – (3.10) the e.o.m.’s for matter L(m) =A = 0 for generic matter 9elds A must be taken into account as well. In the present section we are interested only in the direct solution of (3.7) – (3.10) without 9xing any gauge [67], in the absence of matter (W = W ± = 0). Linear combination of the two equations (3.8), multiplied, respectively, by X − and X + and using (3.7) leads to (Y = X a Xa =2 = X + X − ) d! + j
d(X + X − ) + V(Y; X )dX = 0 :
(3.11)
This indicates the existence of a conservation law for a function C(Y; X ) = C0 = const: which is nothing else than the Casimir function of the Poisson-Sigma model of Section 2.3. In the application to physically motivated 2D models, potentials of form (2.31) were found to be the most important ones. We therefore concentrate on those. Multiplying (3.11) by the integrating factor exp Q with X Q= U (y) dy ; (3.12) one obtains the conservation law dC = 0
(3.13)
for the Casimir function C = eQ Y + w with
w(X ) =
X
eQ(y) V (y)dy :
(3.14)
(3.15)
Of course, any function of C is also absolutely conserved. Therefore, for some speci9c model, among others, a suitable convention must be used to 9x the lower limit of integration in Q (inRuencing an overall factor of C) and the lower limit in (3.15) (yielding an additive overall contribution). We assume X + = 0 which will be realized (cf. (3.8)) if V (X ) = 0 does not possess a non-trivial solution for X . This is true in SRG, but e.g. in the KV-model [27] such a “point-solution” may appear for certain values of the parameters [66]. If X + = 0 the 9rst component of Eq. (3.8) with a new one form Z:=e+ =X + dX + ! = − + + ZV (3.16) X
354
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
determines the spin connection in terms of the other variables. In a similar way, Eq. (3.7) may be taken to de9ne e− : e− =
dX + X −Z : X+
(3.17)
From (3.16) and (3.17) and Eq. (3.10) with the upper sign, recalling that now j = −e− e+ = −dX Z
(3.18)
for the potential (2.31), the short relation dZ − dX ZU = 0
(3.19)
follows. The ansatz Z = Zˆ exp Q, with the same integrating factor (3.12) as the one introduced above for C, reduces Eq. (3.19) to dZˆ = 0. Now by application of the PoincarVe Lemma (cf. Section 1.2) Zˆ = df is the only “integration” necessary for the full solution: 26 e+ = X + eQ df ; e− =
dX + X − eQ df ; X+
!=−
dX + + V eQ df ; X+
C = eQ X + X − + w(X ) = C0 = const :
(3.20) (3.21) (3.22) (3.23)
Indeed, all the other equations are easily checked to be ful9lled identically. Eq. (3.23) can be used to express X − in terms X and X + , so that in addition to f besides the constant C0 we have the free functions X and X + . Eqs. (3.7) – (3.10) are symmetric in the light-cone coordinates. Therefore, the whole derivation could have started as well from the assumption X − = 0. It is straightforward, although eventually tedious in detail, to generalize the solution (3.20) – (3.23) to dilaton theories where in (2.9) the factor of the Ricci-scalar is a more general (non-invertible) function Z(X ). Comparing the number of arbitrary functions (f; X; X + ) in the solution (3.20) – (3.23), with the three continuous gauge degrees of freedom, the theory is a topological one, 27 albeit of a di?erent type from other topological theories like the Chern–Simons theory [206 –210]: there is no discrete topological charge like the winding number associated to the solutions. In agreement with Section 2.3, the only variable which determines the di?erent solutions for a given action is the constant C0 ∈ R. The key role of C is exhibited by the line element (ds)2 = 2e+ ⊗ e− = eQ(X ) df ⊗ [2dX + 2(C0 − w(X )) df ]
26 27
(3.24)
This type of solution has been given 9rst in Ref. [66], starting from the Darboux coordinates for the KV-model [27]. In the sense that no continuous physical degrees of freedom are present [205].
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
355
after elimination of X − by (3.23). Whenever a rede9nition of X by dX˜ = dX exp Q
(3.25)
is possible, 28 Eq. (3.24) becomes (ds)2 = 2df ⊗ dX˜ + (X˜ )df ⊗ df ;
(3.26)
(X˜ ) = 2eQ (C0 − w)X =X (X˜ ) ;
(3.27)
i.e. the EF gauge is obtained when f and X˜ are taken directly as the coordinates. Then (X˜ ) coincides with the Killing-norm k 2 (cf. (3.5)). As we shall see in the next subsection the “topological” properties of (X˜ ), i.e. the sequence of singularities and zeros (horizons), and the behavior at the boundaries of the range for X˜ , completely determines the global structure of a solution. Eqs. (3.24) – (3.27), together with the de9nitions (3.12), (3.15) represent the main result of this section. They are exact expressions for the geometric variables and thus also for the line element (ds)2 valid for (almost) arbitrary dilaton gravity models without matter. It is now easy to specify other gauges by taking (3.24) as the point of departure, that is after having solved the e.o.m.’s (3.7) – (3.10) in the simple manner demonstrated above, namely inserting (x0 = t; x1 = r; F = 9F=9r; F˙ = 9F=9t) dX˜ = X˜ dr + X˜˙ dt; df = f dr + f˙ dt ; (3.28) into (3.26) with (3.27) with appropriate choices for X˜ and f. Of particular interest are diagonal gauges, a class of gauges to which prominent choices (Schwarzschild and conformal gauge) belong. The absence of mixed terms dr dt in the metric can be guaranteed in a certain patch by the gauge conditions X˜ = X˜ (r) ;
(3.29)
X˜ + f = 0 ;
(3.30)
and f˙ = 0. The solution for f from (3.30) r ˜ dX˜ (x) 1 Z = − K(X ) + f(t) Z + f(t) ; f=− dx 2 (X˜ (x)) dx 29
contains the integral K(X˜ ) de9ned by r K(r) = 2 dy−1 (y) : r0
(3.31)
(3.32)
The diagonal line element ˙ 2 − (f dr)2 ] ; (ds)2 = [(fdt)
(3.33)
for f˙ = f = 1 attains the form of the conformal gauge. Requiring furthermore det g = −1 as in Schwarzschild-type gauges yields (ds)2 = (dt)2 − −1 (dr)2 : 28 29
If the rede9nition is not possible for all X the computation of geodesics should start from (3.24). Z f(t) is arbitrary except fZ˙ = 0.
(3.34)
356
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
X As a concrete example we take SRG (2.33) with D = 4, where QSRG = USRG√(y)dy = −1=2 ln X with a natural choice y = 1 for the lower limit of integration and wSRG = −2+2 X with the lower limit X = 0 (cf. (3.12), (3.15)). The conserved quantity (3.14) becomes X + X − +2 √ CSRG = √ − X = C0 ; (3.35) 2 X and the Killing-norm (3.27) reads 2C0 2 kSRG = SRG = √ + 4+2 : X
(3.36)
√ In terms of the new variable X˜ = 2 X , the EF gauge (3.26) follows with (X˜ ) = 4C0 = X˜ + 4+2 . Further trivial rede9nitions (r = X˜ =(2+); u = 2+f; dt = du + dr=) yield the Schwarzschild metric [134] 2M 2M −1 (ds)2sch = 1 − (dt)2 − 1 − (dr)2 ; (3.37) r r where, as expected, C0 is related to the mass M C0 M =− 3 : (3.38) 4+ From the steps leading to the line element (3.24) or to one of its subsequent versions, it is obvious that these steps can be retraced backwards as easily, say from a Killing-norm (X˜ ) in (3.27) towards an action. This procedure is not unique, because one function (X˜ ) is to be related to two other functions (U and V ). For an associated model (2.9) with U =0, from (3.15) and (3.27) the potential V (X ) simply results by di?erentiation dw=dX . However, as emphasized already above, these models essentially encode their topology in the parameters of the action determined by V (X ). In the Killing-norm =2(C0 −w) the value of C0 by which the solutions di?er in those models only inRuences the position of the zeros of , the horizons. For instance, the Reissner–NordstrYom solution of Eq. (2.13) follows from the potential anticipated in that equation. As will be seen in Section 4, when interactions with matter are turned on, C of the present chapter is part of the conservation law involving matter contributions. Thus e.g. the inRux of matter only changes C and hence in models with U = 0 the position of the horizons, but does not change the “strength” of the singularities which is 9xed here by the mass and the charge Q. Actually, SRG belongs to the interesting class of models in which a given (X˜ ) is related to both functions U and V in a very speci9c way, namely, such that C0 = 0 corresponds to a Rat (Minkowski) manifold. According to (3.24) this can be simply achieved by the condition eQ w = & = const: ;
(3.39)
because then the only dependence on X resides in the term ˙ C0 [68]. Thus all models of this class (“Minkowski ground-state dilaton theories”) are characterized by the relation between U and V in (2.9) following from (3.39): X & d exp −2 V (X ) = U (y) dy : (3.40) 2 dX
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
357
A last related remark concerns the conformal transformation (1.56) of a dilaton theory represented by the 9rst-order action (2.17). It is always possible, starting from a theory with VZ (X ) and vanishing UZ (X ) to arrive at a model with non-vanishing U (X ) by the transformation: ea = eaZ eQ=2 ; a X a = XZ e−Q=2 ;
! = !Z +
U Za X eZ a ; 2
X a Xa U (X ) ; V = e−Q VZ (x) + 2
(3.41)
where Q is de9ned as in (3.12). In the language of the PSM formulation (Section 2.3) this is an explicit example of a target space di?eomorphism. 30 However, as pointed out already several times, this mathematical transformation connects two models with solutions of, in general, completely di?erent topology and=or properties regarding the role of the conserved quantity C0 . 3.2. Global structure As emphasized in Section 1.2 the global properties of a solution for the geometric variables are obtained by following the path of some device on the manifold. The most important example is the geodesic (1.43) which may penetrate horizons, but ends when singularities are encountered at 9nite a3ne parameter. When no geodesic can reach a boundary of the space–time for 9nite values of the a3ne parameter, the space–time is called geodesically complete, otherwise geodesically incomplete. It should be emphasized that the procedure presented below does not require the explicit or implicit knowledge of Kruskal-like global coordinates. For the analysis of the global structure, it is convenient to use outgoing or ingoing EF coordinates. In a simpli9ed notation [136] the line element (3.26) is written for the outgoing case (f ˙ u; X˜ ˙ r; k 2 (r) = (r); ∞ := + 1) (ds)2out = du(2dr + (r)du) :
(3.42)
The ingoing EF gauge (still with ∞ = +1) (ds)2in = dv(2dr − (r)dv) ;
(3.43)
which will be used here in order to construct patch A of the conformal diagram for Schwarzschild space–time (see below), could have been obtained if one uses Z := e− =X − for X − = 0 in (3.16) – (3.23). Eq. (3.43) is the most suitable starting point for our subsequent arguments in the present example. For the ingoing metric (x& = {v; r}) −(r) 1 ; (3.44) g = 1 0 30
See the discussion after (2.44).
358
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
the geodesics (v = v((); r = r(()) obey v˙2 =0 ; (3.45) 2 v˙2 (3.46) rY − r˙v ˙ + = 0 : 2 These are the e.o.m.’s of (1.42) if the a3ne parameter ( is identi9ed with s. The Killing-9eld 9=9v implies a constant of motion (k & = (1; 0)) & 8 ˜ = const: (3.47) g&8 k x˙ = r˙ − (r)v˙ = |A| vY +
which could have been derived as well from (3.46) and (3.45) by taking a proper linear combination. From (3.47) the a3ne parameter ( can be identi9ed with parameters describing the line element (3.43) 1 ± (d()2 = (dr − dv)2 = (ds)2 ; (3.48) ˜ |A| where the two signs correspond to time-like, resp. space-like geodesics. The 9rst-order di?erential equation 1 −1=2 dv = 1∓ 1+ ; (3.49) dr (r) A for the two signs in (3.49) describes two types of geodesics v(1) (r) and v(2) (r) at each point {v; r}. ˜ the two signs from To avoid confusion with the association of signs, in the new constant A = ±|A| (3.48) have been absorbed so that now A ¿ 0 and A ¡ 0 correspond to time-like, resp. space-like geodesics. Inserting (3.49) into (3.48) provides the relation r 1 1 s(r) = ± dy : (3.50) 1 − (y)=A |A| r0 “Special” geodesics A˜ = 0 with dv = −1 (r) ; dr r dy |(y)|−1=2 ; s= r0
(3.51) (3.52)
and “degenerate” ones obeying A = (r) = const: must be considered separately. The advantage of the EF-gauge is visible e.g. in (3.49). The geodesic with the upper sign in the square bracket clearly passes continuously (C ∞ ) through a horizon where (rh ) = 0 which, therefore, does not represent a boundary of the patch for the solution (provided A = 0). For a 9rst orientation of the global properties of a manifold, it is su3cient to study the behavior of null-directions. Light-like directions are immediately read o? from (ds)2 = 0 in (3.43): (dv = 0) → v(1) = const: := v0(1)
(3.53)
(dv = 0) → v(2) = K(r) + const: := K(r) + v0(2)
(3.54)
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
359
with K de9ned in (3.32). In terms of the new variables v˜ = v(1) ;
u˜ = v(2) − K(r)
(3.55)
those null-directions become the straight lines v˜ = const: and u˜ = const: The line element in these (conformal) coordinates, of course, exhibits (coordinate) singularities at the horizons. It should be stressed that conformal coordinates are being used here only in order to be in agreement with standard diagrammatic representations. Furthermore, it is convenient to map (ds)2 by a conformal transformation onto a 9nite region by considering e.g. [133] 2du˜ dv(r( ˜ u)) ˜ (ds) ˜ 2= (3.56) 2 (1 + u˜ )(1 + v˜2 ) where the powers of the two factors in the denominator are chosen appropriately. Light-like geodesics are mapped onto light-like geodesics, i.e. the causal structure is not changed by this transformation. The conformal diagrams obtained in this way have been introduced by Carter and Penrose [211,212]. The trivial example is Minkowski space with = 1. From (3.56) with (3.32) and (3.37) both light-cone variables u˜ = v − 2(r − r0 );
v˜ = v
(3.57)
lie in −∞ 6 {u; ˜ v} ˜ 6 + ∞. By the compacti9cation (3.56), in the line element (ds) ˜ 2 = 2dU dV
(3.58)
the new variables U =arctan u, ˜ V =arctan v˜ are restricted to the 9nite interval −%=2 6 U; V 6+%=2. 3.2.1. Schwarzschild metric As a typical non-trivial example for the general procedure in curved space [136] we take the Schwarzschild BH with 2M (r) = 1 − : (3.59) r The second light-like coordinate, solving (3.54) with (3.32) r (3.60) v(2) = v0(2) + 2r ∗ = v0(2) + 2r + 4M ln 1 − ; 2M is intimately related to the “Regge–Wheeler tortoise coordinate” r ∗ , but, as we see below, the actual integration need not even be performed. It is su3cient to just regard the general features of the curves. The steps from Figs. 3.1 to 3.3 are obvious by inspection. The change to conformal (null) coordinates in Fig. 3.4 implies the introduction of u˜ as the horizontal axis. Thus the curves v(2) =const: in Fig. 3.3 are to be “straightened” into vertical lines. Above the line u˜ = 0 this pushes the lines r = const: in the regions A and B of Fig. 3.1–3.3 together so that they all terminate in the point (a) in Fig. 3.4. For negative u, ˜ those lines are pushed apart to end in the corners (b) and (c). The value r = 2M corresponds to the lines (b) – (c) and (a) – (e) with the exception of the endpoints (a), (b) and (c). Similarly, the value r = ∞ corresponds to the lines (a) – (d) and (c) – (d), except for the endpoints (a) and (c). The integration constant v0(2) , the endpoint of those curves for r = 0 in Fig. 3.3, always terminates at some 9nite value which is smaller than all v(1) |r=0 ¿ v0(2) . Therefore,
360
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
Fig. 3.1. Killing norm (3.59) of Schwarzschild metric.
Fig. 3.2. Derivative of Eq. (3.54).
Fig. 3.3. v(2) of Eq. (3.54).
the left-hand boundary in Fig. 3.4 for r = 0 experiences a “cut o?”, described by the line from (a) to (b). 31 In the language of general relativity the nomenclature for the points (a), (c), (d) and (e) is, respectively, i+ , i− , i0 and the bifurcation-two-sphere. The lines (a) – (b), (a) – (d), (d) – (c) and (a) – (e) are, respectively, the singularity, I+ , I− and the Killing-horizon. 31
Whether this is really a straight line as drawn in Fig. 3.4 depends among others on the compression factor. The same is also true for the other boundaries at r = ∞, u˜ = −∞. However, the shape of those curves is irrelevant, as far as the topological properties are concerned which are determined by their mutual arrangement only.
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
361
~v= +
r=0
~ u= A
(b)
(d)
B (e)
(c)
8
(a)
8
~ (1) v=v=v (0)
~u=v(2) (0)
Fig. 3.4. Conformal coordinates with “compression factor”.
A B
Fig. 3.5. Reorientation of Fig. 3.4: patch A.
We emphasize again that in the EF gauge the whole patch of Fig. 3.4 is connected by continuous geodesics. A treatment in the conformal gauge [213], although using simpler geodesics, su?ers from the drawback that the connection between the regions A and B must be made by explicit continuation ◦ through the coordinate singularity at the horizon r = 2M . We now turn Fig. 3.4 by 45 (Fig. 3.5) and call it patch A. Clearly, r → ∞ is complete in the sense of Section 1.2.1 because there the space becomes asymptotically Rat (cf. (3.59)). The singularity at r =0 can be reached for 9nite a3ne parameter. At the edge v = −∞ incompleteness is observed, and (in conformal gauge) a coordinate singularity. Therefore, an extension must be possible. Indeed, introducing coordinates vB ; r B in patch B by rB = r ; r B = r ; vB = K(r) − v
(3.61)
dr = dr B ; dv = −dvB + 2−1 (r B ) dr B
(3.62)
with
again transforms the line element (3.43) into itself, but with the replacement r → r B ; v → vB . Moreover, we obtain the same di?erential equations as the ones in the patch A except for the change of sign v → −v (cf. (3.61)). As a consequence patch B is given by Fig. 3.6, the mirror image of Fig. 3.5. Further patch solutions C and D can be obtained by simply changing both signs on the right-hand side of (3.55), resp. (3.61), yielding the patches of Fig. 3.7. Now the key observation is that the
362
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
A’ B’
Fig. 3.6. Mirror image of Fig. 3.5: patch B.
B’’ A’’
B’’’ A’’’
Fig. 3.7. Further Rips: patches C and D.
Fig. 3.8. CP diagram for the Schwarzschild solution.
lines r = const: correspond to the same variable in the regions A of A and A of B. The same is true in B and B for B and C, and for A and A for C and D. Superimposing those regions we arrive at the well-known Carter–Penrose (CP) diagram for the Schwarzschild solution (Fig. 3.8). 3.2.2. More general cases We have glossed over several delicate points in this procedure [136,214]. As pointed out at the beginning of this section in a more complicated case a careful analysis of geodesics is necessary at external boundaries and, especially, at the corners of a diagram like Fig. 3.8. One may encounter “completeness” in this way in some corners, but also in the middle of a diagram, resembling Fig. 3.8. Still, in all those cases the analysis does not need the full solution of the geodesic equations (3.45), (3.46). It su3ces to check their properties in the appropriate limits only. Also the diagram alone may not be su3cient to read o? some important “physical” properties. The line of reasoning, passing through Figs. 3.1–3.5 shows that obviously all Killing-norms (r) with one singularity, one (single) zero and ∞ = 1 will lead to the same diagram Fig. 3.8. However, e.g. the incomplete boundary at the singularity may behave di?erently. For the CGHS model [52] in which the power r −1 (or r −(D−3) for SRG from D ¿ 4) is replaced by an exponential e−r , only time-like geodesics are incomplete at r = 0. This means that light signals take “in9nitely long time” to reach the singularity (null completeness), whereas massive objects do not. Another important point to be checked is whether by superimposing patches around some center as the bifurcation 2-sphere in Fig. 3.8, one really arrives at B = B (uniqueness), or whether this
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
363
Fig. 3.9. Basic patch of Reissner–NordstrYom metric.
Fig. 3.10. Penrose diagram for Reissner–NordstrYom metric.
can be obtained only by imposing certain further conditions. Otherwise not a planar picture like Fig. 3.8, but an in9nite continuation in the form of a “spiral staircase” extending above (and below) the drawing plane may emerge. When exhibits two zeros as for the Reissner–NordstrYom metric RN = 1 −
Q2 2M + 2 ; r r
(3.63)
the basic patch of Fig. 3.9 with the superposition method described above leads to the well-known [215,216] one-dimensional in9nite periodic extension of Fig. 3.10.
364
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
identify
Fig. 3.11. A possible RN-kink (cf. [217]).
When even three zeros are present in the Killing-norm the global diagram becomes periodic in two directions (cf. e.g. [142], Figs. 2.1–2.3), i.e. covers the whole plane. 32 Such one-, two- or more dimensional lattices exhibit discrete symmetries, which, in turn, may be used to compactify manifolds by identifying certain curves. If this identi9cation occurs in a nontrivial manner, “solitonic” manifolds are produced [217], as illustrated by Fig. 3.11. 3.3. Black hole in Minkowski, Rindler or de Sitter space As a further illustration for the application of the methods described in this section a family of dilaton gravities [187] is considered which includes the physically most interesting models describing a single BH in Minkowski (cf. (2.33)), Rindler or de Sitter space. The potentials U and V are assumed to be of a simple monomial form, a B U (X ) = − ; V (X ) = − X a+b : (3.64) X 2 Among the constants a, b and B only a and b distinguish between physically inequivalent models. B plays the same role as +2 in (2.33), de9ning an overall scale factor. 32
Here we have even discarded the “uniqueness”-problem, referred to above.
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
365
In the line element (3.24) the functions Q and w read (cf. (3.12) and (3.15)) eQ(X ) = X −a ; so that 2
(ds) = X
−a
w(X ) = −
B X b+1 ; 2(b + 1)
df ⊗ 2dX + 2 C0 +
B X b+1 2(b + 1)
(3.65)
df
:
(3.66)
The equation w = C0 has at most one solution on the positive semi-axis. Hence the metric (3.66) exhibits at most one horizon. The most interesting models correspond to positive a for which the function X −a diverges at X = 0. For X ¿ 0 in terms of the alternative de9nition of the dilaton 9eld 6 (2.11) the dilaton action (2.10) with the potentials (3.64) becomes √ 1 (dil) d2 x −g e−26 [R + 4a(∇6)2 − Be2(1−a−b)6 ] ; L = (3.67) 2 which may be more familiar to a string audience. It is also instructive to calculate the scalar curvature: R = 2aC0 X a−2 +
Bb (b + 1 − a)X a+b−1 : b+1
(3.68)
In what follows only models with b = − 1 will be considered (although b = −1 can be analyzed too [187]). For the “ground state” solutions C0 = 0 only the second term in (3.68) survives. If a = b + 1 or b = 0 the scalar curvature of the ground state is zero. A more detailed analysis shows that the 9rst case (a = b + 1) corresponds to Minkowski space, and the second (b = 0) to Rindler space. The condition a = b + 1 for the Minkowski ground-state models also follows from (3.40). If a = 1 − b the ground state has constant curvature and corresponds to (anti-) de Sitter space. For the general solutions (3.66) with C0 = 0 it follows from (3.68) that, depending on the values of a and b, they may show a curvature singularity at X = 0, at X = ∞, or at both values. In the special cases considered above there could be only one singularity. Therefore, these models describe (in a somewhat generalized sense) a single BH immersed in Minkowski, or Rindler, or de Sitter space. Many interesting and important models belong to the two-parameter family of this section. The SRG models (2.2) for general dimension D lie on the line a = b + 1 between a = 12 (this point corresponds to D = 4) and a = 1. As D grows, these models approach the point a = 1, corresponding to the CGHS model [52]. The point a = 0, b = 1 describes the Jackiw–Teitelboim model [21,23,26]. Lemos and Sa [166] gave the global solutions for b = 1 − a, Mignemi [218] considered a = 1 and all values of b. The models of [219] correspond to b = 0 and a 6 1. The general solution for the whole plane was obtained in Ref. [187] and is summarized in Fig. 3.12. In order to calculate quantities like the ADM mass and the Hawking Rux it is essential to re-write the line element (3.66) for asymptotically Minkowski, de Sitter and Rindler models in such a form that it becomes the standard one in the asymptotic region. Here we give an explicit expression for the asymptotically Minkowski solutions only (other cases can be found in Ref. [220]) where such a representation is possible for a ∈ (0; 2). We repeat the steps which led before from the metric (3.24) to
366
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
b
1
0
1
2
a
SBH
null
b=a-1
incomplete at singularity
Fig. 3.12. “Phase” diagram showing the CP diagrams related to the (a- and b-dependent) action (3.67). Bold lines in those diagrams denote geodesically incomplete boundaries. Spherically reduced models lie on the half-line b = a − 1, b ≤ 0, the endpoint of which corresponds to CGHS. The special case of SRG from D = 4 is depicted by the point labelled by SBH.
the Schwarzschild black hole (3.37). Namely, we 9rst introduce the coordinates x˜ = r; f = u (cf. (3.26)) 1 B a f sign(1 − a); r= X 1− a u= (3.69) a B |1 − a| and write the metric in EF form (3.27) where now reads (2−a)=2(a−1) B (r) = 2C0 |1 − a|a=(a−1) r a=(a−1) +1 : a
(3.70)
Following the steps after (3.28) one arrives at the generalized Schwarz-schild metric (3.34) with as in (3.70). For a ∈ (0; 1) the asymptotic region corresponds to r → ∞, while for a ∈ (1; 2) it is reached with the limit r → 0. In both cases (r) → 1, and the metric assumes the standard Minkowski form g → diag(1; −1). For the asymptotically Rindler and de Sitter solution with a belonging to the same interval, a ∈ (0; 2), the quantity becomes (r) = rB21=2 − Mr a=(a−1)
(3.71)
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
367
for the Rindler solutions, and (r) = r 2 B2 − Mr a=(a−1)
(3.72)
for the de Sitter solutions. Explicit expressions for the constants M and B2 and for the variables t and r can be found in Ref. [220]. The presence of typical linear (Rindler) and quadratic (de Sitter) terms in (3.71) and (3.72) should be noted. Although the CGHS model a = 1, b = 0 belongs to the family of the asymptotically Minkowski models considered above, Eqs. (3.69) are singular at a = 1. Appropriate coordinates for this case are √ 1 u = − Bf; (3.73) r = − √ ln X : B The line element (3.27) now contains √
2C0 e Br : (3.74) (r) = 1 + B A somewhat non-standard feature of (3.74) is that the asymptotic region is situated at r → −∞. 4. Additional elds In 2D there are neither gravitons nor photons, i.e. no propagating physical modes exist [205]. This feature makes the inclusion of Yang–Mills 9elds in 2D dilaton gravity or an extension to supergravity straightforward. Indeed, both generalizations can be treated again as a PSM (2.34) with generalized AI and X I . More locally conserved quantities (Casimir functions) may emerge and the integrability concept is extended. Besides gauge 9elds also scalar and spinor 9elds may be added. If the latter are derived from higher dimensions through spherical reduction they are non-minimally coupled to the dilaton. The introduction of those 9elds in general destroys the integrability. Only in special cases exact solutions still can be obtained. An example are chiral fermions [79] or (anti-)self-dual scalars [80]. 4.1. Dilaton-Yang–Mills theory The interaction with additional one-form Yang–Mills 9elds Aa related to local gauge transformations belonging to a compact Lie group G is simply included in (2.17) by introducing further auxiliary variables Z a (additional target space coordinates in the PSM language) in the dilaton-Yang–Mills (DYM) action (DYM) L = [Xa Dea + X d! + Z a (DA)a + jV(X a Xa ; X; c1 (Z); c2 (Z); : : : cn−1 (Z))] : (4.1) M2
The gauge covariant derivative (DA)a = dAa + gfabc Ab Ac
(4.2)
contains the structure constants fabc and the gauge coupling g. The potential V, invariant under local Lorentz transformations and transformations G now also may depend on the Casimir invariants ci of the group G. For instance in G = SU (N ), there are N − 1 independent invariant polynomials of degree 2; 3; : : : ; N − 1 in terms of the components Z a .
368
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
The Abelian case (f=0 in (4.2)) is especially simple. There V only depends on the single variable Z. Variation of (4.1) with respect to A directly yields dZ = 0, i.e. Z = Z0 = const: is conserved, an additional Casimir function in the PSM-interpretation of (4.1). Because Z = Z0 is the result of solving a di?erential equation it cannot be simply reinserted into (4.1). Variation of Z yields 9V dA = −j : (4.3) 9Z Z=Z0 The remaining e.o.m.’s can be solved as for (2.17) by (3.20) – (3.23) with just an additional dependence on the constant Z = Z0 in the potential V in (3.22) and in w(X; Z) of (3.23) [221,222]. For a non-Abelian gauge group G the coupling between the gauge 9elds Aa and the geometric variables is somewhat more complicated, but as a PSM it can be treated again along the lines of Section 3.1. For a potential of the type V(X + X − ; X ) + &(X )Z a Z a the solution of the geometric sector can be obtained like the one-form of an ordinary GDT, because Z a Z a is constant on-shell [136,153,214]. Such a potential correctly reproduces the e.o.m.’s for ordinary D = 2 dilaton-Yang– Mills theory: (DA)a = −j&(X )Z a ;
d(Z a Z a ) = 0 :
(4.4)
Some explicit solutions for the dilaton-Maxwell-Scalar system can be found in Ref. [223]. The action (4.1) does not contain the special case which emerges from spherical reduction of Einstein–Yang–Mills (EYM) in D = 4. The reason for this is obvious: While the Killing-vectors s associated with spherical symmetry act trivially on the metric, s g = 0 with s being the Lie-derivative with respect to s , a corresponding transformation of the gauge 9eld A → A −js s A can be compensated by a suitable gauge transformation, A → A + js D Ws , such that s A = D Ws [224]. Already for SU (2) the most general solution compatible with spherical symmetry, sometimes called “Witten’s ansatz” [225], yields terms which are not described by (4.1), namely an additional (minimally coupled) charged scalar 9eld with dilaton-dependent mass term and quartic self-interaction. 33 4.2. Dilaton supergravity Already a long time ago the super9eld approach has been applied in supersymmetric extensions of GDTs [227]. Super9elds are expressed in terms of supercoordinates z M = (xm ; M ), where the bosonic coordinates xm are supplemented 34 by anticommuting (Grassmann) coordinates M . We assume the latter to be Majorana spinors, i.e. we restrict ourselves to N = 1 superspace in D = 2. Beside the Z2 -grading property for coordinates, 35 z M z N = (−1)MN z N z M ;
(4.5)
the derivatives with respect to z are de9ned to act to the right ˜9M = ˜9=9z . Only those derivatives will appear in the following, so the arrow will be dropped. M
33
See. [226] for a comprehensive review on non-trivial EYM solutions. Throughout this subsection we employ the generally accepted supersymmetry notation with Latin indices from the middle of the alphabet for the holonomic bosonic coordinates xn (denoted “x ” in the rest of this Report). Greek indices are reserved for the fermionic coordinates M . A similar notation is used in tangential space for Lorentz vectors X A = (X a ; X & ) with indices from the beginning of the alphabet. 35 In the exponent of (−1), M or N is zero for a bosonic component, for a fermionic one M , resp. N , are 1. 34
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
369
Any vector-9eld in superspace V = V M 9M is invariant under non-degenerate coordinate changes z M → zZM (z) N L 9 M 9z 9zZ V M 9M = VZ ; (4.6) M L 9zZ 9z 9zZN which shows the advantage of the conventional summation of indices “ten to four” in supersymmetry, already introduced in the section on PSMs (Section 2.3). Now any formula of di?erential geometry in ordinary space of the PSM can be copied to superspace notation. The p-forms equation (1.14), of Section 1.2.1 turn into the same expressions written in terms of dz M instead of dx , adhering strictly to the summation of supersymmetry: 1 dz Mp ∧ · · · ∧ dz M1 6M1 ···Mp : (4.7) 6= p!
Also the external di?erential is de9ned in agreement with Section 2.3 as 1 dz Mp ∧ · · · ∧ dz M1 ∧ dz N 9N 6M1 ···Mp d6 = p!
(4.8)
which implies the Leibniz rule (2.37) for superforms, already introduced in that section in anticipation of the graded PSM (gPSM) approach below. Clearly, the (anti-)symmetry properties of the tensor !M1 :::Mp now depend on the graded commutation properties (4.5) of the indices. Instead of (1.13), (1.14) the one-forms of superzweibein and superconnection are EM A and !MA B = !M LA B ; where in (4.9) the simpli9cation for two dimensions with b 0 ja LA B = 0 − 12 (-∗ )& 8
(4.9)
(4.10)
already has been taken into account. The fermionic part in (4.10), the generator of Lorentz transformations in spinor space, agrees with (1.61). Covariant derivatives, de9ned by analogy to (1.23) ∇ M V A = 9 M V A + ! M V B LB A ; ∇ M V A = 9 M V A − ! M LA B V B ;
(4.11)
lead to the expressions for the components of supercurvature and supertorsion RMNA B = (9M !N − 9N !M (−1)MN )LA B = : FMN LA B ;
(4.12)
TMN A = 9M EN A + !M EN B !M LB B − (M ↔ N )(−1)MN :
(4.13)
Again Bianchi identities, direct generalizations of (1.26) and (1.28), must hold (cf. [195], Eqs. (1.58) and (2.20)). They restrict the component 9elds, contained in EM A ; !M when these expressions are expanded in terms of (a 9nite number of) ordinary 9elds, appearing as coe3cients of powers P and P2 = P P . It turns out that to deal with supertorsion it is more convenient to use its anholonomic components: TAB C = (−1)A(B+N ) EB N EA M TMN C :
(4.14)
370
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
The literature on 2D supergravity (cf. e.g. [227–232]) is strongly inRuenced by its close relation to string theory, where the bosonic torsion vanishes, Tab c = 0. It uses the further constraints on TAB C T&8 c = 2i (-c )&8 ;
T&8 - = 0 ;
(4.15)
the 9rst of which is dictated by the requirement that in the limit of global transformations ordinary supersymmetry should be restored. The second one turns out to be a convenient choice, because then the Bianchi identities in a Wess–Zumino-type gauge are ful9lled identically [188]. In the application to 2D gravity including bosonic torsion, it seems natural to retain (4.15), but to simply drop the zero bosonic torsion condition. However, as a consequence of the Bianchi identities it turned out [188] that the super9eld components, obtained in this manner, did not permit the construction of 2D supergravity Lagrangians with non-vanishing bosonic torsion, after all. Only after replacing (4.15) by the weaker set (F&8 = E& M E8 N FNM (−1)M , cf. (4.12)) (-a )8& T&8 c = −4ia c ;
T&8 - = 0;
(-a )8& F&8 = 0
(4.16)
a solution can be found [195]. However, the mathematical complexity of this approach becomes considerable. Instead, we turn to the generalization of the PSM, adding fermionic target space coordinates Q& and corresponding Rarita–Schwinger one-form 9elds & to the degrees of freedom in (2.34), (2.38) as 36 X I = (X; X a ; Q& ) ; AI = (!; ea ;
&)
:
(4.17)
Apart from that, the PSM action retains the form (Eq. (2.34)). Both Q& and & denote Majorana 9elds, when, as in what follows, N = 1 supergravity is considered. 37 The graded Poisson-tensor PIJ = (−1)IJ +1 PJI instead of (Eq. (2.36)) is now assumed to ful9ll a graded Jacobi identity PIL˜9L PJK + gcycl(IJK) = 0 :
(4.18)
Except for the range of 9elds to be summed over, the e.o.m.’s are again (2.41), (2.42). The symmetries (2.43), (2.44) depend on in9nitesimal local parameters jI = (j ; ja ; j& ). The mixed components P&X are constructed by analogy to PaX in (2.39) with the appropriate generator (−-∗ =2) of Lorentz transformations in 2D spinor space (cf. Eq. (4.10)). Then dj& in the second set of Eq. (2.44) acquires an additional term casting it into the covariant combination (Dj)& , with covariant derivative (1.63). This is precisely the form required for the (dilaton deformed) supergravity transformation of the “gravitino” & . As the Poisson-tensor PIJ also here is not of full rank, Casimir functions C(Y; ; Q2 ) exist which, following the same line of argument as in Section 2.3 obey dC = 0. From Lorentz invariance a bosonic C in supergravity is of the form C = c + 12 Q2 c2 ;
(4.19)
36 Under the title “free di?erential algebras” this has been proposed for simple models in Ref. [233], cf. also Refs. [189,234,235]. (1) (N ) (i) 37 In higher N supergravity Majorana 9elds Q& : : : Q& and corresponding & are needed with an additional SO(N ) symmetry.
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
371
where c coincides with the quantity denoted by C in the pure bosonic case (2.46), (3.23). However, also fermionic Casimir functions may appear (see below). The determination of all possible minimally extended supergravities reduces to the solution of the Jacobi identities (4.18). In the general ansatz for PIJ Pab = V jab ;
(4.20)
Pb = X a ja b ;
(4.21)
1 P& = − Q8 (-∗ )8 & ; 2
(4.22)
P&b = Q8 (F b )8 & ;
(4.23)
P&8 = v&8 +
Q2 &8 v ; 2 2
(4.24)
the function Q2 (4.25) v2 (X; Y ) 2 contains the original bosonic potential V. As explained above, Eqs. (4.21) and (4.22) are 9xed completely by Lorentz invariance. This invariance also implies that the (symmetric) spinor–tensor V &8 in (4.24) can be expanded further into three scalar functions of X and Y , multiplying the a &8 symmetric matrices (-∗ )&8 ; -&8 a ; X (-∗ -a ) : V = V(X; Y ) +
˜ a &8 ˆ a b &8 V &8 = U-&8 ∗ + i UX -a + i UX ja -b Q2 &8 v : = v&8 + 2 2
(4.26)
Each function U again has a pure bosonic part and, as indicated in the second line of (4.26), a term proportional to Q2 . In a similar manner the spinor F b in (4.23) can be expressed in terms proportional to 8 & ; (-a )8 & (-∗ )8 & which 9nally requires eight scalar functions of X and Y , multiplied by appropriate factors constructed with the help of X a and jab [77]. In (4.23) the multiplying factor Q precludes terms with factor Q2 in F. Thanks to (4.21) and (4.22) the Jacobi identities with one index referring to X are ful9lled automatically. The remaining ones can be solved algebraically, provided a quite speci9c sequence of steps is followed (for details see [77]). Three main cases are determined by the rank of the 2 × 2 spinor matrix v&8 in (4.26). For full rank (det v&8 = 0) the solution is found to depend on 9ve scalar functions of X; Y and the derivatives thereof, if the bosonic potential V in (4.25) is assumed to be given. If the fermionic rank is reduced, beside the bosonic Casimir function (4.19) one or two fermionic ones exists. They are of the generic form − − ±1=4 (± ) ± X C = Q ++ c(±) (X; Y ) (4.27) X and owe their Lorentz invariance to the interplay of the Abelian boost transformations exp(±8) of the light-cone coordinates X ±± related to X a , and exp(±8=2) of the chiral spinor components Q± of
372
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
Q& . For fermionic rank 1, the general solution contains four arbitrary functions besides V and one additional Casimir function of type (4.27). For rank zero of the fermionic extension in PIJ beside (4.19) both fermionic Casimir functions (4.27) are conserved and three functions remain arbitrary for a given bosonic potential. In order to avoid solving di?erential equations by imposing the Jacobi identities (4.18) also for reduced fermionic rank, it is important to make intensive use of the information on the Casimir functions. The arbitrariness of the solution of the Jacobi identities can be understood as well by studying reparameterizations of the target space, spanned by the X I in the gPSM. Those reparameterizations may generate new models. Therefore, they can be useful to create a more general gPSM from a simpler one, although this approach is di3cult to handle if V in (4.25) is assumed to be the given starting point. However, the subset of those reparameterizations may be analyzed which leaves a given bosonic theory unchanged. Again the same number of arbitrary functions emerges for the di?erent cases described in the paragraphs above. As an illustration we quote Eq. (4.252) from [77] which represents one (of many) supergravity actions which possess a bosonic potential (2.31) quadratic in torsion: 38 (QBT) L = X d! + X a Dea + Q& D & + j(V + 12 X a Xa U + 12 Q2 v2 ) M
U a 3 b iV X (Q- -a - eb ) − (Q-a ea ) 4 2u U 2 1 i a u + Q ( -3 ) ; − X ( -a ) − 2 2 8 +
1 v2 = − 2u
V2 VU + V + 2 u
:
(4.28)
In this formula U (X ) is the quantity de9ned in (2.31). The prepotential u is related to U (X ) and V (X ) by u2 (X ) = −2e−Q(X ) w(X ) ;
(4.29)
where Q and w have been de9ned in (3.12) and (3.15). The supergravity transformations of ea and & with small fermionic parameter j for this action (4.28) (cf. Eqs. (4.255) – (4.259) of [77] and footnote 38) are of the form ea = i(j-a ) + · · · ;
&
= −(Dj)& + · · · :
(4.30)
Thus, they contain the essential terms, but, not shown here, also others, because of the deformation by the dilaton 9eld. SRG is the special case (2.33) for U and V , but as a supergravity extension (4.28) is not unique. A generic property of the fermionic extensions obtained in this analysis is the appearance of obstructions, which is a typical feature of supersymmetric theories (cf. e.g. [236]). The 9rst type 38
We comply with our present notation by the replacements → X; Z → U (X ) in Ref. [77]. Furthermore an arbitrary constant is now 9xed as u˜ 0 = 1.
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
373
of those consists in singular functions of the bosonic variables X and Y , multiplying the fermionic parts of a supergravity action, when no such singularities are present in the bosonic part. But even in the absence of such additional singularities, a relation like (4.29) between the original potential and some prepotential u, dictated by the corresponding supergravity theory, either leads to a restriction of the range of X and=or Y as given by the original bosonic one, or even altogether prevents any extension of the latter. Remarkably, a known 2D supergravity model like the one of Howe [227] which originally had been constructed with the full machinery of the super9eld technique, is one example which escapes such obstructions. There, in our language, the PSM potential V = −2+2 X 3 permits an expansion in terms of the prepotential u(X ) through V = −du2 =dX because Q = 0. An example where obstructions seem to be inevitable is the KV-model [27] with quadratic bosonic torsion. On the other hand, the supergravity extension of SRG following from the action (4.28) is free from such problems. However, it is not the only possible extension of the bosonic theory. Indeed, the hope that a link could be found between the possibility of reducing the arbitrariness of extensions referred to above, and of the absence of such obstructions, did not materialize. Several counterexamples could be given including di?erent singular and non-singular extensions of SRG. Another very important point concerns the “triviality” of supergravity extensions, proved earlier by Strobl [235]. It was based upon the observation that locally a formulation of the dynamics in terms of Darboux coordinates allows to elevate the in9nitesimal transformations (2.43, 2.44) (on-shell) to 9nite ones. Then the latter may be used to gauge the fermionic 9elds in 2D supergravity to zero. Providing now the explicit form of those Darboux coordinates in the explicit solution of a generic model, additional support has been given to the original argument of Ref. [235]. However, the appearance of the obstructions and the ensuing singular factors in the transition to the Darboux coordinates may introduce a new aspect. When those new singularities appear at isolated points without restriction of the range for the original bosonic 9eld variable, they may be interpreted and discarded much like coordinate singularities. Another way to circumvent this problem in the presence of restrictions to the range and thus to retain triviality is to allow a continuation of our (real) theory to complex variables. This triviality disappears anyhow, when interactions with additional matter 9elds are introduced, obeying the same symmetry as given by the gPSM-theory. A proposal in this direction can be found in Ref. [233]. In order to eliminate the arbitrariness of superdilaton extensions the only viable argument seems to start from a supergravity theory in higher dimensions (e.g. D = 4) and to reduce it (spherically or toroidally) to a D = 2 e?ective theory. It turns out that the Killing spinors needed in that case must be Dirac spinors, requiring the generalization of the work of Ref. [77] to (at least) N = 2, where, however, the same technique of gPSMs can be applied. As in the bosonic case an action like (4.28) or, better, directly its gPSM form (2.34) can be converted into a dilaton theory by elimination of the torsion-dependent part of the spin connection and of X a . Also for supergravity the equation for X a is independent of the potential V, Eq. (2.29) being replaced by X a = −ema jmn [(9n X ) + 12 (Q-∗ n )] : The corresponding dilaton action in the gPSM notation becomes 1 AB & X d!˜ + Q (D )& + P |X a eB eA ; L= 2 M
(4.31) (4.32)
374
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
where !˜ is the torsionless part of the curvature as in (2.27) and |X a means that the components of the Poisson-tensor (in the anholonomic basis) are to be taken with X a given by (4.31). When (4.32) is written in components (cf. Eq. (4.246) of [77]) with the bosonic potential (2.31) for the Howe model (V = − 12 X 3 ; U = 0), it can be checked that the superdilaton theory, obtained in this way di?ers from the direct superextension of the bosonic theory [237]. The reason is that in our approach X is directly promoted to be the bosonic component of the super9eld, whereas in Ref. [237] the dilaton represents the bosonic part of yet another scalar super9eld. 4.3. Dilaton gravity with matter 4.3.1. Scalar and fermionic matter, quintessence For the inclusion of scalar matter as in Section 2.1 we start with the example of spherical reduction of Einstein theory. When massless scalar 9elds are coupled minimally in D dimensions ((∇(D) 6)2 = gMN 9M 69N 6; M; N = 0; 1; : : : ; D − 1) 1 (m; D) dD x −g(D) (∇(D) 6)2 ; L() = (4.33) 2 for scalar 9elds = (x0 ; x1 ) the 2D reduced action in terms of the components of gMN as derived from the line element (2.1), becomes OD − 2 2 √ d L(m) = x −gF(X )(∇)2 ; (4.34) () + D −2 X : (4.35) FSRG (X ) = 2 Such an interaction is an example of non-minimal coupling (F(X ) = const:) in the reduced case. Admitting a general function F(X ), a dilaton 9eld dependence di?erent from SRG (Eq. (4.35)) can be covered as well. An especially simple theory follows for F = const:, minimal coupling at the D = 2 level. 39 Below we shall absorb the relative factor 12 between the dilaton action (2.2) and the 9rst-order action (2.17)—which of course must also be adjusted properly in the matter action—into the coupling function F(X ). Dropping as in our convention for the geometrical part of the FOG action in (2.17) the prefactor 2OD−2 =+D−2 , the action (4.34) can be written also as 1 1 (m) F(dea ) ∗ (deb )ab : L() = F d ∗ d = (4.36) 2 M2 2 In order to avoid the delicate subject of Killing spinors, necessary for the (spherical) reduction of fermions 40 (cf. e.g. [238] for D = 4) we shall only deal with fermions introduced directly in D = 2. The di?eomorphism covariant generalization (cf. (1.63)) of the Dirac action ↔( ) √ i (m) d2 x −gea ( Z -a D ) L( ) = (4.37) 4 ↔
as in Minkowski space must contain a two-sided derivative aD b = a(Db) − (Da)b in order to yield a real action. The cancellation of ! in that derivative is a peculiar feature of D = 2, i.e. 39
This is not to be confused with F ˙ X , corresponding to minimal coupling in the original dimension in the case of SRG. We will refer to that case as non-minimal coupling. 40 Such a reduction yields a dilaton-dependent “mass” term and coupling of spinors to the auxiliary 9elds X ± .
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430 ↔
375
↔
the simpli9cation D = 9 is possible there. Thus both interactions (4.36) and (4.37) do not depend on the spin connection, as anticipated already in deriving the (classical) e.o.m.’s with matter, Eqs. (3.6) – (3.10). An additional geometrical degree of freedom may also appear already at the D = 4 level. Recently, in connection with the observation of supernovae at high values of the redshift [239 –242] the validity of the Hilbert–Einstein theory has been put into doubt [243]. The simplest theoretical description requires the introduction of a (still very small) cosmological constant ; (de Sitter theory). Therefore, extensions of the Einstein theory towards the old Jordan–Brans–Dicke (JBD) theory [177,179,180], have been revived, where already at D=4 an additional scalar 9eld 6 (Jordan 9eld, “quintessence”) is assumed to exist [244 –248]. Then, already in D=4 an action like (2.9) is postulated with X → 6 and appropriate assumptions for functions U(4) (6); V(4) (6) so that the “e?ective cosmological constant” is driven to its present (small) value (Q) L = d4 x −g(4) [R(4) 6 + U(4) (6)(∇(4) 6)2 + V(4) (6)] : (4.38) After spherical reduction of (4.38), even without including a genuine matter interaction a 2D theory emerges where the (in D = 4 geometric) variable 6 turns into something like an additional scalar 9eld (beside the genuine D = 2 dilaton 9eld X ). These “two-dilaton theories” have been studied in more detail in [249]. The most interesting feature of such theories is that one dilaton 9eld plays the role of a “geometric” dilaton 9eld in D = 2 and the other one behaves like matter, providing continuous physical degrees of freedom. We shall discuss the classical and quantum properties of dilaton gravity with (4.34), (4.35) in Section 7. 4.3.2. Exact solutions—conservation law for geometry and matter In the presence of interactions with additional 9elds which—in contrast to gauge 9elds (cf. Section 4.1) or supergravity (Section 4.2)—cannot be incorporated into the PSM approach, the possibility to 9nd analytic solutions is restricted. 41 Nevertheless, the interest in such solutions had been raised, especially by the work on the CGHS model (cf. Section 2.1.2). It possesses a global structure very much like the one of the genuine Schwarzschild BH (Fig. 3.8). Also application of the (singular) dilaton 9eld-dependent conformal transformation (1.56) with dilaton-dependent conformal factor 1 X 1 UCGHS (X )dX = ln X *CGHS = − (4.39) 2 2 is found to cancel the torsion term (cf. Section 2.1.4). The transformed potentials read U˜ (X ) = 0;
V˜ (X ) = VCGHS (X )eQCGHS (X ) = const:
(4.40)
Then also the action for the scalar 9eld (4.36) becomes the one in a Rat background (g → ). In this model F is taken to be constant (minimal coupling). After (trivially) solving for the (free) scalar or fermionic 9elds, the inverse conformal transformation is applied. This method has also been extended to include one-loop quantum e?ects in the semi-classical approach, by describing that lowest order quantum e?ect through the Polyakov e?ective action [252]. Adding to this action another piece, adapted suitably so that the exact solubility is maintained, more semi-classical solutions have 41
It is possible, however, to adjust the dilaton potentials in such a way that some exact solutions with matter can be obtained [250,251].
376
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
been studied [82–88]. An approximate analysis of the solutions is possible of course for a larger class of models. For instance, it has been demonstrated [253] that adding the Polyakov term to SRG shifts and attenuates the BH singularity. In the exact solution without matter, the important step has been to use one of the Eqs. (3.8) to express ! as in (3.16). If the theory only depends on one type of chiral 9elds either W − or W + in (3.8) vanishes. Then the same elimination of ! can be used. Separating e.g. Dirac fermions into chiral components as √ √ QR 4 4 ; SZ = 2 (QR† ; QL† ) ; (4.41) S= 2 QL the interaction (4.37) may be written as (m) (e+ J − + e− J + ) ; L(S) = − M2
† † J −; + = J R; L = i[QR; L (dQR; L ) − (dQR; L )QR; L ] :
Also “amplitudes” k and “phases” ’ can be introduced, 1 QR; L = √ kR; L ei’R; L ; 2 in terms of which the chiral currents (4.43) become J +; − = −kL;2 R d’L; R :
(4.42) (4.43) (4.44)
(4.45)
Since the theory only depends upon one type of chiral 9elds, either J + or J − in (4.42) is zero [79,80]. Still an equation like (3.16) without matter contribution holds, and the further steps of the solution for the geometric variables are exactly as in the matterless situation, except for the matter contribution in the conservation law. The general case from (3.8) and (3.7) may be derived from d(X + X − ) + V(X; Y ) dX + X − W + + X + W − = 0 :
(4.46)
For chiral fermions only one of the two last terms remains. As the consequences of (4.46) are of more general importance we will come back to that relation when the situation without restrictions on matter will be discussed below. The matter equation for chiral fermions kL = 0 in (4.44) by variation of ’R and kR2 e+ d’R = 0 ;
(4.47)
d(kR2 e+ ) = 0
(4.48) +
are solved easily, because e is the previous solution (3.20). Eq. (4.47) implies that ’R = ’R (f) where f is the arbitrary function introduced in (3.20). Inserting the latter into (4.48) determines the amplitude: e−Q(X ) g(f) : (4.49) X+ An analogous procedure works for chiral scalars. Variation with respect to e∓ of the action (4.36) for F = const: in light-like coordinates (± = ∗(de± )) yields kR2 =
W ± = 12 (∗± ) d ;
(4.50)
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
377
so that for (anti-)self-dual scalars with either + = 0 (self-dual: ∗d = d) or − = 0 (anti-self-dual: ∗d = −d) again one of the Eqs. (3.8) is independent of the matter contribution. The subsequent steps to obtain the exact solution proceed as for chiral fermions. The e.o.m. for minimally coupled scalars can be written as d ∗ d = d(− e+ ) = d(+ e− ) = 0 ;
(4.51)
where the last two forms are to be used, respectively, in the (anti-)self-dual cases. Thus (anti-)self-dual matter identically solves (4.51). Furthermore, e.g. for self-dual the condition e+ d = 0 makes the lines of = const: light-like. With the same solution (3.20) for e+ as without matter, the latter relation leads to d df = 0, i.e. = (f), or vice versa. As f represented a null coordinate in the EF line element, the peculiar light-like nature of is con9rmed. The similarity between (anti-)self-dual scalars and chiral fermions is not surprising in view of the well-known close relation between scalars and fermions in D = 2 [254]. Other non-trivial examples of exactly soluble systems are static solutions 42 [256] or continuously self-similar solutions [257] of SRG with a massless (non-minimally coupled) scalar 9eld. Minimally coupled scalars with SRG for the geometric sector can be solved in a perturbative manner [258] or in the static limit [259,260]. With the exception of aforementioned special cases, the CGHS model, the models of Refs. [250,251] or some other simple potentials like e.g. constant V (i.e. Rindler metric) and of teleparallelism (pure torsion, V = 0) [261] no exact solution with general (scalar or fermionic) matter seems to be known. 43 Indeed, one of the main open problems in classical dilaton theory with matter is an analytic (as opposed to numerical) description of non-trivial systems showing the feature of critical collapse. 44 5. Energy considerations It is important to clarify the relation of certain quantities appearing in generic (D=2) gravity theories (like the absolutely conserved quantity C(g) ) with respect to corresponding concepts well-known from D = 4 Einstein gravity. 5.1. ADM mass and quasilocal energy Because of di?eomorphism invariance the Hamiltonian density vanishes on the surface of the constraints in all gravity theories including all 2D dilaton models. 45 However, a boundary term must be included in the Hamiltonian which can be used to de9ne a global or “quasilocal” energy. This is the essence of the Arnowitt–Deser–Misner procedure [271]. As discussed in detail by Faddeev [272] in the context of 4D Einstein gravity, this boundary is the one at (spatial) in9nity, and the value of the gravitational energy is very sensitive to the choice of asymptotic conditions and it is 42
An extensive discussion of these solutions can be found in Refs. [142,255]. Recently, cosmological solutions in the JT model have been obtained [262,263]. 44 Here Ref. [264] for the seminal work of Choptuik should be quoted. It had been triggered by previous analytic studies of Christodoulou [265 –268]. Recent reviews are Refs. [269,270]. 45 See also the general Hamiltonian analysis in Section 7. 43
378
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
not invariant under the change of coordinates at in9nity. This reRects the fact that the energy is related to an observer connected with the asymptotic coordinate system who measures this energy. In generic 2D dilaton gravity the situation becomes even more complicated since a natural asymptotic coordinate system does not exist for some of the models. 46 This lack of asymptotic di?eomorphism invariance (and of conformal invariance) has led to much confusion in the literature. To facilitate comparison with other works on the ADM approach we consider here the second-order dilaton action (3.67) with the exponential parameterization X = e−26 for the dilaton 9eld. Since only the 9rst term e−26 R is essential for the calculation of the ADM mass, this does not imply any restrictions on the potentials U and V . √ √ ˙ ; = f in (3.33)) The diagonal gauge for the metric (N = f; (ds)2 = N 2 (dt)2 − ;2 (dr)2
(5.1)
is su3cient for our purposes since the lapse N is a Lagrange multiplier for the Hamiltonian constraint in generally covariant theories. A more complete canonical analysis can be found in Refs. [32,40,220,273,274]. The next step is to supplement the volume action (3.67) by a suitable boundary term. The role of the latter is to convert second-order derivatives to 9rst-order ones. More exactly, such a term must provide standard variation equations for the boundary data, the induced metric on the boundary. In particular, this means that the 9rst variation of the total action must not contain boundary contributions with normal derivative for the variation of the lapse N . Since only the curvature term contains second derivatives, it also must not depend on the potentials U and V in (2.9). Therefore, we may adopt for all models the expression for the SRG model. It can be obtained by spherical reduction of the standard extrinsic curvature term in four dimensions. This yields b L = N dt e−26 K ; (5.2) 9M
where we assume that 9M corresponds to a constant value of r. N dt is the surface element on the boundary, e−26 is produced by the spherical reduction. K is the extrinsic curvature, 1 1 9n N = ∓ 9r (log N ) ; (5.3) N ; where 9n denotes the derivative with respect to an outward pointing unit normal. The upper (lower) sign in (5.3) should be taken on the “right” (“left”) component of the boundary. For linear variations around a static background in the diagonal gauge (3.33) the curvature follows from the simpli9ed formula √ 1 −gR = 29r 9r N : (5.4) ; Now the lapse N is varied in the total action dil b 2 (L + L ) = (N )[e:o:m:]d x + dt(N )2e−26 9n 6 : (5.5) K =−
M
46
9M
In the absence of asymptotic Ratness as in the KV-model [27,28] other possibilities than the ADM-mass were discussed in Ref. [67].
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
379
The second integral generates the so-called quasilocal energy associated with the observer at the boundary: Eql = 2e−26 9n 6 = −9n X :
(5.6)
To obtain the Hamiltonian (or the ADM mass) (5.6) must be multiplied by N : H = EADM = −N 9n X :
(5.7)
Evidently, solutions with constant dilaton (see, e.g., [275]) will lead to zero ADM mass. If one moves the boundary to the asymptotic region for a generic D = 2 dilaton theory, the right-hand side of (5.7) diverges. In order to arrive at a 9nite value of a “generalized” ADM mass a procedure like the Gibbons–Hawking subtraction [276,277] is needed. A rather natural idea [278,279] is to subtract from the total action an action functional calculated for some reference space–time with the same induced metric on the spatial boundary (6 or X for SRG). Note that normal derivatives of the boundary data and the normal metric (;) may be di?erent for the reference space–time. E?ectively, this means to subtract from the quasilocal “physical” energy the one of some reference (“empty space”) con9guration, denoted by a subscript 0: Eqlreg = −9n X + [9n X ]0 :
(5.8)
To obtain the ADM mass measured in a physical space–time (5.8) should be multiplied by the lapse function N corresponding also to the physical space–time: reg = N (−9n X + [9n X ]0 ) : MADM
(5.9)
Eq. (5.9) may be evaluated directly for dilaton theories admitting solutions which are Rat everywhere, i.e. where U and V are related by (3.40). SRG is a special case of this class, where it is natural to take the reference frame to be the Minkowski space solution with C0 = 0. Instead of studying the most general situation, we concentrate on the subclass of models (3.67). Identifying values of the dilaton 9eld on the boundary for physical and reference space–times is equivalent to identifying the coordinate r of the boundary. This yields the total ADM mass reg MADM = ± lim (r)(− (r) + 0 (r))9r X ; (5.10) r →I
where the upper sign (+) should be taken if the asymptotic region I corresponds to r → ∞, and the lower one (−) if I corresponds to r → 0. By substituting e.g. expressions (3.70) and (3.69) for (r) and r(X ) and taking into account the position of the asymptotic region, one obtains a reg : (5.11) MADM = −C0 B It is instructive to check whether the mass of the Schwarzschild black hole 9ts correctly into this procedure. For D = 4 the values a = 12 and B = 2+2 follow from (2.33) and (3.64). Multiplying (5.11) by the coe3cient which has been omitted while passing from (2.2) to (2.9) indeed yields reg MADM =−
C0 : 4GN +3
This value exactly coincides with (3.38) where units with GN = 1 have been used.
(5.12)
380
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
For the CGHS model (a = 1 in (5.11)) the di?erent coordinate system (3.73) has to be used. Substituting also (3.74) in (5.10) and taking the lower sign there—since the asymptotic region corresponds to the lower limit of r—the ADM mass becomes reg MADM = −C0 B−1=2 ;
(5.13)
which somewhat surprisingly coincides with the naive limit a → 1 in (5.11). This value is also consistent with the calculations existing in the literature. For example, Witten’s result [44] is recovered if we take B = 8=k , replace C0 by a constant shift of the dilaton 6 and take into account the overall factor of 12 assumed in our action. It should be noted that in all examples considered above, positive mass BHs correspond to negative values of C0 whereas positive values of the latter describe naked singularities. For Minkowski ground-state theories (cf. (3.40)) of which SRG is a special case, that subtraction procedure appears very natural. For more complicated models, it might be preferable to subtract the total energy of a reference space from the total energy of the physical space [220]: reg M˜ ADM = −[N 9n X ] + [N 9n X ]0 :
(5.14)
This formula, of course, cannot reproduce the correct mass, e.g. in the case of the Schwarzschild BH, but that it may be useful is a di?erent context. There are considerable variations in the details of such a subtraction being used by di?erent authors and in di?erent models. Sometimes, it appeared more appropriate to subtract an extremal BH solution instead of the Minkowski space [280]. As noted in [281] the subtraction procedures of [44,162,220,282,283], applied to the so-called exact string BH [45], lead to di?erent results. Treating this model is especially tricky since the corresponding action is not known. It should always be kept in mind that a strong dependence on the asymptotic conditions and on the subtraction procedure has a clear physical origin: energy depends on the observer who measures it. This is true for both reference and physical space–times. De9nitions of the ADM mass for higher dimensional dilaton gravities have been considered in Refs. [284,285]. For asymptotically Rindler and de Sitter models, the ADM mass has been calculated in Ref. [220] (see also [286]). This concept can be also introduced in the presence of radiation [287] and of a shock wave of matter 9elds [288]. An extension of the Hamiltonian analysis to the case of charged BHs is possible too [289]. Obviously, according to this procedure, the ADM mass is not conformally invariant. This means that it will change, in general, if a conformal transformation, for example, removes the kinetic term for the dilaton. The reason was clearly stated in Refs. [284,290]. Even though it is possible to make the unregularized energy conformally invariant for a selected class of models, the subtraction term will inevitably destroy that invariance since an “empty” reference space is mapped into a non-trivial con9guration. In other words, physical and reference observers are being transformed di?erently. A 9nal remark concerns approaches where instead of the ADM mass the conserved quantity C(g) has been related directly to a quasilocal energy expression. The notion of quasilocal energy has been investigated thoroughly in the context of General Relativity by Brown and York [277,291] and in the context of 2D dilaton gravity by Kummer and Lau [274]. As shown in Ref. [67] a relation to C(g) is possible following the arguments leading to Wald’s energy density [292,293]. Approaches which require no explicit subtraction have been suggested as well [67,162].
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
381
5.2. Conservation laws Even in the most general dilaton theory including matter interactions, when no exact solution is known, the conservation law which may be derived from (4.46) contains important information. Again potentials with quadratic torsion (2.31) only are considered because there the integrating factor exp Q can be determined easily. Attaching that factor as in (3.14), (3.15) yields dC(g) + W (m) = 0 ;
(5.15)
where C(g) is the quantity de9ned in (3.14) for the geometric variable and (cf. (3.8), (3.6)) W (m) = eQ (X + W − + X − W + ) :
(5.16)
Clearly, W (m) from (5.15) must obey the integrability condition dW (m) = 0, a relation which in turn must be expressible in terms of the e.o.m.’s. With W (m) = dC(m) , Eq. (5.15) simply becomes 47 dC(tot) = d(C(g) + C(m) ) = 0
(5.17)
and C(tot) = C0 = const: is an absolutely (i.e. in both coordinates) conserved quantity [67]. Obviously, in the presence of further gauge 9elds, the present argument can be generalized easily following the steps of Section 4.1. Using the integrability condition for the components W in W (m) = W dx , Eq. (5.15) can be integrated in two equivalent ways (x0 = t; x1 = r, C0 is an integration constant): t r (g) C (t; r; t0 ; r0 ) = − dt W0 (t ; r) − dr W1 (t0 ; r ) + C0 t0 r0 r t =− dr W1 (t; r ) − dt W0 (t ; r0 ) + C0 (5.18) r0
t0
for any 9rst-order gravity action (2.17) or its equivalent dilaton form (2.9). SRG may serve as a concrete example [41]. In the “diagonal” gauge widely used in spherical 2 BH simulations [264,269,270] with g00 √ = &2 (t; r), g11 = −a√ = −(1 − 2m=r)−1 , g01 = g10 = 0, the − − + + zweibein must be 9xed as e0 = e0 = &= 2; e1 = −e1 = a= 2 and the dilaton 9eld by X = +2 r 2 =4. Then the quantity m(t; r) (sometimes called mass aspect function) is proportional to C(g) with the proportionality factor for SRG given by (3.38). As shown in [41] e.g. the 9rst version of (5.18) turns into t 4%r 2 m(t; r; t0 ; r0 ) = dt 2 (90 )(91 ) a t0 r 2 (91 ) 2 (90 ) + dr 2%r + + m0 : (5.19) &2 a2 t=t0 r0 De9ning the ADM-mass as mADM = m(t0 ; ∞; t0 ; r0 ) in the limit r → ∞ in (5.19) and using √ asymptotically free ingoing and outgoing spherical scalar waves ∼ [f+ (t − r) + f− (t + r)]=r 4% 47
An early version of a conservation law of this type [162] was not general enough to cover interacting scalars and fermions as introduced in the present chapter.
382
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
yields the e?ective time-dependent mass of the (eventual) BH t (e? ) 2 2 dt [(f− ) − (f+ )]: mBH (t) = m(t; ∞; t0 ; 0) = mADM + t0
(5.20)
The matter contribution has the intuitive interpretation as the total incoming, resp. outgoing, Rux to in9nity of matter at a certain time t, starting from t=t0 . In fact, when such Ruxes exist, it is necessary ) 48 It is remarkable that, to use m(e? BH as a measure of BH formation rather than mADM alone [269]. in contrast to the situation in D = 4, in D = 2 something like a standard energy conservation law can be formulated in this manner. Clearly, the importance of that conservation (5.17) and (5.18) is not restricted to SRG where the ADM-mass can be de9ned from an asymptotically Rat region. For generic 2D dilaton theories (2.9) or (2.17) there is a close relation of C(tot) to the concept of “quasilocal energy” [67,220,274,277] which has been dealt with also in the previous subsection. 5.3. Symmetries The 9nal topic of this subsection is the question of symmetries, to be attached to (5.17). It should be emphasized that these symmetries are quite di?erent from the (gauge-like) ones incorporated automatically in the PSM approach, because the latter is valid for the geometric part of the action alone. When matter is absent the NYother symmetry of C(tot) = C(g) is realized by a translation in the Killing-direction [221]. In the presence of matter, the integrability condition dW (m) = 0 for (5.15) can be interpreted as a conservation law of another one-form current W (m) which is related to a symmetry transformation with another type of parameters. Both ingredients are necessary for the peculiar “two-stage” NYother symmetry which has been encountered here [295]. It seems to be yet another special feature of a generic D = 2 theory. In order to simplify the discussion of this unusual symmetry, the mechanism is explained in the frame of a toy-model [221] which, nevertheless, contains all essential features: Lˆ = (X dw + Kw d) : (5.21) M2
The 9rst (“geometric”) term in (5.21) can be considered as a simpli9cation of the Lagrangian (2.17) whereas the second (“matter”) term resembles the fermion interaction, as written in (4.42) with (4.43), and a current expressed in terms of amplitude (one-form w), phase (zero-form ), and Lagrange multipliers (zero-forms X; K). In the e.o.m.’s to be derived from (5.21)
48
dX + K d = 0 ;
(5.22)
dw = 0 ;
(5.23)
w d = 0 ;
(5.24)
d(Kw) = 0 ;
(5.25)
As pointed out in Ref. [41] the 9rst-order formulation also seems to be much more convenient in gauges of the Sachs– Bondi type, where no coordinate singularity is created at the horizon. Then the introduction of the extrinsic curvature as an additional variable [294] can be avoided altogether.
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
383
Eq. (5.22) represents the analog of the conservation law in the form (4.46) with W (m) = K d. The integrability condition dW (m) = 0 becomes dK d = 0. This implies K = K() so that K d = d ( y0 K(y) dy), and dC = d X + K(y) dy = 0 (5.26) y0
is the counterpart of (5.17). Thus C = C0 = const: characterizes the solutions of this theory. From (5.23) and (5.24) similarly w = w() can be concluded, so that (5.25) is ful9lled identically. In the “matterless” case (K =0) the “geometric” symmetry transformations are constant translations w = - = const: The integrability condition d(K d) = 0 allows an expansion in terms of e.o.m.’s ˆ ˆ ˆ (5.23) – (5.25) which correspond, respectively, to the variations L=X; L=K; L=: Lˆ Lˆ 90 Lˆ 90 K d(K d) = −K + : (5.27) X w0 K w0 The apparent dependence on the speci9c coordinate x0 is spurious (e.g. 90 =w0 = 91 =w1 from (5.24)). Thus (5.27) permits the introduction of a “matter” symmetry with global parameter * 90 90 90 K = *; X = −K *; K = * (5.28) w0 w0 w0 ˆ in Lˆ = L ˆ or an equivalent one with 90 → 91 ; w0 → w1 . It can be checked that the Lagrangian L of (5.21) indeed transforms as a total divergence: 9 0 ˆ = d K d − K L w * : (5.29) w0 The related conserved NYother one-form current becomes J = K d, or ∗J = j K9 in components for the Hodge dual of J . Hence the conservation law for the complete expression (5.26) is related to a simultaneous transformation of the action Lˆ with respect to both the symmetry parameters and * the second of whom belongs to a di?erent (one-form) current W (m) = K d. It is straightforward to apply the procedure, as outlined in this simple example, to a general theory with matter interactions in D = 2. The resulting formulas are quite lengthy (cf. [295]) and, therefore, will not be reproduced here. 6. Hawking radiation One of the main motivations for studying low-dimensional gravity theories is the hope to get insight into the dynamics of a BH, its quantum radiation and eventual evaporation [18]. Therefore, it is important to make sure that especially the e?ect of Hawking radiation still exists in two-dimensional theories and to study its basic properties like the temperature–mass relation. It should be kept in mind, though, that this e?ect, discovered more than a quarter of a century ago, is a 9xed background phenomenon. No quantum gravity is involved; only the matter 9eld action is taken into account in the one-loop approximation. The vacuum polarization is described by the energy–momentum tensor, induced by this quantum e?ect, 2 W T = √ ; (6.1) −g g
384
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
where W is the one-loop e?ective action for the matter 9elds on a classical background manifold with metric g . For minimal coupling of scalars in 2D W in (6.1) is the famous Polyakov action [252]. In a suitable coordinate system, the Hawking Rux is given by the light-cone component T− − calculated in the asymptotic region. To this end various methods have been developed [296]. Most of them can be applied in 2D. Variation of W as in (6.1) allows the direct determination of T . Alternatively, the thermal particle distribution may be reproduced by comparing di?erent vacuum states from the Bogoliubov coe3cients [297,298]. In this review we follow the approach of Christensen and Fulling [299] based upon the conformal anomaly. Like the comparison of thermal distributions it should not be sensitive to the dimensionality of space–time. Here the computation for minimally coupled scalars is very simple, and a closed expression for the energy–momentum tensor may be given for any dilaton gravity model. For non-minimal coupling, the situation is much more complicated. Several problems still remain unsolved, although the result for the Rux from D = 4 can be reproduced correctly. A detailed and elementary discussion of the non-minimal case can be found in Ref. [300], where it was shown that the use of the fully integrated e?ective action could be avoided altogether. 6.1. Minimally coupled scalars The simplest example is a minimally coupled scalar 9eld with action (4.34) and FOD−2 =+D−2 =1=2: √ 1 min d2 x −gg (9 )(9 ) : (6.2) L() = 2 If is taken to be an on-shell classical 9eld the energy–momentum tensor satis9es the usual conservation equation ∇ T = 0 :
(6.3)
The same relation holds for one-loop quantum corrections with a trivial background 9eld = 0 where in (6.3) the e?ective action W by (6.1) appears. Eq. (6.3) is most conveniently analyzed in the conformal gauge (3.1). We change variables dz = dr=(r) in the generalized Schwarzschild gauge (3.34) to obtain (3.33) in the form (ds)2 = (r)((dt)2 − (dz)2 ) :
√
(6.4)
In light-cone coordinates x± = (t ± z)= 2 the line element (6.4) will be expressed as (ds)2 = 2e2* dx+ dx− ;
* = 12 log() :
(6.5)
For the asymptotically Minkowski models 0 ¡ a ¡ 1 considered in (3.70) it is convenient to write as r a=(1−a) h (r) = 1 − ; (6.6) r where (a−2)=2a B 1 1−a=a rh = (−2C0 ) (6.7) |1 − a| a
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
385
is the value of r at the horizon. The explicit form (6.7) will not be needed until the very end of this calculation. There are only two non-zero components of the Levi-CivitVa connection: )++ + =29+ * and )− − − = 29− *. The minus component of in (6.3) yields 9+ T− − + 9− T+− − 2(9− *)T+− = 0 :
(6.8)
On static backgrounds, which depend on the variable r alone, the relations 1 1 (6.9) 9+ = −9− = √ 9z = √ (r)9r ; 2 2 between partial derivatives hold. Therefore, (6.8) becomes a simple 9rst-order ordinary di?erential equation (9z − 2(9z *))T+− = 9z T− − :
(6.10)
The Rux component T− − can be found easily from the trace [299] 2T = 2e−2* T+− :
(6.11)
As the classical trace of T for a massless 9eld is zero in D = 2, the whole contribution to T arises from the conformal (or Weyl) anomaly (cf. [301] for a historical review). With minimally coupled scalars in 2D the calculations are especially simple, but they permit to illustrate several important points. As a 9rst step, in the action (6.2) an integration by parts is performed and it is continued to the Euclidean domain, √ 1 LE = d2 x gA ; (6.12) 2 where A = −V = −g ∇ ∇ is the Laplace operator on the curved background. The path integral measure is de9ned by the relation 2 √ 2 1 = (D) exp − d x g ;
(6.13)
so that the procedure maintains di?eomorphism invariance and thus preserves the conservation equation (6.3). It is also possible to trade part of the di?eomorphism invariance for Weyl invariance [84,154,302–304], but this option will not be considered here. The partition function for the 9eld reads 2 √ Z = (D) exp − d x gA = (det A)−1=2 ; (6.14) where the determinant is divergent. The zeta function regularization [305,306] W = −ln Z = − 12 WA (0);
WA (s) = Tr(A−s ) ;
(6.15)
is very convenient in the present context. Prime denotes di?erentiation with respect to s. Strictly speaking, to keep the argument of the zeta function in (6.15) dimensionless, one has to multiply it by 2s where is a parameter with mass dimension one. Then the e?ective action W will be shifted by − 12 WA (0)ln 2 which represents the usual renormalization ambiguity. This term, however, does not contribute to the anomaly.
386
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
The following analysis will be valid for an arbitrary conformally covariant operator which means that under an in9nitesimal conformal transformation g =2g *(x) of the metric (6.5) the operator A changes as A = −2(*(x))A :
(6.16)
Because of this property, the variation of the zeta function is simply WA (s) = −s Tr((A)A−1−s ) = 2s Tr((*)A−s ) ;
(6.17)
i.e. the operator A−s is restored with its original power. The corresponding change of the e?ective action is expressed in terms of a generalized (“smeared”) zeta function: W = −W(0|*; A);
W(s|*; A):=Tr((*)A−s ) :
(6.18)
At vanishing argument s → 0, Eq. (6.18) can be evaluated easily by heat kernel methods. For the operator A = −V the result is 49 √ 1 d2 x gR* : W(0|*; −V) = (6.19) 24% On the other hand, the de9nition of the energy–momentum tensor (6.1) yields √ 1 2 √ d x gT g = − d2 x gT * : W = 2
(6.20)
Comparing (6.18) with (6.20) and (6.19) the well-known expression for the trace anomaly 1 R (6.21) T = 24% follows, which remains unchanged after continuation back to Minkowski signature. In conformal gauge the Ricci-scalar becomes 50 R = 2e−2* 92z * :
(6.22)
In light-cone coordinates, (6.11) yields 1 2 9*: (6.23) T+ − = 24% z With this input, the conservation equation (6.10) is solved easily, 1 [92 * − (9z *)2 ] + t− ; (6.24) T− − = 24% z where t− is the integration constant. Di?erent choices of t− correspond to di?erent “quantum vacua” [19,307–309]. There is nothing speci9c for 2D models in this respect. We assume that the Killing-horizon is non-degenerate, i.e. (r) has a simple zero at r=rh as for (r) in (6.6). To ensure regularity of the energy–momentum tensor at 49
See Appendix B for details. This expression may be obtained most easily from the identity (1.56) with g from the line element (6.5), Rˆ = 0, 9 r = 9z . 50
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
387
the horizon in global (Kruskal) coordinates one has to require that T− − exhibits a second-order zero at r = rh . There is only one integration constant t− available. Therefore, 9xing it by the requirement T− − |h = 0 ;
(6.25)
it must be checked later on whether (6.25) indeed produces a second-order zero. In terms of the function the energy–momentum tensor (6.24) can be expressed as (cf. (6.5) and (6.9)) 1 [2 − ( )2 ] + t− : T− − = (6.26) 96% With t− determined from (6.25) it is an easy exercise to show that for the asymptotically Minkowski models (6.6) the Hawking Rux in the asymptotic region becomes a2 T− − |as = : (6.27) 96%(a − 1)2 rh2 This Rux de9nes the Hawking temperature TH of the BH. In 2D the Stefan–Boltzmann law contains TH2 : % (6.28) T− − |as = TH2 : 6 Comparing (6.27) and (6.28) the value of TH agrees with the one derived from surface gravity 1 TH = 4% |rh . These equations together with (6.7) and (5.11) 9x the dependence of the Hawking temperature on the ADM mass for this class of models: TH ˙ (MADM )(a−1)=a :
(6.29) (a= 12 )
is reproduced. Eq. (6.29) reveals The well-known inverse mass law for the Schwarzschild BH an intriguing property [220] of the class of 2D models discussed in Section 3.3: depending on the parameter a the Hawking temperature may be proportional to a negative, but also a positive power of the BH mass. It is easy to check that near the horizon indeed T− − |r →rh ∼ (r − rh )2
(6.30)
for all values of a, i.e. the requirement of a continuous Rux in Kruskal coordinates is ful9lled. Again, the CGHS model must be considered separately. By substituting (3.74) in (6.26) the Hawking Rux B (6.31) T− − |as = 96% is obtained, consistent with the earlier calculation [52]. It is important to note that in the CGHS model, Hawking radiation does not depend on the ADM mass. Hawking radiation can be studied as well for asymptotically Rindler and de Sitter models. Explicit expressions can be found in Ref. [220]. T− − for “exotic” con9gurations with constant dilaton has been calculated in Refs. [275,286]. It is very sensitive with respect to asymptotic conditions on the metric. Physically, this means that one has to 9x length and time scales used in its measurement. Clearly, di?erent scales yield di?erent results, as may be seen by comparing Refs. [219] and [220] where asymptotically Rindler spaces were studied. By choosing an accelerated reference system Hawking radiation may be converted into Unruh radiation [310].
388
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
It should be stressed that Hawking radiation behaves quite di?erently in conformally related models as witnessed by the results of Ref. [311] vs. Ref. [220]. Conformal transformations change, in general, also the asymptotic behavior of the metric and that of the path integral measure. Indeed the very existence of the conformal anomaly means conformal non-invariance of the theory. The case of minimally coupled spinor 9elds interacting (again minimally) with an Abelian gauge 9eld can be also analyzed along the same lines. One has to add a contribution of the chiral anomaly to the Polyakov action [312,313]. 51 Another generalization [315] consists in considering the Casimir force due to a minimally coupled scalar 9eld between two surfaces on a CGHS background. 6.2. Non-minimally coupled scalars The scalar 9eld action (4.34), (4.35) contains a non-minimal coupling to the dilaton from spherical reduction. On dimensional and symmetry grounds for a GDT in the path integral measure also a general function S of the dilaton 6 may be introduced, 2 √ −2S 2 1 = (D) exp − d x ge ; (6.32) instead of the standard mode normalization condition following from D dimensional spherical reduction (S = 6), using the exponential parameterization of the dilaton (2.11). Then the rescaled 9eld ’ = e−S still possesses the standard dilaton-independent path integral measure in 2D (6.13). In terms of this new 9eld the action (4.34) reads √ 1 (nm) d2 x g’A(nm) ’ ; L = (6.33) 2 A(nm) = −e2(S−6) g (∇ ∇ + 2(S; − 6; )9 + S; + S; S; − 2S; 6; ) ;
(6.34)
where an integration by parts has been performed and an irrelevant overall factor in the action has been dropped (S; = ∇ S). The 9rst calculation of the conformal anomaly for non-minimally coupled scalar 9elds with the spherically reduced path integral measure (S = 6) has been presented by Mukhanov, Wipf and Zelnikov [316] who were also the 9rst to address the problem of Hawking radiation for spherically reduced matter. Their result was con9rmed later [317,318] and extended to arbitrary measure [319]. 52 As in the minimally coupled case the zeta function regularization (6.15) may be employed. The operator A(nm) being conformally covariant, Eqs. (6.16) – (6.18) and (6.20) (after the replacement A → A(nm) ) are still valid. The conformal anomaly is derived from W(0|*; A(nm) ). Again the general formulae from Appendix B can be used, because A(nm) may be expressed in the standard form (B.1) ˆ ; A(nm) = −(gˆ ∇ˆ ∇ˆ + E) with the “e?ective” metric gˆ ∇ˆ = 9 + )ˆ + !ˆ ; 51
(6.35)
= e2(S−6) g and the covariant derivative !ˆ = S; − 6; ;
(6.36)
The case of neutral matter on the background of a charged BH is even simpler. One has to modify only the metric in the Polyakov action [314]. Expression (6.26) still holds in terms of a di?erent . 52 The literature on this subject is quite large (cf. e.g. [320 –325]).
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
389
where )ˆ is the Christo?el connection for the metric g. ˆ Here the potential Eˆ reads Eˆ = gˆ (−6; 6; + 6; ) :
(6.37)
According to Appendix B, (Eq. (B.5) with (B.9) and (B.13)), after returning to Minkowski space, one obtains for the smeared W-function (6.18) 1 ˆ d2 x −g( ˆ Rˆ + 6E)* ; (6.38) W(0|*; A(nm) ) = 24% where Rˆ is the scalar curvature determined from g, ˆ so that (6.37) and (6.38) yield the trace anomaly [319] T =
1 (R − 6(∇6)2 + 4∇2 6 + 2∇2 S) : 24%
(6.39)
For the spherically reduced measure S = 6, this expression agrees with Refs. [316 –318]. Di?erent expressions for the conformal anomaly with various choices of the measure were reported too [320 –322,326]. It is often important to keep track of total derivatives (or zero modes in the compact case) in computations of T . A careful analysis of this type has been performed by Dowker [327] (cf. also [300,328]) who con9rmed the result (6.39) for SRG. When scalars are coupled non-minimally to a dilaton 9eld, the conservation law for the one-loop energy–momentum tensor has to be modi9ed, 1 W ∇ T = −(9 6) √ ; −g 6
(6.40)
as can be seen by applying the usual assumption of di?eomorphism invariance to a -dependent matter action. In the absence of classical matter Ruxes, the matter action can be replaced again by the one-loop e?ective action of non-minimally coupled scalars [329]. In contrast to conformal transformations, a shift of the dilaton 9eld 6 → 6 + 6 does not act on the operator A(nm) in a covariant way. 53 Therefore, the variation of (A(nm) )−s in the zeta function does not yield a power of A(nm) anymore. As a consequence the variation of the e?ective action cannot be expressed in terms of known heat kernel coe3cients. A way to overcome this di3culty has been suggested in [330]. By keeping the same classical action, but by changing the hermiticity requirements of relevant operators, A(nm) has been transformed into a product of two operators of Dirac type each of which transforms homogeneously under the shifts of the dilaton 9eld. This allowed us to calculate the energy–momentum tensor and the e?ective action in a closed analytical form. Although this procedure changes the original spectral problem, the results exhibit several attractive features which have to be present in spherically reduced theories. For example, the Hawking temperature coincides with its geometrical expression through surface gravity. Thus, in hindsight one may conjecture [300] that this procedure somehow takes into account the “dimensional reduction anomaly” (see below). Another model where the energy–momentum tensor can be calculated exactly has been proposed recently [331]. 53
From now on, we assume that the function S in the measure is 9xed. The most relevant choice (SRG) is S = 6.
390
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
In this connection it should be remarked that in 2D a generic di?erential operator can be represented in “dilaton” form 54 AZ = −(eS ∇ e−6 )(e−6 ∇ eS ) := L L† :
(6.41)
This parameterization proved very convenient in resumming the perturbative expansion of the e?ective action [333]. It also allows to prove some symmetry relations between functional determinants even in higher dimensions or if S and 6 are matrix-valued 9elds [334]. Roughly speaking, these symmetry relations allow to interchange L and L† inside the determinant which is a rather non-trivial operation because of the summation over in (6.41). In order to study the quantum back reaction upon the classical BH, solutions of the 9eld equations obtained from an action containing both classical and one-loop parts are needed. It is not possible to solve such equations in general, even if the quantum e?ects are represented by the simplest Polyakov action. However, for particular dilaton theories exact solutions can be obtained [82–88,335], although in some of these papers the “quantum” part was rather introduced by hand than derived. An e?ective action for non-minimally coupled 9elds was presented in [330]. Admittedly, its derivation lacked complete rigor. Many authors [336 –347] employ the “conformal” action 1 1 (conf ) 2 √ 21 W d x −g R R − 12(∇6) R + 12R6 = (6.42) 96% which correctly reproduces the conformal anomaly (6.39) for S = 6 but neglects (an undetermined) conformally invariant part. 55 The 9rst term under the integral in (6.42) yields the Polyakov action. It is interesting to note that although (6.42) di?ers from the full e?ective action obtained in [330], many physical predictions are identical. Even if it is assumed that W (conf ) provides a correct description of one-loop e?ects for nonminimally coupled matter, many problems remain open. The 9rst one is how to deal with the non-local terms in (6.42). Direct variation of this equation with respect to the metric leads to very complicated expressions [349]. It was proposed in Ref. [350] to convert (6.42) into a local action by introducing two auxiliary 9elds, f1 = (1= )R and f2 = (1= )(∇6)2 . Various versions of this method were frequently used since (e.g. [339 –342,351,352]). As the new action in terms of f1 and f2 is local, it is quite straightforward to vary it with respect to the metric in order to arrive at the energy–momentum tensor. However, since f1 and f2 are to be found from R and (∇6)2 by solving second-order di?erential equations, the energy–momentum tensor obtained in this way will, in general, depend on four integration constants. This is an indication that such an extended action does not necessarily yield the same physics as the original one. For the latter a single
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
391
integrable) one [300]. In Ref. [354] this singularity has been attributed to a breakdown of the WKB approximation. In the case of non-minimally coupled scalars derived from SRG, action and path integral measure (the mode normalization condition) coincide with the ones for the s-wave parts of the corresponding quantities in four dimensions. Does this guarantee that the 2D Hawking Rux will be just the s-wave part of the Hawking Rux in four dimensions? The answer is negative, because renormalization and dimensional reduction do not commute. Indeed, even if each individual angular momentum contribution to the energy–momentum tensor or to the e?ective action were 9nite, the sum over the angular momenta will, in general, diverge. In fact, the e?ective action W can be written in the zeta function regularization as reg 1 W reg = − )(s) +l;−nsl = Wl ; (6.43) 2 n l
l
l
where the “partial wave” e?ective action is 1 +l;−nsl : Wlreg = − )(s) 2 n
(6.44)
l
Here, +l; nl are eigenvalues of the kinetic operator in four dimensions corresponding to the angular momentum l. To remove divergences in Wlreg as s → 0 one must subtract the pole terms: 1 Wl (s) = +l;−nsl : (6.45) Wl = Wl (0) + · · · ; 2s n l
Here dots denote 9nite renormalization terms. After that one obtains the familiar expression (6.15) for each l with an appropriate operator A. However, the sum (Wlreg + Wl ) (6.46) W= l
will diverge. Thus, a subtraction term which is needed to make (6.43) 9nite has nothing to do with the sum over l of the individual pole terms (6.45). This latter sum simply does not exist! This means that the four-dimensional theory requires more counter terms and counter terms of a di?erent type than the spherically reduced one. This problem was noted long ago [355] in calculations of tunnel determinants. In the context of SRG the non-commutativity of renormalization and dimensional reduction has been called “dimensional reduction anomaly” [356]; it has been the subject of extensive studies over recent years [354,357,358]. We conclude this section by noting that for massive matter 9elds the situation is simpler than for massless ones. One can apply e.g. the high frequency approximation [359] to estimate the energy– momentum tensor. The zero mass limit in such calculations is, of course, singular. The massive case is also less interesting because the Hawking Rux is suppressed by the mass. 7. Non-perturbative path integral quantization As pointed out in the previous section, dilaton gravity is a convenient laboratory for studying semi-classical e?ects like Hawking radiation. For quantum e?ects the simplicity of 2D theories
392
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
becomes even more important, since quantum gravity is beset with well-known conceptual problems, which are essentially independent of the considered dimension (cf. e.g. [20]). Thus, a study in a framework were the purely technical challenges are not as demanding as in higher-dimensional theories is desirable. GDT in 2D with matter as a theoretical laboratory for quantum gravity has several advantages as compared to other models: • As outlined in the Introduction, it encompasses many di?erent theories proposed in the literature, including models with strong physical motivation, like SRG. • It exhibits continuous physical degrees of freedom and thus provides physical scattering processes with a non-trivial S-matrix, as opposed to pure SRG or GDT without matter. • It is still simple enough to allow a non-perturbative treatment in the geometric sector. The main advantage of the 9rst point is that the same techniques can be used uniformly for a large class of theories. The second point will be elaborated in detail in the next section, where the S-matrix for s-wave gravitational scattering will be calculated. The third point is conceptually and technically very important: the split of geometric variables into background plus Ructuations in perturbation theory is something which can be avoided here. From the viewpoint of GR this is very attractive. After integrating out geometry exactly a non-local and non-polynomial action is obtained, depending solely on the matter 9elds and external sources. When perturbation theory is introduced at this point geometry can be reconstructed self-consistently to each given order. In particular, the proper back-reaction from matter is included automatically. From a technical point of view the use of Cartan variables in a 9rst-order formulation has been crucial. The ensuing constraint algebra also with matter shares the essential features with the one in the PSM model (cf. Section 2.3) which governs the matterless case: It becomes a 9nite W-algebra for minimally coupled matter and a Lie-algebra for the JT model [21–26]. Moreover, it still closes with -functions rather than derivatives of them. Within the BRST quantization procedure the “temporal” gauge has turned out to be extremely useful. It will lead to an e?ective metric in Sachs–Bondi form. As seen above (cf. Section 3.1) that gauge appeared to be already the most natural one in the absence of matter interactions. Even when the latter is present the classical action in that gauge remains linear in the canonical coordinates of the geometrical sector. Consequently, by integration three functional delta functions are generated which are used to perform an exact path integration over the corresponding canonical momenta. If no matter 9elds are present in this way an exact generating functional for the Green’s functions is obtained. The e?ective 9eld theory with non-local interactions for the case with matter 9elds will be considered in detail in Section 8. 7.1. Constraint algebra A prerequisite for the proper formulation of a path integral is the Hamiltonian analysis. The key advantage of the formulation (2.17) for the geometric part of the action is its “Hamiltonian” form. The component version (2.20), together with the one for scalar 9elds, non-minimally coupled by F(X ) = const: to the dilaton 9eld (cf. (4.36)), 1 a X+ b (m) L = (e)F(X ) ab (j˜ e 9 )(j˜ eX 9+ ) − f() (7.1) 2
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
393
in terms of the canonical coordinates and qi = (!1 ; e1− ; e1+ );
qZi = (!0 ; e0− ; e0+ )
(7.2)
allow the identi9cation of the respective canonical momenta 56 from the total Lagrangian L=L(g) + 2 (m) L (L = d xL; 90 qi = q˙i , etc.) pi = %=
9L = (X; X + ; X − ) ; 9q˙i
9L ; 9˙
pZ i =
9L =0 : 9qZ˙i
(7.3) (7.4) (7.5)
Eqs. (7.5) are three primary constraints. The canonical Hamiltonian density Hc = pi q˙i + %˙ − L ;
(7.6)
after elimination of ˙ and q˙i becomes Hc = −qZi Gi ;
(7.7)
with the secondary 9rst class constraints Gi (q; p; ; %) := Gi(g) (q; p) + Gi(m) (q; p; ; %) ;
(7.8)
the geometric part of which is given by (V = V(p2 p3 ; p1 ) as de9ned in (2.31)) G1(g) = 91 p1 + p3 q3 − p2 q2 ;
(7.9)
G2(g) = 91 p2 + q1 p2 − q3 V ;
(7.10)
G3(g) = 91 p3 − q1 p3 + q2 V ;
(7.11)
and its matter part reads G1(m) = 0 ; 2 F(p1 ) % (91 ) − − F(p1 )q3 f() ; 4q2 F(p1 ) 2 F(p1 ) % (m) G3 = − (91 ) + + F(p1 )q2 f() : 4q3 F(p1 ) G2(m) =
56
(7.12) (7.13) (7.14)
Strictly speaking the relation between pi and the variables X; X ± yields primary second class constraints. However, the canonical procedure using Dirac brackets in the present case justi9es the shortcut implied by (7.3). We use the standard nomenclature of Hamiltonian analysis (cf. e.g. [360 –362]).
394
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
By means of the Poisson bracket (pi = pi (x ), etc.) {qi ; pj } = ij (x1 − x1 )
(7.15)
the stability of the 9rst class primary constraints (7.5) identi9es the Gi of (7.8) as 9rst class secondary constraints. There are no ternary constraints as can be seen from the Poisson algebra of the Gi {Gi ; Gj } = Cijk Gk (x1 − x1 ) ; with (Cijk = −Cjik ; all non-listed Cijk -components vanish) 9V F (p1 ) + C231 = − L(m) ; 9p1 (e)F(p1 ) C122 = −1; 9V C232 = − ; 9p 2 C133 = 1; 9V C233 = − : 9p3
(7.16)
(7.17)
In the matterless case and for minimal coupling (F =0 in (7.17)) the structure functions Cijk depend on the momenta only. For the JT model (2.12) with V = ;p1 again the Lie-algebra of SO(1; 2) is reproduced (cf. the observation after Eq. (2.36)). Already without matter the symmetry generated by the Gi = Gi(g) corresponds to a non-linear (9nite W -) algebra A(g) . Including also the (mutually Z (g) = {A(g) ; pi } that algebra closes and the Casimir invariant C(g) commuting) momenta pi in A of the PSM appears as one of the two elements of the center [194], the second of which can be expressed as 91 C(g) . It is remarkable that the commutators (7.16) resemble, though, the ones of an ordinary gauge theory or the Ashtekar approach to gravity [363,364] in the sense that no space-derivatives of the delta functions appear. 57 The usual Hamiltonian constraints H and the di?eomorphism constraints H1 in an analysis of the ADM-type [271] always lead to such derivatives (cf. e.g. [365]). Indeed, H and H1 can be reproduced by suitable linear combination of the Gi [221,366]. We emphasize that upon quantization we have no ordering problems in the present formalism. 58 Due to the linear appearance of coordinates in the geometric parts Gi(g) of the constraints any hermitian version of them is automatically Weyl ordered. 59 Moreover, the Hamiltonian is hermitian if the constraints exhibit that property (since the Hamiltonian essentially is just a sum over them). This property carries over to Gi since Cijk for minimal coupling depend only on the momenta and for non-minimal coupling the only addition consists of the matter Lagrangian. The commutator between structure functions and constraints vanishes: for minimal coupling this is a trivial consequence of the PSM structure of the Hamiltonian and for non-minimal coupling the only non-trivial term (present in C231 ) vanishes as well since it commutes with G1 . Moreover, the commutator of two (Weyl ordered) constraints again yields the classical expressions (7.17) in Weyl ordered form. Therefore, 57 It is also possible to switch to other sets of constraints, e.g. including 91 C(g) as one of them, thus Abelianizing them in the matterless case. This, however, works only in a given patch, since the transformation involved breaks down at a horizon (cf. e.g. [142]). 58 We are grateful to P. van Nieuwenhuizen for discussions on that point. 59 To be more explicit: classical terms of the form pq have a unique hermitian representation, namely (qp + pq)=2. This is also their Weyl ordered version.
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
395
the Poisson algebra (7.16) can be elevated without problems to a commutator algebra for quantum operators. As no ternary constraints exist and as all constraints are 9rst class, the extended phase space with (anticommuting) ghost 9elds can be constructed easily, following the approach of Batalin, Vilkovisky and Fradkin [98,367,368]. One 9rst determines the BRST charge ! which ful9lls !2 = 1 {!; !} = 0. Treating qZi as canonical variable one obtains two quadruplets of constraint=canonical 2 coordinate=ghost=ghost momentum (pZ i ; qZi ; bi ; pib );
(Gi ; −; ci ; pic ) ;
with canonical (graded) brackets
(7.18)
{ci ; pic } = −ij (x1 − x1 ) = {bi ; pib } :
(7.19)
In (7.18) no “coordinate” conjugate to the secondary constraints Gi appears, although one could try to construct some quantities which ful9ll canonical Poisson bracket relations with them. These quantities are not needed for the BRST procedure which according to [142] yields 1 c (7.20) != ci Gi + ci cj Cijk pk + bi pZ i d2 x : 2 Since the structure functions (7.17) are 9eld dependent, it is non-trivial that the homological perturbation series stops at rank = 1. In general one would expect the presence of higher order ghost terms (“ghost self interactions”). However, it can be veri9ed easily that ! as de9ned in (7.20) is nilpotent by itself. For the matterless case this is a simple consequence of the Poisson structure: the Jacobi identity (2.36) for the Poisson-tensor implies that the homological perturbation series already stops at the Yang–Mills level [72]. It turns out that the inclusion of (dynamical) scalars does not change this feature. The quantity (7.20) generates BRST transformations with anticommuting constant parameter + by Hc = {+!; Hc } = 0. It not only leaves the canonical Hamiltonian density Hc invariant, but also the extended Hamiltonian density Hext = Hc + { ; !}
(7.21)
in which Hc has been supplemented by a (BRST exact) term with the gauge-9xing fermion . In our case also Hc = {pic qZi ; !} is exact, a well-known feature of reparameterization invariant theories. A useful class of gauge-9xing fermions is given by [94] = pib Qi ;
(7.22)
where Qi are some gauge-9xing functions. The class of temporal gauges (3.3) qZi = ai ;
ai = (0; 1; 0)
(7.23)
has turned out to be very convenient for the exact path integration of the geometric part of the action [65,69,70,94]. It can be incorporated in (7.22) by the choice 1 (7.24) Qi = (qZi − ai ) j with j being a positive constant. Then (7.21) reduces to 1 1 Hext = pib bi − (qZi − ai )pZ i − qZi Gi − qZi cj Cijk pkc + pic bi : (7.25) j j
396
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
It is necessary to perform the limit j → 0 to impose (7.23) in the path integral. This can be achieved by a rede9nition of the canonical momenta pZ i → pZˆ i ;
pib → pˆ bi = jpib ;
(7.26)
which has unit super-Jacobian in the path integral measure. Taking j → 0 afterwards, in terms of the new momenta yields a well-de9ned extended Hamiltonian. 7.2. Path integral quantization After that step the path integral in extended phase space becomes W = (Dqi )(Dpi )(DqZi )(DpZˆ i )(D)(D%)(Dci )(Dpic )(Dbi )(Dpˆ bi ) (0) 2 ×exp i (L + Ji pi + ji qi + ')d x ;
(7.27)
with L(0) = pi q˙i + jpZˆ i qZ˙i + %˙ + jpˆ bi b˙i + pic c˙i − Hext :
(7.28)
It turns out to be very useful to introduce sources ji and ' not only for the geometric variables qi and for the scalar 9eld , but also for the momenta pi , denoted by Ji . Integrating out pZˆ i and qZi yields an e?ective Lagrangian with qZi = ai as required by (7.23). After further trivial integrations with respect to bi and pˆ bi , and 9nally ci and pic , (7.27) simpli9es to (7.29) W = (Dqi )(Dpi )(D)(D%) det M exp iL(1) with L
(1)
=
d2 x(pi q˙i + %˙ + ai Gi + Ji pi + ji qi + ') :
In the functional matrix −1 0 90 90 0 M = 0 K p3 U (p1 ) 90 + p2 U (p1 )
(7.30)
(7.31)
the (complicated) contribution K is irrelevant for its determinant: det M = (det 90 )2 det(90 + p2 U (p1 )) :
(7.32)
Indeed, apart from that important contribution to the measure the generating functional of the Green’s functions W with the e?ective Lagrangian equation (7.30) is nothing but the “naive” result, obtained by gauge 9xing the Hamiltonian Hc . The present approach to quantization may be questioned because it is not based directly upon the classical physical theory like the dilaton action appearing naturally in SRG. However, the classical equivalence argument of Section 2.2 in the quantum language simply means to integrate out 60 the 60
In the path integral the classical elimination procedure is replaced by 9rst integrating the two components of ! which appear linearly. The resulting -functions allow the elimination of X a by means of the relation (2.29).
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
397
torsion-independent part of the spin connection and of the X a in (2.17). The only delicate point is the transformation in the measure of the path integral. As shown in Ref. [69] there exists a gauge (e0− = 1; e1+ = 1; e0+ = 0) which does not produce a Faddeev–Popov-type determinant in the transition to the equivalent dilaton theory (of the form (2.9)). Thus for all physical observables, de9ned as to be independent of the gauge, that equivalence should also hold at the quantum level. The (Gaussian) path integral 61 of momenta % in (7.29) is done in the next step: W = (Dqi )(Dpi )(D)(det q2 )1=2 det M exp iL(2) ; (7.33) L
(2)
=
d2 x[pi q˙i + q1 p2 − q3 (V (p1 ) + U (p1 )p2 p3 )
+ F(p1 )((91 )(90 ) − q2 (90 )2 − q3 f()) + ji qi + Ji pi + '] :
(7.34)
In the only contributing constraint G2 , which is now written explicitly, the total derivative 91 p2 has been dropped. 62 As it stands (7.33) lacks a covariant measure for the 9nal matter integration. If this is not corrected properly, counter terms emerging from a non-covariant measure may obscure the quantization procedure. The accepted remedy [369] (cf. also [370,371]) is to insert the appropriate factor by hand. It arises from the requirement that the Gaussian integral √ √ 4 (D −g) exp i −g) = (det ))−1=2 (7.35) √ be independent from the metric g, as is evident in (7.35). In the present gauge (3.3) with −g = (e) = e1+ = q3 this means that (det q2 )1=2 in (7.33) should be replaced by (det q3 )1=2 . In the customary approach the next step would be the integral of the momenta pi . However, in a generic 2D gravity model, including the physically relevant SRG, the p-integrals are not Gaussian. On the other hand, the action (7.34) is linear in the geometric coordinates qi . Even the new determinant in the measure by the identity (u and uZ are anticommuting scalars and v a commuting one) 1=2 (q3 ) = (Dv)(Du)(Du) exp i(v2 + uu)q3 (7.36) may be reexpressed formally as yet another linear contribution (in q3 ) to the action. This suggests to perform the qi -integrals
An e?ective action for a general class of gauges where this integration is not possible has been proposed too [255]. There exists a shortcut to obtain (7.33) with (7.34) and (7.32) [142]: instead of (7.22) with (7.24) one can use the gauge 9xing fermion S = p2c and straightforwardly integrate all ghosts and their momenta, without limiting procedure for a quantity . as in (7.24). 63 Historically, this exact integrability was realized 9rst for the matterless KV-model [94] where the quadratic dependence on pi allowed the 9rst integration to be the (traditional) Gaussian one. The inverted sequence of integrals has been initiated in [69]. 62
398
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
The vanishing arguments of the three -functions for the respective (q1 ; q2 ; q3 )-integrals yield three di?erential equations (h = uu Z + v2 from (7.36)) 90 p1 − p2 − j1 = 0 ;
(7.37)
90 p2 − j2 + F(p1 )(90 )2 = 0 ;
(7.38)
(90 + p2 U (p1 ))p3 + V (p1 ) + F(p1 )f() − h − j3 = 0 :
(7.39)
The di?erential operators acting on pi precisely combine to the ones in the determinant det M of the measure in (7.29) with (7.32). Therefore, det M will be cancelled exactly in the subsequent integration of pi . Eqs. (7.37) – (7.39) are the classical (Hamiltonian) di?erential equations for the momenta (with sources ji ). Matter is represented in (7.38), (7.39) by the terms proportional to Newton’s constant (F ˙ X). 7.3. Path integral without matter It is not possible to obtain an exact solution for pi for general matter interactions. Therefore, the latter can be treated only perturbatively, and in the 9rst step F → 0 should be considered. 64 Then the solutions of (7.29) – (7.37) can be written as 1 p1 = B1 = pZ 1 + 9− 0 (p2 + j1 ) ;
(7.40)
1 p2 = B2 = pZ 2 + 9− 0 j2 ;
(7.41)
1 Q p3 = B3 = e−Q [9− Z 3] ; 0 e (j3 − V (p1 )) + p
(7.42)
1 −2 where 90 pZ i = 0 and 9− have to be properly de9ned one-dimensional Green’s functions in the 0 ; 90 genuine realm of quantum theory (see below). In the integration of (7.39) the di?erential operator H = (90 + p2 U (p1 )) has been reexpressed in terms of 1 Q = 9− 0 (U (B1 )B2 )
e− Q 9 − 1 e Q .
(7.43)
Proceeding as announced above, after (Dq)(Dp) we arrive at the exact as H −1 = 0 expression for the generating functional for the Green’s functions W0 (j; J ) = exp iL(0) e? ; ˜ 0 (j; B)] ; L(0) d2 x[Ji Bi + L e? =
(7.44) (7.45)
where Bi =Bi (j; pZ i ) . Here L(0) e? trivially coincides with the generating functional of connected Green’s ˜ 0 (j; B) has been added. It originates from an ambiguity in function. In (7.45) a new contribution L 1 −1 the 9rst term of the square bracket. In an expression dx0 dy0 Jx0 (9− means 0 )A the symbol 90 64
For minimally coupled scalars F = const: is independent of pi so that this term can be taken along one more step. In the end, however, one cannot avoid that perturbation expansion.
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
399
an integral which when acting upon J contains an undetermined integration constant g(x Z 1 ). This generates a new term gZ A. Applying this to J1 B1 + J2 B2 yields uninteresting couplings. 65 But from B3 with A = eQ (j3 − V ) together with J3 an important contribution to the action follows: ˜ (0) = ge Z Q (j3 − V ) : (7.46) L Indeed that term is the only one to survive in the matterless case at Ji = 0, i.e. for vanishing sources ˜ (0) had been derived in the 9rst exact path of the momenta. On the other hand, precisely that action L integral [94] computed for the KV-model [27,28]. There Ji ≡ 0 had been taken from the beginning. It also plays a crucial role for the derivation of the solutions for the (classical) e.o.m.’s for the geometric variables which simply follow from the “expectation values” in the matterless case W0 1 qi = : (7.47) iW0 (0) ji j=J =0 These qi indeed coincide with the classical solutions (3.20) – (3.22) when constants of integration are adjusted and the gauge (3.3) is assumed. The usefulness of the sources Ji for the momenta is evident when the Casimir function C(g) of (3.14) is determined from its expectation value: 1 1 (g) (g) C W0 (J ) = pZ 3 : (7.48) C (p) = W0 (J ) i J J =0 The last equality in (7.48) follows from introducing the solutions (7.40) – (7.42) into C(g) (p) = C(g) (Bi0 ), where in Bi0 = Bi (j = 0) the residual gauge is 9xed so that B20 = pZ 2 = 1; B10 = x0 . Here, as well as in the solutions qi of (7.47) it is evident that pZ 3 describes the (classical) background. It should be emphasized that one encounters the unusual situation that in the matterless case the classical theory is expressed by W0 in a quantum 9eld theoretical formalism. Therefore, the question arises for the whereabouts of the e.o.m.’s which are the counterparts of (7.37) – (7.39), but which depend on derivatives 91 pi instead of 90 pi . These relations disappeared because of the gauge 9xing, much like the Gauss law disappears in the temporal gauge A0 = 0 for the U (1) gauge 9eld A . Actually, in the “quantum” formalism of the classical result they reappear as “Ward-identities” by gauge variations (di?eomorphisms, local Lorentz transformations) of W0 (j; J ). For details Refs. [70,142] can be consulted. There is one most important lesson to be drawn retrospectively for the matterless case which may have consequences in a more general context than the present one of 2D theories of gravity: The exact quantum integral of the geometry leading here to the classical theory uses, among others, a path integral over all values of q3 in order to arrive at the classical equation for the momenta √ through the -function. However, q3 in the gauge (3.3) is identical to the determinant (e) = −g. Therefore, a summation including negative and vanishing volumes has to be made to arrive at the correct (classical) result. It is instructive to derive the e?ective action corresponding to the generating functional (7.45). In terms of the mean 9elds qi , pi qi = 65
L(0) e? ; ji
pi =
L(0) e? ; Ji
(7.49)
It turns out that the corresponding ambiguous contributions are 9xed uniquely by imposing boundary conditions on the momenta p1 and p2 .
400
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
the e?ective action )(qi ; pi ) results from the Legendre transform of L(0) e? , )(qi ; pi ) = L(0) d2 x(ji qi + Ji pi ) ; e? (j; J ) −
(7.50)
where the sources must be expressed through the mean 9elds. To economize writing the brackets the notations qi = (!1 ; e1− ; e1+ );
pi = (X; X + ; X − ) ;
(7.51)
imply a simple return to the original geometric variables (cf. (7.2) and (7.3)). A peculiar feature of the 9rst-order formalism is that only the last three of the equations (7.49) are needed: j 1 = 90 X − X + ;
(7.52)
j 2 = 90 X + ;
(7.53)
j3 − V (X ) = e−Q 90 (eQ X − ) = (90 + X + U (X ))X − :
(7.54)
The exact e?ective action (7.50) for the dilaton gravity models without matter immediately follows from (7.52) to (7.54): )= d2 x[!1 X + − !1 (90 X ) − e1− (90 X + ) − e1+ (90 X − ) M
−e1+ (V (X ) + X + X − U (X ))] ± gZ
9M
dx1 eQ X − :
(7.55)
It has been assumed that the manifold M has the form of a strip M = [r1 ; r2 ] × R. The upper (+) sign in front of the surface term 66 corresponds to the “right” boundary x0 = r2 , and the lower (−) sign to x0 = r1 . The volume term in (7.55) is just the classical action in the temporal gauge (cf. 9rst line in (7.34)). Thus all dilaton gravities without matter are locally quantum trivial [69]. Therefore, all eventual quantum e?ects are encoded in the boundary part and are global. Except in Section 5.1 and in Eq. (7.55) so far complications from boundary e?ects were entirely ignored. Their inclusion in the path integral approach is a highly non-trivial problem and, in general, requires the introduction of non-local operators at the boundary (cf. e.g. [372–376]). Matterless 2D quantum gravity being a special case of PSMs, the discussion can be incorporated into the one of these more general models, 67 if the bulk action (2.34) is supplemented by f(C)X I AI ; (7.56) 9M 2
where C is the Casimir function (2.45) – (2.47). A consistent way 68 to implement boundary conditions is 9xing f(C) = 0 and X I = 0 at time-like 9M2 , which implies X I |9M2 = X I (r) with r being The two omitted surface terms (cf. footnote 65) are just 9M dx1 X + and 9M dx1 X . Since both quantities will be 9xed by suitable boundary conditions in the next section we have already dropped them. 67 We are grateful to L. Bergamin and P. van Nieuwenhuizen for discussions on that model. 68 We mean consistency as de9ned in [377]: boundary conditions arise from (1) extremizing the action, (2) invariance of the action and (3) closure of the set of boundary conditions under symmetry transformations. In Maxwell theory, e.g., these consistency requirements single out electric or magnetic boundary conditions [378]. 66
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
401
the “radius”. In fact, this prescription we had imposed tacitly by dropping all boundary contributions in (7.44) and by choosing 9xed functions of x0 (which corresponds to a radial coordinate in our gauge) for the boundary values of pi . As pointed out in Section 2.3, gravity theories in D = 2 without matter are special examples of PSMs. Not surprisingly, the exact path integral also is encountered there [200]. Recently, also an “almost closed expression” for the partition function on an arbitrary oriented two-manifold has been presented as well [201]. It is interesting to compare this result with local perturbative calculations [55,91–93]. Results obtained in di?erent gauges must coincide on-shell only. The e?ective “quantum” actions obtained in these papers indeed vanish on-shell in full agreement with our non-perturbative calculations. Hence the non-trivial o?-shell counterterms appearing in Refs. [55,91–93] are pure artifacts of the gauges employed. In Ref. [379] local quantum triviality of some dilaton models has been veri9ed with the conformal 9eld theory technique. Some authors [380,381] also rely on rather complicated 9eld rede9nitions which, however, as a rule produce Jacobian factors, making a comparison with the result [69] very di3cult. Finally, Ref. [382] should be mentioned where loop calculations in the presence of the Polyakov term have been performed, although only part of the degrees of freedom was quantized. 7.4. Path integral with matter 7.4.1. General formalism When the geometry interacts with matter the computation must be resumed at the generating functional (7.33). After performing the integrations (Dp)(Dq) as in the matterless case one arrives at W = (D) exp iL(3) ; (7.57) L
(3)
=
˜ i ; Bˆ i )] ; d2 x[F(Bˆ 1 )(90 )(91 ) + Ji Bˆ i + ' + L(j
ˆ ˜ i ; Bˆ i ) = ge ˜ Q (h + j3 − V (Bˆ 1 ) − F(Bˆ 1 )f()) : L(j
(7.58) (7.59)
Here Bˆ i are the solutions for pi from (7.37) to (7.39) and thus are functions of the scalar 9eld as well. The notation Qˆ and Vˆ also indicates the dependence of these quantities on Bˆ i instead of Bi ˜ (7.59) and (cf. (7.43)). Nevertheless, the action still is seen to be only linear in h = uu Z + v2 in L Bˆ 3 of (7.39). Therefore the identity (7.36) may be used backwards with q3 replaced by ˆ
ˆ
ˆ
ˆ
1 Q E1+ (J; j; p; Z f) = J3 e−Q 90 eQ 9− Z Q: 0 e + ge
In the ensuing new version of (7.57) W = (D)(det E1+ )1=2 exp iLe? ; Le? =
˜ h=0 + '] ; d2 x[F(Bˆ 1 )(90 )(91 ) + Ji Bˆ i |h=0 + L|
(7.60)
(7.61) (7.62)
402
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
the measure allows an intuitive interpretation. For physically interesting Green’s functions with J3 =0 the penultimate term in (7.60) is nothing but q3 again, however expressed in terms of the sources j and containing the scalar 9eld. Thus this determinant in the measure duely takes into account backreactions of scalar matter upon the geometry. This is how far we can get using non-perturbative methods. The 9nal matter integration cannot be performed exactly. 7.4.2. Perturbation theory In the treatment of matter, one now may follow the usual steps of perturbative quantum 9eld theory. In order to avoid cumbersome formulas and, nevertheless, elucidate the basic principles we restrict the discussion to minimal coupling (F(p1 ) = 1) and no local self-interaction (f() = 0). ˜ to this order. Then, First the terms quadratic in of (7.58) are isolated by expanding Bˆ i and L they are considered together with the source term ' in a (Gaussian) path integral. The -dependence in the measure contributes to higher loop order only. Higher order terms O(2n ) (n ¿ 2) in (7.62) are interpreted as vertices and taken outside the integral, with the replacement → (1=i)='. We denote them summarily as Z(): 1 W = exp i Z W˜ ; (7.63) i ' W˜ (j; J; ') = (D) det E1+ exp i [(90 )(91 ) − E1− (90 )2 + '] : (7.64) In the coe3cient E1− the di?erent (non-local) contributions from the quadratic terms in (90 )2 of (7.63) are lumped together. Comparing (7.64) with the path integral √ 1 2 √ 4 ˜ W = (D −g) exp i d x −g (7.65) g (9 )(9 ) + ' 2 explains our choice of the symbol E1− , because it is a generalization of the zweibein component e1− which for the EF gauge would appear in this place. Here, by construction, E1+ and E1− depend on the external sources and not on the scalar 9eld. A more general form (with F (p1 ) = 0) of the “e?ective zweibein” will be considered in Section 8. A Gaussian integral like (7.65) leads to the inverse square root of a functional determinant which in D = 2 may be reexpressed as a Polyakov action 69 [252] ( = g ∇ 9 ) √ [det −g ]−1=2 = exp iL(Pol) ; (7.66) √ 1 1 d2 x d2 y −gRx − L(Pol) = − (7.67) xy Ry : 96% Then the full expression W˜ (7.65), written as (7.66), becomes i W˜ = exp[iL(Pol) ] exp − d2 x d2 y'x Vxy 'y ; 2 69
(7.68)
For non-minimal coupling (e.g. F(p1 ) ˙ p1 ) its place would be taken by a corresponding quantity generalized to depend on the dilaton 9eld p1 = Bˆ 1 expressed in terms of the scalar 9eld and external sources. The e?ective action proposed in [330] indicates the possible form of such a generalization.
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
403
√ where the propagator V=V(j; J )=[ −g ]−1 by its dependence on j contains the eventual interaction with external zweibeine. For minimally coupled scalars (F = 1) it obeys (90 91 − 90 E1− 90 )Vxx = 2 (x − x ) : The ansatz Vxx =
x
1 2 Mxx (9− 0 )x x d x
(7.69) (7.70)
allows the formal computation of Mxx as (P means path ordering) 1 Mxx = P −1 9− 1 P ; 1 − 0 1 P(x) = P exp − dy E1 (x ; y )90 : x
(7.71) (7.72)
1 For the classical expressions in the exact path integral the meaning of 9− as an integration (with 0 undetermined integration constant) is evident. In the quantum case a more careful de9nition of the Green’s functions is required which implies a UV and IR regularization. A suitable de9nition is (90 → ∇0 = 90 − i) 1 0 0 i(x ∇− 0 = −P(y − x )e
0
−y 0 )
:
(7.73)
behavior at The regularization parameter = 0 − ij (0 → +0; j → +0) guarantees −1 proper −2 −1 0 0 x − y → ±∞. One easily veri9es the same property in ∇0 xy = z ∇0 xz ∇0 zy as well as in 1 Z 2 in (7.40) when higher powers. Only in expressions involving the classical background like ∇− 0 p 1 ix0 (7.41) is inserted this rule must be adapted. With ∇0 pZ 2 = 0 and thus pZ 2 = pZ 2 (x )e the expression 1 ∇− p Z diverges. The solution consists in simply going back in these (classical) terms to the classical 2 0 1 −1 interpretation where ∇− = 9 corresponds to integration. 0 0 The formulas for GDTs with non-minimally coupled self interacting scalars can be derived retaining F = 1 and f = 0 in (7.38), (7.39). Then the perturbation expansion in terms of Newton’s constant requires an expansion in terms of F already in the solution of these equations. Together with the perturbation theory outlined in connection with a path integral (7.64) this yields rather complicated formulas which, therefore, will not be exhibited here. Only in connection with the “virtual BH” of Section 8 this case will be dealt with. At the moment, there seems to exist only one computation of higher loop e?ects for the simpler case of minimally coupled scalars (F = 1) and a corresponding dilaton theory without kinetic term (U = 0), but the tools are available for arbitrary loop calculations. As noted in Ref. [71] for that speci9c class of theories the whole two-loop e?ect is just a renormalization of the potential V . 7.4.3. Exact path integral with matter The JT model (2.12) is an example of a situation where the path integral can be calculated exactly even in the presence of minimally coupled matter 9elds [69]. There the integration (D) produces the Polyakov action. Thus the generating functional for the Green’s functions reads: WJT = (Dqi )(Dpi ) exp iLJT (7.74) e? ; d2 x[pi q˙i + q1 p2 + ;q3 p1 ] + L(Pol) (q2 ; q3 ) ; (7.75) LJT e? =
404
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
where source terms j; J; ' and the propagator for the scalar 9eld have been dropped. The crucial observation is that (7.75) is now linear in pi . Therefore, when integrating 9rst over the momenta one obtains three functional delta functions which may be used to integrate over qi . Up to this change, the whole procedure works as before. For details we refer to Ref. [69]. Again, something like local quantum triviality occurs. The action (7.75) already incorporates all quantum e?ects, because in this case no higher loop corrections exist, a feature used in Refs. [383–385] to extend the one-loop calculations to all orders of perturbation theory. The method of exact functional integration described here seems to be a rather general one, although it does not seem possible to formulate general criteria of applicability. We just note that in a similar way an exact path integral has been calculated in a di?erent context, namely the Bianchi IX reduction of Ashtekar gravity [386]. As a 9nal remark of this section it should be stressed that in gravity theories—in contrast to quantum 9eld theory in Minkowski space—there is no immediate relation between classical and quantum integrability. 8. Virtual black hole and S-matrix Only the interaction with matter in D = 2 provides continuous physical degrees of freedom. Since the asymptotic states depend on the model under consideration we do not discuss the most general case, but for illustrating the main technical details we focus on an explicit example instead: SRG with a non-minimally coupled massless scalar 9eld. We also select a situation in which the existence of an S-matrix in the usual quantum 9eld theoretic sense should be unchallenged, namely gravitational scattering of scalars in asymptotically Rat space, however, without 9xing the background further before quantization. In Section 7 we have demonstrated that all geometrical degrees of freedom can be integrated out exactly. This procedure yields e?ective non-local interactions of the remaining 9elds (scalars in our case). In this section we calculate explicitly some lowest order e?ective vertices and ensuing tree level S-matrix elements corresponding to gravitational scattering of s-wave scalars. The results can be interpreted as an exchange of virtual black holes. The vertices are extracted from the action (7.62) after separation of the interaction part Z (cf. (7.63)). They appear as complicated non-local expressions with multiple integrals in x0 from 1 repeated multiplications of the formal object 9− present in Bˆ i , the solutions for pi of Eqs. (7.37) – 0 (7.39). There exist two classes of vertices (we have attached all outer legs in the formulae below), symmetric ones (2n) (8.1) Va = dx12 : : : dx2n va(2n) (x1 ; : : : ; x n )(90 )2x1 : : : (90 )2x n ; and non-symmetric ones 70 (2n) Vb = dx12 : : : dx2n vb(2n) (x1 ; : : : ; x n )(90 91 )x1 (90 )2x2 : : : (90 )2x n :
70
(8.2)
For the evaluation of the S-matrix one has to permute all external legs and thus leg-exchange symmetry is restored.
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
405
They have the following properties: • They contain an even number of outer legs. Thus, in addition to a propagator term (cf. e.g. (7.68) for the simpler case of minimally coupled scalars) there are 4 -vertices, 6 -vertices and so on. • Each pair of outer legs is attached at one point xi to the non-local vertex. • Each outer leg contains one derivative. 1 • Non-locality is inherited from the 9− operators in the Bˆ i . 0 ˜ in (7.62). • The symmetric vertices originate from the term L • The non-symmetric vertices are produced on the one hand by F(Bˆ 1 )90 91 in (7.62), on the ˜ yields such terms. For minimal coupling all non-symmetric vertices vanish. other hand also L • The information contained in the tree-graphs is classical. Thus, it must be possible to extract it by other means. Nevertheless, the path integral seems to be the most adequate language to derive scattering amplitudes. 71 The lowest order tree-graph and the ensuing S-matrix element has been evaluated in Ref. [95] for F(X ) = const. The result was trivial, unless mass terms f() = m2 2 had been added (cf. the discussion after Eq. (8.23)). Therefore, we will focus in the rest of this section on the (also phenomenologically more relevant) case of non-minimal coupling [96]. 8.1. Non-minimal coupling, spherically reduced gravity In principle, all e?ective interactions of the scalars can be extracted by expanding the non-local action (7.62) in a power series of . At each order the number of integrations increases, and one has to 9x the ranges appropriately. This becomes cumbersome already at the 4 level. Fortunately, two observations [70] simplify the calculations considerably. First, instead of dealing with complicated non-local kernels one may solve corresponding di?erential equations. All ambiguities are then removed by imposing asymptotic conditions on the solutions. Second, instead of taking the nth functional derivative of the action with respect to bilinear combinations of the scalar, the matter 9elds may be localized at n di?erent space–time points. This mimics the e?ect of functional di?erentiation. To be more speci9c, the symmetric vertex (8.1) may serve as an example va(2n) (x1 ; : : : ; x n ) ˙
n Le? : ((90 (x1 ))2 ) : : : ((90 (x n ))2 )
(8.3)
By its de9nition, the functional derivative is a response to a small localized change of the functional argument ((90 )2 in the present case). Therefore, let us choose a speci9c matter distribution such that (90 )2 is localized at n − 1 points: 2
(90 ) (x) ˙
n
c[k] 2 (x − xk ) :
(8.4)
k=2
71
As in other well-known examples—e.g. the Klein–Nishina formula for relativistic Compton scattering [387]—this formalism seems to be much superior to a classical computation.
406
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
Now let X (x1 ) and E1− be solutions of the classical 9eld equations in the presence of localized matter (8.4). In va(2n) (x1 ; : : : ; x n ) ˙ F(X (x1 ))E1− (x1 )[x2 ; : : : ; x n ]|%c
(8.5)
we have indicated the dependence of X and E1 on the matter distribution. The notation %c after [k] the vertical n line[k]means that one has to expand in c and to select the coe3cient in front of the product k=2 c . The proof of this statement with explicit coe3cients instead of the proportionality symbol as well as a corresponding argument for the non-symmetric vertex (8.2) can be found in Refs. [70,95,96]. Qualitatively, the result (8.5) is rather easy to understand. In the classical action (90 )2 appears multiplied by F(X )e1− . Due to local quantum triviality in the absence of matter it is natural to expect “e?ective” quantities of the same nature in the vertices. The most interesting case is SRG [96], where V = −2, U = −(2p1 )−1 , f() = 0, F = p1 =2, ji = Ji = Q = 0. To extract the terms quartic in (n = 2 in (8.1), (8.2)) one has to take a matter distribution localized at one space–time point (cf. (8.4)) 0 := 12 (90 )2 → c0 2 (x − y) ;
(8.6)
1 := 12 (90 )(91 ) → c1 2 (x − y) ;
(8.7)
and to solve the classical e.o.m.’s up to linear order in the constants c0 and c1 which just keep track of the number of sources. The di?erential equations (7.37) – (7.39), together with classical equations for qi become q3 p2 p3 90 q1 = + 1 − q 2 0 ; 90 p1 = p 2 ; 2p12 q 3 p3 90 q2 = −q1 − ; 90 p2 = p1 0 ; 2p1 p2 p 3 q3 p2 90 p3 = 2 + ; 90 q3 = − : (8.8) 2p1 2p1 Their solutions, to be substituted back into the action with the inverse replacement (8.6) and (8.7), are found easily: p1 (x) = x0 − (x0 − y0 )c0 y0 h(x; y) ;
(8.9)
p2 (x) = 1 − c0 y0 h(x; y) ;
(8.10)
3=2 1=2 √ √ q2 (x) = 4 p1 + (8c0 y0 p1 − 2c0 y0 − c1 y0 + (c1 − 6c0 y0 )p1 )h(x; y) ;
(8.11)
1 : q3 (x) = √ p1
(8.12)
Here h(x; y):=M(y0 − x0 )(x1 − y1 ) corresponds to one of the possible prescriptions introduced in Ref. [95] for the boundary values at x0 → ∞. It turns out that the vertices below are independent of any such choice. The matching conditions at x0 = y0 follow from continuity properties: p1 ; q2
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
407
and q3 are C 0 and 90 q2 (y0 + 0) − 90 q2 (y0 − 0) = −(c1 − q2 (y0 )c0 )(x1 − y1 ). Integration constants which would produce an asymptotic (i.e. for x0 → ∞) Schwarzschild term have been 9xed to zero. Consistency of integration constants with the set of e.o.m.’s containing 91 automatically yields a vanishing Rindler term. Furthermore, it relates the asymptotic Schwarzschild term to the asymptotic value for the geometric part of the conserved quantity [70]. Thus, only four integration constants can be chosen independently. We 9x those in pi and q3 . Because of our particular choice p3 (x0 → ∞)=0 a BH may appear only at an intermediate stage (the “virtual black hole”, see below), but should not act asymptotically. Due to the in9nite range of gravity this is necessary for a proper S-matrix element if spherical waves are used as asymptotic states for incoming and outgoing scalar particles. 8.2. E;ective line element The arguments of the previous section suggest that matter interacts with some e?ective geometry which solves the classical e.o.m.’s in the presence of external sources. Moreover, this geometry can be extracted directly from the vertices. A more formal (but essentially equivalent) way to see this is to calculate the vacuum expectation values of q2 and q3 by varying the exact path integral (7.61) with respect to j2 and j3 in lowest order of the matter loop expansion and in the presence of external matter 9eld. The method described in the previous subsection appears to be more straightforward and considerably simpler. The matter-dependent solutions in the gauge (3.3) with (8.11) and (8.12) de9ne an e?ective line element (ds)2 = 2q3 dx0 (dx1 + q2 dx0 ) = 2dr du + K(r; u)(du)2 ;
(8.13)
with the identi9cations 72 √ u = 2 2x1 ; r = p1 (x0 )=2 :
(8.14)
In the asymptotic region by our previous residual gauge 9xing the Killing-norm K(r; u)|x0 ¿y0 = 1 is constant. The line element (8.13) then appears in outgoing Sachs–Bondi form. In the VBH region the Killing-norm 2m − ar + d (1 + O(c0 )) ; (8.15) K(r; u)|x0 ¡y0 = 1 − r 3=2
1=2
with m = (x1 − y1 )(c1 y0 + 2c0 y0 )=27=2 , a = (x1 − y1 )(6c0 y0 − c1 )=23=2 and d = (x1 − y1 )2c0 y0 has two zeros located approximately at r = 2m and r = 1=a corresponding for positive m and a to a Schwarzschild horizon and a Rindler-type one. 8.3. Virtual black hole The geometric part of the conserved quantity (3.14) in our present notation (3.35) reads p 2 p3 √ − 4 p1 : (8.16) C(g) = √ p1 72
The somewhat unusual role of the coordinates should be noted: x0 is asymptotically proportional to r 2 ; thus our Hamiltonian evolves with respect to a “radius” as “time”-parameter. This also implies that e.g. the asymptotic energy density is related to the component T11 and not T00 of the energy–momentum tensor.
408
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430 i+
ℑ+ y i0
_
ℑ
i_
Fig. 8.1. CP diagram of the VBH.
As a consequence of the choice of integration constants C(g) vanishes in the asymptotic region x0 ¿ y0 . The functions p1 and p3 are continuous, but p2 jumps at x0 =y0 . Thus, C(g) is discontinuous. This phenomenon has been called “virtual black hole” (VBH) in [95]. It is generic rather than an artifact of our special choice of asymptotic conditions. The reason why we have chosen this name is simple: The geometric part of the conserved quantity (8.16) is essentially equivalent to the so-called mass aspect function, which is closely related to the BH mass (cf. Section 5.2). Moreover, inspection of the Killing-norm (8.15) reveals that for negligible Rindler acceleration a the Schwarzschild horizon corresponds to a BH with precisely that mass. It disappears in the asymptotic states (by construction), but mediates an interaction between them. The idea that BHs must be considered in the S-matrix together with elementary matter 9elds has been put forward some time ago [388]. The approach [96] reviewed here, for the 9rst time allowed to derive (rather than to conjecture) the appearance of the BH states in the quantum scattering matrix of gravity. Solutions (8.9) and (8.10) establish C(g) |x0 ¡y0 = 4c0 y0
3=2
˙ −mVBH :
(8.17)
Thus, c1 only enters the Rindler term in the Killing-norm, but not the VBH mass (8.17). The CP diagram corresponding to the line element (8.15) as presented in Fig. 8.1 needs some explanations: 9rst of all, the e?ective line element is non-local in the sense that it depends not only on one set of coordinates (e.g. u; r) but on two (x = (u; r); y = (u0 ; r0 )), where r0 and u0 are related to y0 and y1 like r and u to x0 and x1 in (8.14). As discussed previously, this non-locality was a consequence of integrating out geometry non-perturbatively. For each choice of y, it is possible to draw an ordinary CP-diagram treating u0 ; r0 as external parameters. The light-like “cut” in Fig. 8.1 corresponds to u = u0 and the endpoint labelled by y to the point x = y. The non-trivial part of our e?ective geometry is concentrated on the cut. We do not want to suggest to take the e?ective geometry (8.13) at face value—this would be like over-interpreting the role of virtual particles in a loop diagram. It is a non-local entity and we still have to “sum” (read: integrate) over all possible geometries of this type in order to obtain the non-local vertices and the scattering amplitude. Nonetheless, the simplicity of this geometry
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430 ∂0 ϕ
∂0 ϕ V(4)(x,y)a
q’ x
q’
y
q
V(4)(x,y)b
x
+
k
q ∂0 ϕ
∂0 ϕ
∂1 ϕ
∂0 ϕ
k’
409
∂0 ϕ
k’ y k ∂0 ϕ
Fig. 8.2. Total V (4) -vertex with outer legs.
and the fact that all possible con9gurations are summed over, are nice qualitative features of this picture. The localization of “mass” and “Rindler acceleration” on a light-like cut (see Fig. 8.1) in (8.13) is not an artifact of an accidental gauge choice, but has a physical interpretation in terms of the Ricci-scalar [389], the explicit form of which is given by [97] 4m0 4d R(VBH) (u; r; u0 ; r0 ) = (u − u0 ) −(r − r0 ) + 6a − 0 r2 r 6a0 2d − 2 : (8.18) + P(r0 − r) r r As discussed in Ref. [97] certain parallels to Hawking’s Euclidean VBHs [390] can be observed, but also essential di?erences. The main one is our Minkowski signature which we deem to be a positive feature. 8.4. Non-local 4 vertices All integration constants have been 9xed by the arguments in the preceding paragraphs. The fourth-order vertex of quantum 9eld theory is extracted from the second line of (7.34) by collecting the terms linear in c0 and c1 replacing each by 0 and 1 , respectively. The tree graphs we obtain in that way (cf. Fig. 8.2) contain the non-local vertices dp1 dq2 (4) Va = 0 (x)0 (y) p1 + q 2 dc0 dc0 ci =0 x y √ = 0 (x)0 (y)| y0 − x0 | x0 y0 (3x0 + 3y0 + 2 x0 y0 )(x1 − y1 ) ; (8.19) x
and Vb(4) = −
y
x
y
=− with
x
:=
∞ 0
x
dx0
0 (y)1 (x)
y
dp1 dq2 − 0 (x)1 (y) p1 dc0 dc1 ci =0
0 (x)1 (y)|x0 − y0 |x0 (x1 − y1 ) ;
∞
−∞ dx
1
.
(8.20)
410
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
8.5. Scattering amplitude In terms of the time variable t := r + u the scalar 9eld asymptotically satis9es the spherical wave equation. For proper s-waves only the spherical Bessel function sin(kr) Rk0 (r) = (8.21) kr survives in the mode decomposition (Dk := 4%k 2 dk): ∞ 1 Dk − −ikt ikt √ Rk0 [a+ (r; t) = ]: k e + ak e 3=2 (2%) 2k 0
(8.22)
+ 2 With a± obeying the commutation relation [a− k ; ak ] = (k − k )=(4%k ), they will be used to de9ne asymptotic states and to construct the Fock space. The normalization factor is chosen such that the Hamiltonian of asymptotic scalars reads ∞ 1 ∞ − (as) 2 2 H = Dr[(9t ) + (9r ) ] = Dka+ (8.23) k ak k : 2 0 0
In Ref. [95] we had observed a non-physical feature in the massless case for (in D = 2) minimally coupled scalars: Either the S-matrix was divergent or—if the VBH was “plugged” by suitable boundary conditions on at r = 0—it vanished. This implied an e?ective decoupling of the plane waves from the geometry. For massive scalars a 9nite non-vanishing scattering amplitude has been found. In the present more physical case of s-waves from D = 4 GR at a 9rst glance it may seem surprising that the simple additional factor X in front of the matter Lagrangian induces fundamental changes in the qualitative behavior. In fact, it causes the partial di?erential equations (8.8) to become coupled, giving rise to an additional vertex (Vb(4) ). After a long and tedious calculation (for details see Refs. [142,391]) for the S-matrix element with ingoing modes q; q and outgoing ones k; k , 1 (4) + + − (4) T (q; q ; k; k ) = 0|a− (8.24) k ak (Va + Vb )aq aq |0 ; 2 having restored 73 the full dependence on the gravitational constant X = 8%GN , we arrive at T (q; q ; k; k ) = −
iX(k + k − q − q ) 3 ˜ E T 2(4%)4 |kk qq |3=2
(8.25)
with the conserved total energy E = q + q = k + k , 2 2 Y 1 1 p 1 T˜ (q; q ; k; k ):= 3 Y ln 6 + p2 ln 2 3kk qq − (r 2 s2 ) ; E E Y E 2 p∈{k; k ;q;q }
r =p s=r;p
(8.26) and the momentum transfer function Y = (k + k )(k − q)(k − q). The interesting part of the scattering amplitude is encoded in the scale-independent factor T˜ . 73
Up to this point the overall factor in (2.2) had been omitted.
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
200
411
1
sigma 100
0.8 0 0 0.2
0.4
0.6 beta
0.4 alpha
0.2
0.6 0.8 10
Fig. 8.3. Kinematic plot of s-wave cross section d'=d&.
With the de9nitions k = E&, k = E(1 − &), q = E8, and q = E(1 − 8) (&; 8 ∈ [0; 1], E ∈ R+ ) a quantity to be interpreted as a cross section for spherical waves can be de9ned [96]: 1 X2 E 2 |T˜ (&; 8)|2 d' = : d& 4(4%)3 (1 − |28 − 1|)(1 − &)(1 − 8)&8
(8.27)
The kinematic plot, Fig. 8.3, contains the relevant physical information. The dependence of the cross section on the total incoming energy is trivially given by the monomial prefactor E 2 : it vanishes in the IR limit and diverges quadratically in the UV limit. At least the last fact is not surprising, considering our assumption of energies being small as compared to the Planck energy. It simply signals the breakdown of our perturbation theory. The main results of the detailed discussion [96,97] are: • Poles exist in the case of vanishing momentum transfer (forward scattering). • An ingoing s-wave can decay into three outgoing ones. Although this may be expected on general grounds, within the present formalism it is possible to provide explicit results for the decay rate. • Despite the non-locality of the e?ective theory, the S-matrix is CPT invariant at tree level. • Fig. 8.3 appears to exhibit self-similarity. Indeed, by zooming into the center of that 9gure one obtains again an identically looking plot. However, this self-similarity is only a leading (and next-to-leading) order e?ect and breaks down in the Rat regions. 8.6. Implications for the information paradox Very roughly, the information paradox [392] may be formulated in the following way. Imagine a pure quantum state in a non-singular asymptotically Minkowski space–time. Let this pure state collapse into a BH which evaporates due to the Hawking e?ect. This e?ect is only understood for a background which does not change appreciably due to radiation: if it does, it is, nevertheless, assumed that this evaporation proceeds through an (unknown!) 9nal phase so that the BH disappears. The 9nal state of this process will be Minkowski space 9lled with thermal radiation which is de9nitely a mixed quantum state. Therefore, a pure quantum state seems to evolve into a mixed one, contradicting
412
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
basic laws of quantum mechanics. One reason already has been given why this picture is a rather approximate one. In addition, the exact thermal Planck spectrum of the radiation requires in9nite time for its formation, i.e. radiation can be strictly thermal only if the BH never disappears completely. However, the formation of even approximately thermal 9nal states seems very di3cult to master in quantum theory which has prompted several very interesting developments in quantum gravity, as e.g. models of stable BH remnants [393] and the S-matrix approach of Ref. [394]. The understanding of the evolution of VBHs is crucial for quantum gravity [395]. Their evaporation—or, more exactly, their conversion to mixed states would inevitably violate either locality or energy–momentum conservation [396]. Since the non-perturbative approach to path integral quantization now also predicts VBHs in two dimensions, it is important to understand whether there is an “information loss” in these models. Of course, there is none at the tree level discussed above. However, we also have good grounds to believe that the situation will remain the same in higher loops. Our 9rst argument is somewhat formal. In the two-dimensional model we were able to extract the VBH from the degrees of freedom already present in the theory rather than to be forced to introduce it from the outside. The whole system has been quantized in full accord with the general principles of the quantization for systems with constraints. According to general theorems [361] the resulting quantum theory must be unitary, respect causality and energy conservation, and must forbid transitions of pure states to mixed ones, as long as we are able to refer to a Fock space of the asymptotic states. Our next argument is more physical. BH evaporation is related to the condition (6.25) which 9xes the energy–momentum tensor at the horizon and thus de9nes the Unruh vacuum state. This condition is clearly not applicable to VBHs. The relevant vacuum state for the scalar 9eld is just the usual Minkowski space vacuum containing no information about VBH states which may be formed in quantum scattering. Kruskal coordinates for a VBH cannot be associated with any real observer. Therefore, the argument that the energy–momentum tensor must be 9nite at the horizon is not applicable to it. The only vacuum state which can be de9ned by a condition at in9nity rather then on a horizon is the Boulware vacuum which does not contain Hawking radiation so that VBHs do not radiate anything to in9nity. It must be admitted that in order to put this argumentation upon a 9rm basis, one should calculate the next (one-loop) order in the path integral. Since Hawking radiation is a one-loop e?ect, this order of the perturbation theory will be actually su3cient. 9. Canonical quantization Canonical quantization methods dominated 2D dilaton gravity during its early years. They owe their success to the fact that the geometrical sector contains no propagating degrees of freedom, and, therefore, the problem reduces e?ectively to a quantum mechanical one. After the extensive discussion of the path integral in the previous section we intend to be brief for this essentially equivalent approach. The prehistory of canonical quantization of gravity involves the seminal papers of Arnowitt et al. [271], Wheeler [12] and DeWitt [13] which led to Misner’s “minisuperspace quantization” program [397], where almost all degrees of freedom were frozen by symmetry requirements. Kucha_r extended these techniques to “midisuperspace quantization” for the
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
413
explicit example of cylindrical gravitational waves [137], i.e. to a system with 9eld degrees of freedom, albeit still using symmetry requirements in order to simplify the formalism. This seems to be the only midisuperspace model that could be treated exactly. The most notable example of a non-soluble model is the collapse of spherically symmetric matter [33] (cf. [19] for an essential correction to that paper). A canonical treatment of a complete Schwarzschild space–time under somewhat too strong assumptions provided Lund’s proof [398] of the non-existence of an extrinsic time representation for vacuum Schwarzschild BHs [399]. Possibly due to the impact of Hawking’s work on semi-classical radiation of BHs [18] the discussion of genuine quantum gravity e?ects was postponed until the CGHS model [52] rekindled the interest in (exact) quantization of BHs [32,106,145,146,149,163,165,202,399 – 408]. In particular, in Ref. [32] an extrinsic time representation for the quantized Schwarzschild BH allowed to circumvent Lund’s no-go theorem by relaxing its premises. As an explicit example for demonstrating the main points we focus on the CGHS model, the Dirac quantization of which has been studied by Jackiw and collaborators [146,147,402– 404], by Mikovic [145,405,406], and later also by other authors. Our brief summary follows the work of Kucha_r et al. [149]. The starting point is not really the CGHS action (2.8), but its conformally related one (4.40). In this way, one eliminates the kinetic term of the dilaton 9eld in (2.9) at the cost of a singular conformal transformation (4.39). This action is then cast into canonical form by the standard ADM decomposition. It has to be supplemented by surface terms invoking the requirement of functional di?erentiability. 74 However, the boundary action leads to an important caveat: At the left and right in9nity corresponding to the asymptotic regions of patch A and patch B in Figs. 3.5 and 3.6 arbitrary variations of the lapse are required. Otherwise unwanted “natural” boundary conditions for the BH mass emerge which imply vanishing of the BH mass. This problem has been resolved for the Schwarzschild BH [32], parameterizing the lapse function at the boundaries by a proper time function. It turns out that the total action 2 1 ˙ L = d x(% + %y y˙ + p' '˙ − NH − N H1 ) + dt((˙L mL − (˙R mR ) ; (9.1) depends on these two additional parameters (L and (R and on the standard canonical variables: N is the lapse, N 1 the shift, H the Hamiltonian constraint, H1 the momentum constraint, %; denotes the matter degree of freedom (the presence of a single minimally coupled scalar 9eld is assumed), and in the notation of Ref. [149] y; %y ; '; p' are geometric canonical 9eld variables. In the boundary action the indices L; R refer to “left in9nity” and “right in9nity”, mL; R are the—conveniently normalized—BH masses. The relative sign between the last two terms originates from the di?erent time orientations one chooses for the two patches in order to match the behavior of the Killing-time T in the corresponding global diagram. There exists a canonical transformation mapping the action (9.1) onto a simpler one (in terms of which the constraints become second-order polynomials). One has to be particularly careful with the boundary part. In the new variables the constraints can be solved exactly, because they have the same form as those of a parameterized massless scalar 9eld propagating on a Rat 2D background. The main obstacle in replacing the canonical variables by corresponding operators is a Schwinger 74
The physically most transparent way to impose it is a careful treatment of asymptotic conditions on the geometric variables [409] (including lapse and shift; cf. also Section 5.1).
414
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
term encountered in the commutators of the energy–momentum tensor operators [410]. This anomaly converts the classical 9rst class constraints into quantum second class ones and thus the imposition of the operator constraints on the states leads to inconsistencies. 75 Kucha_r proposed a trick to get rid of that anomaly in the SchrYodinger picture [411,412]: The momentum operators are supplemented by an additional term which does not change the canonical commutation relations but which cancels the anomaly. To summarize: by performing 9rst a conformal and then a canonical transformation the CGHS model was mapped onto a parameterized 9eld theory on a Rat background which could be quantized successfully. Clearly, this quantum theory of the parameterized 9eld is a standard unitary quantum 9eld theory, i.e. no information loss is encountered. However, the interesting questions are precisely those related to the physical space–time. Thus, one still has to show how to pose such questions in the framework based on the auxiliary Rat background. As emphasized by the authors themselves [32,149] it seems that di3cult problems reemerge which were avoided so far: • It is not clear how to make sense of the operator version of the physical line element (ds)2physical = (ds)2Rat exp(−2*). • The “correct” operator ordering of the conformal factor is an open question when * is expressed in terms of the auxiliary canonical variables. • The classical dilaton 9eld should remain positive to ensure the correct signature of the physical metric. In a quantum theory it is highly non-trivial to maintain this positivity requirement. There have been attempts to clarify this issue with a (1 + 0)-dimensional model [413]. Besides, the presence of an anomaly may add di3culties in implementation of the Dirac quantization scheme [404]. If there is no matter 9eld in the model the canonical quantization is especially simple. The reduced phase space quantization program 76 can be carried through exactly to the very end, i.e. one can solve the constraints and 9x the gauge freedom. However, the result is essentially trivial for spherically reduced gravity [32,38,415,416], as well as for the other dilaton models [417]: the quantum functional only depends on the ADM mass. Since the Maxwell 9eld in two dimensions does not add new propagating degrees of freedom an extension of the canonical approach to charged BHs may be done in a rather straightforward manner (cf. [289,418]). Other instances where the programme of canonical quantization has been carried through in essentially quantum mechanical models like the collapse of spherically symmetric (null-)dust are Refs. [419 – 424]. An example for a semiclassical model of BH evolution with time variable is Ref. [425].
75 This is true at least in the SchrYodinger picture. In the Heisenberg picture the quantum theory is well-de9ned and has the same number of degrees of freedom as the classical one. Indeed, also the Heisenberg e.o.m.’s have the same form as the classical e.o.m.’s (of course, one has two additional quantum mechanical degrees of freedom from the two parameters in the boundary action, but they are just constants of motion). 76 A clear explanation of the reduced phase space quantization can be found in [414].
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
415
10. Conclusions and discussion The last decade has seen remarkable progress in the treatment of 2D dilaton gravity models. Having retraced the main historical lines of this development in the Introduction we now confront the main results of this 9eld with a list of well-known problems which quantum gravity, in fact, shares with other quantum theories in which geometry plays a fundamental dynamical role. First we summarize the main contributions which dilaton gravity has been able to provide with respect to these questions and which we have described in some detail in this report. Dilaton models in D=2 possess the basic advantage that their geometric part, in a certain sense, is a “topological” theory, albeit one where the solutions are not related to a discrete winding number. An important special case is the theory which arises from spherical reduction of Einstein gravity in D = 4, i.e. also the treatment of Schwarzschild black holes is covered by it. Other relevant members are the string-inspired dilaton theory and the Jackiw–Teitelboim model. In the absence of matter, the classical solution for all such theories can be given in closed analytic form, a result which appears more naturally in the Eddington–Finkelstein gauge for the metric. In that gauge also a very straightforward procedure allows the construction of the global solution without the necessity to introduce explicitly or implicitly global Kruskal-like coordinates. It is a peculiar feature of e?ective two dimensions of space–time that the ADM mass, even in the presence of matter interactions, generalizes to an “absolute” (in space and time!) conserved quantity. Technically, many new results are related to the complete dynamical equivalence between the standard formulation of dilaton theories by an action expressed in terms of metric and dilaton 9eld on the one hand, and a “9rst order” (“covariant Hamiltonian”) action on the other hand. The latter involves auxiliary 9elds and the geometry is expressed in Cartan variables (zweibeine and spin connection). This equivalent formulation also contains non-trivial torsion and turns out to represent a special case of the very general concept of Poisson-Sigma models, a new and rapidly developing 9eld of research with important connections to strings and non-commutative geometry. Certain generalizations, as e.g. Yang–Mills 9elds or supergravity extensions are covered directly by this formalism. Strictly speaking, the (semi-classical) treatment of Hawking radiation does not represent an application of quantum gravity but it is formulated with respect to a given classical (Black Hole) background. Nevertheless, in order to justify other 2D quantum gravity results derived from an effective 2D theory, it should emerge as well from a treatment of the spherically reduced case. As far as (in D = 2) minimally coupled scalar 9elds are concerned all aspects are well understood. In spherically reduced matter (non-minimal coupling in D = 2) a correct relation between Hawking temperature and Hawking Rux has been proposed, however based upon mathematical steps whose justi9cation as yet has not been proved conclusively. For a wide class of two-dimensional gravity models relations between Hawking temperature and ADM mass can be obtained which di?er from the one in Einstein gravity. The full impact of the advantages from the 9rst-order formulation of dilaton models D = 2 is revealed in the path integral quantization of such theories. In the temporal gauge for Cartan variables—corresponding to the Eddington–Finkelstein gauge of the metric—it proved possible to exactly integrate out all geometric degrees of freedom. This intrinsically non-perturbative result is closely related to the quantum 9eld theoretical “triviality” of generic gravity theories without matter interactions in D = 2. In this derivation the path integral over all positive and negative “volumes” is
416
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
an essential ingredient, thus establishing an important con9rmation of the conjecture that this should also be the correct procedure in D = 4. If matter 9elds are present, still an e?ective theory is obtained in which geometry is treated in a non-perturbative manner. A perturbation expansion in terms of the interactions with matter follows standard quantum 9eld theoretical methods. It is valid as long as the energies are small as compared to Planck’s mass. The e?ective non-local vertices of scalar 9elds in this formulation can be interpreted as the appearance of an intermediate “virtual” Black Hole in certain scattering amplitudes of spherical waves. It should be stressed that this seems to be the 9rst instance where such a virtual Black Hole reRects an intrinsic feature of the theory and is not introduced by any additional assumption. As far as the problem of observables in quantum gravity is concerned, the computation of a special (gauge-independent) S-matrix element for spherically reduced Einstein gravity seems to be an interesting feature as well. Also some progress has been reported regarding the 9nal stages of Black Hole evaporation and the intimately connected “quantum information paradoxon”. The very fact that now a formulation of that system exists in the form of a standard quantum 9eld theory implies that— also at the very end of its existence—a Black Hole does not violate quantum mechanical concepts like unitarity. In that case as in others it has turned out that—at least in D = 2—the application of standard quantum 9eld theoretic techniques can go very far, leading to interesting results without the necessity to infer additional concepts. This suggests further studies in many directions which have not yet su3ciently been covered so far. At the classical level a more systematic search for 2D gravity theories involving matter interactions, but still allowing exact solutions, seems desirable, e.g. for a—perhaps at 9rst only qualitative, but nevertheless exact—description of critical behavior in spherical collapse. The same applies for models with additional Abelian or non-Abelian gauge 9elds from which the spherically symmetric Black Hole with (non-Abelian) charges could be studied. Although the general principle to obtain supergravity extensions from 2D dilaton theories is now available, the new comprehensive approach based upon the Poisson-Sigma structure of such models has posed many new questions. Recently, a whole new 9eld of scalar–tensor theories in D = 4 (quintessence) has been developed. There a dilaton (“Jordan”-) 9eld already appears in the higher dimension. Certain important aspects of these models should be accessible by 2D-methods when the e?ective spherically reduced theory is considered. Within the realm of semi-classical problems despite new insight for the treatment of Hawking radiation starting in the spherically reduced case, still several important questions are open. Among the possible directions of research in full 2D quantum gravity higher loop corrections could be investigated. The issue of “quantum” observables is closely related to the treatment of systems with 9nite boundary and related boundary variables. Possibly also new elements for the long discussion of quantum gravity at the Big Bang (quantum cosmology) could emerge. More immediate consequences of the present approach are a generalization of gravitational scattering of scalars described in this report, for scattering o? a Black Hole. Another generalization in the quantum case would be the treatment of fermions, either directly introduced in D = 2 or obtained from D = 4 by reduction. Finally, the virtual Black Hole phenomenon exists for generic dilaton models. It could be interesting to study the S-matrix of gravitational scattering
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
417
of matter in the extended context of generalized dilaton theories. The range of technically feasible investigations now certainly has been enlarged substantially. Acknowledgements We have pro9ted from numerous enlightening discussions with our previous collaborators S. Alexandrov, M. Bordag, M. Ertl, P. Fischer, F. Haider, D. Hofmann, M.O. Katanaev, T. KlYosch, S. Lau, H. Liebl, D.J. Schwarz, T. Strobl, G. Tieber, P. Widerin, A. Zelnikov and the members of the Institute for Theoretical Physics at the TU Vienna (especially H. Balasin and L. Bergamin). The exchange of views and e-mails on the quantization part with P. van Nieuwenhuizen and R. Jackiw is gratefully acknowledged. We also thank those authors who suggested supplementary references. One of the authors (W.K.) is especially grateful to R. Jackiw and G. SegrVe who in di?erent ways during the 80s kindled his interest in quantum gravity, especially for models in D = 2. The LaTeX-nical support of F. Hochfellner and E. MYossmer has been a great help for us. Finally, we render special thanks to T. KlYosch for letting us “steal” two of his beautiful xfig-pictures. This work has been supported by Project P-14650-TPH of the Austrian Science Foundation (FWF) and by Project BO 1112=11-1 of the Deutsche Forschungsgemeinschaft (DFG). Appendix A. Spherical reduction of the curvature two-form In a D-dimensional pseudo-Riemannian manifold M with Lorentzian signature (+; −; −; : : : ; −) and spherical symmetry 77 the coordinates describing the manifold can be separated in a two-dimensional Lorentzian part spanning the manifold L and a (D − 2)-dimensional Riemannian angular part constituting an S D−2 . In adapted coordinates the line element reads dsLM = g dx ⊗ dx = g&8 dx& ⊗ dx8 − 62 (x& )g*' dx* ⊗ dx' ;
(A.1)
using letters from the beginning of the alphabet (&; 8; : : :; a; b; : : :) for quantities connected with L, letters from the middle of the alphabet (; ; : : :; m; n; : : :) for quantities connected with M and letters from the end of the alphabet for quantities connected with S D−2 (*; '; : : :; r; s; : : :). Indices will be lowered and raised with their corresponding metrics. In the vielbein-formalism 78 dsL2 = ab eZ a ⊗ eZ b , dsSS = rs eZ r ⊗ eZ s and comparing with dsLM = mn em ⊗ n e = dsL2 − 62 dsS2 yields ea = eZ a ; er = 6eZ r ;
ea = eZ a ; er = 6−1 eZ r :
(A.2)
Metricity and torsionlessness for the connection one-forms on M; L and S leads to !a b = !Z a b ;
!r s = !Z r s ;
!r a = (eZ a 6)eZ r ;
!a r = ab rs (eZ b 6)eZ s ;
(A.3)
using relations (A.2). 77
That is, the isometry group of the metric has a group isomorphic to SO(D − 1) as subgroup with S D−2 -spheres as orbits. 78 The notation and the meaning of all quantities appearing here is explained in Section 2.1.1.
418
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
From Cartan’s structure equation (1.25) the curvature two-form on M follows: Ra b = RZ
a
b
;
(A.4)
r Rr s = RZ s + ab (eZ a 6)(eZ b 6)eZ r eZ s ;
(A.5)
Ra r = ac (eZ b eZ c 6)eZ b eZ r + (eZ b 6)!Z ab eZ r ;
(A.6)
Rr a = (eZ b eZ a 6)eZ b eZ r − (eZ b 6)!Z b a eZ r ;
(A.7)
a r where RZ b and RZ s are the curvature two-forms on L and S, respectively. r Contracting the vector indices with the form indices and using RZ s = eZ r eZ s yields the curvature scalar (D − 2)(D − 3) D−2 L & ( 6) ; R=R − [1 + (∇& 6)(∇ 6)] − 2 (A.8) 62 6 & where∇ is the covariant derivative with respect to the metric on L and = ∇& ∇ . This—together (D−2) M L with |g | = 6 −g —is the starting point of spherically reduced gravity formulated by a 2D e?ective action. Note that the generalization to continuous and negative dimensions D is possible in (A.8) which leads to the subclass b = a − 1 of the models of (3.67). Characteristic classes are independent of the metrical structure since they depend solely on the topology, but typically they can be expressed as integrals over local quantities using index theorems. As an example we treat Euler and Pontryagin class in D = 4. The latter can be expressed as 1 P4 = 2 Rmn Rmn = 0 ; (A.9) 8% M
and it vanishes because Rab Rab = 0 = Rst Rst and with (A.6), (A.7) also Ras Ras yields no contribution. The Euler class 1 E4 = Rkl Rmn .klmn (A.10) 2(4%)2 M is non-trivial in general and can be expressed as a 2D integral over L: 1 .a b [Ra b (1 + cd (eZ c 6)(eZ d 6)) E4 = 4% L − 2(ad (eZ c eZ d 6)eZ c + (eZ c 6)!Z ac )(be (eZ c eZ e 6)eZ c + (eZ c 6)!Z c b )] :
(A.11)
Appendix B. Heat kernel expansion Some basic properties of the heat kernel expansion are collected here which are needed in the main text. More details can be found in the monographs [426 – 429]. In most of the quantum 9eld theory problems one deals with an operator of Laplace type. In a suitable basis, such an operator can be represented as A = −(g ∇ ∇ + E) ;
(B.1)
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
419
where ∇ is a covariant derivative, and E is an endomorphism of a vector bundle (or, in simpler terms, a matrix valued function). The connection in the covariant derivative and the matrix E may have gauge and spin indices. We consider the operator A in arbitrary dimension D. The smeared heat kernel is de9ned by the equation K(f; A; t) = Tr(f exp(−tA)) ;
(B.2)
where f is a function, but more complicated cases with f being a di?erential operator may be considered as well [430]. If the underlying manifold M has no boundary, there exists an asymptotic series as t → +0 K(f; A; t)
∞
an (f; A)t −(D=2)+n ;
(B.3)
n=0
where the coe3cients an are locally computable. This means that they can be expressed as integrals of local polynomials constructed from the Riemannian curvature, E, gauge 9eld strength, and covariant derivatives. On manifolds with boundaries half-integer n are also admitted. A very important property is that numerical coe3cients in front of a monomial depend on the dimension D via an overall factor (4%)−D=2 only [426]. This last statement follows immediately from writing the general form of such a coe3cient on a product manifold M = M1 ⊗ S 1 and assuming complete triviality in the S 1 direction. The heat kernel coe3cients an are known for n 6 5 [431]. We 9nd it instructive to present here the calculation of a0 and a1 in order to make our review self-contained, and to advertise a very powerful method of such calculations. The 9rst step is to write down all possible invariants of an appropriate dimension. The mass dimension of the operator A is given by dim A=+2. Therefore, dim t =−2. The volume element has the dimension −D. All geometric invariants (like e.g. curvature) have positive dimensions. The lowest dimension (−D) involves just the integral of the smearing function over the volume. This explains why the expansion (B.3) starts with t −D=2 . Thus, the 9rst two terms in (B.3) must read √ −D=2 a0 (f; A) = (4%) dD x g tr(&0 f) ; (B.4) M
a1 (f; A) = (4%)−D=2
M
√ dD x g tr(f(&1 E + &2 R)) :
(B.5)
where tr denotes the 9nite-dimensional matrix trace. At this point &i still are unknown constants which will be de9ned by particular case calculations or through functional relations between the heat kernels for di?erent operators. The constant &0 follows from the well-known solution of the heat equation in Rat space: &0 = 1 :
(B.6)
Let us consider now how the heat kernel changes under the conformal transformations of the operator A and by the shift by a function. d an (1; e−2jf A) = (D − 2n)an (f; A) ; (B.7) dj j=0
420
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
d an (1; A − jF) = an−1 (F; A) : dj j=0
(B.8)
Here f and F are arbitrary functions. The proof of these two properties is purely combinatorial. It uses di?erentiation of an exponential (B.2) and commutativity under the trace. From Eqs. (B.6) and (B.8) &1 = 1
(B.9)
follows. A combination of the two transformations, A(j; ) = e−2jf (A − F), allows to prove that for D = 2(n + 1) d 0= an+1 (1; A(j; )) ; dj j=0 d d d d an+1 (1; A(j; )) = an+1 (1; A(j; )) 0= d =0 dj j=0 dj j=0 d =0 d = an (e−2jf F; e−2jf A) : (B.10) dj j=0 The conformal transformations of the individual invariants which may enter (B.10) must be de9ned. They are perfectly standard in the “geometry” part: √ d √ g = Df g ; dj j=0 d R = −2fR − 2(D − 1)∇2 f : (B.11) dj j=0 E is transformed such that the operator A is conformally covariant: 1 d E = −2fE + (D − 2)∇2 f : dj 2 j=0
(B.12)
Note, that for the standard conformal (Weyl) transformations the “potential” term E transforms homogeneously, i.e. the second term on r.h.s. of (B.12) is absent. Finally, the general expression (B.5) is substituted in the variational equation (B.10) for D = 4. The result 1 (B.13) &2 = 6 completes the calculation of a1 . Heat kernel methods became standard in quantum 9eld theory after the famous works by DeWitt [432] 79 where a di?erent calculation scheme was used. The approach we have presented here goes back to the paper by Gilkey [434]. This approach appears somewhat simpler, although is less “algorithmic” since one has to invent new functional relations appropriate for a particular 79
For the 9rst time the heat kernel (proper time) methods were used in quantum theory by Fock [433].
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
421
problem. The full power of this method has been demonstrated on manifolds with boundaries [435] (cf. also [436] for minor corrections). With no other method a complicated calculation as the one for a5=2 for mixed boundary conditions [437] is possible. The last topic to be addressed is the relation between the heat kernel and the zeta function of the same operator. It is clear from de9nitions (6.18) and (B.2) that ∞ 1 W(s|f; A) = dt t s−1 K(f; A; t) : (B.14) )(s) 0 This relation can be inverted, 1 ds )(s)W(s|f; A)t −s ; (B.15) K(f; A; t) = 2%i where the integration contour encircles all poles of the integrand. The coe3cient in front of t p in the asymptotic expansion (B.3) corresponds to the residue of )(s)W(s|f; A) at the point s = −p. In particular, aD=2 (f; A) = Ress=0 ()(s)W(s|f; A)) = W(0|f; A) :
(B.16)
For D = 2, A = −V, E = 0, Eqs. (B.5), (B.13) and (B.16) provide relation (6.19) of the main text. References [1] C. Rovelli, Living Rev. Rel. 1 (1998) 1, gr-qc=9710008. [2] J. Polchinski, String Theory, An Introduction to the Bosonic String, Vol. 1, Cambridge University Press, Cambridge, 1998. [3] J.F. Donoghue, Phys. Rev. D 50 (1994) 3874, gr-qc=9405057. [4] C.M. Will, Living Rev. Rel. 4 (2001) 4, gr-qc=0103036. [5] G. ’t Hooft, M.J.G. Veltman, Ann. Poincare Phys. Theor. A 20 (1974) 69. [6] M.H. Goro?, A. Sagnotti, Nucl. Phys. B 266 (1986) 709. [7] P.S. Howe, K.S. Stelle, P.K. Townsend, Nucl. Phys. B 236 (1984) 125. [8] Z. Bern, et al., Nucl. Phys. B 530 (1998) 401, hep-th=9802162. [9] S. Deser, D. Seminara, Phys. Rev. Lett. 82 (1999) 2435, hep-th=9812136. [10] O. Lauscher, M. Reuter, Class. Quant. Grav. 19 (2002) 483, hep-th=0110021. [11] J. Wheeler, Geometrodynamics and the issue of the 9nal state, in: C. DeWitt, B. DeWitt (Eds.), Relativity, Groups and Topology, Gordon and Breach, London, 1964, p. 316. [12] J. Wheeler, in: C. DeWitt, J. Wheeler (Eds.), Batelle Recontres: 1967 Lectures in Mathematics and Physics, Benjamin, New York, 1968. [13] B.S. DeWitt, Phys. Rev. 160 (1967) 1113. [14] C.J. Isham, Lectures given at 30th International Schladming Winter School, Schladming, Austria, February 27– March 8, 1991. [15] A. Ashtekar, R. Geroch, Rep. Prog. Phys. 37 (1974) 1211. [16] K. Kucha_r, in: Proceedings on Quantum Gravity, Oxford, Vol. 2, 1980, pp. 329 –376. [17] W. Kummer, Eur. Phys. J. C 21 (2001) 175, hep-th=0104123. [18] S.W. Hawking, Commun. Math. Phys. 43 (1975) 199. [19] W.G. Unruh, Phys. Rev. D 14 (1976) 870. [20] S. Carlip, Rep. Prog. Phys. 64 (2001) 885, arXiv:gr-qc=0108040. [21] B.M. Barbashov, V.V. Nesterenko, A.M. Chervyakov, Theor. Math. Phys. 40 (1979) 572; Teor. Mat. Fiz. 40 (1979) 15 –27; J. Phys. A 13 (1980) 301–312. [22] E. D’Hoker, R. Jackiw, Phys. Rev. D 26 (1982) 3517. [23] C. Teitelboim, Phys. Lett. B 126 (1983) 41.
422 [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] [50] [51] [52] [53] [54] [55] [56] [57] [58] [59] [60] [61] [62] [63] [64] [65] [66] [67] [68] [69] [70] [71] [72] [73]
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
E. D’Hoker, D. Freedman, R. Jackiw, Phys. Rev. D 28 (1983) 2583. E. D’Hoker, R. Jackiw, Phys. Rev. Lett. 50 (1983) 1719. R. Jackiw, Nucl. Phys. B 252 (1985) 343. M.O. Katanaev, I.V. Volovich, Phys. Lett. B 175 (1986) 413. M.O. Katanaev, I.V. Volovich, Ann. Phys. 197 (1990) 1. R.B. Mann, A. Shiekh, L. Tarasov, Nucl. Phys. B 341 (1990) 134. A.E. Sikkema, R.B. Mann, Class. Quant. Grav. 8 (1991) 219. J. Brown, Lower Dimensional Gravity, World Scienti9c, Singapore, 1988. K.V. Kucha_r, Phys. Rev. D 50 (1994) 3961, arXiv:gr-qc=9403003. B.K. Berger, et al., Phys. Rev. D 5 (1972) 2467. R. Benguria, P. Cordero, C. Teitelboim, Nucl. Phys. B 122 (1977) 61. P. Thomi, B. Isaak, P. HVajVi_cek, Phys. Rev. D 30 (1984) 1168. P. HVajVi_cek, Phys. Rev. D 30 (1984) 1178. S. Mignemi, D.L. Wiltshire, Class. Quant. Grav. 6 (1989) 987. T. Thiemann, H.A. Kastrup, Nucl. Phys. B 399 (1993) 211, arXiv:gr-qc=9310012. H.A. Kastrup, T. Thiemann, Nucl. Phys. B 425 (1994) 665, arXiv:gr-qc=9401032. S.R. Lau, Class. Quant. Grav. 13 (1996) 1541, arXiv:gr-qc=9508028. D. Grumiller, W. Kummer, Phys. Rev. D 61 (2000) 064006, gr-qc=9902074. G. Mandal, A.M. Sengupta, S.R. Wadia, Mod. Phys. Lett. A 6 (1991) 1685. S. Elitzur, A. Forge, E. Rabinovici, Nucl. Phys. B 359 (1991) 581. E. Witten, Phys. Rev. D 44 (1991) 314. R. Dijkgraaf, H. Verlinde, E. Verlinde, Nucl. Phys. B 371 (1992) 269. M.D. McGuigan, C.R. Nappi, S.A. Yost, Nucl. Phys. B 375 (1992) 421, hep-th=9111038. N. Ishibashi, M. Li, A.R. Steif, Phys. Rev. Lett. 67 (1991) 3336. S.P. de Alwis, J. Lykken, Phys. Lett. B 269 (1991) 264. S.P. Khastgir, A. Kumar, Mod. Phys. Lett. A 6 (1991) 3365, hep-th=9109026. S.B. Giddings, Trieste HEP Cosmology, 1994, 0530, arXiv:hep-th=9412138. A. Strominger, 1994, arXiv:hep-th=9501071, Talk given at NATO Advanced Study Institute, 9501071. C.G. Callan Jr., et al., Phys. Rev. D 45 (1992) 1005, hep-th=9111056. H. Verlinde, Trieste Spring School on Strings and Quantum Gravity, 1991, pp. 178–207. N. Ikeda, K.I. Izawa, Prog. Theor. Phys. 90 (1993) 237, hep-th=9304012. J.G. Russo, A.A. Tseytlin, Nucl. Phys. B 382 (1992) 259, arXiv:hep-th=9201021. S.D. Odintsov, I.L. Shapiro, Phys. Lett. B 263 (1991) 183. H.J. Schmidt, Gen. Rel. Grav. 31 (1999) 1187, gr-qc=9905051. E.W. Mielke et al., Phys. Rev. D 48 (1993) 3648, hep-th=9304043. A.M. Polyakov, Mod. Phys. Lett. A 2 (1987) 893. S.N. Solodukhin, Class. Quant. Grav. 10 (1993) 1011. S.N. Solodukhin, JETP Lett. 57 (1993) 329. S.N. Solodukhin, Phys. Lett. B 319 (1993) 87, hep-th=9302040. S. Solodukhin, Mod. Phys. Lett. A 9 (1994) 2817, hep-th=9404034. Y.N. Obukhov, F.W. Hehl, 1997, hep-th=9807101. W. Kummer, D.J. Schwarz, Phys. Rev. D 45 (1992) 3628. P. Schaller, T. Strobl, Class. Quant. Grav. 11 (1994) 331, arXiv:hep-th=9211054. W. Kummer, P. Widerin, Phys. Rev. D 52 (1995) 6965, arXiv:gr-qc=9502031. M.O. Katanaev, W. Kummer, H. Liebl, Phys. Rev. D 53 (1996) 5609, gr-qc=9511009. W. Kummer, H. Liebl, D.V. Vassilevich, Nucl. Phys. B 493 (1997) 491, gr-qc=9612012. W. Kummer, H. Liebl, D.V. Vassilevich, Nucl. Phys. B 544 (1999) 403, hep-th=9809168. W. Kummer, H. Liebl, D.V. Vassilevich, Nucl. Phys. B 513 (1998) 723, hep-th=9707115. P. Schaller, T. Strobl, Mod. Phys. Lett. A 9 (1994) 3129, hep-th=9405110. T. Strobl, Poisson structure induced 9eld theories and models of 1+1 dimensional gravity, Ph.D. Thesis, Technische UniversitYat Wien, 1994, hep-th=0011248. [74] A.Y. Alekseev, P. Schaller, T. Strobl, Phys. Rev. D 52 (1995) 7146, hep-th=9505012.
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430 [75] [76] [77] [78] [79] [80] [81] [82] [83] [84] [85] [86] [87] [88] [89] [90] [91] [92] [93] [94] [95] [96] [97] [98] [99] [100] [101] [102] [103] [104] [105] [106] [107] [108] [109] [110] [111] [112] [113] [114] [115] [116] [117] [118] [119] [120] [121] [122] [123] [124] [125] [126]
423
V. Schomerus, JHEP 06 (1999) 030, hep-th=9903205. N. Seiberg, E. Witten, JHEP 09 (1999) 032, hep-th=9908142. M. Ertl, W. Kummer, T. Strobl, JHEP 01 (2001) 042, arXiv:hep-th=0012219. S. Solodukhin, Phys. Rev. D 51 (1995) 603, hep-th=9404045. W. Kummer, in: D. Bruncko, J. Urban (Eds.), HADRON STRUCTURE ’92, Stara Lesna, Czechoslovakia, 1992. H. Pelzer, T. Strobl, Class. Quant. Grav. 15 (1998) 3803, arXiv:gr-qc=9805059. A. Bilal, C.G. Callan, Nucl. Phys. B 394 (1993) 73, hep-th=9205089. J.G. Russo, L. Susskind, L. Thorlacius, Phys. Lett. B 292 (1992) 13, hep-th=9201074. S. Bose, L. Parker, Y. Peleg, Phys. Rev. D 52 (1995) 3512, hep-th=9502098. J. Cruz, J. Navarro-Salas, Phys. Lett. B 375 (1996) 47, hep-th=9512187. A. Fabbri, J. Navarro-Salas, Phys. Rev. D 58 (1998) 084011, gr-qc=9805082. S. Kim, H. Lee, Phys. Lett. B 458 (1999) 245, gr-qc=9907013. O.B. Zaslavsky, Phys. Lett. B 459 (1999) 105, hep-th=9904184. O.B. Zaslavsky, Phys. Rev. D 59 (1999) 084013, hep-th=9804089. K.J. Hamada, Phys. Lett. B 300 (1993) 322, hep-th=9206071. K.J. Hamada, A. Tsuchiya, Int. J. Mod. Phys. A 8 (1993) 4897, hep-th=9211135. R. Kantowski, C. Marzban, Phys. Rev. D 46 (1992) 5449, hep-th=9208015. E. Elizalde, S. Naftulin, S.D. Odintsov, Int. J. Mod. Phys. A 9 (1994) 933, hep-th=9304091. E. Elizalde, et al., Phys. Lett. B 352 (1995) 235, hep-th=9505030. F. Haider, W. Kummer, Int. J. Mod. Phys. A 9 (1994) 207. D. Grumiller, W. Kummer, D.V. Vassilevich, Nucl. Phys. B 580 (2000) 438, gr-qc=0001038. P. Fischer, et al., Phys. Lett. B 521 (2001) 357, gr-qc=0105034; Erratum, Phys. Lett. B 532 (2002) 373. D. Grumiller, Class. Quant. Grav. 19 (2002) 997, gr-qc=0111097. I.A. Batalin, G.A. Vilkovisky, Phys. Lett. B 69 (1977) 309. V.P. Frolov, D.V. Fursaev, Class. Quant. Grav. 15 (1998) 2041, hep-th=9802010. A.W. Peet, Class. Quant. Grav. 15 (1998) 3291, hep-th=9712253. V. Frolov, et al., Phys. Rev. D 60 (1999) 024016, hep-th=9901087. V.P. Frolov, W. Israel, S.N. Solodukhin, Phys. Rev. D 54 (1996) 2732, hep-th=9602105. V.A. Berezin, A.M. Boyarsky, A.Y. Neronov, Phys. Rev. D 57 (1998) 1118, gr-qc=9708060. V.A. Berezin, A.M. Boyarsky, A.Y. Neronov, Phys. Lett. B 455 (1999) 109, gr-qc=9808027. M. Bojowald et al., Phys. Rev. D 62 (2000) 044026, gr-qc=9906105. A. Barvinsky, G. Kunstatter, Phys. Lett. B 389 (1996) 231, hep-th=9606134. J. Bekenstein, Lett. Nuovo Cimento 11 (1974) 467. G. ’t Hooft, Salamfestschrift, World Scienti9c, Singapore, 1993,gr-qc=9310026. L. Susskind, J. Math. Phys. 36 (1995) 6377, hep-th=9409089. J. Maldacena, Adv. Theor. Math. Phys. 2 (1998) 231, hep-th=9711200. S.S. Gubser, I.R. Klebanov, A.M. Polyakov, Phys. Lett. B 428 (1998) 105, hep-th=9802109. E. Witten, Adv. Theor. Math. Phys. 2 (1998) 253, hep-th=9802150. I. Sachs, S.N. Solodukhin, Phys. Rev. D 64 (2001) 124023, hep-th=0107173. H. Nicolai, D. Korotkin, H. Samtleben, 1996, hep-th=9612065. D. Korotkin, H. Samtleben, Nucl. Phys. B 527 (1998) 657, hep-th=9710210. D. Korotkin, H. Samtleben, Phys. Rev. Lett. 80 (1998) 14, gr-qc=9705013. H. Nicolai, H. Samtleben, Nucl. Phys. B 533 (1998) 210, hep-th=9804152. D. Bernard, N. Regnault, Commun. Math. Phys. 210 (2000) 177, solv-int=9902017. G.G. Varzugin, 2000, gr-qc=0001024. L.D. Faddeev, R.M. Kashaev, A.Y. Volkov, Commun. Math. Phys. 219 (2001) 199, hep-th=0006156. J. Teschner, Class. Quant. Grav. 18 (2001) R153, hep-th=0104158. M. Cadoni, Phys. Rev. D 58 (1998) 104001, hep-th=9803257. D.J. Navarro, J. Navarro-Salas, Mod. Phys. Lett. A 13 (1998) 2049, hep-th=9807003. M. Nakahara, Geometry, Topology and Physics, IOP Publishing, Bristol, 1990. A. Einstein, Sitzungsber. Preuss. Akad. Wiss. Berlin (Math. Phys.) 1915 (1915) 778. A. Einstein, Sitzungsber. Preuss. Akad. Wiss. Berlin (Math. Phys.) 1915 (1915) 844.
424 [127] [128] [129] [130] [131] [132] [133] [134] [135] [136] [137] [138] [139] [140] [141] [142] [143] [144] [145] [146] [147] [148] [149] [150] [151] [152] [153] [154] [155] [156] [157] [158] [159] [160] [161] [162] [163] [164] [165] [166] [167] [168] [169] [170] [171] [172] [173] [174] [175] [176] [177]
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430 A. Palatini, Rend. Circ. Mat. Palermo 43 (1919) 203. F.W. Hehl, et al., Rev. Mod. Phys. 48 (1976) 393. F.W. Hehl, et al., Phys. Rep. 258 (1995) 1, arXiv:gr-qc=9402012. P. Fiziev, H. Kleinert, Europhys. Lett. 35 (1996) 241, hep-th=9503074. H. Kleinert, A. Pelster, Gen. Rel. Grav. 31 (1999) 1439, gr-qc=9605028. S. Hawking, G. Ellis, The Large Scale Structure of Space–Time, Cambridge University Press, Cambridge, 1973. R.M. Wald, General Relativity, The University of Chicago Press, Chicago, 1984. K. Schwarzschild, Sitzungsber. Preuss. Akad. Wiss. Berlin (Math. Phys.) 1916 (1916) 189, arXiv:physics=9905030. M. Walker, J. Math. Phys. 11 (1970) 2280. T. KlYosch, T. Strobl, Class. Quant. Grav. 13 (1996) 2395, arXiv:gr-qc=9511081. K. Kucha_r, Phys. Rev. D 4 (1971) 955. R.H. Gowdy, Phys. Rev. Lett. 27 (1971) 826. R. Geroch, J. Math. Phys. 13 (1972) 394. V. Husain, L. Smolin, Nucl. Phys. B 327 (1989) 205. O. Brodbeck, M. Zagermann, Class. Quant. Grav. 17 (2000) 2749, arXiv:gr-qc=9911118. D. Grumiller, Quantum dilaton gravity in two dimensions with matter, Ph.D. Thesis, Technische UniversitYat Wien, 2001, gr-qc=0105078. G.W. Gibbons, K. Maeda, Nucl. Phys. B 298 (1988) 741. C.G. Callan Jr, et al., Nucl. Phys. B 262 (1985) 593. A. Mikovic, Phys. Lett. B 291 (1992) 19, hep-th=9207006. D. Cangemi, R. Jackiw, B. Zwiebach, Ann. Phys. 245 (1996) 408, hep-th=9505161. E. Benedict, R. Jackiw, H.J. Lee, Phys. Rev. D 54 (1996) 6213, hep-th=9607062. A. Mikovic, V. Radovanovic, Class. Quant. Grav. 14 (1997) 2647, gr-qc=9703035. K.V. Kucha_r, J.D. Romano, M. Varadarajan, Phys. Rev. D 55 (1997) 795, gr-qc=9608011. M. Varadarajan, Phys. Rev. D 57 (1998) 3463, gr-qc=9801058. P. Ginsparg, G.W. Moore, 1993, hep-th=9304011. T. Banks, M. O’Loughlin, Nucl. Phys. B 362 (1991) 649. T. Strobl, Habilitation thesis, 1999, hep-th=0011240. R. Jackiw, 1995, hep-th=9501016. J. de Boer, F. Harmsze, T. Tjin, Phys. Rep. 272 (1996) 139, hep-th=9503161. H. Reissner, Ann. Phys. 50 (1916) 106. G. NordstrYom, Proc. K. Ned. Akad. Wet. 20 (1916) 1238. H.J. Schmidt, J. Math. Phys. 32 (1991) 1562. V.P. Frolov, Phys. Rev. D 46 (1992) 5383. S.D. Odintsov, I.L. Shapiro, Mod. Phys. Lett. A 7 (1992) 437. S.P. de Alwis, Phys. Lett. B 300 (1993) 330, hep-th=9206020. R.B. Mann, Phys. Rev. D 47 (1993) 4438, hep-th=9206044. J. Gegenberg, G. Kunstatter, Phys. Rev. D 47 (1993) 4192, gr-qc=9302006. D. Louis-Martinez, G. Kunstatter, Phys. Rev. D 49 (1994) 5227. D. Louis-Martinez, J. Gegenberg, G. Kunstatter, Phys. Lett. B 321 (1994) 193, gr-qc=9309018. J.P.S. Lemos, P.M. Sa, Phys. Rev. D 49 (1994) 2897, arXiv:gr-qc=9311008. D. Louis-Martinez, Phys. Rev. D 55 (1997) 7982, hep-th=9611031. M. Navarro, Phys. Rev. D 56 (1997) 2384, arXiv:gr-qc=9702040. Y. Kiem, C.Y. Lee, D. Park, Class. Quant. Grav. 15 (1998) 2973, arXiv:hep-th=9703044. J. Cruz, et al., Phys. Rev. D 58 (1998) 044010, arXiv:hep-th=9704168. G. Kunstatter, R. Petryk, S. Shelemy, Phys. Rev. D 57 (1998) 3537, arXiv:gr-qc=9709043. M. Cavaglia, Phys. Rev. D 59 (1999) 084011, arXiv:hep-th=9811059. J. Cruz, A. Fabbri, J. Navarro-Salas, Phys. Lett. B 449 (1999) 30, arXiv:hep-th=9811246. S. Cassemiro F.F., V.O. Rivelles, Phys. Lett. B 452 (1999) 234, arXiv:hep-th=9812096. M. Cavaglia, Mod. Phys. Lett. A 15 (2000) 2113, hep-th=0011136. D. Grumiller, D. Hofmann, W. Kummer, Mod. Phys. Lett. A 16 (2001) 1597, arXiv:gr-qc=0012026. M. Fierz, Helv. Phys. Acta 29 (1956) 128.
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
425
[178] P. Jordan, Schwerkraft und Weltall: Grundlagen der theoretischen Kosmologie, 2nd Edition, Vieweg, Braunschweig, 1955. [179] P. Jordan, Z. Phys. 157 (1959) 112. [180] C. Brans, R.H. Dicke, Phys. Rev. 124 (1961) 925. [181] M.B. Green, J.H. Schwarz, E. Witten, Superstring Theory, Introduction, Vol. 1, Cambridge University Press, Cambridge, 1987. [182] S. Capozziello, R. de Ritis, A.A. Marino, Class. Quant. Grav. 14 (1997) 3243, gr-qc=9612053. [183] R. Dick, Gen. Rel. Grav. 30 (1998) 435. [184] R. Casadio, B. Harms, Mod. Phys. Lett. A 14 (1999) 1089, gr-qc=9806032. [185] V. Faraoni, E. Gunzig, Int. J. Theor. Phys. 38 (1999) 217. [186] E. Alvarez, J. Conde, Mod. Phys. Lett. A 17 (2002) 413, gr-qc=0111031. [187] M.O. Katanaev, W. Kummer, H. Liebl, Nucl. Phys. B 486 (1997) 353, gr-qc=9602040. [188] M.F. Ertl, M.O. Katanaev, W. Kummer, Nucl. Phys. B 530 (1998) 457, hep-th=9710051. [189] N. Ikeda, Ann. Phys. 235 (1994) 435, arXiv:hep-th=9312059. [190] P. Schaller, T. Strobl, Integrable models and strings, Helsinki, 1993, 1994, pp. 98–122, gr-qc=9406027. [191] P. Schaller, T. Strobl, Finite dimensional integrable systems, Dubna, 1994, pp. 181–190, hep-th=9411163. [192] J. Schouten, Conv. Geom. Di?. 1 (1954) 7. [193] A. Nijenhuis, Proc. K. Ned. Akad. Wet. Amsterdam A 58 (1955) 390. [194] H. Grosse, et al., J. Math. Phys. 33 (1992) 3892, hep-th=9205071. [195] M. Ertl, Supergravity in two space–time dimensions, Ph.D. Thesis, Technische UniversitYat Wien, 2001, arXiv:hep-th=0102140. [196] F. Bayen, et al., Ann. Phys. 111 (1978) 61. [197] F. Bayen, et al., Ann. Phys. 111 (1978) 111. [198] A.S. Cattaneo, G. Felder, Commun. Math. Phys. 212 (2000) 591, math.qa=9902090. [199] A.S. Cattaneo, G. Felder, Mod. Phys. Lett. A 16 (2001) 179, hep-th=0102208. [200] A.C. Hirshfeld, T. Schwarzweller, Ann. Phys. 9 (2000) 83, hep-th=9910178. [201] A.C. Hirshfeld, T. Schwarzweller, 2001, hep-th=0112086. [202] J. Gegenberg, G. Kunstatter, D. Louis-Martinez, Phys. Rev. D 51 (1995) 1781, gr-qc=9408015. [203] A.S. Eddington, Nature 113 (1924) 192. [204] D. Finkelstein, Phys. Rev. 110 (1958) 965. [205] D. Birmingham, et al., Phys. Rept. 209 (1991) 129. [206] A.S. Schwarz, Lett. Math. Phys. 2 (1978) 247. [207] A.S. Schwarz, Commun. Math. Phys. 67 (1979) 1. [208] E. Witten, Commun. Math. Phys. 117 (1988) 353. [209] E. Witten, Commun. Math. Phys. 118 (1988) 411. [210] E. Witten, Commun. Math. Phys. 121 (1989) 351. [211] B. Carter, J. Math. Phys. 10 (1969) 70. [212] R. Penrose, Riv. Nuovo Cimento 1 (1969) 252. [213] M.O. Katanaev, J. Math. Phys. 34 (1993) 700. [214] T. KlYosch, T. Strobl, Class. Quant. Grav. 13 (1996) 965, arXiv:gr-qc=9508020. [215] W. Israel, Phys. Rev. 143 (1966) 1016. [216] T. KlYosch, T. Strobl, Class. Quant. Grav. 13 (1996) 1191, arXiv:gr-qc=9507011. [217] T. KlYosch, T. Strobl, Phys. Rev. D 57 (1998) 1034, arXiv:gr-qc=9707053. [218] S. Mignemi, Phys. Rev. D 50 (1994) 4733. [219] A. Fabbri, J.G. Russo, Phys. Rev. D 53 (1996) 6995, hep-th=9510109. [220] H. Liebl, D.V. Vassilevich, S. Alexandrov, Class. Quant. Grav. 14 (1997) 889, arXiv:gr-qc=9605044. [221] W. Kummer, P. Widerin, Mod. Phys. Lett. A 9 (1994) 1407. [222] D. Louis-Martinez, G. Kunstatter, Phys. Rev. D 52 (1995) 3494, gr-qc=9503016. [223] D. Park, Y. Kiem, Phys. Rev. D 53 (1996) 5513, hep-th=9601166. [224] P. Forgacs, N.S. Manton, Commun. Math. Phys. 72 (1980) 15. [225] E. Witten, Phys. Rev. Lett. 38 (1977) 121. [226] M.S. Volkov, D.V. Gal’tsov, Phys. Rep. 319 (1999) 1, hep-th=9810070.
426 [227] [228] [229] [230] [231] [232] [233] [234] [235] [236] [237] [238] [239] [240] [241] [242] [243] [244] [245] [246] [247] [248] [249] [250] [251] [252] [253] [254] [255] [256] [257] [258] [259] [260] [261] [262] [263] [264] [265] [266] [267] [268] [269] [270] [271] [272] [273] [274] [275] [276]
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430 P.S. Howe, J. Phys. A 12 (1979) 393. M. Brown, J. Gates, S. James, Ann. Phys. 122 (1979) 443. E.J. Martinec, Phys. Rev. D 28 (1983) 2604. M. Rocek, P. van Nieuwenhuizen, S.C. Zhang, Ann. Phys. 172 (1986) 348. V.O. Rivelles, Phys. Lett. B 321 (1994) 189, hep-th=9301029. M.M. Leite, V.O. Rivelles, Class. Quant. Grav. 12 (1995) 627, hep-th=9410003. J.M. Izquierdo, Phys. Rev. D 59 (1999) 084017, arXiv:hep-th=9807007. N. Ikeda, Int. J. Mod. Phys. A 9 (1994) 1137. T. Strobl, Phys. Lett. B 460 (1999) 87, arXiv:hep-th=9906230. M.J. Du?, B.E.W. Nilsson, C.N. Pope, Phys. Rep. 130 (1986) 1. Y.C. Park, A. Strominger, Phys. Rev. D 47 (1993) 1569, arXiv:hep-th=9210017. R. Penrose, W. Rindler, Spinors and Space–Time I, Cambridge University Press, Cambridge, 1984. A.G. Riess,et al., Astrophys. J. 560 (2001) 49, astro-ph=0104455. A.G. Riess, Publ. Astron. Soc. Paci9c 112 (2000) 1284. A.G. Riess, et al., Supernova Search Team, Astron. J. 116 (1998) 1009, astro-ph=9805201. S. Perlmutter, et al., Supernova Cosmology Project, Astrophys. J. 517 (1999) 565, arXiv:astro-ph=9812133. N.A. Bahcall, et al., Science 284 (1999) 1481, arXiv:astro-ph=9906463. C. Wetterich, Nucl. Phys. B 302 (1988) 668. L.M. Wang, P.J. Steinhardt, Astrophys. J. 508 (1998) 483, arXiv:astro-ph=9804015. S.M. Carroll, Phys. Rev. Lett. 81 (1998) 3067, arXiv:astro-ph=9806099. I. Zlatev, L.M. Wang, P.J. Steinhardt, Phys. Rev. Lett. 82 (1999) 896, arXiv:astro-ph=9807002. B. Ratra, P.J.E. Peebles, Phys. Rev. D 37 (1988) 3406. D. Grumiller, D. Hofmann, W. Kummer, Ann. Phys. 290 (2001) 69, arXiv:gr-qc=0005098. A.T. Filippov, Mod. Phys. Lett. A 11 (1996) 1691, hep-th=9605008. A.T. Filippov, Int. J. Mod. Phys. A 12 (1997) 13, gr-qc=9612058. A.M. Polyakov, Phys. Lett. B 103 (1981) 207. D.I. Kazakov, S.N. Solodukhin, Nucl. Phys. B 429 (1994) 153, hep-th=9310150. E. Witten, Commun. Math. Phys. 92 (1984) 455. M.O. Katanaev, Ann. Phys. 296 (2002) 1, gr-qc=0101033. I.Z. Fisher, Zh. Eksp. Teor. Fiz. 18 (1948) 636, gr-qc=9911008. M.D. Roberts, Gen. Rel. Grav. 21 (1989) 907. A. Mikovic, Phys. Rev. D 56 (1997) 6067, gr-qc=9705030. L.A. Gergely, Phys. Rev. D 58 (1998) 084030, gr-qc=9809024. L.A. Gergely, Phys. Rev. D 59 (1999) 104014, gr-qc=9902016. G. Tieber, Gravitationsmodelle mit materie in zwei Dimensionen und ihre Symmetrien, Ph.D. Thesis, Technische UniversitYat Wien, 1997 (in German). M. Cadoni, S. Mignemi, (2002), gr-qc=0202066. G.M. Kremer, F.P. Devecchi, Phys. Rev. D 65 (2002) 083515, gr-qc=0202025. M.W. Choptuik, Phys. Rev. Lett. 70 (1993) 9. D. Christodoulou, Commun. Math. Phys. 106 (1986) 587. D. Christodoulou, Commun. Math. Phys. 105 (1986) 337. D. Christodoulou, Commun. Math. Phys. 109 (1987) 591. D. Christodoulou, Commun. Math. Phys. 109 (1987) 613. C. Gundlach, Adv. Theor. Math. Phys. 2 (1998) 1, arXiv:gr-qc=9712084. C. Gundlach, Living Rev. Rel. 2 (1999) 4, arXiv:gr-qc=0001046. R. Arnowitt, S. Deser, C.W. Misner, in: L. Witten (Ed.), Gravitation: An Introduction to Current Research, Wiley, New York, 1962. L.D. Faddeev, Sov. Phys. Usp. 25 (1982) 130. J. Louko, B.F. Whiting, Phys. Rev. D 51 (1995) 5583, gr-qc=9411017. W. Kummer, S.R. Lau, Ann. Phys. 258 (1997) 37, arXiv:gr-qc=9612021. W.T. Kim, Phys. Rev. D 60 (1999) 024011, hep-th=9810055. G.W. Gibbons, S.W. Hawking, Phys. Rev. D 15 (1977) 2752.
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430 [277] [278] [279] [280] [281] [282] [283] [284] [285] [286] [287] [288] [289] [290] [291] [292] [293] [294] [295] [296] [297] [298] [299] [300] [301] [302] [303] [304] [305] [306] [307] [308] [309] [310] [311] [312] [313] [314] [315] [316] [317] [318] [319] [320] [321] [322] [323] [324] [325] [326] [327]
427
J.D. Brown, J. York, W. James, Phys. Rev. D 47 (1993) 1407. L.F. Abbott, S. Deser, Nucl. Phys. B 195 (1982) 76. S.W. Hawking, G.T. Horowitz, Class. Quant. Grav. 13 (1996) 1487, gr-qc=9501014. N. Berkovits, S. Gukov, B.C. Vallilo, Nucl. Phys. B 614 (2001) 195, hep-th=0107140. V.A. Kazakov, A.A. Tseytlin, JHEP 06 (2001) 021, hep-th=0104138. G.W. Gibbons, M.J. Perry, Int. J. Mod. Phys. D 1 (1992) 335, hep-th=9204090. C.R. Nappi, A. Pasquinucci, Mod. Phys. Lett. A 7 (1992) 3337, gr-qc=9208002. K.C.K. Chan, J.D.E. Creighton, R.B. Mann, Phys. Rev. D 54 (1996) 3892, gr-qc=9604055. M.Z. Iofa, 1995, hep-th=9602023. W.T. Kim, J.J. Oh, Phys. Lett. B 461 (1999) 189, hep-th=9905007. W.T. Kim, J. Lee, Int. J. Mod. Phys. A 11 (1996) 553, hep-th=9502078. J. Cruz, A. Fabbri, J. Navarro-Salas, Phys. Rev. D 60 (1999) 107506, gr-qc=9902084. A.J.M. Medved, G. Kunstatter, Phys. Rev. D 59 (1999) 104005, hep-th=9811052. K.C.K. Chan, 1997, gr-qc=9701029. J.D. Brown, S.R. Lau, J.W. York, 2000, gr-qc=0010024. R.M. Wald, Phys. Rev. D 48 (1993) 3427, gr-qc=9307038. V. Iyer, R.M. Wald, Phys. Rev. D 50 (1994) 846, gr-qc=9403028. R.L. Marsa, M.W. Choptuik, Phys. Rev. D 54 (1996) 4929, arXiv:gr-qc=9607034. W. Kummer, G. Tieber, Phys. Rev. D 59 (1999) 044001, arXiv:hep-th=9807122. V.P. Frolov, I.D. Novikov, Black Hole Physics: Basic Concepts and New Developments, Kluwer Academic Publishers, Dordrecht, Netherlands, 1998. T. Christodoulakis, et al., Phys. Lett. B 501 (2001) 269, hep-th=0010097. E.C. Vagenas, Phys. Lett. B 503 (2001) 399, hep-th=0012134. S.M. Christensen, S.A. Fulling, Phys. Rev. D 15 (1977) 2088. W. Kummer, D.V. Vassilevich, Ann. Phys. 8 (1999) 801, gr-qc=9907041. M.J. Du?, Class. Quant. Grav. 11 (1994) 1387, hep-th=9308075. D.R. Karakhanian, R.P. Manvelian, R.L. Mkrtchian, Phys. Lett. B 329 (1994) 185, hep-th=9401031. G. Amelino-Camelia, D. Bak, D. Seminara, Phys. Lett. B 354 (1995) 213, hep-th=9505136. H. La, 1995, hep-th=9510147. J.S. Dowker, R. Critchley, Phys. Rev. D 13 (1976) 3224. S.W. Hawking, Commun. Math. Phys. 55 (1977) 133. D.G. Boulware, Phys. Rev. D 11 (1975) 1404. W. Israel, Phys. Lett. A 57 (1976) 107. G.W. Gibbons, M.J. Perry, Phys. Rev. Lett. 36 (1976) 985. M. Cadoni, S. Mignemi, Phys. Lett. B 358 (1995) 217, gr-qc=9505032. M. Cadoni, Phys. Rev. D 53 (1996) 4413, gr-qc=9510012. S. Nojiri, I. Oda, Phys. Lett. B 294 (1992) 317, hep-th=9206087. A. Ori, Phys. Rev. D 63 (2001) 104016, gr-qc=0102067. K. Diba, D.A. Lowe, 2002, hep-th=0202005. T. Christodoulakis, et al., Phys. Rev. D 64 (2001) 124022, hep-th=0107049. V. Mukhanov, A. Wipf, A. Zelnikov, Phys. Lett. B 332 (1994) 283, hep-th=9403018. T. Chiba, M. Siino, Mod. Phys. Lett. A 12 (1997) 709. S. Ichinose, Phys. Rev. D 57 (1998) 6224, hep-th=9707025. W. Kummer, H. Liebl, D.V. Vassilevich, Mod. Phys. Lett. A 12 (1997) 2683, hep-th=9707041. R. Bousso, S.W. Hawking, Phys. Rev. D 56 (1997) 7788, hep-th=9705236. S. Nojiri, S.D. Odintsov, Mod. Phys. Lett. A 12 (1997) 2083, hep-th=9706009. A. Mikovic, V. Radovanovic, Class. Quant. Grav. 15 (1998) 827, hep-th=9706066. S. Nojiri, S.D. Odintsov, Phys. Rev. D 59 (1999) 044003, hep-th=9806055. S. Ichinose, S.D. Odintsov, Nucl. Phys. B 539 (1999) 643, hep-th=9802043. S. Nojiri, S.D. Odintsov, Int. J. Mod. Phys. A 16 (2001) 1015, hep-th=0009202. S. Nojiri, S.D. Odintsov, Phys. Rev. D 57 (1998) 2363, hep-th=9706143. J.S. Dowker, Class. Quant. Grav. 15 (1998) 1881, hep-th=9802029.
428 [328] [329] [330] [331] [332] [333] [334] [335] [336] [337] [338] [339] [340] [341] [342] [343] [344] [345] [346] [347] [348] [349] [350] [351] [352] [353] [354] [355] [356] [357] [358] [359] [360] [361] [362] [363] [364] [365] [366] [367] [368] [369] [370] [371] [372] [373] [374] [375] [376]
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430 W. Kummer, H. Liebl, D.V. Vassilevich, Phys. Rev. D 58 (1998) 108501, hep-th=9801122. D. Hofmann, Quantum radiation from Black Holes, Ph.D. Thesis, Technische UniversitYat Wien, 2002. W. Kummer, D.V. Vassilevich, Phys. Rev. D 60 (1999) 084021, hep-th=9811092. V. Frolov, A. Zelnikov, Phys. Rev. D 63 (2001) 125026, hep-th=0012252. D.V. Vassilevich, JHEP 03 (2001) 023, hep-th=0102091. Y.V. Gusev, A.I. Zelnikov, Phys. Rev. D 61 (2000) 084010, hep-th=9910198. D.V. Vassilevich, A. Zelnikov, Nucl. Phys. B 594 (2001) 501, hep-th=0009084. J.G. Russo, L. Susskind, L. Thorlacius, Phys. Rev. D 46 (1992) 3444, hep-th=9206070. R. Bousso, S.W. Hawking, Phys. Rev. D 57 (1998) 2436, hep-th=9709224. R. Bousso, Phys. Rev. D 58 (1998) 083511, hep-th=9805081. R. Bousso, Phys. Rev. D 60 (1999) 063503, hep-th=9902183. M. Buric, M. Dimitrijevic, V. Radovanovic, Phys. Rev. D 65 (2002) 064022, hep-th=0108036. M. Buric, V. Radovanovic, Class. Quant. Grav. 17 (2000) 33, gr-qc=9907106. M. Buric, V. Radovanovic, Class. Quant. Grav. 16 (1999) 3937, gr-qc=9907036. M. Buric, V. Radovanovic, Phys. Rev. D 63 (2001) 044020, hep-th=0007172. A.J.M. Medved, G. Kunstatter, Phys. Rev. D 60 (1999) 104029, hep-th=9904070. A.J.M. Medved, G. Kunstatter, Phys. Rev. D 63 (2001) 104005, hep-th=0009050. A.J.M. Medved, 2001, hep-th=0112056. A.J.M. Medved, 2001, hep-th=0111091. C. Barbachoux, A. Fabbri, 2002, hep-th=0201133. Y.V. Novozhilov, D.V. Vassilevich, Phys. Lett. B 220 (1989) 36. F.C. Lombardo, F.D. Mazzitelli, J.G. Russo, Phys. Rev. D 59 (1999) 064007, gr-qc=9808048. M. Buric, V. Radovanovic, A. Mikovic, Phys. Rev. D 59 (1999) 084002, gr-qc=9804083. R. Balbinot, A. Fabbri, Phys. Rev. D 59 (1999) 044031, hep-th=9807123. R. Balbinot, A. Fabbri, 2000, hep-th=0012140. R. Balbinot, A. Fabbri, Phys. Lett. B 459 (1999) 112, gr-qc=9904034. R. Balbinot, et al., Phys. Rev. D 63 (2001) 084029, hep-th=0012048. A.W. Wipf, Nucl. Phys. B 269 (1986) 24. V. Frolov, P. Sutton, A. Zelnikov, Phys. Rev. D 61 (2000) 024021, hep-th=9909086. P. Sutton, Phys. Rev. D 62 (2000) 044033, hep-th=0003290. G. Cognola, S. Zerbini, Nucl. Phys. B 602 (2001) 383, hep-th=0008061. R. Balbinot et al., 2002, hep-th=0202036. D.M. Gitman, I.V. Tyutin, Quantization of Fields with Constraints, Springer, Berlin, 1990. M. Henneaux, C. Teitelboim, Quantization of Gauge Systems, Princeton University Press, Princeton, NJ, USA, 1992. P.A.M. Dirac, Lectures on Quantum Mechanics, Belfer Graduate School of Science, Yeshiva University, New York, 1996. A. Ashtekar, Phys. Rev. Lett. 57 (1986) 2244. A. Ashtekar, Phys. Rev. D 36 (1987) 1587. T. Thiemann, 2001, gr-qc=0110034. M.O. Katanaev, Nucl. Phys. B 416 (1994) 563, hep-th=0101168. E.S. Fradkin, G.A. Vilkovisky, Phys. Lett. B 55 (1975) 224. E.S. Fradkin, T.E. Fradkina, Phys. Lett. B 72 (1978) 343. D.J. Toms, Phys. Rev. D 35 (1987) 3796. K. Fujikawa, et al., Phys. Rev. D 37 (1988) 391. F. Bastianelli, O. Corradini, Phys. Rev. D 60 (1999) 044014, hep-th=9810119. A.O. Barvinsky, Phys. Lett. B 195 (1987) 344. V.N. Marachevsky, D. Vassilevich, Class. Quant. Grav. 13 (1996) 645, gr-qc=9509051. D.V. Vassilevich, Mod. Phys. Lett. A 10 (1995) 2239, hep-th=9504011. I.G. Moss, P.J. Silva, Phys. Rev. D 55 (1997) 1072, gr-qc=9610023. G. Esposito, Quantum Gravity in Four-Dimensions, Nova Science, New York, 2001.
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
429
[377] U. LindstrYom, M. Ro_cek, P. van Nieuwenhuizen, Open String Boundary Conditions and Bulk Tachyons, in preparation. [378] D.V. Vassilevich, Phys. Rev. D 52 (1995) 999, gr-qc=9411036. [379] M. Cadoni, P. Carta and S. Mignemi, 2002, hep-th=0202010. [380] M. Cavaglia, C. Ungarelli, Phys. Rev. D 61 (2000) 064019, hep-th=9912024. [381] M. Cavaglia, A. Fabbri, Phys. Rev. D 65 (2002) 044012, hep-th=0108050. [382] F.C. Lombardo, F.D. Mazzitelli, Phys. Rev. D 58 (1998) 024009, gr-qc=9712091. [383] A. Fabbri, D.J. Navarro, J. Navarro-Salas, Nucl. Phys. B 595 (2001) 381, hep-th=0006035. [384] A. Fabbri, D.J. Navarro, J. Navarro-Salas, Phys. Rev. Lett. 85 (2000) 2434, hep-th=0004027. [385] M. Brigante, et al., JHEP 03 (2002) 005, hep-th=0202073. [386] S. Alexandrov, I. Grigentch, D. Vassilevich, Class. Quant. Grav. 15 (1998) 573, gr-qc=9705080. [387] O. Klein, Y. Nishina, Z. Phys. 52 (1929) 853. [388] G. ’t Hooft, Int. J. Mod. Phys. A 11 (1996) 4623, gr-qc=9607022. [389] D. Grumiller, Int. J. Mod. Phys. A 17 (2001) 989, hep-th=0111138. [390] S.W. Hawking, Phys. Rev. D 53 (1996) 3099, hep-th=9510029. [391] P. Fischer, Vertices in spherically reduced quantum gravity, Master’s Thesis, Vienna University of Technology, 2001. [392] T. Banks, Nucl. Phys. (Proc. Suppl.) 41 (1995) 21, hep-th=9412131. [393] Y. Aharonov, A. Casher, S. Nussinov, Phys. Lett. B 191 (1987) 51. [394] C.R. Stephens, G. ’t Hooft, B.F. Whiting, Class. Quant. Grav. 11 (1994) 621, gr-qc=9310006. [395] S.W. Hawking, Commun. Math. Phys. 87 (1982) 395. [396] T. Banks, L. Susskind, M.E. Peskin, Nucl. Phys. B 244 (1984) 125. [397] C. Misner, in: J. Klauder (Ed.), Magic without Magic: John Archibald Wheeler, A Collection of Essays in Honor of his 60th Birthday, Freeman, San Francisco, CA, 1972. [398] F. Lund, Phys. Rev. D 8 (1973) 3247. [399] D. Cangemi, R. Jackiw, Phys. Lett. B 299 (1993) 24, hep-th=9210036. [400] T. Strobl, Phys. Rev. D 50 (1994) 7346, hep-th=9403121. [401] J. Gegenberg, G. Kunstatter, T. Strobl, Phys. Rev. D 55 (1997) 7651, gr-qc=9612033. [402] D. Cangemi, R. Jackiw, Phys. Rev. Lett. 69 (1992) 233, hep-th=9203056. [403] D. Cangemi, R. Jackiw, Ann. Phys. 225 (1993) 229, hep-th=9302026. [404] D. Cangemi, R. Jackiw, Phys. Lett. B 337 (1994) 271, hep-th=9405119. [405] A. Mikovic, Phys. Lett. B 304 (1993) 70, hep-th=9211082. [406] A. Mikovic, Phys. Lett. B 355 (1995) 85, hep-th=9407104. [407] M. Varadarajan, Phys. Rev. D 52 (1995) 7080, gr-qc=9508039. [408] S.N. Vergeles, Sov. Phys. JETP Lett. 117 (2000) 3, gr-qc=0102001. [409] R. Beig, N. oV Murchadha, Ann. Phys. (N.Y.) 174 (1987) 463. [410] D. Boulware, S. Deser, J. Math. Phys. 8 (1967) 1468. [411] K. Kucha_r, Phys. Rev. D 39 (1989) 2263. [412] K. Kucha_r, Phys. Rev. D 39 (1989) 1579. [413] G. Watson, J.R. Klauder, J. Math. Phys. 41 (2000) 8072, quant-ph=0001026. [414] L.D. Faddeev, R. Jackiw, Phys. Rev. Lett. 60 (1988) 1692. [415] M. Cavaglia, V. de Alfaro, A.T. Filippov, Int. J. Mod. Phys. D 4 (1995) 661, gr-qc=9411070. [416] M. Cavaglia, V. de Alfaro, A.T. Filippov, Int. J. Mod. Phys. D 5 (1996) 227, gr-qc=9508062. [417] M. Cavaglia, V. de Alfaro, A.T. Filippov, Phys. Lett. B 424 (1998) 265, hep-th=9802158. [418] A. Barvinsky, S. Das, G. Kunstatter, Phys. Lett. B 517 (2001) 415, hep-th=0102061. [419] J. Bicak, K.V. Kucha_r, Phys. Rev. D 56 (1997) 4878, gr-qc=9704053. [420] P. Hajicek, I. Kouletsis, 2001, gr-qc=0112060. [421] P. Hajicek, I. Kouletsis, 2001, gr-qc=0112061. [422] I. Kouletsis, P. Hajicek, 2001, gr-qc=0112062. [423] C. Vaz, L. Witten, T.P. Singh, 2001, gr-qc=0112024. [424] P. Hajicek, 2002, gr-qc=0204049. [425] R. Casadio, Phys. Lett. B 511 (2001) 285, gr-qc=0102006.
430
D. Grumiller et al. / Physics Reports 369 (2002) 327 – 430
[426] P.B. Gilkey, Invariance Theory, the Heat Equation, and the Atiyah–Singer Index Theorem, CRC Press, Boca Raton, FL, 1994. [427] G. Esposito, A. Kamenshchik, G. Pollifrone, Euclidean Quantum Gravity on Manifolds with Boundary, Kluwer Academic Publishers, Dordrecht, 1997. [428] I.G. Avramidi, Heat Kernel and Quantum Gravity, Springer, Berlin, 2000. [429] K. Kirsten, Spectral Functions in Mathematics and Physics, Chapman & Hall=CRC Press, Boca Raton, FL, 2001. [430] T.P. Branson, P.B. Gilkey, D.V. Vassilevich, J. Math. Phys. 39 (1998) 1040, hep-th=9702178. [431] A.E.M. van de Ven, Class. Quant. Grav. 15 (1998) 2311, hep-th=9708152. [432] B.S. DeWitt, Dynamical Theory of Groups and Fields, Gordon and Breach, New York, 1965. [433] V.A. Fock, Izv. USSR Acad. Sci. (Phys.) 4 –5 (1937) 551. [434] P.B. Gilkey, J. Di?erential Geom. 10 (1975) 601. [435] T.P. Branson, P.B. Gilkey, Commun. Partial Di?erential Equations 15 (1990) 245. [436] D.V. Vassilevich, J. Math. Phys. 36 (1995) 3174, gr-qc=9404052. [437] T.P. Branson, et al., Nucl. Phys. B 563 (1999) 603, hep-th=9906144. [438] C. Vaz, L. Witten, Nucl. Phys. B 487 (1997) 409. [439] W. Kummer, D.J. Schwarz, Nucl. Phys. B 382 (1992) 172.
Physics Reports 369 (2002) 431 – 548 www.elsevier.com/locate/physrep
Fundamentals of quantum information theory Michael Keyl TU-Braunschweig, Institute of Mathematical Physics, Mendelssohnstrae 3, D-38106 Braunschweig, Germany Received 3 June 2002 editor: J. Eichler
Abstract In this paper we give a self-contained introduction to the conceptional and mathematical foundations of quantum information theory. In the .rst part we introduce the basic notions like entanglement, channels, teleportation, etc. and their mathematical description. The second part is focused on a presentation of the quantitative aspects of the theory. Topics discussed in this context include: entanglement measures, channel capacities, relations between both, additivity and continuity properties and asymptotic rates of quantum operations. Finally, we give an overview on some recent developments and open questions. c 2002 Elsevier Science B.V. All rights reserved. PACS: 03.67.−a; 03.65.−w
Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1. What is quantum information? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2. Tasks of quantum information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3. Experimental realizations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2. Basic concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1. Systems, states and e=ects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.1. Operator algebras . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.2. Quantum mechanics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.3. Classical probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.4. Observables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2. Composite systems and entangled states . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.1. Tensor products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.2. Compound and hybrid systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.3. Correlations and entanglement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.4. Bell inequalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . E-mail address:
[email protected] (M. Keyl). c 2002 Elsevier Science B.V. All rights reserved. 0370-1573/02/$ - see front matter PII: S 0 3 7 0 - 1 5 7 3 ( 0 2 ) 0 0 2 6 6 - 1
433 434 436 438 439 439 440 441 442 443 444 444 445 446 447
432
M. Keyl / Physics Reports 369 (2002) 431 – 548
2.3. Channels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.1. Completely positive maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.2. The Stinespring theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.3. The duality lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4. Separability criteria and positive maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4.1. Positivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4.2. The partial transpose . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4.3. The reduction criterion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3. Basic examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1. Entanglement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.1. Maximally entangled states . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.2. Werner states . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.3. Isotropic states . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.4. OO-invariant states . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.5. PPT states . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.6. Multipartite states . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2. Channels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.1. Quantum channnels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.2. Channels under symmetry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.3. Classical channels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.4. Observables and preparations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.5. Instruments and parameter-dependent operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.6. LOCC and separable channels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3. Quantum mechanics in phase space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.1. Weyl operators and the CCR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.2. Gaussian states . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.3. Entangled Gaussians . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.4. Gaussian channels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4. Basic tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1. Teleportation and dense coding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1.1. Impossible machines revisited: classical teleportation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1.2. Entanglement enhanced teleportation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1.3. Dense coding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2. Estimating and copying . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.1. Quantum state estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.2. Approximate cloning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3. Distillation of entanglement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3.1. Distillation of pairs of qubits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3.2. Distillation of isotropic states . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3.3. Bound entangled states . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4. Quantum error correction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5. Quantum computing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5.1. The network model of classical computing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5.2. Computational complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5.3. Reversible computing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5.4. The network model of a quantum computer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5.5. Simons problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.6. Quantum cryptography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5. Entanglement measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1. General properties and de.nitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1.1. Axiomatics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
448 449 450 450 451 451 452 453 453 454 454 455 456 457 459 460 461 461 463 464 464 465 467 468 468 469 470 472 473 473 474 474 476 477 477 478 479 480 481 481 482 485 485 486 487 487 490 491 493 493 493
M. Keyl / Physics Reports 369 (2002) 431 – 548 5.1.2. Pure states . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1.3. Entanglement measures for mixed states . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2. Two qubits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.1. Pure states . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.2. EOF for Bell diagonal states . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.3. Wootters formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.4. Relative entropy for Bell diagonal states . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3. Entanglement measures under symmetry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.1. Entanglement of formation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.2. Werner states . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.3. Isotropic states . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.4. OO-invariant states . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.5. Relative entropy of entanglement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6. Channel capacity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1. The general case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1.1. The de.nition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1.2. Simple calculations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2. The classical capacity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2.1. Classical channels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2.2. Quantum channels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2.3. Entanglement assisted capacity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2.4. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3. The quantum capacity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3.1. Alternative de.nitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3.2. Upper bounds and achievable rates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3.3. Relations to entanglement measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7. Multiple inputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.1. The general scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.1.1. Figures of merit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.1.2. Covariant operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.1.3. Group representations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.1.4. Distillation of entanglement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2. Optimal devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2.1. Optimal cloning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2.2. Puri.cation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2.3. Estimating pure states . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2.4. The UNOT gate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3. Asymptotic behaviour . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3.1. Estimating mixed state . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3.2. Puri.cation and cloning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
433 495 497 498 499 500 501 502 503 503 503 505 505 507 509 509 509 510 513 513 514 514 515 519 519 520 525 526 526 526 528 529 532 532 532 534 535 538 538 539 541 543
1. Introduction Quantum information and quantum computation have recently attracted a lot of interest. The promise of new technologies like safe cryptography and new “super computers”, capable of handling otherwise untractable problems, has excited not only researchers from many di=erent .elds like physicists, mathematicians and computer scientists, but also a large public audience. On a practical level all these new visions are based on the ability to control the quantum states of (a small number of)
434
M. Keyl / Physics Reports 369 (2002) 431 – 548
microsystems individually and to use them for information transmission and processing. From a more fundamental point of view the crucial point is a reconsideration of the foundations of quantum mechanics in an information theoretical context. The purpose of this work is to follow the second path and to guide physicists into the theoretical foundations of quantum information and some of the most relevant topics of current research. To this end the outline of this paper is as follows: The rest of this introduction is devoted to a rough and informal overview of the .eld, discussing some of its tasks and experimental realizations. Afterwards, in Section 2, we will consider the basic formalism which is necessary to present more detailed results. Typical keywords in this context are: systems, states, observables, correlations, entanglement and quantum channels. We then clarify these concepts (in particular, entanglement and channels) with several examples in Section 3, and in Section 4 we discuss the most important tasks of quantum information in greater detail. The last three sections are devoted to a more quantitative analysis, where we make closer contact to current research: In Section 5 we will discuss how entanglement can be measured. The topic of Section 6 are channel capacities, i.e. we are looking at the amount of information which can maximally be transmitted over a noisy channel and in Section 7 we consider state estimation, optimal cloning and related tasks. Quantum information is a rapidly developing .eld and the present work can of course reJect only a small part of it. An incomplete list of other general sources the reader should consult is: the books of Lo [111], Gruska [76], Nielsen and Chuang [122], Bouwmeester et al. [23] and Alber et al. [3], the lecture notes of Preskill [130] and the collection of references by Cabello [37] which particularly contains many references to other reviews. 1.1. What is quantum information? Classical information is, roughly speaking, everything which can be transmitted from a sender to a receiver with “letters” from a “classical alphabet” e.g. the two digits “0” and “1” or any other .nite set of symbols. In the context of classical information theory, it is completely irrelevant which type of physical system is used to perform the transmission. This abstract approach is successful because it is easy to transform information between di=erent types of carriers like electric currents in a wire, laser pulses in an optical .ber, or symbols on a piece of paper without loss of data; and even if there are losses they are well understood and it is known how to deal with them. However, quantum information theory breaks with this point of view. It studies, loosely speaking, that kind of information (“quantum information”) which is transmitted by microparticles from a preparation device (sender) to a measuring apparatus (receiver) in a quantum mechanical experiment—in other words, the distinction between carriers of classical and quantum information becomes essential. This approach is justi.ed by the observation that a lossless conversion of quantum information into classical information is in the above sense not possible. Therefore, quantum information is a new kind of information. In order to explain why there is no way from quantum to classical information and back, let us discuss how such a conversion would look like. To convert quantum to classical information we need a device which takes quantum systems as input and produces classical information as output—this is nothing else than a measuring apparatus. The converse translation from classical to quantum information can be rephrased similarly as “parameter-dependent preparation”, i.e. the classical input to such a device is used to control the state (and possibly the type of system) in
M. Keyl / Physics Reports 369 (2002) 431 – 548
435
Fig. 1.1. Schematic representation of classical teleportation. Here and in the following diagrams a curly arrow stands for quantum systems and a straight one for the Jow of classical information. Fig. 1.2. A teleportation process should not a=ect the results of a statistical experiment with quantum systems. A more precise explanation of the diagram is given in the text.
which the microparticles should be prepared. A combination of these two elements can be done in two ways. Let us .rst consider a device which goes from classical to quantum to classical information. This is a possible task and in fact technically realized already. A typical example is the transmission of classical information via an optical .ber. The information transmitted through the .ber is carried by microparticles (photons) and is therefore quantum information (in the sense of our preliminary de.nition). To send classical information we have to prepare .rst photons in a certain state send them through the channel and measure an appropriate observable at the output side. This is exactly the combination of a classical → quantum with a quantum → classical device just described. The crucial point is now that the converse composition—performing the measurement M .rst and the preparation P afterwards (cf. Fig. 1.1)—is more problematic. Such a process is called classical teleportation, if the particles produced by P are “indistinguishable” from the input systems. We will show the impossibility of such a device via a hierarchy of other “impossible machines” which traces the problem back to the fundamental structure of quantum mechanics. This .nally will prove our statement that quantum information is a new kind of information. 1 To start with, we have to clarify the precise meaning of “indistinguishable” in this context. This has to be done in a statistical way, because the only possibility to compare quantum mechanical systems is in terms of statistical experiments. Hence, we need an additional preparation device P and an additional measuring apparatus M . Indistinguishable now means that it does not matter whether we perform M measurements directly on P outputs or whether we switch a teleportation device in between; cf. Fig. 1.2. In both cases we should get the same distribution of measuring results for a large number of repetitions of the corresponding experiment. This requirement should hold for any preparation P and any measurement M , but for .xed M and P. The latter means that we are not allowed to use a priori knowledge about P or M to adopt the teleportation process (otherwise we can choose in the most extreme case always P for P and the whole discussion becomes meaningless). 1
The following chain of arguments is taken from [168], where it is presented in greater detail. This concerns, in particular, the construction of Bell’s telephone from a joint measurement, which we have omitted here.
436
M. Keyl / Physics Reports 369 (2002) 431 – 548
Fig. 1.3. Constructing a quantum copying machine from a teleportation device. Fig. 1.4. Constructing a joint measurement for the observables A and B from a quantum copying machine.
The second impossible machine we have to consider is a quantum copying machine. This is a device C which takes one quantum system p as input and produces two systems p1 ; p2 of the same type as output. The limiting condition on C is that p1 and p2 are indistinguishable from the input, where “indistinguishable” has to be understood in the same way as above: Any statistical experiment performed with one of the output particles (i.e. always with p1 or always with p2 ) yields the same result as applied directly to the input p. To get such a device from teleportation is easy: We just have to perform an M measurement on p, make two copies of the classical data obtained, and run the preparation P on each of them; cf. Fig. 1.3. Hence if teleportation is possible copying is possible as well. According to the “no-cloning theorem” of Wootters and Zurek [173], however, a quantum copy machine does not exist and this basically concludes our proof. However, we will give an easy argument for this theorem in terms of a third impossible machine—a joint measuring device MAB for two arbitrary observables A and B. This is a measuring apparatus which produces each time it is invoked a pair (a; b) of classical outputs, where a is a possible output of A and b a possible output of B. The crucial requirement for MAB again is of statistical nature: The statistics of the a outcomes is the same as for device A, and similarly for B. It is known from elementary quantum mechanics that many quantum observables are not jointly measurable in this way. The most famous examples are position and momentum or di=erent components of angular momentum. Nevertheless, a device MAB could be constructed for arbitrary A and B from a quantum copy machine C. We simply have to operate with C on the input system p producing two outputs p1 and p2 and to perform an A measurement on p1 and a B measurement on p2 ; cf. Fig. 1.4. Since the outputs p1 , p2 are, by assumption, indistinguishable from the input p the overall device constructed this way would give a joint measurement for A and B. Hence, a quantum copying machine cannot exist, as stated by the no-cloning theorem. This in turn implies that classical teleportation is impossible, and therefore we cannot transform quantum information lossless into classical information and back. This concludes our chain of arguments. 1.2. Tasks of quantum information So we have seen that quantum information is something new, but what can we do with it? There are three answers to this question which we want to present here. First of all let us remark that
M. Keyl / Physics Reports 369 (2002) 431 – 548
437
in fact all information in a modern data processing environment is carried by microparticles (e.g. electrons or photons). Hence, quantum information comes automatically into play. Currently, it is safe to ignore this and to use classical information theory to describe all relevant processes. If the size of the structures on a typical circuit decreases below a certain limit, however, this is no longer true and quantum information will become relevant. This leads us to the second answer. Although it is far too early to say which concrete technologies will emerge from quantum information in the future, several interesting proposals show that devices based on quantum information can solve certain practical tasks much better than classical ones. The most well known and exciting one is, without a doubt, quantum computing. The basic idea is, roughly speaking, that a quantum computer can operate not only on one number per register but on superpositions of numbers. This possibility leads to an “exponential speedup” for some computations which makes problems feasible which are considered intractable by any classical algorithm. This is most impressively demonstrated by Shor’s factoring algorithm [139,140]. A second example which is quite close to a concrete practical realization (i.e. outside the laboratory; see next section) is quantum cryptography. The fact that it is impossible to perform a quantum mechanical measurement without disturbing the state of the measured system is used here for the secure transmission of a cryptographic key (i.e. each eavesdropping attempt can be detected with certainty). Together with a subsequent application of a classical encryption method known as the “one-time” pad this leads to a cryptographic scheme with provable security—in contrast to currently used public key systems whose security relies on possibly doubtful assumptions about (pseudo) random number generators and prime numbers. We will come back to both subjects, quantum computing and quantum cryptography, in Sections 4.5 and 4.6. The third answer to the above question is of more fundamental nature. The discussion of questions from information theory in the context of quantum mechanics leads to a deeper and in many cases to more quantitative understanding of quantum theory. Maybe the most relevant example for this statement is the study of entanglement, i.e. non-classical correlations between quantum systems, which lead to violations of the Bell inequalities. 2 Entanglement is a fundamental aspect of quantum mechanics and demonstrates the di=erences between quantum and classical physics in the most drastical way—this can be seen from Bell-type experiments, like the one of Aspect et al. [5], and the discussion about. Nevertheless, for a long time it was only considered as an exotic feature of the foundations of quantum mechanics which is not so relevant from a practical point of view. Since quantum information attained broader interest, however, this has changed completely. It has turned out that entanglement is an essential resource whenever classical information processing is outperformed by quantum devices. One of the most remarkable examples is the experimental realization of “entanglement enhanced” teleportation [24,22]. We have argued in Section 1.1 that classical teleportation, i.e. transmission of quantum information through a classical information channel, is impossible. If sender and receiver share, however, an entangled pair of particles (which can be used as an additional resource) the impossible task becomes, most surprisingly, possible [11]! (We will discuss this fact in detail in Section 4.1.) The study of entanglement and in particular the question how it can be quanti<ed is therefore a central topic within quantum information theory (cf. Section 5). Further examples for .elds where quantum information has led to a deeper and in particular more quantitative insight include “capacities” of quantum information channels and “quantum 2
This is only a very rough characterization. A more precise one will be given in Section 2.2.
438
M. Keyl / Physics Reports 369 (2002) 431 – 548
cloning”. A detailed discussion of these topics will be given in Sections 6 and 7. Finally, let us remark that classical information theory bene.ts in a similar way from the synthesis with quantum mechanics. Beside the just mentioned channel capacities this concerns, for example, the theory of computational complexity which analyzes the scaling behavior of time and space consumed by an algorithm in dependence of the size of the input data. Quantum information challenges here, in particular, the fundamental Church–Turing hypotheses [45,152] which claims that each computation can be simulated “eSciently” on a Turing machine; we come back to this topic in Section 4.5. 1.3. Experimental realizations Although this is a theoretical paper, it is of course necessary to say something about experimental realizations of the ideas of quantum information. Let us consider quantum computing .rst. Whatever way we go here, we need systems which can be prepared very precisely in few distinct states (i.e. we need “qubits”), which can be manipulated afterwards individually (we have to realize “quantum gates”) and which can .nally be measured with an appropriate observable (we have to “read out” the result). One of the most far developed approaches to quantum computing is the ion trap technique (see Sections 4.3 and 5.3 in [23] and Section 7.6 of [122] for an overview and further references). A “quantum register” is realized here by a string of ions kept by electromagnetic .elds in high vacuum inside a Paul trap, and two long-living states of each ion are chosen to represent “0” and “1”. A single ion can be manipulated by laser beams and this allows the implementation of all “one-qubit gates”. To get two-qubit gates as well (for a quantum computer we need at least one two qubit gate together with all one-qubit operations; cf. Section 4.5) the collective motional state of the ions has to be used. A “program” on an ion trap quantum computer starts now with a preparation of the register in an initial state—usually the ground state of the ions. This is done by optical pumping and laser cooling (which is in fact one of the most diScult parts of the whole procedure, in particular if many ions are involved). Then the “network” of quantum gates is applied, in terms of a (complicated) sequence of laser pulses. The readout .nally is done by laser beams which illuminate the ions subsequently. The beams are tuned to a fast transition which a=ects only one of the qubit states and the Juorescent light is detected. Concrete implementations (see e.g. [118,102]) are currently restricted to two qubits; however, there is some hope that we will be able to control up to 10 or 12 qubits in the not too distant future. A second quite successful technique is NMR quantum computing (see Section 5.4 of [23] and Section 7.7 of [122] together with the references therein for details). NMR stands for “nuclear magnetic resonance” and it is the study of transitions between Zeeman levels of an atomic nucleus in a magnetic .eld. The qubits are in this case di=erent spin states of the nuclei in an appropriate molecule and quantum gates are realized by high-frequency oscillating magnetic .elds in pulses of controlled duration. In contrast to ion traps, however, we do not use one molecule but a whole cup of liquid containing some 1020 of them. This causes a number of problems, concerning in particular the preparation of an initial state, Juctuations in the free time evolution of the molecules and the readout. There are several ways to overcome these diSculties and we refer the reader again to [23,122] for details. Concrete implementations of NMR quantum computers are capable to use up to .ve qubits [113]. Other realizations include the implementation of several known quantum algorithms on two and three qubits; see e.g. [44,96,109].
M. Keyl / Physics Reports 369 (2002) 431 – 548
439
The fundamental problem of the two methods for quantum computation discussed so far is their lack of scalability. It is realistic to assume that NMR and ion-trap quantum computer with up to tens of qubits will exist somewhere in the future but not with thousands of qubits which are necessary for “real-world” applications. There are, however, many other alternative proposals available and some of them might be capable to avoid this problem. The following is a small (not at all exhaustive) list: atoms in optical lattices [28], semiconductor nanostructures such as quantum dots (there are many works in this area, some recent are [149,30,21,29]) and arrays of Josephson junctions [112]. A second circle of experiments we want to mention here is grouped around quantum communication and quantum cryptography (for a more detailed overview let us refer to [163,69]). Realizations of quantum cryptography are fairly far developed and it is currently possible to span up to 50 km with optical .bers (e.g. [93]). Potentially greater distances can be bridged by “free space cryptography” where the quantum information is transmitted through the air (e.g [34]). With this technology satellites can be used as some sort of “relays”, thus enabling quantum key distribution over arbitrary distances. In the meantime there are quite a lot of successful implementations. For a detailed discussion we will refer the reader to the review of Gisin et al. [69] and the references therein. Other experiments concern the usage of entanglement in quantum communication. The creation and detection of entangled photons is here a fundamental building block. Nowadays this is no problem and the most famous experiment in this context is the one of Aspect et al. [5], where the maximal violation of Bell inequalities was demonstrated with polarization correlated photons. Another spectacular experiment is the creation of entangled photons over a distance of 10 km using standard telecommunication optical .bers by the Geneva group [151]. Among the most exciting applications of entanglement is the realization of entanglement based quantum key distribution [95], the .rst successful “teleportation” of a photon [24,22] and the implementation of “dense coding” [115]; cf. Section 4.1.
2. Basic concepts After we have got a .rst, rough impression of the basic ideas and most relevant subjects of quantum information theory, let us start with a more detailed presentation. First, we have to introduce the fundamental notions of the theory and their mathematical description. Fortunately, much of the material we should have to present here, like Hilbert spaces, tensor products and density matrices, is known already from quantum mechanics and we can focus our discussion to those concepts which are less familiar like POV measures, completely positive maps and entangled states. 2.1. Systems, states and e>ects As classical probability theory quantum mechanics is a statistical theory. Hence, its predictions are of probabilistic nature and can only be tested if the same experiment is repeated very often and the relative frequencies of the outcomes are calculated. In more operational terms this means: The experiment has to be repeated according to the same procedure as it can be set out in a detailed laboratory manual. If we consider a somewhat idealized model of such a statistical experiment we get, in fact, two di=erent types of procedures: .rst preparation procedures which prepare a certain
440
M. Keyl / Physics Reports 369 (2002) 431 – 548
kind of physical system in a distinguished state and second registration procedures measuring a particular observable. A mathematical description of such a setup basically consists of two sets S and E and a map S × E ( ; A) → (A) ∈ [0; 1]. The elements of S describe the states, i.e. preparations, while the A ∈ E represent all yes=no measurements (e>ects) which can be performed on the system. The probability (i.e. the relative frequency for a large number of repetitions) to get the result “yes”, if we are measuring the e=ect A on a system prepared in the state , is given by (A). This is a very general scheme applicable not only to quantum mechanics but also to a very broad class of statistical models, containing, in particular, classical probability. In order to make use of it we have to specify, of course, the precise structure of the sets S and E and the map (A) for the types of systems we want to discuss. 2.1.1. Operator algebras Throughout this paper we will encounter three di=erent kinds of systems: Quantum and classical systems and hybrid systems which are half classical, half quantum (cf. Section 2.2.2). In this subsection we will describe a general way to de.ne states and e=ects which is applicable to all three cases and which therefore provides a handy way to discuss all three cases simultaneously (this will become most useful in Sections 2.2 and 2.3). The scheme we are going to discuss is based on an algebra A of bounded operators acting on a Hilbert space H. More precisely, A is a (closed) linear subspace of B(H), the algebra of bounded operates on H, which contains the identity (5 ∈ A) and is closed under products (A; B ∈ A ⇒ AB ∈ A) and adjoints (A ∈ A ⇒ A∗ ∈ A). For simplicity we will refer to each such A as an observable algebra. The key observation is now that each type of system we will study in the following can be completely characterized by its observable algebra A, i.e. once A is known there is a systematic way to derive the sets S and E and the map ( ; A) → (A) from it. We frequently make use of this fact by referring to systems in terms of their observable algebra A, or even by identifying them with their algebra and saying that A is the system. Although A and H can be in.nite dimensional in general, we will consider only .nite-dimensional Hilbert spaces, as long as nothing else is explicitly stated. Since most research in quantum information is done up to now for .nite-dimensional systems (the only exception in this work is the discussion of Gaussian systems in Section 3.3) this is not a too severe loss of generality. Hence we can choose H=Cd and B(H) is just the algebra of complex d×d matrices. Since A is a subalgebra of B(H) it operates naturally on H and it inherits from B(H) the operator norm A = sup =1 A and the operator ordering A ¿ B ⇔ ; A ¿ ; B ∀ ∈ H. Now we can de.ne S(A) = { ∈ A∗ | ¿ 0; (5) = 1} ;
(2.1)
where A∗ denotes the dual space of A, i.e. the set of all linear functionals on A, and ¿ 0 means
(A) ¿ 0; ∀A ¿ 0. Elements of S(A) describe the states of the system in question while e=ects are given by E(A) = {A ∈ A | A ¿ 0; A 6 5} :
(2.2)
The probability to measure the e=ect A in the state is (A). More generally, we can look at (A) for an arbitrary A as the expectation value of A in the state . Hence, the idea behind Eq. (2.1) is to de.ne states in terms of their expectation value functionals.
M. Keyl / Physics Reports 369 (2002) 431 – 548
441
Both spaces are convex, i.e. ; ∈ S(A) and 0 6 6 1 implies + (1 − ) ∈ S(A) and similarly for E(A). The extremal points of S(A), respectively, E(A), i.e. those elements which do not admit a proper convex decomposition (x = y + (1 − )z ⇒ = 1 or = 0 or y = z = x), play a distinguished role: The extremal points of S(A) are pure states and those of E(A) are the propositions of the system in question. The latter represent those e=ects which register a property with certainty in contrast to non-extremal e=ects which admit some “fuzziness”. As a simple example for the latter consider a detector which registers particles not with certainty but only with a probability which is smaller than one. Finally, let us note that the complete discussion of this section can be generalized easily to in.nite-dimensional systems, if we replace H = Cd by an in.nite-dimensional Hilbert space (e.g. H = L2 (R)). This would require, however, more material about C∗ algebras and measure theory than we want to use in this paper. 2.1.2. Quantum mechanics For quantum mechanics we have A = B(H) ;
(2.3)
where we have chosen again H = Cd . The corresponding systems are called d-level systems or qubits if d = 2 holds. To avoid clumsy notations we frequently write S(H) and E(H) instead of S[B(H)] and E[B(H)]. From Eq. (2.2) we immediately see that an operator A ∈ B(H) is an e=ect i= it is positive and bounded from above by 5. An element P ∈ E(H) is a propositions i= P is a projection operator (P 2 = P). States are described in quantum mechanics usually by density matrices, i.e. positive and normalized trace class 3 operators. To make contact to the general de.nition in Eq. (2.1) note .rst that B(H) is a Hilbert space with the Hilbert–Schmidt scalar product A; B=tr(A∗ B). Hence, each linear functional
∈ B(H)∗ can be expressed in terms of a (trace class) operator ˜ by 4 A → (A) = tr( A). ˜ It is obvious that each ˜ de.nes a unique functional . If we start on the other hand with we can recover the matrix elements of ˜ from by ˜kj = tr( |j k|) ˜ = (|j k|), where |j k| denotes the canonical basis of B(H) (i.e. |j k|ab = ja kb ). More generally, we get for ; ∈ H the relation
; ˜ = (| |), where | | now denotes the rank one operator which maps ∈ H to ; . In the following we drop the ∼ and use the same symbol for the operator and the functional whenever confusion can be avoided. Due to the same abuse of language we will interpret elements of B(H)∗ frequently as (trace class) operators instead of linear functionals (and write tr( A) instead of (A)). However, we do not identify B(H)∗ with B(H) in general, because the two di=erent notations help to keep track of the distinction between spaces of states and spaces of observables. In addition, we equip B∗ (H) with the trace-norm 1 = tr | | instead of the operator norm. Positivity of the functional implies positivity of the operator due to 0 6 (| |) = ; and the same holds for normalization: 1 = (5) = tr( ). Hence, we can identify the state space from 3
On a .nite-dimensional Hilbert space this attribute is of course redundant, since each operator is of trace class in this case. Nevertheless, we will frequently use this terminology, due to greater consistency with the in.nite-dimensional case. 4 If we consider in.nite-dimensional systems this is not true. In this case the dual space of the observable algebra is much larger and Eq. (2.1) leads to states which are not necessarily given by trace class operators. Such “singular states” play an important role in theories which admit an in.nite number of degrees of freedom like quantum statistics and quantum .eld theory; cf. [25,26]. For applications of singular states within quantum information see [97].
442
M. Keyl / Physics Reports 369 (2002) 431 – 548
Eq. (2.1) with the set of density matrices, as expected for quantum mechanics. Pure states of a quantum system are the one-dimensional projectors. As usual, we will frequently identify the density matrix | | with the wave function and call the latter in abuse of language a state. To get a useful parameterization of the state space consider again the Hilbert–Schmidt scalar product
; =tr( ∗ ), but now on B∗ (H). The space of trace free matrices in B∗ (H) (alternatively the functionals with (5) = 0) is the corresponding orthocomplement 5⊥ of the unit operator. If we choose a basis 1 ; : : : ; d2 −1 with j ; k = 2jk in 5⊥ we can write each self-adjoint (trace class) operator with tr( ) = 1 as 2
d −1 5 1 5 1 xj j = : + ˜x · ˜
= + d 2 j=1 d 2
2
with ˜x ∈ Rd −1 :
(2.4)
If d = 2 or d = 3 holds, it is most natural to choose the Pauli matrices, respectively, the Gell–Mann matrices (cf. e.g. [48], Section 13.4) for the j . In the qubit case it is easy to see that ¿ 0 holds i= |˜x| 6 1. Hence the state space S(C2 ) coincides with the Bloch ball {˜x ∈ R3 | |˜x| 6 1}, and the set of pure states with its boundary, the Bloch sphere {˜x ∈ R3 | |˜x| = 1}. This shows in a very geometric way that the pure states are the extremal points of the convex set S(H). If is more generally a pure state of a d-level system we get 1 1 1 = tr( 2 ) = + |˜x|2 ⇒ |˜x| = 2(1 − 1=d) : (2.5) d 2 This implies that all states are contained in the ball with radius 21=2 (1 − 1=d)1=2 , however, not all operators in this set are positive. A simple example is d−1 5 ± 21=2 (1 − 1=d)1=2 j , which is positive only if d = 2 holds. 2.1.3. Classical probability Since the di=erence between classical and quantum systems is an important issue in this work let us reformulate classical probability theory according to the general scheme from Section 2.1.1. The restriction to .nite-dimensional observable algebras leads now to the assumption that all systems we are considering admit a .nite set X of elementary events. Typical examples are: throwing a dice X = {1; : : : ; 6}, tossing a coin X = {“head”; “number”} or classical bits X = {0; 1}. To simplify the notations we write (as in quantum mechanics) S(X ) and E(X ) for the spaces of states and e=ects. The observable algebra A of such a system is the space A = C(X ) = {f : X → C}
(2.6)
of complex-valued functions on X . To interpret this as an operator algebra acting on a Hilbert space H (as indicated in Section 2.1.1) choose an arbitrary but .xed orthonormal basis |x; x ∈ X in H and identify the function f ∈ C(X ) with the operator f = x fx |x x| ∈ B(H) (we use the same symbol for the function and the operator, provided confusion can be avoided). Most frequently we have X = {1; : : : ; d} and we can choose H = Cd and the canonical basis for |x. Hence, C(X ) becomes the algebra of diagonal d × d matrices. Using Eq. (2.2) we immediately see that f ∈ C(X ) is an e=ect i= 0 6 fx 6 1; ∀x ∈ X . Physically, we can interpret fx as the probability that the e=ect f registers the elementary event x. This makes the distinction between propositions and “fuzzy” e=ects very transparent: P ∈ E(X ) is a proposition i= we have either Px =1 or Px =0 for all x ∈ X . Hence, the propositions P ∈ C(X ) are in one-to-one correspondence with the subsets !P = {x ∈ X | Px = 1} ⊂ X
M. Keyl / Physics Reports 369 (2002) 431 – 548
443
which in turn describe the events of the system. Hence, P registers the event !P with certainty, while a fuzzy e=ect f ¡ P does this only with a probability less than one. Since C(X ) is .nite dimensional and admits the distinguished basis |x x|; x ∈ X it is naturally iso∗ morphic to its dual C∗ (X ). More precisely: each linear functional ∈ C (X ) de.nes and is uniquely de.ned by the function x → x = (|x x|) and we have (f) = x fx x . As in the quantum case we will identify the function with the linear functional and use the same symbol for both, although we keep the notation C∗ (X ) to indicate that we are talking about states rather than observables. Positivity of
∈ C∗ (X ) is given by x ¿ 0 for all x and normalization leads to 1 = (5) =
( x |x x|) = x x . Hence to be a state ∈ C∗ (X ) must be a probability distribution on X and
x is the probability that the elementary event x occurs during statistical experiments with systems in the state . More generally (f) = j j fj is the probability to measure the e=ect f on systems in the state . If P is in particular, a proposition, (P) gives the probability for the event !P . The pure states of the system are the Dirac measures x ; x ∈ X ; with x (|y y|) = xy . Hence, each ∈ S(X ) can be decomposed in a unique way into a convex linear combination of pure states. 2.1.4. Observables Up to now we have discussed only e=ects, i.e. yes=no experiments. In this subsection we will have a .rst short look at more general observables. We will come back to this topic in Section 3.2.4 after we have introduced channels. We can think of an observable E taking its values in a .nite set X as a map which associates to each possible outcome x ∈ X the e=ect Ex ∈ E(A) (if A is the observable algebra of the system in question) which is true if x is measured and false otherwise. If the measurement is performed on systems in the state we get for each x ∈ X the probability px = (Ex ) to measure x. Hence, the family of the px should be a probability distribution on X , and this implies that E should be a positive operator-valued measure (POV measure) on X . 5 Denition 2.1. Consider an observable algebra A ⊂ B(H) and a .nite set X . A family E=(Ex )x∈X of e=ects in A (i.e. 0 6 Ex 6 5) is called a POV measure on X if x∈X Ex = 5 holds. If all Ex are projections, E is called projection-valued measure (PV measure).
From basic quantum mechanics we know that observables are described by self-adjoint operators on a Hilbert space H. But, how does this point of view .t into the previous de.nition? The answer is given by the spectral theorem [134, Theorem VIII.6]: Each self-adjoint operator A on a .nite-dimensional Hilbert space H has the form A = ∈(A) P where (A) denotes the spectrum of A, i.e. the set of eigenvalues and P denotes the projection onto the corresponding eigenspace. Hence, there is a unique PV measure P = (P )∈(A) associated to A which is called the spectral measure of A. It is uniquely characterized by the property that the expectation value (P ) of P in the state is given for any state by (A) = tr( A); as it is well known from quantum mechanics. Hence, the traditional way to de.ne observables within quantum mechanics perfectly .ts into the scheme just outlined, however it only covers the projection-valued case and therefore admits no fuzziness. For this reason POV measures are sometimes called generalized observables. 5
This is of course an arti.cial restriction and in many situations not justi.ed (cf. in particular the discussion of quantum state estimation in Section 4.2 and Section 7). However, it helps us to avoid measure theoretical subtleties; cf. Holevo’s book [79] for a more general discussion.
444
M. Keyl / Physics Reports 369 (2002) 431 – 548
Finally, note that the eigenprojections P of A are elements of an observable algebra A i= A ∈ A. This shows two things: First of all we can consider self-adjoint elements of any ∗ -subalgebra A of B(H) as observables of A-systems, and this is precisely the reason why we have called A observable algebra. Secondly, we see why it is essential that A is really a subalgebra of B(H): if it is only a linear subspace of B(H) the relation A ∈ A does not imply P ∈ A. 2.2. Composite systems and entangled states Composite systems occur in many places in quantum information theory. A typical example is a register of a quantum computer, which can be regarded as a system consisting of N qubits (if N is the length of the register). The crucial point is that this opens the possibility for correlations and entanglement between subsystems. In particular, entanglement is of great importance, because it is a central resource in many applications of quantum information theory like entanglement enhanced teleportation or quantum computing—we already discussed this in Section 1.2 of the Introduction. To explain entanglement in greater detail and to introduce some necessary formalism we have to complement the scheme developed in the last section by a procedure which allows us to construct states and observables of the composite system from its subsystems. In quantum mechanics this is done, of course, in terms of tensor products, and we will review in the following some of the most relevant material. 2.2.1. Tensor products Consider two (.nite dimensional) Hilbert spaces H and K. To each pair of vectors 1 ∈ H; 2 ∈ K we can associate a bilinear form 1 ⊗ 2 called the tensor product of 1 and 2 by 1 ⊗ 2 (1 ; 2 ) = 1 ; 1 2 ; 2 . For two product vectors 1 ⊗ 2 and 1 ⊗ 2 their scalar product is de.ned by 1 ⊗ 2 ; 1 ⊗ 2 = 1 ; 1 2 ; 2 and it can be shown that this de.nition extends in a unique way to the span of all 1 ⊗ 2 which therefore de.nes the tensor product H ⊗ K. If we have more than two Hilbert spaces Hj , j = 1; : : : ; N their tensor product H1 ⊗ · · · ⊗ HN can be de.ned similarly. The tensor product A1 ⊗ A2 of two bounded operators A1 ∈ B(H); A2 ∈ B(K) is de.ned .rst for product vectors 1 ⊗ 2 ∈ H ⊗ K by A1 ⊗ A2 ( 1 ⊗ 2 ) = (A1 1 ) ⊗ (A2 2 ) and then extended by linearity. The space B(H ⊗ K) coincides with the span of all A1 ⊗ A2 . If ∈ B(H ⊗ K) is not of product form (and of trace class for in.nite-dimensional H and K) there is nevertheless a way to de.ne “restrictions” to H, respectively, K called the partial trace of . It is de.ned by the equation tr[tr K ( )A] = tr( A ⊗ 5)
∀A ∈ B(H) ;
(2.7)
where the trace on the left-hand side is over H and on the right-hand side over H ⊗ K. If two orthonormal bases 1 ; : : : ; n and 1 ; : : : ; m are given in H, respectively, K we can consider the product basis 1 ⊗ 1 ; : : : ; n ⊗ m in H ⊗ K, and we can expand each " ∈ H ⊗ K as " = jk "jk j ⊗ k with "jk = j ⊗ k ; ". This procedure works for an arbitrary number of tensor factors. However, if we have exactly a twofold tensor product, there is a more economic way to expand ", called Schmidt decomposition in which only diagonal terms of the form j ⊗ j appear.
M. Keyl / Physics Reports 369 (2002) 431 – 548
445
Proposition 2.2. For each element " of the twofold tensor product H ⊗ K there are orthonormal systems j ; j =1; : : : ; n and k ; k =1; : : : ; n (not necessarily bases; i.e. n can be smaller than dim H √ and dim K) of H and K; respectively; such that " = j j j ⊗ j holds. The j and j are √ uniquely determined by ". The expansion is called Schmidt decomposition and the numbers j are the Schmidt coeCcients. Proof. Consider the partial trace 1 = tr K (|" "|) of the one-dimensional projector |" "| associated to ". It can be decomposed in terms of its eigenvectors n and we get tr K (|" "|) = 1 = | n n n n |. Now we can choose an orthonormal basis k ; k = 1; : : : ; m in K and expand " with = respect to j ⊗ k . Carrying out the k summation we get a family of vectors j k "; j ⊗ k k with the property "= j j ⊗ j . Now we can calculate the partial trace and get for any A ∈ B(H1 ): j j ; Aj = tr( 1 A) = "; (A ⊗ 5)" =
j ; Ak j ; k : (2.8) j
j; k
Since A is arbitrary we can compare the left- and right-hand side of this equation term by term and we get j ; k = jk j . Hence; j = j−1=2 j is the desired orthonormal system. As an immediate application of this result we can show that each mixed state ∈ B∗ (H) (of the quantum system B(H)) can be regarded as a pure state on a larger Hilbert space H ⊗ H . We just have to consider the eigenvalue expansion = j j |j j | of and to choose an arbitrary orthonormal system j ; j = 1; : : : n in H . Using Proposition 2.2 we get Corollary 2.3. Each state ∈ B∗ (H) can be extended to a pure state " on a larger system with Hilbert space H ⊗ H such that tr H |" "| = holds. 2.2.2. Compound and hybrid systems To discuss the composition of two arbitrary (i.e. classical or quantum) systems it is very convenient to use the scheme developed in Section 2.1.1 and to talk about the two subsystems in terms of their observable algebras A ⊂ B(H) and B ⊂ B(K). The observable algebra of the composite system is then simply given by the tensor product of A and B, i.e. A ⊗ B:=span{A ⊗ B | A ∈ A; B ∈ B} ⊂ B(K ⊗ H) :
(2.9)
The dual of A ⊗ B is generated by product states, ( ⊗ )(A ⊗ B) = (A)(B) and we therefore write A∗ ⊗ B∗ for (A ⊗ B)∗ . The interpretation of the composed system A ⊗ B in terms of states and e=ects is straightforward and therefore postponed to the next subsection. We will consider .rst the special cases arising from di=erent choices for A and B. If both systems are quantum (A = B(H) and B = B(K)) we get B(H) ⊗ B(K) = B(H ⊗ K)
(2.10)
as expected. For two classical systems A = C(X ) and B = C(Y ) recall that elements of C(X ) (respectively, C(Y )) are complex-valued functions on X (on Y ). Hence, the tensor product C(X ) ⊗ C(Y ) consists of complex-valued functions on X × Y , i.e. C(X ) ⊗ C(Y ) = C(X × Y ). In other words, states and observables of the composite system C(X ) ⊗ C(Y ) are, in accordance with classical
446
M. Keyl / Physics Reports 369 (2002) 431 – 548
probability theory, given by probability distributions and random variables on the Cartesian product X × Y. If only one subsystem is classical and the other is quantum; e.g. a microparticle interacting with a classical measuring device we have a hybrid system. The elements of its observable algebra C(X ) ⊗ B(H) can be regarded as operator-valued functions on X , i.e. X x → Ax ∈ B(H) and A is an e=ect i= 0 6 Ax 6 5 holds for all x ∈ X . The elements of the dual C∗ (X ) ⊗ B∗ (H) are in a ∗ similar way B∗ (X )-valued functions X x → x ∈ B (H) and is a state i= each x is a positive trace class operator on H and x x = 1. The probability to measure the e=ect A in the state is x x (Ax ). 2.2.3. Correlations and entanglement Let us now consider two e=ects A ∈ A and B ∈ B then A ⊗ B is an e=ect of the composite system A ⊗ B. It is interpreted as the joint measurement of A on the .rst and B on the second subsystem, where the “yes” outcome means “both e=ects give yes”. In particular, A ⊗ 5 means to measure A on the .rst subsystem and to ignore the second one completely. If is a state of A ⊗ B we can de.ne its restrictions by A (A)= (A⊗5) and B (A)= (5⊗A). If both systems are quantum the restrictions of are the partial traces, while in the classical case we have to sum over the B, respectively A, variables. For two states 1 ∈ S(A) and 2 ∈ S(B) there is always a state of A ⊗ B such that
1 = A and 2 = B holds: We just have to choose the product state 1 ⊗ 2 . However, in general, we have = A ⊗ B which means nothing else then also contains correlations between the two subsystems. Denition 2.4. A state of a bipartite system A ⊗ B is called correlated if there are some A ∈ A; B ∈ B such that (A ⊗ B) = A (A) B (B) holds. We immediately see that = 1 ⊗ 2 implies (A ⊗ B) = 1 (A) 2 (B) = A (A) B (B) hence is not correlated. If on the other hand (A ⊗ B) = A (A) B (B) holds we get = A ⊗ B . Hence, the de.nition of correlations just given perfectly .ts into our intuitive considerations. An important issue in quantum information theory is the comparison of correlations between quantum systems on the one hand and classical systems on the other. Hence, let us have a closer look on the state space of a system consisting of at least one classical subsystem. Proposition 2.5. Each state of a composite system A ⊗ B consisting of a classical (A = C(X )) and an arbitrary system (B) has the form B
= j A (2.11) j ⊗ j j ∈X
B with positive weights j ¿ 0 and A j ∈ S(A); j ∈ S(B).
Proof. Since A = C(X ) is classical; there is a basis |j j| ∈ A; j ∈ X of mutually orthogonal one-dimensional projectors and we can write each A ∈ A as j aj |j j| (cf. Subsection 2.1.3). For each state ∈ S(A ⊗ B) we can now de.ne A ∈ S(A) with
A j j (A) = tr(A|j j|) = aj and
M. Keyl / Physics Reports 369 (2002) 431 – 548 −1 B
B j ∈ S(B) with j (B) = j (|j j| ⊗ B) and j = (|j j| ⊗ 5). Hence we get = with positive j as stated.
447
j ∈X
B j A j ⊗ j
If A and B are two quantum systems it is still possible for them to be correlated in the way just described. We can simply prepare them with a classical random generator which triggers two preparation devices to produce systems in the states Aj ; Bj with probability j . The overall state produced by this setup is obviously the from Eq. (2.11). However, the crucial point is that not all correlations of quantum systems are of this type! This is an immediate consequence of the de.nition of pure states = |" "| ∈ S(H): Since there is no proper convex decomposition of , it can be written as in Proposition 2.5 i= " is a product vector, i.e. " = ⊗ . This observation motivates the following de.nition. Denition 2.6. A state of the composite system B(H1 )⊗B(H2 ) is called separable or classically correlated if it can be written as (2)
= j (1) (2.12) j ⊗ j j
with states j(k) of B(Hk ) and weights j ¿ 0. Otherwise is called entangled. The set of all separable states is denoted by D(H1 ⊗ H2 ) or just D if H1 and H2 are understood. 2.2.4. Bell inequalities We have just seen that it is quite easy for pure states to check whether they are entangled or not. In the mixed case however this is a much bigger, and in general unsolved, problem. In this subsection we will have a short look at the Bell inequalities, which are maybe the oldest criterion for entanglement (for a more detailed review see [169]). Today more powerful methods, most of them based on positivity properties, are available. We will postpone the corresponding discussion to the end of the following section, after we have studied (completely) positive maps (cf. Section 2.4). Bell inequalities are traditionally discussed in the framework of “local hidden variable theories”. More precisely we will say that a state of a bipartite system B(H ⊗ K) admits a hidden variable model, if there is a probability space (X; %) and (measurable) response functions X x → FA (x; k); FB (x; l) ∈ R for all discrete PV measures A = A1 ; : : : ; AN ∈ B(H), respectively B = B1 ; : : : ; BM ∈ B(K), such that FA (x; k)FB (x; l)%(d x) = tr( Ak ⊗ Bl ) (2.13) X
holds for all, k; l and A; B. The value of the functions FA (x; k) is interpreted as the probability to get the value k during an A measurement with known “hidden parameter” x. The set of states admitting a hidden variable model is a convex set and as such it can be described by an (in.nite) hierarchy of correlation inequalities. Any one of these inequalities is usually called (generalized) Bell inequality. The most well-known one is those given by Clauser et al. [47]: The state satis.es the CHSH-inequality if
(A ⊗ (B + B ) + A ⊗ (B − B )) 6 2
(2.14)
448
M. Keyl / Physics Reports 369 (2002) 431 – 548
holds for all A; A ∈ B(H), respectively B; B ∈ B(K), with −5 6 A; A 6 5 and −5 6 B; B 6 5. For the special case of two dichotomic observables the CHSH inequalities are suScient to characterize the states with a hidden variable model. In the general case the CHSH inequalities are a necessary but not a suScient condition and a complete characterization is not known. (2) It is now easy to see that each separable state = nj=1 j (1) admits a hidden variable j ⊗ j
model: we have to choose X = 1; : : : ; n; %({j}) = j ; FA (x; k) = (1) x (Ak ) and FB analogously. Hence, we immediately see that each state of a composite system with at least one classical subsystem satis.es the Bell inequalities (in particular the CHSH version) while this is not the case for pure quantum systems. The most prominent examples are “maximally entangled states” (cf. Subsection 3.1.1) √ which violate the CHSH inequality (for appropriately chosen A; A ; B; B ) with a maximal value of 2 2. This observation is the starting point for many discussions concerning the interpretation of √ quantum mechanics, in particular because the maximal violation of 2 2 was observed in 1982 experimentally by Aspect and coworkers [5]. We do not want to follow this path (see [169] and the references therein instead). Interesting for us is the fact that Bell inequalities, in particular the CHSH case in Eq. (2.14), provide a necessary condition for a state to be separable. However, there exist entangled states admitting a hidden variable model [165]. Hence, Bell inequalities are not suScient for separability. 2.3. Channels
Assume now that we have a number of quantum systems, e.g. a string of ions in a trap. To “process” the quantum information they carry we have to perform, in general, many steps of a quite di=erent nature. Typical examples are: free time evolution, controlled time evolution (e.g. the application of a “quantum gate” in a quantum computer), preparations and measurements. The purpose of this section is to provide a uni.ed framework for the description of all these di=erent operations. The basic idea is to represent each processing step by a “channel”, which converts input systems, described by an observable algebra A into output systems described by a possibly di=erent algebra B. Henceforth we will call A the input and B the output algebra. If we consider e.g. the free time evolution, we need quantum systems of the same type on the input and the output side; hence, in this case we have A = B = B(H) with an appropriately chosen Hilbert space H. If on the other hand, we want to describe a measurement we have to map quantum systems (the measured system) to classical information (the measuring result). Therefore, we need in this example A = B(H) for the input and B = C(X ) for the output algebra, where X is the set of possible outcomes of the measurement (cf. Section 2.1.4). Our aim is now to get a mathematical object which can be used to describe a channel. To this end consider an e=ect A ∈ B of the output system. If we invoke .rst a channel which transforms A systems into B systems, and measure A afterwards on the output systems, we end up with a measurement of an e=ect T (A) on the input systems. Hence, we get a map T : E(B) → E(A) which completely describes the channel. 6 Alternatively, we can look at the states and interpret a channel as a map T ∗ : S(A) → S(B) which transforms A systems in the state ∈ S(A) into B systems in the state T ∗ ( ). To distinguish between both maps we can say that T describes the channel in the Heisenberg picture and T ∗ in the SchrEodinger picture. On the level of the statistical 6
Note that the direction of the mapping arrow is reversed compared to the natural ordering of processing.
M. Keyl / Physics Reports 369 (2002) 431 – 548
449
interpretation both points of view should coincide of course, i.e. the probabilities 7 (T ∗ )(A) and
(TA) to get the result “yes” during an A measurement on B systems in the state T ∗ , respectively, a TA measurement on A systems in the state , should be the same. Since (T ∗ )(A) is linear in A we see immediately that T must be an aCne map, i.e. T (1 A1 + 2 A2 ) = 1 T (A1 ) + 2 T (A2 ) for each convex linear combination 1 A1 + 2 A2 of e=ects in B, and this in turn implies that T can be extended naturally to a linear map, which we will identify in the following with the channel itself, i.e. we say that T is the channel. 2.3.1. Completely positive maps Let us change now slightly our point of view and start with a linear operator T : A → B. To be a channel, T must map e=ects to e=ects, i.e. T has to be positive: T (A) ¿ 0 ∀A ¿ 0 and bounded from above by 5, i.e. T (5) 6 5. In addition it is natural to require that two channels in parallel are again a channel. More precisely, if two channels T : A1 → B1 and S : A2 → B2 are given we can consider the map T ⊗ S which associates to each A ⊗ B ∈ A1 ⊗ A2 the tensor product T (A) ⊗ S(B) ∈ B1 ⊗ B2 . It is natural to assume that T ⊗ S is a channel which converts composite systems of type A1 ⊗ A2 into B1 ⊗ B2 systems. Hence S ⊗ T should be positive as well [125]. Denition 2.7. Consider two observable algebras A; B and a linear map T : A → B ⊂ B(H). 1. T is called positive if T (A) ¿ 0 holds for all positive A ∈ A. 2. T is called completely positive (cp) if T ⊗ Id : A ⊗ B(Cn ) → B(H) ⊗ B(Cn ) is positive for all n ∈ N. Here Id denotes the identity map on B(Cn ). 3. T is called unital if T (5) = 5 holds. Consider now the map T ∗ : B∗ → A∗ which is dual to T , i.e. T ∗ (A) = (TA) for all ∈ B∗ and A ∈ A. It is called the SchrWodinger picture representation of the channel T , since it maps states to states provided T is unital. (Complete) positivity can be de.ned in the SchrWodinger picture as in the Heisenberg picture and we immediately see that T is (completely) positive i= T ∗ is. It is natural to ask whether the distinction between positivity and complete positivity is really necessary, i.e. whether there are positive maps which are not completely positive. If at least one of the algebras A or B is classical the answer is no: each positive map is completely positive in this case. If both algebras are quantum, however, complete positivity is not implied by positivity alone. We will discuss explicit examples in Section 2.4.2. If item 2 holds only for a .xed n ∈ N the map T is called n-positive. This is obviously a weaker condition than complete positivity. However, n-positivity implies m-positivity for all m 6 n, and for A = B(Cd ) complete positivity is implied by n-positivity, provided n ¿ d holds. Let us consider now the question whether a channel should be unital or not. We have already mentioned that T (5) 6 5 must hold since e=ects should be mapped to e=ects. If T (5) is not equal to 5 we get (T 5) = T ∗ (5) ¡ 1 for the probability to measure the e=ect 5 on systems in the state T ∗ , but this is impossible for channels which produce an output with certainty, because 5 is the 7
To keep notations more readable we will follow frequently the usual convention to drop the parenthesis around arguments of linear operators. Hence, we will write TA and T ∗ instead of T (A) and T ∗ ( ). Similarly, we will simply write TS instead of T ◦ S for compositions.
450
M. Keyl / Physics Reports 369 (2002) 431 – 548
e=ect which is always true. In other words: If a cp map is not unital it describes a channel which sometimes produces no output at all and T (5) is the e=ect which measures whether we have got an output. We will assume in the future that channels are unital if nothing else is explicitly stated. 2.3.2. The Stinespring theorem Consider now channels between quantum systems, i.e. A = B(H1 ) and B = B(H2 ). A fairly simple example (not necessarily unital) is given in terms of an operator V : H1 → H2 by B(H1 ) A → VAV ∗ ∈ B(H2 ). A second example is the restriction to a subsystem, which is given in the Heisenberg picture by B(H) A → A ⊗ 5K ∈ B(H ⊗ K). Finally, the composition S ◦ T = ST of two channels is again a channel. The following theorem, which is the most fundamental structural result about cp maps, 8 says that each channel can be represented as a composition of these two examples [147]. Theorem 2.8 (Stinespring dilation theorem). Every completely positive map T : B(H1 ) → B(H2 ) has the form T (A) = V ∗ (A ⊗ 5K )V ;
(2.15)
with an additional Hilbert space K and an operator V : H2 → H1 ⊗ K. Both (i.e. K and V ) can be chosen such that the span of all (A ⊗ 5)V with A ∈ B(H1 ) and ∈ H2 is dense in H1 ⊗ K. This particular decomposition is unique (up to unitary equivalence) and called the minimal decomposition. If dim H1 = d1 and dim H2 = d2 the minimal K satis<es dim K 6 d21 d2 . By introducing a family |+j +j | of one-dimensional projectors with j |+j +j | = 5 we can de.ne the “Kraus operators” ; Vj = ⊗ +j ; V. In terms of them we can rewrite Eq. (2.15) in the following form [105]: Corollary 2.9 (Kraus form). Every completely positive map T : B(H1 ) → B(H2 ) can be written in the form T (A) =
N
Vj∗ AVj
(2.16)
j=1
with operators Vj : H2 → H1 and N 6 dim (H1 )dim (H2 ). 2.3.3. The duality lemma We will consider a fundamental relation between positive maps and bipartite systems, which will allow us later on to translate properties of entangled states to properties of channels and vice versa. The basic idea originates from elementary linear algebra: A bilinear form on a d-dimensional vector space V can be represented by a d × d-matrix, just as an operator on V . Hence, we can transform into an operator simply by reinterpreting the matrix elements. In our situation things 8
Basically, there is a more general version of this theorem which works with arbitrary output algebras. It needs however some material from representation theory of C*-algebras which we want to avoid here. See e.g. [125,83].
M. Keyl / Physics Reports 369 (2002) 431 – 548
451
are more diScult, because the positivity constraints for states and channels should match up in the right way. Nevertheless, we have the following theorem. Theorem 2.10. Let be a density operator on H ⊗ H1 . Then there is a Hilbert space K a pure state on H ⊗ K and a channel T : B(H1 ) → B(K) with
= (Id ⊗ T ∗ ) ;
(2.17) B∗ (H).
The pure state can be chosen such that tr H () where Id denotes the identity map on has no zero eigenvalue. In this case T and are uniquely determined (up to unitary equivalence) ∗ by Eq. (2.17); i.e. if ; ˜ T˜ with = (Id ⊗ T˜ )˜ are given; we have ˜ = (5 ⊗ U )∗ (5 ⊗ U ) and T˜ (·) = U ∗ T (·)U with an appropriate unitary operator U . Proof. The state is obviously the puri.cation of tr H1 ( ). Hence if j and j are eigenvalues and eigenvectors of tr H1 ( ) we can set = |" "| with " = j j j ⊗ j where j is an (arbitrary) orthonormal basis in K. It is clear that is uniquely determined up to a unitary. Hence; we only have to show that a unique T exists if " is given. To satisfy Eq. (2.17) we must have
(|
j
⊗ k
l
⊗ l |) = "; (Id ⊗ T )(|
j
⊗ k
l
⊗ l |)" ;
= "; | j l | ⊗ T (|k p |)" ; = j l j ; T (|k p |)l
(2.18) (2.19) (2.20)
where k is an (arbitrary) orthonormal basis in H1 . Hence T is uniquely determined by in terms of its matrix elements and we only have to check complete positivity. To this end it is useful to note that the map → T is linear if the j are .xed. Hence; it is suScient to consider the case =|+ +|. Inserting this into Eq. (2.20) we immediately see that T (A)=V ∗ AV with Vj ; k =j−1=2 j ⊗k ; + holds. Hence T is completely positive. Since normalization T (5) = 5 follows from the choice of the j the theorem is proved. 2.4. Separability criteria and positive maps We have already stated in Section 2.3.1 that positive but not completely positive maps exist, whenever input and output algebra are quantum. No such map represents a valid quantum operation, nevertheless they are of great importance in quantum information theory, due to their deep relations to entanglement properties. Hence, this section is a continuation of the study of separability criteria which we have started in Section 2.2.4. In contrast to the rest of this section, all maps are considered in the SchrWodinger rather than in the Heisenberg picture. 2.4.1. Positivity Let us consider now an arbitrary positive, but not necessarily completely positive map T ∗ : ∗ B (H) → B∗ (K). If Id again denotes the identity map, it is easy to see that (Id ⊗ T ∗ )(2 ⊗ 2 ) = 1 ⊗ T ∗ (2 ) ¿ 0 holds for each product state 1 ⊗ 2 ∈ S(H ⊗ K). Hence (Id ⊗ T ∗ ) ¿ 0 for each positive T ∗ is a necessary condition for to be separable. The following theorem proved in [86] shows that suSciency holds as well.
452
M. Keyl / Physics Reports 369 (2002) 431 – 548
Theorem 2.11. A state ∈ B∗ (H⊗K) is separable i> for any positive map T ∗ : B∗ (K) → B∗ (H) the operator (Id ⊗ T ∗ ) is positive. Proof. We will only give a sketch of the proof; see [86] for details. The condition is obviously necessary since (Id ⊗ T ∗ ) 1 ⊗ 2 ¿ 0 holds for any product state provided T ∗ is positive. The proof of suSciency relies on the fact that it is always possible to separate a point (an entangled state) from a convex set D (the set of separable states) by a hyperplane. A precise formulation of this idea leads to the following proposition. Proposition 2.12. For any entangled state ∈ S(H ⊗ K) there is an operator A on H ⊗ K called entanglement witness for ; with the property (A) ¡ 0 and (A) ¿ 0 for all separable ∈ S(H ⊗ K). Proof. Since D ⊂ B∗ (H ⊗ K) is a closed convex set; for each ∈ S ⊂ B∗ (H ⊗ K) with ∈ D there exists a linear functional - on B∗ (H ⊗ K); such that -( ) ¡ . 6 -() for each ∈ D with a constant .. This holds as well in in.nite-dimensional Banach spaces and is a consequence of the Hahn–Banach theorem (cf. [135; Theorem 3.4]). Without loss of generality; we can assume that .=0 holds. Otherwise we just have to replace - by - − . tr. Hence; the result follows from the fact that each linear functional on B∗ (H ⊗ K) has the form -() = tr(A) with A ∈ B(H ⊗ K). To continue the proof of Theorem 2.11 associate now to any operator A ∈ B(H ⊗ K) the map TA∗ : B∗ (K) → B∗ (H) with tr(A 1 ⊗ 2 ) = tr( T1 TA∗ ( 2 )) ;
(2.21)
where (·)T denotes the transposition in an arbitrary but .xed orthonormal basis |j, j = 1; : : : ; d. It is easy to see that TA∗ is positive if tr(A 1 ⊗ 2 ) ¿ 0 for all product states 1 ⊗ 2 ∈ S(H ⊗ K) [94]. A straightforward calculation [86] shows in addition that (2.22) tr(A ) = tr(|" "|(Id ⊗ TA∗ )( )) − 1=2 ∗ ∗ holds, where " = d j |j ⊗ |j. Assume now that (Id ⊗ T ) ¿ 0 for all positive T . Since TA∗ is positive this implies that the left-hand side of (2.22) is positive; hence tr(A ) ¿ 0 provided tr(A) ¿ 0 holds for all separable , and the statement follows from Proposition 2.12. 2.4.2. The partial transpose The most typical example for a positive non-cp map is the transposition /A = AT of d × d matrices, which we have just used in the proof of Theorem 2.11. / is obviously a positive map, but the partial transpose B∗ (H ⊗ K) → (Id ⊗ /)( ) ∈ B∗ (H ⊗ K)
(2.23)
is not. The latter can be easily checked with the maximally entangled state (cf. Section 3.1.1). 1 "= √ |j ⊗ |j ; (2.24) d j where |j ∈ Cd ; j = 1; : : : ; d denote the canonical basis vectors. In low dimensions the transposition is basically the only positive map which is not cp. Due to results of StHrmer [148] and Woronowicz
M. Keyl / Physics Reports 369 (2002) 431 – 548
453
[174] we have: dim H=2 and dim K=2; 3 imply that each positive map T ∗ : B∗ (H) → B∗ (K) has the form T ∗ = T1∗ + T2∗ / with two cp maps T1∗ ; T2∗ and the transposition on B(H). This immediately implies that positivity of the partial transpose is necessary and suCcient for separability of a state
∈ S(H ⊗ K) (cf. [86]): Theorem 2.13. Consider a bipartite system B(H ⊗ K) with dim H = 2 and dim K = 2; 3. A state
∈ S(H ⊗ K) is separable i> its partial transpose is positive. To use positivity of the partial transpose as a separability criterion was proposed for the .rst time by Peres [127], and he conjectured that it is a necessary and suScient condition in arbitrary .nite dimension. Although it has turned out in the meantime that this conjecture is wrong in general (cf. Section 3.1.5), partial transposition has become a crucial tool within entanglement theory and we de.ne: Denition 2.14. A state ∈ B∗ (H ⊗ K) of a bipartite quantum system is called ppt-state if (Id ⊗ /) ¿ 0 holds and npt-state otherwise (ppt = “positive partial transpose” and npt = “negative partial transpose”). 2.4.3. The reduction criterion Another frequently used example of a non-cp but positive map is B∗ (H) → T ∗ ( )=(tr )5−
∈ B∗ (H). The eigenvalues of T ∗ ( ) are given by tr − i , where i are the eigenvalues of . If
¿ 0 we have i ¿ 0 and therefore j j − k ¿ 0. Hence T ∗ is positive. That T ∗ is not completely positive follows if we consider again the example |" "| from Eq. (2.24); hence we get 5 ⊗ tr 2 ( ) − ¿ 0;
tr 1 ( ) ⊗ 5 − ¿ 0
(2.25)
for any separable state ∈ B∗ (H⊗K). These equations are another non-trivial separability criterion, which is called the reduction criterion [85,42]. It is closely related to the ppt criterion, due to the following proposition (see [85] for a proof). Proposition 2.15. Each ppt-state ∈ S(H ⊗ K) satis<es the reduction criterion. If dim H = 2 and dim K = 2; 3 both criteria are equivalent. Hence we see with Theorem 2.13 that a state in 2 × 2 or 2 × 3 dimensions is separable i= it satis.es the reduction criterion.
3. Basic examples After the somewhat abstract discussion in the last section we will become more concrete now. In the following, we will present a number of examples which help on the one hand to understand the structures just introduced, and which are of fundamental importance within quantum information on the other.
454
M. Keyl / Physics Reports 369 (2002) 431 – 548
3.1. Entanglement Although our de.nition of entanglement (De.nition 2.6) is applicable in arbitrary dimensions, detailed knowledge about entangled states is available only for low-dimensional systems or for states with very special properties. In this section we will discuss some of the most basic examples. 3.1.1. Maximally entangled states Let us start with a look on pure states of a composite systems A⊗B and their possible correlations. If one subsystem is classical, i.e. A = C({1; : : : ; d}), the state space is given according to Section 2.2.2 by S(B)d and ∈ S(B)d is pure i= = (j1 1; : : : ; jd 1) with j = 1; : : : ; d and a pure state 1 of the B system. Hence, the restrictions of to A, respectively, B are the Dirac measure j ∈ S(X ) or 1 ∈ S(B), in other words both restrictions are pure. This is completely di=erent if A and B are quantum, i.e. A⊗B=B(H⊗K): Consider =|" "| with " ∈ H⊗K and Schmidt decomposition (Proposition 2.2) " = j j1=2 j ⊗ j . Calculating the A restriction, i.e. the partial trace over K we get 1=2 1=2 tr[tr K ( )A] = tr[|" "|A ⊗ 5] = j k j ; Ak jk ; (3.1) jk
hence tr K ( )= j j |j j | is mixed i= " is entangled. The most extreme case arises if H=K=Cd and tr K ( ) is maximally mixed, i.e. tr K ( ) = 5=d. We get for " d
1 j ⊗ "= √ d j=1
j
(3.2)
with two orthonormal bases 1 ; : : : ; d and 1 ; : : : ; d . In 2n × 2n dimensions these states violate maximally the CHSH inequalities, with appropriately chosen operators A; A ; B; B . Such states are therefore called maximally entangled. The most prominent examples of maximally entangled states are the four “Bell states” for two qubit systems, i.e. H = K = C2 ; |1; |0 denotes the canonical basis and 1 (3.3) 20 = √ (|11 + |00); 2j = i(5 ⊗ j )20 ; j = 1; 2; 3 ; 2 where we have used the shorthand notation |jk for |j ⊗ |k and the j denote the Pauli matrices. The Bell states, which form an orthonormal basis of C2 ⊗ C2 , are the best studied and most relevant examples of entangled states within quantum information. A mixture of them, i.e. a density matrix ∈ S(C2 ⊗ C2 ) with eigenvectors 2j and eigenvalues 0 6 j 6 1; j j = 1, is called a 1 Bell diagonal state. It can be shown [16] that is entangled i= maxj j ¿ 2 holds. We omit the proof of this statement here, but we will come back to this point in Section 5 within the discussion of entanglement measures. Let us come back to the general case now and consider an arbitrary ∈ S(H ⊗ H). Using maximally entangled states, we can introduce another separability criterion in terms of the maximally entangled fraction (cf. [16]) F( ) =
sup
" max: ent :
"; " :
(3.4)
M. Keyl / Physics Reports 369 (2002) 431 – 548
455
If is separable the reduction criterion (2.25) implies "; [tr 1 ( ) ⊗ 5 − ]" ¿ 0 for any maximally entangled state. Since the partial trace of |" "| is d−1 5 we get d−1 = "; tr 1 ( ) ⊗ 5" 6 "; " ;
(3.5)
hence F( ) 6 1=d. This condition is not very sharp however. Using the ppt criterion it can be shown that = |21 21 | + (1 − )|00 00 (with the Bell state 21 ) is entangled for all 0 ¡ 6 1 but a straightforward calculation shows that F( ) 6 1=2 holds for 6 1=2. Finally, we have to mention here a very useful parameterization of the set of pure states on H ⊗ H in terms of maximally entangled states: If " is an arbitrary but .xed maximally entangled state, each ∈ H ⊗ H admits (uniquely determined) operators X1 ; X2 such that = (X1 ⊗ 5)" = (5 ⊗ X2 )"
(3.6)
holds. This can be easily checked in a product basis. 3.1.2. Werner states If we consider entanglement of mixed states rather than pure ones, the analysis becomes quite diScult, even if the dimensions of the underlying Hilbert spaces are low. The reason is that the state space S(H1 ⊗ H2 ) of a two-partite system with dim Hi = di is a geometric object in a (d21 d22 − 1)-dimensional space. Hence even in the simplest non-trivial case (two qubits) the dimension of the state space becomes very high (15 dimensions) and naive geometric intuition can be misleading. Therefore, it is often useful to look at special classes of model states, which can be characterized by only few parameters. A quite powerful tool is the study of symmetry properties; i.e. to investigate the set of states which is invariant under a group of local unitaries. A general discussion of this scheme can be found in [159]. In this paper we will present only three of the most prominent examples. Consider .rst a state ∈ S(H ⊗ H) (with H = Cd ) which is invariant under the group of all U ⊗ U with a unitary U on H; i.e. [U ⊗ U; ] = 0 for all U . Such a is usually called a Werner state [165,128] and its structure can be analyzed quite easily using a well-known result of group theory which goes back to Weyl [171] (see also [142, Theorem IX.11.5]), and which we will state in detail for later reference: Theorem 3.1. Each operator A on the N-fold tensor product H⊗N of the (
(3.7)
In our case (N = 2) there are only two permutations: the identity 5 and the Jip F( ⊗ ) = ⊗ . Hence = a5 + bF with appropriate coeScients a; b. Since is a density matrix, a and b are not independent. To get a transparent way to express these constraints, it is reasonable to consider the eigenprojections P± of F rather than 5 and F; i.e. FP± = ±P± and P± = (5 ± F)=2. The P± are 2 the projections on the subspaces H⊗ ± ⊂ H ⊗ H of symmetric, respectively antisymmetric, tensor products (Bose-, respectively, Fermi-subspace). If we write d± = d(d ± 1)=2 for the dimensions of
456
M. Keyl / Physics Reports 369 (2002) 431 – 548
2 H⊗ ± we get for each Werner state
(1 − ) P+ + P− ; ∈ [0; 1] :
= d+ d−
(3.8)
On the other hand, it is obvious that each state of this form is U ⊗ U invariant, hence a Werner state. If is given, it is very easy to calculate the parameter from the expectation value of and the Jip tr( F) = 2 − 1 ∈ [ − 1; 1]. Therefore, we can write for an arbitrary state ∈ S(H ⊗ H) tr(F) + 1 (1 − tr F) PUU () = P+ + P− (3.9) 2d+ 2d− and this de.nes a projection from the full state space to the set of Werner states which is called the twirl operation. In many cases it is quite useful that it can be written alternatively as a group average of the form PUU () = (U ⊗ U )(U ∗ ⊗ U ∗ ) dU ; (3.10) U (d)
where dU denotes the normalized, left invariant Haar measure on U (d). To check this identity note .rst that its right-hand side is indeed U ⊗ U invariant, due to the invariance of the volume element dU . Hence, we have to check only that the trace of F times the integral coincides with tr(F): ∗ ∗ (U ⊗ U )(U ⊗ U ) dU = tr[F(U ⊗ U )(U ∗ ⊗ U ∗ )] dU ; (3.11) tr F U (d) U (d) = tr(F) dU = tr(F) ; (3.12) U (d)
where we have used the fact that F commutes with U ⊗ U and the normalization of dU . We can apply PUU obviously to arbitrary operators A ∈ B(H ⊗ H) and, as an integral over unitarily implemented operations, we get a channel. Substituting U → U ∗ in (3.10) and cycling the trace tr(APUU ()) we .nd tr(PUU (A) ) = tr(APUU ( )), hence PUU has the same form in the Heisenberg ∗ =P and the SchrWodinger picture (i.e. PUU UU ). If ∈ S(H ⊗ H) is a separable state the integrand of PUU () in Eq. (3.10) consists entirely of separable states, hence PUU () is separable. Since each Werner state is the twirl of itself, we see that is separable i= it is the twirl PUU () of a separable state ∈ S(H ⊗ H). To determine the set of separable Werner states we therefore have to calculate only the set of all tr(F) ∈ [ − 1; 1] with separable . Since each such admits a convex decomposition into pure product states it is suScient to look at
⊗ ; F ⊗ = | ; |2 ;
(3.13)
which ranges from 0 to 1. Hence from Eq. (3.8) is separable i= 12 6 6 1 and entangled otherwise (due to = (tr(F ) + 1)=2). If H = C2 holds, each Werner state is Bell diagonal and we recover the result from Section 3.1.1 (separable if highest eigenvalue less or equal than 1=2). 3.1.3. Isotropic states To derive a second class of states consider the partial transpose (Id ⊗ /) (with respect to a distinguished base |j ∈ H, j = 1; : : : ; d) of a Werner state . Since is, by de.nition, U ⊗ U
M. Keyl / Physics Reports 369 (2002) 431 – 548
457
invariant, it is easy to see that (Id ⊗ /) is U ⊗ UZ invariant, where UZ denotes componentwise T complex conjugation in the base |j (we just have to use that U ∗ = UZ holds). Each state 1 with this kind of symmetry is called an isotropic state [132], and our previous discussion shows that 1 is a linear combination of 5 and the partial transpose of the Jip, which is the rank one operator d F˜ = (Id ⊗ /)F = |" "| = |jj kk| ; (3.14) jk=1
where " = j |jj is, up to normalization a maximally entangled state. Hence, each isotropic 1 can be written as 2 5 1 d + (1 − )F˜ ; ∈ 0; 2 ; (3.15) 1= d d d −1 where the bounds on follow from normalization and positivity. As above we can determine the parameter from the expectation value 2 ˜ = 1−d +d ; tr(F1) (3.16) d which ranges from 0 to d and this again leads to a twirl operation: For an arbitrary state ∈ S(H⊗ H) we can de.ne 1 ˜ − d]5 + [1 − d tr(F)] ˜ F) ˜ PU UZ () = ([tr(F) (3.17) d(1 − d2 ) and as for Werner states PU UZ can be rewritten in terms of a group average ∗ PU UZ () = (U ⊗ UZ )(U ∗ ⊗ UZ ) dU : (3.18) U (d)
Now we can proceed in the same way as above: PU UZ is a channel with PU∗ UZ = PU UZ , its .xed points PU UZ (1) = 1 are exactly the isotropic states, and the image of the set of separable states under PU UZ coincides with the set of separable isotropic states. To determine the latter we have to consider the expectation values (cf. Eq. (3.13))
d
Z 2
⊗ ; F˜ ⊗ =
(3.19) j j = | ; | ∈ [0; 1] :
j=1
This implies that 1 is separable i= d2 d(d − 1) 6 6 (3.20) d2 − 1 d2 − 1 holds and entangled otherwise. For = 0 we recover the maximally entangled state. For d = 2, again we recover again the special case of Bell diagonal states encountered already in the last subsection. 3.1.4. OO-invariant states Let us combine now Werner states with isotropic states, i.e. we look for density matrices which ˜ or, if we introduce the three mutually orthogonal projection can be written as = a5 + bF + cF, operators 1 1 1 ˜ 1 p0 = F; p1 = (5 − F); (5 + F) − F˜ (3.21) d 2 2 d
458
M. Keyl / Physics Reports 369 (2002) 431 – 548 ∼
tr(F) 3
2
1
0
-1 -1
0
1
2
3
tr(F)
Fig. 3.1. State space of OO-invariant states (upper triangle) and its partial transpose (lower triangle) for d = 3. The special cases of isotropic and Werner states are drawn as thin lines.
as a convex linear combination of tr(pj )−1 pj , j = 0; 1; 2: p1 p2 + 2 ; 1 ; 2 ¿ 0;
= (1 − 1 − 2 )p0 + 1 tr(p1 ) tr(p2 )
1 + 2 6 1 :
(3.22)
Each such operator is invariant under all transformations of the form U ⊗ U if U is a unitary with U = UZ , in other words: U should be a real orthogonal matrix. A little bit representation theory of the orthogonal group shows that in fact all operators with this invariance property have the form given in (3.22); cf. [159]. The corresponding states are therefore called OO-invariant, and we can apply basically the same machinery as in Section 3.1.2 if we replace the unitary group U (d) by the orthogonal group O(d). This includes, in particular, the de.nition of a twirl operation as an average over O(d) (for an arbitrary ∈ S(H ⊗ H)): POO ( ) = U ⊗ U U ⊗ U ∗ dU ; (3.23) O(d)
˜ by which we can express alternatively in terms of the expectation values tr(F ), tr(F ) ˜ ˜ p2 tr(F ) 1 − tr(F ) 1 + tr(F ) tr(F ) p0 + p1 + − : POO ( ) = d 2 tr(p1 ) 2 d tr(p2 )
(3.24)
˜ is given by The range of allowed values for tr(F ), tr(F ) − 1 6 tr(F ) 6 1;
˜ 6 d; 0 6 tr(F )
tr(F ) ¿
For d = 3 this is the upper triangle in Fig. 3.1.
˜ 2tr(F ) −1 : d
(3.25)
M. Keyl / Physics Reports 369 (2002) 431 – 548
459
The values in the lower (dotted) triangle belong to partial transpositions of OO-invariant states. The intersection of both, i.e. the gray-shaded square Q = [0; 1] × [0; 1], represents therefore the set of OO-invariant ppt states, and at the same time the set of separable states, since each OO-invariant ppt state is separable. To see the latter note that separable OO-invariant states form a convex subset of Q. Hence, we only have to show that the corners of Q are separable. To do this note that (1) ˜ OO ( ))=tr(F ) holds POO ( ) is separable whenever is and (2) that tr(FPOO ( ))=tr(F ) and tr(FP (cf. Eq. (3.12)). We can consider pure product states |⊗ ⊗ | for and get (| ; |2 ; ; Z |2 ) ˜ for the tuple (tr(F ); tr(F )). Now the point (1; 1) in Q is obtained if = is real, the point (0; 0) Z is obtained for real and orthogonal ; and the point (1; 0) belongs to the case = and ; =0. Z Symmetrically we get (0; 1) with the same and = . 3.1.5. PPT states We have seen in Theorem 2.13 that separable states and ppt states coincide in 2 × 2 and 2 × 3 dimensions. Another class of examples with this property are OO-invariant states just studied. Nevertheless, separability and a positive partial transpose are not equivalent. An easy way to produce such examples of states which are entangled and ppt is given in terms of unextendible product bases [14]. An orthonormal family j ∈ H1 ⊗ H2 , j = 1; : : : ; N ¡ d1 d2 (with dk = dim Hk ) is called an unextendible product basis 9 (UPB) i= (1) all j are product vectors and (2) there is no product vector orthogonal to all j . Let us denote the projector to the span of all j by E, its orthocomplement by E ⊥ , i.e. E ⊥ = 5 − E, and de.ne the state = (d1 d2 − N )−1 E ⊥ . It is entangled because there is by construction no product vector in the support of , and it is ppt. The latter can be seen as follows: The projector E is a sum of the one-dimensional projectors |j j |, j =1; : : : ; N . Since all j are product vectors the partial transposes of the |j j | are of the form |˜ j ˜ j |, with another UPB ˜ j , j = 1; : : : ; N and the partial transpose (5 ⊗ /)E of E is the sum of the |˜ j ˜ j |. Hence (5 ⊗ /)E ⊥ = 5 − (5 ⊗ /)E is a projector and therefore positive. To construct entangled ppt states we have to .nd UPBs. The following two examples are taken from [14]. Consider .rst the .ve vectors (3.26) j = N (cos(23j=5); sin(23j=5); h); j = 0; : : : ; 4 √ √ with N = 2= 5 + 5 and h = 12 1 + 5. They form the apex of a regular pentagonal pyramid with height h. The latter is chosen such that non-adjacent vectors are orthogonal. It is now easy to show that the .ve vectors "j = j ⊗ 2j mod 5 ;
j = 0; : : : ; 4
(3.27)
form a UPB in the Hilbert space H ⊗ H, dim H = 3 (cf. [14]). A second example, again in (3 × 3)-dimensional Hilbert space are the following .ve vectors (called “Tiles” in [14]): 1 √ |0 ⊗ (|0 − |1); 2 9
1 √ |2 ⊗ (|1 − |2); 2
1 √ (|0 − |1) ⊗ |2 ; 2
This name is somewhat misleading because the j are not a base of H1 ⊗ H2 .
460
M. Keyl / Physics Reports 369 (2002) 431 – 548
1 √ (|1 − |2) ⊗ |0; 2
1 (|0 + |1 + |2) ⊗ (|0 + |1 + |2) ; 3
(3.28)
where |k, k = 0; 1; 2 denotes the standard basis in H = C3 . 3.1.6. Multipartite states In many applications of quantum information rather big systems, consisting of a large number of subsystems, occur (e.g. a quantum register of a quantum computer) and it is necessary to study the corresponding correlation and entanglement properties. Since this is a fairly diScult task, there is not much known about—much less as in the two-partite case, which we mainly consider in this paper. Nevertheless, in this subsection we will give a rough outline of some of the most relevant aspects. At the level of pure states the most signi.cant diSculty is the lack of an analog of the Schmidt decomposition [126]. More precisely, there are elements in an N -fold tensor product H(1) ⊗· · ·⊗H(N ) (with N ¿ 2) which cannot be written as 10 "=
d
(N ) j (1) j ⊗ · · · ⊗ j
(3.29)
j=1
with N orthonormal bases 1(k) ; : : : ; d(k) of H(k) , k = 1; : : : ; N . To get examples for such states in the tri-partite case, note .rst that any partial trace of |" "| with " from Eq. (3.29) has separable eigenvectors. Hence, each puri.cation (Corollary 2.3) of an entangled, two-partite, mixed state with inseparable eigenvectors (e.g. a Bell diagonal state) does not admit a Schmidt decomposition. This implies on the one hand that there are interesting new properties to be discovered, but on the other we see that many techniques developed for bipartite pure states can be generalized in a straightforward way only for states which are Schmidt decomposable in the sense of Eq. (3.29). The most well-known representative of this class for a tripartite qubit system is the GHZ state [73] 1 " = √ (|000 + |111) ; (3.30) 2 which has the special property that contradictions between local hidden variable theories and quantum mechanics occur even for non-statistical predictions (as opposed to maximally entangled states of bipartite systems [73,117,116]). A second new aspect arising in the discussion of multiparty entanglement is the fact that several di=erent notions of separability occur. A state of an N -partite system B(H1 ) ⊗ · · · ⊗ B(HN ) is called N -separable if
= J j1 ⊗ · · · ⊗ j N (3.31) J
with states jk ∈ B∗ (Hk ) and multiindices J =(j1 ; : : : ; jk ). Alternatively, however, we can decompose B(H1 )⊗· · ·⊗B(HN ) into two subsystems (or even into M subsystems if M ¡ N ) and call biseparable if it is separable with respect to this decomposition. It is obvious that N -separability implies (k) There is, however, the possibility to choose the bases (k) such that the number of summands becomes 1 ; : : : ; d minimal. For tri-partite systems this “minimal canonical form” is study in [1]. 10
M. Keyl / Physics Reports 369 (2002) 431 – 548
461
biseparability with respect to all possible decompositions. The converse is—not very surprisingly— not true. One way to construct a corresponding counterexample is to use an unextendable product base (cf. Section 3.1.5). In [14] it is shown that the tripartite qubit state complementary to the UPB 1 |0; 1; +; |1; +; 0; |+; 0; 1; |−; −; − with |± = √ (|0 ± |1) 2
(3.32)
is entangled (i.e. tri-inseparable) but biseparable with respect to any decomposition into two subsystems (cf. [14] for details). Another, maybe more systematic, way to .nd examples for multipartite states with interesting properties is the generalization of the methods used for Werner states (Section 3.1.2), i.e. to look for density matrices ∈ B∗ (H⊗N ) which commute with all unitaries of the form U ⊗N . Applying again Theorem 3.1 we see that each such is a linear combination of permutation unitaries. Hence, the structure of the set of all U ⊗N invariant states can be derived from representation theory of the symmetric group (which can be tedious for large N !). For N = 3 this program is carried out in [61] and it turns out that the corresponding set of invariant states is a .ve-dimensional (real) manifold. We skip the details here and refer to [61] instead. 3.2. Channels In Section 2.3 we have introduced channels as very general objects transforming arbitrary types of information (i.e. classical, quantum and mixtures of them) into one another. In the following, we will consider some of the most important special cases. 3.2.1. Quantum channnels Many tasks of quantum information theory require the transmission of quantum information over long distances, using devices like optical .bers or storing quantum information in some sort of memory. Both situations can be described by a channel or quantum operation T : B(H) → B(H), where T ∗ ( ) is the quantum information which will be received when was sent, or alternatively: which will be read o= the quantum memory when was written. Ideally, we would prefer those channels which do not a=ect the information at all, i.e. T = 5, or, as the next best choice, a T whose action can be undone by a physical device, i.e. T should be invertible and T −1 is again a channel. The Stinespring Theorem (Theorem 2.8) immediately shows that this implies T ∗ = U U ∗ with a unitary U ; in other words, the systems carrying the information do not interact with the environment. We will call such a kind of channel an ideal channel. In real situations, however, interaction with the environment, i.e. additional, unobservable degrees of freedom, cannot be avoided. The general structure of such a noisy channel is given by T ∗ ( ) = tr K (U ( ⊗ 0 )U ∗ ) ;
(3.33)
where U : H ⊗ K → H ⊗ K is a unitary operator describing the common evolution of the system (Hilbert space H) and the environment (Hilbert space K) and 0 ∈ S(K) is the initial state of the environment (cf. Fig. 3.2). It is obvious that the quantum information originally stored in ∈ S(H) cannot be completely recovered from T ∗ ( ) if only one system is available. It is an easy consequence of the Stinepspring theorem that each channel can be expressed in this form
462
M. Keyl / Physics Reports 369 (2002) 431 – 548
Fig. 3.2. Noisy channel.
Corollary 3.2 (Ancilla form). Assume that T : B(H) → B(H) is a channel. Then there is a Hilbert space K; a pure state 0 and a unitary map U : H ⊗ K → H ⊗ K such that Eq. (3.33) holds. It is always possible; to choose K such that dim(K) = dim(H)3 holds. Proof. Consider the Stinepspring form T (A) = V ∗ (A ⊗ 5)V with V : H → H ⊗ K of T and choose a vector ∈ K such that U ( ⊗ ) = V () can be extended to a unitary map U : H ⊗ K → H ⊗ K (this is always possible since T is unital and V therefore isometric). If ej ∈ H; j = 1; : : : ; d1 and fk ∈ K; k = 1; : : : ; d2 are orthonormal bases with f1 = we get tr[T (A) ] = tr[ V ∗ (A ⊗ 5)V ] =
V ej ; (A ⊗ 5)Vej (3.34) =
j
U ( ⊗ | |)(ej ⊗ fk ); (A ⊗ 5)U (ej ⊗ fk )
(3.35)
jk
= tr[tr K [U ( ⊗ | |)U ∗ ]A] ;
(3.36)
which proves the statement. Note that there are, in general, many ways to express a channel this way, e.g. if T is an ideal channel → T ∗ = U U ∗ we can rewrite it with an arbitrary unitary U0 : K → K by T ∗ = tr 2 (U ⊗ U0 ⊗ 0 U ∗ ⊗ U0∗ ). This is the weakness of the ancilla form compared to the Stinespring representation of Theorem 2.8. Nevertheless, Corollary 3.2 shows that each channel which is not an ideal channel is noisy in the described way. The most prominent example for a noisy channel is the depolarizing channel for d-level systems (i.e. H = Cd ) 5 S(H) → # + (1 − #) ∈ S(H); 0 6 # 6 1 (3.37) d or in the Heisenberg picture tr(A) 5 ∈ B(H) : (3.38) B(H) A → #A + (1 − #) d A Stinespring dilation of T (not the minimal one—this can be checked by counting dimensions) is given by K = H ⊗ H ⊕ C and V : H → H ⊗ K = H⊗3 ⊕ H with
d √ 1−# |j → V |j = |k ⊗ |k ⊗ |j ⊕ [ #|j] ; (3.39) d k=1
M. Keyl / Physics Reports 369 (2002) 431 – 548
463
where |k, k = 1; : : : ; d denotes again the canonical basis in H. An ancilla form of T with the same K is given by the (pure) environment state
d √ 1−# = |k ⊗ |k ⊕ [ #|0] ∈ K (3.40) d k=1
and the unitary operator U : H ⊗ K → H ⊗ K with U (1 ⊗ 2 ⊗ 3 ⊕ +) = 2 ⊗ 3 ⊗ 1 ⊕ + ;
(3.41)
i.e. U is the direct sum of a permutation unitary and the identity. 3.2.2. Channels under symmetry Similarly to the discussion in Section 3.1 it is often useful to consider channels with special symmetry properties. To be more precise, consider a group G and two unitary representations 31 ; 32 on the Hilbert spaces H1 and H2 , respectively. A channel T : B(H1 ) → B(H2 ) is called covariant (with respect to 31 and 32 ) if T [31 (U )A31 (U )∗ ] = 32 (U )T [A]32 (U )∗
∀A ∈ B(H1 ) ∀U ∈ G
(3.42)
holds. The general structure of covariant channels is governed by a fairly powerful variant of Stinespring’s theorem which we will state below (and which will be very useful for the study of the cloning problem in Section 7). Before we do this let us have a short look on a particular class of examples which is closely related to OO-invariant states. Hence consider a channel T : B(H) → B(H) which is covariant with respect to the orthogonal group, i.e. T (UAU ∗ ) = UT (A)U ∗ for all unitaries U on H with UZ = U in a distinguished basis |j, − 1=2 j = 1; : : : ; d. The maximally entangled state = d for all j |jj is OO-invariant, i.e. U ⊗ U = these U . Therefore, each state = (Id ⊗ T ∗ )| | is OO-invariant as well and by the duality lemma (Theorem 2.10) T and are uniquely determined (up to unitary equivalence) by . This means we can use the structure of OO-invariant states derived in Section 3.1.4 to characterize all orthogonal covariant channels. As a .rst step consider the linear maps X1 (A) = d tr(A)5, X2 (A) = dAT and X3 (A) = dA. They are not channels (they are not unital and X2 is not cp) but they have the correct covariance property and it is easy to see that they correspond to the operators 5; F; F˜ ∈ B(H ⊗ H), i.e. (Id ⊗ X1 )| | = 5;
(Id ⊗ X2 )| | = F;
(Id ⊗ X3 )| | = F˜ :
(3.43)
Using Eq. (3.21), we can determine therefore the channels which belong to the three extremal OO-invariant states (the corners of the upper triangle in Fig. 3.1): tr(A)5 − AT ; d−1 d 2 T T2 (A) = (tr(A)5 + A ) − A : d(d + 1) − 2 2
T0 (A) = A;
T1 (A) =
(3.44) (3.45)
Each OO-invariant channel is a convex linear combination of these three. Special cases are the channels corresponding to Werner and isotropic states. The latter leads to depolarizing channels
464
M. Keyl / Physics Reports 369 (2002) 431 – 548
T (A)=#A+(1−#)d−1 tr(A)5 with # ∈ [0; d2 =(d2 −1)]; cf. Eq. (3.15), while Werner states correspond to T (A) =
# 1−# [tr(A)5 + AT ] + [tr(A)5 − AT ]; d+1 d−1
# ∈ [0; 1] ;
(3.46)
cf. Eq. (3.8). Let us come back now to the general case. We will state here the covariant version of the Stinespring theorem (see [98] for a proof). The basic idea is that all covariant channels are parameterized by representations on the dilation space. Theorem 3.3. Let G be a group with
x
holds. Hence, the family (Txy )x∈X is a probability distribution on X and Txy is therefore the probability to get the information x ∈ X at the output side of the channel if y ∈ Y was send. Each classical channel is uniquely determined by its matrix of transition probabilities. For X = Y we see that the information is transmitted without error i= Txy = xy , i.e. T is an ideal channel if T = Id holds and noisy otherwise. 3.2.4. Observables and preparations Let us consider now a channel which transforms quantum information B(H) into classical information C(X ). Since positivity and complete positivity are again equivalent, we just have to look at a positive and unital map E : C(X ) → B(H). With the canonical basis |x x|, x ∈ X of C(X ) we get a family Ex = E(|x x|), x ∈ X of positive operators Ex ∈ B(H) with x∈X Ex = 5. Hence the Ex form a POV measure, i.e. an observable. If on the other hand a POV measure Ex ∈ B(H), x∈X is given we can de.ne a quantum to classical channel E : C(X ) → B(H) by E(f) = x f(x)Ex .
M. Keyl / Physics Reports 369 (2002) 431 – 548
465
This shows that the observable Ex ; x ∈ X and the channel E can be identi.ed and we say E is the observable. Keeping this interpretation in mind it is possible to have a short look at continuous observables without the need of abstract measure theory: We only have to de.ne the classical algebra C(X ) for a set X which is not .nite or discrete. For simplicity, we assume that X = R holds; however, the generalization to other locally compact spaces is straightforward. We choose for C(R) the space of continuous, complex-valued functions vanishing at in.nity, i.e. |f(x)| ¡ < for each < ¿ 0 provided |x| is large enough. C(R) can be equipped with the sup-norm and becomes an Abelian C∗ -algebra (cf. [25]). To interpret it as an operator algebra as assumed in Section 2.1.1 we have to identify f ∈ C(R) with the corresponding multiplication operator on L2 (R). An observable taking arbitrary real values can now be de.ned as a positive map E : C(R) → B(H). The probability to get a result in the interval [a; b] ⊂ R during an E measurement on systems in the state is 11 %([a; b]) = sup {tr(E(f) ) | f ∈ C(R); 0 6 f 6 5; supp f ⊂ [a; b]} ;
(3.48)
where supp denotes the support of f. The most well-known example for R valued observables are of course position Q and momentum P of a free particle in one dimension. In this case we have H = L2 (R) and the channels corresponding to Q and P are (in position representation) given by C(R) f → EQ (f) ∈ B(H) with EQ (f) = f , respectively, C(R) f → EP (f) ∈ B(H) with EP (f) = (f ˆ )∨ where ∧ and ∨ denote the Fourier transform and its inverse. Let us return now to a .nite set X and exchange the role of C(X ) and B(H); in other words let us consider a channel R : B(H) → C(X ) with a classical input and a quantum output algebra. In the SchrWodinger picture we get a family of density matrices x :=R∗ (x ) ∈ B∗ (H), x ∈ X , where x ∈ C∗ (X ) again denote the Dirac measures (cf. Section 2.1.3). Hence, we get a parameter-dependent preparation which can be used to encode the classical information x ∈ X into the quantum information x ∈ B∗ (H). 3.2.5. Instruments and parameter-dependent operations An observable describes only the statistics of measuring results, but does not contain information about the state of the system after the measurement. To get a description which .lls this gap we have to consider channels which operates on quantum systems and produces hybrid systems as output, i.e. T : B(H) ⊗ M(X ) → B(K). Following Davies [50] we will call such an object an instrument. From T we can derive the subchannel C(X ) f → T (5 ⊗ f) ∈ B(K) ;
(3.49)
which is the observable measured by T , i.e. tr[T (5 ⊗ |x x|) ] is the probability to measure x ∈ X on systems in the state . On the other hand, we get for each x ∈ X a quantum channel (which is not unital) B(H) A → Tx (A) = T (A ⊗ |x x|) ∈ B(K) :
11
(3.50)
Due to the Riesz–Markov theorem (cf. [134, Theorem IV.18]) the set function % extends in unique way to a probability measure on the real line.
466
M. Keyl / Physics Reports 369 (2002) 431 – 548
Fig. 3.3. Instrument. Fig. 3.4. Parameter-dependent operation.
It describes the operation performed by the instrument T if x ∈ X was measured. More precisely if a measurement on systems in the state gives the result x ∈ X we get (up to normalization) the state Tx∗ ( ) after the measurement (cf. Fig. 3.3), while tr(Tx∗ ( )) = tr(Tx∗ ( )5) = tr( T (5 ⊗ |x x|))
(3.51)
is (again) the probability to measure x ∈ X on . The instrument T can be expressed in terms of the operations Tx by T (A ⊗ f) = f(x)Tx (A) ; (3.52) x
hence, we can identify T with the family Tx , x ∈ X . Finally, we can consider the second marginal of T B(H) A → T (A ⊗ 5) = Tx (A) ∈ B(K) : (3.53) x ∈X
It describes the operation we get if the outcome of the measurement is ignored. The most well-known example of an instrument is a von Neumann–LEuders measurement associated to a PV measure given by family of projections Ex , x = 1; : : : d; e.g. the eigenprojections of a self-adjoint operator A ∈ B(H). It is de.ned as the channel T : B(H) ⊗ C(X ) → B(H)
with X = {1; : : : ; d}
)−1 E
and
Tx (A) = Ex AEx :
(3.54)
Hence, we get the .nal state tr(Ex x Ex if we measure the value x ∈ X on systems initially in the state —this is well known from quantum mechanics. Let us change now the role of B(H) ⊗ C(X ) and B(K); in other words, consider a channel T : B(K) → B(H) ⊗ C(X ) with hybrid input and quantum output. It describes a device which changes the state of a system depending on additional classical information. As for an instrument, T decomposes into a family of (unital!) channels Tx : B(K) → B(H) such that we get T ∗ ( ⊗ p) = ∗ odinger picture. Physically T describes a parameter-dependent operation: x px Tx ( ) in the SchrW depending on the classical information x ∈ X the quantum information ∈ B(K) is transformed by the operation Tx (cf. Fig. 3.4). Finally, we can consider a channel T : B(H) ⊗ C(X ) → B(K) ⊗ C(Y ) with hybrid input and output to get a parameter-dependent instrument (cf. Fig. 3.5): Similar to the discussion in the last paragraph we can de.ne a family of instruments Ty : B(H) ⊗ C(X ) → B(K), y ∈ Y by the equation T ∗ ( ⊗ p) = y py Ty∗ ( ). Physically, T describes the following device: It receives
M. Keyl / Physics Reports 369 (2002) 431 – 548
467
Fig. 3.5. Parameter-dependent instrument. Fig. 3.6. One-way LOCC operation; cf. Fig. 3.7 for an explanation.
the classical information y ∈ Y and a quantum system in the state ∈ B∗ (K) as input. Depending on y a measurement with the instrument Ty is performed, which in turn produces the measuring value x ∈ X and leaves the quantum system in the state (up to normalization) Ty;∗ x ( ); with Ty; x given as in Eq. (3.50) by Ty; x (A) = Ty (A ⊗ |x x|). 3.2.6. LOCC and separable channels Let us consider now channels acting on .nite-dimensional bipartite systems: T : B(H1 ⊗ K2 ) → B(K1 ⊗ K2 ). In this case we can ask the question whether a channel preserves separability. Simple examples are local operations (LOs), i.e. T = T A ⊗ T B with two channels T A; B : B(Hj ) → B(Kj ). Physically, we think of such a T in terms of two physicists Alice and Bob both performing operations on their own particle but without information transmission neither classical nor quantum. The next diScult step are LOs with one-way classical communications (one way LOCC). This means Alice operates on her system with an instrument, communicates the classical measuring result j ∈ X = {1; : : : ; N } to Bob and he selects an operation depending on these data. We can write such a channel as a composition T = (T A ⊗ Id)(Id ⊗ T B ) of the instrument T A : B(H1 ) ⊗ C(X1 ) → B(K1 ) and the parameter-dependent operation T B : B(H2 ) → C(X1 ) ⊗ B(K2 ) (cf. Fig. 3.6) Id⊗T B
T A ⊗Id
B(H1 ⊗ H2 ) −→ B(H1 ) ⊗ C(X ) ⊗ B(K2 ) −→ B(K1 ⊗ K2 ) :
(3.55)
It is of course possible to continue the chain in Eq. (3.55), i.e. instead of just operating on his system, Bob can invoke a parameter-dependent instrument depending on Alice’s data j1 ∈ X1 , send the corresponding measuring results j2 ∈ X2 to Alice and so on. To write down the corresponding chain of maps (as in Eq. (3.55)) is simple but not very illuminating and therefore omitted; cf. Fig. 3.7 instead. If we allow Alice and Bob to drop some of their particles, i.e. the operations they perform need not to be unital, we get an LOCC channel (“local operations and classical communications”). It represents the most general physical process which can be performed on a two partite system if only classical communication (in both directions) is available. The LOCC channels play a signi.cant role in entanglement theory (we will see this in Section 4.3), but they are diScult to handle. Fortunately, it is often possible to replace them by closely
468
M. Keyl / Physics Reports 369 (2002) 431 – 548
Fig. 3.7. LOCC operation. The upper and lower curly arrows represent Alice’s respectively Bob’s, quantum system, while the straight arrows in the middle stand for the classical information Alice and Bob exchange. The boxes symbolize the channels applied by Alice and Bob.
related operations with a more simple structure: A not necessarily unital channel T : B(H1 ⊗K2 ) → B(K1 ⊗ K2 ) is called separable, if it is a sum of (in general non-unital) local operations, i.e. T=
N
TjA ⊗ TjB :
(3.56)
j=1
It is easy to see that a separable T maps separable states to separable states (up to normalization) and that each LOCC channel is separable (cf. [13]). The converse, however, is (somewhat surprisingly) not true: there are separable channels which are not LOCC, see [13] for a concrete example. 3.3. Quantum mechanics in phase space Up to now we have considered only .nite-dimensional systems and even in this extremely idealized situation it is not easy to get non-trivial results. At a .rst look the discussion of continuous quantum systems seems therefore to be hopeless. If we restrict our attention however to small classes of states and channels, with suSciently simple structure, many problems become tractable. Phase space quantum mechanics, which will be reviewed in this section (see [79, Chapter 5] for details), provides a very powerful tool in this context. Before we start let us add some remarks to the discussion of Section 2 which we have restricted to .nite-dimensional Hilbert spaces. Basically, most of the material considered there can be generalized in a straightforward way, as long as topological issues like continuity and convergence arguments are treated carefully enough. There are of course some caveats (cf. in particular, footnote 4 of Section 2); however, they do not lead to problems in the framework we are going to discuss and can therefore be ignored. 3.3.1. Weyl operators and the CCR The kinematical structure of a quantum system with d degrees of freedom is usually described by a separable Hilbert space H and 2d self-adjoint operators Q1 ; : : : ; Qd ; P1 ; : : : ; Pd satisfying the canonical commutation relations [Qj ; Qk ] = 0, [Pj ; Pk ] = 0, [Qj ; Pk ] = ijk 5. The latter can be rewritten in a more compact form as R2j−1 = Qj ; R2j = Pj ;
j = 1; : : : ; d; [Rj ; Rk ] = −ijk :
(3.57)
M. Keyl / Physics Reports 369 (2002) 431 – 548
Here denotes the symplectic matrix
0 1 = diag(J; : : : ; J ); J = ; −1 0
469
(3.58)
which plays a crucial role for the geometry of classical mechanics. We will call the pair (V; ) consisting of and the 2d-dimensional real vector space V = R2d henceforth the classical phase space. The relations in Eq. (3.57) are, however, not suScient to .x the operators Rj up to unitary equivalence. The best way to remove the remaining physical ambiguities is the study of the unitaries W (x) = exp(ix · · R);
x ∈ V;
x··R=
2d
xj jk Rk
(3.59)
jk=1
instead of the Rj directly. If the family W (x), x ∈ V is irreducible (i.e. [W (x); A] = 0, ∀x ∈ V implies A = 5 with ∈ C) and satis.es 12 i W (x)W (x ) = exp − x · · x W (x + x ) ; (3.60) 2 it is called an (irreducible) representation of the Weyl relations (on (V; )) and the operators W (x) are called Weyl operators. By the well-known Stone–von Neumann uniqueness theorem all these representations are mutually unitarily equivalent, i.e. if we have two of them W1 (x); W2 (x), there is a unitary operator U with UW1 (x)U ∗ = W2 (x) ∀x ∈ V . This implies that it does not matter from a physical point of view which representation we use. The most well-known one is of course the SchrEodinger representation where H = L2 (Rd ) and Qj , Pk are the usual position and momentum operators. 3.3.2. Gaussian states A density operator ∈ S(H) has
-jk + ijk = 2 tr[(Rj − mj ) (Rk − mk )] :
(3.61)
The mean m can be arbitrary, but the correlation matrix - must be real and symmetric and the positivity condition - + i ¿ 0
(3.62)
must hold (this is an easy consequence of the canonical commutation relations (3.57)). Our aim is now to distinguish exactly one state among all others with the same mean and correlation matrix. This is the point where the Weyl operators come into play. Each state ∈ S(H) can be characterized uniquely by its quantum characteristic function X x → tr[W (x) ] ∈ C which 12
Note that the CCR (3.57) are implied by the Weyl relations (3.60) but the converse is, in contrast to popular believe, not true: There are representations of the CCR which are unitarily inequivalent to the SchrWodinger representation; cf. [134, Section VIII.5] for particular examples. Hence, uniqueness can only be achieved on the level of Weyl operators—which is one major reason to study them.
470
M. Keyl / Physics Reports 369 (2002) 431 – 548
should be regarded as the quantum Fourier transform of and is in fact the Fourier transform of the Wigner function of [164]. We call Gaussian if tr[W (x) ] = exp(im · x − 14 x · - · x)
(3.63)
holds. By di=erentiation it is easy to check that has indeed mean m and covariance matrix -. The most prominent examples for Gaussian states are the ground state 0 of a system of d harmonic oscillators (where the mean is 0 and - is given by the corresponding classical Hamiltonian) and its phase space translates m = W (m) W (−m) (with mean m and the same - as 0 ), which are known from quantum optics as coherent states. 0 and m are pure states and it can be shown that a Gaussian state is pure i= −1 - = −5 holds (see [79, Chapter 5]). Examples for mixed Gaussians are temperature states of harmonic oscillators. In one degree of freedom this is n ∞ N 1 |n n| ; (3.64)
N = N + 1 n=0 N + 1 where |n n| denotes the number basis and N is the mean photon number. The characteristic function of N is tr[W (x) N ] = exp[ − 12 (N + 12 )|x|2 ]
(3.65)
and its correlation matrix is simply - = 2(N + 1=2)5 3.3.3. Entangled Gaussians Let us now consider bipartite systems. Hence the phase space (V; ) decomposes into a direct sum V = VA ⊕ VB (where A stands for “Alice” and B for “Bob”) and the symplectic matrix = A ⊕ B is block diagonal with respect to this decomposition. If WA (x), respectively WB (y), denote Weyl operators, acting on the Hilbert spaces HA , HB , and corresponding to the phase spaces VA and VB , it is easy to see that the tensor product WA (x)⊗WB (y) satis.es the Weyl relations with respect to (V; ). Hence by the Stone–von Neumann uniqueness theorem we can identify W (x ⊕y), x ⊕y ∈ Va ⊕VB =V with WA (x) ⊗ WA (y). This immediately shows that a state on H = HA ⊗ HB is a product state i= its characteristic function factorizes. Separability 13 is characterized as follows (we omit the proof, see [170] instead). Theorem 3.4. A Gaussian state with covariance matrix - is separable i> there are covariance matrices -A ; -B such that
0 -A -¿ (3.66) 0 -B holds. This theorem is somewhat similar to Theorem 2.1: It provides a useful criterion as long as abstract considerations are concerned, but not for explicit calculations. In contrast to .nite-dimensional 13
In in.nite dimensions we have to de.ne separable states (in slight generalization to De.nition 2.5) as a trace-norm convergent convex sum of product states.
M. Keyl / Physics Reports 369 (2002) 431 – 548
471
systems, however, separability of Gaussian states can be decided by an operational criterion in terms of nonlinear maps between matrices [65]. To state it we have to introduce some terminology .rst. The key tool is a sequence of 2n + 2m × 2n + 2m matrices -N , N ∈ N, written in block matrix notation as
AN CN : (3.67) -N = CNT BN Given -0 the other -N are recursively de.ned by AN +1 = BN +1 = AN − Re(XN )
and
CN +1 = −Im(XN )
(3.68)
if -N − i ¿ 0 and -N +1 = 0 otherwise. Here we have set XN = CN (BN − iB )−1 CNT and the inverse denotes the pseudoinverse 14 if BN − iB is not invertible. Now we can state the following theorem (see [65] for a proof). Theorem 3.5. Consider a Gaussian state of a bipartite system with correlation matrix -0 and the sequence -N ; N ∈ N just de
14
A−1 is the pseudoinverse of a matrix A if AA−1 = A−1 A is the projector onto the range of A. If A is invertible A−1 is the usual inverse.
472
M. Keyl / Physics Reports 369 (2002) 431 – 548
Proposition 3.6. A Gaussian state is ppt i> its correlation matrix - satis<es
0 −A : - + i˜ ¿ 0 with ˜ = 0 B
(3.69)
The interesting question is now whether the ppt criterion is (for a given number of degrees of freedom) equivalent to separability or not. The following theorem which was proved in [144] for 1 × 1 systems and in [170] in 1 × d case gives a complete answer. Theorem 3.7. A Gaussian state of a quantum system with 1×d degrees of freedom (i.e. dim XA =2 and dim XB = 2d) is separable i> it is ppt; in other words i> the condition of Proposition 3.6 holds. For other kinds of systems the ppt criterion may fail which means that there are entangled Gaussian states which are ppt. A systematic way to construct such states can be found in [170]. Roughly speaking, it is based on the idea to go to the boundary of the set of ppt covariance matrices, i.e. has to satisfy Eqs. (3.62) and (3.69) and it has to be a minimal matrix with this property. Using this method explicit examples for ppt and entangled Gaussians are constructed for 2 × 2 degrees of freedom (cf. [170] for details). 3.3.4. Gaussian channels Finally, we want to give a short review on a special class of channels for in.nite-dimensional quantum systems (cf. [84] for details). To explain the basic idea .rstly note that each .nite set of Weyl operators (W (xj ), j = 1; : : : ; N , xj = xk for j = k) is linear independent. This can be checked easily using expectation values of j j W (xj ) in Gaussian states. Hence, linear maps on the space of .nite linear combinations of Weyl operators can be de.ned by T [W (x)] = f(x)W (Ax) where f is a complex-valued function on V and A is a 2d × 2d matrix. If we choose A and f carefully enough, such that some continuity properties match T can be extended in a unique way to a linear map on B(H)—which is, however, in general not completely positive. This means we have to consider special choices for A and f. The most easy case arises if f ≡ 1 and A is a symplectic isomorphism, i.e. AT A = . If this holds the map V x → W (Ax) is a representation of the Weyl relations and therefore unitarily equivalent to the representation we have started with. In other words, there is a unitary operator U with T [W (x)] = W (Ax) = UW (x)U ∗ , i.e. T is unitarily implemented, hence completely positive and, in fact, well known as Bogolubov transformation. If A does not preserve the symplectic matrix, f ≡ 1 is no option. Instead, we have to choose f such that the matrices i i Mjk = f(xj − xk )exp − xj · xk + Axj · Axk (3.70) 2 2 are positive. Complete positivity of the corresponding T is then a standard result of abstract C∗ -algebra theory (cf. [51]). If the factor f is in addition a Gaussian, i.e. f(x) = exp(− 12 x · ?x) for a positive de.nite matrix ? the cp-map T is called a Gaussian channel.
M. Keyl / Physics Reports 369 (2002) 431 – 548
473
A simple way to construct a Gaussian channel is in terms of an ancilla representation. More precisely, if A : V → V is an arbitrary linear map we can extend it to a symplectic map V x → Ax ⊕ A x ∈ V ⊕ V , where the symplectic vector space (V ; ) now refers to the environment. Consider now the Weyl operator W (x) ⊗ W (x ) = W (x; x ) on the Hilbert space H ⊗ H associated to the phase space element x ⊕ x ∈ V ⊕ V . Since A ⊕ A is symplectic it admits a unitary Bogolubov transformation U : H⊗H → H⊗H with U ∗ W (x; x )U =W (Ax; A x). If denotes now a Gaussian density matrix on H describing the initial state of the environment we get a Gaussian channel by tr[T ∗ ( )W (x)] = tr[ ⊗ U ∗ W (x; x )U ] = tr[ W (Ax)]tr[ W (A x)] :
(3.71)
Hence T [W (x)] = f(x)W (Ax) with f(x) = tr[ W (A x)]. Particular examples for Gaussian channels in the case of one degree of freedom are attenuation and ampli.cation channels [81,84]. They are given in terms of a real parameter k = 1 by R2 x → Ax = kx ∈ R2 R2 x → A x = 1 − k 2 x ∈ R2 ¡ 1 (3.72) for k ¡ 1 and R2 (q; p) → A (q; p) = (Aq; −Ap) ∈ R2
with A =
k2 − 1
(3.73)
for k ¿ 1. If the environment is initially in a thermal state N˜ (cf. Eq. (3.64)) this leads to 2 1 |k − 1| + Nc x2 W (kx) ; (3.74) T [W (x)] = exp 2 2 where we have set Nc = |k 2 − 1|N˜ . If we start initially with a thermal state N it is mapped by T again to a thermal state N with mean photon number N given by N = k 2 N + max{0; k 2 − 1} + Nc :
(3.75)
If Nc = 0 this means that T ampli.es (k ¿ 1) or damps (k ¡ 1) the mean photon number, while Nc ¿ 0 leads to additional classical, Gaussian noise. We will reconsider this channel in greater detail in Section 6. 4. Basic tasks After we have discussed the conceptual foundations of quantum information we will now consider some of its basic tasks. The spectrum ranges here from elementary processes, like teleportation 4.1 or error correction 4.4, which are building blocks for more complex applications, up to possible future technologies like quantum cryptography 4.6 and quantum computing 4.5. 4.1. Teleportation and dense coding Maybe the most striking feature of entanglement is the fact that otherwise impossible machines become possible if entangled states are used as an additional resource. The most prominent examples are teleportation and dense coding which we want to discuss in this section.
474
M. Keyl / Physics Reports 369 (2002) 431 – 548
4.1.1. Impossible machines revisited: classical teleportation We have already pointed out in the introduction that classical teleportation, i.e. transmission of quantum information over a classical information channel is impossible. With the material introduced in the last two chapters it is now possible to reconsider this subject in a slightly more mathematical way, which makes the following treatment of entanglement’ enhanced teleportation more transparent. To “teleport” the state ∈ B∗ (H) Alice performs a measurement (described by a POV measure E1 ; : : : ; EN ∈ B(H)) on her system and gets a value x ∈ X ={1; : : : ; N } with probability px = tr(Ex ). These data she communicates to Bob and he prepares a B(H) system in the state x . Hence the overall state Bob gets if the experiment is repeated many times is: ˜ = x∈X tr(Ex ) x (cf. Fig. 1.1). The latter can be rewritten as the composition E∗
D∗
B∗ (H)→C(X )∗ →B∗ (H)∗ of the channels C(X ) f → E(f) =
(4.1)
f(x)Ex ∈ B(H)
(4.2)
x ∈X
and C∗ (X ) p → D∗ (p) =
px x ∈ B∗ (H) ;
(4.3)
x ∈X
∗ E ∗ ( )
=D ˜
and this equation makes sense even if X is not .nite. The teleportation is successful i.e. if the output state ˜ cannot be distinguished from the input state by any statistical experiment, i.e. if D∗ E ∗ ( ) = . Hence the impossibility of classical teleportation can be rephrased simply as ED = Id for all observables E and all preparations D. 4.1.2. Entanglement enhanced teleportation Let us now change our setup slightly. Assume that Alice wants to send a quantum state ∈ B∗ (H) to Bob and that she shares an entangled state ∈ B∗ (K ⊗ K) and an ideal classical communication channel C(X ) → C(X ) with him. Alice can perform a measurement E : C(X ) → B(H ⊗ K) on the composite system B(H ⊗ K) consisting of the particle to teleport (B(H)) and her part of the entangled system (B(K)). Then she communicates the classical data x ∈ X to Bob and he operates with the parameter-dependent operation D : B(H) → B(K) ⊗ C(X ) appropriately on his particle (cf. Fig. 4.1). Hence, the overall procedure can be described by the channel T = (E ⊗ Id)D,
Fig. 4.1. Entanglement enhanced teleportation.
M. Keyl / Physics Reports 369 (2002) 431 – 548
475
or in analogy to (4.1) E ∗ ⊗Id
D∗
B∗ (H ⊗ K⊗2 ) −→ C∗ (X ) ⊗ B∗ (K)→B∗ (H) :
(4.4)
The teleportation of is successful if T ∗ ( ⊗ ):=D∗ ((E ∗ ⊗ Id)( ⊗ )) =
(4.5)
holds, in other words if there is no statistical measurement which can distinguish the .nal state T ∗ ( ⊗ ) of Bob’s particle from the initial state of Alice’s input system. The two channels E and D and the entangled state form a teleportation scheme if Eq. (4.5) holds for all states of the B(H) system, i.e. if each state of a B(H) system can be teleported without loss of quantum information. Assume now that H = K = Cd and X = {0; : : : ; d2 − 1} holds. In this case we can de.ne a teleportation scheme as follows: The entangled state shared by Alice and Bob is a maximally entangled state =|C C| and Alice performs a measurement which is given by the one-dimensional projections Ej = |2j 2j |, where 2j ∈ H ⊗ H, j = 0; : : : ; d2 − 1 is a basis of maximally entangled vectors. If her result is j = 0; : : : ; d2 − 1 Bob has to apply the operation 1 → Uj∗ 1Uj on his partner of the entangled pair, where the Uj ∈ B(H), j = 0; : : : ; d2 − 1 are an orthonormal family of unitary operators, i.e. tr(Uj∗ Uk ) = djk . Hence, the parameter-dependent operation D has the form (in the SchrWodinger picture): ∗
∗
∗
C (X ) ⊗ B (H) (p; 1) → D (p; 1) =
2 d −1
pj Uj∗ 1Uj ∈ B∗ (H) :
(4.6)
j=0
Therefore, we get for T ∗ ( ⊗ ) from Eq. (4.5) tr[T ∗ ( ⊗ )A] = tr[(E ⊗ Id)∗ ( ⊗ )D(A)]
2 d −1
= tr
(4.7)
tr 12 [|2j 2j |( ⊗ )]Uj∗ AUj :
(4.8)
j=0
=
2 d −1
tr[( ⊗ )|2j 2j | ⊗ (Uj∗ AUj )]:
(4.9)
j=0
Here tr 12 denotes the partial trace over the .rst two tensor factors (= Alice’s qubits). If C, the 2j and the Uj are related by the equation 2j = (Uj ⊗ 5)C ;
(4.10)
it is a straightforward calculation to show that T ∗ ( ⊗ ) = holds as expected [167]. If d = 2 there is basically a unique choice: the 2j , j = 0; : : : ; 3 are the four Bell states (cf. Eq. (3.3), C = 20 and the Uj are the identity and the three Pauli matrices. In this way, we recover the standard example for teleportation, published for the .rst time in [11]. The .rst experimental realizations are [24,22].
476
M. Keyl / Physics Reports 369 (2002) 431 – 548
Fig. 4.2. Dense coding.
4.1.3. Dense coding We have just shown how quantum information can be transmitted via a classical channel, if entanglement is available as an additional resource. Now we are looking at the dual procedure: transmission of classical information over a quantum channel. To send the classical information x ∈ X = {1; : : : ; n} to Bob, Alice can prepare a d-level quantum system in the state x ∈ B∗ (H), sends it to Bob and he measures an observable given by positive operators E1 ; : : : ; Em . The probability for Bob to receive the signal y ∈ X if Alice has sent x ∈ X is tr( x Ey ) and this de.nes a classical information channel by (cf. Section 3.2.3) ∗ C (X ) p → p(x)tr( x E1 ); : : : ; p(x)tr( x Em ) ∈ C∗ (X ) : (4.11) x ∈X
x ∈X
To get an ideal channel we just have to choose mutually orthogonal pure states x = | x x |, x = 1; : : : ; d on Alice’s side and the corresponding one-dimensional projections Ey = | y y |, y = 1; : : : ; d on Bob’s. If d = 2 and H = C2 it is possible to send one bit classical information via one qubit quantum information. The crucial point is now that the amount of classical information can be increased (doubled in the qubit case) if Alice shares an entangled state ∈ S(H ⊗ H) with Bob. To send the classical information x ∈ X = {1; : : : ; n} to Bob, Alice operates on her particle with an operation Dx : B(H) → B(H), sends it through an (ideal) quantum channel to Bob and he performs a measurement E1 ; : : : ; En ∈ B(H ⊗ H) on both particles. The probability for Bob to measure y ∈ X if Alice has send x ∈ X is given by tr[(Dx ⊗ Id)∗ ()Ey ]
(4.12)
and this de.nes the transition matrix of a classical communication channel T . If T is an ideal channel, i.e. if the transition matrix (4.12) is the identity, we will call E, D and a dense coding scheme (cf. Fig. 4.2). In analogy to Eq. (4.4) we can rewrite the channel T de.ned by (4.12) in terms of the composition D∗ ⊗Id
E∗
(4.13)
pj Dj (1)
(4.14)
C∗ (X ) ⊗ B∗ (H) ⊗ B∗ (H) −→ B∗ (H) ⊗ B∗ (H)→C∗ (X ) of the parameter-dependent operation ∗
∗
∗
D : C (X ) ⊗ B (H) → B (H);
p ⊗ 1 →
n j=1
M. Keyl / Physics Reports 369 (2002) 431 – 548
477
and the observable E : C(X ) → B(H ⊗ H);
p →
n
pj Ej ;
(4.15)
j=1
i.e. T ∗ (p) = E ∗ ◦ (D∗ ⊗ Id)(p ⊗ ). The advantage of this point of view is that it works as well for in.nite-dimensional Hilbert spaces and continuous observables. Finally, let us again consider the case where H=Cd and X ={1; : : : ; d2 }. If we choose as in the last paragraph a maximally entangled vector C ∈ H⊗H, an orthonormal base 2x ∈ H⊗H, x =1; : : : ; d2 of maximally entangled vectors and an orthonormal family Ux ∈ B(H ⊗ H), x = 1; : : : ; d2 of unitary operators, we can construct a dense coding scheme as follows: Ex = |2x 2x |, Dx (A) = Ux∗ AUx and = |C C|. If C, the 2x and the Ux are related by Eq. (4.10) it is easy to see that we really get a dense coding scheme [167]. If d = 2 holds, we have to set again the Bell basis for the 2x , C = 20 and the identity and the Pauli matrices for the Ux . We recover in this case the standard example of dense coding proposed in [19] and we see that we can transfer two bits via one qubit, as stated above. 4.2. Estimating and copying The impossibility of classical teleportation can be rephrased as follows: It is impossible to get complete information about the state of a quantum system by one measurement on one system. However, if we have many systems, say N , all prepared in the same state it should be possible to get (with a clever measuring strategy) as much information on as possible, provided N is large enough. In this way, we can circumvent the impossibility of devices like classical teleportation or quantum copying at least in an approximate way. 4.2.1. Quantum state estimation To discuss this idea in a more detailed way consider a number N of d-level quantum systems, all of them prepared in the same (unknown) state ∈ B∗ (H). Our aim is to estimate the state
by measurements on the compound system ⊗N . This is described in terms of an observable E N : C(XN ) → B(H⊗N ) with values in a .nite subset 15 XN ⊂ S(H) of the quantum state space S(H). According to Section 3.2.4 each such E N is given in terms of a tuple EN , ∈ XN , by E(f) = f()EN ; hence, we get for the expectation value of an EN measurement on systems in the state ⊗N the density matrix ˆN ∈ S(H) with matrix elements
; EN : (4.16)
; ˆN = x ∈ XN
We will call the channel E N an estimator and the criterion for a good estimator E N is that for any one-particle density operator , the value measured on a state ⊗N is likely to be close to , 15
This is a severe restriction at this point and physically not very well motivated. There might be more general (i.e. continuous) observables taking their values in the whole state space S(H) which lead to much better estimates. However, we do not discuss this possibility in order to keep mathematics more elementary.
478
M. Keyl / Physics Reports 369 (2002) 431 – 548
i.e. that the probability K N (!):=tr(E N (!) ⊗N )
with E N (!) =
∈ XN ∩ !
EN
(4.17)
is small if ! ⊂ S(H) is the complement of a small ball around . Of course, we will look at this problem for large N . So the task is to .nd a whole sequence of observables E N , N = 1; 2; : : :, making error probabilities like (4.17) go to zero as N → ∞. The most direct way to get a family E N , N ∈ N of estimators with this property is to perform a sequence of measurements on each of the N input systems separately. A .nite set of observables which leads to a successful estimation strategy is usually called a “quorum” (cf. e.g. [107,162]). E.g. for d = 2 we can perform alternating measurements of the three spin components. If = 12 (5 +˜x · ˜) is the Bloch representation of (cf. Section 2.1.2) we see that the expectation values of these measurements are given by 12 (1+xj ). Hence we get an arbitrarily good estimate if N is large enough. A similar procedure is possible for arbitrary d if we consider the generalized Bloch representation for (see again Section 2.1.2). There are however more eScient strategies based on “entangled” measurements (i.e. the EN () cannot be decomposed into pure tensor products) on the whole input system ⊗N (e.g. [156,99]). Somewhat in between are “adaptive schemes” [63] consisting of separate measurements but the jth measurement depend on the results of (j − 1)th. We will reconsider this circle of questions in a more quantitative way in Section 7. 4.2.2. Approximate cloning By virtue of the no-cloning theorem [173], it is impossible to produce M perfect copies of a d-level quantum system if N ¡ M input systems in the common (unknown) state ⊗N are given. ∗ ( ⊗N ) = ⊗M holds More precisely there is no channel TMN : B(H⊗M ) → B(H⊗N ) such that TMN for all ∈ S(H). Using state estimation, however, it is easy to .nd a device TMN which produces at least approximate copies which become exact in the limit N; M → ∞: If ⊗N is given, we measure the observable E N and get the classical data ∈ XN ⊂ S(H), which we use subsequently to prepare M systems in the state ⊗M . In other words, TMN has the form B∗ (H⊗N ) 1 → tr(EN 1)⊗M ∈ B∗ (H⊗M ) : (4.18) ∈ XN
We immediately see that the probability to get wrong copies coincides exactly with the error probability of the estimator given in Eq. (4.17). This shows .rst that we get exact copies in the limit N → ∞ and second that the quality of the copies does not depend on the number M of output systems, i.e. the asymptotic rate limN; M →∞ M=N of output systems per input system can be arbitrary large. The fact that we get classical data at an intermediate step allows a further generalization of this scheme. Instead of just preparing M systems in the state detected by the estimator, we can apply .rst an arbitrary transformation F : S(H) → S(H) on the density matrix and prepare F()⊗M instead of ⊗M . In this way, we get the channel (cf. Fig. 4.3) B∗ (H⊗N ) 1 → tr(EN 1)F()⊗M ∈ B∗ (H⊗M ) ; (4.19) ∈ XN
M. Keyl / Physics Reports 369 (2002) 431 – 548
479
Fig. 4.3. Approximating the impossible machine F by state estimation.
i.e. a physically realizable device which approximates the impossible machine F. The probability to get a bad approximation of the state F( )⊗M (if the input state was ⊗N ) is again given by the error probability of the estimator and we get a perfect realization of F at arbitrary rate as M; N → ∞. There are in particular two interesting tasks which become possible this way: The .rst is the “universal not gate” which associates to each pure state of a qubit the unique pure state orthogonal to it [36]. This is a special example of a antiunitarily implemented symmetry operation and therefore not completely positive. The second example is the puri.cation of states [46,100]. Here it is assumed that the input states were once pure but have passed later on a depolarizing channel | | → #| | + (1 − #)5=d. If # ¿ 0 this map is invertible but its inverse does not describe an allowed quantum operation because it maps some density operators to operators with negative eigenvalues. Hence the reversal of noise is not possible with a one-shot operation but can be done with high accuracy if enough input systems are available. We rediscuss this topic in Section 7. 4.3. Distillation of entanglement Let us now return to entanglement. We have seen in Section 4.1 that maximally entangled states play a crucial role for processes like teleportation and dense coding. In practice however entanglement is a rather fragile property: If Alice produces a pair of particles in a maximally entangled state |C C| ∈ S(HA ⊗ HB ) and distributes one of them over a great distance to Bob, both end up with a mixed state which contains much less entanglement then the original and which cannot be used any longer for teleportation. The latter can be seen quite easily if we try to apply the qubit teleportation scheme (Section 4.1.2) with a non-maximally entangled isotropic state (Eq. (3.15) with ¿ 0) instead of C. Hence the question arises, whether it is possible to recover |C C| from , or, following the reasoning from the last section, at least a small number of (almost) maximally entangled states from a large number N of copies of . However, since the distance between Alice and Bob is big (and quantum communication therefore impossible) only LOCC operations (Section 3.2.6) are available for this task (Alice and Bob can only operate on their respective particles, drop some of them and communicate classically with one another). This excludes procedures like the puri.cation scheme just sketched, because we would need “entangled” measurements to get an asymptotically exact estimate
480
M. Keyl / Physics Reports 369 (2002) 431 – 548
for the state . Hence, we need a sequence of LOCC channels N ⊗N TN : B(CdN ⊗ CdN ) → B(H⊗ A ⊗ HB )
(4.20)
such that TN∗ ( ⊗N ) − |CN CN |1 → 0
for N → ∞
(4.21)
holds, with a sequence of maximally entangled vectors CN ∈ CdN ⊗ CdN . Note that we have to use N ⊗N ∼ here the natural isomorphism H⊗ = (HA ⊗ HB )⊗N , i.e. we have to reshu\e ⊗N such A ⊗ HB that the .rst N tensor factors belong to Alice (HA ) and the last N to Bob (HB ). If confusion can be avoided we will use this isomorphism in the following without a further note. We will call a sequence of LOCC channels, TN satisfying (4.21) with a state ∈ S(HA ⊗ HB ) a distillation scheme for and is called distillable if it admits a distillation scheme. The asymptotic rate with which maximally entangled states can be distilled with a given protocol is lim inf log2 (dN )=N : n→∞
(4.22)
This quantity will become relevant in the framework of entanglement measures (Section 5). 4.3.1. Distillation of pairs of qubits Concrete distillation protocols are in general rather complicated procedures. We will sketch in the following how any pair of entangled qubits can be distilled. The .rst step is a scheme proposed for the .rst time by Bennett et al. [12]. It can be applied if the maximally entangled fraction F (Eq. (3.4)) is greater than 1=2. As indicated above, we assume that Alice and Bob share a large amount of pairs in the state , so that the total state is ⊗N . To obtain a smaller number of pairs with a higher F they proceed as follows: 1. First they take two pairs (let us call them pairs 1 and 2), i.e. ⊗ and apply to each of them the twirl operation PU UZ associated to isotropic states (cf. Eq. (3.18)). This can be done by LOCC operations in the following way: Alice selects at random (respecting the Haar measure on U (2)) a unitary operator U applies it to her qubits and sends to Bob which transformation she has chosen; then he applies UZ to his particles. They end up with two isotropic states ˜ ⊗ ˜ with the same maximally entangled fraction as . 2. Each party performs the unitary transformation UXOR : |a ⊗ |b → |a ⊗ |a + b mod 2
(4.23)
on his=her members of the pairs. 3. Finally, Alice and Bob perform local measurements in the basis |0; |1 on pair 1 and discards it afterwards. If the measurements agree, pair 2 is kept and has a higher F. Otherwise pair 2 is discarded as well. If this procedure is repeated over and over again, it is possible to get states with an arbitrarily high F, but we have to sacri.ce more and more pairs and the asymptotic rate is zero. To overcome this problem we can apply the scheme above until F( ) is high enough such that 1 + tr( ln ) ¿ 0 holds and then we continue with another scheme called hashing [16] which leads to a non-vanishing rate.
M. Keyl / Physics Reports 369 (2002) 431 – 548
481
If .nally F( ) 6 1=2 but is entangled, Alice and Bob can increase F for some of their particles by
is by assumption entangled) and second that we can write each vector ∈ H ⊗ H as (X ⊗ 5)20 with the Bell state 20 and an appropriately chosen operator X (see Section 3.1.1). Now we can de.ne T in terms of the two operations T1 ; T2 (cf. Eq. (3.52)) with T1 (A) = X ∗ AX −1 ;
Id − T1 = T2 :
It is straightforward to check that we end up with (Tx ⊗ Id)∗ ( ) ;
˜ = tr[(Tx ⊗ Id)∗ ( )]
(4.24)
(4.25)
such that F( ) ˜ ¿ 1=2 holds and we can continue with the scheme described in the previous paragraph. 4.3.2. Distillation of isotropic states ˜ Consider now an entangled isotropic state in d dimensions, i.e. we have H=Cd and 0 6 tr(F ) 6 1 (with the operator F˜ of Section 3.1.3). Each such state is distillable via the following scheme [27,85]: First, Alice and Bob apply a .lter operation T : C(X ) ⊗ B(H) → B(H) on their respective particle given by T1 (A)=PAP, T2 =1−T1 where P is the projection onto a two-dimensional subspace. If both measure the value 1 they get a qubit pair in the state =(T ˜ 1 ⊗T1 )( ). Otherwise they discard their particles (this requires classical communication). Obviously, the state ˜ is entangled (this can be easily checked), hence they can proceed as in the previous subsection. The scheme just proposed can be used to show that each state which violates the reduction criterion (cf. Section 2.4.3) can be distilled [85]. The basic idea is to project with the twirl PU UZ (which is LOCC as we have seen above; cf. Section 4.3.1) to an isotropic state PU UZ ( ) and to apply the procedure from the last paragraph afterwards. We only have to guarantee that PU UZ ( ) is entangled. To this end use a vector ∈ H ⊗ H with ; (5 ⊗ tr 1 ( ) − ) ¡ 0 (which exists by assumption since violates the reduction criterion) and to apply the .lter operation given by via Eq. (4.24). 4.3.3. Bound entangled states It is obvious that separable states are not distillable, because an LOCC operation map separable states to separable states. However, is each entangled state distillable? The answer, maybe somewhat surprising, is no and an entangled state which is not distillable is called bound entangled [87] (distillable states are sometimes called free entangled, in analogy to thermodynamics). Examples of bound entangled states are all ppt entangled states [87]: This is an easy consequence of the fact that each separable channel (and therefore each LOCC channel as well) maps ppt states to ppt states (this is easy to check), but a maximally entangled state is never ppt. It is not yet known, whether
482
M. Keyl / Physics Reports 369 (2002) 431 – 548
bound entangled npt states exists, however, there are at least some partial results: (1) It is suScient to solve this question for Werner states, i.e. if we can show that each npt Werner state is distillable it follows that all npt states are distillable [85]. (2) Each npt Gaussian state is distillable [64]. (3) For each N ∈ N there is an npt Werner state which is not “N -copy distillable”, i.e. ; ⊗N ¿ 0 holds for each pure state with exactly two Schmidt summands [55,58]. This gives some evidence for the existence of bound entangled npt states because is distillable i= it is N -copy distillability for some N [87,55,58]. Since bound entangled states cannot be distilled, they cannot be used for teleportation. Nevertheless bound entanglement can produce a non-classical e=ect, called “activation of bound entanglement” [92]. To explain the basic idea, assume that Alice and Bob share one pair of particles in a distillable state f and many particles in a bound entangled state b . Assume in addition that f cannot be used for teleportation, or, in other words if f is used for teleportation the particle Bob receives is in a state which di=ers from the state Alice has send. This problem cannot be solved by distillation, since Alice and Bob share only one pair of particles in the state f . Nevertheless, they can try to apply an appropriate .lter operation on to get with a certain probability a new state which leads to a better quality of the teleportation (or, if the .ltering fails, to get nothing at all). It can be shown, however [88], that there are states f such that the error occurring in this process (e.g. measured by the trace norm distance of and ) is always above a certain threshold. This is the point where the bound entangled states b come into play: If Alice and Bob operate with an appropriate protocol on f and many copies of b the distance between and can be made arbitrarily small (although the probability to be successful goes to zero). Another example for an activation of bound entanglement is related to distillability of npt states: If Alice and Bob share a certain ppt-entangled state as additional resource each npt state becomes distillable (even if
is bound entangled) [60,104]. For a more detailed survey of the role of bound entanglement and further references see [91]. 4.4. Quantum error correction If we try to distribute quantum information over large distances or store it for a long time in some sort of “quantum memory” we always have to deal with “decoherence e=ects”, i.e. unavoidable interactions with the environment. This results in a signi.cant information loss, which is particularly bad for the functioning of a quantum computer. Similar problems arise as well in a classical computer, but the methods used there to circumvent the problems cannot be transferred to the quantum regime. E.g. the most simple strategy to protect classical information against noise is redundancy: instead of storing the information once we make three copies and decide during readout by a majority vote which bit to take. It is easy to see that this reduces the probability of an error from order j to j2 . Quantum mechanically however such a procedure is forbidden by the no cloning theorem. Nevertheless, quantum error correction is possible although we have to do it in a more subtle way than just copying; this was observed for the .rst time independently in [39,146]. Let us consider .rst the general scheme and assume that T : B(K) → B(K) is a noisy quantum channel. To send quantum systems of type B(H) undisturbed through T we need an encoding channel E : B(K) → B(H) and a decoding channel D : B(H) → B(K) such that ETD=Id holds, respectively D∗ T ∗ E ∗ = Id, in the SchrWodinger picture; cf. Fig. 4.4.
M. Keyl / Physics Reports 369 (2002) 431 – 548
483
Fig. 4.4. Five-bit quantum code: encoding one qubit into .ve and correcting one error.
A powerful error correction scheme should not be restricted to one particular type of error, i.e. one particular noisy channel T . Assume instead that E ⊂ B(K) is a linear subspace of “error operators” and T is any channel given by T∗ ( ) = Fj Fj∗ ; Fj ∈ E : (4.26) j
An isometry V : H → K is called an error correcting code for E if for each T of form (4.26) there is a decoding channel D : B(H) → B(K) with D∗ (T (V V ∗ )) = for all ∈ S(H). By the theory of Knill and LaJamme [103] this is equivalent to the factorization condition
V ; Fj∗ Fk V = !(Fj∗ Fk ) ; ;
(4.27)
where !(Fj∗ Fk ) is a factor which does not depend on the arbitrary vectors ; ∈ H. The most relevant examples of error correcting codes are those which generalize the classical idea of sending multiple copies in a certain sense. This means we encode a small number N of d-level systems into a big number M N of systems of the same type, which are then transmitted and decoded back into N systems afterwards. During the transmission K ¡ M arbitrary errors are N ⊗M allowed. Hence, we have H = H⊗ with H1 = Cd and T is an arbitrary tensor product 1 , K = H1 of K noisy channels Sj , j =1; : : : ; K and M −K ideal channels Id. The most well-known code for this type of error is the “.ve-bit code” where one qubit is encoded into .ve and one error is corrected [16] (cf. Fig. 4.4 for N = 1; M = 5 and K = 1). To de.ne the corresponding error space E consider the .nite sets X = {1; : : : ; N } and Y = {1 + N; : : : ; M + N } and de.ne .rst for each subset Z ⊂ Y : E(Z) = span {A1 ⊗ · · · ⊗ AM ∈ B(K)| Aj ∈ B(H1 ) arbitrary for j + N ∈ Z; Aj = 5 otherwise} :
(4.28)
E is now the span of all E(Z) with |Z| 6 K (i.e. the length of Z is less or equal to K). We say that an error correcting code for this particular E corrects K errors. There are several ways to construct error correcting codes (see e.g. [70,38,4]). Most of these methods are somewhat involved however and require knowledge from classical error correction which we want to skip. Therefore, we will only present the scheme proposed in [137], which is quite easy to describe and admits a simple way to check the error correction condition. Let us sketch .rst the general scheme. We start with an undirected graph F with two kinds of vertices: A set of input vertices, labeled by X and a set of output vertices labeled by Y . The links of the graph are given by the adjacency matrix, i.e. an N + M × N + M matrix F with Fjk = 1 if node k and j are
484
M. Keyl / Physics Reports 369 (2002) 431 – 548
Fig. 4.5. Two graphs belonging to (equivalent) .ve bit codes. The input node can be chosen in both cases arbitrarily. Fig. 4.6. Symbols and de.nition for the three elementary gates AND, OR and NOT. N M linked and Fjk = 0 otherwise. With respect to F we can de.ne now an isometry VF : H⊗ → H⊗ 1 1 by i3˜ ˜ j · Fj (4.29)
jN +1 : : : jN +M |VF |j1 : : : jN = exp d
with ˜j = (j1 ; : : : ; jN +M ) ∈ ZNd +M (where Zd denotes the cyclic group with d elements). There is an easy condition under which VF is an error correcting code. To write it down we need the following N M additional terminology: We say that an error correcting code V : H⊗ → H⊗ detects the error 1 1 con
V ; FV = !(F) ;
∀F ∈ E(Z)
(4.30)
holds. With Eq. (4.27) it is easy to see that V corrects K errors i= it detects all error con.gurations of length 2K or less. Now we have the following theorem: Theorem 4.1. The quantum code VF de
implies that gl = 0; l ∈ X
and
Fkl gl = 0; k ∈ X
(4.32)
l∈ Z
holds. We omit the proof, see [137] instead. Two particular examples (which are equivalent!) are given in Fig. 4.5. In both cases we have N = 1, M = 5 and K = 1 i.e. one input node, which can be chosen arbitrarily, .ve output nodes and the corresponding codes correct one error. For a more detailed survey on quantum error correction, in particular for more examples we refer to [20].
M. Keyl / Physics Reports 369 (2002) 431 – 548
485
x c
x + y mod 2 y
Fig. 4.7. Half-adder circuit as an example for a Boolean network.
4.5. Quantum computing Quantum computing is without a doubt the most prominent and most far reaching application of quantum information theory, since it promises on the one hand, “exponential speedup” for some problems which are “hard to solve” with a classical computer, and gives completely new insights into classical computing and complexity theory on the other. Unfortunately, an exhaustive discussion would require its own review article. Hence, we are only able to give a short overview (see Part II of [122] for a more complete presentation and for further references). 4.5.1. The network model of classical computing Let us start with a brief (and very informal) introduction to classical computing (for a more complete review and hints for further reading see [122, Chapter 3]). What we need .rst is a mathematical model for computation. There are, in fact, several di=erent choices and the Turing machine [152] is the most prominent one. More appropriate for our purposes is, however, the so-called network model, since it allows an easier generalization to the quantum case. The basic idea is to interpret a classical (deterministic) computation as the evaluation of a map f : BN → BM (where B = {0; 1} denotes the .eld with two elements) which maps N input bits to M output bits. If M = 1 holds f is called a Boolean function and it is for many purposes suScient to consider this special case—each general f is in fact a Cartesian product of Boolean functions. Particular examples are the three elementary gates AND, OR and NOT de.ned in Fig. 4.6 and arbitrary algebraic expressions constructed from them: e.g. the XOR gate (x; y) → x + y mod 2 which can be written as (x ∨ y) ∧ @(x ∧ y). It is now a standard result of Boolean algebra that each Boolean function can be represented in this way and there are in general many possibilities to do this. A special case is the disjunctive normal form of f; cf [161]. To write such an expression down in form of equations is, however, somewhat confusing. f is therefore expressed most conveniently in graphical form as a circuit or network, i.e. a graph C with nodes representing elementary gates and edges (“wires”) which determine how the gates should be composed; cf. Fig. 4.7 for an example. A classical computation can now be de.ned as a circuit applied to a speci.ed string of input bits. Variants of this model arise if we replace AND, OR and NOT by another (.nite) set G of elementary gates. We only have to guarantee that each function f can be expressed as a composition of elements from G. A typical example for G is the set which contains only the NAND gate (x; y) → x ↑ y = @(x ∧ y). Since AND, OR and NOT can be rewritten in terms of NAND (e.g. @x = x ↑ x) we can calculate each Boolean function by a circuit of NAND gates.
486
M. Keyl / Physics Reports 369 (2002) 431 – 548
4.5.2. Computational complexity One of the most relevant questions within classical computing, and the central subject of computational complexity, is whether a given problem is easy to solve or not, where “easy” is de.ned in terms of the scaling behavior of the resources needed in dependence of the size of the input data. In the following we will give a rough survey over the most basic aspects of this .eld, while we refer the reader to [124] for a detailed presentation. To start with, let us specify the basic question in greater detail. First of all the problems we want to analyze are decision problems which only give the two possible values “yes” and “no”. They are mathematically described by Boolean functions acting on bit strings of arbitrary size. A well-known example is the factoring problem given by the function fac with fac(m; l) = 1 if m (more precisely the natural number represented by m) has a divisor less then l and fac(m; l) = 0 otherwise. Note that many tasks of classical computation can be reformulated this way, so that we do not get a severe loss of generality. The second crucial point we have to clarify is the question what exactly are the resources we have mentioned above and how we have to quantify them. A natural physical quantity which come into mind immediately is the time needed to perform the computation (space is another candidate, which we do not discuss here, however). Hence, the question we have to discuss is how the computation time t depends on the size L of the input data x (i.e. the length L of the smallest register needed to represent x as a bit string). However, a precise de.nition of “computation time” is still model dependent. For a Turing machine we can take simply the number of head movements needed to solve the problem, and in the network model we choose the number of steps needed to execute the whole circuit, if gates which operate on di=erent bits are allowed to work simultaneously. 16 Even with a .xed type of model the functional behavior of t depends on the set of elementary operations we choose, e.g. the set of elementary gates in the network model. It is therefore useful to divide computational problems into complexity classes whose de.nitions do not su=er under model-dependent aspects. The most fundamental one is the class P which contains all problems which can be computed in “polynomial time”, i.e. t is, as a function of L, bounded from above by a polynomial. The model independence of this class is basically the content of the strong Church Turing hypotheses which states, roughly speaking, that each model of computation can be simulated in polynomial time on a probabilistic Turing machine. Problems of class P are considered “easy”, everything else is “hard”. However, even if a (decision) problem is hard the situation is not hopeless. E.g. consider the factoring problem fac described above. It is generally believed (although not proved) that this problem is not in class P. But if somebody gives us a divisor p ¡ l of m it is easy to check whether p is really a factor, and if the answer is true we have computed fac(m; l). This example motivates the following de.nition: A decision problem f is in class NP (“non-deterministic polynomial time”) if there is a Boolean function f in class P such that f (x; y) = 1 for some y implies f(x). In our example fac is obviously de.ned by fac (m; l; p) = 1 ⇔ p ¡ l and p is a devisor of m. It is obvious that P is a subset of NP the other inclusion however is rather non-trivial. The conjecture is that P = NP holds and great parts of
16
Note that we have glanced over a lot of technical problems at this point. The crucial diSculty is that each circuit CN allows only the computation of a Boolean function fN : BN → B which acts on input data of length N . Since we are interested in answers for arbitrary .nite length inputs a sequence CN , N ∈ N of circuits with appropriate uniformity properties is needed; cf. [124] for details.
M. Keyl / Physics Reports 369 (2002) 431 – 548
487
complexity theory are based on it. Its proof (or disproof), however, represents one of the biggest open questions of theoretical informatics. To introduce a third complexity class we have to generalize our point of view slightly. Instead of a function f : BN → BM we can look at a noisy classical T which sends the input value x ∈ BN to a probability distribution Txy , y ∈ BM on BM (i.e. Txy is the transition matrix of the classical channel T ; cf. Section 3.2.3). Roughly speaking, we can interpret such a channel as a probabilistic computation which can be realized as a circuit consisting of “probabilistic gates”. This means there are several di=erent ways to proceed at each step and we use a classical random number generator to decide which of them we have to choose. If we run our device several times on the same input data x we get di=erent results y with probability Txy . The crucial point is now that we can allow some of the outcomes to be wrong as long as there is an easy way (i.e. a class P algorithm) to check the validity of the results. Hence, we de.ne BPP (“bounded error probabilistic polynomial time”) as the class of all decision problems which admit a polynomial time probabilistic algorithm with error probability less than 1=2 − < (for .xed <). It is obvious that P ⊂ BPP holds but the relation between BPP and NP is not known. 4.5.3. Reversible computing In the last subsection we have discussed the time needed to perform a certain computation. Other physical quantities which seem to be important are space and energy. Space can be treated in a similar way as time and there are in fact space-related complexity classes (e.g. PSPACE which stands for “polynomial space”). Energy, however, is di=erent, because it turns surprisingly out that it is possible to do any calculation without expending any energy! One source of energy consumption in a usual computer is the intrinsic irreversibility of the basic operations. E.g. a basic gate like AND maps two input bits to one output bit, which obviously implies that the input cannot be reconstructed from the output. In other words: one bit of information is erased during the operation of the AND gate; hence a small amount of energy is dissipated to the environment. A thermodynamic analysis, known as Landauer’s principle, shows that this energy loss is at least kB T ln 2, where T is the temperature of the environment [106]. If we want to avoid this kind of energy dissipation we are restricted to reversible processes, i.e. it should be possible to reconstruct the input data from the output data. This is called reversible computation and it is performed in terms of reversible gates, which in turn can be described by invertible functions f : BN → BN . This does not restrict the class of problems which can be solved however: We can repackage a non-invertible function f : BN → BM into an invertible one f : BN +M → BN +M simply by f (x; 0) = (x; f(x)) and an appropriate extension to the rest of BN +M . It can be even shown that a reversible computer performs as good as a usual one, i.e. an “irreversible” network can be simulated in polynomial time by a reversible one. This will be of particular importance for quantum computing, because a reversible computer is, as we will see soon, a special case of a quantum computer. 4.5.4. The network model of a quantum computer Now we are ready to introduce a mathematical model for quantum computation. To this end we will generalize the network model discussed in Section 4.5.1 to the network model of quantum computation.
488
M. Keyl / Physics Reports 369 (2002) 431 – 548
Fig. 4.8. Universal sets of quantum gates.
Fig. 4.9. Quantum circuit for the discrete Fourier transform on a 4-qubit register.
A classical computer operates by a network of gates on a .nite number of classical bits. A quantum computer operates on a .nite number of qubits in terms of a network of quantum gates—this is the rough idea. To be more precise consider the Hilbert space H⊗N with H = C2 which describes a quantum register consisting of N qubits. In H there is a preferred set |0; |1 of orthogonal states, describing the two values a classical bit can have. Hence, we can describe each possible value x of a classical register of length N in terms of the computational basis |x = |x1 ⊗ · · · ⊗ |xN , x ∈ BN . A quantum gate is now nothing else but a unitary operator acting on a small number of qubits (preferably 1 or 2) and a quantum network is a graph representing the composition of elementary gates taken from a small set G of unitaries. A quantum computation can now be de.ned as the application of such a network to an input state of the quantum register (cf. Fig. 4.9 for an example). Similar to the classical case the set G should be universal; i.e. each unitary operator on a quantum register of arbitrary length can be represented as a composition of elements from G. Since the group of unitaries on a Hilbert space is continuous, it is not possible to do this with a .nite set G. However, we can .nd at least suitably small sets which have the chance to be realizable technically (e.g. in an ion-trap) somehow in the future. Particular examples are on the one hand the controlled U operations and the set consisting of CNOT and all one-qubit gates on the other (cf. Fig. 4.8; for a proof of universality see Section 4.5 of [122]). Basically, we could have considered arbitrary quantum operations instead of only unitaries as gates. However in Section 3.2.1, we have seen that we can implement each operation unitarily if we add an ancilla to the systems. Hence, this kind of generalization is already covered by the model. (As long as non-unitarily implemented operations are a desired feature. Decoherence e=ect due to unavoidable interaction with the environment are a completely di=erent story; we come back to this point at the end of the subsection.) The same holds for measurements at intermediate steps and subsequent conditioned operations. In this case we get basically the same result with a di=erent
M. Keyl / Physics Reports 369 (2002) 431 – 548
489
network where all measurements are postponed to the end. (Often it is however very useful to allow measurements at intermediate steps as we will see in the next subsection.) Having a mathematical model of quantum computers in mind we are now ready to discuss how it would work in principle. 1. The .rst step is in most cases preprocessing of the input data on a classical computer. E.g. the Shor algorithm for the factoring problem does not work if the input number m is a pure prime power. However, in this case there is an eScient classical algorithm. Hence, we have to check .rst whether m is of this particular form and use this classical algorithm where appropriate. 2. In the next step we have to prepare the quantum register based on these preprocessed data. This means in the most simple case to write classical data, i.e. to prepare the state |x ∈ H⊗N if the (classical) input is x ∈ BN . In many cases, however, it might be more intelligent to use a superposition of several |x, e.g. the state 1 |x ; (4.33) "= √ 2N x∈BN which represents actually the superposition of all numbers the registers can represent—this is indeed the crucial point of quantum computing and we come back to it below. 3. Now we can apply the quantum circuit C to the input state and after the calculation we get the output state U , where U is the unitary represented by C. 4. To read out the data after the calculation we perform a von Neumann measurement in the computational basis, i.e. we measure the observable given by the one-dimensional projectors |x x|, x ∈ BN . Hence, we get x ∈ BN with probability PN = | |x|2 . 5. Finally, we have to postprocess the measured value x on a classical computer to end up with the .nal result x . If, however, the output state U" is a proper superposition of basis vectors |x (and not just one |x) the probability px to get this particular x is less than 1. In other words, we have performed a probabilistic calculation as described in the last paragraph of Section 4.5.2. Hence, we have to check the validity of the results (with a class P algorithm on a classical computer) and if they are wrong we have to go back to step 2. So, why is quantum computing potentially useful? First of all, a quantum computer can perform at least as good as a classical computer. This follows immediately from our discussion of reversible computing in Section 4.5.3 and the fact that any invertible function f : BN → BN de.nes a unitary by Uf : |x → |f(x) (the quantum CNOT gate in Fig. 4.8 arises exactly in this way from the classical CNOT). But, there is on the other hand strong evidence which indicates that a quantum computer can solve problems in polynomial time which a classical computer cannot. The most striking example for this fact is the Shor algorithm, which provides a way to solve the factoring problem (which is most probably not in class P) in polynomial time. If we introduce the new complexity class BQP of decision problems which can be solved with high probability and in polynomial time with a quantum computer, we can express this conjecture as BPP = BQP. The mechanism which gives a quantum computer its potential power is the ability to operate not just on one value x ∈ BN , but on whole superpositions of values, as already mentioned in step 2 above. E.g. consider a, not necessarily invertible, map f : BN → BM and the unitary operator Uf H⊗N ⊗ H⊗M |x ⊗ |0 → Uf |x ⊗ |0 = |x ⊗ |f(x) ∈ H⊗N ⊗ H⊗M :
(4.34)
490
M. Keyl / Physics Reports 369 (2002) 431 – 548
If we let act Uf on a register in the state " ⊗ |0 from Eq. (4.33) we get the result 1 Uf (" ⊗ |0) = √ |x ⊗ |f(x) : 2N x∈BN
(4.35)
Hence, a quantum computer can evaluate the function f on all possible arguments x ∈ BN at the same time! To bene.t from this feature—usually called quantum parallelism—is, however, not as easy as it looks like. If we perform a measurement on Uf (" ⊗ |0) in the computational basis we get the value of f for exactly one argument and the rest of the information originally contained in Uf (" ⊗ |0) is destroyed. In other words it is not possible to read out all pairs (x; f(x)) from Uf (" ⊗ |0) and to .ll a (classical) lookup table with them. To take advantage from quantum parallelism we have to use a clever algorithm within the quantum computation step (step 3 above). In the next section we will consider a particular example for this. Before we come to this point, let us give some additional comments which link this section to other parts of quantum information. The .rst point concerns entanglement. The state Uf (" ⊗ |0) is highly entangled (although " is separable since "=[2−1=2 (|0+|1)]⊗N ), and this fact is essential for the “exponential speedup” of computations we could gain in a quantum computer. In other words, to outperform a classical computer, entanglement is the most crucial resource—this will become more transparent in the next section. The second remark concerns error correction. Up to now we have implicitly assumed that all components of a quantum computer work perfectly without any error. In reality, however, decoherence e=ects make it impossible to realize unitarily implemented operations, and we have to deal with noisy channels. Fortunately, it is possible within quantum information to correct at least a certain amount of errors, as we have seen in Section 4.4. Hence, unlike an analog computer 17 a quantum computer can be designed fault tolerant, i.e. it can work with imperfectly manufactured components. 4.5.5. Simons problem We will consider now a particular problem (known as Simons problem; cf. [143]) which shows explicitly how a quantum computer can speed up a problem which is hard to solve with a classical computer. It does not .t, however, exactly into the general scheme sketched in the last subsection, because a quantum “oracle” is involved, i.e. a black box which performs an (a priori unknown) unitary transformation on an input state given to it. The term “oracle” indicates here that we are not interested in the time the black box needs to perform the calculation but only in the number of times we have to access it. Hence, this example does not prove the conjecture BPP = BQP stated above. Other quantum algorithms which we do not have the room here to discuss include: the Deutsch [52] and Deutsch–Josza problem [53], the Grover search algorithm [74,75] and of course Shor’s factoring algorithm [139,140]. Hence, let us assume that our black box calculates the unitary Uf from Eq. (4.34) with a map f : BN → BN which is two to one and has period a, i.e. f(x) = f(y) i= y = x + a mod 2. The task is to .nd a. Classically, this problem is hard, i.e. we have to query the oracle exponentially often. To see this note .rst that we have to .nd a pair (x; y) with f(x) = f(y) and the probability to get it with two random queries is 2−N (since there is for each x exactly one y = x with f(x) = f(y)). 17
If an analog computer works reliably only with a certain accuracy, we can rewrite the algorithm into a digital one.
M. Keyl / Physics Reports 369 (2002) 431 – 548
491
If we use the box 2N=4 times, we get less than 2N=2 di=erent pairs. Hence, the probability to get the correct solution is 2−N=2 , i.e. arbitrarily small even with exponentially many queries. Assume now that we let our box act on a quantum register H⊗N ⊗ H⊗N in the state " ⊗ |0 with " from Eq. (4.33) to get Uf (" ⊗ |0) from (4.35). Now we measure the second register. The outcome is one of 2N −1 possible values (say f(x0 )), each of which occurs equiprobable. Hence, after the measurement the .rst register is the state 2−1=2 (|x + |x + a). Now we let a Hadamard gate H (cf. Fig. 4.9) act on each qubit of the .rst register and the result is (this follows with a short calculation) 1 1 √ H ⊗N (|x + |x + a) = √ (−1)x·y |y ; (4.36) N − 1 2 2 a·y=0 where the dot denotes the (B-valued) scalar product in the vector space BN . Now we perform a measurement on the .rst register (in computational basis) and we get a y ∈ BN with the property y · a = 0. If we repeat this procedure N times and if we get N linear-independent values yj we can determine a as a solution of the system of equations y1 · a = 0; : : : ; yN · a = 0. The probability to appear as an outcome of the second measurement is for each y with y · a = 0 given by 21−N . Therefore, the success probability can be made arbitrarily big while the number of times we have to access the box is linear in N . 4.6. Quantum cryptography Finally, we want to have a short look on quantum cryptography—another more practical application of quantum information, which has the potential to emerge into technology in the not so distant future (see e.g. [95,93,34] for some experimental realizations and [69] for a more detailed overview). Hence, let us assume that Alice has a message x ∈ BN which she wants to send secretly to Bob over a public communication channels. One way to do this is the so-called “one-time pad”: Alice generates randomly a second bit-string y ∈ BN of the same length as x sends x + y instead of x. Without knowledge of the key y it is completely impossible to recover the message x from x + y. Hence, this is a perfectly secure method to transmit secret data. Unfortunately, it is completely useless without a secure way to transmit the key y to Bob, because Bob needs y to decrypt the message x + y (simply by adding y again). What makes the situation even worse is the fact that the key y can be used only once (therefore the name one-time pad). If two messages x1 , x2 are encrypted with the same key we can use x1 as a key to decrypt x2 and vice versa: (x1 + y) + (x2 + y) = x1 + x2 , hence both messages are partly compromised. Due to these problems completely di=erent approaches, namely “public key systems” like DSA and RSA are used today for cryptography. The idea is to use two keys instead of one: a private key which is used for decryption and only known to its owner and a public key used for encryption, which is publicly available (we do not discuss the algorithms needed for key generation, encryption and decryption here, see [145] and the references therein instead). To use this method, Bob generates a key pair (z; y), keeps his private key (y) at a secure place and sends the public one (z) to Alice over a public channel. Alice encrypts her message with z sends the result to Bob and he can decrypt it with y. The security of this scheme relies on the assumption that the factoring problem is computationally hard, i.e. not in class P, because to calculate y from z requires the factorization of large integers. Since the latter is tractable on quantum computers via Shor’s algorithm, the security
492
M. Keyl / Physics Reports 369 (2002) 431 – 548
of public key systems breaks down if quantum computers become available in the future. Another problem of more fundamental nature is the unproven status of the conjecture that factorization is not solvable in polynomial time. Consequently, security of public key systems is not proven either. The crucial point is now that quantum information provides a way to distribute a cryptographic key y in a secure way, such that y can be used as a one-time pad afterwards. The basic idea is to use the no cloning theorem to detect possible eavesdropping attempts. To make this more transparent, let us consider a particular example here, namely the probably most prominent protocol proposed by Benett and Brassard in 1984 [10]. 1. Assume that Alice wants to transmit bits from the (randomly generated) key y ∈ BN through an ideal quantum channel to Bob. Before they start they settle upon two orthonormal bases e0 ; e1 ∈ H, respectively f0 ; f1 ∈ H, which are mutually non-orthogonal, i.e. | ej ; fk | ¿ < ¿ 0 with < big enough for each j; k = 0; 1. If photons are used as information carrier a typical choice ◦ are linearly polarized photons with polarization direction rotated by 45 against each other. 2. To send one bit j ∈ B Alice selects now at random one of the two bases, say e0 ; e1 and then she sends a qubit in the state |ej ej | through the channel. Note that neither Bob nor a potential eavesdropper knows which bases she has chosen. 3. When Bob receives the qubit he selects, as Alice before, at random a base and performs the corresponding von Neumann measurement to get one classical bit k ∈ B, which he records together with the measurement method. 4. Both repeat this procedure until the whole string y ∈ BN is transmitted and then Bob tells Alice (through a classical, public communication channel) bit for bit which base he has used for the measurement (but not the result of the measurement). If he has used the same base as Alice both keep the corresponding bit otherwise they discard it. They end up with a bit-string y ∈ BM of a reduced length M . If this is not suScient they have to continue sending random bits until the key is long enough. For large N the rate of successfully transmitted bits per bits sended is obviously 12 . Hence, Alice has to send approximately twice as many bits as they need. To see why this procedure is secure, assume now that the eavesdropper Eve can listen and modify the information sent through the quantum channel and that she can listen on the classical channel but cannot modify it (we come back to this restriction in a minute). Hence, Eve can intercept the qubits sent by Alice and make two copies of it. One she forwards to Bob and the other she keeps for later analysis. Due to the no cloning theorem, however, she has produced errors in both copies and the quality of her own decreases if she tries to make the error in Bob’s as small as possible. Even if Eve knows about the two bases e0 ; e1 and f0 ; f1 she does not know which one Alice uses to send a particular qubit 18 . Hence, Eve has to decide randomly which base to choose (as Bob). If e0 ; e1 and f0 ; f1 are chosen optimal, i.e. | ej ; fk |2 = 0:5 it is easy to see that the error rate Eve necessarily produces if she randomly measures in one of the bases is 1=4 for large N . To detect this error Alice and Bob simply have to sacrify portions of the generated key and to compare randomly selected bits using their classical channel. If the error rate they detect is too big they can decide to drop the whole key and restart from the beginning. 18
If Alice and Bob uses only one basis to send the data and Eve knows about it she can produce, of course, ideal copies of the qubits. This is actually the reason why two non-orthogonal bases are necessary.
M. Keyl / Physics Reports 369 (2002) 431 – 548
493
So let us discuss .nally a situation where Eve is able to intercept the quantum and the classical channel. This would imply that she can play Bob’s part for Alice and Alice’s for Bob. As a result she shares a key with Alice and one with Bob. Hence, she can decode all secret data Alice sends to Bob, read it, and encode it .nally again to forward it to Bob. To secure against such a “woman in the middle attack”, Alice and Bob can use classical authentication protocols which ensure that the correct person is at the other end of the line. This implies that they need a small amount of initial secret material which can be renewed, however, from the new key they have generated through quantum communication. 5. Entanglement measures In the last section we have seen that entanglement is an essential resource for many tasks of quantum information theory, like teleportation or quantum computation. This means that entangled states are needed for the functioning of many processes and that they are consumed during operation. It is therefore necessary to have measures which tell us whether the entanglement contained in a number of quantum systems is suScient to perform a certain task. What makes this subject diScult is the fact that we cannot restrict the discussion to systems in a maximally or at least highly entangled pure state. Due to unavoidable decoherence e=ects realistic applications have to deal with imperfect systems in mixed states, and exactly in this situation the question for the amount of available entanglement is interesting. 5.1. General properties and de
494
M. Keyl / Physics Reports 369 (2002) 431 – 548
The next point concerns the range of E. If is unentangled E( ) should be zero of course and it should be maximal on maximally entangled states. But what happens if we allow the dimensions of H and K to grow? To get an answer consider .rst a pair of qubits in a maximally entangled state . It should contain exactly one-bit entanglement, i.e. E( ) = 1 and N pairs in the state ⊗N should contain N bits. If we interpret ⊗N as a maximally entangled state of a H ⊗ H system with H = CN we get E( ⊗N ) = log2 (dim(H)) = N , where we have to reshu\e in ⊗N the tensor factors such that (C2 ⊗ C2 )⊗N becomes (C2 )⊗N ⊗ (C2 )⊗N (i.e. “all Alice particles to the left and all Bob particles to the right”; cf. Section 4.3.) This observation motivates the following. Axiom E1 (Normalization). E vanishes on separable and takes its maximum on maximally entangled states. More precisely; this means that E() 6 E( ) = log2 (d) for ; ∈ S(H ⊗ H) and
maximally entangled. One thing an entanglement measure should tell us, is how much quantum information can be maximally teleported with a certain amount of entanglement, where this maximum is taken over all possible teleportation schemes and distillation protocols, hence it cannot be increased further by additional LOCC operations on the entangled systems in question. This consideration motivates the following Axiom. Axiom E2 (LOCC monotonicity). E cannot increase under LOCC operation; i.e. E[T ( )] 6 E( ) for all states and all LOCC channels T . A special case of LOCC operations are, of course, local unitary operations U ⊗ V . Axiom E2 implies now that E(U ⊗ V U ∗ ⊗ V ∗ ) 6 E( ) and on the other hand E(U ∗ ⊗ V ∗ U ˜ ⊗ V ) 6 E( ) ˜ hence with =U ˜ ⊗V U ∗ ⊗V we get E( ) 6 E(U ⊗V V ∗ ⊗U ∗ ) therefore E( )=E(U ⊗V U ∗ ⊗V ∗ ). We .x this property as a weakened version of Axiom E2. Axiom E2a (Local unitary invariance). E is invariant under local unitaries; i.e. E(U ⊗ V U ∗ ⊗ V ∗ ) = E( ) for all states and all unitaries U ; V . This axiom shows why we do not have to bother about families of functions as mentioned above. If E is de.ned on S(H ⊗ H) it is automatically de.ned on S(H1 ⊗ H2 ) for all Hilbert spaces Hk with dim(Hk ) 6 dim(H), because we can embed H1 ⊗ H2 under this condition unitarily into H ⊗ H. Consider now a convex linear combination + (1 − ) with 0 6 6 1. Entanglement cannot be “generated” by mixing two states, i.e. E( + (1 − )) 6 E( ) + (1 − )E(). Axiom E3 (Convexity). E is a convex function; i.e. E( + (1 − )) 6 E( ) + (1 − )E() for two states ; and 0 6 6 1. The next property concerns the continuity of E, i.e. if we perturb slightly the change of E( ) should be small. This can be expressed most conveniently as continuity of E in the trace norm. At this point, however, it is not quite clear, how we have to handle the fact that E is de.ned for
M. Keyl / Physics Reports 369 (2002) 431 – 548
495
arbitrary Hilbert spaces. The following version is motivated basically by the fact that it is a crucial assumption in Theorems 5.2 and 5.3. Axiom E4 (Continuity). Consider a sequence of Hilbert spaces HN ; N ∈ N and two sequences of states N ; N ∈ S(HN ⊗ HN ) with lim N − N 1 = 0. Then we have E( N ) − E(N ) =0 : (5.1) lim N →∞ 1 + log2 (dim HN ) The last point we have to consider here are additivity properties: Since we are looking at entanglement as a resource, it is natural to assume that we can do with two pairs in the state twice as much as with one , or more precisely E( ⊗ ) = 2E( ) (in ⊗ we have to reshu\e tensor factors again; see above). Axiom E5 (Additivity). For any pair of two-partite states ; ∈ S(H ⊗ K) we have E( ⊗ ) = E() + E( ). Unfortunately, this rather natural looking axiom seems to be too strong (it excludes reasonable candidates). It should be however, always true that entanglement cannot increase if we put two pairs together. Axiom E5a (Subadditivity). For any pair of states ; we have E( ⊗ ) 6 E( ) + E(). There are further modi.cations of additivity available in the literature. Most frequently used is the following, which restricts Axiom E5 to the case = . Axiom E5b (Weak additivity). For any state of a bipartite system we have N −1 E( ⊗N ) = E( ). Finally, the weakest version of additivity only deals with the behavior of E for large tensor products, i.e. ⊗N for N → ∞. Axiom E5c (Existence of a regularization). For each state the limit E( ⊗N ) E ∞ ( ) = lim N →∞ N exists.
(5.2)
5.1.2. Pure states Let us consider now a pure state = | | ∈ S(H ⊗ K). If it is entangled its partial trace = tr H | | = tr K | | is mixed and for a maximally entangled state it is maximally mixed. This suggests to use the von Neumann entropy 19 of , which measures how much a state is mixed, as an entanglement measure for pure states, i.e. we de.ne [9,16] EvN ( ) = −tr[tr H ln(tr H )] : 19
(5.3)
We assume here and in the following that the reader is suSciently familiar with entropies. If this is not the case we refer to [123].
496
M. Keyl / Physics Reports 369 (2002) 431 – 548
It is easy to deduce from the properties of the von Neumann entropy that EvN satis.es Axioms E0, E1, E3 and E5b. Somewhat more diScult is only Axiom E2 which follows, however, from a nice theorem of Nielsen [119] which relates LOCC operations (on pure states) to the theory of majorization. To state it here we need .rst some terminology. Consider two probability distributions = (1 ; : : : ; M ) and % = (%1 ; : : : ; %N ) both given in decreasing order (i.e. 1 ¿ · · · ¿ M and %1 ¿ · · · ¿ %N ). We say that is majorized by %, in symbols ≺ %, if k
j 6
j=1
k
%j
∀k = 1; : : : ; min M; N
(5.4)
j=1
holds. Now we have the following result (see [119] for a proof). Theorem 5.1. A pure state = j j1=2 ej ⊗ ej ∈ H ⊗ K can be transformed into another pure state = j %j1=2 fj ⊗ fj ∈ H ⊗ K via an LOCC operation; i> the Schmidt coeCcients of are majorized by those of ; i.e. ≺ %. The von Neumann entropy of the restriction tr H | | can be immediately calculated from the Schmidt coeScients of by EvN (| |) = − j j ln(j ). Axiom E2 follows therefore from the fact that the entropy S() = − j j ln(j ) of a probability distribution is a Shur concave function, i.e. ≺ % implies S() ¿ S(%); see [121]. Hence, we have seen so far that EvN is one possible candidate for an entanglement measure on pure states. In the following we will see that it is in fact the only candidate which is physically reasonable. There are basically two reasons for this. The .rst one deals with distillation of entanglement. It was shown by Bennett et al. [9] that each state ∈ H ⊗ K of a bipartite system can be prepared out of (a possibly large number of) systems in an arbitrary entangled state by LOCC operations. To be more precise, we can .nd a sequence of LOCC operations TN : B[(H ⊗ K)⊗M (N ) ] → B[(H ⊗ K)⊗N ]
(5.5)
such that lim TN∗ (| |⊗N ) − | |1 = 0
N →∞
(5.6)
holds with a non-vanishing rate r=limN →∞ M (N )=N . This is done either by distillation (r ¡ 1 if is higher entangled then ) or by “diluting” entanglement, i.e. creating many less entangled states from few highly entangled ones (r ¿ 1). All this can be performed in a reversible way: We can start with some maximally entangled qubits, dilute them to get many less entangled states which can be distilled afterwards to get the original states back (again only in an asymptotic sense). The crucial point is that the asymptotic rate r of these processes is given in terms of EvN by r=EvN (| |)=EvN (| |). Hence, we can say, roughly speaking, that EvN (| |) describes exactly the amount of maximally entangled qubits which is contained in | |. A second somewhat more formal reason is that EvN is the only entanglement measure on the set of pure states which satis.es the axioms formulated above. In other words the following “uniqueness theorem for entanglement measures” holds [129,155,57].
M. Keyl / Physics Reports 369 (2002) 431 – 548
497
Theorem 5.2. The reduced von Neumann entropy EvN is the only entanglement measure on pure states which satis<es Axioms E0 –E5. 5.1.3. Entanglement measures for mixed states To .nd reasonable entanglement measures for mixed states is much more diScult. There are in fact many possibilities (e.g. the maximally entangled fraction introduced in Section 3.1.1 can be regarded as a simple measure) and we want to present therefore only four of the most reasonable candidates. Among those measures which we do not discuss here are negativity quantities ([158] and the references therein) the “best separable approximation” [108], the base norm associated with the set of separable states [157,136] and ppt-distillation rates [133]. The .rst measure we want to present is oriented along the discussion of pure states: We de.ne, roughly speaking, the asymptotic rate with which maximally entangled qubits can be distilled at most out of a state ∈ S(H ⊗ K) as the entanglement of distillation ED ( ) of ; cf. [12]. To be more precise consider all possible distillation protocols for (cf. Section 4.3), i.e. all sequences of LOCC channels TN : B(CdN ⊗ CdN ) → B(H⊗N ⊗ K⊗N ) ;
(5.7)
such that lim TN∗ ( ⊗N ) − |CN CN | 1 = 0
N →∞
holds with a sequence of maximally entangled states CN ∈ CdN . Now we can de.ne log2 (dN ) ; ED ( ) = sup lim sup N (TN )N ∈N N →∞
(5.8)
(5.9)
where the supremum is taken over all possible distillation protocols (TN )N ∈N . It is not very diScult to see that ED satis.es Axioms E0, E1, E2 and E5b. It is not known whether continuity (Axiom E4) and convexity (Axiom E3) holds. It can be shown, however, that ED is not convex (and not additive; Axiom E5) if npt bound entangled states exist (see [141], cf. also Section 4.3.3). For pure states we have discussed beside distillation the “dilution” of entanglement and we can use, similar to ED , the asymptotic rate with which bipartite systems in a given state can be prepared out of maximally entangled singlets [78]. Hence, consider again a sequence of LOCC channels TN : B(H⊗N ⊗ K⊗N ) → B(CdN ⊗ CdN )
(5.10)
and a sequence of maximally entangled states CN ∈ CdN , N ∈ N, but now with the property lim ⊗N − TN∗ (|CN CN |) 1 = 0 :
N →∞
Then we can de.ne the entanglement cost EC ( ) of as log2 (dN ) ; EC ( ) = inf lim inf (SN )N ∈N N →∞ N
(5.11)
(5.12)
where the in.mum is taken over all dilution protocols SN , N ∈ N. It is again easy to see that EC satis.es Axioms E0, E1, E2 and E5b. In contrast to ED however it can be shown that EC is convex (Axiom E3), while it is not known, whether EC is continuous (Axiom E4); cf [78] for proofs.
498
M. Keyl / Physics Reports 369 (2002) 431 – 548
ED and EC are directly based on operational concepts. The remaining two measures we want to discuss here are de.ned in a more abstract way. The .rst can be characterized as the minimal convex extension of EvN to mixed states: We de.ne the entanglement of formation EF of as [16] pj EvN (| j j |) ; EF ( ) = inf (5.13)
=
j
pj |
j j |
where the in.mum is taken over all decompositions of into a convex sum of pure states. EF satis.es E0 –E4 and E5a (cf. [16] for E2 and [120] for E4 the rest follows directly from the de.nition). Whether EF is (weakly) additive (Axiom E5b) is not known. Furthermore, it is conjectured that EF coincides with EC . However, proven is only the identity EF∞ = EC , where the existence of the regularization EF∞ of EF follows directly from subadditivity. Another idea to quantify entanglement is to measure the “distance” of the (entangled) from the set of separable states D. It hat turned out [154] that among all possible distance functions the relative entropy is physically most reasonable. Hence, we de.ne the relative entropy of entanglement as ER ( ) = inf S( |); ∈D
S( |) = [tr( log2 − log2 )] ;
(5.14)
where the in.mum is taken over all separable states. It can be shown that ER satis.es, as EF the Axioms E0 –E4 and E5a, where E1 and E2 are shown in [154] and E4 in [56]; the rest follows directly from the de.nition. It is shown in [159] that ER does not satisfy E5b; cf. also Section 5.3. Hence, the regularization ER∞ of ER di=ers from ER . Finally, let us give now some comments on the relation between the measures just introduced. On pure states all measures just discussed, coincide with the reduced von Neumann entropy—this follows from Theorem 5.2 and the properties stated in the last subsection. For mixed states the situation is more diScult. It can be shown however that ED 6 EC holds and that all “reasonable” entanglement measures lie in between [89]. Theorem 5.3. For each entanglement measure E satisfying E0; E1; E2 and E5b and each state
∈ S(H ⊗ K) we have ED ( ) 6 E( ) 6 EC ( ). Unfortunately, no measure we have discussed in the last subsection satis.es all the assumptions of the theorem. It is possible, however, to get a similar statement for the regularization E ∞ with weaker assumptions on E itself (in particular, without assuming additivity); cf. [57]. 5.2. Two qubits Even more diScult than .nding reasonable entanglement measures are explicit calculations. All measures we have discussed above involve optimization processes over spaces which grow exponentially with the dimension of the Hilbert space. A direct numerical calculation for a general state
is therefore hopeless. There are, however, some attempts to get either some bounds on entanglement measures or to get explicit calculations for special classes of states. We will concentrate this discussion to some relevant special cases. On the one hand, we will concentrate on EF and ER and on the other we will look at two special classes of states where explicit calculations are possible: Two qubit systems in this section and states with symmetry properties in the next one are given.
M. Keyl / Physics Reports 369 (2002) 431 – 548
499
5.2.1. Pure states Assume for the rest of this section that H = C2 holds and consider .rst a pure state ∈ H ⊗ H. To calculate EvN ( ) is of course not diScult and it is straightforward to see that (cf. [16] for all material of this and the following subsection) EvN ( ) = H [ 12 (1 + 1 − C( )2 )] (5.15) holds, with H (x) = −x log2 (x) − (1 − x) log2 (1 − x) and the concurrence C( ) of
3 2
-j
with C( ) =
j=0
(5.16)
which is de.ned by =
3
-j 2 j ;
(5.17)
j=0
where 2j , j =0; : : : ; 3 denotes the Bell basis (3.3). Since C becomes rather important in the following let us reexpress it as C( ) = | ; L |, where → L denotes complex conjugation in the Bell basis. Hence, L is an antiunitary operator and it can be written as the tensor product L = M ⊗ M of Z where Z denotes complex conjugation in the canonical basis and 2 is the the map H → 2 , second Pauli matrix. Hence, local unitaries (i.e. those of the form U1 ⊗ U2 ) commute with L and it can be shown that this is not only a necessary but also a suScient condition for a unitary to be local [160]. We see from Eqs. (5.15) and (5.17) that C( ) ranges from 0 to 1 and that EvN ( ) is a monotone function in C( ). The latter can be considered therefore as an entanglement quantity in its own right. For a Bell state we get in particular C(2j ) = 1 while a separable state 1 ⊗ 2 leads to C(1 ⊗ 2 ) = 0; this can be seen easily with the factorization L = M ⊗ M. Assume now that one of the -j say -0 satis.es |-0 |2 ¿ 1=2. This implies that C( ) cannot be zero since
3
2
-j
6 1 − |-0 |2 (5.18)
j=1
must hold. Hence, C( ) is at least 1 − 2|-0 |2 and this implies for EvN and arbitrary 1 H [ 2 + x(1 − x)] x ¿ 12 ; 2 EvN ( ) ¿ h(| 20 ; | ) with h(x) = 0 x ¡ 12 :
(5.19)
This inequality remains valid if we replace 20 by any other maximally entangled state 2 ∈ H ⊗ H. To see this note that two maximally entangled states 2; 2 ∈ H ⊗ H are related (up to a phase) by a local unitary transformation U1 ⊗ U2 (this follows immediately from their Schmidt decomposition; cf Section 3.1.1). Hence, if we replace the Bell basis in Eq. (5.17) by 2j = U1 ⊗ U2 2j , j = 0; : : : ; 3 we get for the corresponding C the equation C ( ) = U1∗ ⊗ U2∗ ; LU1∗ ⊗ U2∗ = C( ) since L commutes with local unitaries. We can even replace | 20 ; |2 with the supremum over all maximally entangled states and therefore get EvN ( ) ¿ h[F(| |)] ;
(5.20)
500
M. Keyl / Physics Reports 369 (2002) 431 – 548
where F(| |) is the maximally entangled fraction of | | which we have introduced in Section 3.1.1. To see that even equality holds in Eq. (5.20) note .rst that it is suScient to consider the case = a|00 + b|11 with a; b ¿ 0, a2 + b2 = 1, since each pure state can be brought into this form (this follows again from the Schmidt decomposition) by a local unitary transformation which on the other hand does not change EvN . The maximally entangled state which maximizes | ; 2|2 is in this case 20 and we get F(| |) = (a + b)2 =2 = 1=2 + ab. Straightforward calculations now show that h[F(| |)] = h(1=2 + ab) = EvN ( ) holds as stated. 5.2.2. EOF for Bell diagonal states It is easy to extend inequality (5.20) to mixed states if we use the convexity of EF and the fact that EF coincides with EvN on pure states. Hence, (5.20) becomes EF ( ) ¿ h[F( )] :
(5.21)
For general two-qubit states this bound is not achieved however. This can be seen with the example
=1=2(|1 1 |+|00 00|), which we have already considered in the last paragraph of Section 3.1.1. It is easy to see that F( ) = 12 holds hence h[F( )] = 0 but is entangled. Nevertheless, we can show that equality holds in Eq. (5.21) if we restrict it to the Bell diagonal states = 3j=0 j|2j 2j |. To prove this statement we have to .nd a convex decomposition = j %j |"j "j | of such a into pure states |"j "j | such that h[F( )] = j %j EvN (|"j "j | holds. Since EF ( ) cannot be smaller than h[F( )] due to inequality (5.21) this decomposition must be optimal and equality is proven. To .nd such "j assume .rst that the biggest eigenvalue of is greater than 1=2, and let, without loss of generality, 1 be this eigenvalue. A good choice for the "j are then the eight pure states 3 (5.22) 0 20 + i (± j )2j : j=1
The reduced von Neumann entropy of all these states equals h(1 ), hence j %j EvN (|"j "j |)=h(1 ) and therefore EF ( ) = h(1 ). Since the maximally entangled fraction of is obviously 1 we see that (5.21) holds with equality. Assume now that the highest eigenvalue is less than 1=2. Then we can .nd phase factors exp(ij ) such that 3j=0 exp(ij )j = 0 holds and can be expressed as a convex linear combination of the states 3 (5.23) ei0 =2 0 20 + i (±eij =2 j )2j : j=1
The concurrence C of all these states is 0 hence their entanglement is 0 by Eq. (5.15), which in turn implies EF ( ) = 0. Again, we see that equality is achieved in (5.21) since the maximally entangled fraction of is less than 1=2. Summarizing this discussion we have shown (cf. Fig. 5.1) Proposition 5.4. A Bell diagonal state is entangled i> its highest eigenvalue is greater than 1=2. In this case the entanglement of formation of is given by (5.24) EF ( ) = H [ 12 + (1 − )] :
M. Keyl / Physics Reports 369 (2002) 431 – 548
501
1 Entanglement of Formation Relative Entropy
E F () E R () 0.8
0.6
0.4
0.2
0 0.5
0.55
0.6
0.65
0.7
0.75
0.8
0.85
0.9
0.95
1
Highest eigenvalue λ of Fig. 5.1. Entanglement of formation and relative entropy of entanglement for the Bell diagonal states, plotted as a function of the highest eigenvalue of .
5.2.3. Wootters formula If we have a general two-qubit state there is a formula of Wootters [172] which allows an easy calculation of EF . It is based on a generalization of the concurrence C to mixed states. To motivate it rewrite C 2 ( ) = | ; L | as C 2 ( ) = tr(| L L |) = tr( L L) = tr(R2 ) with R=
√
√
L L :
(5.25)
(5.26)
Here we have set = | |. The de.nition of the Hermitian matrix R however makes sense for arbitrary as well. If we write j ; j = 1; : : : ; 4 for the eigenvalues of R and 1 is without loss of generality, the biggest one we can de.ne the concurrence of an arbitrary two-qubit state as [172] C( ) = max(0; 21 − tr(R)) = max(0; 1 − 2 − 3 − 4 ) :
(5.27)
It is easy to see that C(| |) coincides with C( ) from (5.17). The crucial point is now that Eq. (5.15) holds for EF ( ) if we insert C( ) instead of C( ): Theorem 5.5 (Wootters formula). The entanglement of formation of a two-qubit system in a state
is given by (5.28) EF ( ) = H [ 12 (1 + 1 − C( )2 )] ; where the concurrence of is given in Eq. (5.27) and H denotes the binary entropy from (5.16).
502
M. Keyl / Physics Reports 369 (2002) 431 – 548
To prove this theorem we .rstly have to .nd a convex decomposition = j %j |"j "j | of
into pure states "j such that the average reduced von Neumann entropy j %j EvN ("j ) coincides with the right-hand side of Eq. (5.28). Secondly, we have to show that we have really found the minimal decomposition. Since this is much more involved than the simple case discussed in Section 5.2.2 we omit the proof and refer to [172] instead. Note however that Eq. (5.28) really coincides with the special cases we have derived for the pure and the Bell diagonal states. Finally, let us add the remark that there is no analog of Wootters’ formula for higher dimensional Hilbert spaces. It can be shown [160] that the essential properties of the Bell basis 2j , j = 0; : : : ; 3 which would be necessary for such a generalization are available only in 2 × 2 dimensions. 5.2.4. Relative entropy for Bell diagonal states To calculate the relative entropy of entanglement ER for two-qubit systems is more diScult. However, there is at least an easy formula for the Bell diagonal states which we will give in the following [154]: Proposition 5.6. The relative entropy of entanglement for a Bell diagonal state with highest eigenvalue is given by (cf. Fig. 5.1) 1 − H (); ¿ 12 ; ER ( ) = (5.29) 0; 6 12 : Proof. For a Bell diagonal state =
3
j=0
j |2j 2j | we have to calculate
ER ( ) = inf [tr( log2 − log2 )] ∈D 3 = tr( log2 ) + inf − j 2j ; log2 ()2j : ∈D
(5.30) (5.31)
j=0
Since log is a concave function we have −log2 2j ; 2j 6 2j ; −log2 ()2j and therefore 3 ER ( ) ¿ tr( log2 ) + inf − j log2 2j ; 2j : ∈D
(5.32)
j=0
Hence; only the diagonal elements of in the Bell basis enter the minimization on the right-hand side of this inequality and this implies that we can restrict the in.mum to the set of separable Bell diagonal state. Since a Bell diagonal state is separable i= all its eigenvalues are less than 1=2 (Proposition 5.2.1) we get 3 3 ER ( ) ¿ tr( log2 ) + inf − j log2 pj with pj = 1 : (5.33) pj ∈[0;1=2]
j=0
j=0
This is an optimization problem (with constraints) over only four real parameters and easy to solve. If the highest eigenvalue of is greater than 1=2 we get p1 = 1=2 and pj = j =(2 − 2); where we have chosen without loss of generality = 1 . We get a lower bound on ER ( ) which is achieved
M. Keyl / Physics Reports 369 (2002) 431 – 548
503
if we insert the corresponding in Eq. (5.31). Hence; we have proven the statement for ¿ 1=2. which completes the proof; since we have already seen that 6 1=2 implies that is separable (Proposition 5.4). 5.3. Entanglement measures under symmetry The problems occurring if we try to calculate quantities like ER or EF for general density matrices arise from the fact that we have to solve optimization problems over very high dimensional spaces. One possible strategy to get explicit results is therefore parameter reduction by symmetry arguments. This can be done if the state in question admits some invariance properties like Werner, isotropic or OO-invariant states; cf. Section 3.1. In the following, we will give some particular examples for such calculations, while a detailed discussion of the general idea (together with much more examples and further references) can be found in [159]. 5.3.1. Entanglement of formation Consider a compact group of unitaries G ⊂ B(H ⊗ H) (where H is again arbitrary .nite dimensional), the set of G-invariant states, i.e. all with [V; ]=0 for all V ∈ G and the corresponding twirl operation PG = G VV ∗ dV . Particular examples we are looking at are: (1) Werner states where G consists of all unitaries U ⊗ U , (2) isotropic states where each V ∈ G has the form V = U ⊗ UZ and .nally (3) OO-invariant states where G consists of unitaries U ⊗ U with real matrix elements (U = UZ ) and the twirl is given in Eq. (3.24). One way to calculate EF for a G-invariant state consists now of the following steps: (1) Determine the set M of pure states 2 such that PG |2 2| = holds. (2) Calculate the function PG S → jG ( ) = inf {EvN () | ∈ M } ∈ R ;
(5.34)
where we have denoted the set of G-invariant states with PG S. (3) Determine EF ( ) then in terms of the convex hull of j, i.e. j j(j )|j ∈ PG S; 0 6 j 6 1; = j j ; j = 1 : (5.35) EF ( ) = inf j
j
j
The equality in the last equation is of course a non-trivial statement which has to be proved. We skip this point, however, and refer the reader to [159]. The advantage of this scheme relies on the fact that spaces of G invariant states are in general very low dimensional (if G is not too small). Hence, the optimization problem contained in step 3 has a much bigger chance to be tractable than the one we have to solve for the original de.nition of EF . There is of course no guarantee that any of this three steps can be carried out in a concrete situation. For the three examples mentioned above, however, there are results available, which we will present in the following. 5.3.2. Werner states Let us start with Werner states [159]. In this case is uniquely determined by its Jip expectation value tr( F) (cf. Section 3.1.2). To determine 2 ∈ H ⊗ H such that PUU |2 2| = holds, we have to solve therefore the equation
2; F2 = 2jk 2kj = tr(F ) ; (5.36) jk
504
M. Keyl / Physics Reports 369 (2002) 431 – 548 1 0.9 0.8 0.7
EF()
0.6 0.5 0.4 0.3 0.2 0.1 0 -1
-0.8
-0.6
-0.4
-0.2
0
tr(F) Fig. 5.2. Entanglement of formation for Werner states plotted as function of the Jip expectation.
where 2jk denote components of 2 in the canonicalbasis. On the other hand, the reduced density matrix = tr 1 |2 2| has the matrix elements jk = l 2jl 2kl . By exploiting U ⊗ U invariance we can assume without loss of generality that is diagonal. Hence, to get the function
(5.37) S |2jk |2 EvN (|2 2|) = j
k
under constraint (5.36), where S(x) = −x log2 (x) denotes the von Neumann entropy. We skip these calculations here (see [159] instead) and state the results only. For tr(F ) ¿ 0 we get <( ) = 0 (as expected since is separable in this case) and with H from (5.16) jUU ( ) = H [ 12 (1 − 1 − tr(F )2 )] (5.38) for tr(F ) ¡ 0. The minima are taken for 2 where all 2jk except one diagonal element are zero in the case tr(F ) ¿ 0 and for 2 with only two (non-diagonal) coeScients 2jk ; 2kj , j = k non-zero if tr( F) ¡ 0. The function < is convex and coincides therefore with its convex hull such that we get Proposition 5.7. For any Werner state the entanglement of formation is given by (cf. Fig. 5.2) H [ 12 (1 − 1 − tr(F )2 )] ; tr(F ) ¡ 0 ; (5.39) EF ( ) = 0; tr(F ) ¿ 0 :
M. Keyl / Physics Reports 369 (2002) 431 – 548
505
5.3.3. Isotropic states Let us now consider isotropic, i.e. U ⊗ UZ invariant states. They are determined by the expectation ˜ with F˜ from Eq. (3.14). Hence, we have to look .rst for pure states 2 with 2; F2 ˜ = value tr( F) ˜ tr( F) (since this determines, as for Werner states above, those 2 with PU UZ (|2 2|) = ). To this end assume that 2 has the Schmidt decomposition 2 = j j fj ⊗ fj = U1 ⊗ U2 j j ej ⊗ ej with appropriate unitary matrices U1 ; U2 and the canonical basis ej , j = 1; : : : ; d. Exploiting the U ⊗ UZ invariance of we get ˜ = (5 ⊗ V ) ˜ ⊗ V) tr( F) (5.40) j ej ⊗ ej ; F(5 k ek ⊗ e k =
j
k
j k ej ⊗ Vej ; el ⊗ el em ⊗ em ; ek ⊗ Vek
(5.41)
j; k;l; m
2
= j ej ; Vej
(5.42)
j
˜ Following our general scheme, we have to with V = U1T U2 and after inserting the de.nition of F. minimize EvN (|2 2|) under the constraint given in Eq. (5.42). This is explicitly done in [150]. We will only state the result here, which leads to the function H (.) + (1 − .) log (d − 1); tr( F) ˜ ¿1 ; 2 d jU UZ ( ) = (5.43) ˜ 0; tr( F) ¡ 0 with .=
1 d2
˜ + tr( F)
2 ˜ [d − 1][d − tr( F)] :
(5.44)
For d ¿ 3 this function is not convex (cf. Fig. 5.3), hence we get Proposition 5.8. For any isotropic state the entanglement of formation is given as the convex hull
EF ( ) = inf j jU UZ (j ) = j j ; PU UZ = (5.45)
j
j
of the function
506
M. Keyl / Physics Reports 369 (2002) 431 – 548 2 d=4 d=3 d=2
1.8 1.6 1.4
∋
¯ () UU
1.2 1 0.8 0.6 0.4 0.2 0 1
1.5
2
2.5
3
∼ tr( F )
3.5
4
Fig. 5.3. <-function for isotopic states plotted as a function of the Jip expectation. For d ¿ 2 it is not convex near the right endpoint.
3
2
A C
1
B -1
0
1
0
Fig. 5.4. State space of OO-invariant states.
of U ∈ G translated copies of . This follows from the fact that is by de.nition of M the twirl of . Hence any convex linear combination of pure states UU ∗ with U ∈ G has the same EF as . A detailed analysis of the corresponding optimization problems in the case of Werner and isotropic states (which we have omitted here; see [159,150] instead) leads therefore to the following results about OO-invariant states: The space of OO-invariant states decomposes into four regions: The separable square and three triangles A; B; C; cf. Fig. 5.4. For all states in triangle A we can calculate EF ( ) as for Werner states in Proposition 5.7 and in triangle B we have to apply the result for isotropic states from Proposition 5.8. This implies in particular that EF depends in A only on ˜ and the dimension. tr( F) and in B only on tr( F)
M. Keyl / Physics Reports 369 (2002) 431 – 548
507
1 0.9 0.8 0.7
ER()
0.6 0.5 0.4 0.3 0.2 0.1 0 -1
-0.8
-0.6
-0.4
-0.2
0
tr(F) Fig. 5.5. Relative entropy of entanglement for Werner states, plotted as a function of the Jip expectation.
5.3.5. Relative entropy of entanglement To calculate ER ( ) for a symmetric state is even easier as the treatment of EF ( ), because we can restrict the minimization in the de.nition of ER ( ) in Eq. (5.14) to G-invariant separable states, provided G is a group of local unitaries. To see this assume that ∈ D minimizes S( |) for a G-invariant state . Then we get S( |UU ∗ ) = S( |) for all U ∈ G since the relative entropy S is invariant under unitary transformations of both arguments and due to its convexity we even get S( |PG ) 6 S( |). Hence PG minimizes S( |·) as well, and since PG ∈ D holds for a group G of local unitaries, we get ER (; ) = S( |PG ) as stated. The sets of Werner and isotropic states are just intervals and the corresponding separable states form subintervals over which we have to perform the optimization. Due to the convexity of the relative entropy in both arguments, however, it is clear that the minimum is attained exactly at the boundary between entangled and separable states. For Werner states this is the state 0 with tr(F0 ) = 0, i.e. it gives equal weight to both minimal projections. To get ER ( ) for a Werner state
we have to calculate therefore only the relative entropy with respect to this state. Since all Werner states can be simultaneously diagonalized this is easily done and we get (cf. Fig. 5.5) 1 + tr(F ) : (5.46) ER ( ) = 1 − H 2 ˜ 1 ) = 1 which leads to Similarly, the boundary point 1 for isotropic states is given by tr(F (cf. Fig. 5.6) ˜ ˜ ˜ 1 − tr(F ) tr(F ) tr(F ) log2 (d − 1) − S ; (5.47) ER ( ) = log2 d − 1 − d d d
508
M. Keyl / Physics Reports 369 (2002) 431 – 548 2 d=2 d=3 d=4
ER()
1.5
1
0.5
0 1
1.5
2
2.5
∼ tr( F )
3
3.5
4
˜ Fig. 5.6. Relative entropy of entanglement for isotropic states and d = 2; 3; 4, plotted as a function of tr( F).
for each entangled isotropic state , and 0 if is separable. (S(p1 ; p2 ) denotes here the entropy of the probability vector (p1 ; p2 ).) Let us now consider OO-invariant states. As for EOF we divide the state space into the separable square and the three triangles A; B; C; cf. Fig. 5.4. The state at the coordinates (1; d) is a maximally entangled state and all separable states on the line connecting (0; 1) with (1; 1) minimize the relative entropy for this state. Hence consider a particular state on this line. The convexity property of the relative entropy immediately shows that is a minimizer for all states on the line connecting with the state at (1; d). In this way, it is easy to calculate ER ( ) for all in A. In a similar way we can treat the triangle B: We just have to draw a line from to the state at (−1; 0) and .nd the minimizer for at the intersection with the separable border between (0; 0) and (0; 1). For all states in the triangle C the relative entropy is minimized by the separable state at (0; 1). An application of the scheme just reviewed is a proof that ER is not additive, i.e. it does not satisfy Axiom E5b. To see this consider the state = tr(P− )−1 P− where P− denotes the projector on the antisymmetric subspace. It is a Werner state with Jip expectation −1 (i.e. it corresponds to the point (−1; 0) in Fig. 5.4). According to our discussion above S( |·) is minimized in this case by the separable state 0 and we get ER ( ) = 1 independently of the dimension d. The tensor product ⊗2 can be regarded as a state in S(H⊗2 ⊗ H⊗2 ) with U ⊗ U ⊗ V ⊗ V symmetry, where U; V are unitaries on H. Note that the corresponding state space of UUVV invariant states can be parameterized by the expectation of the three operators F ⊗ 5, 5 ⊗ F and F ⊗ F (cf. [159]) and we can apply the machinery just described to get the minimizer ˜ of S( | · ). If d ¿ 2 holds it turns out that ˜ =
d+1 d−1 P+ ⊗ P + + P− ⊗ P − 2 2d tr(P+ ) 2d tr(P− )2
(5.48)
M. Keyl / Physics Reports 369 (2002) 431 – 548
509
holds (where P± denote the projections onto the symmetric and antisymmetric subspaces of H⊗H) and not ˜ = 0 ⊗ 0 as one would expect. As a consequence we get the inequality 2d − 1 ⊗2 ¡ 2 = S( ⊗2 |0⊗2 ) = 2ER ( ) : (5.49) ER ( ) = 2 − log2 d d = 2 is a special case, where 0⊗2 and ˜ (and all their convex linear combination) give the same value 2. Hence for d ¿ 2 the relative entropy of entanglement is, as stated, not additive. 6. Channel capacity In Section 4.4 we have seen that it is possible to send (quantum) information undisturbed through a noisy quantum channel, if we encode one qubit into a (possibly long and highly entangled) string of qubits. This process is wasteful, since we have to use many instances of the channel to send just one qubit of quantum information. It is therefore natural to ask, which resources we need at least if we are using the best possible error correction scheme. More precisely the question is: With which maximal rate, i.e. information sent per channel usage, we can transmit quantum information undisturbed through a noisy channel? This question naturally leads to the concept of channel capacities which we will review in this section. 6.1. The general case We are mainly interested in classical and quantum capacities. The basic ideas behind both situations are however quite similar. In this section we will consider therefore a general de.nition of capacity which applies to arbitrary channels and both kinds of information. (See also [168] as a general reference for this section.) 6.1.1. The de
Id n : B(Cn ) → B(Cn ) :
(6.1)
510
M. Keyl / Physics Reports 369 (2002) 431 – 548
The cb-norm improves the sometimes annoying property of the usual operator norm that quantities like T ⊗ Id B(Cd ) may increase with the dimension d. On in.nite-dimensional observable algebras T cb can be in.nite although each term in the supremum is .nite. A particular example for a map with such a behavior is the transposition on an in.nite-dimensional Hilbert space. A map with .nite cb-norm is therefore called completely bounded. In a .nite-dimensional setup each linear map is completely bounded. For the transposition / on Cd we have in particular /cb = d. The cb-norm has some nice features which we will use frequently; this includes its multiplicativity T1 ⊗ T2 cb = T1 cb T2 cb and the fact that T cb = 1 holds for each (unital) channel. Another useful relation is T cb = T ⊗ Id B(H) , which holds if T is a map B(H) → B(H). For more properties of the cb-norm let us refer to [125]. Now we can de.ne the quantity N(T; B) = inf ETD − Id B cb ; E;D
(6.2)
where the in.mum is taken over all channels E : A2 → B and D : B → A1 and Id B is again the ideal B-channel. N describes, as indicated above, the smallest possible error we have to take into account if we try to transmit one B system through one copy of the channel T using any encoding E and decoding D. In Section 4.4, however, we have seen that we can reduce the error if we take M copies of the channel instead of just one. More generally we are interested in the transmission of “codewords of length” N , i.e. B⊗N systems using M copies of the channel T . Encodings and M M decodings are in this case channels of the form E : A⊗ → B⊗N respectively D : B⊗N → A⊗ 2 1 . If ⊗ M ⊗ N (M ) we increase the number M of channels the error N(T ; B ) decreases provided the rate with which N grows as a function of M is not too large. A more precise formulation of this idea leads to the following de.nition. Denition 6.1. Let T be a channel and B an observable algebra. A number c ¿ 0 is called achievable rate for T with respect to B; if for any pair of sequences Mj ; Nj ; j ∈ N with Mj → ∞ and lim supj→∞ Nj =Mj ¡ c we have lim N(T ⊗Mj ; B⊗Nj ) = 0 :
j →∞
(6.3)
The supremum of all achievable rates is called the capacity of T with respect to B and denoted by C(T; B). Note that by de.nition c = 0 is an achievable rate hence C(T; B) ¿ 0. If on the other hand each c ¿ 0 is achievable we write C(T; B) = ∞. At a .rst look it seems cumbersome to check all pairs of sequences with given upper ratio when testing c. Due to some monotonicity properties of N, however, it can be shown that it is suScient to check only one sequence provided the Mj satisfy the additional condition Mj =(Mj+1 ) → 1. 6.1.2. Simple calculations We see that there are in fact many di=erent capacities of a given channel depending on the type of information we want to transmit. However, there are only two di=erent cases we are interested in: B can be either classical or quantum. We will discuss both special cases in greater detail in the next
M. Keyl / Physics Reports 369 (2002) 431 – 548
511
two sections. Before we do this, however, we will have a short look on some simple calculations which can be done in the general case. To this end it is convenient to introduce the notations Md = B(Cd )
and
Cd = C({1; : : : ; d})
(6.4)
as shorthand notations for B(Cd ) and C({1; : : : ; d}) since some notations become otherwise a little bit clumsy. First of all let us have a look on capacities of ideal channels. If Id Mf and Id Cf denote the identity channels on the quantum algebra Mf , respectively the classical algebra Cf , we get C(Id Cf ; Md ) = 0;
C(Id Cf ; Cd ) = C(Id Mf ; Md ) = C(Id Mf ; Cd ) =
log2 f : log2 d
(6.5)
The .rst equation is the channel capacity version of the no-teleportation theorem: It is impossible to transfer quantum information through a classical channel. The other equations follow simply by counting dimensions. For the next relation it is convenient to associate to a pair of channels T , S the quantity C(T; S) which arises if we replace in De.nition 6.1 and Eq. (6.2) the ideal channel Id B by an arbitrary channel S. Hence C(T; S) is a slight generalization of the channel capacity which describes with which asymptotic rate the channel S can be approximated by T (and appropriate encodings and decodings). These generalized capacities satisfy the two-step coding inequality, i.e. for the three channels T1 ; T2 ; T3 we have C(T3 ; T1 ) ¿ C(T2 ; T1 )C(T3 ; T2 ):
(6.6)
To prove it consider the relations T1⊗N − E1 E2 T3⊗K D2 D1 cb = T1⊗N − E1 T2⊗M D1 + E1 T2⊗M D1 − E1 E2 T3⊗K D2 D1 cb
(6.7)
6 T1⊗N − E1 T2⊗M D1 cb + E1 cb T2⊗M − E2 T3⊗K D2 cb D1 cb
(6.8)
6 T1⊗N − E1 T2⊗M D1 cb + T2⊗M − E2 T3⊗K D2 cb ;
(6.9)
where we have used for the last inequality the fact that the cb-norm of a channel is one. If c1 is an achievable rate of T1 with respect to T2 such that lim supj→∞ Mj =Nj ¡ c1 and c2 is an achievable rate of T2 with respect to T3 such that lim supj→∞ Nj =Kj ¡ c2 we see that lim sup j →∞
Mj M j Nj Mj Nk = lim sup 6 lim sup lim sup : Kj j →∞ Nj Kj j →∞ Nj k →∞ Kk
(6.10)
If we choose the sequences Mj ; Nj and Kj clever enough (cf. the remark following De.nition 6.1) this implies that c1 c2 is an achievable rate for T1 with respect to T3 and this proves Eq. (6.6). As a .rst application of (6.6), we can relate all capacities C(T; Md ) (and C(T; Cd )) for different d to one another. If we choose T3 = T , T1 = Id Md and T2 = Id Mf we get with (6.5) C(T; Md ) 6 (log2 f=log2 d)C(T; Mf ), and exchanging d with f shows that even equality holds.
512
M. Keyl / Physics Reports 369 (2002) 431 – 548
A similar relation can be shown for C(T; Cd ). Hence, the dimension of the observable algebra B describing the type of information to be transmitted, enters only via a multiplicative constant, i.e. it is only a choice of units and we de.ne the classical capacity Cc (T ) and the quantum capacity Cq (T ) of a channel T as Cc (T ) = C(T; C2 );
Cq (T ) = C(T; M2 ) :
(6.11)
A second application of Eq. (6.6) is a relation between the classical and the quantum capacity of a channel. Setting T3 = T , T1 = Id C2 and T2 = Id M2 we get again with (6.5), Cq (T ) 6 Cc (T ) :
(6.12)
Note that it is now not possible to interchange the roles of C2 and M2 . Hence equality does not hold here. Another useful relation concerns concatenated channels: We transmit information of type B .rst through a channel T1 and then through a second channel T2 . It is reasonable to assume that the capacity of the composition T2 T1 cannot be bigger than capacity of the channel with the smallest bandwidth. This conjecture is indeed true and known as the “Bottleneck inequality”: C(T2 T1 ; B) 6 min{C(T1 ; B); C(T2 ; B)} :
(6.13)
To see this consider an encoding and a decoding channel E, respectively D, for (T2 T1 )⊗M , i.e. in the de.nition of C(T2 T1 ; B) we look at N N ⊗M ⊗M ⊗M Id ⊗ Dcb = Id ⊗ B − E(T2 T1 ) B − (ET2 )T1 Dcb :
(6.14)
This implies that ET2⊗M and D are an encoding and a decoding channel for T1 . Something similar holds for D and T1⊗M D with respect to T2 . Hence each achievable rate for T2 T1 is also an achievable rate for T2 and T1 , and this proves Eq. (6.13). Finally, we want to consider two channels T1 , T2 in parallel, i.e. we consider the tensor product T1 ⊗ T2 . If Ej , Dj , j = 1; 2 are encoding, respectively decoding, channels for T1⊗M and T2⊗M such ⊗N that Id B j − Ej Tj⊗M Dj cb 6 j holds, we get Id − Id ⊗ (E2 T ⊗M D2 ) + Id ⊗ (E2 T ⊗M D2 ) − E1 ⊗ E2 (T1 ⊗ T2 )⊗M D1 ⊗ D2 cb
(6.15)
6 Id ⊗ (Id − E2 T ⊗M D2 )cb + (Id − E1 T1⊗M D1 ) ⊗ E2 T ⊗M D2 cb
(6.16)
6 Id − E2 T ⊗M D2 cb + Id − E1 T1⊗M D1 cb 6 2j :
(6.17)
Hence c1 + c2 is achievable for T1 ⊗ T2 if cj is achievable for Tj . This implies the inequality C(T1 ⊗ T2 ; B) ¿ C(T1 ; B) + C(T2 ; B) :
(6.18)
When all channels are ideal, or when all systems involved are classical even equality holds, i.e. channel capacities are additive in this case. However, if quantum channels are considered, it is one of the big open problems of the .eld, to decide under which conditions additivity holds.
M. Keyl / Physics Reports 369 (2002) 431 – 548
513
6.2. The classical capacity In this section we will discuss the classical capacity Cc (T ) of a channel T . There are in fact three di=erent cases to consider: T can be either classical or quantum and in the quantum case we can use either ordinary encodings and decodings or a dense coding scheme (cf. Section 4.1.3). 6.2.1. Classical channels Let us consider .rst a classical to classical channel T : C(Y ) → C(X ). This is basically the situation of classical information theory and we will only have a short look here—mainly to show how this (well known) situation .ts into the general scheme described in the last section. 20 First of all we have to calculate the error quantity N(T; C2 ) de.ned in Eq. (6.20). As stated in Section 3.2.3 T is completely determined by its transition probabilities Txy , (x; y) ∈ X × Y describing the probability to receive x ∈ X when y ∈ Y was sent. Since the cb-norm for a classical algebra coincides with the ordinary norm we get (we have set X = Y for this calculation)
Id − T cb = Id − T = sup
(6.19) (xy − Txy )fy
x;f y
= 2 sup(1 − Txx ) ;
(6.20)
x
where the supremum in the .rst equation is taken over all f ∈ C(X ) with f = supy |fy | 6 1. We see that the quantity in Eq. (6.20) is exactly twice the maximal error probability, i.e. the maximal probability of sending x and getting anything di=erent. Inserting this quantity for N in De.nition 6.1 applied to a classical channel T and the “bit-algebra” B = C2 , we get exactly the Shannons classical de.nition of the capacity of a discrete memoryless channel [138]. Hence we can apply the Shannons noisy channel coding theorem to calculate Cc (T ) for a classical channel. To state it we have to introduce .rst some terminology. Consider therefore a state p ∈ C∗ (X ) of the classical input algebra C(X ) and its image q = T ∗ (p) ∈ C∗ (Y ) under the channel. p and q are probability distributions on X , respectively Y , and px can be interpreted as the probability that the “letter” x ∈ X was send. Similarly qy = x Txy px is the probability that y ∈ Y was received and Pxy = Txy px is the probability that x ∈ X was sent and y ∈ Y was received. The family of all Pxy can be interpreted as a probability distribution P on X × Y and the Txy can be regarded as conditional probability of P under the condition x. Now we can introduce the mutual information Pxy I (p; T ) = S(p) + S(q) − S(P) = ; (6.21) Pxy log2 p x qy (x;y)∈X ×Y
where S(p), S(q) and S(P) denote the entropies of p; q and P. The mutual information describes, roughly speaking, the information that p and q contain about each other. E.g. if p and q are completely uncorrelated (i.e. Pxy = px qy ) we get I (p; T ) = 0. If T is on the other hand an ideal bit-channel and p equally distributed we have I (p; T ) = 1. Now we can state the Shannons Theorem which expresses the classical capacity of T in terms of mutual informations [138]: 20
Please note that this implies in particular that we do not give a complete review of the foundations of classical information theory here; cf. [101,62,49] instead.
514
M. Keyl / Physics Reports 369 (2002) 431 – 548
Theorem 6.2 (Shannon). The classical capacity of Cc (T ) of a classical communication channel T : C(Y ) → C(X ) is given by Cc (T ) = sup I (p; T ) ;
(6.22)
p
where the supremum is taken over all states p ∈ C∗ (X ). 6.2.2. Quantum channels If we transmit classical data through a quantum channel T : B(H) → B(H) the encoding E : B(H) → C2 is a parameter-dependent preparation and the decoding D : C2 → B(H) is an observable. Hence, the composition ETD is a channel C2 → C2 , i.e. a purely classical channel and we can calculate its capacity in terms of the Shannons Theorem (Theorem 6.2). This observation leads to the de.nition of the “one-shot” classical capacity of T : Cc; 1 (T ) = sup Cc (ETD) ;
(6.23)
E;D
where the supremum is taken over all encodings and decodings of classical bits. The term “one-shot” in this de.nition arises from the fact that we need apparently only one invocation of the channel T . However, many uses of the channel are hidden in the de.nition of the classical capacity on the right-hand side. Hence, Cc; 1 (T ) can be de.ned alternatively in the same way as Cc (T ) except that no entanglement is allowed during encoding and decoding, or more precisely in De.nition 6.1 we consider only encodings E : B(K)⊗M → C2⊗N which prepare separable states and only decodings D : C2⊗N → B(H)⊗M which lead to separable observables. It is not yet known, whether entangled codings can help to increase the transmission rate. Therefore, we only know that 1 Cc; 1 (T ) 6 Cc (T ) = sup (6.24) Cc; 1 (T ⊗M ) M M ∈N holds. One reason why Cc; 1 (T ) is an interesting quantity relies on the fact that we have, due to the following theorem by Holevo [80], a computable expression for it. Theorem 6.3. The one-shot classical capacity Cc; 1 (T ) of a quantum channel T : B(H) → B(H) is given by
Cc; 1 (T ) = sup S pj T ∗ [ j ] − pj S(T ∗ [ j ]) ; (6.25) pj ; j
j
j
where the supremum is taken over all probability distributions pj and collections of density operators j . 6.2.3. Entanglement assisted capacity Another classical capacity of a quantum channel arises, if we use dense coding schemes instead of simple encodings and decodings to transmit the data through the channel T . In other words we can de.ne the entanglement enhanced classical capacity Ce (T ) in the same way as Cc (T ) but by replacing the encoding and decoding channels in De.nition 6.1 and Eq. (6.2) by dense coding protocols. Note that this implies that the sender Alice and the receiver Bob share an (arbitrary) amount of (maximally) entangled states prior to the transmission.
M. Keyl / Physics Reports 369 (2002) 431 – 548
515
For this quantity a coding theorem was recently proven by Bennett and others [18] which we want to state in the following. To this end assume that we are transmitting systems in the state ∈ B∗ (H) through the channel and that has the puri.cation " ∈ H ⊗ H, i.e. = tr 1 |" "| = tr 2 |" "|. Then we can de.ne the entropy exchange S( ; T ) = S[(T ⊗ Id)(|" "|)] :
(6.26)
The density operator (T ⊗ Id)(|" "|) has the output state T ∗ ( ) and the input state as its partial traces. It can be regarded therefore as the quantum analog of the input=output probability distribution Txy de.ned in Section 6.2.1. Another way to look at S( ; T ) is in terms of an ancilla representation of T : If T ∗ ( ) = tr K (U ⊗ K U ∗ ) with a unitary U : H ⊗ K and a pure environment state K it ∗ ] where T can be shown [7] that S( ; T ) = S[TK K is the channel describing the information transfer ∗ into the environment, i.e. TK ( ) = tr H (U ⊗ K U ∗ ), in other words S( ; T ) is the .nal entropy of the environment. Now we can de.ne I ( ; T ) = S( ) + S(T ∗ ) − S( ; T ) ;
(6.27)
which is the quantum analog of the mutual information given in Eq. (6.21). It has a number of nice properties, in particular positivity, concavity with respect to the input state and additivity [2] and its maximum with respect to coincides actually with Ce (T ) [18]. Theorem 6.4. The entanglement assisted capacity Ce (T ) of a quantum channel T : B(H) → B(H) is given by Ce (T ) = sup I ( ; T ) ;
(6.28)
where the supremum is taken over all input states ∈ B∗ (H). Due to the nice additivity properties of the quantum mutual information I ( ; T ) the capacity Ce (T ) is known to be additive as well. This implies that it coincides with the corresponding “one-shot” capacity, and this is an essential simpli.cation compared to the classical capacity Cc (T ). 6.2.4. Examples Although the expressions in Theorems 6.3 and 6.4 are much easier than the original de.nitions they still involve some optimization problems over possibly large parameter spaces. Nevertheless, there are special cases which allow explicit calculations. As a .rst example we will consider the “quantum erasure channel” which transmits with probability 1 − # the d-dimensional input state intact while it is replaced with probability # by an “erasure symbol”, i.e. a (d + 1)th pure state e which is orthogonal to all others [72]. In the SchrWodinger picture this is B∗ (Cd ) → T ∗ ( ) = (1 − #) + # tr( )| e e | ∈ B∗ (Cd+1 ) :
(6.29)
This example is very unusual, because all capacities discussed up to now (including the quantum capacity as we will see in Section 6.3.2) can be calculated explicitly: We get Cc; 1 (T ) = Cc (T ) = (1 − #) log2 (d) for the classical and Ce (T ) = 2Cc (T ) for the entanglement enhanced classical capacity [15,17]. Hence the gain by entanglement assistance is exactly a factor two; cf. Fig. 6.1.
516
M. Keyl / Physics Reports 369 (2002) 431 – 548 2 classical capacity ee. classical capacity quantum capacity
Ce(T) Cc(T) Cq(T)
1.5
1
0.5
0 0
0.2
0.4
0.6
0.8
1
ϑ Fig. 6.1. Capacities of the quantum erasure channel plotted as a function of the error probability.
Our next example is the depolarizing channel 5 (6.30) ∈ B∗ (Cd ) ; d already discussed in Section 3.2. It is more interesting and more diScult to study. It is in particular not known whether Cc and Cc; 1 coincide in this case (i.e. the value of Cc is not known. Therefore we can compare Ce (T ) only with Cc; 1 . Using the unitary covariance of T (cf. Section 3.2.2) we see .rst that I (U U ∗ ; T ) = I ( ; T ) holds for all unitaries U (to calculate S(U U ∗ ; T ) note that U ⊗ U" is a puri.cation of U U ∗ if " is a puri.cation of ). Due to the concavity of I ( ; T ) in the .rst argument we can average over all unitaries and see that the maximum in Eq. (6.28) is achieved on the maximally mixed state. Straightforward calculation therefore shows that d2 − 1 d2 − 1 d2 − 1 # 2 Ce (T ) = log2 (d ) + 1 − # log 1 − # + # log2 2 (6.31) 2 2 2 2 d d d d B∗ (Cd ) → T ∗ ( ) = (1 − #) + # tr( )
holds, while we have
d−1 Cc; 1 (T ) = log2 (d) + 1 − # d
log2
d−1 1−# d
+#
d−1 # log2 ; d d
(6.32)
where the maximum in Eq. (6.25) is achieved for an ensemble of equiprobable pure states taken from an orthonormal basis in H [82]. This is plausible since the .rst ∗ term under the sup in Eq. (6.25) becomes maximal and the second becomes minimal: j pj T j is maximally mixed in this case and its entropy is therefore maximal. The entropies of the T ∗ j are on the other hand minimal if the j are pure. In Fig. 6.2 we have plotted both capacities as a function of the noise parameter # and in Fig. 6.3 we have plotted the quotient Ce (T )=Cc; 1 (T ) which gives an upper bound on the gain we get from entanglement assistance.
M. Keyl / Physics Reports 369 (2002) 431 – 548
517
2 one-shot cl. capacity entanglement enhanced cl. capacity
Ce(T) Cc,1(T)
1.8 1.6 1.4 1.2 1 0.8 0.6 0.4 0.2 0 0
0.2
0.4
0.6
0.8
1
Fig. 6.2. Entanglement enhanced and one-shot classical capacity of a depolarizing qubit channel. Ce(T) Cc,1(T)
3 2.9 2.8 2.7 2.6 2.5 2.4 2.3 2.2 2.1 2 0
0.2
0.4
0.6
0.8
1
Fig. 6.3. Gain of using entanglement assisted versus unassisted classical capacity for a depolarizing qubit channel.
As a third example we want to consider Gaussian channels de.ned in Section 3.3.4. Hence consider the Hilbert space H = L2 (R) describing a one-dimensional harmonic oscillator (or one mode of the electromagnetic .eld) and the ampli.cation=attenuation channel T de.ned in Eq. (3.74). The results we want to state concern a slight modi.cation of the original de.nitions of Cc; 1 (T ) and Ce (T ): We will consider capacities for channels with constraint input. This means that only a restricted class of states on the input Hilbert space of the channel are allowed for encoding. In our case this means
518
M. Keyl / Physics Reports 369 (2002) 431 – 548 10 Ent. enhanced classical capacity one-shot classical capacity
Ce(T) Cc,1(T)
9 8 7 6 5 4 3 2 1 0 0
0.5
1
1.5
2
Fig. 6.4. One-shot and entanglement enhanced classical capacity of a Gaussian ampli.cation=attenuation channel with Nc = 0 and input noise N = 10.
that we will consider the constraint tr( aa∗ ) 6 N for a positive real number N ¿ 0 and with the usual creation and annihilation operators a∗ ; a. This can be rewritten as an energy constraint for a quadratic Hamiltonian; hence this is a physically realistic restriction. For the entanglement enhanced capacity it can be shown now that the maximum in Eq. (6.28) is taken on Gaussian states. To get Ce (T ) it is suScient therefore to calculate the quantum mutual information I (T; ) for the Gaussian state N from Eq. (3.64). The details can be found in [84,18], we will only state the results here. With the abbreviation g(x) = (x + 1) log2 (x + 1) − x log2 x ;
(6.33)
we get S( N ) = g(N ) and S(T [ N ]) = g(N ) with N = k 2 N + max{0; k 2 − 1} + Nc (cf. Eq. (3.75)) for the entropies of input and output states and D − N + N − 1 D + N − N − 1 +g (6.34) S( ; T ) = g 2 2 with D=
(N + N + 1)2 − 4k 2 N (N + 1)
(6.35)
for the entropy exchange. The sum of all three terms gives Ce (T ) which we have plotted in Fig. 6.4 as a function of k. To calculate the one-shot capacity Cc; 1 (T ) the optimization in Eq. (6.25) has to be calculated over ∗ probability distributions pj and collections of density operators j such that j pj tr(aa j ) 6 N holds. It is conjectured but not yet proven [84] that the maximum is achieved on coherent states
M. Keyl / Physics Reports 369 (2002) 431 – 548 Ce(T) Cc1(T)
519
3.5 N=0.1 N=1 N=10
3
2.5
2
1.5
1 0
0.5
1
1.5
2
Fig. 6.5. Gain of using entanglement assisted versus unassisted classical capacity for a Gaussian ampli.cation=attenuation channel with Nc = 0 and input noise N = 0:1; 1; 10.
with Gaussian probability distribution p(x) = (3N )−1 exp(−|x|2 =N ). If this is true we get Cc; 1 (T ) = g(N ) − g(N0 )
with N0 = max{0; k 2 − 1} + Nc :
(6.36)
The result is plotted as a function of k in Fig. 6.4 and the ratio G = Ce =C1 in Fig. 6.5. G gives an upper bound on the gain of using entanglement assisted versus unassisted classical capacity. 6.3. The quantum capacity The quantum capacity of a quantum channel T : B(H) → B(H) is more diScult to treat than the classical capacities discussed in the last section. There is, in particular, no coding theorem available which would allow explicit calculations. Nevertheless, there are partial results available, which we will review in the following. 6.3.1. Alternative de
(6.37)
and if H = H holds we simply write Fp (T ). Hence a number c is an achievable rate if lim Fp (Ej T ⊗Mj Dj ) = 1
j →∞
(6.38)
520
M. Keyl / Physics Reports 369 (2002) 431 – 548
holds for sequences ⊗ Nj
Ej : B(H)⊗Mj → M2
;
⊗ Nj
Dj : M2
→ B(H)⊗Mj ;
j∈N
(6.39)
of encodings and decodings and sequences of integers Mj ; Nj , j ∈ N satisfying the same constraints as in De.nition 6.1 (in particular limj→∞ Nj =Mj ¡ c). The equivalence to our version of Cq (T ) follows now from the estimates [168] (6.40) T − Id 6 T − Idcb 6 4 T − Id ;
T − Id 6 4
1 − Fp (T ) 6 4 T − Id :
(6.41)
A second version of Cq (T ) is given in [7]. To state it let us de.ne .rst a quantum source as a sequence N ; N ∈ N of density operators N ∈ B∗ (K⊗N ) (with an appropriate Hilbert space K) and the entropy rate of this source as lim supN →∞ S( N )=N . In addition we need the entanglement <delity of a state (with respect to a channel T ) Fe ( ; T ) = "; (T ⊗ Id)[|" "|]" ;
(6.42)
where " is the puri.cation of . Now we de.ne c ¿ 0 to be achievable if there is a quantum source
N , N ∈ N with entropy rate c such that lim Fe ( N ; EN T ⊗N DN ) = 1
(6.43)
n→∞
holds with encodings and decodings EN : B(H)⊗N → B(K⊗N );
DN : B(K⊗N ) → B(H)⊗N ;
j∈N :
(6.44)
Note that these EN , DN play a slightly di=erent role than the Ej , Dj in Eq. (6.39) (and in De.nition 6.1), because the number of tensor factors of the input and the output algebra is always identical, while in Eq. (6.39) the quotients of these numbers lead to the achievable rate. To relate both de.nitions we have to derive an appropriately chosen family of subspaces HN ⊂ K⊗N from the N such that the minimal .delities Fp (HN ; EN T ⊗N DN ) of these subspaces go to 1 as N → ∞. If we identify the HN with tensor products of C2 and the Ej , Dj of Eq. (6.39) with restrictions of EN , DN to these tensor products we recover Eq. (6.38). A precise implementation of this rough idea can be found in [6] and it shows that both de.nitions just discussed are indeed equivalent. 6.3.2. Upper bounds and achievable rates Although there is no coding theorem for the quantum capacity Cq (T ), there is a fairly good candidate which is related to the coherent information J ( ; T ) = S(T ∗ ) − S( ; T ) :
(6.45)
Here S(T ∗ ) is the entropy of the output state and S( ; T ) is the entropy exchange de.ned in Eq. (6.26). It is argued [7] that J ( ; T ) plays a role in quantum information theory which is analogous
M. Keyl / Physics Reports 369 (2002) 431 – 548
521
to that of the (classical) mutual information (6.21) in classical information theory. J ( ; T ) has some nasty properties, however: it can be negative [41] and it is known to be not additive [54]. To relate it to Cq (T ) it is therefore not suScient to consider a one-shot capacity as in the Shannons Theorem (Theorem 6.2). Instead, we have to de.ne Cs (T ) = sup N
1 Cs; 1 (T ⊗N ) N
with Cs; 1 (T ) = sup J ( ; T ) :
(6.46)
In [7,8] it is shown that Cs (T ) is an upper bound on Cq (T ). Equality, however, is conjectured but not yet proven, although there are good heuristic arguments [110,90]. A second interesting quantity which provides an upper bound on the quantum capacity uses the transposition operation / on the output systems. More precisely it is shown in [84] that Cq (T ) 6 CQ (T ) = log2 T/cb
(6.47)
holds for any channel. In contrast to many other calculations in this .eld it is particular easy to derive this relation from properties of the cb-norm. Hence we are able to give a proof here. We start with the fact that /cb = d if d is the dimension of the Hilbert space on which / operates. Assume N that Nj =Mj → c 6 Cq (T ) and j large enough such that Id 2 j − Ej T ⊗Mj Dj 6 j with appropriate encodings and decodings Ej ; Dj . We get N
N
2Nj = Id 2 j /cb 6 /(Id 2 j − Ej T ⊗Mj Dj )cb + /Ej T ⊗Mj Dj cb N
6 2Nj Id 2 j − Ej T ⊗Mj Dj cb + /Ej /(/T )⊗Mj Dj cb M
6 2Nj j + /T cbj ;
(6.48) (6.49) (6.50)
where we have used for the last equation the fact that Dj and /Ej / are channels and that the cb-norm is multiplicative. Taking logarithms on both sides we get Nj log2 (1 − j) + 6 log2 /T cb : Mj Mj
(6.51)
In the limit j → ∞ this implies c 6 log2 /T and therefore Cq (T ) 6 log2 /T cb = CQ (T ) as stated. Since CQ (T ) is an upper bound on Cq (T ) it is particularly useful to check whether the quantum capacity for a particular channel is zero. If, e.g., T is classical we have /T =T since the transposition coincides on a classical algebra Cd with the identity (elements of Cd are just diagonal matrices). This implies CQ (T ) = log2 /T cb = log2 T cb = 0, because the cb-norm of a channel is 1. We see therefore that the quantum capacity of a classical channel is 0—this is just another proof of the no-teleportation theorem. A slightly more general result concerns channels T = RS which are the composition of a preparation R : Md → Cf and a subsequent measurement S : Cf → Md . It is easy to see that /T = /RS is a channel, because /R/ is a channel and / is the identity on Cf , hence /R/ = /R and /R/S = /RS = /T . Again we get CQ (T ) = 0. Let us consider now some examples. The most simple case is again the quantum erasure channel from Eq. (6.29). As for the classical capacities its quantum capacity can be explicitly calculated [15] and we have Cq (T ) = max(0; (1 − 2#) log2 (d)); cf. Fig. 6.1.
522
M. Keyl / Physics Reports 369 (2002) 431 – 548 1 one-shot coherent information transposition bound Hamming bound
C (T) Cs,1(T) 0.8
0.6
0.4
0.2
0 0
0.2
0.4
ϑ
0.6
0.8
1
Fig. 6.6. CQ (T ), Cs (T ) and the Hamming bound of a depolarizing qubit channel plotted as function of the noise parameter #.
For the depolarizing channel (6.30) precise calculations of Cq (T ) are not available. Hence let us consider .rst the coherent information. J (T; ) inherits from T its unitary covariance, i.e. we have J (U U ∗ ; T )=J ( ; T ). In contrast to the mutual information, however, it does not have nice concavity properties, which makes the optimization over all input states more diScult to solve. Nevertheless, the calculation of J ( ; T ) is straightforward and we get in the qubit case (if # is the noise parameter of T and is the highest eigenvalue of ): # 1 − #=2 + A 1 − #=2 − A J ( ; T ) = S (1 − #) + −S −S 2 2 2 (1 − )# # −S ; (6.52) −S 2 2 where S(x) = −x log2 (x) denotes again the entropy function and A = (2 − 1)2 (1 − #=2)2 + 4(1 − )(1 − #)2 :
(6.53)
Optimization over can be performed at least numerically (the maximum is attained at the left boundary ( = 1=2) if J is positive there, and the right boundary otherwise). The result is plotted together with CQ (T ) in Fig. 6.6 as a function of Q. The quantity CQ (T ) is much easier to compute and we get ! 3 : (6.54) CQ (T ) = max 0; log2 2 − Q 2 To get a lower bound on Cq (T ) we have to show that a certain rate r 6 Cq (T ) can be achieved with an appropriate sequence EM : Md⊗M → M2⊗N (M ) ;
M; N (M ) ∈ N
(6.55)
M. Keyl / Physics Reports 369 (2002) 431 – 548
523
of error correcting codes and corresponding decodings DM . I.e. we need lim N (M )=M = r
j →∞
and
lim EM T ⊗M DM − Idcb = 0 :
j →∞
(6.56)
To .nd such a sequence note .rst that we can look at the depolarizing channel as a device which produces an error with probability # and leaves the quantum information intact otherwise. If more and more copies of T are used in parallel, i.e. if M goes to in.nity, the number of errors approaches therefore #M . In other words, the probability to have more than #M errors vanishes asymptotically. To see this consider T ⊗M = ((# − 1)Id + #d−1 tr(·)5)⊗M =
M
(1 − #)K #N −K TK(M ) ;
(6.57)
K=1
TK(M )
denotes the sum of all M -fold tensor products with d−1 tr(·)5 on N places and Id on where the N − K remaining—i.e. TK(M ) is a channel which produces exactly K errors on M transmitted systems. Now we have " " " " " ⊗M " (M ) − (1 − #)K #N −K TK " (6.58) "T " " K 6#M cb " " " " " " (1 − #)K #N −K TK(M ) " (6.59) =" " " K¿#M
6
M K¿#M
6
M K¿#M
cb
(1 − #)K #N −K TK(M ) cb
M K
(6.60)
(1 − #)K #N −K = R :
(6.61)
The quantity R is the tail a of binomial series and vanishes therefore in the limit M → ∞ (cf. e.g. [131, Appendix B]). This shows that for M → ∞ only terms TK(M ) with K 6 #M are relevant in Eq. (6.57)—in other words at most #M errors occur asymptotically, as stated. This implies that we need a sequence of codes EM which encode N (M ) qubits and correct #M errors on M places. One way to get such a sequence is “random coding”—the classical version of this method is well known from the proof of Shannons theorem. The idea is, basically, to generate error correcting codes of a certain type randomly. E.g. we can generate a sequence of random graphs with N (M ) input and M output vertices (cf. Section 4.4). If we can show that the corresponding codes correct (asymptotically) #M errors, the corresponding rate r = limM →∞ N (M )=M is achievable. For the depolarizing channel 21 such an analysis, using randomly generated stabilizer codes shows [16,71] Cq (T ) 6 1 − H (#) − # log2 3 ; 21
(6.62)
With a more thorough discussion similar results can be obtained for a much more general class of channels, e.g. all T in a neighborhood of the identity channel; cf. [114].
524
M. Keyl / Physics Reports 369 (2002) 431 – 548 3.5 One-shot coherent information Transposition bound
C(T) Cs,1(T)
3
2.5
2
1.5
1
0.5
0 0
0.5
1
1.5
2
Fig. 6.7. CQ (T ) and Cs (T ) of a Gaussian ampli.cation=attenuation channel as a function of ampli.cation parameter k.
where H is the binary entropy from Eq. (5.16). This bound can be further improved using a more clever coding strategy; cf. [54]. As a third example let us consider again the Gaussian channel studied already in Section 6.2.4. For CQ (T ) we have (the corresponding calculation is not trivial and uses properties of Gaussian channels which we have not discussed; cf. [84].) CQ (T ) = max{0; log2 (k 2 + 1) − log2 (|k 2 − 1| + 2Nc )}
(6.63)
and we see that CQ (T ) and therefore Cq (T ) become zero if Nc is large enough (i.e. Nc ¿ max{1; k 2 }). The coherent information for the Gaussian state N from Eq. (3.64) has the form D − N + N − 1 D + N − N − 1 −g (6.64) J ( N ; T ) = g(N ) − g 2 2 with N ; D and g as in Section 6.2.4. It increases with N and we can calculate therefore the maximum over all Gaussian states (which might di=er from CS (T )) as Nc 2 2 : (6.65) CG (T ) = lim J ( N ; T ) = log2 k − log2 |k − 1| − g N →∞ k2 − 1 We have plotted both quantities in Fig. 6.7 as a function of k. Finally let us have a short look on the special case k = 1, i.e. T describes in this case only the inJuence of classical Gaussian noise on the transmitted qubits. If we set k = 1 in Eq. (6.64) and take the limit N → ∞ we get CG (T ) = −log2 (Nc e) and CQ (T ) becomes CQ (T ) = max{0; −log2 (Nc )}; both quantities are plotted in Fig. 6.8. This special case is interesting because the one-shot coherent
M. Keyl / Physics Reports 369 (2002) 431 – 548
525
8 One-shot coherent information Transposition bound
C(T) Cs,1(T)
7
6
5
4
3
2
1
0 0
0.2
0.4
0.6
0.8
1
Fig. 6.8. CQ (T ) and Cs (T ) of a Gaussian ampli.cation=attenuation channel as a function of the noise parameter Nc (and with k = 1).
information CG (T ) is achievable, provided the noise parameter Nc satis.es certain conditions 22 [77]. Hence there is strong evidence that the quantum capacity lies between the two lines in Fig. 6.8. 6.3.3. Relations to entanglement measures The duality lemma proved in Section 2.3.3 provides an interesting way to derive bounds on channel capacities and capacity-like quantities from entanglement measures (and vice versa) [16,90]: To derive a state of a bipartite system from a channel T we can take a maximally entangled state " ∈ H ⊗ H, send one particle through T and get a less entangled pair in the state T = (Id ⊗ T ∗ )|" "|. If on the other hand an entangled state ∈ S(H ⊗ H) is given, we can use it as a resource for teleportation and get a channel T . The two maps → T and T → T are, however, not inverse to one another. This can be seen easily from the duality lemma (Theorem 2.10): For each state ∈ S(H⊗H) there is a channel T and a pure state 2 ∈ H⊗H such that =(Id ⊗T ∗ )|2 2| holds; but 2 is in general not maximally entangled (and uniquely determined by ). Nevertheless, there are special cases in which the state derived from T coincides with : A particular class of examples is given by teleportation channels derived from a Bell-diagonal state. On T we can evaluate an entanglement measure E( T ) and get in this way a quantity which is related to the capacity of T . A particularly interesting candidate for E is the “one-way LOCC” distillation rate ED; → . It is de.ned in the same way as the entanglement of distillation ED , except that only one-way LOCC operation are allowed in Eq. (5.8). According to [16] ED; → is related to Cq by the inequalities ED; → ( ) ¿ Cq (T ) and ED; → (T ) 6 Cq (T ). Hence if T = we can calculate ED; → ( ) in terms of Cq (T ) and vice versa. 22
It is only shown that log2 (1=(Nc e)) can be achieved, where x denotes the biggest integer less than x. It is very likely however that this is only a restriction of the methods used in the proof and not of the result.
526
M. Keyl / Physics Reports 369 (2002) 431 – 548
A second interesting example is the transposition bound CQ (T ) introduced in the last subsection. It is related to the logarithmic negativity [158] EQ ( T ) = log2 (Id ⊗ /) T 1 ;
(6.66)
which measures the degree with which the partial transpose of fails to be positive. EQ can be regarded as entanglement measure although it has some drawbacks: it is not LOCC monotone (Axiom E2), it is not convex (Axiom E3) and most severe: It does not coincides with the reduced von Neumann entropy on pure states, which we have considered as “the” entanglement measure for pure states. On the other hand, it is easy to calculate and it gives bounds on distillation rates and teleportation capacities [158]. In addition EQ can be used together with the relation between depolarizing channels and isotropic states to derive Eq. (6.54) in a very simple way.
7. Multiple inputs We have seen in Section 4 that many tasks of quantum information which are impossible with one-shot operations can be approximated by channels which operate on a large number of equally prepared inputs. Typical examples are approximate cloning, undoing noise and distillation of entanglement. There are basically two questions which are interesting for a quantitative analysis: First, we can search for the optimal solutions for a .xed number N of input systems and second we can ask for the asymptotic behavior in the limit N → ∞. In the latter case the asymptotic rate, i.e. the number of outputs (of a certain quality) per input system is of particular interest. 7.1. The general scheme Both types of questions just mentioned can be treated (up to certain degree) independently from the (impossible) task we are dealing with. In the following we will study the corresponding general scheme. Hence consider a channel T : B(H⊗M ) → B(H⊗N ) which operates on N input systems and produces M outputs of the same type. Our aim is to optimize a “
(7.1)
M. Keyl / Physics Reports 369 (2002) 431 – 548
527
a .gure of merit for the .rst case is given by Fc; 1 (T ) = inf
inf tr((j) T ∗ (⊗N )) :
j=1;:::;N pure
(7.2)
It measures the worst one-particle .delity of the output state T ∗ (⊗N ). If we are interested in correlations too, we have to choose Fc; all (T ) = inf tr(⊗M T ∗ (⊗N )) ; pure
(7.3)
which is again a “worst case” .delity, but now of the full output with respect to M uncorrelated copies of the input . Instead of .delities we can consider other error quantities like trace-norm distances or relative entropies. In general, however, we do not get signi.cantly di=erent results from such alternative choices; hence, we can safely ignore them. Real variants arise if we consider instead of the in.ma over all pure states quantities which prefer a (possibly discrete or even .nite) class of states. Such a choice leads to “state-dependent cloning”, because the corresponding optimal devices perform better as “universal” ones (i.e. those described by the .gures of merit above) on some states but much worse on the rest. We ignore state-dependent cloning in this work, because the universal case is physically more relevant and technically more challenging. Other cases which we do not discuss either include “asymmetric cloning”, which arises if we trade in Eq. (7.2) the quality of one particular output system against the rest (see [40]), and cloning of mixed states. The latter is much more diScult than the pure state case and even for classical systems, where it is related to the so-called “bootstrap” technique [59], non-trivial. Closely related to cloning is puri.cation, i.e. undoing noise. This means we are considering N systems originally prepared in the same (unknown) pure state but which have passed a depolarizing channel R∗ = # + (1 − #)5=d
(7.4)
afterwards. The task is now to .nd a device T acting on N of the decohered systems such that T ∗ (R∗ ) is as close as possible to the original pure state. We have the same basic choices for a .gure of merit as in the cloning problem. Hence, we de.ne FR; 1 (T ) = inf
inf tr((j) T ∗ [(R∗ )⊗N ])
j=1;:::;N pure
(7.5)
and FR; all (T ) = inf tr(⊗M T ∗ [(R∗ )⊗N ]) : pure
(7.6)
These quantities can be regarded as generalizations of Fc; 1 and Fc; all which we recover if R∗ is the identity. Another task we can consider is the approximation of a map / which is positive but not completely positive, like the transposition. Positivity and normalization imply that /∗ maps states to states but / cannot be realized by a physical device. An explicit example is the universal not gate (UNOT) which maps each pure qubit state to its orthocomplement ⊥ [36]. It is given the anti-unitary operator Z = -|0 + ?|1 → / = -|0 Z − ?|1 :
(7.7)
528
M. Keyl / Physics Reports 369 (2002) 431 – 548
Since / is a state if is, we can ask again for a channel T such that T ∗ (⊗N ) approximates (/)⊗M . As in the two previous examples we have the choice to allow arbitrary correlations in the output or not and we get the following .gures of merit: inf tr((/)(j) T ∗ (⊗N ))
FQ; 1 (T ) = inf
j=1;:::;N pure
(7.8)
and FQ; all (T ) = inf tr((/)⊗M T ∗ (⊗N )) : pure
(7.9)
Note that we can plug in for / basically any functional which maps states to states. In addition we can combine Eqs. (7.5) and (7.6) on the one hand with (7.8) and (7.9) on the other. As result we would get a measure for devices which undo an operation R and approximate an impossible machine / at the same time. 7.1.2. Covariant operations All the functionals just de.ned give rise to optimization problems which we will study in greater detail in the next sections. This means we are interested in two things: First of all the maximal value of F#; “ (with # = c; R; Q and “ = 1; all) given by F#; “ (N; M ) = inf F#; “ (T ) ;
(7.10)
T
where the supremum is taken over all channels T : B(H⊗M ) → B(H⊗N ), and second the particular channel Tˆ where the optimum is attained. At a .rst look a complete solution of these problems seems to be impossible, due to the large dimension of the space of all T , which scales exponentially in M and N . Fortunately, all F#; “ (T ) admit a large symmetry group which allows in many cases the explicit calculation of the optimal values F#; “ (N; M ) and the determination of optimizers Tˆ with a certain covariance behavior. Note that this is an immediate consequence of our decision to restrict the discussion to “universal” procedures, which do not prefer any particular input state. Let us consider permutations of the input systems .rst: If p ∈ SN is a permutation on N places and Vp the corresponding unitary on H⊗N (cf. Eq. (3.7)) we get obviously T ∗ (Vp ⊗N Vp∗ ) = T ∗ ( ⊗N ), hence F#; “ [-p (T )] = F#; “ (T )
∀p ∈ SN
with [-p (T )](A) = Vp∗ T (A)Vp :
(7.11)
In other words: F#; “ (T ) is invariant under permutations of the input systems. Similarly, we can show that F#; “ (T ) is invariant under permutations of the output systems: F#; “ [?p (T )] = F(T )
∀p ∈ SM
with [?p (T )](A) = T (Vp∗ AVp ) :
(7.12)
To see this consider e.g. for # = c and “ = all tr[⊗M Vp T ∗ ( ⊗N )Vp∗ ] = tr[Vp ⊗M Vp∗ T ∗ ( ⊗N )] = tr[⊗M T ∗ ( ⊗N )] : For the other cases similar calculations apply.
(7.13)
M. Keyl / Physics Reports 369 (2002) 431 – 548
529
Finally, none of the F#; “ (T ) singles out a preferred direction in the one-particle Hilbert space H. This implies that we can rotate T by local unitaries of the form U ⊗N , respectively U ⊗M , without changing F#; “ (T ). More precisely we have F#; “ [.U (T )] = F#; “ (T )
∀U ∈ U (d)
(7.14)
with [.U (T )](A) = U ∗⊗N T (U ⊗M AU ∗⊗M )U ⊗N :
(7.15)
The validity of Eq. (7.14) can be proven in the same way as (7.11) and (7.12). The details are therefore left to the reader. Now we can average over the groups SN ; SM and U (d). Instead of the operation T we consider 1 Z T= -p ?q .U (T ) dU ; (7.16) N !M ! p∈S q∈S G N
M
where dU denotes the normalized, left invariant Haar measure on U (d). We see immediately that TZ has the following symmetry properties: -p (TZ ) = TZ ;
?q (TZ ) = TZ ;
.U (TZ ) = TZ
∀p ∈ SN
∀q ∈ SM
∀U ∈ U (d)
(7.17)
and we will call each operation T fully symmetric, if it satis.es this equation. The concavity of F#; “ implies immediately that it cannot decrease if we replace T by TZ : 1 F#; “ (T ) = F#; “ -p ?q .U (T ) dU (7.18) N !M ! p∈S q∈S G N
¿
1 N !M ! p∈S
N
q ∈ SM
G
M
F#; “ [-p ?q .U (T )] dU = F#; “ (T ) :
(7.19)
To calculate the optimal value F#; “ (N; M ) it is therefore completely suScient to search a maximizer for F#; “ (T ) only among fully symmetric T and to evaluate F#; “ (T ) for this particular operation. This simpli.es the problem signi.cantly because the size of the parameter space is extremely reduced. Of course, we do not know from this argument whether the optimum is attained on non-symmetric operations, however this information is in general less important (and for some problems like optimal cloning a uniqueness result is available). 7.1.3. Group representations To get an idea how this parameter reduction can be exploited practically, let us reconsider Theorem 3.1: The two representations U → U ⊗N and p → Vp of U (d), respectively SN , on H⊗N are “commutants” of each other, i.e., any operator on H⊗N commuting with all U ⊗N is a linear combination of the Vp , and conversely. This knowledge can be used to decompose the representation U ⊗N (and Vp as well) into irreducible components. To reduce the group theoretic overhead, we will discuss this procedure .rst for qubits only and come back to the general case afterwards.
530
M. Keyl / Physics Reports 369 (2002) 431 – 548
Hence assume that H = C2 holds. Then H⊗N is the Hilbert space of N (distinguishable) spin-1=2 particles and it can be decomposed into terms of eigenspaces of total angular momentum. More precisely consider 1 ( j) Lk = ; k = 1; 2; 3 (7.20) 2 j k the k-component of total angular momentum k is the kth Pauli matrix and (j) ∈ B(H⊗N ) is (i.e. 2 2 de.ned according to Eq. (7.1)) and ˜L = k Lk . The eigenvalue expansion of ˜L2 is well known to be 0; 1; : : : ; N=2; N even; ˜L = s(s + 1)Ps with s = (7.21) 1=2; 3=2; : : : ; N=2; N odd; j where the Ps denote the projections to the eigenspaces of ˜L2 . It is easy to see that both representations U → U ⊗N and p → Vp commute with ˜L. Hence the eigenspaces Ps H⊗N of ˜L2 are invariant subspaces of U ⊗N and Vp and this implies that the restriction of U ⊗N and Vp to them are representations of SU(2), respectively SN . Since ˜L2 is constant on Ps H⊗N the SU(2) representation we get in this way must be (naturally isomorphic to) a multiple of the irreducible spin-s representation 3s . It is de.ned by 2s # $ i 1 (j) (s) (s) k 3s exp = exp iLk ; (7.22) with Lk = 2 2 j=1 k on the representation space ⊗2s H s = H+
(7.23)
(the Bose-subspace of H⊗2s ). Hence we get Ps H⊗N ∼ = Hs ⊗ KN; s ;
U ⊗N
= (3s (U ) ⊗ 5)
∀ ∈ P s H⊗ N :
(7.24)
Since Vp and U ⊗N commute the Hilbert space KN; s carries a representation 3ˆN; s (p) of SN which is irreducible as well. Note that KN; s depends in contrast to Hs on the number N of tensor factors and its dimension is (see [100] or [142] for general d) N 2s + 1 dim KN; s = : (7.25) N=2 + s + 1 N=2 − s Summarizing the discussion we get H⊗ N ∼ = ⊕ Hs ⊗ KN; s ; s
U ⊗N ∼ = ⊕ 3s (U ) ⊗ 5; s
Vp ∼ ˆ : = ⊕ 5 ⊗ 3(p) s
(7.26)
Let us consider now a fully symmetric operation T . Permutation invariance (-p (T ) = T and ?p (T ) = T ) implies together with Eq. (7.26) that tr (Bj ) T (Aj ⊗ Bj ) = ⊕ Tsj (Aj ) ⊗ 5 with Tsj : B(Hj ) → B(Hs ) (7.27) s dim KN; j
M. Keyl / Physics Reports 369 (2002) 431 – 548
531
holds if Aj ⊗ Bj ∈ B(Hj ⊗ KN; j ). The operations Tsj are unital and have, according to .U (T ) = T the following covariance properties: 3s (U )T (Aj )3s (U ∗ ) = T [3j (U )Aj 3j (U ∗ )]
∀U ∈ SU(2) :
(7.28)
The classi.cation of all fully symmetric channels T is reduced therefore to the study of all these Tsj . We can apply now the covariant version of Stinespring’s theorem (Theorem 3.3) to .nd that ˜ V : Hs → Hj ⊗ H;
Tsj (Aj ) = V ∗ (Aj ⊗ 5)V;
V3s (U ) = 3j (U ) ⊗ 3(U ˜ )V ;
(7.29)
˜ If 3˜ is irreducible with total angular momentum l the where 3˜ is a representation of SU(2) on H. “intertwining operator” V is well known: Its components in a particularly chosen basis coincide with certain Clebsh–Gordon coeScients. Hence, the corresponding operation is uniquely determined (up to unitary equivalence) and we write Tsjl (Aj ) = [Vl (Aj ⊗ 5)Vl ];
Vl 3s (U ) = 3j (U ) ⊗ 3l (U )Vl ;
(7.30)
where l can range from |j − s| to j + s. Since in a general representation 3˜ can be decomposed into irreducible components we see that each covariant Tsj is a convex linear combination of the Tsjl and we get with Eq. (7.27)
T (Aj ⊗ Bj ) = ⊕ cjl [Tsjl (Aj ) ⊗ (tr (Bj )5)] ; (7.31) s
l
where the cjl are constrained by cjl ¿ 0 and j cjl = (dim KN; j )−1 . In this way we have parameterized the set of fully symmetric operations completely in terms of group theoretical data and we can rewrite F#; “ (T ) accordingly. This leads to an optimization problem for a quantity depending only on s; j and l, which is at least in some cases solvable. To generalize the scheme just presented to the case H = Cd with arbitrary d we only have to .nd a replacement for the decomposition in Eq. (7.26). This, however, is well known from group theory: H⊗ N ∼ = ⊕HY ⊗ KY ; Y
U ⊗N ∼ = ⊕3Y (U ) ⊗ 5; Y
Vp ∼ = ⊕5 ⊗ 3ˆY (p) ; Y
(7.32)
where 3Y : U (d) → B(HY ) and 3ˆY : SN → B(KY ) are irreducible representations. The summation index Y runs over all Young frames with d rows and Nboxes, i.e. by the arrangements of N boxes into d rows of lengths Y1 ¿ Y2 ¿ · · · ¿ Yd ¿ 0 with k Yk = N . The relation to total angular momentum s used as the parameter for d = 2 is given by Y1 − Y2 = 2s, which determines Y together with Y1 + Y2 = N completely. The rest of the arguments applies without signi.cant changes, this is in particular the case for Eq. (7.31) which holds for general d if we replace s; j and l by Young frames. However, the representation theory of U (d) becomes much more diScult. The generalization of results available for qubits (d = 2) to d ¿ 2 is therefore not straightforward. Finally, let us give a short comment on Gaussian states here. Obviously, the methods just described do not apply in this case. However, we can consider instead of U ⊗N -covariance, covariance with respect to phase-space translations. Following this idea some results concerning optimal cloning of Gaussian states are obtained (see [43] and the references therein), but the corresponding general theory is not as far developed as in the .nite-dimensional case.
532
M. Keyl / Physics Reports 369 (2002) 431 – 548
7.1.4. Distillation of entanglement Finally, let us have another look at distillation of entanglement. The basic idea is quite the same as for optimal cloning: Use multiple inputs to approximate a task which is impossible with one-shot operations. From a more technical point of view, however, it does not .t into the general scheme proposed up to now. Nevertheless, some of the arguments can be adopted in an easy way. First of all we have to replace the “one-particle” Hilbert space H with a twofold tensor product HA ⊗ HB and the channels we have to look at are LOCC operations M M ⊗N ⊗N ⊗ H⊗ T : B(H⊗ B ) → B(HA ⊗ HB ) ; A
(7.33)
cf. Section 4.3. Our aim is to determine T such that T ∗ ( ⊗N ) is for each distillable (mixed) state
∈ B∗ (HA ⊗ HB ), close to the M -fold tensor product |" "|⊗M of a maximally entangled state " ∈ HA ⊗ HB . A .gure of merit with a similar structure as the F#; all studied above can be derived directly from the de.nition of the entanglement measure ED in Section 5.1.3: We de.ne (replacing the trace-norm distance with a .delity) FD (T ) = inf inf "⊗M ; T ∗ ( ⊗N )"⊗M ;
"
(7.34)
where the in.ma are taken over all maximally entangled states " and all distillable states . Alternatively, we can look at state-dependent measures, which seem to be particularly important if we try to calculate ED ( ) for some state . In this case we simply get FD; (T ) = inf "⊗M ; T ∗ ( ⊗N )"⊗M : "
(7.35)
To translate the group theoretical analysis of the last two subsections is somewhat more diScult. As in the case of F#; “ we can restrict the search for optimizers to permutation invariant operations, i.e. -p (T ) = T and ?p (T ) = T in the terminology of Section 7.1.2. Unitary covariance U ⊗N T (A)U ∗⊗N = T (U ⊗M AU ∗⊗M ) ;
(7.36)
however, cannot be assumed for all unitaries U of HA ⊗ HB , but only for local ones (U = UA ⊗ UB ) in the case of FD or only for local U which leave invariant for FD; . This makes the analog of the decomposition scheme from Section 7.1.3 more diScult and such a study is (up to my knowledge) not yet done. A related subproblem arises if we consider FD; from Eq. (7.35) for a state with special symmetry properties; e.g. an OO-invariant state. The corresponding optimization might be simpler and a solution would be relevant for the calculation of ED . 7.2. Optimal devices Now we can consider the optimization problems associated to the .gures of merit discussed in the last section. This means that we are searching for those devices which approximate the impossible tasks in question in the best possible way. As pointed out at the beginning of this Section this can be done for .nite N and in the limit N → ∞. The latter is postponed to the next section. 7.2.1. Optimal cloning The quality of an optimal, pure state cloner is de.ned by the .gures of merit Fc; # in Eqs. (7.2) and (7.3) and the group theoretic ideas sketched in Section 7.1.3 allow the complete solution of
M. Keyl / Physics Reports 369 (2002) 431 – 548
533
this problem. We will demonstrate some of the basic ideas in the qubit case .rst and state the .nal result afterwards in full generality. The solvability of this problem relies in part on the special structure of the .gures of merit Fc; # , which allows further simpli.cations of the general scheme sketched in Section 7.1.3. If we consider e.g. Fc; 1 (T ) (the other case works similarly) we get Fc; 1 (T ) = inf
inf tr ((j) T ∗ (⊗N ))
(7.37)
= inf
inf tr (T ((j) )⊗N ))
(7.38)
j=1;:::;N pure
j=1;:::;N pure
= inf inf
j=1;:::;N
⊗N
; T ((j) )
⊗N
:
(7.39)
N ⊗N Hence Fc; # only depends on the B(H⊗ + ) component (where H+ denotes again the Bose-subspace ⊗ N of H ) of T and we can assume without loss of generality that T is of the form N T : B(H⊗M ) → B(H⊗ + ) :
(7.40)
N The restriction of U ⊗N to H⊗ is an irreducible representation (for any d) and in the qubit case + N ⊗ N (d = 2) we have U = 3s (U ) with s = N=2 for all ∈ H⊗ + . The decomposition of T from Eq. (7.27) contains therefore only those summands with s = N=2. This simpli.es the optimization problem signi.cantly, since the number of variables needed to parametrize all relevant cloning maps according to Eq. (7.31) is reduced from 3 to 2. A more detailed (and non-trivial) analysis shows that the maximum for Fc; 1 and Fc; all is attained if all terms in (7.31) except the one with s=N=2; j=N=2 and l=(M −N )=2 vanish. The precise result is stated in the following theorem ([68,31,32] for qubits and [166,98] for general d).
Theorem 7.1. For each H = Cd both
Tˆ is the unique solution for both optimization problems; i.e. there is no other operation T of form (7.40) which maximizes Fc; 1 or Fc; all . There are two aspects of this result which deserve special attention. One is the relation to state estimation which is postponed to Section 7.2.3. The second concerns the role of correlations: It does not matter whether we are looking for the quality of each single clone (Fc; 1 ) only, or whether correlations are taken into account (Fc; all ). In both cases we get the same optimal solution. This is
534
M. Keyl / Physics Reports 369 (2002) 431 – 548
a special feature of pure states, however. Although there are no concrete results for quantum systems, it can be checked quite easily in the classical case that considering correlations changes the optimal cloner for arbitrary mixed states drastically. 7.2.2. Puri
(?) = exp 2? = ? (7.44) 2 cosh (?) 2 e + e − ? 0 e− ? = tanh(?)| | + (1 − tanh(?)) 12 5;
= |0 ;
(7.45)
The parameterization of in terms of the “pseudo-temperature” ? is chosen here, because it simpli.es some calculations signi.cantly (as we will see soon). The relation to the form of = R∗ initially given in Eq. (7.4) is obviously # = tanh(?). To state the main result of this subsection we have to decompose the product state (?)⊗N into spin-s components. This can be done in terms of Eq. (7.26). (?) is not unitary of course. However, we can apply (7.26) by analytic continuation, i.e. we treat (?) in the same way as we would exp(i?3 ). It is then straightforward to get 5 (7.46)
(?)⊗N = ⊕ wN (s) s (?) ⊗ s dim KN; s with wN (s) =
sinh((2s + 1)?) dim KN; s sinh(?)(2 cosh(?))N
s (?) =
sinh(?) exp(2?L3(s) ) ; sinh((2s + 1)?)
(7.47)
and
where L3(s) is the three-component of angular momentum in the spin-s representation and the dimension of KN; s is given in Eq. (7.25). By (7.23) the representation space of 3s coincides with the symmetric tensor product H2s + . Hence we can interpret s (?) as a state of 2s (indistinguishable) particles. In other words the decomposition of (?)⊗N leads in a natural way to a family of operations ⊗2s ) → B(H⊗N ) Qs : B(H+
with Qs∗ [ (?)⊗N ] = s (?) :
(7.48)
We can think of the family Qs , of operations as an instrument Q which measures the number of output systems and transforms (?)⊗N to the appropriate s (?). The crucial point is now that the purity of s (?), measured in terms of .delities with respect to increases provided s ¿ 1=2 holds.
M. Keyl / Physics Reports 369 (2002) 431 – 548
535
Hence, we can think of Q as a puri.er which arises naturally by reduction to irreducible spin components [46]. Unfortunately, Q does not produce a .xed number of output systems. The most obvious way to construct a device which produces always the same number M of outputs is to run the optimal 2s → M cloner Tˆ 2s→M if 2s ¡ M or to drop 2s − M particles if M 6 2s holds. More precisely we can de.ne Qˆ : B(H⊗M ) → B(H⊗N ) by ∗ ∗ Qˆ [ (?)⊗N ] = wN (s)Tˆ 2s→M [ s (?)] (7.49) s
with
d[2s] SM ( ⊗ 5)SM ; for M ¿ 2s; ∗ ˆ T 2s→M ( ) = d[M ] tr 2s−M
for M 6 2s:
(7.50)
tr 2s−M denotes here the partial trace over the 2s−M .rst tensor factors. Applying the general scheme of Section 7.1.3 shows that this is the best way to get exactly M puri.ed qubits [100]: Theorem 7.2. The operation Qˆ de
s
with 2f1 (M; ?; s) − 1 2s + 1 1 coth((2s + 1)?) − coth ? for 2s ¿ M; 2s 2s = 1 M +2 ((2s + 1) coth((2s + 1)?) − coth ?) for 2s 6 M; 2s + 2 M and
2s + 1 1 − e−2? M + 1 1 − e−(4s+2)? −1 fall (M; ?; s) = −2? 2s K 1 − e e2?(K −s) 1 − e−(4s+2)? M M K
(7.52)
M 6 2s (7.53) M ¿ 2s:
The expression for the optimal .delities given here look rather complicated and are not very illuminating. We have plotted there both quantities as a function of # (Fig. 7.1) of N (Fig. 7.2) and M (Fig. 7.3). While the .rst two plots looks quite similar the functional behavior in dependence of M seems to be very di=erent. The study of the asymptotic behavior in the next section will give a precise analysis of this observation. 7.2.3. Estimating pure states We have already seen in Section 4.2 that the cloning problem and state estimation are closely related, because we can construct an approximate cloner T from an estimator E simply by running
536
M. Keyl / Physics Reports 369 (2002) 431 – 548
Fig. 7.1. One- and all-qubit .delities of the optimal puri.er for N = 100 and M = 10. Plotted as a function of the noise parameter #.
Fig. 7.2. One- and all-qubit .delities of the optimal puri.er for # = 0:5 and M = 10. Plotted as a function of N .
E on the N input states, and preparing M systems according to the attained classical information. In this section we want to go the other way round and show that the optimal cloner derived in Theorem 7.1 leads immediately to an optimal pure state estimator; cf. [33]. To this end let us assume that E has the form (cf. Section 4.2) C(X ) f → E(f) = f()E ∈ B(H⊗N ) ; (7.54) ∈X
M. Keyl / Physics Reports 369 (2002) 431 – 548
537
Fig. 7.3. One- and all-qubit .delities of the optimal puri.er for # = 0:5 and N = 10. Plotted as a function of M .
where X ⊂ B∗ (H) is a .nite set 23 of pure states. The quality of E can be measured in analogy to Section 7.1.1 by a .delity-like quantity Fs (E) = inf ; = inf
⊗N ; E ⊗N ; ; (7.55) ∈H
∈H
∈X
where = ⊗N ; E ⊗n is the (density matrix valued) expectation value of E and the in.mum is taken over all pure states . Hence Fs (E) measures the worst .delity of with respect to the input state . If we construct now a cloner TE from E by TE∗ (| |⊗N ) =
⊗N ; E ⊗n ⊗M (7.56)
its one-particle .delity Fc; 1 (TE ) coincides obviously with Fs (E). Since we can produce in this way arbitrary many clones of the same quality we see that Fs (E) is smaller than Fc; 1 (N; M ) for all M and therefore d−1 N ; (7.57) Fs (E) 6 Fc; 1 (N; ∞) = lim Fc; 1 (N; M ) = M →∞ d N +d where we can look at Fc; 1 (N; ∞) as the optimal quality of a cloner which produces arbitrary many outputs from N input systems. To see that this bound can be saturated consider an asymptotically exact family C(XM ) f → E M (f) = f()EM ∈ B(H⊗M ); XM ⊂ S(H) (7.58) ∈X
23
The generalization of the following considerations to continuous sets and a measure theoretic setup is straightforward and does not lead to a di=erent result; i.e. we cannot improve the estimation quality with continuous observables.
538
M. Keyl / Physics Reports 369 (2002) 431 – 548
of estimators, i.e. the error probabilities (4.17) vanish in the limit N → ∞. If the EM ∈ B(H⊗M ) are pure tensor products (i.e. the E M are realized by a “quorum” of observables as described in Section ∗ 4.2.1) they cannot distinguish between the output state Tˆ ( ⊗N ) (which is highly correlated) and the pure product state ˜⊗M where ˜ ∈ B∗ (H) denotes the partial trace over M − 1 tensor factors (due to permutation invariance it does not matter which factors we trace away here). Hence if we apply E M to the output of the optimal N to M cloner Tˆ N →M we get an estimate for ˜ and in the limit M → ∞ this estimate is exact. The .delity ; ˜ of ˜ with respect to the pure input state of Tˆ N →M coincides however with Fc; 1 (N; M ). Hence the composition of Tˆ N →M with E M converges 24 to an estimator E with Fe (E) = Fc; 1 (N; ∞). We can rephrase this result roughly in the from: “producing in.nitely many optimal clones of a pure state is the same as estimating optimally”. 7.2.4. The UNOT gate The discussion of the last subsection shows that the optimal cloner Tˆ N →M produces better clones than any estimation-based scheme (as in Eq. (7.56)), as long as we are interested only in
The dependence on the number M of outputs is not interesting here, because the optimal device produces arbitrarily many copies of the same quality. 7.3. Asymptotic behaviour If a device, such as the optimal cloner, is given which produces M output system from N inputs it is interesting to ask for the maximal rate, i.e. the maximal ratio M (N )=N in the limit N → ∞ such that the asymptotic .delity limN →∞ F(N; M (N )) is above a certain threshold (preferably equal to one). Note that this type of question was very important as well for distillation of entanglement and channel capacities, but almost not computable in there. In the current context this type of question is somewhat easier to answer. This relies on the one hand on the group theoretical structure presented 24
Basically convergence must be shown here. It follows however easily from the corresponding property of the E M .
M. Keyl / Physics Reports 369 (2002) 431 – 548
539
in the last section and on the other on the close relation to quantum state estimation. We start this section therefore with a look on some aspects of the asymptotics of mixed state estimation. 7.3.1. Estimating mixed state If we do not know a priori that the input systems are in a pure state much less is known about estimating and cloning. It is, in particular, almost impossible to say anything about optimality for .nitely many input systems (only if N is very small e.g. [156]). Nevertheless, some strong results are available for the behavior in the limit N → ∞ and we will give here a short review of some of them. One quantity, interesting to be analyzed for a family of estimators E N in the limit N → ∞ is the variance of the E N . To state some results in this context it is convenient to parameterize the state space S(H) or parts of it in terms of n real parameters x = (x1 ; : : : ; x n ) = U ⊂ Rn and to write (x) as the corresponding state. If we want to cover all states, one particular parameterization is e.g. the generalized Bloch ball from Section 2.1.2. An estimator taking N input systems is now a (discrete) observable ExN ∈ B(H⊗N ); x ∈ XN with values in a (.nite) subset XN of U. The expectation value of E N in the state (x)⊗N is therefore the vector E N x with components E N x ; j; j = 1; : : : ; n given by
E N x; j = yj tr(EyN (x)⊗N ) (7.60) y ∈ XN
and the mean quadratic error is described by the matrix ( EN x; j − yj )( EN x; k − yk )tr (EyN (x)⊗N ) : VjkN (x) =
(7.61)
y ∈ XN
For a good estimation strategy we expect that Vjk (x) decreases as 1=N , i.e. Wjk (x) ; (7.62) VjkN (x) ! N where the scaled mean quadratic error matrix Wjk (x) does not depend on N . The task is now to .nd bounds on this matrix. We will state here one result taken from [66]. To this end we need the HellstrEom quantum information matrix j(x)k (x) − k (x)j (x) Hjk (x) = tr (x) ; (7.63) 2 which is de.ned in terms of symmetric logarithmic derivatives j , which in turn are implicitly given by 9 (x) j (x) (x) + (x)j (x) : (7.64) = 9xj 2 Now we have the following theorem [66]: Theorem 7.4. Consider a family of estimators E N ; N ∈ N as described above such that the following conditions hold: 1. The scaled mean quadratic error matrix NVjkN (x) converges uniformly in x to Wjk (x) as N → ∞.
540
M. Keyl / Physics Reports 369 (2002) 431 – 548
2. Wjk (x) is continuous at a point x0 = x. 3. Hjk (x) and its derivatives are bounded in a neighborhood of x0 . Then we have tr[H −1 (x0 )W −1 (x0 )] 6 (d − 1) :
(7.65)
For qubits this bound can be attained by a particular estimation strategy which measures on each qubit separately. We refer to [66] for details. A second quantity interesting to study in the limit N → ∞ is the error probability de.ned in Section 4.2; cf. Eq. (4.17). For a good estimation strategy it should go to zero of course, an additional question, however, concerns the rate with which this happens. We will review here a result from [99] which concerns the subproblem of estimating the spectrum. Hence we are looking now at a family of observables E N : C(XN ) → B(H⊗N ); N ∈ N taking their values in a .nite subset XN of the set U = (x1 ; : : : ; xd ) ∈ Rd | x1 ¿ · · · ¿ xd ¿ 0; xj = 1 (7.66) j
of ordered spectra of density operators on H = Cd . Our aim is to determine the behavior of the error probabilities (cf. Eq. (4.17) KN (N) = tr(ExN ⊗N ) (7.67) x ∈ N∩ XN
in the limit N → ∞. Following the general arguments in Section 7.1.2 we can restrict our attention here to covariant observables, i.e. we can assume without loss of cloning quality that the ExN commute with all permutation unitaries Vp ; p ∈ SN and all local unitaries U ⊗N ; U ∈ U (d). If we restrict our attention in addition to projection-valued measures, which is suggestive for ruling out unnecessary fuzziness, we see that each ExN must coincide with a (sum of) projections PY from H⊗N onto the U (d), respectively Vp , invariant subspace HY ⊗ KY , which is de.ned in Eq. (7.32), where Y = (Y1 ; : : : ; Yd ) refers here to Young frames with d rows and N boxes. The only remaining freedom for the E N is the assignment x(Y ) ∈ U of Young frames (and therefore projections EN ) to points in U. Since the Young frames themselves have up to normalization the same structure as the elements of U, one possibility for s(Y ) is just s(Y ) = Y=N . Written as quantum to classical channel this is C(XN ) f → f(Y=N )PY ∈ B(H⊗N ) ; (7.68) Y
where XN ⊂ U is the set of normalized Young frames, i.e. all Y=N if Y has d rows and N boxes. It turns out, somewhat surprisingly that this choice leads indeed to an asymptotically exact estimation strategy with exponentially decaying error probability (7.67). The following theorem can be proven with methods from the theory of large deviations: Theorem 7.5. The family of estimators E N ; N ∈ N given in Eq. (7.68) is asymptotically exact; i.e. the error probabilities KN (N) vanish in the limit N → ∞ if N is a complement of a ball around
M. Keyl / Physics Reports 369 (2002) 431 – 548
541
the spectrum r ∈ U of . If N is a set (possibly containing r) whose interior is dense in its closure we have the asymptotic estimate for KN (N): 1 ln KN (N) = inf I (s) ; (7.69) lim N →∞ N s ∈N where the “rate function” I : U → R is just the relative entropy between the two probability vectors s and r sj (ln sj − ln rj ) : (7.70) I (s) = j
To make this statement more transparent, note that we can rephrase (7.69) as KN (N) ≈ exp −N inf I (s) : s ∈N
(7.71)
Since the rate function I vanishes only for s = r we see that the probability measures KN converge (weakly) to a point measure concentrated at r ∈ U. The rate of this convergence is exponential and measured exactly by the function I . 7.3.2. Puri
s (?) ⊗ dim KN; s where Ps is the projection from H⊗N to Hs ⊗ KN; s . In other words Ps is equal to PY from Eq. (7.68) if we apply the reparametrization (Y1 ; Y2 ) → (s; N ) = ((Y1 − Y2 )=2; Y1 + Y2 ) :
(7.73)
In a similar way we can rewrite the set of ordered spectra by U (x1 ; x2 ) → x1 − x2 ∈ [0; 1] and KN (N) becomes a measure on [0; 1] (i.e. N ⊂ [0; 1]): tr( (?)⊗N Ps ) = wN (s) (7.74) KN (N) = 2s=N ∈N
and the sum FR; # (N; M (N )) =
2s=N ∈N
wN (s)f# (M (N ); ?; s)
(7.75)
s
can be rephrased as the integral of a function [0; 1] x → f˜ # (N; ?; x) ∈ R with respect to this measure, provided f˜ # is related to f# by f˜ # (N; ?; 2s=N )=f# (M (N ); ?; s). According to Theorem 7.5 the KN converge to a point measure concentrated at the ordered spectrum of (?); but the latter corresponds, according to the reparametrization above, to the noise parameter # = tanh ?. Hence, if
542
M. Keyl / Physics Reports 369 (2002) 431 – 548 1 theta=0.25 theta=0.50 theta=0.75 theta=1.00
0.9 0.8 0.7
Φ()
0.6 0.5 0.4 0.3 0.2 0.1 0 0
0.5
1
1.5
2
Fig. 7.4. Asymptotic all-qubit .delity 2(%) plotted as function of the rate %.
the sequence of functions f˜ # (N; ?; ·) converges for N → ∞ uniformly (or at least uniformly on a neighborhood of #) to f˜ # (?; ·) we get f˜ # (N; ?; s) = f˜ # (?; #) (7.76) lim F(N; M (N )) = lim N →∞
N →∞
s
for the limit of the .delities. A precise formulation of this idea leads to the following theorem [100]. Theorem 7.6. The two puri
(7.77)
N →∞ M →∞
and
2(%) = lim FR; all (N; M ) = N →∞ M=N →%
2#2 2#2 + %(1 − #)
if % 6 #;
2#2 %(1 + #)
if % ¿ #:
(7.78)
If we are only interested in the quality of each qubit separately we can produce arbitrarily good puri.ed qubits at any rate. If on the other hand the correlations between the output systems should vanish in the limit the rate is always zero. This can be seen from the function 2, which is the asymptotic all-qubit .delity which can be reached by a given rate %. We have plotted it in Fig. 7.4. Note .nally that the results just stated contain the rates of optimal cloning machines as a special case; we only have to set # = 1.
M. Keyl / Physics Reports 369 (2002) 431 – 548
543
References [1] A. Ac_`n, A. Andrianov, L. Costa, E. Jan_e, J.I. Latorre, R. Tarrach, Schmidt decomposition and classi.cation of three-quantum-bit states, Phys. Rev. Lett. 85 (7) (2000) 1560–1563. [2] C. Adami, N.J. Cerf, Von Neumann capacity of noisy quantum channels, Phys. Rev. A 56 (5) (1997) 3470–3483. [3] G. Alber, T. Beth, M. Horodecki, R. Horodecki, M. RWotteler, H. Weinfurter, R. Werner, A. Zeilinger (Eds.), Quantum Information, Springer, Berlin, 2001. [4] A. Ashikhmin, E. Knill, Nonbinary quantum stabilizer codes, IEEE Trans. Inf. Theory 47 (7) (2001) 3065–3072. [5] A. Aspect, J. Dalibard, G. Roger, Experimental test of Bell’s inequalities using time-varying analyzers, Phys. Rev. Lett. 49 (1982) 1804–1807. [6] H. Barnum, E. Knill, M.A. Nielsen, On quantum .delities and channel capacities, IEEE Trans. Inf. Theory 46 (2000) 1317–1329. [7] H. Barnum, M.A. Nielsen, B. Schumacher, Information transmission through a noisy quantum channel, Phys. Rev. A 57 (6) (1998) 4153–4175. [8] H. Barnum, J.A. Smolin, B.M. Terhal, Quantum capacity is properly de.ned without encodings, Phys. Rev. A 58 (5) (1998) 3496–3501. [9] C.H. Bennett, H.J. Bernstein, S. Popescu, B. Schumacher, Concentrating partial entanglement by local operations, Phys. Rev. A 53 (4) (1996) 2046–2052. [10] C.H. Bennett, G. Brassard, Quantum key distribution and coin tossing, in: Proceedings of the IEEE International Conference on Computers, Systems, and Signal Processing, Bangalore, India, IEEE, New York, 1984, pp. 175 –179. [11] C.H. Bennett, G. Brassard, C. Cr_epeau, R. Jozsa, A. Peres, W.K. Wootters, Teleporting an unknown quantum state via dual classical and Einstein–Podolsky–Rosen channels, Phys. Rev. Lett. 70 (1993) 1895–1899. [12] C.H. Bennett, G. Brassard, S. Popescu, B. Schumacher, J.A. Smolin, W.K. Wootters, Puri.cation of noisy entanglement and faithful teleportation via noisy channels, Phys. Rev. Lett. 76 (5) (1996) 722–725; C.H. Bennett, G. Brassard, S. Popescu, B. Schumacher, J.A. Smolin, W.K. Wootters, Erratum, Phys. Rev. Lett. 78 (10) (1997) 2031. [13] C.H. Bennett, D.P. DiVincenzo, C.A. Fuchs, T. Mor, E.M. Rains, P.W. Shor, J.A. Smolin, W.K. Wootters, Quantum nonlocality without entanglement, Phys. Rev. A 59 (2) (1999) 1070–1091. [14] C.H. Bennett, D.P. DiVincenzo, T. Mor, P.W. Shor, J.A. Smolin, B.M. Terhal, Unextendible product bases and bound entanglement, Phys. Rev. Lett. 82 (26) (1999) 5385–5388. [15] C.H. Bennett, D.P. DiVincenzo, J.A. Smolin, Capacities of quantum erasure channels, Phys. Rev. Lett. 78 (16) (1997) 3217–3220. [16] C.H. Bennett, D.P. DiVincenzo, J.A. Smolin, W.K. Wootters, Mixed-state entanglement and quantum error correction, Phys. Rev. A 54 (4) (1996) 3824–3851. [17] C.H. Bennett, P.W. Shor, J.A. Smolin, A.V. Thapliyal, Entanglement-assisted classical capacity of noisy quantum channels, Phys. Rev. Lett. 83 (15) (1999) 3081–3084. [18] C.H. Bennett, P.W. Shor, J.A. Smolin, A.V. Thapliyal, Entanglement-assisted capacity of a quantum channel and the reverse Shannon theorem, 2001, quant-ph=0106052. [19] C.H. Bennett, S.J. Wiesner, Communication via one- and two-particle operators on Einstein–Podolsky–Rosen states, Phys. Rev. Lett. 20 (1992) 2881–2884. [20] T. Beth, M. RWotteler, Quantum algorithms: applicable algebra and quantum physics, in: G. Alber, et al., (Eds.), Quantum Information, Springer, Berlin, 2001, pp. 97–150. [21] E. Biolatti, R.C. Iotti, P. Zanardi, F. Rossi, Quantum information processing with semiconductor macroatoms, Phys. Rev. Lett. 85 (26) (2000) 5647–5650. [22] D. Boschi, S. Branca, F. De Martini, L. Hardy, S. Popescu, Experimental realization of teleporting an unknown pure quantum state via dual classical an Einstein–Podolsky–Rosen channels, Phys. Rev. Lett. 80 (6) (1998) 1121–1125. [23] D. Bouwmeester, A.K. Ekert, A. Zeilinger (Eds.), The Physics of Quantum Information: Quantum Cryptography, Quantum Teleportation, Quantum Computation, Springer, Berlin, 2000. [24] D. Bouwmeester, J.-W. Pan, K. Mattle, M. Eibl, H. Weinfurter, A. Zeilinger, Experimental quantum teleportation, Nature 390 (1997) 575–579. [25] O. Bratteli, D.W. Robinson, Operator Algebras and Quantum Statistical Mechanics I, Springer, New York, 1979.
544
M. Keyl / Physics Reports 369 (2002) 431 – 548
[26] O. Bratteli, D.W. Robinson, Operator Algebras and Quantum Statistical Mechanics II, Springer, Berlin, 1997. [27] S.L. Braunstein, C.M. Caves, R. Jozsa, N. Linden, S. Popescu, R. Schack, Separability of very noisy mixed states and implications for NMR quantum computing, Phys. Rev. Lett. 83 (5) (1999) 1054–1057. [28] G.K. Brennen, C.M. Caves, I.H. Deutsch, F.S. Jessen, Quantum logic gates in optical lattices, Phys. Rev. Lett. 82 (5) (1999) 1060–1063. [29] K.R. Brown, D.A. Lidar, K.B. Whaley, Quantum computing with quantum dots on linear supports, 2001, quant-ph=0105102. [30] T.A. Brun, H.L. Wang, Coupling nanocrystals to a high-q silica microsphere: entanglement in quantum dots via photon exchange, Phys. Rev. A 61 (2000) 032307. [31] D. Brua, D.P. DiVincenzo, A. Ekert, C.A. Fuchs, C. Machiavello, J.A. Smolin, Optimal universal and state-dependent cloning, Phys. Rev. A 57 (4) (1998) 2368–2378. [32] D. Brua, A.K. Ekert, C. Macchiavello, Optimal universal quantum cloning and state estimation, Phys. Rev. Lett. 81 (12) (1998) 2598–2601. [33] D. Brua, C. Macchiavello, Optimal state estimation for d-dimensional quantum systems, Phys. Lett. A 253 (1999) 249–251. [34] W.T. Buttler, R.J. Hughes, S.K. Lamoreaux, G.L. Morgan, J.E. Nordholt, C.G. Peterson, Daylight quantum key distribution over 1:6 km, Phys. Rev. Lett. 84 (2000) 5652–5655. [35] V. Bubzek, M. Hillery, Universal optimal cloning of qubits and quantum registers, Phys. Rev. Lett. 81 (22) (1998) 5003–5006. [36] V. Bubzek, M. Hillery, R.F. Werner, Optimal manipulations with qubits: universal-not gate, Phys. Rev. A 60 (4) (1999) R2626–R2629. [37] A. Cabello, Bibliographic guide to the foundations of quantum mechanics and quantum information, 2000, quant-ph=0012089. [38] A.R. Calderbank, E.M. Rains, P.W. Shor, N.J.A. Sloane, Quantum error correction and orthogonal geometry, Phys. Rev. Lett. 78 (3) (1997) 405–408. [39] A.R. Calderbank, P.W. Shor, Good quantum error-correcting codes exist, Phys. Rev. A 54 (1996) 1098–1105. [40] N.J. Cerf, Asymmetric quantum cloning in any direction, J. Mod. Opt. 47 (2) (2000) 187–209. [41] N.J. Cerf, C. Adami, Negative entropy and information in quantum mechanics, Phys. Rev. Lett. 79 (26) (1997) 5194–5197. [42] N.J. Cerf, C. Adami, R.M. Gingrich, Reduction criterion for separability, Phys. Rev. A 60 (2) (1999) 898–909. [43] N.J. Cerf, S. Iblisdir, G. van Assche, Cloning and cryptography with quantum continuous variables, 2001, quant-ph=0107077. [44] I.L. Chuang, L.M.K. Vandersypen, X.L. Zhou, D.W. Leung, S. Lloyd, Experimental realization of a quantum algorithm, Nature 393 (1998) 143–146. [45] A. Church, An unsolved problem of elementary number theory, Am. J. Math. 58 (1936) 345–363. [46] J.I. Cirac, A.K. Ekert, C. Macchiavello, Optimal puri.cation of single qubits, Phys. Rev. Lett. 82 (1999) 4344–4347. [47] J.F. Clauser, M.A. Horne, A. Shimony, R.A. Holt, Proposed experiment to test local hidden-variable theories, Phys. Rev. Lett. 23 (15) (1969) 880–884. [48] J.F. Cornwell, Group Theory in Physics II, Academic Press, London, 1984. [49] T.M. Cover, J.A. Thomas, Elements of Information Theory, Wiley, Chichester, 1991. [50] E.B. Davies, Quantum Theory of Open Systems, Academic Press, London, 1976. [51] B. Demoen, P. Vanheuverzwijn, A. Verbeure, Completely positive maps on the CCR-algebra, Lett. Math. Phys. 2 (1977) 161–166. [52] D. Deutsch, Quantum theory, the Church–Turing principle and the universal quantum computer, Proc. R. Soc. London A 400 (1985) 97–117. [53] D. Deutsch, R. Jozsa, Rapid solution of problems by quantum computation, Proc. R. Soc. London A 439 (1992) 553–558. [54] D.P. DiVincenzo, P.W. Shor, J.A. Smolin, Quantum-channel capacity of very noisy channels, Phys. Rev. A 57 (2) (1998) 830–839; D.P. DiVincenzo, P.W. Shor, J.A. Smolin, Erratum, Phys. Rev. A 59 (2) (1999) 1717. [55] D.P. DiVincenzo, P.W. Shor, J.A. Smolin, B.M. Terhal, A.V. Thapliyal, Evidence for bound entangled states with negative partial transpose, Phys. Rev. A 61 (6) (2000) 062312.
M. Keyl / Physics Reports 369 (2002) 431 – 548
545
[56] M.J. Donald, M. Horodecki, Continuity of relative entropy of entanglement, Phys. Lett. A 264 (4) (1999) 257–260. [57] M.J. Donald, M. Horodecki, O. Rudolph, The uniqueness theorem for entanglement measures, 2001, quant-ph=0105017. [58] W. DWur, J.I. Cirac, M. Lewenstein, D. Bruss, Distillability and partial transposition in bipartite systems, Phys. Rev. A 61 (6) (2000) 062313. [59] B. Efron, R.J. Tibshirani, An Introduction to the Bootstrap, Chapman & Hall, New York, 1993. [60] T. Eggeling, K.G.H. Vollbrecht, R.F. Werner, M.M. Wolf, Distillability via protocols respecting the positivity of the partial transpose, Phys. Rev. Lett. 87 (2001) 257902. [61] T. Eggeling, R.F. Werner, Separability properties of tripartite states with U × U × U -symmetry, Phys. Rev. A 63 (4) (2001) 042111. [62] A. Feinstein, Foundations of Informations Theory, McGraw-Hill, New York, 1958. [63] D.G. Fischer, M. Freyberger, Estimating mixed quantum states, Phys. Lett. A 273 (2000) 293–302. [64] G. Giedke, L.-M. Duan, J.I. Cirac, P. Zoller, Distillability criterion for all bipartite gaussian states, Quant. Inf. Comput. 1 (3) (2001). [65] G. Giedke, B. Kraus, M. Lewenstein, J.I. Cirac, Separability properties of three-mode gaussian states, Phys. Rev. A 64 (5) (2001) 052303. [66] R.D. Gill, S. Massar, State estimation for large ensembles, Phys. Rev. A 61 (2000) 2312–2327. [67] N. Gisin, Hidden quantum nonlocality revealed by local .lters, Phys. Lett. A 210 (3) (1996) 151–156. [68] N. Gisin, S. Massar, Optimal quantum cloning machines, Phys. Rev. Lett. 79 (11) (1997) 2153–2156. [69] N. Gisin, G. Ribordy, W. Tittel, H. Zbinden, Quantum Cryptography, 2001, quant-ph=0101098. [70] D. Gottesman, Class of quantum error-correcting codes saturating the quantum hamming bound, Phys. Rev. A 54 (1996) 1862–1868. [71] D. Gottesman, Stabilizer codes and quantum error correction, Ph.D. Thesis, California Institute of Technology, 1997, quant-ph=9705052. [72] M. Grassl, T. Beth, T. Pellizzari, Codes for the quantum erasure channel, Phys. Rev. A 56 (1) (1997) 33–38. [73] D.M. Greenberger, M.A. Horne, A. Zeilinger, Going beyond bell’s theorem, in: M. Kafatos (Ed.), Bell’s Theorem, Quantum Theory, and Conceptions of the Universe, Kluwer Academic Publishers, Dordrecht, 1989, pp. 69–72. [74] L.K. Grover, Quantum computers can search arbitrarily large databases by a single query, Phys. Rev. A 56 (23) (1997) 4709–4712. [75] L.K. Grover, Quantum mechanics helps in searching for a needle in a haystack, Phys. Rev. Lett. 79 (2) (1997) 325–328. [76] J. Gruska, Quantum Computing, McGraw-Hill, New York, 1999. [77] J. Harrington, J. Preskill, Achievable rates for the gaussian quantum channel, Phys. Rev. A 64 (6) (2001) 062301. [78] P.M. Hayden, M. Horodecki, B.M. Terhal, The asymptotic entanglement cost of preparing a quantum state, J. Phys. A. Math. Gen. 34 (35) (2001) 6891–6898. [79] A.S. Holevo, Probabilistic and Statistical Aspects of Quantum Theory, North-Holland, Amsterdam, 1982. [80] A.S. Holevo, Coding theorems for quantum channels, Tamagawa University Research Review no. 4, 1998, quant-ph=9809023. [81] A.S. Holevo, Sending quantum information with gaussian states, in: Proceedings of the Fourth International Conference on Quantum Communication, Measurement and Computing, Evanston, 1998, quant-ph=9809022. [82] A.S. Holevo, On entanglement-assisted classical capacity, 2001, quant-ph=0106075. [83] A.S. Holevo, Statistical Structure of Quantum Theory, Springer, Berlin, 2001. [84] A.S. Holevo, R.F. Werner, Evaluating capacities of bosonic gaussian channels, Phys. Rev. A 63 (3) (2001) 032312. [85] M. Horodecki, P. Horodecki, Reduction criterion of separability and limits for a class of distillation protocols, Phys. Rev. A 59 (6) (1999) 4206–4216. [86] M. Horodecki, P. Horodecki, R. Horodecki, Separability of mixed states: necessary and suScient conditions, Phys. Lett. A 223 (1–2) (1996) 1–8. [87] M. Horodecki, P. Horodecki, R. Horodecki, Mixed-state entanglement and distillation: is there a “bound” entanglement in nature? Phys. Rev. Lett. 80 (24) (1998) 5239–5242. [88] M. Horodecki, P. Horodecki, R. Horodecki, General teleportation channel, singlet fraction, and quasidistillation, Phys. Rev. A 60 (3) (1999) 1888–1898.
546
M. Keyl / Physics Reports 369 (2002) 431 – 548
[89] M. Horodecki, P. Horodecki, R. Horodecki, Limits for entanglement measures, Phys. Rev. Lett. 84 (9) (2000) 2014–2017. [90] M. Horodecki, P. Horodecki, R. Horodecki, Uni.ed approach to quantum capacities: towards quantum noisy coding theorem, Phys. Rev. Lett. 85 (2) (2000) 433–436. [91] M. Horodecki, P. Horodecki, R. Horodecki, Mixed-state entanglement and quantum communication, in: G. Alber, et al., (Eds.), Quantum Information, Springer, Berlin, 2001, pp. 151–195. [92] P. Horodecki, M. Horodecki, R. Horodecki, Bound entanglement can be activated, Phys. Rev. Lett. 82 (5) (1999) 1056–1059. [93] R.J. Hughes, G.L. Morgan, C.G. Peterson, Quantum key distribution over a 48 km optical .bre network, J. Mod. Opt. 47 (2–3) (2000) 533–547. [94] A. Jamiolkowski, Linear transformations which preserve trace and positive semide.niteness of operators, Rep. Math. Phys. 3 (1972) 275–278. [95] T. Jennewein, C. Simon, G. Weihs, H. Weinfurter, A. Zeilinger, Quantum cryptography with entangled photons, Phys. Rev. Lett. 84 (2000) 4729–4732. [96] J.A. Jones, M. Mosca, R.H. Hansen, Implementation of a quantum search algorithm on a quantum computer, Nature 393 (1998) 344–346. [97] M. Keyl, D. Schlingemann, R.F. Werner, In.nitely entangled states, in preparation. [98] M. Keyl, R.F. Werner, Optimal cloning of pure states, testing single clones, J. Math. Phys. 40 (1999) 3283–3299. [99] M. Keyl, R.F. Werner, Estimating the spectrum of a density operator, Phys. Rev. A 64 (5) (2001) 052311. [100] M. Keyl, R.F. Werner, The rate of optimal puri.cation procedures, Ann H. Poincar_e 2 (2001) 1–26. [101] A.I. Khinchin, Mathematical Foundations of Information Theory, Dover Publications, New York, 1957. [102] B.E. King, C.S. Wood, C.J. Myatt, Q.A. Turchette, D. Leibfried, W.M. Itano, C. Monroe, D.J. Wineland, Cooling the collective motion of trapped ions to initialize a quantum register, Phys. Rev. Lett. 81 (7) (1998) 1525–1528. [103] E. Knill, R. LaJamme, Theory of quantum error-correcting codes, Phys. Rev. A 55 (2) (1997) 900–911. [104] B. Kraus, M. Lewenstein, J.I. Cirac, Characterization of distillable and activable states using entanglement witnesses, 2001, quant-ph=0110174. [105] K. Kraus, States E=ects and Operations, Springer, Berlin, 1983. [106] R. Landauer, Irreversibility and heat generation in the computing process, IBM J. Res. Dev. 5 (1961) 183. [107] U. Leonhardt, Measuring the Quantum State of Light, Cambridge University Press, Cambridge, 1997. [108] M. Lewenstein, A. Sanpera, Separability and entanglement of composite quantum systems, Phys. Rev. Lett. 80 (11) (1998) 2261–2264. [109] N. Linden, H. Barjat, R. Freeman, An implementation of the Deutsch–Jozsa algorithm on a three-qubit NMR quantum computer, Chem. Phys. Lett. 296 (1–2) (1998) 61–67. [110] S. Lloyd, Capacity of the noisy quantum channel, Phys. Rev. A 55 (3) (1997) 1613–1622. [111] H.-K. Lo, T. Spiller, S. Popescu (Eds.), Introduction to Quantum Computation and Information, World Scienti.c, Singapore, 1998. [112] Y. Makhlin, G. SchWon, A. Shnirman, Quantum-state engineering with Josephson-junction devices, Rev. Mod. Phys. 73 (2) (2001) 357–400. [113] R. Marx, A.F. Fahmy, J.M. Myers, W. Bermel, S.J. Glaser, Approaching .ve-bit NMR quantum computing, Phys. Rev. A 62 (1) (2000) 012310. [114] R. Matsumoto and T. Uyematsu, Lower bound for the quantum capacity of a discrete memoryless quantum channel, 2001, quant-ph=0105151. [115] K. Mattle, H. Weinfurter, P.G. Kwiat, A. Zeilinger, Dense coding in experimental quantum communication, Phys. Rev. Lett. 76 (25) (1996) 4656–4659. [116] N.D. Mermin, Quantum mysteries revisited, Am. J. Phys. 58 (8) (1990) 731–734. [117] N.D. Mermin, What’s wrong with these elements of reality? Phys. Today 43 (6) (1990) 9–11. [118] H.C. Nagerl, W. Bechter, J. Eschner, F. Schmidt-Kaler, R. Blatt, Ion strings for quantum gates, Appl. Phys. B 66 (5) (1998) 603–608. [119] M. A. Nielsen, Conditions for a class of entanglement transformations, Phys. Rev. Lett. 83 (2) (1999) 436–439. [120] M.A. Nielsen, Continuity bounds for entanglement, Phys. Rev. A 61 (6) (2000) 064301. [121] M.A. Nielsen, Characterizing mixing and measurement in quantum mechanics, Phys. Rev. A 63 (2) (2001) 022114.
M. Keyl / Physics Reports 369 (2002) 431 – 548
547
[122] M.A. Nielsen, I.L. Chuang, Quantum Computation and Quantum Information, Cambridge University Press, Cambridge, 2000. [123] M. Ohya, D. Petz, Quantum Entropy and its Use, Springer, Berlin, 1993. [124] C.M. Papadimitriou, Computational Complexity, Addison-Wesley, Reading, MA, 1994. [125] V.I. Paulsen, Completely Bounded Maps and Dilations, Longman Scienti.c & Technical, New York, 1986. [126] A. Peres, Higher order schmidt decompositions, Phys. Lett. A 202 (1) (1995) 16–17. [127] A. Peres, Separability criterion for density matrices, Phys. Rev. Lett. 77 (8) (1996) 1413–1415. [128] S. Popescu, Bell’s inequalities versus teleportation: what is nonlocality? Phys. Rev. Lett. 72 (6) (1994) 797–799. [129] S. Popescu, D. Rohrlich, Thermodynamics and the measure of entanglement, Phys. Rev. A 56 (5) (1997) R3319–R3321. [130] J. Preskill, Lecture notes for the course ‘Information for Physics 219=Computer Science 219, Quantum Computation,’ Caltech, Pasadena, California, 1999, www.theory.caltech.edu/people/preskill/ph229. [131] M. Purser, Introduction to Error-Correcting Codes, Artech House, Boston, 1995. [132] E.M. Rains, Bound on distillable entanglement, Phys. Rev. A 60 (1) (1999) 179–184; E.M. Rains, Erratum, Phys. Rev. A 63 (1) (2001) 019902(E). [133] E.M. Rains, A semide.nite program for distillable entanglement, IEEE Trans. Inf. Theory 47 (7) (2001) 2921–2933. [134] M. Reed, B. Simon, Methods of Modern Mathematical Physics I, Academic Press, San Diego, 1980. [135] W. Rudin, Functional Analysis, McGraw-Hill, New-York, 1973. [136] O. Rudolph, A separability criterion for density operators, J. Phys. A 33 (21) (2000) 3951–3955. [137] D. Schlingemann, R.F. Werner, Quantum error-correcting codes associated with graphs, 2000, quant-ph=0012111. [138] C.E. Shannon, A mathematical theory of communication, Bell. Syst. Tech. J. 27 (1948) 379 – 423, 623– 656. [139] P.W. Shor, Algorithms for quantum computation: discrete logarithms and factoring, in: S. Goldwasser (Ed.), Proceedings of the 35th Annual Symposium on the Foundations of Computer Science, IEEE Computer Science, Society Press, Los Alamitos, CA, 1994, pp. 124–134. [140] P.W. Shor, Polynomial-time algorithms for prime factorization and discrete logarithms on a quantum computer, Soc. Ind. Appl. Math. J. Comput. 26 (1997) 1484–1509. [141] P.W. Shor, J.A. Smolin, B.M. Terhal, Nonadditivity of bipartite distillable entanglement follows from a conjecture on bound entangled Werner states, Phys. Rev. Lett. 86 (12) (2001) 2681–2684. [142] B. Simon, Representations of Finite and Compact Groups, American Mathematical Society, Providence, RI, 1996. [143] D. Simon, On the power of quantum computation, in: Proceedings of the 35th Annual Symposium on Foundations of Computer Science, IEEE Computer Society Press, Los Alamitos, 1994, pp. 124 –134. [144] R. Simon, Peres-Horodecki separability criterion for continuous variable systems, Phys. Rev. Lett. 84 (12) (2000) 2726–2729. [145] S. Singh, The Code Book: The Science of Secrecy from Ancient Egypt to Quantum Cryptography, Fourth Estate, London, 1999. [146] A.M. Steane, Multiple particle interference and quantum error correction, Proc. Roy. Soc. London A 452 (1996) 2551–2577. [147] W.F. Stinespring, Positive functions on C*-algebras, Proc. Am. Math. Soc. (1955) 211–216. [148] E. StHrmer, Positive linear maps of operator algebras, Acta Math. 110 (1693) 233–278. [149] T. Tanamoto, Quantum gates by coupled asymmetric quantum dots and controlled-not-gate operation, Phys. Rev. A 61 (2000) 022305. [150] B.M. Terhal, K.G.H. Vollbrecht, Entanglement of formation for isotropic states, Phys. Rev. Lett. 85 (12) (2000) 2625–2628. [151] W. Tittel, J. Brendel, H. Zbinden, N. Gisin, Violation of Bell inequalities by photons more than 10 km apart, Phys. Rev. Lett. 81 (17) (1998) 3563–3566. [152] A.M. Turing, On computable numbers, with an application to the entscheidungsproblem, Proc. London Math. Soc. Ser. 2 42 (1936) 230–265. [153] V. Vedral, M.B. Plenio, Entanglement measures and puri.cation procedures, Phys. Rev. A 54 (3) (1998) 1619–1633. [154] V. Vedral, M.B. Plenio, M.A. Rippin, P.L. Knight, Quantifying entanglement, Phys. Rev. Lett. 78 (12) (1997) 2275–2279. [155] G. Vidal, Entanglement monotones, J. Mod. Opt. 47 (2–3) (2000) 355–376.
548
M. Keyl / Physics Reports 369 (2002) 431 – 548
[156] G. Vidal, J.I. Latorre, P. Pascual, R. Tarrach, Optimal minimal measurements of mixed states, Phys. Rev. A 60 (1999) 126–135. [157] G. Vidal, R. Tarrach, Robustness of entanglement, Phys. Rev. A 59 (1) (1999) 141–155. [158] G. Vidal, R.F. Werner, A computable measure of entanglement, 2001, quant-ph=0102117. [159] K.G.H. Vollbrecht, R.F. Werner, Entanglement measures under symmetry, 2000, quant-ph=0010095. [160] K.G.H. Vollbrecht, R.F. Werner, Why two qubits are special, J. Math. Phys. 41 (10) (2000) 6772–6782. [161] I. Wegener, The Complexity of Boolean Functions, Teubner, Stuttgart, 1987. [162] S. Weigert, Reconstruction of quantum states and its conceptual implications, in: H.D. Doebner, S.T. Ali, M. Keyl, R.F. Werner (Eds.), Trends in Quantum Mechanics, World Scienti.c, Singapore, 2000, pp. 146–156. [163] H. Weinfurter, A. Zeilinger, Quantum communication, in: G. Alber, et al., (Eds.), Quantum Information, Springer, Berlin, 2001, pp. 58–95. [164] R.F. Werner, Quantum harmonic analysis on phase space, J. Math. Phys. 25 (1984) 1404–1411. [165] R.F. Werner, Quantum states with Einstein–Podolsky–Rosen correlations admitting a hidden-variable model, Phys. Rev. A 40 (8) (1989) 4277–4281. [166] R.F. Werner, Optimal cloning of pure states, Phys. Rev. A 58 (1998) 980–1003. [167] R.F. Werner, All teleportation and dense coding schemes, 2000, quant-ph=0003070. [168] R.F. Werner, Quantum information theory—an invitation, in: G. Alber, et al., (Eds.), Quantum Information, Springer, Berlin, 2001, pp. 14–59. [169] R.F. Werner, M.M. Wolf, Bell inequalities and entanglement, Quant. Inf. Comput. 1 (3) (2001) 1–25. [170] R.F. Werner, M.M. Wolf, Bound entangled gaussian states, Phys. Rev. Lett. 86 (16) (2001) 3658–3661. [171] H. Weyl, The Classical Groups, Princeton University, Princeton, NJ, 1946. [172] W.K. Wooters, Entanglement of formation of an arbitrary state of two qubits, Phys. Rev. Lett. 80 (10) (1998) 2245–2248. [173] W.K. Wootters, W.H. Zurek, A single quantum cannot be cloned, Nature 299 (1982) 802–803. [174] S.L. Woronowicz, Positive maps of low dimensional matrix algebras, Rep. Math. Phys. 10 (1976) 165–183.
Physics Reports 369 (2002) 549 – 686 www.elsevier.com/locate/physrep
Microscopic formulation of black holes in string theory Justin R. Davida , Gautam Mandalb , Spenta R. Wadiab;∗ a
b
Department of Physics, University of California, Santa Barbara, CA 93106, USA Research Department of Theoretical Physics, Tata Institute of Fundamental Research, Homi Bhabha Road, Mumbai 400 005, India Received 1 June 2002 editor: A. Schwimmer
Abstract In this report we review the microscopic formulation of the 1ve-dimensional black hole of type IIB string theory in terms of the D1–D5 brane system. The emphasis here is more on the brane dynamics than on supergravity solutions. We show how the low energy brane dynamics, combined with crucial inputs from AdS=CFT correspondence, leads to a derivation of black hole thermodynamics and the rate of Hawking radiation. Our approach requires a detailed exposition of the gauge theory and conformal 1eld theory of the D1–D5 system. We also discuss some applications of the AdS=CFT correspondence in the context of black hole formation in three dimensions by thermal transition and by collision of point particles. c 2002 Elsevier Science B.V. All rights reserved. PACS: 04.70.Dy; 11.15.−q; 11.25.−w; 11.25.Hf Keywords: Black holes; D-branes; Gauge theory
Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1. Quantum theory and general relativity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2. Black holes and the information puzzle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3. The string theory framework for black holes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.4. Plan of this report . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2. Construction of classical solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1. Classical solutions of M-theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.1. The 2-brane solution of M-theory: M 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.2. Intersecting M 2-branes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2. The 6D (D1–D5) black string solution of IIB on T 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ∗
Corresponding author.
c 2002 Elsevier Science B.V. All rights reserved. 0370-1573/02/$ - see front matter PII: S 0 3 7 0 - 1 5 7 3 ( 0 2 ) 0 0 2 7 1 - 5
551 551 552 556 558 562 563 563 566 567
550
3.
4.
5.
6.
7.
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
2.3. The extremal 5D black hole solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4. Non-extremal 1ve-dimensional black hole . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4.1. Geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4.2. Hawking temperature and Bekenstein–Hawking entropy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4.3. Comments on brane–antibrane and other non-BPS solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5. Supergravity solution with non-zero vev of BNS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5.1. Asymptotically Cat geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5.2. Near-horizon geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.6. Near-horizon limit and AdS3 × S 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.6.1. The three-dimensional anti-de Sitter space or AdS3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.6.2. The BTZ black hole . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.6.3. The two-dimensional black hole . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Semiclassical derivation of Hawking radiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1. Minimal scalar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.1. Absorption cross-section . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.2. Hawking radiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.3. Importance of near-horizon physics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2. Fixed scalars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The microscopic modeling of black hole and gauge theory of the D1–D5 system . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1. The D1–D5 System and the N = 4, U (Q1 ) × U (Q5 ) gauge theory in 2-dimensions . . . . . . . . . . . . . . . . . . . . . 4.2. The potential terms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3. D-Catness equations and the moduli space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4. The bound state in the Higgs phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5. The conformally invariant limit of the gauge theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.6. A quick derivation of entropy and temperatures from CFT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.7. D1-branes as solitonic strings of the D5 gauge theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The SCFT on the orbifold M . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1. The N = 4 superconformal algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2. Free 1eld realization of N = (4; 4) SCFT on the orbifold M . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3. The SO(4) algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4. The supergroup SU (1; 1|2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.5. Short multiplets of SU (1; 1|2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.6. The resolutions of the symmetric product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.6.1. The untwisted sector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.6.2. Z2 twists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.6.3. Higher twists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.7. The chiral primaries of the N = (4; 4) SCFT on M . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.7.1. The k-cycle twist operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.7.2. The complete set of chiral primaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.8. Short multiplets of N = (4; 4) SCFT on M . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.9. Stringy exclusion principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Near-horizon supergravity and SCFT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1. Classi1cation of the supergravity modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2. The supergravity moduli . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3. AdS3 =CFT2 correspondence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.4. Supergravity moduli and the marginal operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Location of the symmetric product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.1. Dynamics of the decay of the D1–D5 system from gravity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2. The linear sigma model description of R4 =Z2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3. The gauge theory relevant for the decay of the D1–D5 system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.4. Dynamics of the decay of the D1–D5 system from gauge theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
569 571 573 574 574 574 575 576 577 578 579 580 581 583 586 586 587 587 588 588 592 592 593 595 595 597 599 599 600 601 602 603 604 604 605 607 608 608 610 612 612 613 613 616 617 618 619 622 624 626 629
J.R. David et al. / Physics Reports 369 (2002) 549 – 686 7.5. 8. The 8.1. 8.2. 8.3. 8.4.
The symmetric product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . microscopic derivation of Hawking radiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Near-horizon limit and Fermion boundary conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The black hole state . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The coupling with the bulk 1elds for the D1–D5 black hole . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Determination of the strength of the coupling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.4.1. Evaluation of the tree-level vertices in supergravity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.4.2. Two-point function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.5. Absorption cross-section as thermal Green’s function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.6. Absorption cross-section of minimal scalars from the D1–D5 SCFT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.6.1. Absorption cross-section for the blow up modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.7. Fixed scalars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.8. Intermediate scalars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9. Non-renormalization theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.1. The spectrum of short multiplets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.2. Entropy and area . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.3. Hawking radiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.3.1. Independence of Hawking radiation calculation vis-a-vis moduli: supergravity . . . . . . . . . . . . . . . . . . . . 9.3.2. Independence of Hawking radiation vis-a-vis moduli: SCFT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10. Strings in AdS3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.1. The S-dual of the D1–D5 system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.2. String propagation on AdS3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.3. Spectrum of strings on AdS3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.4. Strings on Euclidean AdS3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.4.1. The long string worldsheet algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.5. Strings on the thermal AdS3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11. Applications of AdS 3 –CFT 2 duality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1. Hawking–Page transition in AdS3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1.1. Euclidean free energy from supergravity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1.2. Boundary conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1.3. Finding the saddle points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1.4. Free energy of BTZ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2. Conical defects and particles in AdS3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2.1. Black hole creation by particle collision . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12. Concluding remarks and open problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix A. Euclidean derivation of Hawking temperature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix B. A brief heuristic motivation for Rules 1 and 2 of Section 2.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix C. Coordinate systems for AdS3 and related spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C.1. AdS3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C.2. BTZ black hole . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C.3. Conical spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C.4. Euclidean sections and thermal physics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
551 630 630 631 631 634 634 636 638 639 640 641 641 643 643 644 644 645 645 646 649 650 651 653 655 656 657 658 659 660 661 662 663 665 667 669 671 671 672 673 674 675 676 676 679
1. Introduction 1.1. Quantum theory and general relativity Quantum theory and the general theory of relativity form the basis of modern physics. However, these two theories seem to be fundamentally incompatible. Quantizing general relativity leads to
552
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
a number of basic problems: (1) Ultraviolet divergences render general relativity ill-de1ned as a quantum theory (see, e.g. Weinberg in [1]). This speci1cally means that if we perform a perturbation expansion around Cat Minkowski space–time (which is a good 1rst approximation to our world) then to subtract in1nities from the divergent diagrams we have to add an in1nite number of counterterms to the Einstein–Hilbert action with coeKcients that are proportional to appropriate powers of the ultraviolet cutoL. There is good reason to believe that string theory [2,3] solves this ultraviolet problem because the extended nature of √ string interactions have an inherent ultraviolet cutoL given by the fundamental string length . Furthermore, for length scales much larger than the string length the Einstein–Hilbert action emerges [4,5] as a low energy eLective action from string theory, with Newton’s constant (for type II strings in ten dimensions) given by GN(10) = 86 gs2 4 ;
(1.1)
where gs is the string coupling. (2) There are many singular classical solutions of general relativity (for standard textbooks on classical general relativity, see, e.g., [6 –10]), including the Schwarzschild black hole and the Big Bang model of cosmology. Black holes and their higher dimensional analogues (black branes) also appear as solutions of low energy string theory. A quantum theory of gravity must (a) present an understanding regarding which of these singular geometries can arise from a well de1ned quantum mechanics in an appropriate limit, and (b) formulate such a quantum mechanics where possible. String theory has been able to “resolve” a class of singularities in this way, but a complete understanding of the issue of singularities is still lacking (see [11–15] for a partial list of related papers; Ref. [15] contains a review and a comprehensive list of references). (3) While the above problems are related to the high energy (short distance) behaviour of general relativity, there exists another basic problem when we quantize matter 1elds in the presence of a black hole, which does not ostensibly depend on high energy processes. This problem is called the information puzzle ([16,17], for early reviews see, e.g., [1,18,19]). In the following we shall explain the issue in some detail and subsequently summarize the attempts within string theory to resolve the puzzle in a certain class of black holes. Besides being a long-lasting problem of general relativity, this is an important problem for string theory for the following reason. String theory has been proposed as a theory that describes all elementary particles and their interactions. Presently, the theory is not in the stage of development where it can provide quantitative predictions in particle physics. However if string theory can resolve some logical problem that arises in the applications of standard quantum 1eld theory to general relativity, then it is a step forward for string theory. (4) Finally, any quantum theory of gravity should lead to an understanding of the problem of the cosmological constant (for reviews see, e.g. [20 –22]). 1.2. Black holes and the information puzzle Let us brieCy review some general properties of black holes [6 –10]. Black holes are objects which result as end points of gravitational collapse of matter. For an object of mass greater than roughly
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
553
three solar masses (see, e.g. [23]), the gravitational force overcomes all other forces and the matter generically collapses into a black hole (in some exceptional cases a naked singularity might result). This would suggest that to specify a black hole it is necessary to give in detail the initial conditions of the collapse. As we will see below a black hole is completely speci1ed by a few parameters only. To introduce various concepts related to black holes we will discuss two examples of black holes. First, let us consider the Schwarzschild black hole in 3 + 1 dimensions. It is a time independent, spherically symmetric solution of Einstein gravity without matter. The metric is given by 2GN M 2GN M −1 2 2 2 dt + 1 − ds = − 1 − dr + r 2 d2 ; (1.2) r r where GN is the Newton’s constant 1 and d2 = (d 2 + 2 d2 ) is the metric on S 2 . We have chosen units so that the velocity of light is c = 1. The surface r = 2GN M is called the event horizon. It is a coordinate singularity (grr = ∞) but not a curvature singularity (e.g., Ricci scalar is 1nite here). Light-like geodesics and time-like geodesics starting “inside” the event horizon (r ¡ 2GN M ) end up at r = 0 (the curvature singularity) in a 1nite proper time. This means that classically the black hole is truly black, it cannot emit anything. Note that the solution is completely speci1ed by only one parameter M , which coincides with the ADM mass (see (2.12)) of the black hole. Next we consider the Reissner-NQordstrom (RN) black hole. It is a time independent, spherically symmetric solution of Einstein gravity coupled to the electromagnetic 1eld. The solution is given by the following backgrounds: −1 2GN M 2GN M GN Q 2 GN Q 2 2 2 ds = − 1 − dt + 1 − dr 2 + r 2 d2 ; + + r r2 r r2 A0 =
Q ; r
Ai = 0; i = 1; 2; 3 ;
(1.3)
where A0 is the time component of the vector potential. This solution carries a charge Q (the Schwarzschild black hole, Eq. (1.2), corresponds to the special case Q =0). There are two coordinate singularities (grr = ∞) at r = r+ (outer horizon) and r = r− (inner horizon) r± = GN M ± (GN M )2 − GN Q2 : (1.4) √ The event horizon coincides with the outer horizon r = r+ . When M = |Q|= GN , r+ coincides with r− . Such a black hole is called an extremal black hole. Note that the black hole (1.3) is completely speci1ed by its mass M and the charge Q. The D1–D5 black hole, which will be the main subject of this report, is similar to the RN black hole; it is charged under Ramond–Ramond 1elds in type IIB string theory and has the same causal structure, equivalently the same Penrose diagram (see Section 2.4.1), as the RN black hole. The point r = 0 is a curvature singularity for both the Schwarzschild solution (1.2) and the Reissner-NQordstrom solution (1.3). The singularity is “clothed” by the event horizon in the physical √ range of parameters (M ¿ 0 for Schwarzschild, M ¿ |Q|= GN for RN). The ranges of mass: M ¡ 0 We will use the notation GN(D) or GND for Newton’s constant in D spacetime dimensions, reserving the default GN for D = 4. 1
554
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
√ for Schwarzschild, M ¡ |Q|= GN for RN, are unphysical because these result in a naked singularity √ (see, e.g. [11]). The limiting case for the RN black hole, M = |Q|= GN , is called an extremal black hole, as already mentioned above. In general, collapsing matter results in black holes which are completely speci1ed by the mass M , the U (1) charges Qi and the angular momentum J . This is called the no hair theorem (see, e.g., [24 –27] for a review). Whatever other information (for example, multipole moments) present decays exponentially fast with a characteristic time $ = rh =c, during the collapse. (rh is the radius of the horizon and c is the speed of light.) Thus, all detailed information carried by the collapsing matter is completely lost. The other beautiful result of classical general relativity is the so-called area law. It says that [10] the area of the horizon of a black hole cannot decrease with time, and if two (or more) black holes (of areas A1 , A2 ; : : :) merge to form a single black hole, the area A of its horizon will satisfy A ¿ A1 + A2 + · · · :
(1.5)
This general result is easy to verify for the Schwarzschild and Reissner-NQordstrom black holes that we have discussed above. Bekenstein argued that unless a black hole had entropy an infalling hot body would lead to a violation of the second law of thermodynamics, because the entropy of the hot body will be lost once it is absorbed by the black hole, thus causing entropy to decrease. He postulated that the entropy of a black hole is proportional to the area of the event horizon, the constant of proportionality being universal for all black holes. The area law now has natural thermodynamic interpretations and the macroscopic observables allowed by the no hair theorems play the role of macroscopic thermodynamic variables. In the classical theory, however, there is no notion of the absolute entropy of a system. Such a concept, as we shall see now, requires quantum theory. In his pioneering work in the 1970s, Hawking [16] quantized matter 1elds in the background geometry of a black hole. He found that the Schwarzschild black hole is not truly black. A semi-classical calculation showed that it emits radiation with the spectrum of a black body at a temperature T given by TH =
˝ : 8GN M
(1.6)
The quantum nature of this eLect is clearly evident from the fact that the temperature is proportional to ˝. For the Reissner-NQordstrom black hole the temperature of Hawking radiation is TH =
(r+ − r− )˝ ; 4r+2
(1.7)
where r± are de1ned in (1.4) as functions of Q; M . A brief derivation of the temperature formulae, using the Euclidean approach [28] is presented in Appendix A. One notes that the extremal Reissner-NQordstrom black hole has TH = 0 and does not Hawking radiate. In general the Hawking temperature turns out to be a function of mass, the charge(s) and the angular momentum alone. Thus, even semi-classical eLects do not provide further information of the black hole. The works in the 1970s culminated in the following laws of black holes [29 –32] which are analogous to the laws of thermodynamics.
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
555
(1) First Law: Two neighbouring equilibrium states of a black hole of mass M , charges Qi and angular momentum J , are related by A dM = TH d + 'i dQi + dJ : (1.8) 4GN ˝ where A is the area of the event horizon, ' the electric surface potential and the angular velocity. For the special case of the Reissner-NQordstrom black hole the 1rst law reduces to A dM = TH d + ' dQ ; (1.9) 4GN ˝ where ' = Q=r+ . This is explicitly veri1ed in Appendix A, Eq. (A.11). (2) Second Law: Black holes have entropy S given by 1 A A = S= ; (1.10) 4GN ˝ 4 APl GN ˝ APl = 3 ; (1.11) c where we have reinstated in the last equation the speed of light c. The generalized second law says that the sum of the entropy of the black hole and the surroundings never decreases (this is a generalized form of (1.5)). Formula (1.10), called the Bekenstein–Hawking entropy formula, is very important because it provides a counting of the eLective number of degrees of freedom of a black hole which any theory of quantum gravity must reproduce. We note that APl in (1.11) is a basic unit of area (Planck unit), involving all three fundamental constants (APl = 2:61 × 10−66 cm2 in four dimensions). The Bekenstein–Hawking formula simply states that the entropy of a black hole is a quarter of its horizon area measured in units of APl . The entropy of the Schwarzschild black hole, according to (1.10) is S = 4GN M 2
(1.12)
while that of the RN black hole is r+2 : (1.13) GN ˝ The Hawking radiation as calculated in semi-classical general relativity is a mixed state. It turns out to be diKcult to calculate the correlations between the infalling matter and outgoing Hawking particles in the standard framework of general relativity. Such a calculation would require a good quantum theory of gravity where controlled approximations are possible [33]. If we accept the semi-classical result that black holes emit radiation that is exactly thermal then it leads to the information puzzle [1,16 –19]. Initially the matter that formed the black hole is in a pure quantum mechanical state. Here in principle we know all the quantum mechanical correlations between the degrees of freedom of the system. In case the black hole evaporates completely, then the 1nal state of the system is purely thermal and hence it is a mixed state. This evolution of a pure state to a mixed state is in conCict with the standard laws of quantum mechanics which involve unitary time evolution which sends pure states to pure states. S=
556
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
Hence it would appear that we have to modify quantum mechanics, as was advocated by Hawking [17]. However, in the following we shall argue that if we replace the paradigm of quantum 1eld theory by that of string theory (Section 1.3), we are able to retain quantum mechanics and resolve the information puzzle (for a certain class of black holes) by discovering the microscopic degrees of freedom of the black hole. It is well worth pointing out that the existence of black holes in nature (for which there is mounting evidence [23]) compels us to resolve the conundrums that black holes present. One can perhaps take recourse to the fact that for a black hole whose mass is a few solar masses the Hawking temperature is very tiny (∼ 10−8 K), and not of any observable consequence. However the logical problem that we have described above cannot be wished away and its resolution makes a de1nitive case for the string paradigm as a correct framework for fundamental physics as opposed to standard local quantum 1eld theory. This assertion implicitly assumes that in string theory there exists a controlled calculational scheme to calculate the properties of black holes. Fortunately there does exist a class of black holes in type IIB string theory compacti1ed on a 4-manifold (T 4 or K3 ), which has suKcient supersymmetry to enable a precise calculation of low energy processes of this class of black holes. This aspect is the main focus of this review. 1.3. The string theory framework for black holes We now brieCy describe the conceptual framework of black hole thermodynamics in string theory. A black hole of a given mass M , charges Qi and angular momentum J is de1ned by a density matrix (see also (8.2)): 1 |i i| ; (1.14) )= i∈S
where |i is a microstate which can be any of a set S of states (microcanonical ensemble) all of which are characterized by the above-mentioned mass, charges and angular momentum. A de1nition like (1.14) is of course standard in quantum statistical mechanics, where a system with a large number of degrees of freedom is described by a density matrix to derive a thermodynamic description. Using (1.14) thermodynamic quantities like temperature, entropy and the rates of Hawking radiation can be derived for a black hole in string theory (see Section 8 for details). In particular the Bekenstein–Hawking formula is derived from Boltzmann’s law: S = ln ;
(1.15)
where is the number of microstates of the system. We are using units so that the Boltzmann constant is unity. Given this we can calculate formulas of black hole thermodynamics just like we calculate the thermodynamic properties of macroscopic objects using standard methods of statistical mechanics. Here, the quantum correlations that existed in the initial state of the system are in principle all present and are only erased by our procedure of de1ning the black hole state in terms of a density matrix. In this way one can account for not only the entropy of the system which is a counting problem but also the rate of Hawking radiation which depends on interactions.
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
557
Let us recall the treatment of radiation coming from a star or a lump of hot coal. The ‘thermal’ description of the radiation coming is the result of averaging over a large number of quantum states of the coal. In principle, by making detailed measurements on the wave function of the emitted radiation we can infer the precise quantum state of the emitting body. For black holes the reasoning is similar. Hence in the string theory formulation, the black hole can exist as a pure state: one among the highly degenerate set of states that are characterized by a small number of parameters. Let us also note that in Hawking’s semi-classical analysis, which uses quantum 1eld theory in a given black-hole space–time, there is no possibility of a microscopic construction of black hole wave functions. To repeat, in string theory Hawking radiation is not thermal and in principle we can reconstruct the initial state of the system from the 1nal state, which therefore resolves the information paradox. We shall see that in the case of the 1ve-dimensional black hole of type IIB string theory it is possible to construct a precise microscopic description of black hole thermodynamics. We now summarize the four basic ingredients we need to describe and calculate Hawking radiation for near extremal black holes which have low Hawking temperature: (1) The microscopic constituents of the black hole. In the case of the 5-dimensional black hole of type IIB string theory the microscopic modeling is in terms of a system of D1–D5 branes wrapped on S 1 × M4 , where M4 is a 4-dim. compact manifold, which can be either T 4 or K3 . In this report we will mainly focus on T 4 . (2) The spectrum of the low energy degrees of freedom of the bound state of the D1–D5 system. These are derived at weak coupling and we need to know if the spectrum survives at strong coupling. (3) The coupling of the low energy degrees of freedom to supergravity modes. (4) The description of the black hole as a density matrix. This implies expressions for decay and absorption probabilities which are related to S-matrix elements between initial and 1nal states of the black hole. To understand the microscopic calculation of Hawking rate in a nutshell, consider a black hole of mass M and charge Q, described by a microcanonical ensemble S. Consider the process of absorption of some particles by the black hole which changes the mass and charge to M ; Q (corresponding to another microcanonical ensemble, say S ). Let ( ) be the total number of states in S (resp. S ). The absorption probability from a state |i ∈ S to a state |f ∈ S is given by Pabs (i → f) =
1 |f|S|i |2 ;
(1.16)
i;f
where, by de1nition, we sum over the 1nal states and average over the initial states. Similarly, the decay probability from a state |i ∈ S to a state |f ∈ S is given by Pdecay (i → f) =
1 |f|S|i |2 :
(1.17)
i;f
Point (3) above allows us to calculate the matrix element f|S|i in string theory, thus leading to a calculation of absorption cross-section and Hawking rate. We will elaborate on this in detail in Section 8.
558
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
One of the important issues in this subject is that the string theory ingredients (points (1) and (2) above) are usually known in the case when the eLective open string coupling is small. In this case, the Schwarzschild radius Rsch of the black hole is smaller than the string length ls and we have a controlled de1nition of a string state. As the coupling is increased we go over to the supergravity description where Rsch ls and we have a black hole. Now it is an issue of dynamics whether the spectrum of the theory undergoes a drastic change or not, determining whether the description of states in weak coupling, which enabled a thermodynamic description, remains valid or not. In the model we will consider, we will see that the description of the weak coupling eLective Lagrangian goes over to strong coupling because of supersymmetry. It is an outstanding challenge to understand this problem when the weak coupling theory has little or no supersymmetry [34 –37]. We will brieCy touch upon this issue in Section 11. 1.4. Plan of this report The plan of the rest of this report is as follows. In Section 2, we discuss in some detail the construction of the 1ve-dimensional black hole solution in type IIB supergravity. We provide a brief background of how to construct BPS solutions in M-theory from 1rst principles by explicitly solving Killing spinor equations; we focus on the example of the M 2-brane and its intersections. We then construct the D1–D5 black string solution by dualizing M 2⊥M 2, and the extremal (BPS) D1–D5 black hole by dualizing M 2⊥M 2⊥M 2. We then describe an algorithm of how to generate non-BPS solutions from BPS ones; we provide a motivation for this algorithm in Appendix B. We compute the Bekenstein–Hawking entropy and the Hawking temperature of these black holes. We next discuss how to generate non-zero BNS in the D1–D5 system by exact dualities in supergravity. In the absence of the BNS vev, the D1–D5 system is marginally bound and can fragment into clusters of smaller D1–D5 systems. We show, by constructing the exact supergravity solution with non-zero BNS , that the BPS mass formula becomes non-linear in the masses of the D1- and D5-branes and that a binding energy is generated. The picture of the bound state thus obtained is compared with gauge theory and conformal 1eld theory in Sections 4 and 7. Next, we describe the appearance of AdS3 and BTZ black holes as the “near-horizon limit” of the D1–D5 string and the D1–D5 black hole, respectively; this discussion is used as background material (together with Appendix C) for the discussions of AdS=CFT in Sections 6 and 11. In the context of the near-horizon physics we also discuss a connection between the pure 1ve-brane system and the two-dimensional black hole. In Section 3, we describe the semiclassical derivation of absorption cross-section (also called graybody factor) and the rate of Hawking radiation for the near-extremal 1ve-dimensional black hole constructed in Section 2. We describe various Cuctuations around the black hole solutions. We speci1cally focus on (a) minimal scalars which couple only to the 1ve-dimensional Einstein-frame metric, and (b) 1xed scalars which couple also to other background 1elds like the Ramond–Ramond Cuxes. We derive case (a) in more detail in which we show that the computation of the graybody factor and the Hawking rate essentially depends only on the near-horizon limit. Both cases (a) and (b) are compared with the D1–D5 conformal 1eld theory calculations in Section 8, where we 1nd exact agreement with the results of Section 3. The agreement involves new 1rst principles calculations in the D-brane conformal 1eld theory which involve signi1cant conceptual departures from the phenomenological approach adopted in the early calculations (see [38– 40] for the original
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
559
papers and [41] for a review). We mention the new points in detail in Section 8 (esp. Sections 8.6.1 and 8.7). The results in Sections 3 and 8 have a priori diLerent regions of validity; a comparison therefore involves an extrapolation which is made possible by non-renormalization theorems discussed in Section 9. In Section 4, we discuss the gauge theory for the D1–D5 system. This system is closely related to the 1ve-dimensional black hole solution. The D1–D5 system, as discussed in Section 2, consists of Q1 D1-branes and Q5 D5-branes wrapped on a T 4 in type IIB theory. It is a black string in six-dimensions. The 1ve-dimensional black hole solution is obtained by wrapping the black string on a circle and introducing Kaluza–Klein momenta along this direction. For the purpose of understanding Hawking radiation from the 1ve-dimensional black hole it is suKcient to study the low energy eLective theory of the D1–D5 system. The low energy theory of the D1–D5 system is a 1 + 1 dimensional supersymmetric gauge theory with gauge group U (Q1 ) × U (Q5 ). It has 8 supersymmetries. The matter content of this theory consists of hypermultiplets transforming as bi-fundamentals of U (Q1 ) × U (Q5 ). We review in detail the 1eld content of this theory and its symmetries. The bound state of the D1- and D5-branes is described by the Higgs branch of this theory. We show the existence of this bound state when the Fayet–Iliopoulos terms are non-zero. The Higgs branch of the D1–D5 system Cows in the infrared to a certain N = (4; 4) superconformal 1eld theory (SCFT). A more detailed understanding of this conformal 1eld theory can be obtained by thinking of the bound D1-branes as solitonic strings of the D5 brane theory. From this point of view the target space of the SCFT is the moduli space of Q1 instantons of a U (Q5 ) gauge theory on T 4 . This moduli space is known to be a resolution of the orbifold (T˜ )Q1 Q5 =S(Q1 Q5 ) which we denote by 4 M. S(Q1 Q5 ) stands for the symmetric permutation group of Q1 Q5 elements. The torus T˜ can be distinct from the compacti1cation torus T 4 . We review the evidence for this result. The mere fact that the low energy theory is a (super)conformal 1eld theory with a known central charge is used in Section 4.6 to provide a short-hand derivation of the Bekenstein–Hawking entropy and Hawking temperature. The complete derivation, including that of the graybody factor and Hawking rate, is however possible only in Section 8 after we have discussed in detail the precise SCFT in Section 5, the couplings to the bulk 1elds in Section 6 and the location of the SCFT used in the moduli space of the M = (4; 4) SCFT on M. It turns out that the supergravity calculation and the SCFT are at diLerent points in the moduli space and the reason they agree is due to the non-renormalization theorems discussed in Section 9. In Section 5 we formulate the SCFT on M, we will discuss in detail a realization of this orbifold SCFT as a free 1eld theory with identi1cations. The symmetries of this SCFT including a new SO(4) algebra will also be discussed. As the SCFT relevant for the D1–D5 system is a resolution of the SCFT on the orbifold M, we construct all the marginal operators of this SCFT including operators which correspond to blowing up modes. We classify the marginal operators according to the short multiplets of the global part of the N = (4; 4) superalgebra and the new SO(4) algebra. We then explicitly construct all the short multiplets of the SCFT on M and classify them according to the global subgroup of the N = (4; 4) superconformal algebra. These short multiplets are shown to be in one-to-one correspondence with the spectrum of short-multiplets of supergravity in the near horizon geometry of the D1–D5 system in Section 6. Now that we have discussed the microscopic degrees of freedom in Section 6, we take the next step towards the microscopic understanding of Hawking radiation. We 1nd the precise coupling of the 1elds of the supergravity to the microscopic SCFT. This is given by a speci1c SCFT operator
560
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
O(z; z), U which couples to the supergravity 1eld in the form of an interaction. U z) U ; Sint = d 2 z(z; z)O(z;
(1.18)
where is the strength of the coupling. As N = (4; 4) SCFT on M is an “eLective” theory of the D1–D5 system, it is diKcult to 1x the coupling of this theory to the supergravity 1elds using ab initio methods. We 1x the operator in the SCFT corresponding to the supergravity 1eld using the method of symmetries. The near horizon limit of the D1–D5 system exhibits enhanced symmetries. This is a special case of the AdS=CFT correspondence. As we have seen in Section 2, the near horizon geometry of the D1–D5 system reduces to that of AdS3 × S 3 × T 4 . The AdS=CFT correspondence [42] states that string theory on AdS3 × S 3 × T 4 is dual to the 1 + 1 dimensional SCFT describing the Higgs branch of the D1–D5 system on T 4 . For large Q1 Q5 the radius of S 3 is large in string units. The Kaluza–Klein modes on T 4 is of the order of string length. Thus type IIB string theory on AdS3 × S 3 × T 4 passes over to 6-dimensional (2; 2) supergravity on AdS3 × S 3 . We will work in the supergravity limit. The evidence for this correspondence comes from symmetries. The isometries of AdS3 correspond to the global part of the Virasoro group of the SCFT. The R-symmetry of the SCFT is identi1ed with the isometry of the S 3 . The number of supersymmetries of the bulk gets enhanced to 16 from 8. These correspond to the global supersymmetries of the N = (4; 4) superalgebra of the SCFT. Thus the global part of the superalgebra of the SCFT is identi1ed with the AdS3 × S 3 supergroup, SU (1; 1|2)×SU (1; 1|2). Therefore a viable strategy to 1x the coupling is to classify both the bulk 1elds and the SCFT operators according to the symmetries. The question then would be if this procedure can 1x the SCFT operator required for analysing Hawking radiation. We will review the classi1cation of the entire set of Kaluza–Klein modes of the six-dimensional supergravity on S 3 as short multiplets of SU (1; 1|2) × SU (1; 1|2) [43]. We use symmetries, including the new global SO(4) algebra, to identify the marginal operators constructed in Section 5 with their corresponding supergravity 1elds. This enables us to identify the operators corresponding to the minimal scalars. We also identify the quantum numbers of the 1xed scalars and the intermediate scalars. From Sections 5 and 6 we see that there is a one-to-one correspondence between the moduli of the N = (4; 4) SCFT on M and the moduli of supergravity. In Section 7 we address the question as to where in this moduli space is the free 1eld orbifold SCFT, which we use in our calculations, with respect to the supergravity solution. Let us 1rst examine the supergravity side. In Section 2 we constructed the supergravity solution with the self-dual Neveu–Schwarz B 1eld turned on. The self dual NS B-1eld is a moduli of the D1–D5 background. The mass formula for these solutions was non-linear and had a binding energy for the break up of the D1–D5 system into constituent D1- and D5-branes. This is unlike the case when no moduli is turned on. The mass formula in that case is linear and the system is marginally stable with respect to decay into constituent branes. We see in Section 7 that the eLective theory from supergravity governing the decay of the D1–D5 system is a Liouville theory. This is derived by using probe branes near the boundary of AdS3 . These branes are called long strings. We then derive this Liouville theory from the D1–D5 gauge theory. As we have seen in Section 4, absence of a bound state implies that the Fayet–Iliopoulos and terms have to be set to zero. In Section 7 we see that the eLective theory near the origin of the Higgs branch (i.e. when these terms are small but non-zero) is the same Liouville theory as seen in supergravity. This Liouville theory is strongly coupled and singular at the origin of the Higgs branch. The free orbifold SCFT on M, on the other hand, is regular and 1nite. Thus we conclude that the free orbifold theory
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
561
must correspond, (a) in the gauge theory to a point where the Fayet–Iliopoulos and=or terms do not vanish, equivalently (b) in supergravity when some of the moduli are turned on which lead to non-zero binding energy, as mentioned above. More speci1cally, we will see that the orbifold theory corresponds to a point where the term in the gauge theory of the D1–D5 system is turned on. Thus, the simple supergravity solution of Sections 2.3 and 2.4 is at a diLerent point in moduli space compared to the SCFT. In Section 9 we describe why calculations done in the free orbifold SCFT can be valid at the point in moduli space which corresponds to the simple supergravity solution. In Section 8, we derive the thermodynamics of the 1ve-dimensional black hole from the microscopic viewpoint (SCFT). We use the free 1eld orbifold N = (4; 4) SCFT on M as the microscopic theory. The D1–D5 black hole is identi1ed as an excited state with de1nite left and right conformal weights over the Ramond sector of this theory. Using Cardy’s formula we get the asymptotic state counting for a conformal 1eld theory in terms of its central charge and the level number. Use of the Cardy formula in the SCFT for the black hole state shows that the entropy evaluated from SCFT precisely matches with the entropy of D1–D5 black hole. We then discuss Hawking radiation of minimal scalars from the D1–D5 black hole. In order to do this, we have to 1x the strength of the coupling of the minimal scalar to its boundary operator in (1.18). We determine this by matching the two point functions evaluated in the SCFT and supergravity. It is only here we use the more quantitative version of the AdS=CFT correspondence. Once the coupling, and hence Sint is determined Hawking radiation=absorption cross-section from the microscopic SCFT can be derived as a purely quantum scattering process. We formulate the absorption cross-section calculation from the SCFT as an evaluation of the thermal Green’s function of the operators O in (1.18). The absorption cross-section evaluated from SCFT agrees with the semiclassical calculation evaluated in supergravity in Section 3. In Sections 8.6.1 and 8.7 we point out the diLerences between the present SCFT results, and those in the early works (reviewed in [41]) which adopted a phenomenological approach for the D-brane degrees of freedom as well as for the coupling to supergravity 1elds. The phenomenological method had a discrepancy for 4 minimal scalars which correspond to blow-up modes in the SCFT and 1xed scalars from the semiclassical results of Section 3. In fact using the phenomenological Dirac–Born–Infeld action for the D-brane degrees of freedom there is no accounting at all for the minimal scalars corresponding to the blow-up modes. We show that the 1rst principles calculation presented in Section 8 remove these discrepancies. We also outline the absorption cross-section calculation for the intermediate scalars. Though the N = (4; 4) SCFT and the near horizon moduli of supergravity agree, we have seen in Section 7 that the free 1eld orbifold theory is at a diLerent point in moduli space compared to the supergravity solution. In Section 9 we see why calculations done using the free orbifold theory are valid and agree with that of supergravity. We show how the spectrum of short multiplets, the entropy and the calculation of Hawking radiation are independent of moduli both in SCFT and in supergravity due to non-renormalization theorems. In discussing non-renormalization theorems for Hawking radiation, we 1rst examine the dilute gas approximations and the low energy approximation made in the semiclassical calculation in Section 3 and show how they appear in the SCFT. In Section 10 we go beyond the supergravity approximation in the study of the near horizon geometry, AdS3 × S 3 × T 4 , and study string theory in this curved background to explore the AdS=CFT correspondence. Since string theory in Ramond–Ramond backgrounds is diKcult to study it is
562
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
convenient to study the near horizon geometry of the S-dual to the D1–D5 system, the NS1–NS5 brane system. We review the spectrum of strings in this background and show that the long strings found in Section 7 play an important role in the completion of the spectrum. We then review the formulation of string theory on Euclidean AdS3 and derive the world sheet algebra of the long string. It is seen, as expected from S-duality, that the world sheet theory is a Liouville theory which coincides with that of the single long D1-string near the boundary of AdS3 in Section 7. We then review the evaluation of the partition function of a gas of strings in thermal AdS3 . From this partition function it can be veri1ed that long strings are present in the spectrum as expected from the analysis of the spectrum with the Lorentzian signature metric. In Section 11, we discuss some applications of AdS=CFT correspondence (introduced in Section 6). In the 1rst part we discuss the thermal phase transition (Hawking–Page) in AdS3 and how to understand it in terms of the dual CFT picture. We show that the Euclidean partition of asymptotically AdS3 spaces can be evaluated in the leading semiclassical approximation as a sum over an SL(2; Z) family of saddle-point con1gurations, of which two members are AdS3 and the BTZ black hole. We discuss the issue of boundary conditions of this partition function (see also Appendix C) and relate it to the boundary conditions of the CFT partition function. We calculate the BTZ partition function in AdS3 supergravity as well as in the boundary SCFT and show that they agree. We show that the supergravity partition function is dominated at low energies by AdS3 , and high energies by the Euclidean BTZ black hole; from the CFT viewpoint this phenomenon gets related to the fact that at low temperatures the NS sector dominates and high temperatures the Ramond sector dominates. The modular invariance of the boundary SCFT points gets related to a similar modular invariance in the three-dimensional supergravity. We next describe some new spaces which are also asymptotically AdS3 ; these are conical spaces which are created by static point particles. We discuss various ways of understanding them as solutions of string theory, the simplest being the case of ZN cones which are described as orbifolds. We brieCy discuss the CFT duals. We end the discussion with conical spaces created by moving point particles and describe the scenario in which there are two such particles with suKcient energy to form a BTZ black hole. The dual CFT is left as an open problem. In Section 12 we conclude with a discussion of some open problems.
2. Construction of classical solutions The aim of this section is to construct the classical solution representing the 1ve-dimensional black hole in [44]. Rather than presenting the solution and showing that it solves the low energy equations of type II superstring, we will describe some aspects of the art of solution-building. There are many excellent reviews of this area (see, for example, [45 –53], other general reviews on black holes in string=M theories include [54 –56]), so we shall be brief. The method of construction of various classical solutions, we will see, will throw light on the microscopic con1gurations corresponding to these solutions. Two widely used methods for construction of classical solutions are (a) the method of harmonic superposition; (b) O(d; d) transformations.
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
563
We will mainly concentrate on the 1rst one below. For a more detailed account including the second method, see, e.g. [50]. As is well known by now, classical solutions of type II string theories can be obtained from those of M-theory [57,58] through suitable compacti1cation and dualities. We will accordingly start with classical solutions of M-theory, or alternatively of 11-dimensional supergravity. We should note two important points: (a) For classical supergravity description of these solutions to be valid, we need the curvature to be small (in the scale of the 11-dimensional Planck length l11 for solutions of M-theory, or of the string length ls for string theories). (b) Since various superstring theories are de1ned (through perturbation theory) only in the (respective) weakly coupled regimes, in order to meaningfully talk about classical solutions of various string theories we need the string coupling also to be small. For the RR-charged type II solutions (charge Q) that we will describe below, both the above conditions can be met if Q1=gs 1 (that is, gs Q1; gs 1). 2.1. Classical solutions of M-theory The massless modes of M-theory are those of 11-dimensional supergravity: the metric GMN , the gravitino M and a three-form A(3) MNP ; M = 0; 1; : : : ; 10. The (bosonic part of the) classical action is √ 1 1 (3) 1 11 (3) 2 (3) (3) (dA ) − A ∧ dA ∧ dA d x : (2.1) S11 = 2 −G R − 48 6 2011 There are two basic classical solutions of this Lagrangian, the M 2 and M 5-branes, whose intersections account for most stable supersymmetric solutions of M-theory [47,59,60]. 2.1.1. The 2-brane solution of M-theory: M 2 It will suKce to describe only the M 2-brane solution [61] in some detail. Statement of the problem: we want to 1nd (a) a relativistic 2-brane solution of (2.1) (say stretching along x1; 2 ) with (b) some number of unbroken supersymmetries. Condition (a) implies that the solution must have an SO(2; 1)0; 1; 2 ×SO(8)3; 4; :::; 10 symmetry, together with translational symmetries along x0; 1; 2 . 2 This uniquely leads to 2 = e2A1 (r) d x d x + e2A2 (r) d xm d xm ; ds11
A(3) = eA3 (r) dt ∧ d x1 ∧ d x2 ;
(2.2)
where = 0; 1; 2 denote directions parallel to the world-volume and m = 3; : : : ; 9; 10 denote the transverse directions. r 2 ≡ xm xm . 2
The subscripts denote which directions are acted on by the SO groups. We denote spacetime coordinates by xM ; M = 0; 1; : : : ; 10.
564
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
Condition (b) implies that there should exist a non-empty set of supersymmetry transformations j preserving the solution (2.2); in particular the gravitino variation 1 (4NPQR − 83NM 4PQR )FNPQR j = 0; where 3j M = DM j + 288 M 1 BC (2.3) DM j ≡ 9M + !M 4BC j 4 must vanish for some j’s. It is straightforward to see that Eq. (2.3) vanishes for M = (world volume directions) if 9 j = 0 ; A3 = 3A1
and
ˆˆˆ
4012 j = j ;
(2.4) (3)
where the caret ˆ denotes local Lorenz indices. (Flipping the sign of A would correspond to −j on the right-hand side of the last line of (2.4): this would correspond to an anti-brane solution in our convention.) The M = m (transverse) components of (2.3) give rise to the further conditions A1 = −2A2 ; j = eA3 =6 j0 :
(2.5)
Harmonic equation. Eqs. (2.4) and (2.6) 1x the three functions Ai in (2.2) in terms of just one function, say A3 . It is easy to determine it by looking at the equation of motion of the three-form potential: √ 1 jNPQABCDEFGH FABCD FEFGH = 0 : (2.6) 9M ( −gF MNPQ ) + 2:(4!)2 The second term is clearly zero for our ansatz (2.2) for A(3) . The 1rst term, evaluated for (P; Q; R) = (0; 1; 2) gives 9m 9m (e−A3 ) = 0 :
(2.7)
Thus, the full M 2 solution is given by 2 ds11 = H 1=3 [H −1 d x d x + d xm d xm ] ;
A(3) = H −1 dt ∧ d x1 ∧ d x2 ;
(2.8)
where H = H (r) satis1es the harmonic equation in the transverse coordinates 9 m 9m H = 0 :
(2.9)
The simplest solution for H , in an asymptotically Cat space, is given by H = 1 + k=r 6 : Clearly, multi-centred solutions are also allowed: ki H =1+ ; |˜x − ˜xi |6 i where ˜x denotes the transverse directions xm .
(2.10) (2.11)
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
565
We note that, the constant, 1, in (2.10) is essentially an integration constant. Clearly, it can also be zero; such choices have led to M=string theory solutions involving AdS spaces. The point of this remark is to emphasize that the near-horizon geometry (r → 0), important in the context of AdS=CFT correspondence [42,62,63], in which H = k=r 6 corresponds to a complete solution in its own right without the appendage of the asymptotically Cat regions. We will return to the AdS=CFT correspondence many times in this as well as later sections. ADM mass. The integration constant k in (2.10) aLects the asymptotic fall-oL of the metric as well as of the 1eld strength, and is related to the ADM mass (per unit area of the 2-brane) M and to the gauge charge (per unit area) q. Using the de1nitions 3 M= d 7 9m (9n hmn − 9m h) ; (2.12) S7 q= d 7 9m Fm012 ; (2.13) S7
we get M = 6k7 = q :
10
(2.14)
Here S 7 represents the sphere at r 2 = xm xm = ∞, 4 hMN ≡ gMN − ;MN , h ≡ M =1 hMM , and n ≡ 2(n+1)=2 =4((n + 1)=2) is the volume of the unit sphere S n . BPS nature. The mass-charge equality in the last equation (2.14) is characteristic of a “BPS solution”. We provide a very brief introduction below. The 11-dimensional supersymmetry algebra [64] is {Q; Q} = C(4M PM + 4MN UMN + 4MNPQR VMNPQR ) ;
(2.15)
where C is the charge conjugation matrix and P; U and V are various central terms. When (2.15) is evaluated [65] for the above M 2 solution, we get 1 ˆ ˆˆ {Q ; Q= } = (C40 )= M + (C412 )= Q ; (2.16) V2 where we have used the notation P0ˆ = V2 M;
U1ˆ 2ˆ = V2 q ;
(2.17)
V2 being the spatial volume of the 2-brane (assumed compacti1ed on a large T 2 ). Now, the positivity of the Q2 operator implies that M ¿ q where the inequality is saturated when the right-hand side of (2.16) has a zero eigenvector. For our solution (2.8), we see from (2.4) that the unbroken supersymmetry transformation parameter satis1es ˆˆˆ
(1 − 4012 )j = 0 :
(2.18)
This clearly leads to M = q. This is a typical example of how classical solutions with (partially) unbroken supersymmetries satisfy the extremality condition mass = charge. We note here that the remaining half of the supersymmetry transformations, the complement of the ones in (2.18), are non-linearly realized in the M 2 geometry and can be regarded as 3
We follow the normalizations in [45] which diLer from, e.g. [48]. The total ADM mass, which diverges, includes integrals over x1; 2 as well; we ignore them here since we are interested in the mass per unit area. Similar remarks apply to the charge. 4
566
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
spontaneously broken supersymmetries. Interestingly, the supersymmetry variations under these transformations vanish in the near-horizon limit which has the geometry [66] AdS4 × S 7 . As a result, the broken supersymmetry transformations reemerge as unbroken leading to an enhancement of the number of supersymmetry charges 16 → 32 in the near-horizon limit, a fact that plays a crucial role in the AdS=CFT correspondence. Identi
A(3) = eA5 dt ∧ d x1 ∧ d x2 + eA6 dt ∧ d x3 ∧ d x4 :
(2.19)
Now, as before, the desire to have a BPS solution leads to existence of unbroken supersymmetry, or 3j M = 0. This now yields four diLerent type of equations, depending on whether the index M is 0; {1; 2}; {3; 4} or the rest. These express the six functions above in terms of two independent functions H1 ; H2 . These functions turn out to be harmonic in the common transverse directions when one imposes closure of SUSY algebra or equation of motion. The solution ultimately is dt 2 d x12 + d x22 d x32 + d x42 2 1=3 ds11 = (H1 H2 ) − + + + d xi d xi ; H1 H2 H1 H2 A(3) =
1 1 dt ∧ d x1 ∧ d x2 + dt ∧ d x3 ∧ d x4 : H1 H2
(2.20)
The above is an example of “harmonic superposition of branes” (see, e.g. [60]). Delocalized nature of the solution. We note that the ansatz above (2.19), as well as the solution (2.20), represent a “delocalized solution”. A localized M 2⊥M 2 intersection would destroy translational symmetries along the spatial world-sheet of both the 2-branes. The subject of localized intersection is interesting in its own right (see, e.g. [67] which is especially relevant to the D1–D5 system), although we do not have space to discuss them here. The delocalization here involves “smearing” the 1rst M 2 solution along the directions x3 ; x4 (by using a continuous superposition in (2.11), see e.g. [47]), and “smearing” the second M 2 solution along x1 ; x2 .
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
567
M 2⊥M 2⊥M 2. Extending the above method, we get the following supergravity solution for three orthogonal M 2-branes, extending respectively along x1; 2 ; x3; 4 and x5; 6 : 2 ds11 = (H1 H2 H3 )1=3 [(H1 H2 H3 )−1 (−dt 2 ) + H1−1 ( d x12 + d x22 )
+ H2−1 ( d x32 + d x42 ) + H3−1 (d x52 + d x62 ) + d xi d xi ] ; A(3) = H1−1 dt ∧ d x1 ∧ d x2 + H2−1 dt ∧ d x3 ∧ d x4 + H3−1 dt ∧ d x5 ∧ d x6 :
(2.21)
2.2. The 6D (D1–D5) black string solution of IIB on T 4 In the following, we will construct solutions of type II string theories using the above M-theory solutions by using various duality relations which we will describe as we go along. For an early account of black p-brane solutions in string theory, see [68]. We apply the transformation T567 R10 to the M 2⊥M 2 solution (2.20): R
10 M-theory → IIA
M 2 (8,9) M 2 (6,7)
D2 (8,9) D2 (6,7)
T567
→ IIB D5 (5,6,7,8,9) D1 (5)
The 1rst transformation R10 denotes the reduction from M-theory to type IIA. To do this, one 1rst needs to compactify the M 2⊥M 2 solution along x10 (by using the multi-centred harmonic functions, with centres separated by a distance 2R10 along x10 ). Essentially, at transverse distances large compared to R10 , this amounts to replacement of the harmonic function 1=r 4 by 1=r 3 and a suitable modi1cation of the integration constant to reCect the appropriate quantization conditions. At this stage, one still has 11-dimensional 1elds. To get to IIA 1elds, we use the reduction formula 2 2 ds11 = exp[ − 2=3] ds10 + exp[4=3]( d x10 + C(1) d x )2 ;
A = B ∧ d x10 + C (3) :
(2.22)
It is instructive to verify at this stage that the classical D2 solutions do come out of the M 2-brane after these transformations. We use the notation C (n) for the n-form Ramond–Ramond (RR) potentials in type II theories. The second transformation T567 involves a sequence of T-dualities (for a recent account of T-duality transformations involving RR 1elds, see [69]). We denote by Tm T-duality along the direction xm . T567 denotes T5 T6 T7 . The 1nal transformation, not explicitly written in the above table, is to wrap x6; 7; 8; 9 on T 4 . We will denote the volume of the T 4 by
VT 4 ≡ 2 (2)4 v˜ :
(2.23)
568
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
Assuming the number of the two orthogonal sets of M 2-branes to be Q5 ; Q1 , respectively, the 1nal result is: Q5 strings from wrapping D5 on T 4 and Q1 D-strings. This is the D1–D5 system in IIB supergravity, 5 characterized by the following solution: 2 = f1−1=2 f5−1=2 (−dt 2 + d x52 ) + f11=2 f51=2 d xi d xi + f11=2 f5−1=2 d xa d xa ; ds10
1 (2) = − (f1−1 − 1) ; C05 2 (3) = jijkl 9l f5 ; Fijk
F (3) = dC (2) ;
e−2 = f5 f1−1 ; f1; 5 = (1 + r1;2 5 =r 2 ) :
(2.24)
Here C (2) is the 2-form RR gauge potential of type IIB string theory. The parameters r1 ; r5 are de1ned in terms of Q1 ; Q5 , see Eqs. (2.28). Spacetime symmetry. The spacetime symmetry S of the above solution is: S = SO(1; 1) × SO(4)E × ‘SO(4)I ’ ;
(2.25)
where SO(1; 1) refers to directions 0; 5, SO(4)E to directions 1; 2; 3; 4 (E for external) and ‘SO(4)I ’ to directions 6; 7; 8; 9 (I stands for internal; the quotes signify that the symmetry is broken by wrapping the directions on a four-torus although for low energies compared to the inverse radii it remains a symmetry of the supergravity solution). Supersymmetry. The unbroken supersymmetry can be read oL either by recalling those of the M-theory solution and following the dualities or by solving the Killing spinor equations (analogous to (2.3)). The result is: 4056789 jL = jR ; 405 jL = jR :
(2.26)
The 1rst line corresponds to the unbroken supersymmetry appropriate for the D5-brane (extending in 5; 6; 7; 8; 9 directions). The second line refers to the D1-brane. (The superscripts in 4ab:: denote local Lorenz indices like in (2.3), although we have dropped the carets.) To solve Eq. (2.26), we recast (2.26) as 46789 jL = jL ;
jR = 405 jL :
Since jL has 16 independent real components to begin with, the 1rst equation cuts it down by a half, thus leaving 8 real components. These, by virtue of the second equation, completely determine jR , thus leaving no further degrees of freedom. Thus, there are eight unbroken real supersymmetries. Recall that type II string theory has 32 supersymmetries (i.e., jL ; jR each has 16 independent real components). In the near-horizon limit, eight of the broken supersymmetries reemerge as unbroken. 5
Strictly speaking, we should wrap the D1–D5 string on a large circle to avoid the Gregory–LaCamme instability [70].
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
569
2.3. The extremal 5D black hole solution Let us now compactify x5 along a circle of radius R5 and wrap the above solution along x5 to get a spherically symmetric object in 1ve dimensions. Let us also “add” gravitational waves (denoted W ) moving to the “left” along x5 . This gives us the BPS version [44,71] of the 1ve-dimensional black hole. Adding such a wave can be achieved either (a) by applying the Gar1nkle–Vachaspati transformation [72] to the black string solution (2.24) and wrapping it on S 1 , or (b) augmenting the M 2⊥M 2 solution (2.20) by a third, transverse, set of M 2-branes along x5; 10 (cf. (2.21)), and passing it through the same sequence of transformations T567 R10 as before (see the table in Section 2.2), with the result that the third set of M 2-branes becomes a gravitational wave (=momentum mode) along the x5 -circle: R
T
10 567 M-theory → IIA → IIB M 2 (8,9) D2 (8,9) D5 (5,6,7,8,9) M 2 (6,7) D2 (6,7) D1 (5) M 2 (5,10) NS1 (5) W (5)
The last transformation essentially reCects the fact that T-duality changes winding modes to momentum modes. (W denotes a gravitational wave and not a winding mode.) The 1nal con1guration corresponds to D5-branes along x5; 6; 7; 8; 9 and D1-branes along x5 , with a non-zero amount of (left-moving) momentum. If the number of the three sets of M 2-branes are Q1 ; Q5 and N , respectively, then these will correspond to the numbers of D1-, D5-branes and the quantized left-moving momentum, respectively. The solution for the extremal 1ve-dimensional D1–D5 black hole is thus given by 2 = f1−1=2 f5−1=2 (−du dv + (fn − 1) du2 ) ds10
+ f11=2 f51=2 d xi d xi + f11=2 f5−1=2 d xa d xa ; 1 (2) = − (f1−1 − 1) ; C05 2 (3) = jijkl 9l f5 ; Fijk
e−2 = f5 f1−1 ;
r 1; 5; n 2 : f1; 5; n = 1 + r
(2.27)
The parameters r1 ; r5 ; rn are de1ned in terms of Q1 ; Q5 ; N , respectively, see Eq. (2.28). Symmetries. Curling up x5 and adding momentum along it reduces the spacetime symmetry and supersymmetry of the solution (2.27), compared to (2.25) and (2.26). Thus the spacetime symmetry is SO(4)E × ‘SO(4)I ’ while the number of supersymmetries is reduced to four due to an additional condition on the Killing spinor: 405 jL; R = jL; R .
570
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
Charge quantization. The parameters r1;2 5; n in (2.27) are related to the integer-quantized charges Q1; 5 and momentum N by 4GN5 R5 gs ; = gs v˜
r12 = c1 Q1 ;
c1 =
r52 = c5 Q5 ;
c5 = gs ;
rn2
4GN5 g2 2 cn = = s 2 ; R5 vR ˜ 5
= cn N;
(2.28)
where
GN5
g2 2 GN10 = s 2 : = (2R5 VT 4 ) 4vR ˜ 5
(2.29)
In the above we have used (1.1) and (2.23). For a detailed discussion of quantization conditions like (2.28), see, e.g. [45,48,73]. Here GNd denotes the d-dimensional Newton’s constant. VT 4 is the volume of the four-torus in the directions x6; 7; 8; 9 , while R5 is the radius of the circle along x5 . Explicit
+)=3
ds52
(2.30)
(the 1rst two exponential factors are simply the de1nitions of the scalars E; ; the factor in front of ds52 can be found easily by demanding that ds52 is the 1ve-dimensional Einstein metric). Here = 1; 2; 3; 4. Using (2.30), the 1ve-dimensional Einstein metric is given by ds52 = −f−2=3 (r) dt 2 + f1=3 (r)(dr 2 + r 2 d32 ) ; f(r) = f1 (r)f5 (r)fn (r) ;
(2.31)
where f1; 5; n (r) are de1ned in (2.27). Area and entropy. The above metric has a horizon at r = 0, which has a 1nite radius Rh and a 1nite area Ah , given by Ah = 22 R3h ;
Rh = (r1 r5 rn )1=3 :
The Bekenstein–Hawking entropy (1.10) is given by √ c1 c5 c n 2 Q1 Q5 N = 2 Q1 Q5 N ; S = 2 5 4GN
(2.32)
(2.33)
where in the second step we have used √
c1 c5 cn =
4GN5
(2.34)
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
571
which follows easily from (2.28). The fact that all “moduli” like the coupling and radii disappear from (2.33), and that the entropy is ultimately given only in terms of quantized charges, is remarkable. We defer a discussion of the geometry of this solution till the next Section 2.4 where we describe the non-extremal version. 2.4. Non-extremal
d xi d xi → h−1 (r) dr 2 + r 2 dd2 −1 ;
h(r) = 1 − =r d−2 with the harmonic function now de1ned as ˜ d− 2 ; H (r) = 1 + Q=r
(2.35)
where Q˜ is a combination of the non-extremality parameter and some “boost” angle 3: Q˜ = sinh2 3
(2.36)
(for multi-centred solutions, Q˜ i = sinh2 3i .) Rule 2. In the expression for F4 = dA, make the substitution −1 U U Q Q − 1 = 1 − d− 2 H ; H → H˜ (r) = 1 + r r d−2 + Q˜ − QU QU = sinh 3 cosh 3 :
(2.37)
Applying this rule to the M 2⊥M 2⊥M 2 case (2.21), we get 2 = (H1 H2 H3 )−1=3 [ − H1 H2 H3 h dt 2 + H1 (dy12 + dy22 ) ds11
+ H2 (dy32 + dy42 ) + H3 (dy52 + dy62 ) + h−1 dr 2 + r 2 dd2 −1 ] :
(2.38)
The rest of the story is similar to the BPS case described in the previous subsection. Namely, we apply the duality transformation T567 R10 as in the table in Section 2.3: by 1rst reducing the M-theory solution (2.38) to IIA and then T-dualizing to IIB, and 1nally wrapping it on T 4 × S 1 .
572
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
Under the reduction from M-theory to type IIA in ten dimensions, we get e−2 = f1 f5−1 ; 2 ds10 = f1−1=2 f5−1=2 [ − dt 2 + d x52 + (1 − h)(cosh n dt + sinh n d x5 )2 ] 2 dr 1=2 1=2 2 2 + f1 f5 + r d3 + F11=2 F5−1=2 d xa d xa ; h
h = 1 − r02 =r 2 ;
(2.39)
where a = 6; : : : ; 9, (r; 3 ) are polar coordinates for x1; 2; 3; 4 . f1 ; f5 (also fn in (2.40)) are de1ned as in (2.27), except that the parameters r1 ; r5 (also rn ) are no more de1ned by (2.28), but by their non-extremal counterparts (2.40) and (2.41). The parameter r02 is the same as the non-extremality parameter of (2.35), while 1; 5; n are related to the boost angle of (2.36). This is still a IIA solution. In order to get the IIB version, we have to apply the sequence T567 . We omit the details here which are fairly straightforward. At the end, after we further use the Kaluza–Klein reduction (2.30) we get the following 1ve-dimensional Einstein metric [75]: 2 dr + r 2 d32 ; ds52 = −hf−2=3 dt 2 + f1=3 h f = f1 f5 fn = (1 + r12 =r 2 )(1 + r52 =r 2 )(1 + rn2 =r 2 ) ; r1;2 5; n = r02 sinh2 1; 5; n :
(2.40)
√ There are six independent parameters of the metric: 1; 5; n , r0 ; R5 ; v˜ ≡ VT 4 =(2ls )4 (ls = ). The boost angles and the non-extremality parameters are related to the three charges and the mass M as follows (F (3) ≡ dC (2) ): V r 2 sinh 21 e ∗ F (3) = 0 ; Q1 = 2 4 g 2c1 1 r 2 sinh 25 F (3) = 0 ; Q5 = 2 4 gs 2c5 N=
r02 sinh 2n ; 2cn
M=
˜ 02 R5 vr (cosh 21 + cosh 25 + cosh 2n ) : 2( )2 gs2
(2.41)
There is another very interesting representation of the above-mentioned six parameters in terms of what looks like brane-, antibrane-numbers and left-, right-moving momenta: N1; 1U =
r02 e±21 ; 4c1
N5; 5U =
r02 e±25 ; 4c5
NL; R =
r02 e±2n : 4cn
(2.42)
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
573
The coeKcients c1 ; c5 ; cn are as in (2.28). Clearly N1 − N1U = Q1 ;
N5 − N5U = Q5 ;
NL − NR = N :
(2.43)
The extremal limit corresponds to taking r0 → 0; 1; 5; n → ∞ keeping the charges Q1; 5 ; N 1nite. We comment on the brane–antibrane interpretation in Section 2.4.2. 2.4.1. Geometry It is easy to see that the above solution is a 1ve-dimensional black hole, with horizon at r = r0 . The horizon has a 1nite area Ah , given by Ah = 22 r03 cosh 1 cosh 5 cosh n √ √ √ √ √ √ = 8GN5 ( N 1 + N 1U)( N 5 + N 5U)( N L + N R ) :
(2.44)
Here we have used (2.34) and (2.42). The fact that the horizon has a 1nite area indicates that the singularity lies “inside” r = r0 . It is not at r = 0, however, which corresponds to the inner horizon (where light-cones “Cip” the second time as one travels in). To locate the singularity one needs to use other coordinate patches which extend the manifold further “inside”. The singularity is time-like and the Carter–Penrose diagram (Fig. 1) is similar to that of the non-extremal Reissner-Nordstrom metric (see (1.3) and (1.4)). The inner and outer horizons (cf. (1.4)) in the present case are r− = 0, r+ = r0 .
III
r 2 = - r 2n
III II r=r 0
I
I
II
Fig. 1. Carter–Penrose diagram for the non-extremal 5D black hole.
574
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
2.4.2. Hawking temperature and Bekenstein–Hawking entropy By using the formula (A.9) we get the following Hawking temperature (˝ = 1) (see Appendix A for details) 1 = 2r0 cosh(1 ) cosh(5 ) cosh(n ) : (2.45) TH We will compare this with the CFT result for the temperature (4.31) in Section 4 (see also Section 8). By using the formula S = Ah =4GN5 (cf.(1.10)) and (2.44), we get √ √ √ √ √ √ SBH = 2( N 1 + N 1U)( N 5 + N 5U)( N L + N R ) : (2.46) Of course, the extremal entropy (2.33) corresponds √ to the special case NR = N1U = N5U = 0 (use (2.43)). A somewhat more general case is when N1U = N 5U = 0, NR = 0; the entropy in that case is given by √ √ SBH = 2 Q1 Q5 ( NL + NR ) : (2.47) The entropy formulae (2.33) and (2.46) are U-duality invariant, in the following sense. Consider an S(3) subgroup of the U-duality group of type IIB on T 5 , which permutes the three charges Q1 ; Q5 and N . Such an S(3) is generated by (a) T6789 which sends Q1 → Q5 , Q5 → Q1 , N → N , and (b) T9876 ST65 which sends Q1 → Q5 , Q5 → N , N → Q1 . The entropy formula (2.33) remains invariant under these permutations. Since the “anti”-objects are also permuted among each other by these U-duality transformations, we can say that the entropy formula (2.46) is also U-duality invariant. 2.4.3. Comments on brane–antibrane and other non-BPS solutions It should be noted that the “brane–antibrane” representation of the above non-extremal black hole is only suggestive at the moment. The subject of supergravity representation of brane–antibrane and other non-BPS systems is very much open; for a partial list of papers see [76 –86]. 2.5. Supergravity solution with non-zero vev of BNS Our discussion so far has been devoted to supergravity solutions in which the values of all the moduli 1elds were set to zero. Such solutions have the characteristic that the mass of the D1–D5 system is a sum of the charges that characterize the system. Such bound states are marginal, without any binding energy, and can fragment into clusters of D1–D5 branes. The corresponding CFT has singularities. In order to obtain a stable bound state and a non-singular CFT we have to turn on certain moduli 1elds. We will consider the case when BNS is non-zero. The construction of the supergravity solution that corresponds to a 14 BPS con1guration, with a non-zero BNS was presented in [87,88]. BNS has non-zero components only along the directions 6; 7; 8; 9 of the internal torus. From the view point of open string theory this is then a non-commutative torus. Here we will summarize the result. The solution contains, besides D1- and D5-brane charges, D3-brane charges that are induced by the BNS . For simplicity we consider only non-zero values for (∞) (∞) B79 and B68 . The asymptotic values are given by B79 = b79 and B68 = b68 . It is important that
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
575
at least 2 components of the BNS are non-zero, in order to be able to discuss the self-dual and anti-self-dual components. Below, we present the full solution which can be derived by a solution generating technique. Details can be found in [88]. ds2 = (f1 f5 )−1=2 (−dt 2 + (d x5 )2 ) + (f1 f5 )1=2 (dr 2 + r 2 d32 ) + (f1 f5 )1=2 {Z’−1 ((d x6 )2 + (d x8 )2 ) + Z −1 ((d x7 )2 + (d x9 )2 )} ; e2 = f1 f5 =Z’ Z ; (2) BNS = (Z’−1 sin ’ cos ’(f1 − f5 ) + b68 ) d x6 ∧ d x8
+ (Z −1 sin
cos (f1 − f5 ) + b79 ) d x7 ∧ d x9 ;
(3) F (3) = cos ’ cos K˜ + sin ’ sin
F (5) = Z’−1 (−f5 cos ’ sin
K (3) ;
K (3) + f1 cos sin ’K˜
(3)
) ∧ d x6 ∧ d x8
(3) + Z −1 (−f5 cos sin ’K (3) + f1 cos ’ sin K˜ ) ∧ d x7 ∧ d x9 ;
Z’;
’; =1+ 2
r2
;
’ = 1 sin2 ’ + 5 cos2 ’;
= 1 sin2
+ 5 cos2
:
(2.48)
Here b68 and b79 are arbitrary constants which we have added at the end by a T-duality transformation that shifts the NS B-1eld by a constant. Note that for ’ = = 0 and b68 = b79 = 0, the above solution reduces to the known solution for D1–D5 system without B-1eld. The above solution depends upon 4 parameters 1 , 5 , and the angles and , and in general represents a system of D1-, D5- and D3-branes. Since we are seeking a solution that has no source D3-branes we require that the D3-brane charges are only induced by the presence of the non-zero BNS . This leads to certain conditions on the solutions which we do not derive here, but whose physical implication we analyze. We discuss both the asymptotically Cat and near-horizon geometry. 2.5.1. Asymptotically Bat geometry In this case the induced D3-brane charges along the (5; 7; 9) and (5; 6; 8) directions are (∞) Q3 = B79 Q5 ;
(∞) Q3 = B68 Q5 ;
(2.49)
(∞) (∞) where B79 = b79 , B68 = b68 . There is an induced contribution to the D1-brane charge. The charge Q1s of the source D1-branes is
Q1s = Q1 − b68 b79 Q5 ; while the D5-brane charge remains unaLected by the moduli.
(2.50)
576
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
Mass. Let us now study the mass formula as a function of the charges and the moduli. The mass corresponding to the 14 BPS solution [46], which coincides with the ADM mass, is given in terms of the appropriate charges by M 2 = (Q1 + Q5 )2 + (Q3 − Q3 )2 :
(2.51)
This can in turn be expressed in terms of Q1s , Q5 and b68 , b79 M 2 = (Q1s + b68 b79 Q5 + Q5 )2 + Q52 (b68 − b79 )2 :
(2.52)
We must consider the mass as a function of the moduli, holding Q1s and Q5 1xed. We see that for non-zero moduli we have a true bound state that turns marginal when the moduli are set to zero. To locate the values of the moduli which minimize the mass, we extremize the mass with respect to the moduli. The extremal values of the moduli are b68 = −b79 = ± Q1s =Q5 − 1 : (2.53) This says that the BNS moduli are self-dual, in the asymptotically Cat metric. The mass at the critical point of the true bound state is then given by M 2 = 4Q1s Q5 :
(2.54)
2.5.2. Near-horizon geometry In this case, absence of D3-brane sources is ensured if we set (h) Q3(h) = B79 Q5 ;
(h) Q3(h) = B68 Q5 ;
(2.55)
where 1 − 5 sin ’ cos ’ + b68 ; ’ 1 − 5 (h) = sin cos + b79 B79 (h) = B68
(2.56) (2.57)
are the horizon values of the two non-zero components of the B-1eld. Moreover, we see can that in this case (h) B68 B(h) = − 79 ; ’
(2.58)
which is the self-duality condition on the B-1eld in the near-horizon geometry. We also note that the volume of T 4 at the horizon is given by VT(h) 4 =
Q(h) 1 5 = 1s : ’ Q5
(2.59)
The D1-brane charge that arises from source D1-branes in this case is given by (h) (h) (h) = Q1(h) − B68 B79 Q5 : Q1s
(2.60)
One can show that (h) Q1s = Q1s ;
(2.61)
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
577
where Q1s is given by (2.60). Thus we see that not only do the parameters b68 and b79 have the same values here as in the asymptotically Cat case, but even the source D1-branes are identical, despite the total D1-brane charges being very diLerent in the two cases. Mass. The 14 BPS mass formula in terms of the various charge densities in this case is 2 2 2 Q3(h) Q3(h) Q1(h) M (h) = + Q5 + √ −√ : (2.62) g77 g99 g66 g88 V (h) V (h) 4 4 T
T
Using (2.55) – (2.61) it can be easily seen that (M (h) )2 = VT(h) 4 (4Q1s Q5 ) :
(2.63)
Apart from the extra factor of the T 4 volume in the near-horizon geometry, this is exactly the same as (2.54). The extra volume factor correctly takes into account the diLerence in the 6-dimensional Newton’s constant between the asymptotically Cat and near horizon geometries because of the difference in the T 4 volume in the two cases. We have already seen that the B-1eld is automatically self-dual in the near-horizon geometry and that the volume of T 4 satis1es the condition given by (2.59) and (2.60). We now see that the mass of the bound state is already at the 1xed point value. Thus the solution we have here provides an explicit demonstration of the attractor mechanism [89]. The signi1cance of this solution is that it is the description of a stable bound state in the near horizon geometry. As we shall discuss later (Section 7), this situation corresponds to a non-singular dual CFT. 2.6. Near-horizon limit and AdS3 × S 3 In this section we will exhibit the form of the classical solution in the near-horizon limit of Maldacena [42]. This subsection, together with Appendix C, will be used as background for discussions of AdS=CFT correspondence in Sections 6 and 11. The basic idea of the near-horizon limit is that, near the horizon of a black hole or a black brane, the energies of particles as seen by the asymptotic observer get red-shifted: √ E∞ = g00 E : (2.64) For the metric (2.24) the red-shift factor is √ g00 = (f1 f5 )−1=4 :
(2.65)
Clearly as r → ∞ the red shift factor is unity. However near the horizon we get the equation r E∞ = E ; (2.66) R where R, Eq. (2.73), is the length scale that characterizes the geometry. In the near-horizon region, de1ned by rR
(2.67)
we see that the energy observed by the asymptotic observer goes to zero for 1nite values of E. This means that near the horizon (de1ned by (2.67)) an excitation of arbitrary energy looks massless. For massless modes this means that they have almost in1nitely long wavelengths and for massive
578
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
modes they appear as long wavelength massless excitations. If one examines the potential energy of a particle in the above geometry, then in the near-horizon limit the potential barrier becomes very high so that the modes near the horizon cannot get out. In the exact limit of Q1 and Q5 going to in1nity the horizon degrees of freedom become exactly massless and decouple from the bulk degrees of freedom. As we shall see later, it is in this limit that the bulk string theory is dual to a SCFT which also exhibits massless behaviour in the infrared. 2.6.1. The three-dimensional anti-de Sitter space or AdS3 We now apply these ideas to the metric of the D1–D5 black string with the KK charge N = 0, namely (2.24). In the region (2.67) the metric and other backgrounds are still given by (2.24), except that the harmonic functions change to f1 =
164 gs 3 Q1 ; V4 r 2
f5 =
gs Q5 : r2
(2.68)
Here r 2 = x12 + x22 + x32 + x42 denotes the distance measured in the transverse direction to all the D-branes. The above metric diLers from (2.24) only in that the harmonic functions do not have the “1” term any more (see remarks after (2.11)). A more precise scaling limit of the geometry is given by [90] r → 0; ≡ U = 1xed ; gs V4 (2.69) g6 = √ = 1xed : v≡ 2 = 1xed; 4 16 v In this limit the metric in (2.24) becomes ds2 = (ds32 + ds2 [S 3 ] + ds2 [T 4 ]) ; where ds32
dU 2 U2 = 2 (−d x02 + d x52 ) + l2 2 l U
(2.70) ;
(2.71)
represents three-dimensional anti-de Sitter space AdS3 (see Appendix C, Eq. (C.10)) and ds2 [S 3 ] = l2 d32 ;
Q1 ds2 [T 4 ] = (d x62 + · · · + d x92 ) ; vQ5
(2.72)
represent a three-sphere and a four-torus. Thus the near horizon geometry is that of AdS3 × S 3 × T 4 . Our notation for coordinates here is as follows: AdS3 : (x0 ; x5 ; r), S 3 : (3 =(E; ; )), T 4 : (x6 ; x7 ; x8 ; x9 ). r; E; ; are spherical polar coordinates for the directions x1 ; x2 ; x3 ; x4 . The length scale l is the dimensionless radius of S 3 and the anti-de Sitter space: √ R l = √ ; R = (g62 Q1 Q5 )1=4 : (2.73)
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
579
Note that the eLective string coupling in the near-horizon limit is given by geff = g6 Q1 =Q5 : The formulas for the black hole entropy and temperature, which depend only on the near horizon properties of the geometry, do not change in the near horizon limit. In Section 6 we will discuss in detail the symmetries of the near-horizon geometry (2.70). The spacetime symmetries as well as supersymmetries get enhanced compared to (2.25) and (2.26). 2.6.2. The BTZ black hole The above discussion was about the near-horizon limit of the six-dimensional black string. We now turn to the near-horizon limit of the 1ve-dimensional black hole (2.40). The near horizon scaling limit is given by [90,91] → 0;
r → 0;
r0 → 0
(2.74)
with
r r0 = 1xed; U0 ≡ = 1xed ; gs V4 g6 = √ = 1xed; R5 = 1xed : (2.75) v≡ 2 = 1xed; 4 16 v In this limit the metric of the D1–D5 black hole (cf. Eq. (2.39)) reduces to the following: U ≡
ds2 = (ds32 + ds2 [S 3 ] + ds2 [T 4 ]) ; where the metric on the 3-sphere and 4-torus are still given by (2.72), whereas U 2 U 2 l2 ds32 = 2 (−d x02 + d x52 ) + 2 0 (cosh I dt + sinh I d x52 )2 + 2 dU 2 l l U − U02
(2.76) (2.77)
now represents the BTZ geometry, as we will show below. Thus the near-horizon geometry is BTZ × T 4 × S 3 . Our coordinate de1nitions here are as follows: t; ; r˜ refer to BTZ coordinates, 3 stands for the S 3 and x6 ; x7 ; x8 ; x9 stand for the coordinates of T 4 . To identify ds32 with the BTZ metric (i.e., the black hole in three-dimensional anti-de Sitter space discovered by [92]), we make the coordinate rede1nitions given below [90,91] R2 r˜2 = (U 2 + U02 sinh2 I) 25 ; l r+ =
R5 U0 cosh I ; l
r− =
R5 U0 sinh I ; l
x5 lx0 ; t= : R5 R5 The metric (2.77) in these new coordinates is given by =
ds32
2 2 ) 2 (r˜2 − r+2 )(r˜2 − r− r + r− r˜2 l2 2 2 =− dt + 2 : dr + r˜ d + 2 dt 2 l2 r˜2 (r˜ − r+2 )(r˜2 − r− r˜ l )
(2.78)
(2.79)
580
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
In this form, the metric coincides with that of a BTZ black hole (cf. Eqs. (C.11) and (C.13)), with mass M and angular momentum J given by (cf. (C.12)) √ 2 r+2 + r− 2r+ r− : (2.80) ; J = M= l2 l The mass M and the angular momentum J for the BTZ black hole are related to the parameters of the D1–D5 black hole by M NL + N R ; = L0 + LU0 = 2 Q1 Q5
J N − NR √ = L0 − LU0 = L ; Q 1 Q5 2 l
(2.81)
where NL , NR are de1ned in (2.42) and L0 , LU0 are the levels of the SCFT. The extremal limit is given by r+ = r− . From (2.80) and (2.81) we see that in the extremal limit NR = 0 as expected for the D1–D5 black hole. It is important to mention the global properties of the metric (2.79), especially in relation to those of the AdS3 metric in (2.70) above. Let us consider the simplest BTZ solution 1rst, namely with r+ = r− = 0. Substituting these values in (2.79) we 1nd the metric is given by r˜2 2 l2 2 ds32 = − d x + dr + r˜2 d2 : l2 0 r˜2
(2.82)
By comparison with (2.70) one can see that this metric is locally AdS3 except for the global identi1cation ≡ + 2. This of course reCects the fact that the r+ = r− = 0 BTZ solution corresponds to the near-horizon limit of the D1–D5 string (with NL = NR = 0) wrapped on a circle along x5 ≡ x5 + 2R5 . This periodic identi1cation has two important implications: (a) Firstly, that the zero-mass BTZ black hole is a quotient of the AdS3 space by a discrete isometry. Indeed, as has been shown in [90] the global property of the more general near-horizon solution (2.79) also corresponds to an appropriate quotient of AdS3 by a discrete isometry, consistent with the expected global properties of BTZ black holes [92]. (b) Secondly, the diLerence between the geometry of the zero mass BTZ black hole and that of the AdS3 (although identical locally) leads to an important diLerence in the boundary conditions for the fermions. For the case of AdS3 the fermions are anti-periodic in and for the zero mass BTZ black hole they are periodic in . One can easily see that the constant time slice of the metric in (2.71) has the topology of a disk. This forces the fermions to be anti-periodic in for AdS3 . For the case of the zero mass BTZ black hole in (2.82), the constant time slice has a singularity at r˜ = 0. Therefore the fermions can be both periodic or anti-periodic. An analysis of the Killing spinors in the background of the zero mass BTZ black hole shows that the fermions in fact have to be periodic [93]. 2.6.3. The two-dimensional black hole In this subsubsection we will discuss the near-horizon geometry of a related (pure 5-brane) system and its connection to the two-dimensional black hole [94 –96]. Consider the supergravity solution of the non-extremal black hole in type IIB string theory (2.40) with Q1 = 0 and I = 0. The D1–D5 supergravity solution then reduces to the non-extremal D5-brane.
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
581
We will now show that there is a near-horizon region where the geometry can be approximated to that of two-dimensional black hole [94 –96]. The ten-dimensional geometry with Q1 = 0 is given by r52 1 −2 = 2 1+ 2 ; e gs r H = Q5 ; r02 gs Q5 −1=2 2 2 2 ds = 1 + − 1 − 2 dt + d x5 + · · · + d x9 r2 r gs Q5 dr 2 2 2 2 + 1+ dr + r d3 : r2 1 − r0 =r 2 2
(2.83)
To obtain the near-horizon geometry we use the IMSY limit [97] for the case of D5-branes. This is given by r0 r U = = 1xed; U0 = = 1xed ; 2 = (2)3 gs = 1xed with → 0 : gYM
(2.84)
Here gYM is the Yang–Mills coupling√on the D5-brane. This geometry is non-conformal and the dilaton depends on the scale U . For Q5 gYM U , the string coupling and the curvature in string units are large, and therefore the valid description of the background is obtained by performing an S-duality. The solution reduces to the near-horizon geometry of non-extremal NS 5-branes. The metric and the dilaton after S-duality is given by (2)3 Q5 : 2 gYM U2 dU 2 U02 2 2 2 2 2 + d3 : ds = − 1 − 2 dt + (d x5 + · · · + d x9 ) + gs Q5 U U 2 (1 − U0 =U ) e2 =
(2.85)
Here we have scaled the metric by gs so that the 10-dimensional Newton’s constant is invariant. To see this is the metric of the (2d black hole) × S 3 × R5 we change coordinates by substituting U = U0 cosh J. Then we get e2 =
1 (2)3 Q5 2 gYM U02 cosh2 J
ds2 = −tanh2 J dt 2 + gs Q5 (dJ2 + d32 ) + (d x56 + · · · + d x92 ) :
(2.86) 3
5
Now it is easily seen that the geometry reduces to that of (2d black hole)×S ×R . This near-horizon limit of extremal NS 5-branes was obtained in [98]. 3. Semiclassical derivation of Hawking radiation We described in some detail the construction of the D1–D5 black hole, (2.27) and (2.40), in the last section. We will now address the issue of absorption and Hawking radiation by this black
582
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
hole. Both absorption and Hawking radiation involve interesting questions, as we remarked in the introduction. For instance, classically the black hole only absorbs and does not emit. One of our goals will be to ultimately interpret this in the microscopic model, explaining thereby a crucial aspect of the event horizon. Secondly, the semiclassical treatment of Hawking radiation leads to the information puzzle, and we would like to see how standard scattering processes described in terms of the microscopic model gives rise to such a radiation within a unitary quantum theory. Before we proceed to the microscopic description, however, we will devote the present section to brieCy review the (semi)classical calculations of absorption=emission of particles (in the type IIB spectrum) by the D1–D5 black hole (2.40). The absorption cross-section and emission rate of a particular 1eld depend on how the 1eld propagates and backscatters from the geometry of the black hole. We will look at the equation of propagation of scalar Cuctuations. We begin by writing down the IIB Lagrangian [99 –101] compacti1ed on T 5 (of which the D1–D5 black hole (2.40) and (2.41) is a solution): √ 1 4 1 5 √ S5 = 2 d x −g R − (9 5 )2 − G ab G cd (9 Gac 9 Gbd + e25 G9 Cac 9 Cbd ) 3 4 205 e−45 =3 e25 =3 √ e(45 =3) √ K a bK ab 2 Gab FK F − − (3.1) GG HKa Hb − GHKL : 4 4 12 Notation: a; b; : : : = 5; : : : ; 9 denote the directions along T 5 , while ; K; : : : = 0; : : : ; 4 denote the non-compact directions. We have included in the above Lagrangian only the following ten-dimensional 1elds: • the ten-dimensional dilaton , • the ten-dimensional string-frame metric 6 ds2 written as ds2 = gK d x d xK + Gab (dya + Aa d x )(dyb + AbK d xK ) which identi1es Aa as the KK vector 1elds, • and the RR 2-form 1eld C (2) written as C (2) = CK d x ∧ d xK + Cab d xa ∧ d xb : The various 1elds appearing in (3.1) are de1ned in terms of the above 1elds, as follows: • the 1ve-dimensional dilaton 5 = 10 − (1=4) ln[det ab Gab ], • the KK 1eld strengths F a = dAa , • and the H -1elds given by [99] b HKa = FaK − Cab FK ;
HKL = 9 BKL − 6
1 a 1 a A FaKL − Aa FKL + cyc:perm: ; 2 2
Related to the Einstein frame metric dsE2 as ds2 = exp[=2] dsE2 .
(3.2)
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
583
where Aa = Ca + Cab Ab ;
Fa = dAa ;
BK = CK + Aa[ AK]a − Aa Cab AbK :
The 1ve-dimensional Newton’s constant 16(GN5 )2 ≡ 2052 is de1ned as in (2.29). We will now simplify the Lagrangian even further, by assuming that (a) of the KK-gauge 1elds only A5 is non-zero and is of the “electric” type (b) Cab = 0. This is a consistent truncation, and D1–D5 black hole (2.40) is a solution of the truncated system. We will consider below two separate sets of scalar Cuctuations: (1) This set of Cuctuations hab , a = b, a; b = 6; 7; 8; 9 is de1ned by Gab = f11=2 f5−1=2 (3ab + hab );
a; b = 6; 7; 8; 9 :
Recall that Gab = f11=2 f5−1=2 3ab represents the background value (cf. (2.39)). (2) This set of Cuctuations is de1ned by • e2K ≡ G66 (assumed equal to G77 = G88 = G99 ), • ≡ 5 + 12 K5 , where e2K5 = G55 , • L ≡ 34 K5 − 12 5 . For the Cuctuations in Set (1), action (3.1) reduces to the following action: √ 1 S = − 2 d 5 x −g9 hab 9 hab 805 whereas for the three Cuctuations in Set (2), action (3.1) reduces to 4 1 5 √ S = 2 d x −g R − (9 )2 − (9 L)2 − 4(9 K)2 3 205 1 (8=3)L 5 2 1 −4=3L+4K 2 1 4=3L+4K 2 e − e (FK ) − e F5; K − HKL : 4 4 12
(3.3)
(3.4)
(3.5)
The background values of the 5-D Einstein metric gK and the other 1elds are to be read oL from the D1–D5 black hole solution (2.40) and (2.41). Note that hab , a = b and couple only to the gK (de1ned in Eq. (2.40)); because of this property they are called “minimal scalars”. On the other hand, K; L couple to the dilaton and the RR 1elds as well; these are called “1xed scalars” because their value at the horizon cannot be arbitrarily chosen but are 1xed by the charges Q1 ; Q5 ; N . In the following we will 1rst calculate the absorption cross-section for minimal scalars, and later brieCy mention the case of 1xed scalars. 3.1. Minimal scalar For the semiclassical absorption=emission [38– 40], all we need is the equation for propagation of the Cuctuation hab (or ) on the black hole metric gK . We will denote the minimal scalar Cuctuation generically by the symbol ’; since it couples only to the 1ve-dimensional Einstein metric as in
584
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
Vw
Reflected
Transmitted
r0
Incoming
r 3/2 r 0
Fig. 2. Potential for minimal scalar.
Eq. (3.4), the equation of motion is D 9 ’ = 0 : For the 1ve-dimensional black hole metric (2.40) the above equation becomes for the s-wave mode: h d 3 d 2 hr + fw Rw (r) = 0 ; (3.6) r 3 dr dr where ’ = Rw (r) exp[ − iwt]. The idea behind the absorption calculation is very simple. In terms of becomes d2 =0 ; − 2 + Vw (r∗ ) dr∗
=r 3=2 R the above equation (3.7)
where 3 (1 + 2r02 =r 2 − 3r04 =r 4 ) : (3.8) 4r 2 The shape of the potential is given by (Fig. 2). Absorption is caused by the tunnelling of an incoming wave into the “pit of the potential”. Near and far solutions: It is not possible to solve the wave equation exactly. However, we can devise near and far zones where the potential simpli1es enough to admit known solutions. If the zones have an overlap region, then matching the near and far wave-functions and their radial derivatives will provide the solution for our purpose. In the following we will closely follow [40]. We will work in the following range of frequency and parameters: Vw (r∗ ) = −w2 f +
r0 ; rn r1 ; r5 ; wr5 1 :
(3.9)
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
585
The far and near solutions will be matched at an intermediate point rm such that r0 ; rn rm r1 ; r5 ;
wr1 rm =r1 :
(3.10)
The existence of such an intermediate point rm is guaranteed by the conditions (3.9). Far zone (r ¿ rm ): Here the potential Vw becomes (in terms of ) = wr) 3 2 Vw ()) = −w 1 − 2 : 4)
(3.11)
This gives a Bessel equation, so that = F()) + =G()) ; F()) = )=2J1 ()); G()) = )=2N1 ()) :
(3.12)
Using R = r −3=2 and the asymptotic forms of the above Bessel functions, we 1nd the following asymptotic form for R: 1 eiwr e−iwr (e−i3=4 − =e−i=4 ) + (ei3=4 − =ei=4 ) : R = 3=2 (3.13) r 2 2 Near zone (r ¡ rm ): Here we have (wrn r1 r5 )2 w2 r12 r52 h d 3 d hr R + Rw (r) = 0 + r 3 dr dr r6 r4
(3.14)
which is a Hypergeometric equation, with solution [40] R = AF˜ + BG˜ ; F˜ = z −i(a+b)=2 F(−ia; −ib; 1 − ia − ib; z) ; G˜ = z i(a+b)=2 F(−ia; −ib; 1 − ia − ib; z) ; z = (1 − r02 =r 2 );
a = w=(4TR );
b = w=(4TL ) ;
(3.15)
where we have introduced two parameters TL ; TR , given by TL; R =
r0 1 : e± n = 2r1 r5 2r0 sinh(1 ) sinh(5 ) exp[ ∓ n ]
(3.16)
These, as will see in Section 4, play the role of ‘left-’ and ‘right’-moving temperatures (cf. Eq. (4.29)). In the second step we have used the de1nition of the gravitational lengths r1 ; r5 in (2.40). We now impose on the “near solution” (3.15) the condition that the wave at the horizon should not have any outcoming component: it should be purely ingoing (no “white hole”). This gives B = 0.
586
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
Matching: We now match R and dR=dr between the near and far regions at some point rm in the overlap region. This gives 4(1 − ia − ib) ; ==1 : (3.17) =2w3=2 =2 = A 4(1 − ib)4(1 − ia) Fluxes. Eq. (3.6) for R implies (d=dr)F = 0, where 1 F(r) = [R∗ hr 3 dR=dr − c:c:] : (3.18) 2i In order to 1nd out what fraction of the Cux gets absorbed at the horizon, we compute the ratio a+b 3 R1 = F(r0 )=Fin (∞) = r02 w =2 ; (3.19) w|e1 |2 where the superscript “in” indicates the Cux calculated from the “ingoing” part of the wave at in1nity. 3.1.1. Absorption cross-section In order to de1ne absorption cross-section in the standard way, one has to consider plane waves and not s-waves. It is easy to derive that e−iwz = (4=w3 )e−iwr Z000 + other partial waves ;
(3.20)
where we use the notation Zlm1 m2 for the S3 analogs of the spherical harmonics Ylm . From this and the standard de1nition of absorption cross-section we get Iabs = (4=w3 )R1 which evaluates to [40] exp(w=TH ) − 1 w ; Iabs = 22 r12 r52 2 (exp(w=2TR ) − 1)(exp(w=2TL ) − 1)
(3.21)
(3.22)
where TL; R is given by (3.16), and TH , to be identi1ed below with the Hawking temperature, is given by the harmonic mean 1 1 1 1 = 2r0 sinh(1 ) sinh(5 ) cosh(n ) : = + (3.23) TH 2 TL T R Note that in the regime (3.9), the Hawking temperature agrees with (2.45). We will make this comparison in Section 4.6 where we will also compare the temperatures (3.16) and (3.23) with the values obtained from the D1–D5 CFT (see also Section 8). In the w → 0 limit, one gets [38] Iabs = Ah ;
(3.24)
where Ah denotes the area of the event horizon. 3.1.2. Hawking radiation The semiclassical calculation of Hawking radiation is performed through the standard route of 1nding Bogoliubov coeKcients representing mixing of negative and positive frequency modes due to evolution from “in” to “out” vacua, de1ned with respect to Minkowski observers existing in the
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
587
asymptotically Cat regions at t = −∞ and t = +∞, respectively [16] (see, e.g. [18,102] for more leisurely derivations). The rate of radiation is given by 4H = Iabs (ew=TH − 1)−1
d4 k : (2)4
(3.25)
As we remarked above, the Hawking temperature, given by Eqs. (3.23), agrees with the temperature (2.45) in the region (3.9) (see Section 4.6). We will see in Section 8 how 4H and Iabs are reproduced in the D-brane picture. 3.1.3. Importance of near-horizon physics It is interesting to note two points for later use: (a) The “near zone” described above is simply the near-horizon region as in the AdS=CFT context (see Sections 6 and 8), (b) With the inequality = in (3.17), solution (3.13) in the far zone simply becomes R = √ eiwr 2
(3.26)
which is just a free incoming wave, with Cux Fin (∞) = ||2 . As we saw (Eq. (3.19)), the parameter also disappears from the ultimate calculation because of the division by the Cux at in1nity. Thus, at the end of the day, it is only the near-horizon geometry, together with the mere existence of the asymptotically Cat region, which ultimately determines the absorption cross-section (3.22) and the Hawking Cux (3.25). 3.2. Fixed scalars The graybody factor for the 1xed scalars phi; L [101–106] (also reviewed in [107]) follows from a similar, but more involved, analysis of the (coupled) equations of motion of these two 1elds which follow from action (3.5). These were solved for general Q1 ; Q5 ; N in [104]. The method for computing the absorption cross-section and the Hawking rate is similar to those employed for the minimal scalars. As we noted above, the important ingredient in the semiclassical calculation is the near-horizon equation of motion; this turns out be Eq. (8.57). Using this, we arrive at the following result for the absorption cross-section for 1xed scalars (for w; TR TL ) 42 TH2 1 2 Iabs = Ah (wrn ) 1 + ; (3.27) 4 w2 where the temperatures TL ; TR ; TH are de1ned in (3.16) and (3.23) and the area Ah of the horizon is de1ned in (2.44). As we will see in Section 8, understanding the absorption and emission of 1xed scalars from D-brane models is a subtle problem [104]. Resolution of this problem requires [108] a new insight from AdS=CFT correspondence about coupling of D-branes to supergravity.
588
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
4. The microscopic modeling of black hole and gauge theory of the D1–D5 system In Section 2.2 we discussed the supergravity solution of the D1–D5 black string solution. The solution with N = 0, consists of Q1 D1-branes and Q5 D5-branes. The realization that solitons carrying Ramond–Ramond charges can be represented at weak string coupling by open strings with Dirichlet boundary conditions [109] allows the formulation of the microscopic theory for the D1–D5 system. We will be interested in only low energy degrees of freedom of the D1–D5 system, and thus we ignore all the massive string modes. There are two ways to proceed in the study of the massless modes, and we shall discuss both of them. The 1rst method is a description in terms of a 2-dimensional gauge theory of the D-branes and the second method involves identifying D1-branes with instantons of a 4-dimensional gauge theory. The latter description is more accurate and is valid for instantons of all sizes. The 2-dimensional gauge theory description is valid near the point in the moduli space of instantons when the instantons have shrunk to zero size [110]. We will discuss this more approximate description 1rst and detail the domain of validity of this description. 4.1. The D1–D5 System and the N = 4, U (Q1 ) × U (Q5 ) gauge theory in 2-dimensions Consider type IIB string theory with 1ve coordinates, say x5 : : : x9 , compacti1ed on S 1 × T 4 . The microscopic model for solution (2.27) with N = 0 consists Q1 D1-branes and Q5 D5-branes [44,71]. The D1-branes are along the x5 coordinate compacti1ed to a circle S 1 of radius R5 ≡ R, while the D5-branes are parallel to x5 and x6 ; : : : ; x9 compacti1ed on a torus T 4 of volume VT 4 ≡ V4 . The charge N is related to the momenta of the excitations of this system along S 1 . We will work in the following region of parameter space: V4 ∼ O(2 ) ; √ Rls ≡ :
(4.1)
Let us brieCy discuss the implications of the above region in parameter space. The size of the torus T 4 is of the order of string scale, the masses of the winding and momentum modes of the strings are of the order of 1=ls . This implies that for energies E1=ls we can neglect these modes. On the other hand the radius of the S 1 is much larger than string scale. Therefore for E1=ls the winding modes can be neglected but one has to retain all the momentum modes. ELectively we can then treat the circle as non-compact. We now discuss the symmetries preserved by this con1guration of D-branes. The SO(1; 9) symmetry of 10 dimensions is broken down to SO(1; 1) × SO(4)E × SO(4)I . The SO(4)E stands for rotations of the 6; 7; 8; 9 directions. As the 6; 7; 8; 9 directions are compacti1ed on the torus T 4 , the SO(4)I symmetry is also broken. But we can still use the SO(4)I algebra to classify states and organize 1elds. This con1guration of D-branes preserves 8 supersymmetries out of the 32 supersymmetries of type IIB theory. From the fact that we are retaining only momentum modes along the x5 the low energy eLective action for the collective modes of this D-brane con1guration is 1 + 1 dimensional. More precisely, we shall see that the low-energy dynamics of this D-brane system is described by a U (Q1 ) × U (Q5 ) gauge theory in two dimensions with N = 4 supersymmetry [48,111]. The gauge theory will be assumed to be in the Higgs phase because we are interested in the bound state where the branes are not separated from each other in the transverse direction. In order to really achieve this
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
589
(1,1)
(1,5)
(5,1)
(5,5)
Fig. 3. Open strings in the D1–D5 system.
and prevent branes from splitting oL we will turn on the Fayet–Iliopoulos parameters. We shall show in Section 6.2 that in supergravity these parameters correspond to the vev of the Neveu–Schwarz BNS . In principle we can also turn on the term in the gauge theory. This corresponds to a vev of a certain linear combination of the RR 0-form and 4-form. The elementary excitations of the D-brane system (see Fig. 3) correspond to open strings with two ends attached to the branes and there are three classes of such strings: the (1,1), (5,5) and (1,5) strings. The associated 1elds fall into vector multiplets and hypermultiplets, using the terminology of N = 2; D = 4 supersymmetry. (1,1) strings: The part of the spectrum coming from (1,1) strings is simply the dimensional reduction, to 1 + 1 dimensions (the (t; x5 )-space), of the N = 1; U (Q1 ) gauge theory in 9 + 1 dimensions [109]. The bosonic 1elds of this theory can be organized into the vector multiplet and the hypermultiplet of N = 2 theory in four-dimensions as (1) (1) Vector multiplet : A(1) 0 ; A5 ; Ym ; m = 1; 2; 3; 4 ;
Hypermultiplet : Yi(1) ; i = 6; 7; 8; 9 :
(4.2)
(1) (1) (1) The A(1) 0 ; A5 are the U (Q1 ) gauge 1elds in the non-compact directions. The Ym ’s and Yi ’s are gauge 1elds in the compact directions of the N = 1 super Yang–Mills in ten-dimensions. They
590
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
are hermitian Q1 × Q1 matrices transforming as adjoints of U (Q1 ). The hypermultiplet of N = 2 supersymmetry are doublets of the SU (2)R symmetry of the theory. The adjoint matrices Yi(1) ’s can be arranged as doublets under SU (2)R as (1) N Y9(1) + iY8(1) (1) 1 : (4.3) = N = Y7(1) − iY6(1) N2(1)† (5,5) strings: The 1eld content of these massless open strings is similar to the (1; 1) strings except for the fact that the gauge group is U (Q5 ) instead of U (Q1 ). Normally, one would have expected the gauge theory of the (5; 5) strings to be a dimensional reduction of N=1U (Q5 ) super Yang–Mills to 5 + 1 dimensions. In region (4.1) where we are working, we can ignore the Kaluza–Klein modes on T 4 eLectively leading to a theory in 1 + 1 dimensions. The vector multiplets and the hypermultiplets are given by (5) (5) Vector multiplet : A(5) 0 ; A5 ; Ym m = 1; 2; 3; 4 ;
Hypermultiplet : Yi(5) i = 6; 7; 8; 9 :
(4.4)
(5) (5) (5) The A(5) 0 ; A5 are the U (Q5 ) gauge 1elds in the non-compact directions. The Ym ’s and Yi ’s are gauge 1elds in the compact directions of the N = 1 super Yang–Mills in ten-dimensions. They are hermitian Q5 × Q5 matrices transforming as adjoints of U (Q5 ). The hypermultiplets Yi(5) ’s can be arranged as doublets under SU (2)R as N1(5) Y9(5) + iY8(5) (5) = : (4.5) N = Y7(5) − iY6(5) N2(5)†
Since xm are compact, the (1,1) strings can also have winding modes around the T 4 . These are, however, massive states in the (1 √ + 1)-dimensional theory and can be ignored. This is because their masses are proportional to 1= (see (4.1)), which can be neglected for energies E1=ls . Similarly, the part of the spectrum coming from (5,5) strings is the dimensional reduction, to 5 + 1 dimensions, of the N = 1; U (Q5 ) gauge theory in 9 + 1 dimensions. In this case, the gauge 1eld m components A(5) m (m = 6; 7; 8; 9) also have a dependence on x . Momentum modes corresponding to this √ dependence are neglected because the size of the 4-torus is of the order of the string scale . The neglect of the winding modes of the (1; 1) strings and the KK modes of the (5; 5) strings is consistent with T-duality. A set of four T-duality transformations along xm interchanges D1and D5-branes and also converts the momentum modes of the (5,5) strings along T 4 into winding modes of (1,1) strings around the dual torus [112]. Since these winding modes have been ignored, a T-duality covariant formulation requires that we should also ignore the associated momentum modes. (1,5) and (5,1) strings: The 1eld content obtained so far is that of N = 2; U (Q1 ) × U (Q5 ) gauge theory, in 1+5 dimensions, reduced to 1+1 dimensions on T 4 . The SO(4)I ∼ SU (2)L × SU (2)R rotations on the tangent space of the torus act on the components of the adjoint hypermultiplets Xi(1; 5) as an R-symmetry. To this set of 1elds we have to add the 1elds from the (1,5) sector that are constrained to live in 1+1 dimensions by the ND boundary conditions. These strings have their ends 1xed on diLerent types of D-branes and, therefore, the corresponding 1elds transform in the fundamental representation of both U (Q1 ) and U (Q5 ). The ND boundary conditions have the important consequence that the (1,5) sector 1elds form a hypermultiplet which
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
591
is chiral w.r.t. SO(4)I . The chirality projection is due to the GSO projection. Hence, the R-symmetry group is SU (2)R . A : (4.6) E= B† A few comments are in order: (1) The inclusion of these 1elds, coming from the (1,5) and (5,1) strings, breaks the supersymmetry by half, to the equivalent of N = 1 in D = 6, and the 1nal theory only has SU (2)R R-symmetry. (2) The fermionic superpartners of these hypermultiplets which arise from the Ramond sector of the massless excitations of (1; 5) and (5; 1) strings carry spinorial indices under SO(4)E and they are singlets under SO(4)I . (3) The U (1) × U (1) subgroup is important. One combination involving the sum of the U (1)’s leaves the hypermultiplet invariant. (Aa a ; Ba a ) have charges (+1; −1) under the relative U (1). (4) E is a chiral spinor of SO(4)I with convention 46789 E = E. (5) Since we are describing the Higgs phase in which all the branes sit on top of each other, we have Yi(1; 5) = 0. (6) There are two coupling constants in the gauge theory, the coupling constant of the D1-brane gauge theory g12 = gs =(2 ) and the coupling constant of the D5-brane gauge theory g52 = gs =(2 v). ˜ Here v˜ is related to the volume of T 4 by VT 4 = 2 (2)4 v. ˜ Since we are interested in low energies E1=ls the gauge theory is strongly coupled. (7) In the above discussion from the geometry of the con1guration, the 1elds Xi(1; 5) along the torus directions and the 1elds E are compact, since they parametrize positions along the compact directions. However, it is not consistent with gauge invariance to take hypermultiplets of the N = 2 multiplet to be compact. 7 Therefore the hypermultiplets are non-compact. Since we are interested in energies E1=ls the expectation values of the hypers (which have units of energy) Xi(1; 5) ; E1=ls . Thus, the (1,1), (5,5) and (1,5) strings do not probe the entire domain of T 4 . Therefore, even though geometrically we are on T 4 it is consistent with the fact that the hypers are non-compact as we are interested in E1=ls . (8) The above discussion of domain of validity of the gauge theory ties in nicely with the fact that the gauge theory is an approximate description while the instanton moduli space description is a more global description. The gauge theory is valid when the hypers get small expectation values. The hyper-multiplet of the gauge theory corresponds to the scales of the instanton (via the ADHM construction for instantons on R4 ). Thus the gauge theory is valid when the instantons have shrunk to almost zero size [110]. In summary, the gauge theory of the D1–D5 system is a 1 + 1 dimensional (4; 4) supersymmetric gauge theory with gauge group U (Q1 ) × U (Q5 ). The matter content of this theory consists of hypermultiplets Y (1) ’s, Y (5) ’s transforming as adjoints of U (Q1 ) and U (Q5 ), respectively. It also has the hypermultiplets E’s which transform as bi-fundamentals of U (Q1 ) × U (Q5 ).
7
This can be seen as follows. The D-term equations in the N = 2 theory admit a symmetry under the complexi1ed gauge symmetry GL(C; Q1 ) × GL(C; Q5 ) (see for instance in [113]). This symmetry involves an arbitrary scaling.
592
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
4.2. The potential terms The Lagrangian of the above gauge theory can be worked out from the dimensional reduction of d = 6; N = 1 gauge theory. The potential energy density of the vector and hypermultiplets is a sum of 4 positive terms (in this section for convenience of notation we have de1ned Yi(1) = Yi ; Yi(5) = Xi ; Ym(1) = Ym ; Ym(5) = Xm ) [48,111]: V = V1 + V 2 + V 3 + V 4 ; 1 1 2 Tr [Y ; Y ] − Tr U (Q5 ) [Xm ; Xn ]2 ; U (Q ) m n 1 4g12 m; n 4g52 m; n 1 1 V2 = − 2 Tr U (Q1 ) [Yi ; Ym ]2 − 2 Tr U (Q5 ) [Xi ; Xm ]2 ; 2g1 i; m 2g5 i; m 1 Tr U (Q1 ) (EXm − Ym E)(Xm E† − E† Ym ) ; V3 = 4 m 2 1 T + + + 1 V4 = Tr U (Q1 ) Ei4ij E + i[Yi ; Yj ] − Oij 4 Q1 2 1 + + + 1 + Tr U (Q5 ) E i4ij E + i[Xi ; Xj ] − Oij : 4 Q5 V1 = −
(4.7) (4.8) (4.9) (4.10)
(4.11)
The potential energy V4 comes from a combination of F and D terms of the higher dimensional gauge theory. 4ij = (i=2)[4i ; 4j ] are spinor rotation matrices. The notation a+ ij denotes the self-dual part of the anti-symmetric tensor aij . In V4 we have included the Fayet–Iliopoulos (FI) terms Oij+ , which form a triplet under SU (2)R . Their inclusion is consistent with N=4 SUSY. The FI terms can be identi1ed with the self-dual part of Bij , the anti-symmetry tensor of the NS sector of the closed string theory [114]. This identi1cation at this stage rests on the fact that (i) Oij+ and Bij+ have identical transformation properties under SU (4)I and (ii) at the origin of the Higgs branch where E = X = Y = 0, V4 ∼ Oij+ Oij+ . This signals a tachyonic mode from the view point of string perturbation theory [114]. The tachyon mass is easily computed and this implies the relation Oij+ Oij+ ∼ Bij+ Bij+ . These issues are discussed further in Section 7. 4.3. D-Batness equations and the moduli space The supersymmetric ground state (semiclassical) is characterized by the 2 sets of D-Catness equations which are obtained by setting V4 = 0. They are best written in terms of the SU (2)R doublet 1elds N (1) and N (5) : (1) N Y9 + iY8 (1) 1 = ; N = Y7 + iY6 N2(1)† (5) N X9 + iX8 (5) 1 N = : (4.12) = X7 + iX9 N2(5)†
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
593
+ + + We also de1ne O = O69 and Oc = O67 + iO68 . With these de1nitions the 2 sets of D-Catness conditions become:
(AA† − B† B)a b + [N1(1) ; N1(1)† ]a b − [N2(1) ; N2(1)† ]a b = (AB)a b + [N1(1) ; N2(1)† ]a b =
Oc 3a b ; Q1
(A† A − BB† )ab + [N1(5) ; N1(5)† ]ab − [N2(5) ; N2(5)† ]ab = (A+ B+ )ab + [N1(5) ; N2(5)† ]ab =
O 3a b ; Q1
Oc 3ab : Q5
O 3ab ; Q5 (4.13)
Here a ; b runs from 1 · · · Q1 and a; b runs from 1 · · · Q5 . The hypermultiplet moduli space is a solution of the above equations modulo the gauge group U (Q1 ) × U (Q5 ). A detailed discussion of the procedure was given in [88,111]. Below we summarize the main points. If we take the trace parts of (4.13) we get the same set of 3 equations as the D-Catness equations for a U (1) theory with Q1 Q5 hypermultiplets, with U (1) charge assignment (+1; −1) for (Aa b ; BaT b ). Thus, (Aa b A∗a b − BaT b BaT∗b ) = O ; (4.14) a b
Aa b BaT b = Oc :
(4.15)
a b
For a given point on the surface de1ned by (4.14) and (4.15) the traceless parts of (4.13) lead to 3Q12 + 3Q52 − 6 constraints among 4Q12 + 4Q52 − 8 degrees of freedom corresponding to the traceless parts of the adjoint hypermultiplets N (1) and N (5) . Using Q12 + Q52 −2 gauge conditions corresponding to SU (Q1 ) × SU (Q5 ) we have (3Q12 + 3Q52 − 6) + (Q12 + Q52 − 2) = 4Q12 + 4Q52 − 8 conditions for the (4Q12 + 4Q52 − 8) degrees of freedom in the traceless parts of N (1) and N (5) . The 8 degrees of freedom corresponding to TrXi and TrYi ; i = 6; 7; 8; 9 correspond to the centre-of-mass of the D5and D1-branes, respectively. 4.4. The bound state in the Higgs phase Having discussed the moduli space that characterizes the SUSY ground state we can discuss the Cuctuations of the transverse vector multiplet scalars Xm and Ym ; m = 1; 2; 3; 4. In the Higgs phase since Xm = Ym = 0 and E = EU lies on the surface de1ned by (4.14) and (4.15). The relevant action of Cuctuations in the path integral is dt d x5 (Tr U (Q5 ) 9 Xm 9 Xm + Tr U (Q1 ) 9 Ym 9 Ym ) + dt d x5 (V2 + V3 ) : (4.16) S= m
We restrict the discussion to the case when Q5 = 1 and Q1 is arbitrary. In this case the matrix Xm is a real number which we denote by xm . E is a complex column vector with components (Aa ; Ba ). Since we are looking at the Cuctuations of the Ym only to quadratic order in the path integral, the
594
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
integrals over the diLerent Ym decouple from each other and we can treat each of them separately. Let us discuss the Cuctuation Y1 and set (Y1 )a b = 3a b y1a . Then the potential V3 , (4.10) becomes V3 = (|Aa |2 + |Ba |2 )(y1a − x1 )2 : (4.17) a
We will prove that |Aa |2 + |Ba |2 can never vanish if the FI terms are non-zero. In order to do this let us analyze the second D-term Eq. (4.13) Aa Bb + [N1(1) ; N2(1)† ]a b =
Oc 3a b : Q1
(4.18)
We can use the complex gauge group GL(C; Q1 ) to diagonalize the complex matrix N1(1) [113]. Then, (4.18) becomes Aa Bb + (na − nb )(N2(1)† )a b =
Oc 3a b : Q1
(4.19)
For a = b , this determines the non-diagonal components of N2(1) (N2(1)† )a b = −
A a Bb : na − n b
(4.20)
For a = b, we get the equations A a Ba =
Oc ; Q1
(4.21)
which imply that |Aa Ba | =
|Oc | Q1
(4.22)
with the consequence that |Aa | and |Ba | are non-zero for all a = 1; : : : ; Q1 . This implies that (|Aa |2 + |Ba |2 ) ¿ 0), and hence the Cuctuation (y1a − x1 ) is massive. If we change variables y1a → y1a + x1 , then x1 is the only Cat direction. This corresponds to the global translation of the 5-brane in the x1 direction. A similar analysis can be done for all the remaining directions m=2; 3; 4 with identical conclusions. This shows that a non-zero FI term implies a true bound state of the Q5 = 1, Q1 = N system. If FI = 0, then there is no such guarantee and the system can easily fragment, due to the presence of Cat directions in (Ym )a b . What the above result says is that when the FI parameters are non-zero the zero mode of the 1elds (Ym )a b is massive. If we regard the zero mode as a collective coordinate, then the Hamiltonian of the zero mode has a quadratic potential which agrees with the near-horizon limit of the Liouville potential derived in [88,114]. The general case with an arbitrary number of Q1 - and Q5 -branes seems signi1cantly harder to prove and is an open question, but the result is very plausible on physical grounds. If the potential for a single test D1-brane is attractive, it is hard to imagine any change in this fact if there are 2 test D1-branes, because the D1-branes by themselves can form a bound state.
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
595
4.5. The conformally invariant limit of the gauge theory In the previous section we showed that the D1–D5 system leads to a bound state in the Higgs phase. The next question is about the low energy collective excitations of the bound state. They are described by the sigma model corresponding to the hypermultiplet moduli de1ned by the Eqs. (4.13). The Lagrangian (bosonic part) is S= dt d x5 (trU (Q5 ) 9 Xi 9 Xi + trU (Q1 ) 9 Yi 9 Yi ) + dt d x5 (9 E9 E† ) : (4.23) m
This is a diKcult non-linear system, with N = 4 SUSY. Since we are interested in the low energy dynamics we may as well ask whether there is an SCFT 1xed point. This SCFT 1xed point is relevant in the study of the near-horizon geometry (2.71). Such an SCFT must have (4,4) supersymmetry (16 real supersymmetries) with a central charge c = 6(Q1 Q5 + 1). Now note that Eqs. (4.14) and (4.15) describe a hyper-Kahler manifold and hence the sigma model de1ned on it is an SCFT with (4,4) SUSY. We can then consider the part of the action involving the Xi and Yi which are solved in terms of the E as giving a deformation of the SCFT. Now this deformation clearly breaks the superconformal symmetry. The sigma model action at the conformally invariant point is dt d x5 (9 Aa b 9 A∗a b + 9 BaT b 9 BaT∗b ) : (4.24) a b
The sigma model 1elds are constrained to be on the surface de1ned by (4.14) and (4.15). Further, after appropriate gauge 1xing the residual gauge invariance inherited from the gauge theory is the Weyl group S(Q1 ) × S(Q5 ) [111]. The Weyl invariance can be used to construct gauge invariant strings of various lengths. If Q1 and Q5 are relatively prime it is indeed possible to prove the existence of a single winding string with minimum unit of momentum given by 1=Q1 Q5 . This is associated with the longest cyclic subgroup of S(Q1 ) × S(Q5 ). Cyclic subgroups of shorter length cycles lead to strings with minimum momentum 1=l1 l5 , where l1 and l5 are the lengths of the cycles [111]. In a diLerent way of describing these degrees of freedom we shall see in the next sections that strings of various lengths are associated with chiral primary operators of the conformal 1eld theory on the moduli space of instantons on a 4-torus. 4.6. A quick derivation of entropy and temperatures from CFT We pause in this section to show that certain deductions about thermodynamic properties can be made just by the knowledge of the central charge (c = 6Q1 Q5 , same as in Section 4.5 for large Q1 Q5 ) and the level of the Virasoro algebra of the unitary superconformal 1eld theory mentioned above. This information is suKcient to calculate the number of microstates (a more detailed and complete derivation is given in Section 8). To 1nd the microstates of the D1–D5 black hole we look for states with L0 = NL and LU0 = NR . The asymptotic number of distinct states of this SCFT is given by Cardy’s formula [115] = L R ; L; R = exp[2
cNL; R =6] = exp[2
Q1 Q5 NL; R ] :
(4.25)
596
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
From the Boltzmann formula one obtains S = ln = 2( Q1 Q5 NL + Q1 Q5 NR ) :
(4.26)
This exactly reproduces the Bekenstein–Hawking entropy (2.47). The exact agreement is surprising since for arbitrary NR ; NL = 0 the states being considered are far from being supersymmetric. The quantity L (NL ) de1nes (the number of states in) a microcanonical ensemble in which we 1x the energy of the left-movers to be NL : EL = R5 The equivalent canonical ensemble is de1ned by ∞ ZL ≡ Tr exp[ − =L EL ] = L (NL ) exp[ − =L NL =R5 ] NL =0
=
∞
exp[2
Q1 Q5 NL − =L NL =R5 ] :
(4.27)
NL =0
For large enough temperature TL (=1==L ), the sum in the second line is dominated by a saddle point value that occurs at Q1 Q5 =L = : (4.28) R5 NL This determines the temperature of the left movers; a similar reasoning works for the right movers. The two temperatures are given by
1 NL NR 1 1 1 = and TR = = : (4.29) TL = =L R5 Q1 Q5 =R R5 Q1 Q5 The temperature TH = 1==H of the full system is conjugate to the total energy E = EL + ER , and is given by 8 1 1 1 1 1 ; (4.30) =H = (=L + =R ) ⇒ = + 2 TH 2 TL TR 8
To derive the “harmonic” rule in (4.30) (cf. also (3.23)), de1ne the full partition function as Z ≡ ZL ZR = Tr exp[ − (=L EL + =R ER )]
which we can rewrite as Z = Tr exp[ − 12 (=L + =R )(EL + ER ) − 12 (=L − =R )(EL − ER )] : Performing the sum over EL − ER (equivalently, over NL − NR ), and ignoring the unimportant multiplicative constant we get Z = Tr exp[ − =H (EL + ER )] ; where =H = 12 (=L + =R ) :
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
√ 2 NL NR √ √ √ TH = 1==H = : R5 Q1 Q5 ( NL + NR )
597
(4.31)
Comparison with supergravity: We have so far encountered three expressions for temperature, in Eq. (2.45) in Section 2, in Eqs. (3.16) and (3.23) in Section 3, and the expressions (4.29) and (4.31) above. It will be interesting to compare them. Note that the dilute gas regime (3.9) implies (see Eqs. (2.40)) sinh 1 ; sinh 5 1; sinh n :
(4.32)
In this region (see (2.41) and (2.42)) Q1; 5 ≈ N1; 5 N1;U 5U : This gives r0 cosh 1; 5 ≈
r0 exp[1; 5 ] ≈ c1; 5 Q1; 5 : 2
Eq. (2.42) gives r0 en = 2 cn NL ; r0 e−n = 2 cn NR :
Using these expressions (and also (2.34)) it is easy to see that (2.45) and (3.23) both reduce to (4.31). Also (3.16) reduces to (4.29). 4.7. D1-branes as solitonic strings of the D5 gauge theory In the previous subsection we found that the Higgs branch of the gauge theory of the D1–D5 ˜ with central charge 6Q1 Q5 system Cows in the infrared to N = (4; 4) SCFT on a target space M (we have excluded the centre of mass degrees of freedom). For black hole processes like Hawking radiation it is important to have a better hand on the target space M. In this section we review the ˜ is a resolution of the orbifold T 4 ×(T˜ 4 )Q1 Q5 =S(Q1 Q5 ). arguments which show that the target space M 4 (We use the notation T˜ to distinguish it from the compacti1cation torus T 4 .) This discussion gives a succinct description of the bound state in the Higgs phase. The Q1 D1-branes can be thought of as Q1 instantons in the 5 + 1 dimensional U (Q5 ) super Yang–Mills theory of the Q5 D5-branes [116,117]. To see this note that the DBI action of the D5-branes have a coupling d 6 x C (2) ∧ Tr[F (5) ∧ F (5) ] : (4.33) The non-trivial gauge con1gurations which are independent of x0 ; x5 and have zero values of A(5) 0 and A(5) but non-zero values of Tr[F (5) ∧ F (5) ] act as sources of the Ramond–Ramond two-form 5 (2) C05 . If these gauge 1eld con1gurations have to preserve half the supersymmetries of the D5-brane action they should be self-dual. Thus they are instanton solutions of four-dimensional Euclidean Yang–Mills of the 6; 7; 8; 9 directions. Additional evidence for this comes from the fact that the integral property of Tr[F (5) ∧ F (5) ] corresponds to the quantization of the D1-branes charge. The action for a Q1 instanton solution 2 is Q1 =gYM . This agrees with the tension of Q1 D1-branes, namely Q1 =gs . If one is dealing with
598
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
non-compact D5-branes and D1-branes it is seen that the D-Catness conditions of the D1-brane theory is identical to the ADHM construction of Q1 instantons of U (Q5 ) gauge theory [117]. In fact, the last two equations in (4.13) are the relevant ADHM equations in this case [118]. From the discussion in the preceding paragraphs we conclude that, excluding Wilson lines of ˜ can be thought of as the moduli space of Q1 instantons of a U (Q5 ) the U (Q5 ) gauge theory, M 4 4 gauge theory on T . This moduli space is known to be the Hilbert scheme of Q1 Q5 points on T˜ 4 [119]. T˜ can be diLerent from the compacti1cation torus T 4 . This is a smooth resolution of the 4 singular orbifold (T˜ )Q1 Q5 =S(Q1 Q5 ). We will provide physically motivated evidence for the fact that the moduli space of Q1 instantons of a U (Q5 ) gauge theory on T 4 is a smooth resolution of the 4 orbifold (T˜ )Q1 Q5 =S(Q1 Q5 ) using string dualities. The evidence is topological and it comes from ˜ is the degeneracy of the ground states of the D1–D5 gauge realizing that the cohomology of M theory. We can calculate this degeneracy in two ways. One is by explicitly counting the cohomology 4 of (T˜ )Q1 Q5 =S(Q1 Q5 ) [120]. The second method is to use string dualities as discussed below. Both these methods give identical answers. Thus at least at the level of cohomology we are able to verify that the moduli space of Q1 instantons of a U (Q5 ) gauge theory on T 4 is smooth resolution of 4 (T˜ )Q1 Q5 =S(Q1 Q5 ). Consider type IIB string theory compacti1ed on S 1 × T 4 with a fundamental string having Q5 units of winding along x6 and Q1 units of momentum along x6 . On performing the sequence of dualities ST6789 ST56 we can map the fundamental string to the D1–D5 system (we can de-compactify the x5 direction 1nally) with Q1 D1-branes along x5 and Q5 D5-branes along x5 ; x6 ; x7 ; x8 ; x9 [121]. Therefore, using this U-duality sequence the BPS states of this fundamental string (that is, states with either purely left moving or right moving oscillators) maps to ground states of the D1–D5 system. The number of ground states of the D1–D5 system is given by the dimension of the cohomology ˜ From the perturbative string degeneracy counting the generating function of BPS states with of M. left moving oscillator number NL is given by ∞ ∞ 1 + qn 8 NL d(NL )q = 256 × ; (4.34) 1 − qn N =o n=1 L
where d(NL ) refers to the degeneracy of states with left moving oscillator number NL . The D1–D5 system is U-dual to the perturbative string with NL = Q1 Q5 . 4 Explicit counting of the cohomology of (T˜ )Q1 Q5 =S(Q1 Q5 ) gives d(Q1 Q5 )=256. The factor 256 comes from quantization of the center of mass coordinate along 1; 2; 3; 4 directions and the 6; 7; 8; 9 directions. The center of mass coordinate is represented by the U (1) of U (Q1 ) × U (Q5 ). Therefore the low energy theory of the bound D1–D5 system is an SCFT on the target space 4 R4 × T 4 × (T˜ )Q1 Q5 =S(Q1 Q5 ) :
(4.35)
It is also useful to interpret the moduli represented by the free T 4 form the gauge theory of Q5 D5-branes. As this theory is on a torus T 4 it admits Wilson-lines along all the cycles of the T 4 . The free torus in the above equation stands for Wilson-lines in the theory of the D5-branes. From the D1–D5 gauge theory point of view the free T 4 belongs to the Higgs branch as it stands for the center of mass coordinate on T 4 parametrized by Tr U (Q5 ) (Yi(5) ) + Tr U (Q1 ) (Yi(1) ) while R4 belongs to the Coulomb branch. It is parameterized by Tr U (Q5 ) (Ym(5) ) + Tr U (Q1 ) (Ym(1) ). Thus the Higgs branch of
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
599
4 the D1–D5 gauge theory Cows in the infrared to a N = (4; 4) SCFT on T 4 × (T˜ )Q1 Q5 =S(Q1 Q5 ). The SCFT on T 4 is free while the symmetric product contains interesting dynamics. From now on 4 we will denote the symmetric product orbifold (T˜ )Q1 Q5 =S(Q1 Q5 ) by M.
5. The SCFT on the orbifold M As we have seen in the last section the Higgs branch of the D1–D5 system Cows in the infrared to a product of an N = (4; 4) SCFT on a resolution of the orbifold M with a free theory on T 4 . As the SCFT on T 4 is free and decoupled we will 1rst focus on the symmetric product. We will formulate the SCFT on M as a free 1eld theory with identi1cations and discuss its symmetries. In particular we 1nd a new SO(4) algebra which is useful in the classi1cation of states. We then construct the operators which correspond to the moduli of the SCFT including the four operators which correspond to the resolution of the orbifold. Finally, we explicitly construct the chiral primaries of the N=(4; 4) SCFT on the symmetric product orbifold M. The N = (4; 4) SCFT on M is described by the free Lagrangian 1 U i; A + i (z)9U i (z) + ˜ i (z)9 ˜ i (z)] S= d 2 z[9xAi 9x : (5.1) A A A U A U 2 4 coordinates 1,2,3,4 and A = 1; 2; : : : ; Q1 Q5 labels various copies of the Here i runs over the T four-torus. The symmetric group S(Q1 Q5 ) acts by permuting the copy indices. It introduces various twisted sectors which we will discuss later. The free 1eld realization of this SCFT has N = (4; 4) superconformal symmetry. To set up our notations and conventions we review the N = 4 superconformal algebra. 5.1. The N = 4 superconformal algebra The algebra is generated by the stress energy tensor, four supersymmetry currents, and a local SU (2) R-symmetry current. The operator product expansions(OPE) of the algebra with central charge c are given by (see for example [122]) T (z)T (w) =
9T (w) 2T (w) c + + ; z−w (z − w)2 2(z − w)4
G a (z)G b† (w) =
4IU iab J i 2T (w)3ab 2IU iab 9J i 2c3ab + + + ; 2 z−w z−w (z − w) 3(z − w)3
J i (z)J j (w) =
c ijijk J k + ; z−w 12(z − w)2
T (z)G a (w) =
3G a (z) 9G a (w) + ; z−w 2(z − w)2
T (z)G a† (w) =
3G a† (z) 9G a† (w) + ; z−w 2(z − w)2
600
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
T (z)J i (w) =
Ji 9J i (w) + ; z−w (z − w)2
J i (z)G a (w) =
G b (z)(Ii )ba ; 2(z − w)
J i (z)G a† (w) = −
(Ii )ab G b† (w) : 2(z − w)
(5.2)
Here T (z) is the stress energy tensor, G a (z); G b† (z) the SU (2) doublet of supersymmetry generators and J i (z) the SU (2) R-symmetry current. The I’s stand for Pauli matrices and the I’s U stand for the complex conjugates of Pauli matrices. In the free 1eld realization described below, the above holomorphic currents occur together with their anti-holomorphic counterparts, which we will denote ˜ z) ] by J˜ (z); U G( U and T˜ (z). U In particular, the R-parity group will be denoted by SU (2)R × SU (2)R . 5.2. Free <eld realization of N = (4; 4) SCFT on the orbifold M A free 1eld realization of the N = 4 superconformal algebra with c = 6Q1 Q5 can be constructed out of Q1 Q5 copies of four real fermions and bosons. The generators are given by 1 1 T (z) = 9XA (z)9XA† (z) + RA (z)9RA† (z) − 9RA (z)RA† (z) ; 2 2 1 1 2† √ √ G R −R (z) (z) (z) A A 9XA1 (z) ; = 2 9XA2 (z) + 2 G a (z) = G 2 (z) RA2 (z) RA1† (z) JRi (z) =
1 RA (z)Ii RA† (z) : 2
(5.3)
We will use the following notation for the zero mode of the R-parity current: dz 1 i JR = RA (z)Ii RA† (z) : 2 2i
(5.4)
In the above the summation over A which runs from 1 to Q1 Q5 is implied. The bosons X and the fermions R are XA (z) = (XA1 (z); XA2 (z)) = 1=2(xA1 (z) + ixA2 (z); xA3 (z) + ixA4 (z)) ; RA (z) = (RA1 (z); RA2 (z)) = 1=2( A1 (z) + i A2 (z); A3 (z) + i A4 (z)) ; 1 xA1 (z) − ixA2 (z) XA1† (z) † = XA (z) = ; 2 xA3 (z) − ixA4 (z) XA2† (z) †
RA (z) =
RA1† (z) R2A † (z)
1 = 2
1 A (z) 3 A (z)
−i −i
2 A (z) 4 A (z)
:
(5.5)
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
601
5.3. The SO(4) algebra In addition to the local R-symmetry the free 1eld realization of the N = 4 superconformal algebra has additional global symmetries which can be used to classify the states. There are 2 global SU (2) symmetries which correspond to the SO(4) rotations of the 4 bosons xi . The corresponding charges are given by dz dz dz 1 1 1 † i i i † XA I 9XA − 9XA I XA + 'A Ii 'A† ; I1 = 4 2i 4 2i 2 2i dz dz 1 1 † i i XA I 9XA − 9XA Ii XA† : I2 = (5.6) 4 2i 4 2i Here XA =
(XA1 ; −XA2† );
'A = (RA1 ; RA2† );
XA1† ; X = −XA2 1† RA † : 'A = RA2 †
(5.7)
These charges are generators of SU (2) × SU (2) algebra: [I1i ; I1j ] = ijijk I1k ;
[I2i ; I2j ] = ijijk I2k ;
[I1i ; I2j ] = 0 :
(5.8)
The commutation relation of these new global charges with the various local charges are given below [I1i ; G a (z)] = 0; [I1i ; T (z)] = 0; [I2i ; Ga (z)] =
[I1i ; G a† (z)] = 0; [I1i ; J (z)] = 0;
1 b i G (z)Iba ; 2
[I2i ; T (z)] = 0;
[I2i ; J (z)] = 0 ;
where G = (G 1 ; G 2† );
1 i b† [I2i ; Ga† (z)] = − Iab G (z) ; 2
G† =
G 1† G2
(5.9)
:
(5.10)
The following commutations relation show that the bosons transform as (2; 2) under SU (2)I1 × SU (2)I2 : 1 b i 1 i b† XA Iba ; [I1i ; XAa† ] = − Iab XA ; 2 2 1 1 i b† i ; [I2i ; XAa† ] = − Iab XA : [I2i ; XAa ] = XAb Iba 2 2
[I1i ; XAa ] =
(5.11)
602
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
The fermions transform as (2; 1) under SU (2)I1 × SU (2)I2 as can be seen from the commutations relations given below. [I1i ; 'Aa ] =
1 b i ' I ; 2 A ba
[I2i ; Ra ] = 0;
1 i b† [I1i ; 'Aa† ] = − Iab 'A ; 2
a [I2i ; RU ] = 0 :
(5.12)
We are interested in studying the states of the N = (4; 4) SCFT on M. The classi1cation of the states and their symmetry properties can be analyzed by studying the states of a free 1eld realization of an N = (4; 4) SCFT on R4Q1 Q5 =S(Q1 Q5 ). This is realized by considering the holomorphic and the anti-holomorphic N=4 superconformal algebra with c= c=6Q U 1 Q5 constructed out of Q1 Q5 copies of four real fermions and bosons. So we have an anti-holomorphic component for each 1eld, generator and charges discussed above. These are labelled by the same symbols used for the holomorphic components but distinguished by a tilde. The charges I1 ; I2 constructed above generate SO(4) transformations only on the holomorphic bosons XA (z). Similarly, we can construct charges I1 ; I2 which generate SO(4) transformations only A (z). U Normally one would expect these charges to give rise to a on the anti-holomorphic bosons X global SO(4)hol × SO(4)antihol symmetry. However, the kinetic term of the bosons in the free 1eld realization is not invariant under independent holomorphic and anti-holomorphic SO(4) rotations. It is easy to see, for example by using the Noether procedure, that there is a residual SO(4) symmetry generated by the charges JI = I1 + I˜1 ;
J˜ I = I2 + I˜2 :
(5.13)
] (2)I , where the SU (2) factors are generated We will denote this symmetry as SO(4)I = SU (2)I × SU ˜ by JI ; J I . These charges satisfy the property that (a) they correspond to SO(4) transformations of the A (z) U = XA (z) + X U and (b) they fall into representations of the N = (4; 4) algebra (as bosons XA (z; z) can be proved by using the commutation relations (5.11) of the I ’s). The bosons X (z; z) U transform ] as (2; 2) under SU (2)I × SU (2)I . 5.4. The supergroup SU (1; 1|2) The global part of the N = 4 superconformal algebra forms the supergroup SU (1; 1|2). Let a L±; 0 ; JR(1); (2); (3) be the global charges of the currents T (z) and JR(i) (z) and G1=2; −1=2 the global charges a of the supersymmetry currents G (z) in the Neveu–Schwarz sector. From the OPE’s (5.2) we obtain the following commutation relations for the global charges: [L0 ; L± ] = ∓L±
[L1 ; L−1 ] = 2L0 ;
(i) b† a ab i ; G− {G1=2 1=2 } = 23 L0 + 2Iab JR ; (i) b† a ab i {G− 1=2 ; G1=2 } = 23 L0 − 2Iab JR ;
[JR(i) ; JR(j) ] = ijijk JR(k) ;
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
1 a a [L0 ; G± 1=2 ] = ∓ G±1=2 ; 2 a [L+ ; G1=2 ] = 0;
a a [L+ ; G− 1=2 ] = G1=2 ;
a† [L− ; G1=2 ]=0 ;
a† a ] = −G− [L− ; G1=2 1=2 ; a [JR(i) ; G± 1=2 ] =
1 a† a† [L0 ; G± 1=2 ] = ∓ G±1=2 ; 2
a [L− ; G− 1=2 ] = 0 ;
a a [L− ; G1=2 ] = −G− 1=2 ; a† [L+ ; G1=2 ] = 0;
603
a† a [L+ ; G− 1=2 ] = G1=2 ;
1 b G (Ii )ba ; 2 ±1=2
1 i ba b† a† [JR(i) ; G± 1=2 ] = − (I ) G±1=2 : 2
(5.14)
The above commutation relations form the algebra of the supergroup SU (1; 1|2). The global part of the N = (4; 4) superconformal algebra form the supergroup SU (1; 1|2) × SU (1; 1|2).
5.5. Short multiplets of SU (1; 1|2) The representations of the supergroup SU (1; 1|2) are classi1ed according to the conformal weight and SU (2)R quantum number. The highest weight states |hw = |h; jR ; jR3 = jR satisfy the following properties: L1 |hw = 0;
L0 |hw = h|hw ;
JR(+) |hw = 0;
JR(3) |hw = jR |hw ;
a |hw = 0; G1=2
a† G1=2 |hw = 0 ;
(5.15)
2† 1 where JR+ = JR(1) + iJR(2) . Highest weight states which satisfy G− 1=2 |hw = 0; G−1=2 |hw = 0 are chiral primaries. They satisfy h=j. We will denote these states as |hw S . Short multiplets are generated from 1† 2 the chiral primaries through the action of the raising operators J− ; G− 1=2 and G−1=2 . The structure of the short multiplet is given below
States
j
L0
Degeneracy
|hw S 1† 2 G− 1=2 |hw S ; G−1=2 |hw S 1† 2 G− 1=2 G−1=2 |hw S
h h − 1=2 h−1
h h + 1=2 h+1
2h + 1 2h + 2h = 4h 2h − 1
(5.16)
The short multiplets of the supergroup SU (1; 1|2) × SU (1; 1|2) are obtained by the tensor product of the above multiplet. We denote the short multiplet of SU (1; 1|2) × SU (1; 1|2) as (2h + 1; 2h + 1)S . These stand for the degeneracy of the bottom component, the top row in (5.16). The top component of the short multiplet are the states belonging to the last row in (5.16). The short multiplet (2; 2)S
604
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
is special, it terminates at the middle row of (5.16). For this case, the top component is the middle ] row. These states have h = hU = 1 and transform as (1; 1) of SU (2)R × SU (2)R . There are 4 such states for each (2; 2)S . 5.6. The resolutions of the symmetric product The Higgs branch of the D1–D5 system at low energies apart from the SCFT on the free torus T 4 is an SCFT on a resolution of the orbifold M. So it is important for us to understand the operators corresponding to the moduli and the resolution of the orbifold M. To this end we construct all the marginal operators of the N = (4; 4) SCFT on the symmetric product orbifold M. We will 1nd the four operators which correspond to resolution of the orbifold singularity. 5.6.1. The untwisted sector Let us 1rst focus on the operators constructed from the untwisted sector. The operators of lowest conformal weight are 1 RA1 (z)R˜ A (z); U 1 U RA2† (z)R˜ A (z);
2† RA1 (z)R˜ A (z) U ; 2† RA2† (z)R˜ A (z) U ;
(5.17)
U where summation over A is implied. These four operators have conformal dimension (h; h)=(1=2; 1=2) 3 ˜3 3 ˜3 U ] and (jR ; j R ) = (1=2; 1=2) under the R-symmetry SU (2)R × SU (2)R . Since (h; h) = (jR ; j R ), these operators are chiral primaries and have non-singular operator product expansions (OPE) with the 1 2† U G˜ (z). U These properties indicate that they belong to the supersymmetry currents G 1 (z); G 2† (z); G˜ (z); bottom component of the short multiplet (2; 2)S . 9 Each of the four chiral primaries gives rise to four top components of the short multiplet (2; 2)S . They are given by the leading pole ((z−w)−1 (z− U w) U −1 ) in the OPE’s 2
G 2 (z)G˜ (z)P(w; U w); U 2 U w); U G 1† (z)G˜ (z)P(w;
1†
G 2 (z)G˜ (z)P(w; U w) U ; 1† G 1† (z)G˜ (z)P(w; U w) U ;
(5.18)
where P stands for any of the four chiral primaries in (5.17). From the superconformal algebra it is easily seen that the top components constructed above have weights (1; 1) and transform as ] (1; 1) under SU (2)R × SU (2)R . The OPEs (5.18) can be easily evaluated. We 1nd that the 16 top U j. components of the 4(2; 2)S short multiplets are 9xAi 9x A We classify the above operators belonging to the top component according to representations of 4 4 (a) the SO(4)I rotational symmetry of the T˜ , (The four torus T˜ breaks this symmetry but we assume the target space is R4 for the classi1cation of states.) (b) R-symmetry of the SCFT and (c) the conformal weights. As all of these operators belong to the top component of (2; 2)S , the only property which distinguishes them is the representation under SO(4)I . The quantum numbers of these 9
We restrict our operators to be single trace operators. This excludes operators of the type Multi-trace operators in the AdS=CFT correspondence have been discussed in [123,124].
Q1 Q5 A=1
RA1 (z)
Q1 Q5 B=1
2† U R˜ B (z).
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
605
operators under the various symmetries are ] SU (2)I × SU (2)I
Operator
{i k 1 ij U j} (z) U k U (3; 3) 9xA (z)9x A U − 4 3 9xA (z)9xA (z) U j] (z) (3; 1) + (1; 3) 9xA[i (z)9x A U i U i (z) U (1; 1) 9xA (z)9x A
] SU (2)R × SU (2)R
U (h; h)
(1; 1) (1; 1) (1; 1)
(1; 1) (1; 1) (1; 1)
(5.19)
Therefore we have 16 marginal operators from the untwisted sector. As these are top components they can be added to the free SCFT as perturbations without violating the N=(4; 4) supersymmetry. 5.6.2. Z2 twists We now construct the marginal operators from the various twisted sectors of the orbifold SCFT. The twist 1elds of the SCFT on the orbifold M are labeled by the conjugacy classes of the symmetric group S(Q1 Q5 ) [125 –127]. The conjugacy classes consist of cyclic groups of various lengths. The various conjugacy classes and the multiplicity in which they occur in S(Q1 Q5 ) can be found from the solutions of the equation (5.20) nNn = Q1 Q5 ; where n is the length of the cycle and Nn is the multiplicity of the cycle. Consider the simplest non-trivial conjugacy class which is given by N1 = Q1 Q5 − 2; N2 = 1 and the rest of Nn = 0. A representative element of this class is (X1 → X2 ; X2 → X1 ); X3 → X3 ; : : : ; XQ1 Q5 → XQ1 Q5 :
(5.21)
Here the XA ’s are related to the xA ’s appearing in action (5.1) by (5.5). To exhibit the singularity of this group action we go over to the following new coordinates: Xcm = X1 + X2
and
= X 1 − X2 :
(5.22)
Under the group action (5.21) Xcm is invariant and → −. Thus the singularity is locally of the type R4 =Z2 . The bosonic twist operators for this orbifold singularity are given by the following OPEs [128] U = 91 (z)I1 (w; w)
$1 (w; w) U ; (z − w)1=2
91† (z)I1 (w; w) U =
$1 (w; w) U ; (z − w)1=2
U = 92 (z)I2 (w; w)
U $2 (w; w) ; (z − w)1=2
92† (z)I2 (w; w) U =
U $2 (w; w) ; (z − w)1=2
U $˜1 (w; w) 1 U 1 (w; w) U = ; 9U˜ (z)I (zU − w) U 1=2
U $˜1 (ww) 1† 9U˜ (z)I U 1 (w; w) U = ; (zU − w) U 1=2
$˜2 (w; w) U 2 U 2 (w; w) U = ; 9U˜ (z)I (zU − w) U 1=2
$˜2 (w; w) U 2† 9U˜ (z)I U 2 (w; w) U = : (zU − w) U 1=2
(5.23)
606
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
The $’s are excited twist operators. The fermionic twists are constructed from bosonized currents de1ned by E1 (z) = eiH
1
(z)
;
E1† (z) = e−iH
1
(z)
;
E2 (z) = eiH
2
(z)
;
E2† (z) = e−iH
2
(z)
;
(5.24)
where the E’s, de1ned as R1 − R2 , are the superpartners of the bosons . From the above we construct the supersymmetric twist 1elds which act both on fermions and bosons as follows: (1=2; 1=2) = I1 (z; z)I U 2 (z; z)e U iH 9(12)
1
1
2
(z)=2 −iH 2 (z)=2 iH˜ (z)=2 U −iH˜ (z)=2 U
e
(1=2; −1=2) 9(12) = I1 (z; z)I U 2 (z; z)e U iH
1
e
e
;
1 2 (z)=2 −iH 2 (z)=2 −iH˜ (z)=2 U iH˜ (z)=2 U
(−1=2; 1=2) 9(12) = I1 (z; z)I U 2 (z; z)e U −iH
e
1
e
e
1
2
;
(z)=2 +iH 2 (z)=2 iH˜ (z)=2 U −iH˜ (z)=2 U
(−1=2; −1=2) 9(12) = I1 (z; z)I U 2 (z; z)e U −iH
e
1
e
e
;
1 2 (z)=2 +iH 2 (z)=2 −iH˜ (z)=2 U +iH˜ (z)=2 U
e
e
e
:
(5.25)
The subscript [12] refers to the fact that these twist operators were constructed for the representative 4 group element (5.21) which exchanges the 1 and 2 labels of the coordinates of T˜ . The superscript 3 stands for the (jR3 ; j˜R ) quantum numbers. The twist operators for the orbifold M belonging to the conjugacy class under consideration is obtained by summing over these Z2 twist operators for all representative elements of this class. 9(1=2; 1=2) =
Q 1 Q5 Q1 Q 5
i=1 j=1; j =i
(1=2; 1=2) 9(ij) :
(5.26)
We can de1ne the rest of the twist operators for the orbifold in a similar manner. The conformal ] dimensions of these operators are (1=2; 1=2). They transform as (2; 2) under the SU (2)R × SU (2)R symmetry of the SCFT. They belong to the bottom component of the short multiplet (2; 2)S . The operator 9(1=2; 1=2) is a chiral primary. As before the 4 top components of this short multiplet, which we denote by T (1=2; 1=2) ; T (−1=2; 1=2) ;
T (1=2; −1=2) ; T (−1=2; −1=2)
(5.27)
are given by the leading pole in the following OPEs, respectively: 2 U (1=2; 1=2) (w; w); U G 2 (z)G˜ (z)9 2 U (1=2; 1=2) (w; w); U G 1† (z)G˜ (z)9
1† G 2 (z)G˜ (z)9 U (1=2; 1=2) (w; w) U ; 1† G 1† (z)G˜ (z)9 U (1=2; 1=2) (w; w) U :
(5.28)
4
These are the 4 blow up modes of the R =Z2 singularity [129] and they have conformal weight (1; 1). 10 They transform as (1; 1) under the SU (2)R × SU] (2)R . As before, since these are top components of the short multiplet (2; 2)S they can be added to the free SCFT as perturbations without 10
Relevance of Z2 twist operators to the marginal deformations of the SCFT has earlier been discussed in [130,131].
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
607
violating the N = (4; 4) supersymmetry of the SCFT. The various quantum numbers of these operators are listed below. 3
Operator
(j 3 ; j˜ )I
] SU (2)R × SU (2)R
U (h; h)
1 T(1) = T (1=2; 1=2) 1 = T (1=2; −1=2) + T (−1=2; 1=2) T(0) 1 T(−1) = T (−1=2; −1=2) T0 = T (−1=2; −1=2) − T (−1=2; −1=2)
(0; 1) (0; 0) (0; −1) (0; 0)
(1; 1) (1; 1) (1; 1) (1; 1)
(1; 1) (1; 1) (1; 1) (1; 1)
(5.29)
] The 1rst three operators of the above table can be organized as a (1; 3) under SU (2)I × SU (2)I . We 1 will denote these 3 operators as T . The last operator transforms as a scalar (1; 1) under SU (2)I × 3 ] SU (2)I and is denoted by T0 . The simplest way of 1guring out the (j 3 ; j˜ )I quantum numbers in the ] (2)I as can above table is to note that (a) the 9-operators of (5.25) are singlets under SU (2)I × SU ˜ ˜ be veri1ed by computing the action on them of the operators I1 ; I2 and I 1 ; I 2 , (b) the T-operators are obtained from 9’s by the action of the supersymmetry currents as in (5.28) and (c) the quantum numbers of the supersymmetry currents under I1 ; I2 and I˜1 ; I˜2 are given by (5.9). 5.6.3. Higher twists We now show that the twist operators corresponding to any other conjugacy class of S(Q1 Q5 ) are irrelevant. Consider the class with N1 = Q1 Q5 − 3; N3 = 1 and the rest of Nn = 0. A representative element of this class is (X1 → X2 ; X2 → X3 ; X3 → X1 ); X4 → X4 ; : : : ; XQ1 Q5 → XQ1 Q5 :
(5.30)
To make the action of this group element transparent we diagonalize the group action as follows: 1 1 1 X1 1 2 = 1 ! ! 2 X2 ; (5.31) 3 X3 1 !2 !4 where ! = exp(2i=3). These new coordinates are identi1ed under the group action (5.30) 1 → 1 ; 2 → !2 2 and 3 → !3 . These identi1cations are locally characteristic of the orbifold R4 × R4 =! × R4 =!2 :
(5.32)
The dimension of the supersymmetric twist operator which twists the coordinates by a phase e2ik=N in 2 complex dimensions is h(k; N ) = k=N [128]. The twist operator which implements the action of the group element (5.30) combines the supersymmetric twist operators acting on 2 and 3 and therefore has total dimension h = h(1; 3) + h(2; 3) = 1=3 + 2=3 = 1 :
(5.33)
It is the superpartners of these which could be candidates for the blow up modes. However, these have weight 3=2. These operators are therefore irrelevant.
608
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
For the class N1 = Q1 Q5 − k; Nk = k and the rest of Nn = 0, the total dimension of the twist operator is h=
k −1
h(i; k) = (k − 1)=2 :
(5.34)
i=1
Its superpartner has dimension k=2. Now it is easy to see that all conjugacy classes other than the exchange of 2 elements give rise to irrelevant twist operators. Thus the orbifold M is resolved by the 4 blow up modes corresponding to the conjugacy class represented by (5.21). We have thus identi1ed 4 the 20 marginal operators of the N=(4; 4) SCFT on T˜ . They are all top components of the 5(2; 2)S U = (1=2; 1=2). These short multiplets. The 5(2; 2)S have 20 operators of conformal dimensions (h; h) are relevant operators for the SCFT. It would be interesting to investigate the role of these relevant operators. As they are chiral primaries they would break only half of the supersymmetries of the SCFT and therefore the renormalization group Cow induced by these operators would presumably be tractable for study. 5.7. The chiral primaries of the N = (4; 4) SCFT on M In this section we will explicitly construct all the chiral primaries corresponding to single particle states of the SCFT on the orbifold M. For this purpose we will have to construct the twist operator corresponding to the conjugacy class N1 = Q1 Q5 − k; Nk = k and the rest of Nn = 0. 5.7.1. The k-cycle twist operator We will extend the method of construction of the 2-cycle twist operator of Section 5.6.2 to the construction of the k-cycle twist operator. Consider the conjugacy class given by N1 =Q1 Q5 −k; Nk =k and the rest of Nn = 0. A representative element of this class is the following group action: (X1 → X2 ; : : : ; Xk → X1 ); Xk+1 → Xk+1 ; : : : ; XQ1 Q5 → XQ1 Q5 : We can diagonalize the 1 k k − 1 1 k − 2 1 = .. .. . . 1
group action as follows: 1 ! !2 .. .
1 !2 !4 .. .
X1 ! k −1 X2 2(k −1) ! X3 ; .. .. . ::: . (k −1)(k −1) Xk ::: !
::: ::: :::
1 !k −1 !(k −1)2
(5.35)
1
(5.36)
where ! = e2i=k . These new coordinates are identi1ed under the group action (5.35) as 1 → !1 ;
2 → !2 2 ;
3 → !3 3 ; : : : ; k −1 → !k −1 k −1 ;
k → !k k :
(5.37)
These identi1cations are locally characteristic of the orbifold R4 × R4 =! × R4 =!2 × · · · × R4 =!k −1 :
(5.38)
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
609
The coordinate m is twisted by the phase !m (m runs from 1 : : : k). The bosonic twist operators corresponding to this twist are de1ned by the following OPEs: 91m (z)Im1 (w; w) U =
$1m (w; w) U ; 1 − (z − w) m=k
91m† (z)Im1 (w; w) U =
$m1 (w; w) U ; (z − w)m=k
U = 92m (z)Im2 (w; w)
U $2m (w; w) ; 1 − (z − w) m=k
92m† (z)Im2 (w; w) U =
U $m2 (w; w) ; (z − w)m=k
U $˜1 (w; w) 1 U m1 (w; w) U = m ; 9U˜ m (z)I (zU − w) U m=k
1† 9U˜ m (z)I U m1 (w; w) U =
U $˜1m (w; w) ; (zU − w) U 1−m=k
U $˜2 (w; w) 2 U m2 (w; w) U = m ; 9U˜ m (z)I (zU − w) U m=k
2† 9U˜ m (z)I U m2 (w; w) U =
U $˜2m (w; w) : (zU − w) U 1−m=k
(5.39)
As in Section 5.6.2 $’s are excited twist operators. The fermionic twists are constructed from bosonized currents de1ned by 1
Em1† (z) = e−iHm (z) ;
2
Em2† (z) = e−iHm (z) ;
Em1 (z) = eiHm (z) ; Em2 (z) = eiHm (z) ;
1
2
(5.40)
where the Em ’s are the superpartners of the bosons m ’s. The twist operators corresponding to the fermions Em ’s are given by e±imHm =k . We now assemble all these operators to construct the k-cycle twist operator which is a chiral primary. The k-cycle twist operator is given by (k −1)=2 9(12:::k)
=
k −1
1
2
˜1
˜2
U U [Im1 (z; z)I U m2 (z; z)e U imHm (z)=k e−imHm (z)=k eimH m (z)=k e−imH m (z)=k ]:
(5.41)
m=1
The subscript (12 : : : k) refers to the fact that these twist operators were constructed for the represen4 tative group element (5.35) which cyclically permutes the 1; : : : ; k labels of the coordinates of T˜ . The superscript (k − 1)=2 stands for the conformal dimension of this operator. As we saw in Section 5.6.3, the conformal dimension of the twist operator for the conjugacy class N1 = Q1 Q5 − k; Nk = k U = ((k − 1)=2; (k − 1)=2). The twist operator for the conjugacy class and the rest of Nn = 0 is (h; h) under consideration is obtained by summing over the k-cycle twist operators for all representative elements of this class. 9(k −1)=2 (z; z) U = 9i1 i2 :::ik (z; z) U ; (5.42) {ii ;:::; ik }
where the sum runs over all k-tuples {ii : : : ; ik } such that ii = i2 = · · · = ik . im take values from 1 to U Q1 Q5 . The operator 9(k −1)=2 is a chiral primary with conformal dimension (h; h)=((k −1)=2; (k −1)=2) 3 3 ˜ and (jR ; j R ) = ((k − 1)=2; (k − 1)=2). As the largest cycle is of length Q1 Q5 , the maximal dimension of the k-cycle twist operator is ((Q1 Q5 − 1)=2; (Q1 Q5 − 1)=2). It belongs to the bottom component of the short multiplet (k; k)S . The other components of the short multiplet (k; k)S corresponding to the k-cycle twists can be generated by the action of supersymmetry currents and the R-symmetry currents of the N = (4; 4) theory on M.
610
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
5.7.2. The complete set of chiral primaries We have seen is Section 5.6 that there are 1ve chiral primaries corresponding to the short multiplet 5(2; 2)S . In this section we will construct the complete set of chiral primaries from single particle U of an N = states of the SCFT on M. It is known that the chiral primaries with weight (h; h) (4; 4) superconformal 1eld theory on a manifold K correspond to the elements of the cohomology H2h 2hU(K) [132]. The chiral primaries are formed by the product of the chiral primaries corresponding 4 4 to the cohomology of the diagonal T˜ denoted by B4 (the sum of all copies of T˜ ) and the various k-cycle chiral primaries constructed in Section 5.7.1. We will list the chiral primaries below. Chiral primaries with h − hU = 0: All the k-cycle chiral primaries have h − hU = 0. To construct chiral primaries with h − hU = 0 we need the four chiral primaries which correspond to the cohomology H11 (B4 ) with weight (1=2; 1=2). They are given in (5.17). Using this we can construct the following chiral primaries: 1
9(k −1)=2 (z; z)R U A1 (z)R˜ A (z); U 1
U A2† (z)R˜ A (z); U 9(k −1)=2 (z; z)R
2†
9(k −1)=2 (z; z)R U A1 (z)R˜ A (z) U ; 2†
9(k −1)=2 (z; z)R U A2† (z)R˜ A (z) U ;
(5.43)
where summation over A is implied. These four operators have conformal dimension (k=2; k=2). There is one more chiral primary corresponding to the cohomology H22 (B4 ) for which h − hU = 0. It is given by 1 2† RA1 (z)RA2† (z)R˜ A (z) U R˜ A (z) U ;
(5.44)
where summation over all indices of A is implied. This chiral primary corresponds to the top form of B4 . The cohomology H00 (B4 ) gives rise to chiral primaries of conformal dimension (k=2; k=2). It is given by 1 2† 9(k −2)=2 (z; z)R U A1 (z)RA2† (z)R˜ A (z) U R˜ A (z) U :
(5.45)
From the equation above we see that these chiral primaries exist only for k ¿ 2. Finally, we have the chiral primary 9(k)=2 (z; z) U of conformal dimension (k=2; k=2). Thus for k ¿ 2 and k 6 Q1 Q5 − 1 there are 6 chiral primaries of dimension (k=2; k=2). U with h − hU = 0 corresponding to single particle The complete list of chiral primaries with (h; h) states are given by U (h; h)
Degeneracy
(1=2; 1=2) (1; 1) (3=2; 3=2) .. .
5 6 6 .. .
((Q1 Q5 − 1)=2; (Q1 Q5 − 1)=2) ((Q1 Q5 )2; (Q1 Q1 )=2) ((Q1 Q5 + 1)=2; (Q1 Q5 + 1)=2)
6 5 1
U = (0; 0). In the above table we have ignored the vacuum with weight (h; h)
(5.46)
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
611
Chiral primaries with h − hU = 1=2: The chiral primaries of B4 which correspond to the elements of the cohomology H10 (B4 ) are given by Q1 Q5
RA1 (z)
Q1 Q5
and
A=1
RA2† (z) :
(5.47)
A=1
We can construct chiral primaries with weight ((k + 1)=2; k=2)) by taking the product of the above chiral primaries with the twist operator 9k=2 (z; z). U These give the following chiral primaries: k=2
Q1 Q 5
U 9 (z; z)
RA1 (z)
and
Q 1 Q5
k=2
9 (z; z) U
A=1
RA2† (z) :
(5.48)
A=1
The chiral primaries of the diagonal B4 which correspond to the elements of the cohomology H21 (B4 ) are 1 U RA1 (z)RA2† (z)R˜ A (z)
and
2† RA1 (z)RA2† (z)R˜ A (z) U :
(5.49)
Here summation over all the three indices of A is implied. From these one can construct chiral primaries with weight ((k + 1)=2; k=2) are follows: 1 U A1 (z)RA2† (z)R˜ A (z) U 9(k −1)=2 (z; z)R
and
2† 9(k −1)=2 (z; z)R U A1 (z)RA2† (z)R˜ A (z) U :
(5.50)
Therefore there are 4 chiral primaries with weight ((k + 1)=2; k=2) for 1 6 k 6 (Q1 Q5 − 1) and 2 chiral primaries with weight ((Q1 Q5 + 1)=2; Q1 Q5 =2). There are also 2 chiral primaries with weight (1=2; 0). Chiral primaries with hU − h = 1=2: The procedure for constructing these chiral primaries are identical to the case h − hU = 1=2. The four chiral primaries with weight (k=2; (k + 1)=2) are given by k=2
9 (z; z) U
Q1 Q5
1 U R˜ A (z);
k=2
9 (z; z) U
Q1 Q 5
A=1 1 2† U A1 (z)R˜ A (z) U R˜ A (z); U 9(k −1)=2 (z; z)R
2† U ; R˜ A (z)
A=1 1 2† 9(k −1)=2 (z; z)R U A2† (z)R˜ A (z) U R˜ A (z) U :
(5.51)
There are 4 chiral primaries with weight (k=2; (k + 1)=2) for 1 6 k 6 (Q1 Q5 − 1) 2 chiral primaries with weight (Q1 Q5 =2; (Q1 Q5 + 1)=2) and 2 chiral primaries with weight (0; 1=2). Chiral primaries with h − hU = 1: As in the previous cases, let us 1rst look at the chiral primaries corresponding to the cohomology element H20 (B4 ). There is only one element which is given by RA1 (z)RA2† (z) ;
(5.52)
where summation over A is implied. The single chiral primary with weight ((k+2)=2; k=2) constructed out of the above chiral primary is U A1 (z)RA2† (z) : 9k=2 (z; z)R
(5.53)
Thus there is one chiral primary with weight ((k + 2)=2; k=2) for 0 6 k 6 (Q1 Q5 − 1). The operator product expansion of two chiral primaries will give rise to other chiral primaries consistent with conservation laws. These are known to form a ring. It will be interesting to understand the structure of this ring.
612
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
Chiral primaries with hU − h = 1: The construction of these is parallel to the case for h − hU = 1. The single chiral primary with weight (k=2; (k + 2)=2) for 0 6 k 6 (Q1 Q5 − 1) is given by 1
2†
U R˜ A (z) U R˜ A (z) U : 9k=2 (z; z)
(5.54)
There are no chiral primaries with h − hU ¿ 1 or hU − h ¿ 1. From the construction of the chiral primaries we see that such chiral primaries can exist only if there is an element in Hr0 (B2 ) or H0r (B2 ) with r ¿ 1. As the homology groups of B4 is identical to that of a four torus we know that such elements do not exist. 5.8. Short multiplets of N = (4; 4) SCFT on M Using the results of Section 5.7 we will write the complete set of short multiplets of single particle states of the N = (4; 4) SCFT on M. In Section 6 we will compare this set of short multiplets with that obtained from supergravity. We will see in Section 6.1 that supergravity is a good approximation in string theory only when Q1 → ∞; Q5 → ∞. Therefore we write down the list of short multiplets 4 for (T˜ )(∞) =S(∞). Basically this means that the list of chiral primaries of the previous Section 5.7 does not terminate. We have seen that each chiral primary of weight (h; h ) gives rise to the short multiplet (2h + 1; 2h +1))S . Therefore the results of Section 5.7 indicate that the list of short multiplets corresponding 4 to the single particle states of N = (4; 4) SCFT on (T˜ )(∞) =S(∞) is given by 5(2; 2)S + 6 ⊕m¿3 (m; m)S 2(1; 2)S + 2(2; 1)S + (1; 3)S + (3; 1)S ⊕m¿2 [(m; m + 2)S + (m + 2; m)S + 4(m; m + 1)S + 4(m + 1; m)S ] :
(5.55)
In our discussion so far, we have ignored the short multiplets from the free torus T 4 which forms a part of the Higgs branch of the D1–D5 system. We will see in Section 6 that the short multiplets from the free torus are not present in the supergravity. Thus for comparison with supergravity it is suKcient for us to restrict our attention to the short multiplets on M. 5.9. Stringy exclusion principle We see from the preceding discussion that the spin of short multiplets in the SCFT is bounded by (1 + Q1 Q5 )=2. In the context of the AdS3 =CFT2 correspondence (see Section 6) this is puzzling at 1rst since there is no corresponding bound on spin from supergravity. However, since supergravity is only valid at Q1 ; Q5 → ∞ these two facts are reconciled. The existence of a maximum spin for 1nite Q1 ; Q5 has been called the “stringy exclusion principle” [90]. Clearly this bound cannot be understood in supergravity and should be understood in terms of an exact treatment of strings in AdS3 .
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
613
6. Near-horizon supergravity and SCFT In this section we will classify the supergravity 1elds according to the symmetries of the nearhorizon geometry of the D1–D5 system which was derived in Section 2.6 and compare them with the chiral multiplets of the SCFT on M. Let us examine the symmetries of the near-horizon geometry (2.70). The bosonic symmetries arise from the isometries of AdS3 and S 3 . The isometries of the AdS3 space form the non-compact ] group SO(2; 2), while the isometries of S 3 form the group SO(4)E = SU (2)E × SU (2)E . Though the compacti1cation on T 4 breaks the SO(4) rotations of the coordinates x6 ; : : : ; x9 we can still use this symmetry to classify supergravity 1elds. We will call this symmetry SO(4)I . The D1–D5 system preserves eight out of the 32 supersymmetries of the type IIB theory. In the near-horizon limit the number of supersymmetries gets enhanced from eight to sixteen [133,134]. These symmetries 1x the form of the eLective anti-de Sitter supergravity theory near the horizon. The bosonic symmetries SO(2; 2) × SO(4)E = (SL(2; R) × SU (2)) × (SL(2; R) × SU (2)) form the bosonic symmetries of the anti-de Sitter supergravity in three-dimensions. Simple anti-de Sitter supergroups in three-dimensions were classi1ed in [135]. It can be seen that the only simple supergroups whose bosonic part is SL(2; R) × SU (2) are Osp(3|2; R) and SU (1; 1|2). The former contains the bosonic subgroup O(3) × SL(2; R). The supercharges of the supergroup Osp(3|2; R) transform as the vector representation of the group O(3), while the supercharges of the supergroup SU (1; 1|2) transform as 2 of the group SU (2). The unbroken supercharges of the D1–D5 system transform in the spinor representation of SO(4)E and therefore they transform as 2 of SU (2). This rules out Osp(3|2; R). Therefore the near-horizon anti-de Sitter supergravity is based on the supergroup SU (1; 1|2) × SU (1; 1|2) with matter 1elds. 11 6.1. Classi
and
gs Q5 1 :
(6.1)
These inequalities imply that we are working in the regime where closed string perturbation theory is valid and where all length scales are greater than the string length. Therefore we are justi1ed in using supergravity. Kaluza–Klein reduction of type IIB supergravity to six dimensions leads to six-dimensional (2; 2) supergravity. We show that the Kaluza–Klein spectrum of the six-dimensional theory on AdS3 × S 3 can be completely organized as short multiplets of the supergroup SU (1; 1|2) × SU (1; 1|2). We will follow the method developed by [43]. The massless spectrum of (2; 2) six-dimensional supergravity consists of: a graviton, 8 gravitinos, 5 two-forms, 16 gauge 1elds, 40 fermions and 25 scalars. Since these are massless, the physical degrees of freedom fall into various representations R4 of the little group SO(4)L of R(5; 1) . For ] example, the graviton transforms as a (3; 3) under the little group SO(4)L = SU (2)L × SU (2)L . 11
The pure anti-de Sitter supergravity based on the super group SU (1; 1|2) × SU (1; 1|2) was constructed in [136] using the fact that it is a Chern–Simons theory.
614
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
On further compactifying R(5; 1) into AdS3 × S 3 , each representation R4 decomposes into various representations R3 of SO(3), the local Lorentz group of the S 3 . This SO(3) SU (2) is the diagonal ] SU (2) of SU (2)L × SU (2)L . For example, the graviton decomposes as 1 + 3 + 5 under the SO(3), the local Lorentz group of S 3 . The dependence of each of these 1elds on the angles of S 3 leads to decomposition in terms of Kaluza–Klein modes on the S 3 which transforms according to some representation of the isometry group SO(4) of S 3 . Only those representations of SO(4) occur in these decompositions which contain the representation R3 of S 3 . To be more explicit, consider the 1eld RSO(3) (x0 ; x5 ; r; ; ; E) which transforms as some representation RSO(3) of the local Lorentz group of S 3 . The Kaluza–Klein expansion of this 1eld on S 3 is given by RSO(4) RSO(3) (x0 ; x5 ; r; ; ; E) = (; ; E) : (6.2) ˜ RSO(4) (x0 ; x5 ; r)YRSO(3) RSO(4)
R
SO(4) Here YRSO(3) (; ; E) stands for the spherical harmonics on S 3 . In the above expansion the only representation of RSO(4) allowed are the ones which contain RSO(3) . For example, (x0 ; x5 ; r; ; ; E) which is a scalar under the local Lorentz group of S 3 can be expanded as (x0 ; x5 ; r; ; ; E) = (6.3) ˜ mm (x0 ; x5 ; r)Y (m; m ) (; ; E) :
m;m ;m=m
Once the complete set of Kaluza–Klein modes are obtained we will organize them into short multiplets of the supergroup SU (1; 1|2) × SU (1; 1|2). Let us now consider all the massless 1eld of (2; 2) supergravity in six-dimensions individually. The graviton transforms as (3; 3) of the little group in 6 dimensions. The Kaluza–Klein harmonics of this 1eld according to the rules discussed above are (1; 1) + 2(2; 2) + (3; 1) + (1; 3) + 3 ⊕ (m; m) + 2 ⊕ [(m + 2; m) + (m; m + 2)] m ¿3
m¿2
+ ⊕ [(m + 4; m) + (m; m + 4)] :
(6.4)
m¿1
The little group representations of the 8 gravitations is 4(2; 3)+4(3; 2). Their Kaluza–Klein harmonics are 8[(1; 2) + (2; 1)] + 16 ⊕ [(m + 1; m) + (m; m + 1)] + 8 ⊕ [(m + 3; m) + (m; m + 3)] : m¿2
m¿1
(6.5)
The Kaluza–Klein harmonics of the 5 two-forms transforming in (1; 3) + (3; 1) of the little group are 10 ⊕ (m; m) + 10 ⊕ [(m + 2; m) + (m; m + 2)] : m¿2
m¿1
(6.6)
The Kaluza–Klein harmonics of the 16 gauge 1elds (2; 2) are 16(1; 1) + 32 ⊕ (m; m) + 16 ⊕ [(m; m + 2) + (m + 2; m)] : m¿2
m¿1
(6.7)
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
615
The 40 fermions 20(2; 1) + 20(1; 2) give rise to the following harmonics: 40 ⊕ [(m; m + 1) + (m + 1; m)] :
(6.8)
m¿1
The 25 scalars (1; 1) give rise to the harmonics 25 ⊕ (m; m) :
(6.9)
m¿1
Putting all this together the complete Kaluza–Klein spectrum of type IIB on AdS3 × S 3 × T 4 yields 42(1; 1) + 69(2; 2) + 48[(1; 2) + (2; 1)] + 27[(1; 3) + (3; 1)] 70 ⊕ (m; m) + 56 ⊕ [(m; m + 1) + (m + 1; m)] m¿3
m ¿2
+28 ⊕ [(m; m + 2) + (m + 2; m)] + 8 ⊕ [(m; m + 3) + (m + 3; m)] m ¿2
m¿1
+ ⊕ [(m; m + 4) + (m + 4; m)] : m¿1
(6.10)
We now organize the above Kaluza–Klein modes into short representations of SU (1; 1|2)×SU (1; 1|2) [43]. The short multiplet of SU (1; 1|2) consists of the following states: j
L0
Degeneracy
h h − 1=2 h−1
h h + 1=2 h+1
2h + 1 2(2h) 2h − 1
(6.11)
In the above table j labels the representation of SU (2) which is identi1ed as one of the SU (2)’s of the isometry group of S 3 . L0 denotes the conformal weight of the state. We denote the short multiplet of SU (1; 1|2) × SU (1; 1|2) as (2h + 1; 2h + 1)S . On organizing the Kaluza–Klein spectrum into short multiplets we get the following set: 5(2; 2)S + 6 ⊕ (m; m)S m ¿3
⊕ [(m; m + 2)S + (m + 2; m)S + 4(m; m + 1)S + 4(m + 1; m)S ] :
m¿2
(6.12)
Eq. (6.10) shows that there are 42(1; 1) SO(4) representations in the supergravity Kaluza–Klein spectrum. We know that one of these arises from the s-wave of g55 from Eq. (6.4). This is one of the 1xed scalars. 16(1; 1) comes from the s-waves of the 16 gauge 1elds (the components along x5 ) as seen in Eq. (6.7). The remaining 25 come from the 25 scalars of the six-dimensional theory. We would like to see where these 42(1; 1) 1t in the short multiplets of SU (1; 1|2) × SU (1; 1|2). From Eq. (6.12) one can read that 20 of them are in the 5(2; 2)S with (j = 0; L0 = 1; j = 0; L0 = 1). 6 of them are in 6(3; 3)S with (j = 0; L0 = 2; j = 0; L0 = 2). These correspond to the 1xed scalars. Finally, the remaining 16 of them belong to 4(2; 3)S + 4(3; 2)S . 8 of them have (j = 0; L0 = 1; j = 0; L0 = 2) and 8 of them have (j = 0; L0 = 2; j = 0; L0 = 1). These scalars can be recognized as the intermediate scalars.
616
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
Comparison of supergravity short multiplets with SCFT: In Section 5.8 we have listed the complete set of short multiplets corresponding to single-particle states of the N = (4; 4) SCFT on the orbifold M. Comparing Eq. (5.55) and the list of short multiplets of single-particle states obtained from supergravity in (6.12) we 1nd that they are identical except for the presence of the following additional short multiplets in the SCFT: 2(1; 2)S + 2(2; 1)S + (1; 3)S + (3; 1)S :
(6.13)
These correspond to non-propagating degrees of freedom in the supergravity [43]. Therefore they are not present in the list of short multiplets obtained from supergravity (6.12). Furthermore, note that in (5.55) we have ignored the contribution of short multiplets from the free T 4 which forms the part of the Higgs branch of the D1–D5 system. Thus the short multiplets in supergravity also ignores the contribution from the free torus. It is pertinent here to mention that the AdS=CFT duality for D1–D5 systems in theories with 16 supercharges were studied in [137,138]. These theories were obtained by considering various orbifolds of type IIB. 6.2. The supergravity moduli In this section we will analyze in detail the massless scalars in the near-horizon geometry of the D1–D5 system. Type IIB supergravity compacti1ed on T 4 has 25 scalars. There are 10 scalars hij which arise from compacti1cation of the metric. i; j; k : : : stands for the directions of T 4 . There are 6 scalars bij which arise from the Neveu–Schwarz B-1eld and similarly there are 6 scalars bij from the Ramond–Ramond B -1eld. The remaining 3 scalars are the ten-dimensional dilation 10 , the Ramond–Ramond scalar E and the Ramond–Ramond 4-form C6789 . These scalars parameterize the coset SO(5; 5)=(SO(5) × SO(5)). The near-horizon limit of the D1–D5 system is AdS3 × S 3 × T 4 (2.70). In this geometry 5 of the 25 scalars become massive [90]. They are the hii (the trace of the metric of T 4 which is proportional to the volume of T 4 ), the 3 components of the anti-self dual part of the Neveu–Schwarz B-1eld b− ij and a linear combination of the Ramond–Ramond scalar and the 4-form [114]. The massless scalars in the near-horizon geometry parameterize the coset SO(5; 4)=(SO(5) × SO(4)) [139]. As we have seen the near-horizon symmetries form the supergroup SU (1; 1|2) × SU (1; 1|2). We have classi1ed all the massless supergravity 1elds of type IIB supergravity on AdS3 ×S 3 ×T 4 ignoring the Kaluza–Klein modes on T 4 according to the short multiplets of the supergroup SU (1; 1|2) × SU (1; 1|2). The isometries of the anti-de Sitter space allow us to relate the quantum number L0 + LU0 to the mass of the scalar 1eld through the relation [90]: (6.14) h + hU = 1 + 1 + m2 : U is the eigenvalue of L0 ; LU0 Here m is the mass of the scalar in units of the radius of AdS3 and (h; h) under the classi1cation of the scalar in short multiplets of SU (1; 1|2)×SU (1; 1|2). Thus the massless 1elds of the near-horizon geometry of the D1–D5 system fall into the top component of the 5(2; 2)S short multiplet. We further classify these 1elds according to the representations of the SO(4)I , the rotations of the x6 ; x7 ; x8 ; x9 directions. As we have mentioned before this is not a symmetry of the supergravity as it is compacti1ed on T 4 , but it can be used to classify states. The quantum number
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
617
of the massless supergravity 1elds are listed below. Field 1 3 h 4 ij kk
hij − bij 6 a1 E + a2 C6789 b+ ij
SU (2)I × SU] (2)I
] SU (2)E × SU (2)E
Mass
(3; 3) (3; 1) + (1; 3) (1; 1) (1; 1) (1; 3)
(1; 1) (1; 1) (1; 1) (1; 1) (1; 1)
0 0 0 0 0
(6.15)
The linear combination appearing on the fourth line is the one that remains massless in the near-horizon ] limit. 6 refers to the six-dimensional dilation. The SU (2)E × SU (2)E stands for the SO(4) isometries 3 of the S . All the above 1elds are s-waves of scalars in the near-horizon geometry. 6.3. AdS3 =CFT2 correspondence We have already seen in Section 6.3 that all the supergravity modes can be organized as short multiplets of the SCFT on M. This is evidence for Maldacena’s AdS=CFT correspondence. Maldacena’s conjecture [42,62,63,90] for the case of the D1–D5 system states that string theory on AdS3 × S 3 × T 4 is dual to the 1 + 1 dimensional conformal 1eld theory of the Higgs branch of gauge theory of the D1–D5 system. Here we brieCy review the evidence for this conjecture from symmetries. To describe the D1–D5 system at a generic point in the moduli space we can use the N = (4; 4) SCFT on the orbifold M × T 4 to describe the Higgs branch of the gauge theory of the D1–D5 system as we have argued in Section 4. Here the dynamics on T 4 is decoupled from the symmetric product. However in string theory on AdS3 × S 3 × T 4 , all 1elds couple to gravity and no 1eld is free. Thus in this case of the AdS=CFT correspondence we have to ignore the free torus T 4 . This is the same reason that the U (1) of the N = 4 super Yang–Mills theory with gauge group U (N ) is ignored in the correspondence with string theory on AdS5 × S 4 . The gauge group used in the correspondence is in fact SU (N ) [140]. The volume of T 4 is of the order of string length and radius of S 3 is large, therefore we can pass over from string theory on AdS3 × S 3 × T 4 to six-dimensional (2; 2) supergravity on AdS3 × S 3 . We will compare symmetries in the supergravity limit. The identi1cation of the isometries of the near-horizon geometry with that of the symmetries of the SCFT are given in the following table. Symmetries of the bulk
Symmetries of SCFT
(a) Isometries of AdS3 ]R) SO(2; 2) SL(2; R) × SL(2; 3 (b) Isometries of S SO(4)E SU (2) × SU (2) (c) Sixteen near-horizon supersymmetries (d) SO(4)I of T 4
The global part of the Virasoro group ]R) SL(2; R) × SL(2; R-symmetry of the SCFT ] SU (2)R × SU (2)R Global supercharges of N = (4; 4) SCFT 4 SO(4)I of T˜
618
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
To summarize, the SU (1; 1|2) × SU (1; 1|2) symmetry of the near-horizon geometry is identi1ed with the global part of the N = (4; 4) SCFT on the orbifold M together with the identi1cation of the 4 SO(4)I algebra of T 4 and T˜ . 6.4. Supergravity moduli and the marginal operators We would like to match the 20 supergravity moduli appearing in (6.15) with the 20 marginal operators appearing in (5.19) and (5.29) by comparing their symmetry properties under the AdS=CFT correspondence [141]. The symmetries, or equivalently quantum numbers, to be compared under the AdS=CFT correspondence are as follows: (a) The isometries of the supergravity are identi1ed with the global symmetries of the superconformal 1eld theory. For the AdS3 case the symmetries form the supergroup SU (1; 1|2) × SU (1; 1|2). The identi1cation of this supergroup with the global part of the N = (4; 4) superalgebra leads to the mass–dimension relation (6.14). Since in our case the SCFT operators are marginal and the supergravity 1elds are massless, the mass–dimension relation is obviously satis1ed. ] (b) The SU (2)E × SU (2)E quantum number of the bulk supergravity 1eld corresponds to the ] (2)R quantum number of the boundary operator. By an inspection of column three of SU (2)R × SU the tables in (5.19), (5.29) and (6.15), we see that these quantum numbers also match. (c) The location of the bulk 1elds and the boundary operators as components of the short multiplet can be found by the supersymmetry properties of the bulk 1elds and the boundary operators. Noting the fact that all the 20 bulk 1elds as well as all the marginal operators mentioned above correspond to top components of short multiplets, this property also matches. (d) The above symmetries alone do not distinguish between the 20 operators or the 20 bulk 1elds. To further distinguish these operators and the 1elds we identify the SO(4)I symmetry of the directions x6 ; x7 ; x8 ; x9 with the SO(4)I of the SCFT. At the level of classi1cation of states this identi1cation is reasonable though these are not actual symmetries. Using the quantum numbers under this group we obtain the following matching of the boundary operators and the supergravity moduli: Operator
Field
] SU (2)I × SU (2)I
{i k U k 1 ij U j} (z) 9xA (z)9x A U − 4 3 9xA 9xA j] [i U (z) 9xA (z)9x A U i U U 9xA (z)9xAi (z) T1 T0
hij − 14 3ij hkk bij b+ ij a1 E + a2 C6789
(3; 3) (3; 1) + (1; 3) (1; 1) (1; 3) (1; 1)
(6.16)
Note that both the representations (1; 3) and (1; 1) occur twice in the above table. This could give rise to a two-fold ambiguity in identifying either (1; 3) or (1; 1) operators with their corresponding bulk 1elds. The way we have resolved it here is as follows. The operators T1 and T0 correspond to blow up modes of the orbifold, and as we will show in Section 7 that these are related to the Fayet– Iliopoulos terms and the -term in the D1–D5 gauge theory. Tuning these operators one can reach
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
619
the singular SCFT [114] that corresponds to fragmentation of the D1–D5 system. In supergravity, similarly, it is only the moduli b+ ij and a1 E + a2 C6789 which aLect the stability of the D1–D5 system + 1 [114,119,142]. As a result, it is b+ ij (and not bij ) which should correspond to the operator T and similarly a1 E + a2 C6789 should correspond to T0 . Another reason for this identi1cation is as follows. + b+ ij and a1 E + a2 C6789 are odd under world sheet parity while bij and are even under world sheet parity. In a Z2 orbifolded theory there is a Z2 symmetry which can be used to classify the states [143]. Under this symmetry the Z2 quantum number of the twisted sectors is −1 and the Z2 quantum number of the untwisted sectors is +1. If under the AdS=CFT correspondence one can identify these Z2 quantum numbers in the boundary SCFT and the bulk, then the correspondence we have made is further justi1ed. Thus, we arrive at a one-to-one correspondence between operators of the SCFT and the supergravity moduli. 7. Location of the symmetric product In the previous section we have studied the moduli of the D1–D5 system in detail. In (6.16) we have listed all the supergravity moduli and their corresponding operators in the SCFT on M. The D1–D5 system is unspeci1ed until all its moduli are given. In this section we will 1nd the location of the free 1eld orbifold N = (4; 4) SCFT theory on M in the D1–D5 moduli space. It is easy to see from the mass formula of the D1–D5 system, that the D1–D5 system is marginally stable to decay. The mass per unit length of the D1–D5 system is given by M=
1 (Q1 + vQ5 ) : gs 2
(7.1)
Here v is de1ned in (2.69). Note that the above formula is linear in Q1 and Q5 , therefore it does not cost any energy for the D1–D5 system to decay to subsystems with smaller values of Q1 and Q5 . But, when any of the moduli in (6.16) is turned on then the D1–D5 system is stable. In fact there is a binding energy which prevents its decay. To see this let us turn on one of the moduli in (6.16). Consider turning on the self-dual Neveu–Schwarz B-1eld along the D5-brane direction. We can choose it to be given by 0 b 0 0 1 −b 0 0 0 : (7.2) Bij = 0 0 0 b 2 0 0 −b 0 Here i; j runs from 6; : : : ; 9. For convenience let the metric of T 4 be 3ij and v = 1. To demonstrate that there is a binding energy it is suKcient to consider the case of Q1 = 1 and Q5 = 1. The mass 12 of a single D1-brane is given by MD1 = 12
1 : gs 2
Here mass refers to mass per unit length.
(7.3)
620
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
Similarly, the mass of the D5-brane wrapped on T 4 with the B-1eld (7.2) is given by 1 (1 + b2 ) : MD5 = gs 2 It is easy to understand this mass formula from the Dirac–Born–Infeld action 1 Det(G + 2B) ; S= gs 2
(7.4)
(7.5)
where G is the induced metric. Substituting the value of B from (7.2) and the metric of the T 4 and expanding the Dirac–Born–Infeld action in the static gauge we obtain the mass for the D5-brane with the B-1eld as given by (7.4). The mass of the D1–D5 system with the B-1eld is given by [46] (cf. (2.52) with b68 = −b79 = b) 1 (Q1 + (1 − b2 )Q5 )2 + 4b2 Q52 : (7.6) M= gs 2 Here Q1 and Q5 stand for the number of D1- and D5-branes, respectively. Substituting Q1 = Q5 = 1 we get 1 4 + b4 : (7.7) MD1−D5 = gs 2 The binding energy is given by YM = (MD1 + MD5 ) − MD1−D5 :
(7.8)
It is easy to see this binding energy is positive. 13 One can repeat similar calculations with the other moduli given in (6.16) and demonstrate the existence of a positive binding energy. This issue of the stability of the D1–D5 system with the various moduli turned on has been discussed in [88,114,119,142,144]. It has been observed in [114] that the eLective theory of a single D1-brane separating oL the D1–D5 bound state is a linear dilation theory. This was derived by studying the dynamics of a D1-brane close to the boundary of AdS3 . In Section 7.1 we derive the eLective theory of a set of q1 D1-branes and q5 D5-branes splitting of the D1–D5 bound state. The AdS=CFT correspondence suggests that this decay of the D1–D5 system should also be seen from the conformal 1eld theory of the Higgs branch of the D1–D5 system. It is convenient to extract the dynamics of the decay from the D1–D5 gauge theory. In fact such a decay signals a singularity in the world volume gauge theory associated with the origin of the Higgs branch. The dynamics of the decay can be extracted from the D1–D5 gauge theory using the methods developed by [145]. In Section 7.3 we extract the dynamics of the decay from the D1–D5 gauge theory and show that it is described by the same linear dilation theory observed by [114] in supergravity. The singularity mentioned above leads to a singular conformal 1eld theory. However, generic values of the supergravity moduli which do not involve fragmentation into constituents are described by well-de1ned conformal 1eld theories and therefore string perturbation theory makes sense. We have seen in Section 5.6 that the important singularity structure of the M = (4; 4) SCFT on the orbifold M is locally of the type R4 =Z2 . The resolution of this singularity gives rise to marginal 13
If the NS B-1eld was anti-self dual then the binding energy is zero, and the D1–D5 system is marginally stable.
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
621
Linear sigma model (UV) θ=π
θ=0
Marginal deformation Τ 0 Non singular CFT
Singular CFT SCFT (IR)
Fig. 4. Linear sigma model and CFT description of the R4 =Z2 singularity.
operators. An orbifold theory realized as a free 1eld SCFT on R4 =Z2 is non-singular as all correlation functions are 1nite. The reason for this can be understood from the linear sigma model description of the R4 =Z2 singularity which will be discussed in Section 7.2. We will see that though the R4 =Z2 singularity is geometrically singular, the SCFT is 1nite because it corresponds to a non-zero theta term in the linear sigma model. The geometric resolution of this singularity corresponds to adding Fayet–Iliopoulos terms to the D-term equations of the linear sigma model. This deforms the R4 =Z2 singularity to an Eguchi–Hansen space. In the orbifold theory this deformation is caused by the twist operator T1 . The Eguchi–Hansen space is asymptotically R4 =Z2 but the singularity at the origin is blown up to a 2-sphere. One can use the SU (2)R symmetry of the linear sigma model to rotate the three Fayet–Iliopoulos terms to one term. This term corresponds to the radius of the blown up 2-sphere. The theta term of the linear sigma model corresponds to B-Cux through the 2-sphere. The change of this B-Cux is caused by deforming the orbifold SCFT by the twist operator T0 . Thus SCFT realized as a free 1eld theory on the orbifold R4 =Z2 is regular even though the 2-sphere is squashed to zero size because of the non-zero value of B-Cux trapped in the squashed 2-sphere [146]. We summarize this discussion in (Fig. 4). For most of our discussion we have assumed that the Higgs branch of the D1–D5 gauge theory is a resolution N = (4; 4) theory on M. Furthermore we have realized this theory as a free 1eld theory with orbifold identi1cation. This implies that we are at a point in the moduli space of the D1–D5 system at which the orbifold is geometrically singular but because of the non-zero value of the theta term the SCFT is regular and not at the singularity corresponding to fragmentation. In other words, the orbifold SCFT corresponds to a bound state of Q1 D1-branes and Q5 D5-branes (henceforth denoted as the (Q1 ; Q5 ) bound state). The supergravity solution (2.70) has no moduli turned on. This implies that the SCFT dual is singular and is far away in moduli space from the regular conformal theory on M. But, the fact that we could show that all the short multiplets of the supergravity modes on AdS3 × S 3 is in one-to-one correspondence with the short multiplets of the SCFT on M implies that these multiplets are protected from non-renormalization theorems. This will be discussed in detail in Section 9.
622
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
7.1. Dynamics of the decay of the D1–D5 system from gravity We consider a set of (q1 ; q5 ) test D-branes with q1 ; q5 Q1 ; Q5 close to the boundary of AdS3 but separated from the rest of the branes of the D1–D5 system. When the test branes are close to the boundary it is easy to see using the UV=IR correspondence that the gauge group is broken to U (q1 ) × U (q5 ) × U (Q1 − q1 ) × U (Q − q5 ) from U (Q1 ) × U (Q5 ) in the IR. Thus we can extract the infrared dynamics of the decay of the D1–D5 system from gravity. Lets us 1rst consider the case when q5 = 0 [114]. The AdS=CFT correspondence tells us that we need to consider q1 D1-branes in the background of AdS3 × S 3 × T 4 . (We will work √ in the Euclidean AdS3 coordinates.) The radius of S 3 and the anti-de Sitter space is given by r0 = (g62 Q1 Q5 )1=4 , 14 where Q1 = Q1 − q1 and Q5 = Q5 − q1 . For the supergravity to be valid we need to consider the limit r0 0. Let us focus on the distance between the boundary of AdS3 × S 3 × T 4 and the set of q1 D1-branes. We are interested in the infrared description of the splitting process. By the UV=IR correspondence, the D1-branes should be close to the boundary of the AdS3 × S 3 × T 4 to obtain the infrared description of the splitting process in the supergravity. We assume that the D1-branes are 1xed at a particular point on the S 3 and the T 4 . The action of q1 D1-branes in the background of AdS3 and the Ramond–Ramond two-form B05 is given by the DBI action. We can use the DBI action for multiple D1-branes as we are interested only in the dynamics of the centre of mass of the collection of q1 D1-branes. The DBI action of q1 D1-branes is given by q1 q1 2 − ind S= d Ie B; (7.9) det(g= ) − 2gs 2gs ind is the induced where I stands for the world volume coordinates and ; = label these coordinates. g= metric on the world volume. B is the Ramond–Ramond 2-form potential. We chose a gauge in which the world volume coordinates are the coordinates of the boundary of the AdS3 . Let the metric on the boundary be g= (I). One can extend the metric g= (I) on the boundary to the interior of AdS3 in the neighbourhood of the boundary [114]. This is given by r2 ds2 = 20 (dt 2 + gˆ= (I; t) dI dI= ) (7.10) t with
gˆ= (I; 0) = g= (I);
gˆ= (I; t) = g= (I) − t 2 P= + O(t 3 ) + · · · :
(7.11)
Here g= P = = R=2. R is the world sheet curvature. The global coordinates of Euclidean AdS3 is given by 15 ds2 = r02 (d2 + sinh2 d2 ) ; where d2 is the round metric on S 2 . Near the boundary the metric is given by e2 2 2 2 2 d ds = r0 d + : 4
(7.12)
(7.13)
√ We have called this quantity R = l in (2.73) and (11.4); the r0 here is not be confused with the non-extremality parameter, such as in (2.39). 15 ˜ ˜ ∈ S 2 , and then This can be obtained from (C.26) and (C.27) by de1ning Y−1 = l cosh ; (Y0 ; Y1 ; Y2 ) = l sinh ; replacing the notation l by r0 , see footnote 14. 14
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
623
Motivated by this we use de1ned as t = 2e− to measure the distance from the boundary of AdS3 . Substituting the metric in (7.10) and the near-horizon value of the Ramond–Ramond 2-form and the dilaton in (7.9) we obtain the following eLective action of the D1-branes near the boundary.
q1 r02 1 Q5 v √ −2 S= g 9 9 + R − ) R + O(e 4gs Q1 2 q 1 Q5 √ 1 −2 = g 9 9 + R − R + O(e ) : (7.14) 4 2 Now consider the case when q1 = 0. The q5 D5-branes are wrapped on T 4 . Therefore the world volume of the D5-branes is of the form M2 × T 4 where M2 is any 2-manifold. The D5-branes are located at a point on the S 3 . We ignore the Cuctuations on T 4 as we are interested in the dynamics on AdS3 . The DBI action of q5 D5-branes is given by q5 6 − 6 ind d Ie ; (7.15) det(g= ) − C 325 gs 3 where C 6 is the Ramond–Ramond 6-form potential coupling to the D5-brane. Performing a similar calculation for the D5-branes and substituting the near-horizon values of the 6-from Ramond–Ramond potential, the dilaton and the volume of T 4 one obtains the following eLective actions for the D5-branes:
1 Q1 v √ q5 r02 −2 g 9 9 + R − R + O(e ) S= 4g2 Q5 2 1 q 5 Q1 √ −2 g 9 9 + R − R + O(e ) : (7.16) = 4 2 For the case when q1 = 0 and q5 = 0 and we just add the contribution from (7.14) and (7.16) to obtain the eLective action of the (q1 ; q5 ) string in AdS3 . The reason we can do this is because there is no force between the test D1- and D5-branes. Thus to the leading order in the total eLective action of the (q1 ; q5 ) string near the boundary is given by 1 (q1 Q5 + q5 Q1 ) √ (7.17) g 9 9 + R − R : S= 4 2 Rescaling so that the normalization of the kinetic energy term is canonical one obtains a linear dilaton action with a back ground charge given by QSUGRA = 2(q1 Q5 + q5 Q1 ) : (7.18) To summarize, the eLective dynamics of the (Q1 ; Q5 ) D1–D5 system by decaying into (q1 ; q5 ) branes is governed by linear dilaton theory with background charge given by (7.18). Note that the linear dilaton theory in (7.17) is strongly coupled at → ∞, the boundary of AdS3 . If a similar analysis is performed for the supergravity solution with the self-dual NS B-1eld turned on (2.48), one obtains a potential for the linear dilaton which prevents the coupling to grow
624
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
to in1nity at the boundary of AdS3 [88]. Thus the eLective theory is non-singular in the presence of the NS B-1eld. 7.2. The linear sigma model description of R4 =Z2 The linear sigma model is a 1 + 1 dimensional U (1) gauge theory with (4; 4) supersymmetry [147]. It has 2 hypermultiplets charged under the U (1). The scalar 1elds of the hypermultiplets can be organized as doublets under the SU (2)R symmetry of the (4; 4) theory as A1 A2 and E2 = : (7.19) E1 = † B1 B2† The A’s have charge +1 and the B’s have charge −1 under the U (1). The vector multiplet has 4 real scalars ’i , i = 1; : : : ; 4. They do not transform under the SU (2)R . One can include 4 parameters in this theory consistent with (4; 4) supersymmetry. They are the 3 Fayet–Iliopoulos terms and the theta term. Let us 1rst investigate the hypermultiplet moduli space of this theory with the 3 Fayet–Iliopoulos terms and the theta term set to zero. The Higgs phase of this theory is obtained by setting i and the D-terms to zero. The D-term equations are |A1 |2 + |A2 |2 − |B1 |2 − |B2 |2 = 0 ; A 1 B 1 + A 2 B2 = 0 :
(7.20)
The hypermultiplet moduli space is the space of solutions of the above equations modded out by the U (1) gauge symmetry. Counting the number of degrees of freedom indicate that this space is 4-dimensional. To obtain the explicit form of this space it is convenient to introduce the following gauge invariant variables: M = A 1 B2 ;
N = A 2 B1 ;
P = A1 B1 = −A2 B2 :
(7.21) (7.22)
These variables are not independent. Setting the D-terms equal to zero and modding out the resulting space by U (1) is equivalent to the equation P 2 + MN = 0 :
(7.23)
This homogeneous equation is an equation of the space R4 =Z2 . To see this the solution of the above equation can be parameterized by 2 complex numbers (O; ;) such that P = iO;;
M = O2 ;
N = ;2 :
(7.24)
Thus the point (O; ;) and (−O; −;) are the same point in the space of solutions of (7.23). We have shown that the hypermultiplet moduli space is R4 =Z2 . The above singularity at the origin of the moduli space is a geometric singularity in the hypermultiplet moduli space. We now argue that this singularity is a genuine singularity of the SCFT that the linear sigma model Cows to in the infrared. At the origin of the classical moduli space the
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
625
Coulomb branch meets the Higgs branch. In addition to the potential due to the D-terms the linear sigma model contains the following term in the superpotential: 16 V = (|A1 |2 + |A2 |2 + |B1 |2 + |B2 |2 )(’21 + ’22 + ’23 + ’24 ) :
(7.25)
Thus at the origin of the hypermultiplet moduli space a Cat direction for the Coulomb branch opens up. The ground state at this point is not normalizable due to the non-compactness of the Coulomb branch. This renders the infrared SCFT singular. This singularity can be avoided in two distinct ways. If one turns on the Fayet–Iliopoulos D-terms, the D-term equations are modi1ed to [147] 17 |A1 |2 + |A2 |2 − |B1 |2 − |B2 |2 = r3 ; A1 B1 + A2 B2 = r1 + ir2 ;
(7.26)
where r1 ; r2 ; r3 are the 3 Fayet–Iliopoulos D-terms transforming as the adjoint of the SU (2)R . Now the origin is no more a solution of these equations and the non-compactness of the Coulomb branch is avoided. In this case wave-functions will have compact support on the Coulomb branch. This ensures that the infrared SCFT is non-singular. Turning on the Fayet–Iliopoulos D-terms thus correspond to the geometric resolution of the singularity. The resolved space is known to be [146,147] described by an Eguchi–Hanson metric in which r1; 2; 3 parameterize a shrinking two-cycle. The second way to avoid the singularity in the SCFT is to turn on the theta angle . This induces a constant electric 1eld in the vacuum. This electric 1eld is screened at any other point than the origin in the hypermultiplet moduli space as the U (1) gauge 1eld is massive with a mass proportional to the vacuum expectation value of the hypers. At the origin the U (1) 1eld is not screened and thus it contributes to the energy density of the vacuum. This energy is proportional to 2 . Thus turning on the theta term lifts the Cat directions of the Coulomb branch. This ensures that the corresponding infrared SCFT is well de1ned though the hypermultiplet moduli space remains geometrically singular. In terms of the Eguchi–Hanson space, the -term corresponds to a Cux of the anti-symmetric tensor through the two-cycle mentioned above. The (4; 4) SCFT on R4 =Z2 at the orbifold point is well de1ned. Since the orbifold has a geometric singularity but the SCFT is non-singular it must correspond to the linear sigma model with a 1nite value of and the Fayet–Iliopoulos D-terms set to zero. Deformations of the R4 =Z2 orbifold by its 4 blow up modes correspond to changes in the Fayet–Iliopoulos D-terms and theta term of the linear sigma model. 18 The global description of the moduli of a N = (4; 4) SCFT on a resolved R4 =Z2 orbifold is provided by the linear sigma model. In conclusion let us describe this linear sigma model in terms of the gauge theory of D-branes. The theory described above arises on a single D1-brane in presence of 2 D5-branes. The singularity at the point r1 ; r2 ; r3 ; = 0 is due to non-compactness of the Cat direction of the Coulomb branch. Thus it corresponds to the physical situation of the D1-brane leaving the D5-branes. 16 These terms can be understood from the coupling A A E∗ E in six dimensions, and recognizing that under dimensional reduction to two dimensions ’i ’s appear from the components of A in the compact directions. 17 cf. Eqs. (4.14) and (4.15), whose parameters O; Oc are related to the present parameters r1 ; r2 ; r3 by r3 ≡ O; r1 +ir2 ≡ Oc . 18 ] If we identify the SU (2)R of the linear sigma-model with SU (2)I of the orbifold SCFT, then the Fayet–Iliopoulos parameters will correspond to T1 and the -term to T0 . This is consistent with Witten’s observation [113] that SO(4)E symmetry of the linear sigma-model (one that rotates the i ’s) corresponds to the SU (2)R of the orbifold SCFT.
626
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
7.3. The gauge theory relevant for the decay of the D1–D5 system As we have seen in Section 5.6 the resolutions of the N = (4; 4) SCFT on M is described by 4 marginal operators which were identi1ed in the previous subsection with the Fayet–Iliopoulos D-terms and the theta term of the linear sigma model description of the R4 =Z2 singularity. We want to now indicate how these four parameters would make their appearance in the gauge theory description of the full D1–D5 system. Motivated by the D-brane description of the R4 =Z2 singularity we look for the degrees of freedom characterizing the break up of (Q1 ; Q5 ) system to (q1 ; q5 ) and (Q1 ; Q5 ) where Q1 = Q1 − q1 and Q5 = Q5 − q5 . Physically the relevant degree of freedom describing this process is the relative coordinate between the centre of mass of the (q1 ; q5 ) system and the (Q1 ; Q5 ). We will describe the eLective theory of this degree of freedom below. For the bound state (Q1 ; Q5 ) the hypermultiplets, E are charged under the relative U (1) of U (Q1 )× U (Q5 ), that is under the gauge 1eld A =Tr U (Q1 ) (A )−Tr U (Q5 ) (A ). The relative U (1) gauge multiplet corresponds to the degree of freedom of the relative coordinate between the centre of mass of the collection of Q1 D1-branes and Q5 D5-branes. At a generic point of the Higgs phase, all the E’s have expectation values, thus making this degree of freedom becomes massive. This is consistent with the fact that we are looking at the bound state (Q1 ; Q5 ). Consider the break up of the (Q1 ; Q5 ) bound state to the bound states (q1 ; q5 ) and (Q1 ; Q5 ). To 1nd out the charges of the hypermultiplets under the various U (1), we will organize the hypers as (5) (5) (1) (1) Y Y Y Y EabU Ea bU U U i(bb) i(aa) U i(bb ) i(aaU ) ; Yi(1) = E= and Yi(5) = ; (7.27) (1) (1) (5) (5) Ea bU Ea bU Yi(a Y Y Y a) U U U i(a aU ) i(b b)
i(b b )
where a; aU runs from 1; : : : ; q1 , b; bU from 1; : : : ; q5 , a aU from 1; : : : ; Q1 and b ; bU from 1; : : : ; Q5 . We organize the scalars of the vector multiplet corresponding to the gauge group U (Q1 ) and U (Q5 ) as (1)aaU (1)aaU bU bU m (5)b m (5)b (5) (5) m m and m = ; (7.28) m = U U aU aU b b (1)a (2)a (5)b (5)b m m m m where m = 1; 2; 3; 4. Let us call the U (1) gauge 1elds (traces) of U (q1 ); U (q5 ); U (Q1 ); U (Q5 ) as A1 ; A5 ; A1 ; A5 , respectively. We will also use the notation A± ≡ A1 ± A5 and A± ≡ A1 ± A5 . As we are interested in the bound states (q1 ; q5 ) and (Q1 ; Q5 ), in what follows we will work with (1) a speci1c classical background in which we give vevs to the block-diagonal hypers Eab ; Ea b ; Yi(a a) U ; (5) (1) (5) Yi(bb)U ; Yi(a aU ) and Yi(b bU ) . These vevs are chosen so that the classical background satis1es the D-term equations (4.13). The vevs of the E’s render the 1elds A− and A− massive with a mass proportional to vevs. In the low energy eLective Lagrangian these gauge 1elds can therefore be neglected. In the following we will focus on the U (1) gauge 1eld Ar = 1=2(A+ − A+ ) which does not get mass from the above vevs. The gauge multiplet corresponding to Ar contains four real scalars denoted below by ’m . These represent the relative coordinate between the centre of mass of the (q1 ; q5 ) and the (Q1 ; Q5 ) bound states. We will be interested in the question whether the ’m ’s remain massless or otherwise.
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
627
The massless case would correspond to a non-compact Coulomb branch and eventual singularity of the SCFT. In order to address the above question we need to 1nd the low energy degrees of freedom which couple to the gauge multiplet corresponding to Ar . (1) (1) (5) (5) The 1elds charged under Ar are the hypermultiplets EabU ; Ea bU; Yi(a U and the U ; Yi(bbU ) ; Yi(b b) aU ) ; Yi(a a)
U
U
b aU aU b vector multiplets (1)a ; (1)a ; (5)b ; (5)b . In order to 1nd out which of these are massless, we m m m i look at the following terms in the Lagrangian of U (Q1 ) × U (Q5 ) gauge theory:
L = L 1 + L2 + L3 + L4 ; L1 = E† †m m E ; L2 = E† †m m Ea1 bU3 ; L3 = Tr([Yi(1) ; Yj(1) ][Yi(1) ; Yj(1) ]) ; L4 = Tr([Yi(5) ; Yj(5) ][Yi(5) ; Yj(5) ]) :
(7.29)
The terms L1 and L2 originate from terms of the type |AM E|2 where AM ≡ (A ; m ) is the (4; 4) vector multiplet in two dimensions. The terms L3 and L4 arise from commutators of gauge 1elds in compacti1ed directions. (1) The 1elds Y are in general massive. The reason is that the traces yi(1) ≡ Tr(Yi(a a) U ), represent(1) (1) 4 ing the centre-of-mass position in the T of q1 D1-branes, and yi ≡ Tr(Yi(a aU ) ), representing the centre-of-mass position in the T 4 of Q1 D1-branes, are neutral and will have vevs which are generically separated (the centres of mass can be separated in the torus even when they are on top of each (1) (1) other in physical space). The mass of Yi(a U can be read oL from the term L3 in (7.29), to be aU ) ; Yi(a a) (5) (5) is proportional to (y(5) − y(5) )2 (as proportional to (y(1) − y(1) )2 . Similarly the mass of Yi(b ;Y bU ) i(bbU )
can be read oL from the term L4 in (7.29)) where y(5) and y(5) are the centres of mass of the Q5 4 D5-branes and Q5 D5-branes along the direction of the dual four torus Tˆ . (At special points when their centres of mass coincide, these 1elds become massless. The analysis for these cases can also be carried out by incorporating these 1elds in (7.33) – (7.35), with no change in the conclusion.) The aU aU 1elds (1)a ; (1)a are also massive. Their masses can be read oL from the L1 in (7.29). Speci1cally m m they arise from the following terms: (1)aaU1 ∗
Ea∗ bU m 1
(1)aaU2
m
aU1 ∗ (1)a aU2 Ea2 bU + Ea∗1 bU(1)a m Ea2 bU ; m
(7.30)
where ai run from 1; : : : ; q1 and ai run form 1; : : : ; Q1 . These terms show that their masses are proportional to the expectation values of the hypers EabU and Ea bU . Similarly the terms of L2 in (7.29) (5)b1 b∗
Ea∗ bU m 1
(5)b2 bU
m
U
U
1 b ∗ (5)b2 b Ea bU2 + Ea∗bU1 (5)b m EabU2 m
U
U
(7.31)
b (5)b b m are massive with masses proportional to the expectation values of show that the 1elds (5)b m the hypers EaaU and Ea bU . In the above equation bi take values from 1; : : : ; q5 and bi take values from
628
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
1; : : : ; Q5 . Note that these masses remain non-zero even in the limit when the (q1 ; q5 ) and (Q1 ; Q5 ) are on the verge of separating. Thus the relevant degrees of freedom describing the splitting process is a 1 + 1 dimensional U (1) gauge theory of Ar with (4; 4) supersymmetry. The matter content of this theory consists of hypermultiplets EabU with charge +1 and Ea bU with charge −1. This theory consists of totally q1 Q5 + q5 Q1 hypers. We de1ne the individual components of hypers as the following doublets: AabU Aa bU EabU = ; E : (7.32) = a bU Ba†bU Ba† bU Let us now describe the dynamics of the splitting process. This is given by analyzing the hypermultiplet moduli space of the eLective theory described above with the help of the D-term equations: AabU A∗abU − Aa bUA∗a bU − Bb aUBb∗ aU + BbaU Bb∗aU = 0 ; AabU Bb aU − Aa bUBbaU = 0 :
(7.33)
In the above equations the sum over a; b; a ; b is understood. These equations are generalized version of (7.20) discussed for the R4 =Z2 singularity in Section 4.1. At the origin of the Higgs branch where the classical moduli space meets the Coulomb branch this linear sigma model would Cow to an infrared conformal 1eld theory which is singular. The reason for this is the same as for the R4 =Z2 case. The linear sigma model contains the following term in the superpotential: V = (AabU A∗abU + Aa bUA∗a bU + Bb aUBb∗ aU + BbaU Bb∗aU )(’21 + ’22 + ’23 + ’24 ) :
(7.34)
As in the discussion of the R4 =Z2 case, at the origin of the hypermultiplet moduli space the Cat direction of the Coulomb branch leads to a ground state which is not normalizable. This singularity can be avoided by deforming the D-term equations by the Fayet–Iliopoulos terms (cf. Eqs. (4.14) and (4.15) and footnote 17): AabU A∗abU − Aa bUA∗a bU − Bb aUBb∗ aU + BbaU Bb∗aU = r3 ; AabU Bb aU − Aa bUBbaU = r1 + ir2 :
(7.35)
We note here that the Fayet–Iliopoulos terms break the relative U (1) under discussion and the gauge 1eld becomes massive. The reason is that the D-terms with the Fayet–Iliopoulos do not permit all A; B’s in the above equation to simultaneously vanish. At least one of them must be non-zero. As these A; B’s are charged under the U (1), the non-zero of value of A; B gives mass to the vector multiplet. This can be seen from the potential (7.34). The scalars of the vector multiplet becomes massive with the mass proportional to the vevs of A; B. Thus the relative U (1) is broken. The singularity associated with the non-compact Coulomb branch can also be avoided by turning on the term, the mechanism being similar to the one discussed in the previous subsection. If any of the 3 Fayet–Iliopoulos D-terms or the term is turned on, the Cat directions of the Coulomb branch are lifted, leading to normalizable ground state is of the Higgs branch. This prevents the breaking up of the (Q1 ; Q5 ) system to subsystems. Thus we see that the 4 parameters which resolve the singularity of the N = (4; 4) SCFT on M make their appearance in the gauge theory as the Fayet–Iliopoulos terms and the theta term.
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
629
It would be interesting to extract the singularity structure of the gauge theory of the D1–D5 system through mappings similar to (7.21) – (7.24). 19 The case (Q1 ; Q5 ) → (Q1 − 1; Q5 ) + (1; 0): splitting of a single D1-brane: It is illuminating to consider the special case in which 1 D1-brane splits oL from the bound state (Q1 ; Q5 ). The eLective dynamics is again described in terms of a U (1) gauge theory associated with the relative separation between the single D1-brane and the bound state (Q1 − 1; Q5 ). The massless hypermultiplets charged under this U (1) correspond to open strings joining the single D1-brane with the D5-branes and are denoted by Ab : (7.36) E b = Bb† The D-term equations, with the Fayet–Iliopoulos terms, become in this case Q5
(|Ab |2 − |Bb |2 ) = r3 ;
b =1
Q5
Ab Bb = r1 + ir2
(7.37)
b =1
while the potential is Q5 (|Ab |2 + |Bb |2 ) (’21 + ’22 + ’23 + ’24 ) : V=
(7.38)
b =1
In this simple case it is easy to see that the presence of the Fayet–Iliopoulos terms in (7.37) ensures that all A; B’s do not vanish simultaneously. The vevs of A; B gives mass to the ’’s. Thus the relative U (1) is broken when the Fayet–Iliopoulos term is not zero. The D-term equations above agree with those in [114] which discusses the splitting of a single D1-brane. It is important to emphasize that the potential and the D-term equations describe an eDective dynamics in the classical background corresponding to the (Q1 − 1; Q5 ) bound state. This corresponds to the description in [114] of the splitting process in an AdS3 background which represents a mean 1eld of the above bound state. 7.4. Dynamics of the decay of the D1–D5 system from gauge theory We have seen in Section 7.3 that the eLective theory describing the dynamics of the splitting of the (Q1 ; Q5 ) system to subsystems (q1 ; q5 ) and (Q1 ; Q5 ) is (4; 4), U (1) super Yang–Mills coupled to q1 Q5 + q5 Q1 ) hypermultiplets. The SCFT which this gauge theory Cows in the infra-red is singular if the Fayet–Iliopoulos terms and the theta term is set to zero. The description of the superconformal theory of the Higgs branch of a U (1) gauge theory with (4; 4) supersymmetry and N hypermultiplets near the singularity was found in [145]. The Higgs branch near the singularity was expressed in the Coulomb variables. The SCFT near the singularity was derived using the R-symmetry of the Higgs branch. It consists of a bosonic SU (2) Wess–Zumino–Witten model at level N − 2, four free fermions and a linear dilaton with background charge given by 2 Q= (N − 1) : (7.39) N 19
The singularity structure for a U (1) theory coupled to N hypermultiplets has been obtained in [88].
630
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
The central charge of this SCFT is 6(N − 1). Using this result for the U (1) theory describing the splitting we get a background charge for the linear dilaton given by
2 QGaugeTheory = (q1 Q5 + q5 Q1 − 1) : (7.40) q1 Q5 + q5 Q1 For large Q1 and Q5 we see from the above equation and (7.18) that QGaugeTheory = QSUGRA . Consider the case of a single string splitting oL the D1–D5 bound state, then the linear dilaton theory relevant for this decay has a background charge of Q = (2=Q5 )(Q5 −1). This eLective theory is called the long string. On performing an S-duality transformation this long D-string turns into a long fundamental string. This argument demonstrates the existence of long fundamental strings in the S-dual of the near-horizon geometry of the D1–D5 system. We will discuss these solutions in detail in Section 10. 7.5. The symmetric product From the arguments of this section we see that the free 1eld orbifold conformal 1eld theory on M does not correspond to the D1–D5 system given by the supergravity solution in (2.70). This solution does not have any moduli turned on. We saw in this section that in the absence of moduli the SCFT is singular. The eLective theory near the singularities does not just depend on the product Q1 Q5 . For instance the theory near the singularity corresponding to the decay of a single D1-brane is characterized by a background charge of (2=Q5 )(Q5 − 1). Thus it is not clear that whether the symmetric product moduli space is connected to this singular SCFT. In spite of this from the fact that the short multiplets of the SCFT on M agree with the supergravity modes on AdS3 × S 3 we see that at least for calculations involving correlations functions of the short multiplets we can trust the SCFT on the symmetric product M. The reason for this is that correlations functions involving short multiplets are protected by non-renormalization theorems. 8. The microscopic derivation of Hawking radiation From Sections 5 and 6 we have seen that there is a one-to-one correspondence of the supergravity modes and the short multiplets of the N = (4; 4) SCFT on M. In this section we use this fact to obtain a precise understanding of Hawking radiation from the D1–D5 system starting from the microscopic SCFT. As mentioned in the introduction after extracting out the low energy, degrees of freedom of the black hole the next step towards understanding Hawking radiation is to 1nd the coupling of these degrees of freedom to the supergravity modes. This is given by a speci1c SCFT operator O(z; z) U which couples to the supergravity 1eld in the form of the interaction Sint = d 2 z (z; z)O(z; U z) U ; (8.1) where is the strength of the coupling. We have no 1rst principle method of determining the operator corresponding O which couples with the supergravity mode as the microscopic theory is an eLective theory. Therefore we appeal to symmetries to determine the operator. A coupling such as
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
631
the one given in (8.1) can exist only if the operator O and the 1eld have the same symmetries. The identi1cation of the bulk and boundary symmetries of the D1–D5 system in Section 6.3 enables the determination of the operator which couples to a given supergravity mode. Strictly speaking in the near-horizon limit (2.6.2) there is no coupling of the bulk to the boundary. Here, we will assume small but strictly not zero, so that we can discuss Hawking radiation. From the near-horizon limit of the D1–D5 black hole whose metric in 5-d is given in (2.42) we can infer that the black hole is an excited state of the Ramond sector of the same SCFT as that of the unexcited D1–D5 system. Therefore the coupling (8.1) should be the same as that of the D1–D5 system. The strength of the coupling is determined by comparing bulk and boundary two point function using the AdS=CFT correspondence for the D1–D5 system. Once the interaction in (8.1) is determined the calculation of Hawking radiation from the SCFT reduces to a purely quantum mechanical evaluation of a scattering matrix in the SCFT. In Sections 8.1 and 8.2 we identify the D1–D5 black hole as an excited state in the Ramond sector of the SCFT of the D1–D5 system. We show how the entropy of the D1–D5 black hole matches with this excited state in the SCFT. In Section 8.3 we determine the coupling of the minimal scalar corresponding to the metric Cuctuation of the torus to the SCFT operator. In Section 8.4 we review the formulation of the absorption cross-section calculation from SCFT as an evaluation of the thermal Green’s function of the operators O corresponding to the supergravity 1eld . We then evaluate the absorption cross-section from SCFT and show that it agrees with the one evaluated from supergravity including all the graybody factors. In Section 8.6 we address the Hawking radiation of 1xed scalars from the SCFT point of view. We show that 1xing the SCFT operators using symmetries resolves the disagreement observed in [104] between the ‘eLective string’ calculation of the Hawking radiation and the supergravity calculation. Finally in Section 8.7 we outline how Hawking radiation of the intermediate scalars also can be determined from the SCFT. 8.1. Near-horizon limit and Fermion boundary conditions The near-horizon geometry of the D1–D5 black hole is described in detail in Section 2.6.2. We see from the remark (b) at the end of that section that the boundary condition for fermions in the BTZ case is periodic; this implies that the SCFT relevant for the D1–D5 black hole with Kaluza– Klein momentum N = 0 is the Ramond vacuum of the N = (4; 4) SCFT on the orbifold M. The microscopic states corresponding to the general D1–D5 black hole are states with L0 = 0 and LU0 = 0 excited over the Ramond vacuum of the N = (4; 4) SCFT on the orbifold M. In the AdS3 case, the fermion boundary condition is anti-periodic; therefore the appropriate SCFT is that of the NS sector. 8.2. The black hole state As we have seen, the general non-extremal black hole will have Kaluza–Klein excitations along both the directions on the S 1 . In the SCFT on M, it is represented by states with L0 = 0 and LU0 = 0 over the Ramond vacuum. The black hole is represented by a density matrix (cf. Eq. (1.14)) 1 |i i| : (8.2) )= {i }
632
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
The states |i belongs to the various twisted sectors of the orbifold theory. They satisfy the constraint NL NR L0 = ; LU0 = : (8.3) Q1 Q5 Q1 Q5 We have suppressed the index which labels the vacuum. is the volume of the phase space in the microcanonical ensemble. It can be seen that the maximally twisted sector of the orbifold gives rise to the dominant contribution to the sum in (8.2) over the various twisted sectors. The maximally twisted sector is obtained by the action of the twist operator (Q1 Q5 −1)=2 on the Ramond vacuum. From the OPEs in (5.39) we see that the twist operator (Q1 Q5 −1)=2 introduces a cut in the complex plane such that XA (e2i z; e−2i z) U = XA+1 (z; z) U :
(8.4)
Thus this changes the boundary conditions of the bosons and the fermions. Again from the OPEs in (5.39) one infers that the excitations like 91 | (Q1 Q5 −1)=2 over the maximally twisted sector have modes in units of 1=Q1 Q5 . A simple way of understanding that the maximally twisted sector has modes in units of 1=(Q1 Q5 ) is to note that the boundary conditions in (8.4) imply that XA (z; z) U is periodic with a period of 2Q1 Q5 . This forces the modes to be quantized in units of 1=(Q1 Q5 ). We now show that the maximally twisted sector can account for the entire entropy of the black hole. The entropy of the D1–D5 black hole can be written as √ √ SSUGRA = 2 NL + 2 NR : (8.5) Using Cardy’s formula, the degeneracy of the states in the maximally twisted sector with L0 = NL =Q1 Q5 and LU0 = NR =Q1 Q5 is given by = e2
√
√
NL +2 NR
:
(8.6)
By the Boltzmann formula,
√ √ S(maximally twisted) = 2 NL + 2 NR :
(8.7)
Thus the maximally twisted sector entirely accounts for the D1–D5 black hole entropy. NL and NR are multiples of Q1 Q5 due to the orbifold projection. Therefore, the entropy can be written as S = 2 NL Q1 Q5 + 2 NR Q1 Q5 : (8.8) With this understanding, we restrict the calculations of Hawking radiation and absorption cross-section only to the maximally twisted sector. The probability amplitude for the Hawking process is given by 1 P= |f|Sint |i |2 ; (8.9) f; i
where |f denotes the 1nal states the black hole can decay into. We have averaged over the initial states in the microcanonical ensemble. It is more convenient to work with the canonical ensemble. We now discuss the method of determining the temperature of the canonical ensemble. Consider the generating function U
Z = Tr R (e−=L E0 e−=R E 0 ) ;
(8.10)
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
633
where the trace is evaluated over the Ramond states in the maximally twisted sector. E0 and EU 0 are energies of the left and the right moving modes. E0 =
LU0 EU 0 = : R5
L0 ; R5
(8.11)
From the generating function Z in (8.10) we see that the coeKcient of e−(=L NL )=(Q1 Q5 R5 ) and e−(=R NR )=(Q1 Q5 R5 ) is the degeneracy of the states with L0 = NL =Q1 Q5 and LU0 = NR =Q1 Q5 corresponding to the D1–D5 black hole. A simple way to satisfy this constraint is to choose =L and =R such that Z is peaked at this value of L0 and LU0 . Evaluating the trace one obtains 4 4 ∞ 1 + e−(=L n)=(Q1 Q5 R5 ) 1 + e−(=R n)=(Q1 Q5 R5 ) Z= : (8.12) 1 − e−(=L n)=(Q1 Q5 R5 ) 1 − e−(=R n)=(Q1 Q5 R5 ) n=1 Then
ln Z = 4
∞
ln(1 + e
−=L n=Q1 Q5 R5
)−
n=1
+4
∞
∞
ln(1 − e
−=L n=Q1 Q5 R5
)
(8.13)
n=1
ln(1 + e
−=R n=Q1 Q5 R5
n=1
)−
∞
ln(1 − e
−=R n=Q1 Q5 R5
)
:
(8.14)
n=1
We can evaluate the sum by approximating it by an integral given by ∞ 1 + e − =R x 1 + e− = L x ln Z = 4Q1 Q5 R5 + ln : d x ln 1 − e − =L x 1 − e − =R x 0
(8.15)
From the partition function in (8.10) we see that −
NL
9 ln Z = 9=L Q1 Q5 R5
and
−
9 ln Z NR
= ; 9=R Q 1 Q5 R 5
(8.16)
where · indicates the average value of NL and NR . As the distribution is peaked at NL and NR we assume that NL = NL and NL = NR . Using (8.15) we obtain NL Q 1 Q5 R5 2 = 2 Q1 Q 5 R 5 =L Thus
√ 1 NL = TL = =L R5 Q1 Q5
and
and
Q1 Q 5 R 5 2 NR = : 2 Q 1 Q5 R 5 =R √ NR 1 TR = = : =R R5 Q1 Q5
(8.17)
(8.18)
Above we have introduced a left temperature TL and a right temperature TR corresponding to the left and the right moving excitations of the SCFT to pass over to the canonical ensemble. To see that the temperatures TL ; TR de1ned here are the same as in (4.29), note that the oscillator numbers NL ; NR are Q1 Q5 times the oscillator numbers that enter (4.29). The reason is that in (4.29) we de1ned
634
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
NL ; NR simply as the eigenvalues of L0 ; LU0 ; in this section we are working with the maximally twisted sectors which have fractional oscillator numbers, thus to reach the same energy we have to work with oscillator numbers which are Q1 Q5 times larger. As mentioned before (cf. (4.30)) the temperature of the combined system (conjugate to EL + ER ), to be identi1ed as the Hawking temperature, is given by 1 1=TH = (1=TL + 1=TR ) : 2
(8.19)
8.3. The coupling with the bulk <elds for the D1–D5 black hole In Section 6 we showed that there is a one to one map between the supergravity 1elds on AdS3 × S 3 and the short multiplets of the N = (4; 4) SCFT on M. Therefore in principle we can determine O for each bulk 1eld in (8.1) just by matching the symmetries of the operator and the 1eld. But as we mentioned above that the black hole is represented by an excited state in the Ramond sector and not for the Neveu–Schwarz sector which corresponds to the AdS3 boundary conditions. The couplings determined in the Neveu–Schwarz sector of the SCFT do not change in the Ramond sector as interaction terms do not depend on whether one is in the Ramond sector or the Neveu–Schwarz sector. Therefore we can continue to use these couplings for the D1–D5 black hole. The scaling dimension of an operator is given by operator product expansions (OPEs) with the stress energy tensor. Since OPEs are local relations, they do not change on going from the Neveu–Schwarz sector to the Ramond sector, the same can be said of the R-charge of the operator. In Section 8, we will see the calculation of Hawking radiation from the SCFT just depends on the scaling dimension and the R-charge of the operator. Since this is invariant whether one is in the Ramond sector or the Neveu–Schwarz sector the operator O is the same as the one identi1ed using AdS3 as the near-horizon geometry. 8.4. Determination of the strength of the coupling Before we perform the calculation of Hawking radiation=absorption cross-section from the SCFT corresponding to the D1–D5 black hole it is important to determine the strength of the coupling in (8.1). In this section we will determine for the case of minimal scalars hij . 20 In (6.16) we have identi1ed the SCFT operator corresponding to these 1elds of the supergravity. The SCFT operator is given by 1 {i U k (z; z) U j} (z; z) Oij (z; z) U = 9xA (z; z) U 9x U − 3ij 9xAk 9x U : A A 4
(8.20)
Let us suppose the background metric of the torus T 4 is gij = 3ij . The interaction Lagrangian of the SCFT with the Cuctuation hij is given by U j] : Sint = TeL d 2 z [hij 9xAi 9x (8.21) A 20
From now on hij will denote the traceless part of the metric Cuctuations of T 4 .
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
635
The eLective string tension TeL of the conformal 1eld theory, which also appears in the free part of the action S0 = TeL d 2 z [9z xAi 9zUxi; A + fermions] (8.22) has been discussed in [48,101,111]. The speci1c value of TeL is not important for the calculation of the S-matrix for absorption or emission, since the factor just determines the normalization of the two-point function of the operator Oij (z; z). U In this section we will argue that the constant = 1. A direct string theory computation would of course provide the constant as well (albeit at weak coupling). This would be analogous to 1xing the normalization of the Dirac–Born–Infeld action for a single D-brane by comparing with one-loop open string diagram [109]. However, for a large number and more than one type of D-branes it is a diKcult proposition and we will not attempt to pursue it here. Fortunately, the method of symmetries using the AdS=CFT employed for determining the operator O helps us determine the value of as well. For the latter, however, we need to use the more quantitative version [62,63] of the Maldacena conjecture. We will see below that for this quantitative conjecture to be true for the two-point function (which can be calculated independently from the N = (4; 4) SCFT and from supergravity) we need = 1. We will see that the above normalization leads to precise equality between the absorption crosssections (and consequently Hawking radiation rates) computed from the moduli space of the D1– D5 system and from semiclassical gravity. This method of 1xing the normalization can perhaps be criticized on the ground that it borrows from supergravity and does not rely entirely on the SCFT. However, we would like to emphasize two things: (a) We have 1xed = 1 by comparing with supergravity around AdS3 background which does not have a black hole. On the other hand, the supergravity calculation of absorption cross-section and Hawking Cux is performed around a black hole background represented in the near-horizon limit by the BTZ black hole. From the viewpoint of semiclassical gravity these two backgrounds are rather diLerent. The fact that normalizing with respect to the former background leads to the correctly normalized absorption cross-section around the black hole background is a rather remarkable prediction. (b) Similar issues are involved in 1xing the coupling constant between the electron and the electromagnetic 1eld in the semiclassical theory of radiation in terms of the physical electric charge, and in similarly 1xing the gravitational coupling of extended objects in terms of Newton’s constant. These issues too are decided by comparing two-point functions of currents with Coulomb’s or Newton’s laws, respectively. In the present case the quantitative version of the AdS=CFT conjecture [62,63] provides the counterpart of Newton’s law or Coulomb’s law at strong coupling. Without this the best result one can achieve is that the Hawking radiation rates computed from D1–D5 branes and from semiclassical gravity are proportional. We should remark that 1xing the normalization by the use of Dirac–Born–Infeld action, as has been done previously, is not satisfactory since the DBI action is meant for single D-branes and extending it to a system of multiple D1–D5 branes does not always give the right results as we shall see in Sections 8.6.1 and 8.7. The method of equivalence principle to 1x the normalization is not very general and cannot be applied to the case of non-minimal scalars, for example. Let us now compare the two-point function for the minimal scalar hij determined from the AdS=CFT correspondence and the SCFT to determine the normalization constant . We will discuss
636
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
the more quantitative version of the AdS=CFT conjecture [62,63] to compare the 2-point correlation function of Oij from supergravity and SCFT. The relation between the correlators are as follows. Let the supergravity Lagrangian be L = d 3 x1 d 3 x2 bij; i j (x1 ; x2 )hij (x1 )hi j (x2 ) +
d 3 x1 d 3 x2 d 3 x3 cij; i j ; i j (x1 ; x2 ; x3 )hij (x1 )hi j (x2 )hi j (x3 ) + · · · ;
(8.23)
where we have only exhibited terms quadratic and cubic in the hij ’s. The coeKcient b determines the propagator and the coeKcient c is the tree-level 3-point vertex in supergravity. The coeKcients b and c are local operators, b is the kinetic operator. The 2-point function of the Oij ’s (at large gs Q1 ; gs Q5 ) is given by [62,63] assuming Sint given by (8.21) − 2 d 3 x1 d 3 x2 [bij; i j (x1 ; x2 )K(x1 |z1 )K(x2 |z2 )] ; Oij (z1 )Oi j (z2 ) = 2(TeL ) (8.24) where K is the boundary-to-bulk Green’s function for massless scalars [62]. 2 x0 1 : K(x|z) = x02 + (|zx − z|2 )
(8.25)
We use complex z for coordinates of the SCFT, and x = (x0 ; zx ) for the PoincarZe coordinates of bulk theory. 8.4.1. Evaluation of the tree-level vertices in supergravity We begin with the bosonic sector of Type IIB supergravity. The Lagrangian is (we follow the conventions of [148]) I = INS + IRR ; √ 1 1 10 −2 2 2 INS = − 2 d x −G e R − 4(d) + (dBNS ) ; 12 2k10 1 √ 1 (H n )2 d 10 x −G IRR = − 2 2n! 2k10 n=3;7;:::
(8.26)
2 =647 gs2 4 . We use Mˆ ; Nˆ ; : : : to denote 10 dimensional indices, i; j; : : : to denote coordinates with k10 on the torus T 4 , M; N; : : : to denote the remaining 6 dimensions and ; K; : : : to denote coordinates on the AdS3 . We have separately indicated the terms depending on Neveu–Schwarz Neveu–Schwarz and Ramond–Ramond backgrounds. Our aim will be to obtain the Lagrangian of the minimally coupled scalars corresponding to the Cuctuations of the metric of the T 4 in the D1–D5-brane system. We will 1nd the Lagrangian up to cubic order in the near-horizon limit. Let us 1rst focus on INS . We substitute the values of the background 1elds of the D1–D5 system in the Type IIB Lagrangian with the following change in the metric (cf. Eq. (3.3)):
f11=2 f5−1=2 3ij → f11=2 f5−1=2 (3ij + hij ) ;
(8.27)
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
637
where hij are the minimally coupled scalars with trace zero. These scalars are functions of the 6-dimensional coordinates. Retaining the terms upto O(h3 ) and ignoring the traces, the Lagrangian can be written as √ V4 G MN [9M hij 9N hij + 9M (hik hkj )9N hij ] : d 6 x −G (8.28) INS = − 2 4 2k10 In the above equation we have used the near-horizon limit and V4 is the volume of the T 4 . It is easy to see that to O(h2 ), the minimally coupled scalars do not mix with any other scalars. These minimally coupled scalars are all massless (see the 1rst line in (6.15)). Therefore to O(h2 ) they do not mix with any other scalars which are all massive. For our purpose of computing the two point function using the AdS=CFT correspondence it is suKcient to determine the tree-level action correct to O(h2 ). The metric GMN near the horizon is (writing r for U in Eq. (2.71), and including a factor of ) r2 R2 (−d x02 + d x52 ) + 2 dr 2 + R2 d32 : (8.29) 2 R r We make a change of variables to the PoincarZe coordinates by substituting R x0 x5 : (8.30) z0 = ; z1 = ; z2 = r R R The metric becomes 1 (8.31) ds2 = R2 2 (d z02 − d z12 + d z22 ) + R2 d32 : z0 √ Here R= (g62 Q1 Q5 )1=4 is the radius of curvature of AdS3 (also of the S 3 ) (see (2.73)). For s-waves the minimal scalars do not depend on the coordinates of the S 3 . Finally, in PoincarZe coordinates INS (correct to cubic order in h) can be written as √ V4 3 (8.32) INS = − 2 R VS 3 d 3 z −ggK [9 hij 9K hij + 9 (hik hkj )9K hij ] ; 8k10 ds2 =
where VS 3 = 22 , the volume of a three-sphere of unit radius. Now we would like to show that to all orders in h, IRR = 0 in the near-horizon geometry. The relevant terms in our case are √ 1 ˆ ˆ ˆ 10 d x −GHMˆ Nˆ Oˆ H M N O : (8.33) IRR = − 2 4 × 3!k10 We substitute the values of B due to the magnetic and electric components of the Ramond–Ramond charges and the value of G. The contribution from the electric part of B , after going to the near-horizon limit and performing the integral over S 3 and T 4 is V4 3 √ 3 d RV z −g det(3ij + hij ) : (8.34) S 2 4k10 The contribution of the magnetic part of B in the same limit is √ V4 − 2 RVS 3 d 3 z −g det(3ij + hij ) : 4k10
(8.35)
638
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
We note that the contribution of the electric and the magnetic parts cancel giving no couplings for the minimal scalars to the Ramond–Ramond background. Therefore the tree-level supergravity action correct to cubic order in h is given by 21 Q 1 Q5 d 3 z [9 hij 9 hij + 9 (hik hkj )9 hij ] : I =− (8.36) 16 The coeKcient Q1 Q5 =(16) is U-duality invariant. This is because it is a function of only the integers Q1 and Q5 . This can be tested by computing the same coeKcient from the Neveu–Schwarz 5 brane-fundamental string background which is related to the D1–D5 system by S-duality. The Neveu–Schwarz 5 brane-fundamental string background also gives the same coeKcient. U-duality transformations which generate BNS backgrounds [88] also give rise to the same coeKcient. 8.4.2. Two-point function The two-point function of the operator Oij can be evaluated by substituting the value of bij; i j (x1 ; x2 ) obtained from (8.36) into (8.24) and using the boundary-to-bulk Green’s function given in (8.25). On evaluating the integral in (8.24) using formulae given in [150], we 1nd that Oij (z)Oi j (w) = (TeL )−2 3ii 3jj
Q1 Q5 1 : 162 |z − w|4
(8.37)
This is exactly the value of the two-point function obtained from the SCFT described by the free Lagrangian (5.1) provided we put = 1. We have compared the two-point function obtained from the supergravity corresponding to the near-horizon geometry of the D1–D5 system with no moduli to the orbifold SCFT. As we have argued before the orbifold SCFT corresponds to the D1–D5 system with moduli. Thus naively this comparison seems to be meaningless. On further examination we note that the coeKcient bij; i j in (8.24) was U-duality invariant. Since the D1–D5 system with moduli can be obtained through U-duality transformations we know that this coeKcient will not change for the D1–D5 system with moduli. It is only the value of this coeKcient which 1xes to be 1. Thus the comparison we have made is valid. It is remarkable that even at strong coupling the two-point function of Oij can be computed from the free Lagrangian (8.22). This is consistent with the non-renormalization theorems involving the N = (4; 4) SCFT which will be discussed in Section 9. The choice = 1 ensures that the perturbation (8.21) of (8.22) is consistent with the perturbation implied in (8.27). We will see in the next section that this choice leads to precise equality between absorption cross-sections (consequently Hawking radiation rates) calculated from semiclassical gravity and from the D1–D5 branes. The overall multiplicative constant TeL will not be important for the absorption cross-section calculation. This factor 1nally cancels oL in the calculation as we will see in Section 8.6. Higher point correlations functions in the orbifold conformal 1eld theory were determined in [151,152] using general methods of computing correlations functions of twist 1elds on symmetric product orbifolds developed by [153,154].
21
The cubic couplings of all 1elds in type IIB supergravity on AdS3 × S 3 × T 4 were determined in [149].
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
639
8.5. Absorption cross-section as thermal Green’s function Let us now relate the absorption cross-section of a supergravity Cuctuation 3 to the thermal Green’s function of the corresponding operator of the N = (4; 4) SCFT on the orbifold M [155]. The notation 3 implies that we are considering the supergravity 1eld to be of the form = 0 + 3U ; (8.38) where 0 represents the background value and is the strength of the coupling. U S = S0 + d 2 z [0 + 3]O(z; z) U = S0 + Sint ; where
(8.39)
S0 = S0 + d 2 z 0 O(z; z) U ; U z) U ; Sint = d 2 z 3O(z;
(8.40) (8.41)
O is the operator corresponding to supergravity 1eld . S0 is the Lagrangian of the SCFT which includes the deformations due to various backgrounds in the supergravity. For example, the free Lagrangian in (8.22) corresponds to the case when the 1eld a1 E + a2 C6789 in (6.15) is turned on. We calculate the absorption of a quanta 3U = 05 e−ipx corresponding to the operator O using the Fermi’s Golden Rule. 05 is related to the 1ve-dimensional Newton’s constant G5 and the ten-dimensional Newton’s constant G10 as 8G10 647 gs2 4 052 = 8G5 = = : (8.42) V4 2R5 V4 2R5 We see that 05 is proportional to 2 . In the Maldacena limit (2.74) the coupling of the bulk Cuctuation to the SCFT drops out. We retain this term for our absorption cross-section calculation. √ In fact we will see below that the absorption cross-section turns out to be proportional g62 Q1 Q5 2 . In this computation of the absorption cross-section the black hole is represented by a canonical ensemble at a given temperature. The above interaction gives the thermally averaged transition probability P as e− = · p i e− = · p i P= Pi→f = 2 052 Lt (2)2 32 (p + pi − pf )|f|O(0; 0)|i |2 : (8.43) Z Z i;f
i;f
Here i and f refer to initial and 1nal states, respectively. pi ; pf refers to the initial and 1nal momenta of these states. L = 2R5 denotes the length of the string and t is the time of interaction. As we have seen in Section 8.2, the inverse temperature = has two components =L and =R . The relation of these temperatures to the parameters of the D1–D5 black hole is 1 1 =L = and =R = : (8.44) TL TR The left moving momenta p+ and the right moving momenta p− are in a thermal bath with inverse temperatures =L and =R , respectively. = · p is de1ned as = · p = =L p+ + =R p− . Z stands for the partition function of the thermal ensemble.
640
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
The Green’s function in Euclidean time is given by G(−i$; x) = O† (−i$; x)O(0; 0) = Tr()T$ {O† (−i$; x)O(0; 0)})
(8.45)
where ) = e−=·pˆ =Z. Time ordering is de1ned as T$ with respect to −Imaginary(t). This de1nition coincides with radial ordering on mapping the coordinate ($; x) from the cylinder to the plane. The advantage of doing this is that the integral e− = · p i (2)2 32 (p + pi − pf )|f|O(0; 0)|i |2 : dt d x eip·x G(t − ij; x) = (8.46) Z i;f
The Green’s function G is determined by the two-point function of the operator O. This is in turn U of the operator O and the normalization of the two-point determined by conformal dimension (h; h) function. As we have to subtract out the emission probability we get the cross-section as Iabs Ft = P(1 − e−=·p )
(8.47)
where F is the Cux and P is given by (8.43). Substituting the value of P from (8.46) we get 2 052 L dt d x (G(t − ij; x) − G(t + ij; x)) : (8.48) Iabs = F In the above equation we have related the evaluation of the absorption cross-section to the evaluation of the thermal Green’s function. Evaluating the integral one obtains U
Iabs =
U
2 052 LCO (2TL )2h−1 (2TR )2h−1 e=·p=2 − (−1)2h+2h e−=·p=2 U F 2 4(2h)4(2h) p− 2 p+ U 4 h+i ; × 4 h + i 2TL 2TR
(8.49)
where CO is the coeKcient of the leading order term in the OPE of the two-point function of operator O. 8.6. Absorption cross-section of minimal scalars from the D1–D5 SCFT In the previous section we related the thermal Green’s function of the SCFT operator to the absorption cross-section. We will apply the results of the previous section for the case of the minimal scalars. We will consider the case of the minimal scalars corresponding to the Cuctuation of the metric of T 4 . Let the background metric of the torus be 3ij . Consider the minimal scalar h67 . We know the SCFT operator corresponding to this has conformal dimension (1; 1). From Section 8.4 we know that = 1. The interaction Lagrangian is given by U 7 (z; z) Sint = 2TeL d 2 z h67 9xA6 (z; z) U 9x U ; (8.50) A where we have set = 1. The factor of 2 arises because of the symmetric property of h67 . S0 is given by U j (z; z) S0 = TeL d 2 z 9xAi (z; z) U 9x U : (8.51) A
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
641
U 7 (z; z). Comparing with the previous section the operator O = 2TeL 9xA6 (z; z) U 9x U For the absorption of a A quanta of energy ! using (8.49) we obtain exp(!=TH ) − 1 ! ; (8.52) Iabs = 22 r12 r52 2 (exp(!=2TR ) − 1)(exp(!=2TL ) − 1) where we have L = 2R5 , F = !, (8.42) for 05 and (8.37) for CO . Comparing the absorption cross-section of the minimal scalars obtained from supergravity in (3.22) with (8.52) we 1nd that Iabs (SCFT) = Iabs (Supergravity) :
(8.53)
Thus the SCFT calculation and the supergravity calculation of the absorption cross-section agree exactly. As we have mentioned earlier, this implies an exact agreement of decay rates between SCFT and the semiclassical calculation. It is important to note that we have used the N = (4; 4) SCFT realized as a free SCFT on the orbifold M as the background Lagrangian S0 . As we have said before, this SCFT is non-singular and therefore cannot correspond to the case of the D1–D5 system with no moduli. In Section 9.3.1 we have argued that the supergravity calculation of the absorption cross-section is independent of moduli. Therefore it makes sense to compare it with the SCFT result for the case with moduli turned on. In the next section we will show that the SCFT calculation is also independent of the moduli. Before closing this section, we should mention the semiclassical calculation of decay rates from BTZ black holes and its comparison to CFT [156 –158]. The greybody factor for BTZ black hole is connected with that for the D1–D5 black hole in [159,160]. 8.6.1. Absorption cross-section for the blow up modes Another point worth mentioning is that the method followed above for the calculation of the absorption cross-section from the SCFT can be easily extended for the case of minimal scalars corresponding to the four blow up modes. These minimal scalars are listed in the last two lines of (6.16). They are the self-dual NS B-1eld and a linear combination of the Ramond–Ramond four form and the zero form. The operators for these scalars are the Z2 twists in the SCFT. Absorption cross-section calculations for these scalars cannot be performed on the ‘eLective string’ model based on the DBI action. The simple reason being that these operators are not present in the ‘eLective’ string model. Thus the ‘eLective’ string model does not capture all the degrees of freedom of the D1–D5 black hole. 8.7. Fixed scalars Out of the 25 scalars mentioned earlier which form part of the spectrum of IIB supergravity on T 4 , 1ve become massive when further compacti1ed on AdS3 × S 3 . There is an important additional scalar 1eld which appears after this compacti1cation: h55 . Let us remind ourselves the notation used for the coordinates: AdS3 : (x0 ; x5 ; r); S 3 : (E; ; ); T 4 : (x6 ; x7 ; x8 ; x9 ). r; E; ; are spherical polar coordinates for the directions x1 ; x2 ; x3 ; x4 . In terms of the D-brane wrappings, the D5-branes are wrapped along the directions x5 ; x6 ; x7 ; x8 ; x9 and D1-branes are aligned along x5 . The 1eld h55 is scalar in the sense that it is a scalar under the local Lorentz group SO(3) of S 3 . In what follows we will speci1cally consider the three scalars 10 ; hii and h55 . The equations of motion of these 1elds in supergravity are coupled and have been discussed in detail in the
642
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
literature [101,103–106]. It turns out that the six-dimensional dilaton 6 = 10 − hii =4 which is a linear combination of hii and 10 remains massless; it is part of the 20 massless (minimal) scalars previously discussed. The two other linear combinations L and K are de1ned as (see case (2) after Eq. (3.3)) L=
h55 10 hii − + ; 2 2 8
K=
hii ; 8
(8.54)
L and K satisfy coupled diLerential equations. They can be decoupled by the following position independent linear transformation [104]: 22 L = (cos )+ + (sin )− ; K = −(sin )+ + (cos )− ;
(8.55)
where can be found by solving the equation tan −
1 2 Q1 + vQ5 : =√ tan 3 Q1 − vQ5
Then ± obey the following equations: 2 Q± 1 3 2 ± = 0 ; 9r 9r + ! f1 f5 − 8 2 2 2 2 r3 r (r + Q± ) where ! is the frequency of wave and
2 gs Q1 Q1 Q5 Q1 Q5 + : Q± = − ∓ Q52 + 3 v v v
(8.56)
(8.57)
(8.58)
These are examples of 1xed scalars. The pick up masses in the background geometry of the D1–D5 system. To see this take the near-horizon limit de1ned in (2.69) in Eq. (8.57), we get (see also [161]) 1 9 ! 2 l2 8 3 9 U + 2 − 2 ± = 0 : (8.59) l2 U 9U 9U U l Note that this is the Klein–Gordan equation of a massive scalar in AdS3 with (mass)2 = 8 is units of the AdS3 units. Further more since the equation for both + and − is identical in the near-horizon limit the equation of motions of L and K become decoupled. They obey the massive Klein–Gordon equation in AdS3 . The near-horizon mass of L and K is m2 = 8 in units of the radius of AdS3 . Understanding the absorption and emission properties of 1xed scalars is an important problem, because the D-brane computation and semiclassical black hole calculation of these properties are at variance [103,104]. The discrepancy essentially originates from the ‘expected’ couplings of L and 22
We have set r0 = 0 in the equations in [104] as we are looking at the background corresponding to the D1–D5 system.
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
643
U = (1; 3) and (3; 1) (see also [101]). These SCFT operators lead K to SCFT operators with (h; h) to qualitatively diLerent graybody factors from what the 1xed scalars exhibit semiclassically. The semiclassical graybody factors are in agreement with D-brane computations if the couplings were only to (2; 2) operators. The coupling to (1; 3) and (3; 1) operators is guessed from qualitative reasoning based on the Dirac–Born–Infeld action. Since we now have a method of deducing the couplings to the bulk 1elds based on near-horizon symmetries, let us use it in the case of the 1xed scalars. (a) By the mass dimension relation (6.14) we see that the 1xed scalars L and K correspond to operators with weights h + hU = 4. ] (b) The 1xed scalars have SU (2)E × SU (2)E quantum numbers (1; 1). As all the supergravity 1elds are classi1ed according to the short multiplets of SU (1; 1|2)×SU (1; 1|2) we can 1nd the 1eld corresponding to these quantum numbers among the short multiplets. Searching through the short multiplets (see below (6.12)), we 1nd that the 1xed scalars belong to the short multiplet (3; 3)S of SU (1; 1|2) × SU (1; 1|2). They occur as top component of (3; 3)S . There are six U = (1; 3) or (h; h) U = (3; 1) (which were 1xed scalars in all. We conclude that the operators with (h; h) inferred by the DBI method) are ruled out by the analysis of symmetries. In summary, since the (1; 3) and (3; 1) operators are ruled out by our analysis, the discrepancy between the D-brane calculation and the semiclassical calculation of absorption and emission rates disappears. Using the coupling to (2; 2) operators as we derived above, we can compute Iabs for 1xed scalars using (8.49). This agrees exactly with the result (3.27). 8.8. Intermediate scalars We only make the remark that the classi1cation presented in Section 6.1 correctly accounts for all U = (1; 2) 16 intermediate scalars, and predict that they should couple to SCFT operators with (h; h) U belonging to the short multiplet (2; 3)S or operators with (h; h)=(2; 1) belonging to the short multiplet (3; 2)S (see below Eq. (6.12)). This agrees with the ‘phenomenological’ prediction made earlier in the literature [162]. 9. Non-renormalization theorems Let us review the three major agreements between the N = (4; 4) SCFT on M and supergravity. We showed in Section 6 that the spectrum of short multiplets of N = (4; 4) SCFT on M was in one to one correspondence with the supergravity modes in the near-horizon geometry of the D1–D5 system. In Sections 5 and 8 we saw that the entropy calculated from the microscopic SCFT agreed with that of the D1–D5 black hole. Finally in Section 8 we have seen that the calculation of Hawking radiation from the SCFT agreed precisely with the semiclassical calculation including the gray body factors. In this section we will discuss the validity of these calculations both in the boundary SCFT and the bulk supergravity. The validity of the calculations performed on the conformal 1eld theory side in general do not overlap with that on the supergravity. Conformal 1eld theory calculations are performed using the N = (4; 4) free orbifold theory on M. In Section 7 we saw that the
644
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
SCFT corresponding to supergravity is singular and presumably involves a large deformation in the moduli space from the free orbifold conformal 1eld theory. Therefore the calculations performed in supergravity are valid in the region of moduli space when the conformal 1eld theory is singular. Let us compare the moduli space of deformations on the SCFT and the supergravity. The supergravity and the SCFT moduli are listed in (6.16). They are in one to one correspondence. There are 20 moduli in all. On the supergravity side the moduli parameterize the homogeneous space ˜ = SO(4; 5)=SO(4) × SO(5) [114,119,139]. On the SCFT N = (4; 4) supersymmetry highly conM ˜ [163]. Therefore for every point on the moduli straints the metric on the moduli space to also be M space of the SCFT there is a corresponding point on the SCFT. As we have seen in Section 7 the D1–D5 supergravity solution and the free orbifold SCFT are at diLerent points in the moduli space. Therefore it is natural to conclude that there are non-renormalization theorems that allow us to interpolate between the calculations done using the free orbifold theory and supergravity calculations. In this section we will detail these non-renormalization theorems for the three calculations, the spectrum of short multiplets, the entropy and Hawking radiation. 9.1. The spectrum of short multiplets Short multiplets in the SCFT are built on chiral primaries both for the left and right movers (see Section 5.5). The chiral primaries satisfy a BPS bound. Their R-charge is the same as the conformal dimension. Thus their spectrum in independent of any perturbation of the conformal 1eld theory which preserves the N = (4; 4) supersymmetric structure [132,164] The entire structure of short multiplets is then dictated by the N = (4; 4) SCFT algebra. Therefore the spectrum of short multiplets is invariant under deformations of the SCFT. In Section 5.8 the entire set of short multiplets was evaluated using the free orbifold theory on M. This will remain invariant under deformations of this SCFT from the free orbifold point. A further piece of evidence that the short multiplets structure does not change under deformations is that the number of chiral primaries which form the bottom component of a short multiplet can be counted in the case of SCFT on M using a topological partition function [165]. This partition function is invariant under deformation of the SCFT. The short multiplet structure of the supergravity modes obtained in Section 6.1 ignored the winding and momentum modes on the torus. In fact the mass spectrum would not change if there were any metric deformation or Neveu–Schwarz B-1eld through the torus. These aLect only winding and momentum modes on the torus. Thus the short multiplet structure of the supergravity modes presented in (6.12) are invariant under deformations of the moduli that involve the traceless components of the metric and the self-dual NS B-1eld (6.16). We have just demonstrated that calculation of the spectrum of short multiplets in the supergravity and the SCFT is independent of the moduli. This allows us to compare the short multiplet spectrum on both sides and obtain agreement. 9.2. Entropy and area As we have seen in Section 8 the black hole is represented as a state with L0 = 0; LU0 = 0 over the Ramond sector of the SCFT. The entropy in the SCFT is calculated by evaluating the asymptotic density of states of with these values of L0 and LU0 . The asymptotic density of states in a conformal
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
645
1eld theory is given by Cardy’s formula [115]. This depends on the level and the central charge of the conformal 1eld theory. The central charge of a conformal 1eld theory is independent of moduli. The entropy is given by the logarithm for the asymptotic density of states. Therefore the entropy calculated from the SCFT remains invariant for various values of moduli. The Bekenstein–Hawking entropy in supergravity is evaluated in the 1ve-dimensional Einstein metric and is equal to the area of the horizon. From the equations of motion of type IIB supergravity [101], we can explicitly see that the 1ve-dimensional Einstein metric is not changed by turning on the 16 moduli listed on the top three lines of (6.16). These are the traceless components of the metric, the self-dual NS B-1eld along the torus and the six-dimensional dilaton. This can also be seen explicitly form the supergravity solution with moduli (2.48) constructed in [88]. Thus the area of the D1–D5 black hole does not change with moduli in supergravity. This ensures that the evaluation of the entropy from SCFT will agree with that in supergravity even though they are evaluated at diLerent points in the moduli space. 9.3. Hawking radiation Before we discuss the dependence of Hawking radiation on the moduli let us examine the various approximations made in the derivation of Hawking radiation both in supergravity and in SCFT. Recall that the semiclassical calculations of Hawking radiation was done in the dilute gas limit (3.9) r0 ; rn r1 ; r5 . We also made the approximation of low energies compared to the horizon radius !r5 1. As the SCFT calculation relies very much on the near-horizon limit and the enhanced symmetries near the horizon, one can ask the question whether in the near-horizon limit the dilute gas approximation is obeyed. Let us convert r0 ; rn ; r1 ; r5 to near-horizon variables. r0 = U0 ; rn = U0 sinh I ; gs Q r1 = ; r5 = gs Q5 : v
(9.1)
Now it is easy to see in the near-horizon limit → 0 the dilute gas approximation always holds. We also use the low energy approximation in the SCFT calculation. In the microscopic calculations of Hawking radiation from the SCFT we restricted our attention to 1rst order in perturbation theory. In fact we used the Fermi-golden rule to obtain the absorption cross-section in (8.9). It is easy to see that for the metric Cuctuation the higher order terms in perturbation theory go in powers of w2 r52 [166]. Thus higher order terms in the SCFT calculation are suppressed due to the low energy approximation. In spite of the fact that the dilute gas approximation and the low energy approximation are made on both the supergravity and SCFT side the Hawking radiation calculation from the D1–D5 black hole in supergravity and the SCFT are done at diLerent point in the moduli space. We now discuss why in spite of this they both agree. 9.3.1. Independence of Hawking radiation calculation vis-a-vis moduli: supergravity We recall that the D1–D5 black hole solution in the absence of moduli is [44,71] obtained from the D1–D5 system by further compactifying x5 on a circle of radius R5 and adding left(right) moving Kaluza–Klein momenta along x5 . The corresponding supergravity solution is given.
646
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
The absorption cross-section of minimal scalars in the absence of moduli is given by (3.22) [38,40]. We will now show that the absorption cross-section remains unchanged even when the moduli are turned on. From the equations of motion of type IIB supergravity [101], we can explicitly see that the 1ve-dimensional Einstein metric ds5;2 Ein is not changed by turning on the 16 moduli corresponding to the metric Gij on T 4 and the Ramond–Ramond 2-form potential B. As regards the four blowing up moduli, the invariance of ds5;2 Ein can be seen from the fact that turning on these moduli corresponds to SO(4; 5) transformation (which is a part of a U-duality transformation) and from the fact that the Einstein metric does not change under U-duality. Now we know that the minimal scalars i all satisfy the wave-equation D 9 i = 0 ;
(9.2)
where the Laplacian is with respect to the Einstein metric in 1ve dimensions. Since it is only this wave equation that determines the absorption cross-section completely, we see that Iabs is the same as before. It is straightforward to see that the Hawking rate, given by (3.25) is also not changed when moduli are turned on. 9.3.2. Independence of Hawking radiation vis-a-vis moduli: SCFT In this section we will study the independence of the Hawking radiation on D1–D5 moduli. In Section 2 we have listed the twenty (1; 1) operators Oi (z; z) U in the SCFT based on the symmetric product orbifold M which is dual to the D1–D5 system. Turning on various moduli i of supergravity corresponds to perturbing the SCFT i d 2 z U Oi (z; z) S = S0 + U ; (9.3) i i
where U denote the near-horizon limits of the various moduli 1elds i . We note here that S0 corresponds to the free SCFT based on the symmetric product orbifold M. As we have seen in Section 7 that this SCFT is non-singular (all correlation functions are 1nite), it does not correspond to the marginally stable BPS solution originally found in [44,71]. Instead, it corresponds to a 1ve-dimensional black hole solution in supergravity with suitable “blow-up” moduli turned on. Let us now calculate the absorption cross-section of a supergravity Cuctuation 3i to the thermal Green’s function of the corresponding operator of the SCFT. The notation 3i implies that we are considering the supergravity 1eld to be of the form i = i0 + 3i ;
(9.4)
where i0 represents the background value and is the strength of the coupling. i i U S = S0 + d 2 z [U 0 + 3U ]Oi (z; z) = S0 + Sint ;
(9.5)
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
647
where S 0 = S 0 + Sint =
i d 2 z U 0 Oi (z; z) U ;
i d 2 z 3U Oi (z; z) U :
(9.6)
(9.7)
As we have seen in the Section 8.5 the absorption cross-section of the supergravity Cuctuation 3i involves essentially the two-point function of the operator Oi calculated with respect to the SCFT action S0 . Since Oi is a marginal operator, its two-point function is completely determined apart from a constant. Regarding the marginality of the operators Oi , it is easy to establish it upto one-loop order by direct computation (cijk = 0). The fact that these operators are exactly marginal can be argued as follows. The 20 operators Oi arise as top components of 1ve chiral primaries. It is known that the number of chiral primaries with (jR ; j˜R ) = (m; n) is the Hodge number h2m; 2n of the target space M of the SCFT. Since this number is a topological invariant, it should be the same at all points of the moduli space of deformations. We showed in Section 8.4 that if the operator Oi corresponding to hij is canonically normalized (OPE has residue 1) and if 3i is canonically normalized in supergravity, then the normalization of Sint as in (9.7) ensures that Iabs from SCFT agrees with the supergravity result. The crucial point now is the following: once we 1x the normalization of Sint at a given point in moduli space, at some other point it may acquire a constant (= 1) in front of the integral when Oi and 3i are canonically normalized at the new point. This would imply that Iabs will get multiplied by this constant, in turn implying disagreement with supergravity. We need to show that this does not happen. 4 To start with a simple example, let us 1rst restrict to the moduli gij of the torus T˜ . We have U j gij : S = d 2 z 9xi 9x (9.8) The factor of string tension has been absorbed in the de1nition of xi . In Section 8.4 we had gij = 3ij + hij , leading to S = S0 + Sint ; U j 3ij ; S0 = d 2 z 9xi 9x Sint =
U j hij : d 2 z 9xi 9x
(9.9)
In the above equation we have set = 1. As we have remarked above, this Sint gives rise to the correctly normalized Iabs . Now, if we expand around some other metric gij = g0ij + hij
(9.10)
648
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
then the above action (9.8) implies S = Sg0 + Sint ; U j g0ij ; Sg0 = d 2 z 9xi 9x Sint =
U j hij : d 2 z 9xi 9x
(9.11)
U j in Sint is canonically normalized Now the point is that neither hij nor the operator Oij = 9X i 9X at gij = g0ij . When we do use the canonically normalized operators, do we pick up an additional constant in front? Note that Oij Okl g0 = g0ik g0jl |z − w|−4
(9.12)
hij (x)hkl (y) g0 = g0; ik g0; jl D(x; y) ;
(9.13)
and
where D(x; y) is the massless scalar propagator. This shows that Statement (1): The two-point functions of Oij and hij pick up inverse factors. As a result, Sint remains correctly normalized when re-written in terms of the canonically normalized h and O and no additional constant is picked up. ˜ because Statement The above result is in fact valid in the full twenty-dimensional moduli space M (1) above remains true generally. To see this, let us 1rst rephrase our result for the special case of the metric moduli (9.8) in a more ˜ (known geometric way. The gij ’s can be regarded as some of the coordinates of the moduli space M to be a coset SO(4; 5)=(SO(4) × SO(5))). The in1nitesimal perturbations hij ; hkl can be thought of as de1ning tangent vectors at the point g0; ij (namely the vectors 9=9gij ; 9=9gkl ). The (residue of the) two-point function given by (9.12) de1nes the inner product between these two tangent vectors according to the Zamolodchikov metric [163,167]. ˜ of the N = (4; 4) SCFT on M is the coset SO(4; 5)=(SO(4) × The fact that the moduli space M SO(5)) is argued in [163]. If the superconformal theory has N = (4; 4) supersymmetry and if the dimension of the moduli space is d then it is shown in [163] that the moduli space of the symmetric the product SCFT is given by SO(4; d=4) : SO(4) × SO(d=4)
(9.14)
As a simple check note that the dimension of the space in (9.14) is d. The outline of the argument is (2)R symmetry. We have seen that a follows. An N = (4; 4) SCFT has superconformal SU (2)R × SU] the bottom component of the short multiplet which contains the marginal operator (2; 2)S transforms as a (2; 2) under SU (2)R × SU] (2)R . The top component which corresponds to the moduli transforms as a (1; 1) under the R-symmetry. The holonomy group of the Zamolodchikov metric should leave
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
649
invariant the action of SU (2)R × SU] (2)R . Then the holonomy group should have a form K ⊂ SU (2) × SU (2) × K˜ ⊂ SO(d) :
(9.15)
Then (9.15) together with N=(4; 4) supersymmetry and the left–right symmetry of the two SU (2)R ’s of the SCFT 1xes the moduli space to be uniquely that given in (9.14). We have found in Section 2 that there are 20 marginal operators for the N = (4; 4) SCFT on the orbifold M. Therefore the ˜ is given by dimension of the moduli space is 20. Thus M SO(4; 5) ˜ = : (9.16) M SO(4) × SO(5) Consider, on the other hand, the propagator (inverse two-point function) of hij ; hkl in supergravity. The moduli space action of low energy Cuctuations is nothing but the supergravity action evaluated around the classical solutions g0; ij . The kinetic term of such a moduli space action de1nes the metric of moduli space. Statement (1) above is a simple reCection of the fact that the Zamolodchikov metric de1nes the metric on moduli space, and hence Statement (2): The propagator of supergravity Buctuations, viewed as a matrix, is the inverse of the two-point functions in the SCFT. The last statement is of course not speci1c to the moduli gij and is true of all the moduli. We 1nd, therefore, that 1xing the normalization of Sint (9.7) at any one point 0 ensures that the normalization remains correct at any other point 0 by virtue of Statement (2). We should note in passing that Statement (2) is consistent with, and could have been derived from AdS=CFT correspondence as applied to the two-point function. Thus, we 1nd that Iabs is independent of the moduli, in agreement with the result from supergravity. 10. Strings in AdS3 The AdS=CFT correspondence [41,42,62,63,90] for the case of the D1–D5 system states that type IIB string theory on AdS3 × S 3 × T 4 is dual to the 1 + 1 dimensional conformal 1eld theory of the Higgs branch of the gauge theory of the D1–D5 system. In our study of the bulk geometry we have worked only in the supergravity approximation of string theory on AdS3 × S 3 . To fully explore the AdS=CFT correspondence for the D1–D5 system we need to understand string propagation in the bulk geometry. String theory on AdS3 × S 3 × T 4 involves Ramond–Ramond Cuxes through the S 3 . Progress in formulation of string propagation in Ramond–Ramond backgrounds for the case of AdS3 have been made in [168–173]. Unfortunately, these models have been diKcult to quantize and results have been hard to obtain. It is more convenient to study string theory on the background which is S-dual to the D1–D5 system. The near-horizon geometry of the S-dual is also AdS3 ×S 3 ×T 4 but with Neveu–Schwarz H-Cuxes through the S 3 and AdS3 . String propagation on this dual background can be quantized. It also provides an exact string background in which the metric g00 is non-trivial. An important result from the study of strings in AdS3 has been in understanding the role of long strings in the spectrum. We saw in Section 7 that there exists long D1-brane solutions near the boundary of AdS3 . Therefore, in the S-dual geometry this would mean that there exists long fundamental strings. These strings have been constructed as classical solutions and also have been identi1ed in the full
650
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
quantum spectrum [174,175]. They play an important role in constructing a consistent spectrum of strings in AdS3 . In this Section 10.1 we will introduce the S-dual of the D1–D5 system. We then formulate string propagation on AdS3 and study its spectrum in Section 10.2. Our discussion will be based on [174]. To understand the essential aspects of the spectrum it is enough to focus on the AdS3 part of the geometry, we also restrict our discussion to bosonic strings in AdS3 . In Section 10.3 brieCy review string propagation in Euclidean AdS3 which was initiated in [139]. We then write down the long string solution in Euclidean AdS3 and discuss its symmetries [114]. Finally in Section 10.4 we discuss string theory on thermal AdS3 backgrounds. The one loop free energy of a gas of strings in AdS3 was evaluated in [175]. It was shown to be modular invariant and the space–time spectrum read oL from it matched with the proposal for the spectrum in [174]. We just mention two topics which we will not have the time to review. For a discussion of correlation functions for string theory on AdS3 and their role in the AdS=CFT correspondence see [176] and references therein. Branes in AdS3 has been extensively studied; for a recent work see [177,178] and references therein. 10.1. The S-dual of the D1–D5 system Let us consider the S-dual of the D1–D5 whose metric is given in (2.24). S-duality takes → −, C 2 → BNS and ds2 → e− ds2 . Performing these operations on the supergravity solution given √ in (2.24) along with a rescaling of the coordinates by gs (coordinates are rescaled to keep the 10-dimensional Newton’s constant invariant) we obtain ds2 = f1−1 (−dt 2 + d x52 ) + f5 (d x12 + · · · + d x42 ) + (d x62 + · · · + d x92 ) ; e−2 =
1 f1 f5−1 ; gs2
1 B05 = (f1−1 − 1) ; 2 1 Habc = jabcd 9d f5 ; 2 where gs = 1=gs and f1 = 1 +
a; b; c; d = 1; 2; 3; 4 ;
(10.1)
Q5 : r2
(10.2)
164 gs2 3 Q1 ; V4 r 2
f5 = 1 +
Here V4 refers to the volume of the T 4 measured in the scaled coordinates. From the Neveu–Schwarz Cuxes it is now easy to see that this system is the supergravity solution of Q5 Neveu–Schwarz branes with Q1 fundamental strings smeared over the four torus T 4 . Let us now take the near-horizon limit of this solution. This is given by r ≡ U = 1xed ; → 0; v ≡
V4 = 1xed; 164 2
gs g6 ≡ √ = 1xed : v
(10.3)
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
651
Under this scaling limit the metric given in (10.1) reduces to ds2 = U 2 Q5 (−dt 2 + d x52 ) + Q5 e−2 =
dU 2 + Q5 d\2 + (d x62 + · · · + d x92 ) ; U2
Q1 ; v Q5
H05U = Q5 U ; HE = Q5 :
(10.4) Here we have rescaled coordinates t and x5 by Q1 Q5 g62 . Thus the near-horizon geometry of Q1 fundamental strings and Q5 NS branes is AdS3 × S 3 × T 4 with Neveu–Schwarz Cuxes. The radius of √ 3 S is Q5 . Note that the near-horizon geometry depends on Q1 only through the string coupling constant which is proportional to the ratio Q1 =Q5 . It is convenient to study string propagation on this geometry as it consists only of Neveu–Schwarz Cuxes. We will restrict out attention to string propagation only on the AdS3 part of the geometry. Strings on S 3 with H Cux through the sphere is an SU (2) WZW model at level Q5 , while strings on T 4 is a free 1eld conformal theory. We refer to these conformal 1eld theories as the internal conformal 1eld theory. To simplify the discussion we will study only bosonic strings on AdS3 . 10.2. String propagation on AdS3 String propagation on AdS3 with H Cux is an exact conformal 1eld theory AdS3 is a SL(2; R) group manifold. To see this consider the SL(2; R) group element parameterized by t+ t− g = exp i (10.5) I2 exp()I3 ) exp i I2 ; 2 2 where Ii are the Pauli matrices. We use the following generators for the SL(2; R) Lie algebra i 1 T 3 = − I2 ; T ± = (I3 ± iI1 ) : (10.6) 2 2 Then the metric on the SL(2; R) group manifold is given by 1 gK = Tr(g−1 9 gg−1 9K g) ; (10.7) 2 where ; K are indices referring to ); t; . Evaluating the metric using the parameterization given in (10.5) we get ds2 = −cosh2 ) dt 2 + d)2 + sinh2 ) d2 ;
(10.8)
which is the metric on AdS3 expressed in the global coordinates (t; ; )) (C.6). Thus string propagation on AdS3 can be expressed in terms of the WZW action given below Q5 Q5 + − −1 + −1 − S= d x d x Tr(g 9 gg 9 g) + Tr(!3 ) : (10.9) 4 M 12 N Here M is the embedding of the world sheet into the group manifold AdS3 and N is any 3-dimensional manifold whose boundary is M and !=g−1 dg the Maurer–Cartan 1-form. The second term is called
652
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
the Wess–Zumino term. We have used Minkowski signature on the world sheet and x± = $ ± I, where $ and I are the world sheet time and position coordinate. As a check on the action compare the H 1eld in (10.4) and the one induced by the Wess–Zumino term. It is easily seen that both are proportional to the volume form on AdS3 . Classical solutions: Now that we have the action of strings in AdS3 we obtain the form of the classical solutions. The equations of motion derived form the action in (10.9) is given by 9− (9+ gg−1 ) = 0 :
(10.10)
Thus a general solution of this action is given by g = g+ (x+ )g− (x− ) :
(10.11)
It is thus easy to construct classical solutions. Consider the following solution g+ (x+ ) = e(iI
2
=2)x+
;
g− (x− ) = e(iI
2
=2)x−
:
(10.12)
2
For this solution we have g = e−$I . From the parametrization of the group element in (10.5) we see that the solution is a timelike geodesic with ) = 0; = 0 and t = $. Note that the solution does not have I dependence, therefore it represents a particle trajectory. Space like geodesics can also be constructed. The solution g+ (x+ ) = e(I
3
=2)x+
;
g− (x− ) = e(I
3
=2)x−
(10.13)
3
represents a spacelike geodesic with g = e$I . From (10.5) we see that the trajectory is given by ) = $ which is space like. It is interesting to note that there is a symmetry which allows the generation of new solutions given one solution. The transformation g+ = ei(1=2)!x
+
I2
g˜+ ;
g− = ei(1=2)!x
−
I2
g˜+ ;
(10.14)
where g˜+ and g˜− are the old solution is also a solution. From (10.5) we see that this acts on t and as t → t + !$;
→ + !I :
(10.15)
The periodicity of the string worldsheet under I → I + 2 is obeyed. This transformation is called spectral Cow. It stretches the geodesic in the t-direction gives I dependence to the geodesics. In fact now the solution represents a string winding w times around the centre ) = 0 of AdS3 . The spectral Cow of timelike geodesics are called short strings, their energy is bounded from above. While the spectral Cow of spacelike geodesics are called long strings, their energy is bounded from below. Thus long strings are like scattering states while short strings are like bound states in AdS3 [174]. These long strings as we will see in Section 10.4 are the duals of the long D-strings discussed in Section 7. Symmetries of the SL(2; R) WZW action: The WZW action has an in1nite set of conserved charges given by 2 + 2 d x inx+ d x− inx− a a −1 a U J n = Q5 e Tr(T 9+ gg ); J n = Q5 e Tr(T a 9U− gg−1 ) : (10.16) 2 2 0 0
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
653
They obey the commutation relations Q5 [Jn3 ; Jm3 ] = − n3n+m; 0 ; 2 ± ; [Jn3 ; Jm± ] = ±Jn+m 3 [Jn+ ; Jm− ] = −2Jn+m + Q5 n3n+m; 0 :
(10.17) a JU n .
There is a similar set of commutation relations for the right movers Using the Sugawara construction one can de1ne the Virasoro generators, they are given by ∞ 1 + − 1 (J J + J0− J0+ ) − (J03 )2 + (J−+m Jm− + J−−m Jm+ − 2J−3 m Jm3 ) ; L0 = Q5 − 2 2 0 0 m=1 ∞
Ln =0 =
1 + − (J J + Jn−−m Jm+ − 2Jn3−m Jm3 ) : Q5 − 2 m=1 n−m m
These generators obey the Virasoro algebra with the central charge given by 3Q5 c= : Q5 − 2
(10.18)
(10.19)
10.3. Spectrum of strings on AdS3 From the fact that WZW action admits the SL(2; R) current algebra we see that the physical spectrum of a string in AdS3 must be in unitary representations of the current. We construct the unitary representation of the SL(2; R) current algebra by 1rst constructing unitary representation of the global part of the SL(2; R). This is given by [J03 ; J0± ] = ±J0± ;
[J0+ ; J0− ] = −2J03 :
(10.20)
From this algebra we see states are classi1ed by the eigen values of J03 which we denote by m and j which is related to the Casimir c2 = 12 (J0+ J0− + J0− J0+ ) − (J03 )2 by c2 = −j(j − 1). Unitary representations of SL(2; R) fall in 1ve classes: 23 (1) Identity: The trivial representation |0 . This representation has j = 0; m = 0 and J0± |0 = 0. (2) Principal discrete representations (lowest weight): These are representation of the form Dj+ = {|j; m : m = j; j + 1; j + 2; : : :} ;
(10.21)
Here |j; j is annihilated by J0− . The tower of states over |j; j is built by the repeated action of J0+ . The norm of these states is positive and the representation is unitary if j is real and j ¿ 0. j is restricted to be half integer if we are considering representation of the group SL(2; R), however for the universal cover of SL(2; R) which is our interest, j can be any positive integer. 23
See [179] for a review.
654
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
(3) Principal discrete representations (highest weight): These are representation of the form Dj− = {|j; m : m = −j; −j − 1; −j − 2; : : :} ;
(10.22)
where |j; j is annihilated by J0+ . The representation has positive norm and is unitary if j is real and j ¿ 0. This representation is the charge conjugate of Dj+ . (4) Principal continuous representations: A representation is of the form Cj = {|j; ; m : m = ; ± 1; ± 2; : : :} :
(10.23)
Without loss of generality, we can restrict 0 6 ¡ 1. The representation has positive norm and is unitary if j = 1=2 + is where s is real. (5) Complementary representations: These are of the form Ej = {|j; ; m : m = ; ± 1; ± 2; : : :} :
(10.24)
Again without loss of generality we can restrict 0 6 ¡ 1. The representation has positive norm and is unitary if j is real and j(1 − j) ¿ (1 − ). Among these representations, we restrict to those which admit square integrable wave functions in the point particle limit. As AdS3 is non-compact, square-integrability refers to delta function normalizable wave functions. This imposes the restriction j ¿ 1=2. 24 It is known that Cj=1=2+is ⊗ Cj=1=2+is and ± ± Dj ⊗ Dj with j ¿ 1=2 form the complete basis of square integrable wave functions on AdS3 . So it is suKcient to work with these representations only. Let us call a unitary representation of SL(2; R) with this restriction as H. Now that we have the unitary representation of the global part of the current algebra we can obtain the unitary representation of the SL(2; R) current algebra by considering H as its primary states which are annihilated by Jn3 ; Jn± where n ¿ 0. Then the full representation is obtained by the ± action of Jn3 ; Jn± with n ¡ 0 on H. We denote this full representation by Dˆ j and Cˆ j=1=2+is . In general, representation of the SL(2; R) current algebra contains negative norm states. String theory on AdS3 is consistent if one can remove these negative states by imposing the Virasoro constraint on the Hilbert space for a single string state. If we consider the bosonic string theory on AdS3 then the Virasoro constraint is given by (Ltotal − 3n; 0 )|Physical = 0; n
n¿0 :
(10.25)
Ltotal n
Here refers to the Virasoro generator of the c =26 conformal 1eld theory including the SL(2; R) ± WZW model. It has been shown that there are no negative norm states for Dˆ j with 0 ¡ j ¡ k=2 and Cˆ j=1=2+is (see [174,181,182] for a list of references). In [174] it was seen that the SL(2; R) WZW model admits a symmetry given by k 3 + − (10.26) Jn3 = J˜ n + w3n; 0 ; Jn+ = J˜ n−w ; Jn− = J˜ n+w ; 2 This condition is also the condition for the Breitenlohner–Freedman bound [180] on Dj± which states the mass of a scalar in AdS3 is given by m2 = j(j − 1) ¿ − 14 . 24
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
655
where w is any integer. The map of J ’s to J˜ ’s preserves the commutation relations (10.17). The Virasoro generators L˜ n can be found using the Sugawara construction and they are related to Ln ’s by the map k 3 (10.27) Ln = L˜ n − wJ˜ n − w2 3n; 0 : 4 This symmetry of the SL(2; R) WZW model is called spectral Cow. It is the same symmetry as the one which allowed the generation of new classical solutions from old ones in Section 10.2. This can be seen easily by computing the change in the stress energy tensor and the SL(2; R) generators under the map (10.14). They are identical to (10.27) and (10.26), respectively. The spectral Cow maps one representation to another. For the case of a compact group like SU (2) it does not generate a new representation, but for the non-compact group SL(2; R) it generates new representations. Let ±; w ; w us call the resulting representations Dˆ j˜ and Cˆ 1=2+is , where j˜ denotes the SL(2; R) spin before the Cow. The representations obtained by the spectral Cow also have negative norms states. It has been shown in [174], that there are no negative norm states for representation obtained from spectral Cow for 1=2 ¡ j˜ ¡ (k − 1)=2. Now we have the ingredients to state the proposal for the spectrum of strings on AdS3 [174]. The spectrum consists of two kinds of representations, the spectral Cow of the continuous repre; w ; w sentation with the same amount of spectral Cow on the left and right Cˆ 1=2+is; L ⊗ Cˆ 1=2+is; R along ±; w ±; w with the spectral Cow of the discrete representations Dˆ j;˜ L ⊗ Dˆ j;˜ R . The value of j˜ is restricted to be 1=2 ¡ j˜ ¡ (k − 1)=2. These representations should be tensored with the representations of the internal CFT which contributes to the net central charge. We then have to impose the Virasoro constraints. The expressions for the energy and the virasoro constraints for both the discrete and the continuous representation are given in [174]. This proposal was veri1ed in [175] by reading out the spectrum from the modular invariant one-loop partition function in thermal AdS3 background. The discrete and continuous states obtained form the one-loop partition function was in agreement with the above proposal. We will review [175] in Section 10.5. 10.4. Strings on Euclidean AdS3 In this section we formulate string theory on Euclidean AdS3 , we denote Euclidean AdS3 by H (see Appendix C). Again we will restrict our attention to a single Poincare patch in H. Consider the following coordinate rede1nition of the coordinates in the Euclidean version of (10.4): U = e ;
J = it + x5 ;
JU = −it + x5 :
(10.28)
Then the metric on H becomes (cf. (C.35) with h = e− ) U : ds2 = l2 (d2 + e2 dJ d J)
(10.29)
Here l2 = Q5 . The value of the B-1eld can be read out again from (10.4) and is given by B = l2 e2 dJ ∧ d JU :
(10.30)
The B-1eld is necessary for worldsheet conformal invariance. The B-1eld in Euclidean AdS3 is imaginary. We work with a Euclidean worldsheet theory. This makes the contribution of B-1eld to
656
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
the world sheet Lagrangian real. The world sheet action is given by l2 U U + e2 9J9J) S= d 2 z (99 : 2
(10.31)
Let us write this action in a more convenient form. Introducing auxiliary 1elds = and =U of weights (1; 0) and (0; 1) we can write the action as l2 U + =9J U + =9 U JU − e−2 ==) U : d 2 z (99 S= (10.32) 2 Integrating out the auxiliary 1elds = and =U in the above equation we obtain (10.31). Scaling so that it has the canonical normalization and taking into account of the measure we obtain: 1 2 2 ˆ 2 U U U U d z 99 − S= R + =9J + =9JU − == exp − : (10.33) 4 + + √ Here + = 2Q5 − 4 and Rˆ is the world sheet curvature. Notice that the coeKcient of the exponent has been renormalized. The action becomes free at → ∞ which is near the boundary of AdS3 . The world sheet propagators of the 1elds in (10.33) are 1 (10.34) (z)(0) = −log |z|2 ; =(z)J(0) = : z This action (10.33) admits a SL(2; R) × SL(2; R) current algebra. This representation of the SL(2; R) current algebra is called the Wakimoto representation [183]. The holomorphic currents are given by + J 3 = =J + 9 ; 2 J + = =J2 + + J9 + Q5 9J ; J− = = :
(10.35)
Similar de1nitions for the anti-holomorphic currents exist. The modes of these currents generate the SL(2; R) current algebra given in (10.17) with the same central charge as given in (10.19). 10.4.1. The long string worldsheet algebra We now derive the worldsheet degrees of freedom of the long string solution in H. Using supersymmetry one can derive the worldsheet theory of the long string exactly, unlike our classical analysis in Section 7.1. We will follow the discussion given in [114]. The long string solution in the static gauge is given by [114,139,174] = 0 ;
J(z; z) U = z;
J( U z; U z) U = zU :
(10.36)
The ghosts corresponding to J and JU decouple [114]. Thus the bosonic worldsheet degrees of freedom of the long string will be the coordinate characterizing its radial position in H, the coordinates on the sphere S 3 which forms an SU (2) current algebra j a , a = 1; : : : ; 3 and the coordinates of T 4 . The 4 fermionic partners of the coordinates on T 4 with the bosons form a N = (4; 4) superconformal algebra with central charge 6. The superpartners of the coordinate and the SU (2) current algebra j a of the sphere S 3 4 free fermions S with = 1; : : : ; 4. From the fact that the radius of S 3 is Q5 the we know that the SU (2) current algebra is at level Q5 − 2. 25 These arguments lead us to the 25
The shift in the level is because we have used decoupled fermions (see for instance in [184]).
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
657
following operator product expansions among the 1elds: 3K ; S (z)S K (w) = − z−w 9(z)9(w) = − j a (z)j b (w) = −
1 ; (z − w)2
3ab (Q5 − 2) jabc j c : + 2(z − w)2 (z − w)
(10.37)
To keep our discussion less cumbersome we have ignored the anti-holomorphic 1elds. As the space– time preserves 16 supersymmetries we should construct out of these 1eld an N = (4; 4) superconformal algebra. It is known [114,184] that the following construction has the required properties of the long sting: √ 1 j aj a 1 2(Q5 − 1) 2 √ T = − 9S S − − 99 + 9; 2 Q5 2 2 Q5 Ja = ja + √
G =
1 a K ; S ; 2 K
2 2 1 Q5 − 1 9S − √ ;aK j a S K + √ jK)I S K S ) S I − √ 9S : 2 Q5 6 Q5 Q5
(10.38)
These generators form an N = (4; 4) superconformal algebra given in (5.2) with the following de1nition for the two component super charge in (5.2): G a = (G 1 + iG 2 ; G 3 − iG 4 ) :
(10.39)
The following points are worth noting: (1) Note from the de1nitionof the stress energy tensor that the 1eld is a linear dilaton with background charge Q = (2=Q5 )(Q5 − 1). This is precisely the background charge of the linear dilaton theory for the case of a single D1-brane splitting oL the D1–D5 bound state (see below (7.40)). (2) The central charge of this algebra is 6(Q5 − 1) and not 6, as one would have expected if this algebra was describing the spacetime geometry. (3) Note also that the R-symmetry generators in (10.38) involve the bosonic 1elds j a which corresponds to the symmetry of S 3 . This is a characteristic of the Higgs branch [113]. 10.5. Strings on the thermal AdS3 To discuss thermal boundary conditions on AdS3 it is convenient to parameterize Euclidean AdS3 using the following coordinates: J = ve ;
JU = ve U :
(10.40)
In terms of these new coordinate the metric on H becomes (see (10.29) and (C.35)) ds2 = l2 (d2 + (dv + v d)(d vU + vU d)) :
(10.41)
658
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
The worldsheet action with the B-1eld given by (10.30) is Q5 U + (9vU + 9v)( U + 9v)) U d 2 z (99 U 9v : S= 2
(10.42)
Thermal AdS3 is then de1ned by the following identi1cations (see Appendix C.4): v ∼ vei= ; vU ∼ ve U −i= ; ∼+= :
(10.43)
Here = is the inverse temperature and is the chemical potential. We now set up the one-loop evaluation of the partition function on H [175]. From this it is easy to evaluate the spacetime free energy and thus determine the spectrum of strings in AdS3 . We have to evaluate the path integral on a torus with modular parameter $. The conformal 1eld theory consists of the worldsheet Lagrangian (10.42) with the identi1cations (10.43), the b c ghosts and an internal conformal 1eld theory. Let the partition function of the internal conformal 1eld theory be given by U h qUhU ; ZM = (qq) U −cint =24 D(h; h)q (10.44) h;hU
U is the degeneracy of the state with weight (h; h). U Putting this partition function where q=e2i$ , D(h; h) and that of the b; c ghosts together we get [175], =(k − 2)1=2 ∞ d$2 1=2 U h qUhU Z(=; ) = d$1 e4$2 (1−1=4(k −2)) D(h; h)q 3=2 8 $2 −1=2 0 h;hU ∞ −(k −2)m2 =2 =4$2 e × 2 ˆ m=1 |sinh(m==2)|
2 ∞ 2in$ 1 − e ; ˆ ˆ (1 − em=+2in$ )(1 − e−m=+2in$ )
(10.45)
n=1
where =ˆ = = + i=. From the one-loop partition function it is easy to extract the spacetime free energy F, which is given by Z(=; ) = −=F. One can rewrite the free energy as a sum over states in the single particle string Hilbert space H 1 F(=; ) = log(1 − e−=(Estring +ilstring ) ) ; (10.46) = string∈H
where Estring and lstring are the energy and angular momentum of the string state. One can compare (10.45) and (10.46) and show that the spectrum is precisely the one proposed by [174] which was discussed in Section 10.3. 11. Applications of AdS 3 –CFT 2 duality The microscopic derivation (Section 8) of Hawking radiation from the D1–D5 black hole shows that the microstates of the black hole are to be identi1ed with states of the “boundary CFT”. One
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
659
way to understand this is to note that [41,42] in the black hole description (large ’tHooft coupling gs Q) the propagation of closed string quanta in the curved geometry (see Section 3) consists of (a) free propagation in the asymptotically Cat region and (b) propagation in AdS3 geometry (throat region); in the weak coupling description the closed string quanta (a ) propagate freely in Cat space and (b ) occasionally interact with the D1–D5 system which, for the low energy scales associated with Hawking quanta, is described by a CFT. Since we are describing the same physical process as (a) + (b) at strong coupling and as (a ) + (b ) at weak coupling and since (a) is clearly equivalent to (a ), we say that (b) is equivalent (dual) to (b ). That is, as seen by a closed string probe, supergravity in AdS3 is equivalent to the CFT of the D1–D5 system. This was indeed the reasoning behind the discovery of the AdS=CFT duality [42]. While the above equivalence gives us important insight in the context of the full geometry including the asymptotically Cat part, the equivalence between asymptotically AdS3 spaces and the boundary CFT2 has a number of important applications. This subject has been discussed in fair amount of detail in [41]. Our discussion here should be regarded as complementary to it; we will focus here mainly on (a) thermodynamics and phase transitions in AdS3 and (b) black hole formation by particle collision. We will begin with (a). 11.1. Hawking–Page transition in AdS3 In the context of Cat space, it is well-known [32,185] that a thermal state in Cat Minkowski space, no matter how low the temperature, is unstable to formation of a black hole: • For a Cat space with in1nite volume, the mass of a thermal state is in1nite; therefore the state will gravitationally collapse to a black hole. The conclusion remains true even if one restricts to a large but 1nite volume. • Furthermore, because of the negative speci1c heat implied by T = 1=(8 GM ) (Eq. (1.2)), a black hole can only be in an unstable equilibrium with radiation at the same temperature (in a suKciently large volume to keep the temperature constant). Any Cuctuation resulting in an increase of the black hole mass would reduce its temperature below that of its surroundings, causing more absorption than emission so that the black hole continues to grow. This implies a breakdown of the canonical ensemble. As found in [186], the situation is diLerent in an AdS space. Let us write the metric of AdS3 as (see (2.71) and (B.7)) −1 r2 r2 2 2 2 + r 2 d2 (11.1) ds = −dt 1 + 2 + dr 1 + 2 l l and that of BTZ as (see (2.77), (C.11) and (C.13)) 2 2 −1 2 2 r J r J ds2 = − 2 − M + dt 2 + 2 − M + dr 2 l 2r l 2r +r
2
2 −J dt + d ; 2r 2
(11.2)
660
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
where M and J refer to the mass and angular momentum of the BTZ black hole (J 6 M ). The arguments of [186], applied to the present case, state that a thermal state describing radiation in AdS3 at a 1nite temperature has a 1nite mass. For l1 the energy density of radiation is given approximately by the formula ) ˙ T 3 where T is the locally measured temperature T = T0 (1 + r 2 =l2 )−1=2 which, for rl, goes as T ˙ 1=r. The total energy is given by the integral M ≈ l dr d) which is clearly 1nite since ) ˙ r −3 at large distances r. Thus, unlike in Cat space, thermal radiation in AdS3 can be a stable con1guration, if its (free) energy is less than that of a black hole at the same temperature (see below). We will also see below that the BTZ black hole has a positive speci1c heat (see, e.g. the formula for the temperature (C.45)), unlike the Schwarzschild black hole. Thus a stable equilibrium between a BTZ black hole and radiation in AdS3 is possible; a Cuctuation resulting in an increase of mass of the black hole increases its temperature above that of the radiation, causing more emission than absorption by the black hole, thus restoring its energy back to the equilibrium value. 11.1.1. Euclidean free energy from supergravity As we argued in Section 2, the AdS3 and BTZ spacetimes (11.1), (11.2) (times S 3 × K) are near-horizon limits of solutions of type IIB supergravity (namely the D1–D5 string and the D1–D5 black hole). In fact they are exact solutions of type IIB supergravity in their own right. Stated diLerently, AdS3 and BTZ are solutions of three-dimensional supergravity obtained by a Kaluza–Klein reduction of type IIB theory on S 3 × K. Let us elaborate this a bit more. The action for pure anti-de Sitter supergravity, based on the super group SU (1; 1|2) × SU (1; 1|2) (see Section 6), in a three-dimensional spacetime with cosmological constant T = −1=l2 ¡ 0, is given by (see [136,187]) 26 4ijijk i j k 1 2 3 K) U K) i i d x eR + 2 e − j A 9K A ) − S= A AK A) DK ) − 8lj l 3 16GN(3) 4ijijk i j k K) U K) i i A A K A ) A 9K A ) − : (11.3) −j DK ) + 8lj 3 Here GN(3) is the three-dimensional Newton’s constant, and DK = 9K + !abK Jab =4 − eaK Ja =(2l) − 2AiK Ii and DK = 9K + !abK Jab =4 + eaK Ja =(2l) − 2AKi Ii . The basic 1elds appearing in the Lagrangian are the vielbein ea ; ) ; Ai ; and Ai . The !’s are spin connections. As we remarked before (11.3), the same three-dimensional supergravity can be obtained from type IIB string theory compacti1ed on K × S 3 (with constant Cux on S 3 ). This identi1es the cosmological constant and Newton’s constant in terms of type IIB parameters and the parameters of compacti1cation (cf. Eq. (2.73)): 44 gs2 GN(3) = ; V 4 l3 l4 = 26
164 gs2 Q1 Q5 ; V4
(11.4)
Recently the three-dimensional SO(4) gauged supergravity which provides the coupling of the pure anti-de Sitter supergravity based on the supergroup SU (1; 1|2) × SU (1; 1|2) to the lowest matter multiplets including the massless scalar 1elds has been constructed in [188].
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
661
where V4 is the volume of T 4 and gs is the string coupling (we are working in the units = 1). From our remarks above, it is obvious that AdS3 and BTZ are solutions of (11.3). We will now use this supergravity Lagrangian to calculate the Euclidean free energy. Our method will be similar in spirit to that of [28,186,189]. The presentation will more closely follow [90,190,191]. Let us recall that the free energy of AdS3 supergravity is given by (11.5) ZSUGRA = D [1elds] exp[ − SE ] ; where SE is the Euclidean version of the action written in (11.3). 11.1.2. Boundary conditions In the above equation, denotes boundary conditions that de1ne the functional integral. The semiclassical evaluation of the functional integral is performed by summing over the saddle point con1gurations which satisfy the given boundary conditions. The boundary conditions are speci1ed as follows: (1) Asymptotically AdS3 Euclidean metrics satisfy the boundary condition: r →∞
ds2 → dr 2 =r 2 + r 2 |d + i dt|2 ;
(11.6)
where ; t satisfy the periodicity conditions: + it ≡ + it + 2(n + m$) :
(11.7)
This de1nes the (conformal) boundary of the space to be a torus with modular parameter $. For Euclidean AdS3 , Eqs. (C.24) and (C.32) imply that the modular parameter for the boundary torus is $ = i==(2);
= = 1=T − i' ;
(11.8)
where the complex temperature = (see (C.33)) relates to a partition function of the form Tr exp[ − H=T + i'J ]. For Euclidean BTZ, on the other hand, Eqs. (C.37) and (C.44) imply that the modular parameter for the boundary torus (cf. (11.6)) is now $˜ = i==2 = −1=$ ;
(11.9)
where $ is given by (C.41). The inverse relation (11.9) between the modular parameters $; $˜ arises because of the diLerence in our identi1cations of space and Euclidean time in (C.31) and (C.40). Indeed from this viewpoint, Euclidean BTZ is a special case of an SL(2; Z) family of instanton con1gurations [90,191]. In the more general case we replace the parameter $ in (C.38) by a$ + b a b $˜ = ∈ SL(2; Z) (11.10) ; c d c$ + d and (C.40) is replaced by i 2u = ( + it) : c$ + d
(11.11)
662
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
In [191] an elliptic genus calculated from the boundary CFT is interpreted as a sum over this entire SL(2; Z) family of gravitational instantons. In our simpli1ed treatment below, only AdS3 and BTZ con1gurations will dominate the path integral. As an upshot of the above discussion is the relation (cf. (11.8) and (C.44)) ZAdS ($ = i==2) = ZBTZ ($ = i=0 =2) =0 = (42 ==) :
(11.12)
We should note that for Euclidean BTZ black hole, the complex temperature gets 1xed by the geometry (see (C.41)), whereas for Euclidean AdS3 the complex temperature is for us to specify. (2) Gauge 1eld: the boundary condition on the gauge 1elds will not play an important role in the following discussion (see Eq. (5.7) of [191] for details). (3) Fermions: the above discussion, especially (11.6) and (11.7) points to two non-trivial cycles of the boundary torus. The fermion 1elds can have either periodic (P) or anti-periodic (AP) boundary conditions along either cycle. Thus we can have any of the four boundary conditions (P; P), (P; A), (A; P), (A; A), where by convention the entry denotes boundary condition along the cycle + it → + it + 2n and the second entry denotes boundary condition along + it → + it + i=; $ = i==2. In the language of 2D SCFT, a periodic boundary condition correspond to Ramond (R) fermions and an anti-periodic b.c. corresponds to Neveu–Schwarz (NS) fermions. By the remarks at the end of Section 2.6.2 (see also Section 6) we 1nd that the BTZ fermions correspond to Ramond boundary conditions along the “space” cycle; thus a standard Euclidean partition function in BTZ should correspond to the boundary condition (P; AP) (thermal fermions represented by Tr (e−=H ) happen to be anti-periodic along the “time” cycle). We will compute below such a partition function. By the duality (11.12) this should be related to (AP; P) for the fermions in AdS3 . This is consistent with the remarks at the end of Section 2.6.2 (see also Section 6) that AdS3 fermions are represented by NS boundary conditions at the boundary. 11.1.3. Finding the saddle points We will evaluate (11.5) by 1nding saddle points of the action subject to the speci1c boundary conditions mentioned above. By virtue of the equation of motion R = −6=l2 , the Euclidean action S of a classical spacetime X is simply its volume times a constant. To be precise, S(X ) =
1 4l2 GN(3)
Vol(X ) =
0
=
dt
Vol(X ) ;
R
r0
dr
0
2
√ d g :
(11.13)
The ranges of ; $ follow from the identi1cations mentioned above. The lower limit r0 of the r-integral is identically zero for AdS3 and the conical spaces, whereas for BTZ it denotes the location of the horizon (the Euclidean section is de1ned only upto the horizon). The upper limit R is kept as an infrared regulator to make the volume 1nite. We will in practice only be interested in free energies relative to AdS3 and the R-dependent divergent term will disappear from that calculation.
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
663
11.1.4. Free energy of BTZ We will now compute the free energy of BTZ relative to AdS3 (in a manner similar to [92,189]). As detailed in Appendix C, the AdS3 solution can be at any temperature = (C.33) while the temperature of the black hole is 1xed to be =0 (C.44) and (C.41) which is given by the geometry. To compare with the AdS3 background one must adjust = so that the geometries of the two manifolds match at the hypersurface of radius R (in other words, we must use the same infrared regulator on all saddle points of the functional integral). This gives the following relation:
1 + l2 =R2 ; (11.14) =0 = = 1 − M 2 l2 =R2 √ √ 1 d3 x g − d3 x g S(BTZ) − S(AdS3 ) = (3) 2 4l GN BTZ AdS3 =
1 4l2 GN(3)
=0 (R2 − r+2 ) − =R2 :
Substituting the value of =0 in terms of = 27 and taking the limit R → ∞ we obtain 1 S(BTZ) − S(AdS3 ) = [=0 l2 − 2 r+ l2 ] ; 4l2 GN(3)
(11.15)
(11.16)
where the complex temperature =0 is given by Eqs. (B.41) and (B.45). Using these variables the diLerence in the action becomes (identifying =0 with =) [190] 1 1 1 ∗ 2 3 4 : (11.17) S(BTZ) − S(AdS3 ) = (= + = )l − l + = =∗ 4l2 GN(3) 2 By using (11.12) we 1nd 1 (= + =∗ )l2 ; S(AdS3 ) = 4l2 GN(3) 2 1 1 3 4 1 + : l S(BTZ) = = =∗ 4l2 GN(3)
(11.18) (11.19)
It is clear that at low temperatures the AdS3 saddle point dominates the path integral, where at high temperatures the BTZ dominates. The transition from AdS3 at low temperature to BTZ at high temperature is the 3-dimensional analogue of the Hawking–Page transition. Euclidean free energy from CFT. The aim of this section is to calculate the partition function of the (4; 4) CFT on the orbifold T 4Q1 Q5 =S(Q1 Q5 ). The partition function depends on the boundary conditions of the fermions of the CFT. We will 1rst calculate the partition function when the bulk geometry is that of the BTZ black hole. CFT partition function corresponding to BTZ. The fermions of the CFT are periodic along the angular coordinate of the cylinder if the bulk geometry is that of the BTZ black hole. This can be 27 We should make a remark here to avoid any potential confusion between equations like (11.12) and (11.14); the former equation is a statement that the Euclidean partition function of AdS3 , corresponds to a certain complex temperature, matches with that of BTZ at a diLerent temperature. In calculating the supergravity partition function, however, we want to regard both as having the same temperature (R → ∞ limit of (11.14)); the diLerence between which bulk geometry corresponds to AdS3 and which to BTZ arises here by the choice of “space” and “time” (see Appendix C.4).
664
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
seen by observing that the zero mass BTZ black hole admits killing vectors which are periodic along the angular coordinate [93]. Therefore the zero mass BTZ black hole correspond to the Ramond sector of the CFT. The general case of the BTZ black hole with mass and angular momentum correspond to excited states of the CFT over the Ramond vacuum with L0 + LU0 = Ml ; L0 − LU0 = JE ;
(11.20)
where M and JE are the mass and the (Euclidean) angular momentum of the BTZ black hole. Therefore the partition function of the BTZ black hole should correspond to U
Z = Tr R (e2i$L0 e2i$UL0 ) :
(11.21)
The Hilbert space of the CFT on the orbifold T 4Q1 Q5 =S(Q1 Q5 ) can be decomposed into twisted sectors labeled by the conjugacy classes of the permutation group S(Q1 Q5 ). The conjugacy classes of the permutation group consists of cyclic groups of various lengths. The various conjugacy classes and the multiplicity in which they occur in S(Q1 Q5 ) can be found from the solutions of the equation Q1 Q5
nNn = Q1 Q5 ;
(11.22)
n=0
where n is the length of the cycle and Nn is the multiplicity of the cycle. The Hilbert space is given by H= ⊕
⊗ S Nn HP(n)n :
nNn =Q1 Q5 n¿0
(11.23)
S N H denotes the symmetrized product of the Hilbert space H, N times. By the symbol HP(n)n we mean the Hilbert space of the twisted sector with a cycle of length n in which only states which are invariant under the projection operator n
Pn =
1 2ik(L0 −LU0 ) e n
(11.24)
k=1
are retained. The values of L0 or LU0 in the twisted sector of length n is of the form p=n where p is positive integer. This projection forces the value of L0 − LU0 to be an integer on the twisted sector. It arises because the black hole can exchange only integer valued Kaluza–Klein momentum with the bulk [108]. The dominant contribution to the partition function arises from the maximally twisted sector. That is, from the longest single cycle of length Q1 Q5 . It is given by U 1 Q5 d(Q1 Q5 n + m)d(m)e2in$ e2im$=Q1 Q5 e−2im$=Q ; (11.25) Z= m; n
where d’s are the coeKcients de1ned by the expansion U2 (0|$ 2 = d(n)e2i$n : ZT 4 = ;3 ($) n ¿0
(11.26)
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
665
In the above equation ZT 4 is the partition function of the holomorphic sector of the CFT on T 4 . We will 1rst evaluate the sum ∞ P(m; $) = d(Q1 Q5 n + m)e2in$ : (11.27) n=0
For large values of Q1 Q5 we can use the asymptotic form of d(Q1 Q5 n + m) d(Q1 Q5 n + m) ∼ exp(2 Q1 Q5 n + m) :
(11.28)
Substituting the above value of d(Q1 Q5 n + m) in P(m; $) we obtain a sum which has an integral representation as shown below. ∞ √ P(m; $) = e2 Q1 Q5 n+m+2in$ + d(m) n=1
i =P 2
∞
−∞
dw coth !e
√
2 iQ1 Q5 !+m−2!$
√
e2 + d(m) − 2
m
(11.29)
where P denotes “principal value” of the integral. We are interested in the high temperature limit of the partition function. The leading contribution to the integral in the limit $ → 0 is P(m; $) ∼ iQ1 Q5 =$eiQ1 Q5 =2$−i2m$=Q1 Q5 : (11.30) Substituting the above value of P(m; $) the partition function becomes ∞ U 1 Q5 d(m)e−2im$=Q ∼ exp(iQ1 Q5 (1=2$ − 1=2$)) U : Z = iQ1 Q5 =$
(11.31)
m=0
Thus the free energy at high temperatures is given by −iQ1 Q5 1 1 : − ln Z = − 2 $ $U
(11.32)
This exactly agrees with (11.19) with the identi1cation $ = i=+ =(2l). CFT partition function corresponding to AdS3 . We will not attempt to calculate the AdS3 partition function here (with fermion b.c. (AP; P)), except to note that by using spectral Cow arguments [191] it is possible to show that the CFT result agrees with (11.18). Indeed in [191] a more general agreement between the CFT elliptic genus and the corresponding AdS3 quantity is demonstrated for the entire SL(2; Z) family of solutions of which AdS3 and BTZ are special cases. 11.2. Conical defects and particles in AdS3 In this section we will mention some of the continuing developments regarding point particle dynamics and black hole formation in AdS3 . It has long been noted [192,193] that there is a one-parameter family of spacetimes, given by r2 dr 2 2 2 2 2 2 ds = l − J + 2 dt + (11.33) + r d l J + r 2 =l2
666
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
which represent the so-called point masses in AdS3 . These spacetimes are all asymptotically AdS3 , and they interpolate between AdS3 and BTZ (J = −M = 1 is AdS3 , J = −M 6 0 is BTZ (J = 0), and J = −M ∈ (0; 1) are the conical spaces). The conical spaces are called so because they have a √ defect angle V = 2(1 − J) (see Appendix C). The parameter J is related to the “mass” of the point particle that causes the conical defect. To say it in more detail, consider a free particle (a geodesic) g in AdS3 , given by a
Y(s) = esp Ja ;
(11.34)
where we have used coordinates (B.3). The quantities pa play the role of “momenta” (see Section 11.2.1 for more detail). For a static particle p0 = m; p1 = p2 = 0 :
(11.35)
The geodesic (11.34) satis1es, in the coordinates (C.9), (t) = 0;
)(t) = 0 :
(11.36)
We state without proof here (for more details see Section 11.2.1) that the gravitational back reaction of the point particle (11.34) amounts to cutting a wedge out of AdS3 and identifying the edges. Such an identi1cation is achieved by (C.19) – (C.22). String theory embedding and CFT duals: (1) It has been shown in [190] that the conical space C (11.33) can be embedded as a solution of three-dimensional supergravity based on the supergroup SU (1; 1|2) × SU (1; 1|2) described by (11.3). The latter appears as the low energy description of Type IIB string theory on S 3 × T 4 . At a 1rst sight supersymmetrization of conical spaces seems to be an impossibility since the candidate Killing spinors [93] typically pick up phases when transported around the conical singularity, and therefore are not single valued. It was shown in [194] that this problem could be avoided by employing an extended supersymmetry N = (2; 0) (see [195] for the notation N = (p; q) supergravity) and assigning a background value to the gauge 1eld (which occurs in the supergravity multiplet). By choosing the background value appropriately the gravitational holonomy picked up by the Killing spinors can be cancelled by the gauge holonomy. Since the extended supergravity N = (2; 0) can be embedded, further, in the SU (1; 1|2) × SU (1; 1|2) supergravity which is N = (4; 4), the conical space C can be embedded as solutions of this latter supergravity. We therefore land up with a solution C × S 3 × T 4 of type IIB supergravity [190]. The background value of the SU (2) gauge 1eld is given by A = (l=2)J d$3 ; from the point of IIB theory this is one of the “Kaluza–Klein” gauge 1elds that appear on reduction on S 3 . The CFT dual of the conical spaces (11.33), from this viewpoint, is identi1ed [190] as the spectrally Cowed CFT Hilbert space with spectral Cow parameter √ (11.37) ;= J : The spectral Cow provides a one-parameter interpolation between the NS (; = 0) and R (; = 1) sectors of the CFT, just as the conical spaces (parameterized by J) interpolate between AdS3 (J = 0) and zero-mass BTZ (J = 1). Note that the spectrally Cowed energy formula of the CFT
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
667
ground state: L0 |0 ; = LU0 |0 ; = −
c 2 ; |0 ; : 24
This precisely agrees with the ADM mass of the conical spaces provided one uses (11.37). Indeed, the free energy at a 1nite temperature also agrees with the CFT calculation [190]. (2) In the above embedding, the identi1cation (C.19) acts apparently only on the AdS3 , and not on the S 3 × T 4 , although the vev of the SU (2) gauge 1eld indirectly does aLect the S 3 . In [196,197] a diLerent, though perhaps not entirely unrelated, embedding of (11.33) into type IIB theory is used where the holonomy acts simultaneously on AdS3 as well as on S 3 (as we go around the conical singularity in AdS3 , we also go around a circle in S 3 ), leading to an embedding in type IIB theory as (AdS3 × S 3 )=Z × T 4 . This description naturally arise as near-horizon limit of spinning black holes. The CFT dual for the conical spaces in this approach are (a microcanonical ensemble of) RR states which depend on the parameter J (see [196], Eqs. (118) – (121)). (3) For J = 1=N 2 the conical space (11.33) becomes the orbifold AdS3 =ZN which has an exact worldsheet CFT description and hence can be embedded as an exact solution in string theory [198]. This can be done in two ways: • AdS3 =ZN × S 3 × K. This theory is tachyonic, and forms a model of closed string tachyons. • (AdS3 × S 3 )=ZN × K. This theory is supersymmetric and has no tachyons. The AdS3 =CFT2 dual of the conical defect from this viewpoint is given in terms of a fractionally moded N = (4; 4) superconformal algebra, the fractional moding being given by 1=N spectral Cow [198] from the Ramond sector. 11.2.1. Black hole creation by particle collision In 3D gravity (T ¡ 0) there are explicit solutions [199] where a BTZ black hole is created by point particle collision. We will very brieCy mention some salient points here: • The conical spaces (11.33) above are a special case of the conical spacetimes for a moving point particle. We will describe here the simplest case of a moving particle which turns out to be that of a massless particle. In this case, the geodesic g (11.34) is speci1ed by the “momenta” p0 = p1 = tan j; p2 = 0 :
(11.38)
In coordinates (C.9) g now satis1es (t) = 0; )(t) = tan(t=2) :
(11.39)
The gravitational back-reaction of such a particle can be exactly computed and the resulting spacetime is obtained by cutting out a wedge W from AdS3 , with edges 9W = w+ ∪ w− , w± :
2) sin(j ± ) = sin t sin j ; 1 + )2
where w+ are w− are identi<ed. The identi1cation is achieved by quotienting AdS3 by an isometry as in Eqs. (C.19) and (C.20) with the momenta now given by (11.38). It is clear that the geodesic
668
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
g (11.39) is a 1xed point set of the quotienting operation (C.19). Note that under this identi1cation w+ ≡ w− = g. Thus a moving particle is described by a wedge W as constructed above. The spacetime constructed this way is AdS3 =Z, where the Z is a discrete subgroup of the isometry group as just described. Hence the resulting spacetime remains an exact solution of three-dimensional gravity with T ¡ 0. The energy of the particle is related to the parameter j which determines the holonomy of the conical spacetime. • It is easy to generalize the above procedure to construct spacetimes representing two particles moving towards each other. Each geodesic, g(1) or g(2) , represents a wedge, W (1) or W (2) , with edges given by 2) (1) w± : sin(j ± ) = sin t sin j ; 1 + )2 (2) : w±
2) sin(j ± ) = −sin t sin j : 1 + )2
(1) (1) ≡ w− ≡ g(1) under the holonomy matrix u(1) = 1 + tan j(J0 + J1 ) and similarly Like before w+ for the second particle. The full spacetime is represented in terms of two charts, one obtained by quotienting with the holonomy matrix u(1) u(2) , and the other one by quotienting with u(2) u(1) . Once again, the full spacetime is AdS3 =Z, where the quotienting is by a discrete isometry subgroup; hence the resulting spacetime is an exact solution of AdS3 -gravity. • Since the above construction gives the full spacetime, the time-development of the collision process is computed by looking at the Poincare discs (see de1nition near (C.9)) at various times t. Thus, on the Poincare disc corresponding to t = 0, the two wedges W (1) and W (2) meet; this, therefore represents the time when the two massless particles collide. At later times t ¿ 0, the Poincare discs exhibit a single wedge (the two wedges get identi1ed!) which corresponds to the worldline 2) = sin(t) tan(j) : (11.40) 1 + )2
For energies j low enough so that =4 ¿ j ¿ 0, tan(j) ¡ 1. It is easy to verify that (11.40) then represents a timelike worldline. In this energy range, therefore, the collision of two massless particles results in a single massive particle (see [199, Fig. 5]). For higher energies =2 ¿ j ¿ =4, i.e. tan(j) ¿ 1, hence the geodesic (11.40) is spacelike. As shown in [199], this spacelike worldline is identi1ed with the future singularity of a BTZ black hole (see [199, Fig. 6]). Indeed the holonomy matrix is identi1ed as exactly the one appropriate for a BTZ black hole. The spacetime constructed this way therefore corresponds to formation of a BTZ black hole by a collision of two particles. Once again, since the identi1cations used correspond to discrete isometries, we have an exact solution of AdS3 gravity. The embedding of the above solutions in string theory or AdS3 supergravity remains an open problem, although it is likely that the constructions described here will admit straightforward generalizations. A more interesting problem is to understand the CFT dual of these solutions. If we succeed in applying any one of the candidate CFT duals of conical spaces as described above, we will have a unitary quantum mechanical description of black hole formation. We should mention that black hole formation in three dimensions can also be described as a collapsing scalar 1eld con1guration [200] or as collapsing dust shells [201]. The former process, which exhibits a critical scaling behaviour, has been discussed in the context of AdS3 =CFT2
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
669
correspondence in [202]. CFT duals of collapsing dust shell solutions in AdS spaces are discussed in detail in [203–205].
12. Concluding remarks and open problems In this report (see Section 1.4 for a more detailed summary) we have presented a detailed calculable formalism of the near-extremal black hole (Sections 2 and 3) of type IIB string theory in terms of the D1–D5 system of branes (Sections 4 and 5). We discussed (Section 8) how far this black hole the thermodynamics and also the rates of Hawking radiation of all the massless particles can be reproduced from string theory to match the results derived in supergravity. The facility of extrapolating weak coupling calculations to the strong coupling regime owes to the high degree of supersymmetry that exists in the eLective Lagrangian of the low energy degrees of freedom in the string theory (Sections 7 and 9). As we emphasized in this report, a crucial input (Section 6) in the calculation of Hawking rates comes from the AdS=CFT correspondence of Maldacena [42] without which the Hawking radiation from some massless scalars is impossible to calculate from CFT (in particular, we saw in Sections 8.7 and 8.6.1 that we get incorrect results if we use the earlier DBI approach to derive the interaction of the D1–D5 bound state with the Hawking quanta). We also presented a review of AdS3 =CFT2 correspondence beyond the supergravity approximation (using the NS5 version, see Section 10), and some applications of this correspondence for black hole formation in three dimensions (Section 11) by thermal transition and by collision of point particles. In the light of what is achieved the following open problems naturally suggest themselves: (1) The emergence of AdS3 (×S 3 ) spacetime from the N = (4; 4) SCFT is an open important open problem. In the context of AdS5 × S 5 Dorey et al. [206] (see also [207]) have argued that this spacetime emerges from the moduli and the fermion zero modes associated with the large N saddle point of the dual N = 4 SUSY gauge theory. Another approach to the question of how the radial direction of the AdS space arises from the viewpoint of the boundary theory is via Liouville theory where the Liouville or conformal degree of freedom is interpreted as a space dimension [208–211]. For a recent discussion of formulation of holography in AdS3 using Liouville theory see [212,213]. Also see [214] for a discussion of black hole creation by point particles using Liouville theory. (2) Although the D1–D5 system provides a correct derivation of the black hole entropy, there is no precise understanding of why the entropy is given by the area of the event horizon. The reason why we lack this understanding is that the counting of microstates (see Section 8) is performed in the dual theory. An important question to answer is to understand these states in the language of supergravity=string theory. Related questions have been discussed in [215,216]. See also [217] where a conformal algebra is constructed in AdS3 supergravity with a central charge that coincides with that of the boundary CFT obtained from the D1–D5 system. (3) A matrix string description of the D1–D5 black hole was proposed in [218]. It will be interesting to make a detailed comparison between this work and the microscopic formulation presented in this report. (4) Most of the discussion (especially where it involves the micro-description) in the present report involves BPS or near-BPS black holes. There is a vast amount of literature on non-supersymmetric
670
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
black holes in supergravity, but we are quite far from a microscopic understanding of them. Works which address this problem include • D0-brane black holes (see [219] and references therein): Quantum mechanics of N D0-branes at a 1nite temperature is analysed at large ’tHooft coupling gs N , using a mean-1eld theory approximation. This leads to an entropy in good agreement (over a certain range of temperatures) with the Bekenstein–Hawking entropy of a ten-dimensional type IIA black hole carrying 0-brane charge. • Correspondence principle and string–black hole phase transitions: It has been shown [34,35] that the entropy formula for a large class of non-supersymmetric black holes agrees with that for a highly excited string state (with the same mass and charge) at a correspondence point where the curvature at the horizon becomes of the order of the string scale. This suggests a phase transition [36,37] from a string state to a black hole (see also [220,221]). A precise understanding of such a transition is obviously important. • Closed string tachyons and black hole entropy [222]: A string theoretic version is suggested of the Gibbons–Hawking derivation [28] of entropy of a Schwarzschild black hole. The Schwarzschild black hole is in fact replaced by a cone for computing the entropy. The corresponding orbifold string theory has a closed string tachyon whose dynamics is dealt with using the techniques of [223]. It is of obvious interest to understand this calculation in terms of microstates. (5) One of the potential applications of the AdS=CFT correspondence (especially in the context of AdS3 ) is black hole dynamics. In Section 11.2 we brieCy discussed the problem of black hole formation by particle collisions in AdS3 (see also [202] which discusses Choptuik scaling [200] in the context of AdS3 =CFT2 correspondence). Clearly, the problem of evaporation of black holes too maps into interesting time dependent phenomena in the dual gauge theory=CFT. The Euclidean AdS=CFT correspondence describes equilibrium physics; the approach to this equilibrium is an important unsolved problem. (6) Issues of singularities and the D1–D5 system: The D1–D5 system (with K3 compacti1cation) has been used to resolve naked singularities [13]. The detailed understanding of this mechanism is an important problem, especially in what precise sense the gauge theory acts as a source for the geometry, and how precisely the “matching” of the gauge theory and the geometry can be understood. (7) Recently [224] a particular scaling limit of AdS spaces (“pp wave”) is found where string theory in the Ramond–Ramond background is solved exactly, leading to new insights into the AdS=CFT correspondence. It is important to explore the consequences for the D1–D5 system, in particular to understand from the CFT the IIB string spectrum in the pp-wave limit of AdS3 ×S 3 [224,225]. (8) We found in Section 2.6.3 that the two-dimensional black hole [94 –96] arises as a limit of the non-extremal 5-brane. In [226] a holographic description of the two-dimensional black hole is proposed in terms of a quantum mechanical matrix model. It will be interesting to see how this matrix model 1ts into (a holographic description of) the 1ve-brane theory, e.g. whether it can perhaps lead to the phenomenological model of [227] consisting of a gas of strings on the 1ve brane (with tension 1=(2Q5 gs ) and√central charge =6), that gives the Bekenstein– Hawking entropy of this black hole S = U02 gYM Q5 . It will be interesting to compare both these
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
671
approaches with the recent calculation [228] of the partition function of the two-dimensional black hole.
Acknowledgements We would like to thank L. Alvarez-Gaume, S. Das, T. Damour, A. Dhar, M.R. Douglas, D. Gross, S.F. Hassan, K. Krasnov, K. Maeda, J. Maldacena, G. Moore, K.S. Narain, A. Sen, L. Susskind, G. Veneziano and E. Witten for useful discussions. The work of J.R.D is supported by NSF grant PHY00-98396.
Appendix A. Euclidean derivation of Hawking temperature We will present a derivation (the purist may regard this as a “mnemonic for a derivation”, for more detailed accounts see, e.g. [1,7,28]) of Hawking temperature for a class of black holes represented by the metric ds2 = −F(r)C(r) dt 2 + dr 2 =C(r) + H (r)r 2 d2
(A.1)
which of course include (1.2) and (1.3). The black hole could be three, four or higher dimensional, the number of angles represented by d2 varying accordingly. We will assume that C(r) vanishes at the real value r = rh and is non-vanishing for r ¿ rh (C(r) = 0 can have smaller roots than rh ; they are irrelevant for the present discussion). We will also assume that F(r) and H (r) are smooth and positive for r ¿ rh . For asymptotically Cat black holes, C(r); F(r); H (r) → 1 as r → ∞. r = rh will correspond to the location of the event horizon. We will focus our attention near r = rh , where C(r) = C (rh )(r − rh ) + O(r − rh )2 :
(A.2)
It is useful to de1ne a new radial coordinate ): d)2 = dr 2 =C(r);
r ∈ ]rh ; ∞];
) ∈ ]0; ∞] :
In the near-horizon region (A.2) r − rh [1 + O(r − rh )] : )=2 C (rh )
(A.3)
(A.4)
The near-horizon metric is given by ds2 = −dt 2 F(rh )[C (rh )]2
)2 + d)2 + H (rh )rh2 d2 : 4
(A.5)
The Euclidean continuation t = −i$ is given by dsE2 = d$2 F(rh )[C (rh )]2
)2 + d)2 + rh2 d2 : 4
(A.6)
672
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
If we require the $; ) plane not to have a conical singularity [28], we must assign the following periodicity: 4 : (A.7) $ ≡ $ + =; = = C (rh ) F(rh ) To see this, write $ = (==2)’, so that 2 (r ) F(r ) =C h h + d)2 + H (rh )rh2 d2 : dsE2 = d’2 )2 4
(A.8)
Absence of conical singularity implies that the quantity inside the square brackets must be equal to one. A periodic Euclidean time with period = implies a temperature ˜C (rh ) F(rh ) TH = ˜== = : (A.9) 4 • Example 1: for the RN black hole (1.3), rh = r+ ; C(r) = (1 − r− =r)(1 − r+ =r); F(r) = 1, hence C (rh ) = (r+ − r− )=r+2 , leading to (1.7). • Example 2: for the non-extremal 1ve-dimensional black hole (2.40), rh = r0 ; C(r) = hf−1=3 ; F(r) = f−1=3 ; H (r) = f1=3 , hence C (rh ) F(rh ) = h (r0 )f(r0 )−1=2 = 2r0−1 (cosh(1 ) cosh(5 ) cosh(n ))−1 ; leading to (2.45). Entropy and the
r+ − r− r+2 r+ − r − d = ; GN 2GN r+ 4r+2
r + + r− 1 = dr+ = TH dS + ' dQ : 2GN 2GN
(A.11)
In the 1rst line we have used Q2 = r+ r− =GN ⇒ Q dQ = 12 d(Q2 ) = (r− =2GN ) dr+ . We thus verify Eq. (1.9). Appendix B. A brief heuristic motivation for Rules 1 and 2 of Section 2.4 We present a brief, heuristic, motivation for the algorithm of Section 2.4. Suppose we view a static Schwarzschild black hole, of ADM mass m, from the viewpoint of the 1ve-dimensional
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
673
(R4 ×S 1 ) Kaluza–Klein theory (we use coordinates x0; 1; 2; 3 for R4 and x5 for the S 1 ). The 5-momentum (p0 ; p ˜ ; p5 ) would be given by p0 = m;
p5 ˙ charge = 0;
p ˜ =0 :
(B.1)
The p5 equation follows because the Schwarzschild black hole is neutral. A way of generating charged solutions (see, e.g. [229 –231] for details) is to unwrap the above solution in 1ve non-compact dimensions, perform a boost in the 0 –5 plane (which is a symmetry of the non-compact theory), and compactify the new x5 direction to get a charged (RN) black hole in four non-compact dimensions. The momenta should transform as p0 ≡ M = m cosh 3;
p5 ≡ Q = m sinh 3;
p ˜ = 0 :
(B.2)
In the above, we have absorbed the factor of radius in the de1nition of the charge so that the extremality condition reads M = Q. The extremal limit can be attained by 3 → ∞;
m → 0;
me3 → constant
(B.3)
so that Q → M = me3 =2;
pR ≡ p0 − p5 → 0 :
(B.4)
Near-extremal limit: The near-extremal limit is obtained by keeping the leading corrections in the expansion parameter e−3 . Thus, 2 ˜ Q=M = tanh 3 1 − m2 =(2Q˜ );
pR pL ; pL ≡ p0 + p5 :
(B.5)
In terms of these parameters, the four-dimensional metric for a near-extremal charged (RN) black hole is given by ds42 = −f dt 2 + f−1 dr 2 + r 2 d22 ; 2 f ≡ 1 − 2M=r + Q˜ =r 2 = fext h(r) ;
˜ 2; fext ≡ (1 − Q=r)
h(r) = (1 − =r) ;
= m2 = Q˜ :
(B.6)
The last equality implies Q˜ = sinh2 3 ;
(B.7)
same as (2.36) above. Also, the second equality agrees with Rule 1 for relating the non-extremal gtt ; grr to their extremal counterparts. Of course, we have considered here only the near-extremal case. The remarkable thing about the algorithm mentioned in Section 2.4 is that it works for arbitrary deviations from extremality. Appendix C. Coordinate systems for AdS3 and related spaces For a more detailed discussion and derivations, see, e.g. [10,232]. This appendix is meant to serve as a compendium of some de1nitions and results about the geometry of AdS3 and related spaces. We discuss both Lorentzian and Euclidean signatures.
674
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
C.1. AdS3 AdS3 is de1ned as a hyperboloid
− Y02 − Y−2 1 + Y12 + Y22 ≡ Y+ Y− +
Y Y = −l2
(C.1)
=0;1
in R2; 2 with metric ds2 = −dY02 − dY−2 1 + dY12 + dY22 = dY dY + dY+ dY− :
(C.2)
Here Y± = Y2 ± Y−1 . Condition (C.1) can be equivalently stated by saying that a point in AdS3 is represented by an SL(2) matrix (1=l)Y where Y = Y − 1 1 + Y a Ja ; where
J0 =
0 1 −1 0
a = 0; 1; 2 ;
;
J1 =
0 1 1 0
(C.3)
;
J2 =
1 0 0 −1
:
(C.4)
Global coordinates: Y−1 = l cosh sin t;
Y0 = l cosh cos t ;
Y1 = l sinh cos ;
Y2 = l sinh sin :
(C.5)
The metric is ds2 = l2 (−cosh2 dt 2 + d2 + sinh2 d2 ) : By rede1ning sinh = r=l we get r2 dr 2 r2 2 2 2 2 + 2 d ; ds = l − 1 + 2 dt + l 1 + r 2 =l2 l Range of coordinates: ∈ [0; 2]; (equivalently r) ∈ (0; ∞); t ∈ [0; 2] :
(C.6)
(C.7) (C.8)
If we unwrap t to the range t ∈ (−∞; ∞) we get the so-called CAdS3 (covering space) which is geodesically complete. There is another popular form, given by the coordinate transformation ) = tanh(=2), leading to 2 2 1 + )2 2 2 2 ds = − dt + (d)2 + )2 d2 ) : (C.9) 1 − )2 1 − )2 The section at any given t has the metric of a disc, called the Poincare disc. Poincare coordinates: Y+ = l2 =u;
Y = lx =u
(Y− determined by (C.1)). The metric is l2 (du2 + d x d x ) ; u2 Range: u ∈ (0; ∞), x0 = t ∈ (−∞; ∞), x1 ∈ (−∞; ∞): covers a half of AdS3 (C:1). ds2 =
(C.10)
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
675
C.2. BTZ black hole The three-dimensional black hole [92,232] in an asymptotically AdS spacetime (T:= − 1=l2 ), of mass M and angular momentum J , is given by the metric ds2 = −N 2 (r) dt 2 + dr 2 =N 2 (r) + r 2 (N (r) dt + d)2 ; 2 r2 J J 2 ; N (r) = 2 ; N (r) = 2 − M + l 2r 2r Range: t ∈ (−∞; ∞); r ∈ (0; ∞); ∈ [0; 2] :
(C.11)
Properties: • The lapse function N (r) vanishes at r = r± where 1=2
2 M J : r± = l 1 ± 1 − 2 Ml
(C.12)
Thus, N 2 (r) =
1 l2 r 2
2 (r 2 − r+2 )(r 2 − r− );
N (r) =
r + r− : r2l
(C.13)
r+ corresponds to the event horizon. • g00 = −N (r)2 + r 2 N (r)2 = r 2 =l2 − M vanishes at √ rerg = l M which represents the surface of in1nite red-shift. • These three special values of r satisfy r− 6 r+ 6 rerg . The region between r+ and rerg is called the “ergosphere”. Furthermore, the above black hole (C.11) can be obtained as a quotient of the AdS3 space (C.1). We show this for the J = 0; M ¿ 0 BTZ, given by 2 2 −1 r r ds2 = − 2 − M dt 2 + 2 − M dr 2 + r 2 d2 : (C.14) l l In the patch Y−2 1 ¿ Y12 ; Y02 6 Y22 we de1ne √ 1=2 2 √ r r t M 2 −l ; Y−1 = ± √ cosh M ; Y0 = sinh M l M √ 1=2 2 √ r r t M 2 −l : Y1 = √ sinh M ; Y2 = ± cosh M l M
(C.15)
With this, the metric (C.2) coincides with (C.14). In (C.14) the angle ≡ + 2. This implies, through (C.15) a discrete quotienting of the original hyperboloid. We thus 1nd that the BTZ metric is equivalent to AdS3 =Z. (We have shown this here in a certain coordinate patch, but it can be proved more generally [232].)
676
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
C.3. Conical spaces r2 dr 2 2 2 2 + r d : ds = l − J + 2 dt + l J + r 2 =l2 2
2
(C.16)
Range: r ∈ (0; ∞), t ∈ (−∞; ∞), ∈ [0; 2]. √ √ √ De1ne r = Jr, ˜ t = t˜= J, = ˜ J, get r˜2 d r˜2 2 2 2 2 2 ˜ (C.17) + r˜ d : ds = l − 1 + 2 d t˜ + l 1 + r˜2 =l2 √ (C.18) Range : ˜ ∈ [0; 2 J] : √ Defect angle V = 2(1 − J). Conic as AdS3 =Z. We will show that (C.17) can be obtained from (C.1) and (C.2) modulo the following identi1cation (using the notation of (C.3)): Y ≡ u−1 Yu :
(C.19)
The holonomy matrix u is given by the momentum pa ; a = 0; 1; 2 of the particle: u = u1 + pa Ja :
(C.20)
In case of a static particle, of mass m p0 = m;
p1 = p2 = 0 :
√
(C.21)
Hence u = u1 + mJ0 , where u = 1 − m2 by the SL(2) condition. The identi1cation (C.19), aLects only the components Y1 ; Y2 and reads Y1 Y1 1 − 2m2 −2m 1 − m2 ≡ : Y2 Y2 2m 1 − m2 1 − 2m2 Note that the matrix is an SO(2) rotation matrix with cos() = 1 − 2m2 . In terms of the coordinates (C.5) and (C.17) this implies an identi1cation ˜ ≡ ˜ + cos−1 (1 − 2m2 ) : Comparing with (C.18) we get a relation between the parameter J and the mass m √ 1 − 2m2 = cos(2 J) :
(C.22)
C.4. Euclidean sections and thermal physics Euclidean AdS3 can be de1ned by the global coordinates of (C.7) with the replacement t = −it, leading to the metric r2 dr 2 2 2 ; (C.23) + r d ds2 = l2 1 + 2 dt2 + l 1 + r 2 =l2 where (cf. (C.8)) ≡ + 2 :
(C.24)
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
677
Like (C.3) in the Lorentzian case, a point in Euclidean AdS3 can be alternatively de1ned as a Hermitian matrix (1=l)Y of unit determinant (the space of such matrices is called) where Y = Y−1 1 + Y a J˜a ;
a = 0; 1; 2
(C.25)
and the new J˜ matrices are obtained by replacing J0 in (C.4) by J˜0 = iJ0 . The determinant condition now reads − Y−2 1 + Y02 + Y12 + Y22 = −l2 :
(C.26)
The metric in this parameterization is ds2 = Tr dY2 = −dY−2 1 + dY02 + dY12 + dY22 :
(C.27)
To make contact with the global coordinates of (C.23) we 1rst introduce the by the following parameterization of: u e 0 eu 0 1 + r 2 =l2 r=l Y=l = : (C.28) 0 e− u 0 e− u 1 + r 2 =l2 r=l In order to cover only once, we must identify 2u ≡ 2u + i2n The metric (C.27) then becomes 2 dr 2 r2 2 (du + du) 2 2 + − 2 (du − du) : ds = l 1 + r 2 =l2 1 + r 2 =l2 l
(C.29)
(C.30)
A comparison of the periodicities (C.29) and (C.24) suggests that we de1ne 2u = t + i :
(C.31)
With this de1nition we recover (C.23) from (C.30). For Euclidean AdS3 , the periodicity in the t direction is to be supplied as an input from physics. A thermal ensemble implies in the usual fashion t ≡ t + 1=T . In addition to a temperature one may want to introduce an angular potential (conjugate to angular momentum J ∼ i9=9); such an ensemble, described by Tr exp[ − H=T + i'J ] implies a more general (twisted) identi1cation + it ≡ + it + ' + i=T := + it + i= ;
(C.32)
where the second step is a de1nition of the “complex temperature” (cf. (A.7)) = = 1=T − i' : Euclidean Poincare coordinates. The Euclidean version of (C.10) is de1ned by u 0 e h + ww=h U w=h : Y=l = w=h U 1=h 0 e− u The metric (C.27) becomes 1 ds2 = 2 (dw d wU + dh2 ) : h
(C.33)
(C.34)
(C.35)
678
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
Euclidean BTZ. Euclidean BTZ is usually de1ned by de1ning t = −it; JE = iJ in (C.11): 2 2 −1 2 2 2 r J r JE E 2 2 2 2 iJE dt + 2 − M − dr + r dt + d ; ds = 2 − M − l 2r l 2r 2r 2
(C.36)
where (cf. (C.11)) ≡ + 2 :
(C.37)
Like in the Lorentzian case, we can obtain the metric (C.36) as EBTZ = EAdS3 =Z [90,191], as follows. De1ne a quotient of (C.25), by [92] i$U −i$ e e 0 0 Y : (C.38) Y≡ 0 ei$ 0 e−i$U In terms of (C.28) the above identi1cation (C.38) reads 2u ≡ 2u + 2in$ :
(C.39)
Note that this identi1cation is in addition to (C.29). The two identi1cations de1nes for us two independent cycles in the u-plane. We will show below that the description (C.36) follows by identifying the cycle (C.39) with the “space” cycle, namely (C.37). This can be easily done by de1ning 2u = −i$( + it) :
(C.40)
The cycle (C.29) now becomes the “time”, unlike in the case of thermal AdS3 , where it was “space”. If we de1ne $ = |r− | − ir+
(C.41)
and a new radial coordinate r˜ in terms of the r of (C.28), as follows: r˜2 =l2 − $22 r 2 =l2 = ; |$|2 where l2 M JE2 2 1± 1+ 2 2 r± = 2 M l
(C.42)
(C.43)
(note that r− = i|r− | is purely imaginary), then the metric on, for r˜ ¿ r+ , becomes (C.36) once we drop the tilde from r. ˜ Temperature. Note that the periodicity along the new “time” cycle (C.29), through (C.40) implies the following complex periodicity (complex temperature): + it ≡ + it + i=0 ; =0 = 2i(1=$) =
2r+ 2|r− | +i 2 : 2 2 − r− r+ − r−
r+2
The real temperature T0 ≡ (R=0 )−1 is given by 2 r 2 − r− T0 = + 2r+ which agrees with the expression in [92].
(C.44)
(C.45)
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
679
Entropy. As can be derived from the partition function calculation in Section 11.1, the entropy of the BTZ black hole is given by [92] 2r+ S= (C.46) 4GN(3) which of course agrees with the Bekenstein–Hawking formula (1.10) as well. References [1] S.W. Hawking, W.Israel, General Relativity: an Einstein Centenary Survey, Cambridge University Press, Cambridge, UK, 1979, 919pp. [2] M.B. Green, J.H. Schwarz, E.Witten, Superstring Theory. Vol. 1: Introduction; Vol. 2: Loop amplitudes, anomalies and phenomenology, Cambridge Monographs on Mathematical Physics, Cambridge University Press, Cambridge, UK, 1987, 469pp. [3] J. Polchinski, String Theory. Vol. 1: An Introduction to the bosonic string; Vol. 2; Superstring Theory and Beyond, Cambridge University Press, Cambridge, UK, 1998, 402pp. [4] T. Yoneya, Connection of dual models to electrodynamics and gravidynamics, Prog. Theor. Phys. 51 (1974) 1907–1920. [5] J. Scherk, J.H. Schwarz, Dual models and the geometry of space–time, Phys. Lett. B 52 (1974) 347. [6] C.W. Misner, K.S. Thorne, J.A. Wheeler, Gravitation, Freeman, 1970, 1279pp. [7] R.M. Wald, General relativity, Chicago University Press, Chicago, USA, 1984, 491pp. [8] R.P. Feynman, F.B. Morinigo, W.G. Wagner, B. Hat1eld (Ed.), Feynman Lectures on Gravitation, Addison-Wesley, Reading, USA, 1995, 232pp (The advanced book program). [9] S. Weinberg, Gravitation and Cosmology: Principles and Applications of the General Theory of Relativity, Wiley, New York, 1972, 657pp. [10] S. Hawking, G. Ellis, The large scale structure of space time, Cambridge University Press, Cambridge, UK, 1973, 391pp. [11] G.T. Horowitz, R.C. Myers, The value of singularities, Gen. Rel. Grav. 27 (1995) 915–919, gr-qc=9503062. [12] C.V. Johnson, N. Kaloper, R.R. Khuri, R.C. Myers, Is string theory a theory of strings? Phys. Lett. B 368 (1996) 71–77, hep-th=9509070. [13] C.V. Johnson, A.W. Peet, J. Polchinski, Gauge theory and the excision of repulson singularities, Phys. Rev. D 61 (2000) 086001, hep-th=9911161. [14] S.S. Gubser, Curvature singularities: the good, the bad, and the naked, hep-th=0002160. [15] M. Natsuume, The singularity problem in string theory, gr-qc=0108059. [16] S.W. Hawking, Particle creation by black holes, Commun. Math. Phys. 43 (1975) 199–220. [17] S.W. Hawking, Breakdown of predictability in gravitational collapse, Phys. Rev. D 14 (1976) 2460–2473. [18] G. ’t Hooft, Black holes, Hawking radiation, and the information paradox, Nucl. Phys. Proc. Suppl. 43 (1995) 1–11. [19] S.B. Giddings, The black hole information paradox, hep-th=9508151. [20] S.M. Carroll, The cosmological constant, Living Rev. Rel. 4 (2001) 1, astro-ph=0004075. [21] S. Weinberg, The cosmological constant problems, astro-ph=0005265. [22] E. Witten, The cosmological constant from the viewpoint of string theory, hep-ph=0002297. [23] A. Celotti, J.C. Miller, D.W. Sciama, Astrophysical evidence for the existence of black holes, astro-ph=9912186. [24] P.T. Chrusciel, ‘No hair’ theorems: folklore, conjectures, results, gr-qc=9402032. [25] J.D. Bekenstein, Black hole hair: twenty-1ve years after, gr-qc=9605059. [26] B.Carter, Has the black hole equilibrium problem been solved?, gr-qc=9712038. [27] D. Marolf, Some brane-theoretic no-hair results (and their 1eld-theory duals), gr-qc=9908017. [28] G.W. Gibbons, S.W. Hawking, Action integrals and partition functions in quantum gravity, Phys. Rev. D 15 (1977) 2752–2756. [29] J.D. Bekenstein, Black holes and entropy, Phys. Rev. D 7 (1973) 2333–2346.
680
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
[30] J.D. Bekenstein, Generalized second law of thermodynamics in black hole physics, Phys. Rev. D 9 (1974) 3292–3300. [31] J.M. Bardeen, B. Carter, S.W. Hawking, The four laws of black hole mechanics, Commun. Math. Phys. 31 (1973) 161–170. [32] S.W. Hawking, Black holes and thermodynamics, Phys. Rev. D 13 (1976) 191–197. [33] G.’t Hooft, The scattering matrix approach for the quantum black hole: an overview, Int. J. Mod. Phys. A 11 (1996) 4623–4688, gr-qc=9607022. [34] L. Susskind, Some speculations about black hole entropy in string theory, hep-th=9309145. [35] G.T. Horowitz, J. Polchinski, A correspondence principle for black holes and strings, Phys. Rev. D 55 (1997) 6189–6197, hep-th=9612146. [36] G.T. Horowitz, J. Polchinski, Self gravitating fundamental strings, Phys. Rev. D 57 (1998) 2557–2563, hep-th=9707170. [37] T. Damour, G. Veneziano, Self-gravitating fundamental strings and black holes, Nucl. Phys. B 568 (2000) 93–119, hep-th=9907030. [38] A. Dhar, G. Mandal, S.R. Wadia, Absorption vs decay of black holes in string theory and T-symmetry, Phys. Lett. B 388 (1996) 51–59, hep-th=9605234. [39] S.R. Das, S.D. Mathur, Comparing decay rates for black holes and D-branes, Nucl. Phys. B 478 (1996) 561–576, hep-th=9606185. [40] J. Maldacena, A. Strominger, Black hole greybody factors and D-brane spectroscopy, Phys. Rev. D 55 (1997) 861–870, hep-th=9609026. [41] O. Aharony, S.S. Gubser, J. Maldacena, H. Ooguri, Y. Oz, Large N 1eld theories, string theory and gravity, Phys. Rep. 323 (2000) 183–386, hep-th=9905111. [42] J. Maldacena, The large N limit of superconformal 1eld theories and supergravity, Adv. Theor. Math. Phys. 2 (1998) 231–252, hep-th=9711200. [43] J. deBoer, Six-dimensional supergravity on S 3 × AdS3 and 2d conformal 1eld theory, Nucl. Phys. B 548 (1999) 139–166, hep-th=9806104. [44] C.G. Callan, J.M. Maldacena, D-brane approach to black hole quantum mechanics, Nucl. Phys. B 472 (1996) 591–610, hep-th=9602043. [45] K.S. Stelle, BPS branes in supergravity, hep-th=9803116. [46] N.A. Obers, B. Pioline, U-duality and M-theory, Phys. Rep. 318 (1999) 113–225, hep-th=9809039. [47] J.P. Gauntlett, Intersecting branes, hep-th=9705011. [48] J.M. Maldacena, Black holes in string theory, hep-th=9607235. [49] M. Cvetic, Properties of black holes in toroidally compacti1ed string theory, Nucl. Phys. Proc. Suppl. B 56 (1997) 1–10, hep-th=9701152. [50] D. Youm, Black holes and solitons in string theory, Phys. Rep. 316 (1999) 1–232, hep-th=9710046. [51] J.R. David, String theory and black holes, hep-th=9911003. [52] G. Mandal, A review of the D1=D5 system and 1ve dimensional black hole from supergravity and brane viewpoints, hep-th=0002184. [53] S.R. Wadia, Lectures on the microscopic modeling of the 5-dim black hole of IIB string theory and the D1=D5 system, hep-th=0006190. [54] A.W. Peet, The Bekenstein formula and string theory (N-brane theory), Class. Quant. Grav. 15 (1998) 3291–3338, hep-th=9712253. [55] K. Skenderis, Black holes and branes in string theory, Lect. Notes Phys. 541 (2000) 325–364, hep-th=9901050. [56] M.J. DuL, Lectures on branes, black holes and anti-de Sitter space, hep-th=9912164. [57] P.K. Townsend, The eleven-dimensional supermembrane revisited, Phys. Lett. B 350 (1995) 184–187, hep-th=9501068. [58] E. Witten, String theory dynamics in various dimensions, Nucl. Phys. B 443 (1995) 85–126, hep-th=9503124. [59] G. Papadopoulos, P.K. Townsend, Intersecting M-branes, Phys. Lett. B 380 (1996) 273–279, hep-th=9603087. [60] A.A. Tseytlin, Harmonic superpositions of M-branes, Nucl. Phys. B 475 (1996) 149–163, hep-th=9604035. [61] M.J. DuL, K.S. Stelle, Multi-membrane solutions of D = 11 supergravity, Phys. Lett. B 253 (1991) 113–118. [62] E. Witten, Anti-de Sitter space and holography, Adv. Theor. Math. Phys. 2 (1998) 253–291, hep-th=9802150.
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
681
[63] S.S. Gubser, I.R. Klebanov, A.M. Polyakov, Gauge theory correlators from non-critical string theory, Phys. Lett. B 428 (1998) 105–114, hep-th=9802109. [64] E. Cremmer, B. Julia, J. Scherk, Supergravity theory in 11 dimensions, Phys. Lett. B 76 (1978) 409–412. [65] J.A. deAzcarraga, J.P. Gauntlett, J.M. Izquierdo, P.K. Townsend, Topological extensions of the supersymmetry algebra for extended objects, Phys. Rev. Lett. 63 (1989) 2443. [66] G.W. Gibbons, P.K. Townsend, Vacuum interpolation in supergravity via super p-branes, Phys. Rev. Lett. 71 (1993) 3754–3757, hep-th=9307049. [67] S. Surya, D. Marolf, Localized branes and black holes, Phys. Rev. D 58 (1998) 124013, hep-th=9805121. [68] G.T. Horowitz, A. Strominger, Black strings and p-branes, Nucl. Phys. B 360 (1991) 197–209. [69] S.F. Hassan, T-duality, space-time spinors and R–R 1elds in curved backgrounds, Nucl. Phys. B 568 (2000) 145–161, hep-th=9907152. [70] R. Gregory, R. LaCamme, Black strings and p-branes are unstable, Phys. Rev. Lett. 70 (1993) 2837–2840, hep-th=9301052. [71] A. Strominger, C. Vafa, Microscopic origin of the Bekenstein–Hawking entropy, Phys. Lett. B 379 (1996) 99–104, hep-th=9601029. [72] D. Gar1nkle, T. Vachaspati, Cosmic string traveling waves, Phys. Rev. D 42 (1990) 1960–1963. [73] M.S. Bremer, H. Lu, C.N. Pope, K.S. Stelle, Dirac quantisation conditions and Kaluza–Klein reduction, Nucl. Phys. B 529 (1998) 259–294, hep-th=9710244. [74] M. Cvetic, A.A. Tseytlin, Non-extreme black holes from non-extreme intersecting M-branes, Nucl. Phys. B 478 (1996) 181–198, hep-th=9606033. [75] G.T. Horowitz, J.M. Maldacena, A. Strominger, Nonextremal black hole microstates and U-duality, Phys. Lett. B 383 (1996) 151–159, hep-th=9603109. [76] A. Sen, Strong coupling dynamics of branes from M-theory, JHEP 10 (1997) 002, hep-th=9708002. [77] E. Eyras, S. Panda, The spacetime life of a non-BPS D-particle, Nucl. Phys. B 584 (2000) 251–283, hep-th=0003033. [78] Y. Lozano, Non-BPS D-brane solutions in six dimensional orbifolds, Phys. Lett. B 487 (2000) 180–186, hep-th=0003226. [79] P. Brax, G. Mandal, Y. Oz, Supergravity description of non-BPS branes, Phys. Rev. D 63 (2001) 064008, hep-th=0005242. [80] M. Bertolini, et al., Is a classical description of stable non-BPS D-branes possible? Nucl. Phys. B 590 (2000) 471–503, hep-th=0007097. [81] P. Bain, Taming the supergravity description of non-BPS D-branes: The D=anti-D solution, JHEP 04 (2001) 014, hep-th=0012211. [82] K.A. Intriligator, M. Kleban, J. Kumar, Comments on unstable branes, JHEP 02 (2001) 023, hep-th=0101010. [83] Y.C. Liang, E. Teo, Black diholes with unbalanced magnetic charges, Phys. Rev. D 64 (2001) 024019, hep-th=0101221. [84] R. Emparan, E. Teo, Macroscopic and microscopic description of black diholes, Nucl. Phys. B 610 (2001) 190–214, hep-th=0104206. [85] S.H.S. Alexander, InCation from D–anti-D brane annihilation, Phys. Rev. D 65 (2002) 023507, hep-th=0105032. [86] G.L. Alberghi, E. Caceres, K. Goldstein, D.A. Lowe, Stacking non-BPS D-branes, Phys. Lett. B 520 (2001) 361–366, hep-th=0105205. [87] J.M. Maldacena, J.G. Russo, Large N limit of non-commutative gauge theories, JHEP 09 (1999) 025, hep-th=9908134. [88] A. Dhar, G. Mandal, S.R. Wadia, K.P. Yogendran, D1=D5 system with B-1eld, noncommutative geometry and the CFT of the Higgs branch, Nucl. Phys. B 575 (2000) 177–194, hep-th=9910194. [89] S. Ferrara, G.W. Gibbons, R. Kallosh, Black holes and critical points in moduli space, Nucl. Phys. B 500 (1997) 75–93, hep-th=9702103. [90] J. Maldacena, A. Strominger, AdS3 black holes and a stringy exclusion principle, JHEP 12 (1998) 005, hep-th=9804085. [91] K. Sfetsos, K. Skenderis, Microscopic derivation of the Bekenstein–Hawking entropy formula for non-extremal black holes, Nucl. Phys. B 517 (1998) 179–204, hep-th=9711138.
682
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
[92] M. Banados, C. Teitelboim, J. Zanelli, The black hole in three-dimensional space-time, Phys. Rev. Lett. 69 (1992) 1849–1851, hep-th=9204099. [93] O. Coussaert, M. Henneaux, Supersymmetry of the (2 + 1) black holes, Phys. Rev. Lett. 72 (1994) 183–186, hep-th=9310194. [94] G. Mandal, A.M. Sengupta, S.R. Wadia, Classical solutions of two-dimensional string theory, Mod. Phys. Lett. A 6 (1991) 1685–1692. [95] E. Witten, On string theory and black holes, Phys. Rev. D 44 (1991) 314–324. [96] S. Elitzur, A. Forge, E. Rabinovici, Some global aspects of string compacti1cations, Nucl. Phys. B 359 (1991) 581–610. [97] N. Itzhaki, J.M. Maldacena, J. Sonnenschein, S. Yankielowicz, Supergravity and the large N limit of theories with sixteen supercharges, Phys. Rev. D 58 (1998) 046004, hep-th=9802042. [98] J.M. Maldacena, A. Strominger, Semiclassical decay of near-extremal 1vebranes, JHEP 12 (1997) 008, hep-th=9710014. [99] J. Maharana, J.H. Schwarz, Noncompact symmetries in string theory, Nucl. Phys. B 390 (1993) 3–32, hep-th=9207016. [100] E. BergshoeL, C.M. Hull, T. Ortin, Duality in the type II superstring eLective action, Nucl. Phys. B 451 (1995) 547–578, hep-th=9504081. [101] C.G. Callan, S.S. Gubser, I.R. Klebanov, A.A. Tseytlin, Absorption of 1xed scalars and the D-brane approach to black holes, Nucl. Phys. B 489 (1997) 65–94, hep-th=9610172. [102] N.D. Birrell, P.C.W. Davies, Quantum Fields in Curved Space, Cambridge University Press, Cambridge, UK, 1982, 340pp. [103] I.R. Klebanov, M. Krasnitz, Fixed scalar greybody factors in 1ve and four dimensions, Phys. Rev. D 55 (1997) 3250–3254, hep-th=9612051. [104] M. Krasnitz, I.R. Klebanov, Testing eLective string models of black holes with 1xed scalars, Phys. Rev. D 56 (1997) 2173–2179, hep-th=9703216. [105] M.M. Taylor-Robinson, Absorption of 1xed scalars, hep-th=9704172. [106] H.W. Lee, Y.S. Myung, J.Y. Kim, Absorption of 1xed scalars in scattering oL 5d black holes, Phys. Rev. D 58 (1998) 104006, hep-th=9708099. [107] S.S. Gubser, Dynamics of D-brane black holes, hep-th=9908004. [108] J.R. David, G. Mandal, S.R. Wadia, Absorption and Hawking radiation of minimal and 1xed scalars, and AdS=CFT correspondence, Nucl. Phys. B 544 (1999) 590–611, hep-th=9808168. [109] J. Polchinski, TASI lectures on D-branes, hep-th=9611050. [110] M.R. Douglas, J. Polchinski, A. Strominger, Probing 1ve-dimensional black holes with D-branes, JHEP 12 (1997) 003, hep-th=9703031. [111] S.F. Hassan, S.R. Wadia, Gauge theory description of D-brane black holes: emergence of the eLective SCFT and Hawking radiation Nucl. Phys. B 526 (1998) 311–333, hep-th=9712213. [112] W. Taylor IV, D-brane 1eld theory on compact spaces, Phys. Lett. B 394 (1997) 283–287, hep-th=9611042. [113] E. Witten, On the conformal 1eld theory of the Higgs branch, JHEP 07 (1997) 003, hep-th=9707093. [114] N. Seibergm, E. Witten, The D1=D5 system and singular CFT, JHEP 04 (1999) 017, hep-th=9903224. [115] J.L. Cardy, Operator content of two-dimensional conformally invariant theories, Nucl. Phys. B 270 (1986) 186–204. [116] C. Vafa, Instantons on D-branes, Nucl. Phys. B 463 (1996) 435–442, hep-th=9512078. [117] M.R. Douglas, Branes within branes, hep-th=9512077. [118] E. Witten, Sigma models and the ADHM construction of instantons, J. Geom. Phys. 15 (1995) 215–226, hep-th=9410052. [119] R. Dijkgraaf, Instanton strings and hyperKaehler geometry, Nucl. Phys. B 543 (1999) 545–571, hep-th=9810210. [120] C. Vafa, Gas of D-branes and Hagedorn density of BPS states, Nucl. Phys. B 463 (1996) 415–419, hep-th=9511088. [121] A. Sen, U-duality and intersecting D-branes, Phys. Rev. D 53 (1996) 2874–2894, hep-th=9511026. [122] M. Yu, The unitary representations of the N = 4, SU (2) extended superconformal algebras, Nucl. Phys. B 294 (1987) 890. [123] O. Aharony, M. Berkooz, E. Silverstein, Non-local string theories on AdS3 × S 3 and stable non-supersymmetric backgrounds, Phys. Rev. D 65 (2002) 106007, hep-th=0112178.
J.R. David et al. / Physics Reports 369 (2002) 549 – 686 [124] [125] [126] [127] [128] [129] [130] [131] [132] [133] [134] [135] [136] [137] [138] [139] [140] [141] [142] [143] [144] [145] [146] [147] [148] [149] [150] [151] [152] [153] [154]
683
E. Witten, Multi-trace operators, boundary conditions, and AdS=CFT correspondence, hep-th=0112258. L.J. Dixon, J.A. Harvey, C. Vafa, E. Witten, Strings on orbifolds, Nucl. Phys. B 261 (1985) 678–686. L.J. Dixon, J.A. Harvey, C. Vafa, E. Witten, Strings on orbifolds. 2, Nucl. Phys. B 274 (1986) 285–314. R. Dijkgraaf, G.W. Moore, E. Verlinde, H. Verlinde, Elliptic genera of symmetric products and second quantized strings, Commun. Math. Phys. 185 (1997) 197–209, hep-th=9608096. L.J. Dixon, D. Friedan, E.J. Martinec, S.H. Shenker, The conformal 1eld theory of orbifolds, Nucl. Phys. B 282 (1987) 13–73. M. Cvetic, ELective Lagrangian of the (blownup) oribifolds, in: G. Furlan et al. (Eds.), Superstrings, Uni1ed Theories and Cosmology, 1987. S.F. Hassan, S.R. Wadia, D-brane black holes: large-N limit and the eLective string description Phys. Lett. B 402 (1997) 43–52, hep-th=9703163. R. Dijkgraaf, E. Verlinde, H. Verlinde, 5d black holes and matrix strings, Nucl. Phys. B 506 (1997) 121–142, hep-th=9704018. E. Witten, Constraints on supersymmetry breaking, Nucl. Phys. B 202 (1982) 253. H.J. Boonstra, B. Peeters, K. Skenderis, Duality and asymptotic geometries, Phys. Lett. B 411 (1997) 59–67, hep-th=9706192. P. Claus, et al., Black holes and superconformal mechanics, Phys. Rev. Lett. 81 (1998) 4553–4556, hep-th=9804177. M. Gunaydin, G. Sierra, P.K. Townsend, The unitary supermultiplets of d = 3 anti-de Sitter and d = 2 conformal superalgebras, Nucl. Phys. B 274 (1986) 429. J.R. David, Anti-de Sitter gravity associated with the supergroup SU (1; 1|2) × SU (1; 1|2), Mod. Phys. Lett. A 14 (1999) 1143–1148, hep-th=9904068. E. Gava, A.B. Hammou, J.F. Morales, K.S. Narain, D1=D2 systems in N = 4 string theories, Nucl. Phys. B 605 (2001) 17–63, hep-th=0012118. E. Gava, A.B. Hammou, J.F. Morales, K.S. Narain, AdS=CFT correspondence and D1=D5 systems in theories with 16 supercharges, JHEP 03 (2001) 035, hep-th=0102043. A. Giveon, D. Kutasov, N. Seiberg, Comments on string theory on AdS3 , Adv. Theor. Math. Phys. 2 (1998) 733–780, hep-th=9806194. E. Witten, AdS=CFT correspondence and topological 1eld theory, JHEP 12 (1998) 012, hep-th=9812012. J.R. David, G. Mandal, S.R. Wadia, D1=D5 moduli in SCFT and gauge theory, and Hawking radiation, Nucl. Phys. B 564 (2000) 103–127, hep-th=9907075. F. Larsen, E.J. Martinec, U(1) charges and moduli in the D1–D5 system, JHEP 06 (1999) 019, hep-th=9905064. C. Vafa, Quantum symmetries of string vacua, Mod. Phys. Lett. A 4 (1989) 1615. A. Mikhailov, D1D5 system and noncommutative geometry, Nucl. Phys. B 584 (2000) 545–588, hep-th=9910126. O. Aharony, M. Berkooz, IR dynamics of d = 2; N = (4; 4) gauge theories and DLCQ of ‘little string theories’, JHEP 10 (1999) 030, hep-th=9909101. P.S. Aspinwall, Enhanced gauge symmetries and K3 surfaces, Phys. Lett. B 357 (1995) 329–334, hep-th=9507012. E. Witten, Some comments on string dynamics, hep-th=9507121. C.P. Bachas, Lectures on D-branes, hep-th=9806199. G. Arutyunov, A. Pankiewicz, S. Theisen, Cubic couplings in D = 6N = 4b supergravity on AdS3 × S 3 , Phys. Rev. D 63 (2001) 044024, hep-th=0007061. D.Z. Freedman, S.D. Mathur, A. Matusis, L. Rastelli, Correlation functions in the CFTd =AdSd+1 correspondence, Nucl. Phys. B 546 (1999) 96–118, hep-th=9804058. O. Lunin, S.D. Mathur, Correlation functions for M N =S N orbifolds, Commun. Math. Phys. 219 (2001) 399–442, hep-th=0006196. O. Lunin, S.D. Mathur, Three-point functions for M N =S N orbifolds with n = 4 supersymmetry, Commun. Math. Phys. 227 (2002) 385–419, hep-th=0103169. G.E. Arutyunov, S.A. Frolov, Virasoro amplitude from the S N R24 orbifold sigma model, Theor. Math. Phys. 114 (1998) 43–66, hep-th=9708129. G.E. Arutyunov, S.A. Frolov, Four graviton scattering amplitude from S N R8 supersymmetric orbifold sigma model, Nucl. Phys. B 524 (1998) 159–206, hep-th=9712061.
684
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
[155] S.S. Gubser, Absorption of photons and fermions by black holes in four dimensions, Phys. Rev. D 56 (1997) 7854–7868, hep-th=9706100. [156] D. Birmingham, I. Sachs, S. Sen, Three-dimensional black holes and string theory, Phys. Lett. B 413 (1997) 281–286, hep-th=9707188. [157] H.J.W. Muller-Kirsten, N. Ohta, J.-G. Zhou, AdS3 =CFT correspondence, poincare vacuum state and greybody factors in BTZ black holes, Phys. Lett. B 445 (1999) 287–295, hep-th=9809193. [158] N. Ohta, J.-G. Zhou, Thermalization of poincare vacuum state and fermion emission from AdS3 black holes in bulk-boundary correspondence, JHEP 12 (1998) 023, hep-th=9811057. [159] H.W. Lee, Y.S. Myung, Holographic connection between the BTZ black hole and 5d black hole, Phys. Rev. D 58 (1998) 104013, hep-th=9804095. [160] H.W. Lee, Y.S. Myung, Scattering from an AdS3 bubble and an exact AdS3 , Phys. Rev. D 61 (2000) 024031, hep-th=9903054. [161] E. Teo, Black hole absorption cross-sections and the anti-de Sitter-conformal 1eld theory correspondence, Phys. Lett. B 436 (1998) 269–274, hep-th=9805014. [162] I.R. Klebanov, A. Rajaraman, A.A. Tseytlin, Intermediate scalars and the eLective string model of black holes, Nucl. Phys. B 503 (1997) 157–176, hep-th=9704112. [163] S. Cecotti, N = 2 Landau–Ginzburg versus Calabi–Yau sigma models: nonperturbative aspects Int. J. Mod. Phys. A 6 (1991) 1749–1814. [164] E. Witten, Dynamical breaking of supersymmetry, Nucl. Phys. B 188 (1981) 513. [165] J.M. Maldacena, G.W. Moore, A. Strominger, Counting BPS black holes in toroidal type II string theory, hep-th=9903163. [166] S.R. Das, The eLectiveness of D-branes in the description of near-extremal black holes, Phys. Rev. D 56 (1997) 3582–3590, hep-th=9703146. [167] A.B. Zamolodchikov, Irreversibility’ of the Cux of the renormalization group in a 2-d 1eld theory, JETP Lett. 43 (1986) 730–732. [168] I. Pesando, The GS type IIB superstring action on AdS3 × S 3 × T 4 , JHEP 02 (1999) 007, hep-th=9809145. [169] J. Rahmfeld, A. Rajaraman, The GS string action on AdS3 × S 3 with Ramond–Ramond charge, Phys. Rev. D 60 (1999) 064014, hep-th=9809164. [170] J. Park, S.-J. Rey, Green-Schwarz superstring on AdS3 × S 3 , JHEP 01 (1999) 001, hep-th=9812062. [171] M. Yu, B. Zhang, Light-cone gauge quantization of string theories on AdS3 space, Nucl. Phys. B 551 (1999) 425–449, hep-th=9812216. [172] N. Berkovits, C. Vafa, E. Witten, Conformal 1eld theory of AdS background with Ramond–Ramond Cux, JHEP 03 (1999) 018, hep-th=9902098. [173] L. Dolan, E. Witten, Vertex operators for AdS(3) background with Ramond–Ramond Cux, JHEP 11 (1999) 003, hep-th=9910205. [174] J. Maldacena, H. Ooguri, Strings in AdS3 and SL(2; R) WZW model. I, J. Math. Phys. 42 (2001) 2929–2960, hep-th=0001053. [175] J. Maldacena, H. Ooguri, J. Son, Strings in AdS3 and the SL(2; R) WZW model. II: Euclidean black hole J. Math. Phys. 42 (2001) 2961–2977, hep-th=0005183. [176] J. Maldacena, H. Ooguri, Strings in AdS3 and the SL(2; R) WZW model. III: Correlation functions, hep-th=0111180. [177] P. Lee, H. Ooguri, J. Park, Boundary states for AdS2 branes in AdS3 , hep-th=0112188. [178] B. Ponsot, V. Schomerus, J. Teschner, Branes in the Euclidean AdS3 , hep-th=0112198. [179] L.J. Dixon, M.E. Peskin, J. Lykken, N = 2 superconformal symmetry and SO(2; 1) current algebra, Nucl. Phys. B 325 (1989) 329–355. [180] P. Breitenlohner, D.Z. Freedman, Positive energy in anti-de Sitter backgrounds and gauged extended supergravity, Phys. Lett. B 115 (1982) 197. [181] J.M. Evans, M.R. Gaberdiel, M.J. Perry, The no-ghost theorem for AdS3 and the stringy exclusion principle, Nucl. Phys. B 535 (1998) 152–170, hep-th=9806024. [182] Y. Satoh, On string theory in AdS3 backgrounds, hep-th=0005169. [183] M. Wakimoto, Fock representations of the aKne lie algebra A1(1), Commun. Math. Phys. 104 (1986) 605–609. [184] C.G. Callan, J.A. Harvey, A. Strominger, Supersymmetric string solitons, hep-th=9112030.
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
685
[185] D.J. Gross, M.J. Perry, L.G. YaLe, Instability of Cat space at 1nite temperature, Phys. Rev. D 25 (1982) 330–355. [186] S.W. Hawking, D.N. Page, Thermodynamics of black holes in anti-de Sitter space, Commun. Math. Phys. 87 (1983) 577. [187] M. Nishimura, Y. Tanii, Super weyl anomalies in the AdS=CFT correspondence, Int. J. Mod. Phys. A 14 (1999) 3731–3744, hep-th=9904010. [188] H. Nicolai, H. Samtleben, N = 8 matter coupled AdS3 supergravities, Phys. Lett. B 514 (2001) 165–172, hep-th=0106153. [189] E. Witten, Anti-de Sitter space, thermal phase transition, and con1nement in gauge theories, Adv. Theor. Math. Phys. 2 (1998) 505–532, hep-th=9803131. [190] J.R. David, G. Mandal, S. Vaidya, S.R. Wadia, Point mass geometries, spectral Cow and AdS3 –CFT2 correspondence, Nucl. Phys. B 564 (2000) 128–141, hep-th=9906112. [191] R. Dijkgraaf, J. Maldacena, G.W. Moore, E. Verlinde, A black hole farey tail, hep-th=0005003. [192] S. Deser, R. Jackiw, G. ’t Hooft, Three-dimensional Einstein gravity: dynamics of Cat space Ann. Phys. 152 (1984) 220. [193] S. Deser, R. Jackiw, Three-dimensional cosmological gravity: dynamics of constant curvature Ann. Phys. 153 (1984) 405–416. [194] J.M. Izquierdo, P.K. Townsend, Supersymmetric space-times in (2 + 1) ads supergravity models, Class. Quant. Grav. 12 (1995) 895–924, gr-qc=9501018. [195] A. Achucarro, P.K. Townsend, Extended supergravities in d = (2 + 1) as Chern–Simons theories, Phys. Lett. B 229 (1989) 383. [196] V. Balasubramanian, J. deBoer, E. Keski-Vakkuri, S.F. Ross, Supersymmetric conical defects: towards a string theoretic description of black hole formation Phys. Rev. D 64 (2001) 064011, hep-th=0011217. [197] J. Maldacena, L. Maoz, De-singularization by rotation, hep-th=0012025. [198] E.J. Martinec, W. McElgin, String theory on AdS orbifolds, hep-th=0106171. [199] H.-J. Matschull, Black hole creation in 2+1-dimensions, Class. Quant. Grav. 16 (1999) 1069–1095, gr-qc=9809087. [200] M.W. Choptuik, Universality and scaling in gravitational collapse of a massless scalar 1eld, Phys. Rev. Lett. 70 (1993) 9–12. [201] Y. Peleg, A.R. Steif, Phase transition for gravitationally collapsing dust shells in (2 + 1)-dimensions, Phys. Rev. D 51 (1995) 3992–3996, gr-qc=9412023. [202] D. Birmingham, Choptuik scaling and quasinormal modes in the AdS=CFT correspondence, Phys. Rev. D 64 (2001) 064024, hep-th=0101194. [203] U.H. Danielsson, E. Keski-Vakkuri, M. Kruczenski, Spherically collapsing matter in ads, holography, and shellons, Nucl. Phys. B 563 (1999) 279–292, hep-th=9905227. [204] U.H. Danielsson, E. Keski-Vakkuri, M. Kruczenski, Black hole formation in ads and thermalization on the boundary, JHEP 02 (2000) 039, hep-th=9912209. [205] S.B. Giddings, A. Nudelman, Gravitational collapse and its boundary description in ads, JHEP 02 (2002) 003, hep-th=0112099. [206] N. Dorey, T.J. Hollowood, V.V. Khoze, M.P. Mattis, S. Vandoren, Multi-instantons and Maldacena’s conjecture, JHEP 06 (1999) 023, hep-th=9810243. [207] M. Blau, K.S. Narain, G. Thompson, Instantons, the information metric, and the AdS=CFT correspondence, hep-th=0108122. [208] S.R. Das, S.Naik, S.R. Wadia, Quantization of the Liouville mode and string theory, Mod. Phys. Lett. A 4 (1989) 1033. [209] A. Dhar, T. Jayaraman, K.S. Narain, S.R. Wadia, The role of quantized two-dimensional gravity in string theory, Mod. Phys. Lett. A 5 (1990) 863. [210] S.R. Das, A. Dhar, S.R. Wadia, Critical behavior in two-dimensional quantum gravity and equations of motion of the string, Mod. Phys. Lett. A 5 (1990) 799. [211] A. Dhar, S.R. Wadia, Noncritical strings, RG Cows and holography, Nucl. Phys. B 590 (2000) 261–272, hep-th=0006043. [212] K. Krasnov, 3D gravity, point particles and Liouville theory, Class. Quant. Grav. 18 (2001) 1291–1304, hep-th=0008253.
686
J.R. David et al. / Physics Reports 369 (2002) 549 – 686
[213] K. Krasnov, Lambda ¡ 0 quantum gravity in 2 + 1 dimensions. I: Quantum states and stringy S-matrix, hep-th=0112164. [214] K. Krasnov, Lambda ¡ 0 quantum gravity in 2 + 1 dimensions. II: Black hole creation by point particles, hep-th=0202117. [215] T. Banks, M.R. Douglas, G.T. Horowitz, E.J. Martinec, AdS dynamics from conformal 1eld theory, hep-th=9808016. [216] V. Balasubramanian, S.B. Giddings, A.E. Lawrence, What do CFTs tell us about anti-de Sitter spacetimes? JHEP 03 (1999) 001, hep-th=9902052. [217] J.D. Brown, M. Henneaux, Central charges in the canonical realization of asymptotic symmetries: an example from three-dimensional gravity, Commun. Math. Phys. 104 (1986) 207–226. [218] R. Dijkgraaf, E. Verlinde, H. Verlinde, 5d black holes and matrix strings, Nucl. Phys. B 506 (1997) 121–142, hep-th=9704018. [219] D. Kabat, G. Lifschytz, D.A. Lowe, Black hole entropy from non-perturbative gauge theory, Phys. Rev. D 64 (2001) 124015, hep-th=0105171. [220] S.A. Abel, J.L.F. Barbon, I.I. Kogan, E. Rabinovici, Some thermodynamical aspects of string theory, hep-th=9911004. [221] J.L.F. Barbon, E. Rabinovici, Closed-string tachyons and the hagedorn transition in AdS space, hep-th=0112173. [222] A. Dabholkar, Tachyon condensation and black hole entropy, Phys. Rev. Lett. 88 (2002) 091301, hep-th=0111004. [223] A. Adams, J. Polchinski, E. Silverstein, Don’t panic! closed string tachyons in ALE space-times, JHEP 10 (2001) 029, hep-th=0108075. [224] D. Berenstein, J. Maldacena, H. Nastase, Strings in Cat space and pp waves from N = 4 super yang mills, hep-th=0202021. [225] L.A.P. Zayas, J. Sonnenschein, On Penrose limits and gauge theories, hep-th=0202186. [226] V. Kazakov, I.K. Kostov, D. Kutasov, A matrix model for the two-dimensional black hole, Nucl. Phys. B 622 (2002) 141–188, hep-th=0101011. [227] J.M. Maldacena, Statistical entropy of near extremal 1ve-branes, Nucl. Phys. B 477 (1996) 168–174, hep-th=9605016. [228] A. Hanany, N. Prezas, J. Troost, The partition function of the two-dimensional black hole conformal 1eld theory, hep-th=0202129. [229] A. Sen, Rotating charged black hole solution in heterotic string theory, Phys. Rev. Lett. 69 (1992) 1006–1009, hep-th=9204046. [230] G.T. Horowitz, The dark side of string theory: black holes and black strings, hep-th=9210119. [231] J.G. Russo, A.A. Tseytlin, Waves, boosted branes and BPS states in M-theory, Nucl. Phys. B 490 (1997) 121–144, hep-th=9611047. [232] M. Banados, M. Henneaux, C. Teitelboim, J. Zanelli, Geometry of the (2 + 1) black hole, Phys. Rev. D 48 (1993) 1506–1525, gr-qc=9302012.
687
CONTENTS VOLUME 369 W.M. Alberico, G. Garbarino. Weak decay of L-hypernuclei
1
A.W. Blain, I. Smail, R.J. Ivison, J.-P. Kneib, D.T. Frayer. Submillimeter galaxies
111
J. Koperski. Study of diatomic van der Waals complexes in supersonic beams
177
D. Grumiller, W. Kummer, D.V. Vassilevich. Dilaton gravity in two dimensions
327
M. Keyl. Fundamentals of quantum information theory
431
J.R. David, G. Mandal, S.R. Wadia, Microscopic formulation of black holes in string theory
549
Contents of volume 369
687
PII: S 0 3 7 0 - 1 5 7 3 ( 0 2 ) 0 0 4 0 2 - 7